Formatting the Status.Text with the various Entities

Jun 28, 2013 at 1:56 PM
Hi,

When formatting a Status.Text (ie the body of a tweet) for links you have to use the various Status.Entities (UrlEntites, MentionEntities etc) in turn.

Each one has a Start and End index of where the entity is represented in the Status.Text of the tweet.

The problem is trying to format a tweet's text that includes user mentions, hastags and urls. As soon as you enumerate (backwards) any one of the entity types, change the Text to include the link tags, you have destroyed all the indexes of the other entity types. They now point to the wrong locations in the text and are worthless.

Wouldn't it work better for Status.Entities to be enumerable and each item specify as a property what type of entity it is? Am I missing something really obvious here?

Other than this, this lib is loads better than the one I was using.
thanks
Nev
Coordinator
Jun 29, 2013 at 3:19 AM
Hi Nev,

That's an interesting thought. Maybe it might be good to build a collection of start/end that's sorted by largest number and then modify the text backwards.

My thoughts are that this is one use case. I have to also consider that each entity type contains different metadata. If I optimized for your use case, I wonder if I would be making the situation more cumbersome for a person with a requirement to only extract a specific type of entity from the tweet - e.g. data mining URLs or finding mentioned users. There might be other scenarios, but the end result is that changing it now would break code for a lot of other developers.

@JoeMayo
Jun 29, 2013 at 10:11 AM
Hi,

Thanks for your quick response.

It's what I did in the end, just wrapped your Entities in a common interface, added each entity to a collection and had each wrapper apply it's link changes to the text when enumerated in a reverse order aggregate based on the Start property.

The other lib I mentioned I was using (Twitterizer) had a method on the tweet called Linkify which basically returned the text properly formatted with all the links in place. For apps targeted at consuming tweets in their "intended" purpose (rather than lookups, stat driven apps etc), this sort of function is pretty handy.

I appreciate you don't want to change the structure for people relying on the broken out link types but I think you could probably accommodate both uses. If you had a property on the Status Entity called LinkedText or something, the consumer could choose the unformatted value in .Text or the formatted version in .LinkedText . That way, you don't mess with the structure, and someone just interested in the formatted text doesn't have to even think about the Entities.

As it's easily overcome it's not a major issue, just an idea.

thanks for the great library
Nev
Jul 4, 2013 at 2:42 PM
Hi Nevin,

I am facing a similar problem, could you share your solution?

Thanks,John
Sep 9, 2013 at 6:44 AM
Hi all

We're stuck in the same boat.
Would anyone care to share their solution? You can PM me if you wish.

Many thanks, Jaans
Sep 11, 2013 at 11:23 AM
Sorry for the late response (especially for John) but I didn't see the notification.

As I sort of described above, I abstracted/wrapped the LinqToTwitter tweet (LinqToTwitterTweet). I also have a common interface for all the types of entity types in a LinqToTwitter tweet (ILinqToTwitterEntity).
In the LinksInText Method I create a list of all the types of entities wrapping them in the respective class that handles them:
// getters and setters for the tweet properties I am interested in removed for brevity.
 public LinqToTwitterTweet(LinqToTwitter.Status tweet)
        {
           // set all the properties..cut out here for brevity.
// obviously, this.text is picked up in a public Property and will have the full text of the Tweet complete with links.
          this.text = this.LinksInText(tweet.Text, tweet.Entities);
        }

        private string LinksInText(string raw, LinqToTwitter.Entities entities)
        {
            var ents = entities.UrlEntities.Select(ent => new LtoTUrlEntities(ent)).Cast<ILinqToTwitterEntity>().ToList();
            ents.AddRange(entities.UserMentionEntities.Select(ent => new LtoTMentionEntities(ent)).Cast<ILinqToTwitterEntity>().ToList());
            ents.AddRange(entities.HashTagEntities.Select(ent => new LtoTHashEntities(ent)).Cast<ILinqToTwitterEntity>().ToList());

            return ents.OrderByDescending(x => x.Start).Aggregate(raw, (current, entity) => entity.UpdateText(current));
        }
An example of the entity handling (which is pretty simple) would be for the LtoUrlEntities class:
public class LtoTUrlEntities : ILinqToTwitterEntity
    {

        public int Start { get; private set; }

        private readonly int end;

        private readonly string url;

        public LtoTUrlEntities(LinqToTwitter.UrlEntity entity)
        {
            this.Start = entity.Start;
            this.end = entity.End;
            this.url = entity.Url;
        }

        public string UpdateText(string before)
        {
            string after = before.Insert(this.end, "</a>");
            after = after.Insert(this.Start, string.Format("<a href=\"{0}\">", this.url));

            return after;
        }
    }
An other example of the LtoHashEntries class UpdateText Method (the constructor is the same):
 public string UpdateText(string before)
        {
            string after = before.Insert(this.end, "</a>");
            after = after.Insert(this.Start, string.Format("<a href=\"https://twitter.com/search?q=%23{0}&src=hash\">", this.url));

            return after;
        }
Hope this helps.
Nev
Feb 11, 2014 at 2:57 AM
Apologies for the belated reply.

@Nevin, the above worked a treat! I had tweaked it to our needs and code structure, and I like the extensible design.

I would really like to see this in the LinqToTwitter framework though :-)

Thanks again,
Jaans
Coordinator
Feb 11, 2014 at 4:22 AM
This discussion has been copied to a work item. Click here to go to the work item and continue the discussion.
Coordinator
Feb 11, 2014 at 4:23 AM
Jaans wrote:
Apologies for the belated reply.

@Nevin, the above worked a treat! I had tweaked it to our needs and code structure, and I like the extensible design.

I would really like to see this in the LinqToTwitter framework though :-)

Thanks again,
Jaans
Thanks for sharing the code. I added it to the task list to take another look at.
@JoeMayo