TwitterStream - I randomly get garbage when accessing stream.Content

Feb 26, 2014 at 10:30 AM
Edited Feb 26, 2014 at 2:20 PM
Version 2.1.11.0

I've been trying to maintain a permanent connection to the streaming endpoint and while I've succeeded, I am still running into an issue.

I already handle all the stream stalls and disconnects / reconnects successfully.

The current problem is that, after a few hours of receiving tweets, i start getting "garbage" on the stream.Content property. You can still decipher a few parts of a tweet, but there seems to be some problem with the encoding / buffer or something.

Here is an example from my logs table:
10:21 PM - Message that is not a tweet: tt543\u0648\u06terhtt5492687872","init"1tr":null,"in_reply_iheSqug.cFilgmin"w":339ihe Squg.cmin8d":533354ull,"ly_to_use533354ullmin87872","in 229str":null,"in_reply_Je021eNoujaimmin"w":339je021emedujaimmin8d"::fa340c \ly_to_use:fa340cmin87872","in9mes09}}]crat":"job ever. Al98humng ":"787c \ly_to_use45\u0humng ":"787cmin87872","in_bac99,"Jes_count":l":"http:\/\/pbs.twimg.com\at":"TuehB5Gh5IEAAlwjD/\/:{"hs_count":_image_url":"http:\/pbs.twimg.com\at":"TuehB5Gh5IEAAlwjD/\/:{"hg":null,"followites_cyAHPptegCtps""id":43698231600,"time_zone":"PyAHPptegCtps"303482,"id_str":"83034825084,"lang":"MrSedkyofile_use_bac98humf4n8om\7se:"en","contributors_enabled":false,"i\/\/pbs.twimg.com\/profile_background_imont,translatorgroom\/p40File_backgron_enablA2","profile_backgro19cone_backgron_enablat":false,"profile_baF6","e_backgron_ena}}]},"favorited":false,"retweities":{"hashtags":[{"text":"tbt","indices":[0,4]}],"symbols":[],"urls"arna
You can clearly see some stuff resembling tweet properties. These tweets replace normal tweets for a few hours and then I start getting back normal tweets again. But, since i keep getting something from the stream, there is no stall warning or disconnect. And I don't think I should disconnect and then reconnect the stream again to try and solve this. I have also not seen this problem anywhere.

Any ideas guys?

Piece of code (normal stuff):
try
            {
                (from stream in twitterContext.Streaming
                 where stream.Type == StreamingType.Filter &&
                     stream.Track == keywords
                 select stream)
                 .StreamingCallback(stream =>
                 {
                     try
                     {
                         this.m_lastCallbackTime = DateTime.UtcNow;
                         this.m_twitterStream = stream;
                         if (stream != null)
                         {
                             if (stream.Status == TwitterErrorStatus.RequestProcessingException)
                             {
                                 var webEx = stream.Error as WebException;
                                 if (webEx != null && webEx.Status == WebExceptionStatus.ConnectFailure)
                                 {
                                     Trace.TraceInformation(DateTime.UtcNow.ToShortTimeString() + " - Error: LinqToTwitter stream connection failure!");
                                     this.m_hadStreamFailure = true;
                                 }
                             }
                             else if (!string.IsNullOrEmpty(stream.Content))
                             {
                                 var tweet = stream.Content;
and so on...
I appreciate all your help with this issue. Thanks in advance ;)
Coordinator
Feb 26, 2014 at 4:03 PM
Feb 26, 2014 at 5:17 PM
Thanks Joe ;)
Mar 4, 2014 at 10:26 AM
Sorry to bother you again Joe.

The library's recent version was a significant improvement on those messed up tweets.
And although we stopped getting those for hours on straight, now we still receive 1 or 2 "bad" tweets every 10 minutes. It is significantly better and we can cope with this rate.
Nevertheless, just wanted to inform you on that. Initially I thought it was because of the volume we were getting (between 500 and 3000 per minute), but after inspecting the logs, the "bad" tweets almost never appear during the peak times. So I don't think there is a connection with that.

Anyway, thanks for your help :)