Long Running User Stream Failure

Aug 15, 2012 at 4:35 PM

I am having an issue when running the User Stream for a long running process. There doesn't ever seem to be a log entry made and no errors appear to be raised but after some length of time it appears that the user stream just stops working.

The code that is experiencing the issue is running on Linux and Mono but I do not think that is a problem because every thing works fine for a few hours and then unexpectedly the stream stops raising events.

The code where I am seeing this issue is in an open source project I am working on so it can be found in the link below.

https://github.com/postworthy/postworthy/blob/master/Postworthy.Tasks.Streaming/Program.cs

Any help on this would be great. I am basically trying to setup a long running process that is always listening to the twitter stream and processing the results for display. But the longest I seem to be able to get before I stop getting messages is around 11 hours at the highest.

Coordinator
Aug 17, 2012 at 3:14 AM

Hi,

A few things come to mind when looking at this, none of which point to a definitive answer, but open areas that might be useful to look at: logging strategy, exceptions, keep alives, and restarts. I'm not sure that I can set up this code and duplicate the problem on my Windows box in a decent amount of time, so I'll share some ideas.

I noticed you're logging to the console. Are you seeing any output on the console, other than Console.WriteLine? I've scanned the code looking for any paths that might lead to an exception that doesn't log, but that doesn't mean there aren't. If I can reproduce, maybe I can find a situation where an error isn't being logged. The logging is only for exceptions.  One thought is that you might grab the source code and add more instrumentation that might indicate where the problem is occurring.  I would take this as a new issue to add more sophisticated instrumentation to the code, but that won't help you as I'm unlikely to get it done any time soon. I'm not aware of any exception handling problems, but it's pretty important, which is why I discuss it in length.

In Windows, unhandled exceptions show up in the event log.  Is there an equivalent place on Linux where you can look?  Does an unhandled exception in Mono take down the process?  If so, is your process crashing?

Twitter sends keep-alive messages, which is what the code checks for on Line 220.  The following documenation has a section on Stalls that might be something for you to look at:

https://dev.twitter.com/docs/streaming-apis/connecting

Essentially, keep a timer to detect when you've either not received a new entity or keep-alive message withing 90 seconds and reconnect.

I'd be interested in hearing how this goes. If you have a solution/project that's easy for me to set up and reproduce, let me know.

Joe

Aug 18, 2012 at 3:19 AM

Q: Are you seeing any output on the console, other than Console.WriteLine

A: If anything goes wrong with the LinqToTwitter it writes to the output but no errors appear in the log. Also all errors that are thrown by my code are caught and written out to the console so no errors are being thrown there either. It just stops working after some length of time (this seems to be random times).

 

Q: In Windows, unhandled exceptions show up in the event log.  Is there an equivalent place on Linux where you can look?  Does an unhandled exception in Mono take down the process?  If so, is your process crashing?

A: The process handles all exceptions cleanly displaying them in the output and continuing. The process never crashes but the stream just stops raising new events to process.

Q: I'd be interested in hearing how this goes. If you have a solution/project that's easy for me to set up and reproduce, let me know.

A: The main branch builds directly from git and to get the stream running you really just need to modify the example.app.config in the streaming project you would also want to copy the UsersCollection.config from the web project and make sure the app.config points to where ever you place the UsersCollection file. Though for a fully functional site you would want to have memcached running if it does not exist it will just pass through and store local. The part that would take the longest is signing up to the twitter dev and getting all the required keys.

 

-------------------

I currently check every 60 seconds for data i may attempt to reset if i don't receive any data after some length of time. I may also look at modifying the streaming code to log more for debugging this issue.

Aug 18, 2012 at 3:46 AM

I have pushed the latest log:

http://postworthy.com/log.txt

The error that is at the beginning of the log is not related to the issue. You can see that the stream is processed without issue for a couple of hours before it just stops working without posting anything to the output.

Coordinator
Aug 18, 2012 at 4:00 AM

Thanks for that.  I have a couple more ideas:

  1. The stalling theory is looking more likely now. It might be useful to log every time a message comes in with a timestamp and whether it's a valid message or keep-alive. Have messages reset a timer (for detecting stall) back to zero. When the timer expires at 91 seconds, log the fact that it expired. The fact that no more messages arrive after the stalled log entry will point to this being a reason.  Then you already have the logic in-place to know when to reconnect.
  2. I don't know if this is a rational thought or not, but the Friends.Update call on Line #186 bugs me.  I'm wondering if Twitter doesn't like REST API calls by the same user who is already connected to a stream. i.e. does it make sense to make a REST call for the same information that would be available via the stream.  A better strategy might be to make all your REST API calls to fill in missed data between connections, and then connect to the stream and get new information in real-time.

I added instrumentation as a new LINQ to Twitter issue.  Whatever I add will need to be cross-platform/technology.

I'm currently getting started on an app that does Twitter streaming, so this is interesting.

Joe

Aug 18, 2012 at 3:28 PM

I have been looking at the linq to twitter code to see if there is any place an error could occur in the stream and a log event not be written out. I have identified one such location. In TwitterExecute.cs on line 610 there is a default case that sets the error wait but does not write anything to the log. I am using version 2.0.28.0 of the LinqToTwitter which i believe is the latest release.i am going to try adding in logging at that line and see if i can catch anything.

1. Feels like a stall but is seems like it would be more native if the LinqToTwitter streaming code handled the stalls (future version of course).

2. The Friends.Update is commented out because when it was called after the stream was connected 2 things would happen. 1st the Update would fail and logging errors by LinqToTwitter would be made. 2nd the stream would stop functioning.

I am glad you find this idea interesting. My goal with this project in the Postworthy solution is to have a long running process for real time content processing (pull links, images, video, ect). You can see from the code that when tweets are processed they are immediately pushed to the web site which then immediately pushes the tweet to all clients connected (real time to the browser).

This timeout/stall issue is the last major issue for having a robust process for realtime processing of the twitter stream. I am sure it is resolvable.

Aug 18, 2012 at 6:29 PM

I updated the code to check and see if the stream stalls:

https://github.com/postworthy/postworthy/blob/master/Postworthy.Tasks.Streaming/Program.cs

This code is currently running on my server so I will update later when I can see if the latest code can work around any stalls.

Aug 20, 2012 at 5:21 PM

I with the updates that I made in the code in the above post here is what I have learned. For some reason something happens that sends the twitter stream into an endless loop of firing off calls to the .StreamingCallback as fast as it is able. I know this because i can check my logs and see the 

Console.WriteLine("{0}: Twitter Keep Alive", DateTime.Now);

being called multiple times a second by the .StreamingCallback. There appears to be something that can throw this method off so that it gets into an endless loop which explains why once this happens I never get any updates from twitter.

Coordinator
Aug 22, 2012 at 1:50 AM

I might have figured it out. Apparently, the code gets a lot of newlines whenever Twitter closes the stream, which L2T interprets as Keep-Alive messages. What I've done is added some code to detect when the stream is closed and throw an exception.  I've also modified StreamContent with new Status and Error properties for detecting this condition. 

You can detect this by inspecting the Error, observing whether it's a WebException with WebExceptionStatus.ConnectionFailed, wait a second or so (whatever is Twitter's recommended back-off procedure), and reconnect.  Here's an example:

            (from strm in twitterCtx.UserStream
             where strm.Type == UserStreamType.User
             select strm)
            .StreamingCallback(strm =>
            {
                if (strm.Status == TwitterErrorStatus.RequestProcessingException)
                {
                    WebException wex = strm.Error as WebException;
                    if (wex != null && wex.Status == WebExceptionStatus.ConnectFailure)
                    {
                        Console.WriteLine(wex.Message + " You might want to reconnect.");
                    }

                    Console.WriteLine(strm.Error.ToString());
                    return;
                }

                Console.WriteLine(strm.Content + "\n");

                if (count++ >= 25)
                {
                    strm.CloseStream();
                }
            })
            .SingleOrDefault();

This is checked into source control.  As usual, feedback is welcome. 

 Joe

Aug 22, 2012 at 3:01 AM

Thanks Joe, any idea when I could expect to see this update available through nuget? I am not against downloading the source and compiling it local in the meantime but my current project setup uses nuget to manage the packages.

Coordinator
Aug 22, 2012 at 6:36 PM

Available now. :)

Aug 23, 2012 at 3:31 AM

OK so I am using the latest code and catching the error when the stream is stalled. But it appears there is an issue if i try to reconnect using the same context that lost the connection. Basically when I try to reconnect after 90 seconds it will never reconnect to the twitter stream. You can see the line of code where I am attempting to reconnect at line 180.

https://github.com/postworthy/postworthy/blob/master/Postworthy.Tasks.Streaming/Program.cs#L180

Coordinator
Aug 23, 2012 at 6:18 PM

Good - making progress. I have a couple things out of the way and will probably spend some time over the next few days looking closer at this.  In the meantime, a possible work around you might try is to try to reconnect with a new instance of TwitterContext.

Joe

Aug 24, 2012 at 4:24 AM

So far so good. If I reconnect with a new context it appears that everything reconnects. So far my output show 2 disconnects but a reconnect was made 90s later. I will post back in a day or so with another update. Also the code that I have working can be found by following the link below.

https://github.com/postworthy/postworthy/blob/master/Postworthy.Tasks.Streaming/Program.cs

Coordinator
Aug 27, 2012 at 4:54 AM

Found the problem - I wasn't resetting the CloseStream flag, which prevented using the same instance.  I checked-in an update.  Until I do a new release, current work-arounds are to either re-start with a new instance, like you're doing, or set CloseStream to false.

BTW, I got your code working pretty easy - just a few configuration file updates and it runs great.

Joe

Aug 27, 2012 at 5:38 PM

The process ran for over 3 days without issue (the longest prior to the current update was 11 hours max). I am currently investigating the reason for failure (at this point I am unsure why it did not reconnect) it is possible it is on my side but I have not had enough time to dig into the issue yet.

I am glad it was easy to get up and running I have been trying to make it as easy to setup as possible. The hope it to eventually have it as easy to setup as a wordpress blog. Any feedback you have on it would be greatly appreciated. At the moment I am spending most of my energy on the Postworthy.Tasks.StreamMonitor project which I hope will turn into an extinsable means to process the twitter stream in different ways. The stock processing method will be content extraction (the end result seen on the postworthy front page). But other possibilities include "twitter bots" which monitor the stream and reply to keywords, or sentiment analysis (how do people feel about a subject). Your tool has been a big help since I have not had to reinvent the wheel to connect to the twitter stream through c#

Sep 7, 2012 at 11:11 PM

I have now been running for over a week without the service going down or needing to be restarted.

Coordinator
Sep 8, 2012 at 1:40 AM

Good news - thanks for sticking with it and working with me on this. :)

Joe