Rate Limits and Followers

Sep 6, 2011 at 1:16 PM

Joe and the L2T group,

Knowing that the rates (either 150 hits per hour or 350 hits per hour, depending on your authentication method) can change at Twitter's whim, is it easy to figure out if you can pull back all followers for a screen name within the max hits per hour?  I'm having to follow a couple of twitter accounts and would like to know if I will be able to pull back the followers back within these hit rates, assuming I'm able to pull back 200 followers at a time.  I'm making 2 twitter calls, one to get the IDs of the followers and then another call to get their screen names.

Best I can tell, assuming I can do everything with 1 hit per query, I'd be limited to 175 hits in an hour to get the IDs and then 175 hits to do the queries to get the screen names (splitting the calls between the two API calls), looks like a max of 35K followers per hour.  Does the 'math' work or am I missing something??

 

Thanks,

Richard

Coordinator
Sep 6, 2011 at 3:04 PM

Hi Richard,

Your numbers look reasonable.  However, I'm not certain they'll h sold up in practice, because you're looking at 35K as the max rather than actual.  i.e. network traffic delays, outages, and dynamic rate limiting could result in something much less.

While a brute force approach is easier to code, you might want to consider designs that would help you scale better.  I don't know if these tips will work in your situation, but it might give you some ideas on how to approach your own design:

  1. Determine how stale you can tolerate your data.  i.e. can it be 1 min, 1 hour, or x days old? A related question is how often is a given piece of data used?
  2. Cache the follower ID list so that you aren't constantly requesting the same data.
  3. Cache user records so you don't have to constantly request the same user data.
  4. Estimate the frequency of each user's tweets to adapt how often you re-request their data.
  5. Examine APIs to see if you can remove part of the data you don't need (perhaps because it's already cached). i.e. Status.Home has a TrimUser parameter.
  6. Consider UserStreams if you need immediate updates.  SiteStreams is in beta right now, but would also be a potential option.

I'm sure this isn't an exhaustive list, but it might give you some ideas on how to minimize the impact of rate limits.

Joe

Sep 6, 2011 at 10:01 PM

Thanks Joe,

I have been playing with some designs around caching the list of users on my side.  The issue that I see is that it doesn't appear to be anyway to get 'new' followers that was added since a point in time, so, as far as I can see in L2T and the native Twitter API, I'm kind of stuck having to 'brute force' the list of members.  Am I missing something somewhere in L2T or teh APIs?

Thanks,

Richard

Coordinator
Sep 6, 2011 at 10:45 PM

I don't see a way to query Twitter for only new followers, so it looks like requesting the entire list of IDs will be necessary.  Once you have the list, you can do something like:

var newFollowers = newList.Except(savedList).ToList();

 



Joe

Sep 7, 2011 at 1:39 AM

Joe,

I was thinking about something along those lines.  One question...does the call to get the IDs of the followers return 5000 IDs at a time like the native Twitter API does?  Also, do you have any code samples that demonstrate looping through the calls and checking the Rate Limits as you loop?

Thanks,

Richard

Coordinator
Sep 7, 2011 at 4:00 AM

You can do a search through the LinqToTwitterDemo project for "Cursor" to find an example - I think there's some in the SocialGraph demos.  Checking rate-limits is a matter of accessing the rate limit properties on the TwitterContext instance.  There's also a Headers property (on TwitterContext) that's updated on every request.

Joe

Sep 7, 2011 at 6:37 AM

I am using cursors currently.  My question is on the SocialGraph.Followers...does it return 5000 IDs per cursor like the native API does?  The documentation doesn't really say.

Coordinator
Sep 7, 2011 at 1:28 PM

Yes, it will.  Generally, LINQ to Twitter will do exactly what the API does.

Joe