Access to ManageFlitter was disabled for around 50 hours over Tuesday and Wednesday. For a couple of days before that unfollows were not being completely processed. This blog post is an explanation of what occurred.
We had our first indication that something was wrong with the system on Friday afternoon when we received a couple of emails saying that unfollows were not being processed. We initially replied to these emails asking people to check the system again later as it can take some time for the request queue to complete. We have occasionally received emails like this in the past when there have not been issues.
Over the weekend we continued to receive emails from people saying the system was not always working for them. We ran a few tests and found nothing wrong. We didn’t see an increase in the error rate through the system, so there was no indication that our unfollow requests to Twitter were being dropped.
On Monday after continued complaints from users we did some more extensive testing and discovered that not all unfollows were appearing in Twitter’s pages. After digging in the API requests we were sending to Twitter we noticed that there were strange results being returned with an error structure that was not documented for the requests we were making. We disabled access to the system while we sent a request to Twitter to find out what the problem was.
After a little digging on Twitter’s end it turned out they had implemented some site wide policies that rate limited API requests on a per-IP basis that were previously unbound. Twitter gave us no notice or provided no documentation about this change which resulted in our extended downtime. The error structure was unrecognised so it took us longer to respond than usual.
Once we understood the cause, we quickly implemented a change that gave us distributed messaging queuing on API requests that allowed us to rate limit and balance our unfollow & follow requests across several IPs.
After we brought the system up 24 hours ago we saw a large spike in traffic which meant that some unfollows took around an hour to processes. All unfollows from the past 24 hours were processed successfully. We’ve also now spun up some additional servers so future requests should consistently be processed even faster. We will work to display the estimated time it will take for unfollow to be processed in the ManageFlitter UI as we know it’s confusing when you request an unfollow, but it doesn’t show up in your account.
Unfortunately we rely on Twitter to notify us of these changes and processes our requests in consistent matter. When changes are made without notification it can take us some time to respond. We have improved our error logged and distributed our requests in order to help keep the system running smoothly.
We apologise for this extended period of downtime and instability. We understand it is frustrating and many of you use ManageFlitter on a daily basis. We plan to identify and respond to these changes faster in the future.
James Peter – CTO