A common problem developers have is how to run through a large number of tasks in a short amount of time. For instance, sending out a nightly notification to all your users, calculating your user’s bills every month or updating their Facebook graph data. These types of things are very common and almost every application needs to do it and it becomes more and more of a problem as your user base grows.
I was just helping a customer today that was using Heroku workers at full worker capacity (24) and it was taking ~9 hours per day to go through his user database of 200,000 users and send out notifications. A quick calculation explains why:
((200,000 users * 3.5 seconds per task) / 3600 seconds/hour) / 24 workers processes = 8.1 hours
We took it down to less than 9 minutes. Here’s how:
The easy (naive?) way would just be to queue up 200,000 tasks on IronWorker. This would have worked just fine, but it would have taken a while just to queue up the tasks and the setup/teardown time of a task would be wasteful when the task only takes 3.5 seconds to run. Instead, we had each task process 100 users which is now only 2000 tasks. Each task should take approximately 3.5 * 100 = 350 seconds = ~6 minutes to run and we can run them all at pretty much the same time, but I’ll be conservative and adjust a bit for IronWorker capacity.
((200,000 users * 3.5 seconds) / 3600 seconds/hour) / 2000 worker processes * 1.5 capacity adjustment = 0.146 hours = 8.75 minutes
Batching up the tasks into batches of 100 was easy, here’s sample code:
IronWorker gives you super simple access to huge compute power and you don’t even need to think about a server. This customer will never have to worry about scaling this part of his application again.