Untappd provides a mobile check-in application for beer lovers. Their application has been downloaded by over a million users and on any given night, they can register over 300,000 individual checkins and social sharing events.
The Untappd application lets users record their beer selections, share their likes with friends, win rewards, get recommendations, and participate in a shared passion of beer with others around the world. A solid design, a fun set of features, and a responsive application are just a few of the reasons they’re one of the fasting growing entertainment-based social communities.
What’s even more impressive about Untappd is that it’s just a two-person company – a designer and a developer, both of whom have other jobs and are doing Untappd on the side. This is a story on how they got started and the work that goes on behind the scenes to power their success.
The Untappd Story
Greg Avola and Tim Mather met over Twitter six years ago when Greg was looking for a collaborator for a Twitter mobile app. They ended up working together on the app and then proceeded to take on several other projects as a designer/developer combination. In early summer of 2010, Tim came up with the idea for a check-in system for beer drinkers. The idea mapped well with Greg’s interest in beer and so they quickly created a mobile app and got to market by the fall.
The Untappd app has evolved a lot since the early days but the main premise is the same – users check-in at locations and check-in on the beers they’re drinking. For each check-in, they become eligible to win badges and receive promotions. They can also get real-time recommendations for beers based on their location.
The team works closely with breweries and beer venues to increase the connections that users have with their favorites beers. They help breweries and other partners create badges and other promotional elements for beers and events. The badges are hugely popular and are posted and shared widely within the app and across social media.
Given how important up-to-date information about beers is, they’ve created what will soon become one of the largest open-source databases on beers in the world. It’s moderated by over 40 volunteers who help clean up information and de-dup entries. They offer free API access for developers and have the ultimate goal of making it the most widely used libraries about beer.
Registered users top over a million and they service over 300,000 check-ins on weekend nights and have processed over 50M events (the majority of them using Iron.io). Users love the Untappd app and use it to keep track of their beer, discover new favorites, meet new people, and find new places of interest.
The Untappd app is a model that works – a fanatical user base, an app that provides rewards, hundreds of happy partners, and an almost limitless opportunity.
Checking In a Beer
Behind the Scenes of the App
The app framework for Untappd is that of a mobile client, a set of app servers connected to databases, and a large async/background processing component. They make use of a LAMP stack with PHP serving as their primary language. They use MySQL as their primary database for transactions, MongoDB for their recommendation engine and activity feeds, Redis to store all the counts for beer/user/brewery/venue, and Iron.io for their background processing and as their mobile compute engine.
When users check in to Untappd, there are a number of transactional events that take place. The user account gets updated and the check-in gets posted to Twitter, Facebook, and/or FourSquare. If a photo is uploaded, it gets processed. Check-in parameters get filtered for location and venue and then piped it into their MongoDB clusters that power their local recommendation capability. All in all, there can be up to 10 different events taking place for each location or beer check-in.
Initially, the check-in processes were being handled as a large batch job after hours at night. Because actions were being posted well after the actual event, the check-in process obviously wasn’t as responsive enough for their users as they needed. The Untappd team then moved these actions to the check-in response loop. That lasted for a little while as it resulted in a more responsive check-in but it quickly showed signs of strain. On heavy nights, the Untappd main app servers would start to melt because they were being used to process all the actions for each check-in, in addition to serving pages and providing query responses.
This tightly coupled serial approach also resulted in users having to wait for each process to start and finish in sequence. The delayed response times began having noticeable impacts on engagement. It was taking much longer to check-in as the app wouldn’t return for up to many seconds at a time. Users were getting frustrated and so they were not checking in for the second beer or the third.
Serial Processing Events at Check In = Slow Response Times
The general experience was also not feeling real-time enough for users because they wouldn’t see tweets until much later, and the information they were receiving from the app after a check-in was not as relevant as they might expect. Recommendations for other beers, for example, were out of date because the database wasn’t getting new beer inserts in a timely manner, and notifications of nearby trending places were not being sent out quickly enough to be relevant.
To keep user engagement high and their user base growing, they needed find a solution to their check-in problem. They turned to Iron.io to do so.
To make their application more responsive and scalable, Untappd move their event processing to Iron.io as a combination of IronMQ and IronWorker. Each check-in event is sent to a queue within IronMQ and then routed to workers within IronWorker. The processing runs outside the user response loop, which speeds up check-ins as well as provides the ability to handle any and all spikes in traffic.
Using Iron.io, Untappd has been able to reduce the time to average check-in time from over 7 seconds to 500ms. They’ve also eliminated the need to manage infrastructure for this part of their app and given themselves an almost unlimited ability to scale their processing.
Continual Event Processing
The way the event flow works is that they put a check-in event onto a single queue and then that fans out to multiple queues – with each sub-queue controlling a different action, such as posting to social media or updating the recommendation engine. Multiple workers spin up soon after a check-in happens and so by the time the user has laid down their phone and sampled their beer, every action is either in process or has completed.
There are 4 main processing flows for Untappd events. General tasks handle the basic parts of Untappd; Social tasks push messages to social media for checkings; SocialBadge tasks push to social media for badges earned within the service, and PostPush tasks handle push notifications.
Untappd employs master tasks to control and manage the scaling of each flow. Each master task runs every 60 seconds as a scheduled job and calculates the number of open jobs left on MQ for each type of event. If the queue is large, the task will spin up new workers on demand to deal with the load. When a worker runs, it will poll a queue based on their job type, get the event payload, execute the job, delete the message, and then repeat. Each worker will run for a set duration ranging from seconds to minutes.
This structure gives them immediate flexibility to handle high traffic loads but also kill off worker jobs when loads are light. In terms of concurrency, they might have 50-100 tasks running at once per processing flow at peak times, scaling down to at least 8-10 on slow periods.
Processing Events Concurrently = Fast Response Times
The local recommendations feature was rolled out at the beginning of 2014 and became an instant hit. One of the biggest problems with making recommendations on beer is that users are drinking it in locations around the world. Untappd can't recommend a beer to someone in San Francisco when it’s only available in New York City. It’s not a relevant recommendation and so all it does is distract from the experience.
To provide recommendations that are up-to-date, Untappd uses a combination of IronWorker and MongoDB. They insert the GPS locations for all the checked-in beers into a database at the point of check-in. When other users check-in, Untappd uses all of the fresh inserts to validate the recommendations they’re returning for subsequent checkins. Prior to using IronWorker, the recommendations were taking too long to return and so they were not much value to the user. With IronWorker, the process is much more real-time and the recommendations much more relevant.
The End Result
The combination of IronMQ and IronWorker has saved Untappd hundreds of hours of development time and greatly increased their ability to release new features and build new capabilities. It has also given them a very reliable and scalable processing framework. Untappd just pushes events to IronMQ and the Iron.io platform takes care of everything else. The Untappd team can concentrate on building their app and serving their customers, plus they get to relax on even their busiest nights.
"For a small development team and as someone who wants to enjoy their Friday and Saturday nights, I like that I don't have to worry about whether to scale more servers. It's done automatically by Iron.io, which is key for us and obviously why we love the platform."
- Greg Avola, CTO & Co-Founder, Untappd
What Greg Avola, CTO/Co-Founder of Untappd, says about their business and development structure and how Iron.io helps power their asynchronous processing.
First off, can you tell us how advertising and promotions fit into the application and the growth of the community?
From day one Tim and I have always been down on traditional advertising approaches. We think it lessens the experience. You need to do things in a better way. In our model, Untappd provides you with a communication channel around a very social activity like beer drinking and then users add to this channel in a number of key ways – from competitive things like collecting badges to just sharing or achieving things together.
Badges are similar to ads but better people are posting and sharing these items on their own. When’s the last time someone saw an ad and shared it on Twitter with their friends. Nobody does that. People tend to get pretty obsessed with badges which is why we’re continually evolving ways for people to get them and then share them. It's a different take on advertising, for sure, but one that we think is a heck of a lot more sustainable.
You’ve moved from a monolithic, tightly bound application to a distributed system that’s highly distributed and scalable. How does that help you?
Doing things in a more distributed asynchronous manner that you're able to develop features much more quickly because you're working on just that feature and everything else is still working as it did. It's also just a great way to modify the app without having a whole production come down on us if we make a mistake.
If I want to tweak the recommendations setting a bit, I just go into that task, adjust it, push it through some QA testing, and then push it into production. Or if I want to do something new – say I wanted to perform a large task such as pulling all the check-ins for a single day into a separate table and perform some complex analysis. All I have to do is fire up a worker or alter one of the ones we have and we’re good to go. You can do it independently knowing you aren't going to affect any production actions or workflows.
How has this type of architecture changed your development approach?
You greatly modularize your thinking. You can figure out which tasks can be separated so that when you get a chance to rev something, you're not going into the entire app and changing a few lines and then pushing the entire app out. You're really changing it in the one task and then just pushing that out. There's less risk involved and greater speed. I've extracted pretty much everything that's part of the main application's check-in process into workers. I’ve minimize that scope and I’ve minimize the risk.
It has helped even from a code analysis perspective. I'm a very visual person so when I go into the Iron.io dashboard, I can see all the worker code that’s in there. I know which ones are part of which tasks. I can inspect them out and modify them without affecting the app in full. It’s sped up our developing task for sure.
An analyst we know is a fan of Untappd plus he also uses your application as an example in his slide presentation about mobile development. Can you explain how you’re able to gain leverage in the market with such a small team?
We have to use all the tools we can to help us develop because we're already at a disadvantage against other companies that have way more funding and any number of developers. I'm the only one writing back-end code and so anything I can to help me develop faster, more efficiently, and more effectively I'm going to take that opportunity. Obviously, the tools that I've been able to use from Iron.io have been able to help me significantly with that. Sounds like a commercial but it's really true.
What are some of your favorite things about working with Iron.io?
Given the spikey nature of our usage, one of the big benefits is the flexibility Iron.io gives us in terms of scaling. We’ve provisioned things well across our application stack and have a pretty good setup. At the same time, we always need to expand to handle user growth or deal with busy periods. With Iron.io, we know our system will not impacted by these spikes. The resources are there and our task concurrency can be increased whenever we need it. It’s this flexibility that helps us maintain a consistent level of service without having to physically watch and manage all the things within our system.
Another thing I enjoy about working with Iron.io is the chat system that's pretty much manned 24 hours, seven days a week. As a developer working on things in production, I don't have time to use email support, fill out forms, or whatever to get someone to help me. I can hop into your support channel if we’re running into a problem or seeing an issue and get a quick answer. The documentation is great, but it's even better to get a real person on the line and get the answer right away. That's very important to us.
Weekend nights are your most active periods and yet you talk about being able to enjoy your evenings. Can you explain more about this application peace of mind?
This is a key point for us in using Iron.io. Of course I can easily install packages on my own servers but then I have to manage a lot of infrastructure and I don't want to do that. Being able to use the right tools to help us excel in our development practices, without having to manage ops, is a major thing for us.
If we’re worrying about ops all the time, we can't build the product. If we're worrying about building infrastructure, we're not worrying about building the best experience we can for our users.