Continuous Delivery / Ninja Deployments
Recently I’ve attended the PHPUK 2012 conference where I went to see a talk “To a thousand servers and beyond: scaling a massive PHP application” by Nikolay Bachiyski. The talk itself was more about how wordpress.com scales to serve it’s massive load but what got me interested to write this blog post is how wordpress.com does deployments.
There are two parts about WordPress. One is the blog that you can download and host on your own servers. The other is where you can create an account on wordpress.com and set up a blog on their own servers. These two are developed and released separately. What we’ve found out from the talk is that about 50 developers have access to wordpress.com codebase and can make changes and they do about 100 commits to trunk a day.
Now the interesting part is that every commit to trunk is an actual deployment to the live platform. And it’s super crazy fast. It takes 8 seconds for them to deploy WordPress.com to 3 datacenters. Note 100 commits equates to 100 deploys a day. And they don’t have a QA team, a testable environment or a stage environment. Crazy if you ask me but apparently it works for them. They serve hundreds of billions of pageviews and manage to keep the platform stable.
When asked Nikolay explained that it’s a much better strategy for them than going into 2 weeks of merging nightmares where all new changes are merged into a stable branch. I think that merge nightmares is as extreme as ninja deployments from trunk. I do believe in a balanced approach and think we’ve managed to achieve it at AOL with our own projects.
A Different Approach
We use an internally made tool which tracks on top of SVN all the changes made to different branches and allows to easily move those changesets from one branch to another. With every project repository we have three branches: trunk, testable and stable. Once a developer wants to make a new commit he would commit with a comment like this: “#123 > comment message” and this will assign a commit to a specific ticket number in our ticket system and do the commit. If a dev needs to make 10 commits he would do all of them against the same ticket number. Once he’s done he uses the internal tool to mark the set of changesets he made as resolved.
This is where the QA’s can now take all of those changesets and try and merge them into a testable branch when they feel they’re ready to test. They again do it via our internal tool. The smart thing here is that the tool detects all possible conflicts by dry running the merge and warning you which tickets conflict with which tickets. 95% of the time if conflicts happen is because people try to merge newer changesets first rather than merging older changesets first. Even then a lot of times it’s possible to merge ingnoring the conflict which does not cause any trouble later on.
We try to maintain discipline and push things in the order they were developed. Still conflicts do happen. It’s unavoidable. But for that we have a separate tole: a release manager. Who is responsible for solving these merge conflicts and usually they’re very minor, they quickly catch a dev responsible for the changeset and work it out. The release manager is also the guy who controls what goes into stable and then deploys to live with a click of a button.
Before we had this tool we lived in the nightmare merge world. But no more. We’re actually managing to deliver continuously deploying few times a day. It also allows our QA’s to have a controlled environment with only the changes they want. Yes it takes an extra role but that’s a minor cost for us considering the other two extremes. I believe this is a much more balanced approach that can and does make both the business owners happy and the developers less suicidal.
p.s The tool described is developed by one of our developers and last time I checked he seriously considered to make it opensource but want’s to polish it a bit further first.