CAT | SVN
Recently I’ve attended the PHPUK 2012 conference where I went to see a talk “To a thousand servers and beyond: scaling a massive PHP application” by Nikolay Bachiyski. The talk itself was more about how wordpress.com scales to serve it’s massive load but what got me interested to write this blog post is how wordpress.com does deployments.
There are two parts about WordPress. One is the blog that you can download and host on your own servers. The other is where you can create an account on wordpress.com and set up a blog on their own servers. These two are developed and released separately. What we’ve found out from the talk is that about 50 developers have access to wordpress.com codebase and can make changes and they do about 100 commits to trunk a day.
Now the interesting part is that every commit to trunk is an actual deployment to the live platform. And it’s super crazy fast. It takes 8 seconds for them to deploy WordPress.com to 3 datacenters. Note 100 commits equates to 100 deploys a day. And they don’t have a QA team, a testable environment or a stage environment. Crazy if you ask me but apparently it works for them. They serve hundreds of billions of pageviews and manage to keep the platform stable.
When asked Nikolay explained that it’s a much better strategy for them than going into 2 weeks of merging nightmares where all new changes are merged into a stable branch. I think that merge nightmares is as extreme as ninja deployments from trunk. I do believe in a balanced approach and think we’ve managed to achieve it at AOL with our own projects.
A Different Approach
We use an internally made tool which tracks on top of SVN all the changes made to different branches and allows to easily move those changesets from one branch to another. With every project repository we have three branches: trunk, testable and stable. Once a developer wants to make a new commit he would commit with a comment like this: “#123 > comment message” and this will assign a commit to a specific ticket number in our ticket system and do the commit. If a dev needs to make 10 commits he would do all of them against the same ticket number. Once he’s done he uses the internal tool to mark the set of changesets he made as resolved.
This is where the QA’s can now take all of those changesets and try and merge them into a testable branch when they feel they’re ready to test. They again do it via our internal tool. The smart thing here is that the tool detects all possible conflicts by dry running the merge and warning you which tickets conflict with which tickets. 95% of the time if conflicts happen is because people try to merge newer changesets first rather than merging older changesets first. Even then a lot of times it’s possible to merge ingnoring the conflict which does not cause any trouble later on.
We try to maintain discipline and push things in the order they were developed. Still conflicts do happen. It’s unavoidable. But for that we have a separate tole: a release manager. Who is responsible for solving these merge conflicts and usually they’re very minor, they quickly catch a dev responsible for the changeset and work it out. The release manager is also the guy who controls what goes into stable and then deploys to live with a click of a button.
Before we had this tool we lived in the nightmare merge world. But no more. We’re actually managing to deliver continuously deploying few times a day. It also allows our QA’s to have a controlled environment with only the changes they want. Yes it takes an extra role but that’s a minor cost for us considering the other two extremes. I believe this is a much more balanced approach that can and does make both the business owners happy and the developers less suicidal.
p.s The tool described is developed by one of our developers and last time I checked he seriously considered to make it opensource but want’s to polish it a bit further first.
When connecting to Subversion repositories using SSL connections the SVN client checks the server certificate if it is not expired, if it’s host description matches the host of the repository and if the authority which signed the certificate is trusted.
If the certificate fails to comply with any of the above rules the SVN client will respond with a message such as this one:
Error validating server certificate for ‘https://hostname:443′:
– The certificate is not issued by a trusted authority. Use the
fingerprint to validate the certificate manually!
– Hostname: hostname
– Valid: from Tue, 16 Feb 2010 16:58:39 GMT until Fri, 14 Feb 2020 16:58:39 GMT
– Issuer: company.com, London, Berkshire, GB
– Fingerprint: d5:4e:d8:12:33:12:a5:f1:18:91:77:40:c4:77:3b:0b:f8:51:71:cd
(R)eject, accept (t)emporarily or accept (p)ermanently?
The certificate can still be accepted permanently manually. It may not be a solution if SVN commands are issued by non interactive processes. For example a PHP script run by apache trying to export a branch from the repository.
Certificates signed by trusted authorities such as Verisign should not have any problems. But self signed certificates will not be recognized by the SVN client which in turn will respond with the response above. Self signed certificates can be be made trusted by the SVN client by using the ssl-authority-files configuration option:
ssl-authority-files = /home/void/.subversion/company.crt
The configuration file named servers which holds this configuration option can be stored in multiple locations on the filesystem. First the Subversion client will try to look for it in the home folder of the user that is executing the SVN command. Users such as apache will most likely not have a home folder. In such cases SVN tries to look for the servers file in the /etc/subversion directory. It may or may not exist depending on the OS distribution flavour. For example it exits on Ubuntu but does not exist on CentOS a flavour of RedHat.
I was working on a small web application that creates Subversion branches and tags. In short it just executes SVN commands on the repository. Whenever a user executes an SVN command the SVN client tries to check user’s local home folder for the .subversion configuration directory. The issue that I was running into was that for some reason apache’s home folder was pointing to our system’s administrator home folder which in turn would result in a permission denied error when apache would try to access the .subversion folder.
It just didn’t make any sense. Turns out if you start a service through /etc/init.d/ it starts that service with environment variables belonging to the user that started the service. In this case our system’s administrator started the service using his own user.
To start services in a clean environment a special utility called service should be used. It usually resides in the /sbin directory. So for example instead of starting apache like this:
$ sudo /etc/init.d/httpd start
It should be started like this:
$ sudo /sbin/service httpd start
Which will result in $HOME environment variable being empty and the SVN client not getting a permission denied error.