At GitHub, we use a variant of the Flow pattern to deploy changes: new code is always deployed from a pull request branch, and merged only once it has been confirmed in production.
master is our stable release branch, so everything on
master is considered production-ready code. If a branch deploy ships bad code containing a bug or performance regression, it is rolled back by deploying the latest
master to production.
Using this workflow, engineers at GitHub deploy changes to our website several hundred times every week.
All deployments happen in chat via Hubot commands, which ensures that everyone in the company (from development to operations to support) has visibility into changes that are being pushed into production.
Deployments are reported to the GitHub API and show up in the timeline on corresponding pull requests.
Recent deployments are also available through chat.
Over the years, we’ve built a number of deployment features into Hubot and Heaven (our Capistrano-based deployment API) to help streamline our process. Below are some of our favorites.
During peak work hours, multiple developers are often trying to deploy their changes to production. To avoid confusion and give everyone a fair chance, we can ask Hubot to add us to the deployment queue.
We can also check the status of the queue, to deploy directly if it’s empty or to find a less busy time if it’s looking particularly full.
We can also unqueue ourselves if something comes up and we have to step away from the computer.
To ensure bad code cannot make it to production, Hubot won’t let us deploy a branch until continuous integration tests have run. This prevents trigger-finger deploys while CI is still running.
Similarly, if CI completed but our branch failed some tests, Hubot will prevent us from deploying.
master is our stable release branch, we want to ensure that any branches being deployed are caught up with the latest code in
master. Before proceeding with a deployment, Hubot will detect if our branch is behind and automatically merge in
master if required.
To ensure deployments are visible to the rest of the team, Hubot forces us to deploy from specific chat rooms.
In rare emergency situations, it is possible to override these guards using
As soon as a branch is deployed, Hubot locks the environment so no other branches can be deployed to it. This prevents others from accidentally deploying while a developer is testing their branch.
Once we’ve merged our branch, Hubot will automatically unlock the environment and let the next person in the queue know they can deploy.
We can also manually unlock deploys to let someone else have a turn, when we decide not to merge our branch just yet.
Finally, during outages, attacks, and other emergency situations, we can lock deployments manually to prevent changes while we investigate problems.
In addition to the main production environment, we can deploy to staging servers that are only accessible by GitHub staff. This staging environment closely mirrors our production environment, including real-world datasets to ensure high-fidelity testing.
To find out what environments are available for deployments, we can ask Hubot for a list and see which ones are currently unlocked.
The lab, garage and other staging environments each replicate different aspects of production: frontend web workers, background job queues, CDN setup for assets, Git fileserver workers, etc. Depending on what part of the stack a branch touches, we can pick a matching staging environment to exercise the new code without affecting production user traffic.
One of these environments is a special “branch lab” which does not require locking, because it sets up an isolated sandbox for each branch. This helps avoid deploy lock contention and lets developers and designers deploy experimental UI changes as shareable URLs they can send to others in the company for feedback.
The branch lab is implemented as a single staging server which runs one unicorn worker per branch. The branches deployed there can be listed via chat, and a branch can be deleted once it’s no longer being used. If the free memory on that server starts to run out, we automatically prune the oldest branches to free up some space.
We can also manually remove branches that we’re done testing, or have shipped to production already:
Once a branch has passed automated tests, undergone code-review, and been verified in staging, it comes time to push it into production. Recall that GitHub engineers are not allowed to merge any pull request that has not yet been verified in production. Production traffic patterns and datasets often trigger edge-cases that expose bugs and performance issues which might not have been seen otherwise, and we want to ensure that our
master branch always represents our stable production release.
To safely roll out a risky branch, we can ask Hubot to deploy it to a specific subset of servers within an environment. This limits the user impact of the change, and allows us to monitor for new exceptions or performance regressions coming from the servers that are running our branch.
A change to the Rails version for example can be deployed to one or two frontend webservers, and if things look good we can continue to deploy it to more frontends. Similarly, an upgraded version of Git could be deployed to a handful of backend fileservers.
Once we’ve gained confidence in our branch, we can deploy it to all of production and then merge it to unlock deployments for the next developer in the queue.
Our deployment chatops and workflows work so well that we use them for everything in the company. If we want to add a DNS record, we make a pull request to github/dns and use
/deploy dns. If we want to add a monitoring alert for a new service, we make a pull request to github/nagios and use
/deploy nagios. If we want to install a new software package on a specific frontend, we use
/deploy puppet/add-package-branch to prod/fe142. We even use similar workflows to ship new versions of our native desktop apps.
If you aren’t already, we highly recommend you try some of the techniques mentioned in this blog post. This workflow brings a ton of great benefits, including: