GitHub Engineering

Context aware MySQL pools via HAProxy

At GitHub we use MySQL as our main datastore. While repository data lies in git, metadata is stored in MySQL. This includes Issues, Pull Requests, Comments etc. We also auth against MySQL via a custom git proxy (babeld). To be able to serve under the high load GitHub operates at, we use MySQL replication to scale out read load.

gh-ost: GitHub's online schema migration tool for MySQL

Today we are announcing the open source release of gh-ost: GitHub’s triggerless online schema migration tool for MySQL.

SYN Flood Mitigation with synsanity

GitHub hosts a wide range of user content, and like all large websites this often causes us to become a target of denial of service attacks. Around a year ago, GitHub was on the receiving end of a large, unusual and very well publicised attack involving both application level and volumetric attacks against our infrastructure.

GitHub's CSP journey

We shipped subresource integrity a few months back to reduce the risk of a compromised CDN serving malicious JavaScript. That is a big win, but does not address related content injection issues that may exist on GitHub.com itself. We have been tackling this side of the problem over the past few years and thought it would be fun, and hopefully useful, to share what we have been up to.

Introducing DGit

GitHub hosts over 35 million repositories and over 30 million Gists on hundreds of servers. Over the past year, we’ve built DGit, a new distributed storage system that dramatically improves the availability, reliability, and performance of serving and storing Git content.

Older posts