Search over 600 articles

5 Sep 2013 · Semaphore News

Improved repository caching

2 min read

Contents

Over the past week we rolled out an update to our repository caching system. Previously cache was branch based, and all users suffered one disadvantage: first builds of every new branch took more time as the git clone and bundle install had to be done from scratch.

A brief overview of how repository caching works

Our build platform is powered by dedicated hardware and all build servers maintain a cache of repositories they have worked with. The build scheduler tries to assign a job to an available build server which has worked with a given branch before.

Source code is then taken from the cache, advanced to the revision specified by the build and moved to a virtual machine where the actual build or deploy is performed. When it reaches a certain usage quota, cache frees up disk space by removing repositories by iterating through oldest first.

Repository caching is now global

From now on repositories are cached globally and all branches use the same source. One immediate effect is that the cache lives longer. It also uses less disk space, so a build server can accomodate more projects, resulting in overall fewer cache misses as one project can easily be cached on many servers.

So instead of 100% of all first builds of a branch doing a full git clone and bundle install, that number is now down to about 50%. As most teams work in a workflow with feature branches, this results in reduced build times on many projects. The same applies to deploys as well. Of course, we’ll continue to work on improving the platform further.

Next post Previous post

Leave a Reply Cancel reply

Writen by:

Marko Anastasov

Marko Anastasov is a software engineer, author, and co-founder of Semaphore. He worked on building and scaling Semaphore from an idea to a cloud-based platform used by some of the world’s engineering teams.