Reverse Engineering the GHA Cache to Improve Performance

This article walks you through how to use Depot's API within your own code to set up projects and run your Docker builds as a service on Depot's infrastructure.

We recently announced our new product, Depot-hosted GitHub Actions runners. Our runners bring an extra improvement in cache speed that's no longer limited to our accelerated Docker builds. We're excited to be bringing faster caching to all kinds of GitHub Actions workloads.

As we were building our runners, we learned a lot about the undocumented inner workings of the GitHub Actions cache. In this post, we share what we learned, how we incorporated this knowledge into our new product, Depot GitHub Actions runners, and how you can use it to make your workflows more efficient.

Use Depot GitHub Actions runners to speed up your builds. Fast machines hosted in AWS, with up to 10x faster caching performance. Try free for 7 days. To get started, create a Depot account and then visit our docs.

The GitHub Actions cache challenge

In order for builds in general, and especially Docker builds, to be efficient within CI, they need to rely heavily on caching, so that the compute- and time-intensive work of building code and dependencies gets reused as much as possible between builds.

The problem with caching in hosted CI platforms is that all runners are ephemeral, so the cache needs to be stored remotely. Then, when the runner needs to use the cache, it has to be transferred over the network in every build before it can be used. Networks are often slow and flaky, which can negate the speed improvements from caching when you have to use them to save and load the cache.

GitHub's own hosted runners suffer from this limitation due to capped network speed. GitHub's runners have access to around 1 Gbps of network throughput, equivalent to 125 MB/s, which greatly limits how quickly cache can be saved or restored. GitHub also limits their cache to 10 GB per repository, which can quickly become exhausted as the cache becomes larger. On top of this, the cache API itself can be flaky.

There are alternatives that seek to address these limitations, but they don't solve them completely, or they create developer experience hurdles with their solutions. For example, most other hosted GitHub Actions runner providers have created their own separate caching implementation and forked all GitHub Actions cache actions into their own namespace, in order to point the cache action to their own cache implementation. So, to take advantage of the faster caching that these providers offer, you would need to change all your workflows away from actions/cache@v3 to something like hosting-provider/cache@v1 . While the config change might seem small, this involves duplicating work for everyone and maintaining multiple versions of what should effectively be the same GitHub Actions cache actions.

Another issue that alternative GitHub Actions runner providers can introduce is latency and reduced bandwidth due to having the compute located in European data centers such as Hetzner. While these compute providers are inexpensive, many internet services and infrastructure providers (including GitHub) are hosted in the US, and the added latency and lower bandwidth when moving data between Europe and the US can make some workflows much slower. We believe cache actions should “just work” out of the box and be fast, even on different runners than those hosted by GitHub, and developers shouldn't have to think about which action they should use based on where their build is going to be run.

... continue reading