Tech News
← Back to articles

Package managers keep using Git as a database, it never works out

read original related products more articles

Using git as a database is a seductive idea. You get version history for free. Pull requests give you a review workflow. It’s distributed by design. GitHub will host it for free. Everyone already knows how to use it.

Package managers keep falling for this. And it keeps not working out.

Cargo

The crates.io index started as a git repository. Every Cargo client cloned it. This worked fine when the registry was small, but the index kept growing. Users would see progress bars like “Resolving deltas: 74.01%, (64415/95919)” hanging for ages, the visible symptom of Cargo’s libgit2 library grinding through delta resolution on a repository with thousands of historic commits.

The problem was worst in CI. Stateless environments would download the full index, use a tiny fraction of it, and throw it away. Every build, every time.

RFC 2789 introduced a sparse HTTP protocol. Instead of cloning the whole index, Cargo now fetches files directly over HTTPS, downloading only the metadata for dependencies your project actually uses. (This is the “full index replication vs on-demand queries” tradeoff in action.) By April 2025, 99% of crates.io requests came from Cargo versions where sparse is the default. The git index still exists, still growing by thousands of commits per day, but most users never touch it.

Homebrew

GitHub explicitly asked Homebrew to stop using shallow clones. Updating them was “an extremely expensive operation” due to the tree layout and traffic of homebrew-core and homebrew-cask.

Users were downloading 331MB just to unshallow homebrew-core. The .git folder approached 1GB on some machines. Every brew update meant waiting for git to grind through delta resolution.

Homebrew 4.0.0 in February 2023 switched to JSON downloads for tap updates. The reasoning was blunt: “they are expensive to git fetch and git clone and GitHub would rather we didn’t do that… they are slow to git fetch and git clone and this provides a bad experience to end users.”

... continue reading