Tech News
← Back to articles

Why it took 4 years to get a lock files specification

read original related products more articles

(This is the blog post version of my keynote from EuroPython 2025 in Prague, Czechia.)

We now have a lock file format specification. That might not sound like a big deal, but for me it took 4 years of active work to get us that specification. Part education, part therapy, this post is meant to help explain what make creating a lock file difficult and why it took so long to reach this point.

What goes into a lock file

A lock file is meant to record all the dependencies your code needs to work along with how to install those dependencies.

That involves The "how" is source trees, source distributions (aka sdists), and wheels. With all of these forms, the trick is recording the right details in order to know how to install code in any of those three forms. Luckily we already had the direct_url.json specification that just needed translation into TOML for source trees. As for sdists and wheels, it's effectively recording what an index server provides you when you look at a project's release.

The much trickier part is figuring what to install when. For instance, let's consider where your top-level, direct dependencies come from. In pyproject.toml there's project.dependencies for dependencies you always need for your code to run, project.optional-dependencies (aka extras), for when you want to offer your users the option to install additional dependencies, and then there's dependency-groups for dependencies that are not meant for end-users (e.g. listing your test dependencies).

But letting users control what is (not) installed isn't the end of things. There's also the specifiers you can add to any of your listed dependencies. They allow you to not only restrict what versions of things you want (i.e. setting a lower-bound and not setting an upper-bound if you can help it), but also when the dependency actually applies (e.g. is it specific to Windows?).

Put that all together and you end up with a graph of dependencies who edges dictate whether a dependency applies on some platform. If you manage to write it all out then you have multi-use lock files which are portable across platforms and whatever options the installing users selects, compared to single-use lock files that have a specific applicability due to only supporting a single platform and set of input dependencies.

Oh, and even getting the complete list of dependencies in either case is an NP-complete problem.

And it make makes things "interesting", I also wanted the file format to be written by software but readable by people, secure by default, fast to install, and allow the locker which write the lock file to be different from the installer that performs the install (and either be written in a language other than Python).

... continue reading