Along with a pip (and now packaging) maintainer, Damian Shaw, I have been working on making packaging, the library behind almost all packaging related tools, faster at reading versions and specifiers, something tools like pip have to do thousands of times during resolution. Using Python 3.15’s new statistical profiler and metadata from every package ever uploaded to PyPI, I measured and improved core Packaging constructs while keeping the code readable and simple. Reading in Version s can be up to 2x faster and SpecifierSet s can be up to 3x faster in packaging 26.0rc1 , now released! Other operations have been optimized, as well, up to 5x in some cases. See the announcement and release notes too; this post will focus on the performance work only.
Introduction
packaging is the core library used by most tools for Python to deal with many of the standardized packaging constructs, like versions, specifiers, markers, and the like. It is the 11th most downloaded library, but if you also take into account that it is vendored into pip, meaning you get a (hidden) copy with every pip install, it’s actually the 2nd most downloaded library. Given that pip is vendored into Python, everyone who has Python has packaging , unless their distro strips it out into a separate package; so it is possible it is the most common third party Python library in the world.
In packaging, a Version is something that follows PEP 440’s version standard. And a SpecifierSet is conditions on that version; think >=2,<3 or ~=1.0 , those are SpecifierSet s. They are used on dependencies, on requires-python , etc. They are also part of Marker s, that is, something like tomli; python_version < '3.11' (a Requirement ) contains a Marker .
I’d like to start by showing you the progress we’ve made as a series of plots; if you’d like to see how we made some of these, I’ll follow with in-depth examples.
Performance plots with asv
After most of the performance PRs were made, I finally invested a little time into making a proper set of micro-benchmarks with asv; I’ll be showing plots from that. Code for this is currently in a branch in my fork; it might eventually be either contributed or moved to a separate repo. The benchmarks are an optimized (trimmed down) version of the original code.
Plots were made using code in the source directory of my blog repository; values are scaled by the 25.0 performance numbers, with a green line showing the current performance after the changes we’ve been working on. I ran them with Python 3.14 from uv (which is a bit faster than the one from homebrew) on an entry-level M1 Mac Mini. The plot xscale is expanded after 25.0 to show the current work.
This is the Version constructor. You can see the series of PRs described below lowering the time to 0.5. Now, one of those steps was making the comparison tuple generated on first usage, instead of in the constructor, so the sorting benchmark has taken on that cost:
Sorting isn’t slower than before, we’ve just moved some of the construction time over to the first time you compare a version; inside pip, only around 30% of the versions constructed actually get compared, so this is a savings.
... continue reading