Tech News
← Back to articles

Which NPM package has the largest version number?

read original related products more articles

Which npm package has the largest version number?

I spent way too much time on this

I was recently working on a project that uses the AWS SDK for JavaScript. When updating the dependencies in said project, I noticed that the version of that dependency was v3.888.0 . Eight hundred eighty eight. That’s a big number as far as versions go.

That got me thinking: I wonder what package in the npm registry has the largest number in its version. It could be a major, minor, or patch version, and it doesn’t have to be the latest version of the package. In other words, out of the three numbers in .. for each version for each package, what is the largest number I can find?

TL;DR? Jump to the results to see the answer.

The npm API Obviously npm has some kind of API, so it shouldn’t be too hard to get a list of all… 3,639,812 packages. Oh. That’s a lot of packages. Well, considering npm had 374 billion package downloads in the past month, I’m sure they wouldn’t mind me making a few million HTTP requests. Doing a quick search for “npm api” leads me to a readme in the npm/registry repo on GitHub. There’s a /-/all endpoint listed in the table of contents which seems promising. That section doesn’t actually exist in the readme, but maybe it still works? Terminal window 1 $ curl 'https://registry.npmjs.org/-/all' 2 { "code" : "ResourceNotFound" , "message" : "/-/all does not exist" } Whelp, maybe npm packages have an ID and I can just start at 1 and count up? It looks like packages have an _id field… never mind, the _id field is the package name. Okay, let’s try to find something else. A little more digging brings me to this GitHub discussion about the npm replication API. So npm replicates package info in CouchDB at https://replicate.npmjs.com , and conveniently, they support the _all_docs endpoint. Let’s give that a try: Terminal window 1 $ curl 'https://replicate.npmjs.com/registry/_all_docs' 2 { 3 "total_rows" : 3628088, 4 "offset" : 0, 5 "rows" : [ 6 { 7 "id" : "-", 8 "key" : "-", 9 "value" : { 10 "rev" : "5-f0890cdc1175072e37c43859f9d28403" 11 } 12 }, 13 { 14 "id" : "--------------------------------------------------------------------------------------------------------------------------------whynunu", 15 "key" : "--------------------------------------------------------------------------------------------------------------------------------whynunu", 16 "value" : { 17 "rev" : "1-1d26131b0f8f9702c444e061278d24f2" 18 } 19 }, 20 { 21 "id" : "-----hsad-----", 22 "key" : "-----hsad-----", 23 "value" : { 24 "rev" : "1-47778a3a6f9d8ce1e0530611c78c4ab4" 25 } 26 }, 27 # 997 more packages... Those are some interesting package names. Looks like this data is paginated and by default I get 1,000 packages at a time. When I write the final script, I can set the limit query parameter to the max of 10,000 to make pagination a little less painful. Fortunately, the CouchDB docs have a guide for pagination, and it looks like it’s as simple as using the skip query parameter. Terminal window 1 $ curl 'https://replicate.npmjs.com/registry/_all_docs?skip=1000' 2 "Bad Request" Never mind. According to the GitHub discussion linked above, skip is no longer supported. The “Paging (Alternate Method)” section of the same page says that I can use startkey_docid instead. If I grab the id of the last row, I should be able to use that to return the next set of rows. Fun fact: The 1000th package (alphabetically) on npm is 03-webpack-number-test . Terminal window 1 $ curl 'https://replicate.npmjs.com/registry/_all_docs?startkey_docid="03-webpack-number-test"' 2 { 3 "total_rows" : 3628102, 4 "offset" : 999, 5 "rows" : [ 6 # another 1000 packages... Nice. Also, another 3628102 - 3628088 = 14 packages have been published in the ~15 minutes since I ran the last query. Now, there’s one more piece of the puzzle to figure out. How do I get all the versions for a given package? Unfortunately, it doesn’t seem like I can get package version information along with the base info returned by _all_docs . I have to separately fetch each package’s metadata from https://registry.npmjs.org/ . Let’s see what good ol’ trusty 03-webpack-number-test looks like: Terminal window 1 $ curl 'https://registry.npmjs.org/03-webpack-number-test' 2 { 3 # i've omitted some fields here 4 "_id" : "03-webpack-number-test", 5 "versions" : { 6 "1.0.0" : { ... }, 7 # the rest of the versions... Alright, I have everything I need. Now I just need to write a bash script that— just kidding. A wise programmer once said, “if your shell script is more than 10 lines, it shouldn’t be a shell script” (that was me, I said that). I like TypeScript, so let’s use that. The biggest bottleneck is going to be waiting on the GET s for each package’s metadata. My plan is this: Grab all the package IDs from the replication API and save that data to a file (I don’t want to have to refetch everything if the something goes wrong later in the script)

Fetch package data in batches so we’re not just doing 1 HTTP request at a time

Save the package data to a file (again, hopefully I only have to fetch everything once) Once I have all the package data, I can answer the original question of “largest number in version” and look at a few other interesting things. (A few hours and many iterations later…) Terminal window 1 $ bun npm-package-versions.ts 2 Fetching package IDs... 3 Fetched 10000 packages IDs starting from offset 0 4 # this goes on for a while... 5 Finished fetching package IDs 6 Fetched 50 packages in 884ms (57 packages/s ) 7 Fetched 50 packages in 852ms (59 packages/s ) 8 # this goes on for a really long while... See the script section at the end if you want to see what it looks like.

Results Some stats: Time to fetch all ~3.6 million package IDs: A few minutes

Time to fetch version data for each one of those packages: ~12 hours (yikes)

... continue reading