Distributed ID Formats Are Architectural Commitments, Not Just Data Types

Most systems start with auto-increment IDs because it’s the easiest possible thing that works. The database hands you numbers, you store them, life is good. There’s something comforting about watching IDs tick upward in perfect sequence—12345, 12346, 12347.

But IDs have a funny property: they quietly spread everywhere. Into URLs, logs, analytics pipelines, API responses, customer support workflows—all the places you don’t think about until changing the format suddenly becomes painful.

Distributed IDs are a known problem at scale. But I didn’t expect the format to matter as much as it does. After watching teams struggle with migrations and evolution, I realized the real problem isn’t generation—it’s choosing a format you can live with long-term.

The First Time IDs Became a Problem

The first time I saw ID formats become a real architectural constraint was during a database split. A team was breaking a monolith into several database instances. The old auto-increment IDs were totally fine—until suddenly they weren’t, because multiple shards couldn’t share the same global counter anymore.

The migration itself wasn’t terrible. The ugly part was everything else. IDs already existed in URLs, references in other services, analytics jobs expecting sequential integers, dashboards that assumed ordering. You can’t just regenerate everything because the IDs already have meaning out in the world.

Their workaround was simple and surprisingly effective: they offset new IDs by a huge constant—roughly a billion. Old IDs stayed below the threshold, new IDs lived above it, and nothing collided. It worked surprisingly well, but it also taught me something.

ID formats aren’t just formats. They’re commitments.

Once you deploy one, it becomes part of your architecture. That realization stuck with me.

When Does This Actually Matter?

... continue reading