Distributed ID generation
Learn the five ID generation strategies used in production systems, when each breaks, and why Snowflake IDs are the default choice for any distributed write-heavy service.
The problem
Every record in a distributed system needs a unique identifier. The way you generate that ID shapes three things: how fast writes are, how well your indexes perform at scale, and whether a single node failure can stop new records from being created.
In a single-node database, auto-increment is trivial. In a distributed system, uniqueness without a central coordinator is a real problem. Pick the wrong approach and you will end up with index fragmentation, B-tree write amplification, or a global lock on every write.
I see candidates reach for UUIDs as a reflex in interviews, without reasoning about the consequences. This article gives you the full picture.
The four requirements
Every distributed ID must satisfy at minimum:
- Global uniqueness. Two different records on two different machines must never share an ID.
- No central coordination. Generating an ID must not require a network call to a global counter service sitting in front of every write.
- Monotonically increasing (usually). IDs that embed creation time allow
ORDER BY id DESCto substitute forORDER BY created_at DESC. This eliminates one index and keeps B-tree insertions sequential. - Compact. IDs appear in multiple indexes, URLs, and API responses. A 36-character string UUID is 4.5x the storage of an 8-byte BIGINT.
The options
Option 1: Database auto-increment
The simplest approach: let the primary database issue sequential integers.
-- MySQL / PostgreSQL auto-increment
CREATE TABLE tweets (
tweet_id BIGINT NOT NULL AUTO_INCREMENT,
...
PRIMARY KEY (tweet_id)
);
What works: Simple. Sequential writes append to the right side of the B-tree leaf page. Minimal fragmentation. Human-readable ("tweet 42 million").
What breaks:
- Single point of failure. The auto-increment counter lives on one host. If that host fails, no new records can be written even if the rest of your fleet is healthy.
- Cross-shard collision. When you shard the table across N hosts, each shard issues its own counter starting from 1. IDs collide across shards immediately. You must either use disjoint ranges per shard (
shard 0: 0-1B, shard 1: 1B-2B) or a separate global counter service, which is a new single point of failure.
When it's fine: Single-database systems under moderate load where sharding is not a near-term concern.
Option 2: UUID v4 (random)
A 128-bit randomly generated identifier. Universally unique by construction: the probability of two UUIDs colliding is 1 in 5 ร 10^36.
// UUID v4: fully random, no coordination needed
tweet_id = uuid_v4()
// "f47ac10b-58cc-4372-a567-0e02b2c3d479"
db.insert(tweet_id, user_id, text)
What works: No coordination needed. Works identically on every node, everywhere. No configuration required.
What breaks:
- Index fragmentation. B-tree indexes expect mostly sequential inserts; new data appends to the rightmost leaf page. Random UUIDs scatter inserts across all leaf pages, causing constant page splits, a write amplification factor of 2-5x, and cache eviction of hot index pages. This is measurable at high insert rates on tables over ~50GB.
- Not time-sortable. To query in chronological order you need a separate
created_atcolumn and a secondary index on it. Every write maintains one more index. - Storage overhead. A UUID stored as a VARCHAR(36) is 36 bytes vs 8 bytes for a BIGINT. Across billions of rows and the multiple indexes that reference the primary key, this multiplies significantly.
UUID v4 fragmentation is not theoretical. On a PostgreSQL table with 100 million rows and a UUID primary key, random inserts cause the index to span far more pages than a sequential-ID index, with constant page splits and cache eviction. The fill factor of UUID indexes in active production tables often hovers around 30-50%, compared to 80-95% for sequential IDs. This directly translates to 2-5x more I/O per write at high insert rates.
When it's fine: Tables under ~10M rows, or systems where sort order by ID does not matter and write volume is modest.
Option 3: UUID v7 (time-ordered)
A newer UUID variant (IETF RFC 9562, 2024) where the high bits encode a Unix millisecond timestamp. IDs from the same millisecond are random; across milliseconds, IDs are monotonically increasing.
// UUID v7: timestamp-prefixed, time-sortable
// Format: [unix_ts_ms 48 bits][version 4 bits][rand_a 12 bits][variant 2 bits][rand_b 62 bits]
tweet_id = uuid_v7()
// "01905cce-1820-7abc-def0-123456789abc"
// The first 48 bits encode Unix timestamp in milliseconds
What works: Solves the fragmentation problem from UUID v4. IDs from the same millisecond are still random, but at scale the prefix-sorted nature keeps most inserts near the right edge of the B-tree. Still 128 bits, so storage is still 16 bytes (binary) or 36 bytes (string).
What breaks: Still twice the storage of a 64-bit integer. Multiple IDs generated within the same millisecond are not strictly ordered, only within-machine order is guaranteed.
When it's fine: Greenfield systems that want UUID compatibility (no schema migration from existing UUID v4 systems) but care about insert performance at scale.
Option 4: Counter + base62 encoding
Maintain a global counter (typically in Redis using atomic INCR). For each new record, fetch the next counter value and encode it in base62 (a-z, A-Z, 0-9). Used by URL shorteners for short, human-readable codes.
// Redis atomic counter + base62 encode
CHARS = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789"
function encode_base62(n):
result = ""
while n > 0:
result = CHARS[n % 62] + result
n = n // 62
return result
counter = redis.INCR("global_id_counter") // atomic, returns next unused integer
short_code = encode_base62(counter) // e.g., "b9fA2"
db.insert(short_code, long_url)
What works: Human-readable, short codes. Unique by construction (each counter value issued exactly once). Sequential writes. A 6-character base62 code covers 62^6 = 56B values.
What breaks:
- Redis is now in the write critical path. Every record creation requires a network round-trip to Redis. If Redis is unavailable, writes stop. Mitigate with batching: each writer claims 1,000 IDs at once via
INCRBY 1000. - Not suitable for infinite scale. Counter requires central coordination even if Redis is fast (~100K ops/sec single node). Distributed write rates above ~50K/sec need sharded counter ranges.
- IDs reveal volume. Counter-based IDs expose how many records have been created, which may be sensitive information.
When it's fine: URL shorteners, short codes for human-shareable links, anywhere a compact readable identifier is explicitly desired.
Option 5: Snowflake ID (recommended for most distributed systems)
A 64-bit integer structured as encoded bit fields. Each machine generates IDs independently using only local state. No network call, no lock, no retry.
flowchart TD
subgraph Layout["64-bit Snowflake ID"]
B1["Sign ยท 1 bit\nalways 0"]
B2["Timestamp ยท 41 bits\nms since custom epoch\n~69-year range"]
B3["Worker ID ยท 10 bits\n5-bit datacenter + 5-bit machine\n1,024 unique generators"]
B4["Sequence ยท 12 bits\nper-ms counter, resets to 0\n4,096 IDs per ms per node"]
B1 --> B2 --> B3 --> B4
end
subgraph Same_ms["Same millisecond: two machines never collide"]
M1["Machine 1 ยท worker=1\nts=T ยท seq=0 โ ID_A"]
M2["Machine 2 ยท worker=2\nts=T ยท seq=0 โ ID_B"]
M1 -->|"worker bits differ"| R["ID_A != ID_B\nno coordination needed"]
M2 -->|"worker bits differ"| R
end
// Snowflake ID generation: pure local computation, no network call
EPOCH_MS = 1288834974657 // custom epoch (Twitter: Nov 4, 2010)
function generate():
ts = current_time_ms() - EPOCH_MS // milliseconds since epoch
seq = next_local_sequence(ts) // atomic local counter, resets per ms
return (ts << 22) | (DATACENTER_ID << 17) | (MACHINE_ID << 12) | seq
Components:
- 41-bit timestamp: milliseconds since a custom epoch. Covers ~69 years from the chosen epoch.
- 10-bit machine ID: split as 5 bits for datacenter ID and 5 bits for machine ID. Supports 1,024 unique generator nodes.
- 12-bit sequence: per-millisecond counter. Up to 4,096 IDs per millisecond per machine. Total capacity: 4,096 ร 1,024 machines = ~4.2 billion IDs per millisecond globally.
What works:
- Fully distributed. Each machine generates IDs from local state alone. No network call, no contention.
- Time-sortable. Higher IDs are always later.
ORDER BY id DESCreplacesORDER BY created_at DESC. Thecreated_atcolumn becomes optional. - Sequential inserts. B-tree insertions mostly append to the rightmost leaf page. No fragmentation.
- Compact. 8 bytes as a stored BIGINT.
What breaks:
- Clock skew. If a machine's system clock moves backward (NTP correction), the generator may emit an ID with a lower timestamp than a recent ID. The standard mitigation: detect clock drift, pause generation until the clock advances past the last-issued timestamp, then resume.
- Machine ID coordination. Machine IDs must be globally unique across all generator instances. Twitter solved this with a ZooKeeper registration service. A simpler approach for smaller fleets: assign machine IDs via config management (Ansible, Kubernetes pod annotations) and fail fast on startup if a collision is detected.
When to use: Any distributed system that writes records at non-trivial rate, needs chronological sorting by ID, and does not want a central ID issuer in the write critical path. This means: social feeds, event logs, messaging systems, order systems, anywhere write throughput matters.
Comparison table
| Strategy | Storage | Time-sortable | Coordination needed | Fragmentation risk | Good for |
|---|---|---|---|---|---|
| Auto-increment | 8 bytes | Yes | Yes (single host) | None | Single-node DBs |
| UUID v4 | 16 bytes | No | None | High at scale | Low-volume tables |
| UUID v7 | 16 bytes | Yes | None | Low | UUID-compatible systems |
| Counter + base62 | Variable | Yes | Yes (Redis) | None | Short human-readable codes |
| Snowflake | 8 bytes | Yes | Startup only (machine ID) | None | Distributed write-heavy systems |
flowchart TD
Start(["๐ค Choosing an ID strategy"])
Q1{"Single database,\nno sharding planned?"}
Q2{"Need human-readable\nshort codes?"}
Q3{"Need UUID format\ncompatibility?"}
Q4{"High write rate\nor time-sortable\nrequired?"}
AutoIncrement["โ
Auto-increment\nSimplest, sequential\nno coordination"]
Base62["โ
Counter + base62\nShort, readable, unique\nRedis dependency"]
UUIDv7["โ
UUID v7\n128-bit, time-sorted\nno coordination"]
Snowflake["โ
Snowflake ID\n64-bit, time-sorted\nfully distributed"]
UUIDv4["โ
UUID v4\nRandom, no coordination\nhigh fragmentation at scale"]
Start --> Q1
Q1 -->|"Yes"| AutoIncrement
Q1 -->|"No (distributed)"| Q2
Q2 -->|"Yes"| Base62
Q2 -->|"No"| Q3
Q3 -->|"Yes"| UUIDv7
Q3 -->|"No"| Q4
Q4 -->|"Yes"| Snowflake
Q4 -->|"No"| UUIDv4
Production usage
| System | Strategy | Notable behavior |
|---|---|---|
| Snowflake (original, open-sourced 2010) | 10-bit machine ID space supports 1,024 nodes. Retired in favor of internal Snowflake variants that allocate more bits to the sequence field for higher QPS. | |
| Postgres sequence + shard ID encoded in 64-bit int | Using Postgres SEQUENCE per logical shard avoids the need for a centralized ID service while remaining sortable and compact. | |
| Discord | Snowflake variant with Discord epoch (Jan 1, 2015) | Tweet IDs, message IDs, and channel IDs are all Snowflakes. The timestamp in the ID lets Discord query by time range without a secondary index. |
| Stripe | Prefixed random IDs (ch_xxxx, cus_xxxx) | Not time-sortable. Prefix encodes resource type, making IDs self-documenting in logs. Random suffix prevents enumeration. |
Interview cheat sheet
- State upfront that your ID choice has downstream consequences for index performance, latency, and failover behavior.
- Auto-increment is only safe for single-node databases. The moment you shard, you need a different strategy.
- UUID v4 causes B-tree fragmentation at scale: random inserts scatter across all leaf pages, causing page splits and write amplification.
- UUID v7 solves fragmentation but is still 16 bytes. Snowflake is 8 bytes and gives you all the same time-sortable properties.
- Snowflake IDs allow
ORDER BY tweet_id DESCto substitute forORDER BY created_at DESC, eliminating a secondary index on every timeline query. - Clock skew is Snowflake's one real vulnerability. The mitigation is to pause on detection and wait for the clock to advance, not to ignore it.
- For short human-readable codes (URL shorteners), counter + base62 beats hashing: unique by construction, no retry loop, no birthday problem.
- Batch your counter claims.
INCRBY 1000instead ofINCRreduces Redis round-trips by 1,000x at high write rates.
Quick recap
- Auto-increment IDs are a hidden time bomb in sharded architectures. They produce collisions the moment you need to identify rows across tables or systems.
- UUID v4 is globally unique but random, causing B-tree index fragmentation at high write volumes. UUID v7 solves this by embedding a timestamp prefix.
- Snowflake IDs pack 41 bits of millisecond timestamp, 10 bits of worker ID, and 12 bits of sequence into 8 bytes. They are time-sortable, globally unique, and fast to generate without coordination.
- Clock skew is Snowflake's fundamental vulnerability. Always pause on detection rather than continuing at the same timestamp.
- For human-readable short codes, counter plus base62 encoding beats hashing: guaranteed unique by construction, no retry loop, no birthday problem.
- Worker ID assignment at startup is an operational problem that hardcoded configs solve simply for small deployments and coordination services solve reliably for large ones.
Related concepts
- Databases โ Snowflake IDs eliminate the need for a secondary index on
created_at. Understanding B-tree index fragmentation explains why UUID v4 degrades write throughput at 50M+ rows. - Cursor Pagination โ Time-sortable IDs make the primary key the natural pagination cursor.
WHERE tweet_id < :cursor ORDER BY tweet_id DESCworks correctly without a separate timestamp column. - Caching โ Redis
INCRBYis the coordination mechanism for counter-based ID schemes. Understanding Redis atomic operations explains why counter batching withINCRBY 1000is safe and efficient.