Design a URL shortener
Walk through a complete URL shortener design, from a single write path to a globally distributed system serving 1M redirects per second across 1B stored links in under 100ms.
What is a URL shortener?
A URL shortener converts a long URL into a short, shareable code. Visit tinyurl.com/3yk5m9f and you land on a product page with a 200-character URL. The interesting engineering problems are not the conversion itself; they are generating billions of globally unique codes without collisions, serving redirects in under 100ms at 1M requests per second, and surviving a read workload that dwarfs writes by a factor of 1,000.
Functional Requirements
Core Requirements
- Users can submit a long URL and receive a shortened version.
- Optionally, users can specify a custom alias for the short URL.
- Optionally, users can set an expiration date on the short URL.
- Visiting a short URL redirects the user to the original long URL.
Below the Line (out of scope)
- User authentication and account management
- Click analytics and geographic tracking
- Spam and malicious URL detection
- QR code generation
The hardest part in scope: Generating unique short codes at scale. We need globally unique 6-8 character codes at approximately 1,160 writes per second without collision probability or retry overhead. We will dedicate a full deep dive to it.
User authentication is below the line because it does not change the write or redirect paths we are designing. To add it, I would associate each shortened URL with a user_id from a session token and expose a GET /users/{id}/urls endpoint for managing a user's links.
Click analytics is below the line because it introduces a separate write path and storage tier. To add it, I would emit a click event to a Kafka topic on every redirect and process it asynchronously into a time-series database. That pipeline sits beside the main system rather than inside it.
Spam detection is below the line because it requires an ML-backed classification model. To add it, I would run a synchronous check against a URL reputation service before inserting and queue suspicious URLs for async review.
QR code generation is below the line because it is stateless and does not interact with the core URL mapping. To add it, I would generate the QR code on demand from the short URL without storing anything new.
Non-Functional Requirements
Core Requirements
- Uniqueness: Each short code maps to exactly one long URL, globally. No collisions permitted.
- Availability: 99.99% uptime. Availability over consistency for redirects (a stale cache hit is better than a 500 timeout).
- Latency: Redirect completes in under 100ms p99. Short URL creation can tolerate up to 500ms.
- Scale: 1B total stored URLs, 100M DAU. Write rate peaks at approximately 1,160 new URLs per second. Read rate peaks at approximately 1.16M redirects per second.
Below the Line
- Sub-10ms redirect latency via CDN edge caching
- Real-time click analytics consistency
Read/write ratio: For every 1 new URL created, expect roughly 1,000 redirects. This 1000:1 skew is the single most important number in this design. It determines the caching strategy, the service split, and the database tier. Nearly every architectural decision in this article traces back to it.
Under 100ms redirect latency means a direct database lookup on every request is not viable (a single DB round-trip adds 10-50ms before accounting for query time). The 99.99% availability target means a single database node is not acceptable for the read path.
Core Entities
- ShortURL: The core mapping from a short code to a long URL. Carries the short code, original URL, optional custom alias, optional expiration timestamp, and a creation timestamp.
- User (out of scope for now): The account that created the short URL. We would reference
user_idin the full schema once authentication is in scope.
The full schema, indexes, and column types are deferred to the data model deep dive. The entities above are sufficient to drive the API design and High-Level Design.
API Design
Shorten a URL:
POST /urls
Body: { long_url, custom_alias?, expiration_date? }
Response: { short_url }
Redirect to the original URL:
GET /{short_code}
Response: HTTP 302 → original long URL
Delete a short URL:
DELETE /urls/{short_code}
Response: 204 No Content
302 vs 301: Use 302 (temporary redirect) over 301 (permanent redirect). A 301 tells browsers to cache the redirect permanently, so the browser never asks our server again. This matters for three reasons: expired URLs need to return
410 Gonerather than silently redirecting to a dead page; custom aliases can be reassigned; the redirect target can be updated without asking users to clear their cache.
The GET /{short_code} endpoint is the hot path, handling over 99% of all traffic. Every other endpoint is noise by comparison. I would split it into a dedicated service for exactly this reason.
High-Level Design
1. Users can submit a long URL and receive a shortened version
The write path: client submits a long URL, server generates a short code, database stores the mapping.
Components:
- Client: Web or mobile interface sending
POST /urlsrequests. - Write Server: Validates the URL, generates a short code (treated as a black box here, detailed in the deep dives), writes the mapping to the database.
- Database: Stores the
short_code → long_urlmapping plus the optional custom alias and expiration date.
Request walkthrough:
- Client sends
POST /urlswith the long URL and an optional custom alias. - Write Server validates the URL format.
- Write Server checks whether a custom alias is requested and not already taken.
- Write Server generates a short code.
- Write Server inserts
{ short_code, long_url, expiration_date }into the database. - Write Server returns the constructed short URL to the client.
flowchart LR
C(["👤 Client\nWeb / mobile app"])
WS["⚙️ Write Server\nValidate URL format\nGenerate short code · INSERT to DB"]
DB[("🗄️ PostgreSQL\nshort_code → long_url\nUNIQUE short_code constraint")]
C -->|"POST /urls · long_url"| WS
WS -->|"INSERT short_code + long_url"| DB
WS -->|"Returns short_url"| C
The write path only: client sends a long URL, the Write Server validates it, generates a short code, and stores the mapping. The redirect path and caching layer come in the next requirement.
2. Users can access the original URL by visiting the short link
The read path: client visits the short URL, server looks up the mapping, returns a 302 redirect. At a 1000:1 read/write ratio, the database cannot absorb this traffic directly; a cache is required.
Components:
- Read Server: Receives
GET /{short_code}, checks the cache, falls back to the database on a miss, and returns a 302 redirect. - Redis Cache: Stores
short_code → long_urlmappings. Sub-millisecond lookups with a TTL set to the URL's expiration date. - Database: Handles cache misses. After the first miss, the mapping is cached; the database is rarely hit again for popular codes.
Request walkthrough:
- Client sends
GET /{short_code}. - Read Server checks Redis for the short code.
- Cache hit: Redis returns the long URL. Read Server responds with
302 → long_url. Done in under 1ms. - Cache miss: Read Server queries the database, gets the long URL, writes it back to Redis (cache-aside pattern), then responds with
302 → long_url.
flowchart LR
C(["👤 Client\nWeb / mobile app"])
RS["⚙️ Read Server\nLookup short_code · return 302\nHandles 99%+ of all traffic"]
RC["⚡ Redis Cache\nIn-memory key-value store · < 1ms\nshort_code → long_url · ~90%+ hit rate"]
DB[("🗄️ PostgreSQL\nCache miss fallback only\nPrimary key lookup · ~10-50ms")]
C -->|"GET /{short_code}"| RS
RS -->|"Cache lookup"| RC
RC -.->|"Cache miss → fall back to DB"| DB
DB -->|"Returns long_url"| RS
RS -->|"302 → long_url"| C
The Read Server bypasses the primary database on cache hits. With a warm cache absorbing 90%+ of lookups, the database handles cold starts and rare misses only.
The Write Server and Read Server are shown as separate services from the start because of the 1000:1 traffic ratio. They scale independently in every subsequent diagram.
3. Custom alias support
Custom aliases must be globally unique across all users. The write path gains a uniqueness check, and the database's UNIQUE constraint becomes the final arbiter for concurrent conflicts.
Components:
- Write Server (updated): Before inserting, check whether the requested alias is already taken. Return
409 Conflictif it is. - Database (updated): The
short_codecolumn carries a UNIQUE constraint. This is the safety net for concurrent requests that both pass the application-level check before either commits.
Request walkthrough:
- Client sends
POST /urlswithcustom_alias: "mycompany". - Write Server queries the database:
SELECT 1 FROM urls WHERE short_code = 'mycompany'. - If the alias is taken, return
409 Conflictimmediately. - If free, proceed with the insert. The UNIQUE constraint prevents a race condition between the check and the insert.
- Write Server populates the Redis cache for the new alias immediately, so the first redirect is a cache hit.
flowchart LR
C(["👤 Client\nWeb / mobile app"])
WS["⚙️ Write Server\nCheck alias · INSERT if free\n409 Conflict if taken · seed cache after insert"]
DB[("🗄️ PostgreSQL\nUNIQUE short_code · final conflict arbiter\nConstraint blocks concurrent duplicate inserts")]
RC["⚡ Redis Cache\nIn-memory · < 1ms\nPre-seeded on write\nEliminates cold-start miss on first redirect"]
C -->|"POST /urls · custom_alias='mycompany'"| WS
WS -->|"SELECT · check alias availability"| DB
DB -->|"Alias free or taken"| WS
WS -->|"INSERT (UNIQUE constraint enforced)"| DB
WS -->|"SET alias in cache immediately"| RC
WS -->|"Returns short_url or 409 Conflict"| C
Writing the alias to cache immediately after insertion eliminates the cold-redirect miss for URLs shared right after creation. The UNIQUE constraint is the race-condition safety net: if two concurrent requests both pass the SELECT check before either INSERT commits, only one succeeds; the other gets a 409.
The SELECT-then-INSERT pattern has a race condition: two concurrent requests for the same alias can both pass the SELECT check before either INSERT commits. The UNIQUE constraint on the database column makes this safe. One insert wins; the other gets a constraint violation, which the Write Server translates into a 409 response.
4. URL expiration
Expired URLs must never serve a redirect. The expiration check must not add an extra round-trip to every request, and the database must not grow unboundedly with dead rows.
Components:
- Redis Cache (updated): Each cache entry carries a TTL equal to
expiration_date - now. Cache entries auto-evict when the URL expires. - Read Server (updated): On a cache miss for an expired URL, the database row may still exist. Check
expiration_dateinline on every cache miss before responding. Return410 Goneif expired. - Background Cleanup Job: Runs periodically (daily). Hard-deletes rows where
expiration_date < NOW()in large batches. Prevents the database from accumulating unbounded expired rows.
Request walkthrough for an expired URL:
- Client sends
GET /{short_code}for an expired short URL. - Redis has no entry (TTL expired). Cache miss.
- Read Server queries the database. The row may still exist.
- Read Server checks
expiration_date. Ifexpiration_date < now, return410 Gone. - No cache repopulation. The Background Cleanup Job eventually hard-deletes the expired row.
flowchart LR
C(["👤 Client\nWeb / mobile app"])
RS["⚙️ Read Server\nInline expiry check on cache miss\nNo extra DB round-trip to check expiry"]
RC["⚡ Redis Cache\nIn-memory · < 1ms\nTTL = expiration_date - now\nAuto-evicts when URL expires"]
DB[("🗄️ PostgreSQL\nPartial index on expiration_date\nRow may outlive its Redis TTL")]
BG["⚙️ Cleanup Job\nRuns daily via cron\nBatched DELETE: 10K rows/batch\nThrottled 100ms between batches"]
C -->|"GET /{short_code}"| RS
RS -->|"Cache lookup"| RC
RC -.->|"TTL expired → cache miss"| RS
RS -->|"SELECT + inline expiry check"| DB
DB -->|"Row or null + expiration_date"| RS
RS -->|"410 Gone or 302 redirect"| C
BG -.->|"DELETE WHERE expiration_date < NOW()"| DB
The Cleanup Job runs asynchronously and on a gentle schedule. Expired rows linger briefly in the database, but the inline expiry check on every cache miss ensures no stale 302 is ever served.
Potential Deep Dives
1. How do we generate unique short codes?
Three constraints drive the design:
- Codes must be globally unique. Two different long URLs must never produce the same short code.
- Codes should be 6-8 characters for readability and shareability.
- Generation must be fast. At 1,160 writes per second, code generation is on the hot path of every create request.
2. How do we scale redirects to 100M DAU under 100ms?
Three constraints drive the design:
- 1.16M redirect requests per second at peak.
- Under 100ms p99 for every redirect.
- A single database instance handles approximately 10K-50K reads per second. We are 23x beyond that at peak.
3. How do we handle URL expiration efficiently?
Three constraints drive the design:
- Expired URLs must never serve a redirect. A
410 Gonemust be returned. - The expiration check must not add a separate database round-trip to every redirect.
- The database must not accumulate unbounded expired rows over the lifetime of the service.
Final Architecture
flowchart LR
subgraph Clients["👤 Client Layer"]
U(["👤 User\nWeb / mobile app"])
end
subgraph EdgeLayer["🌐 CDN Edge Layer"]
CDN["🌐 CDN Edge Node\n< 10ms globally\nCache-Control header drives TTL\nAbsorbs repeat redirects at edge"]
end
subgraph Gateway["🔀 Gateway Layer"]
AG["🔀 API Gateway\nRoute reads → Read Service\nRoute writes → Write Service\nRate limiting · auth headers"]
end
subgraph AppTier["⚙️ Application Tier"]
WS["⚙️ Write Service\nValidate · generate short code\nBatch ID fetch: INCRBY 1K IDs\nINSERT + seed cache on write"]
RS["⚙️ Read Service\nLookup + inline expiry check\nReturn 302 / 410 / 404\nHandles 99%+ of all traffic"]
GC["⚡ Counter (Redis)\nIn-memory · atomic INCR/INCRBY\n100K+ ops/sec · no lock needed\nBatch size: 1K IDs per claim"]
BG["⚙️ Cleanup Job\nRuns daily · 10K rows/batch\n100ms sleep between batches\nKeeps DB table size bounded"]
end
subgraph CacheTier["⚡ Cache Tier"]
RC["⚡ Redis Cluster\nIn-memory · < 1ms per lookup\nshort_code → long_url\nTTL = expiration_date · 6 nodes\n~90%+ cache hit rate for reads"]
end
subgraph DBTier["🗄️ Database Tier"]
PDB[("🟢 Primary DB (PostgreSQL)\nWrites only · ACID guarantees\nUNIQUE short_code · source of truth")]
RR[("🔵 Read Replica\nCache miss fallback only\nAsync replication · ~10-50ms lag\nNever receives direct writes")]
end
U -->|"GET /{short_code}"| CDN
U -->|"POST /urls · writes"| AG
CDN -->|"Cache miss → origin"| AG
AG -->|"Write requests"| WS
AG -->|"Read requests"| RS
WS -->|"INCRBY · claim 1K ID batch"| GC
WS -->|"INSERT short_code + long_url"| PDB
WS -->|"SET short_code in cache"| RC
RS -->|"GET short_code"| RC
RC -.->|"Cache miss"| RR
RR -->|"Row + expiration_date"| RS
RS -->|"SETEX · repopulate cache"| RC
PDB -.->|"Async replication · ~10-50ms"| RR
BG -.->|"DELETE expired rows"| PDB
The read/write split is the core insight. Write Service and Read Service scale independently based on the 1000:1 traffic ratio. Redis Cluster absorbs over 90% of redirect lookups at under 1ms, keeping the primary database reserved exclusively for writes. CDN edge caching adds a third layer that handles repeat accesses for popular codes without touching our infrastructure at all.
Interview Cheat Sheet
- Lock down 3-4 core features and name what is explicitly out of scope before drawing anything.
- State the read/write ratio immediately (1000:1 for URL shorteners) because it explains every downstream architectural decision.
- Counter-based short code generation beats hashing: unique by construction, no retry logic, no birthday problem.
- A 6-character base62 code covers 62^6 = 56B values, giving 56x headroom over the 1B URL target.
INCRis atomic. Counter-based generation needs no distributed lock and no collision handling.- Use 302 (temporary) not 301 (permanent) so browsers do not cache redirects, which would break expiration and alias updates.
- Split Write Service and Read Service early. They have completely different load profiles at a 1000:1 traffic ratio.
- Redis cache for
short_code → long_urllookups drops redirect latency from ~20ms (read replica) to under 1ms. - Counter batching reduces per-write Redis round-trips by 1000x: each Write Service instance claims 1,000 IDs at once via
INCRBY. - For multi-region, allocate disjoint counter ranges per region (A: 0-1B, B: 1B-2B). No cross-region coordination needed.
- Set Redis TTL equal to
expiration_date - nowon write. Expired entries evict automatically; do not repopulate the cache on an expired cache miss. - The inline expiration check on a cache miss (not a separate query) keeps the read path to a single database round-trip.
- The background cleanup job runs in batches with throttled intervals. Expired rows linger briefly, but the inline check ensures no stale 302.
- The UNIQUE constraint on
short_codeis the final safety net for concurrent custom alias requests. One insert wins; the other gets a 409. - CDN edge caching for popular short codes pushes redirect latency under 10ms and removes the majority of traffic from origin servers entirely.