URL Shortener
Walk through a complete URL shortener design, from a single write path to a globally distributed system serving 1M redirects per second across 1B stored links in under 100ms.
What is a URL shortener?
A URL shortener converts a long URL into a short, shareable code. Visit tinyurl.com/3yk5m9f and you land on a product page with a 200-character URL. The interesting engineering problems are not the conversion itself; they are generating billions of globally unique codes without collisions, serving redirects in under 100ms at 1M requests per second, and surviving a read workload that dwarfs writes by a factor of 1,000.
I use this as a warm-up question in interviews because every candidate thinks it is easy, but the 1000:1 read/write skew forces design tradeoffs that most people do not anticipate until they start drawing.
Functional Requirements
Core Requirements
- Users can submit a long URL and receive a shortened version.
- Optionally, users can specify a custom alias for the short URL.
- Optionally, users can set an expiration date on the short URL.
- Visiting a short URL redirects the user to the original long URL.
Below the Line (out of scope)
- User authentication and account management
- Click analytics and geographic tracking
- Spam and malicious URL detection
- QR code generation
The hardest part in scope: Generating unique short codes at scale. We need globally unique 6-8 character codes at approximately 1,160 writes per second without collision probability or retry overhead. We will dedicate a full deep dive to it.
User authentication is below the line because it does not change the write or redirect paths we are designing. To add it, I would associate each shortened URL with a user_id from a session token and expose a GET /users/{id}/urls endpoint for managing a user's links.
Click analytics is below the line because it introduces a separate write path and storage tier. To add it, I would emit a click event to a Kafka topic on every redirect and process it asynchronously into a time-series database. That pipeline sits beside the main system rather than inside it.
Spam detection is below the line because it requires an ML-backed classification model. To add it, I would run a synchronous check against a URL reputation service before inserting and queue suspicious URLs for async review.
QR code generation is below the line because it is stateless and does not interact with the core URL mapping. To add it, I would generate the QR code on demand from the short URL without storing anything new.
Non-Functional Requirements
Core Requirements
- Uniqueness: Each short code maps to exactly one long URL, globally. No collisions permitted.
- Availability: 99.99% uptime. Availability over consistency for redirects (a stale cache hit is better than a 500 timeout).
- Latency: Redirect completes in under 100ms p99. Short URL creation can tolerate up to 500ms.
- Scale: 1B total stored URLs, 100M DAU. Write rate peaks at approximately 1,160 new URLs per second. Read rate peaks at approximately 1.16M redirects per second.
Below the Line
- Sub-10ms redirect latency via CDN edge caching
- Real-time click analytics consistency
Read/write ratio: For every 1 new URL created, expect roughly 1,000 redirects. This 1000:1 skew is the single most important number in this design. It determines the caching strategy, the service split, and the database tier. Nearly every architectural decision in this article traces back to it.
Under 100ms redirect latency means a direct database lookup on every request is not viable (a single DB round-trip adds 10-50ms before accounting for query time). The 99.99% availability target means a single database node is not acceptable for the read path.
Core Entities
- ShortURL: The core mapping from a short code to a long URL. Carries the short code, original URL, optional custom alias, optional expiration timestamp, and a creation timestamp.
- User (out of scope for now): The account that created the short URL. We would reference
user_idin the full schema once authentication is in scope.
The full schema, indexes, and column types are deferred to the data model deep dive. The entities above are sufficient to drive the API design and High-Level Design.
API Design
Shorten a URL:
POST /urls
Body: { long_url, custom_alias?, expiration_date? }
Response: { short_url }
Redirect to the original URL:
GET /{short_code}
Response: HTTP 302 → original long URL
Delete a short URL:
DELETE /urls/{short_code}
Response: 204 No Content
302 vs 301: Use 302 (temporary redirect) over 301 (permanent redirect). A 301 tells browsers to cache the redirect permanently, so the browser never asks our server again. This matters for three reasons: expired URLs need to return
410 Gonerather than silently redirecting to a dead page; custom aliases can be reassigned; the redirect target can be updated without asking users to clear their cache.
The GET /{short_code} endpoint is the hot path, handling over 99% of all traffic. Every other endpoint is noise by comparison. I would split it into a dedicated service for exactly this reason.
High-Level Design
1. Users can submit a long URL and receive a shortened version
The write path: client submits a long URL, server generates a short code, database stores the mapping.
Components:
- Client: Web or mobile interface sending
POST /urlsrequests. - Write Server: Validates the URL, generates a short code (treated as a black box here, detailed in the deep dives), writes the mapping to the database.
- Database: Stores the
short_code → long_urlmapping plus the optional custom alias and expiration date.
Request walkthrough:
- Client sends
POST /urlswith the long URL and an optional custom alias. - Write Server validates the URL format.
- Write Server checks whether a custom alias is requested and not already taken.
- Write Server generates a short code.
- Write Server inserts
{ short_code, long_url, expiration_date }into the database. - Write Server returns the constructed short URL to the client.
The write path only: client sends a long URL, the Write Server validates it, generates a short code, and stores the mapping. I'd call out to the interviewer that code generation is a black box for now and move on. The redirect path and caching layer come in the next requirement.
2. Users can access the original URL by visiting the short link
The read path: client visits the short URL, server looks up the mapping, returns a 302 redirect. At a 1000:1 read/write ratio, the database cannot absorb this traffic directly; a cache is required.
Components:
- Read Server: Receives
GET /{short_code}, checks the cache, falls back to the database on a miss, and returns a 302 redirect. - Redis Cache: Stores
short_code → long_urlmappings. Sub-millisecond lookups with a TTL set to the URL's expiration date. - Database: Handles cache misses. After the first miss, the mapping is cached; the database is rarely hit again for popular codes.
Request walkthrough:
- Client sends
GET /{short_code}. - Read Server checks Redis for the short code.
- Cache hit: Redis returns the long URL. Read Server responds with
302 → long_url. Done in under 1ms. - Cache miss: Read Server queries the database, gets the long URL, writes it back to Redis (cache-aside pattern), then responds with
302 → long_url.
The Read Server bypasses the primary database on cache hits. With a warm cache absorbing 90%+ of lookups, the database handles cold starts and rare misses only. This is the moment in the interview where the design starts to feel real: you have a dedicated read path that can scale independently from writes.
The Write Server and Read Server are shown as separate services from the start because of the 1000:1 traffic ratio. They scale independently in every subsequent diagram.
3. Custom alias support
Custom aliases must be globally unique across all users. The write path gains a uniqueness check, and the database's UNIQUE constraint becomes the final arbiter for concurrent conflicts.
Components:
- Write Server (updated): Before inserting, check whether the requested alias is already taken. Return
409 Conflictif it is. - Database (updated): The
short_codecolumn carries a UNIQUE constraint. This is the safety net for concurrent requests that both pass the application-level check before either commits.
Request walkthrough:
- Client sends
POST /urlswithcustom_alias: "mycompany". - Write Server queries the database:
SELECT 1 FROM urls WHERE short_code = 'mycompany'. - If the alias is taken, return
409 Conflictimmediately. - If free, proceed with the insert. The UNIQUE constraint prevents a race condition between the check and the insert.
- Write Server populates the Redis cache for the new alias immediately, so the first redirect is a cache hit.
Writing the alias to cache immediately after insertion eliminates the cold-redirect miss for URLs shared right after creation. The UNIQUE constraint is the race-condition safety net: if two concurrent requests both pass the SELECT check before either INSERT commits, only one succeeds; the other gets a 409.
The SELECT-then-INSERT pattern has a race condition: two concurrent requests for the same alias can both pass the SELECT check before either INSERT commits. The UNIQUE constraint on the database column makes this safe. One insert wins; the other gets a constraint violation, which the Write Server translates into a 409 response.
4. URL expiration
Expired URLs must never serve a redirect. The expiration check must not add an extra round-trip to every request, and the database must not grow unboundedly with dead rows.
Components:
- Redis Cache (updated): Each cache entry carries a TTL equal to
expiration_date - now. Cache entries auto-evict when the URL expires. - Read Server (updated): On a cache miss for an expired URL, the database row may still exist. Check
expiration_dateinline on every cache miss before responding. Return410 Goneif expired. - Background Cleanup Job: Runs periodically (daily). Hard-deletes rows where
expiration_date < NOW()in large batches. Prevents the database from accumulating unbounded expired rows.
Request walkthrough for an expired URL:
- Client sends
GET /{short_code}for an expired short URL. - Redis has no entry (TTL expired). Cache miss.
- Read Server queries the database. The row may still exist.
- Read Server checks
expiration_date. Ifexpiration_date < now, return410 Gone. - No cache repopulation. The Background Cleanup Job eventually hard-deletes the expired row.
The Cleanup Job runs asynchronously and on a gentle schedule. Expired rows linger briefly in the database, but the inline expiry check on every cache miss ensures no stale 302 is ever served.
Potential Deep Dives
1. How do we generate unique short codes?
Three constraints drive the design:
- Codes must be globally unique. Two different long URLs must never produce the same short code.
- Codes should be 6-8 characters for readability and shareability.
- Generation must be fast. At 1,160 writes per second, code generation is on the hot path of every create request.
2. How do we scale redirects to 100M DAU under 100ms?
Three constraints drive the design:
- 1.16M redirect requests per second at peak.
- Under 100ms p99 for every redirect.
- A single database instance handles approximately 10K-50K reads per second. We are 23x beyond that at peak.
3. How do we handle URL expiration efficiently?
Three constraints drive the design:
- Expired URLs must never serve a redirect. A
410 Gonemust be returned. - The expiration check must not add a separate database round-trip to every redirect.
- The database must not accumulate unbounded expired rows over the lifetime of the service.
Final Architecture
The read/write split is the core insight. Write Service and Read Service scale independently based on the 1000:1 traffic ratio. Redis Cluster absorbs over 90% of redirect lookups at under 1ms, keeping the primary database reserved exclusively for writes. CDN edge caching adds a third layer that handles repeat accesses for popular codes without touching our infrastructure at all.
Interview Cheat Sheet
- Lock down 3-4 core features and name what is explicitly out of scope before drawing anything.
- State the read/write ratio immediately (1000:1 for URL shorteners) because it explains every downstream architectural decision.
- Counter-based short code generation beats hashing: unique by construction, no retry logic, no birthday problem.
- A 6-character base62 code covers 62^6 = 56B values, giving 56x headroom over the 1B URL target.
INCRis atomic. Counter-based generation needs no distributed lock and no collision handling.- Use 302 (temporary) not 301 (permanent) so browsers do not cache redirects, which would break expiration and alias updates.
- Split Write Service and Read Service early. They have completely different load profiles at a 1000:1 traffic ratio.
- Redis cache for
short_code → long_urllookups drops redirect latency from ~20ms (read replica) to under 1ms. - Counter batching reduces per-write Redis round-trips by 1000x: each Write Service instance claims 1,000 IDs at once via
INCRBY. - For multi-region, allocate disjoint counter ranges per region (A: 0-1B, B: 1B-2B). No cross-region coordination needed.
- Set Redis TTL equal to
expiration_date - nowon write. Expired entries evict automatically; do not repopulate the cache on an expired cache miss. - The inline expiration check on a cache miss (not a separate query) keeps the read path to a single database round-trip.
- The background cleanup job runs in batches with throttled intervals. Expired rows linger briefly, but the inline check ensures no stale 302.
- The UNIQUE constraint on
short_codeis the final safety net for concurrent custom alias requests. One insert wins; the other gets a 409. - CDN edge caching for popular short codes pushes redirect latency under 10ms and removes the majority of traffic from origin servers entirely.