Pastebin
Design a text-sharing service that lets users store and share snippets of code or text, from a simple single-server prototype to a system handling millions of pastes with expiration, access control, and global CDN delivery.
What is Pastebin?
Pastebin stores arbitrary text or code snippets behind a short URL you can share with anyone. The apparent simplicity hides two real engineering problems: generating millions of unique short IDs without collisions, and serving a read workload that outpaces writes by 100:1 without turning the database into a bottleneck. It is a good interview question because it combines ID generation, blob storage decisions, TTL-based expiration, and caching strategy in a system compact enough to design end-to-end in 45 minutes.
Functional Requirements
Core Requirements
- Users can create a paste with a block of text or code.
- Each paste gets a unique short URL.
- Pastes can optionally expire after a set duration.
- Users can view a paste via its URL.
Below the Line
- Full-text search across all pastes.
- Real-time collaborative editing.
The hardest part in scope: Generating globally unique short IDs without collisions while keeping the write path fast is the central design challenge. We will dedicate Deep Dive 1 entirely to it.
Full-text search is below the line because it requires a separate search index (Elasticsearch or similar), a background ingestion pipeline, and query infrastructure that sits beside, not inside, the core read/write path. To add it, we would stream every new paste to a Kafka topic, have a consumer index the content, and build a search endpoint that queries the index rather than the paste database.
Real-time collaborative editing requires operational transforms or CRDTs, conflict resolution, cursor synchronization, and a WebSocket layer for broadcasting changes. This is effectively a separate product surface and does not interact with the batch paste storage model we are designing here.
Non-Functional Requirements
Core Requirements
- Low read latency: Paste content served in under 50ms p99.
- High availability: 99.99% uptime. Availability over consistency for paste reads.
- Scale: 10M DAU. 100K pastes created per day (~1.2 writes/second). 10M paste views per day (~116 reads/second).
- Durability: No silent data loss. A paste that was successfully created must be retrievable until it expires.
- Storage capacity: Average paste size 10KB. 100K pastes per day equals ~1GB of new content per day. Total storage requirement: ~1TB over 3 years.
Under 50ms latency for paste reads means a cache layer in front of the database is non-negotiable. A direct PostgreSQL read adds 10-30ms of query time before any network overhead.
99.99% uptime means we need at least one standby replica so a primary failure does not cause downtime. For paste reads, we prefer availability over consistency: a slightly stale cache hit is better than a failed request.
Below the Line
- Real-time view count analytics per paste.
- Abuse detection and spam filtering.
Real-time view count analytics is below the line because it requires a separate write path (incrementing a counter on every read) and an analytics pipeline that sits outside the core paste storage loop. To add it, I'd capture a view event on every GET, publish it to a Kafka topic, and aggregate counts asynchronously in a time-series store like ClickHouse or TimescaleDB. The core read path stays untouched.
Abuse detection and spam filtering is below the line because it needs content classification infrastructure (ML models, regex rulesets, URL scanning) that would add latency to the write path. To build it, I'd run an async content scanner that processes new pastes from a queue and flags or deletes ones that match abuse patterns, keeping the synchronous write path fast.
Read/write ratio: For every paste created, expect roughly 100 views. This 100:1 read/write skew is the number that shapes every downstream decision. It tells us the database read path is where we will suffer, that a cache is non-negotiable, and that read replicas matter more than write replicas for this system. I would call this out in an interview within the first three minutes.
Core Entities
- Paste: A short ID, raw content (or a reference to object storage), creation timestamp, expiration timestamp, and an optional owner ID.
- User: An account that owns pastes, relevant for rate limiting and distinguishing anonymous from registered users. Authentication is out of scope.
We will revisit schema details, including indexes and storage layout, in the deep dives. The entities above are enough to reason about the API and the data flow.
API Design
Start with one endpoint per functional requirement.
FR 1 - Create a paste:
# FR 1: Create a new paste
POST /pastes
Body: { content, expiration_seconds?, syntax_hint? }
Response: { paste_id, paste_url, expires_at? }
POST because we are creating a new resource and the server assigns the ID. The client does not supply the paste_id. Return the full paste_url so the client does not need to reconstruct it from the ID.
FR 2 - View a paste:
# FR 2: Retrieve paste content by ID
GET /pastes/{paste_id}
Response: { paste_id, content, created_at, expires_at? }
Unlike a URL shortener, Pastebin serves content directly rather than redirecting. Return 410 Gone for expired pastes rather than 404 Not Found, because 404 means "never existed" and 410 means "existed but is gone now." That distinction matters for debugging and for abuse detection.
FR 3 and FR 4 - Expiration is transparent. Any endpoint that retrieves a paste checks expires_at inline and returns 410 Gone if the paste is past its expiration. No separate expiration endpoint is needed. The background cleanup job covered in the HLD handles eventual hard deletion from storage.
Anonymous vs. registered users: anonymous users get rate-limited by IP address; registered users get a higher rate limit tied to their account. Enforce this with a Redis counter keyed by IP or user_id before allowing a paste creation. This prevents a single client from flooding the system with millions of pastes without requiring full authentication infrastructure.
High-Level Design
1. Users can create a paste
The write path: validate input, generate a short ID, and persist to the database.
For now, treat short ID generation as a black box. Deep Dive 1 walks through exactly three options and picks the right one.
Components:
- Client: Web or mobile interface sending POST /pastes requests.
- App Server: Validates input size (enforce a hard 10MB cap), generates a paste_id, and writes to the database.
- PostgreSQL: Stores paste_id, content, created_at, expires_at, and owner_id. The source of truth for all paste data.
Request walkthrough:
- Client sends
POST /pasteswith content and an optional TTL in seconds. - App server validates the content size is under the limit and that the TTL is a positive integer.
- App server generates a short paste_id (black box for now).
- App server inserts the paste_id, content, and expires_at into PostgreSQL.
- App server returns the paste_url to the client.
The write path at 1.2 writes/second is trivial for a single PostgreSQL instance. I'd spend no more than 30 seconds on this in an interview before pivoting to the read side, which is where all the interesting problems live.
2. Users can view a paste
The read path carries 99% of the traffic. At 100:1 read/write, every design decision here matters more than anything on the write side.
First, the naive approach: every GET request to /pastes/:paste_id reads directly from PostgreSQL. At 116 reads/second in steady state, PostgreSQL handles this without breaking a sweat. But a single viral paste changes everything: hundreds of concurrent requests for the same paste_id hit the same database row, causing lock contention and connection pool exhaustion. I've seen this exact failure mode in production where a single trending paste brought down database connections for all pastes.
The fix is a Redis cache seeded on write. When a paste is created, the app server writes the content into Redis immediately with a TTL matching the paste's expires_at. On a cache hit, the request never touches the database.
Separate reads from writes at the database layer by adding a read replica so the primary only ever receives INSERTs and DELETEs.
Components added:
- Redis Cache: In-memory store, sub-1ms per lookup. Stores paste content keyed by paste_id. TTL equals the paste's expiration so content auto-evicts correctly without explicit invalidation.
- Read Replica: Async replica of the PostgreSQL primary. Serves cache-miss reads so the primary is never touched by a GET request.
Request walkthrough (read path):
- Client sends a GET request to
/pastes/:paste_id. - App server checks Redis for
paste_id. - Cache hit: return content directly in under 1ms.
- Cache miss: query the read replica for the paste row.
- Check
expires_at. If expired, return 410 Gone. - Write content into Redis with TTL equal to
expires_at - NOW(). - Return content to the client.
At a 90%+ cache hit rate, the read replica handles fewer than 12 reads/second in steady state. The cache absorbs everything else.
3. Pastes expire
Expiration has two layers: an inline check to block stale reads immediately, and a background worker for periodic hard deletes.
The inline check is already present in the read path above: every GET checks expires_at before returning content, returning 410 Gone if expired. Redis TTL handles cache-side eviction automatically when the TTL reaches zero. Together, these two mechanisms ensure a client can never receive expired content.
The background cleanup worker removes expired rows from the database so storage does not grow without bound. Run it as a scheduled task every 60 seconds. It does a batched DELETE: query for all pastes with expires_at < NOW() and delete them in batches of 500 rows per iteration.
Component added:
- Expiry Worker: Scheduled job running every 60 seconds. Batch-deletes expired rows from PostgreSQL. Runs on a dedicated connection pool so it does not compete with application traffic.
The Expiry Worker is deliberately simple. If it falls behind during a heavy expiration burst, increase the batch size or raise the run frequency. Expired content is already blocked at the read layer, so falling behind on cleanup is a storage efficiency problem, not a correctness problem.
Potential Deep Dives
Pastebin is an easy-difficulty question, so we cover two deep dives: ID generation and large paste storage with CDN delivery.
1. How do we generate unique paste IDs?
Every paste needs a short, URL-safe ID. The constraints are strict: IDs must never collide globally, must be generated without a bottleneck serializing all writes, and should be short enough to fit cleanly in a URL (6-8 characters). I often see candidates jump straight to UUIDs here, but a full UUID is 36 characters, far too long for a paste URL and wasteful in database indexes.
2. How do we handle large paste sizes and CDN delivery?
The average paste is 10KB, but real workloads include full log files, configuration dumps, and threaded stack traces that regularly hit 1-10MB. Storing all content inline in PostgreSQL creates backup and serving problems. I'd bring this up in an interview proactively: "What happens when a user pastes a 5MB server log?" It shows you think about real-world data distributions, not just averages.
Final Architecture
The central insight is the layer split: PostgreSQL holds only 200-byte metadata rows, GCS holds the content, and the CDN absorbs read traffic before it ever reaches your servers. The 100:1 read/write ratio is handled by the CDN and Redis operating as a two-tier absorber, leaving the database to do what it does best: small, indexed ACID writes and metadata lookups.
Interview Cheat Sheet
- Lock down 3-4 core features first, then explicitly name what is out of scope (full-text search, collaborative editing) so the interviewer sees deliberate scoping.
- State the 100:1 read/write ratio within the first few minutes. It explains every caching and scaling decision you make for the rest of the interview.
- The hardest sub-problem is unique ID generation: use an atomic Redis counter with base62 encoding, not random strings or content hashes.
- A 6-character base62 code covers 62^6 = 56 billion values, enough for over 1.5 million years at 100K pastes per day.
- Redis INCR is atomic: no distributed lock, no collision, no retry logic needed in the write path.
- Use INCRBY 1000 (counter batching) under load to reduce Redis round-trips by 1,000x. Each App Server instance holds a local batch until exhausted.
- Seed the Redis cache on write so the first read of any paste is a cache hit, not a database round-trip.
- Expiration has two layers: inline
expires_atcheck on every read for immediate enforcement, plus a background worker for periodic hard deletes so storage does not grow without bound. - Return 410 Gone for expired pastes, not 404. The distinction tells operators whether a paste never existed (404) or was removed after a TTL (410).
- Store paste content in object storage (GCS or S3) and only metadata in PostgreSQL. At 1TB over 3 years, keeping content inline in the database creates multi-hour backup windows and blocks CDN delivery.
- A CDN (CloudFront or Fastly) in front of GCS serves content globally under 50ms p99 without touching your application servers on subsequent requests.
- Handle anonymous vs. registered users with a Redis rate-limit counter keyed by IP for anonymous clients and by user_id for registered ones, using a sliding window to prevent paste flooding.
- For private pastes: generate S3 presigned URLs with a 5-minute expiry instead of using the CDN, so access control is enforced on every request without changing CDN configuration.
- The Expiry Worker runs every 60 seconds and deletes in 500-row batches from PostgreSQL only. Use an S3 lifecycle rule to auto-delete content objects so the worker never touches GCS directly.