Ticketmaster
Walk through a complete Ticketmaster-style ticket booking system, from a basic event/seat model to a globally-consistent reservation system that handles flash-sale traffic spikes and prevents double-booking.
What is Ticketmaster?
Ticketmaster sells tickets to concerts, sports games, and theater. The system looks like a simple shopping platform until 500,000 users simultaneously hit "Buy" the moment Taylor Swift tickets go on sale.
Every seat is a unique, finite resource that must be sold to exactly one buyer. That constraint forces the design into ACID transactions, distributed locking, and queue-based traffic shaping. I've seen teams underestimate this: they build a nice e-commerce checkout and then discover on launch day that 200 people bought the same seat. This question tests pessimistic versus optimistic concurrency control, flash-sale architecture, and the boundary between strong and eventual consistency.
Functional Requirements
Core Requirements
- Event organizers can create events with a seat map and price tiers.
- Users can browse available events and view the seat map with real-time availability.
- Users can reserve one or more seats, hold them for a 10-minute checkout window, and complete the purchase.
- If a user abandons checkout, held seats are automatically released back to available.
Below the Line (out of scope)
- Ticket transfer and secondary market resale
- Mobile ticket delivery and NFC check-in
- Venue and event organizer analytics
- Dynamic pricing (surge pricing based on demand)
Ticket transfer and secondary market resale involves a second transaction lifecycle: a resale listing, a buyer payment, and a transfer of ownership that invalidates the original booking and creates a new one atomically. To add it, I would model a Resale entity linked to an existing Booking and handle ownership transfer in a separate service isolated from the primary booking flow.
Mobile ticket delivery and NFC check-in is a read-only verification flow with no write contention on seats. Each confirmed Booking generates a signed QR code (an HMAC of booking_id, event_id, and seat_id), stored as a JWT in the booking record. The check-in scanner verifies the signature against the system's secret key without hitting the booking database on every scan.
Venue and organizer analytics is a read-only, eventually consistent reporting pipeline that sits beside the booking system. The integration point is a Kafka topic where every booking event is published; an analytics consumer writes into a data warehouse for reporting without touching the booking critical path.
Dynamic pricing requires a pricing engine with write access to the Seat.price field that runs a separate real-time demand model. The integration is a background pricing service that subscribes to booking events, computes demand signals, and updates price tiers on remaining seats outside the synchronous booking path.
The hardest part in scope: Preventing two users from booking the same seat simultaneously when 500,000 concurrent users hit the booking endpoint is the central engineering challenge. Every concurrency decision in this design is a direct response to this constraint.
Non-Functional Requirements
Core Requirements
- Scale: 50 million active users; 100 events active concurrently; peak events with up to 100,000 seats.
- Concurrency: Up to 500,000 simultaneous users at the booking endpoint for a single high-demand event at sale time.
- Latency: Seat reservation confirmation under 200ms p99.
- Consistency: Strong consistency for seat booking. A seat must never be sold twice.
- Availability: 99.99% uptime for the booking path (roughly 52 minutes of downtime per year).
- Hold window: Seats held for exactly 10 minutes; automatically released on expiry.
Below the Line
- Geo-replication across multiple regions (single-region is sufficient for this design)
- Real-time fraud detection during checkout (a separate async scoring pipeline)
Read/write ratio: Browsing the event catalog and viewing seat maps accounts for roughly 100 reads for every 1 booking write. The booking writes are the critical path. A slow seat-map read shows stale data; a failed or duplicated booking write loses revenue and violates correctness. This ratio tells you where to invest caching: the seat map read path can absorb a generous cache TTL, while the booking write path must bypass cache entirely and go to the source of truth.
Core Entities
- Event: A scheduled performance at a venue. Holds name, date, venue reference, and sale open time.
- SeatMap: The physical layout of a venue for an event. Contains sections, rows, and seat identifiers.
- Seat: An individual seat tied to an event. Tracks
status(available, held, booked), price tier, and the current reservation holding it. - Reservation: A 10-minute hold on one or more seats. Created when checkout begins. Expires and auto-releases on timeout.
- Booking: A confirmed, paid purchase of reserved seats. The permanent record of a completed transaction.
- User: An account that can browse events, hold seats, and complete purchases.
Schema details such as the expires_at index and seat status enum are addressed in the Interview Cheat Sheet. These six entities are enough to drive the API and High-Level Design.
API Design
FR 1: Create an event with a seat map:
# Organizer creates a new event; server expands seat map template into seat rows
POST /v1/events
Body: { name, venue_id, date, seat_map: { sections: [...] }, price_tiers: [...] }
Response: { event_id: "ev_abc123" }
POST creates a new resource. The seat_map in the request body describes sections and seat counts; the server generates individual Seat rows from this template inside a single transaction. Price tiers link to sections, not individual seats, keeping the schema manageable for 100,000-seat venues. I'd call this out early in the interview because interviewers sometimes expect you to model price per seat, which explodes the schema for large venues.
FR 2: View seat availability:
# Returns per-seat status for a given event; served from Redis on cache hit
GET /v1/events/{event_id}/seats
Response: {
event_id: "ev_abc123",
seats: [
{ seat_id: "s_1A", section: "Floor", row: "A", number: 1, status: "available", price: 150 },
{ seat_id: "s_1B", section: "Floor", row: "A", number: 2, status: "held" }
]
}
This is the hottest read endpoint: 100 reads for every 1 booking write. Responses are served from Redis; only cache misses hit PostgreSQL. Seat status changes (hold, release, book) write through to Redis immediately to keep the map accurate.
FR 3: Reserve seats (start checkout):
# Atomically holds requested seats; returns reservation with 10-minute expiry
POST /v1/events/{event_id}/reservations
Body: { seat_ids: ["s_1A", "s_1B"], user_id: "u_789" }
Response: { reservation_id: "res_xyz", expires_at: "2026-03-29T12:10:00Z", seats: [...] }
This is the hardest endpoint. Two requests for the same seat_id arriving simultaneously must result in exactly one succeeding (201 Created) and one failing (409 Conflict). The atomic seat-locking mechanism is the subject of Deep Dive 1.
FR 4: Confirm booking (complete checkout):
# Processes payment and converts the reservation into a confirmed booking
POST /v1/reservations/{reservation_id}/confirm
Body: { payment_method_id: "pm_abc" }
Response: { booking_id: "bk_def", seats: [...], total_amount: 300 }
The server validates the reservation has not expired, processes payment, then atomically transitions seats from "held" to "booked" and inserts a Booking record in one transaction. Payment failure leaves seats held until expiry; no manual rollback is needed.
Abandon checkout (release hold):
# Explicit release; expired reservations are also released by background worker
DELETE /v1/reservations/{reservation_id}
Response: 204 No Content
The client calls this on explicit cancel. Expired reservations are released automatically by a background expiry worker; this endpoint handles the explicit cancel path only.
High-Level Design
1. Event creation and seat storage
The foundation: an organizer creates an event; the server generates all seat records atomically and stores them in a relational database.
Components:
- Client (Organizer): Admin dashboard sending the event creation request.
- Event Service: Validates the event data, expands the seat map template into individual seat rows, and writes everything in a single transaction.
- PostgreSQL: Stores
events,seat_maps, andseatstables. ACID transactions ensure the event and all seat rows are written together or not at all.
Request walkthrough:
- Organizer sends
POST /v1/eventswith event metadata and a seat map template. - Event Service opens a transaction: INSERT into
events, then for each seat in the template, INSERT intoseatswithstatus = 'available'. - Event Service commits and returns the
event_id.
This creates the event and initializes every seat as available. I'd mention at the whiteboard that the INSERT of 100,000 seat rows is a one-time batch operation (seconds, not milliseconds), so it does not need the latency guarantees of the booking path. The next step covers how to serve this seat data to thousands of concurrent viewers without overloading the database.
2. Seat availability read path
The read path: a user views the seat map; the system serves per-seat availability from Redis rather than from the primary database.
With 100 events active and up to 500,000 concurrent viewers during a flash sale, sending every seat map request directly to PostgreSQL creates a stampede that overwhelms the database. I'd always build this path cache-first: PostgreSQL is the source of truth for writes, and Redis is the source of truth for reads. The write-through pattern on every seat status change keeps them synchronized.
Components (added):
- Booking Service: Stateless service handling both the read and write paths. Implements a cache-aside pattern for seat map reads.
- Redis (Seat Map Cache): Stores per-event seat data as a Redis Hash keyed by
seats:{event_id}. Each field is aseat_id; each value is a JSON blob with status and price. Field-levelHSETupdates on status changes keep it current.
Request walkthrough:
- Client sends
GET /v1/events/{event_id}/seats. - Booking Service runs
HGETALL seats:{event_id}on Redis. - Cache hit: returns seat data directly (under 2ms).
- Cache miss: reads all seats from PostgreSQL, populates the Redis hash, returns the result.
- Seat status changes (during reservations and bookings) call
HSET seats:{event_id} {seat_id} {new_status}to keep Redis synchronized.
This read path serves the high fan-out browsing traffic efficiently. The harder problem is what happens the moment two users click "Hold" on the same seat at the same time. That is the next section.
3. Seat reservation and the double-booking problem
The reservation path: a user selects seats; the system atomically marks them as held for 10 minutes. Two concurrent users selecting the same seat must result in one success and one rejection.
The naive approach exposes a classic check-then-act race condition. Two requests for the same seat both read status = 'available', both decide to proceed, and both UPDATE the row to 'held'. The last write wins silently, and two reservation records now point at the same seat.
The fix is a database-level row lock: SELECT ... FOR UPDATE inside a transaction. The second concurrent transaction blocks at the lock until the first commits. When it unblocks, it re-reads the row, sees status = 'held', and returns 409 Conflict without any application-level coordination. This is the single most important design decision in the entire system, and I'd draw it on the whiteboard before anything else in the deep dives.
Components:
- Booking Service: Opens a serialized transaction per reservation request. Holds the row lock for the duration of the INSERT.
- Expiry Worker: Background job running every 30 seconds. Releases holds where
reservation.expires_at < NOW()by setting seat status back to 'available' and syncing the Redis cache.
Request walkthrough:
- User A and User B both send
POST /v1/events/ev1/reservationsselecting seats_1Aat the same instant. - Both requests reach the Booking Service simultaneously.
- Booking Service for User A begins a transaction:
SELECT * FROM seats WHERE seat_id = 's_1A' FOR UPDATE. - User A's transaction acquires the row lock. User B's identical query blocks, waiting for the lock.
- Booking Service validates seat is 'available', UPDATEs to 'held', INSERTs the reservation record, COMMITs. Lock released.
- User B's transaction unblocks, re-reads the row:
status = 'held'. Returns 409 Conflict immediately. - Booking Service calls
HSET seats:{event_id} s_1A heldon Redis to sync the seat map cache.
Row-level locking prevents double-booking completely within a single write path. What it does not solve is serving 500,000 concurrent users without the booking service itself becoming the bottleneck. That is the next section.
4. Checkout confirmation and payment
The confirmation path: user submits payment; the system charges the card and atomically converts the reservation into a confirmed booking.
Partial failure is the key risk here. Payment can succeed while the database write fails, leaving a user charged but without a booking record. I've seen this exact failure mode in production: a timeout between the payment gateway and the app server left a customer charged with no booking confirmation, which required a manual refund and a very apologetic support email.
The safe pattern is: capture a payment_intent_id from Stripe first, then write the booking atomically. If the DB commit fails after a successful charge, a reconciliation job detects the orphaned intent and retries the write or issues a refund.
Components:
- Payment Service: Delegates to the external payment processor (Stripe). Returns a
payment_intent_id. - Booking Service: Re-validates the reservation has not expired, calls the Payment Service, then atomically updates seat status to 'booked' and inserts the
Bookingrecord in one transaction.
Request walkthrough:
- Client sends
POST /v1/reservations/{reservation_id}/confirmwith a payment method. - Booking Service validates the reservation: not expired, user matches.
- Booking Service calls Payment Service to charge the card. Receives
payment_intent_id. - Booking Service opens a transaction: UPDATE seats to 'booked', INSERT Booking record with the
payment_intent_id, DELETE the Reservation row, COMMIT. - Booking Service calls
HSET seats:{event_id} {seat_id} bookedto sync Redis. - Returns the booking confirmation to the client.
The four HLD steps cover all core requirements including FR4 (seat hold expiry), which is addressed directly below.
5. Seat hold expiry and explicit cancellation
FR4: If a user abandons checkout, the held seat must be released automatically. Both the explicit cancellation path and the background expiry path must converge on the same state change.
There are two ways a reservation ends without completing checkout: the user explicitly clicks "Cancel" (DELETE request), or the user does nothing and the 10-minute window elapses silently. Both paths must set seat status back to 'available' and update the Redis cache.
Components:
- Booking Service: Handles the explicit DELETE path synchronously. Deletes the Reservation row, UPDATEs seat status to 'available', and calls
HSETto sync Redis. - Expiry Worker: Background job running every 30 seconds. Queries for reservations where
expires_at < NOW()and applies the same status reset in bulk.
Request walkthrough (explicit cancellation):
- User sends
DELETE /v1/reservations/{reservation_id}. - Booking Service verifies the reservation belongs to the requesting user.
- Booking Service opens a transaction: DELETE the reservation row, UPDATE seat status to 'available'.
- Booking Service calls
HSET seats:{event_id} {seat_id} availableto sync Redis. - If the seat had a Redis lock (
lock:seat:{seat_id}), callDEL lock:seat:{seat_id}to release it.
Request walkthrough (background expiry):
- Expiry Worker runs every 30 seconds.
- Queries:
SELECT reservation_id, seat_ids FROM reservations WHERE expires_at < NOW(). - For each expired reservation: UPDATE all seat rows to 'available', DELETE the reservation row, HSET each seat in Redis, DEL each Redis lock.
Both paths converge on the same outcome: seat is available again and the cache reflects it. I'd point out at the whiteboard that the Expiry Worker is idempotent by design: running it twice on the same expired reservation produces the same result, which means you can safely run multiple worker instances without coordination.
The deep dives resolve the three remaining hard questions: how to prevent double-booking at high concurrency, how to survive a Taylor Swift on-sale event, and how to keep the seat map accurate under heavy concurrent viewership.
Potential Deep Dives
1. How do you prevent double-booking at scale?
The double-booking problem has three layers: correctness at low concurrency, correctness at medium concurrency, and correctness under per-seat write contention when every user wants the same front-row seat. The database locking strategy is the most important implementation decision in the entire system.
2. How do you handle the traffic spike when a high-demand event goes on sale?
When Taylor Swift concert tickets go on sale at 10am, 500,000 users hit the booking endpoint within the first 30 seconds. A standard scaling approach (add more servers) does not work here: each seat can only be booked once, and flooding the booking service with concurrent transactions creates database lock contention that degrades all-user latency, not just the unsuccessful buyers.
3. How do you keep the seat map accurate for 10,000 concurrent viewers?
During a flash sale, tens of thousands of users watch the seat map update in near-real-time. A seat map that is 30 seconds stale causes users to click seats that are already taken, generating a wave of 409 errors that looks like bugs from the outside. Getting the accuracy right without overloading the database involves three levels of cache design.
Final Architecture
The most important architectural insight here: Redis sits in front of PostgreSQL on both the read path (seat map hash) and the write path (seat lock pre-filter and waiting room queue). PostgreSQL only handles transactions that have already passed the Redis pre-filters, keeping database connection count and lock contention within manageable bounds even at flash-sale peak traffic.
Interview Cheat Sheet
- Prevent double-booking: Use
SELECT FOR UPDATEinside a PostgreSQL transaction to acquire a row-level lock on the seat row before writing. A Redis pre-filter (SET lock:seat:{id} NX EX 600) eliminates most contention before it touches the database. At most one transaction can hold a given seat at a time. - Optimistic vs. pessimistic locking: Optimistic locking (version numbers) works for low-contention seats but creates retry storms when 500K users compete for 10 front-row seats. Use pessimistic locking (
SELECT FOR UPDATE) for this workload. The trade-off is held database connections; the benefit is zero-retry correctness. - Seat reservation hold model: Create a
Reservationrow withexpires_at = NOW() + INTERVAL '10 minutes'onPOST /v1/reservations. Mirror the hold in Redis asSET lock:seat:{id} NX EX 600. A background Expiry Worker runs every 30 seconds and releases any hold whereexpires_at < NOW()by updating seat status back to 'available' and syncing the cache. - Flash-sale traffic spike: Place incoming booking requests in a Redis sorted set (virtual waiting room, keyed by arrival timestamp). Admit 1,000 users per second via a Queue Controller that issues short-lived JWTs (5-minute TTL). The Booking Service only accepts requests with a valid admission token, limiting database write concurrency to the system's proven throughput.
- Seat map accuracy for concurrent viewers: Store the seat map as a Redis Hash (
HSET seats:{event_id} seat_id status). Write through on every seat status change. Use Redis Pub/Sub to push changes to a WebSocket Gateway, which fans updates to all connected viewers in real time. No TTL; the cache is never stale because every write invalidates immediately. - Database choice (SQL over NoSQL): PostgreSQL wins this workload. Strong consistency and row-level
SELECT FOR UPDATElocking are non-negotiable. DynamoDB transactions support atomic writes on up to 25 items but do not have row-level locking primitives that block a conflicting transaction at the database layer; you would reintroduce the race condition in application code. - Partial payment failure: Collect a
payment_intent_idfrom Stripe before writing the booking record. The database transaction atomically inserts the Booking row including thepayment_intent_id. If the DB write fails after a successful charge, a reconciliation job detects the orphaned payment intent and either retries the booking write or issues a refund. - Lock ordering for multi-seat reservations: Always sort
seat_idsin ascending order before acquiringSELECT FOR UPDATElocks. Two transactions each holding one lock the other needs will deadlock if they acquire in different orders. Consistent ordering eliminates this class of deadlock without any coordination protocol. - Hold expiry at scale: Index
reservations.expires_at. The Expiry Worker query isUPDATE seats SET status='available' WHERE reservation_id IN (SELECT reservation_id FROM reservations WHERE expires_at < NOW()). Without the index, this scans the full reservations table every 30 seconds under load. - Concurrent viewers without database overload: A single
HGETALLfor a 100,000-seat event returns roughly 10MB from Redis in under 5ms. Serving 10,000 concurrent seat map reads from Redis costs roughly 10 GB/s of Redis memory bandwidth, well within a Redis Cluster's capacity. PostgreSQL is entirely off the hot read path. - Latency budget at 200ms p99: Redis pre-filter under 1ms. PostgreSQL
SELECT FOR UPDATEplus UPDATE plus INSERT runs 10 to 50ms under normal load and up to 150ms at flash-sale peak. Payment service call: 30 to 100ms. Network: 5 to 20ms. The virtual waiting room absorbs excess load before it hits the database, keeping p99 within budget on admitted requests. - Availability at 99.99%: The booking path requires PostgreSQL and Redis to each have replicas with automatic failover. Use PostgreSQL synchronous replication on the primary to avoid data loss on failover. Use Redis Sentinel or Redis Cluster for automatic Redis failover. The WebSocket Gateway and Booking Service are stateless and can be restarted without data loss.
- Static seat map on CDN: The venue seat map SVG is a static asset served from CDN with a long TTL. It never goes through the booking cluster on each page load. Only the per-seat availability overlay (the Redis hash data) is dynamic. Serving the SVG from the application cluster for 500,000 pre-sale visitors would saturate the cluster before the first ticket is sold.
- Sold-out signaling: When the last seat transitions to 'booked', set a
sold_out:{event_id}key in Redis. The API Gateway checks this key before routing any new reservation request to the Booking Service and returns 410 Gone immediately for sold-out events, protecting the booking cluster from hopeless traffic.