Walk through a complete Uber design, from a single trip service to a globally distributed system handling 5M concurrent drivers, real-time GPS matching, and sub-5-second dispatch.
39 min read2026-03-28hardsystem-designride-hailinggeospatialreal-timewebsockets
Uber is a ride-hailing platform that connects riders who need a trip with nearby drivers. The apparent core is simple: request a ride, match a driver, complete the trip. The hard part is underneath.
The system must continuously track millions of driver GPS coordinates, answer "who is closest to this pickup?" in under 100ms, stream location updates between two strangers in near real-time, and do all of this for hundreds of thousands of concurrent trips without losing a single location update. I start every Uber design by separating the location write path (drivers broadcasting GPS) from the matching read path (riders triggering geospatial queries), because they have almost nothing in common and mixing them creates the worst bottlenecks at scale.
It tests geospatial indexing, real-time streaming at scale, event-driven matching, and atomic concurrency control, making it one of the most concept-dense questions in the interview circuit.
Scheduled rides and ride types (Pool, Comfort, Black)
Driver onboarding and background checks
The hardest part in scope: geospatial matching. Every ride request triggers a query across millions of GPS coordinates to find the nearest available driver. The naive approach (SQL range query on lat/lng columns) collapses at scale. Efficient geospatial indexing is the central engineering problem this article solves.
Payments are below the line because the payment flow (charge, refund, driver payout) runs after trip completion and does not share any infrastructure with matching or location tracking. To add it, I would integrate a payment processor like Stripe and publish a TripCompletedEvent from the Trip Service to a dedicated payments Kafka topic. A Payments Service consumes this event, calculates the fare using trip distance and any surge multiplier, and executes the charge asynchronously.
Continue Reading with Premium
Unlock this article and every other in-depth system design guide on the platform with NotesFromSDE Premium.
Ratings and reviews are below the line because they are a separate write-after-trip flow that does not affect the hot paths. To add them, I would store ratings in a Postgres table keyed by (trip_id, rater_id) and compute rolling rating averages in a background job rather than inline.
Scheduled rides require a separate scheduling layer. To add them, I would store scheduled trip requests in a persistent job queue and release them into the normal matching flow shortly before the scheduled pickup time, reusing the matching infrastructure entirely.
Availability: 99.99% uptime. Availability over consistency for matching; a mildly stale driver location is acceptable, a failed match is not.
Match latency: Rider receives a driver assignment within 5 seconds of requesting a trip.
Location freshness: Driver locations are current to within 5 seconds at all times.
Location write throughput: Support 1M concurrently active drivers, each sending GPS updates every 4 seconds. That is 250K location writes per second at peak.
Scale: 5M registered drivers, 15M trips per day. Peak matching throughput of approximately 500 trip requests per second during surge.
Sub-second GPS update propagation to rider app during trip
Surge pricing computation (important but not part of functional core)
Read/write ratio: Location writes (driver GPS broadcasts) are the dominant workload: 250K writes per second at peak. Trip requests (matching reads) are orders of magnitude lower: ~500 per second peak. But each trip request triggers a geospatial query across 1M+ driver positions. The write volume shapes the location storage architecture; the read access pattern shapes the geospatial index. They pull in different directions, and that tension drives every major design decision in this article.
I target a 5-second match time because GPS freshness means a 4-second update interval is already baked in. Anything beyond one cycle means the system is stalling on matching logic, not waiting for fresh data. The sub-5-second target rules out any matching approach that requires multiple sequential round-trips to the database with no caching.
Driver: A registered driver with a current location (latitude, longitude), status (available, on_trip, offline), vehicle details, and a driver_id. The status field gates every matching query.
Rider: A registered user with a rider_id who can place trip requests.
Trip: A request-to-completion record with trip_id, rider_id, driver_id, pickup_location, dropoff_location, status (requested, accepted, in_progress, completed, cancelled), and created_at.
DriverLocation: The current GPS snapshot for a driver: driver_id, latitude, longitude, updated_at. Ephemeral; not a durable historical record.
The schema details (indexes, partition keys, TTLs) are deferred to the deep dives. These four entities are sufficient to drive the API design and High-Level Design.
GET /trips/{trip_id}Response: { trip_id, status, driver: { lat, lng, eta_seconds } }
Driver broadcasts location:
POST /drivers/locationBody: { latitude, longitude, status }Response: 200 OK
Driver accepts a trip offer:
PUT /trips/{trip_id}/acceptResponse: { trip_id, pickup_location, rider_name }
Driver updates trip status:
PUT /trips/{trip_id}/statusBody: { status: "in_progress" | "completed" | "cancelled" }Response: 200 OK
Rider subscribes to real-time driver location:
GET /trips/{trip_id}/liveUpgrade: websocketServer pushes: { driver_lat, driver_lng, timestamp_ms }Connection closes when trip status reaches "completed" or "cancelled"
Why HTTP for location updates? At 250K location writes per second, HTTP/2 keep-alive connections amortize connection overhead across many requests. A short-lived HTTP POST per update adds roughly 5ms of latency but keeps the driver app stateless: no persistent WebSocket connection to maintain on mobile networks that regularly drop and reconnect. The alternative (persistent WebSocket from driver to server) reduces per-update overhead but complicates reconnection logic on unreliable mobile connections. I choose HTTP for the driver write path and WebSocket for the rider receive path, since riders need server-pushed updates without polling.
The write path: the rider submits a pickup/dropoff pair, the Trip Service creates a trip record with status requested, and immediately kicks off asynchronous driver matching. The rider receives a trip_id back without waiting for a driver to accept.
Components:
Rider App: Mobile client sending the trip request.
Trip Service: Validates the request, creates the trip record, and publishes a TripRequestedEvent for the matching pipeline to consume.
Trip DB: Stores the authoritative trip record. Status progresses from requested through accepted, in_progress, to completed.
Request walkthrough:
Rider app sends POST /trips with pickup and dropoff coordinates.
Trip Service validates the locations (valid lat/lng range, reachable geocoordinate).
Trip Service inserts { trip_id, rider_id, pickup_location, dropoff_location, status: "requested", created_at } into Trip DB.
Trip Service records the trip request in the surge demand index: ZADD trip_requests:cell:{geohash5(pickup_lat, pickup_lng)} {timestamp_ms} {trip_id} on Redis (consumed by the Surge Worker in deep dive 4).
Trip Service publishes TripRequestedEvent { trip_id, pickup_lat, pickup_lng } to Kafka.
Trip Service returns { trip_id, status: "requested" } to the rider.
The matching step that consumes the Kafka event is deferred to requirement 3. For now the trip exists in the database, the rider has a trip_id, and the matching pipeline has the event it needs.
The write path for driver location is entirely separate from the trip request path. I always pause here on the whiteboard and draw a clear vertical line between the two paths, because conflating them is the single most common mistake candidates make. Drivers send a GPS update every 4 seconds regardless of whether they are available, on a trip, or transitioning between states. These updates flow into two destinations: a geospatial index for matching queries, and a real-time channel for active-trip tracking.
Components:
Driver App: Mobile client sending periodic GPS updates.
Location Service: Receives driver location updates and writes them to the geospatial index for available-driver queries. When the driver is on an active trip, it publishes GPS positions to Redis Pub/Sub instead (covered in requirement 4).
Redis Geo (Location Store): A geospatially indexed Redis sorted set. Available drivers are stored here permanently until they accept a trip or go offline.
Request walkthrough:
Driver app sends POST /drivers/location with current lat/lng and status.
Location Service validates the coordinates and driver status.
If status = available: Location Service calls GEOADD drivers:available <lng> <lat> <driver_id> on Redis.
If status = on_trip or status = offline: Location Service calls ZREM drivers:available <driver_id> to remove from the geospatial index.
Location Service returns 200 OK.
This diagram covers only the geospatial index write path. Active-trip location streaming via Redis Pub/Sub is introduced in requirement 4. The Redis Geo index is the critical structure: it receives 250K writes per second at peak and must answer "find all available drivers within 5km of this point" in under 10ms. I have seen candidates propose Postgres with PostGIS here, and the interviewer's follow-up is always "what happens at 250K writes per second?" Having Redis Geo ready as the answer skips that entire trap. The deep dives address how this scales and what geospatial indexing strategy actually underpins it.
Matching consumes the TripRequestedEvent from Kafka, queries the Redis Geo index for nearby drivers, and assigns the trip to the first driver who accepts. The entire flow is asynchronous from the rider's perspective: the rider polls (or receives a WebSocket push) to learn when a driver is assigned.
Components:
Match Worker: Kafka consumer that processes TripRequestedEvent messages. Queries Redis Geo and dispatches offers to candidate drivers.
Redis Geo (from requirement 2): Answers geospatial proximity queries for available drivers.
Notification Service: Pushes trip offers to specific driver apps (via APNs/FCM push notification or driver WebSocket connection).
Trip Service (updated): Handles PUT /trips/{trip_id}/accept from drivers, atomically assigns the trip, and updates Trip DB.
Request walkthrough:
Match Worker consumes TripRequestedEvent { trip_id, pickup_lat, pickup_lng } from Kafka.
Match Worker calls GEOSEARCH drivers:available FROMLONLAT <pickup_lng> <pickup_lat> BYRADIUS 5 km ASC COUNT 5 on Redis.
Match Worker marks the top 5 candidate drivers as pending_offer (atomic Redis SET with 30-second TTL per driver).
Notification Service pushes a trip offer to each of the 5 candidate drivers simultaneously.
First driver app sends PUT /trips/{trip_id}/accept.
Trip Service executes a compare-and-swap update: UPDATE trips SET status='accepted', driver_id=? WHERE trip_id=? AND status='requested'. The row count indicates whether this driver won the race.
Trip Service removes the assigned driver from drivers:available via ZREM.
Trip Service pushes a "driver assigned" notification to the rider.