Marketplace
Walk through a complete marketplace design, from a basic listing service to a geospatial-aware search platform handling 100M DAU with sub-200ms search, location-based discovery, and real-time seller-buyer messaging.
What is an online marketplace?
An online marketplace, like Craigslist or Facebook Marketplace, connects sellers with buyers nearby. The interesting engineering challenge is combining geospatial search, full-text search, and real-time messaging into a single coherent system that stays fast when listings number in the hundreds of millions.
Functional Requirements
Core Requirements
- Sellers can create listings with title, description, price, photos, and location.
- Buyers can browse and search listings filtered by category, price range, and proximity to their location.
- Buyers can message sellers directly about a specific listing.
- Sellers can mark a listing as sold, hiding it from search results.
Below the Line (out of scope)
- Integrated payments and escrow
- Buyer and seller reviews and reputation scores
- Promoted or sponsored listing placement
- Dispute resolution and fraud detection
The hardest part in scope: Geospatial search combined with full-text filtering. A buyer types "vintage guitar" and wants results sorted by distance, not just text relevance. That combination of geo and text is where the design gets interesting, and where naive SQL queries fall apart at scale.
Integrated payments are below the line because they require a licensed payment processor, escrow logic, and regulatory compliance. To add them, I would integrate Stripe Connect for marketplace payments and build a separate escrow service that holds funds until the buyer confirms receipt.
Reputation scores are below the line because they require a review submission pipeline and fraud detection to prevent fake reviews. To add them, I would add a Review entity after a transaction closes and roll up scores asynchronously into seller profiles.
Promoted listings are below the line because they introduce an ad auction mechanism. To add them, I would build a thin ad-serving layer that injects sponsored results at fixed positions in the search response.
Non-Functional Requirements
Core Requirements
- Availability: 99.9% uptime. Availability over consistency for search (a slightly stale search result is acceptable; a failed search is not).
- Search latency: Search results return in under 200ms p99, including geo-filter and text-match scoring.
- Scale: 100M DAU, 500M total active listings. Peak write rate: ~2,000 new listings per second. Peak search rate: ~50,000 searches per second.
- Message delivery: Messages between buyer and seller delivered within 500ms.
- Durability: Listings and messages are never lost. Photos stored durably in object storage.
Below the Line
- Sub-50ms search via CDN-edge caching of popular query results
- Real-time sold status propagation across all active sessions
Read/write ratio: For every listing created, expect roughly 25 searches that scan that listing. This 25:1 read skew shapes the entire storage and caching strategy. The search path must be fast and horizontally scalable. The write path handles a tiny fraction of the traffic.
Under 200ms search latency means a naive SELECT * FROM listings WHERE ST_DWithin(location, ?, ?) against a 500M-row PostgreSQL table is not viable without spatial indexing. Even with a PostGIS GiST index, filtering 500M rows by geo-box and then by text is slow without a dedicated search engine. The 100M DAU target means the search tier must scale horizontally with no single bottleneck.
Core Entities
- Listing: The core object. Carries title, description, price, category, status (active/sold), and a geographic coordinate (latitude + longitude). Belongs to exactly one seller.
- Photo: A binary asset attached to a listing. Stored in object storage (S3); the listing record stores only the photo URLs.
- User: The account that creates listings or sends messages. Carries an ID, display name, and an optional saved location for proximity defaults.
- Message: A single message in a conversation between a buyer and a seller about a specific listing. A conversation is implicitly defined by the
(listing_id, buyer_id, seller_id)triple.
Full schema, indexes, and column types are deferred to the data model deep dive. These four entities are sufficient to drive the API design and High-Level Design.
API Design
Start with one endpoint per functional requirement, then evolve where the naive shape needs adjustment.
FR 1: Create a listing
POST /listings
Authorization: Bearer <token>
Body: {
title: string,
description: string,
price_cents: number,
category: string,
location: { lat: number, lng: number },
photo_ids: string[] // pre-uploaded to S3, see note below
}
Response: 201 { listing_id, status: "active" }
Photos are not included in this request body. Embedding binary files in JSON is inefficient and creates timeouts on large images. Instead, clients upload photos directly to S3 via pre-signed URLs (a separate POST /photos/upload-url endpoint returns a short-lived signed URL). Once uploaded, the client passes the resulting photo IDs to this endpoint.
FR 2: Search listings
GET /listings/search
Query: {
q?: string, // full-text query (e.g. "vintage guitar")
lat: number,
lng: number,
radius_km: number, // defaults to 25km
category?: string,
min_price?: number,
max_price?: number,
cursor?: string, // for cursor-based pagination
limit?: number // default 20
}
Response: 200 {
listings: [Listing],
next_cursor: string | null
}
Cursor-based pagination over offset pagination because search results shift as new listings are posted. Offset pagination would show duplicates or skip items; a cursor anchors the result window to a stable position.
FR 3: Send a message to a seller
POST /listings/{listing_id}/messages
Authorization: Bearer <token>
Body: { text: string }
Response: 201 { message_id, conversation_id, sent_at }
The server derives seller_id from the listing, and buyer_id from the auth token. No need to pass either in the body.
FR 4: Get conversation messages
GET /listings/{listing_id}/messages
Authorization: Bearer <token>
Query: { cursor?: string, limit?: number }
Response: 200 {
messages: [Message],
next_cursor: string | null
}
FR 5: Mark a listing as sold
PATCH /listings/{listing_id}
Authorization: Bearer <token>
Body: { status: "sold" }
Response: 200 { listing_id, status: "sold" }
PATCH rather than a dedicated /listings/{id}/sold endpoint because status is a field on the listing. PATCH is idiomatic for partial updates. The server must validate that only the listing owner can change status.
High-Level Design
1. Sellers can create a listing with photos and location
The write path: seller uploads photos to S3, then submits listing metadata to the Listing Service, which writes to the database.
Components:
- Client: Web or mobile app. Fetches a pre-signed S3 URL, uploads photos directly to S3, then POSTs listing metadata to the API.
- API Gateway: Routes requests, handles auth token validation, and enforces rate limits to prevent listing spam.
- Listing Service: Validates the listing fields, persists the record to PostgreSQL, and publishes a
listing.createdevent to a message queue for async downstream processing (search indexing). - PostgreSQL: Stores listing records with geographic coordinates as a PostGIS
GEOGRAPHYcolumn. This is the source of truth. - S3: Stores raw photo bytes. The Listing Service stores only the photo URLs in PostgreSQL.
Request walkthrough:
- Client calls
POST /photos/upload-urland receives a pre-signed S3 URL (valid for 10 minutes). - Client uploads the photo directly to S3 using the pre-signed URL. S3 returns the photo URL.
- Client calls
POST /listingswith metadata including the photo URLs. - Listing Service validates all fields and writes the listing row to PostgreSQL.
- Listing Service publishes
listing.createdevent to Kafka for downstream processing. - Listing Service returns
201 { listing_id, status: "active" }to the client.
This is the write path only. Photo URLs live in S3; the database stores only references. The Kafka event seeds the search index asynchronously (next section).
2. Buyers can browse and search by category, price, and location
This is where the interesting complexity lives. Buyers need two query patterns: structured browse (category + price filter, no text) and free-form search (text query + geo filter). These are different enough that a single naive SQL approach breaks for both.
The naive approach: SELECT * FROM listings WHERE category = ? AND price BETWEEN ? AND ? AND ST_DWithin(location, ST_MakePoint(lng, lat), radius).
This breaks at 500M rows even with a PostGIS index. PostGIS can answer the geo-filter efficiently, but combining it with a full-text LIKE search saturates the database when you have 50,000 queries per second hitting the same node.
The fix: Route search queries through a dedicated Elasticsearch cluster. Elasticsearch handles both geo_distance filtering and BM25 full-text scoring natively in one query, scales horizontally by adding shards, and keeps the PostgreSQL primary reserved for writes.
The Kafka consumer (Search Indexer) from the previous section consumes listing.created / listing.updated events and upserts documents into the Elasticsearch index. Eventual consistency here is acceptable: a listing appearing in search 1-2 seconds after creation is not a user-visible problem.
Components added:
- Search Service: Translates the
/listings/searchquery parameters into an Elasticsearch query with geo_distance filter, bool must (text), and range filters (price). - Elasticsearch Cluster: Stores a denormalized listing document per listing. Handles geo_distance, full-text BM25 scoring, and filter aggregations.
- Search Indexer (Kafka consumer): Consumes
listing.*events from Kafka and upserts documents into Elasticsearch. Runs async, not in the write path.
Request walkthrough:
- Buyer sends
GET /listings/search?q=vintage+guitar&lat=37.7&lng=-122.4&radius_km=15. - API Gateway routes to Search Service.
- Search Service builds an Elasticsearch query:
geo_distancecircle around the buyer's location,multi_matchon title and description for "vintage guitar",rangeon price, andtermon status: "active". - Elasticsearch returns top 20 matching listing IDs and scores.
- Search Service fetches full listing details from PostgreSQL by IDs (or from the Elasticsearch document itself, since it is denormalized).
- Search Service returns the ranked list to the client.
Denormalizing the listing into the Elasticsearch document avoids a second PostgreSQL fetch for most search requests. Store title, price, category, photo thumbnail URL, and seller display name in the document. Reserve the PostgreSQL fetch for the listing detail page, not the search results page.
3. Buyers can message sellers about a listing
Messaging introduces a new access pattern: write-heavy at creation (many short messages), read-heavy at retrieval (load the full conversation history). PostgreSQL works here for moderate scale but has a write amplification problem if a single conversation gets thousands of messages.
I'd use PostgreSQL for messaging at this scale (100M DAU, but conversations are low-volume compared to search). A table messages(id, listing_id, sender_id, receiver_id, text, sent_at) with an index on (listing_id, sent_at) covers the two main access patterns: fetch the full conversation for a listing, and sort messages by time.
Components added:
- Messaging Service: Validates sender permissions (buyer must not be the seller), persists the message, and notifies the recipient. Notification delivery is out of scope, but I'd publish to a
message.createdKafka topic that a push notification service consumes. - Messages Table (PostgreSQL): Partitioned by
listing_idto keep conversation lookups fast even at scale.
Request walkthrough:
- Buyer sends
POST /listings/{id}/messageswith message text. - API Gateway validates the auth token and routes to Messaging Service.
- Messaging Service verifies the buyer is not the listing owner (sellers cannot message themselves).
- Messaging Service inserts the message row into PostgreSQL.
- Messaging Service publishes
message.createdto Kafka for async push notification delivery. - Returns
201 { message_id, sent_at }.
4. Sellers can mark a listing as sold
Mark-as-sold is a simple update to the listing record, but it has one important consequence: the listing must disappear from search results immediately. Buyers should not message a seller about a listing that is already sold.
How it works:
- Seller sends
PATCH /listings/{id}with{ status: "sold" }. - Listing Service verifies the caller is the listing owner.
- Listing Service updates
status = 'sold'in PostgreSQL. - Listing Service publishes
listing.soldto Kafka. - Search Indexer consumes the event and updates the Elasticsearch document's
statusfield to"sold". All search queries filter onstatus: "active", so the listing drops from results within seconds.
There is a short window (typically under 2 seconds) between when the DB update commits and when Elasticsearch processes the event, during which the listing still appears in search. This is acceptable for a marketplace. If immediate removal were critical, you could synchronously delete the Elasticsearch document in the write path, but that couples the write latency to Elasticsearch availability.
No new diagram is needed here. The write path follows the same Listing Service + PostgreSQL + Kafka + Search Indexer flow established in FR1 and FR2.
Potential Deep Dives
1. How do we implement location-based listing discovery?
The geospatial search problem is the most interesting data engineering challenge in this system. We need to find all active listings within X km of a buyer's location, fast, at 50,000 queries per second against 500M listings. There are three approaches with meaningfully different tradeoffs.
2. How do we add full-text search so buyers can find specific items?
Full-text search is the second major query pattern. A buyer types "red leather couch" and expects results ranked by relevance, not just by recency or distance. The challenge is that natural language matching against 500M listing titles and descriptions is a different problem from structured attribute filtering.
3. How do we generate personalized recommendations for buyers?
Recommendations surface relevant listings a buyer did not explicitly search for: the "You might like" section after a search, or personalized home-feed listings. I would approach this differently depending on where you are in the product lifecycle.
Final Architecture
The central insight is the Kafka-driven decoupling between write operations and search visibility: every change to a listing (create, update, sold) flows through Kafka to the Search Indexer, which keeps Elasticsearch eventually consistent without coupling the write path to search availability. Elasticsearch absorbs 100% of search traffic, keeping PostgreSQL reserved for writes and message history. The item-item similarity map in Redis enables sub-10ms recommendation lookups with no online ML inference.
Interview Cheat Sheet
- Lock down 4 core features early: create listing, search by location and text, message seller, and mark as sold. Payments and reviews are quick to name as out of scope.
- State the 25:1 read/write ratio upfront. It drives the decision to put Elasticsearch in front of PostgreSQL for search traffic.
- PostgreSQL with a PostGIS geography column is the source of truth for listing coordinates. Elasticsearch is the query engine, not the record of truth.
- Geohashing is a reasonable interview answer if you explain it. But Elasticsearch geo_distance on a BKD tree is categorically faster and avoids cell-boundary edge cases.
- Elasticsearch handles geo_distance + full-text BM25 scoring in a single compound bool query. No need for application-side post-filtering.
- The English analyzer in Elasticsearch applies stemming: buyers searching "guitars" match listings containing "guitar". This is a free feature that LIKE queries cannot replicate.
- Use filter context (not must context) for non-text filters in Elasticsearch: category, status, and price range. Filter context caches bitsets across queries and skips BM25 scoring for those clauses.
- Photos are uploaded directly to S3 via pre-signed URLs. Never route binary uploads through your application servers.
- The mark-as-sold update propagates to Elasticsearch through Kafka within 1-3 seconds. This eventual consistency window is acceptable for a marketplace.
- Content-based recommendations (match by recently viewed categories + keywords) ship quickly and handle the cold start problem. Save collaborative filtering for when you have enough view data.
- Item-item collaborative filtering computes similarity offline (daily Spark job) and stores the top-20 similar listings per item in Redis. Online recommendation latency is just two lookups.
- Messaging fits in PostgreSQL partitioned by listing_id. Only add Cassandra if conversation volumes exceed ~1,000 messages per listing or you need multi-region fan-out.