News Feed
Design a personalized news feed system like Facebook's or Instagram's: from a naive fan-out-on-write to a hybrid push-pull model that serves hundreds of millions of users in under 200ms.
What is a social media news feed?
A news feed is the personalized, scrollable homepage showing posts from people you follow. The real challenge is not storing posts; it is delivering a user's feed in under 200ms when a single celebrity post must propagate to 10 million followers simultaneously. I consider this one of the best interview questions because the naive answer (fan-out-on-write) sounds correct until you do the math on celebrity accounts, and the interviewer gets to watch you reason through the trade-off in real time. This question tests caching, message queues, sharding, and the fundamental fan-out trade-off between writing to every follower's feed at post time versus computing each user's feed fresh on every read.
Functional Requirements
Core Requirements
- Users see a personalized, ranked feed of posts from friends and accounts they follow.
- New posts appear in followers' feeds within seconds of publishing.
- The feed supports infinite scroll (paginated reads).
Below the Line (out of scope)
- Feed recommendation ML model internals
- Ad insertion logic
- Stories and ephemeral content (separate architecture)
If ranking ML model internals were in scope, I would introduce a candidate retrieval and ranking service layer sitting between feed storage and the client: retrieve the top 500 candidate post_ids from the raw chronological feed and pass them to the ranking model, which applies signals like engagement rate, relationship strength, and recency to return the final ordered set. The ML model itself is a separate offline training pipeline that publishes scoring parameters to a feature store the ranking service reads at request time.
Ad insertion is out of scope because it requires its own auction pipeline and targeting model. The integration point is a slot injection layer that periodically splices ad slots between organic posts in the ranked feed before serializing the response, a background concern that sits downstream of everything in this design.
Stories live on a separate fan-out and storage architecture because they have short TTLs (24 hours), a different content format, and different engagement mechanics. They would not share the feed cache or fan-out worker design covered here.
The hardest part in scope: Deciding how to fan-out a post from a user with 10 million followers. Fan-out-on-write (write to each follower's feed at post time) creates 10 million writes per post. Fan-out-on-read (compute each feed fresh at read time) creates N database lookups per page load. Neither works at scale in isolation; the entire system design pivots on getting this trade-off right.
Non-Functional Requirements
Core Requirements
- Latency: Feed load under 200ms p99 end to end.
- Scale: 500M DAU; each user follows 200 to 500 accounts on average; celebrity accounts have up to 10M followers.
- Writes: 10M new posts per day (approximately 115 posts per second on average; 5x at peak).
- Availability: 99.99% uptime (about 52 minutes of downtime per year).
- Consistency: Eventual. Seeing a post 2 to 5 seconds late is acceptable; feed staleness for minutes is not.
Below the Line
- Exactly-once delivery guarantees for feed updates (at-least-once with idempotent writes is sufficient)
- Cross-device feed position synchronization (cursor managed per device)
Read/write amplification: 10M new posts per day sounds modest, but fan-out is the multiplier that changes the math entirely. A user with 500 followers creates 500 feed-write operations per post. A celebrity with 10M followers creates 10M operations per post.
At peak post rates, fan-out writes can hit 500M feed updates per second across the system, a 4,000x amplification over the raw post write rate. Every design decision in this article traces back to controlling that amplification.
I'd flag the read-to-write ratio early because it shapes which trade-off you optimize for: feeds are read roughly 100 times for every post written, which means read latency is the primary cost and write throughput is the secondary cost. You can afford more write latency (via async fan-out) to buy less read latency (pre-materialized feed in Redis).
Core Entities
- User: An account on the platform. Has a list of accounts they follow stored in the social graph.
- Post: Content published by a user (text, image references, video references). The atomic unit of the news feed.
- Follow: A directed edge in the social graph from follower to followee. Used to determine whose posts appear in a user's feed.
- FeedEntry: A materialized mapping of (user_id, post_id, score) representing a pre-computed feed item stored in the user's feed cache.
Full schema and indexing strategy are deferred to the deep dives. These four entities are enough to drive the API and High-Level Design.
API Design
FR 1 and FR 2: Create a post and publish it to followers' feeds:
# Create a new post; triggers async fan-out to followers
POST /v1/posts
Body: { content: "...", media_urls: [], created_at: "2026-03-29T12:00:00Z" }
Response: { post_id: "p_abc123", created_at: "2026-03-29T12:00:00Z" }
POST because this is a state-creating operation. Media uploads are handled separately via a pre-signed URL flow; this endpoint accepts media references, not raw bytes. The fan-out happens asynchronously after the response is returned, so the caller does not wait for all followers' feeds to be updated.
FR 3: Read the paginated news feed:
# Fetch next page of the caller's feed
GET /v1/feed?cursor=eyJ0c...&limit=20
Response: {
posts: [
{ post_id: "p_abc", author_id: "u_xyz", content: "...", created_at: "...", like_count: 312 },
...
],
next_cursor: "eyJ0c...",
has_more: true
}
Use cursor-based pagination rather than offset-based. Offset pagination on a mutable, ranked feed skips or repeats posts when new content is inserted ahead of the current offset. The cursor encodes a timestamp and post_id so the feed resumes deterministically after new posts arrive. Limit defaults to 20 and caps at 50.
High-Level Design
1. Basic post creation and feed write
The naive write path: a user posts, the Post Service saves the content, then synchronously writes the post_id into every follower's feed table before returning.
This is simple to reason about and works for small accounts. It fails for celebrity accounts, but understanding exactly why it fails is the setup for everything that follows. I always start an interview answer with this synchronous version because interviewers want to see you identify the bottleneck yourself, not skip to the optimized solution.
Components:
- Client: Web or mobile app sending
POST /v1/posts. - Post Service: Validates and stores the post in the Post DB. Queries the social graph for the poster's follower list, then writes one row per follower into the Feed DB.
- Post DB: Stores full post content (PostgreSQL). The source of truth for all post data.
- Social Graph DB: Stores follower relationships as a directed adjacency list. Read-heavy: queried on every post to enumerate followers.
- Feed DB: A per-user feed table storing
(user_id, post_id, timestamp)rows. Feed reads query this table.
Request walkthrough:
- Client sends
POST /v1/postswith post content. - Post Service validates the request and inserts the post into Post DB.
- Post Service queries Social Graph DB: give me all followers for user X.
- Post Service loops through the follower list and writes
(follower_id, post_id, timestamp)into Feed DB for each follower. - Post Service returns the new post_id to the client.
This covers the happy path for regular users. The critical failure: a user with 10M followers triggers 10M synchronous inserts before the POST request returns. At average follower counts (500) and 10M posts per day, total fan-out is 5B feed writes per day, which is manageable. At celebrity scale, a single post blocks the write path for minutes. The next section fixes this.
2. Async fan-out via message queue
The fix: the Post Service publishes a single event to a message queue and returns immediately. Fan-out Workers consume the event asynchronously and handle the per-follower feed writes.
Decoupling the POST response from the fan-out work is the key architectural move. The client gets a fast acknowledgment. The fan-out work happens in the background, completing within seconds. In every production system I have worked on, this decoupling was the single biggest improvement to write latency and user-perceived responsiveness.
Components (new or changed):
- Post Service (evolved): After saving the post, publishes a
post.createdevent to Kafka and returns. No longer performs fan-out directly. - Kafka (post-created topic): Durable event log partitioned by poster's user_id. Buffers the post event so Fan-out Workers can process it at their own pace.
- Fan-out Worker: Stateless consumer reading from Kafka. Fetches the poster's follower list from the Social Graph DB, then writes
(follower_id, post_id, timestamp)entries to the Feed Cache (Redis) for each follower. - Feed Cache (Redis): Sorted sets keyed by
feed:{user_id}, score = post timestamp, member = post_id. Replaces the Feed DB as the primary feed store for fast reads.
Request walkthrough:
- Client sends
POST /v1/posts. - Post Service saves the post to Post DB and publishes
{ poster_id, post_id, timestamp }to Kafka. - Post Service returns 200 immediately (fan-out is not yet done).
- Fan-out Worker consumes the Kafka event.
- Fan-out Worker reads the poster's follower list from Social Graph DB.
- Fan-out Worker calls
ZADD feed:{follower_id} {timestamp} {post_id}for each follower. - Fan-out Worker trims old entries:
ZREMRANGEBYRANK feed:{follower_id} 0 -1001(keep top 1000 posts).
The poster's client now gets a response in under 50ms regardless of follower count. Followers see the post within seconds as the Workers process the Kafka event. The trade-off is eventual consistency: there is a small window (typically 2 to 10 seconds) between a post being made and it appearing in all followers' feeds.
3. Feed read path with Redis cache
The read path: the Feed Service reads a user's pre-built feed from Redis, then hydrates the post_ids with full post content from the Post DB before returning to the client.
The feed is stored as a sorted set of post_ids, not full posts. Storing post_ids keeps the Redis memory footprint small and means post edits propagate automatically when the hydration step re-fetches content from the Post DB. I would call this out explicitly in an interview because it is a common mistake to store full post JSON in the feed sorted set, which bloats Redis memory by 50x and creates a cache invalidation nightmare when posts are edited.
Components:
- Feed Service: Stateless read service. Reads
ZREVRANGEBYSCORE feed:{user_id}from Redis, then batch-fetches full post objects from Post DB. - Feed Cache (Redis): Already populated by Fan-out Workers.
ZREVRANGEBYSCOREwith a cursor encoded as the last seen timestamp gives cursor-based pagination. - Post DB: Provides full post content and author info for the hydration batch read.
Request walkthrough:
- Client sends
GET /v1/feed?cursor=...&limit=20. - Feed Service decodes the cursor to extract the last seen timestamp.
- Feed Service runs
ZREVRANGEBYSCORE feed:{user_id} {cursor_ts} -inf LIMIT 0 20to get 20 post_ids. - Feed Service batch-fetches full post objects from Post DB (
SELECT ... WHERE post_id IN (...)). - Feed Service encodes the next cursor (timestamp of the last returned post) and returns the response.
This read path handles the 99% case: logged-in users scrolling a pre-built feed. The edge cases for returning users whose cached feed has expired, and celebrity posts that were never fan-outed, are the subject of the deep dives.
Interview tip: separate post_id storage from post content
Store only post_ids in the feed cache, not full post objects. This keeps the sorted set small (a 64-bit integer vs. a kilobyte JSON blob), lets post edits propagate automatically via the hydration step, and decouples feed invalidation from post content invalidation. Always draw this separation explicitly when presenting the feed read path to an interviewer.
Potential Deep Dives
1. How should you handle fan-out for celebrity accounts?
Fan-out-on-write works for regular users but creates millions of writes per celebrity post. Fan-out-on-read avoids the write amplification but makes every feed read expensive. The right answer is a hybrid model, and this is the central trade-off every interviewer probes. I lead with the numbers here: "a user with 10M followers generates 10M Redis writes per post" is the sentence that makes the interviewer nod and signals you understand the scale.
Common mistake: relying solely on follower count for the celebrity threshold
Follower count misses accounts that are suddenly viral but have not yet crossed the threshold. Add a second signal: if a post generates more than 10K fan-out writes per second in the first 60 seconds (monitor the Fan-out Worker output rate), promote the poster to the celebrity path immediately for the duration of that post's fan-out window.
2. How do you serve a fresh, relevant feed when a user returns after several days?
A user who opens the app after a 3-day absence has a feed cache that either expired or is filled with posts from 3 days ago. Serving stale content destroys the experience. The right approach serves immediately and refreshes asynchronously. I have seen teams spend weeks optimizing the hot path only to forget the returning-user cold start, which affects 5-10% of daily requests and generates the most user complaints.
Computing the feed completely from scratch adds too much latency to the foreground request path.
3. How do you serve 500M users reading feeds under 200ms?
The read path serving 500M DAU requires the hydration step (fetching full post content) to not become a bottleneck. The naive approach collapses under the sheer volume of post DB reads.
Final Architecture
The hybrid fan-out model is the central architectural insight: fan-out-on-write for regular users pre-builds feeds in Redis at near-zero read cost, while celebrities bypass fan-out entirely and their posts are pulled and merged at read time, capping write amplification regardless of follower count.
Interview Cheat Sheet
- Establish the fan-out problem as the core challenge immediately: a post from a user with 10M followers creates 10M writes; naive fan-out-on-write cannot absorb this at celebrity scale.
- Fan-out-on-write gives O(1) feed reads but write amplification proportional to follower count; fan-out-on-read eliminates write amplification but makes each feed read proportional to the number of followees.
- The hybrid model (fan-out-on-write for regular users, fan-out-on-read for celebrities above 1M followers) bounds write amplification while keeping feed reads fast for the vast majority of users.
- Use Kafka to decouple post creation from fan-out: the Post Service publishes a
post.createdevent and returns immediately; Fan-out Workers process the event asynchronously, giving the poster a fast response regardless of follower count. - Store pre-built feeds as Redis sorted sets with post_ids as members and timestamp as score;
ZREVRANGEBYSCOREwith a timestamp cursor gives cursor-based pagination in O(log N). - Never store full post objects in the feed sorted set; store only post_ids and hydrate at read time. Post edits propagate automatically through the hydration step and feed memory stays small.
- At read time, merge the pre-built feed with a fresh pull from each celebrity timeline the user follows; these parallel Redis reads add under 10ms total latency on a pipeline client.
- The post content cache (Redis, write-through on post creation, 7-day TTL) absorbs above 95% of hydration reads and keeps the Post DB reserved for writes and cache-miss fallback only.
- Use cursor-based pagination rather than offset: offsets on a mutable ranked feed skip or repeat posts when new content is inserted ahead of the current position.
- For returning users with an empty or stale feed cache, serve the stale cache immediately and trigger an async rebuild; blocking the feed read on a synchronous full followee scan causes latency spikes and DB stampedes.
- Eventual consistency is the right model for feeds: a 2 to 5 second delay between a post being made and it appearing in followers' feeds is invisible to users and removes the requirement for synchronous fan-out.
- Feed Cache and Post Cache must be separate Redis clusters with different eviction policies: Feed Cache uses
allkeys-lru; Post Cache usesvolatile-lruto protect pinned posts from eviction. - The celebrity flag threshold (1M followers) should be supplemented with a real-time write-rate monitor: if fan-out exceeds 10K writes per second for a single post event in the first 60 seconds, promote that poster to the celebrity path immediately.