News Feed
Design a personalized news feed system like Facebook's or Instagram's: from a naive fan-out-on-write to a hybrid push-pull model that serves hundreds of millions of users in under 200ms.
What is a social media news feed?
A news feed is the personalized, scrollable homepage showing posts from people you follow. The real challenge is not storing posts; it is delivering a user's feed in under 200ms when a single celebrity post must propagate to 10 million followers simultaneously. I consider this one of the best interview questions because the naive answer (fan-out-on-write) sounds correct until you do the math on celebrity accounts, and the interviewer gets to watch you reason through the trade-off in real time. This question tests caching, message queues, sharding, and the fundamental fan-out trade-off between writing to every follower's feed at post time versus computing each user's feed fresh on every read.
Functional Requirements
Core Requirements
- Users see a personalized, ranked feed of posts from friends and accounts they follow.
- New posts appear in followers' feeds within seconds of publishing.
- The feed supports infinite scroll (paginated reads).
Below the Line (out of scope)
- Feed recommendation ML model internals
- Ad insertion logic
- Stories and ephemeral content (separate architecture)
If ranking ML model internals were in scope, I would introduce a candidate retrieval and ranking service layer sitting between feed storage and the client: retrieve the top 500 candidate post_ids from the raw chronological feed and pass them to the ranking model, which applies signals like engagement rate, relationship strength, and recency to return the final ordered set. The ML model itself is a separate offline training pipeline that publishes scoring parameters to a feature store the ranking service reads at request time.
Ad insertion is out of scope because it requires its own auction pipeline and targeting model. The integration point is a slot injection layer that periodically splices ad slots between organic posts in the ranked feed before serializing the response, a background concern that sits downstream of everything in this design.
Stories live on a separate fan-out and storage architecture because they have short TTLs (24 hours), a different content format, and different engagement mechanics. They would not share the feed cache or fan-out worker design covered here.
The hardest part in scope: Deciding how to fan-out a post from a user with 10 million followers. Fan-out-on-write (write to each follower's feed at post time) creates 10 million writes per post. Fan-out-on-read (compute each feed fresh at read time) creates N database lookups per page load. Neither works at scale in isolation; the entire system design pivots on getting this trade-off right.
Non-Functional Requirements
Core Requirements
- Latency: Feed load under 200ms p99 end to end.
- Scale: 500M DAU; each user follows 200 to 500 accounts on average; celebrity accounts have up to 10M followers.
- Writes: 10M new posts per day (approximately 115 posts per second on average; 5x at peak).
- Availability: 99.99% uptime (about 52 minutes of downtime per year).
- Consistency: Eventual. Seeing a post 2 to 5 seconds late is acceptable; feed staleness for minutes is not.
Below the Line
- Exactly-once delivery guarantees for feed updates (at-least-once with idempotent writes is sufficient)
- Cross-device feed position synchronization (cursor managed per device)
Read/write amplification: 10M new posts per day sounds modest, but fan-out is the multiplier that changes the math entirely. A user with 500 followers creates 500 feed-write operations per post. A celebrity with 10M followers creates 10M operations per post.
At peak post rates, fan-out writes can hit 500M feed updates per second across the system, a 4,000x amplification over the raw post write rate. Every design decision in this article traces back to controlling that amplification.
I'd flag the read-to-write ratio early because it shapes which trade-off you optimize for: feeds are read roughly 100 times for every post written, which means read latency is the primary cost and write throughput is the secondary cost. You can afford more write latency (via async fan-out) to buy less read latency (pre-materialized feed in Redis).
Core Entities
- User: An account on the platform. Has a list of accounts they follow stored in the social graph.
- Post: Content published by a user (text, image references, video references). The atomic unit of the news feed.
- Follow: A directed edge in the social graph from follower to followee. Used to determine whose posts appear in a user's feed.
- FeedEntry: A materialized mapping of (user_id, post_id, score) representing a pre-computed feed item stored in the user's feed cache.
Full schema and indexing strategy are deferred to the deep dives. These four entities are enough to drive the API and High-Level Design.
API Design
FR 1 and FR 2: Create a post and publish it to followers' feeds:
# Create a new post; triggers async fan-out to followers
POST /v1/posts
Body: { content: "...", media_urls: [], created_at: "2026-03-29T12:00:00Z" }
Response: { post_id: "p_abc123", created_at: "2026-03-29T12:00:00Z" }
POST because this is a state-creating operation. Media uploads are handled separately via a pre-signed URL flow; this endpoint accepts media references, not raw bytes. The fan-out happens asynchronously after the response is returned, so the caller does not wait for all followers' feeds to be updated.
FR 3: Read the paginated news feed:
# Fetch next page of the caller's feed
GET /v1/feed?cursor=eyJ0c...&limit=20
Response: {
posts: [
{ post_id: "p_abc", author_id: "u_xyz", content: "...", created_at: "...", like_count: 312 },
...
],
next_cursor: "eyJ0c...",
has_more: true
}
Use cursor-based pagination rather than offset-based. Offset pagination on a mutable, ranked feed skips or repeats posts when new content is inserted ahead of the current offset. The cursor encodes a timestamp and post_id so the feed resumes deterministically after new posts arrive. Limit defaults to 20 and caps at 50.
High-Level Design
1. Basic post creation and feed write
The naive write path: a user posts, the Post Service saves the content, then synchronously writes the post_id into every follower's feed table before returning.
This is simple to reason about and works for small accounts. It fails for celebrity accounts, but understanding exactly why it fails is the setup for everything that follows. I always start an interview answer with this synchronous version because interviewers want to see you identify the bottleneck yourself, not skip to the optimized solution.
Components:
- Client: Web or mobile app sending
POST /v1/posts. - Post Service: Validates and stores the post in the Post DB. Queries the social graph for the poster's follower list, then writes one row per follower into the Feed DB.
- Post DB: Stores full post content (PostgreSQL). The source of truth for all post data.
- Social Graph DB: Stores follower relationships as a directed adjacency list. Read-heavy: queried on every post to enumerate followers.
- Feed DB: A per-user feed table storing
(user_id, post_id, timestamp)rows. Feed reads query this table.
Request walkthrough:
- Client sends
POST /v1/postswith post content. - Post Service validates the request and inserts the post into Post DB.
- Post Service queries Social Graph DB: give me all followers for user X.
- Post Service loops through the follower list and writes
(follower_id, post_id, timestamp)into Feed DB for each follower. - Post Service returns the new post_id to the client.
This covers the happy path for regular users. The critical failure: a user with 10M followers triggers 10M synchronous inserts before the POST request returns. At average follower counts (500) and 10M posts per day, total fan-out is 5B feed writes per day, which is manageable. At celebrity scale, a single post blocks the write path for minutes. The next section fixes this.
2. Async fan-out via message queue
The fix: the Post Service publishes a single event to a message queue and returns immediately. Fan-out Workers consume the event asynchronously and handle the per-follower feed writes.
Decoupling the POST response from the fan-out work is the key architectural move. The client gets a fast acknowledgment. The fan-out work happens in the background, completing within seconds. In every production system I have worked on, this decoupling was the single biggest improvement to write latency and user-perceived responsiveness.
Components (new or changed):
- Post Service (evolved): After saving the post, publishes a
post.createdevent to Kafka and returns. No longer performs fan-out directly. - Kafka (post-created topic): Durable event log partitioned by poster's user_id. Buffers the post event so Fan-out Workers can process it at their own pace.
- Fan-out Worker: Stateless consumer reading from Kafka. Fetches the poster's follower list from the Social Graph DB, then writes
(follower_id, post_id, timestamp)entries to the Feed Cache (Redis) for each follower. - Feed Cache (Redis): Sorted sets keyed by
feed:{user_id}, score = post timestamp, member = post_id. Replaces the Feed DB as the primary feed store for fast reads.
Request walkthrough:
Continue Reading with Premium
Unlock this article and every other in-depth system design guide on the platform with NotesFromSDE Premium.