Fan-out patterns
Fan-out write vs. fan-out read for social feeds and notifications. When each model breaks down, the hybrid threshold-based approach used by Twitter and Instagram, and how push-on-write affects storage.
The Feed Problem
When a user posts on a social platform, their followers need to see the post. The question is: do you distribute the post at write time or at read time?
- Fan-out on write: when a user posts, push the post (or a reference to it) to every follower's feed at write time
- Fan-out on read: when a user reads their feed, pull posts from all the accounts they follow and merge
The choice has massive implications for storage, latency, and infrastructure at scale.
TL;DR
- Fan-out on write (push model) precomputes each user's feed at post time by writing to every follower's inbox. Reads are fast (O(1)), but writes are expensive (O(followers)).
- Fan-out on read (pull model) assembles the feed at read time by querying all followed accounts and merging. Writes are cheap (O(1)), but reads are expensive (O(following)).
- Neither pure model works at scale. The real solution is hybrid: push for normal accounts, pull for high-follower accounts (celebrities).
- The celebrity/hotspot problem is the defining challenge: a user with 50M followers generates 50M writes per post under fan-out on write.
- This pattern appears in nearly every "Design Twitter/Instagram/Facebook" interview question. Know when to use each model and how to handle the celebrity threshold.
The Problem
When a user posts on a social platform, their followers need to see the post. This sounds simple until you look at the numbers.
Consider a platform with 500M users, where the average user follows 200 accounts and the average user posts twice per day. A celebrity user has 50M followers. When that celebrity posts, how do you get the post into 50M feeds?
Option 1: Write the post reference to all 50M follower inboxes at post time. That's 50M writes for one post. If the celebrity posts 10 times a day, that's 500M write operations just for one account.
Option 2: When any of those 50M followers opens their feed, query all accounts they follow, fetch recent posts, and merge them. If each user follows 200 accounts, that's 200 lookups and an N-way merge, computed on every feed refresh, for every user, multiple times per day.
Both options are expensive. The question isn't which one to use, it's when to use which.
The forces in tension: write amplification (fan-out on write makes posts expensive to create) vs read latency (fan-out on read makes feeds expensive to load). You can't eliminate both. Every feed system engineering team eventually confronts this trade-off, and the answer is always "it depends on the follower distribution."
One-Line Definition
Fan-out patterns distribute content to consumers either at write time (push) or at read time (pull), trading write amplification for read performance or vice versa.
Analogy
Think about a newspaper delivery service. There are two models.
Push model (fan-out on write): The printing press runs. A truck delivers a copy of the newspaper to every subscriber's doorstep at 5 a.m. When subscribers wake up, the paper is already there. Fast read experience. But if you have 1 million subscribers and a breaking story needs an extra edition, you need 1 million more deliveries.
Pull model (fan-out on read): No home delivery. Subscribers go to the newsstand when they want to read. The newsstand has a copy of every newspaper from every publisher. The subscriber picks the ones they want and reads them. No delivery cost, but the subscriber has to travel to the newsstand and browse every time.
Hybrid: High-volume newspapers get delivered to your door. Niche publications are only available at the newsstand. Most subscribers get 90% of their reading delivered and occasionally visit the newsstand for the rest. The decision of what to deliver vs what to leave at the newsstand is the "threshold" in our hybrid fan-out model.
Solution Walkthrough
Fan-Out on Write (Push Model)
At post time, iterate the posting user's follower list and write a reference to the new post into each follower's feed inbox.
async function post(userId: string, content: string): Promise<void> {
const postId = await createPost(userId, content);
const followers = await getFollowers(userId); // could be millions
for (const followerId of followers) {
await feedInbox.prepend(followerId, postId); // write to each inbox
}
}
async function getFeed(userId: string): Promise<Post[]> {
const postIds = await feedInbox.getRecent(userId, 50); // pre-computed
return fetchPosts(postIds); // simple batch lookup
}
Read path: A user's feed is just reading their inbox. Pre-computed, sorted, fast. O(1) read (ignoring pagination). Feed load times are under 50ms because there's no computation at read time.
Write path: O(followers) writes per post. A user with 10M followers generates 10M writes per post. This is write amplification.
Storage: Every post reference is duplicated in N inboxes. If a celebrity posts and has 50M followers, that post ID is written 50M times. Assuming 100 bytes per feed entry (postId + metadata + timestamp), one celebrity post consumes 50M x 100B = 5GB of feed storage.
Best for: Accounts with small-to-medium follower counts (under 10K) where read performance matters more than write cost.
Async fan-out is non-negotiable
Never fan out synchronously inside the post() request handler. A user with 50K followers would block the HTTP response for seconds while inbox writes complete. Always enqueue a fan-out job and return immediately. The poster sees their own post via a "write-through" to their own feed, while the fan-out workers distribute to followers in the background.
Fan-Out on Read (Pull Model)
At read time, look up all accounts the user follows, fetch recent posts from each, and merge-sort them.
async function post(userId: string, content: string): Promise<void> {
await createPost(userId, content); // one write, done
}
async function getFeed(userId: string): Promise<Post[]> {
const following = await getFollowing(userId); // 200 accounts
const postLists = await Promise.all(
following.map(id => getRecentPosts(id, 20))
);
return mergeSorted(postLists).slice(0, 50);
}
Read path: O(following) fetches + N-way merge. If a user follows 200 accounts, that's 200 parallel queries and a 200-way merge at read time. Even with parallelism and caching, this takes 100-500ms depending on infrastructure.
Write path: One write: create the post. No fan-out. O(1). Instant.
Storage: Each post is stored once. No duplication. A celebrity with 50M followers and 10 posts/day has the same storage footprint as any other user.
Best for: Systems where write cost must be minimized. Also works when follower counts are very high (celebrities) because there's zero write amplification regardless of follower count. The trade-off is clear: you pay nothing at write time but pay dearly at read time.
The Hybrid Approach
Neither pure model works at scale. The real solution is to use both, switching between them based on a follower count threshold.
const CELEBRITY_THRESHOLD = 10_000; // accounts with >10K followers
async function post(userId: string, content: string): Promise<void> {
const postId = await createPost(userId, content);
const followerCount = await getFollowerCount(userId);
if (followerCount <= CELEBRITY_THRESHOLD) {
// Normal user: fan-out on write (push to all follower inboxes)
const followers = await getFollowers(userId);
await fanOutWorker.enqueue({ postId, followers });
} else {
// Celebrity: mark for pull at read time
await markAsCelebrityPost(postId, userId);
}
}
async function getFeed(userId: string): Promise<Post[]> {
// Step 1: Read pre-computed inbox (pushed posts from normal accounts)
const pushedPosts = await feedInbox.getRecent(userId, 50);
// Step 2: Pull celebrity posts in real-time
const celebsFollowed = await getCelebrityFollowing(userId);
const celebPosts = await Promise.all(
celebsFollowed.map(id => getRecentPosts(id, 10))
);
// Step 3: Merge and return
return mergeSorted([pushedPosts, ...celebPosts]).slice(0, 50);
}
The threshold is the key design decision. Twitter reportedly uses around 10K followers as the dividing line. Instagram's threshold may differ. The exact number depends on your system's write throughput capacity and acceptable fan-out latency.
Why this works: Most accounts have under 10K followers. The 99% case uses fan-out on write (fast reads). Only celebrities (the top fraction of a percent) use fan-out on read. At read time, the user's feed merges the pre-computed inbox with a small number of real-time celebrity fetches. If a user follows 5 celebrities and 195 normal accounts, the read path does 1 inbox read + 5 celebrity fetches. That's 6 operations instead of 200.
Why the threshold matters
The threshold is a tuning parameter, not a fixed constant. Set it based on your system's write throughput capacity. If your fan-out workers can sustain 500K inbox writes/second total, and you process 1,000 posts/second, the average fan-out budget per post is 500 writes (500K / 1K). So your threshold might be 500, not 10K. The exact number depends on your infrastructure.
SNS/SQS Fan-Out (Infrastructure Pattern)
The same fan-out concept appears at the infrastructure level. AWS SNS + SQS implements fan-out on write: when a message is published to an SNS topic with N SQS subscribers, SNS delivers a copy to each queue. This is commonly used for decoupling microservices.
SNS Topic: "order-created"
→ SQS Queue: "order-fulfillment" (warehouse team)
→ SQS Queue: "order-analytics" (data team)
→ SQS Queue: "order-notifications" (comms team)
Kafka consumer groups are the pull-side equivalent: a single topic, multiple consumer groups, each group independently reading all messages. The broker handles the fan-out internally.
The same trade-offs apply: SNS/SQS fan-out has write amplification (N copies), while Kafka consumer groups read from a shared log (no data duplication, but each group has its own offset tracking).
Implementation Sketch
Here's the fan-out worker that handles the write-side distribution:
// Fan-out worker processing jobs from a queue
async function processFanOutJob(job: FanOutJob): Promise<void> {
const { postId, followers } = job;
const BATCH_SIZE = 1000;
for (let i = 0; i < followers.length; i += BATCH_SIZE) {
const batch = followers.slice(i, i + BATCH_SIZE);
await Promise.all(
batch.map(followerId =>
feedInbox.prepend(followerId, {
postId,
timestamp: Date.now(),
authorId: job.authorId,
})
)
);
}
}
Key detail: the fan-out happens asynchronously via a job queue (SQS, Kafka). The post creation returns immediately to the user. The fan-out worker distributes the post reference to follower inboxes in batches. For a user with 100K followers, this might take 5-10 seconds. The author sees their post instantly; followers see it within a few seconds.
Inbox Data Model
The feed inbox is typically a Redis sorted set per user, keyed by user ID and scored by timestamp:
| Operation | Redis Command | Complexity |
|---|---|---|
| Push post to inbox | ZADD user:{id}:feed {timestamp} {postId} | O(log N) |
| Read feed (latest 50) | ZREVRANGEBYSCORE user:{id}:feed +inf -inf LIMIT 0 50 | O(log N + 50) |
| Trim inbox to 500 | ZREMRANGEBYRANK user:{id}:feed 0 -501 | O(log N + K) |
Each inbox entry is small (post ID + timestamp = ~16 bytes). The entire inbox of 500 entries is ~8KB. For 50M active users with hot inboxes, that's ~400GB of Redis memory, achievable with a Redis cluster of 10-15 nodes (each with 32-64GB RAM).
The trim operation runs after each push or on a periodic schedule. It caps inbox size to prevent unbounded growth. Posts trimmed from the inbox are not lost; they still exist in the posts table and can be fetched on demand for users who scroll far back.
When It Shines
- Social media feed systems: Instagram, Twitter/X, and Facebook all use hybrid fan-out for their timelines. This is the canonical use case.
- Notification systems: When an event triggers notifications to many users (e.g., a product goes on sale and 100K users have it on their wishlist), fan-out on write pushes notifications to user inboxes.
- SNS fan-out to multiple SQS queues: AWS SNS topic publishes to multiple SQS subscribers. Each subscriber gets its own copy. This is fan-out on write at the infrastructure level.
- Kafka topic fan-out with consumer groups: One topic, multiple consumer groups. Each group gets every message independently. The broker handles the fan-out.
- Activity feeds in enterprise apps: Slack channels, GitHub notification feeds, Jira activity streams. Same push/pull trade-off applies at smaller scale.
- Content recommendation pipelines: When a new piece of content is published, pre-compute "this might interest you" entries for targeted user segments. Similar to feed fan-out but filtered by interest signals rather than follow relationships.
The pattern is most valuable when you have a power-law distribution of content producers (few high-output accounts, many low-output accounts) and need to balance write cost against read latency.
When to Avoid Fan-Out on Write Entirely
Fan-out on write is the wrong choice when:
- Every feed view requires ranking: If you're building a TikTok-style "For You" page where 100% of content is algorithmically ranked from a large candidate pool, there's no stable "follow" relationship to pre-compute against. Pull at read time is the only option.
- Content has a very short half-life: If posts are only relevant for seconds (live sports scores, stock prices), pre-computing feeds is wasteful because most entries expire before being read.
- The graph is fully connected: If every user follows every other user (small team chat, all-hands channel), fan-out on write means every post is written to every inbox. At that point, a shared stream (Kafka topic, database table) is simpler.
- Follow relationships change rapidly: If users frequently follow/unfollow (ephemeral content platforms, event-based communities), the cost of maintaining materialized inboxes through constant follow-graph changes outweighs the read performance benefit.
Failure Modes & Pitfalls
1. The celebrity hotspot: A celebrity with 50M followers posts during peak hours. The fan-out worker generates 50M write operations. If your feed inbox storage (Redis, Cassandra) can handle 100K writes/second, this one post takes 8+ minutes to distribute. During that time, the fan-out worker queue backs up and other users' posts are delayed. The fix: detect celebrities and switch to pull model above the threshold.
2. Fan-out amplification factor: If the average user follows 200 accounts, each posting twice/day, the system processes 400 posts per user per day. With fan-out on write, if readers outnumber writers 100:1, the total fan-out writes across the platform can exceed the total post creations by orders of magnitude. Track your amplification factor (total inbox writes / total posts created) as a health metric. I've seen production systems where this ratio exceeded 1000x during viral events, and the fan-out worker queue lag grew to minutes.
3. Stale feed after follow/unfollow: When a user follows a new account, their inbox doesn't contain historical posts from that account. The feed has a gap. Fix: on follow, backfill the last N posts from the newly followed account into the inbox. On unfollow, lazily filter out posts from the unfollowed account at read time (don't delete from inbox, too expensive). This backfill operation needs rate limiting to prevent a user who follows 100 accounts in rapid succession from generating a burst of 100 x N backfill writes.
4. Memory pressure from large inboxes: If you store feed inboxes in Redis, each inbox consuming 10KB per user at 500M users = 5TB of Redis memory. Cost becomes significant. Common optimization: only store the latest 200-500 feed entries per inbox and trim older ones.
5. Ordering inconsistency in hybrid: Pushed posts arrive at different times than pulled celebrity posts. A celebrity posted 2 minutes ago but the push for a normal user happened 5 minutes ago. The merge at read time must sort by original post timestamp, not inbox insertion time.
6. Thundering herd on celebrity read: In the pull model, when a celebrity posts and millions of followers simultaneously open their feeds, every feed assembly request queries that celebrity's recent posts. This creates a read hotspot on the celebrity's post storage. The fix: cache the celebrity's recent posts at the feed assembly layer with a short TTL (5-10 seconds). All concurrent readers hit the cache instead of the underlying storage.
Watch the amplification factor
Track your fan-out amplification ratio: total inbox writes per day divided by total posts per day. A healthy ratio for a hybrid system is 50-200x. If it exceeds 500x, your threshold is too high (too many users classified as "normal" who should be celebrities). If it's under 10x, you might be doing too much pull work at read time. This metric is your early warning system.
Trade-offs
| Dimension | Fan-out on Write | Fan-out on Read | Hybrid |
|---|---|---|---|
| Read latency | Very low (pre-computed) | High (200+ queries per read) | Low (inbox + few celebrity pulls) |
| Write cost | O(followers) per post | O(1) per post | O(followers) for normals, O(1) for celebs |
| Storage | High (duplicated per inbox) | Low (single copy) | Medium |
| Celebrity handling | Catastrophic | Fine | Fine |
| Feed freshness | Eventual (fan-out delay) | Real-time | Mostly eventual, celebs real-time |
| Complexity | Low | Low | Medium (threshold logic, dual path) |
The fundamental tension is storage + write cost vs compute + read latency. Fan-out on write pays upfront (lots of storage, lots of writes) for cheap reads. Fan-out on read pays at runtime (lots of compute, higher read latency) for cheap writes. Neither extreme is correct at scale. The hybrid is the only architecture that works for a platform with both normal users and celebrities.
Cost Comparison at Scale
Consider a platform with 500M users, 50M daily posts, average 200 followers per author:
| Metric | Fan-out on Write | Fan-out on Read |
|---|---|---|
| Daily inbox writes | 10 billion (50M x 200) | 0 |
| Daily feed reads (1B feed views) | 1B inbox lookups | 200B post fetches (1B x 200) |
| Peak write throughput | ~115K writes/sec | ~580 writes/sec (posts only) |
| Peak read throughput | ~11.5K reads/sec | ~2.3M reads/sec |
| Redis memory (500 entries/user, hot tier) | ~2 TB (50M hot users) | ~0 (no inboxes) |
| Posts table storage | Same for both | Same for both |
The hybrid cuts write throughput by 80-90% (celebrities removed from fan-out) while keeping read throughput close to the pure push model. This is the sweet spot.
Real-World Usage
Twitter/X is the canonical case study for hybrid fan-out. Twitter's original architecture used fan-out on write exclusively. When a user tweeted, the tweet reference was pushed to every follower's timeline cache (stored in Redis). This worked well until high-follower accounts (celebrities, politicians, media outlets) created massive write amplification. A single tweet from an account with 80M followers generated 80M Redis writes.
Twitter's solution, described in their engineering blog and Raffi Krikorian's talks, was the hybrid approach: tweets from users with fewer than ~10K followers are fanned out on write; tweets from high-follower accounts are pulled at read time and merged with the pre-computed timeline. At peak, Twitter's fan-out service processed over 300K requests per second. The fan-out worker fleet was one of the largest services at Twitter, consuming significant compute resources even with the celebrity optimization in place.
Instagram uses a similar hybrid approach for its photo and story feeds. Instagram's engineering blog describes their feed ranking system, which combines pre-computed candidate generation (fan-out on write for followed accounts under the threshold) with real-time ranking at read time. Instagram's challenge is compounded by their ranking algorithm: unlike a simple reverse-chronological feed, Instagram ranks posts by predicted engagement, which requires pulling candidate posts and scoring them at read time regardless of how they were distributed. When Instagram switched from chronological to ranked feeds in 2016, it shifted more work to the read path but kept fan-out on write for candidate generation. The inbox became a "candidate pool" rather than the final feed.
Facebook took the pull model further with their TAO (The Associations and Objects) framework. Facebook's News Feed is not a simple chronological timeline. It's a ranked, algorithmically curated feed. Because every feed view requires ranking, Facebook computes feeds primarily at read time, pulling candidate posts from followed friends and pages, scoring them, and returning the top results. They cache aggressively to reduce the compute cost of repeated reads, but the core architecture is pull-dominant. Facebook published that their feed ranking system evaluates thousands of candidate posts per feed load, filtering down to ~50 shown posts.
Materialized timelines
All three companies use a concept called a materialized timeline: a pre-computed, denormalized view of a user's feed stored separately from the canonical posts table. This is essentially the "inbox" we've been discussing. The materialized timeline is an application of CQRS: the write model (posts table) is optimized for creating posts, while the read model (materialized timeline) is optimized for reading feeds. When the timeline becomes stale or is evicted from cache, it's reconstructed from the write model.
How This Shows Up in Interviews
Fan-out is the central design decision in every "Design Twitter," "Design Instagram," or "Design a social feed" interview question. If you get one of these questions, the interviewer is specifically waiting to hear you discuss the push/pull trade-off and the celebrity problem.
My recommendation: start with fan-out on write (it's simpler), explain the celebrity problem as you scale, then introduce the hybrid approach. This shows your design evolves with requirements, which is exactly what interviewers want to see.
The celebrity threshold shows scale thinking
When you say "users above 10K followers switch to fan-out on read," you demonstrate that you've thought about worst-case write amplification and designed a system that handles power-law distributions. This is the #1 signal interviewers look for in feed system design.
Depth expected at senior/staff level:
- Explain both pure models with their O() complexity on read and write.
- Identify the celebrity/hotspot problem without being prompted.
- Describe the hybrid approach with a specific threshold (e.g., 10K followers).
- Know that the fan-out happens asynchronously via a job queue, not synchronously during the post request.
- Explain how feed ranking interacts with fan-out (ranked feeds lean toward pull because scoring happens at read time).
- Discuss storage trade-offs: Redis inbox per user, trimming old entries, cache eviction.
- Explain how feed ranking interacts with fan-out (ranked feeds lean toward pull because scoring happens at read time).
- Discuss the Kafka batching strategy: one fan-out job per batch of 10K followers, not one message per follower.
Common follow-up questions and strong answers:
| Interviewer asks | Strong answer |
|---|---|
| "What happens when a celebrity with 50M followers posts?" | "We don't fan out at all. The post is stored once. At read time, any user following that celebrity pulls their recent posts and merges with the pre-computed inbox. The celebrity's post reaches 50M followers without 50M writes." |
| "How do you handle a user who follows 1,000 accounts?" | "With hybrid, most of those 1,000 accounts' posts are pre-pushed to the inbox. Only the handful of celebrity accounts are pulled at read time. If the user follows 5 celebrities and 995 normal users, the read path does 1 inbox read + 5 celebrity fetches, not 1,000 queries." |
| "How does your system handle a new follower seeing historical posts?" | "On follow, we backfill the last 20 posts from the newly followed account into the user's inbox. This is a one-time cost. For unfollows, we lazily filter at read time rather than removing entries from the inbox, since inbox deletion across millions of entries is expensive." |
| "What storage system do you use for the feed inbox?" | "Redis sorted sets, keyed by userId, scored by timestamp. ZADD for writes, ZREVRANGEBYSCORE for reads. Trim to latest 500 posts per inbox to bound memory. For cold users (haven't logged in recently), evict from Redis and reconstruct from the posts table on next login." |
| "How does this change for a ranked feed vs chronological?" | "Ranked feeds lean toward fan-out on read because ranking requires scoring candidates at read time anyway. You still push candidate posts to inboxes (generates the candidate list), but the read path fetches candidates, scores them, and returns the top-ranked results. Instagram and Facebook both use this approach." |
Don't forget the async worker fleet
A common interview mistake is to describe fan-out on write as synchronous. "The user posts, and we write to all 10K inboxes before returning a response." No. The post response returns immediately. The fan-out is a background job. If you describe it as synchronous, the interviewer will assume you don't understand production-scale systems. Always mention the job queue (SQS, Kafka, Redis Streams) and the worker fleet.
Test Your Understanding
Quick Recap
- Fan-out on write pushes posts to every follower's inbox at write time, giving O(1) reads but O(followers) writes per post.
- Fan-out on read assembles feeds at read time by pulling from all followed accounts, giving O(1) writes but O(following) reads.
- The celebrity/hotspot problem makes pure fan-out on write impractical: one 50M-follower post generates 50M inbox writes.
- The hybrid approach (push for normal users, pull for celebrities above a threshold like 10K followers) is the standard production architecture.
- Fan-out on write happens asynchronously via worker queues, not synchronously during the post request.
- Storage trade-offs are significant: fan-out on write duplicates every post reference across N inboxes, requiring careful memory management (inbox trimming, tiered storage).
- Ranked feeds push toward read-time computation regardless of fan-out model, since ranking signals are dynamic.
- Infrastructure fan-out (SNS to SQS, Kafka consumer groups) follows the same push/pull trade-offs as application-level feed fan-out.
- Track the amplification factor (total inbox writes / total posts) as a health metric. Values over 500x suggest the celebrity threshold is too high.
Related Patterns
- Message queues: The fan-out worker processes jobs from a queue. Understanding queue semantics (ordering, at-least-once delivery, consumer groups) is prerequisite knowledge for fan-out systems.
- Caching: Feed inboxes are essentially caches of materialized timelines. Cache eviction, trimming, and reconstruction on miss follow standard caching patterns.
- Competing consumers: Fan-out workers are a classic competing consumer pattern. Multiple workers process fan-out jobs from the same queue in parallel.
- CQRS: The feed system is a natural CQRS split. The write model (posts table) and the read model (feed inboxes) are separate, optimized for their respective access patterns.