How TikTok's For You Page works
TikTok's For You Page uses video completion rate as its primary signal, feeds that into a real-time recommendation model, and bootstraps new users with trending content before personalization kicks in. Here's what's known about how it works.
The Problem Statement
Interviewer: "You open TikTok and immediately see a feed of videos that feel eerily personalized, even though you never told TikTok what you like. Walk me through how the For You Page decides what to show you. What signals does it use, and how does it work for a brand-new user who has no history?"
This question tests three things: whether you understand why completion rate is a stronger signal than likes or follows, how a multi-stage recommendation pipeline works (candidate generation, ranking, re-ranking), and how a system bootstraps personalization for users with zero history.
Most candidates jump to "collaborative filtering" and describe the Netflix Prize approach. That misses the key insight: TikTok's recommendation engine is fundamentally different from Spotify or Netflix because it prioritizes real-time behavioral signals (watch time, replays, scroll speed) over historical taste profiles. The strong answer explains why this design choice matters, how the pipeline is structured, and what happens in the first 10 minutes of a new user's experience.
Clarifying the Scenario
You: "Great question. Before I walk through the architecture, I want to clarify a few things."
You: "When you say 'how the For You Page decides what to show,' are you asking about the ML model internals (embeddings, loss functions), or the system architecture that serves recommendations in real time?"
Interviewer: "Both, but lean toward how the system is built. I want to understand the full pipeline from signal collection to feed ranking."
You: "Got it. Should I also cover the creator distribution side, how TikTok decides to give a new video wider exposure?"
Interviewer: "Yes, that is part of it. The creator side is interesting."
You: "OK. And is the scope mobile only, or should I consider web?"
Interviewer: "Mobile. That is where 95%+ of usage happens."
You: "Perfect. I will structure my answer in four parts: the signals TikTok collects and why completion rate is the primary one, the multi-stage recommendation pipeline (candidate generation, ranking, re-ranking with diversity), how cold start works for both new users and new videos, and how creators get their content distributed through the graduated exposure system."
My Approach
I break this into five parts:
- Why completion rate is the primary signal: Likes and follows are social signals with noise. Watching a 15-second video to completion is a high-purity intent signal that is hard to fake.
- Real-time feature ingestion: TikTok processes behavioral signals (watch time, replays, shares, scroll speed) within seconds, not batch-processed overnight like Spotify's Discover Weekly.
- The multi-stage ranking pipeline: Candidate generation pulls hundreds of videos from multiple sources, a ranking model scores them, and a re-ranking pass injects diversity and freshness.
- Cold start and the exploration-exploitation tradeoff: New users see trending content first, and TikTok rapidly personalizes based on the first few interactions. New videos get a test pool of viewers before graduating to wider distribution.
- Creator-side distribution mechanics: The graduated exposure system (small test pool, measure completion rate, scale or suppress) is what makes TikTok's creator ecosystem work.
The Architecture
Here is how the pipeline works end to end:
-
Event collection is continuous. Every interaction you have with a video (how long you watched, whether you replayed it, how fast you scrolled past, whether you shared it) streams as an event to Kafka within milliseconds. This is not batch processing. TikTok's edge is real-time signal ingestion.
-
Feature computation happens in a streaming pipeline. Flink or a similar stream processor aggregates raw events into features: your rolling completion rate for different content categories, your average session depth, which audio tracks you replay. These features update every few minutes in the feature store.
-
Candidate generation pulls from multiple channels. The system retrieves roughly 1,000 candidate videos from several sources: videos similar to what you recently engaged with (embedding similarity), trending content in your geographic region, content from creators you follow, and an "exploration" bucket of content you would not normally see.
-
The ranking model scores each candidate. A deep neural network predicts multiple objectives: probability of completion, probability of sharing, probability of liking. These are combined into a weighted score. Completion rate gets the highest weight because it is the strongest signal of genuine interest.
-
Re-ranking injects diversity and enforces policies. Without this step, the feed would be a monotonous stream of similar content. The re-ranker enforces rules: no back-to-back videos from the same creator, genre diversity across the session, an exploration budget (5-10% of feed slots go to content outside your usual taste), and content safety filters.
-
The client prefetches the top 3 videos so there is zero load time when you swipe.
TikTok processes over 1 billion video views per day. The recommendation pipeline must return a ranked feed in under 200ms per request. This latency constraint is why the pipeline uses a multi-stage funnel (1000 candidates, narrowed to 30) rather than scoring every video in the catalog.
Completion Rate and Real-Time Signal Processing
This is the section most people underestimate. Completion rate is not just "another engagement metric." It is the fundamental design insight behind TikTok's recommendation quality.
Consider the difference: a user can like a video for social reasons (their friend posted it, it is culturally relevant, they want the creator to see it). A user can follow a creator and then never watch their content. But watching a 15-second video to completion, especially rewatching it, is a strong behavioral signal that is extremely hard to fake at scale.
TikTok's system tracks far more than binary "watched" vs "skipped." The raw signals include:
| Signal | Format | Weight in Model | Why It Matters |
|---|---|---|---|
| Watch time / video length | Ratio (0.0 to 3.0+) | Highest | Completion and replay are strongest intent signals |
| Replay count | Integer | Very high | Rewatching means strong positive engagement |
| Share | Boolean + platform | High | Sharing to a friend is a high-effort action |
| Comment | Boolean + text | Medium-high | Comments indicate strong emotional response |
| Like | Boolean | Medium | Can be social/reciprocal, lower signal purity |
| Scroll speed past video | Float (px/sec) | Medium | Fast scroll = negative signal, slow scroll = interest |
| Follow after watching | Boolean | Medium | Strong signal but rare event |
| Profile visit | Boolean | Low-medium | Curiosity about creator |
| Audio page visit | Boolean | Low | Interest in the audio track/meme |
The key architectural insight: TikTok's feature store is near-real-time, not batch. When I watch a cooking video to completion right now, the very next batch of recommendations (fetched when I scroll past the current buffer) already reflects that signal. This is why TikTok feels like it "reads your mind" after just a few videos.
For your interview: emphasize that the feedback loop is minutes, not hours. This separates TikTok from systems like Spotify's Discover Weekly (batch, once per week) or Netflix's homepage (updated every few hours).
Continue Reading with Premium
Unlock this article and every other in-depth system design guide on the platform with NotesFromSDE Premium.