Caching vs. freshness
The tradeoff between serving data fast from a cache vs. serving data that reflects the latest state, covering TTL strategies, cache invalidation patterns, and how to decide which data can be cached and for how long.
TL;DR
| Data type | Cache aggressively | Serve fresh | Why |
|---|---|---|---|
| Account balance | Never cache during transactions | Always from primary DB | Showing stale balance during a transfer causes support tickets |
| Product catalog | Cache 1-5 min TTL, invalidate on edit | Not needed; changes are infrequent | 2M products, 100K views/min, 1000:1 read-to-write ratio |
| Social media like count | Cache 5-30s TTL | Not needed; approximate is fine | Nobody notices if a count is 5 seconds behind |
| News feed content | Cache 1-5 min with event-driven purge | After a user posts, show their own post immediately | Combination of caching + read-your-own-writes |
| Static assets (JS, CSS) | Immutable cache with content hashing | Never stale; hash changes on deploy | Cache-Control: immutable, max-age=31536000 |
Default instinct: cache everything, then carve out exceptions. Most data tolerates seconds to minutes of staleness. The exceptions (financial balances, inventory during checkout, real-time bidding) are the minority. Start by caching aggressively with TTL-based expiration, then add event-driven invalidation only for the data that truly needs it.
The Framing
Your e-commerce platform caches product data in Redis with a 1-hour TTL. A customer loads the product page and sees a price of $29.99. Twenty minutes ago, the merchandising team changed the price to $24.99. The customer adds the item to their cart, and the cart service reads the fresh price from the database: $24.99. The customer sees a different price on the product page and the cart. They call support.
This is the staleness problem. The product page served cached data that no longer matches the source of truth. The cart service (which reads the database directly) showed the real price. The user's experience fractured because two services had different views of the same data.
The naive fix: "stop caching product data." But without the Redis cache, every product page view hits the database. At 100K page views per minute, the database falls over within minutes. The cache was not optional; it was structural.
The real question is not "should I cache this data?" It is "how stale can this data be before a user is harmed, and what invalidation strategy matches that tolerance?"
I have seen teams swing between two extremes: caching everything with long TTLs (fast but stale) or caching nothing because invalidation is "too complex" (correct but slow). The answer is almost always in the middle, with different staleness tolerances per data type.
How Each Works
Caching: trade freshness for speed
Caching stores a computed result so subsequent requests skip the expensive computation. The most common pattern is cache-aside (lazy population): check the cache first, fall through to the database on a miss, and populate the cache for the next reader.
def get_product(product_id: str) -> Product:
cache_key = f"product:{product_id}"
# Check cache first
cached = redis.get(cache_key)
if cached:
return deserialize(cached)
# Cache miss: fetch from DB
product = db.find_product(product_id)
redis.setex(cache_key, ttl_seconds=300, value=serialize(product)) # 5min TTL
return product
The speed benefit is dramatic. Redis responds in under 1ms. A PostgreSQL query with joins takes 5-50ms. At a 95% cache hit rate, 95 out of 100 requests are served 10-50x faster. The database handles only the 5% that miss the cache, reducing its load by 20x.
The cost: every cached value is a snapshot frozen in time. Until the TTL expires or the entry is explicitly invalidated, readers see the snapshot, not the current truth.
Freshness: always serve the source of truth
The "always-fresh" approach skips the cache entirely and reads from the primary database on every request. This guarantees that every response reflects the latest state.
def get_product_fresh(product_id: str) -> Product:
# Always read from the primary database
return db.find_product(product_id)
This is correct by definition but expensive. At scale, every read hits the database. Connection pools saturate. Query latency increases as load grows. You cannot horizontally scale a single-primary database beyond its hardware limits without adding read replicas (which introduce their own staleness via replication lag).
For your interview: never present "no caching" as a serious option for read-heavy data. The correct framing is choosing the right caching strategy and invalidation pattern, not choosing between caching and no caching.
Four cache interaction patterns
The differences between cache patterns determine who populates the cache and when.
| Pattern | Who populates cache | Write path | Staleness risk | Best for |
|---|---|---|---|---|
| Cache-aside | Application on read miss | App writes DB, invalidates/ignores cache | TTL window (seconds-hours) | Most use cases; simplest to implement |
| Read-through | Cache library on miss | Same as cache-aside | Same as cache-aside | Cache libraries with built-in loaders |
| Write-through | Cache on write | App writes to cache, cache writes to DB | Near-zero (cache always current) | Read-after-write consistency is critical |
| Write-behind | Cache on write, async DB flush | App writes cache only, cache flushes later | Data loss risk if cache crashes before flush | High write throughput, acceptable data loss |
Cache-aside is the default choice for 90% of use cases. Write-through when you need read-after-write consistency. Write-behind only when you can tolerate potential data loss.
Head-to-Head Comparison
| Dimension | Aggressive caching | Always-fresh reads | Verdict |
|---|---|---|---|
| Read latency | Sub-ms from Redis | 5-50ms from database | Caching |
| Database load | Reduced 10-20x at 95% hit rate | Full load on every read | Caching |
| Data freshness | Stale by TTL window (seconds-hours) | Always current | Freshness |
| Consistency across services | Services may see different versions | All services see same state | Freshness |
| Horizontal read scaling | Add Redis nodes cheaply | Requires read replicas (still has lag) | Caching |
| Operational complexity | Cache infrastructure + invalidation logic | Simpler, but DB scaling is harder | Depends |
| Failure mode | Cache crash = temporary DB overload | DB crash = total outage | Caching (graceful degradation) |
| Cost at scale | Redis cluster ($200-800/mo) + DB | Larger DB instances ($2K-10K/mo) | Caching |
| Debugging | "Is this stale?" is a common question | Data always current, easier to reason about | Freshness |
The fundamental tension: response speed vs. data accuracy. Caching serves data faster but risks showing outdated information. Fresh reads are always correct but hit performance limits much sooner.
When Caching Wins
Choose aggressive caching when read throughput matters more than second-by-second accuracy. This is the majority of web application data.
Product catalogs and content pages. A product page viewed 10K times per minute changes once every few hours. A 5-minute TTL means the page is at most 5 minutes stale, which no user notices. The alternative (10K database reads per minute per product) is wasteful.
User profile data. Profile information (name, avatar, preferences) changes rarely. A 1-hour TTL with event-driven invalidation on profile update gives sub-millisecond reads with near-instant freshness when something changes.
Search results and recommendations. Search indexes and recommendation models update periodically (minutes to hours). Caching the rendered results eliminates redundant computation. A user searching "laptop" does not need results from the index updated 3 seconds ago.
Session data. Session lookups happen on every authenticated request. Storing sessions in Redis eliminates a database round-trip per request. Sessions change infrequently (login, permission change), so staleness is not a concern.
When Freshness Wins
Choose always-fresh reads when showing stale data causes real harm, not just cosmetic inconsistency.
Financial balances during transactions. A user checking their balance during a funds transfer must see the correct number. Showing a cached balance that does not reflect a pending debit causes incorrect decisions and support escalations.
Inventory counts at checkout. When a user clicks "Buy," the system must check real inventory, not a cached count. Selling the last item to two users simultaneously (because both saw cached inventory of 1) creates an oversell that requires manual resolution.
Real-time bidding and auctions. If a bidder sees a cached bid of $100 and places $101, but the real bid is $150, the system wasted their bid and confused the experience. Auction bids must always reflect the latest state.
Access control and permissions. When an admin revokes a user's access, the revocation must take effect immediately, not after the permission cache expires. A 5-minute TTL on permissions means 5 minutes of unauthorized access after revocation.
The Nuance
The real answer is almost never "cache everything" or "cache nothing." It is "cache most things, with invalidation strategies matched to each data type's tolerance."
Event-driven invalidation: best of both worlds
Instead of waiting for TTL expiry, invalidate the cache immediately when the source data changes. The write path publishes an event (via Kafka, Redis Pub/Sub, or database CDC), and a consumer deletes or updates the cache entry.
# Write path: update DB and publish invalidation event
def update_product(product_id: str, new_price: float):
db.update_product(product_id, price=new_price)
kafka.publish("product-updates", {
"product_id": product_id,
"action": "invalidate"
})
# Consumer: listen for invalidation events
def on_product_update(event):
redis.delete(f"product:{event['product_id']}")
This approach gives short effective staleness (the time between the DB write and the cache invalidation, typically 50-200ms) while still serving reads from the cache. The next read after invalidation repopulates from the database.
The trade-off: invalidation adds complexity. You need a reliable event pipeline. If the invalidation event is lost or delayed, the cache serves stale data until TTL expiry. This is why you still set a TTL as a safety net, even with event-driven invalidation.
HTTP caching: the CDN layer
For content served over HTTP, the caching layer extends beyond your application. Browsers, CDNs, and reverse proxies all cache responses based on HTTP headers.
Key headers:
Cache-Control: max-age=300, stale-while-revalidate=60
-> Cache for 5 min. After expiry, serve stale for 60s while revalidating.
Cache-Control: max-age=31536000, immutable
-> Cache forever. Used with content-hashed URLs (app.abc123.js).
ETag: "abc123"
-> Client sends If-None-Match: "abc123" on next request.
-> Server returns 304 Not Modified if unchanged (saves bandwidth).
Surrogate-Key: product-42 product-catalog
-> CDN tag for targeted purging.
-> Purge all responses tagged "product-42" when that product changes.
Surrogate-Key purging is the CDN equivalent of event-driven invalidation. When a product changes, you send a purge request to the CDN for that product's surrogate key. All cached responses tagged with that key are invalidated across all edge nodes. Fastly and Cloudflare both support this with sub-second global purge latency.
Real-World Examples
Facebook TAO (write-through + TTL). TAO is Facebook's distributed cache for the social graph. Every write (new friendship, like, comment) updates both the database and the cache synchronously (write-through). This ensures the cache is always current for the writer. For other readers, eventual consistency is acceptable: a like count that is 1-2 seconds behind is fine. TAO also uses a short TTL as a safety net: even if a write-through fails to update a cache node, the stale entry expires within seconds.
Netflix EVCache (event-driven invalidation). Netflix uses EVCache (a distributed Memcached layer) to cache catalog metadata, user preferences, and viewing history. When a new title is added or metadata changes, an event is published through their internal event bus, and EVCache entries are invalidated within 200-500ms. Netflix explicitly classifies each data type by staleness tolerance: viewing history can be 30 seconds stale, but entitlement checks (can this user watch this title?) are always fresh.
Shopify (Surrogate-Key CDN purging). Shopify serves millions of storefronts through a CDN. Each HTTP response is tagged with Surrogate-Keys (product ID, collection ID, theme version). When a merchant updates a product, Shopify sends a targeted purge for that product's surrogate key. The purge propagates globally in under 150ms. Shopify reports that Surrogate-Key purging reduced their invalidation from "purge everything" (5-10 minute propagation) to targeted purges (sub-second).
How This Shows Up in Interviews
This trade-off appears whenever you design a read-heavy system with mutable data. The interviewer wants to see that you reason about staleness tolerance per data type, not apply a blanket caching strategy.
What they are testing: Do you know cache-aside, write-through, and event-driven invalidation? Can you identify which data should not be cached? Do you understand the thundering herd problem and how to prevent it? Can you design a CDN caching strategy with targeted invalidation?
Depth expected at senior level:
- Classify data by staleness tolerance (0s, seconds, minutes, hours, immutable)
- Explain cache-aside vs write-through vs write-behind with trade-offs
- Know the thundering herd problem and at least 2 solutions (jittered TTL, request coalescing)
- Understand HTTP caching headers (Cache-Control, ETag, Surrogate-Key)
- Reference real patterns (event-driven invalidation via Kafka, CDN Surrogate-Key purging)
| Interviewer asks | Strong answer |
|---|---|
| "How do you decide what to cache?" | "Classify each data type by staleness tolerance. Immutable data caches forever. Product catalog gets 5-min TTL with event-driven invalidation. Financial balances are never cached. The staleness budget drives the TTL and invalidation strategy." |
| "What happens when a popular cache key expires?" | "Thundering herd: hundreds of requests simultaneously miss and hit the database. Prevention: jittered TTLs spread expirations, request coalescing ensures only one request refreshes the cache." |
| "How do you keep the cache consistent with the database?" | "Event-driven invalidation: on write, publish an event (Kafka or CDC), consumer deletes the cache entry. Set a TTL as a safety net in case the event is lost. For HTTP, use Surrogate-Key CDN purging." |
| "Should we use write-through or cache-aside?" | "Cache-aside for most cases: it only caches data that is actually read, saving memory. Write-through when read-after-write consistency is critical (the writer must immediately see their update in the cache)." |
Interview tip: name the staleness budget
When adding a cache to your design, say: "The staleness budget for [data type] is [X seconds/minutes]. I will use a [TTL/event-driven/write-through] strategy to stay within that budget." This shows you think about freshness as a design parameter, not an afterthought.
Quick Recap
- Caching trades data freshness for read speed. A 95% cache hit rate reduces database load by 20x, but every cached value is a snapshot that can become stale.
- Classify every data type by staleness tolerance: 0s (never cache), seconds, minutes, hours, or immutable. The tolerance determines the TTL and invalidation strategy.
- Cache-aside is the default pattern for 90% of use cases. Write-through when read-after-write consistency is critical. Write-behind only when data loss risk is tolerable.
- Event-driven invalidation (via Kafka or CDC) gives sub-second freshness while preserving caching performance. Always set a TTL as a safety net.
- The thundering herd problem is real at scale. Prevent it with jittered TTLs, request coalescing, or probabilistic early expiration.
- In interviews, name the staleness budget explicitly per data type. This shows you reason about freshness as a design parameter, not an afterthought.
Related Trade-offs
- Read replicas vs. caching for comparing two strategies that both improve read performance
- Strong vs. eventual consistency for the consistency model behind cache staleness
- Read vs. write optimization for how caching fits into the broader read optimization toolkit
- Sync vs. async communication for how invalidation events flow between services
- CDN for the full picture on edge caching architecture