Read replicas vs. caching

TL;DR

Dimension	Choose Caching	Choose Read Replicas
Access pattern	Hot data, high read-to-write ratio, repeated identical reads	Complex queries, unique filter combinations, analytical workloads
Staleness tolerance	Seconds of staleness are acceptable (profiles, catalogs, sessions)	Need data freshness within replication lag (milliseconds to seconds)
Write rate	Low write rate relative to reads (invalidation rate is manageable)	Write-heavy workload where cache invalidation would destroy hit rate
Query complexity	Simple key-value lookups or pre-computed results	Full SQL capability needed (JOINs, GROUP BY, arbitrary filters)
Data coverage	Hot working set fits in memory (typically 5-20% of total data)	Need to query the full dataset, not just hot keys

Default answer: use both, at different layers. Cache for hot, repetitive reads (sessions, profiles, product pages). Read replicas for complex queries, analytics, and everything the cache can't cover. The access pattern tells you which to deploy first.

The Framing

Your primary database is at 80% CPU. Reads are taking 200ms when they used to take 20ms. The dashboard is red. You have two cards to play: add a read replica (more database capacity) or add a cache (less database traffic).

I've watched teams pick the wrong one and waste weeks. A team added Redis in front of a write-heavy user activity table with 2,000 writes per second. Cache invalidation rate matched the write rate. Hit rate hovered at 3%. They'd added infrastructure, complexity, and a new failure mode, and the database load didn't budge.

The access pattern tells you which lever to pull. High read-to-write ratio with a clear hot set? Cache. Complex queries where every request is unique? Read replica. Write-heavy with read contention? Definitely not a cache.

Here's my rule of thumb: if you can describe your top 10 most expensive queries and they all have the same parameters across thousands of users, cache them. If each user's query is unique, a cache won't help. Add a replica.

How Each Works

Caching (Redis / Memcached)

A cache sits between your application and database. On read, check the cache first. If the key is present (cache hit), return it immediately at sub-millisecond latency. If not (cache miss), read from the database, write the result to the cache with a TTL, and return it.

The most common pattern is cache-aside (lazy loading):

# Cache-aside pattern (pseudocode)
def get_user_profile(user_id):
    # Step 1: Check cache
    cached = redis.get(f"user:{user_id}")
    if cached:
        return deserialize(cached)  # Sub-ms response

    # Step 2: Cache miss, read from DB
    profile = db.query("SELECT * FROM users WHERE id = %s", user_id)

    # Step 3: Populate cache with TTL
    redis.setex(f"user:{user_id}", 3600, serialize(profile))

    return profile

The cache only stores data that's been requested at least once. Over time, the hot working set naturally populates the cache. A well-tuned cache-aside setup achieves 90-99% hit rates on read-heavy workloads, meaning 90-99% of reads never touch the database.

The catch: cache invalidation. When the underlying data changes, the cached copy is stale. You need an invalidation strategy (delete on write, TTL expiry, or event-driven invalidation). Each has trade-offs between freshness, consistency, and complexity.

Read Replicas

A read replica is a copy of your primary database that receives changes via asynchronous replication. Your application routes write queries to the primary and read queries to one or more replicas. Each replica is a full database instance with complete query capability.

# Read replica routing (pseudocode)
def get_user_profile(user_id):
    # Read from replica (full SQL capability)
    return replica_db.query("SELECT * FROM users WHERE id = %s", user_id)

def search_users(filters):
    # Complex query that would be impossible to cache
    return replica_db.query("""
        SELECT u.*, COUNT(o.id) as order_count
        FROM users u
        LEFT JOIN orders o ON o.user_id = u.id
        WHERE u.country = %s AND u.created_at > %s
        GROUP BY u.id
        ORDER BY order_count DESC
        LIMIT 50
    """, filters.country, filters.since)

def update_user(user_id, data):
    # Writes always go to primary
    return primary_db.query("UPDATE users SET name=%s WHERE id=%s", data.name, user_id)

The key advantage: replicas support arbitrary SQL queries. No pre-defined keys, no invalidation logic, no serialization. Every query your primary can run, the replica can run too.

The cost: replication lag. Async replication means the replica is always slightly behind the primary (typically milliseconds on the same region, seconds under heavy load or cross-region). A user might write data and immediately read stale results from a replica that hasn't received the write yet.

Cache Invalidation Patterns

The hardest part of caching isn't adding a cache. It's deciding how and when to update it. There are four patterns, each with a different consistency/complexity trade-off.

Cache-aside (lazy loading): The application manages the cache explicitly. On read, check cache first. On miss, read from DB and populate cache. On write, update DB and delete the cache key. This is the most common pattern because it's simple and the cache only stores data that's actually been requested.

Write-through: On every write, update both the DB and the cache synchronously. The cache is always fresh, but every write pays the latency of two operations (DB write + cache write). Good for data that's read immediately after writing (user profiles, settings).

Write-behind (write-back): On write, update the cache immediately and asynchronously flush to the DB. Writes are fast (cache-speed), but you risk data loss if the cache fails before the async flush completes. I've seen this used for analytics counters where losing a few seconds of data is acceptable.

Event-driven invalidation: The database publishes change events (via CDC, triggers, or binlog streaming), and a consumer invalidates or updates cache entries. This decouples the write path from cache management entirely. More complex to set up, but eliminates the "forgot to invalidate" class of bugs.

TL;DR

Dimension	Choose Caching	Choose Read Replicas
Access pattern	Hot data, high read-to-write ratio, repeated identical reads	Complex queries, unique filter combinations, analytical workloads
Staleness tolerance	Seconds of staleness are acceptable (profiles, catalogs, sessions)	Need data freshness within replication lag (milliseconds to seconds)
Write rate	Low write rate relative to reads (invalidation rate is manageable)	Write-heavy workload where cache invalidation would destroy hit rate
Query complexity	Simple key-value lookups or pre-computed results	Full SQL capability needed (JOINs, GROUP BY, arbitrary filters)
Data coverage	Hot working set fits in memory (typically 5-20% of total data)	Need to query the full dataset, not just hot keys

The Framing

How Each Works

Caching (Redis / Memcached)

The most common pattern is cache-aside (lazy loading):

# Cache-aside pattern (pseudocode)
def get_user_profile(user_id):
    # Step 1: Check cache
    cached = redis.get(f"user:{user_id}")
    if cached:
        return deserialize(cached)  # Sub-ms response

    # Step 2: Cache miss, read from DB
    profile = db.query("SELECT * FROM users WHERE id = %s", user_id)

    # Step 3: Populate cache with TTL
    redis.setex(f"user:{user_id}", 3600, serialize(profile))

    return profile

Read Replicas

# Read replica routing (pseudocode)
def get_user_profile(user_id):
    # Read from replica (full SQL capability)
    return replica_db.query("SELECT * FROM users WHERE id = %s", user_id)

def search_users(filters):
    # Complex query that would be impossible to cache
    return replica_db.query("""
        SELECT u.*, COUNT(o.id) as order_count
        FROM users u
        LEFT JOIN orders o ON o.user_id = u.id
        WHERE u.country = %s AND u.created_at > %s
        GROUP BY u.id
        ORDER BY order_count DESC
        LIMIT 50
    """, filters.country, filters.since)

def update_user(user_id, data):
    # Writes always go to primary
    return primary_db.query("UPDATE users SET name=%s WHERE id=%s", data.name, user_id)

The key advantage: replicas support arbitrary SQL queries. No pre-defined keys, no invalidation logic, no serialization. Every query your primary can run, the replica can run too.

Cache Invalidation Patterns

The hardest part of caching isn't adding a cache. It's deciding how and when to update it. There are four patterns, each with a different consistency/complexity trade-off.

Read replicas vs. caching

TL;DR

The Framing

How Each Works

Caching (Redis / Memcached)

Read Replicas

Cache Invalidation Patterns

Continue Reading with Premium

Comments

Read replicas vs. caching

TL;DR

The Framing

How Each Works

Caching (Redis / Memcached)

Read Replicas

Cache Invalidation Patterns

Continue Reading with Premium

Comments