Read vs. write optimization

TL;DR

Scenario	Optimize for reads	Optimize for writes
E-commerce product catalog	Denormalize, add indexes, cache in Redis	Overkill; writes are rare compared to page views
IoT sensor telemetry	Not the bottleneck; reads are infrequent	Append-only LSM, batch inserts, minimal indexing
Social media feed	Precompute feeds, read replicas, CDN	Fan-out-on-write for celebrities creates write storms
Financial audit log	Occasional reads for investigations	Append-only, immutable, WAL-optimized
Real-time analytics dashboard	Materialized views, OLAP column store	Raw event ingestion must be write-optimized

Default instinct: optimize for reads. Most web applications have a 100:1 read-to-write ratio. But know when you are in the minority: IoT, logging, event sourcing, and analytics ingestion are write-heavy by nature. The right answer depends on your read/write ratio, not on what sounds more sophisticated.

The Framing

Your e-commerce platform launches. The product table has a few thousand rows. Reads and writes are both fast. No one thinks about optimization.

Then the catalog grows to 2 million products. The homepage query that joins products, categories, reviews, and inventory takes 800ms. You add a B-tree index on category_id. It drops to 15ms. Problem solved.

Six months later, the warehouse integration pushes 50,000 inventory updates per minute. Each UPDATE touches the products table, which now has 12 indexes. Every write must update every index. Write latency jumps from 2ms to 45ms. The index that saved your reads is now strangling your writes.

This is the fundamental tension: every structure you add to speed up reads (indexes, materialized views, denormalized columns) becomes overhead on writes. Every optimization you make for writes (append-only logs, minimal indexing, batching) makes reads more expensive. You cannot optimize for both simultaneously on the same data path.

I have seen teams add indexes reactively (one for every slow query) until they have 15+ indexes on a single table and wonder why writes are crawling. The fix is not "remove all indexes." The fix is understanding which path (read or write) is your bottleneck and optimizing intentionally.

The honest answer: neither side is "better." Your read/write ratio determines where to invest. A 100:1 ratio means read optimization pays 100x more per unit of engineering effort. A 1:10 ratio means every write optimization matters far more than shaving milliseconds off reads.

How Each Works

Read optimization: make reads fast at the cost of write overhead

Read-optimized systems pre-compute and pre-organize data so that reads require minimal work. The B-tree is the canonical read-optimized data structure: data is always sorted, binary search finds any key in O(log n), and range scans follow sequential leaf pointers.

B-tree index lookup (PostgreSQL, MySQL):
  1. Query: SELECT * FROM products WHERE category_id = 42
  2. Traverse B-tree: root -> internal node -> leaf node (3-4 disk reads)
  3. Leaf node contains row pointer -> fetch row from heap
  4. Total: 3-5 random reads, predictable latency

Cost of maintaining the B-tree on writes:
  1. INSERT: find correct leaf -> insert key -> possibly split node
  2. UPDATE: update old entry, insert new entry (if indexed column changes)
  3. Each index on the table = one additional B-tree update per write
  4. 12 indexes = 12 B-tree updates per INSERT

Beyond indexes, read optimization includes denormalization (store redundant data to avoid joins), materialized views (pre-computed query results), caching (store hot data in Redis), and read replicas (horizontally scale read throughput). Every one of these techniques adds work on the write side.

For your interview: when someone says "our reads are slow," the optimization ladder is indexes, then cache, then read replicas, then denormalize. Each step trades write complexity for read speed.

Write optimization: make writes fast at the cost of read overhead

Write-optimized systems prioritize ingestion speed. The LSM tree (Log-Structured Merge Tree) is the canonical write-optimized data structure: writes go to an in-memory buffer (memtable), which is periodically flushed to disk as an immutable sorted file (SSTable). Reads must check multiple SSTables and merge results.

LSM tree write path (Cassandra, RocksDB, LevelDB):
  1. Write arrives -> append to Write-Ahead Log (sequential I/O, fast)
  2. Insert into in-memory memtable (sorted skip list or red-black tree)
  3. When memtable is full (e.g., 64MB) -> flush to disk as immutable SSTable
  4. Total write cost: 1 sequential append + 1 in-memory insert
  5. No index updates, no page splits, no random I/O

LSM tree read path:
  1. Check memtable (in-memory, fast)
  2. Check each SSTable level, newest first
  3. Bloom filter per SSTable: skip if key is definitely not present
  4. Merge results from multiple SSTables
  5. Total: potentially multiple disk reads across levels

Compaction runs in the background, merging SSTables to reduce the number of files a read must check. This is the write amplification tradeoff: the data is written once to the memtable, then rewritten during each compaction level. RocksDB's leveled compaction can produce 10-30x write amplification (each byte is rewritten 10-30 times across compaction levels).

Bloom filters are the critical optimization that makes LSM reads tolerable. A Bloom filter answers "is this key possibly in this SSTable?" with a configurable false positive rate (typically 1%). If the filter says "no," the SSTable is skipped entirely. Without Bloom filters, every read would scan every SSTable at every level.

Head-to-Head Comparison

Dimension	Read-optimized (B-tree)	Write-optimized (LSM tree)	Verdict
Point read latency	O(log n), 3-5 disk reads, predictable	Must check memtable + multiple SSTable levels	B-tree
Range scan	Sequential leaf traversal, excellent	Must merge across SSTables	B-tree
Write throughput	Random I/O per index, degrades with index count	Sequential I/O only, scales linearly	LSM
Write latency	2-10ms (depends on index count)	Sub-ms (memtable insert + WAL append)	LSM
Space amplification	~1x (data stored once, indexes add 20-40%)	1.1-2x (duplicates during compaction)	B-tree (slightly)
Write amplification	1x per index (N indexes = N updates per write)	10-30x (compaction rewrites at each level)	B-tree (fewer total rewrites)
Read amplification	1 (data is in one place per index)	1-5 (check multiple levels, mitigated by Bloom)	B-tree
Storage I/O pattern	Random reads and writes	Sequential writes, random reads	LSM (write path)
Compaction overhead	None (data sorted in-place)	Background CPU and I/O spikes	B-tree

The fundamental tension: read latency vs. write throughput. B-trees keep data sorted for fast reads but pay on every write. LSM trees buffer writes for speed but defer the sorting cost to reads and background compaction.

When Read Optimization Wins

Choose read optimization when your workload is read-heavy and read latency directly affects user experience. This is most web applications.

E-commerce product catalog. A catalog with 2M products and 100K page views per minute generates a 1000:1 read-to-write ratio. Products change a few times per day. Indexes, denormalized views, Redis caching, and read replicas are all justified. The write-side cost (slower product updates) is trivial compared to the benefit of sub-millisecond reads.

User profile lookups. Authentication and profile services read user records on every request. Writes happen only during registration and profile edits. A composite index on (email, status), a Redis cache, and 2-3 read replicas handle millions of reads per second.

Search and discovery. Full-text search (Elasticsearch), autocomplete, and recommendation results are read-heavy by definition. The indexing cost is paid once during ingestion; the search benefit is multiplied across every user query.

OLAP data warehouses. Analytical queries scan millions of rows but the data is loaded in batch (nightly ETL). Column-oriented storage (Redshift, BigQuery, ClickHouse) is a read optimization: columns are stored contiguously, compression is better, and analytical queries read only the columns they need.

When Write Optimization Wins

Choose write optimization when your workload is write-heavy and write throughput is the bottleneck. These workloads often accept higher read latency in exchange for ingestion speed.

IoT and sensor telemetry. A fleet of 100K devices reporting metrics every 10 seconds generates 10K writes/sec. Reads are infrequent (dashboard queries, incident investigations). An LSM-based time-series database (TimescaleDB, InfluxDB) or append-only log (Kafka + ClickHouse) handles this efficiently.

Event sourcing and audit logs. These workloads are append-only by design. Events are never updated or deleted; they are only appended. An LSM tree or append-only log is the natural fit. Reads reconstruct state by replaying events, which is expensive but infrequent.

Metrics and logging. Services generating 50K log lines per second need a write path that does not block the application. Writes go to a buffer (Kafka, Fluentd), then batch-insert into a write-optimized store (ClickHouse, Elasticsearch). Reads happen during investigations, not on the hot path.

High-throughput ingestion pipelines. Data lakes ingest raw events from multiple sources. The priority is "get it on disk fast" rather than "make it queryable immediately." Append to Parquet files in S3, then build read-optimized views later.

The Nuance

The "read vs. write" framing is often a false dichotomy. Most real systems need both fast reads and fast writes. The question is not "pick one" but "how do I split the read path from the write path so each can be optimized independently?"

CQRS: separate the read and write models

Command Query Responsibility Segregation (CQRS) is the pattern that resolves this tension. Writes go to a write-optimized store (normalized, minimal indexes, append-only). Reads go to a separate read-optimized store (denormalized, heavily indexed, cached). An asynchronous sync mechanism keeps the read model updated.

The write side uses a normalized PostgreSQL schema with minimal indexes. The read side uses a denormalized schema (or even a different database like Elasticsearch) with aggressive indexing and caching. The Kafka-based sync introduces a staleness window (typically 100ms to a few seconds), but for most use cases this is acceptable.

I have seen CQRS reduce write latency by 3-5x (fewer indexes on the write path) while simultaneously improving read latency by 10x (denormalized read model + cache). The cost is operational complexity: you maintain two data stores and a sync pipeline.

Read/write ratio analysis

Before optimizing, measure your actual ratio:

Read/write ratio estimation (PostgreSQL):
  -- Check reads:
  SELECT seq_scan + idx_scan AS total_reads
  FROM pg_stat_user_tables WHERE relname = 'products';

  -- Check writes:
  SELECT n_tup_ins + n_tup_upd + n_tup_del AS total_writes
  FROM pg_stat_user_tables WHERE relname = 'products';

Common ratios:
  Product catalog:  1000:1 reads:writes -> optimize reads aggressively
  User sessions:    100:1  -> cache + replicas
  Order processing: 10:1   -> balanced, lean toward reads
  Event ingestion:  1:100  -> optimize writes, build read models later
  Audit log:        1:1000 -> pure write optimization

The rule of thumb: if your ratio exceeds 10:1 in either direction, optimize for the dominant side. If it is between 1:10 and 10:1, consider CQRS to split the paths.

Real-World Examples

Facebook TAO (read-optimized, extreme). TAO (The Associations and Objects store) is Facebook's read-optimized graph cache. It serves billions of reads per second for the social graph (friends, likes, comments). Writes go to a MySQL primary, which replicates to thousands of TAO cache servers storing denormalized graph data. TAO accepts 1-2 second replication lag because social graph reads tolerate eventual consistency. The write path is intentionally simple (normalized MySQL) while the read path is massively scaled across multiple datacenters.

Cassandra (write-optimized, LSM-based). Cassandra uses LSM trees and was designed for write-heavy workloads (originally built for Facebook's inbox search). It achieves millions of writes per second via commit log + memtable only, no random I/O, minimal indexing (partition key only, secondary indexes are expensive and discouraged), and background compaction. The trade-off: reads on non-partition-key columns are slow. Cassandra requires you to design your table schema around read patterns (one denormalized table per query), making the application layer responsible for read optimization.

Uber Schemaless (CQRS-inspired). Uber built Schemaless on top of MySQL for their write-heavy ride data (trip events, GPS pings, payment events). The write path appends immutable cells optimized for high throughput. Read-optimized views are built asynchronously via triggers that populate secondary indexes and materialized views. This separation allowed Uber to scale write throughput to millions of events per second while serving reads from purpose-built views with sub-10ms latency.

How This Shows Up in Interviews

This trade-off appears in nearly every system design interview. Whenever you choose a database, add an index, introduce caching, or discuss data modeling, you are implicitly making a read-vs-write optimization decision. The interviewer wants to see that you make this choice deliberately.

What they are testing: Do you understand why adding an index speeds up reads but slows writes? Can you identify whether a workload is read-heavy or write-heavy and choose the appropriate optimization? Do you know when CQRS is justified and when it is overkill?

Depth expected at senior level:

Explain B-tree vs LSM tree trade-offs with specific numbers (write amplification, read amplification)
Know that over-indexing is a real production problem, not just a textbook concern
Articulate CQRS as the resolution when both read and write performance matter
Understand Bloom filters and why they make LSM reads practical
Reference specific databases and their optimization leanings (PostgreSQL = B-tree/read, Cassandra = LSM/write)

Interviewer asks	Strong answer
"How would you optimize this slow read query?"	"First, check the read/write ratio. If it is 100:1, I would add a composite index, then a cache layer, then read replicas. If closer to 1:1, I would consider CQRS to avoid slowing writes with more indexes."
"Why not just add more indexes?"	"Each index is a write-time cost. With 10 indexes, every INSERT updates 10 B-trees. At high write throughput, index maintenance becomes the bottleneck. Audit existing indexes with pg_stat_user_indexes first."
"This system ingests 100K events/sec. How would you store them?"	"Write-optimized: append to a WAL-backed LSM store like Cassandra or ClickHouse. Minimal indexes on the write path. Build read-optimized views asynchronously via CDC or materialized views."
"What is CQRS and when would you use it?"	"Separate write model from read model. Writes go to a normalized store with minimal indexes. Reads go to a denormalized store with heavy indexing and caching. Sync via Kafka/CDC. Justified when both paths are critical and 100ms-5s staleness is acceptable."

Interview tip: state the read/write ratio explicitly

When designing any data layer, say: "The read/write ratio here is approximately X:1, so I will optimize for [reads/writes]." This shows you make data structure choices based on workload analysis, not gut feeling.

Quick Recap

Read-optimized structures (B-trees, indexes, materialized views, caches, read replicas) make reads fast at the cost of write overhead. Every index added is a B-tree that every write must update.
Write-optimized structures (LSM trees, WALs, append-only logs, minimal indexing, batching) make writes fast at the cost of read complexity. Reads must check multiple levels and merge results.
B-trees keep data sorted in-place for fast reads with random I/O writes. LSM trees buffer writes sequentially, deferring sort cost to compaction.
Bloom filters make LSM reads practical by skipping SSTables that definitely do not contain the target key, reducing read amplification from thousands to 1-2 disk reads.
CQRS resolves the tension by splitting the write path (normalized, minimal indexes) from the read path (denormalized, heavily indexed, cached), synced via CDC or event streaming.
In interviews, always state the read/write ratio before choosing an optimization strategy. A 100:1 ratio means read optimization pays 100x more per unit of effort.

Normalization vs. denormalization for the data modeling side of this trade-off
SQL vs. NoSQL for how database engine choice reflects read/write optimization philosophy
Caching vs. freshness for the read-optimization technique that introduces staleness
Read replicas vs. caching for comparing two specific read optimization techniques
CQRS for the full pattern that separates read and write paths

TL;DR

Scenario	Optimize for reads	Optimize for writes
E-commerce product catalog	Denormalize, add indexes, cache in Redis	Overkill; writes are rare compared to page views
IoT sensor telemetry	Not the bottleneck; reads are infrequent	Append-only LSM, batch inserts, minimal indexing
Social media feed	Precompute feeds, read replicas, CDN	Fan-out-on-write for celebrities creates write storms
Financial audit log	Occasional reads for investigations	Append-only, immutable, WAL-optimized
Real-time analytics dashboard	Materialized views, OLAP column store	Raw event ingestion must be write-optimized

The Framing

Your e-commerce platform launches. The product table has a few thousand rows. Reads and writes are both fast. No one thinks about optimization.

How Each Works

Read optimization: make reads fast at the cost of write overhead

B-tree index lookup (PostgreSQL, MySQL):
  1. Query: SELECT * FROM products WHERE category_id = 42
  2. Traverse B-tree: root -> internal node -> leaf node (3-4 disk reads)
  3. Leaf node contains row pointer -> fetch row from heap
  4. Total: 3-5 random reads, predictable latency

Cost of maintaining the B-tree on writes:
  1. INSERT: find correct leaf -> insert key -> possibly split node
  2. UPDATE: update old entry, insert new entry (if indexed column changes)
  3. Each index on the table = one additional B-tree update per write
  4. 12 indexes = 12 B-tree updates per INSERT

For your interview: when someone says "our reads are slow," the optimization ladder is indexes, then cache, then read replicas, then denormalize. Each step trades write complexity for read speed.

Write optimization: make writes fast at the cost of read overhead

LSM tree write path (Cassandra, RocksDB, LevelDB):
  1. Write arrives -> append to Write-Ahead Log (sequential I/O, fast)
  2. Insert into in-memory memtable (sorted skip list or red-black tree)
  3. When memtable is full (e.g., 64MB) -> flush to disk as immutable SSTable
  4. Total write cost: 1 sequential append + 1 in-memory insert
  5. No index updates, no page splits, no random I/O

LSM tree read path:
  1. Check memtable (in-memory, fast)
  2. Check each SSTable level, newest first
  3. Bloom filter per SSTable: skip if key is definitely not present
  4. Merge results from multiple SSTables
  5. Total: potentially multiple disk reads across levels

Head-to-Head Comparison

Dimension	Read-optimized (B-tree)	Write-optimized (LSM tree)	Verdict
Point read latency	O(log n), 3-5 disk reads, predictable	Must check memtable + multiple SSTable levels	B-tree
Range scan	Sequential leaf traversal, excellent	Must merge across SSTables	B-tree
Write throughput	Random I/O per index, degrades with index count	Sequential I/O only, scales linearly	LSM
Write latency	2-10ms (depends on index count)	Sub-ms (memtable insert + WAL append)	LSM
Space amplification	~1x (data stored once, indexes add 20-40%)	1.1-2x (duplicates during compaction)	B-tree (slightly)
Write amplification	1x per index (N indexes = N updates per write)	10-30x (compaction rewrites at each level)	B-tree (fewer total rewrites)
Read amplification	1 (data is in one place per index)	1-5 (check multiple levels, mitigated by Bloom)	B-tree
Storage I/O pattern	Random reads and writes	Sequential writes, random reads	LSM (write path)
Compaction overhead	None (data sorted in-place)	Background CPU and I/O spikes	B-tree

Read/write ratio estimation (PostgreSQL):
  -- Check reads:
  SELECT seq_scan + idx_scan AS total_reads
  FROM pg_stat_user_tables WHERE relname = 'products';

  -- Check writes:
  SELECT n_tup_ins + n_tup_upd + n_tup_del AS total_writes
  FROM pg_stat_user_tables WHERE relname = 'products';

Common ratios:
  Product catalog:  1000:1 reads:writes -> optimize reads aggressively
  User sessions:    100:1  -> cache + replicas
  Order processing: 10:1   -> balanced, lean toward reads
  Event ingestion:  1:100  -> optimize writes, build read models later
  Audit log:        1:1000 -> pure write optimization

The rule of thumb: if your ratio exceeds 10:1 in either direction, optimize for the dominant side. If it is between 1:10 and 10:1, consider CQRS to split the paths.

Real-World Examples

How This Shows Up in Interviews

Depth expected at senior level:

Explain B-tree vs LSM tree trade-offs with specific numbers (write amplification, read amplification)
Know that over-indexing is a real production problem, not just a textbook concern
Articulate CQRS as the resolution when both read and write performance matter
Understand Bloom filters and why they make LSM reads practical
Reference specific databases and their optimization leanings (PostgreSQL = B-tree/read, Cassandra = LSM/write)

Interviewer asks	Strong answer
"How would you optimize this slow read query?"	"First, check the read/write ratio. If it is 100:1, I would add a composite index, then a cache layer, then read replicas. If closer to 1:1, I would consider CQRS to avoid slowing writes with more indexes."
"Why not just add more indexes?"	"Each index is a write-time cost. With 10 indexes, every INSERT updates 10 B-trees. At high write throughput, index maintenance becomes the bottleneck. Audit existing indexes with pg_stat_user_indexes first."
"This system ingests 100K events/sec. How would you store them?"	"Write-optimized: append to a WAL-backed LSM store like Cassandra or ClickHouse. Minimal indexes on the write path. Build read-optimized views asynchronously via CDC or materialized views."
"What is CQRS and when would you use it?"	"Separate write model from read model. Writes go to a normalized store with minimal indexes. Reads go to a denormalized store with heavy indexing and caching. Sync via Kafka/CDC. Justified when both paths are critical and 100ms-5s staleness is acceptable."

Interview tip: state the read/write ratio explicitly

Quick Recap

Read-optimized structures (B-trees, indexes, materialized views, caches, read replicas) make reads fast at the cost of write overhead. Every index added is a B-tree that every write must update.
Write-optimized structures (LSM trees, WALs, append-only logs, minimal indexing, batching) make writes fast at the cost of read complexity. Reads must check multiple levels and merge results.
B-trees keep data sorted in-place for fast reads with random I/O writes. LSM trees buffer writes sequentially, deferring sort cost to compaction.
Bloom filters make LSM reads practical by skipping SSTables that definitely do not contain the target key, reducing read amplification from thousands to 1-2 disk reads.
CQRS resolves the tension by splitting the write path (normalized, minimal indexes) from the read path (denormalized, heavily indexed, cached), synced via CDC or event streaming.
In interviews, always state the read/write ratio before choosing an optimization strategy. A 100:1 ratio means read optimization pays 100x more per unit of effort.

Normalization vs. denormalization for the data modeling side of this trade-off
SQL vs. NoSQL for how database engine choice reflects read/write optimization philosophy
Caching vs. freshness for the read-optimization technique that introduces staleness
Read replicas vs. caching for comparing two specific read optimization techniques
CQRS for the full pattern that separates read and write paths

Read vs. write optimization

TL;DR

The Framing

How Each Works

Read optimization: make reads fast at the cost of write overhead

Write optimization: make writes fast at the cost of read overhead

Head-to-Head Comparison

When Read Optimization Wins

When Write Optimization Wins

The Nuance

CQRS: separate the read and write models

Read/write ratio analysis

Real-World Examples

How This Shows Up in Interviews

Quick Recap

Comments

Read vs. write optimization

TL;DR

The Framing

How Each Works

Read optimization: make reads fast at the cost of write overhead

Write optimization: make writes fast at the cost of read overhead

Head-to-Head Comparison

When Read Optimization Wins

When Write Optimization Wins

The Nuance

CQRS: separate the read and write models

Read/write ratio analysis

Real-World Examples

How This Shows Up in Interviews

Quick Recap

Comments