Over-indexing anti-pattern
Learn why adding too many indexes degrades write performance, explodes storage, and slows query planning, and how to choose only the indexes your queries actually need.
TL;DR
- Every index you add costs a write amplification tax: for every INSERT, UPDATE, or DELETE, Postgres must also update all indexes on that table. A table with 12 indexes incurs 12 index writes per row insertion.
- Indexes are not free storage. Each B-tree index is a full copy of the indexed columns, ordered and balanced. A table with many indexes can be double or triple its base table size.
- The query planner has to choose between available indexes at query time. Too many indexes can cause the planner to make suboptimal choices, or spend extra time evaluating options.
- The discipline: add indexes that are directly justified by a specific slow query with EXPLAIN ANALYZE evidence. Do not add indexes "just in case."
- The fix: audit with
pg_stat_user_indexes, drop indexes with 0 scans, consolidate single-column indexes into composites, and use partial indexes for hot subsets.
The Problem
It's 3 a.m. and your on-call engineer gets paged. The order processing pipeline is backing up. Write latency on the orders table has crept from 2ms to 45ms over the past month. Nobody noticed because reads were still fast. The culprit? Twelve indexes on a single table, most added "just in case" during code reviews over the past year.
A developer reviewing a schema adds "helpful" indexes to a large orders table:
CREATE INDEX idx_orders_user_id ON orders (user_id);
CREATE INDEX idx_orders_status ON orders (status);
CREATE INDEX idx_orders_created_at ON orders (created_at);
CREATE INDEX idx_orders_updated_at ON orders (updated_at);
CREATE INDEX idx_orders_amount ON orders (amount);
CREATE INDEX idx_orders_user_status ON orders (user_id, status);
CREATE INDEX idx_orders_status_created ON orders (status, created_at);
CREATE INDEX idx_orders_user_created ON orders (user_id, created_at);
-- ... 5 more indexes for other query shapes
The table has 50 million rows. Each order insertion now updates 13 indexes. What was a 1ms primary key insert becomes a 12ms multi-index update. Write throughput drops by 10x. WAL log size explodes, causing replication lag. VACUUM takes longer because index entries must be cleaned too. A hot promotion flow that processes 5,000 orders/second now struggles to keep up with 500/second.
I've seen this exact failure at two different companies. Both times, the fix was the same: audit every index, drop the ones without a matching query, and consolidate overlapping indexes into composites.
Here's the before-and-after from one of those incidents:
| Metric | Before (13 indexes) | After (4 indexes) |
|---|---|---|
| Insert latency (p99) | 45ms | 4ms |
| WAL generation | 120 MB/min | 35 MB/min |
| Replication lag | 8 seconds | < 1 second |
| VACUUM duration | 4 hours | 25 minutes |
| Table + index size | 180 GB | 65 GB |
Write amplification per index
For a B-tree index:
- Every INSERT inserts a key into the B-tree: O(log n) per index
- Every DELETE removes the key from the B-tree: O(log n) per index
- Every UPDATE on an indexed column deletes the old key and inserts the new key: O(log n) per index, twice
At 50M rows, each B-tree operation is logβ(50,000,000) β 26 comparisons. With 13 indexes: 338 comparisons per write. The page cache must hold 13Γ more index pages. Dirty pages from all 13 index structures compete for the shared buffer pool.
The math is simple: if you have N indexes, your write cost is roughly N times what it would be with just the primary key. For most OLTP tables, 3-5 well-chosen indexes is the sweet spot. Beyond that, you need a very good reason.
Why It Happens
Every one of these decisions sounds reasonable in isolation:
- A developer adds an index during a code review because "we might query on this column someday." It takes 10 seconds to write the migration.
- The DBA adds indexes during a performance incident to fix a slow query. The crisis passes, but the indexes stay forever. Nobody wants to drop an index that "fixed" something.
- An ORM auto-generates indexes for every foreign key, and nobody reviews the migration output. I've seen Django and Rails projects with 2-3 auto-generated indexes per model relationship.
- A team copies the indexing strategy from a read-heavy OLAP workload onto a write-heavy OLTP table. Different workloads need different indexing strategies.
- Someone reads a "SQL performance tips" blog post that says "always index your WHERE columns" and applies it literally to every column.
Continue Reading with Premium
Unlock this article and every other in-depth system design guide on the platform with NotesFromSDE Premium.