PostgreSQL query planner: statistics, cost model, and EXPLAIN

The problem

Your e-commerce platform runs a query that joins orders, users, and products to generate a daily sales report. In development with 10,000 rows, it takes 12ms. In production with 50 million orders, the same query takes 47 seconds.

You add an index on orders.user_id. The query still takes 47 seconds. You check the execution plan with EXPLAIN ANALYZE and discover the planner is ignoring your index entirely. It chose a sequential scan across the full orders table, then a nested loop join against users, scanning all 2 million user rows for every batch of orders.

The index exists. The query is correct. The problem is not your SQL. The problem is the planner's decision: it estimated the join would return 12 rows (based on stale statistics from last month's data) when the real result set is 380,000 rows. With that wrong estimate, nested loop looked cheaper than hash join. The plan it chose is catastrophically wrong.

This is the problem the query planner solves, and the problem you face when the planner gets it wrong. PostgreSQL's query planner is a cost-based optimizer that evaluates multiple execution strategies and picks the cheapest. When its cost estimates are accurate, it makes excellent choices. When the estimates are wrong (stale statistics, skewed data, correlated columns), the plan can be orders of magnitude slower than optimal.

What it is

The PostgreSQL query planner is a cost-based optimizer that translates a declarative SQL query into a physical execution plan: a tree of operators (scans, joins, sorts, aggregates) that the executor runs to produce results. For each query, the planner generates candidate plans, estimates their cost using stored statistics about the data, and picks the cheapest one.

Analogy: Think of a GPS navigation system. You type in a destination (the SQL query). The GPS considers multiple routes: highway (fast but longer distance), side streets (shorter but more stops), toll road (expensive but avoids traffic). It picks the route with the lowest estimated travel time based on current traffic data. If the traffic data is stale (it thinks the highway is clear when it is actually jammed), it sends you the wrong way. The PostgreSQL planner works the same way: good statistics produce good plans, stale statistics produce catastrophic plans.

Optimizer vs. planner

PostgreSQL documentation uses "planner" and "optimizer" interchangeably. Other databases (MySQL, Oracle, SQL Server) call the same component the "query optimizer." The function is identical: translate SQL into the cheapest physical execution plan.

How it works

The planner transforms a SQL query into an execution plan through four stages: parsing, rewriting, planning, and execution. The planning stage is where all the interesting decisions happen.

The planner's core loop works by dynamic programming. For each table in the query, it generates all possible access paths (sequential scan, each available index scan, bitmap scan). For each pair of tables, it considers all join strategies. It builds the plan bottom-up, caching the cheapest sub-plan at each step.

function plan_query(tables, joins, filters):
  // Step 1: Generate scan paths for each table
  for each table in tables:
    paths[table] = []
    paths[table].add(seq_scan(table))
    for each index on table:
      if index matches any filter on table:
        paths[table].add(index_scan(table, index))
    if multiple index conditions exist:
      paths[table].add(bitmap_scan(table, indexes))
    paths[table] = cheapest(paths[table])

  // Step 2: Generate join paths for each pair
  for join_size = 2 to len(tables):
    for each subset S of tables with size join_size:
      for each way to split S into (left, right):
        for join_method in [nested_loop, hash_join, merge_join]:
          cost = estimate_cost(paths[left], paths[right], join_method)
          paths[S] = keep_cheapest(paths[S], cost)

  // Step 3: Add sort, aggregate, limit on top
  plan = paths[all_tables]
  if ORDER BY: plan = add_sort(plan)
  if GROUP BY: plan = add_aggregate(plan)
  return plan

For queries involving more than geqo_threshold tables (default: 12), PostgreSQL switches from exhaustive dynamic programming to a Genetic Query Optimizer (GEQO) that uses randomized search. This is because the number of possible join orderings grows factorially: 12 tables produce 479 million possible orderings.

The cost model: sequential scan vs index scan

Every plan node gets a cost estimate. The cost formula combines I/O cost (pages read from disk) and CPU cost (rows processed). The cost units are arbitrary but consistent, anchored to seq_page_cost = 1.0.

Key cost parameters and their defaults:

Parameter	Default	What it controls
`seq_page_cost`	1.0	Cost of reading one page sequentially (baseline)
`random_page_cost`	4.0	Cost of reading one page randomly (index lookups)
`cpu_tuple_cost`	0.01	CPU cost of processing one row
`cpu_index_tuple_cost`	0.005	CPU cost of processing one index entry
`cpu_operator_cost`	0.0025	CPU cost of one operator evaluation (comparison, function)
`effective_cache_size`	4GB	Planner's estimate of available OS + PostgreSQL cache

A concrete example: table orders has 1 million rows across 50,000 pages. An index on user_id covers 2,700 index pages.

Sequential scan cost (scanning all rows, filtering to user_id = 123):

I/O:  50,000 pages × 1.0 (seq_page_cost)     = 50,000
CPU:  1,000,000 rows × 0.01 (cpu_tuple_cost)  = 10,000
Total: 60,000

Index scan cost (looking up user_id = 123, estimated 50 matching rows):

Index I/O: 3 levels × 4.0 (random_page_cost)  = 12
Heap I/O:  50 rows × 4.0 (random_page_cost)   = 200
CPU:       50 rows × 0.005                     = 0.25
Total: 212.25

The index scan is 280x cheaper. The planner picks it. But if random_page_cost were misconfigured to 40.0, the index scan cost jumps to 2,012, and the planner might still pick the sequential scan for smaller result sets.

On SSD storage, set random_page_cost to 1.1 (not the default 4.0). SSDs have nearly equal sequential and random read latency. The default of 4.0 was calibrated for spinning disks and causes the planner to under-use indexes on SSD-backed systems. I have seen this single misconfiguration cause 10x slowdowns on production queries.

Join algorithms

When a query joins two or more tables, the planner picks a join algorithm for each pair. PostgreSQL supports three algorithms, each optimal for different data shapes.

The problem

What it is

Optimizer vs. planner

How it works

The planner transforms a SQL query into an execution plan through four stages: parsing, rewriting, planning, and execution. The planning stage is where all the interesting decisions happen.

function plan_query(tables, joins, filters):
  // Step 1: Generate scan paths for each table
  for each table in tables:
    paths[table] = []
    paths[table].add(seq_scan(table))
    for each index on table:
      if index matches any filter on table:
        paths[table].add(index_scan(table, index))
    if multiple index conditions exist:
      paths[table].add(bitmap_scan(table, indexes))
    paths[table] = cheapest(paths[table])

  // Step 2: Generate join paths for each pair
  for join_size = 2 to len(tables):
    for each subset S of tables with size join_size:
      for each way to split S into (left, right):
        for join_method in [nested_loop, hash_join, merge_join]:
          cost = estimate_cost(paths[left], paths[right], join_method)
          paths[S] = keep_cheapest(paths[S], cost)

  // Step 3: Add sort, aggregate, limit on top
  plan = paths[all_tables]
  if ORDER BY: plan = add_sort(plan)
  if GROUP BY: plan = add_aggregate(plan)
  return plan

The cost model: sequential scan vs index scan

Key cost parameters and their defaults:

Parameter	Default	What it controls
`seq_page_cost`	1.0	Cost of reading one page sequentially (baseline)
`random_page_cost`	4.0	Cost of reading one page randomly (index lookups)
`cpu_tuple_cost`	0.01	CPU cost of processing one row
`cpu_index_tuple_cost`	0.005	CPU cost of processing one index entry
`cpu_operator_cost`	0.0025	CPU cost of one operator evaluation (comparison, function)
`effective_cache_size`	4GB	Planner's estimate of available OS + PostgreSQL cache

A concrete example: table orders has 1 million rows across 50,000 pages. An index on user_id covers 2,700 index pages.

Sequential scan cost (scanning all rows, filtering to user_id = 123):

I/O:  50,000 pages × 1.0 (seq_page_cost)     = 50,000
CPU:  1,000,000 rows × 0.01 (cpu_tuple_cost)  = 10,000
Total: 60,000

Index scan cost (looking up user_id = 123, estimated 50 matching rows):

Index I/O: 3 levels × 4.0 (random_page_cost)  = 12
Heap I/O:  50 rows × 4.0 (random_page_cost)   = 200
CPU:       50 rows × 0.005                     = 0.25
Total: 212.25

Join algorithms

When a query joins two or more tables, the planner picks a join algorithm for each pair. PostgreSQL supports three algorithms, each optimal for different data shapes.

PostgreSQL query planner: statistics, cost model, and EXPLAIN

The problem

What it is

How it works

The cost model: sequential scan vs index scan

Join algorithms

Continue Reading with Premium

Comments

PostgreSQL query planner: statistics, cost model, and EXPLAIN

The problem

What it is

How it works

The cost model: sequential scan vs index scan

Join algorithms

Continue Reading with Premium

Comments