Segment trees
How segment trees enable efficient range queries and range updates in O(log N), the data structure behind real-time leaderboards, financial analytics, and database range aggregations.
The problem
You are building a real-time gaming leaderboard. One million players have scores that change constantly (point updates), and your API needs to answer "how many players have scores between 500 and 1000?" (range queries) in under 10 milliseconds.
With a raw array, answering a range query requires scanning every element in the range. For a million elements, that is O(N) per query. At 10,000 queries per second, you are doing 10 billion comparisons per second. The leaderboard page takes 3 seconds to load. Response times degrade linearly as the player base grows.
You could precompute prefix sums for O(1) range queries. But when a player's score changes, you must rebuild the entire prefix sum array, which is O(N). With 50,000 score updates per second, the prefix sum array is permanently stale. By the time you finish rebuilding, thousands of new updates have arrived.
This is the fundamental tension: arrays are fast to update but slow to query. Prefix sums are fast to query but slow to update. You need a structure that gives O(log N) for both operations simultaneously.
| Approach | Query | Update |
|---|---|---|
| Raw array | O(N) scan | O(1) |
| Prefix sum array | O(1) | O(N) rebuild |
| Segment tree | O(log N) | O(log N) |
When both queries and updates happen at scale, neither extreme works. Segment trees handle both in O(log N).
What it is
A segment tree is a binary tree where each node stores a precomputed aggregate (sum, min, max, count) over a contiguous segment of an array. The root covers the entire array, and each leaf covers a single element.
Think of a company org chart for budget tracking. The CEO node holds the total budget of the entire company. Each VP node holds the total for their division. Each manager node holds the total for their team. Each individual contributor is a leaf with their own budget. To find the total budget of divisions 3 through 5, you sum three VP nodes instead of iterating through hundreds of employees. When one employee's budget changes, you update their leaf and recalculate upward through their manager, VP, and CEO, touching only log(N) nodes.
Interview tip: the one-liner
"A segment tree precomputes aggregates over every possible range in O(N) space, then answers any range query in O(log N) by combining at most 2 * log(N) precomputed nodes, and handles point updates in O(log N) by recomputing ancestors."
How it works
A segment tree is a complete binary tree. Each node stores the aggregate of a range of the original array. The root covers the full array; leaves cover individual elements.
For N elements, the tree has at most 4N nodes. Storage is O(N). The tree is stored as a flat array where node i has children at 2i+1 and 2i+2. No pointers needed.
class SegmentTree {
private tree: number[];
private n: number;
constructor(arr: number[]) {
this.n = arr.length;
this.tree = new Array(4 * this.n).fill(0);
this.build(arr, 0, 0, this.n - 1);
}
private build(arr: number[], node: number, start: number, end: number): void {
if (start === end) {
this.tree[node] = arr[start];
return;
}
const mid = Math.floor((start + end) / 2);
const left = 2 * node + 1;
const right = 2 * node + 2;
this.build(arr, left, start, mid);
this.build(arr, right, mid + 1, end);
this.tree[node] = this.tree[left] + this.tree[right]; // merge: sum
}
}
Build time: O(N). Each element is visited once. The recursive build visits every node exactly once, and each node's computation (merging two children) is O(1).
Array-based storage
The tree is stored as a flat array rather than with node pointers. Node at index i has its left child at 2i + 1 and right child at 2i + 2. This eliminates pointer overhead and allows single-block allocation. Why 4N nodes? For power-of-2 input sizes, the tree has exactly 2N - 1 nodes. For non-power-of-2 sizes, the last level may be sparse, requiring up to 4N slots in the worst case.
Range query
A query for sum([L, R]) descends the tree, cutting off branches that lie entirely outside [L, R] and using precomputed sums for branches entirely inside.
The query decomposes [2, 5] into two precomputed nodes: sum[2,3] = 5 and sum[4,5] = 14. Answer: 19. Only 7 nodes were visited instead of scanning 4 array elements. At scale, this is O(log N) versus O(N).
query(queryLeft: number, queryRight: number): number {
return this._query(0, 0, this.n - 1, queryLeft, queryRight);
}
private _query(
node: number, start: number, end: number,
qLeft: number, qRight: number
): number {
// No overlap
if (qRight < start || end < qLeft) return 0; // identity for sum
// Total overlap: use precomputed value
if (qLeft <= start && end <= qRight) return this.tree[node];
// Partial overlap: recurse into both children
const mid = Math.floor((start + end) / 2);
const leftSum = this._query(2 * node + 1, start, mid, qLeft, qRight);
const rightSum = this._query(2 * node + 2, mid + 1, end, qLeft, qRight);
return leftSum + rightSum;
}
At each level, at most 4 nodes are visited (2 partial overlaps, potentially 2 full overlaps that terminate). At most 2 nodes per level create "partial overlap" recursive calls, and the rest terminate early. Result: O(log N).
The three cases for each node during a query:
- No overlap (query range is entirely outside the node's range): return the identity element immediately. Zero work.
- Total overlap (node's range is entirely inside the query range): return the precomputed value immediately. Zero recursion.
- Partial overlap (query range partially covers the node's range): recurse into both children and merge results.
Point update
Change arr[i] = newValue. Walk from root to the leaf at index i, updating every ancestor.
The update touches exactly log(N) nodes: one per tree level. Each node recomputation is O(1) (summing two children), so the total is O(log N).
Continue Reading with Premium
Unlock this article and every other in-depth system design guide on the platform with NotesFromSDE Premium.