Connection pooling
Learn why database connection overhead makes per-request connections unusable at scale, how pool sizing directly impacts throughput, and when pool exhaustion cascades into a full outage.
The problem
Your application server makes a database call on every request. The naive approach: open a connection at request start, run the query, close the connection at request end.
On PostgreSQL, establishing a new connection is not cheap. The database forks a new backend process for each connection. The fork + TLS handshake + authentication sequence takes 20-50ms. For a request that takes 10ms to serve, you're spending 3-5x the request duration just establishing the connection.
At low concurrency with a fast database, this is invisible. At 1,000 requests per second, you're opening and closing 1,000 connections per second. Your database is spending more time managing connections than answering queries.
What a connection pool is
A connection pool is a set of pre-established, reusable database connections maintained by the application server. Instead of opening a new connection per request, the application borrows a connection from the pool, runs the query, and returns the connection. The connection itself is never closed between requests.
Analogy: A taxi rank at an airport. Instead of building a new taxi for each passenger and scrapping it after the ride, you maintain a fleet of taxis that wait, pick up passengers, and return to the rank. The taxi (connection) is reused. The setup cost (manufacturing the taxi, training a driver) is paid once.
How pool sizing affects performance
Pool sizing is where engineers most commonly go wrong. The intuition says: more connections = more parallelism = faster. This is wrong past a critical threshold.
Each PostgreSQL backend process consumes:
- ~5-10 MB of RAM just for the process overhead
- A file descriptor slot
- CPU context-switch overhead when the scheduler rotates through active backends
PostgreSQL's throughput peaks at a pool size well below the maximum allowed connections. A common empirical finding: on a system with 4 CPU cores, peak throughput is reached at 4-8 active connections. Beyond that, additional concurrent backends compete for the same CPU capacity, and context-switching overhead reduces throughput.
Throughput vs. Pool Size (4-core DB server, pgbench workload):
Pool size: 5 10 20 50 100 200
TPS: 2K 3.8K 4K 3.5K 2.5K 1.5K
Latency: 3ms 5ms 8ms 18ms 50ms 130ms
The peak is somewhere around 2x CPU cores for CPU-bound workloads, and higher for I/O-bound workloads. The right number is empirically determined, not guessed.
max_connections is not your pool target
PostgreSQL's max_connections = 200 is a safety limit, not a performance target. Setting your pool size to 200 because PostgreSQL allows 200 is a common mistake. At 200 concurrent backends on a 4-core machine, throughput collapses from context-switching overhead. Use a pool size based on core count and measure actual throughput.
Pool exhaustion and the cascade failure
When all connections in the pool are in use and a new request arrives, the request waits for a connection to free up. If requests take longer than expected (slow query, lock contention, external service timeout), the pool drains. Requests queue. The queue grows faster than connections are released.
This is pool exhaustion, and it often causes the complete failure most engineers attribute to something else.
The cascade: slow queries hold connections longer. Other requests wait for connections. Wait times exceed timeouts. Errors spike. The service appears down even though the database is responding fine to the 20 active connections.
I've seen engineers interpret this as "the database is down" and restart the database, which makes things worse because reconnection overhead compounds the problem.
Pool configuration parameters
Most pool libraries expose these settings:
| Parameter | What it controls | Wrong default behavior |
|---|---|---|
minConnections | Connections established at startup | 0: first requests after idle period pay connection overhead |
maxConnections | Maximum pool size | Too high: exhausts DB resources; too low: requests queue |
acquireTimeout | Max wait time for a connection | Too long: requests accumulate and memory grows; too short: spurious errors |
idleTimeout | How long an idle connection stays open before closing | Too long: DB resources held unnecessarily; too short: cold starts |
connectionTimeout | Max time to establish a new connection | Should be low (1-3s); a slow DB is better caught fast |
maxLifetime | Maximum connection age before forced replacement | Important for load balancers that kill long-lived connections silently |
// Example: pg-pool (Node.js)
const pool = new Pool({
max: 20, // pool size β set based on DB CPU cores, not max_connections
min: 5, // keep 5 warm connections at all times
idleTimeoutMillis: 30_000, // close connections idle for 30s
connectionTimeoutMillis: 3_000, // fail fast if DB unreachable
maxUses: 7500, // replace a connection after 7500 uses (memory hygiene)
});
PgBouncer: pooling outside the application
Application-side pools (pg-pool, HikariCP, SQLAlchemy) maintain connections per process. A fleet of 50 app servers each with a pool of 20 connections = 1,000 connections visible to PostgreSQL. At large fleet sizes, this overwhelms the database even if individual response times are fine.
Continue Reading with Premium
Unlock this article and every other in-depth system design guide on the platform with NotesFromSDE Premium.