Connection pooling
Learn why database connection overhead makes per-request connections unusable at scale, how pool sizing directly impacts throughput, and when pool exhaustion cascades into a full outage.
The problem
Your application server makes a database call on every request. The naive approach: open a connection at request start, run the query, close the connection at request end.
On PostgreSQL, establishing a new connection is not cheap. The database forks a new backend process for each connection. The fork + TLS handshake + authentication sequence takes 20-50ms. For a request that takes 10ms to serve, you're spending 3-5x the request duration just establishing the connection.
At low concurrency with a fast database, this is invisible. At 1,000 requests per second, you're opening and closing 1,000 connections per second. Your database is spending more time managing connections than answering queries.
What a connection pool is
A connection pool is a set of pre-established, reusable database connections maintained by the application server. Instead of opening a new connection per request, the application borrows a connection from the pool, runs the query, and returns the connection. The connection itself is never closed between requests.
Analogy: A taxi rank at an airport. Instead of building a new taxi for each passenger and scrapping it after the ride, you maintain a fleet of taxis that wait, pick up passengers, and return to the rank. The taxi (connection) is reused. The setup cost (manufacturing the taxi, training a driver) is paid once.
How pool sizing affects performance
Pool sizing is where engineers most commonly go wrong. The intuition says: more connections = more parallelism = faster. This is wrong past a critical threshold.
Each PostgreSQL backend process consumes:
- ~5-10 MB of RAM just for the process overhead
- A file descriptor slot
- CPU context-switch overhead when the scheduler rotates through active backends
PostgreSQL's throughput peaks at a pool size well below the maximum allowed connections. A common empirical finding: on a system with 4 CPU cores, peak throughput is reached at 4-8 active connections. Beyond that, additional concurrent backends compete for the same CPU capacity, and context-switching overhead reduces throughput.
Throughput vs. Pool Size (4-core DB server, pgbench workload):
Pool size: 5 10 20 50 100 200
TPS: 2K 3.8K 4K 3.5K 2.5K 1.5K
Latency: 3ms 5ms 8ms 18ms 50ms 130ms
The peak is somewhere around 2x CPU cores for CPU-bound workloads, and higher for I/O-bound workloads. The right number is empirically determined, not guessed.
max_connections is not your pool target
PostgreSQL's max_connections = 200 is a safety limit, not a performance target. Setting your pool size to 200 because PostgreSQL allows 200 is a common mistake. At 200 concurrent backends on a 4-core machine, throughput collapses from context-switching overhead. Use a pool size based on core count and measure actual throughput.
Pool exhaustion and the cascade failure
When all connections in the pool are in use and a new request arrives, the request waits for a connection to free up. If requests take longer than expected (slow query, lock contention, external service timeout), the pool drains. Requests queue. The queue grows faster than connections are released.
This is pool exhaustion, and it often causes the complete failure most engineers attribute to something else.
The cascade: slow queries hold connections longer. Other requests wait for connections. Wait times exceed timeouts. Errors spike. The service appears down even though the database is responding fine to the 20 active connections.
I've seen engineers interpret this as "the database is down" and restart the database, which makes things worse because reconnection overhead compounds the problem.
Pool configuration parameters
Most pool libraries expose these settings:
| Parameter | What it controls | Wrong default behavior |
|---|---|---|
minConnections | Connections established at startup | 0: first requests after idle period pay connection overhead |
maxConnections | Maximum pool size | Too high: exhausts DB resources; too low: requests queue |
acquireTimeout | Max wait time for a connection | Too long: requests accumulate and memory grows; too short: spurious errors |
idleTimeout | How long an idle connection stays open before closing | Too long: DB resources held unnecessarily; too short: cold starts |
connectionTimeout | Max time to establish a new connection | Should be low (1-3s); a slow DB is better caught fast |
maxLifetime | Maximum connection age before forced replacement | Important for load balancers that kill long-lived connections silently |
// Example: pg-pool (Node.js)
const pool = new Pool({
max: 20, // pool size — set based on DB CPU cores, not max_connections
min: 5, // keep 5 warm connections at all times
idleTimeoutMillis: 30_000, // close connections idle for 30s
connectionTimeoutMillis: 3_000, // fail fast if DB unreachable
maxUses: 7500, // replace a connection after 7500 uses (memory hygiene)
});
PgBouncer: pooling outside the application
Application-side pools (pg-pool, HikariCP, SQLAlchemy) maintain connections per process. A fleet of 50 app servers each with a pool of 20 connections = 1,000 connections visible to PostgreSQL. At large fleet sizes, this overwhelms the database even if individual response times are fine.
PgBouncer is a standalone connection pool proxy that sits between your app tier and PostgreSQL. All app servers connect to PgBouncer (which is cheap), and PgBouncer maintains a small, controlled pool of real connections to PostgreSQL.
Without PgBouncer:
50 app servers × 20 connections = 1,000 PostgreSQL backends
With PgBouncer:
50 app servers × 20 connections → PgBouncer → 50-100 PostgreSQL backends
PostgreSQL only sees 50-100 processes regardless of app server count
spawnSync d2 ENOENT
PgBouncer transaction mode breaks prepared statements
PgBouncer's transaction pooling mode returns the PostgreSQL connection to the pool after each transaction. This breaks session-level state including prepared statements and advisory locks. Most ORMs and query builders support a flag to disable server-side prepared statements. Toggle this when using PgBouncer transaction mode.
PgBouncer supports three modes:
- Session pooling: One PostgreSQL connection per client session. Low isolation, simple.
- Transaction pooling: A PostgreSQL connection is held for the duration of one transaction, then returned to the pool. The most commonly used mode. Works with most application code but breaks session-level features (prepared statements, advisory locks,
SETcommands). - Statement pooling: A connection is returned after each statement. Highest multiplexing ratio but breaks multi-statement transactions.
Transaction pooling mode is the right default for most web applications.
Production usage
| System | Pool mechanism | Notable behavior |
|---|---|---|
| PostgreSQL + pg-pool (Node.js) | Application-side pool | Default max: 10; idleTimeoutMillis defaults to 10s, causing cold-start overhead after idle periods |
| PostgreSQL + HikariCP (Java) | Application-side pool | Best-in-class JVM pool; maximumPoolSize = 2x CPU cores is a reasonable starting point; connectionTimeout = 30s default is too high for most web apps |
| PostgreSQL + PgBouncer | Proxy-side pool | Transaction mode breaks prepared statements; use server_reset_query = DISCARD ALL to reset session state between clients |
| MySQL + ProxySQL | Proxy-side pool | MySQL equivalent to PgBouncer; adds query routing for read/write split on top of connection multiplexing |
| Redis | No pooling by default | Redis is single-threaded; each client connection is lightweight; use pipelining rather than a large pool for throughput |
When connection pooling is not enough
- N+1 query problems are not fixed by pooling. If your application makes 100 database calls per request, pooling reuses the connection but still makes 100 round-trips. Fix N+1 with batching or JOINs first; the pool is not the bottleneck.
- Per-thread frameworks exhaust pools differently. A Java servlet container with 200 threads each borrowing one connection needs a 200-connection pool. A Go service with 10,000 goroutines using a pool of 20 is fine because goroutines yield while waiting. Profile the actual concurrency model before sizing.
- Application-side pools compound on serverless. A Lambda function that opens a pool with
min: 5creates 5 connections per cold start. With 500 concurrent Lambdas, that is 2,500 connections to PostgreSQL, each idle between invocations. Use RDS Proxy (AWS) or a PgBouncer sidecar to proxy the connections instead. - Pooling does not hide a slow database. If queries are slow, connections stay checked out longer, the pool depletes faster, and request queues grow. Pooling provides headroom, not a fix for slow queries or missing indexes.
Choosing the right pooling strategy
Interview cheat sheet
- Each PostgreSQL connection is a forked OS process. Connection establishment costs 20-50ms and ~10MB RAM.
- Optimal pool size is typically
2x DB CPU coresfor CPU-bound workloads. More connections add context-switch overhead, not throughput. - Pool exhaustion during slow queries is a common failure mode. It looks like an application outage but the database is responding fine to the connections that are active.
acquireTimeoutdetermines how long a request waits before returning an error. This bound stops the request queue from growing indefinitely.- PgBouncer solves the N-app-servers × pool-size explosion problem by multiplexing many application connections onto a small number of PostgreSQL backends.
maxLifetimeprevents connections from silently breaking when cloud load balancers terminate long-lived TCP connections.- Connection pools on serverless runtimes (Lambda, Cloud Functions) create N connections per cold start times concurrent instances. Use a proxy layer (RDS Proxy, PgBouncer) to prevent connection count explosion.
- Application-side connection pools do not fix slow queries. If queries hold connections longer than expected, the pool depletes faster and request queues grow. Fix the query first, then right-size the pool.
Quick recap
- Each PostgreSQL connection is a forked OS process costing 20-50ms and ~10MB RAM to establish. Per-request connections are unusable at scale.
- Optimal pool size is determined empirically, typically 2-4x the database's CPU core count. More connections add context-switch overhead, reducing throughput.
- Pool exhaustion during slow queries cascades into request queue buildup, causing the service to appear down while the database is responding normally.
acquireTimeoutbounds how long a request waits for a pool connection. Without it, queues grow unbounded.- PgBouncer multiplexes many application connections onto a small number of PostgreSQL backends, solving the N-servers × pool-size connection explosion.
- Cloud load balancers silently terminate idle TCP connections. Configure
maxLifetimeand a keepalive query to prevent "connection reset by peer" spikes.
Related concepts
- Databases — Connection pooling is one layer of database scaling. Read replicas, sharding, and caching are the other layers that complement it at higher scales.
- Scalability — Connection pool exhaustion is one of the most common scaling failures. It shows up in load tests before database capacity becomes the actual bottleneck.
- Load Balancing — PgBouncer is effectively a load balancer for database connections. The same principles (health checks, backends, session affinity) apply.