Connection pooling

The problem

Your application server makes a database call on every request. The naive approach: open a connection at request start, run the query, close the connection at request end.

On PostgreSQL, establishing a new connection is not cheap. The database forks a new backend process for each connection. The fork + TLS handshake + authentication sequence takes 20-50ms. For a request that takes 10ms to serve, you're spending 3-5x the request duration just establishing the connection.

At low concurrency with a fast database, this is invisible. At 1,000 requests per second, you're opening and closing 1,000 connections per second. Your database is spending more time managing connections than answering queries.

What a connection pool is

A connection pool is a set of pre-established, reusable database connections maintained by the application server. Instead of opening a new connection per request, the application borrows a connection from the pool, runs the query, and returns the connection. The connection itself is never closed between requests.

Analogy: A taxi rank at an airport. Instead of building a new taxi for each passenger and scrapping it after the ride, you maintain a fleet of taxis that wait, pick up passengers, and return to the rank. The taxi (connection) is reused. The setup cost (manufacturing the taxi, training a driver) is paid once.

How pool sizing affects performance

Pool sizing is where engineers most commonly go wrong. The intuition says: more connections = more parallelism = faster. This is wrong past a critical threshold.

Each PostgreSQL backend process consumes:

~5-10 MB of RAM just for the process overhead
A file descriptor slot
CPU context-switch overhead when the scheduler rotates through active backends

PostgreSQL's throughput peaks at a pool size well below the maximum allowed connections. A common empirical finding: on a system with 4 CPU cores, peak throughput is reached at 4-8 active connections. Beyond that, additional concurrent backends compete for the same CPU capacity, and context-switching overhead reduces throughput.

Throughput vs. Pool Size (4-core DB server, pgbench workload):

Pool size:  5   10   20   50   100  200
TPS:       2K  3.8K  4K  3.5K  2.5K  1.5K
Latency:   3ms  5ms  8ms  18ms  50ms  130ms

The peak is somewhere around 2x CPU cores for CPU-bound workloads, and higher for I/O-bound workloads. The right number is empirically determined, not guessed.

max_connections is not your pool target

PostgreSQL's max_connections = 200 is a safety limit, not a performance target. Setting your pool size to 200 because PostgreSQL allows 200 is a common mistake. At 200 concurrent backends on a 4-core machine, throughput collapses from context-switching overhead. Use a pool size based on core count and measure actual throughput.

Pool exhaustion and the cascade failure

When all connections in the pool are in use and a new request arrives, the request waits for a connection to free up. If requests take longer than expected (slow query, lock contention, external service timeout), the pool drains. Requests queue. The queue grows faster than connections are released.

This is pool exhaustion, and it often causes the complete failure most engineers attribute to something else.

The cascade: slow queries hold connections longer. Other requests wait for connections. Wait times exceed timeouts. Errors spike. The service appears down even though the database is responding fine to the 20 active connections.

I've seen engineers interpret this as "the database is down" and restart the database, which makes things worse because reconnection overhead compounds the problem.

Pool configuration parameters

Most pool libraries expose these settings:

Parameter	What it controls	Wrong default behavior
`minConnections`	Connections established at startup	0: first requests after idle period pay connection overhead
`maxConnections`	Maximum pool size	Too high: exhausts DB resources; too low: requests queue
`acquireTimeout`	Max wait time for a connection	Too long: requests accumulate and memory grows; too short: spurious errors
`idleTimeout`	How long an idle connection stays open before closing	Too long: DB resources held unnecessarily; too short: cold starts
`connectionTimeout`	Max time to establish a new connection	Should be low (1-3s); a slow DB is better caught fast
`maxLifetime`	Maximum connection age before forced replacement	Important for load balancers that kill long-lived connections silently

// Example: pg-pool (Node.js)
const pool = new Pool({
  max: 20,              // pool size — set based on DB CPU cores, not max_connections
  min: 5,              // keep 5 warm connections at all times
  idleTimeoutMillis: 30_000,   // close connections idle for 30s
  connectionTimeoutMillis: 3_000,  // fail fast if DB unreachable
  maxUses: 7500,       // replace a connection after 7500 uses (memory hygiene)
});

PgBouncer: pooling outside the application

Application-side pools (pg-pool, HikariCP, SQLAlchemy) maintain connections per process. A fleet of 50 app servers each with a pool of 20 connections = 1,000 connections visible to PostgreSQL. At large fleet sizes, this overwhelms the database even if individual response times are fine.

The problem

Your application server makes a database call on every request. The naive approach: open a connection at request start, run the query, close the connection at request end.

What a connection pool is

How pool sizing affects performance

Pool sizing is where engineers most commonly go wrong. The intuition says: more connections = more parallelism = faster. This is wrong past a critical threshold.

Each PostgreSQL backend process consumes:

~5-10 MB of RAM just for the process overhead
A file descriptor slot
CPU context-switch overhead when the scheduler rotates through active backends

Throughput vs. Pool Size (4-core DB server, pgbench workload):

Pool size:  5   10   20   50   100  200
TPS:       2K  3.8K  4K  3.5K  2.5K  1.5K
Latency:   3ms  5ms  8ms  18ms  50ms  130ms

The peak is somewhere around 2x CPU cores for CPU-bound workloads, and higher for I/O-bound workloads. The right number is empirically determined, not guessed.

max_connections is not your pool target

Pool exhaustion and the cascade failure

This is pool exhaustion, and it often causes the complete failure most engineers attribute to something else.

I've seen engineers interpret this as "the database is down" and restart the database, which makes things worse because reconnection overhead compounds the problem.

Pool configuration parameters

Most pool libraries expose these settings:

Parameter	What it controls	Wrong default behavior
`minConnections`	Connections established at startup	0: first requests after idle period pay connection overhead
`maxConnections`	Maximum pool size	Too high: exhausts DB resources; too low: requests queue
`acquireTimeout`	Max wait time for a connection	Too long: requests accumulate and memory grows; too short: spurious errors
`idleTimeout`	How long an idle connection stays open before closing	Too long: DB resources held unnecessarily; too short: cold starts
`connectionTimeout`	Max time to establish a new connection	Should be low (1-3s); a slow DB is better caught fast
`maxLifetime`	Maximum connection age before forced replacement	Important for load balancers that kill long-lived connections silently

// Example: pg-pool (Node.js)
const pool = new Pool({
  max: 20,              // pool size — set based on DB CPU cores, not max_connections
  min: 5,              // keep 5 warm connections at all times
  idleTimeoutMillis: 30_000,   // close connections idle for 30s
  connectionTimeoutMillis: 3_000,  // fail fast if DB unreachable
  maxUses: 7500,       // replace a connection after 7500 uses (memory hygiene)
});

Connection pooling

The problem

What a connection pool is

How pool sizing affects performance

Pool exhaustion and the cascade failure

Pool configuration parameters

PgBouncer: pooling outside the application

Continue Reading with Premium

Comments

Connection pooling

The problem

What a connection pool is

How pool sizing affects performance

Pool exhaustion and the cascade failure

Pool configuration parameters

PgBouncer: pooling outside the application

Continue Reading with Premium

Comments