Connection pool
Low-level design of a thread-safe database connection pool. Covers min/max pool size, acquire timeout, connection health checks, connection lifecycle (create/validate/evict), and metrics for monitoring pool exhaustion.
The Problem
Your application serves 2,000 concurrent requests. Each request opens a fresh database connection, runs a query, and closes it. Under load, connection creation alone (TCP handshake, TLS negotiation, authentication) burns 30-80ms per request. At peak traffic, the database rejects new connections because it hit its 500-connection limit, and the application starts throwing "too many connections" errors.
A connection pool solves this by maintaining a set of pre-created connections that threads borrow and return. Instead of paying the creation cost per query, you pay it once at startup. The pool enforces a maximum connection count, queues threads when all connections are in use, validates connections before handing them out, and evicts stale connections in the background.
Design the core classes for a connection pool that manages connection lifecycle (create, validate, evict), enforces min/max pool sizing, supports acquire-with-timeout semantics, runs background health checks, and exposes metrics for monitoring.
Requirements
Clarifying Questions
Before jumping into class design, ask questions to turn the vague prompt into a concrete specification. Cover four areas: core actions, error handling, boundaries, and future extensions.
You: "Is this a fixed-size pool or a dynamic pool that grows from min to max?"
Interviewer: "Dynamic. Start with a minimum number of connections, grow on demand up to a maximum, and shrink back down when idle connections exceed the minimum."
Dynamic sizing with min/max bounds. That means the pool has three phases: warm-up (fill to min), growth (create on demand up to max), and shrink (evict idle connections back to min).
You: "When a thread requests a connection and none are available, does it block forever or timeout?"
Interviewer: "It blocks up to a configurable timeout, then throws an exception. The caller should never hang indefinitely."
Timeout-based acquisition. We need a blocking mechanism with a deadline, like a semaphore with tryAcquire or a condition variable with awaitNanos.
You: "How do we validate connections? Test-on-borrow, background health checks, or both?"
Interviewer: "Both. Validate before handing a connection to the caller, and run a background evictor thread that periodically removes dead or idle-too-long connections."
Two validation paths: synchronous on borrow and asynchronous via a background thread. The health check itself should be pluggable since different databases need different validation queries.
You: "Should the pool detect connection leaks? For example, a thread borrows a connection but never returns it."
Interviewer: "Yes, track how long a connection has been checked out. If it exceeds a threshold, log a warning with the stack trace of where it was borrowed."
Leak detection needs to record the borrowing stack trace at acquire time. That is a wrapper concern, not something the pool itself handles directly.
You: "Should we use FIFO or LIFO ordering for idle connections?"
Interviewer: "LIFO. It gives better cache locality because the most recently returned connection is the most likely to still be valid and have a warm TCP socket."
LIFO means we use a stack (Deque, pollLast) rather than a queue (pollFirst). Most production pools like HikariCP use LIFO for exactly this reason.
You: "Do we need metrics, and if so, what metrics?"
Interviewer: "Total connections, idle count, active count, acquire wait time, and timeout count. The operations team needs these to set alerts for pool exhaustion."
Metrics are a separate concern. We will use a dedicated PoolMetrics class so the pool does not mix monitoring logic with connection management.
You: "Should the pool support multiple database drivers, or just JDBC?"
Interviewer: "Design for pluggability. The pool should not depend on JDBC directly. A factory creates connections, and the pool manages them."
Factory Method pattern for connection creation. The pool works with an abstract ConnectionFactory, and different database drivers provide concrete implementations.
Perfect. You have clarified scope and ruled out unnecessary complexity.
Final Requirements
Functional Requirements:
- Acquire a connection, blocking up to a configurable timeout if none are available.
- Release a connection back to the pool after use.
- Maintain a minimum number of idle connections (warm pool).
- Enforce a maximum total connection count (idle + active).
- Validate connections on borrow (test-on-borrow) and reject/replace dead ones.
- Run a background evictor that removes idle-too-long and max-lifetime-exceeded connections.
Non-Functional Requirements:
- Thread-safe: multiple threads acquire/release concurrently without corruption.
- Fair ordering: threads waiting for a connection are served in FIFO order.
- Metrics: expose pool state (total, idle, active, wait time, timeouts) for monitoring.
- Extensible: support different database drivers via a pluggable connection factory.
Out of Scope:
- SQL query execution and result set handling
- Distributed pooling across multiple application instances
- Connection routing (read/write splitting)
- UI or admin dashboard
Example Inputs and Outputs
Scenario 1: Normal acquire and release
- Input: Thread A calls
pool.acquire()when 3 idle connections exist. - Expected: Pool validates the top idle connection, marks it IN_USE, returns it. Idle count drops from 3 to 2, active count rises from 0 to 1.
- Why: Validates the happy-path borrow/return cycle.
Scenario 2: Pool exhaustion with timeout
- Input: Max pool size is 10, all 10 are active. Thread B calls
pool.acquire()with a 5-second timeout. - Expected: Thread B blocks. If no connection is returned within 5 seconds, pool throws
AcquireTimeoutException. Timeout metric increments by 1. - Why: Validates the backpressure mechanism under overload.
Scenario 3: Background eviction
- Input: Min pool size is 2, currently 8 idle connections. Idle timeout is 30 seconds. Evictor runs and finds 6 connections idle longer than 30 seconds.
- Expected: Evictor closes 6 connections, leaving 2 (the minimum). Total count drops from 8 to 2.
- Why: Validates that the pool shrinks back to min during low traffic.
Try it yourself
Before reading the solution, spend 15 minutes sketching the connection lifecycle states and the acquire/release methods. Focus on what happens when the pool is full and a thread is waiting. Compare your approach with the walkthrough below.
Step 1: Identify Core Entities
Start by asking: what are the main "things" in this problem? Look for nouns in your requirements: pool, connection, configuration, factory, health checker, metrics, state.
A common mistake is dumping everything into a single ConnectionPool god class. Good design means each class has a single, clear job.
| Entity | Responsibility | Key attributes |
|---|---|---|
| ConnectionPool | Orchestrator. Manages acquire, release, and shutdown. | idleConnections, activeConnections, config, metrics |
| PoolConfig | Immutable configuration holder. Min/max size, timeouts, intervals. | minSize, maxSize, acquireTimeout, idleTimeout, maxLifetime |
| PooledConnection | Wraps a raw connection with lifecycle metadata. | rawConnection, state, createdAt, lastUsedAt, lastValidatedAt |
| ConnectionFactory | Creates and validates raw connections. Pluggable per DB driver. | create(), validate() |
| HealthChecker | Background thread that evicts dead/stale connections. | evictionInterval, pool reference |
| PoolMetrics | Tracks pool statistics for monitoring. | totalCount, idleCount, activeCount, acquireWaitTime, timeoutCount |
| ConnectionState | Enum for connection lifecycle. | IDLE, IN_USE, VALIDATING, EVICTED |
Notice that PooledConnection wraps the raw connection rather than exposing it directly. This lets us track metadata (creation time, last validation time) and intercept close() to return the connection to the pool instead of actually closing it.
Step 2: Define Relationships and Class Design
Class Diagram
Class Interface Derivation
ConnectionPool
The central orchestrator. All acquire/release calls go through this class.
Deriving state from requirements:
| Requirement | What ConnectionPool must track |
|---|---|
| "Acquire a connection" | Collection of idle connections available for borrowing |
| "Release a connection" | Set of active connections currently checked out |
| "Enforce max pool size" | Semaphore or counter limiting total connections |
| "Min idle connections" | Reference to config for min size threshold |
| "Health check on borrow" | Reference to ConnectionFactory for validation |
This gives us the state:
idleConnections: Deque<PooledConnection> // LIFO stack for idle connections
activeConnections: Set<PooledConnection> // track checked-out connections
config: PoolConfig // immutable configuration
factory: ConnectionFactory // creates/validates connections
metrics: PoolMetrics // monitoring counters
sizeLimiter: Semaphore // enforces max pool size
lock: ReentrantLock // protects pool state
notEmpty: Condition // signals waiting threads
Deriving methods from needs:
| Need from requirements | Method |
|---|---|
| "Borrow a connection with timeout" | acquire(): PooledConnection |
| "Return connection to pool" | release(conn): void |
| "Warm pool on startup" | initialize(): void |
| "Clean shutdown" | shutdown(): void |
PooledConnection
Wraps a raw database connection with lifecycle metadata.
Deriving state from requirements:
| Requirement | What PooledConnection must track |
|---|---|
| "Validate before borrow" | lastValidatedAt timestamp |
| "Evict idle-too-long" | lastUsedAt timestamp |
| "Evict max-lifetime-exceeded" | createdAt timestamp |
| "Lifecycle states" | state enum (IDLE, IN_USE, VALIDATING, EVICTED) |
We make the state transitions explicit. A connection in EVICTED state should never be handed to a caller.
Connection State Machine
Continue Reading with Premium
Unlock this article and every other in-depth system design guide on the platform with NotesFromSDE Premium.