HTTP keep-alive and connection reuse
Learn how HTTP persistent connections eliminate TCP handshake overhead, what HTTP/2 multiplexing adds on top, and why connection keep-alive settings are a frequent source of production 502 errors.
The problem
Your nginx reverse proxy has keepalive_timeout 75s (the default). Your Node.js app server has keepAliveTimeout set to 5000 ms (the old default in Node.js 16 and earlier). nginx holds upstream connections open for up to 75 seconds. Node.js closes them after 5 seconds of idle time.
After 5 seconds of inactivity, Node.js sends a FIN on the socket. nginx does not see this immediately. When the next request arrives, nginx picks that connection from its upstream pool and sends the request. Node.js responds with a TCP RST: that socket is closed. nginx gets the RST mid-write and returns a 502 Bad Gateway to the client.
This is not a fluke. It happens on every connection that idles for more than 5 seconds. At low traffic, connections are always busy, so the window never opens. At moderate traffic with brief pauses between requests (typical for user-facing APIs), you see a steady background rate of 502s that seems random but always clusters around the 5-second idle mark.
What HTTP keep-alive is
HTTP keep-alive (persistent connections) reuses one TCP connection for multiple request-response cycles. Without it, every HTTP request pays the TCP handshake cost: at least 1.5 round trips before a single byte of application data is transferred.
Analogy: A phone call versus a series of walkie-talkie exchanges. Each walkie-talkie message (HTTP/1.0) requires keying the transmitter, speaking, releasing, and waiting for acknowledgment before the next message. A phone call (keep-alive) holds the channel open for the full conversation. You pay the connection cost once.
HTTP/1.1 enables keep-alive by default. You must explicitly opt out with Connection: close. HTTP/2 adds connection multiplexing on top: multiple requests travel on the same connection simultaneously, not sequentially.
How it works
In HTTP/1.0, every request requires a new TCP connection. The three-way handshake alone costs 1.5 RTT before the request is even sent. For 100 requests over 50 ms RTT, that is 7.5 seconds of pure handshake overhead.
HTTP/1.1 keeps the connection open between requests. The handshake is paid once. The tradeoff is head-of-line (HOL) blocking: request B cannot be sent on the same connection until response A is fully received.
HTTP/2 introduces binary framing with stream IDs. Multiple requests are in flight on the same connection simultaneously. Request B does not wait for response A.
For a 50 ms RTT, HTTP/1.0 wastes 75 ms on handshake per request. HTTP/1.1 amortizes that 75 ms across all requests on the connection. HTTP/2 also eliminates the sequential wait: responses A and B can overlap in transit.
The 502 race condition
The root cause: the upstream server closes a connection that the proxy still holds in its pool as "idle but usable." The proxy sends the next request on a socket that is already closed.
The fix is to ensure Node.js keeps connections alive longer than nginx's upstream idle period, so Node.js never closes a connection nginx might still use:
const server = http.createServer(app);
// Must be greater than the proxy's upstream keepalive timeout.
// If nginx keepalive_timeout is 60s, set this to 65s.
server.keepAliveTimeout = 65_000;
// Must be greater than keepAliveTimeout.
// Node.js enforces a separate deadline for receiving HTTP headers.
// If headersTimeout fires before keepAliveTimeout, you get a different 502.
server.headersTimeout = 66_000;
Lower nginx's keepalive_timeout to 60s (or whatever is below your Node.js value), and set server.keepAliveTimeout = 65000. Node.js will now hold connections open past nginx's cutoff point. nginx closes idle connections first, gracefully, without sending a new request. Node.js handles the incoming FIN cleanly.
Always set headersTimeout alongside keepAliveTimeout
Setting only keepAliveTimeout without headersTimeout is a common partial fix. If headersTimeout defaults to 60 000 ms and keepAliveTimeout is 65 000 ms, the headers timeout fires first on slow requests, producing its own 408 or 502 error. Always set headersTimeout = keepAliveTimeout + 1000 to ensure they fire in the right order.
HTTP/2 and TCP head-of-line blocking
HTTP/2 eliminates application-layer HOL blocking. Streams 1, 3, and 5 are all in flight simultaneously. A slow response on stream 1 does not block stream 3 from being sent or received.
But HTTP/2 still runs over a single TCP connection. TCP guarantees in-order byte delivery. If one TCP segment is lost, the OS retransmit timer fires and all HTTP/2 streams on that connection stall, even streams whose data has already arrived in the receive buffer.
This is TCP head-of-line blocking. HTTP/2 trades application-layer HOL for TCP-layer HOL.
HTTP/3 (QUIC) eliminates this by replacing TCP with UDP-based QUIC streams. Each QUIC stream has independent loss detection and retransmission. A lost packet on stream 1 does not affect stream 3.
HOL blocking by protocol:
Protocol | App-layer HOL | TCP-layer HOL
----------+---------------+--------------
HTTP/1.0 | N/A | Yes
HTTP/1.1 | Yes | Yes
HTTP/2 | No | Yes
HTTP/3 | No | No (QUIC)
In practice: TCP HOL blocking only causes measurable harm under high packet loss (above 1-2%), which is common on mobile networks and uncommon on datacenter links. HTTP/2 is a clear improvement over HTTP/1.1 for most workloads. HTTP/3 is the right choice for user-facing endpoints on unreliable or high-latency networks.
Connection pool sizing
nginx maintains a pool of idle persistent upstream connections per worker process. The keepalive N directive in an upstream block sets the maximum number of idle connections per worker:
Continue Reading with Premium
Unlock this article and every other in-depth system design guide on the platform with NotesFromSDE Premium.