How Nginx processes a request
How Nginx uses an event-driven architecture with epoll, master-worker process model, and efficient connection handling to serve millions of concurrent connections.
The Interview Question
Interviewer: "Your service needs to handle 50,000 concurrent WebSocket connections behind a reverse proxy. You chose Nginx over Apache. Walk me through how Nginx handles that many connections with just a few worker processes, and why Apache's model would struggle here."
This question tests whether you understand event-driven I/O, the difference between process-per-connection and event-loop architectures, and the practical implications for memory and CPU usage at scale. The interviewer is looking for someone who can reason about system resource constraints, not just recite configuration directives. A great answer covers the event loop, explains why memory-per-connection matters, and contrasts with Apache's threading model quantitatively.
What to Clarify Before Answering
You: "Good question. Let me clarify the scope..."
- "Should I focus on Nginx as a reverse proxy (proxying to upstream backends), or also cover static file serving and load balancing?"
- "Are we talking about Nginx open-source or Nginx Plus (commercial) with active health checks?"
- "Should I cover the HTTP request processing phases, or keep it high-level at the architecture and I/O model?"
- "Is HTTP/2 or gRPC proxying in scope, or just HTTP/1.1?"
Why this matters: Nginx does many things (static files, reverse proxy, load balancer, TLS terminator, HTTP/2 gateway, mail proxy). A strong candidate narrows the discussion to the relevant use case and shows they know the breadth of the tool.
The 30-Second Answer
Nginx uses a master-worker process model with a small, fixed number of worker processes (typically one per CPU core). Each worker runs a single-threaded event loop powered by epoll (Linux) or kqueue (BSD) that handles thousands of connections simultaneously through non-blocking I/O. When a request arrives, the worker does not spawn a thread or fork a process. Instead, it registers interest in I/O events (socket readable, writable, timer expired) and processes them as they occur. This lets a single worker handle 10,000+ concurrent connections using roughly 2.5KB of memory per connection. HTTP requests pass through a multi-phase processing pipeline (11 phases) where modules handle rewriting, authentication, proxying, compression, and logging.
The Architecture Overview
Nginx was designed from the ground up to solve the C10K problem (handling 10,000+ concurrent connections). Igor Sysoev created it in 2004 specifically because Apache's process-per-connection model could not scale to the connection counts that large Russian websites needed. The fundamental design decision was to use a fixed number of event-driven worker processes rather than spawning a thread or process per connection.
The master process runs as root (to bind to privileged ports like 80 and 443), reads the configuration file, and spawns worker processes. Workers drop privileges and run as a less-privileged user (typically nginx or www-data). Each worker independently accepts connections from a shared listening socket and processes them through its event loop. The cache manager and cache loader are helper processes that manage the on-disk proxy cache.
The key architectural insight: connections are distributed across workers through the kernel's socket accept mechanism (SO_REUSEPORT on modern Linux, or the accept_mutex on older versions). Each worker is independent, sharing no memory for request processing. This eliminates lock contention entirely.
The total maximum concurrent connections Nginx can handle is worker_processes * worker_connections. With 4 workers and 10,240 connections each, that is 40,960 concurrent connections. In practice, each proxied connection uses two file descriptors (one for client, one for upstream), so the effective proxied connection limit is half that.
The Master-Worker Process Model
The master process is the supervisor. It never handles client traffic directly. It runs as root to bind to privileged ports (80, 443) and then spawns worker processes that drop to unprivileged user (nginx or www-data). This privilege separation is a security feature: even if a worker is compromised through a vulnerability, the attacker has limited permissions.
The master process responsibilities:
- Read and validate the configuration file
- Bind to listening sockets (ports 80, 443, etc.)
- Spawn worker processes with reduced privileges
- Handle signals for graceful reloads and shutdowns
- Monitor workers and respawn them if they crash
- Write the master PID to a file for management scripts
Graceful Reload (Zero-Downtime Config Changes)
This is one of Nginx's most important operational features. Unlike most servers that require a restart for configuration changes, Nginx can swap its entire configuration while handling live traffic. No load balancer drain, no connection drops, no downtime window.
When you run nginx -s reload, the master process:
- Parses and validates the new configuration (if invalid, the reload is aborted with no impact on running workers)
- Spawns new worker processes with the new configuration
- Sends a graceful shutdown signal to old workers
- Old workers stop accepting new connections but finish processing in-flight requests
- Once all in-flight requests complete, old workers exit
This is why Nginx can reload configuration with zero downtime. At any point during the reload, either old or new workers (or both) are handling traffic. No connections are dropped.
Why this matters in production
In a Kubernetes environment with frequent config changes (new upstreams, updated TLS certificates), this reload mechanism means you never need to restart the Nginx pod. I use inotifywait or a sidecar to watch for config changes and trigger nginx -s reload automatically. The reload takes milliseconds and does not drop a single connection.
Worker Process Configuration
| Directive | Typical Value | Purpose |
|---|---|---|
worker_processes | auto (one per CPU core) | Number of worker processes |
worker_connections | 1024-65535 | Max connections per worker |
worker_cpu_affinity | auto | Pin workers to specific CPU cores |
worker_rlimit_nofile | 65535 | Max open file descriptors per worker |
accept_mutex | off (with SO_REUSEPORT) | Serializes accept() calls across workers |
Event-Driven Architecture with epoll
This is the core of why Nginx is fast. Each worker runs a single-threaded event loop that multiplexes thousands of connections through non-blocking I/O. No threads are created per connection, no processes are forked. The event loop is the entire concurrency model.
How epoll Works
epoll is Linux's scalable I/O event notification mechanism. It replaced the older select() and poll() system calls, which had O(N) performance (the kernel scanned every file descriptor on every call). epoll is O(1) for waiting and O(active_events) for processing.
The three key system calls:
epoll_create(): Create an epoll instance (returns a file descriptor)epoll_ctl(fd, ADD/MOD/DEL): Register or modify interest in a file descriptorepoll_wait(max_events, timeout): Block until events are ready or timeout expires
When a client sends data, the network card triggers an interrupt. The kernel protocol stack processes the packet and marks the socket as readable. The next epoll_wait() call returns that socket in the ready list. The worker reads the data (non-blocking, so it never waits), processes the request, and registers interest in writability when the response is ready.
The Event Loop
The simplified event loop pseudocode:
// Nginx worker event loop (simplified)
function worker_event_loop():
while not shutting_down:
timeout = find_next_timer_expiry()
events = epoll_wait(epfd, max_events, timeout)
for event in events:
if event.is_accept:
conn = accept(listen_fd) // Non-blocking accept
set_nonblocking(conn.fd)
epoll_add(conn.fd, READ) // Register for read events
elif event.is_readable:
data = read(event.fd) // Non-blocking read
if data:
process_request(event.conn, data)
else:
close_connection(event.conn)
elif event.is_writable:
bytes_sent = write(event.fd, event.conn.send_buf)
if send_buf_empty:
epoll_modify(event.fd, READ) // Done writing
process_expired_timers()
run_posted_events()
Why This Is Better Than Thread-Per-Connection
The critical difference between Nginx and Apache's prefork/worker model:
| Aspect | Nginx (event-driven) | Apache prefork (process-per-conn) |
|---|---|---|
| Memory per connection | ~2.5 KB | ~2-10 MB (full process) |
| 10K connections memory | ~25 MB | ~20-100 GB |
| Context switches | Minimal (single thread) | Constant (OS scheduling) |
| Idle connection cost | Nearly zero (just an fd) | Full process/thread resources |
| CPU cache efficiency | Excellent (single thread) | Poor (threads thrash L1/L2) |
The one thing that blocks Nginx workers
Any blocking operation inside a worker stalls ALL connections on that worker. This includes: blocking DNS resolution (use the resolver directive for async DNS), reading large files from a slow disk without AIO, and third-party modules that make synchronous HTTP calls. I have seen a single blocking DNS lookup add 5 seconds of latency to every connection on that worker.
Request Processing Phases
When an HTTP request arrives at a worker, Nginx processes it through a well-defined pipeline of 11 phases. Each phase has registered handler modules that execute in order. Understanding these phases is critical for debugging why a request is handled differently than expected, especially when multiple location blocks, rewrite rules, and access controls interact.
The 11 Phases
The diagram below shows all 11 phases in order. Each phase can have multiple registered modules. A module in any phase can terminate the request early (e.g., the access phase returns 403, or the rate limiter returns 429). If no module terminates, processing continues to the next phase.
The key phases for most configurations:
Phase 3 (FIND_CONFIG) is where Nginx matches the request URI to a location block. This is the most commonly misunderstood phase. The matching order is:
- Exact match (
location = /api/health) - checked first - Longest prefix match with
^~modifier - stops further searching - Regular expression matches (first match wins, in config file order)
- Longest prefix match without
^~- used if no regex matched
Phase 6 (PREACCESS) is where rate limiting happens. The limit_req module uses a leaky bucket algorithm with configurable burst size and delay. This is the first line of defense against DDoS and abuse.
Phase 7 (ACCESS) handles authentication and authorization. IP-based allow/deny rules, HTTP basic auth, and JWT validation all happen here. If access is denied, processing stops and a 403 is returned.
Phase 10 (CONTENT) is where the actual response generation happens. Only one content handler can execute per request (proxy_pass, fastcgi_pass, static file serving, or return).
Upstream and Load Balancing
When Nginx acts as a reverse proxy, it distributes requests to backend servers using configurable load balancing algorithms. The upstream module is one of the most heavily used parts of Nginx in production, and understanding its behavior is critical for high-availability deployments.
Load Balancing Algorithms
| Algorithm | Directive | Behavior | Best For |
|---|---|---|---|
| Round Robin | (default) | Rotate through servers sequentially | Homogeneous backends |
| Weighted Round Robin | weight=3 | Proportional distribution by weight | Mixed-capacity backends |
| Least Connections | least_conn | Send to server with fewest active connections | Variable request duration |
| IP Hash | ip_hash | Hash client IP to sticky backend | Session affinity without cookies |
| Generic Hash | hash $request_uri | Hash arbitrary key to server | Cache-friendly distribution |
| Random with Two Choices | random two least_conn | Pick 2 random servers, choose least-loaded | Large server pools (power of 2 choices) |
The power of two random choices algorithm deserves special attention. Instead of checking all servers (which requires shared state in a multi-worker setup), it picks two servers at random and sends the request to the less-loaded one. Research shows this achieves near-optimal load distribution with O(1) decision time, making it ideal for large upstream pools with 50+ servers.
Passive vs Active Health Checks
Nginx open-source performs passive health checks only. It monitors responses from upstreams and marks a server as down after max_fails consecutive failures within fail_timeout seconds. The server is reintroduced after fail_timeout expires.
This means a real user request is used as the health probe, which has a cost: the first few users after a backend goes down will see errors before Nginx detects the failure.
Nginx Plus adds active health checks that send periodic synthetic requests to a health endpoint (e.g., /health). This detects failures before any real user is affected.
For open-source Nginx, I mitigate this by combining max_fails=2 fail_timeout=10s with proxy_next_upstream error timeout http_502 http_503. This retries failed requests on another backend transparently, so the user never sees the error (the first backend is just slightly slower while the retry happens).
Connection Pooling to Upstreams
Without keepalive connections, every proxied request incurs:
- TCP three-way handshake (~0.5ms local, 1-50ms cross-AZ)
- TLS handshake if HTTPS (~5-30ms additional)
- Request/response exchange
- TCP connection teardown
With keepalive connections, the TCP and TLS overhead is paid once, and subsequent requests reuse the established connection.
# Upstream configuration with connection pooling
upstream backend {
least_conn;
server 10.0.1.10:8080 max_fails=3 fail_timeout=30s;
server 10.0.1.11:8080 max_fails=3 fail_timeout=30s;
server 10.0.1.12:8080 max_fails=3 fail_timeout=30s;
keepalive 64; # Pool of 64 idle connections per worker
}
server {
location /api/ {
proxy_pass http://backend;
proxy_http_version 1.1; # Required for keepalive
proxy_set_header Connection ""; # Clear hop-by-hop header
proxy_connect_timeout 5s;
proxy_read_timeout 30s;
proxy_next_upstream error timeout http_502; # Retry on failure
}
}
Health checks in open-source vs Nginx Plus
Open-source Nginx only does passive health checks: if max_fails consecutive requests to a backend fail, it is marked down for fail_timeout seconds. Nginx Plus adds active health checks (periodic HTTP requests to a health endpoint). For open-source Nginx, I use proxy_next_upstream to automatically retry failed requests on another backend, which provides similar resilience.
Static File Serving and Zero-Copy I/O
Nginx is exceptionally fast at serving static files due to kernel-level optimizations that bypass userspace entirely for the data transfer path.
The sendfile() System Call
Without sendfile, serving a file requires:
read()from disk into a kernel buffer- Copy from kernel buffer to userspace buffer
- Copy from userspace buffer to kernel socket buffer
write()from socket buffer to network
This involves 4 context switches and 4 data copies. With sendfile():
sendfile()tells the kernel to transfer data directly from the file page cache to the socket buffer
This is zero-copy: data moves from the page cache to the network card with only 2 context switches and zero userspace copies. On Linux with DMA-capable network cards and TCP_CORK, the kernel can further optimize by batching the HTTP headers and file data into a single TCP segment.
The performance difference is significant for large files. Serving a 10MB file with read()/write() copies 40MB of data through the CPU. With sendfile(), the CPU copies zero bytes (DMA handles the transfer). For a server handling 1,000 concurrent file downloads, this saves 40GB/sec of memory bandwidth.
# Optimal static file serving configuration
server {
location /static/ {
root /var/www;
sendfile on; # Zero-copy file transfer
tcp_nopush on; # Batch headers + file data in one TCP segment
tcp_nodelay on; # Disable Nagle for small responses
open_file_cache max=10000 inactive=60s; # Cache file descriptors
open_file_cache_valid 30s;
gzip on;
gzip_comp_level 6;
gzip_types text/css application/javascript application/json;
gzip_min_length 256; # Don't compress tiny files
expires 30d; # Cache-Control: max-age=2592000
}
}
open_file_cache
This caches file descriptors, file sizes, modification times, and directory lookups in a per-worker hash table. Without it, every request for /static/app.js requires:
open()system call to get a file descriptorfstat()to get the file size for Content-Lengthclose()after the response
With open_file_cache, the file descriptor stays open and metadata is cached. For high-traffic static sites, this reduces system calls by 60-70%.
Connection Handling: Keep-Alive, HTTP/2, and TLS
Nginx's connection handling is designed to minimize per-connection overhead while maximizing throughput for both short-lived HTTP requests and long-lived WebSocket or streaming connections.
Keep-Alive Connections
HTTP keep-alive reuses a TCP connection for multiple requests. Without it, every request incurs a TCP handshake (and TLS handshake for HTTPS). The cost of not using keep-alive is measurable: at 100 requests/sec per client, you pay 100 TCP handshakes/sec (and 100 TLS handshakes for HTTPS). With keep-alive, you pay one handshake and reuse it for all 100 requests.
Nginx tracks keep-alive state per connection with minimal memory overhead. An idle keep-alive connection costs only a file descriptor and ~2.5KB of state. The kernel handles the actual TCP buffers.
| Directive | Default | Purpose |
|---|---|---|
keepalive_timeout | 75s | Idle time before closing keep-alive connection |
keepalive_requests | 1000 | Max requests per keep-alive connection |
keepalive_time | 1h | Max lifetime of a keep-alive connection |
HTTP/2 Multiplexing
HTTP/2 allows multiple requests over a single TCP connection through stream multiplexing. This eliminates head-of-line blocking at the HTTP layer (though TCP-level HOL blocking remains).
HTTP/2 frames are the basic unit of communication. Each frame belongs to a stream (identified by a stream ID), and streams are multiplexed over the single TCP connection. Nginx maintains a priority tree for streams and allocates bandwidth proportionally based on stream weights and dependencies.
The flow control mechanism prevents a fast sender from overwhelming a slow receiver. Each stream has its own flow control window (default 64KB), and the connection itself has a window (default 64KB). Nginx tracks both windows and pauses sending when a window fills up.
server {
listen 443 ssl http2;
http2_max_concurrent_streams 128; # Streams per connection
http2_recv_buffer_size 256k;
}
A single HTTP/2 connection can carry 128 concurrent streams (requests/responses). Each stream is independent, so a slow response on stream 5 does not block stream 6. This is particularly important for browser connections, where a page load may trigger 50+ resource requests.
HTTP/2 server push is deprecated
HTTP/2 server push (proactively sending resources before the client requests them) was supported by Nginx but has been deprecated by most browsers due to poor real-world performance. The 103 Early Hints response code is the modern replacement, hinting to the browser which resources to preload while the server generates the main response. Nginx supports this via add_header Link in early responses.
TLS Termination
Nginx terminates TLS at the edge, decrypting traffic before forwarding plain HTTP to backends. This offloads the CPU-intensive cryptographic operations from application servers. The TLS handshake is the most CPU-intensive operation Nginx performs, and optimizing it is critical for HTTPS-heavy workloads.
| TLS Operation | Latency | CPU Cost |
|---|---|---|
| Full TLS 1.2 handshake (RSA) | 2 RTT + ~5ms CPU | High (RSA decryption) |
| Full TLS 1.3 handshake (ECDHE) | 1 RTT + ~1ms CPU | Moderate |
| TLS session resumption (ticket) | 1 RTT + ~0.5ms CPU | Low |
| TLS session resumption (0-RTT) | 0 RTT + ~0.5ms CPU | Low |
The key optimization for TLS at scale
Enable TLS session tickets (ssl_session_tickets on) and set a shared session cache (ssl_session_cache shared:SSL:50m). This lets returning clients resume sessions without a full handshake, reducing latency from ~5ms to ~0.5ms and cutting CPU usage by 80%+ for repeat visitors. For TLS 1.3, early data (0-RTT) eliminates the round-trip entirely for idempotent requests.
What Happens When Things Break
Understanding failure modes is essential because Nginx is typically the first component in the request path. When Nginx fails, everything behind it is unreachable.
| Failure | What Happens | How to Detect | How to Fix |
|---|---|---|---|
| Worker crashes | Master respawns immediately, other workers unaffected | Error log entry, brief connection drops | Check error log for segfault cause, update Nginx or disable faulty module |
| All upstreams down | Nginx returns 502 Bad Gateway | Upstream error count in logs, monitoring alerts | Fix backends, configure fallback with error_page 502 /maintenance.html |
| File descriptor limit hit | New connections refused with "Too many open files" | worker_connections exceeded alerts in error log | Increase worker_rlimit_nofile and OS ulimit -n |
| Slow upstream | Connections queue up, worker connections saturated | Rising proxy_read_timeout errors, growing active connections | Lower timeouts, add circuit breaking, scale backends |
| SSL certificate expired | Browsers show security warning, drop traffic | Certificate monitoring, CT log watching | Automate renewal with certbot/ACME |
| Disk full (cache/logs) | Cache stops working, logs stop writing, potential crash | Disk usage alerts | Rotate logs with logrotate, limit cache size with max_size |
The thundering herd on reload
When Nginx reloads with a large number of connections, all new workers simultaneously call accept() on the shared socket. On older kernels without SO_REUSEPORT, this causes a thundering herd problem where all workers wake up but only one can accept each connection. Use reuseport in the listen directive on Linux 3.9+ to give each worker its own accept queue: listen 80 reuseport;. This eliminates the thundering herd and improves accept latency by 2-3x.
Performance Characteristics
These numbers come from real-world benchmarks and my production experience. Your mileage will vary based on hardware, kernel version, and workload mix. The key takeaway is that Nginx's overhead per request is measured in microseconds, not milliseconds.
| Metric | Typical Value | Key Factor |
|---|---|---|
| Memory per connection | ~2.5 KB (idle keep-alive) | Minimal state: fd, buffers, timers |
| Memory per worker | ~10-30 MB | Config complexity, loaded modules |
| Static file requests/sec | 100,000-500,000+ per core | Bottleneck: disk I/O or network bandwidth |
| Proxy requests/sec | 20,000-80,000 per core | Upstream latency is the bottleneck |
| HTTP/2 streams per connection | 128 (default) | http2_max_concurrent_streams |
| TLS handshakes/sec | 5,000-30,000 per core | RSA vs ECDSA, key size (2048 vs 4096) |
| Config reload time | < 100ms | No connections dropped |
| Max connections per worker | 65,535 (fd limit) | OS ulimit and worker_connections |
| Latency overhead (proxy) | 0.1-0.5ms | Header parsing, upstream connection |
The real bottleneck is almost never Nginx
In my experience, Nginx itself is rarely the performance bottleneck. It adds 0.1-0.5ms of latency per request as a reverse proxy. The bottleneck is almost always the upstream backend (slow database queries, heavy computation) or the network (cross-AZ latency, bandwidth limits). I optimize backends first and Nginx configuration second.
How This Compares to Alternatives
Choosing a reverse proxy is one of the most consequential infrastructure decisions because it sits in the critical path of every request. Each option has different strengths and tradeoffs.
| Feature | Nginx | Apache (event MPM) | HAProxy | Envoy | Caddy |
|---|---|---|---|---|---|
| Architecture | Event-driven, multi-process | Event-driven + thread pool | Event-driven, multi-thread | Event-driven, multi-thread | Event-driven, goroutines |
| Static file serving | Excellent (sendfile) | Good | Not designed for this | Not designed for this | Good |
| Reverse proxy | Excellent | Good | Excellent | Excellent | Good |
| Load balancing | Good (basic algorithms) | Basic | Excellent (advanced algorithms, stick tables) | Excellent (xDS API, circuit breaking) | Basic |
| TLS automation | Manual (certbot sidecar) | Manual | Manual | SDS API (automatic) | Built-in (ACME) |
| Config format | Static file, reload needed | Static file + .htaccess | Static file, reload needed | API-driven (xDS), hot config | JSON/Caddyfile, auto-reload |
| Memory per connection | ~2.5 KB | ~10 KB (event MPM) | ~3 KB | ~30-50 KB | ~5-10 KB |
| HTTP/3 (QUIC) | Experimental | No | Yes (recent) | Yes | Yes |
| Service mesh sidecar | No | No | No | Yes (designed for this) | No |
I reach for Nginx when I need a proven, lightweight reverse proxy with excellent static file serving and straightforward configuration. I choose HAProxy when advanced load balancing features matter (stick tables, advanced health checks, detailed metrics per backend). I pick Envoy when building a service mesh or when I need dynamic configuration via API (xDS protocol). I use Caddy for small projects where automatic TLS with zero configuration is the priority.
The Envoy vs Nginx debate for microservices
In a Kubernetes environment, I increasingly see teams default to Envoy (via Istio or standalone) because of its API-driven configuration. Nginx requires config file changes and reloads, while Envoy can update routing rules via xDS without any restart. For service mesh sidecars, where hundreds of proxy instances need coordinated configuration changes, API-driven configuration is a significant operational advantage. But for edge proxying and static file serving, Nginx remains the better choice due to lower memory overhead and the sendfile optimization.
When to Use What
| Scenario | Best Choice | Reason |
|---|---|---|
| Static site + reverse proxy | Nginx | sendfile, low memory, simple config |
| High-availability TCP/HTTP LB | HAProxy | Stick tables, detailed health checks |
| Service mesh sidecar | Envoy | xDS API, circuit breaking, observability |
| Small project, auto-TLS | Caddy | Zero-config ACME, Caddyfile simplicity |
| Kubernetes ingress | Nginx Ingress or Envoy-based (e.g., Contour) | Depends on team familiarity and dynamic config needs |
Interview Cheat Sheet
These are the key points you need to remember for interviews. Each bullet pairs a trigger question with a concise, technically accurate response.
- When asked about the process model: "Nginx uses a master-worker model. The master manages workers and handles signals. Each worker is a single-threaded event loop. Typically one worker per CPU core."
- When asked about concurrency: "A single Nginx worker handles 10,000+ connections using epoll and non-blocking I/O. Memory per connection is ~2.5KB. This is why Nginx can handle the C10K problem easily."
- When asked about epoll: "epoll is Linux's I/O event notification mechanism. The worker registers interest in socket events (readable, writable) and blocks on epoll_wait(). When events are ready, it processes them in a tight loop. No thread-per-connection overhead."
- When asked about graceful reload: "Send SIGHUP to the master. It spawns new workers with the new config, then gracefully shuts down old workers. Old workers finish in-flight requests before exiting. Zero connections dropped."
- When asked about load balancing: "Round-robin by default. least_conn for variable request durations. ip_hash for session affinity. random two least_conn for large pools (power of two choices)."
- When asked about static files: "Nginx uses sendfile() for zero-copy transfer from disk to socket. Combined with open_file_cache for fd caching and gzip compression, it can serve 100K+ static requests per second per core."
- When asked about TLS termination: "Nginx terminates TLS at the edge and forwards plain HTTP to backends. Session tickets and shared session cache reduce repeat handshake cost by 80%+. ECDSA certificates are faster than RSA for handshakes."
- When asked about HTTP/2: "HTTP/2 multiplexes multiple streams over a single TCP connection, eliminating HTTP-level head-of-line blocking. Nginx supports 128 concurrent streams per connection by default."
- When asked why not Apache: "Apache's prefork model uses 2-10MB per connection. For 10,000 connections, that is 20-100GB of RAM. Nginx uses 25MB for the same load. The event MPM improved Apache significantly, but Nginx's architecture was designed for this from the start."
- When asked about the biggest limitation: "Any blocking operation inside a worker stalls all connections on that worker. Avoid blocking DNS, synchronous disk I/O on slow disks, and third-party modules that make blocking calls."
Test Your Understanding
These questions test whether you can apply the concepts from this article to real production scenarios. They are deliberately harder than what you will encounter in most interviews.
Quick Recap
These are the core facts about Nginx that every backend and infrastructure engineer should know.
- Nginx uses a master-worker process model where the master manages configuration and worker lifecycle, and each worker handles thousands of connections through a single-threaded event loop.
- The event loop uses epoll (Linux) for non-blocking I/O, allowing one worker to multiplex 10,000+ concurrent connections with ~2.5KB memory per connection.
- Graceful reloads spawn new workers with updated config while old workers finish in-flight requests, achieving zero-downtime configuration changes.
- HTTP requests pass through 11 processing phases, from IP resolution through rewriting, authentication, rate limiting, content generation, and logging.
- Load balancing to upstream backends supports round-robin, least connections, IP hash, and random-two-choices algorithms, with connection pooling via the keepalive directive.
- Static file serving uses the sendfile() system call for zero-copy I/O and open_file_cache for file descriptor caching, reaching 100K+ requests/sec per core.
- TLS termination at the Nginx layer offloads encryption from backends, with session tickets and shared caches reducing repeat handshake overhead by 80%+.
- The primary performance bottleneck in an Nginx deployment is almost never Nginx itself, but rather upstream backend latency or network constraints.
Related Concepts
These articles complement this one by covering related technologies and protocols.
- How Linux containers work: Understanding cgroups and namespaces explains how Nginx worker processes are resource-constrained in containerized deployments
- How SSL certificates work: The TLS handshake that Nginx terminates, including certificate chains, OCSP stapling, and key exchange algorithms
- How gRPC works: The HTTP/2-based RPC framework that Nginx can proxy with
grpc_pass, including streaming and multiplexing semantics - How DNS resolution works: Understanding DNS is critical for Nginx upstream configuration, since Nginx resolves upstream hostnames at config load time (not per-request) unless you use the
resolverdirective