How Nginx processes a request

The Interview Question

Interviewer: "Your service needs to handle 50,000 concurrent WebSocket connections behind a reverse proxy. You chose Nginx over Apache. Walk me through how Nginx handles that many connections with just a few worker processes, and why Apache's model would struggle here."

This question tests whether you understand event-driven I/O, the difference between process-per-connection and event-loop architectures, and the practical implications for memory and CPU usage at scale. The interviewer is looking for someone who can reason about system resource constraints, not just recite configuration directives. A great answer covers the event loop, explains why memory-per-connection matters, and contrasts with Apache's threading model quantitatively.

What to Clarify Before Answering

You: "Good question. Let me clarify the scope..."

"Should I focus on Nginx as a reverse proxy (proxying to upstream backends), or also cover static file serving and load balancing?"
"Are we talking about Nginx open-source or Nginx Plus (commercial) with active health checks?"
"Should I cover the HTTP request processing phases, or keep it high-level at the architecture and I/O model?"
"Is HTTP/2 or gRPC proxying in scope, or just HTTP/1.1?"

Why this matters: Nginx does many things (static files, reverse proxy, load balancer, TLS terminator, HTTP/2 gateway, mail proxy). A strong candidate narrows the discussion to the relevant use case and shows they know the breadth of the tool.

The 30-Second Answer

Nginx uses a master-worker process model with a small, fixed number of worker processes (typically one per CPU core). Each worker runs a single-threaded event loop powered by epoll (Linux) or kqueue (BSD) that handles thousands of connections simultaneously through non-blocking I/O. When a request arrives, the worker does not spawn a thread or fork a process. Instead, it registers interest in I/O events (socket readable, writable, timer expired) and processes them as they occur. This lets a single worker handle 10,000+ concurrent connections using roughly 2.5KB of memory per connection. HTTP requests pass through a multi-phase processing pipeline (11 phases) where modules handle rewriting, authentication, proxying, compression, and logging.

The Architecture Overview

Nginx was designed from the ground up to solve the C10K problem (handling 10,000+ concurrent connections). Igor Sysoev created it in 2004 specifically because Apache's process-per-connection model could not scale to the connection counts that large Russian websites needed. The fundamental design decision was to use a fixed number of event-driven worker processes rather than spawning a thread or process per connection.

The master process runs as root (to bind to privileged ports like 80 and 443), reads the configuration file, and spawns worker processes. Workers drop privileges and run as a less-privileged user (typically nginx or www-data). Each worker independently accepts connections from a shared listening socket and processes them through its event loop. The cache manager and cache loader are helper processes that manage the on-disk proxy cache.

The key architectural insight: connections are distributed across workers through the kernel's socket accept mechanism (SO_REUSEPORT on modern Linux, or the accept_mutex on older versions). Each worker is independent, sharing no memory for request processing. This eliminates lock contention entirely.

The total maximum concurrent connections Nginx can handle is worker_processes * worker_connections. With 4 workers and 10,240 connections each, that is 40,960 concurrent connections. In practice, each proxied connection uses two file descriptors (one for client, one for upstream), so the effective proxied connection limit is half that.

The Master-Worker Process Model

The master process is the supervisor. It never handles client traffic directly. It runs as root to bind to privileged ports (80, 443) and then spawns worker processes that drop to unprivileged user (nginx or www-data). This privilege separation is a security feature: even if a worker is compromised through a vulnerability, the attacker has limited permissions.

The master process responsibilities:

Read and validate the configuration file
Bind to listening sockets (ports 80, 443, etc.)
Spawn worker processes with reduced privileges
Handle signals for graceful reloads and shutdowns
Monitor workers and respawn them if they crash
Write the master PID to a file for management scripts

Graceful Reload (Zero-Downtime Config Changes)

This is one of Nginx's most important operational features. Unlike most servers that require a restart for configuration changes, Nginx can swap its entire configuration while handling live traffic. No load balancer drain, no connection drops, no downtime window.

When you run nginx -s reload, the master process:

Parses and validates the new configuration (if invalid, the reload is aborted with no impact on running workers)
Spawns new worker processes with the new configuration
Sends a graceful shutdown signal to old workers
Old workers stop accepting new connections but finish processing in-flight requests
Once all in-flight requests complete, old workers exit

This is why Nginx can reload configuration with zero downtime. At any point during the reload, either old or new workers (or both) are handling traffic. No connections are dropped.

Why this matters in production

In a Kubernetes environment with frequent config changes (new upstreams, updated TLS certificates), this reload mechanism means you never need to restart the Nginx pod. I use inotifywait or a sidecar to watch for config changes and trigger nginx -s reload automatically. The reload takes milliseconds and does not drop a single connection.

Worker Process Configuration

Directive	Typical Value	Purpose
`worker_processes`	`auto` (one per CPU core)	Number of worker processes
`worker_connections`	1024-65535	Max connections per worker
`worker_cpu_affinity`	`auto`	Pin workers to specific CPU cores
`worker_rlimit_nofile`	65535	Max open file descriptors per worker
`accept_mutex`	off (with SO_REUSEPORT)	Serializes accept() calls across workers

Event-Driven Architecture with epoll

This is the core of why Nginx is fast. Each worker runs a single-threaded event loop that multiplexes thousands of connections through non-blocking I/O. No threads are created per connection, no processes are forked. The event loop is the entire concurrency model.

How epoll Works

epoll is Linux's scalable I/O event notification mechanism. It replaced the older select() and poll() system calls, which had O(N) performance (the kernel scanned every file descriptor on every call). epoll is O(1) for waiting and O(active_events) for processing.

The three key system calls:

epoll_create(): Create an epoll instance (returns a file descriptor)
epoll_ctl(fd, ADD/MOD/DEL): Register or modify interest in a file descriptor
epoll_wait(max_events, timeout): Block until events are ready or timeout expires

When a client sends data, the network card triggers an interrupt. The kernel protocol stack processes the packet and marks the socket as readable. The next epoll_wait() call returns that socket in the ready list. The worker reads the data (non-blocking, so it never waits), processes the request, and registers interest in writability when the response is ready.

The Event Loop

The simplified event loop pseudocode:

// Nginx worker event loop (simplified)
function worker_event_loop():
    while not shutting_down:
        timeout = find_next_timer_expiry()
        events = epoll_wait(epfd, max_events, timeout)

        for event in events:
            if event.is_accept:
                conn = accept(listen_fd)       // Non-blocking accept
                set_nonblocking(conn.fd)
                epoll_add(conn.fd, READ)       // Register for read events
            elif event.is_readable:
                data = read(event.fd)          // Non-blocking read
                if data:
                    process_request(event.conn, data)
                else:
                    close_connection(event.conn)
            elif event.is_writable:
                bytes_sent = write(event.fd, event.conn.send_buf)
                if send_buf_empty:
                    epoll_modify(event.fd, READ)  // Done writing

        process_expired_timers()
        run_posted_events()

Why This Is Better Than Thread-Per-Connection

The critical difference between Nginx and Apache's prefork/worker model:

Aspect	Nginx (event-driven)	Apache prefork (process-per-conn)
Memory per connection	~2.5 KB	~2-10 MB (full process)
10K connections memory	~25 MB	~20-100 GB
Context switches	Minimal (single thread)	Constant (OS scheduling)
Idle connection cost	Nearly zero (just an fd)	Full process/thread resources
CPU cache efficiency	Excellent (single thread)	Poor (threads thrash L1/L2)

The one thing that blocks Nginx workers

Any blocking operation inside a worker stalls ALL connections on that worker. This includes: blocking DNS resolution (use the resolver directive for async DNS), reading large files from a slow disk without AIO, and third-party modules that make synchronous HTTP calls. I have seen a single blocking DNS lookup add 5 seconds of latency to every connection on that worker.

Request Processing Phases

When an HTTP request arrives at a worker, Nginx processes it through a well-defined pipeline of 11 phases. Each phase has registered handler modules that execute in order. Understanding these phases is critical for debugging why a request is handled differently than expected, especially when multiple location blocks, rewrite rules, and access controls interact.

The 11 Phases

The diagram below shows all 11 phases in order. Each phase can have multiple registered modules. A module in any phase can terminate the request early (e.g., the access phase returns 403, or the rate limiter returns 429). If no module terminates, processing continues to the next phase.

The key phases for most configurations:

Phase 3 (FIND_CONFIG) is where Nginx matches the request URI to a location block. This is the most commonly misunderstood phase. The matching order is:

Exact match (location = /api/health) - checked first
Longest prefix match with ^~ modifier - stops further searching
Regular expression matches (first match wins, in config file order)
Longest prefix match without ^~ - used if no regex matched

Phase 6 (PREACCESS) is where rate limiting happens. The limit_req module uses a leaky bucket algorithm with configurable burst size and delay. This is the first line of defense against DDoS and abuse.

Phase 7 (ACCESS) handles authentication and authorization. IP-based allow/deny rules, HTTP basic auth, and JWT validation all happen here. If access is denied, processing stops and a 403 is returned.

Phase 10 (CONTENT) is where the actual response generation happens. Only one content handler can execute per request (proxy_pass, fastcgi_pass, static file serving, or return).

Upstream and Load Balancing

When Nginx acts as a reverse proxy, it distributes requests to backend servers using configurable load balancing algorithms. The upstream module is one of the most heavily used parts of Nginx in production, and understanding its behavior is critical for high-availability deployments.

Load Balancing Algorithms

Algorithm	Directive	Behavior	Best For
Round Robin	(default)	Rotate through servers sequentially	Homogeneous backends
Weighted Round Robin	`weight=3`	Proportional distribution by weight	Mixed-capacity backends
Least Connections	`least_conn`	Send to server with fewest active connections	Variable request duration
IP Hash	`ip_hash`	Hash client IP to sticky backend	Session affinity without cookies
Generic Hash	`hash $request_uri`	Hash arbitrary key to server	Cache-friendly distribution
Random with Two Choices	`random two least_conn`	Pick 2 random servers, choose least-loaded	Large server pools (power of 2 choices)

The power of two random choices algorithm deserves special attention. Instead of checking all servers (which requires shared state in a multi-worker setup), it picks two servers at random and sends the request to the less-loaded one. Research shows this achieves near-optimal load distribution with O(1) decision time, making it ideal for large upstream pools with 50+ servers.

Passive vs Active Health Checks

Nginx open-source performs passive health checks only. It monitors responses from upstreams and marks a server as down after max_fails consecutive failures within fail_timeout seconds. The server is reintroduced after fail_timeout expires.

This means a real user request is used as the health probe, which has a cost: the first few users after a backend goes down will see errors before Nginx detects the failure.

Nginx Plus adds active health checks that send periodic synthetic requests to a health endpoint (e.g., /health). This detects failures before any real user is affected.

For open-source Nginx, I mitigate this by combining max_fails=2 fail_timeout=10s with proxy_next_upstream error timeout http_502 http_503. This retries failed requests on another backend transparently, so the user never sees the error (the first backend is just slightly slower while the retry happens).

Connection Pooling to Upstreams

Without keepalive connections, every proxied request incurs:

TCP three-way handshake (~0.5ms local, 1-50ms cross-AZ)
TLS handshake if HTTPS (~5-30ms additional)
Request/response exchange
TCP connection teardown

With keepalive connections, the TCP and TLS overhead is paid once, and subsequent requests reuse the established connection.

# Upstream configuration with connection pooling
upstream backend {
    least_conn;
    server 10.0.1.10:8080 max_fails=3 fail_timeout=30s;
    server 10.0.1.11:8080 max_fails=3 fail_timeout=30s;
    server 10.0.1.12:8080 max_fails=3 fail_timeout=30s;
    keepalive 64;           # Pool of 64 idle connections per worker
}

server {
    location /api/ {
        proxy_pass http://backend;
        proxy_http_version 1.1;                    # Required for keepalive
        proxy_set_header Connection "";             # Clear hop-by-hop header
        proxy_connect_timeout 5s;
        proxy_read_timeout 30s;
        proxy_next_upstream error timeout http_502; # Retry on failure
    }
}

Health checks in open-source vs Nginx Plus

Open-source Nginx only does passive health checks: if max_fails consecutive requests to a backend fail, it is marked down for fail_timeout seconds. Nginx Plus adds active health checks (periodic HTTP requests to a health endpoint). For open-source Nginx, I use proxy_next_upstream to automatically retry failed requests on another backend, which provides similar resilience.

Static File Serving and Zero-Copy I/O

Nginx is exceptionally fast at serving static files due to kernel-level optimizations that bypass userspace entirely for the data transfer path.

The sendfile() System Call

Without sendfile, serving a file requires:

read() from disk into a kernel buffer
Copy from kernel buffer to userspace buffer
Copy from userspace buffer to kernel socket buffer
write() from socket buffer to network

This involves 4 context switches and 4 data copies. With sendfile():

sendfile() tells the kernel to transfer data directly from the file page cache to the socket buffer

This is zero-copy: data moves from the page cache to the network card with only 2 context switches and zero userspace copies. On Linux with DMA-capable network cards and TCP_CORK, the kernel can further optimize by batching the HTTP headers and file data into a single TCP segment.

The performance difference is significant for large files. Serving a 10MB file with read()/write() copies 40MB of data through the CPU. With sendfile(), the CPU copies zero bytes (DMA handles the transfer). For a server handling 1,000 concurrent file downloads, this saves 40GB/sec of memory bandwidth.

# Optimal static file serving configuration
server {
    location /static/ {
        root /var/www;
        sendfile on;                # Zero-copy file transfer
        tcp_nopush on;              # Batch headers + file data in one TCP segment
        tcp_nodelay on;             # Disable Nagle for small responses
        open_file_cache max=10000 inactive=60s;  # Cache file descriptors
        open_file_cache_valid 30s;
        gzip on;
        gzip_comp_level 6;
        gzip_types text/css application/javascript application/json;
        gzip_min_length 256;        # Don't compress tiny files
        expires 30d;                # Cache-Control: max-age=2592000
    }
}

open_file_cache

This caches file descriptors, file sizes, modification times, and directory lookups in a per-worker hash table. Without it, every request for /static/app.js requires:

open() system call to get a file descriptor
fstat() to get the file size for Content-Length
close() after the response

With open_file_cache, the file descriptor stays open and metadata is cached. For high-traffic static sites, this reduces system calls by 60-70%.

Connection Handling: Keep-Alive, HTTP/2, and TLS

Nginx's connection handling is designed to minimize per-connection overhead while maximizing throughput for both short-lived HTTP requests and long-lived WebSocket or streaming connections.

Keep-Alive Connections

HTTP keep-alive reuses a TCP connection for multiple requests. Without it, every request incurs a TCP handshake (and TLS handshake for HTTPS). The cost of not using keep-alive is measurable: at 100 requests/sec per client, you pay 100 TCP handshakes/sec (and 100 TLS handshakes for HTTPS). With keep-alive, you pay one handshake and reuse it for all 100 requests.

Nginx tracks keep-alive state per connection with minimal memory overhead. An idle keep-alive connection costs only a file descriptor and ~2.5KB of state. The kernel handles the actual TCP buffers.

Directive	Default	Purpose
`keepalive_timeout`	75s	Idle time before closing keep-alive connection
`keepalive_requests`	1000	Max requests per keep-alive connection
`keepalive_time`	1h	Max lifetime of a keep-alive connection

HTTP/2 Multiplexing

HTTP/2 allows multiple requests over a single TCP connection through stream multiplexing. This eliminates head-of-line blocking at the HTTP layer (though TCP-level HOL blocking remains).

HTTP/2 frames are the basic unit of communication. Each frame belongs to a stream (identified by a stream ID), and streams are multiplexed over the single TCP connection. Nginx maintains a priority tree for streams and allocates bandwidth proportionally based on stream weights and dependencies.

The flow control mechanism prevents a fast sender from overwhelming a slow receiver. Each stream has its own flow control window (default 64KB), and the connection itself has a window (default 64KB). Nginx tracks both windows and pauses sending when a window fills up.

server {
    listen 443 ssl http2;
    http2_max_concurrent_streams 128;   # Streams per connection
    http2_recv_buffer_size 256k;
}

A single HTTP/2 connection can carry 128 concurrent streams (requests/responses). Each stream is independent, so a slow response on stream 5 does not block stream 6. This is particularly important for browser connections, where a page load may trigger 50+ resource requests.

HTTP/2 server push is deprecated

HTTP/2 server push (proactively sending resources before the client requests them) was supported by Nginx but has been deprecated by most browsers due to poor real-world performance. The 103 Early Hints response code is the modern replacement, hinting to the browser which resources to preload while the server generates the main response. Nginx supports this via add_header Link in early responses.

TLS Termination

Nginx terminates TLS at the edge, decrypting traffic before forwarding plain HTTP to backends. This offloads the CPU-intensive cryptographic operations from application servers. The TLS handshake is the most CPU-intensive operation Nginx performs, and optimizing it is critical for HTTPS-heavy workloads.

TLS Operation	Latency	CPU Cost
Full TLS 1.2 handshake (RSA)	2 RTT + ~5ms CPU	High (RSA decryption)
Full TLS 1.3 handshake (ECDHE)	1 RTT + ~1ms CPU	Moderate
TLS session resumption (ticket)	1 RTT + ~0.5ms CPU	Low
TLS session resumption (0-RTT)	0 RTT + ~0.5ms CPU	Low

The key optimization for TLS at scale

Enable TLS session tickets (ssl_session_tickets on) and set a shared session cache (ssl_session_cache shared:SSL:50m). This lets returning clients resume sessions without a full handshake, reducing latency from ~5ms to ~0.5ms and cutting CPU usage by 80%+ for repeat visitors. For TLS 1.3, early data (0-RTT) eliminates the round-trip entirely for idempotent requests.

What Happens When Things Break

Understanding failure modes is essential because Nginx is typically the first component in the request path. When Nginx fails, everything behind it is unreachable.

Failure	What Happens	How to Detect	How to Fix
Worker crashes	Master respawns immediately, other workers unaffected	Error log entry, brief connection drops	Check error log for segfault cause, update Nginx or disable faulty module
All upstreams down	Nginx returns 502 Bad Gateway	Upstream error count in logs, monitoring alerts	Fix backends, configure fallback with `error_page 502 /maintenance.html`
File descriptor limit hit	New connections refused with "Too many open files"	`worker_connections` exceeded alerts in error log	Increase `worker_rlimit_nofile` and OS `ulimit -n`
Slow upstream	Connections queue up, worker connections saturated	Rising `proxy_read_timeout` errors, growing active connections	Lower timeouts, add circuit breaking, scale backends
SSL certificate expired	Browsers show security warning, drop traffic	Certificate monitoring, CT log watching	Automate renewal with certbot/ACME
Disk full (cache/logs)	Cache stops working, logs stop writing, potential crash	Disk usage alerts	Rotate logs with `logrotate`, limit cache size with `max_size`

The thundering herd on reload

When Nginx reloads with a large number of connections, all new workers simultaneously call accept() on the shared socket. On older kernels without SO_REUSEPORT, this causes a thundering herd problem where all workers wake up but only one can accept each connection. Use reuseport in the listen directive on Linux 3.9+ to give each worker its own accept queue: listen 80 reuseport;. This eliminates the thundering herd and improves accept latency by 2-3x.

Performance Characteristics

These numbers come from real-world benchmarks and my production experience. Your mileage will vary based on hardware, kernel version, and workload mix. The key takeaway is that Nginx's overhead per request is measured in microseconds, not milliseconds.

Metric	Typical Value	Key Factor
Memory per connection	~2.5 KB (idle keep-alive)	Minimal state: fd, buffers, timers
Memory per worker	~10-30 MB	Config complexity, loaded modules
Static file requests/sec	100,000-500,000+ per core	Bottleneck: disk I/O or network bandwidth
Proxy requests/sec	20,000-80,000 per core	Upstream latency is the bottleneck
HTTP/2 streams per connection	128 (default)	`http2_max_concurrent_streams`
TLS handshakes/sec	5,000-30,000 per core	RSA vs ECDSA, key size (2048 vs 4096)
Config reload time	< 100ms	No connections dropped
Max connections per worker	65,535 (fd limit)	OS ulimit and `worker_connections`
Latency overhead (proxy)	0.1-0.5ms	Header parsing, upstream connection

The real bottleneck is almost never Nginx

In my experience, Nginx itself is rarely the performance bottleneck. It adds 0.1-0.5ms of latency per request as a reverse proxy. The bottleneck is almost always the upstream backend (slow database queries, heavy computation) or the network (cross-AZ latency, bandwidth limits). I optimize backends first and Nginx configuration second.

How This Compares to Alternatives

Choosing a reverse proxy is one of the most consequential infrastructure decisions because it sits in the critical path of every request. Each option has different strengths and tradeoffs.

Feature	Nginx	Apache (event MPM)	HAProxy	Envoy	Caddy
Architecture	Event-driven, multi-process	Event-driven + thread pool	Event-driven, multi-thread	Event-driven, multi-thread	Event-driven, goroutines
Static file serving	Excellent (sendfile)	Good	Not designed for this	Not designed for this	Good
Reverse proxy	Excellent	Good	Excellent	Excellent	Good
Load balancing	Good (basic algorithms)	Basic	Excellent (advanced algorithms, stick tables)	Excellent (xDS API, circuit breaking)	Basic
TLS automation	Manual (certbot sidecar)	Manual	Manual	SDS API (automatic)	Built-in (ACME)
Config format	Static file, reload needed	Static file + .htaccess	Static file, reload needed	API-driven (xDS), hot config	JSON/Caddyfile, auto-reload
Memory per connection	~2.5 KB	~10 KB (event MPM)	~3 KB	~30-50 KB	~5-10 KB
HTTP/3 (QUIC)	Experimental	No	Yes (recent)	Yes	Yes
Service mesh sidecar	No	No	No	Yes (designed for this)	No

I reach for Nginx when I need a proven, lightweight reverse proxy with excellent static file serving and straightforward configuration. I choose HAProxy when advanced load balancing features matter (stick tables, advanced health checks, detailed metrics per backend). I pick Envoy when building a service mesh or when I need dynamic configuration via API (xDS protocol). I use Caddy for small projects where automatic TLS with zero configuration is the priority.

The Envoy vs Nginx debate for microservices

In a Kubernetes environment, I increasingly see teams default to Envoy (via Istio or standalone) because of its API-driven configuration. Nginx requires config file changes and reloads, while Envoy can update routing rules via xDS without any restart. For service mesh sidecars, where hundreds of proxy instances need coordinated configuration changes, API-driven configuration is a significant operational advantage. But for edge proxying and static file serving, Nginx remains the better choice due to lower memory overhead and the sendfile optimization.

When to Use What

Scenario	Best Choice	Reason
Static site + reverse proxy	Nginx	sendfile, low memory, simple config
High-availability TCP/HTTP LB	HAProxy	Stick tables, detailed health checks
Service mesh sidecar	Envoy	xDS API, circuit breaking, observability
Small project, auto-TLS	Caddy	Zero-config ACME, Caddyfile simplicity
Kubernetes ingress	Nginx Ingress or Envoy-based (e.g., Contour)	Depends on team familiarity and dynamic config needs

Interview Cheat Sheet

These are the key points you need to remember for interviews. Each bullet pairs a trigger question with a concise, technically accurate response.

When asked about the process model: "Nginx uses a master-worker model. The master manages workers and handles signals. Each worker is a single-threaded event loop. Typically one worker per CPU core."
When asked about concurrency: "A single Nginx worker handles 10,000+ connections using epoll and non-blocking I/O. Memory per connection is ~2.5KB. This is why Nginx can handle the C10K problem easily."
When asked about epoll: "epoll is Linux's I/O event notification mechanism. The worker registers interest in socket events (readable, writable) and blocks on epoll_wait(). When events are ready, it processes them in a tight loop. No thread-per-connection overhead."
When asked about graceful reload: "Send SIGHUP to the master. It spawns new workers with the new config, then gracefully shuts down old workers. Old workers finish in-flight requests before exiting. Zero connections dropped."
When asked about load balancing: "Round-robin by default. least_conn for variable request durations. ip_hash for session affinity. random two least_conn for large pools (power of two choices)."
When asked about static files: "Nginx uses sendfile() for zero-copy transfer from disk to socket. Combined with open_file_cache for fd caching and gzip compression, it can serve 100K+ static requests per second per core."
When asked about TLS termination: "Nginx terminates TLS at the edge and forwards plain HTTP to backends. Session tickets and shared session cache reduce repeat handshake cost by 80%+. ECDSA certificates are faster than RSA for handshakes."
When asked about HTTP/2: "HTTP/2 multiplexes multiple streams over a single TCP connection, eliminating HTTP-level head-of-line blocking. Nginx supports 128 concurrent streams per connection by default."
When asked why not Apache: "Apache's prefork model uses 2-10MB per connection. For 10,000 connections, that is 20-100GB of RAM. Nginx uses 25MB for the same load. The event MPM improved Apache significantly, but Nginx's architecture was designed for this from the start."
When asked about the biggest limitation: "Any blocking operation inside a worker stalls all connections on that worker. Avoid blocking DNS, synchronous disk I/O on slow disks, and third-party modules that make blocking calls."

Test Your Understanding

These questions test whether you can apply the concepts from this article to real production scenarios. They are deliberately harder than what you will encounter in most interviews.

Quick Recap

These are the core facts about Nginx that every backend and infrastructure engineer should know.

Nginx uses a master-worker process model where the master manages configuration and worker lifecycle, and each worker handles thousands of connections through a single-threaded event loop.
The event loop uses epoll (Linux) for non-blocking I/O, allowing one worker to multiplex 10,000+ concurrent connections with ~2.5KB memory per connection.
Graceful reloads spawn new workers with updated config while old workers finish in-flight requests, achieving zero-downtime configuration changes.
HTTP requests pass through 11 processing phases, from IP resolution through rewriting, authentication, rate limiting, content generation, and logging.
Load balancing to upstream backends supports round-robin, least connections, IP hash, and random-two-choices algorithms, with connection pooling via the keepalive directive.
Static file serving uses the sendfile() system call for zero-copy I/O and open_file_cache for file descriptor caching, reaching 100K+ requests/sec per core.
TLS termination at the Nginx layer offloads encryption from backends, with session tickets and shared caches reducing repeat handshake overhead by 80%+.
The primary performance bottleneck in an Nginx deployment is almost never Nginx itself, but rather upstream backend latency or network constraints.

These articles complement this one by covering related technologies and protocols.

How Linux containers work: Understanding cgroups and namespaces explains how Nginx worker processes are resource-constrained in containerized deployments
How SSL certificates work: The TLS handshake that Nginx terminates, including certificate chains, OCSP stapling, and key exchange algorithms
How gRPC works: The HTTP/2-based RPC framework that Nginx can proxy with grpc_pass, including streaming and multiplexing semantics
How DNS resolution works: Understanding DNS is critical for Nginx upstream configuration, since Nginx resolves upstream hostnames at config load time (not per-request) unless you use the resolver directive