How to coordinate exclusive access to a shared resource across distributed processes. Covers Redis SETNX locks, fencing tokens, Redlock, etcd/ZooKeeper locks, and Martin Kleppmann's critique.
25 min read2026-04-04harddistributed-lockingconcurrencyredisdistributed-systems
Distributed locks coordinate exclusive access to a shared resource across machines. Local in-memory locks (synchronized, mutex) are useless because each process has its own memory space.
Redis SET key value NX PX ttl is the most common distributed lock: fast to acquire, auto-expires on crash, and sufficient for most coordination tasks.
Fencing tokens protect against the GC-pause problem: a slow lock holder whose TTL expires can still write to the resource after a new holder takes over. Without fencing, you get silent data corruption.
The Redlock algorithm (Antirez, 2015) acquires locks across 5 independent Redis instances. Martin Kleppmann's critique showed it's still unsafe under clock skew, sparking one of the most important debates in distributed systems.
For strict correctness (financial transactions, shard assignment), use consensus-backed locks (etcd, ZooKeeper). For best-effort coordination (cron dedup, cache warming), Redis locks are fine.
It's 11:59 PM. Your payment service processes end-of-day settlement batches. A Kubernetes pod running the settlement job starts processing batch #4472. Four seconds in, Kubernetes detects a slow health check and spins up a replacement pod. The replacement also picks up batch #4472. Now two pods are processing the same batch simultaneously.
Pod A debits the merchant account. Pod B debits the merchant account. The merchant gets charged twice. The support team spends the next morning issuing refunds, and your engineering team spends the next week explaining how this happened.
The root cause: both pods checked "is batch #4472 already processing?" against their own in-memory state. Each pod's answer was "no" because each pod has its own memory space. The synchronized block that was supposed to prevent concurrent execution only works within a single JVM.
This isn't a theoretical problem. I've seen it in production at companies that should know better. Any system running multiple instances of the same service (which is every production system) needs a way to coordinate exclusive access across those instances.
Comments
Local locks solve the single-process case. Distributed locks solve the multi-process case. The difference matters every time you horizontally scale.
A distributed lock is a mechanism that ensures at most one process across a cluster can execute a critical section at any given time. The lock lives in a shared store (Redis, etcd, a database) that all processes can access, rather than in any single process's memory.
Think of a shared bathroom with a lock on the door. Anyone in the building can try the handle. If the door is locked, you wait. If it's unlocked, you enter and lock it behind you. The lock isn't in your pocket (local mutex). It's on the door itself (shared store). Everyone checks the same door.
The distributed lock must provide three properties:
Mutual exclusion: At most one holder at any time.
Deadlock freedom: If a holder crashes without releasing, the lock eventually becomes available (via TTL or lease expiration).
Fault tolerance: The lock service itself must be available even if some of its nodes fail.
For your interview: say "I'd use a distributed lock with TTL for mutual exclusion, and fencing tokens if I need correctness under process pauses." That covers the two most important design decisions.
A distributed lock does not guarantee mutual exclusion
Most distributed locks (Redis SETNX, Redlock) guarantee mutual exclusion only if the lock holder completes its work before the TTL expires. If the holder pauses (GC, slow network, disk I/O stall) past the TTL, the lock expires, a second process acquires it, and you have two holders. Fencing tokens are the only reliable fix. Without fencing, a distributed lock is a "mostly exclusive" lock.
The Lua script is critical. A non-atomic GET followed by DEL has a TOCTOU (time-of-check to time-of-use) race: between GET and DEL, the lock might expire and be reacquired by another process. Your DEL then removes their lock.
Here's the correctness gap that most Redis lock tutorials skip. A process can hold the lock, get paused (GC stop-the-world, network partition, slow disk I/O), and resume after the TTL expires. By then, another process has the lock. Both processes now believe they have exclusive access.
Fencing tokens fix this. The lock server issues a monotonically increasing integer with each lock acquisition. The downstream resource (database, API, storage layer) checks the token on every write and rejects writes with a token lower than the last seen value.
The catch: fencing tokens require the downstream resource to understand and enforce them. Most databases don't natively support this. You can implement it with a version column:
-- Resource-side fencing checkUPDATE batchesSET status = 'processing', fence_token = 35WHERE id = 4472 AND fence_token < 35;-- affected_rows = 0 means stale token
Redis is single-node by default. If that node crashes, all locks are lost. Redlock (proposed by Salvatore Sanfilippo) attempts to make Redis locks fault-tolerant by acquiring across 5 independent Redis instances:
Get current time in milliseconds.
Try to acquire the lock on all 5 Redis instances sequentially, with a short per-instance timeout.
The lock is acquired if and only if: (a) you got the lock on at least 3 out of 5 instances, and (b) the total elapsed time is less than the lock TTL.
If acquired, the effective TTL is the original TTL minus the elapsed acquisition time.
If not acquired, release the lock on all instances where you did acquire it.
The Kleppmann-Antirez Debate
Martin Kleppmann's analysis ("How to do distributed locking," 2016) argued that Redlock is fundamentally unsafe because it relies on clock synchronization across Redis instances. If one instance's clock jumps forward (NTP correction, VM clock skew), locked keys expire early, allowing a second client to acquire a "majority." Kleppmann's recommendation: if you need correctness, use a consensus-backed lock (etcd, ZooKeeper). If you only need best-effort coordination, a single Redis instance with fencing is simpler and equally effective. The industry has largely settled on: Redlock for efficiency-type locks, etcd/ZooKeeper for correctness-type locks.
Same as etcd; prefer if ZooKeeper already in stack
SELECT FOR UPDATE
Row-level pessimistic lock
Yes (within transaction)
5-20ms
Operations already in a DB transaction
Optimistic lock
Version column + CAS update
N/A (retry, not a lock)
5-20ms
Low-contention reads with occasional writes
Advisory lock
pg_advisory_lock(key) in PostgreSQL
Yes (within DB availability)
5-20ms
App-level coordination without schema changes
My recommendation: start with Redis SETNX for 90% of use cases. Upgrade to etcd when someone proves you need correctness under partitions. Use database locks when the operation is already transactional. Redlock almost never, because single Redis + fencing is simpler and consensus-backed stores are safer.
For strict correctness requirements (financial transactions, leader election), consensus-backed locks provide stronger guarantees:
# etcd lease-based lockimport etcd3client = etcd3.client()# Create a lease with TTLlease = client.lease(30) # 30-second TTL# Acquire lock (this is a Raft commit, not just a key set)lock = client.lock('batch:4472', ttl=30)lock.acquire()try: process_batch()finally: lock.release()
etcd uses Raft-based consensus. A lock acquisition is a Raft commit, so it's linearizable, not just "probably OK." The lease mechanism means the lock is automatically released if the holder crashes or the lease keeper fails to refresh.
ZooKeeper achieves similar semantics with ephemeral znodes: a client creates an ephemeral node at a known path, and ZooKeeper automatically deletes it when the client's session expires (disconnect or heartbeat timeout). For ordered locking, ZooKeeper uses sequential ephemeral nodes, and each waiter watches the next-lower sequence node to avoid thundering herd.
For systems that already have a database, using it as the lock store avoids an extra dependency:
-- Pessimistic: SELECT FOR UPDATEBEGIN;SELECT id FROM orders WHERE id = '123' FOR UPDATE;-- Row-level exclusive lock held until COMMIT/ROLLBACKUPDATE orders SET status = 'processing' WHERE id = '123';COMMIT;-- Optimistic: version-based CASSELECT id, status, version FROM orders WHERE id = '123';-- version = 7UPDATE ordersSET status = 'processing', version = 8WHERE id = '123' AND version = 7;-- affected_rows = 0 means conflict → retry
SELECT FOR UPDATE works well for short-lived operations within a database transaction. It's not suitable for long-held locks because the transaction holds a DB connection for the duration.
Adds a network dependency on every critical-section entry
TTL prevents deadlock from crashed holders
TTL too short = false release; TTL too long = blocked waiters
Simple mental model (one holder at a time)
Fencing tokens required for true correctness under pauses
Works with existing infrastructure (Redis, DB)
Lock contention becomes a bottleneck under high concurrency
Consensus-backed locks survive partitions
Consensus locks add 10-50ms latency per acquisition
The fundamental tension is correctness vs. performance. Fully correct distributed locks (consensus-backed with fencing tokens) add latency and operational complexity. Fast locks (Redis SETNX) are not fully correct under partitions and process pauses. Pick your poison based on the cost of double-execution in your specific domain.
Multiple instances of a service can pick up the same job (batch processing, cron, queue consumers)
Exactly-once semantics matter and the downstream isn't naturally idempotent
You need leader election without a dedicated consensus cluster
A shared resource (file, account, inventory item) must be modified exclusively
Avoid a distributed lock when:
You can make the operation idempotent instead (idempotency keys, upserts with dedup)
The critical section is a database write (use SELECT FOR UPDATE or optimistic locking instead of an external lock)
Lock contention would serialize all requests (if every request needs the lock, the lock becomes a bottleneck)
You're locking for reads (use snapshot isolation or MVCC instead)
The best distributed lock is the one you didn't need. If you can restructure the system so that each instance owns a non-overlapping partition of work (e.g., Kafka consumer groups with partition assignment), you eliminate the need for locking entirely. I'll always push for partition-based ownership before reaching for a distributed lock.
Stripe (Payment Idempotency): Stripe processes millions of payment API calls daily. Two duplicate requests for the same payment must result in exactly one charge. Stripe uses idempotency keys backed by distributed locks: the first request acquires a lock keyed on the idempotency key, processes the payment, and stores the result. Duplicate requests that arrive while the lock is held either wait or return the cached result. The lock TTL is tuned to slightly exceed the maximum expected payment processing time (roughly 30 seconds).
Uber (Ride Assignment): When a rider requests a ride, multiple dispatch servers might identify the same driver as the best match. Uber uses distributed locking to ensure that a driver is assigned to exactly one ride. The lock is keyed on the driver ID, acquired before sending the dispatch offer, and released after the driver accepts or the offer times out. Without this lock, two riders could see "Driver Ahmed is 2 minutes away" simultaneously, and Ahmed would get two conflicting ride assignments.
Distributed Cron (ShedLock): In microservice architectures, scheduled jobs often run on multiple instances. Libraries like ShedLock use database locks or Redis locks to ensure that a scheduled task runs on exactly one instance per execution window. The lock is acquired at the start of the cron window, with a TTL matching the expected job duration plus buffer. If the holder crashes, the TTL ensures the next cron window isn't blocked.
Distributed locking appears in interviews whenever you're designing a system with multiple instances that share mutable state:
"How do you prevent duplicate payment processing?" → Distributed lock + idempotency key
"How does the scheduler ensure a job runs only once?" → Distributed lock on the job ID
"How do you handle concurrent inventory decrements?" → Optimistic locking or distributed lock depending on contention level
"Your batch processor runs on 10 pods. How do you prevent overlapping work?" → Partition assignment or distributed lock
Mention distributed locking proactively when you spot shared mutable state accessed by multiple instances. It signals that you understand the single-process-to-multi-process gap.
Explain the basic mechanism: SET NX PX with a unique holder ID, atomic Lua release
Know the fencing token problem and when it matters (process pauses, GC)
Articulate the Redlock vs. single-Redis trade-off (and mention Kleppmann's critique)
Distinguish efficiency locks (Redis) from correctness locks (etcd/ZooKeeper)
Suggest alternatives: "Could we partition the work instead of locking?"
Interview move: question the lock
The strongest move in an interview is to say: "Before adding a distributed lock, can we make the operation idempotent instead?" This shows you understand that locks are a coordination mechanism, not a first resort. If the downstream supports upserts or idempotency keys, you don't need a lock at all. Only lock when there's no other way to prevent duplicate side effects.
"The TTL expires and the lock is automatically released. For Redis, this is the PX parameter. For etcd, it's the lease TTL. The next process that tries to acquire will succeed."
"How do you handle lock contention?"
"Exponential backoff with jitter on retries. If contention is chronic, it signals the system needs partition-based ownership instead of a central lock."
"Why not just use Redlock?"
"Redlock relies on synchronized clocks across 5 Redis instances. Clock skew (NTP corrections, VM migration) can break the quorum guarantee. For correctness-critical locks, I'd use etcd. For best-effort, a single Redis instance with fencing is simpler."
"What's a fencing token?"
"A monotonically increasing integer issued with each lock acquisition. The protected resource rejects writes with a token lower than the last seen. It prevents stale lock holders (whose TTL expired during a GC pause) from corrupting data."
"When would you use a database lock instead?"
"When the protected operation is already in a database transaction. SELECT FOR UPDATE gives you row-level exclusivity within the same commit scope, with no extra infrastructure. But it ties up a DB connection for the lock duration."
Consensus Algorithms: etcd and ZooKeeper locks are built on Raft/ZAB consensus. Understanding consensus explains why these locks are correct under partitions.
CAP Theorem: Redis locks favor AP (available but not partition-tolerant for correctness). Consensus locks favor CP (correct under partitions but higher latency and possibly unavailable without quorum).
Missing Idempotency (Anti-Pattern): Often the alternative to a distributed lock. If you can make the operation idempotent, you don't need to lock.
Replication: Redis replication doesn't replicate locks atomically, which is why a lock acquired on the primary can be lost during failover.