Outbox pattern
Learn how the Outbox pattern eliminates the dual-write problem in distributed systems, guaranteeing every database write produces its corresponding event even when brokers and services crash mid-flight.
TL;DR
- The Outbox pattern solves the dual-write problem: you cannot atomically write to a database and publish to a message broker simultaneously. One can succeed while the other fails, leaving your system in an inconsistent state.
- The fix is to write the event into an outbox table inside the same database transaction as your business data. A separate relay process then reads the outbox and publishes to the broker. Atomicity comes from the database; delivery is handled by the relay.
- This gives you at-least-once delivery, not exactly-once. Consumers must implement idempotency to handle duplicate events.
- Two relay strategies exist: polling (simple, works up to ~5K events/second) and CDC via Debezium (sub-second latency, higher operational burden, no theoretical throughput ceiling).
- This pattern is not optional complexity. It is the structural difference between an event-driven architecture that works in production and one that silently loses events under load.
The Problem
Your order service needs to do two things when a customer places an order: write the order to PostgreSQL, and publish an order.created event to Kafka so the inventory service can reserve stock.
Two lines of code: INSERT, then publish. The problem is those two operations are not atomic. PostgreSQL knows nothing about Kafka. Kafka knows nothing about PostgreSQL.
If your application crashes between the INSERT and the publish, the order exists in the database but the inventory service never sees the event. Stock is never reserved. The customer gets an order confirmation, but nothing ships. Your on-call engineer finds out at 3 a.m. when the customer calls.
This is not just a crash scenario. The same failure appears when Kafka is temporarily unavailable, when a network partition separates your app from the broker, or when the publish call throws an exception after the DB commit has already completed.
The instinctive fix is to add retry logic around the publish. Retry logic does not help if the application has already crashed before the retry runs. It also risks double-publishing if the original publish actually succeeded but the acknowledgment was lost.
The fundamental issue: you cannot make two writes to two unrelated systems atomic without a distributed transaction, and distributed transactions are something you do not want in a high-throughput service.
One-Line Definition
The Outbox pattern guarantees event delivery by writing events into a dedicated table in the same database transaction as your business data, then using a relay process to publish those events to the message broker asynchronously.
Analogy
Think of a restaurant kitchen where the chef (your service) must both prepare a dish (database write) and call out to a waiter (message broker) to deliver it. The chef can complete the dish and then faint before calling anyone. Dish plated, waiter never notified. The customer gets no food even though the kitchen has a completed meal.
The outbox is the kitchen ticket printer. Every time the chef finishes a dish, they print a ticket and clip it to the pass. A dedicated ticket runner (the relay) comes by every 30 seconds, picks up all undelivered tickets, and routes them to the right waiter. Even if the ticket runner is temporarily absent, the tickets stay clipped. The next time the runner comes, they deliver everything. The chef never worries about delivery.
Solution Walkthrough
The Outbox pattern has three moving parts: the transactional write, the relay, and the consumer. Here is how they work together.
Step 1: Atomic transactional write
Your service writes business data and the corresponding outbox row in a single database transaction. Both succeed together, or both fail together. If the transaction commits, you are guaranteed the outbox row exists. If the transaction rolls back, no outbox row exists. The two states are always consistent.
Step 2: Relay reads and publishes
A background process polls the outbox table for PENDING rows (and for PROCESSING rows older than a timeout, which indicates a crashed relay) and publishes them to Kafka. After receiving a Kafka acknowledgment, it marks the row as SENT. This relay runs independently of the service that wrote the data. If the relay crashes mid-publish, the row stays in PROCESSING and is recovered automatically when the next poll cycle finds it past the stale timeout.
Step 3: Consumer processes with idempotency
The consumer receives the event. Before executing business logic, it attempts to insert the event ID into a processed_events table. If the insert succeeds (first time), process normally. If it fails on a unique constraint (already processed), skip silently. Because the relay delivers at-least-once, duplicates are expected. Idempotency is how you make them harmless.
The FOR UPDATE SKIP LOCKED in Step 2 is not a cosmetic detail. Without it, running two relay instances means both claim the same batch of rows, and every event gets published twice. SKIP LOCKED makes each relay instance claim a disjoint subset of rows automatically.
For your interview: describe all three steps, name FOR UPDATE SKIP LOCKED as the concurrency lock, and state that consumers get at-least-once delivery and must be idempotent. That combination signals you've actually run this, not just read about it.
Implementation Sketch
Here is a production-grade relay covering batching, the three-state lifecycle, and dead-letter handling:
// outbox-relay.ts β background service, one per service instance or as a cron job
class OutboxRelay {
private readonly BATCH_SIZE = 100;
private readonly MAX_ATTEMPTS = 5;
private readonly POLL_INTERVAL_MS = 2000;
async start(): Promise<void> {
while (true) {
await this.processBatch();
await sleep(this.POLL_INTERVAL_MS);
}
}
private async processBatch(): Promise<void> {
// The SELECT and the UPDATE to PROCESSING MUST be in the same transaction.
// If they are separate db.query calls, each auto-commits on their own connection,
// releasing the FOR UPDATE lock between the two calls. A concurrent relay instance
// can then claim the same rows in that gap β exactly the race we are trying to prevent.
const client = await pool.connect();
let rows: OutboxRow[] = [];
try {
await client.query('BEGIN');
const result = await client.query<OutboxRow>(
`SELECT id, aggregate_id, event_type, payload, attempts
FROM outbox
WHERE (
(status = 'PENDING' AND attempts < $1)
OR
-- Recover PROCESSING rows left by a crashed relay instance
(status = 'PROCESSING' AND updated_at < NOW() - INTERVAL '10 minutes')
)
ORDER BY created_at
LIMIT $2
FOR UPDATE SKIP LOCKED`,
[this.MAX_ATTEMPTS, this.BATCH_SIZE]
);
rows = result.rows;
if (rows.length === 0) {
await client.query('COMMIT');
return;
}
// Mark PROCESSING while still holding the FOR UPDATE row locks.
// updated_at = NOW() is critical: the stale-recovery condition uses updated_at
// to detect crashed relays. Without this, a row that waited >10 min in PENDING
// would have an old updated_at and be immediately vulnerable to concurrent re-claiming.
await client.query(
`UPDATE outbox SET status = 'PROCESSING', updated_at = NOW() WHERE id = ANY($1)`,
[rows.map(r => r.id)]
);
await client.query('COMMIT');
// Locks released. Rows are now PROCESSING β concurrent relays skip them.
} catch (err) {
await client.query('ROLLBACK').catch(() => {});
throw err;
} finally {
client.release();
}
const sentIds: string[] = [];
const failedIds: string[] = [];
for (const row of rows) {
try {
await kafka.produce({
topic: topicForEventType(row.event_type),
key: row.aggregate_id, // ensures ordering: same aggregate β same partition
value: row.payload,
headers: { 'x-outbox-event-id': row.id }, // consumer uses for idempotency key
});
sentIds.push(row.id);
} catch {
failedIds.push(row.id);
}
}
// Batch updates β never do per-row DB round-trips in the relay hot path.
// These run OUTSIDE the above transaction so PROCESSING already prevents re-claiming.
if (sentIds.length > 0) {
await pool.query(
`UPDATE outbox SET status = 'SENT', sent_at = NOW(), updated_at = NOW() WHERE id = ANY($1)`,
[sentIds]
);
}
if (failedIds.length > 0) {
await pool.query(
`UPDATE outbox
SET status = CASE WHEN attempts + 1 >= $2 THEN 'DEAD_LETTER' ELSE 'PENDING' END,
attempts = attempts + 1,
updated_at = NOW()
WHERE id = ANY($1)`,
[failedIds, this.MAX_ATTEMPTS]
);
}
}
}
The three-status progression (PENDING β PROCESSING β SENT) is the detail most tutorial implementations omit. The intermediate PROCESSING state prevents a second relay instance from claiming a row while the first is mid-publish. Without it, a slow Kafka produce call gives the second relay a window to claim the same row and publish the event twice.
My rule: implement PROCESSING status from day one, even if you only ever run one relay instance. You will run two for HA eventually.
How the Relay Works: Polling vs. CDC
Choosing how to read the outbox table is one of the first architectural decisions teams get wrong. Not because the wrong choice breaks things immediately, but because migrating from polling to CDC under production load is painful.
Polling relay is what the implementation sketch above implements. A dedicated process runs on a tight loop, querying the outbox table for new rows every N seconds. Simple to implement, simple to debug (check the outbox table with SQL), and suitable for the vast majority of production workloads.
The tradeoff: polling adds read load to your primary database. At very high event volumes (50K+ events/second), you either poll so frequently you create meaningful DB contention, or you poll infrequently and add latency that makes events feel stale.
CDC (Change Data Capture) is the high-throughput alternative. Instead of polling the outbox table, a CDC daemon tails PostgreSQL's Write-Ahead Log (WAL). Every committed INSERT into the outbox table appears in the WAL as a change record, which Debezium captures and routes to Kafka directly. No SELECT query is issued against the outbox table.
The tradeoff: CDC requires operating Debezium (a distributed Kafka Connect cluster), managing PostgreSQL logical replication slots, and handling schema migrations carefully. There is real operational complexity here that teams consistently underestimate.
My recommendation: start with polling relay. Migrate to CDC when your event volume or latency requirements make polling untenable, and when you have a team member who has operated Debezium before.
Continue Reading with Premium
Unlock this article and every other in-depth system design guide on the platform with NotesFromSDE Premium.