Stripe: Building for idempotency

TL;DR

Network timeouts between client and server create an impossible question: "Did my request succeed?" Without idempotency, retrying a payment request risks charging the customer twice.
Stripe requires callers to provide a client-generated Idempotency-Key header on every mutating API call. The server stores the key alongside the result, replaying the stored response on any retry.
The design uses an atomic findOrCreate operation against a unique-indexed idempotency_keys table, with a pending intermediate state to handle concurrent retries safely.
Keys are scoped per API key, validated against a request body hash, and expired after 24 hours.
This pattern has become the industry standard for payment APIs and applies to any operation that must execute at-most-once: email sends, webhook deliveries, ledger entries.

Every payment API eventually faces the same failure mode. A client sends a POST /v1/charges request. The server processes it, debits the customer's card, and begins writing the response. Then the TCP connection drops. The client never receives a response.

From the client's perspective, there are two possibilities: the charge succeeded (and the response was lost in transit) or the charge never reached Stripe's processing pipeline. The client has no way to distinguish between these cases. The only safe option is to retry.

Without idempotency, that retry creates a second charge. The customer sees two $50.00 debits on their statement. Support tickets pile up. Trust erodes. I've worked with payment integrations where this exact scenario caused thousands of dollars in duplicate charges during a single network blip, and the engineering team spent weeks reconciling the mess.

At Stripe's scale (processing hundreds of billions of dollars annually across millions of API calls per minute), network timeouts are not edge cases. They are a constant. Load balancer connection resets, client-side timeout configurations, TLS handshake failures, and cloud provider network partitions all produce the same result: a request that may or may not have succeeded.

Double-charging is the worst UX failure

Users forgive slow pages. They forgive occasional errors. They do not forgive being charged twice. A single duplicate charge triggers a support ticket, a potential chargeback, and permanent distrust of the platform. For payment APIs, idempotency is not a nice-to-have. It is a correctness requirement.

This diagram is not theoretical. Before Stripe's idempotency system, this failure mode was a real and frequent source of customer complaints across the payments industry. The entire pattern exists because of this one scenario.

The System Before

Payment APIs in the early 2010s handled retries in one of two ways, both flawed.

Approach 1: No retry safety. The API is stateless. Every request executes independently. If a client retries, a duplicate charge happens. The burden falls on the client to implement their own deduplication, which most clients get wrong.

Approach 2: Server-generated transaction IDs. The server returns a transaction ID after processing. The client can use this ID to check status before retrying. The problem: if the original request's response was lost, the client never received the transaction ID. You are back to square one.

The fundamental flaw in both approaches is the same: the deduplication signal does not exist before the first request. Any solution that requires the server to generate the deduplication key cannot handle the case where the server's response is lost. This is the insight that drove Stripe's design.

How other companies handled it

Stripe was not the first to face this problem, but most existing solutions had significant limitations.

Provider	Retry approach	Limitation
PayPal (early API)	Server-generated `txn_id` on response	Useless if the response is lost
Braintree	Client submits `order_id`, dedup on match	Breaks for recurring charges with same order
Early Stripe (pre-2015)	No deduplication	Duplicate charges on retry
Amazon Pay	Request-level `ReferenceId`	Close to Stripe's eventual design, but scoping was more restrictive

The pattern Stripe settled on, client-generated keys with server-side storage, was not entirely novel. But their implementation of the pending state, request hash validation, and automatic SDK integration set the standard that the rest of the industry adopted.

Why Not Just Use Database Unique Constraints?

The obvious first thought: "Why not deduplicate on the server using business fields?" For example, reject a second charge if (customer_id, amount, currency, timestamp) matches a recent charge within a window.

This breaks down immediately in practice.

A customer legitimately orders two items at the same price within seconds. A subscription service charges the same amount monthly. A marketplace splits a payment into identical sub-charges. Business-field deduplication cannot distinguish a legitimate duplicate from a retry of a failed request.

I've seen teams try fuzzy time-window deduplication ("reject charges with the same amount within 60 seconds"), and it creates a different nightmare: legitimate charges get rejected, and the window tuning becomes an endless game of whack-a-mole.

Server-generated request IDs (like a UUID assigned on receipt) also fail. If the server assigns a request ID and the response carrying that ID is lost, the client cannot reference it during retry. The client needs to own the deduplication key before the first request ever leaves the client.

The bottom line: any deduplication mechanism that relies on server-side state created during request processing cannot solve the "lost response" problem. The key must originate on the client.

The Decision

Stripe's design centers on a single principle: the client generates a unique key before making the request, and sends it as an Idempotency-Key header. The server guarantees that any request with a previously seen key returns the original result without re-executing.

POST /v1/charges HTTP/1.1
Authorization: Bearer sk_live_xxx
Idempotency-Key: order_prod_8f14e45f-ceea-4a3b-9b97

{
  "amount": 5000,
  "currency": "usd",
  "customer": "cus_NhD8HD2bY8dP3V",
  "description": "Order #8f14e"
}

Three design choices make this work:

TL;DR

Network timeouts between client and server create an impossible question: "Did my request succeed?" Without idempotency, retrying a payment request risks charging the customer twice.
Stripe requires callers to provide a client-generated Idempotency-Key header on every mutating API call. The server stores the key alongside the result, replaying the stored response on any retry.
The design uses an atomic findOrCreate operation against a unique-indexed idempotency_keys table, with a pending intermediate state to handle concurrent retries safely.
Keys are scoped per API key, validated against a request body hash, and expired after 24 hours.
This pattern has become the industry standard for payment APIs and applies to any operation that must execute at-most-once: email sends, webhook deliveries, ledger entries.

Provider	Retry approach	Limitation
PayPal (early API)	Server-generated `txn_id` on response	Useless if the response is lost
Braintree	Client submits `order_id`, dedup on match	Breaks for recurring charges with same order
Early Stripe (pre-2015)	No deduplication	Duplicate charges on retry
Amazon Pay	Request-level `ReferenceId`	Close to Stripe's eventual design, but scoping was more restrictive

Why Not Just Use Database Unique Constraints?

This breaks down immediately in practice.

The bottom line: any deduplication mechanism that relies on server-side state created during request processing cannot solve the "lost response" problem. The key must originate on the client.

The Decision

POST /v1/charges HTTP/1.1
Authorization: Bearer sk_live_xxx
Idempotency-Key: order_prod_8f14e45f-ceea-4a3b-9b97

{
  "amount": 5000,
  "currency": "usd",
  "customer": "cus_NhD8HD2bY8dP3V",
  "description": "Order #8f14e"
}

Three design choices make this work:

Stripe: Building for idempotency

TL;DR

The Trigger

The System Before

How other companies handled it

Why Not Just Use Database Unique Constraints?

The Decision

Continue Reading with Premium

Comments

Stripe: Building for idempotency

TL;DR

The Trigger

The System Before

How other companies handled it

Why Not Just Use Database Unique Constraints?

The Decision

Continue Reading with Premium

Comments