API gateway overload anti-pattern
Learn why cramming business logic, orchestration, authentication, and transformation into your API gateway creates a bottleneck and tight coupling that defeats the gateway's purpose.
TL;DR
- An API gateway should handle cross-cutting concerns: routing, TLS termination, rate limiting, auth token validation, and request logging. That's it.
- The anti-pattern: business logic, orchestration, data transformation, and response aggregation creep into the gateway. It starts knowing specific business rules, calling specific microservices, and holding domain knowledge.
- A logic-laden gateway becomes the slowest, most fragile point in your system. Every business change requires a gateway deployment, and one slow endpoint degrades all APIs.
- The fix: move business orchestration to a Backend-for-Frontend (BFF) service or a dedicated orchestrator that sits behind the gateway, not inside it.
- The test for whether logic belongs in the gateway: does it change when business requirements change? If yes, extract it.
The Problem
It's 11:15 p.m. on launch night. Your team just released the new checkout flow, and latency spikes from 120ms to 4.2 seconds. The on-call engineer checks the Order Service: healthy, 15ms p99. The Inventory Service: fine. The User Service: no issues. Every downstream service is green.
The problem is the API gateway. It's making 6 synchronous calls per checkout request: user tier lookup, feature flag evaluation, order creation, inventory check, shipping estimate, and a discount calculation. One slow response from the feature flag service (200ms timeout) cascades into every single API request because the gateway is single-threaded on that code path.
I've seen this exact failure at two different companies. Both times, the team was shocked because "the gateway was just routing." It wasn't. It had quietly become the most complex service in the system.
Your API gateway started clean six months ago. It routed /api/orders/* to the Order Service and validated JWTs. Simple. Then the requirements trickled in:
- Calls the User Service to inject user tier into the request header
- Aggregates responses from Order, Inventory, and Shipping for the checkout endpoint
- Applies a 10% discount for "premium" tier users (business logic in infrastructure)
- Transforms date formats from ISO 8601 to Unix timestamps for legacy clients
- Queries the feature flag service to decide which checkout flow version to expose
The mistake I see most often: teams treat the gateway as "just a YAML config file" and assume adding logic there is cheaper than building a new service. It is cheaper, for exactly one sprint. After that, it's debt that compounds with every new feature.
Here's what the overloaded gateway looks like in practice:
Six sequential calls. Business logic in the gateway. One slow dependency tanks every API in the system, not just checkout.
The gateway is a shared resource. When it's busy doing orchestration for checkout, health check endpoints, search queries, and profile lookups all queue behind it. At 500 RPS, those 6 sequential calls per checkout request consume 3,000 outbound connections from the gateway's connection pool. That's connection pool exhaustion waiting to happen.
Every deployment of the checkout business logic now requires a gateway deployment. The gateway team becomes a bottleneck for every product feature. And because the gateway handles all APIs, a failed deployment rolls back everything, not just checkout.
The root cause: the gateway stopped being infrastructure and became a business service.
Here's the architecture from 10,000 feet. Notice how the gateway sits in the middle of everything:
Every β node is business logic that has no place in a gateway. The gateway knows about user tiers, discount rules, inventory, and shipping. It's coupled to every domain in your system.
Compare this to what a clean gateway looks like:
The clean gateway has four responsibilities. The BFF layer owns all business orchestration. When the checkout flow changes, only the BFF deploys. The gateway stays untouched.
The visual difference is striking: the overloaded gateway connects to everything, while the clean gateway connects to exactly one downstream layer. If your gateway architecture diagram looks like a spider web, that's a red flag.
Why It Happens
Think of an API gateway like a security desk in a building lobby. It checks badges, logs entries, and directs visitors to the right floor. Now imagine the security desk also starts making coffee orders, handling mail sorting, resolving HR disputes, and calculating payroll. Everyone in the building depends on the security desk, and now it's doing six jobs instead of one.
Every decision that leads here sounds reasonable in isolation. That's what makes this anti-pattern so common. No single PR introduces the problem; it's a death by a thousand cuts.
"It's just one small call." The first piece of logic is always tiny. "We just need to check the user's tier before routing." A 5-line if-statement. No one creates a new service for 5 lines of code. So it goes in the gateway.
"The gateway already has the auth context." Since the gateway validates JWTs, it already knows the user ID. It feels wasteful to pass that downstream and have another service look it up again. So the gateway enriches the request. Now it has domain knowledge.
"We need response aggregation for mobile." The mobile team needs a single endpoint that combines data from three services. The gateway is the natural aggregation point. Except now it knows how to call specific services, merge their responses, and handle partial failures.
"We don't have time to build a BFF." Building a separate orchestration service feels like over-engineering when you have 3 microservices. The gateway is right there. I've made this exact argument myself, and it's valid for a while. The problem is that "a while" expires faster than you expect. What starts as one aggregation endpoint becomes five, then ten.
Each decision compounds. By the sixth or seventh addition, the gateway owns business logic, deployment cadence, and domain models. It's no longer infrastructure. It's the most important business service in your architecture, but nobody treats it that way.
Continue Reading with Premium
Unlock this article and every other in-depth system design guide on the platform with NotesFromSDE Premium.