Premature microservices anti-pattern
Understand why splitting a small application into microservices adds distributed systems complexity before you have the scale or team size to justify it, and when to actually make the switch.
TL;DR
- Microservices are the correct answer at scale and with large teams. At a three-person startup or in a new product domain, they are almost always the wrong answer.
- A monolith is not a failure. It is the correct architecture for services with unproven domain boundaries, small teams, and low operational maturity.
- The distributed systems tax (network latency, partial failures, distributed transactions, versioning, service discovery, observability) is real and must be earned by the scale that justifies it.
- The correct path: start as a modular monolith with clean interfaces, measure where you actually hit bottlenecks, then extract services surgically using the strangler fig.
The Problem
A startup of 4 engineers decides to build a SaaS product. They've read about Netflix and Uber. They architect 12 services on day one: Auth, User, Billing, Notifications, Analytics, Products, Orders, Shipping, Reviews, Search, Admin, and a BFF Gateway.
Six months in, adding a new feature requires coordinating changes across 4 services. They have a 2-hour deployment ceremony requiring all services to be updated in the right order. Their local development environment requires Kubernetes, 12 containers, and a service mesh to run. Debugging a bug requires reading distributed traces across 6 services.
Their competitor, a 2-person team with a Django monolith, shipped the same feature in an afternoon.
The startup's problem is not scale. They have 200 users. Their problem is they've prematurely imported the complexity model of an organisation with 2,000 engineers for a team that's too small to own that complexity.
I made this exact mistake in 2019. We had 3 engineers and 8 services. Our deploy pipeline took 45 minutes. The Django monolith we replaced used to deploy in 90 seconds.
A single feature that would be a one-line change in a monolith becomes a multi-service coordination problem.
Why It Happens
Premature microservices come from good intentions, misapplied.
"Netflix does it." Netflix has 2,000+ engineers. Their architecture solves coordination problems at that scale. At 4 engineers, you don't have coordination problems. You have a small team that can walk over to each other's desks.
"We need to be ready to scale." This assumes you know which parts will need independent scaling before you've found product-market fit. You almost certainly don't. Premature scaling decisions are premature optimization at the architecture level.
"Microservices enforce good boundaries." So do modules, packages, and interfaces. You don't need a network boundary to enforce a code boundary. A function call with a clean interface is just as decoupled as an HTTP call, without the network tax.
"We'll be stuck with the monolith forever." The strangler fig pattern exists precisely for this scenario. A well-structured monolith with clean module boundaries is easier to extract from than a poorly-structured microservices system is to consolidate.
The complexity cliff
The problem isn't that microservices are bad. The problem is that the complexity they introduce has a minimum cost regardless of system size. Whether you have 100 users or 100 million, you still need:
- Service discovery and health checks
- Distributed tracing and log aggregation
- Circuit breakers and retry logic
- API versioning and backward compatibility
- Integration testing across service boundaries
- A deployment pipeline per service
For a 200-person engineering org, this infrastructure already exists and is maintained by a platform team. For a 4-person startup, building and maintaining this infrastructure IS the product work. You spend 60% of your time on plumbing and 40% on features.
I've audited three startups that went microservices-first. In every case, the team estimated the infrastructure overhead at 10-15% of their time. The actual number, measured over 6 months, was 40-55%.
How to Detect It
| Symptom | What It Means | How to Check |
|---|---|---|
| More services than engineers | Over-decomposition | Count services vs team size. If ratio > 2:1, question every service |
| Local dev requires Kubernetes or Docker Compose with 5+ containers | Infrastructure overhead exceeds product complexity | Time how long docker-compose up takes. Over 5 minutes is a smell |
| Adding a simple feature requires PRs in 3+ repos | Cross-service coupling from premature splits | Track the average number of repos touched per feature over a sprint |
| Deployment requires a specific order or a coordination meeting | Deployment coupling | Ask: can each service deploy independently? If no, you split too early |
| Team spends more time on infra than product | Operational overhead exceeds product value | Track the ratio of infra tickets to product tickets over a month |
| Inter-service calls dominate latency | Network tax exceeds computation | Check distributed traces. If 80% of request time is hop-to-hop latency, you have too many hops |
| You haven't hit product-market fit yet | Optimizing architecture before validating the product | Honest conversation: do you have paying users who need this scale? |
If you recognize 3 or more, you likely have premature microservices.
The "services per engineer" ratio
A practical heuristic: count your services and divide by your team size.
| Ratio | Assessment |
|---|---|
| < 1.0 (fewer services than engineers) | Probably fine. Each engineer can own their service |
| 1.0 - 2.0 | Watch carefully. Infrastructure overhead is noticeable |
| 2.0 - 3.0 | Likely premature. Engineers spend more time on operations than features |
| > 3.0 | Almost certainly premature. Consider consolidation |
This isn't a hard rule, but it's a useful conversation starter. A team of 4 engineers managing 12 services (ratio 3.0) will spend most of their time keeping the lights on rather than building product.
The "time to first feature" test
Another practical test: time how long it takes a new engineer to ship their first feature (end-to-end, including code review and deploy). In a well-structured monolith, this should be 1-3 days. In a microservices system appropriate for the team size, 3-5 days. If it takes more than 2 weeks, the architecture complexity is outpacing the team's capacity.
The distributed systems tax
Every function call you replace with a network call adds this overhead:
| Concern | Monolith (in-process) | Microservices (over network) |
|---|---|---|
| Latency | Nanoseconds | 1-50ms per hop |
| Failure modes | Process crash (all-or-nothing) | Partial failure, timeout, 503 |
| Data consistency | ACID transactions | Sagas, eventual consistency |
| Debugging | Single stack trace | Distributed traces, correlation IDs |
| Deployment | Deploy one thing | Version compatibility, rolling upgrades |
| Local dev | Run one process | Docker Compose, service stubs |
| Observability | Single log stream | Log aggregation, tracing, metrics per service |
None of these are insurmountable. All of them are real and compound. For a two-person team, this is a massive tax on velocity.
The real cost: time-to-feature
Here's what happens to feature delivery speed when a 4-person team manages 12 services vs a monolith:
| Activity | Monolith | 12 Microservices |
|---|---|---|
| Add a field to the API | 1 file, 1 PR, 10 minutes | 2-4 services, 2-4 PRs, 2 hours |
| Set up local dev environment | npm start, 30 seconds | docker-compose up, 5-15 minutes |
| Debug a request failure | Single stack trace, 5 minutes | Distributed trace across 3 services, 30 minutes |
| Onboard a new engineer | Half day to productive | 1-2 weeks to understand service topology |
| Deploy a feature | 1 pipeline, 3 minutes | 4 pipelines in sequence, 45 minutes |
Multiply these differences across every feature, every week, for months. The velocity gap is enormous.
The Fix
Fix 1: Start with a modular monolith
A modular monolith has clear internal boundaries without deploying each module separately. Modules communicate through well-defined interfaces, not direct database access. When you need to extract a service, the interface already exists. You're just putting a network call where a function call was.
// β Big Ball of Mud: everything tangled together
class OrderController {
async createOrder(req: Request) {
const user = await db.query("SELECT * FROM users WHERE id = $1", [req.userId]);
const stock = await db.query("SELECT quantity FROM products WHERE id = $1", [req.productId]);
await db.query("INSERT INTO orders ...");
await db.query("UPDATE products SET quantity = quantity - 1 ...");
await sendEmail(user.email, "Order confirmed");
}
}
// β
Modular Monolith: clean interfaces, same process
class OrderModule {
constructor(
private userModule: UserModule,
private productModule: ProductModule,
private billingModule: BillingModule,
private notificationModule: NotificationModule,
) {}
async createOrder(userId: string, productId: string) {
const user = await this.userModule.getUser(userId);
const available = await this.productModule.checkStock(productId);
if (!available) throw new InsufficientStockError();
const order = await this.orderRepo.create(userId, productId);
await this.billingModule.charge(userId, order.total);
await this.notificationModule.sendOrderConfirmation(user.email, order.id);
return order;
}
}
Each module owns its tables, exposes a typed interface, and can be extracted into a service later with minimal changes.
Fix 2: Extract only when you have evidence
Don't extract a service because it "feels" like a separate concern. Extract when you have concrete evidence.
When to extract:
- Scale mismatch: A module needs to scale 10x more than the rest (e.g., video transcoding vs user auth)
- Team growth: Multiple teams make conflicting changes to the same codebase
- Technology fit: A component needs a different runtime (e.g., ML inference in Python, business logic in Go)
- Compliance: Specific data needs regulatory isolation (e.g., PCI for payment processing)
When NOT to extract:
- "It's a logically separate domain" (that's what modules are for)
- "We might need to scale it someday" (measure first)
- "Netflix does it" (you are not Netflix)
The extraction decision checklist
Before extracting any module, run through this checklist. If you can't answer "yes" to at least two of these, keep it in the monolith.
| Question | Why it matters |
|---|---|
| Does this module need to scale independently from the rest? | If not, extraction adds network overhead for no benefit |
| Do different teams own this module and the consuming modules? | If one team owns everything, a monolith is simpler |
| Does this module have a fundamentally different deployment cadence? | If everything deploys together anyway, services add ceremony |
| Does this module need a different technology stack? | If it's all TypeScript, there's no runtime benefit to separation |
| Can the module's interface be expressed as a stable, versioned API? | If the interface changes weekly, you'll spend all your time on contract management |
Fix 3: Use the strangler fig for extraction
When you do have evidence, extract gradually. The strangler fig pattern lets you introduce a service alongside the monolith, migrate traffic incrementally, and roll back if something goes wrong.
// Step 1: Route through a faΓ§ade
class VideoFacade {
async transcode(videoId: string): Promise<TranscodeResult> {
if (featureFlags.useVideoService) {
// New: network call to extracted service
return await videoServiceClient.transcode(videoId);
}
// Old: in-process module call
return await this.videoModule.transcode(videoId);
}
}
Trade-off: You run two implementations during the migration. Keep this phase short (4-8 weeks) to avoid maintaining parallel code paths.
Choosing your architecture
Severity and Blast Radius
Premature microservices are a medium-severity anti-pattern, but the blast radius grows fast.
What breaks: Developer velocity. Every feature takes 3-5x longer because of cross-service coordination. Local development becomes painful. Onboarding new engineers takes weeks instead of days because of infrastructure complexity.
Cascade risk: Low for production outages (each service is small), but high for organizational velocity. The team spends most of its energy managing infrastructure instead of building product. This is especially dangerous for startups where shipping speed is existential.
Recovery difficulty: Medium. Consolidating back to a monolith is doable but emotionally hard. Teams feel like they're "going backward." The technical work is straightforward if module boundaries were clean. If they weren't, you're untangling a distributed monolith too.
The hidden cost: Engineering morale. Teams that spend their days on deploy pipelines, service mesh configuration, and distributed debugging instead of product features tend to burn out. I've seen senior engineers leave companies specifically because "we spend all our time on infra and never ship anything."
When It's Actually OK
- You have 50+ engineers and clear domain boundaries. At this scale, deployment coordination across one codebase becomes the bottleneck, and microservices are the correct answer. Amazon, Netflix, and Uber moved to microservices because their team size demanded it, not because microservices are inherently better.
- A single module has 10x different scaling needs. If your video transcoding needs 100 GPUs and your user profile service needs 2 CPUs, separate deployment makes obvious sense. This is the strongest signal for extraction.
- Regulatory or compliance isolation. PCI-DSS for payment processing, HIPAA for health data. Isolating these into separate services with separate audit trails is sometimes legally required, not just architecturally preferred.
- You're building a platform with external API consumers. If third parties consume your API, service isolation provides stability guarantees. You need independent versioning and deployment so that internal changes don't break external consumers.
- Your monolith has genuinely become a big ball of mud. If the codebase has zero module boundaries and untangling it in-place is harder than extracting services, microservices can impose boundaries through network contracts. But this is admitting a code quality failure, not a scaling need.
Microservices don't fix bad code
If your monolith is a mess, microservices just give you a distributed mess. Clean up module boundaries first. If you can't enforce boundaries within a single codebase, you won't maintain them across network boundaries either.
How This Shows Up in Interviews
When asked to design a system, it's acceptable to propose a microservices architecture for a large-scale system. But always show you understand the cost. At the design phase, say: "I'll start with a modular monolith structure and note which modules would become independent services when team size or scale demands it." This shows architectural maturity, not defaulting to microservices as an answer.
The two useful thresholds
A common rule of thumb: microservices make sense when you have 2+ teams that independently deploy code, or when you have a single module that needs to scale 10x more than the rest of the system. Below those thresholds, a well-structured monolith is probably better.
Quick Recap
- Microservices import distributed systems complexity that is only justified at sufficient scale and team size.
- A monolith is not technical debt. It is the correct architecture for early-stage systems with unclear domain boundaries.
- The modular monolith gives you clean boundaries without the network tax, and makes future extraction safe.
- Extract services when a module needs independent scaling, team size creates coordination overhead, or technology mismatch demands it.
- The strangler fig pattern is the correct extraction strategy: introduce a service gradually alongside the monolith, never as a big-bang rewrite.
- The two thresholds: 2+ teams needing independent deploys, or a single module needing 10x different scaling.
- "Netflix does it" is not evidence that your 5-person team should too.