Database per service
Why microservices need isolated databases, the cross-service query and consistency challenges this creates, and the practical strategies (CQRS, API composition, sagas) for working around shared-data limitations.
TL;DR
- Each microservice owns its database exclusively. No other service reads from or writes to it directly.
- Shared databases create schema coupling (one team's migration breaks another service), deployment coupling (can't deploy independently), and scaling coupling (can't scale storage independently).
- Cross-service queries lose SQL JOINs. Replace them with API composition, CQRS read models, or bounded data duplication.
- Cross-service transactions require the saga pattern (local transactions with compensating actions). True distributed transactions (2PC) are avoided in practice.
- Polyglot persistence becomes possible: each service picks the optimal database for its workload (PostgreSQL, Redis, Elasticsearch, Cassandra).
The Problem
Your company runs an e-commerce monolith backed by a single PostgreSQL database. The Orders, Users, Products, and Inventory tables all live in one schema. Three teams (Order Team, User Team, Product Team) deploy from the same codebase.
The User Team needs to rename users.email to users.primary_email and add a users.secondary_email column. This is a routine migration. But the Order Service has a JOIN that references users.email. The Product Service has a reporting query that filters on users.email. Both break.
The User Team can't deploy without coordinating with two other teams. The Order Team can't scale their database independently because it shares storage with User and Product data. A runaway query from the Product Service's nightly report locks rows that the Order Service needs for checkout.
This is not a theoretical problem. I've seen shared databases turn three-week sprints into three-month coordination nightmares. Every schema change becomes a cross-team negotiation. Every deployment window requires all three teams to be available. The shared database is the hidden monolith that survives long after you've split the application code.
One-Line Definition
Database-per-service gives each microservice exclusive, private ownership of its persistence layer, eliminating schema, deployment, and scaling coupling between services.
Analogy
Think of apartments in a building vs a shared house. In a shared house, one roommate rearranging the kitchen affects everyone. In an apartment building, each tenant controls their own unit's layout, locks, and appliances independently. The building provides shared infrastructure (electricity, plumbing), but no tenant can modify another's space. Database-per-service is the apartment model: each service owns its data, and the "building" provides shared infrastructure (networking, observability).
Solution Walkthrough
Polyglot Persistence
Isolation enables each service to choose the optimal store for its workload:
| Service | Database | Reason |
|---|---|---|
| UserService | PostgreSQL | Relational, ACID, strong consistency |
| ProductService | Elasticsearch | Full-text search, faceted filters |
| SessionService | Redis | TTL-based, sub-millisecond reads |
| OrderService | PostgreSQL | Transactional, audit trail |
| ActivityFeedService | Cassandra | Append-only, high write throughput |
| RecommendationService | Neo4j | Graph traversal for relationships |
No more org-wide "we use MySQL for everything" decisions that don't fit every workload. Each team picks the right tool, and the technology choice stays local to the service.
Solving the Cross-Service Query Problem
The hard part of database-per-service is queries that span multiple services. In a shared database, this was trivial:
-- Trivial in a shared database:
SELECT o.id, o.total, u.name, u.email
FROM orders o JOIN users u ON o.user_id = u.id
WHERE o.status = 'pending'
With isolated databases, you can't do cross-service JOINs. Three alternatives exist:
Option 1: API Composition. The calling service (or an API gateway) fetches from each service separately and joins in application code:
async function getPendingOrdersWithUsers(): Promise<OrderWithUser[]> {
const orders = await orderService.getPendingOrders();
const userIds = [...new Set(orders.map(o => o.userId))];
const users = await userService.getUsersBulk(userIds);
const userMap = new Map(users.map(u => [u.id, u]));
return orders.map(order => ({
...order,
user: userMap.get(order.userId)!,
}));
}
This works for small result sets. At scale, two sequential network round trips and potential N+1 problems accumulate latency.
Option 2: CQRS with a read-optimized view. Build a separate read model that denormalizes data from multiple services:
The read model is eventually consistent (updated asynchronously from events). Queries are fast because data is pre-joined. The trade-off is operational complexity and propagation delay.
Option 3: Bounded data duplication. Services store minimal copies of data they need from other services:
OrderService stores in its own DB:
user_id (foreign reference to UserService)
user_name (denormalized copy, may be stale)
user_email (denormalized copy, may be stale)
When UserService updates a name, it emits a UserNameChanged event, and OrderService updates its local copy. This is eventual consistency applied to duplicated fields. Keep the duplicated fields minimal and non-critical.
Saga Pattern for Cross-Service Transactions
With isolated databases, you can't wrap a multi-service operation in a SQL transaction. The saga pattern replaces distributed transactions with a sequence of local transactions and compensating actions on failure:
Each service executes a local transaction. If a step fails, the orchestrator runs compensating transactions for all previously completed steps, in reverse order. This gives you eventual consistency across services without distributed locks.
Two-phase commit (2PC) is the alternative, but it's rarely used in practice. The coordinator becomes a single point of failure, and participants hold locks until the coordinator decides. In a microservices environment where services deploy independently and network partitions happen, 2PC creates more problems than it solves.
Migration: Shared DB to Isolated Databases
Splitting a monolith's shared database requires the Strangler Fig pattern applied to data:
- Identify service boundaries in the existing schema (which tables belong to which service)
- Catalog cross-boundary dependencies (every JOIN, every foreign key across boundaries)
- Implement replacements (API composition, CQRS, duplication) while still on the shared database
- Dual-write to both old table and new service's DB, comparing results
- Migrate reads to the new service's DB
- Cut over writes and remove the old path
- Drop old tables after a bake period with monitoring
This takes months for a moderately coupled system. Start with the least-coupled service to build confidence, then tackle the most-coupled services last.
Implementation Sketch
// API Composition: gateway joins data from two services
// This is the simplest cross-service query strategy
interface Order { id: string; userId: string; total: number; status: string }
interface User { id: string; name: string; email: string }
async function getOrdersWithUsers(
orderClient: OrderServiceClient,
userClient: UserServiceClient,
status: string
): Promise<(Order & { userName: string })[]> {
// Step 1: fetch orders from Order Service
const orders = await orderClient.getByStatus(status);
// Step 2: batch-fetch users (avoid N+1)
const userIds = [...new Set(orders.map(o => o.userId))];
const users = await userClient.getBulk(userIds);
const userMap = new Map(users.map(u => [u.id, u]));
// Step 3: join in application code
return orders.map(o => ({
...o,
userName: userMap.get(o.userId)?.name ?? 'Unknown',
}));
}
This sketch shows the API composition pattern. The gateway fetches from each service independently, then joins in memory. The getBulk call avoids N+1 queries across the network.
When It Shines
- Independent deployment: teams deploy, rollback, and migrate schemas without cross-team coordination
- Independent scaling: a write-heavy Order Service gets a bigger database while the read-heavy Product Service scales horizontally with read replicas
- Polyglot persistence: each service uses the optimal database for its access patterns
- Fault isolation: a runaway query in the Product Service can't lock rows needed by the Order Service
- Clear ownership: the team that owns the service owns the data, the schema, the indexes, and the performance characteristics
- Organizational alignment: Conway's Law works in your favor when service boundaries match team boundaries
The rule of thumb: 3+ teams deploying to the same database = you need database-per-service. One team, one service, one database = you definitely don't.
Failure Modes & Pitfalls
The Hidden Shared Database. Teams split their application code into "microservices" but keep pointing at the same PostgreSQL instance with shared schemas. Nothing actually changed. Schema coupling, deployment coupling, and scaling coupling all remain. Different schemas on the same instance is a partial improvement; true isolation means separate database instances.
The Cross-Service JOIN Addiction. After splitting databases, teams discover hundreds of queries that joined across now-separated tables. The pressure to "just add a read replica that has all the data" is intense. Resist it. Each cross-service JOIN is a coupling point. Replace them with API composition or CQRS read models, one query at a time.
The Data Duplication Drift. Services duplicate fields from other services for local queries (e.g., OrderService stores user_name). Over time, the event that propagates updates is missed, and duplicated fields drift from the source. Set up monitoring that compares duplicated fields against the source periodically. Accept that some staleness is inherent, but catch systematic drift.
The Distributed Monolith. Services with isolated databases but synchronous API calls to each other on every request are a distributed monolith. If the User Service is down, the Order Service can't complete any operation. Design for resilience: cache essential data locally, use async events for non-critical updates, and degrade gracefully.
The Reporting Black Hole. Business analysts who previously ran ad-hoc SQL queries on the shared database can no longer do so. They need a reporting solution (data warehouse, data lake) that aggregates data from all services via CDC or event streams. Plan for this from day one, or analytics teams will build shadow databases that bypass your service boundaries.
Trade-offs
| Pros | Cons |
|---|---|
| Independent deployment and schema evolution | No cross-service SQL JOINs |
| Independent scaling per service | Data duplication and eventual consistency |
| Polyglot persistence (right DB per workload) | Distributed transactions require sagas (complex) |
| Fault isolation (one service can't lock another) | Reporting requires separate infrastructure (data warehouse) |
| Clear data ownership aligned with team ownership | Operational overhead: more databases to manage |
The fundamental tension is autonomy vs queryability. Full data isolation gives you deployment independence but removes the ability to run ad-hoc cross-domain queries. Every cross-service query must be explicitly designed and maintained.
Real-World Usage
Amazon pioneered database-per-service as part of their 2002 mandate that all teams must communicate through service interfaces. Each service (catalog, checkout, recommendations, fulfillment) owns its own datastore. Amazon uses DynamoDB, Aurora, ElastiCache, and Neptune across different services, fully embracing polyglot persistence. Werner Vogels has described this isolation as the foundation that enabled Amazon to deploy thousands of times per day.
Netflix isolates data per service across more than 1,000 microservices. The streaming metadata service uses Cassandra for high-availability reads, the billing service uses PostgreSQL for ACID transactions, and the recommendation engine uses custom ML serving stores. Netflix's data isolation extends to separate AWS accounts per service domain, providing network-level as well as database-level isolation.
Uber adopted a "domain-oriented microservice architecture" (DOMA) where each domain (rides, eats, freight) owns its data completely. Cross-domain queries go through published APIs and event streams, never shared databases. Uber processes over 100M trips per day across these isolated domains, using a combination of Schemaless (MySQL-based), Cassandra, and Redis per service.
Interview shortcut: name the coupling types
When proposing database-per-service, immediately name the three coupling types you're eliminating: schema coupling, deployment coupling, and scaling coupling. Then proactively address the cost: "We lose SQL JOINs across services and need API composition or CQRS for cross-service reads."
How This Shows Up in Interviews
Database-per-service comes up in every microservices system design. The moment you draw two separate services, the interviewer expects you to address data ownership and cross-service queries.
When to bring it up: "Each service owns its own database. The Order Service uses PostgreSQL; the Search Service uses Elasticsearch. No shared schemas. Cross-service reads use API composition or a CQRS read model."
Depth expected:
- At senior level: state the pattern, explain why shared DBs are bad, name API composition and CQRS as query strategies
- At staff level: discuss saga pattern for distributed transactions, polyglot persistence trade-offs, data duplication strategies, and reporting infrastructure
- At principal level: explain migration strategy from shared DB (strangler fig), organizational alignment (Conway's Law), multi-region data isolation, and compliance considerations (GDPR data residency per service)
| Interviewer asks | Strong answer |
|---|---|
| "How do services share data?" | "They don't share databases. Cross-service reads use API composition or a CQRS read model. Cross-service writes use the saga pattern." |
| "What if I need to JOIN orders with users?" | "API composition: the gateway fetches orders, batch-fetches users by ID, and joins in memory. For high-frequency queries, I'd build a CQRS read model pre-joining the data via Kafka events." |
| "What about transactions across services?" | "Saga pattern: each service runs a local transaction. On failure, compensating transactions undo previous steps. Eventually consistent, not ACID across services." |
| "How do you handle reporting?" | "CDC streams from each service's database into a data warehouse (Snowflake, BigQuery). Analysts query the warehouse, not individual service databases." |
| "Doesn't this create a lot of data duplication?" | "Yes, that's the trade-off. Services duplicate minimal fields they need. Events propagate updates. Some staleness is inherent. We monitor for drift." |
Quick Recap
- Database-per-service gives each microservice exclusive ownership of its persistence layer, eliminating schema, deployment, and scaling coupling.
- Polyglot persistence is a benefit: each service uses the optimal database (PostgreSQL, Redis, Elasticsearch, Cassandra) for its workload.
- Cross-service SQL JOINs are gone. Replace them with API composition (simple, synchronous), CQRS read models (pre-joined, eventually consistent), or bounded data duplication.
- Distributed transactions require the saga pattern: local transactions with compensating actions on failure. Avoid 2PC in production microservices.
- Reporting and analytics need a dedicated data warehouse fed by CDC or event streams. Analysts must never query service databases directly.
- Start with a shared database at small scale. Split when cross-team coordination becomes a bottleneck (typically at 3+ independent teams).
- Migration from shared to isolated databases uses the Strangler Fig pattern: dual-write, migrate reads, cut over writes, drop old tables. This takes months.
Related Patterns
- Saga pattern: the primary mechanism for maintaining consistency across services with isolated databases
- CQRS: separates write and read models, enabling cross-service read models that replace lost SQL JOINs
- Materialized view: the CQRS read model is a form of materialized view, pre-computing cross-service data
- Change data capture: CDC streams data from per-service databases into shared analytics stores and CQRS read models