Horizontal vs. vertical scaling

TL;DR

Dimension	Choose Vertical Scaling	Choose Horizontal Scaling
Architecture	No code changes needed, single-node simplicity	Requires stateless design, load balancer, distributed state
Ceiling	Hardware limit (AWS: 448 vCPUs, 24 TB RAM on u-24tb1.metal)	Theoretically unlimited, add machines indefinitely
Fault tolerance	Single point of failure, the big machine goes down and everything stops	Built-in redundancy, one node fails and others continue
Cost curve	Non-linear: 2x CPU costs 2.5-3x the price	Near-linear: 2x capacity costs ~2x (commodity hardware)
Best for	Databases, GPU workloads, coordination nodes, early-stage startups	Stateless web/API servers, read-heavy workloads, traffic spikes

Default answer: scale vertically first (it's simpler), then scale horizontally when you hit the ceiling or need fault tolerance. For stateless application servers, go horizontal from day one. For databases, go vertical until it hurts, then add read replicas, then shard as a last resort.

The Framing

A startup I worked with had a PostgreSQL database on a db.r6g.xlarge (4 vCPUs, 32 GB RAM). At 2,000 queries/sec it maxed out CPU. The team debated for two weeks: shard the database or upgrade the instance?

They upgraded to db.r6g.4xlarge (16 vCPUs, 128 GB RAM). Took 20 minutes of downtime. Cost went from $0.65/hr to $2.60/hr. Problem solved for the next 18 months. No code changes, no schema redesign, no distributed transaction headaches.

The team that sharded their database at 2,000 QPS? They spent three months building a sharding layer, introduced cross-shard query bugs that took weeks to diagnose, and ended up with two shards doing 1,000 QPS each. They could have clicked "modify instance" and gone to lunch.

This is the core lesson. Vertical scaling is boring. Horizontal scaling is interesting. Engineers systematically over-invest in horizontal scaling because it's more technically challenging. But "boring and simple" is usually the right first move. Scale vertically until you can't, then scale horizontally where you must.

How Each Works

Vertical Scaling: Bigger Machine

Vertical scaling means replacing your current server with a more powerful one. More CPU cores, more RAM, faster NVMe storage, better network bandwidth. The application code stays exactly the same.

AWS EC2 vertical scaling path:
  t3.medium    →  2 vCPU,   4 GB RAM   →  $0.042/hr
  m6i.xlarge   →  4 vCPU,  16 GB RAM   →  $0.192/hr
  m6i.4xlarge  → 16 vCPU,  64 GB RAM   →  $0.768/hr
  m6i.16xlarge → 64 vCPU, 256 GB RAM   →  $3.072/hr
  u-24tb1.metal→448 vCPU,  24 TB RAM   → ~$218/hr (the ceiling)

RDS PostgreSQL vertical scaling:
  db.r6g.large   →  2 vCPU,  16 GB → ~3,000 QPS simple queries
  db.r6g.4xlarge → 16 vCPU, 128 GB → ~20,000 QPS simple queries
  db.r6g.16xlarge→ 64 vCPU, 512 GB → ~60,000 QPS simple queries

The advantages are obvious. No architectural changes. No load balancer. No distributed state management. ACID transactions work on a single machine without coordination overhead. Upgrades take minutes (managed databases handle it with a brief failover).

The limits are equally obvious. There's a ceiling: the biggest machine AWS sells has 448 vCPUs and 24 TB RAM. You can't go bigger. The cost curve is non-linear: doubling CPU roughly triples the price at the high end. And you have one machine. If it fails, everything fails.

My rule of thumb: if a vertical upgrade buys you 12+ months of headroom and costs less than $5,000/month, do it. It's almost always cheaper than the engineering time to build horizontal scaling infrastructure.

Horizontal Scaling: More Machines

Horizontal scaling means running multiple instances of your service behind a load balancer. Each instance handles a portion of the traffic. Add more instances to handle more load.

# The fundamental requirement: stateless design
# Each request can be handled by ANY instance

# BAD: state stored in process memory
class BadServer:
    def __init__(self):
        self.sessions = {}  # <-- lost if this instance dies

    def handle(self, request):
        user = self.sessions[request.session_id]  # Breaks on different instance
        return process(request, user)

# GOOD: state stored externally
class GoodServer:
    def __init__(self, redis_client, db_client):
        self.redis = redis_client
        self.db = db_client

    def handle(self, request):
        user = self.redis.get(f"session:{request.session_id}")  # Any instance works
        return process(request, user)

The prerequisite is stateless architecture. If an instance stores user sessions in process memory, requests from the same user must always hit the same instance (sticky sessions). That defeats the purpose. Move state to Redis, a database, or a distributed cache, and every instance is interchangeable.

Horizontal scaling is theoretically unlimited: need 10x capacity? Add 10x instances. Need 100x for Black Friday? Autoscaling handles it. It also provides fault tolerance: one instance crashes and the load balancer routes around it.

The costs are architectural complexity (load balancer, health checks, service registration, distributed state), operational complexity (deploying to N machines, monitoring N machines, debugging issues that only happen on 1 of N machines), and the requirement to move all state out of the process.

Head-to-Head Comparison

Dimension	Vertical	Horizontal	Verdict
Implementation effort	Click "resize," wait 5 minutes	Redesign for statelessness, add LB, externalize state	Vertical, much simpler
Max capacity	Hardware ceiling (448 vCPU, 24 TB RAM)	Theoretically unlimited	Horizontal
Fault tolerance	None. One machine = one failure domain	Built-in. N-1 instances survive one failure	Horizontal
Cost efficiency at scale	Non-linear: 2x resources costs 2.5-3x	Near-linear: 2x resources costs ~2x	Horizontal at scale
Cost efficiency at small scale	One machine, no LB overhead, no coordination	LB + multiple instances + external state	Vertical at small scale
Latency	Everything on one box: no network hops	Cross-instance communication adds latency	Vertical for single-request
ACID transactions	Single-node transactions, no coordination	Distributed transactions or eventual consistency	Vertical for transactional workloads
Scaling granularity	Large jumps (4x CPU minimum upgrade steps)	Incremental (add one instance at a time)	Horizontal
Downtime during scaling	Brief (managed DB failover: ~30s)	Zero (add instances behind LB)	Horizontal
Operational complexity	One machine to monitor and debug	N machines, distributed logs, network partitions	Vertical

The pattern I've seen repeatedly: teams adopt horizontal scaling for the app tier (correct, since web servers are naturally stateless) but keep the database vertical far longer than people expect. Shopify ran their core commerce database on a single very large PostgreSQL instance well past $1B GMV. Scaling the database horizontally (sharding) has enormous complexity costs that vertical scaling avoids.

When Vertical Scaling Wins

Vertical scaling is right when you want simplicity and haven't hit the ceiling.

Databases, almost always first. PostgreSQL, MySQL, MongoDB. Vertical first because distributed database complexity is enormous. Sharding requires choosing a partition key, handling cross-shard queries, managing data migration, and dealing with hotspots. A db.r6g.16xlarge (64 vCPU, 512 GB RAM) handles far more than most startups need. I've seen databases serving 50,000+ simple queries per second on a single vertical instance.

Specialized compute. GPU workloads (ML inference, video encoding) scale better with a bigger GPU than with multiple smaller GPUs. An A100 80 GB outperforms two A10G 24 GBs for most inference workloads because the model fits in one GPU's memory without cross-GPU communication.

Coordination-sensitive services. Leader election quorum nodes, ZooKeeper clusters, etcd clusters. These run on 3-5 nodes by design. Adding more nodes increases coordination overhead. Scale each node up, not the cluster out.

Early-stage startups. Your first 100K users don't need horizontal scaling. A single $200/month instance handles more traffic than most startups will see in year one. Spend engineering time on the product, not the infrastructure.

When Horizontal Scaling Wins

Horizontal scaling is right when you need fault tolerance, handle variable traffic, or have hit vertical limits.

Stateless application servers. Web servers, API servers, microservices. These are designed to be stateless and interchangeable. Horizontal scaling is the natural model. There's no reason to run one giant API server when four medium ones give you the same capacity plus fault tolerance.

Traffic with spikes. E-commerce during flash sales, streaming during live events, news sites during breaking stories. Vertical scaling can't react: you can't resize an instance in 30 seconds. Auto-scaling adds instances in under a minute. My recommendation: if your peak traffic is 5x+ your baseline, horizontal with auto-scaling is the only practical option.

Read-heavy workloads. Database read replicas are a form of horizontal read scaling. Adding 3 read replicas gives you 4x the read capacity without touching the primary. Works beautifully for workloads that are 90%+ reads.

Fault tolerance requirements. SLA demands 99.99% availability? You need redundancy across instances and availability zones. No single machine, no matter how powerful, achieves four nines alone. You need at least 2-3 instances with health checks and automatic failover.

Cost optimization at scale. At large scale, horizontal scaling on commodity hardware is cheaper per unit of compute than vertical scaling on premium instances. Four r6g.4xlarge instances ($2.60/hr each = $10.40/hr total) give you 64 vCPUs and 512 GB RAM. One r6g.16xlarge with the same specs costs $12.80/hr. That's a 23% premium for the convenience of a single machine.

The Nuance

The Hybrid Reality

In practice, every production system uses both strategies at different layers. Here's the standard pattern:

The app tier scales horizontally from day one. The cache starts vertical (single Redis node) and goes horizontal (Redis Cluster) when data exceeds one node's memory. The database stays vertical for as long as possible, then adds read replicas, then shards.

The Cost Crossover

There's a specific point where horizontal becomes cheaper than vertical. Below that point, the operational overhead of horizontal scaling is a waste.

For small teams (under 10 engineers), the operational overhead of horizontal infrastructure often costs more in engineering time than the savings on compute. For large teams with platform engineering capabilities, horizontal scaling is almost always cheaper at scale.

When Vertical Hits the Wall

The moment vertical scaling fails is abrupt. You're on the largest available instance and CPU is at 90%. There's no bigger machine. Now you must go horizontal, and you must do it under pressure, which is the worst time to make architectural decisions.

My advice: plan for horizontal scaling in your architecture (stateless services, external state) even if you deploy vertically today. The migration from "one big instance" to "four small instances" should be a configuration change, not a rewrite. If your application requires sticky sessions or stores state in local files, fix that now while you have time.

Real-World Examples

Shopify: Runs one of the largest Ruby on Rails monoliths on a vertically scaled database. Their core commerce database was a single large MySQL instance for years, handling millions of merchants. They stayed vertical with aggressive query optimization and caching, only sharding when they absolutely had to (and it took years of engineering effort). The lesson: vertical scaling's simplicity is worth protecting as long as possible.

Netflix: The poster child for horizontal scaling. Their microservices run on thousands of EC2 instances with auto-scaling. During peak hours (8 PM on a Sunday), they scale up to handle 200M+ active users. Their stateless architecture means instances are disposable: any instance can handle any request for any user. They pioneered Chaos Monkey specifically because horizontal architectures need automated failure testing.

Instagram: Scaled vertically for their PostgreSQL database well past 1 billion users by combining aggressive vertical scaling (very large instances), read replicas, and application-level caching. They famously ran their entire backend on fewer than 10 engineers. When they finally sharded, they took months to plan the partition strategy and migration. The delay was worth it because every month of vertical scaling was a month of simpler operations.

How This Shows Up in Interviews

This tradeoff comes up early in almost every system design interview, usually when you draw your first architecture box. The interviewer wants to see that you know when each strategy applies.

What they're testing: Do you default to horizontal because it sounds more "scalable," or do you show judgment about when vertical is the right call? Senior candidates know that horizontal scaling has real costs.

Depth expected at senior level:

Know specific instance sizes and their limits (r6g.16xlarge: 64 vCPU, 512 GB)
Explain the database scaling ladder: vertical, connection pooling, read replicas, caching, sharding
Identify which tiers scale which way and why
Discuss auto-scaling mechanics: metrics, cool-down periods, predictive scaling
Name the stateless-architecture prerequisite for horizontal scaling

Interviewer asks	Strong answer
"How would you scale this?"	"The app tier scales horizontally behind an ALB, starting at 3 instances with CPU-based autoscaling. The database scales vertically first. At this traffic level, a db.r6g.4xlarge handles it easily. I'd add read replicas when reads hit 80% of capacity."
"Why not shard the database from the start?"	"Sharding adds cross-shard query complexity, application-level routing, and months of migration work. Vertical scaling on a single instance handles 50K+ simple QPS. I'd exhaust vertical scaling, read replicas, and caching before introducing sharding."
"What happens when you hit the vertical ceiling?"	"The ceiling is real: 448 vCPU, 24 TB RAM for the largest EC2 instance. For databases, the path is: read replicas for read scaling, then sharding for write scaling. For compute, horizontal scaling with a load balancer and stateless design."
"How does auto-scaling work?"	"Kubernetes HPA scales pods based on CPU/memory metrics. AWS Auto Scaling uses target tracking policies. Key: scale up fast (30s), scale down slow (5 min) to avoid thrashing. Set minimum replicas to 3 for availability."
"What's the cost tradeoff?"	"Vertical is cheaper in ops overhead for small teams. Horizontal is cheaper in raw compute at scale. The crossover depends on team size and traffic patterns. Four r6g.4xlarge instances cost 19% less than one r6g.16xlarge with the same total specs, but you need a load balancer, distributed monitoring, and ops tooling."

Interview tip: mention the database scaling ladder

When the interviewer asks about database scaling, don't jump to sharding. Walk the ladder: "I'd start with a vertically scaled primary, add PgBouncer for connection pooling, then read replicas, then Redis caching. Sharding is a last resort." This shows you understand the complexity cost of each step.

Quick Recap

Vertical scaling (bigger machine) requires no code changes and is the simplest way to increase capacity. It's the correct first step for databases and most services that haven't hit their ceiling.
Horizontal scaling (more machines) requires stateless architecture, a load balancer, and externalized state. It provides fault tolerance, auto-scaling, and theoretically unlimited capacity.
The database scaling ladder runs: vertical upgrade, connection pooling, read replicas, caching, sharding. Most applications never need sharding. Exhaust every earlier rung first.
Vertical scaling has a hard ceiling (448 vCPU, 24 TB RAM on AWS's largest instance) and no fault tolerance. Plan your architecture for horizontal scaling even if you deploy vertically today.
The cost crossover favors horizontal at scale (4 medium instances are ~19% cheaper than 1 equivalent large instance) but vertical at small scale (simpler ops, no LB overhead).
In interviews, show you know which tier scales which way: app tier horizontal from day one, database vertical first, cache vertical then cluster, and use auto-scaling with fast scale-up and slow scale-down.

Scalability for the broader principles of designing systems that handle growing load
Load balancing for the routing algorithms and health checks that make horizontal scaling work
Stateful vs. stateless for why stateless design is the prerequisite for horizontal scaling
SQL vs. NoSQL for how database choice affects your scaling options
Read replicas vs. caching for the two read-scaling strategies that come before sharding Consistent hashing distributes keys across nodes

Message brokers: Kafka: partition-based horizontal scaling (add brokers, add partitions)

CDN / edge nodes: By definition horizontal (nodes near users around the world)


## The Database Sharding Decision

Databases require special consideration because they're inherently stateful:

Scaling path:

Vertical: increase primary hardware → cheapest, simplest
Read replicas: add horizontal read capacity → adds replica lag
Connection pooling: PgBouncer, RDS Proxy → more connections, not more writes
Caching: remove read load from DB entirely → most effective read scaling
Sharding: horizontal write scaling → major application changes required

Each step is an order of magnitude more complex than the previous. Don't jump to sharding until you've exhausted steps 1-4.


Sharding is the last resort, not the first answer, for databases.

## Quick Recap

1. Vertical scaling (bigger machines) requires no application changes and works well for databases. Horizontal scaling (more machines) requires stateless services and a load balancer.
2. Vertical scaling has a ceiling (largest machines are finite) and a cost cliff (large instances aren't cost-linear). Horizontal scaling's ceiling is effectively unlimited.
3. Scale-up before scale-out: vertical is simpler, faster, and appropriate until you hit the hardware ceiling or the SPOF risk is unacceptable.
4. The standard architecture: stateless application tier (horizontal); database tier (vertical first, then read replicas, then sharding as a last resort).
5. Sharding is the last step in database horizontal scaling. Exhaust vertical scaling, read replicas, connection pooling, and caching first.

TL;DR

Dimension	Choose Vertical Scaling	Choose Horizontal Scaling
Architecture	No code changes needed, single-node simplicity	Requires stateless design, load balancer, distributed state
Ceiling	Hardware limit (AWS: 448 vCPUs, 24 TB RAM on u-24tb1.metal)	Theoretically unlimited, add machines indefinitely
Fault tolerance	Single point of failure, the big machine goes down and everything stops	Built-in redundancy, one node fails and others continue
Cost curve	Non-linear: 2x CPU costs 2.5-3x the price	Near-linear: 2x capacity costs ~2x (commodity hardware)
Best for	Databases, GPU workloads, coordination nodes, early-stage startups	Stateless web/API servers, read-heavy workloads, traffic spikes

The Framing

How Each Works

Vertical Scaling: Bigger Machine

Vertical scaling means replacing your current server with a more powerful one. More CPU cores, more RAM, faster NVMe storage, better network bandwidth. The application code stays exactly the same.

AWS EC2 vertical scaling path:
  t3.medium    →  2 vCPU,   4 GB RAM   →  $0.042/hr
  m6i.xlarge   →  4 vCPU,  16 GB RAM   →  $0.192/hr
  m6i.4xlarge  → 16 vCPU,  64 GB RAM   →  $0.768/hr
  m6i.16xlarge → 64 vCPU, 256 GB RAM   →  $3.072/hr
  u-24tb1.metal→448 vCPU,  24 TB RAM   → ~$218/hr (the ceiling)

RDS PostgreSQL vertical scaling:
  db.r6g.large   →  2 vCPU,  16 GB → ~3,000 QPS simple queries
  db.r6g.4xlarge → 16 vCPU, 128 GB → ~20,000 QPS simple queries
  db.r6g.16xlarge→ 64 vCPU, 512 GB → ~60,000 QPS simple queries

Horizontal Scaling: More Machines

Horizontal scaling means running multiple instances of your service behind a load balancer. Each instance handles a portion of the traffic. Add more instances to handle more load.

# The fundamental requirement: stateless design
# Each request can be handled by ANY instance

# BAD: state stored in process memory
class BadServer:
    def __init__(self):
        self.sessions = {}  # <-- lost if this instance dies

    def handle(self, request):
        user = self.sessions[request.session_id]  # Breaks on different instance
        return process(request, user)

# GOOD: state stored externally
class GoodServer:
    def __init__(self, redis_client, db_client):
        self.redis = redis_client
        self.db = db_client

    def handle(self, request):
        user = self.redis.get(f"session:{request.session_id}")  # Any instance works
        return process(request, user)

Head-to-Head Comparison

Dimension	Vertical	Horizontal	Verdict
Implementation effort	Click "resize," wait 5 minutes	Redesign for statelessness, add LB, externalize state	Vertical, much simpler
Max capacity	Hardware ceiling (448 vCPU, 24 TB RAM)	Theoretically unlimited	Horizontal
Fault tolerance	None. One machine = one failure domain	Built-in. N-1 instances survive one failure	Horizontal
Cost efficiency at scale	Non-linear: 2x resources costs 2.5-3x	Near-linear: 2x resources costs ~2x	Horizontal at scale
Cost efficiency at small scale	One machine, no LB overhead, no coordination	LB + multiple instances + external state	Vertical at small scale
Latency	Everything on one box: no network hops	Cross-instance communication adds latency	Vertical for single-request
ACID transactions	Single-node transactions, no coordination	Distributed transactions or eventual consistency	Vertical for transactional workloads
Scaling granularity	Large jumps (4x CPU minimum upgrade steps)	Incremental (add one instance at a time)	Horizontal
Downtime during scaling	Brief (managed DB failover: ~30s)	Zero (add instances behind LB)	Horizontal
Operational complexity	One machine to monitor and debug	N machines, distributed logs, network partitions	Vertical

Know specific instance sizes and their limits (r6g.16xlarge: 64 vCPU, 512 GB)
Explain the database scaling ladder: vertical, connection pooling, read replicas, caching, sharding
Identify which tiers scale which way and why
Discuss auto-scaling mechanics: metrics, cool-down periods, predictive scaling
Name the stateless-architecture prerequisite for horizontal scaling

Interviewer asks	Strong answer
"How would you scale this?"	"The app tier scales horizontally behind an ALB, starting at 3 instances with CPU-based autoscaling. The database scales vertically first. At this traffic level, a db.r6g.4xlarge handles it easily. I'd add read replicas when reads hit 80% of capacity."
"Why not shard the database from the start?"	"Sharding adds cross-shard query complexity, application-level routing, and months of migration work. Vertical scaling on a single instance handles 50K+ simple QPS. I'd exhaust vertical scaling, read replicas, and caching before introducing sharding."
"What happens when you hit the vertical ceiling?"	"The ceiling is real: 448 vCPU, 24 TB RAM for the largest EC2 instance. For databases, the path is: read replicas for read scaling, then sharding for write scaling. For compute, horizontal scaling with a load balancer and stateless design."
"How does auto-scaling work?"	"Kubernetes HPA scales pods based on CPU/memory metrics. AWS Auto Scaling uses target tracking policies. Key: scale up fast (30s), scale down slow (5 min) to avoid thrashing. Set minimum replicas to 3 for availability."
"What's the cost tradeoff?"	"Vertical is cheaper in ops overhead for small teams. Horizontal is cheaper in raw compute at scale. The crossover depends on team size and traffic patterns. Four r6g.4xlarge instances cost 19% less than one r6g.16xlarge with the same total specs, but you need a load balancer, distributed monitoring, and ops tooling."

Interview tip: mention the database scaling ladder

Quick Recap

Vertical scaling (bigger machine) requires no code changes and is the simplest way to increase capacity. It's the correct first step for databases and most services that haven't hit their ceiling.
Horizontal scaling (more machines) requires stateless architecture, a load balancer, and externalized state. It provides fault tolerance, auto-scaling, and theoretically unlimited capacity.
The database scaling ladder runs: vertical upgrade, connection pooling, read replicas, caching, sharding. Most applications never need sharding. Exhaust every earlier rung first.
Vertical scaling has a hard ceiling (448 vCPU, 24 TB RAM on AWS's largest instance) and no fault tolerance. Plan your architecture for horizontal scaling even if you deploy vertically today.
The cost crossover favors horizontal at scale (4 medium instances are ~19% cheaper than 1 equivalent large instance) but vertical at small scale (simpler ops, no LB overhead).
In interviews, show you know which tier scales which way: app tier horizontal from day one, database vertical first, cache vertical then cluster, and use auto-scaling with fast scale-up and slow scale-down.

Scalability for the broader principles of designing systems that handle growing load
Load balancing for the routing algorithms and health checks that make horizontal scaling work
Stateful vs. stateless for why stateless design is the prerequisite for horizontal scaling
SQL vs. NoSQL for how database choice affects your scaling options
Read replicas vs. caching for the two read-scaling strategies that come before sharding Consistent hashing distributes keys across nodes

Message brokers: Kafka: partition-based horizontal scaling (add brokers, add partitions)

CDN / edge nodes: By definition horizontal (nodes near users around the world)


## The Database Sharding Decision

Databases require special consideration because they're inherently stateful:

Scaling path:

Vertical: increase primary hardware → cheapest, simplest
Read replicas: add horizontal read capacity → adds replica lag
Connection pooling: PgBouncer, RDS Proxy → more connections, not more writes
Caching: remove read load from DB entirely → most effective read scaling
Sharding: horizontal write scaling → major application changes required

Each step is an order of magnitude more complex than the previous. Don't jump to sharding until you've exhausted steps 1-4.


Sharding is the last resort, not the first answer, for databases.

## Quick Recap

1. Vertical scaling (bigger machines) requires no application changes and works well for databases. Horizontal scaling (more machines) requires stateless services and a load balancer.
2. Vertical scaling has a ceiling (largest machines are finite) and a cost cliff (large instances aren't cost-linear). Horizontal scaling's ceiling is effectively unlimited.
3. Scale-up before scale-out: vertical is simpler, faster, and appropriate until you hit the hardware ceiling or the SPOF risk is unacceptable.
4. The standard architecture: stateless application tier (horizontal); database tier (vertical first, then read replicas, then sharding as a last resort).
5. Sharding is the last step in database horizontal scaling. Exhaust vertical scaling, read replicas, connection pooling, and caching first.

Horizontal vs. vertical scaling

TL;DR

The Framing

How Each Works

Vertical Scaling: Bigger Machine

Horizontal Scaling: More Machines

Head-to-Head Comparison

When Vertical Scaling Wins

When Horizontal Scaling Wins

The Nuance

The Hybrid Reality

The Cost Crossover

When Vertical Hits the Wall

Real-World Examples

How This Shows Up in Interviews

Quick Recap

Comments

Horizontal vs. vertical scaling

TL;DR

The Framing

How Each Works

Vertical Scaling: Bigger Machine

Horizontal Scaling: More Machines

Head-to-Head Comparison

When Vertical Scaling Wins

When Horizontal Scaling Wins

The Nuance

The Hybrid Reality

The Cost Crossover

When Vertical Hits the Wall

Real-World Examples

How This Shows Up in Interviews

Quick Recap

Comments