SQL vs NoSQL

TL;DR

Use SQL when you need multi-table JOINs, ACID transactions, or complex ad-hoc queries you can't fully predict today.
Use Key-Value NoSQL (Redis, DynamoDB) for O(1) single-key access at millions of requests per second — sessions, caches, feature flags, rate limiting.
Use Document NoSQL (MongoDB, Firestore) when your data is naturally hierarchical, consumed together, and schemas will evolve rapidly.
Use Column-Family NoSQL (Cassandra, HBase, ScyllaDB) when you have write-heavy time-series workloads — logs, metrics, IoT, messaging history — at petabyte scale.
Use Graph NoSQL (Neo4j, Neptune) when the relationships between entities are the product — social graphs, fraud networks, recommendation engines.
"NoSQL scales better than SQL" is largely a myth for most workloads. Instagram uses PostgreSQL for relational data (users, accounts, follows) at 2 billion users — combined with Cassandra for feeds and Redis for caching. SQL scales further than most teams think before NoSQL becomes necessary.

In 2012, a well-funded startup migrated their entire product off PostgreSQL onto MongoDB. Reason: "SQL doesn't scale — we'll need NoSQL when we grow." The migration took four engineers three months. Within a year they hit three production bugs they had no answer for:

Orphaned records — no foreign key enforcement meant deleting a user left their orders referencing a user_id that no longer existed. Reporting queries returned nonsensically partial data.
Double-purchase bugs — the checkout flow did: read inventory → write sale, but without ACID transactions these steps weren't atomic. Under load, two concurrent requests both saw available inventory, both wrote a sale. Inventory went negative.
Analytics debt — their SQL queries that took a data analyst two minutes to write now required a MapReduce job. A 200ms ad-hoc JOIN became a 45-minute Hadoop run.

They migrated back. The real cause of their scaling problems turned out to be missing read replicas and a nonexistent caching layer — problems that had nothing to do with SQL vs. NoSQL.

Now contrast that with Discord, 2022. Discord moved billions of historical messages to ScyllaDB (a Cassandra-compatible database). That is a genuine NoSQL win: 4 billion messages accessed by (channel_id, timestamp), written at enormous throughput, read in fixed-width pages, with no joins anywhere. Cassandra's access pattern — partition key + sort key, wide rows, LSM-Tree writes — is exactly right for it. SQL would have required manual time-series partitioning and would still struggle on the write path.

The real question is not "SQL or NoSQL?" It is: does your query shape match your data store's access pattern model?

Interview tip: narrate this decision tree out loud

When an interviewer asks "what database would you use?", don't jump to a name. Say: "Let me think about the query shape first." Then walk through: do we need transactions? What are the access patterns? Is it write-heavy or read-heavy? Are there JOINs? That process — visible reasoning about trade-offs — is what earns senior/staff-level credit.

How Each Works

SQL: The Relational Model

A relational database stores data in tables of typed rows and columns. Relationships between entities are expressed via foreign keys — a reference from one table's column to another table's primary key. At query time, the engine JOINs tables by matching these keys, allowing arbitrarily complex cross-entity queries without duplicating data.

Internally, the storage engine uses a B-Tree index for most access patterns: a balanced tree that sorts data by key, enabling O(log n) lookup for any indexed value. The Write-Ahead Log (WAL) records every modification before it is applied, enabling crash recovery and replication. The query planner takes your SQL, builds an execution plan, and optimizes it — you write declarative intent, the engine picks the algorithm.

ACID is the bedrock:

Atomicity — a transaction is all-or-nothing. The checkout debit and inventory decrement either both commit or both roll back.
Consistency — the database is always in a valid state per your constraints. A foreign key violation is caught and rejected, not silently stored.
Isolation — concurrent transactions do not see each other's in-progress state unless you explicitly configure them to.
Durability — a committed transaction survives a crash. The WAL was fsynced before the commit returned.

The trade-off: SQL databases scale vertically first. To scale reads, you add read replicas. To scale writes, you shard — but sharding a relational database is complex and kills cross-shard JOINs and transactions.

NoSQL: Four Models, Four Trade-offs

"NoSQL" is not a single thing — it is a family of systems that each gave up a specific SQL guarantee (usually JOINs, or ACID, or both) in exchange for a specific gain (usually horizontal scale, write throughput, or schema flexibility). There are four distinct models:

Key-Value (Redis, DynamoDB, Etcd): A distributed hash table. Partition key is mandatory on every query — you cannot scan the full table. Within a partition, you can range-query a sort key (e.g., DynamoDB: sk BETWEEN ts_start AND ts_end), but you cannot filter by non-key attributes without a pre-planned index. Redis is a pure in-memory key lookup; DynamoDB supports sort-key ranges within a partition. Both are blindingly fast for what they do. Completely wrong for ad-hoc, cross-partition queries.

Document (MongoDB, Firestore, CouchDB): Each record is a JSON/BSON document. Documents can be nested — a user document can contain their addresses, preferences, and recent activity all at once. You query by field values within documents. Schema is optional and flexible. JOINs do not exist natively — you either embed related data inside the document (fast reads, data duplication) or use application-level lookups (slow, multiple round trips). Best when your read and write unit is the entire document.

Column-Family (Cassandra, HBase, ScyllaDB): Data is stored in rows identified by a partition key and sorted within a partition by a clustering key. A row can have thousands of columns. All reads are by partition key — Cassandra routes the request to the node owning that partition. The LSM-Tree storage engine converts random writes into sequential I/O, making it extraordinarily fast for write-heavy workloads. The model forces you to think about your access pattern at schema design time — Cassandra's schema is essentially a pre-materialized view of your hottest query.

Graph (Neo4j, Amazon Neptune, TigerGraph): Data is stored as vertices (entities) and edges (relationships). Traversal queries follow edges to answer questions like "who are friends-of-friends of user A?" — queries that require recursive self-joins in SQL become bounded O(relationship count) traversals in a native graph database. Specialized query languages (Cypher, Gremlin) express graph traversal declaratively. Overkill unless your core value is the relationship graph.

Four NoSQL data models arranged in a 2x2 matrix. Top row: Key-Value (Redis, DynamoDB) for simple O(1) lookups and Document (MongoDB, Firestore) for hierarchical flexible data. Bottom row: Column-Family (Cassandra, ScyllaDB) for write-heavy time-series and Graph (Neo4j, Neptune) for multi-hop relationship traversal. — Each NoSQL model trades away a different SQL guarantee for a different performance property. Match the model to your hottest query — not to your data format.

'NoSQL has no schema' is wrong — it has schema at query time

NoSQL databases do not eliminate schema — they push schema enforcement from write time (SQL) to query time (NoSQL). MongoDB will happily store { "price": "twenty dollars" } where you expected a number. Your application code is now the schema validator. At small scale this is convenient. At large scale you end up with 12 different document shapes stored under the same collection name, and a migration nightmare that makes SQL ALTER TABLE feel trivial.

Head-to-Head Comparison

Dimension	SQL (PostgreSQL, MySQL)	Key-Value (DynamoDB, Redis)	Document (MongoDB)	Column-Family (Cassandra)	Graph (Neo4j)
Schema enforcement	Strict at write time	None	Optional, per-document	Pre-defined per query pattern	Node/edge types defined
Scaling model	Vertical → read replicas → painful manual sharding	Horizontal, hash-partitioned natively	Horizontal with mongos sharding	Horizontal, consistent hash ring, auto-rebalance	Vertical primarily
ACID transactions	Full ACID, cross-table	Single-item atomic only	Multi-document in v4.0+ (with overhead)	Single-partition atomic only	ACID within local graph
JOIN capability	Native, arbitrary multi-table	None — by design	`$lookup` (slow), application-level	None — by design	Native multi-hop traversal
Query flexibility	Very high — ad-hoc, aggregations, window functions	Redis: none beyond key lookup; DynamoDB: partition key + sort-key range queries within partition only	Medium — field queries, aggregations	Low — partition key is mandatory on every query	High for graph traversal; restricted otherwise
Write throughput	10–50K writes/s per primary (varies with txn complexity, fsync latency, index count; application-level sharding extends this)	Unlimited (DynamoDB provisioned capacity)	10–30K writes/s	100K–1M+ writes/s (LSM-Tree sequential)	5–20K writes/s
Read latency (indexed)	1–5ms	< 1ms (Redis), 1–10ms (DynamoDB)	1–10ms	1–10ms (single-partition)	1–100ms (traversal depth-dependent)
Consistency model	Strong (serializable)	Eventual default, strong configurable	Eventual or strong (configurable)	Tunable: ONE / QUORUM / ALL	Strong (single-machine)
Operational complexity	Low–medium (mature tooling)	Low (managed/serverless)	Medium	High (ring topology, compaction, tombstones)	Medium-high
Best fit	Transactional systems, relational data, ad-hoc reporting	Sessions, caches, rate limiting, leaderboards	Catalogs, user profiles, CMS content	Messaging history, IoT telemetry, logs, feeds	Social graphs, fraud detection, recommendations

When SQL Wins

SQL is the correct default for most applications. Reach for it when:

You need multi-entity transactions. Any operation that must atomically update more than one logical entity — money transfers between accounts, inventory decrements with order creation, user registration with profile creation — requires ACID. Without it, you will have a production incident.
Your query patterns are unknown or evolving. SQL's query planner lets you ask arbitrary questions of your data. "All orders in the last 30 days where the customer is in California and the product is in category X, grouped by week, with a running total" is four lines of SQL. In Cassandra, that query requires a pre-planned table designed around those exact access dimensions.
Your data is highly relational. E-commerce (orders, items, customers, products, addresses, discounts), financial systems (accounts, transactions, ledger entries, reconciliations), SaaS products (tenants, users, resources, permissions, billing) — all are deeply relational. Denormalizing into NoSQL buys zero performance benefit and adds significant correctness risk.
Your write volume is under 50K writes/second. A single PostgreSQL primary handles 10–50K writes/second depending on transaction complexity and schema design. If writes exceed this AND your access patterns are partition-friendly (user-scoped writes with no cross-partition JOINs), application-level sharding on PostgreSQL (via Citus or range-key routing) is the next step — before migrating data models to NoSQL. With read replicas, your read throughput is unlimited. This covers the vast majority of production applications.
You need compliance or audit trails. ACID guarantees, foreign key constraints, and row-level security are what compliance auditors expect. Every finance, healthcare, and legal workload defaults to relational for this reason.
You have a small team. A single managed PostgreSQL (RDS, Cloud SQL, Neon) is operationally trivial. A Cassandra ring is not.

Interview tip: state PostgreSQL's actual ceiling

Before reaching for NoSQL, say: "With PgBouncer for connection pooling, read replicas, and a Redis cache in front, a single-region PostgreSQL handles the write traffic of most unicorn-stage companies. I'd reach for NoSQL when I have verified write throughput above 50K/s AND confirmed my access patterns are truly partition-based with no cross-partition queries." That sentence signals you have benchmarks in mind, not just gut instinct.

When NoSQL Wins

NoSQL is genuinely better for specific workloads. The decision is not "prefer SQL" — it is "know precisely what you are trading away and confirm you are getting something measurable in return."

Key-Value wins when:

You need sub-millisecond single-key lookup at millions of requests per second — session storage, feature flags, distributed locks, leaderboard counters, rate-limiting windows.
The query is always "give me everything for key X" — no filtering, no sorting, no aggregations.
The value shape changes often and you want to avoid schema migrations (config blobs, A/B test parameters).

Example: Every request on Twitter checks whether the authenticated user's session is valid. That is 500 million requests per day hitting a single key lookup. No relational structure, no JOIN needed — pure Redis O(1).

Document NoSQL wins when:

The natural read/write unit is a hierarchical document: a product catalog entry with variants, pricing tiers, images, and metadata — all consumed together, all updated together.
Schema flexibility is genuinely valuable because data shape differs per record (different product types, flexible CMS content, heterogeneous user settings where 80% of fields are null).
You are in early prototype phase and the domain model is still evolving rapidly.

Example: A CMS stores articles, product pages, and help docs — each with completely different fields. MongoDB's flexible schema is the right fit; forcing all three into a SQL table would mean dozens of nullable columns and a confusing schema.

Column-Family wins when:

Your write volume genuinely exceeds what a single SQL node can absorb AND your access pattern is fixed: "records for partition key X, sorted by Y, last N rows."
You have time-series data at petabyte scale: IoT telemetry, application logs, messaging history, analytics events.
You can tolerate eventual consistency (within seconds) in exchange for write availability and horizontal scale.

Example: Netflix stores viewing history for 250 million subscribers in Cassandra. The access pattern is always "last 50 watched titles for member_id X" — a single-partition read sorted by timestamp. At Netflix scale, Cassandra is several times more cost-effective than sharded PostgreSQL.

Graph NoSQL wins when:

Traversing relationships between entities IS the core query — not an occasional JOIN but the whole product.
You need multi-hop traversal ("friends of friends of X who have also flagged for fraud") that requires recursive CTEs or self-joins in SQL that grow exponentially with each hop.
You are building fraud detection, recommendation engines, IAM permission graphs, or social connection features.

Example: LinkedIn's "People You May Know" traverses second-degree connection graphs across 900 million members. A second-degree connection query in SQL reads billions of join rows per query. LinkedIn built a custom graph system (Venice) to handle this in milliseconds.

The Nuance

NewSQL: ACID Plus Horizontal Scale

CockroachDB, Google Spanner, TiDB, and YugabyteDB blur the line. They give you SQL query interface, ACID transactions (distributed, cross-shard), and horizontal write scale (sharding is automatic and transparent).

The trade-off: higher latency (10–100ms for cross-region distributed consensus vs. 1–5ms for a local PostgreSQL write) and higher operational complexity. Use NewSQL when you have genuinely hit PostgreSQL's write ceiling and still need transactions. Do not use it to avoid PostgreSQL operational work — that is premature optimization.

Polyglot Persistence: The Production Reality

Most production systems at scale use multiple databases simultaneously:

Production systems at scale rarely use one database for everything. Each tier solves a different problem: SQL for ACID correctness, Redis for speed, Cassandra for write throughput, Elasticsearch for search.

# Polyglot persistence by workload type
user_accounts_orders_transactions: PostgreSQL    # ACID, relational, ad-hoc queries
sessions_caches_feature_flags:     Redis         # sub-ms lookup, high throughput
activity_feed_messaging_history:   Cassandra     # write-heavy, time-series, fan-out
full_text_search_autocomplete:     Elasticsearch # inverted index, relevance ranking
analytics_event_store:             ClickHouse    # columnar OLAP, aggregations

This is called polyglot persistence — multiple stores, each optimized for its specific workload. The skill is not choosing SQL or NoSQL. It is knowing which database fits which access pattern, and designing the data sync patterns between them (typically via Change Data Capture, event streaming, or dual-write with eventual consistency).

Schema at Write Time vs. Query Time

The real axis of difference is not "structured vs. unstructured" — it is when schema is enforced:

SQL enforces schema at write time. An INSERT with a wrong column type or a missing foreign key is rejected before it touches disk.
NoSQL pushes schema to query time. MongoDB, DynamoDB, Cassandra all require your query code to handle whatever shape the data is in. If someone stored "price": "twenty" three years ago, your query today returns a string where you expected a number.

The practical consequence: NoSQL flexibility costs you data integrity unless you compensate with application-layer validation (JSON Schema, protobuf, strict TypeScript codegen types). At scale, this trade-off usually costs more than it saves.

The ACID vs. BASE Spectrum

The core axis is not SQL vs. NoSQL — it is ACID vs. BASE:

ACID (Atomicity, Consistency, Isolation, Durability): strong guarantees, serializable, optimized for single-machine correctness.
BASE (Basically Available, Soft-state, Eventually consistent): weaker guarantees, available under network partition, distributed-first.

Most NoSQL databases chose BASE to gain write throughput and partition tolerance (per the CAP theorem). You can tune some toward stronger consistency (Cassandra's QUORUM reads), but that costs availability and latency in return.

Real-World Examples

Company	SQL Choice	NoSQL Choice	Why NoSQL Was Added
Instagram	PostgreSQL — user profiles, follows, likes, comments	Cassandra — activity feeds; Redis — sessions	Feed fan-out at 2B users required 100K+ writes/second per post; Cassandra handles the distributed write path
Netflix	MySQL — account data, billing	Cassandra — viewing history, play states	250M viewing history records with single-partition access by `member_id`; MySQL sharding was too complex
Discord	PostgreSQL — user/server metadata	ScyllaDB (Cassandra-compatible) — 4B+ messages, partition key `(channel_id, time_bucket)`, sort by `message_timestamp DESC`	Messages are always "last N for channel_id" — a fixed access pattern; ScyllaDB replaced Cassandra to eliminate JVM GC pauses causing > 100ms latency spikes
Uber	MySQL + Schemaless — trips, drivers	Riak — availability-critical geolocation data	Geolocation writes 3× per second × millions of drivers = write throughput SQL could not absorb alone
LinkedIn	MySQL — member profiles, connections	Voldemort (key-value), Venice (graph)	Profiles are document-shaped; "People You May Know" is multi-hop graph traversal at 900M members
Twitter	PostgreSQL — user accounts, follow graph	Manhattan (internal key-value)	Timeline fan-out from celebrity tweets (50M followers) requires key-value writes per user_id at enormous throughput

Interview tip: name the company AND the specific query pattern

Saying "Instagram uses PostgreSQL" is not enough. The answer is: "Instagram uses PostgreSQL for its relational, ACID-requiring data because their core entities — users, follows, comments — are deeply relational and require foreign key integrity. Their read scale is handled by a Redis caching layer, not by switching to NoSQL. They added Cassandra only for the activity feed write path, where fan-out volume made SQL impractical." That depth signals you understand the tradeoff, not just the name.

How This Shows Up in Interviews

When to Bring It Up

As soon as you have defined non-functional requirements — storage volume, DAU/WAU, read-to-write ratio, latency SLA — state your database choice and give the one or two deciding reasons.

"Given our write volume of ~30K events/second and the fact that all reads are by (user_id, timestamp) with no cross-user queries, I would use Cassandra for the event store. For user account data and any transactional operations, PostgreSQL — we need ACID there and write volume is manageable."

That one statement signals: you know both options, you can match access pattern to storage model, and you are not picking NoSQL because it sounds modern.

Depth Expected at Senior/Staff Level

Know all four NoSQL models and name a production system for each
Know Cassandra's partition key + clustering key model and how to design a table around a query
Know that DynamoDB uses consistent hashing and that table scans are prohibitively expensive
Know the CAP theorem implications: CP vs. AP, and what partition tolerance means in practice
Know that MongoDB multi-document transactions exist but carry latency overhead
Know that PostgreSQL can shard via Citus or Vitess (MySQL), and know when to reach for that vs. a NoSQL migration
Articulate polyglot persistence — using two or three databases in one system — as the production-grade answer

Follow-up Q&A

Interviewer asks	Strong answer
"Why not just use MongoDB for everything?"	"MongoDB is excellent for document-shaped data, but it trades ACID transactions for flexibility. Any operation spanning multiple documents risks partial writes unless I opt into distributed transactions, which add 10–30ms overhead. For anything financial, checkout, or inventory, I need SQL's guarantees — the correctness risk of MongoDB's optimistic model is higher than the migration cost."
"How does Cassandra achieve this scale?"	"Cassandra uses consistent hashing to distribute partitions across a ring of nodes. A write goes to the partition owner and RF-1 replicas. With `QUORUM` writes and reads, I tolerate one node failure with no data loss and no stale reads. Write throughput scales linearly — adding nodes adds proportional capacity. The trade-off is that all queries must target a partition key."
"What if our SQL database can't keep up with writes?"	"First, verify the bottleneck. Add PgBouncer for connection pooling. Check for missing indexes. Batch small writes. If writes genuinely saturate the primary — I'm looking at `pg_stat_activity` showing write queuing — I consider application-level sharding or CockroachDB before migrating data model to NoSQL. NoSQL migration carries irreversible correctness risks."
"What does eventual consistency actually mean for a user?"	"A user updates their profile. A second later they refresh and see the old value. That is eventual consistency — the update was accepted and will propagate, but not all replicas have it yet. For profile data this is usually fine. For inventory counts or bank balances, it is not — you can oversell or overdraft. The question is not 'do we want eventual consistency' but 'can the user tolerate seeing stale data for this specific field, for how many seconds?'"
"When is it worth migrating from SQL to NoSQL?"	"When you have exhausted SQL's ceiling: connection pooling, read replicas, a caching layer, and write-level sharding — and you still see write queue depth growing. At that point, if your access patterns are partition-based, NoSQL migration is justified. Most teams that say they need NoSQL actually need a caching layer. I would migrate only with clear benchmarks showing SQL is the bottleneck, not a projected future scale."

Test Your Understanding

Quick Recap

SQL is the correct default for most applications — ACID, foreign keys, and flexible ad-hoc queries carry the majority of production systems through their entire lifecycle.
NoSQL is not the answer to scale in general — it is the answer to a specific access pattern that SQL cannot serve efficiently: key-value lookup, document hierarchy, write-heavy time-series, or graph traversal.
The four NoSQL models trade away different things: Key-Value trades queries for speed; Document trades integrity for flexibility; Column-Family trades JOINs and ACID for write throughput; Graph trades general-purpose simplicity for traversal power.
"NoSQL scales better" is a myth for most teams. PostgreSQL with read replicas and a caching layer handles the read load of most unicorn-stage companies. Cassandra is the right call when writes saturate your SQL primary AND access patterns are partition-based.
NewSQL (CockroachDB, Spanner) gives ACID plus horizontal write scale at the cost of higher latency and operational complexity — use it when you have genuinely exhausted PostgreSQL and still need transactions.
Polyglot persistence is the production answer: SQL for transactional state, Redis for low-latency lookups, Cassandra for write-heavy append logs, Elasticsearch for full-text search. The skill is composing these and keeping data in sync.
Schema at write time vs. query time: NoSQL flexibility pushes enforcement to your application code. At scale this costs more than it saves unless you compensate with strict application-layer validation.

Databases — B-Tree vs. LSM-Tree, ACID in depth, index internals, and how the SQL query planner works — the foundational concepts behind every trade-off on this page.
Sharding — When you have outgrown a single database node, sharding is the mechanism — and SQL vs. NoSQL determines how painful or transparent that process is.
Replication — How data copies propagate across nodes. Understanding replication lag is essential to knowing what "eventual consistency" means in practice.
CAP Theorem — The theoretical foundation for why distributed NoSQL systems make availability-consistency trade-offs. Every NoSQL system is a specific CAP incarnation.
Caching — Before reaching for a NoSQL database, ask whether a cache layer solves your read-scale problem. It usually does.