Google Spanner and TrueTime
How Google's Spanner database uses GPS-synchronized atomic clocks and bounded uncertainty to provide globally consistent transactions across data centers on different continents.
TL;DR
- Google built Spanner because Bigtable and Megastore could not provide globally consistent ACID transactions across data centers.
- TrueTime uses GPS receivers (accurate to less than 1μs) and cesium atomic clocks in every data center to return wall-clock time as a bounded interval
[earliest, latest], not a single point. - Spanner's commit wait protocol assigns a timestamp
s, then waits untilTrueTime.now().earliest > sbefore acknowledging, guaranteeing thatsis safely in the past for every clock on Earth. - Practical uncertainty is roughly 7ms (typically around 4ms), making global ACID commits remarkably fast.
- The transferable lesson: when you can bound uncertainty precisely, you can wait it out, and that changes what is possible in distributed systems.
The Trigger
By 2011, Google's internal infrastructure ran on two storage systems that could not coexist gracefully. Bigtable handled massive throughput but offered no cross-row transactions and no SQL interface. Megastore added cross-datacenter replication with ACID semantics on top of Bigtable, but write throughput collapsed under load because every write required Paxos consensus across replicas.
Google Ads needed to bill advertisers in real-time across regions. Google Payments required financial-grade consistency for transactions spanning continents. YouTube needed globally consistent counters for views and subscriptions.
Each of these services was bending Megastore's limitations with application-level workarounds, and the complexity was compounding. One Google engineer described the situation in a 2012 conference talk: "We had more lines of code dealing with Megastore's consistency quirks than actual business logic."
The breaking point was not a single outage. It was the realization that every team building a globally distributed application at Google was independently reinventing the same coordination logic on top of broken abstractions. I've seen this pattern at smaller companies too: when three teams all build their own "distributed transaction layer," it is time to fix the platform.
The Spanner project started in 2008 under the leadership of Jeff Dean and Wilson Hsieh. The initial team was small (fewer than 20 engineers), but the scope was enormous: build a database that provides ACID transactions across continents with latency measured in single-digit milliseconds.
The System Before
Google's storage stack in 2011 looked like a layer cake with cracks running through it.
Bigtable was the workhorse: a sorted key-value store sharded across thousands of machines, optimized for high-throughput reads and writes within a single data center. It had no concept of multi-row transactions. If you needed to update a user's balance and their transaction log atomically, you had to put both in the same row or accept eventual consistency.
This limitation was intentional. Bigtable was designed for web-scale workloads (crawl indexing, analytics) where eventual consistency was acceptable. But as Google's business evolved, more services needed transactional guarantees.
Megastore sat on top of Bigtable and added cross-datacenter replication via Paxos. It supported ACID transactions within an "entity group" (a partition of related rows). But cross-group transactions required two-phase commit layered on top of Paxos, which meant write latency of 100-400ms for anything spanning entity groups.
The fundamental issue was clock coordination. Every distributed database must order transactions globally to provide consistency guarantees. Bigtable and Megastore used logical timestamps (monotonically increasing counters) for ordering within a single machine or entity group. But across data centers, there was no shared notion of "now."
And without a shared "now," there is no way to answer the most basic consistency question: "Has this transaction's result been made visible to everyone who started after it committed?"
Logical clocks solved ordering within a single Paxos group. They could not answer the question: "Did transaction A in Virginia commit before transaction B in Belgium started, according to actual wall-clock time?" Without that answer, external consistency was impossible.
The cost of this limitation was concrete. Google Ads engineers wrote thousands of lines of application-level reconciliation code to handle cases where two data centers disagreed about transaction ordering. Google Payments could not guarantee that a refund processed in Europe would be visible to a balance check in the US within any bounded time. These were not theoretical problems. They were support tickets.
Why Not Just [NTP / Logical Clocks / 2PC]?
Three "obvious" fixes existed. Each one fell short in a specific way.
Why not just use NTP?
Network Time Protocol synchronizes clocks across the internet to within 1-50ms of UTC. That sounds precise enough until you realize what "1-50ms" actually means in practice.
NTP accuracy degrades unpredictably. A congested network link adds variable latency to time queries. Temperature changes in a data center shift crystal oscillator frequency by parts per million. A rebooted NTP server can introduce step changes of tens of milliseconds.
The drift between corrections is unbounded for short intervals. Between NTP polls (which happen every few minutes), a server's clock drifts at whatever rate its local oscillator dictates. That rate varies with temperature, load, and hardware age.
For a database, this means two nodes can disagree about "now" by 50ms or more, and neither knows how wrong it is. If transaction A commits at node 1's time T=100 and transaction B starts at node 2's time T=101, you cannot guarantee that A truly happened before B. Node 2's clock might be 50ms ahead.
The critical difference between NTP and TrueTime: NTP gives you a best-effort timestamp with no error bound. TrueTime gives you a timestamp with a worst-case error bound. The second is far more useful for building correct distributed systems, because you can reason about the bound.
Why not just use logical clocks?
Lamport clocks and vector clocks provide causal ordering: if event A caused event B, the clocks guarantee A's timestamp is smaller than B's. But they say nothing about concurrent events that have no causal relationship.
Two users in different cities submit payments at the same wall-clock moment. Neither payment caused the other. Logical clocks assign arbitrary ordering to these events. For an advertising billing system that needs to match real-world time ("this click happened before that impression expired"), arbitrary ordering is not acceptable.
Vector clocks detect concurrency but do not resolve it. They tell you "these two events are concurrent" without providing a total order. For a SQL database that needs to answer SELECT ... ORDER BY timestamp, you need a total order that respects real time.
The bottom line on logical clocks: they are the right tool for causal consistency (Amazon DynamoDB uses them well), but they fundamentally cannot provide external consistency because they have no connection to wall-clock time.
Why not just use two-phase commit?
Two-phase commit (2PC) can coordinate transactions across data centers, but the coordinator must communicate with every participant before committing. For a transaction spanning Virginia and Belgium, that means at least one cross-Atlantic round trip (~70ms one way).
Megastore already used this approach (Paxos plus 2PC for cross-group transactions), and the result was 100-400ms write latency. For Google Ads processing millions of bid updates per second, that latency budget was unacceptable. The problem was not correctness. The problem was performance.
2PC also has a well-known availability weakness: if the coordinator crashes mid-protocol, participants hold locks until the coordinator recovers. At Google's scale, coordinator failures happen daily. The combination of high latency and fragile availability made pure 2PC a non-starter for a global database.
Spanner does use 2PC, but with a twist: each participant in the 2PC protocol is a Paxos group, not a single machine. If the coordinator machine crashes, another machine in its Paxos group takes over. This makes 2PC fault-tolerant, but that insight alone did not solve the clock-ordering problem.
The real gap
NTP gives you an approximate clock with unknown error bounds. Logical clocks give you ordering without time. 2PC gives you coordination with high latency. Google needed ordering that respects real time, with known error bounds, and without cross-datacenter round trips for every transaction.
| Approach | What it provides | What it lacks |
|---|---|---|
| NTP | Wall-clock sync to ~1-50ms | No bound on actual error at any moment |
| Lamport clocks | Causal ordering | No wall-time semantics, no concurrent ordering |
| Vector clocks | Causal ordering + concurrency detection | No total order, no time semantics |
| Hybrid Logical Clocks (HLC) | Causal + approximate wall time | No guaranteed bound on uncertainty |
| 2PC / Paxos | Strong consistency | Cross-DC latency for every write |
| TrueTime | Wall time with bounded uncertainty | Requires specialized hardware |
The Decision
Google's clock infrastructure team made a bet: if you can measure uncertainty precisely, you can wait it out. The result was TrueTime, a time API unlike anything in production systems before it.
The name "TrueTime" is slightly misleading. It does not give you the true time. It gives you a bounded range that is guaranteed to contain the true time. That guarantee is what makes everything else possible.
The TrueTime API
TrueTime exposes three methods:
TT.now() → TTinterval { earliest, latest }
TT.after(t) → true if t is definitely in the past
TT.before(t) → true if t is definitely in the future
The critical insight is now(). It does not return a timestamp. It returns an interval [earliest, latest] where the true current time is guaranteed to fall. The width of that interval is the uncertainty, denoted ε (epsilon).
In practice, ε stays below 7ms and typically hovers around 4ms. That is the cost of global consistency: a 4-7ms wait per commit.
Compare this to the alternative. Without TrueTime, achieving the same consistency guarantee requires a cross-datacenter Paxos round trip for every transaction: 50-150ms depending on geography. TrueTime replaces network coordination with a local clock query and a short wait. The trade-off is hardware cost (GPS antennas and atomic clocks in every data center) versus latency cost (round trips for every write).
For Google's scale (millions of transactions per second across dozens of data centers), the hardware investment pays for itself many times over in reduced write latency.
The hardware stack
Every Google data center deploys two types of time references:
GPS receivers provide time accurate to less than 1 microsecond by receiving signals from GPS satellites (which carry atomic clocks). Multiple GPS antennas per data center provide redundancy against antenna failures, GPS signal jamming, and satellite ephemeris errors.
Cesium atomic clocks serve as a fallback when GPS signals are unavailable. Atomic clocks drift slowly (roughly 200 microseconds per day for a cesium beam clock), so they provide a reliable lower bound on accuracy even during extended GPS outages.
The TrueTime daemon on each machine polls multiple time masters every 30 seconds, uses Marzullo's algorithm to intersect their intervals, and accounts for local clock drift between polls. The result: a continuously updated uncertainty interval that tightens after each poll and slowly widens between polls.
My favorite detail is the redundancy model. GPS receivers and atomic clocks have uncorrelated failure modes. A solar flare that disrupts GPS signals does not affect cesium clocks. An earthquake that damages a data center's atomic clock does not affect GPS. The combination provides availability that neither could achieve alone.
The Migration Path
Spanner was not a migration of an existing system. It was a new database built from scratch between 2008 and 2012, designed to replace both Bigtable (for structured data needing transactions) and Megastore (for cross-datacenter consistency).
Phase 1: TrueTime infrastructure (2008-2009)
Google deployed GPS receivers and atomic clocks in data centers and built the TrueTime daemon. This was the foundational bet. Before writing a single line of database code, the team validated that bounded clock uncertainty was achievable at scale.
The validation criteria: ε must stay below 10ms under normal conditions and below 200ms during GPS outages (atomic clock fallback). Both targets were met.
This order of operations matters. Many infrastructure projects start with the application and bolt on the hard part later. Google started with the hard part (reliable bounded time) and validated it independently before building anything on top. That de-risked the entire project.
Phase 2: Spanner core (2009-2011)
The database layer was built on top of TrueTime with these components:
- Paxos replication groups (called "spanservers"): each shard of data is replicated across 3-5 data centers using Paxos for consensus. The Paxos leader handles reads and writes for that shard.
- Directory-based sharding: data is partitioned into "directories" (contiguous key ranges) that can move between Paxos groups for load balancing. This allows Spanner to automatically rebalance load without manual intervention.
- Two-phase commit across Paxos groups: for transactions spanning multiple shards, a coordinator runs 2PC where each participant is a Paxos group (not a single machine). This makes 2PC fault-tolerant, because the Paxos group survives individual machine failures.
- SQL query engine: unlike Bigtable, Spanner exposed a full SQL interface, making adoption easier for teams accustomed to relational databases.
Phase 3: Internal adoption (2011-2012)
Google Ads was the first major customer at production scale. The F1 database (Google's ad-serving database, previously on MySQL with manual sharding) was rebuilt on top of Spanner between 2011 and 2013.
The F1 migration proved that Spanner could handle millions of transactions per second across five data centers while maintaining external consistency. Google published the Spanner paper at OSDI 2012 and the F1 paper at VLDB 2013.
The 2012 paper was a watershed moment for the database industry. It showed that globally consistent transactions were not just theoretically possible but practically achievable at Google scale. Within five years, CockroachDB, YugabyteDB, and TiDB had all launched as commercial databases inspired by Spanner's architecture.
Phase 4: External availability (2017)
Google launched Cloud Spanner as a managed service in February 2017, making the technology available outside Google for the first time. By that point, hundreds of internal Google services had migrated to Spanner, validating its operational model at a scale no external customer would match for years.
This was a five-year build
Spanner took roughly five years from initial TrueTime work to the OSDI paper. This was not a weekend hackathon. The lesson: infrastructure bets at this level require long-term commitment and organizational patience that most companies cannot afford.
The System After
Spanner's architecture combines Paxos replication, TrueTime, and a commit wait protocol to achieve external consistency without requiring cross-datacenter coordination for reads.
The key architectural insight: Spanner separates the concern of "replicating data" (Paxos) from the concern of "ordering transactions" (TrueTime). Paxos ensures every replica agrees on which writes happened. TrueTime ensures those writes are ordered consistently with real-world time. Neither mechanism alone provides external consistency. Together, they do.
The commit wait protocol
This is the mechanism that makes everything work. When a read-write transaction commits:
The commit wait guarantees a simple invariant: by the time the client hears "committed at timestamp s," the real time is definitely past s according to every clock in the system. Any subsequent transaction, on any node, anywhere in the world, will pick a timestamp greater than s.
Why this guarantees external consistency
Suppose transaction T1 commits at timestamp s1, and the client then starts transaction T2. The commit wait ensures real time has moved past s1 before T2 begins. T2's timestamp s2 must be greater than s1 (because TrueTime's earliest is past s1). Therefore T2 sees T1's writes. This holds even if T2 executes on a different continent.
The commit wait is the price Spanner pays. Every read-write transaction waits ~4-7ms (the current uncertainty bound) before acknowledging. This sounds expensive, but consider the alternative: without commit wait, you need a cross-datacenter round trip for every read to check whether a newer write exists. A cross-Atlantic round trip is ~70ms. Paying 4-7ms on writes to avoid 70ms on reads is an extraordinary trade.
The wait duration is at most 2ε (roughly 7-14ms in the worst case). In practice, Paxos replication often takes longer than the commit wait itself, so the wait is typically hidden behind replication latency.
Read-only transactions
Spanner also supports lock-free, globally consistent snapshot reads. A read-only transaction picks a timestamp, and every Paxos group serves data as of that timestamp. Because TrueTime bounds uncertainty, the node can determine whether all writes that should be visible at that timestamp have been applied. No locks, no cross-datacenter coordination.
This is the part that surprises most engineers when they first encounter Spanner. Consistent reads across continents with no locking overhead. The trick: the commit wait that already happened during the write ensures all subsequent reads at that timestamp are safe.
Stale reads are also supported: a client can request data as of a specific timestamp in the past (say, 10 seconds ago). Stale reads avoid even the need to check whether all Paxos groups have applied their latest writes, making them even faster. For dashboards and analytics queries where perfect freshness is unnecessary, stale reads are the right choice.
For your interview: mention that Spanner achieves lock-free consistent reads by pushing the coordination cost into the write path (commit wait), so reads are fast and cheap.
Software alternatives today
Spanner's design inspired an entire generation of "NewSQL" databases:
- CockroachDB uses Hybrid Logical Clocks (HLC) instead of TrueTime. HLC combines a physical timestamp with a logical counter, providing causal ordering with approximate wall-time semantics. The trade-off: CockroachDB cannot guarantee external consistency without a "clock skew" configuration that adds latency proportional to estimated clock drift (default 250ms in older versions, 500ms max offset).
- YugabyteDB also uses HLC with a similar approach, targeting Google Spanner's feature set without the hardware dependency.
- AWS Aurora Global Database provides cross-region replication with write forwarding, but does not offer external consistency. Reads from replicas may be stale.
- TiDB uses a centralized Timestamp Oracle (TSO) for ordering. This provides strong consistency but introduces a single point of coordination (and latency) for every transaction.
The key distinction: all software alternatives approximate TrueTime's guarantees with wider uncertainty bounds or different consistency trade-offs. None can match the ~4ms bounded uncertainty without specialized hardware.
If you remember one thing from this comparison: CockroachDB is the closest open-source equivalent to Spanner. In an interview, mentioning CockroachDB as the "software TrueTime alternative" shows you understand both the ideal and the practical.
The Results
| Metric | Before (Megastore) | After (Spanner) |
|---|---|---|
| Write latency (cross-DC) | 100-400ms | 8-15ms (commit wait + Paxos) |
| Read latency (consistent) | Required Paxos round-trip | Lock-free snapshot reads |
| Consistency model | Serializable within entity group | External consistency (globally) |
| SQL support | None (key-value with entity groups) | Full SQL (Spanner SQL dialect) |
| Sharding | Manual entity group design | Automatic directory-based sharding |
| Replication | 3 replicas via Paxos | 3-5 replicas, configurable per table |
| Clock uncertainty (ε) | N/A (logical clocks) | ~4-7ms bounded |
| Scaling | Thousands of entity groups | Millions of Paxos groups |
The numbers speak louder than any architectural argument. Write latency dropped by an order of magnitude while the consistency guarantee got strictly stronger.
The F1 migration (Google Ads) showed that Spanner could handle the full ad-serving workload: millions of transactions per second, five data centers, sub-10ms commit latency for writes within a region.
Google Payments achieved financial-grade consistency across regions without application-level coordination. Before Spanner, payment logic included extensive retry and reconciliation code to handle Megastore's consistency gaps. After Spanner, that code was deleted.
I remember reading the F1 paper and thinking the before/after numbers were too clean. But multiple Google engineers have confirmed in talks that removing application-level consistency workarounds was one of the biggest wins, sometimes eliminating thousands of lines of coordination code per service.
The cost side is less public. Cloud Spanner pricing is roughly 10x that of Cloud SQL for equivalent compute, reflecting both the hardware overhead (GPS/atomic clock infrastructure) and the complexity of managing a globally replicated database. For most applications, that premium is not justified. For applications where a consistency bug means a financial loss, it is cheap.
What They'd Do Differently
The Spanner team has been publicly reflective about their design choices in conference talks and follow-up papers.
SQL from day one. Spanner initially launched with a key-value API. SQL support was added later (and became the dominant interface). The team has said they would include SQL from the start if building again, because the relational model is what most application developers actually need.
Smaller Paxos groups. Early Spanner deployments used relatively large Paxos groups. Google later moved toward finer-grained sharding with smaller groups for better load balancing and faster leader elections.
Automated schema management. Schema changes in early Spanner required careful coordination. Online schema changes (adding columns, creating indexes without downtime) were added later and are now considered essential. The team wishes they had prioritized this earlier.
Better client library abstractions. Early Spanner clients exposed too many low-level details (transaction retries, session management). Later client libraries abstracted these concerns, significantly improving developer experience. The pattern is universal: make the common case easy, not just possible.
The honest trade-off acknowledgment: Spanner requires hardware that most organizations cannot deploy. The external consistency guarantee comes at a literal physical cost (GPS antennas, atomic clocks, hardened time infrastructure). Google has never pretended this is free.
Better documentation of commit wait implications. Application developers sometimes found it surprising that write-heavy workloads had a hard latency floor of ~4-7ms per commit. The team later invested in better documentation and tooling to help developers understand the performance characteristics before choosing Spanner.
I think the SQL-from-day-one point is the most universally applicable. Every infrastructure team I've worked with has made the same mistake: shipping a low-level API first and adding the friendly interface later. Developers adopt the low-level API, and then the migration to the friendly one becomes its own project.
Architecture Decision Guide
When should you reach for Spanner-style global consistency? This flowchart helps you navigate the decision.
Most teams overcomplicate this. The default answer is "use a regional database." Only escalate to global consistency when you have a specific, named scenario where stale reads cause real harm.
The practical shortcut
Most applications do not need external consistency. If your users are in one region, a regional database with read replicas covers 95% of use cases. Reach for global consistency only when you have a concrete scenario where stale reads cause financial or safety consequences.
Transferable Lessons
1. Bounded uncertainty beats unbounded precision.
The breakthrough in TrueTime is not that clocks are perfectly accurate. They are not. The breakthrough is that the system knows exactly how inaccurate it is. An NTP clock might be off by 50ms and have no idea. TrueTime might be off by 4ms and knows that 4ms is the worst case. When you can bound the uncertainty, you can wait it out. This principle applies beyond clocks: any system that can quantify its uncertainty can make guarantees that a system with unknown uncertainty cannot.
2. Push coordination cost to the write path to make reads free.
Spanner's commit wait adds a few milliseconds to every write. In exchange, read-only transactions are lock-free and globally consistent. Most systems are read-heavy (10:1 or 100:1 read-to-write ratios), so paying a small write-side cost to eliminate read-side coordination is a massive net win. I apply this principle in cache invalidation design too: pay the cost at write time so reads never have to check.
3. Hardware-software co-design unlocks capabilities that software alone cannot reach.
Google did not solve the distributed clock problem with a clever algorithm. They solved it by deploying GPS receivers and atomic clocks in every data center, then building software that exploits the guarantees those devices provide. Most companies cannot do this, but the lesson is broader: when a software-only approach hits a fundamental limit, ask whether a hardware investment changes the equation.
4. Replace application-level workarounds with platform guarantees.
Before Spanner, every team at Google wrote their own transaction coordination logic on top of Megastore. This was error-prone, duplicated, and inconsistent. Spanner eliminated thousands of lines of application-level consistency code across multiple services. When you see the same workaround in three services, it belongs in the platform.
5. CAP theorem is about assumptions, not absolute limits.
The CAP theorem assumes that clocks are unreliable. TrueTime changes that assumption by providing clocks with bounded uncertainty. Spanner still cannot avoid partition-induced unavailability (it is CP, not AP), but it sidesteps the clock-coordination penalty that makes most CP systems slow. The transferable insight: revisit the assumptions behind any "impossible" result. Sometimes you can change the assumption.
Every one of these lessons comes back to the same meta-principle: Google did not build Spanner by being cleverer than everyone else. They built it by investing in infrastructure that changed the problem's constraints. That is replicable at any scale, as long as you identify which constraint is actually limiting you.
How This Shows Up in Interviews
When an interviewer asks you to design a global payment system or a multi-region database with strong consistency, Spanner is the reference architecture.
The sentence to say: "For global strong consistency, I would use a Spanner-style approach: bounded clock uncertainty with commit wait, so writes pay a small latency cost and reads are lock-free."
Don't go deep on GPS and atomic clocks unless asked. The interviewer cares about the mechanism (commit wait) and the trade-off (write latency vs. read coordination), not the physics.
| Interviewer asks | Strong answer citing Spanner |
|---|---|
| "How do you handle consistency across regions?" | "Bounded clock uncertainty with commit wait. Assign a commit timestamp, wait for the uncertainty window to pass, then acknowledge. Reads are lock-free snapshots." |
| "What about CAP theorem?" | "Spanner is CP. It sidesteps the clock assumption by using hardware clocks with bounded uncertainty. During partitions, affected writes stall, but the consistency guarantee holds." |
| "Isn't cross-region latency too high for strong consistency?" | "Commit wait adds only ~7ms. Paxos replication is usually the bottleneck at ~50-100ms cross-continent. The consistency is almost free relative to replication cost." |
| "What if you can't use Spanner?" | "CockroachDB uses Hybrid Logical Clocks to approximate TrueTime in software. You trade tighter uncertainty bounds for not needing specialized hardware." |
| "How do consistent reads work without locks?" | "The write-side commit wait guarantees timestamps are safely in the past. Read-only transactions pick a timestamp and read a consistent snapshot, no locks needed." |
| "Why not just use eventual consistency?" | "Eventual consistency works for most reads, but for financial transactions or inventory, you need a guarantee that a completed write is visible to all subsequent reads. Spanner provides that without per-read coordination." |
Interview shortcut
You do not need to explain GPS receivers and atomic clocks in detail. Say "bounded clock uncertainty" and focus on the commit wait protocol. That is the mechanism the interviewer cares about.
Quick Recap
- Google built Spanner because Bigtable lacked transactions and Megastore's cross-datacenter writes were too slow at 100-400ms.
- TrueTime uses GPS receivers and cesium atomic clocks to provide wall-clock time as a bounded interval
[earliest, latest]with roughly 4-7ms uncertainty. - The commit wait protocol assigns a timestamp s, then waits until
TrueTime.now().earliest > s, guaranteeing s is in the past for every clock globally. - This achieves external consistency: if transaction A commits before B starts in real time, B sees A's writes, without cross-datacenter coordination for reads.
- Spanner is CP under CAP: it sidesteps the clock assumption but still stalls writes during network partitions.
- Software alternatives (CockroachDB with HLC, YugabyteDB) approximate TrueTime's guarantees without specialized hardware, trading wider uncertainty bounds.
- Cloud Spanner (launched 2017) makes this technology available as a managed service, but at roughly 10x the cost of traditional managed databases.
- The transferable principle: bounded uncertainty is more powerful than unbounded precision, because you can wait out a known bound.
Related Concepts
- Consistency Models: Spanner achieves external consistency (stronger than linearizability), which sits at the top of the consistency hierarchy. Understanding where external consistency fits relative to serializable and linearizable is essential.
- CAP Theorem: Spanner is the canonical example of a CP system that "bends" CAP by changing the clock reliability assumption. It proves that hardware investment can shift the boundary of what is theoretically possible.
- Two-Phase Commit: Spanner uses 2PC across Paxos groups for multi-shard transactions, but TrueTime makes the coordination cost predictable and bounded.
- Replication: Spanner's Paxos-based replication across 3-5 data centers is a textbook example of synchronous multi-region replication with configurable placement policies.