Checksums and error detection
How CRC checksums detect data corruption in storage and networking, the math behind polynomial division, when to use which algorithm, and how databases use checksums to catch silent corruption.
The problem
A production PostgreSQL cluster serves financial records. One morning, a batch job produces account balances that are off by millions. No application bugs, no bad queries, no failed writes. The data on disk has silently changed.
The root cause: a DRAM bit flip during a write-back from the page cache to disk. The hardware reported a successful write. The OS confirmed it. The filesystem metadata is consistent. But the data itself is wrong, and nothing in the stack noticed.
Silent data corruption (often called "bit rot") is a real phenomenon in production storage. DRAM bit flips happen at roughly 1 error per 4GB per year without ECC. SSD firmware bugs can write data to the wrong cell. Disk controllers occasionally misroute writes. Network packets can arrive with flipped bits that slip past Ethernet's frame check.
Without checksums, a database silently serves corrupted data. The application trusts the result, the user trusts the application, and the corruption propagates through downstream systems. This is the problem checksums solve: they give every piece of stored or transmitted data a short fingerprint that can be verified on every read.
What it is
A checksum is a small, fixed-size value computed from a block of data. If even a single bit of the data changes, the checksum changes too. On every read, you recompute the checksum and compare it to the stored value. A mismatch means corruption.
Think of it like a parity digit on a credit card number. The last digit of any credit card is not part of the account number; it is a check digit derived from all the other digits using Luhn's algorithm. If you mistype one digit, the check digit will not match, and the system rejects the number. CRC checksums work the same way, except the "check digit" is computed using polynomial division over binary data, and the data can be megabytes instead of 16 digits.
CRC (Cyclic Redundancy Check) is the most common checksum in storage and networking. It produces a 16, 32, or 64-bit fingerprint that detects virtually all accidental corruption, costs only nanoseconds to compute, and has dedicated hardware instructions on every modern CPU.
How it works
CRC treats the input data as a long binary polynomial and divides it by a fixed generator polynomial. The remainder of that division is the CRC value, typically 32 bits.
The key insight: all arithmetic is in GF(2), meaning addition and subtraction are both XOR. There are no carries. This makes CRC computable with shift-and-XOR operations, which is why it runs so fast.
Step-by-step walkthrough
- Treat data as polynomial. A byte
0b11010011becomes $x^7 + x^6 + x^4 + x + 1$. The entire input file is one massive polynomial. - Pad with zeros. Append 32 zero bits (for CRC-32) to the data polynomial. This makes room for the remainder.
- Divide by the generator. The generator polynomial for CRC-32 is
0x04C11DB7. Division works bit-by-bit: if the current leading bit is 1, XOR with the generator and shift. If 0, just shift. - The remainder is the CRC. After processing all bits, the remaining 32 bits are the CRC value.
- Verify on read. Divide the data+CRC block by the same generator. If the remainder is zero, no corruption. If non-zero, at least one bit has changed.
# CRC-32 core algorithm (simplified, no lookup table)
def crc32(data: bytes) -> int:
generator = 0xEDB88320 # CRC-32 polynomial (reflected)
crc = 0xFFFFFFFF # Initial value
for byte in data:
crc ^= byte
for _ in range(8):
if crc & 1:
crc = (crc >> 1) ^ generator
else:
crc >>= 1
return crc ^ 0xFFFFFFFF # Final XOR
The reflected form (0xEDB88320 instead of 0x04C11DB7) processes bits LSB-first, which matches how x86 CPUs naturally read bytes. Real implementations replace the inner loop with a 256-entry lookup table, processing one byte per table lookup instead of one bit per iteration.
The lookup table optimization
The bit-by-bit algorithm processes one bit per XOR operation, which means 8 iterations per byte. In practice, every CRC implementation precomputes a 256-entry table at startup. The table maps each possible byte value to its effect on the CRC register. Then the inner loop becomes a single table lookup per byte.
# Build CRC-32 lookup table (done once at startup)
def build_crc_table() -> list[int]:
table = []
for i in range(256):
crc = i
for _ in range(8):
if crc & 1:
crc = (crc >> 1) ^ 0xEDB88320
else:
crc >>= 1
table.append(crc)
return table
CRC_TABLE = build_crc_table()
# CRC-32 with lookup table (8x faster than bit-by-bit)
def crc32_fast(data: bytes) -> int:
crc = 0xFFFFFFFF
for byte in data:
index = (crc ^ byte) & 0xFF
crc = (crc >> 8) ^ CRC_TABLE[index]
return crc ^ 0xFFFFFFFF
This reduces the inner loop from 8 operations per byte to 1 table lookup, an XOR, and a shift. Modern implementations go further with "slicing-by-8" or "slicing-by-16" techniques that process 8 or 16 bytes per loop iteration using larger tables, achieving ~1 GB/s in pure software.
Interview tip: CRC is not encryption
When an interviewer asks about data integrity, mention CRC immediately for corruption detection. But always clarify: CRC is not collision-resistant. An attacker can craft data that produces any desired CRC in microseconds. For tamper detection, you need a cryptographic hash (SHA-256, BLAKE3).
Polynomial division mechanics
The polynomial division in CRC is not regular integer division. In GF(2), there are no carries, so "subtraction" is just XOR. Here is what the division looks like for a small example: computing CRC-3 of the data 11010011 with generator 1011.
// CRC division: data = 11010011, generator = 1011 (degree 3)
// Step 1: Pad data with 3 zeros (degree of generator)
// Padded: 11010011 000
11010011 000 โ data padded with 3 zeros
1011 โ XOR with generator (leading bit = 1)
-----------
1100011 000
1011
----------
111011 000
1011
---------
10111 000
1011
--------
0001 000
(shift until leading 1)
...
// Final remainder (3 bits) = CRC value
Each step: if the leading bit is 1, XOR with the generator and shift. If 0, just shift. After processing all data bits, the remaining bits are the CRC.
The mathematical property that makes CRC useful: any single-bit error in the data changes the remainder. Any burst error shorter than the CRC width (32 bits for CRC-32) is guaranteed to produce a different remainder. For longer bursts, the probability of an undetected error is 1 in 2^32 (roughly 1 in 4 billion).
Why CRC catches specific error patterns
CRC's error detection comes from the algebraic properties of the generator polynomial. A well-chosen generator polynomial is irreducible over GF(2) and has specific factors that guarantee detection of common error types:
- All single-bit errors. A single bit flip at position
ichanges the data polynomial byx^i. Since the generator polynomial has more than one term,x^iis never divisible byG(x), so the remainder always changes. - All double-bit errors. Two bit flips at positions
iandjchange the data byx^i + x^j = x^j * (x^(i-j) + 1). CRC-32's generator is chosen so thatx^k + 1is not divisible byG(x)for anykup to the maximum data length. - All odd-number-of-bit errors. CRC-32's generator has
(x + 1)as a factor, so any error polynomial with an odd number of terms (which means an odd number of flipped bits) is guaranteed to produce a non-zero remainder. - All burst errors up to 32 bits. A burst error of length
b(where b is 32 or fewer) is a polynomial of degree less thanG(x)'s degree. No polynomial shorter than the generator can divide it evenly.
These guarantees are deterministic, not probabilistic. For the specific error patterns above, CRC never misses. This is what separates CRC from general-purpose hash functions.
CRC-32 vs CRC-16 vs CRC-64
Not all CRCs are the same size. The choice depends on the data size and the error detection guarantee you need.
| Variant | Output | Polynomial | Burst detection | Undetected error probability | Typical use |
|---|---|---|---|---|---|
| CRC-16 | 2 bytes | 0x8005 | All bursts โค 16 bits | 1 in 65,536 | Modbus, USB, legacy serial |
| CRC-32 | 4 bytes | 0x04C11DB7 | All bursts โค 32 bits | 1 in ~4 billion | Ethernet, ZIP, PNG, gzip |
| CRC-32C | 4 bytes | 0x1EDC6F41 | All bursts โค 32 bits | 1 in ~4 billion | iSCSI, Btrfs, ext4, HDFS |
| CRC-64 | 8 bytes | 0x42F0E1EBA9EA3693 | All bursts โค 64 bits | 1 in ~1.8 ร 10^19 | ECMA-182, XZ Utils |
CRC-32C (Castagnoli) is preferred over CRC-32 (IEEE) in modern storage systems. It has better error detection for certain bit patterns common in real data, and it is the variant that Intel's CRC32 instruction accelerates. I always recommend CRC-32C for new storage systems unless you need backward compatibility with a protocol that mandates CRC-32.
CRC-16 only detects corruption with probability 99.998% for errors longer than 16 bits. For any block larger than a few hundred bytes, CRC-16 is not sufficient. Use CRC-32C as the default unless you have a hard constraint on checksum size.
Hardware acceleration: the CRC32C instruction
Since 2008 (Intel Nehalem, SSE4.2), x86 CPUs include a dedicated CRC32 instruction that computes CRC-32C in hardware. ARM added equivalent support with CRC extensions in ARMv8. This changes the performance profile dramatically.
| Method | Throughput | Latency per 64 bytes |
|---|---|---|
| Software CRC-32 (table lookup) | ~1 GB/s | ~60 ns |
| Hardware CRC-32C (SSE4.2) | ~20 GB/s | ~3 ns |
| Hardware CRC-32C (AVX-512 VPCLMULQDQ) | ~100 GB/s | < 1 ns |
At 20 GB/s, CRC-32C adds negligible overhead to any storage I/O path. This is why every modern storage system (RocksDB, LevelDB, HDFS, ScyllaDB) uses CRC-32C: the cost is effectively zero on hardware that supports it, and all production hardware does.
The storage engine computes CRC on every write and verifies on every read. With hardware acceleration, this adds less than 1% CPU overhead even at high throughput.
CRC vs cryptographic hashes
CRC and cryptographic hashes (SHA-256, BLAKE3) both produce a fixed-size fingerprint from variable-length data. But they solve fundamentally different problems.
| Property | CRC-32C | SHA-256 | BLAKE3 |
|---|---|---|---|
| Output size | 4 bytes | 32 bytes | 32 bytes |
| Speed | ~20 GB/s (HW) | ~500 MB/s | ~5 GB/s |
| Detects random corruption | Yes | Yes | Yes |
| Resists intentional collision | No | Yes | Yes |
| Resists preimage attacks | No | Yes | Yes |
| Use case | Error detection | Tamper detection, signatures | Fast secure hashing |
CRC is designed to detect accidental bit errors with minimal overhead. An attacker can forge a CRC in microseconds: just compute the CRC of the tampered data and replace the checksum. This is by design. CRC optimizes for speed, not security.
Cryptographic hashes are designed so that finding two inputs with the same hash is computationally infeasible (2^128 operations for SHA-256). This makes them suitable for verifying that data has not been tampered with (code signing, certificate pinning, content addressing in Git).
In system design interviews, I always frame this as a two-axis decision: "Do I need to detect accidental corruption, or intentional tampering?" Accidental means CRC. Intentional means cryptographic hash. If you need both (common in distributed systems), compute both: CRC for fast per-block validation, SHA-256 for end-to-end integrity proofs.
Production usage
| System | Usage | Notable behavior |
|---|---|---|
| PostgreSQL | CRC-16 in every 8KB page header | Opt-in via initdb --data-checksums. On mismatch, refuses to serve the page and logs invalid page in block X. Cannot be enabled on an existing cluster without dump/restore. |
| HDFS | CRC-32C per 512-byte chunk within each 128MB block | DataNodes verify on read and report mismatches to the NameNode, which triggers automatic re-replication from a healthy replica. |
| ZFS | SHA-256 or Fletcher-4 per block (full Merkle tree) | Checksums cover both data and metadata. On mismatch, ZFS self-heals from mirror or RAID-Z parity. The only filesystem with true end-to-end integrity. |
| Ethernet | CRC-32 in the Frame Check Sequence (FCS) | Every Ethernet frame includes a 4-byte CRC. NICs discard frames with CRC mismatches silently, before the OS sees them. |
| RocksDB / LevelDB | CRC-32C per block in SST files | Verified on every read. Uses hardware CRC32C instruction. Corruption triggers a compaction repair or manual ldb repair. |
| Kafka | CRC-32C per message batch | Broker verifies CRC on produce. Consumer verifies on fetch. Catches corruption from disk or network at both ends. |
| gzip / PNG / ZIP | CRC-32 | Stored in the file footer. Decompressor verifies after decompression. A mismatch means the archive is corrupt. |
The pattern across these systems is consistent: compute the checksum on write, store it alongside the data, and verify on every read. The cost of a false positive (rejecting valid data) is a single retry or failover. The cost of a false negative (serving corrupted data) can be catastrophic. Every production storage system errs on the side of detection.
Notice that ZFS uses SHA-256 (a cryptographic hash) rather than CRC-32C. This is because ZFS stores checksums in parent blocks, forming a Merkle tree. A cryptographic hash prevents an attacker who compromises one disk from forging both the data and its checksum. For most systems without this threat model, CRC-32C is sufficient and much faster.
Limitations and when NOT to use it
- CRC does not detect intentional tampering. An attacker can modify data, recompute the CRC, and replace the checksum in microseconds. If your threat model includes malicious actors, CRC is useless. Use HMAC-SHA256 or a digital signature.
- CRC does not correct errors, only detect them. If corruption is detected, you need a separate recovery mechanism (replicas, parity, WAL replay). CRC alone just tells you the data is bad.
- CRC-32 has a non-zero false-negative rate for bursts longer than 32 bits. The probability of an undetected error is 1 in 2^32 per corrupted block. At billions of blocks per day (cloud-scale storage), a few undetected corruptions will occur over years.
- CRC does not provide deduplication. Two different data blocks can have the same CRC-32 (collisions are easy to find). For content-addressable storage, use SHA-256 or BLAKE3.
- Checksum verification adds latency on the read path. With hardware CRC32C this is negligible (nanoseconds), but software implementations on embedded devices or older CPUs can be measurable at high IOPS.
- Wrong CRC polynomial gives false confidence. If you use CRC-16 on large blocks or CRC-32 (IEEE) in a system that assumes CRC-32C, the error detection guarantees differ. Mismatched polynomials between writer and reader will produce false corruption alerts on every read.
Interview cheat sheet
- When asked about data integrity in storage systems, state CRC-32C immediately. It detects all single-bit and all burst errors up to 32 bits in a 4-byte checksum, with hardware acceleration on every modern CPU.
- When designing a write path, mention computing the checksum before writing and verifying on every read. PostgreSQL, RocksDB, HDFS, and Kafka all do this.
- When differentiating CRC from cryptographic hashes, say CRC detects accidental corruption; SHA-256 detects intentional tampering. CRC is 40x faster but trivially forgeable.
- When discussing Ethernet or TCP, note that Ethernet uses CRC-32 at the frame level. TCP uses a weaker 16-bit ones' complement checksum that misses some multi-bit errors, which is why application-layer checksums (CRC in protocols like gRPC, Kafka) still matter.
- When asked about ZFS or Btrfs, highlight that they checksum the entire block tree (data and metadata), not just data pages. This catches corruption that filesystem-level checksums miss, like a misdirected write from the disk controller.
- When comparing CRC-32 vs CRC-32C, say CRC-32C (Castagnoli) is preferred in modern systems because Intel's SSE4.2
CRC32instruction computes it at ~20 GB/s, and it has slightly better error detection properties for certain common bit patterns. - When asked about error correction (not just detection), clarify that CRC only detects errors. For correction, you need Reed-Solomon codes, LDPC, or ECC memory. Checksums and error-correcting codes are complementary, not interchangeable.
- When discussing end-to-end integrity, argue for checksumming at every layer: network (Ethernet CRC), transport (TCP checksum), application (CRC-32C per message), and storage (CRC-32C per page/block). Each layer catches corruption that the layer below might miss.
Quick recap
- Silent data corruption (bit flips in DRAM, disk errors, NIC bugs) produces incorrect data that hardware reports as successful. Checksums detect this by verifying data against a short fingerprint on every read.
- CRC uses polynomial division in GF(2) to compute a fixed-size remainder. All arithmetic is XOR-based, making it extremely fast with dedicated hardware instructions.
- CRC-32C (Castagnoli) is the modern default for storage systems. It detects all single-bit errors, all burst errors up to 32 bits, and has hardware acceleration via SSE4.2 at ~20 GB/s throughput.
- CRC detects accidental corruption but cannot detect intentional tampering. An attacker can forge a matching CRC in microseconds. For tamper detection, use SHA-256 or BLAKE3.
- Every production storage system (PostgreSQL, HDFS, RocksDB, Kafka, ZFS) computes checksums on writes and verifies on reads. The performance cost with hardware CRC is negligible (under 1% CPU).
- End-to-end integrity requires checksumming at every layer (network, transport, application, storage) because each layer can introduce corruption that the layer below cannot catch.
Related concepts
- Hashing - CRC is a specialized hash function optimized for error detection speed rather than collision resistance. Understanding hash function properties helps you choose between CRC, SHA-256, and BLAKE3.
- Write-ahead log - WAL entries are protected by CRC checksums in PostgreSQL and RocksDB. If you understand CRC, you understand what happens when a WAL entry fails verification during crash recovery.
- Databases - Page-level checksums are a core database reliability feature. PostgreSQL, MySQL, and SQLite all use CRC variants to catch corruption before it reaches queries.