DNS internals
Learn how DNS resolves a hostname end to end, what recursive vs. iterative resolution means, and why TTL tuning during deployments is as important as the deployment itself.
The problem
You finish a zero-downtime deployment. The new application servers are running. The load balancer is updated and routing traffic to them. Your old EC2 instances are shut down. Five minutes later, 30% of users are still hitting the old, terminated IP address. Their requests are timing out. You got pages from half your regions.
The deployment worked. The DNS did not. Specifically, your DNS TTL was set to 86,400 (24 hours). Resolvers cached the old IP and will not check for an update until their cached record expires. The users hitting the old IP are using resolvers that cached the old record this morning.
DNS propagation delay is not magic or randomness. It is arithmetic: every resolver holds your record for exactly its TTL, and they all refreshed at different times throughout the day. Understanding DNS end to end turns a mysterious deployment failure into a predictable problem you can prevent by lowering your TTL before a planned IP change.
What DNS is
The Domain Name System (DNS) is a globally distributed, hierarchically delegated database that maps human-readable names (like api.example.com) to machine-usable values (like 93.184.216.34). It is not a single server. It is a tree of authorities, each responsible for a portion of the namespace, coordinated through delegations.
Think of it like a nested directory of phone books. There is a global directory (root) that tells you which regional directory handles .com. The .com directory tells you which business directory handles example.com. The example.com directory tells you the actual phone number for api. No single book contains everything; each book tells you who to ask next.
The DNS namespace is a tree. Delegation flows downward from root to TLD to zone:
spawnSync d2 ENOENT
How DNS resolution works
Resolving api.example.com from a fresh cache (no information cached anywhere) walks a four-step chain. The sequence below shows recursive resolution, where the recursive resolver does all the work on behalf of the client.
Step by step:
- Your application calls
getaddrinfo("api.example.com"). The OS stub resolver checks its local cache. Cache miss. - The stub resolver forwards the query to the configured recursive resolver (your ISP's resolver, or
8.8.8.8, or a private DNS server). - The recursive resolver checks its cache. Cache miss. It must start from the top.
- The recursive resolver asks a root nameserver. Root servers do not know the answer but know which nameservers are authoritative for
.com. Returns a referral. - The recursive resolver asks the
.comTLD nameserver. It knows which nameservers are authoritative forexample.com. Another referral. - The recursive resolver asks
ns1.example.com(the authoritative nameserver). This server has the actual record. Returns the answer with a TTL. - The recursive resolver caches the answer for the TTL duration and returns it to the stub resolver, which caches it and returns it to the application.
The entire chain for a cold cache typically takes 30-150 ms. Subsequent queries hit the recursive resolver's cache and return in under 1 ms.
// Pseudocode: recursive resolver algorithm
function resolve(name, type):
cached = cache.get(name, type)
if cached and not expired: return cached
// Start from the bottom of what we know
best_known_ns = find_closest_cached_nameserver(name)
// e.g. for api.example.com, we might have .com NS cached already
while true:
response = query(best_known_ns, name, type)
if response.is_answer:
cache.store(name, type, response.answer, ttl=response.ttl)
return response.answer
if response.is_referral:
// Follow the referral — ask the next nameserver in the chain
best_known_ns = response.referral_ns
continue
if response.is_nxdomain:
cache.store(name, NXDOMAIN, ttl=response.negative_ttl)
return NXDOMAIN
TTL and caching at every layer
TTL (Time To Live) is the number of seconds a resolver is allowed to cache a DNS record. Once the TTL expires, the resolver must re-query the authoritative nameserver for a fresh copy.
Every layer in the chain caches independently, and the TTL countdown starts from when each resolver fetched the record, not from when you published it.
| Layer | What it caches | Typical TTL | Notes |
|---|---|---|---|
| OS stub resolver | Query results | Typically 0-30 s | Many systems re-query on every process restart |
| Recursive resolver (ISP) | Full answers and referrals | As published in DNS | May enforce a minimum TTL floor (often 60 s) |
| Recursive resolver (public: 8.8.8.8) | Full answers | Honors TTL exactly | Google and Cloudflare honor low TTLs; many ISP resolvers enforce floors |
| Browser | A record results | Varies (10 s - 60 s) | Chrome and Firefox have their own DNS cache |
| Application | Results from getaddrinfo | Application-controlled | Many HTTP clients cache results for the connection lifetime |
This is why DNS propagation is gradual rather than instant: every resolver refreshes independently when its cached copy expires. A TTL of 300 s means all resolvers will have the new record within 5 minutes of the change. A TTL of 86,400 s means some resolvers may serve the old record for up to 24 hours.
Lowering TTL must happen before the planned IP change, not during it. If your TTL is 86,400 when you make the change, resolvers that cached the record two hours ago will hold it for another 22 hours regardless of your new TTL. Lower TTL to 300 s at least 24-48 hours before any planned IP rotation, wait one full old-TTL period, then make the change. This is one of the most common deployment mistakes I see in production post-mortems.
DNS record types
DNS is more than just A records. Each record type serves a specific purpose:
| Record type | Purpose | Example |
|---|---|---|
| A | Maps hostname to IPv4 address | api.example.com -> 93.184.216.34 |
| AAAA | Maps hostname to IPv6 address | api.example.com -> 2606:2800:220:1::93c8:d823 |
| CNAME | Alias: maps hostname to another hostname | www.example.com -> example.com |
| MX | Mail exchanger: where to deliver email | example.com -> mail1.example.com (priority 10) |
| TXT | Arbitrary text: used for SPF, DKIM, verification tokens | example.com -> "v=spf1 include:_spf.google.com ~all" |
| NS | Delegations: which nameservers are authoritative | example.com -> ns1.example.com, ns2.example.com |
| SOA | Start of Authority: zone metadata and negative TTL | Zone serial, refresh intervals, negative caching TTL |
CNAMEs have an important constraint: they cannot be used at the zone apex (the root of a domain). You cannot set example.com itself to a CNAME. This is why many DNS providers offer ALIAS or ANAME records as a vendor extension that behaves like a CNAME at the apex but resolves to an A record at the edge.
Negative caching
When a DNS query returns NXDOMAIN (name does not exist) or NODATA (name exists but no records of the requested type), resolvers cache that negative result too. The TTL for negative caching comes from the SOA record's minimum field.
This matters for incident response: if you accidentally deleted a DNS record and a resolver received an NXDOMAIN response, it will cache that negative result for the negative TTL (often 300-900 s). Even after you restore the record, affected resolvers continue returning NXDOMAIN until their negative cache entry expires. There is no way to force a resolver to clear its cache from the outside.
Negative cache TTL is controlled by the SOA record's minimum field, not your A record's TTL. If your A record TTL is 60 s but your SOA minimum is 900 s, a briefly deleted record stays broken for 15 minutes even after it is restored. Audit your SOA negative TTL separately from your A record TTLs, especially in on-call runbooks for accidental deletion scenarios.
Anycast for root and TLD nameservers
There are 13 root nameserver addresses (a.root-servers.net through m.root-servers.net). But each address is served by hundreds of physical servers worldwide using anycast routing. Anycast means the same IP address is announced from multiple locations; routers send queries to whichever server is topologically closest.
This is why DNS root servers can handle billions of queries per day: each query goes to the nearest physical replica, not to 13 centralized machines.
Production usage
| System | Usage | Notable behavior |
|---|---|---|
| AWS Route 53 | Authoritative DNS for AWS-hosted services | Health checks can update DNS records automatically on failure; propagation is constrained by TTL |
| Cloudflare DNS | Both authoritative and recursive (1.1.1.1) | Honors very low TTLs (down to 1 s); provides DDoS protection via anycast |
| Consul | Internal service discovery DNS | Default TTL is 0 (no caching); intended for dynamic microservice environments with frequently changing IPs |
| CoreDNS | Kubernetes internal DNS | Resolves service names to ClusterIP; every pod uses it as its default resolver |
| nscd / dnsmasq | Host-level caching daemon | Adds a local cache to reduce upstream resolver load; can cause stale results if not tuned |
Limitations and when NOT to use it
- DNS is not a real-time failover mechanism unless TTLs are very short. With a 300 s TTL, any DNS-based failover takes up to 5 minutes to propagate to all resolvers. For sub-minute failover requirements, route at the load balancer or via health-check-based record updates, not TTL expiry.
- Lowering TTL must happen before the planned change, not during. If you lower TTL right when you are changing an IP, resolvers that cached the record 1 minute ago with the old high TTL will still hold it for the full original duration. Lower TTL at least one full old-TTL interval before any IP-changing deployment.
- Resolvers may not honor TTLs exactly. ISP resolvers often enforce a minimum TTL floor (30-60 s). Public resolvers (8.8.8.8, 1.1.1.1) generally honor low TTLs. You cannot guarantee sub-60-second propagation across all resolvers in the wild.
- CNAME chaining adds latency. Each CNAME hop may require an additional DNS lookup. Deeply nested CNAMEs (CNAME to CNAME to CNAME) add round-trip time on cache misses.
- Negative caching locks out deleted records. An accidentally deleted record causes NXDOMAIN responses that are cached for the negative TTL. Restoring the record does not help until those caches expire.
- Split-horizon DNS adds operational complexity. Running one DNS view for internal traffic and another for external traffic (split-horizon) requires careful zone management. Misconfiguration directs internal services to external IPs or vice versa.
Interview cheat sheet
- When asked why users still see old servers after a deployment: DNS TTL. Resolvers cache the old record until its TTL expires. If TTL was 86,400 s when you made the change, some resolvers will serve the old IP for up to 24 hours. The fix is to lower TTL before deployments, not after.
- When asked about DNS propagation time: It equals the TTL of the record at the time resolvers last cached it. There is no magic delay. Lower TTLs give faster propagation. An industry best practice is to lower TTL to 60-300 s at least one full old-TTL period before any IP-changing deployment.
- When asked how DNS resolution works: Stub resolver to recursive resolver, then recursive resolver queries root, TLD, authoritative in sequence. Each level returns either an answer or a referral. The recursive resolver caches answers at each level.
- When asked about Consul or CoreDNS in microservices: Internal service discovery DNS with very short TTLs (0-5 s) allows services to discover each other dynamically as IPs change with container scheduling. Standard DNS is too slow for this without near-zero TTLs.
- When asked about CNAME vs A record: A record resolves directly to an IP. CNAME points to another name which then resolves to an IP, requiring an extra lookup on cache miss. CNAMEs cannot be used at the zone apex (naked domain).
- When asked about the 13 root nameservers: There are 13 root nameserver addresses but hundreds of physical machines behind them via anycast routing. Each query goes to the nearest replica; they are not 13 centralized servers.
- When asked about negative caching: NXDOMAIN responses are cached just like positive records. The TTL comes from the SOA record's minimum field. A deleted record causes NXDOMAIN caching; restoring it does not instantly fix resolution for resolvers that already cached the NXDOMAIN.
- When asked about DNS-based load balancing: DNS can return multiple A records (round-robin) or health-check-weighted records (Route 53). It is not a real-time balancer since TTL delays visibility of changes. Use it for coarse-grained routing, not fine-grained traffic management.
Quick recap
- DNS is a globally distributed hierarchical database; resolving a name walks from the root through TLD to authoritative nameservers, with each resolver caching results for the record's TTL.
- TTL determines propagation time: the window for a change to reach all resolvers equals the TTL that was in place when they last cached the record, not the TTL you set at the moment of the change.
- To minimize DNS propagation delay for a planned IP change, lower the TTL at least one full old-TTL period before the change, then make the change.
- NXDOMAIN responses are cached as negative entries with a TTL from the SOA record; re-adding a deleted record does not immediately fix resolution for resolvers that cached the NXDOMAIN.
- CNAMEs cannot be used at the zone apex; use provider-specific ALIAS/ANAME records instead to get dynamic-IP behavior on the naked domain.
- For dynamic microservices requiring sub-second service discovery, DNS with any meaningful TTL is too slow; use a purpose-built service registry (Consul, Kubernetes Services) instead.
Related concepts
- Networking — DNS is the entry point to every networked request; understanding TCP, UDP, and anycast routing puts DNS resolution latency and reliability in context.
- CDN — CDNs rely heavily on DNS-based geo-routing and short TTLs to direct users to the nearest edge node; DNS propagation delays directly affect CDN failover behavior.
- Load balancing — DNS round-robin is the simplest form of load balancing, but it lacks health checking and consistent hashing; understanding its limitations explains why application-layer load balancers exist.