Edge computing
What edge computing is and when to use it, edge vs CDN, Cloudflare Workers vs Lambda@Edge cold starts, edge data stores, and when the edge creates more problems than it solves.
TL;DR
- Edge computing runs application logic at CDN Points of Presence (PoPs) close to users, cutting dynamic request latency from 150-300ms to 10-50ms.
- Best edge use cases: JWT validation, A/B test assignment, geolocation routing, request/response headers, bot detection. All stateless or eventually-consistent workloads.
- Edge runtimes use V8 isolates (not containers), giving near-zero cold starts but strict limits: ~128MB RAM, 50ms CPU time, no blocking I/O.
- Edge data stores (KV, Durable Objects, distributed SQLite) make stateful edge logic possible, but with consistency tradeoffs that you need to design around.
- The fundamental question is not "can I run this at the edge?" but "does the latency win justify running code in 300+ distributed locations?"
The Problem It Solves
Your CDN handles static assets perfectly. Images, CSS, JavaScript bundles all serve from the nearest PoP in under 20ms. But the moment a user hits a dynamic endpoint (login, personalized homepage, API call), the request flies past the CDN and travels to your origin server, often on another continent.
A user in Tokyo making a request to your origin in Virginia faces a minimum 150ms network round-trip just for the speed of light through fiber. Add TLS handshake, server processing, and database queries, and you're looking at 300-500ms for a single dynamic request. Multiply that by the 3-5 sequential API calls a typical page load makes, and your Tokyo users experience 1-2 seconds of latency that your Virginia users never see.
I've seen teams spend months optimizing database queries and application code, shaving off 10ms here and 20ms there, while ignoring the 300ms physics tax on every cross-ocean request. No amount of code optimization fixes the speed of light.
The CDN is right there in Tokyo, 10ms from the user, but it can only serve cached files. Every dynamic request bypasses it entirely. Edge computing changes this: what if the CDN node could also run your application logic?
What Is It?
Edge computing means running application logic at the same physical locations where CDNs serve static content, typically 200-300+ data centers distributed globally. Instead of every dynamic request traveling to a centralized origin, the edge node closest to the user handles it locally.
Think of it like a bank. Traditional web architecture is like a bank with one central office: every customer, no matter which branch they walk into, has to call the central office and wait for an answer. Edge computing puts a teller at every branch who can handle common transactions (verify your ID, check your balance) locally, and only calls the central office for complex operations (wire transfers, loan approvals).
The key insight: the edge handles what it can (auth, routing, personalization), and forwards only what it must to the origin. For many applications, 60-80% of requests can be fully resolved at the edge without ever touching origin.
Edge compute is not CDN caching
A CDN caches static files. Edge compute runs code. They coexist at the same physical locations, and vendors like Cloudflare offer both, but they solve different problems. In an interview, never conflate "adding a CDN" with "adding edge compute." CDN is a caching strategy. Edge compute is a processing strategy.
For your interview: say "edge compute runs application logic at CDN PoPs so we can handle auth, routing, and personalization in 10ms instead of 300ms" and move on.
How It Works
Let's trace a single request through an edge worker from start to finish. A user in São Paulo loads your dashboard.
- DNS resolves to nearest PoP. Your domain uses anycast DNS, so the user's request routes to the São Paulo PoP automatically. Latency: ~5ms.
- TLS terminates at the edge. The edge worker handles the TLS handshake locally, saving one full round-trip (~150ms for users far from origin).
- Edge worker executes. The V8 isolate spins up in under 1ms (no container cold start). The worker runs your middleware logic.
- Auth check. The worker verifies the JWT signature using a cached public key. Invalid tokens get a 401 immediately, never reaching origin.
- Feature flag lookup. The worker reads feature flags from edge KV (~1ms). No origin round-trip needed.
- A/B test assignment. The worker deterministically assigns the user to a cohort based on a hash of their user ID. Sets a cookie, selects the correct variant.
- Decision: edge or origin? If the request can be fully served (auth rejection, cached response, A/B redirect), the worker responds directly. If it needs fresh data, the worker forwards to origin with enriched headers.
- Origin handles complex logic. The origin receives a pre-authenticated, pre-enriched request. It queries the database, runs business logic, returns the response.
- Edge caches the response. If the response is cacheable, the worker stores it at the local PoP for future requests from that region.
// Cloudflare Worker: edge middleware for auth + A/B + geolocation
export default {
async fetch(request: Request, env: Env): Promise<Response> {
// 1. Verify JWT at the edge (no origin round-trip for invalid tokens)
const token = request.headers.get("Authorization")?.replace("Bearer ", "");
if (!token) return new Response("Unauthorized", { status: 401 });
const isValid = await verifyJWT(token, env.JWT_PUBLIC_KEY);
if (!isValid) return new Response("Invalid token", { status: 401 });
// 2. Read feature flags from edge KV (~1ms)
const flags = await env.FLAGS_KV.get("feature-flags", "json");
// 3. Deterministic A/B assignment (no database needed)
const userId = decodeJWT(token).sub;
const cohort = hashToPercent(userId) < 50 ? "control" : "variant-a";
// 4. Geolocation (provided by the edge runtime automatically)
const country = request.cf?.country || "US";
// 5. Forward to origin with enriched headers
const originReq = new Request(env.ORIGIN_URL + new URL(request.url).pathname, {
headers: {
...Object.fromEntries(request.headers),
"X-User-Id": userId,
"X-AB-Cohort": cohort,
"X-Country": country,
"X-Feature-Flags": JSON.stringify(flags),
},
});
return fetch(originReq);
},
};
My recommendation: in an interview, sketch this as "edge middleware" between users and your load balancer, label it with 3-4 things it handles, and draw a single arrow to origin for everything else. That's the whole story.
Key Components
| Component | Role |
|---|---|
| Edge PoP | Physical data center (200-300+ globally). Hosts CDN cache and edge compute side by side. |
| Edge runtime | Execution environment at each PoP: V8 isolates (Cloudflare Workers, Vercel) or containers (Lambda@Edge). |
| Edge worker | Your application code at the PoP. Handles middleware: auth, routing, transformation, caching decisions. |
| Edge KV store | Eventually-consistent key-value store replicated across all PoPs. Reads ~1ms, writes propagate in 10-60s. |
| Durable Objects | Strongly-consistent stateful objects, each pinned to one location. Rate limiters, counters, WebSocket hubs. |
| Edge cache | Per-PoP HTTP cache (separate from static CDN cache). Workers can programmatically cache origin responses. |
| Origin server | Traditional backend in a single region. Complex logic, database writes, strong consistency. |
| Anycast DNS | Routes users to the nearest PoP automatically based on network topology. Zero configuration from the user. |
Types / Variations
Edge runtimes differ significantly in execution model, resource limits, and cold start behavior. This determines what you can actually run at the edge.
| Runtime | Execution Model | Cold Start | Memory | CPU Limit | Regions |
|---|---|---|---|---|---|
| Cloudflare Workers | V8 isolates | 0ms (pre-warmed) | 128MB | 50ms free, 30s paid | 300+ PoPs |
| Lambda@Edge | Containers | 100-500ms | 128MB viewer, 10GB origin | 5s viewer, 30s origin | CloudFront PoPs |
| Deno Deploy | V8 isolates | Near-zero | 512MB | 50ms free, 10min paid | 35 regions |
| Vercel Edge Functions | V8 isolates (Cloudflare) | Near-zero | 128MB | 25ms | Global (Cloudflare) |
| Fastly Compute | WebAssembly | 0ms (AOT compiled) | 128MB | 60s | 80+ PoPs |
The V8 isolate model (Cloudflare, Deno, Vercel) gives near-zero cold starts because isolates share a single process. Lambda@Edge uses containers, which explains the 100-500ms cold start penalty. For latency-sensitive edge work, isolates are almost always the right model.
Interview shortcut: just say Cloudflare Workers
Unless the question specifically asks about AWS, default to Cloudflare Workers as your edge runtime example. It has the widest PoP coverage, the simplest mental model (V8 isolates, 0ms cold start), and the most complete edge data story (KV + Durable Objects + D1). Interviewers rarely care which vendor you pick.
Edge Data Stores
Running code at the edge is only useful if you can access data at the edge. The store you choose determines your consistency guarantees.
| Store | Consistency | Read Latency | Write Latency | Best For |
|---|---|---|---|---|
| Cloudflare KV | Eventually consistent (60s propagation) | ~1ms (local PoP) | ~500ms (central + replicate) | Feature flags, config, sessions |
| Durable Objects | Strongly consistent (single location) | ~50ms (if not co-located) | ~50ms | Rate limiters, counters, WebSocket hubs |
| D1 / Turso | Read replicas (eventual), writes to primary | ~5ms (local replica) | ~100ms (forwarded to primary) | User prefs, small SQL datasets |
| DynamoDB Global Tables | Eventually consistent (cross-region) | ~5ms (local replica) | ~20ms (local, async replicate) | Session state, user profiles |
The rule of thumb: use KV for anything that tolerates 60-second staleness (most things), Durable Objects for coordination that needs exactness, and distributed SQLite when you need queries on small datasets.
Trade-offs
| Advantage | Disadvantage |
|---|---|
| Dynamic request latency drops from 150-300ms to 10-50ms | Data at the edge is eventually consistent (seconds to minutes stale) |
| Near-zero cold starts with V8 isolates | Strict resource limits: 128MB RAM, 50ms CPU, no blocking I/O |
| Automatic global deployment to 300+ PoPs | Bugs deploy globally in seconds with no staged rollout by default |
| Pay-per-request pricing scales to zero | Distributed debugging across hundreds of PoPs |
| Offloads 60-80% of requests from origin | Limited runtime APIs: no raw TCP sockets, no filesystem access |
| Reduced origin infrastructure cost | Vendor lock-in (Workers code doesn't port to Lambda@Edge) |
The fundamental tension is latency vs. operational simplicity. Edge computing trades the debuggability and consistency guarantees of a single-region deployment for sub-50ms response times to global users. The closer you move compute to users, the further you move it from your data and your debugging tools.
When to Use It / When to Avoid It
Use edge compute when:
- Auth and token validation: JWT verification at the edge rejects 100% of unauthenticated traffic without origin involvement. This is the single highest-value edge use case.
- Request routing and middleware: URL rewrites, header injection, A/B test assignment, bot detection. Anything that enriches or filters requests before they hit origin.
- Geolocation logic: Redirect users to regional domains, show localized content, enforce geo-restrictions. The edge runtime provides IP geolocation automatically.
- Personalized caching: Serve cached pages with personalized fragments (user name, locale, feature flags) by assembling responses at the edge from cached components.
- Rate limiting: Edge-based rate limiting blocks abuse before it reaches your infrastructure. Durable Objects or distributed counters make this feasible.
- Static site generation with dynamic headers: Add security headers, CORS, CSP to static responses without any origin involvement.
Avoid edge compute when:
- Your data lives in one region: If every edge function calls a centralized Postgres in Virginia, you've added edge compute latency on top of the database round-trip. The edge function completes in 2ms, then waits 150ms for the database. Net improvement: zero.
- You need strong consistency: Distributed edge state is eventually consistent by default. Financial transactions, inventory counts, and anything where stale reads cause real harm should stay at origin.
- The logic is CPU-intensive: Image processing, PDF generation, ML inference. Edge workers are optimized for I/O-bound tasks, not computation.
- You need full runtime APIs: Raw TCP connections, filesystem access, native extensions, or heavy dependencies. Edge runtimes are sandboxed.
- Your users are in one region: If 95% of your traffic comes from the US East Coast and your origin is in Virginia, edge compute adds architectural complexity with minimal latency benefit.
If you're unsure whether something belongs at the edge, it probably doesn't. Start with auth and routing, measure the impact, and expand from there.
Real-World Examples
Cloudflare Workers handles millions of requests per second across 300+ PoPs in 100+ countries. Discord uses Workers for rate limiting and request routing, processing every API request through edge middleware before it reaches origin. The 0ms cold start means no user ever waits for an isolate to spin up.
Vercel Edge Functions power Next.js middleware for authentication, redirects, and A/B testing. When you deploy a Next.js app on Vercel, the middleware.ts file runs at the edge by default. Companies like Washington Post and Notion use this to serve personalized content with sub-50ms time-to-first-byte globally.
Netflix Open Connect is a custom edge CDN that handles over 15% of all internet traffic globally. While primarily a caching system for video content (not general-purpose compute), it demonstrates the core principle: placing processing power close to users at ISP locations reduces latency and backbone bandwidth. Netflix stores copies of its content across 17,000+ servers in 6,000+ locations worldwide.
How This Shows Up in Interviews
When to bring it up
Mention edge computing when the interviewer asks about latency optimization for global users, when a CDN alone isn't sufficient for dynamic content, or when you're designing middleware/gateway layers. It fits naturally in auth discussions ("reject invalid tokens at the edge before they reach origin") and A/B testing systems ("assign cohorts at the edge with zero origin cost").
Depth expected at senior/staff level
- Edge vs. CDN distinction: Caching static content vs. running dynamic code. Different purposes, same physical locations.
- What belongs at edge vs. origin: Auth, routing, flags, geo-logic at edge. Complex business logic, writes, strong consistency at origin.
- Edge data stores: KV for config/flags (eventually consistent), Durable Objects for coordination (strongly consistent at one location), distributed SQLite for small queryable datasets.
- Cold start tradeoffs: V8 isolates (0ms) vs. containers (100-500ms) and why this drives runtime choice.
- The hybrid model: Edge handles auth/routing/caching, origin handles business logic and writes. Draw the separation explicitly.
- Blast radius: Bugs deploy to 300+ PoPs in seconds. Canary rollouts and feature-flag gating are essential.
Interview pattern: the edge middleware sketch
Draw your system with an "edge middleware" box between users and your load balancer. Label it with 3-4 things it handles (JWT validation, feature flags, geo routing, bot detection). Draw a single arrow from edge to origin labeled "complex requests only." This shows the interviewer you understand the split without overcomplicating the diagram.
Follow-up Q&A
| Interviewer asks | Strong answer |
|---|---|
| "Why not put everything at the edge?" | "Edge runtimes have strict limits (128MB, 50ms CPU) and edge data is eventually consistent. Complex business logic, database transactions, and CPU-heavy work need origin. The edge handles stateless, fast middleware." |
| "How do you handle auth at the edge?" | "The edge worker verifies JWT signatures using a cached public key. Invalid tokens get a 401 in 10ms. Valid tokens are decoded and forwarded as enriched headers (X-User-Id, X-Roles), so origin trusts the edge's auth decision." |
| "What about edge data consistency?" | "Edge KV is eventually consistent with ~60s propagation. For feature flags and sessions, that's fine. For strong consistency (inventory, balances), keep it at origin or use Durable Objects, which are consistent at one location." |
| "Cold starts at the edge?" | "V8 isolate runtimes like Cloudflare Workers have 0ms cold starts because isolates share a process. Container runtimes like Lambda@Edge have 100-500ms cold starts. For latency-sensitive edge work, isolates are the right choice." |
Test Your Understanding
Quick Recap
- Edge computing runs application code at CDN PoPs, cutting dynamic request latency from 150-300ms to 10-50ms by eliminating trans-oceanic round-trips.
- Best edge use cases are stateless or eventually-consistent: JWT validation, A/B test assignment, geolocation routing, request/response transformation, and rate limiting.
- V8 isolate runtimes (Cloudflare Workers, Deno Deploy, Vercel) have 0ms cold starts with 128MB/50ms limits. Container runtimes (Lambda@Edge) have 100-500ms cold starts but higher resource ceilings.
- Edge data stores trade consistency for locality: KV gives 1ms reads with 60-second propagation, Durable Objects give strong consistency at one location with ~50ms reads, and distributed SQLite gives SQL queries on eventually-consistent replicas.
- The edge-origin hybrid model is the standard: edge handles auth, routing, and caching while origin handles business logic, transactions, and strong-consistency operations.
- Edge deployment blast radius is a real operational risk: code deploys to 300+ PoPs in seconds. Canary rollouts, automated rollback triggers, and feature-flag gating are essential safeguards.
- The decision framework is simple: if the request is stateless or tolerates eventual consistency, and the latency savings matter for your user base, move it to the edge. Everything else stays at origin.
Related Concepts
- CDN: CDN caching is the static-content counterpart to edge compute. Understanding CDNs first makes edge compute a natural extension of the same infrastructure.
- Caching: Edge KV stores and per-PoP caches are caching strategies with specific consistency tradeoffs. The cache-aside pattern applies at the edge layer too.
- Networking: Understanding TCP round-trip time, TLS handshakes, and anycast DNS routing explains why edge compute provides such significant latency improvements for global users.