Drawing the architecture diagram
How to draw a clear, interview-worthy architecture diagram: what components to include, how to label data flows, and how to layer complexity over three rounds.
TL;DR
- Most interview diagrams fail because they dump every component at once with unlabeled arrows. The interviewer can't tell what your system actually does.
- Draw in three rounds: core data path first, then read optimization (cache, replicas, CDN), then write optimization (queues, workers, sharding).
- Every arrow must have a label. "HTTP POST /tweets" is useful. A bare arrow is meaningless.
- Group components into visual layers (Internet, App Tier, Cache Tier, Data Tier) so the interviewer can scan your diagram in seconds.
- Narrate while you draw. Silent drawing wastes interview time and misses the chance to explain trade-offs in real time.
The Diagram Nobody Can Read
You're 15 minutes into a "Design Twitter" interview. Your whiteboard has 14 boxes, 22 arrows, and zero labels. The interviewer squints at it, tilts their head, and asks: "So... what talks to what here?"
You've drawn a diagram that makes sense to you (because you just drew it) but communicates nothing to the person evaluating you. I've reviewed hundreds of mock interviews, and this is the single most common failure mode. The candidate's architecture might be perfectly sound, but the diagram is a wall of spaghetti that nobody can follow.
The fix is a layering strategy. You don't draw everything at once. You build the diagram in three deliberate rounds, narrating each one. By the end, your diagram is complex but readable, because the interviewer watched it evolve and understood each addition.
Think of it like explaining directions. You wouldn't dump a complete route with every turn, highway merge, and landmark at once. You'd say: "First, get on the highway heading north. Then take exit 15. Then turn left at the intersection." Each step builds on the last. Your architecture diagram should work the same way.
Unlabeled arrows are invisible architecture
An arrow without a label carries zero information. The interviewer doesn't know if it's an HTTP request, a Kafka message, a database query, or a WebSocket push. Label every single edge with the operation name, HTTP verb, or data flow description.
Why Most Diagrams Fail
Before we fix diagrams, let's name the actual failure modes. I see four consistently.
No labels on arrows. This is the most common and most damaging. Your diagram has a line from "App Server" to "Database" but doesn't say whether it's a read, a write, a transaction, or a health check. The interviewer has to ask, which wastes time and signals that you don't think about data flows precisely.
Everything at once. The candidate draws load balancers, app servers, two databases, a cache, a message queue, three workers, a CDN, and a monitoring service in the first two minutes. There's no story, no progression, and no clear indication of what's core versus what's optimization.
No visual grouping. Components are scattered randomly. The database is next to the CDN. The cache is below the client. The interviewer can't see which components form a layer or which path data takes through the system.
Missing components that matter. The candidate draws app servers and a database but forgets the load balancer. Or draws a cache but never shows a cache-miss path. Or has a queue but no workers consuming from it. Incomplete paths leave holes the interviewer will probe, and they will.
For your interview: the bar is not a beautiful diagram. The bar is a diagram where someone who has never seen your design can trace the read path and write path in under 10 seconds.
The Three-Round Layering Approach
Here's the core technique. You draw your architecture in three rounds, each adding a specific type of complexity. This gives the interviewer a clear narrative arc and proves you know when to add each component.
Round 1: The Core Data Path
This is the happy path with zero optimization. Client sends a request, it reaches your application, your application reads from or writes to a database. That's it.
The goal of Round 1 is to establish the basic functional contract. What does the system do? What's the write path? What's the read path? You're not thinking about scale yet. You're thinking about correctness.
I tell every candidate: if your Round 1 diagram has more than 5 boxes, you're overcomplicating the start. The components you need are: client, load balancer, application server(s), and primary database. Nothing else.
Say this out loud while drawing: "The client sends POST /tweets to create a tweet and GET /timeline to read their feed. The load balancer distributes requests to stateless app servers. Each app server reads from and writes to a single PostgreSQL primary. This handles the basic functional requirement but obviously won't scale."
That last sentence is key. You're signaling that Round 2 is coming.
Round 2: Read Optimization
Now you identify the read bottleneck and address it. For most systems, reads vastly outnumber writes (often 100:1 or higher). Your single database can't handle the read volume at scale, so you add caching and read replicas.
The components you add in Round 2: a cache layer (Redis or Memcached), read replicas for the database, and potentially a CDN for static content.
Narrate the addition: "Reads dominate our traffic, so I'm adding a Redis cache in front of the database. For timeline reads, the app server checks Redis first. On a cache miss, it falls back to a read replica. Writes still go to the primary, and replicas receive updates via async replication. I'm also putting static content behind a CDN to offload image serving entirely."
My recommendation: always state the expected cache hit rate when you add caching. "With a 95% cache hit rate on timeline reads, we reduce database load by 20x" is dramatically more credible than just "we add a cache."
Round 3: Write Optimization and Async Processing
Now you address the write path. For a system like Twitter, the bottleneck isn't writing the tweet itself (that's one INSERT). The bottleneck is fan-out: delivering that tweet to every follower's timeline.
The components you add in Round 3: a message queue (Kafka, SQS), background workers, and potentially write sharding.
The narration: "When a user tweets, the app server writes the tweet to the primary DB and publishes a fan-out event to Kafka. The client gets a 200 OK as soon as Kafka acknowledges, so the write path is fast. Fan-out workers then read the event, look up the author's followers, and write the tweet to each follower's timeline cache in Redis. This makes the write path asynchronous, so even a user with 100K followers doesn't block the API response."
Interview tip: name the boundary between sync and async
When you introduce a message queue, always say where the synchronous boundary ends. "The client gets a 200 after Kafka ACK. Everything after that is async." This tells the interviewer you understand the latency contract.
Labeling Rules That Actually Matter
Every arrow in your diagram needs a label. But not all labels are equally useful. Here's what makes a good label versus a bad one.
Good labels describe the operation or data flowing across the edge:
POST /tweets(HTTP verb + path)INSERT tweet(database operation)Publish fan-out event(message queue operation)Cache read/Cache miss(cache operation with outcome)Async replication(replication type)
Bad labels are vague or redundant:
request(what kind of request?)data(what data?)sends to(the arrow already shows direction)connects(meaningless)
I use a simple rule: if someone covers up the boxes at each end of an arrow, can they still understand what's happening from the label alone? If not, the label needs work.
For your interview: you don't need to label every arrow with full HTTP path details. At minimum, label with the operation type (read, write, publish, consume, replicate). Upgrade to HTTP verb + path when the specific endpoint matters for your design discussion.
Visual Grouping with Subgraph Layers
The most readable diagrams group related components into named layers. This isn't just aesthetic preference. It communicates architectural thinking.
Standard layers for most web applications:
| Layer | Contains | Purpose |
|---|---|---|
| 🌐 Internet | Clients, CDN | Traffic sources, edge caching |
| ⚙️ App Tier | Load balancer, app servers | Stateless request handling |
| ⚡ Cache Tier | Redis, Memcached | Hot data, session storage |
| 📨 Async Tier | Kafka, workers | Background processing |
| 🗄️ Data Tier | Primary DB, replicas | Persistent storage |
When you group components this way, the interviewer immediately sees: requests flow top to bottom, each tier can scale independently, and the boundary between synchronous and asynchronous processing is visible.
I've noticed that candidates who use visual layers get fewer clarifying questions from interviewers. The diagram speaks for itself.
Not every system needs all five layers. A simple CRUD application might only need Internet, App Tier, and Data Tier. That's fine. The point is not to have all layers, but to group whatever components you have into coherent tiers. Resist the temptation to add a Cache Tier just because the template has one.
The other benefit of layers: they make scaling discussions easier. "We'll horizontally scale the App Tier" is clearer when the App Tier is visually distinct. "We'll add replicas to the Data Tier" makes sense when the Data Tier is its own group. Layers turn scaling from an abstract discussion into a visual one.
Tracing Read and Write Paths Separately
Here's a technique that separates strong candidates from average ones: explicitly call out the read path and the write path as separate flows through your diagram.
Most systems have dramatically different read and write characteristics. Twitter's write path (create a tweet) goes through the app server, Kafka, and fan-out workers. The read path (load timeline) goes through the app server, Redis cache, and maybe a read replica on cache miss.
When you narrate, trace each path separately:
Write path (tweet creation): "Client sends POST /tweet to the app server. The server validates, writes to the primary DB, publishes to Kafka. Client gets 200. Async workers fan out to followers' timeline caches."
Read path (timeline load): "Client sends GET /timeline. App server checks Redis cache. On hit (95% of the time), returns immediately. On miss, queries the read replica, populates cache, returns."
This separated narration proves you understand that read-heavy systems need different optimization strategies than write-heavy systems. It's one of the clearest signals of architectural maturity.
Narrating While You Draw
Never draw in silence. The diagram is only half the deliverable. The narration is the other half.
Here's why: when you draw silently, the interviewer watches a diagram appear piece by piece with no context. They don't know if the next box is a database or a cache until you label it. They're trying to reverse-engineer your thought process from the drawing order alone.
When you narrate, you control the story. The interviewer understands each component as you add it, hears your reasoning, and can ask targeted follow-ups. This makes the whole conversation more productive and collaborative.
My approach: speak in "I'm adding X because Y" sentences.
- "I'm adding a load balancer here because our app servers are stateless and we want horizontal scalability."
- "I'm putting Redis in front of the database because timeline reads are 100:1 relative to writes, and I want sub-millisecond cache reads."
- "I'm introducing Kafka here because fan-out to 100K followers is too slow to do synchronously. The client shouldn't wait for that."
Each sentence has the component (what) and the justification (why). That's the pattern.
If you can only remember one rule about narration, make it that one: what you're adding and why you're adding it. Everything else is refinement.
Interview tip: pause after Round 1
After drawing the core data path, stop and ask: "Does this capture the core flow? Should I focus on scaling the read path or the write path first?" This shows you're collaborative and gives the interviewer a chance to steer the discussion toward what they care about.
Common Diagram Mistakes
Spaghetti lines. Arrows cross in every direction, making the data flow impossible to trace. Fix: organize components in layers so data flows primarily top-to-bottom or left-to-right. Avoid crossing arrows by rearranging component positions within their layer.
The magic app server. A single "App Server" box that apparently handles routing, authentication, business logic, caching, and database access. Fix: if you're discussing a specific subsystem (like fan-out), break the app server into its relevant subcomponents. Otherwise, label what the app server does at each arrow.
One giant database. A single "DB" box that stores users, tweets, timelines, social graphs, and analytics. Fix: label what's in the database (at minimum, name the key tables), or split into separate stores when they have different access patterns.
Unidirectional arrows only. Every arrow goes from client to server to database. No responses shown. Fix: you don't need to show every response arrow, but for async flows where the response path is different from the request path, make the return path explicit.
Over-precision too early. Drawing internal subsystems of your cache or detailed Kafka partition assignments before you've established the overall flow. Fix: get the high-level architecture right first. Zoom into subsystems only when asked, or when you're specifically discussing a deep-dive component.
No failure paths. The diagram shows only the happy path with no indication of what happens when a component fails. Fix: be ready to add fallback arrows (cache miss path, circuit breaker, retry path) when asked "what if X goes down?"
Identical boxes with no differentiation. Three "Service" boxes that all look the same and connect to the same database. If your services are truly different (User Service, Tweet Service, Timeline Service), label them differently and show how they handle distinct data flows. If they're identical stateless instances behind a load balancer, say so explicitly.
The 10-second scan test
After finishing your diagram, step back and ask: "Can someone who didn't draw this trace the read path in 10 seconds?" If arrows are crossing, labels are missing, or there's no clear top-to-bottom flow, simplify before adding more complexity.
How This Shows Up in Interviews
"Walk me through your diagram." The interviewer wants you to trace the read and write paths separately. They want operation names on arrows, not vague "sends data" labels. Practice this narration out loud.
"Can you make this more concrete?" Your diagram is too abstract. Add specific technology choices (PostgreSQL, Redis, Kafka), operation labels (POST /tweets, INSERT INTO tweets), and capacity hints (95% cache hit rate).
"What happens if Redis goes down?" This tests whether you've thought about failure paths. If your diagram only shows the happy path, you need to be ready to draw a fallback arrow (app server falls back to read replica on cache miss).
"Why did you add the queue here?" The interviewer is testing whether you added the component deliberately or reflexively. Your answer should reference a specific bottleneck: "Because fan-out to N followers is O(N) work that would block the API response if done synchronously."
"This seems complex for this scale." You've over-engineered. Round 1 might have been sufficient for the stated requirements. Always calibrate your architecture to the stated scale. If the interviewer said "100 users," you probably don't need Kafka.
Quick Recap
- Draw in three rounds: core data path (5 boxes max), read optimization (cache, replicas, CDN), write optimization (queues, workers, sharding). This progression is the framework.
- Label every arrow with the operation, HTTP verb, or data description. An unlabeled arrow communicates nothing and forces the interviewer to ask clarifying questions.
- Group components into named layers (Internet, App Tier, Cache Tier, Async Tier, Data Tier). Visual grouping makes complex diagrams scannable.
- Narrate while you draw using "I'm adding X because Y" sentences. The narration is as important as the diagram itself.
- Trace read and write paths separately. They go through different components in any non-trivial system, and calling them out separately shows architectural maturity.
- After Round 1, pause and ask the interviewer where to focus. This is collaborative and prevents you from optimizing the wrong thing.
- Be ready for failure-path questions. "What if Redis goes down?" should prompt a fallback arrow, not a blank stare.