File Downloader
Design a production-grade file download service: walk through pre-signed URLs, HTTP Range requests, parallel multipart downloads, CDN offloading, and pause-resume state management across 10M concurrent clients.
What is a file download service?
A file download service delivers files from server storage to clients reliably and efficiently. The simplicity is deceptive: the real engineering challenge is serving arbitrary byte ranges from multi-gigabyte files concurrently across millions of connections, without flooding your origin storage or losing a 5 GB resume state when a mobile client drops off the network for 30 minutes. This question tests CDN architecture, HTTP Range requests, connection management, distributed session state, and how to separate the control plane (session management) from the data plane (moving bytes).
Functional Requirements
Core Requirements
- Users can initiate file downloads from web or mobile clients.
- Downloads can be paused and resumed; state persists across client restarts.
- The client reports real-time download progress (bytes received out of total).
- The system handles files from 1 KB to 50 GB.
Below the Line (out of scope)
- File upload pipeline
- Per-file digital rights management (DRM)
- Per-user download speed throttling
- Download scheduling and queue prioritization
The hardest part in scope: Serving byte-range requests for 50 GB files across 10 M concurrent connections without proxying bytes through application servers. The challenge sits at the intersection of CDN edge caching, Range request semantics, and distributed session state for pause-and-resume.
File upload is below the line because it is an independent write path with different reliability constraints (chunked upload, deduplication, virus scanning) that do not influence the download architecture.
DRM is below the line because it requires license server integration, per-device key management, and encrypted segment delivery. To add it, I would generate a time-limited DRM token alongside the pre-signed download URL and configure the CDN to enforce token validity before serving each segment.
Speed throttling is below the line because it adds significant CDN configuration complexity. To add it, I would configure per-client token bucket rate limits at the CDN edge. That configuration layer sits above the download path we are designing here.
Non-Functional Requirements
Core Requirements
- Availability: 99.99% uptime. Availability over strict consistency; a slightly stale progress count is far better than an unavailable download endpoint.
- Latency: First byte under 200 ms p99 for CDN-cached files; under 1 s for origin-fetched cold files.
- Throughput: 10 M concurrent downloads sustained; p99 single-client transfer speed at or above 50 Mbps.
- Integrity: Every delivered file must SHA-256-match the server-recorded checksum. Partial delivery must also be verifiable per chunk.
- Range support: Server must respond correctly to any
Range: bytes=X-Yrequest within a file, enabling pause-resume and parallel multi-part download.
Below the Line
- Geographic IP-based content restrictions
- Per-user bandwidth billing and metered egress quotas
- Real-time download analytics (top files, geographic heat maps)
Read/write ratio: For popular software packages or media files, the read-to-write ratio reaches 1,000,000:1 or more. A single OS release uploaded once can be downloaded by tens of millions of clients over days. The entire design is a read optimization problem. Every architectural decision traces back to this skew: how do we move bytes from origin to client without the origin seeing most of the traffic?
Under 200 ms first-byte latency for cached files demands CDN edge placement within one network hop of the client. The 10 M concurrent download target rules out proxying bytes through application servers entirely: even ignoring bandwidth, each open TCP connection parks a goroutine or thread, and 10 M goroutines is not a fleet you want to maintain.
Core Entities
- File: The downloadable artifact. Carries
file_id,name,size_bytes,content_type,storage_key(S3 object key),checksum(SHA-256 of full file), andcreated_at. - DownloadSession: Tracks pause-and-resume state. Carries
session_id,file_id,client_id,bytes_confirmed(last checkpointed byte offset),total_bytes,status(active, paused, completed, expired),download_url, andexpires_at. - FileChunk: Addressable sub-range of a large file (introduced in the integrity deep dive). Carries
chunk_index,file_id,start_byte,end_byte, and a per-chunkchecksum.
Full schema with indexes and access patterns is deferred to the data model section of the deep dives. These three entities are sufficient to drive the API design and High-Level Design from here.
API Design
FR 1 - Initiate a download:
The naive instinct is a direct GET that streams the full file:
GET /files/{file_id}
Response: 200 OK, Content-Type: application/octet-stream, binary body
This breaks immediately at scale: the API server proxies all bytes, consuming one thread or goroutine per active connection. At 10 M concurrent downloads, that is 10 M goroutines before accounting for bandwidth.
The evolved shape separates intent from delivery. The client tells the API what it wants; the API returns a URL pointing directly to the CDN:
POST /downloads
Body: { file_id, client_id? }
Response: { session_id, download_url, file_size, checksum, content_type, expires_at }
The download_url is a pre-signed URL pointing to the CDN or S3 directly. The client fetches from that URL without touching the application server again:
GET {download_url}
Headers: Range: bytes=0- (optional; omit for full file)
Response: 200 OK
Accept-Ranges: bytes
Content-Disposition: attachment; filename="large-file.zip"
Content-Length: 10737418240
206 Partial Content (when Range header present)
Content-Range: bytes 104857600-10737418239/10737418240
Accept-Ranges: bytes
POST over GET for session creation: creating a DownloadSession is a write with a persistent side effect. POST is correct. The pre-signed URL encodes the authorization; the CDN validates it at the edge on each range request without calling back to the application server.
FR 2 - Pause and resume:
PATCH /downloads/{session_id}
Body: { status: "paused" | "resumed", bytes_confirmed: 104857600 }
Response: { session_id, status, bytes_confirmed, download_url }
bytes_confirmed is the last byte offset the client has fully received and verified. On resume, the client issues a Range: bytes={bytes_confirmed}- request against the download_url.
GET /downloads/{session_id}
Response: { session_id, file_id, status, bytes_confirmed, total_bytes, download_url, expires_at }
GET fetches current session state when a client restarts and needs to know where it left off.
FR 3 - Progress:
Progress is primarily tracked client-side: the download client counts bytes received in memory and renders a progress bar locally. Server-side progress is useful for server-to-server transfers or multi-device syncing:
GET /downloads/{session_id}/progress
Response: { bytes_confirmed, total_bytes, percentage, status, transfer_rate_bps }
Authorization note: All endpoints assume a client token in the Authorization header. The pre-signed download URL embeds authorization in its HMAC signature; the CDN validates it without a round-trip to the API server. If authentication is added to scope, I would associate each session with a user_id from the session token and enforce it on POST /downloads.
High-Level Design
1. Initiating a download - the naive proxy path
The simplest possible design streams bytes through the API server itself.
Components:
- Client: Web or mobile app sending
POST /downloads, then fetching the file. - API Server: Creates a session, fetches the file from S3, and streams bytes back to the client.
- Sessions DB: Stores download session rows (session_id, file_id, bytes_confirmed, status).
- S3 / Object Storage: Stores the raw file bytes.
Request walkthrough:
- Client POSTs to
/downloadswithfile_id. - API Server creates a session row in the Sessions DB.
- API Server opens a GET to S3 for the full file.
- API Server streams S3 response bytes directly back to the client.
- Client accumulates bytes to disk.
This is the write path and byte delivery path combined. The API server here carries all download bandwidth.
What breaks: At 10 M concurrent downloads at 50 Mbps each, the API tier must sustain 500 Tbps of egress. No application server fleet handles that at reasonable cost. Each connection parks a goroutine for the duration of a multi-hour large file download. S3 charges per API call and per-GB egress; every client download hits S3 directly.
2. Evolved - pre-signed URL and CDN offloading
The fix decouples the control plane (deciding what to download and authorizing it) from the data plane (actually moving bytes).
Components:
- Client: Sends
POST /downloadsonce to get a pre-signed URL, then fetches from CDN directly. Never talks to the API server for bytes. - API Server (control plane only): Creates sessions, generates pre-signed URLs, handles pause/resume PATCH calls. Touches zero bandwidth.
- CDN (data plane): Serves file bytes directly to clients. Caches popular files at edge PoPs. Validates pre-signed URL HMAC signatures inline. Handles Range requests natively.
- Origin Storage (S3): Backing store. CDN pulls from here only on a cache miss.
- Sessions DB: Stores session state including
bytes_confirmedfor resume.
Request walkthrough:
- Client POSTs to
/downloadswithfile_id. - API Server looks up file metadata (size, checksum, storage_key) in the File DB.
- API Server creates a DownloadSession row in the Sessions DB.
- API Server generates a pre-signed URL pointing to the CDN (expiry + HMAC signature in query params).
- API Server returns
{ session_id, download_url, file_size, checksum }. - Client GETs the
download_urldirectly from the CDN, bypassing the API server entirely. - CDN validates the URL signature at the edge, checks its cache, and serves bytes from cache or pulls once from S3.
- Client receives bytes from the nearest CDN PoP.
Continue Reading with Premium
Unlock this article and every other in-depth system design guide on the platform with NotesFromSDE Premium.