Document Vault
Design a scalable document management system like Google Docs or Notion: versioned storage, collaborative editing, access control, full-text search, and real-time sync across clients.
What is a document management system?
A document management system lets users create, organize, and retrieve documents in a folder hierarchy, with version history and per-node access control. The interesting engineering challenge isn't the CRUD; it's storing 1,000 edits without consuming 50MB per document, enforcing permission inheritance across a deep folder tree without an O(depth) query on every read, and making 1 billion documents searchable under 500ms.
I'd frame this question around the version storage and ACL inheritance tradeoffs early, because they force completely different design decisions depending on which constraint you prioritize. This combination of storage efficiency, hierarchical access control, and full-text indexing is why the question shows up consistently in senior interviews.
Functional Requirements
Core Requirements
- Users can create, read, update, and delete documents organized in folders.
- The system maintains a version history for every document and supports restoring to any previous version.
- Users can share documents and folders with specific people, controlling read, write, and manage permissions.
- Users can search documents by title and full-text content.
Below the Line (out of scope)
- Real-time collaborative editing (concurrent multi-user editing using OT or CRDT)
- Large binary attachments embedded in documents (images, PDFs, videos)
- Comment threads on specific document sections
- Document templates and publishing or export to PDF
The hardest parts in scope: Version history storage and hierarchical access control. Storing 1,000 edits naively at 50KB per copy consumes 50MB per document. Permission inheritance across a deep folder tree requires careful design to avoid a slow tree walk on every read request.
Real-time collaborative editing is below the line because it requires Operational Transformation or CRDT-based merge logic plus a WebSocket broadcast layer. To add it, introduce a Collaboration Service that serializes concurrent operations and broadcasts change deltas to all connected clients over WebSocket, sitting alongside the Document Service rather than inside it. The server resolves conflicts by serializing concurrent ops against a shared document state, so two users editing the same paragraph see a merged result rather than a last-write-wins overwrite.
Large binary attachments are below the line because they introduce a separate upload flow (chunked multipart, virus scanning, CDN delivery) without changing document metadata or versioning semantics. To add them, store the binary in S3, embed an attachment_id reference link in the document body, and lazy-load the binary on the client. The document itself stays lightweight text; the attachment reference is just a URL pointer.
Comment threads are below the line because they introduce a separate content type with its own read patterns, including threading, reactions, and notifications. To add them, store comments as a separate entity anchored to a (document_id, anchor_offset) tuple, using the same ACL table to control visibility.
Document templates and export are below the line because they are rendering concerns that don't affect the storage or access control design. To add export, a background renderer consumes the latest document version and produces a PDF, storing it in S3 as a derived artifact.
Non-Functional Requirements
Core Requirements
- Scale: 100M registered users, 1B documents total.
- Storage: Average document 50KB; 50TB for text content. Version history adds 2-3x storage overhead with delta encoding, bringing the total to roughly 100-150TB.
- Write latency: Document save acknowledges in under 500ms p99.
- Read latency: Document load completes in under 300ms p99. Search results return in under 500ms p99.
- Availability: 99.9% uptime. Consistency over availability for document writes: a confirmed save must never be silently lost.
- Durability: Document content and version history must survive any single-region failure.
Below the Line
- Sub-100ms global read latency via CDN edge caching (achievable with aggressive caching but not a core NFR here)
- Exactly-once change event delivery to the search indexer (at-least-once with idempotent indexing is sufficient)
Read/write ratio: 10:1. Documents are read far more than written. This opens the door for a Redis cache in front of hot metadata reads and justifies async search index updates rather than synchronous writes on every save.
The 500ms write latency budget is generous enough to absorb a synchronous write to PostgreSQL plus an async delta computation. It is not generous enough to also synchronously update the search index, so search replication goes through an async pipeline with a few seconds of lag.
Storage math drives a key design decision: 1B documents at 50KB each is 50TB for content alone. With version history averaging 10 deltas per document at 1-2KB per delta, that adds another 10-20TB. Delta encoding is not optional at this scale.
I'd run this math on the whiteboard early. It anchors the interviewer's expectations and makes the case for delta encoding before you propose it.
Core Entities
- Document: The primary content item. Contains
document_id,title,owner_id,parent_folder_id,created_at, andupdated_at. The content body lives in object storage; the database row holds only metadata and a pointer to the latest version. - Folder: A container node in the hierarchy. Shares the same
nodestable as documents, distinguished by anode_typediscriminator column. Folders can nest arbitrarily deep. - Version: A recorded revision of a document. Contains
version_id,document_id,created_by,created_at,parent_version_id, and an S3 key pointing to the compressed delta (or a periodic full snapshot anchor). - AclEntry: A permission record linking a principal (user or group) to a node (document or folder). Contains
node_id,principal_id,permission(read, write, manage), and anis_overrideflag that stops permission propagation from overwriting this entry.
Full schema, index design, and delta format are covered in the deep dives. These four entities are sufficient to drive the API and high-level architecture.
API Design
One endpoint group per functional requirement, evolved where the naive shape breaks down.
FR 1 - CRUD on documents and folders:
# Create a document inside a folder
POST /nodes
Body: { type: "document", title, parent_folder_id, content }
Response: 201 Created · { node_id, version_id }
# Read a document (content fetched directly from S3 via presigned URL)
GET /nodes/{node_id}
Response: 200 · { title, parent_folder_id, current_version_id, content_url }
# Update a document
PATCH /nodes/{node_id}
Body: { title?, content? }
Response: 200 · { version_id }
# Delete a node
DELETE /nodes/{node_id}
Response: 204 No Content
PATCH over PUT for updates: the client sends only changed fields rather than the full document, keeping request payloads small. The content_url in the GET response is a presigned S3 URL so the client fetches the body directly from S3, avoiding large payload routing through the API tier.
I'd call out the presigned URL pattern here because interviewers often probe it. The alternative (streaming the document body through the API server) works at small scale but becomes a CPU and bandwidth bottleneck at 1B documents. Presigned URLs let S3 handle the heavy lifting of content delivery.
FR 2 - Version history and restore:
# List version history
GET /nodes/{node_id}/versions?cursor=<opaque>&limit=20
Response: 200 · { versions: [...], next_cursor }
# Restore to a specific version (creates a new version, never mutates history)
POST /nodes/{node_id}/versions/{version_id}/restore
Response: 201 Created · { new_version_id }
POST for restore because restoring appends a new version to history rather than mutating the document in place. The old version is never deleted; the timeline gains a restore marker pointing back to the target version.
FR 3 - Access control:
# Grant or update access for a principal
PUT /nodes/{node_id}/acl
Body: { principal_id, permission, is_override }
Response: 200 · { acl_entry_id }
# List current ACL for a node
GET /nodes/{node_id}/acl
Response: 200 · { entries: [...] }
FR 4 - Search:
# Full-text search across accessible documents
GET /search?q=quarterly+report&cursor=<opaque>&limit=20
Response: 200 · { results: [...], next_cursor }
Search returns only results the requesting user has at least read permission for. The Document Service resolves the user's accessible node_ids before forwarding the query to Elasticsearch, rather than relying on Elasticsearch alone for access control. Cursor-based pagination handles deep result sets without the expensive OFFSET scans that would kill a 1B-document index.
High-Level Design
1. CRUD and folder hierarchy
Continue Reading with Premium
Unlock this article and every other in-depth system design guide on the platform with NotesFromSDE Premium.