Storage types: block vs. file vs. object
When to use block storage, file storage, or object storage: performance characteristics, consistency models, cost, use cases for each, and how cloud services map to these types.
TL;DR
| Dimension | Block Storage | File Storage | Object Storage |
|---|---|---|---|
| Access model | Raw blocks, OS filesystem on top | Network filesystem (NFS/SMB), POSIX semantics | HTTP API (PUT/GET/DELETE), flat namespace |
| Latency | Sub-ms (NVMe: 100us, EBS gp3: 1-5ms) | 1-10ms (network + filesystem overhead) | 10-100ms (HTTP first byte) |
| Best for | Databases, boot volumes, low-latency random I/O | Shared access across servers, legacy apps, media processing | Images, videos, backups, data lakes, static assets |
| Cost (AWS) | $0.08-0.125/GB/mo (gp3/io2) | $0.30/GB/mo (EFS Standard) | $0.023/GB/mo (S3 Standard) |
| Scalability | Fixed size, attached to instance | Elastic, shared mount | Virtually unlimited (exabyte-scale) |
Default answer: Object storage for anything immutable or rarely updated (media, backups, static assets). Block storage for databases and applications requiring low-latency random I/O. File storage only when multiple servers need POSIX-compatible shared access.
The Framing
Your team launches a new service. The database runs on an EC2 instance with a 500 GB EBS volume. User-uploaded images go into the same EBS volume. Shared configuration files that all servers need live in Redis serialized as a blob.
Six months later: the EBS volume is 80% full because images take up 450 GB. Scaling the database to a larger instance means detaching and reattaching the volume (downtime). Configuration changes require manually updating Redis. And the bill for 500 GB of io2 storage ($62.50/month) is 13x what the same images would cost in S3 ($4.60/month per 200 GB).
Every storage type exists because the others can't do something it can. Block storage gives you sub-millisecond random I/O for databases. Object storage gives you infinite capacity for pennies per GB. File storage gives you shared POSIX access that legacy applications expect. Using the wrong type doesn't fail immediately. It fails at scale, at cost, or at operational complexity.
I've seen teams store everything on EBS and wonder why their AWS bill is astronomical. And I've seen teams try to run a database on S3 and wonder why latency is terrible. The rule is simple: match storage type to access pattern.
How Each Works
Block Storage: Raw Blocks, Maximum Performance
Block storage exposes fixed-size blocks (typically 512 bytes or 4 KB) to the operating system. The OS formats them with a filesystem (ext4, XFS, NTFS) and treats them as a local disk. Network-attached block storage (AWS EBS, GCP Persistent Disk, Azure Managed Disks) makes this model work over a network with near-local latency.
The key characteristic is random I/O at any byte offset. A database can read page 4,721 of a B-tree index, modify it, and write it back, all within microseconds for local NVMe or low single-digit milliseconds for network-attached volumes. This byte-addressable random access is what makes block storage essential for databases, and it's the one thing object and file storage fundamentally cannot provide.
# AWS EBS gp3 volume configuration
VolumeType: gp3
Size: 100 # GB
Iops: 3000 # Baseline (free), up to 16,000
Throughput: 125 # MB/s baseline, up to 1,000
Encrypted: true # AES-256 at rest
# AWS EBS io2 Block Express (high-performance)
VolumeType: io2
Size: 100
Iops: 64000 # Up to 256,000 for Block Express
Throughput: 4000 # MB/s
MultiAttachEnabled: false # true for shared-nothing clusters
Performance tiers matter. gp3 provides 3,000 baseline IOPS for free; io2 Block Express scales to 256,000 IOPS for latency-sensitive workloads. NVMe instance storage (i3, i4 instances) provides the lowest latency (~100 microseconds) but is ephemeral: data is lost when the instance stops.
RAID striping (RAID 0 across multiple EBS volumes) multiplies throughput linearly. Four gp3 volumes striped together provide 12,000 IOPS and 500 MB/s throughput. I use this pattern for write-heavy databases that need more throughput than a single volume provides.
Multi-Attach is available on io2 volumes, allowing a single volume to be attached to up to 16 Nitro-based instances simultaneously. This enables shared-nothing cluster architectures where multiple nodes access the same block device. However, the application must manage concurrent writes (the filesystem doesn't coordinate this for you). Oracle RAC and some custom distributed databases use this pattern.
EBS snapshots are incremental and stored in S3 behind the scenes. The first snapshot copies the full volume; subsequent snapshots copy only changed blocks. Snapshots can be copied across regions for disaster recovery. You can create a new volume from any snapshot in any availability zone, which makes migration and scaling straightforward. EBS also supports fast snapshot restore (FSR) which pre-warms the volume data, eliminating the performance penalty of reading from snapshot on first access. Without FSR, the first read of each block fetches from S3 (slow); with FSR, the volume performs at full speed immediately after creation.
Encryption at rest is available on all EBS volume types using AWS KMS keys. There's no performance penalty for encryption (it's handled by the Nitro controller hardware). For compliance requirements, encrypted volumes ensure data is protected even if the underlying hardware is accessed. Once you enable default encryption for a region, all new volumes are automatically encrypted.
Data Durability Comparison
Durability is how likely your data survives hardware failure. The three storage types have very different durability profiles:
- EBS (block): Durability is 99.999% (designed for 0.1-0.2% annual failure rate). Data is replicated within a single Availability Zone. If the AZ fails, the volume is lost unless you have snapshots (stored in S3, which has higher durability). For critical databases, combine EBS with daily snapshots and cross-region replication.
- S3 (object): Designed for 99.999999999% (11 nines) durability across 3+ AZs. AWS calculates this as less than 1 object lost per 10 million objects stored over 10,000 years. S3 is the most durable storage option available on any cloud provider.
- EFS (file): Data is replicated across multiple AZs within a region, providing similar durability to S3. EFS Standard stores data redundantly across 3+ AZs.
- Instance Store (NVMe): Zero durability guarantee. Data is lost on instance stop, termination, or hardware failure. Use only for temporary scratch data that can be regenerated.
The durability hierarchy: S3 (11 nines) > EFS (multi-AZ) > EBS (single-AZ) > Instance Store (ephemeral). For any data that matters, either use S3/EFS or back up EBS with snapshots.
| EBS Volume Type | IOPS (max) | Throughput (max) | Latency | Cost (200 GB/mo) | Best For |
|---|---|---|---|---|---|
| gp3 | 16,000 | 1,000 MB/s | 1-5ms | $16 + IOPS | General workloads |
| io2 Block Express | 256,000 | 4,000 MB/s | sub-ms | $25 + $0.065/IOPS | High-perf databases |
| st1 (HDD) | 500 | 500 MB/s | 5-10ms | $9 | Sequential throughput |
| NVMe instance | 400,000+ | 7,000+ MB/s | ~100us | Included with instance | Temporary scratch |
File Storage: Shared Access with POSIX Semantics
File storage provides a network filesystem that multiple servers mount simultaneously. All servers see the same files and directories with POSIX semantics (read, write, seek, lock). NFS (Network File System) for Linux and SMB/CIFS for Windows are the primary protocols.
AWS EFS (Elastic File System) is the canonical cloud example. It auto-scales capacity (no pre-provisioning), supports thousands of concurrent connections, and charges per GB actually used ($0.30/GB/month for Standard, $0.016/GB/month for Infrequent Access).
# Mount EFS on multiple EC2 instances
sudo mount -t nfs4 \
-o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2 \
fs-0123456789abcdef0.efs.us-east-1.amazonaws.com:/ /mnt/shared
# All instances now see the same filesystem
ls /mnt/shared/config/
# app-config.yaml feature-flags.json tls-certs/
# FSx for Lustre: high-throughput parallel filesystem
# Up to 1,000 GB/s aggregate for HPC / ML training
FSx variants serve specialized needs. FSx for Lustre provides parallel filesystem performance (hundreds of GB/s throughput) for machine learning training and high-performance computing. FSx for Windows File Server provides SMB-compatible storage. FSx for NetApp ONTAP provides enterprise NAS features.
The trade-off: file storage is expensive per GB compared to object storage (13x the cost of S3 Standard) and slower than block for random I/O. Use it when you genuinely need multiple servers writing to the same filesystem simultaneously, not as general-purpose storage.
EFS performance has two dimensions: throughput mode and performance mode. Bursting throughput scales with stored size (50 KB/s per GB stored). Provisioned throughput lets you decouple throughput from storage size. For small filesystems (under 1 TB) with high throughput needs, provisioned throughput avoids the bursting bottleneck. Max I/O performance mode handles highly parallelized workloads at the cost of slightly higher per-operation latency.
NFS performance limitations are the main reason to avoid file storage when possible. NFS adds 1-10ms of latency per operation compared to local disk, and metadata-heavy operations (listing directories with thousands of files, checking file permissions) are significantly slower than on local block storage. Applications that open and close many small files rapidly (build systems, package managers) perform poorly on NFS. The rule: if your application's I/O pattern involves many small random operations, block storage is the right choice even if you need shared access (use a database or object store instead).
Object Storage: Infinite Capacity at Pennies per GB
Object storage uses a flat namespace where each object is identified by a key (string) and stored as an immutable blob. There's no filesystem hierarchy (the "folders" in S3 are just key prefixes). Objects are accessed via HTTP API: PUT to write, GET to read, DELETE to remove.
import boto3
s3 = boto3.client('s3')
# Upload an image
s3.put_object(
Bucket='my-app-images',
Key='users/456/avatar.jpg',
Body=image_bytes,
ContentType='image/jpeg',
ServerSideEncryption='AES256'
)
# Generate a pre-signed URL (temporary access, no auth needed)
url = s3.generate_presigned_url(
'get_object',
Params={'Bucket': 'my-app-images', 'Key': 'users/456/avatar.jpg'},
ExpiresIn=3600 # 1 hour
)
# Lifecycle policy: move to Glacier after 90 days
lifecycle_config = {
'Rules': [{
'ID': 'archive-old-images',
'Status': 'Enabled',
'Transitions': [
{'Days': 90, 'StorageClass': 'GLACIER'},
{'Days': 365, 'StorageClass': 'DEEP_ARCHIVE'}
]
}]
}
S3 provides strong read-after-write consistency (since December 2020). Throughput scales per-prefix: 5,500 PUT/COPY/POST/DELETE requests per second and 3,500 GET/HEAD requests per second per prefix. Distributing objects across prefixes scales linearly.
Storage classes provide cost tiers. S3 Standard ($0.023/GB/mo) for frequently accessed data. S3 Infrequent Access ($0.0125/GB/mo) for data accessed less than once a month. S3 Glacier Instant Retrieval ($0.004/GB/mo) for archival with millisecond access. S3 Glacier Deep Archive ($0.00099/GB/mo) for compliance archives rarely accessed. Lifecycle policies move objects between tiers automatically.
Versioning keeps every version of every object, enabling rollback. Server-side encryption (SSE-S3, SSE-KMS, SSE-C) encrypts at rest. Cross-region replication copies objects to another region for disaster recovery. Object Lock provides WORM (Write Once Read Many) compliance for regulatory requirements that mandate immutable storage.
Event notifications trigger Lambda functions, SQS messages, or SNS notifications when objects are created, modified, or deleted. This enables reactive architectures: a user uploads an image to S3, which triggers a Lambda that creates thumbnails in multiple sizes and writes them back to S3. The entire pipeline runs without any server management.
# S3 event notification configuration (Terraform)
resource "aws_s3_bucket_notification" "upload_trigger" {
bucket = aws_s3_bucket.uploads.id
lambda_function {
lambda_function_arn = aws_lambda_function.thumbnail_generator.arn
events = ["s3:ObjectCreated:*"]
filter_prefix = "uploads/"
filter_suffix = ".jpg"
}
}
S3 Transfer Acceleration uses CloudFront edge locations to speed up uploads from distant clients. Instead of uploading directly to the S3 bucket's region, the client uploads to the nearest edge location, which routes the data to S3 over AWS's optimized backbone network. This can improve upload speed by 50-500% for cross-continental transfers.
Continue Reading with Premium
Unlock this article and every other in-depth system design guide on the platform with NotesFromSDE Premium.