📐HowToHLD
Vote for New Content
Vote for New Content
Home/High Level Design/Engineering Internals

Redis data structures

Learn which Redis data structure to use for timelines, counters, leaderboards, and sessions, with the specific commands and trade-offs each one makes.

23 min read2026-03-28mediumrediscachingdata-structuressorted-setsscalability

Why Redis data structures matter in interviews

The caching.mdx article covers cache patterns (cache-aside, write-through) and eviction policies. What it does not cover is which Redis structure to use for a given problem, and the commands that structure exposes.

In system design interviews, reaching for "Redis" is the right instinct. Knowing which data structure inside Redis is the sign of real depth. Using a List where a Sorted Set is needed, or a String where a Hash would fit better, is the kind of detail that distinguishes a candidate who has read about Redis from one who has used it under load.

This article covers the five structures you need to know and the canonical use cases for each.


flowchart TD
    subgraph Types["Five Redis Data Structures"]
        Str["String\nbyte value / int / float"]
        Hash["Hash\nfield-value map"]
        List["List\ndoubly-linked sequence"]
        Set["Set\nunique unordered members"]
        ZSet["Sorted Set\nscored ranked members"]
    end
    Str -->|"INCR / INCRBY"| C1["Atomic counters\nrate limit atoms"]
    Str -->|"SET key val EX ttl"| C2["Sessions\ncached objects"]
    Hash -->|"HSET / HINCRBY"| C3["User profiles\nper-field counters"]
    List -->|"LPUSH + LTRIM"| C4["Recent-N items\nactivity feed"]
    List -->|"RPUSH + LPOP"| C5["FIFO queue"]
    Set -->|"SADD / SISMEMBER"| C6["Unique visitors\ndeduplication guards"]
    ZSet -->|"ZADD + ZREVRANGE"| C7["Leaderboards\ntimeline cache"]
    ZSet -->|"ZADD + ZREMRANGEBYSCORE"| C8["Sliding window\nrate limiter"]

Strings

The simplest structure. A key maps to a single value: a byte string, integer, or float.

Commands:

  • SET key value [EX seconds] — store a value, optionally with a TTL
  • GET key — retrieve the value
  • INCR key / INCRBY key delta — atomic increment (returns new value)
  • SETNX key value — set only if the key does not exist (used for distributed locks)
  • MGET key1 key2 ... — fetch up to N keys in one round-trip

When to use:

  • Atomic counters. INCR rate_limit:{user_id} is the standard rate limiter primitive. The increment and read are atomic: no two clients can increment simultaneously and both read the same old value.
  • Session tokens. SET session:{token} {user_id_json} EX 3600 stores a session with a 1-hour TTL. One key, one lookup, one eviction via TTL.
  • Feature flags. SET feature:{name} 1 / GET feature:{name}. Simple, instant toggles.
  • Caching serialized objects. SET tweet:{tweet_id} {serialized_tweet} EX 86400. This is the tweet content cache from the Twitter design.

What it does not do well: Anything where you need partial updates. Updating one field in a cached user object requires deserializing, modifying, and re-serializing the entire value. Use a Hash for objects with many independently updated fields.


Lists

An ordered sequence of strings. Insertion is O(1) from either end. Access by index is O(N). Think of it as a doubly-linked list.

Commands:

  • RPUSH key value [value ...] — append to the right (tail)
  • LPUSH key value [value ...] — prepend to the left (head)
  • LRANGE key start stop — return elements from index start to stop (0-indexed; -1 = last)
  • LTRIM key start stop — remove everything outside [start, stop]; destructive
  • LLEN key — number of elements
  • LREM key count value — remove the first count occurrences of a value

When to use:

  • Recent-N caches. LPUSH recent_activity:{user_id} {event_id} then LTRIM recent_activity:{user_id} 0 99 keeps the last 100 events. Combine: LPUSH + LTRIM in a pipeline; the list self-maintains its size cap.
  • Simple queues. RPUSH queue:{name} task to enqueue; LPOP queue:{name} to dequeue. Simple and fast, though message queues (Kafka, SQS) are preferable for durable distributed queues.
  • Fan-out feed (simple version). Before adding scores, Twitter's early architecture used LPUSH timeline:{user_id} tweet_id + LTRIM to maintain a 800-entry feed. This works but cannot do time-range lookups within the list without scanning all items.

What it does not do well: Finding an element by value in the middle of a long list (O(N)). Sorting or ranking items by a score. Use a Sorted Set if you need ordered access by a computed score rather than insertion order.


Sorted Sets

A set of unique members each with a floating-point score. Members are stored sorted by score. All operations maintain the sorted order.

Commands:

  • ZADD key score member [score member ...] — add or update members. If member exists, updates its score (useful with GT flag to only update on higher scores).
  • ZREVRANGE key start stop [WITHSCORES] — return members from highest to lowest score, by rank
  • ZRANGEBYSCORE key min max [LIMIT offset count] — return members with score within [min, max]
  • ZREVRANGEBYSCORE key max min — reverse: highest score first within range
  • ZREMRANGEBYRANK key start stop — remove members by rank (used to trim a sorted set to N entries)
  • ZCARD key — count of members
  • ZSCORE key member — get a member's score
  • ZRANK key member / ZREVRANK key member — get the zero-indexed rank of a member (forward or reverse)

Time complexity: Most operations are O(log N) due to the underlying skip list structure.

When to use:

  • Home timeline cache. ZADD home_timeline:{user_id} {unix_timestamp} {tweet_id} stores tweet_ids scored by creation time. ZREVRANGE home_timeline:{user_id} 0 19 returns the 20 most recent tweet_ids in one command. ZREMRANGEBYRANK home_timeline:{user_id} 0 -801 trims to the most recent 800, discarding older entries. This is the core data structure powering Twitter's pre-computed feed.
  • Leaderboards. ZADD leaderboard:{game_id} {score} {player_id} with GT flag ensures scores only update upward. ZREVRANGE leaderboard:{game_id} 0 9 WITHSCORES returns the top 10. ZREVRANK leaderboard:{game_id} {player_id} returns a player's current rank in O(log N).
  • Rate limiters with time windows. Store request timestamps as members with the timestamp as the score. ZREMRANGEBYSCORE key 0 {window_start} evicts old requests; ZCARD key counts requests in the current window. This is the sliding window rate limiter pattern.
  • Scheduled delayed jobs. Store job IDs scored by their intended execution timestamp. A worker polls ZRANGEBYSCORE jobs 0 {now} LIMIT 0 10 to find jobs ready to run.

What it does not do well: Member values must be unique within a sorted set. If two tweets are scored identically (same millisecond timestamp) and you use the timestamp as both score and member, only one survives. Always use the unique identifier (tweet_id) as the member, and the sort key (timestamp) as the score.


Hashes

A map of field-value pairs stored under a single key. Think of it as an object or row: one Redis key, many fields.

Commands:

  • HSET key field value [field value ...] — set one or more fields
  • HGET key field — get one field
  • HMGET key field1 field2 ... — get multiple fields in one round-trip
  • HGETALL key — get all fields and values (avoid on large hashes)
  • HINCRBY key field delta — atomic increment of an integer field
  • HDEL key field [field ...] — delete fields
  • HEXISTS key field — check field existence without fetching value

When to use:

  • Cached user/session objects. HSET session:{token} user_id 42 name "Alice" role "admin" exp 1748390400. Individual fields can be updated without re-serializing the whole object: HINCRBY user:{id} login_count 1 increments just that counter.
  • Per-object counters. HINCRBY tweet:{tweet_id} like_count 1 updates a single counter in one atomic call. Batch several counters under one key: like_count, retweet_count, view_count.
  • Feature flags per user. HSET flags:{user_id} new_profile 1 dark_mode 0. Individual flags toggled independently without touching the whole flag set.

What it does not do well: Hashes with thousands of fields become slow to scan (HGETALL). For a fixed set of well-known fields on a known number of objects, Hashes are efficient. For dynamically growing collections of items (comments on a post, followers of a user), use a List or Set instead.


Sets

An unordered collection of unique strings. No duplicates, no scores, no order.

Commands:

  • SADD key member [member ...] — add members (idempotent for duplicates)
  • SREM key member [member ...] — remove members
  • SISMEMBER key member — O(1) membership test
  • SMEMBERS key — return all members (avoid on large sets in production)
  • SCARD key — count members
  • SUNION key [key ...] / SINTER key [key ...] / SDIFF key [key ...] — set operations

When to use:

  • Unique visitor tracking. SADD unique_visitors:{date} {user_id} then SCARD unique_visitors:{date} counts distinct visitors. Exact, but memory grows with cardinality. For approximate counts at very high cardinality (100M+ users), use HyperLogLog instead.
  • Tag and category membership. SADD tag:{tag_name} {article_id}. SINTER tag:redis tag:caching returns articles tagged with both.
  • Deduplication guards. SADD processed_events {event_id} before processing. SISMEMBER checks whether an event was already handled. Used to make idempotent consumers.
  • User permissions/roles. SADD user_permissions:{user_id} read write publish. SISMEMBER user_permissions:{user_id} delete checks a specific permission in O(1).

What it does not do well: Sets have no order. To retrieve items in insertion order or sorted order, use a List or Sorted Set. For very large membership sets (100M+ members), memory cost becomes significant.


Internal encoding

Redis uses a compact flat byte array called a ziplist for small collections, regardless of the data type. Once a collection exceeds a threshold (default 128 entries, or any single entry exceeds 64 bytes), Redis automatically upgrades to the full structure. The upgrade is irreversible: a sorted set that grows past 128 members becomes a skip list and stays a skip list even if members are later removed.

Redis Sorted Set (ZSET)ziplist encoding (N <= 128 AND value <= 64 bytes)skiplist + hashtable (N > 128 OR value > 64 bytes) initial encoding for small setsauto-upgrade when threshold exceeded (irreversible)

The same compact-to-full-structure promotion applies to Lists, but the upgrade path is different. Redis 7.0+ replaced ziplist with listpack throughout, but the threshold concept is identical:

Redis Listlistpack encoding (N <= 128 AND each entry <= 64 bytes)Compact flat buffer, roughly 10x less memory than quicklistquicklist (linked list of listpack or ziplist nodes)Used when list grows beyond threshold; each node is a listpack initial encoding for small listsauto-upgrade when list exceeds threshold (irreversible)

For a list that stores 50 strings of under 20 bytes each, the listpack encoding uses a contiguous block of roughly 1.5 KB. The same data in a quicklist uses around 15 KB because each node adds pointer overhead. If your use case stores millions of small per-user lists (recent searches, notification IDs), keeping lists within the listpack threshold reduces Redis memory by a factor of 10.

Ziplist thresholds are the hidden memory tuning parameter

For workloads with millions of small sorted sets (per-user timelines, per-item leaderboards), the settings zset-max-ziplist-entries (default 128) and zset-max-ziplist-value (default 64 bytes) can be the difference between 10 GB and 100 GB of Redis RAM. A ziplist uses roughly 10x less memory than a skip list for the same data. If your sorted set members are short strings and you rarely exceed 128 entries, increasing these thresholds saves significant memory at the cost of slower linear scans within the ziplist. Hashes and Sets have their own threshold settings: hash-max-ziplist-entries and set-max-intset-entries.


Choosing the right structure

ProblemStructureKey command(s)
Cache a serialized objectStringSET / GET with EX
Atomic counter or rate limitStringINCR / INCRBY
Recent-N items, insertion-orderedListLPUSH + LTRIM + LRANGE
Simple FIFO queueListRPUSH (enqueue) + LPOP (dequeue)
Timeline sorted by timeSorted SetZADD (score=ts) + ZREVRANGE
Leaderboard by scoreSorted SetZADD GT + ZREVRANK
Sliding window rate limiterSorted SetZADD (score=ts) + ZREMRANGEBYSCORE
Object with independently updatable fieldsHashHSET / HGET / HINCRBY
Unique membership / deduplicationSetSADD / SISMEMBER
Set intersection (users in both groups)SetSINTER

Production usage

System / Use caseStructureNotable behavior
Twitter timeline fan-outSorted Set keyed by user IDFollowers' timelines stored as sorted sets. ZADD timeline:{user} {tweet_id} {tweet_id} with score = tweet_id. ZREVRANGE returns latest tweets. ZREMRANGEBYRANK caps length.
Rate limiter (sliding window)Sorted SetMembers are request timestamps, score = timestamp. ZREMRANGEBYSCORE evicts expired entries. ZCARD counts requests in window. Used by Stripe, GitHub API.
Session storageString with TTLSET session:{token} {user_json} EX 3600. Atomic read+expiry in one round-trip. Most web frameworks (Express, Rails) use Redis this way via session middleware.
Feature flags / user segmentsSetStore user IDs in a set per feature flag. SISMEMBER flag:dark_mode {user_id} in O(1) is faster than a database check per request.
Pub/Sub message brokerList (LPUSH/BRPOP) or StreamsBackground job queues (Sidekiq, Bull) use LPUSH+BRPOP on a list. Redis Streams add consumer groups and at-least-once delivery for more reliable pipelines.

Limitations and when NOT to use Redis data structures

  • RAM cost scales with every key and value. All Redis data lives in memory. A sorted set timeline cache for 100 million users at 800 entries each requires roughly 160 GB of RAM at typical listpack-plus-pointer overhead. For datasets that exceed available RAM, use Redis as a cache with eviction (LRU or LFU policy) backed by a disk store, not as the primary record of truth.
  • Sorted Set members must be unique within the set. Two events at the exact same millisecond timestamp can share the same score, but member values must differ. Always use the unique event ID as the member and the timestamp as the score. Using the timestamp as both score and member silently drops duplicates.
  • List lookup by value is always O(N). LRANGE returns elements by index range, not by value. Checking whether a specific element is in a list requires scanning the entire list. Use a Set for O(1) membership tests; use a Sorted Set for ordered collections where you need to look up by score, not insertion position.
  • String-serialized objects require full read-modify-write cycles. Storing a user object as a JSON string means updating one field requires GET, decode, modify field, encode, SET. Under concurrent writes, this is a race condition. Use a Hash so each field is updated atomically with HSET or HINCRBY without touching unrelated fields.
  • Redis is not a durable primary store without careful configuration. The default configuration (save "" with no AOF) loses all data on restart. RDB snapshots can lose up to the last snapshot interval. AOF with appendfsync always provides full durability at a performance penalty. Do not store the authoritative copy of financial balances or inventory counts in Redis without a synchronized write to a durable store.
  • Sets and Sorted Sets are wrong for very-high-cardinality deduplication. A Set storing 1 billion user IDs for unique-visitor counting consumes roughly 40 GB of RAM. Use HyperLogLog for approximate unique counts at 12 KB per counter regardless of cardinality, or per-key TTL strings for time-windowed deduplication.

Quick decision guide

flowchart TD
  Q1{"What does the key store?"}
  Q1 -->|"A single value\nwith optional TTL"| Str["String\nSET / GET / INCR"]
  Q1 -->|"An integer counter\n(likes, views, rate limit)"| StrI["String + INCR\nAtomic, no locking needed"]
  Q1 -->|"An ordered collection\n(feed, timeline)"| Q2{"Ordered by insertion\nor by score?"}
  Q1 -->|"A map of named fields\n(user profile, counters)"| Hash["Hash\nHSET / HGET / HINCRBY"]
  Q1 -->|"Unique membership\n(deduplication, tags)"| Set["Set\nSADD / SISMEMBER"]
  Q2 -->|"Insertion order,\nfixed size cap"| List["List\nLPUSH + LTRIM + LRANGE"]
  Q2 -->|"Score-based order\n(timestamp, rank, rating)"| ZSet["Sorted Set\nZADD + ZREVRANGE + ZRANK"]

Interview cheat sheet

  • Name the specific structure, not just "Redis." Saying "a Redis sorted set keyed by user ID, with tweet_id as member and creation timestamp as score" is specific; "store the timeline in Redis" is not.
  • Sorted Set is the answer for timelines, leaderboards, and sliding window rate limiters. It maintains sorted order at O(log N) per insert.
  • ZADD is idempotent for the same member: adding a tweet_id that already exists just updates the score. Kafka at-least-once delivery is safe with sorted set fan-out.
  • ZREMRANGEBYRANK key 0 -801 trims a sorted set to its 800 highest-scored members in one command. Use this to cap timeline length after every ZADD.
  • Hash objects let you increment individual counter fields atomically. HINCRBY tweet:{id} likes 1 does not require read-modify-write at the application layer.
  • List + LPUSH + LTRIM is the correct pattern for a fixed-size recent activity window. The trim is cheap and keeps memory bounded.
  • Set SISMEMBER for deduplication is O(1). If an idempotent consumer needs to check "did I already process event X?", a Redis Set is the simplest correct implementation.
  • MGET (String) and HMGET (Hash) batch multiple lookups into one round-trip. Use them for tweet content hydration instead of N individual GET calls.


Quick recap

  1. Redis String with INCR / INCRBY is the correct choice for any counter: like counts, view counts, rate limit windows. Single-threaded execution makes it inherently atomic.
  2. List plus LPUSH plus LTRIM is the correct pattern for any fixed-size recent-events window. Without LTRIM, the list grows unbounded and eventually OOMs Redis.
  3. Hash stores field-value pairs under one key. HINCRBY increments individual fields atomically without a read-modify-write cycle in application code. Best for objects with multiple counters.
  4. Sorted Set is the answer for timelines, leaderboards, and sliding window rate limiters. It maintains sorted order at O(log N) per insert and supports range queries by rank or score.
  5. Set membership checks are O(1) and idempotent. SISMEMBER for deduplication; SADD for adding; SINTERSTORE for intersection. Switch to per-key TTLs for large-scale deduplication with bounded time windows.
  6. MGET and HMGET batch N reads into one round-trip. For N individual reads with N round-trips, the latency cost is N times higher even if each read is fast.

Related concepts

  • Caching — Redis data structures serve as cache layer primitives. Understanding eviction policies (LRU, LFU) explains how Redis manages memory when it stores tens of millions of timeline sorted sets.
  • Rate Limiting — Sliding window rate limiters use Sorted Set with timestamp as score and ZREMRANGEBYSCORE to evict old entries. This article's Sorted Set patterns map directly to production rate limiter implementations.
  • Databases — Choosing between Redis (in-memory, no durability guarantee) and a relational store (disk-backed, ACID) for a given data type is a fundamental design decision. Timelines tolerate Redis restarts (rebuild from DB); financial counters do not.

Previous

Cursor-based pagination

Next

B-tree indexes

Comments

On This Page

Why Redis data structures matter in interviewsStringsListsSorted SetsHashesSetsInternal encodingChoosing the right structureProduction usageLimitations and when NOT to use Redis data structuresQuick decision guideInterview cheat sheetQuick recapRelated concepts