Parts Catalog
Design a product compatibility system for an eCommerce site that lets users find which parts, accessories, or components are compatible with their specific product model, covering catalog modeling, search, and cross-referencing at millions of SKUs.
What is a parts compatibility catalog?
A parts compatibility catalog lets users find which parts fit their specific product (vehicle, appliance, or device model). The engineering challenge is not storing parts data; that part is easy.
It is the many-to-many compatibility relationship: a single part can fit thousands of vehicle configurations, and each vehicle can have thousands of compatible parts. Naive modeling creates hundreds of millions of fitment rows; smart attribute-based rules collapse that to a few million rule rows while still answering exact compatibility queries in under 100ms. I'd open an interview with exactly that framing because it signals you understand the real problem.
Functional Requirements
Core Requirements
- Users can search for parts compatible with a specific vehicle (year, make, model, submodel).
- Product pages show a "fits your vehicle" badge when a user's garage vehicle is set.
- Catalog managers can add, update, and remove fitment data.
- Results show only parts confirmed compatible with the exact vehicle configuration.
Below the Line (out of scope)
- Inventory management and stock levels
- Pricing and checkout
- Full-text search across part descriptions
The hardest part in scope: Fitment modeling. Explicitly storing one row per part-vehicle fitment pair for an automotive catalog produces hundreds of millions of rows. Attribute-based fitment rules collapse this dramatically while remaining queryable with a simple composite index lookup.
Inventory management is below the line because stock levels change independently of compatibility data. To add it: join the parts table with a separate inventory service and expose availability as a distinct API call. Keeping them decoupled prevents inventory writes from contending with the high-read compatibility path.
Pricing is below the line because it belongs to a separate pricing microservice with a part_id lookup. The catalog service stays read-only and does not need to understand promotional tiers or regional pricing rules.
Full-text search is below the line because it requires a dedicated search index. To add it: index part text fields into Elasticsearch, back the search service with the catalog database as source of truth, and run the two services independently. The catalog database is still the authoritative record; Elasticsearch is a derived read model.
Non-Functional Requirements
Core Requirements
- Scale: 10 million distinct parts, 300 million fitment relationships (or 5 million attribute-based fitment rules). 50 million DAU browsing products.
- Latency: Compatibility check returns under 100ms p99 end-to-end. Catalog updates reflected within 5 minutes.
- Availability: 99.9% uptime. Availability over consistency: a 5-minute lag on new fitment data is acceptable; returning wrong compatibility results is not.
- Read/write skew: Product page loads trigger compatibility checks far more than catalog managers add fitment data. The read path is the hot path.
Below the Line
- Sub-10ms compatibility checks (achievable with more aggressive caching but adds operational cost)
- Multi-region catalog replication
Read/write ratio: Every product page load triggers at least one compatibility check (read) for users with a saved garage vehicle. Catalog managers write fitment data in bulk imports, at most a few times per week. The read-to-write ratio is roughly 1,000:1. This makes the cache tier the most important component in the design, not the database. Almost every architectural decision in this article traces back to this number.
Core Entities
- Part: A catalog entry for a specific component or accessory (part_id, name, brand, part_number, category).
- Vehicle: A specific vehicle configuration (vehicle_id, year, make, model, submodel, engine).
- FitmentRule: The compatibility bridge between a Part and a set of Vehicles (either an explicit part_id + vehicle_id row, or an attribute-based rule row covering year_start, year_end, make, model, submodel).
- UserGarage: A user's saved vehicles (user_id, vehicle_id, nickname).
Schema design (column types, indexes, partitioning) is deferred to the deep dives. I'd keep the entity list this concise in an interview. These four entities are sufficient to drive the API design and High-Level Design.
API Design
FR1 + FR4: Search compatible parts for a vehicle:
GET /parts/compatible?year=2019&make=Ford&model=F-150&submodel=XLT&category=brake_pads
Response: { parts: [{ part_id, name, brand, part_number }], total, next_cursor }
Cursor-based pagination over offset: with 10 million parts, offset pagination requires scanning and discarding rows. Cursor-based pagination is O(1) per page regardless of depth.
FR2: Check "fits your vehicle" for a specific part:
GET /parts/{part_id}/fits?vehicle_id=veh_382
Response: { fits: true, notes?: "Fits 2019-2021 F-150 XLT with 5.0L engine" }
This endpoint is the hot path. It fires on every product page load for users with a saved garage vehicle. It must complete under 10ms at the application layer to stay within the 100ms end-to-end budget. A direct DB query here is a trap.
FR3: Add fitment data (catalog manager / admin):
POST /fitment
Body: { part_id, rule: { year_start, year_end, make, model, submodel } }
Response: { fitment_id }
Use rule (attribute-based) as the default in the API contract, not an explicit vehicle_id. This forces catalog managers to think in rules from day one. If a single vehicle_id is needed, the rule degenerates to year_start = year_end = specific_year with exact make, model, submodel values.
High-Level Design
1. Basic compatibility search (FR1 and FR4)
The read path is the system: a compatibility query must resolve to the correct list of compatible parts in under 100ms, even at 50 million DAU.
On the naive approach, a catalog manager stores one row per part-vehicle combination in a fitments table. A compatibility query selects all part_ids where vehicle_id matches. This works fine at thousands of parts.
Components (naive):
- Client: Web or mobile app. Sends a compatibility search with year, make, model, submodel.
- Catalog Service: Translates query params into a vehicle_id, queries fitments, returns the matching part list.
- PostgreSQL (fitments): Stores part_id + vehicle_id rows. An index on vehicle_id makes the lookup fast at small scale.
Request walkthrough (naive):
- Client sends
GET /parts/compatible?year=2019&make=Ford&model=F-150&submodel=XLT. - Catalog Service resolves the vehicle_id from the query parameters.
- Catalog Service queries:
SELECT part_id FROM fitments WHERE vehicle_id = 'veh_382'. - Catalog Service fetches part details for the returned part_id list.
- Response returns to the client.
At 300 million fitment rows, the index scan for one vehicle still returns thousands of part_id rows across a massive index tree. Query time climbs past 200ms at peak concurrent load. The database becomes the bottleneck before the application server does.
The fix is a cache. Pre-compute the compatible parts list for each vehicle_id and store it in Redis as a Set. A compatibility check becomes a single SMEMBERS call rather than a DB scan at every page load. I'd draw this cache addition on the whiteboard immediately after sketching the naive version.
Components (evolved):
- Redis (inverted index): One Redis Set per vehicle_id containing the compatible part_ids. Cache-aside pattern: populate on first miss, 24-hour TTL.
- PostgreSQL: Source of truth for cache misses. Not the hot path anymore.
Request walkthrough (evolved):
- Client sends the compatibility query.
- Catalog Service resolves the vehicle_id.
- Catalog Service calls
SMEMBERS vehicle:veh_382on Redis. - On cache hit (expected 95%+), return the part list immediately without touching the DB.
- On cache miss, query PostgreSQL, populate the Redis Set, return the result.
Redis absorbs the read load. The database handles only cache misses and writes, which is exactly what you want from a 1,000:1 read-skewed system.
2. "Fits your vehicle" badge on product pages (FR2)
Product page compatibility checks are the hottest request in the system: every product page load for a logged-in user with a saved garage vehicle fires this check.
I'd call this out early in an interview. The GET /parts/{part_id}/fits?vehicle_id=veh_382 endpoint fires once per visible product on listing pages. A user scrolling through 30 brake pads triggers 30 back-to-back fits checks, each of which must complete under 10ms at the application layer.
The badge check is not a background task
The "fits your vehicle" badge is synchronous and in the critical render path. Users see a spinner or delayed badge if this is slow. Sub-millisecond response means Bloom filter or Redis, never a direct DB query per product.
Components:
- Client: Product listing or detail page. Fires one fits check per displayed part.
- Catalog Service: Checks whether part_id is a member of the vehicle's compatibility set.
- Bloom Filter (in-process): Per-vehicle bit array, approximately 7KB each. Rejects "definitely not compatible" parts in a single memory probe without any network call.
- Redis (inverted index):
SISMEMBER vehicle:veh_382 part_id_123. Exact membership check in under 1ms.
Request walkthrough:
- Product page sends
GET /parts/brake-pad-123/fits?vehicle_id=veh_382. - Catalog Service consults the in-process Bloom filter for vehicle
veh_382. - If Bloom filter says "definitely not in set," return
{ fits: false }immediately. No Redis call. - If Bloom filter says "possibly in set," call
SISMEMBER vehicle:veh_382 brake-pad-123on Redis. - Redis returns
1(fits) or0(does not fit) in under 1ms. - On Redis cache miss, evaluate rules against the DB and populate the Redis Set.
The Bloom filter handles "definitely not compatible" checks without touching Redis. For a user with a 2019 Ford F-150, the Bloom filter eliminates motorcycle parts, marine parts, and appliance parts in a single memory probe. On a mixed product listing page, this eliminates the vast majority of Redis calls at zero cost.
3. Catalog updates and fitment data management (FR3)
When catalog managers add or change fitment data, the change must propagate to the Redis cache within 5 minutes, without blocking the write path.
Direct cache invalidation on every write would work at small write volumes. With bulk fitment imports (a single ACES file can contain 50,000-200,000 rules), synchronous invalidation for each affected vehicle would stall the import pipeline.
Components:
- Admin Client: Catalog manager UI or automated import pipeline.
- Catalog Service (write path): Validates and writes fitment rules to PostgreSQL.
- Kafka (fitment change events): Decouples the write from cache invalidation. Each write emits a
fitment.updatedevent with the rule attributes. - Cache Invalidation Worker: Consumes Kafka events, resolves affected vehicle_ids, deletes their Redis keys and rebuilds Bloom filters.
Request walkthrough:
- Admin Client sends
POST /fitmentwith a rule (year_start=2019, year_end=2021, make=Ford, model=F-150). - Catalog Service validates the rule and writes it to the
fitment_rulestable. - Catalog Service publishes a
fitment.updatedevent to Kafka with the rule attributes. - Cache Invalidation Worker consumes the event, resolves all vehicle_ids matching the rule, and calls
DEL vehicle:veh_382on each. - The next compatibility request for any invalidated vehicle misses the cache and rebuilds the Redis Set from the updated
fitment_rulestable.
The Kafka event buffer means a bulk import of 50,000 rules emits 50,000 events that the worker drains at its own pace. Cache invalidation lag is bounded by the Kafka consumer throughput, typically a few seconds, well within the 5-minute NFR. I've seen catalog systems skip this async pattern and invalidate caches synchronously during imports. The result is a 30-minute import that hammers Redis with 50,000 DEL operations while serving live traffic.
Interview tip: name the invalidation pattern
When explaining catalog updates, call it "event-driven cache invalidation via Kafka." Say the worker resolves affected vehicle_ids from the rule attributes, then DELetes those Redis keys. This shows you understand that it is the vehicle's cache entry (not the part's) that gets invalidated, because the inverted index is keyed by vehicle.
Potential Deep Dives
1. How do you model the fitment relationship without creating a billion-row table?
There are three constraints to design against: fitment data must be storable without combinatorial explosion, compatibility queries must resolve in under 100ms, and catalog managers need to express "fits Ford F-150 2019-2021 (all trims)" as a single entry, not thousands. The modeling choice here changes everything downstream.
2. How do you serve compatibility checks at 50 million DAU without overwhelming the database?
At 50 million DAU, even a conservative assumption of 5 product views per session generates 250 million fits-check requests per day, or roughly 3,000 requests per second at steady state. A PostgreSQL read replica handles about 5,000 simple indexed queries per second at moderate row counts. Direct DB queries leave almost no headroom for traffic spikes.
Final Architecture
The three layers work together: attribute-based rules collapse 300 million explicit fitment rows to 5 million, Redis turns DB queries into sub-millisecond memory lookups, and the Bloom filter eliminates 90%+ of Redis calls for the negative case on product listing pages.
I'd emphasize this layered-caching story in an interview. Each layer handles a different class of request, and the system degrades gracefully if any layer fails.
Interview Cheat Sheet
- Start by naming the fitment modeling problem. The naive approach (one row per part-vehicle pair) produces 300 million rows; attribute-based rules collapse this to 5 million. This is the central question the interviewer is probing for.
- Attribute-based rules express "fits Ford F-150 2019-2021 (all trims)" as one row. The key SQL pattern:
year_start <= :year AND year_end >= :year AND (submodel IS NULL OR submodel = :submodel). NULL means "all." - A composite index on (make, model, year_start, year_end) turns a full table scan into a narrow range lookup. For 5 million rules, the query returns in under 5ms.
- State the read/write ratio (1,000:1) right after the NFRs. It explains every downstream decision: Redis inverted index, Bloom filter, separate read and write services.
- The Redis inverted index stores one Set per vehicle_id containing compatible part_ids. SISMEMBER for the fits-check endpoint, SMEMBERS for the full compatible-parts search. Both are sub-millisecond.
- Bloom filters eliminate 90%+ of Redis calls for the negative case. A user browsing a listing page sees mostly non-fitting parts; the Bloom filter rejects them in a memory probe with no Redis round-trip. A 5,000-element Bloom filter at 1% false positive rate uses 7KB.
- Cache invalidation: Kafka event on every fitment rule change, async worker resolves affected vehicle_ids and DELetes their Redis keys. The next request rebuilds the cache from the DB. Max lag is bounded by the Kafka consumer throughput (seconds, not minutes).
- Bulk ACES imports emit thousands of Kafka events. The worker drains them at its own pace without stalling the import pipeline. This is why the write path does not touch Redis directly.
- ACES/PIES is the automotive industry standard for fitment data interchange. Supplier ACES XML files map directly to the fitment_rules schema. Knowing this signals domain awareness beyond generic system design.
- A Bloom filter for 5,000 elements at 1% false positive rate uses 7KB. At 100,000 active vehicles, total bloom filter memory is 700MB in application process memory, well within a standard JVM or Node.js heap.
- The
fits your vehiclebadge endpoint is on the synchronous render path. Sub-millisecond response at the application layer means Bloom filter or Redis, never a direct DB call per product. - Split read and write into separate services. The Compatibility Read Service scales horizontally for 1,000x the request volume. The Catalog Write Service handles low-frequency catalog imports with a single instance in most deployments.
- For a non-automotive catalog (appliances, electronics), the same schema works: replace year/make/model with the product's classification dimensions (brand, model_number, generation). The NULL-means-all convention handles universal accessories without any schema change.