A systematic approach to discovering classes and relationships from requirements, using noun extraction, responsibility testing, and relationship classification.
22 min read2026-04-04mediuminterview-guideentity-modelingoopmethodologylld
You just read "Design a library management system" on the whiteboard. You know you need classes. You know they need relationships. But which nouns become classes, which become fields, and which are just noise? Most candidates either freeze here or start typing class Book and hope inspiration strikes.
Entity discovery is the single most important skill in LLD interviews, and it is also the most under-practiced. Get the entities wrong and every pattern you apply later fights against the shape of your model. Get them right and the patterns almost pick themselves. This guide gives you a repeatable, mechanical process you can apply to any prompt in under five minutes.
Entities are the skeleton of your design. Every class you write, every interface you define, every relationship you draw flows from the entities you chose in the first five minutes. If you pick the wrong entities, you spend the rest of the interview patching a broken model.
I have watched candidates build beautiful Strategy patterns on top of a class model that had Book and Library as a single merged class. The pattern was correct. The entities underneath it were wrong. The interviewer saw it immediately and the conversation went sideways.
Think of entities like the columns in a database schema. Get the columns wrong and no amount of clever queries will save you. Get them right and most queries write themselves.
The good news: entity discovery is not a creative act. It is a mechanical extraction process. You read the requirements, highlight the nouns, filter out the noise, and what remains are your classes. The rest of this guide teaches you exactly that process.
Interview tip: make it visible
Write your entity list on the whiteboard before coding anything. Interviewers score "structured thinking" as a separate dimension. A visible entity list with attributes is evidence of that thinking, even before you write a single line of code.
Read the requirements and underline every noun or noun phrase. Do not judge yet. Do not ask "is this a class?" Just collect.
Here is a sample requirement for a library system:
"The library has multiple branches. Each branch has a collection of books. Members can borrow up to 5 books at a time and must return them within 14 days. A librarian can add or remove books from the catalog. The system tracks fines for overdue books. Each book has a title, author, ISBN, and publication year."
Some nouns start as attributes but deserve promotion to full entities. The author example above is a good case. If books share authors and you need to query "all books by this author," then Author should be its own class with a name, biography, and a list of books. If authors are just a display string, keep it as a field.
The verb "borrow" also hides an entity. A borrowing event has a start date, due date, return date, and status. That is too much state for a simple field. It deserves its own class: Loan or BorrowRecord.
For your interview: always look for hidden entities in the verbs. "Reserve," "borrow," "pay," "fine" are all actions that carry state. When an action has a date, a status, or multiple attributes, it is probably an entity.
This three-step process takes under five minutes with practice. Do it on paper before you touch code.
This is the question that trips up most candidates. Is "Address" a class or a group of fields on Member? Is "Author" its own class or a String? There is a simple heuristic.
Promote to a class when any of these are true:
It has its own identity. Can two different instances of this thing exist with different attributes? Two authors can have the same name but different biographies. That is identity. A title string does not have identity.
It participates in multiple relationships. If Author is referenced by Book and also by Event (author signing events), it needs to be shared. Shared things need to be entities.
It has behavior. An Address that just holds street/city/zip is a value object (or even just fields). An Address that can validate itself, format for different locales, or calculate shipping zones has behavior. Behavior means class.
It changes independently. If the author's biography changes, should every book record update? If yes, Author is a shared entity. If no, it is just a copied string.
Here is the decision as a quick flowchart:
I use this decision tree in every LLD interview I coach. If even one question gets a "yes," promote the noun to a class. When in doubt, start with a class. It is easier to inline a class back into a field than to extract one later after you have written 200 lines of code.
The String trap
The most common entity-modeling mistake is representing complex concepts as Strings. String status should be an enum. String address should be a value object. String author might need to be a class. If you find yourself writing String for something with rules or validation, stop and reconsider.
Every relationship needs a count on both sides. Ask: "How many Bs can one A have? How many As can one B belong to?"
Relationship
Multiplicity
Reasoning
Library to Branch
1 to many
One library, multiple branches
Branch to Book
1 to many
One branch holds many books
Member to Loan
1 to many
One member, up to 5 active loans
Loan to Book
1 to 1
Each loan is for exactly one book
Loan to Fine
1 to 0..1
A loan may or may not generate a fine
Put all of this into a class diagram and you have your entity model:
That diagram took under five minutes to derive and it tells the interviewer everything: you know the domain, you understand relationships, and you can reason about ownership.
"Build a library management system. The library has branches in different locations. Each branch maintains a catalog of books. Members register with the library and can borrow books from any branch. There is a limit of 5 books per member. Books must be returned within 14 days or a fine is charged. Librarians manage the inventory. Members can also reserve books that are currently checked out."
That is the right number. I have seen candidates stop at three (Book, Member, Library) and candidates go to fifteen. Seven to nine is the sweet spot for a 45-minute library problem.
For each entity, list 3-5 attributes. No more. You can always add fields later.
public class Book { private String isbn; private String title; private Author author; private int publicationYear; private BookStatus status; // AVAILABLE, CHECKED_OUT, RESERVED, LOST}public class Member { private String memberId; private String name; private String email; private List<Loan> activeLoans; private List<Reservation> reservations;}public class Loan { private Member member; private Book book; private LocalDate borrowDate; private LocalDate dueDate; private LocalDate returnDate; private LoanStatus status; // ACTIVE, RETURNED, OVERDUE}public class Reservation { private Member member; private Book book; private LocalDate reservedDate; private LocalDate expiryDate; private ReservationStatus status; // PENDING, FULFILLED, EXPIRED, CANCELLED}
Notice the pattern: every entity has an identity field, a status enum, and 2-4 domain-relevant fields. This is a reliable template.
Interview tip: use enums for status
Whenever you see an entity that can be in different states (a loan that is active, returned, or overdue), model that as an enum. It shows the interviewer you think about state management, and it opens the door for the State pattern discussion later.
No. Author's biography changes independently of book data.
Separate classes.
Loan and Fine
Partially. A fine only exists because of a loan, but fine calculation rules change independently (e.g., new fine policy).
Separate classes, composed.
Member and Librarian
They share identity fields but have different permissions and operations.
Separate classes (or Librarian extends Member).
Branch and Library
A branch's catalog changes independently of other branches. Library just groups them.
Separate classes, composed.
Loan and Reservation
Different lifecycles. A reservation becomes a loan but they track different data.
Separate classes.
If you find two entities that always change together and share the same lifecycle, merge them. For instance, if Address only ever appears as part of Branch and never independently, it might just be fields on Branch rather than its own class.
The inverse mistake is equally common: keeping things merged that should be separate. If you have a Book class with borrowerName, borrowDate, and dueDate fields directly on it, you have merged Book with Loan. Split them. A book can exist without being borrowed.
Creating HardcoverBook, PaperbackBook, AudioBook as separate classes before you know whether the system treats them differently. Premature inheritance creates rigid hierarchies. Start with a single Book class and a BookFormat enum. Only split into subclasses when the behavior genuinely differs (e.g., AudioBook has a duration field and a streaming method that others do not).
The rule of thumb: if the only difference is data (fields), use composition or an enum. If the difference is behavior (methods), consider inheritance.
The opposite problem. Stuffing everything into three mega-classes: Library, Book, User. A class with 20 fields and 15 methods is a sign you missed entities. If your User class has borrowBook(), returnBook(), payFine(), reserveBook(), addBook(), removeBook(), you have merged Member and Librarian concerns.
Break it up. Each class should have one clear reason to exist.
Names matter more than candidates think. Common naming mistakes:
Bad name
Problem
Better name
Data
Meaningless
Name it by what data: BookRecord, LoanDetails
Manager
Vague, becomes a god class
LoanService, CatalogManager with a focused scope
Info
Same as Data
MemberProfile, BookMetadata
Helper / Util
Dumping ground
Move methods to the entity they operate on
Object suffix
Redundant (everything is an object)
Drop it: Book not BookObject
Your class names are your first communication to the interviewer. BorrowRecord tells them you understand the domain. DataObject1 tells them you do not.
When you see a concept with a fixed set of values (book status, loan state, member type), model it as an enum. Not as a String, not as an int, not as a boolean. Enums are self-documenting and type-safe.
public enum BookStatus { AVAILABLE, CHECKED_OUT, RESERVED, LOST, UNDER_MAINTENANCE}
This is a tiny thing that sends a strong signal. Interviewers notice it.
"Members borrow books" does not mean Member has a direct reference to Book. The borrowing action itself (the Loan) is an entity. Every time two entities interact through a time-bound, stateful process, there is a "through" entity hiding in the verb.
Other examples: Enrollment between Student and Course. Payment between Customer and Order. Appointment between Doctor and Patient. Miss these and your model becomes a tangled web of direct many-to-many references.
Entity discovery is a mechanical process, not a creative one. Extract nouns, filter noise, promote hidden entities from verbs.
Use the noun-extraction technique: read requirements, highlight nouns, apply three filters (too vague, primitive attribute, duplicate).
Promote a noun to a class when it has its own identity, participates in multiple relationships, has behavior, or changes independently.
Discover relationships by reading verbs and classifying ownership: association (knows about), aggregation (contains but does not own), composition (owns, dies together).
Apply the responsibility test to every entity pair: "Do these change for different reasons?" If yes, keep them separate.
Watch for through entities hiding in verbs. "Borrow," "reserve," "enroll," and "pay" almost always produce their own class.
Name your classes after domain concepts (Book, Loan, Reservation), not technical roles (Manager, Helper, Data).