Prompt contracts
Learn how structured goal-constraint-format-failure contracts eliminate vague agent tasks and boost first-attempt success rates from under 40% to above 85%.
TL;DR
- A prompt contract is a structured agreement (goal, constraints, format, failure conditions) that the agent drafts and the user approves before implementation starts.
- Without contracts, agents guess at implicit requirements. First-attempt success rates on non-trivial tasks drop below 40%.
- The four sections of a contract: Goal (what to deliver), Constraints (technical boundaries), Format (output structure), Failure conditions (what counts as failure).
- Contracts work like a freelancer's scope of work. They force the agent to surface assumptions before burning tokens on the wrong thing.
- The tradeoff: contracts add 15-30 seconds of upfront review time per task, but eliminate 2-5 revision cycles that each cost minutes and tokens.
The Problem It Solves
You tell your agent "build me a beautiful marketing website." The agent picks Next.js, installs 14 dependencies, generates 800 lines of TypeScript, and produces something that looks like a generic Bootstrap template from 2019. You wanted a single-page HTML site with smooth animations. The agent guessed wrong on every major decision, and now you're starting over.
This happens constantly. Vague tasks produce wildly inconsistent results because the agent fills in every ambiguity with its own implicit assumptions. Framework choice, visual style, file count, animation approach, section structure, responsive behavior: each of these is a decision point where the agent can go wrong. Multiply six decision points by three plausible options each and you have 729 possible outcomes, most of which aren't what you wanted.
I've watched teams lose entire afternoons to this loop: vague prompt, wrong output, frustrated correction, slightly-less-wrong output, more corrections. The fix isn't better prompting. The fix is a structured checkpoint before any implementation begins.
What Is It?
Prompt contracts formalize the "definition of done" before agent work starts, by having the agent draft a structured specification with goals, constraints, expected output format, and explicit failure conditions that the user approves before implementation begins.
Think of it like hiring a contractor to renovate your kitchen. A good contractor doesn't just start ripping out cabinets. They walk through the space with you, write up a scope of work (materials, timeline, budget, what "done" looks like), and get your sign-off before touching anything. Prompt contracts are the scope-of-work step for AI agents.
The contract has exactly four sections. No more, no less. Each section answers a different question: Goal (what?), Constraints (within what boundaries?), Format (shaped how?), and Failure conditions (what does broken look like?).
How It Works
Step 1: The agent analyzes the request
Before drafting the contract, the agent performs a self-analysis step. This is not a pass-through; the agent actively decomposes the vague request into five categories:
- Stated requirements: What the user explicitly asked for
- Implicit assumptions: Decisions the agent is about to make on the user's behalf
- Decision points: Places where multiple valid approaches exist
- Failure modes: Things that could go wrong with any approach
- Taste-dependent choices: Aesthetic or stylistic decisions the agent can't make alone
This analysis is the critical step most implementations skip. Without it, the contract is just a reformatted version of the original request. I've seen agents draft contracts that simply restate the user's prompt in four sections, which completely misses the point. The contract's value comes from surfacing what the user didn't say.
Step 2: The agent drafts the four-section contract
Each section serves a distinct purpose:
Goal is a single sentence describing the deliverable. Not a paragraph, not a bulleted wish list. One sentence that both parties can point to and say "this is what we're building." Example: "A single-page marketing site for LeftClick.ai with a modern, dark-themed design."
Constraints are technical boundaries and restrictions. These are the guardrails that prevent the agent from over-engineering or choosing the wrong tools. Examples: "Under 500 lines of HTML," "No external frameworks or build tools," "Mobile responsive down to 375px," "Smooth scroll animations using CSS only."
Format describes the expected output structure. This is where you specify what sections to include, what file types to produce, and how the deliverable should be organized. Examples: "Five sections in this order: Hero, About, Services, Testimonials, CTA," "Subtle fade-in animations on scroll," "Hover states on all interactive elements."
Failure conditions are the most valuable part of the contract. They explicitly define what counts as failure, so the agent has objective criteria to validate against instead of subjectively deciding the output is "good enough." Examples: "Looks like a generic Bootstrap template," "Broken layout on mobile," "Animations are janky or cause layout shifts," "File exceeds 500 lines."
The failure conditions section is what separates prompt contracts from vague planning. Without it, the agent has no definition of done.
Step 3: User reviews and approves
The user reads the contract and does one of three things: approve it as-is, modify specific sections, or reject it and rephrase the original request. This step typically takes 15-30 seconds for a well-drafted contract.
The review step catches misalignment before any implementation tokens are spent. If the agent's contract reveals it was planning to use React when you wanted plain HTML, you catch that in 15 seconds instead of after 3 minutes of code generation.
Step 4: Agent implements with clear spec
Once approved, the agent treats the contract as its specification. Every decision maps back to a contract clause. If the agent faces an ambiguous choice during implementation, it checks the contract. If the contract doesn't cover it, the agent flags it rather than guessing.
Step 5: Agent validates against failure conditions
After implementation, the agent runs through the failure conditions as a checklist. Does it look like a Bootstrap template? Is it broken on mobile? Are animations janky? If any failure condition is triggered, the agent fixes the issue before presenting the result.
This validation step is mechanical, not subjective. The agent doesn't decide if the output is "pretty good." It checks explicit conditions with binary pass/fail results.
For your interview: the key insight is that prompt contracts shift the cost of ambiguity from implementation time (expensive, token-heavy) to planning time (cheap, human-readable).
Implementation Sketch
This is a simplified implementation showing the core mechanism. In production, you'd integrate this into your agent framework (LangChain, CrewAI, or a custom orchestrator).
Continue Reading with Premium
Unlock this article and every other in-depth system design guide on the platform with NotesFromSDE Premium.