Schema validation retry
Validate LLM outputs against a schema, feed validation errors back as context for retry, and propagate learned fixes across pipeline steps to prevent repeated failures.
TL;DR
- Schema validation retry parses LLM output against a strict schema (Pydantic, Zod, JSON Schema), and when validation fails, feeds the exact error message back to the model as context for a targeted retry.
- The retry prompt includes the original request, the invalid output, and the specific validation error ("Field 'age' expected int, got string '25'"), letting the model self-correct just the broken field instead of regenerating everything from scratch.
- Production pipelines using this pattern report 85-95% recovery rates within 2 retries, compared to 40-60% with blind regeneration (retrying without error context).
- Cross-step learning propagates fix patterns discovered in early pipeline steps to later steps: if step 3 learns that the model confuses
nullwith the string"null", step 7's prompt pre-empts the same mistake. - Retry budget: cap at 2-3 attempts. If schema validation fails 3 times, the model likely cannot produce valid output for this schema/prompt combination, and further retries waste tokens.
- Limitation: retry loops add latency (each retry is a full LLM call). For latency-sensitive paths, pair with a fallback to a simpler schema or cached default.
The Problem It Solves
Your agent extracts structured data from customer emails. The prompt says "return a JSON object with fields: name (string), email (string), priority (integer 1-5), issues (array of strings)." The model returns this:
{
"name": "Sarah Chen",
"email": "sarah@acme.com",
"priority": "high",
"issues": "Login broken, billing page 500 error"
}
Two fields are wrong. priority is the string "high" instead of an integer. issues is a flat string instead of an array. Your downstream code calls issues.forEach(...) and crashes. The customer ticket silently drops.
The naive fix is to retry: send the same prompt again and hope the model gets it right. But "hope" is not an engineering strategy. Without telling the model what went wrong, the second attempt has roughly the same probability of making the same mistakes. I've watched agents burn 4-5 retries on the same schema violation because the retry prompt was identical to the original.
The smarter fix: tell the model exactly what failed. "Field 'priority' expected integer 1-5, got string 'high'. Field 'issues' expected array of strings, got string." Now the model knows which fields to fix, and the retry prompt is different from the original. The recovery rate jumps from 50% to over 90%.
What Is It?
Schema validation retry parses every LLM output against a strict schema before passing it downstream. When validation fails, the system constructs a retry prompt containing the original request, the invalid output, and the exact validation errors. The model uses this error context to self-correct the specific fields that failed rather than regenerating from scratch.
Think of it as a teacher grading an exam. A teacher who writes "WRONG" on every incorrect answer is useless. A teacher who writes "You used the formula for area instead of circumference" teaches the student exactly what to fix. The student corrects the specific mistake without redoing the entire exam. Schema validation retry is the detailed red ink, not the rejection stamp.
How It Works
Step 1: Strict schema definition
The schema is the contract between your LLM and your code. Define it with the strictest types your domain allows. Pydantic in Python and Zod in TypeScript are the standard tools.
from pydantic import BaseModel, Field
from typing import List
from enum import Enum
class Priority(int, Enum):
LOW = 1
MEDIUM = 2
HIGH = 3
URGENT = 4
CRITICAL = 5
class CustomerTicket(BaseModel):
name: str = Field(min_length=1, max_length=200)
email: str = Field(pattern=r'^[\w\.-]+@[\w\.-]+\.\w+$')
priority: Priority
issues: List[str] = Field(min_length=1, max_length=10)
summary: str = Field(min_length=10, max_length=500)
Every constraint you add to the schema is a constraint the validator enforces for free. min_length=1 on issues catches empty arrays. The regex on email catches malformed addresses. Priority as an enum rejects values outside 1-5. I've learned to be as specific as possible here because vague schemas produce vague validation errors, which produce vague retries.
The schema also serves as documentation. When another engineer reads Priority(int, Enum), they know exactly what values are valid without reading the prompt.
Step 2: Validation with actionable error messages
The validator returns structured error objects, not boolean pass/fail. Each error includes the field path, the expected type or constraint, the actual value received, and a human-readable message. This detail is what makes retries effective.
def validate_and_extract_errors(raw_output: str, schema) -> tuple:
"""Parse LLM output, return (result, errors)."""
try:
parsed = json.loads(raw_output)
except json.JSONDecodeError as e:
return None, [f"Invalid JSON at position {e.pos}: {e.msg}"]
try:
result = schema.model_validate(parsed)
return result, []
except ValidationError as e:
errors = []
for err in e.errors():
field = " -> ".join(str(loc) for loc in err["loc"])
errors.append(
f"Field '{field}': expected {err['type']}, "
f"got {repr(err['input'])}. {err['msg']}"
)
return None, errors
The error output looks like: "Field 'priority': expected int, got 'high'. Input should be a valid integer." This is actionable. The model reads it and knows to change "high" to an integer. Compare that to "Validation failed", which tells the model nothing.
Step 3: Error-informed retry prompts
The retry prompt is the key innovation. It includes three pieces of context: the original request, the model's failed attempt, and the specific validation errors. This gives the model a clear correction target.
def build_retry_prompt(original_prompt: str, failed_output: str,
errors: list[str], attempt: int) -> str:
error_block = "\n".join(f" - {e}" for e in errors)
return f"""{original_prompt}
Your previous attempt produced invalid output. Here is what you returned:
{failed_output}
The following validation errors were found:
{error_block}
Fix ONLY the fields mentioned above. Keep all other fields exactly as they were.
Return the complete corrected JSON object.
Attempt {attempt} of 3."""
Two important details. First, "Fix ONLY the fields mentioned" prevents the model from "helpfully" changing correct fields while fixing incorrect ones. Without this instruction, I've seen models rewrite perfectly valid fields during retry, introducing new errors while fixing old ones. Second, including the attempt counter creates implicit urgency. Some models produce more careful output when they know retries are limited.
Step 4: Temperature adjustment on retry
Lower the temperature on retry attempts. The first generation at temperature 0.7 explores the output space. Retries at temperature 0.2-0.3 reduce variance, making the model more likely to produce the conservative, schema-compliant output you need.
RETRY_TEMPERATURES = {1: 0.7, 2: 0.3, 3: 0.1}
The pattern: creative on the first attempt, precise on retries. This works because the first attempt establishes the content (the right names, the right summary), and the retries just fix the formatting. You don't want the model to generate entirely different content on retry, just fix the schema violations.
Step 5: Cross-step learning (the advanced pattern)
In multi-step pipelines, the same model makes the same mistakes across different steps. If step 3 discovers that the model outputs "null" (the string) instead of null (the JSON value), step 7 will likely hit the same issue. Cross-step learning shares these fix patterns across the pipeline.
Continue Reading with Premium
Unlock this article and every other in-depth system design guide on the platform with NotesFromSDE Premium.