Human-in-the-loop

TL;DR

Agents make mistakes. Some mistakes are irreversible. Reversibility is the primary filter: irreversible actions always need an approval gate, reversible ones usually don't.
Classify every action as reversible, partially reversible, or irreversible before designing your HITL strategy. This classification drives every downstream design decision.
The approval gate pattern: agent builds a complete work plan, human reviews in plain English, human approves or edits, agent executes. One review per workflow, not one per action.
interrupt_before in LangGraph pauses the graph, serializes state to the checkpointer, and returns without blocking any thread. Resume with invoke(None, config) and the same thread_id on approval.
Progressive autonomy: start with approval on everything, track the false positive rate per action type, expand autonomy incrementally where the data justifies it.
HITL is not a permanent safety crutch. It is a mechanism for building trust with evidence before removing oversight.

A customer success agent processes 200 billing inquiries overnight. An edge case in the routing logic triggers a re-notification branch. By morning, 200 customers have each received a duplicate "Your invoice is overdue" email. Half of them already paid. Finance and customer success spend two days sending corrections, managing inbound complaints, and tracking down the root cause.

The direct cost was the duplicate emails. The real cost was the 40 hours of human remediation, the customer trust lost, and the compliance flag raised by a finance audit. None of this appeared in the demo run where the agent worked perfectly on the three test cases prepared for it.

This type of failure does not surface in controlled testing. It surfaces when the agent runs autonomously on real data with real edge cases, and when the action it takes has no undo button.

What is it?

Human-in-the-loop (HITL) is an architectural pattern where an AI agent pauses execution at defined checkpoints, presents its proposed actions to a human reviewer, and waits for explicit approval before continuing. Think of it like a surgeon requiring a second-surgeon sign-off before an irreversible procedure: the surgery proceeds with full speed and skill, but the point of no return requires a deliberate human decision.

The pattern is not about limiting agents. It is about placing the human decision point exactly where it provides maximum value: at the irreversible boundary.

How it works

Agent

>Awaiting task...

Plan Builder

>Awaiting input...

Human Reviewer

>Ready to review...

Executor

>Awaiting approval...

Completed Task

>Standing by...

The HITL approval gate: agent builds a plan, human reviews and approves, executor runs the approved actions.

Action reversibility classification

Before designing the approval gates, classify every action the agent can take. This classification is the foundation of your entire HITL design.

Tier 1 (reversible): No external system changes permanently. A draft memo, a staging record, a temp file. Auto-execute with no approval needed. Mistakes cost nothing to fix.

Tier 2 (partially reversible): The change can be undone, but it requires effort. A CRM record update creates an audit trail and may trigger downstream notifications. Consider batch review at end of execution for high-stakes workflows. Mistakes cost engineer time.

Tier 3 (irreversible): No practical undo exists. An email sent is an email received. A payment processed has moved money. A production deployment is live to users. Always requires human approval before execution. Mistakes cost remediation time, trust, or legal liability.

The approval gate pattern

The most reliable implementation: the agent generates a complete work plan before taking any action, presents the full plan for human review, the human approves or edits, then the agent executes. This is plan-level review, not action-level review.

Plan-level review has two advantages. First, the human sees full context: what the agent plans to do and in what sequence, not just one action in isolation. This catches plan-level errors such as correct individual actions in the wrong order. Second, review overhead is O(1) per workflow instead of O(n) per action, which prevents alert fatigue.

TL;DR

Agents make mistakes. Some mistakes are irreversible. Reversibility is the primary filter: irreversible actions always need an approval gate, reversible ones usually don't.
Classify every action as reversible, partially reversible, or irreversible before designing your HITL strategy. This classification drives every downstream design decision.
The approval gate pattern: agent builds a complete work plan, human reviews in plain English, human approves or edits, agent executes. One review per workflow, not one per action.
interrupt_before in LangGraph pauses the graph, serializes state to the checkpointer, and returns without blocking any thread. Resume with invoke(None, config) and the same thread_id on approval.
Progressive autonomy: start with approval on everything, track the false positive rate per action type, expand autonomy incrementally where the data justifies it.
HITL is not a permanent safety crutch. It is a mechanism for building trust with evidence before removing oversight.

The problem it solves

This type of failure does not surface in controlled testing. It surfaces when the agent runs autonomously on real data with real edge cases, and when the action it takes has no undo button.

What is it?

The pattern is not about limiting agents. It is about placing the human decision point exactly where it provides maximum value: at the irreversible boundary.

How it works

Agent

>Awaiting task...

Plan Builder

>Awaiting input...

Human Reviewer

>Ready to review...

Executor

>Awaiting approval...

Completed Task

>Standing by...

The HITL approval gate: agent builds a plan, human reviews and approves, executor runs the approved actions.

Action reversibility classification

Before designing the approval gates, classify every action the agent can take. This classification is the foundation of your entire HITL design.

Tier 1 (reversible): No external system changes permanently. A draft memo, a staging record, a temp file. Auto-execute with no approval needed. Mistakes cost nothing to fix.

Human-in-the-loop

TL;DR

The problem it solves

What is it?

How it works

Action reversibility classification

The approval gate pattern

Continue Reading with Premium

Comments

Human-in-the-loop

TL;DR

The problem it solves

What is it?

How it works

Action reversibility classification

The approval gate pattern

Continue Reading with Premium

Comments