How to Add a Human Approval Step to an AI Agent

A human approval step is the cheapest insurance policy in agent operations and the most over-applied governance pattern. Done well it catches the actions that should never happen and stays out of the way for everything else. Done badly it produces approval fatigue, rubber-stamping, and incidents anyway. This guide walks the production pattern: where to insert the gate, what the approver needs to see, how to time out gracefully, and the graduation rules that move trusted actions out of approval over time.

The framing builds on the four-level trust model in agent trust models. Approval gating is the operational mechanism for trust level 3 (approve-then-act). NIST's AI RMF and the OWASP LLM Top 10 both emphasise human oversight as a control surface (NIST AI RMF, retrieved 2026-05-09).

Where to insert the gate

The gate goes immediately before the irreversible action. Not at the end of the agent's plan; not at the start; not at every step. At the action.

Concretely, the agent's tool wrapper for a high-risk action does this:

Receive the proposed action and arguments from the agent.
Validate against allow-lists and policy.
Submit the action to the approval queue with full context.
Wait for approval, rejection, timeout, or escalation.
Execute on approval; abort and report on rejection; escalate on timeout.

The agent's loop sees the approval response as a tool result. The agent does not run the approval logic itself; the orchestration layer does. This separation is what makes the approval auditable independently of the agent's reasoning.

What counts as "high-risk" maps to the trust-level decision matrix in trust models: high blast radius, irreversible, or low frequency. Sending email externally, transferring money, deleting records, posting to a public channel, signing a document. Reading data, drafting text, internal computations: usually no gate.

The timeout pattern

Set a timeout that matches how long the action remains useful. Three example tiers:

Action type	Useful window	Timeout
Customer email reply	~24 hours before stale	24 hours
Meeting scheduling	~2-4 hours before slot disappears	2-4 hours
Invoice payment	~5 business days	5 business days
Internal Slack triage	~30 minutes	30 minutes

On timeout the agent must not auto-act. Auto-acting on timeout is the most common pattern in over-engineered approval systems and the most common source of incidents. The correct timeout behaviour:

Log the timeout with full context.
Escalate to a fallback approver (a manager, a shared queue).
Downgrade the original approver's queue priority for future approvals.
If the escalated fallback also times out, the action is dropped, not executed.

Auto-execute-on-timeout is acceptable only for the lowest-risk auto-graduated actions where the gate exists for surveillance rather than control. Even there, log the auto-execute distinctly from approved executions so the audit can distinguish them.

What approvers need to see

Approvers without context default to two failure modes: rubber-stamping (approving everything) and blanket rejecting (rejecting on principle). Both are dangerous. Five fields keep approvers in the productive middle:

Agent identity and prompt version. Which agent is asking and which version of its prompt produced this action.
Goal context. What was the user request, and what state has the agent established so far.
Proposed action with full input arguments. Not a summary; the actual payload that will be sent if approved.
Alternatives considered. What other actions the agent looked at and why it rejected them.
Reversibility cost. If this action turns out to be wrong, what does undoing it look like.

The fifth field is the most under-implemented and the most useful. Approvers who can see "this email cannot be unsent; if wrong, you will need to send a follow-up apology" approve more carefully than approvers who see only the email body.

Preventing approval fatigue

Approval fatigue is the slow death of approval systems. The pattern: approvers see hundreds of items per day; quality of review drops; bad actions get approved; an incident happens; the team adds more approvals; fatigue worsens.

Three rules contain the spiral:

Approve actions, not runs. An agent run that takes five steps with three high-risk actions should produce three approval requests, not one.
Batch low-risk approvals. Twenty triage labels, one click. Twenty Slack thread reactions, one click. Group by action type, present in a table, approve in bulk.
Graduate actions out of approval over time. If "send onboarding email to new customer" has run 1,000 times with no incidents and zero rejections, the action should auto-execute under guardrail. Keeping it at level 3 forever guarantees fatigue.

The graduation pattern is covered in trust models. The cluster post on monitoring agent activity covers the dashboards that tell you when an action is ready to graduate.

Audit fields per approval

Every approval event writes a row to the audit table with eight fields:

Timestamp (ISO 8601 with timezone, monotonic).
Approver identifier (user ID, role, queue ID).
Agent identifier (with prompt version, model version).
Action type (sendEmail, postPayment, etc.).
Action arguments (full payload as submitted).
Decision (approve / reject / timeout / escalate).
Reasoning text if provided by the approver.
Downstream action ID linking to the executed action's audit row.

The trail is immutable. Append-only storage. Retention matches the underlying action's regulatory requirement (typically 7 years for financial actions, 6 for healthcare, 3-7 for general).

Graduating actions out of approval

The criteria to remove an approval gate:

1,000 incident-free executions of the action type at the current trust level.
Zero rejections in the last 200 approvals (rejections indicate the agent is producing bad actions; if it is, do not graduate).
A regression test exists in the agent's eval suite for this action type.
Audit and rollback paths are operational and tested.

When the criteria are met, drop the gate. The action moves to level 4 (autonomous-with-guardrail). If the next 30 days produce an incident, restore the gate immediately and run the post-incident recovery covered in trust models.

Rollout pattern for first deployment

The rollout sequence that produces the fewest incidents:

Week 1. Approve every action of every type. Approval queue is loud; the team feels the workload. This is intentional. The signal is which actions are safe enough to graduate.
Week 2. Auto-approve the action types where rejection rate is below 1 percent. Keep the rest at level 3.
Week 3-4. Continue graduating low-rejection action types. Add batch approval for any remaining high-volume low-risk types.
Month 2. Review the audit log. Identify any action that approvers consistently spent over 30 seconds reviewing. Either improve the approval interface for that action or downgrade it to level 2 (suggest, not approve-then-act).
Quarterly. Re-audit the trust matrix. Actions whose blast radius or reversibility cost has changed (because of policy or product changes) get reassessed.

The rollout pattern looks slow because it is. Speed at the cost of skipping the audit produces the incidents that approval was meant to prevent. The cluster post on how we test AI agents covers the test discipline that supports rollout decisions.

Frequently asked questions

Where should the human approval step go in an agent flow?

Immediately before the irreversible action (sending an email, posting a payment, deleting a record), not after the agent finishes its full plan. Approving a final summary is theatre; approving the specific action is governance. Insert one approval per high-risk action, not one approval at the end of the run.

How long should the agent wait for approval?

Set a timeout that matches how long the action remains useful. For email replies, 24 hours. For meeting scheduling, 2 to 4 hours. For invoice payment approvals, 5 business days. On timeout the agent should not auto-act; it should escalate to a fallback approver, log the timeout, and downgrade the original approver's queue priority.

What does an approval interface need to show?

Five things: the agent's identifier and prompt version, the goal context, the proposed action with full input arguments, the alternative actions the agent considered (and why it rejected them), and the reversibility cost if the action turns out to be wrong. Approvers without all five default to rubber-stamping or to blanket rejecting.

How do I prevent approval fatigue?

Three rules. First, approve once per action, not per run. Second, batch low-risk approvals into a single decision (twenty triage tags with one click). Third, graduate actions out of approval as they pass an incident-free threshold; the trust model in the cluster post on agent trust models defines the graduation pattern. Approval fatigue produces incidents because tired approvers approve everything.

How is approval tracked for audit?

Every approval event records eight fields: timestamp, approver identifier, agent identifier, action type, action arguments, decision (approve / reject / timeout / escalate), reasoning text if provided, and downstream action ID. The trail is immutable and queryable by approver, action type, and time window. Audit is the difference between governance and theatre.

Three takeaways before you close this tab

Gate the action, not the run. One approval per high-risk action.
Show five fields. Without them, approvers default to rubber-stamping.
Graduate out. Approvals are scaffolding, not the building.

Sources

NIST, "AI Risk Management Framework", retrieved 2026-05-09, nist.gov/itl/ai-risk-management-framework
OWASP, "Top 10 for Large Language Model Applications", retrieved 2026-05-09, owasp.org/www-project-top-10-for-large-language-model-applications
Anthropic, "Building Effective Agents", retrieved 2026-05-09, anthropic.com/engineering/building-effective-agents
European Union, "AI Act, Regulation (EU) 2024/1689", retrieved 2026-05-09, eur-lex.europa.eu/eli/reg/2024/1689/oj
Russell & Norvig, "Artificial Intelligence: A Modern Approach", 4th edition, 2020, agent oversight chapter