How to Write a Prompt for a Recurring AI Agent (That Won't Drift)

One-shot prompts and recurring agent prompts look the same on the page and behave differently in practice. A one-shot prompt runs once; you see the output; you re-prompt if it is wrong. A recurring agent prompt runs unattended for weeks or months on inputs that change. Small ambiguities that are invisible in a one-shot prompt compound into drift across runs. The four-part structure below is what makes a recurring prompt stable.

The four parts are outcome statement, input contract, refusal conditions, and output format with examples. Each part addresses a different failure mode that recurring agents experience but one-shot prompts do not.

Why recurring prompts are different

A one-shot prompt has a human in the loop. The human writes the prompt, the model produces an output, the human reads it. If the output is wrong, the human re-prompts. The feedback loop is tight; ambiguity in the prompt is corrected on the spot.

A recurring agent prompt runs unattended. The agent reads the inputs, runs the prompt, takes actions, produces output. There is no immediate human review. If the prompt was ambiguous, the agent picks an interpretation and keeps picking it. If the input distribution changes, the agent applies the same interpretation to inputs the prompt did not anticipate. The first you hear about it is when something goes visibly wrong, which often means several weeks of degraded output have already happened.

The fix is making the implicit explicit. Recurring prompts cannot leave anything to the agent's interpretation that you would catch in a one-shot. Hence the four-part structure.

Part 1: Outcome statement

The outcome statement is the first paragraph of the prompt and the only paragraph that matters more than any other. State what the agent should achieve in plain English. No implementation steps. No tool names. No conditions yet.

An outcome statement for an inbox triage agent: "Produce a one-screen morning digest of the user's overnight inbox, grouped by sender type (customer, vendor, internal, other). Surface the three most urgent items at the top. Skip newsletters and notifications. The output should be readable in under a minute."

The outcome statement should pass the substitution test: if the underlying model improves, the outcome statement should still describe what you want. If it depends on a specific tool, a specific number of steps, or a specific decision tree, you have written a workflow, not an outcome. The thinking behind this is in describe outcome, not workflow.

Part 2: Input contract

The input contract names every field the agent reads and explains what each field means. This is the part most one-shot prompts skip because the human is mentally filling in the missing context. A recurring agent has no such mental model.

The four parts handle distinct failure modes that compound across recurring runs.

For each input field write: what the field is, what type of value it contains, and what the agent should do when the field is missing or unparseable. Missing-field handling is the most-skipped part of input contracts and the most common source of recurring-agent failures. The 80-tests methodology in how we test AI agents includes a dedicated category for missing-field cases.

Part 3: Refusal conditions

Refusal conditions are explicit instructions for when the agent should stop and ask for help instead of producing output. Three categories cover most situations.

First, out-of-distribution inputs. If an input field is wildly different from anything the prompt has seen, the agent should refuse rather than guess. Example: "If the email length exceeds 5000 words or the language is not English, skip the message and note it in the digest as 'unhandled: out-of-distribution'."

Second, contradictions. If two pieces of information in the input disagree, the agent should refuse rather than pick one. Example: "If the calendar event start time is after the end time, do not assume which is correct; flag and skip."

Third, ambiguity. If the agent cannot tell which interpretation is correct, the agent should refuse rather than choose. Example: "If the message could plausibly be interpreted as customer or vendor, classify as 'ambiguous' rather than guessing."

Refusal is the recurring agent's safety valve. The 8 categories of AI agent failure modes include refusal correctness as a dedicated category for this reason.

Part 4: Output format with examples

The output format is a literal template the agent fills in. Not a description of the format. Not a list of fields. A template, including the punctuation, headings, and spacing.

For a digest agent: "Output exactly the following Markdown, with each section filled in: # Morning digest, dated YYYY-MM-DD followed by ## Top 3 urgent items, then ## Customer inbox, then ## Vendor inbox, then ## Internal, then ## Other. Each item is a single line: 'sender, one-line summary, link'. Skip a section if it has no items."

Then add two or three full input-output examples. The agent learns more from two concrete input-output pairs than from a paragraph of format description. Avoid filling the prompt with examples; two or three is enough, and longer prompts cost more per run, as covered in AI agent cost models.

Drift and how to prevent it

Drift is when a recurring agent's quality decays over runs without any change to the prompt or model. The cause is almost always a change in the input distribution: new senders, new product lines, new edge cases. The agent applies the same interpretation to inputs the prompt did not anticipate, and the interpretation is wrong on the new inputs.

The prevention is twofold. First, refusal conditions catch out-of-distribution inputs and surface them as "unhandled" rather than letting the agent guess. Second, weekly review of the agent's outputs for the first month and monthly review thereafter surfaces drift before it scales. Most prompts converge after the first six to eight reviews; new edge cases discovered in review become new refusal conditions.

The discipline of reviewing recurring outputs is the same discipline of running 80 tests per capability before shipping. Both prevent the same failure mode: the agent silently producing degraded output until something visible breaks.

Frequently asked questions

How is a recurring agent prompt different from a one-shot prompt?

A one-shot prompt runs once with one set of inputs; the operator sees the output and can re-prompt if it is wrong. A recurring agent prompt runs unattended on changing inputs over weeks or months. Small ambiguities that go unnoticed in a one-shot prompt compound into drift across runs. Recurring prompts need explicit refusal conditions, output formats, and edge cases that one-shots can leave implicit.

What are the four parts of a recurring agent prompt?

Outcome statement (what the agent should achieve), input contract (what the agent reads and how to interpret each field), refusal conditions (what to do when inputs are missing, contradictory, or out of distribution), and output format with examples (a literal template the agent fills in).

Why do recurring agent prompts drift over time?

Drift happens because input distributions change. New senders appear, new product lines launch, new edge cases occur. The prompt that worked on the first 100 inputs may not handle inputs 101 through 1000. Drift is not a model bug; it is a distribution change the prompt did not anticipate. The fix is explicit refusal conditions plus periodic prompt review.

Should I include examples in a recurring agent prompt?

Yes, in the output format section. Two or three concrete input-output pairs make the format unambiguous. Avoid filling the prompt with examples; the agent generalises from a small number better than from a large number, and long prompts cost more per run.

How often should I review a recurring agent prompt?

Review the agent's outputs weekly for the first month and monthly thereafter. Flag any output that surprised you. Update the refusal conditions and edge cases when surprises are caused by inputs the prompt did not anticipate. Most prompts converge after the first six to eight weekly reviews.

Three takeaways before you close this tab

Recurring prompts run unattended. Make the implicit explicit because no human is in the loop.
Four parts: outcome, input contract, refusal, output format. Each handles a different failure mode.
Drift is a distribution change, not a model bug. Review weekly for a month, then monthly.

Sources

Anthropic, "Building Effective Agents", retrieved 2026-05-07, anthropic.com/engineering/building-effective-agents
OpenAI, "Prompt engineering best practices", retrieved 2026-05-07, platform.openai.com/docs/guides/prompt-engineering
Mialon et al., "GAIA: A Benchmark for General AI Assistants", arXiv:2311.12983, 2023, retrieved 2026-05-07, arxiv.org/abs/2311.12983
Aryan Agarwal, "Gravity prompt schema", internal v1, May 2026, About