AI Agent Architecture Patterns Explained (ReAct, Plan-and-Execute, Reflection)

An AI agent is not one thing. Under the surface, the agent follows a control structure that decides how it reasons, when it calls tools, whether it checks its own work, and how it splits a job into smaller jobs. That control structure is the architecture pattern. The same model can be wired into very different patterns, and the pattern shapes how reliable, fast, and expensive the agent turns out to be.

This post explains the patterns most builders reach for, in plain language: the ReAct loop, plan-and-execute, reflection and self-critique, the router or dispatcher, and the orchestrator-worker (supervisor) pattern, with tool use as the shared building block beneath all of them. For each, we cover what it is, when to use it, and the trade-offs in latency, cost, reliability, and debuggability. It pairs with the broader view at AI agent orchestration explained and the single-versus-many question in single agent vs multi-agent.

What is an architecture pattern

An architecture pattern is the reusable shape of the loop that turns a goal into actions. It answers four questions about an agent: how does it decide the next step, when does it call a tool, does it check its own output, and how does it break a large goal into smaller ones. The model supplies the intelligence. The pattern supplies the control flow around it.

The reason patterns matter is that the same underlying model behaves very differently depending on how it is wired. A model dropped into a tight reason-act loop solves a problem step by step. The same model wrapped in a planner writes the whole plan first. Wrapped in a supervisor, it delegates to other copies of itself. Same brain, different skeleton, very different reliability.

Tool use is the shared building block

Every pattern in this post rests on one capability: tool use, also called function calling. The model is given a catalog of functions it can call, such as "search the CRM", "send an email", or "query the ledger", and it decides which to call with which arguments. Without tools, an agent can only talk. With tools, it can act on real systems. We cover the mechanics in AI agent tool use explained.

Tool use is the building block; the patterns are the buildings. ReAct decides when to call a tool inside a loop. Plan-and-execute decides the sequence of tool calls up front. Reflection adds a tool call that critiques earlier output. An orchestrator treats whole worker agents as tools. So when you compare patterns, you are really comparing how each one orchestrates the same fundamental action: a tool call.

The ReAct loop

ReAct, short for reason plus act, is the most common single-agent pattern. The agent thinks about what to do, takes one action, observes the result, then thinks again, repeating until the goal is met. It was introduced by Yao et al. (2022) and showed that interleaving reasoning traces with actions beats doing either alone. Most off-the-shelf agent frameworks default to some form of this loop.

How the ReAct loop works

Picture a buyer asking an agent to "find the overdue invoices and email each client a reminder". A ReAct agent reasons that it first needs the invoice list, calls the accounting tool, observes the results, reasons that three are overdue, then loops through emailing each one. It does not plan the whole thing in advance. It figures out each step from the last step's outcome.

That step-by-step quality is the strength. The agent reacts to what it actually sees, so it handles surprises, an empty result, an error, an unexpected field, without a rigid script breaking. In practice, ReAct's biggest advantage is debuggability: because every thought and action is logged in sequence, you can read the trace like a transcript and see exactly where the agent went wrong.

Trade-offs of ReAct

The weakness is drift on long tasks. Because each step depends on the previous one, a small early mistake compounds. By step twelve the agent may be confidently solving the wrong problem. Latency also climbs, since every step is a separate model call. Cost rises with step count. ReAct is the right default for short-to-medium tasks where flexibility matters more than tight cost control, and where reasoning quality is well understood; see AI agent reasoning vs pattern matching.

Plan-and-execute

Plan-and-execute splits the work in two. First, a planner reads the goal and writes a full plan: an ordered list of steps. Then an executor runs the steps, often without re-planning between them. The split is the whole idea. Instead of deciding each move from the last result, the agent commits to a route up front, then drives it. This pattern shows up in many production systems where the task shape is predictable.

When to use plan-and-execute

Use it when the steps are largely known in advance. A monthly financial close, a data migration, a multi-section report: these have a stable structure. Writing the plan once and executing it is cheaper than reasoning step by step, because you spend the expensive reasoning tokens only on planning. It is also easier to review. A human can read the plan before any action runs, which is valuable when the actions are irreversible.

Trade-offs of plan-and-execute

The cost is rigidity. When reality drifts from the plan, an export that fails, a record that does not exist, a pure executor barrels ahead or stalls. The fix is a hybrid: re-plan when a step fails, which adds back some ReAct-style adaptivity. The honest read is that plan-and-execute and ReAct are not rivals; most reliable agents blend them, planning the skeleton and reacting within each step. Debuggability sits in between: the plan is legible, but a failure mid-execution can be harder to trace than a flat ReAct log.

Reflection and self-critique

Reflection adds a quality-control step. The agent produces a draft answer, then a second pass critiques that draft against the goal and either approves it or sends it back for revision. The Reflexion paper (Shinn et al., 2023) formalized this self-feedback idea and reported meaningful gains on reasoning and coding tasks where a single pass often fails. The pattern can wrap any of the others.

How reflection works in practice

Think of reflection as a built-in reviewer. An agent writing a SQL query runs it, sees the result looks wrong, reasons about why, and rewrites the query. An agent drafting a contract clause checks it against the requirements and flags a missing term before returning. The critique can be the same model prompted to be skeptical, a separate evaluator, or a real test, like running code and reading the error.

Trade-offs of reflection

The benefit is accuracy on hard tasks; the cost is latency and tokens. Every reflection pass is another round of model calls, so a task that took one call now takes three or four. In building Gravity's reference agents, we found reflection earns its cost on high-stakes, low-volume outputs, a legal summary, a financial reconciliation, and rarely pays off on cheap, high-volume calls where a wrong answer is easy to retry. Reflection also depends on the critic being honest; a lazy self-critique adds cost without adding accuracy.

Router and orchestrator-worker

Once a system handles many kinds of request, two coordination patterns appear: the router and the orchestrator-worker. The router classifies an incoming request and sends it to the right handler. The orchestrator-worker, also called the supervisor pattern, breaks a complex goal into subtasks and delegates each to a specialized worker agent. Anthropic's engineering write-up on building effective agents describes both as core composition patterns. We go deeper in AI agent multi-agent coordination.

The router or dispatcher pattern

A router is a triage desk. A support request arrives; the router decides whether it is a billing question, a bug report, or a refund, then hands it to the matching agent or workflow. The router itself does not solve the task; it classifies and forwards. The benefit is that each downstream handler stays simple and specialized. The trade-off is that a misclassification at the front door sends the whole request down the wrong path, so router accuracy is worth testing hard.

The orchestrator-worker (supervisor) pattern

An orchestrator owns the goal and farms out the pieces. Asked to "research three competitors and build a comparison", it spawns a worker per competitor, lets them run in parallel, then merges the findings. Each worker is focused, which keeps its reasoning tight. The orchestrator handles sequencing and assembly. This is the backbone of most serious multi-agent systems, and the foundations of who-talks-to-whom live in AI agent orchestration explained.

Trade-offs of orchestration

The upside is modular reliability: small, focused workers are easier to make correct than one agent juggling everything. The downside is coordination overhead. More agents means more tokens, more latency from message passing, and harder debugging, since a failure can hide in any worker or in the hand-off between them. Memory and context sharing get trickier too, which is why AI agent memory explained matters once you go multi-agent. As a rule, reach for orchestration only when a single well-built agent genuinely cannot hold the whole task.

Choosing a pattern

For buyers, the practical answer is that you do not choose a pattern at all. On an agent marketplace, the builder selects the architecture; you describe the outcome you want and run the agent. That is the whole point of "describe the outcome, not the workflow". Still, knowing the patterns changes the questions you ask, and better questions surface more reliable agents. The capability map in what can an AI agent actually do sets the baseline.

How buyers can judge an agent

You cannot see the architecture from outside, but you can probe its effects. Ask three things. Does the agent check its own work before returning a result, a sign of reflection. What happens when a step fails, does it recover, re-plan, or just stop. And does it escalate to a human when it is unsure, or guess. An agent that reflects, recovers, and escalates is usually built on a sturdier pattern than one that always answers instantly.

How builders should choose

For builders, the rule of thumb is to start as simple as the task allows. Our internal guidance for Gravity builders is to begin with a plain ReAct loop, add a reflection pass only where accuracy demonstrably needs it, switch to plan-and-execute when the steps are stable and cost matters, and reach for orchestration last, only when one agent cannot hold the job. Every layer you add buys capability and spends latency, cost, and debuggability. The discipline is to add a pattern only when the task forces it, not because it looks sophisticated.

Frequently asked questions

What is the ReAct pattern in AI agents?

ReAct, short for reason plus act, is a loop where the agent thinks, takes one action, observes the result, then thinks again. It interleaves reasoning with tool calls instead of planning everything up front. It is simple, flexible, and easy to debug, but it can drift on long tasks because each step depends on the last.

When should you use plan-and-execute instead of ReAct?

Use plan-and-execute when the task has many steps that are known in advance, like a multi-stage report or a fixed migration. The agent writes the full plan once, then runs the steps. You get lower cost and clearer structure, at the price of weaker recovery when reality drifts from the plan.

What is the reflection pattern in agent design?

Reflection adds a self-critique step: the agent produces an answer, then a second pass checks the work against the goal and revises it. Patterns like Reflexion show this can raise accuracy on hard tasks. The cost is extra latency and tokens, so reflection suits high-stakes outputs rather than every routine call.

What is the orchestrator-worker pattern?

An orchestrator agent breaks a goal into subtasks and hands each to a specialized worker agent, then merges the results. It is the supervisor pattern for complex, multi-skill work. The benefit is modular reliability; the cost is coordination overhead, harder debugging, and more total tokens than a single agent would spend.

Do buyers need to choose an agent architecture pattern?

No. On a platform like Gravity, the builder picks the pattern; the buyer describes the outcome. Knowing the patterns still helps you judge reliability: ask whether the agent reflects on its own output, how it recovers from a failed step, and whether it escalates instead of guessing.

Three takeaways before you close this tab

The pattern is the skeleton. Same model, different control structure, very different reliability, latency, and cost.
Simpler is sturdier. Start with ReAct; add reflection, planning, or orchestration only when the task forces it.
Buyers judge, builders choose. You describe the outcome; the patterns tell you what questions reveal a reliable agent.

Sources

Yao et al., "ReAct: Synergizing Reasoning and Acting in Language Models", 2022, arxiv.org/abs/2210.03629
Shinn et al., "Reflexion: Language Agents with Verbal Reinforcement Learning", 2023, arxiv.org/abs/2303.11366
Anthropic, "Building Effective Agents", 2024, anthropic.com/engineering/building-effective-agents
Gravity agent design notes, internal v1, 2026. Retrieved 2026-06-05.