Every agent has to answer two different questions. First, what should I do to reach this goal. Second, how do I actually do it. The first is planning. The second is execution. They look like one smooth motion from the outside, but inside the agent they are distinct jobs, and how a builder splits them shapes whether the agent is cheap or expensive, rigid or adaptive, easy or hard to trust.
This post explains the planning-execution split in plain language: what each job is, why agents separate them, how re-planning stitches the two back together, and how to pick the right balance for a task. It extends the plan-and-execute pattern from AI agent architecture patterns explained and the deeper question of what the model is really doing in AI agent reasoning vs pattern matching.
Two jobs, one agent
The cleanest way to see the split is to watch a person do a chore. Asked to "get the overdue invoices paid", you first work out the steps: pull the list, find the overdue ones, draft a reminder for each, send them, log the result. That is planning. Then you sit down and actually do those steps. That is execution. You did both, but they used different parts of your attention. Planning is abstract and forward-looking; execution is concrete and hands-on.
Agents work the same way, and the interesting design choice is how tightly the two are interleaved. At one extreme, the agent plans the entire job up front, then executes the whole plan without rethinking. At the other, it plans exactly one step, executes it, then plans the next from what it just saw. Most patterns live somewhere in between. The position you choose on that spectrum is the cognitive architecture of the agent.
What planning is
Planning is the agent deciding what sequence of actions will reach the goal. It is the reasoning-heavy part: the model weighs options, orders steps, anticipates dependencies, and produces something like a route. A good plan names the steps, their order, and often the tool each step will use. The output of planning is not an action; it is a decision about future actions.
Planning is where reasoning lives
Because planning is abstract, it is where the model's reasoning quality matters most. A weak planner produces a plan that is missing a step, orders things wrong, or assumes a tool it does not have. Those errors are expensive because every later action inherits them. This is why builders spend real effort on the planning prompt and why a planning failure tends to be more damaging than an execution slip. The difference between a model reasoning and a model pattern-matching shows up first in the quality of its plans.
What execution is
Execution is the agent carrying out the steps: calling tools, sending requests, writing data, reading results. It is the hands-on part where the agent touches real systems. Execution is usually cheaper per step than planning, because the hard thinking already happened. The model is now mostly translating a planned step into a concrete tool call and handling the response. The mechanics of that translation are the subject of AI agent tool use explained.
Execution is where reality pushes back
The catch is that execution is where the plan meets the real world, and the real world rarely cooperates. The export that was supposed to return ten rows returns zero. The email address bounces. The API is down. A pure executor following a fixed plan has no answer to these surprises; it either fails loudly or, worse, continues as if nothing happened. That gap between a clean plan and a messy reality is the central problem the split has to manage, and it connects directly to the AI agent failure modes that bite agents in production.
Why split them
If interleaving planning and execution on every step is so adaptive, why separate them at all? Three reasons, and they are the reasons plan-and-execute exists. The first is cost. Planning uses the expensive reasoning tokens; execution uses cheaper ones. Plan once and you pay the reasoning tax a single time instead of on every step. For a long task with many similar steps, that is a large saving, and the maths is laid out in AI agent cost models explained.
The second reason is reviewability. A plan is a legible artifact a human can read before anything happens. When the actions are irreversible, sending money, deleting records, emailing customers, being able to approve the plan first is a genuine safety control, and it pairs naturally with the gates in how to add human-in-the-loop to an agent. The third reason is structure: a written plan gives the whole run a backbone, which makes long tasks more coherent than a chain of step-by-step decisions that can wander.
Re-planning closes the gap
The weakness of a strict split is rigidity, and the fix is re-planning. Re-planning means the agent pauses execution, looks at what actually happened, and revises the plan before going on. It is the safety valve that keeps a planned agent from marching off a cliff when a step fails. Done well, it gives you most of the cost benefit of planning with much of the adaptivity of step-by-step reasoning.
When to trigger a re-plan
The practical question is what should trigger a re-plan, since re-planning too often throws away the cost savings, and too rarely lets failures compound. The usual triggers are a failed step, a result that contradicts an assumption in the plan, or a check that flags the run is off track. In building Gravity's reference agents, we found the cheapest reliable recipe was to plan the skeleton, execute each step with a small amount of local reasoning, and re-plan only when a step actually fails or returns something the plan did not expect. That keeps re-planning rare without letting the agent drift.
Choosing the right balance
There is no single correct point on the planning-execution spectrum; there is a right point for a given task. The deciding factor is how predictable the task is. A stable, repeating job, a monthly close, a fixed migration, a templated report, rewards heavy planning, because the plan barely changes between runs and the cost savings are real. An open-ended, unpredictable job rewards lighter planning and more step-by-step reasoning, because any plan written up front will be wrong by step three.
How buyers can judge the balance
If you are running agents rather than building them, you do not set this balance; the builder does. On a marketplace you describe the outcome, and the agent's internal split is the builder's job. But you can probe its effects with two questions. Can you see the plan before the agent acts on anything irreversible. And when a step fails, does the agent recover and re-plan, or does it simply stop. An agent that exposes its plan and recovers from failure is usually built on a healthier split than one that acts instantly and gives up at the first error. The wider map of what agents can take on is in what can an AI agent actually do.
Frequently asked questions
What is the difference between planning and execution in an AI agent?
Planning is deciding what steps to take to reach a goal. Execution is carrying those steps out by calling tools and acting on systems. Some agents interleave the two on every step, while others write a full plan first and then execute it, which changes their cost, flexibility, and reliability.
Why do agents separate planning from execution?
Separating them lets the agent spend expensive reasoning on the plan once, then run the steps cheaply. It also makes the plan reviewable before any irreversible action runs. The cost is rigidity: if reality drifts from the plan, a pure executor needs a re-planning step to recover gracefully.
What is re-planning in an AI agent?
Re-planning is when an agent stops executing, looks at what actually happened, and revises its plan before continuing. It is the bridge between rigid plan-and-execute and adaptive step-by-step reasoning. Triggering a re-plan when a step fails is what keeps a planned agent from barrelling ahead into a broken state.
Is plan-and-execute better than step-by-step reasoning?
Neither is universally better. Plan-and-execute wins when steps are known in advance and cost matters, because it reasons once. Step-by-step wins when the task is unpredictable and needs to adapt. Most reliable agents blend them: plan the skeleton, then reason within each step and re-plan on failure.
Do buyers need to choose how an agent plans?
No. On a marketplace like Gravity, the builder decides how the agent plans and executes; the buyer describes the outcome. Knowing the split helps you judge an agent: ask whether you can see its plan before it acts, and whether it recovers when a step fails instead of stopping.
Three takeaways before you close this tab
- Two jobs, not one. Planning decides what to do; execution does it; the balance between them is a design choice.
- The split buys cost and review. Plan once to save reasoning tokens and to give a human something to approve.
- Re-planning makes it safe. A planned agent that re-plans on failure gets the savings without the rigidity.
Sources
- Wang et al., "Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning", 2023, arxiv.org/abs/2305.04091
- Yao et al., "ReAct: Synergizing Reasoning and Acting in Language Models", 2022, arxiv.org/abs/2210.03629
- Anthropic, "Building Effective Agents", 2024, anthropic.com/engineering/building-effective-agents
- Gravity agent design notes, internal v1, 2026. Retrieved 2026-06-07.