The terms "AI agent", "chatbot", and "assistant" are used interchangeably by vendors and inconsistently by buyers. The result is that procurement decisions get made on the basis of marketing copy rather than on the basis of the structural properties that actually determine whether the thing will work for the job. This guide is the structural version. The argument is not that one category is better than another; it is that the categories are different jobs and confusing them is what causes most AI procurement to fail.
After three startups in this space, the framework that survived is simple: chatbots talk, assistants help, agents act. Each has a job, each has a structural footprint, and each fails in characteristic ways when used outside its job. The full lineage of how this framework came together lives in three startups, three shutdowns and the workflow comparison at describe outcome, not workflow.
Why the terms blur
The blur is partly historical. "Chatbot" emerged in the 2010s for rule-based or shallow-ML conversational systems. "Assistant" got popularised by Siri, Cortana, and Alexa as a category for voice-driven helpers tied to a specific device or platform. "AI agent" is the 2024-2026 term for autonomous systems that take goals and act over time. The categories overlap because the underlying model technology now spans all three jobs.
The blur is also partly commercial. Vendors selling chatbot products in 2024 rebranded as agents in 2025 because the word commands a higher price. Vendors selling assistants did the same. The result is that "AI agent" on a product page can mean anything from a stateless chatbot with a slightly bigger model to a fully autonomous system with persistent memory and tool use. Buyers cannot rely on the label.
What buyers can rely on is structural questions. Does the system remember across sessions? Does it call external tools? Does it run when the user is not watching? Those questions are answerable in a discovery call and they map directly onto whether the thing will do the job. The four axes below are the buyer-facing version of those questions.
Four axes of difference
The four axes that separate chatbot, assistant, and agent are statefulness, autonomy, tool use, and scope.
Statefulness. Does the system carry context between interactions, or does each interaction start fresh? Chatbots are typically stateless or session-only. Assistants carry session-level state and sometimes user-level memory. Agents carry persistent task and world state across long horizons.
Autonomy. Who decides the next step? Chatbots react to user messages and choose nothing else. Assistants suggest steps and execute under user supervision. Agents plan and execute autonomously, escalating to a human only on policy or failure.
Tool use. Does the system call external systems, or does it only generate text? Chatbots typically only generate text. Assistants call a defined set of tools mediated by user approval. Agents discover and call tools as part of their planning loop, often dynamically.
Scope. What is the unit of work? A chatbot's unit is a message. An assistant's unit is a conversation or a session. An agent's unit is a task or a goal that may take minutes, hours, or days to complete.
What a chatbot actually is
A chatbot is a stateless or session-bound conversational interface. The user sends a message; the chatbot responds. The defining property is that the unit of work is the message and the system is not expected to act in the world. Modern LLM-backed chatbots are far more capable than 2010s rule-based bots, but the structural envelope is the same. They talk. They do not act.
Chatbots are the right abstraction for FAQ replacement, intent routing, top-of-funnel qualification, and customer-service tier-one deflection. The cost is low, the failure modes are well understood, and the integration burden is small. Where chatbots fail is in jobs that require multi-step action: scheduling, booking, multi-system updates. Those are agent jobs dressed as chat.
What an assistant actually is
An assistant is a co-pilot: a system that works alongside a user, suggesting actions, calling a defined set of tools under user approval, and carrying session-level state. The canonical examples are GitHub Copilot, ChatGPT with tools, and the assistants embedded in office suites. The user is in the loop on every consequential step.
Assistants are the right abstraction for skilled-worker augmentation: developers, analysts, researchers, designers. They reduce the time per task without removing the human's judgment from the loop. Where assistants fail is in jobs that require running unattended; an assistant is, by definition, attended. If the job is "do this every morning at 7am whether anyone is watching", that is not an assistant job.
What an AI agent actually is
An AI agent is an autonomous system that takes a goal, plans steps, calls tools, monitors progress, and runs to completion or escalation. The unit of work is the task. The defining property is that the user is not in the loop on every step; the user is in a supervisory relationship with the agent, not a conversational one. After three startups, the framework I came back to is that an agent is the answer when you want a worker, not a tool.
Agents are the right abstraction for recurring multi-step tasks: inbox triage, lead enrichment, monitoring and remediation, scheduled reports, multi-system reconciliation. The structural cost is higher: agents need persistent state, tool catalogs, escalation rules, and reliability testing. The 80-test methodology in how we test AI agents exists precisely because agent failures are different from chatbot or assistant failures.
Which one to buy for which job
Match the abstraction to the job. Customer FAQ deflection: chatbot. Code review and pair programming: assistant. Inbox triage that runs daily without supervision: agent. Lead enrichment from a CRM trigger: agent. Internal doc Q&A: chatbot or assistant depending on whether the user wants to take action. Procurement-ticket reconciliation: agent.
Two failure modes are common. The first is buying an agent for a chatbot job: heavy infrastructure for a thing that just needed a stateless response. The second, more dangerous, is buying a chatbot for an agent job: cheap-looking deployment that fails silently because the underlying need was multi-step action. The future-hub at what is an autonomous AI agent walks through the agent decision in more depth; the workflow comparison at describe outcome, not workflow covers the case against pre-agent automation tools.
Gravity sits firmly in the agent category: persistent runners that take an outcome description, run continuously, and escalate when they cannot make progress. The economics math is at economics of bootstrapped AI agents. The bootstrapping logic is at bootstrapping an AI agent platform.
Frequently asked questions
What is the difference between an AI agent and a chatbot?
A chatbot is a stateless responder: it answers a message, returns a reply, and forgets. An AI agent is a stateful, autonomous worker that takes goals, plans steps, calls tools, and runs over time. Chatbots talk; agents act. The structural differences are statefulness, autonomy, and tool use, not the size of the underlying model.
What is the difference between an AI agent and an assistant?
An assistant is co-pilot: it works alongside a user who is in the loop on every step. An agent is autonomous: it runs without the user in the loop, escalating only when it cannot make progress. The user experience for an assistant is conversational; the user experience for an agent is supervisory. Same model can power both; the operating mode is different.
Are these names just marketing terms?
Partly. Vendors use the words loosely and inconsistently, which is part of the buyer confusion. The structural distinctions are real, though, and they map onto observable properties: does the system remember across sessions, does it call tools, does it run when the user is not watching. Those are the buyer questions.
Which one should a business buy in 2026?
Buy the abstraction that matches the job. Customer FAQ is a chatbot job. Coding alongside a developer is an assistant job. Inbox triage that runs at 7am every day is an agent job. The mistake most buyers make is buying the highest-tier abstraction (an agent) for a job a chatbot would have done at one tenth the cost.
How does an agent know when to stop?
Agents stop when the goal is met, when a budget is exceeded, or when an escalation rule fires. The goal-met signal is a structured check on the post-condition. The budget can be tokens, tool calls, time, or money. Escalation rules trigger when the agent encounters something outside its policy. Stopping correctly is harder than starting correctly.
Three takeaways before you close this tab
- The four axes are observable. Statefulness, autonomy, tool use, scope. Ask in the discovery call.
- The marketing label lies. Vendors call everything an "agent" because the word commands a premium.
- Buy by job, not by tier. An agent for a chatbot job is over-engineered; a chatbot for an agent job will silently fail.
Sources
- OpenAI, "Assistants API and Agents documentation", accessed 2026-05-05, platform.openai.com/docs
- Anthropic, "Building effective agents", 2024, anthropic.com/engineering
- GAIA benchmark paper, "GAIA: a benchmark for General AI Assistants", 2023, arxiv.org/abs/2311.12983
- Andreessen Horowitz, "The AI agents stack", 2024 essay series, a16z.com/ai