Build vs Buy an AI Agent: A Four-Axis Decision Framework

Build vs buy is the wrong opening question for AI agents. The right opening question is: what would have to be true for either answer to be obvious. The four-axis framework that follows (cost, time, capability, control) is designed to surface the conditions, not to declare a winner. Some axes favour build; others favour buy; the right answer is whichever set of axes dominates for the specific task.

The framework draws on three prior shutdowns documented in three startups, three shutdowns. The shutdowns produced one rule that survived: a feature must be at least three times better than the alternative, not slightly better, to justify building it. That rule applies to build-vs-buy decisions too. If buying gets the buyer to 80 percent of the value at 10 percent of the cost, building has to be three times better than that to win.

The question, framed correctly

The framing that goes wrong: "should we build or buy an agent for X?" The framing that works: "what is the cost of being wrong about each axis, and which axis can we afford to be wrong about?" Build-vs-buy is a portfolio decision under uncertainty, not a yes-or-no question. The framework converts intuition into a score that survives a procurement review.

The pragmatic test: write the four axes on the board, score each one to nine for build and one to nine for buy, sum the columns. If the columns are within three points of each other, the decision is buy by default; the cost of being wrong on a buy decision is one renewal cycle, while the cost of being wrong on a build decision is months of engineering and an internal political conflict. Indifference favours buy.

The four axes

Cost. Total cost of ownership over 24 months. For build: engineers, model spend, tool infrastructure, ops. For buy: licence, integration, switching cost.
Time. Calendar weeks from decision to first production use. Build is rarely faster than buy; buying a tool that does 80 percent today often beats a 100 percent solution six months out.
Capability. Does the option meet the autonomy level the task requires? See autonomous vs assistive AI for the five-axis scoring of autonomy levels.
Control. Data residency, compliance, customisation depth, dependency risk. Build wins when control is non-negotiable; buy wins when standardisation is acceptable.

The axes are not equally weighted. Capability is binary at a threshold: either the option meets the required autonomy level or it does not. Cost and time are continuous. Control is contextual; for some buyers it dominates, for others it is irrelevant.

For non-core tasks, buy wins on cost, time, and capability while losing only on control. The buy column sums to 29; the build column to 20.

Cost, with real numbers

A production-grade single-capability agent built in-house in 2026 costs roughly six months of one to two senior engineers plus model spend and tool infrastructure. At Bangalore engineering rates documented in economics of bootstrapped AI agents, that runs 30,000 to 60,000 USD. At US rates, 150,000 to 300,000 USD. The variance is engineer count and seniority, not technology.

Buying runs differently. Per-task pricing on agent platforms in 2026 typically falls in the 0.10 to 2.00 USD per execution range depending on complexity (multi-tool tasks land at the high end). Per-seat copilot pricing runs 20 to 50 USD per user per month. Outcome-described platforms tend toward flat or capability-based pricing for predictability. For most mid-market tasks, the buy option pays for itself before the build option ships.

The exception is high-volume tasks where per-task pricing exceeds engineer cost over 24 months. The break-even point depends on volume, complexity, and pricing model. For tasks above roughly 100,000 executions per month at 1 USD per task, in-house builds start to become cost-competitive on cost alone, even before capability and control are factored in.

The hidden cost: reliability

The cost most buyers underestimate is reliability work. A working prototype is roughly one third of the way to production. The remaining two thirds is the discipline that turns "it worked" into "it always works": the kind of testing documented in 80-test methodology. Eighty tests per capability, eight categories, weighted scoring, monthly re-checks. None of it feels like product work; all of it determines whether the agent runs unattended.

Building this discipline from scratch is a six-month project on top of the agent itself. It is the main reason in-house agent projects miss timelines and budgets. Buying a platform that has done the reliability work transfers that cost. Buying without checking whether the platform has done the reliability work transfers nothing; it just hides where the failure will surface.

The buyer-side test is direct: ask the vendor for their reliability methodology in writing. If they cannot describe what they test, how they weight failures, and how often they re-check, the reliability work has not happened. The stop-after-one-task failure mode is what unhandled reliability looks like in production.

Reading the decision

The decision pattern: score the four axes, weight capability as a binary threshold, prefer buy on indifference. Build only when the buy column loses on capability or control by enough that the 3x rule from three startups, three shutdowns is met. Build because the buy option is missing something fundamental, not because it is missing something nice-to-have.

The Gravity bet is that for a wide class of business tasks, the buy option will win on the framework starting in 2026. The product framing in describe outcome, not workflow exists to compress the time-to-value gap that has historically favoured build. If the time axis collapses (60-second deployment), the cost and capability advantages compound.

Frequently asked questions

Should I build or buy an AI agent?

Buy if the task is common, the data is non-sensitive, and time-to-value matters. Build if the task is core differentiation, the data cannot leave your perimeter, or you need autonomy levels no platform offers. The four-axis framework (cost, time, capability, control) gives the structured answer beyond gut feel.

How much does it cost to build an AI agent in-house?

A production-grade single-capability agent costs roughly six months of one to two senior engineers plus model and tool infrastructure. At Bangalore rates that runs 30,000 to 60,000 USD; at US rates 150,000 to 300,000 USD. The reliability work is the long tail: getting from a working prototype to 95 percent reliability often takes longer than the prototype itself.

What does buying an AI agent platform actually cost?

Most platforms charge per-task, per-seat, or per-capability in 2026. Per-task pricing runs 0.10 to 2.00 USD per agent execution depending on complexity. Per-seat pricing for copilot-style products runs 20 to 50 USD per user per month. Outcome-described platforms tend toward flat or capability-based pricing, which is more predictable for buyers.

When does building an AI agent make sense?

Building makes sense when the agent is core IP, when data residency or compliance demands an air-gapped deployment, or when no available platform reaches the autonomy level required. For non-core tasks, buying is almost always faster and cheaper, even at the highest quoted platform prices.

What is the hidden cost of building an AI agent?

Reliability work is the largest hidden cost. The 80-test methodology applied at Gravity takes around four hours per capability and runs ongoing. Building this discipline in-house from scratch is a six-month project that does not feel like product work and is the main reason in-house agent projects miss timelines.

Three takeaways before you close this tab

Score four axes; do not argue from gut. Cost, time, capability, control.
Reliability is the long tail. Buy a platform that has done the work or budget for doing it yourself.
Indifference favours buy. One renewal cycle is cheaper than a failed build.

Sources

Aryan Agarwal, "Three Startups, Three Shutdowns", May 2026, /blog/three-startups-three-shutdowns/
Bessemer Venture Partners, "State of the Cloud 2025", retrieved 2026-05-07, bvp.com/atlas/state-of-the-cloud-2025
a16z, "16 Changes to the Way Enterprises Are Building and Buying Generative AI", retrieved 2026-05-07, a16z.com
Mialon et al., "GAIA: A Benchmark for General AI Assistants", arXiv:2311.12983, 2023, retrieved 2026-05-07, arxiv.org/abs/2311.12983
NIST, "AI Risk Management Framework", retrieved 2026-05-07, nist.gov/itl/ai-risk-management-framework