The Mistakes I Made With Super AI: A Founder's Decision Postmortem

The public version of the Super AI postmortem is on this site already (read it here). It explains the structural failure: an all-in-one AI router built in 2024, killed in March 2025, in a category that was getting commoditised by the underlying foundation models. That is the right framing. It is not the whole story.

This post is the deeper version. The decisions I made from the inside, the moments I should have called it earlier, and the specific things I would undo if I started Super AI over today. Some of these are tactical; some are structural. All of them are now rules in how I build Gravity.

The thesis I should have killed

The Super AI thesis in 2024 was: foundation models are improving fast, picking the right model for each task is annoying, a router that does it for you is real value. It read true at the time. By the second half of 2024 it was already softening, and by early 2025 it was wrong.

The signal I missed: model output quality was converging at the top. GPT-4, Claude 3.5, Gemini 1.5; the differences mattered for benchmarks, less for the jobs my users actually ran. Once the floor was good enough, the value of routing dropped to near zero. The wrapper had no margin to defend.

The decision I would undo: at the GrowthX Capstone in October 2024, two reviewers told me the routing thesis was already weakening. I treated their input as a discount on a thesis I already believed, instead of as a sign to redesign. The right response would have been to pick a single high-value job, build that job 10x better, and let routing be a hidden implementation detail, not the headline.

Three startups later, I now hold a hard rule: if two outside reviewers, separately, push back on the headline thesis, the headline is wrong. Not "needs to be communicated better". Wrong. Three Startups, Three Shutdowns walks through how this rule was forged.

The scope I should have refused

Super AI was scoped as an all-in-one platform. Chat, code generation, image generation, document Q&A, transcription, the whole menu. The reasoning was that any one of these capabilities by itself was a small product, but together they were a platform. The reasoning was wrong.

Scope is the second-most expensive decision a founder makes after thesis. Each capability comes with its own user expectation, its own failure modes, its own pricing pressure, and its own competitor. Adding a sixth capability did not make the product 6x more valuable; it made it 6x harder to be the best at any one thing. The user could not point at Super AI and say "this is the thing I open when I need to do X". That is the death sentence for a wrapper.

The decision I would undo: I would have killed four of the six capabilities by month three. The single capability I would have kept is the one that had the highest retention against a defined job. Everything else was a distraction the early users were polite about.

This is now a rule for Gravity: one capability ships per week, end-to-end, with a unit-economics check before the next one starts. The rule is detailed in Bootstrapping an AI Agent Platform in 2026.

The pricing model I should have rebuilt

Super AI was priced as a flat monthly subscription. Twenty dollars a month, all-you-can-use within reason. The pricing model was inherited from SaaS instinct, not from the cost curve underneath. Every paying user ran a different volume; every dollar I charged was either a discount on a heavy user or a tax on a light one. The cost-per-active-user was unpredictable in the wrong direction.

The structural fix would have been capability-based pricing aligned to the cost-of-inference per capability. Light capabilities priced low, heavy capabilities priced high, and the per-capability margin checked monthly against the underlying API spend. I half-built this in early 2025; I shipped it nowhere because the all-in-one positioning conflicted with capability-based pricing.

The top 25% of users were money-losing every month. Flat-rate pricing made the loss unfixable.

The decision I would undo: I would have shipped capability-based pricing in month two, before any meaningful user volume. Re-pricing live users is harder than pricing right from the first paying customer. The same lesson is detailed in economics of bootstrapped AI agents.

The hire I should not have made

I made one early hire in Super AI for engineering capacity. The hire was technically excellent. The hire was also a fixed-cost addition to a product whose unit economics had not yet balanced. Adding fixed cost to a wrapper with a converging margin meant the kill threshold moved closer, not further.

The decision I would undo: I would have hired no one until the per-active-user margin could absorb a salary without the cost-per-active-user crossing the kill threshold. That is a much higher bar than "we have runway". Runway is not the same as unit economics; one of the three checks I missed was confusing the two.

Gravity's hiring rule: the first hire goes in only when the per-active-agent margin can cover them with room to spare. Not when the founder is overworked. Not when the runway looks long. Only when the math works.

The timing I should have read better

By late 2024, the AI assistant category was already showing structural cracks. ChatGPT was rolling out custom GPTs as native capability; Claude was shipping Projects; Gemini was bundling into Workspace. The substrate was eating the wrappers. I read the signal as competitive pressure, not as category obsolescence. They are not the same.

Competitive pressure is something you fight with better execution. Category obsolescence is something you survive only by repositioning into a different category. Super AI was suffering category obsolescence; I treated it like competitive pressure. That was a perception error, not a strategy error. The decision I would undo is the calendar one: I would have reviewed the category-obsolescence question quarterly, not annually.

The same reading error appears in the Vibe AI postmortem (read it here). Different category, same misread. Pattern recognition is what makes shutdowns useful; the framework that emerged is in Three Startups, Three Shutdowns.

What Gravity inherits from this

Five rules that came directly out of Super AI's mistakes, every one of them now hard-coded into how Gravity gets built:

Pick a job, not a category. Gravity is "an autonomous agent that runs a specific outcome", not "an AI platform". Read describe outcome, not workflow.
Capability-based pricing from day one. No flat-rate temptation. Each agent priced against its own cost-of-inference.
One capability per week, end-to-end. No second capability until the first one passes a unit-economics check.
Hiring waits for margin. Salary is a fixed cost; the per-active-agent margin must absorb it before the hire happens.
Quarterly category-obsolescence review. The substrate is moving; the question "could the next foundation-model release eat this?" is on calendar.

Frequently asked questions

What was the single biggest mistake with Super AI?

Treating the routing problem as the product instead of as a feature. Routing was a wrapper over models that were converging fast. Once GPT-4-class quality became the floor in late 2024, the wrapper had no margin to defend. The product needed a defensible job-to-be-done, not a clever middle layer.

Why didn't pricing fix the unit economics?

Flat-rate pricing on a usage-priced underlying API guarantees an inverted cost curve. Heavy users cost more than they paid; light users paid more than they cost. Switching to capability-based pricing would have helped, but only if I had also reduced scope. I tried neither in time.

Was the all-in-one positioning a mistake from day one?

Yes. All-in-one means no specific user remembers you for one thing. The closest thing to a moat in a wrapper category is a specific job done 10x better than the underlying model alone. All-in-one positioning makes that moat impossible to articulate, let alone build.

Should you have raised more capital?

No. More capital would have funded a longer run at the wrong thesis. The structural problem was the category, not the runway. Capital does not fix a wrapper that is being commoditised by the underlying API. It just buys more months of the same losing equation.

What did Super AI teach you that is shaping Gravity?

Pick a job, not a category. Build the agent so the user remembers a specific outcome, not a generic capability. Price aligned to cost-of-inference per agent, not flat across all behaviour. Refuse scope creep ruthlessly in the first six months. Gravity is the version of bet four where these are non-negotiable.

Three takeaways before you close this tab

Routing is a feature, never a product. The substrate eats wrappers; pick a job, not a category.
Flat-rate pricing on a usage-priced API is a structural loss. Capability-based pricing or nothing.
Two outside reviewers pushing back on the same headline means the headline is wrong. Not under-explained. Wrong.

Sources

TechStartups, "Top AI Startups That Shut Down in 2025: What Founders Can Learn", December 2025, retrieved 2026-05-07, techstartups.com
CB Insights, "Why Startups Fail: Top 9 Reasons", 2026 analysis, retrieved 2026-05-07, cbinsights.com
Volt Equity, "Peter Thiel on Identifying Disruptive Companies (10x rule)", retrieved 2026-05-07, voltequity.com