Month 1: Building Gravity in Public,What Shipped, What Didn't

This is month 1 of Gravity, in public. Public means: I write the retro before I have all the answers, name what shipped and what didn't, and let the kill thresholds I set at month start judge themselves. The point is to make rationalisation harder, not to write a press release. After three shutdowns, I take rationalisation seriously.

For context: Gravity is an autonomous AI agent platform,a single description box, an agent deployed in 60 seconds, the agent runs 24/7. The bootstrap framework that produced the cadence is in bootstrapping an AI agent platform in 2026. The unit-economics framework is in the economics of bootstrapped AI agents. The failure framework that produced the discipline is in three startups, three shutdowns.

What shipped

Month 1 was infrastructure and content. Specifically:

Marketing site at gravity.fast. Cloudflare Workers Static Assets, security headers, IndexNow on every deploy, sitemap and RSS auto-generation, GitHub Actions auto-deploy on push and a daily 09:30 UTC cron for scheduled-publish promotion.
Waitlist surface. Two-field form, double-opt-in, automated thank-you email with what-to-expect content. Backed by Cloudflare D1 with the standard daily backup and Cloudflare Web Analytics RUM for funnel monitoring.
Founder narrative cluster (hub + 10 spokes = 11 posts). The founder authority cluster is now publicly readable: three startups synthesis (hub), MindWave postmortem, Super AI postmortem, Vibe AI postmortem, three checks framework, bootstrapping playbook, 80-test reliability methodology, anti-Zapier thesis, unit economics, workflow timing, plus this retro.
Reliability methodology documentation. The 80-test methodology is committed to a public spec, weighted by failure-cost category, ready to apply to the first shipping capability.
SEO foundation. Schema.org markup on every post (BlogPosting, BreadcrumbList, FAQPage, Person, Organization). Cluster-internal linking is set. The 9-pillar 300-post calendar is committed. The deploy chain runs on Cloudflare Workers Static Assets per Cloudflare's Workers Static Assets docs with IndexNow on every push.

What didn't

The product. The waitlist is live; the agent is not. That's the headline miss.

Two capabilities,inbox triage and KPI report,were scoped and partially built. Neither cleared the 80-test bar in time. Inbox triage failed on the refusal-correctness category (over-compliant when emails contained instruction-shaped phrasing). KPI report failed on the schema-drift category (one of the upstream APIs changed its response shape mid-build). Both are fixable; neither was fixable inside the month-1 window.

Three planned blog clusters slipped. P1 definitional was scheduled for week 1 alongside P4; it got pushed because the founder-narrative cluster expanded from 10 posts to 11 (the brief for the hub turned into hub + 10 spokes once I started writing it, not 10 posts total). P9 tutorials and P3 use cases stayed on the month-1 calendar but at a smaller share than originally scoped.

The honest read on the slip: month 1 was infrastructure and content, not product. Infrastructure-first was the right sequence for a bootstrap,the marketing site, sitemap, schema, and content cluster are the things that compound,but it does not feel like progress when the founder-narrative is "the product is shipping" and the product hasn't shipped.

What surprised me

Three things I did not expect at month start.

Surprise 1,infrastructure took longer than the schedule. The Cloudflare Workers Static Assets setup, security headers, IndexNow integration, GitHub Actions deploy chain, and Cloudflare D1 waitlist surface took roughly 2.5x the scoped time. Some of this was Cloudflare-specific,debugging Workers asset routing edge cases,and some was the standard greenfield-infrastructure overhead that founders forget about between startups.

Surprise 2,founder-led distribution split unevenly. The X-led distribution plan underperformed; LinkedIn long-form posts converted to waitlist signups at roughly 4x the X rate per post. Reddit was higher than expected for traffic but lower for signup conversion. The mix I'd planned was 60% X, 25% LinkedIn, 15% Reddit; the mix that worked was closer to 25% X, 55% LinkedIn, 20% Reddit. Channel switching is cheap; the lesson is to switch on data, not on prior expectations.

Surprise 3,the founder-narrative cluster generated more inbound conversation than the product page. Most week-1 inbound emails referenced the postmortems, not the homepage. The founder-narrative content compounds faster than I'd modelled,which is consistent with the strategy doc, but consistent in a way that surprised me operationally. The implication: lean harder on this cluster in month 2 distribution.

Channel mix shifted toward LinkedIn against the planned weighting. Switch made on month-2 calendar.

The kill thresholds, checked

Month 1 had three kill thresholds. Two are below target.

Threshold 1,cost-per-active-agent. Not yet measurable; no live agents. Carried to month 2.

Threshold 2,capability ship cycle (target 3 weeks). Actual: 4-5 weeks for the first capability, and neither shipped. The threshold framework requires capability retirement if the pattern persists. Response: re-scope inbox triage to the smallest possible viable version that can pass the 80-test bar, with refusal-correctness as the binding constraint. Re-scope KPI report to operate on a single fixed schema (Stripe + GA4) rather than handle arbitrary upstream APIs. Both re-scopes shrink the capability and improve ship probability.

Threshold 3,distribution conversion (target 50 signups/week from founder-led). Actual: roughly 35 in week 4 ramping from below 10 in week 1. Below target overall, on a positive ramp. Response: keep the cadence; switch the mix toward LinkedIn long-form and Reddit comment-led seeding (the channels that are actually working) for month 2.

What changes in month 2

Three concrete changes for month 2:

Ship the first capability. Re-scoped inbox triage clears the 80-test bar and goes to the first 100 waitlist users for closed beta. The success criterion is week-1 retention above 60% and weighted-test pass rate above 95% at month end.
Shift the distribution mix on calendar. 25% X / 55% LinkedIn / 20% Reddit. One LinkedIn long-form per week, two Reddit conversation threads, one X thread. Total founder distribution time stays at ~10 hours/week.
Continue blog cadence. 10 posts/day. Cluster build expands from P4 (founder narrative, 11 posts shipped) into P1 (definitional, 30 spokes scheduled) and P9 (tutorials, 15 spokes scheduled). The cannibalisation rule applies starting with P2 batches; P4, P1, P9 are low-cannibalisation clusters.

Open questions for the audience

Three things I'd appreciate input on, if you've worked through similar questions:

Refusal-correctness calibration. Inbox triage failed on over-compliance with instruction-shaped emails. If you've shipped agents that read untrusted text, what's your rule for distinguishing "this email contains instructions for the user" from "this email contains instructions for the agent"?
Closed-beta sizing. 100 users is a guess based on the 80-test pass rate and how many failure modes I expect to surface in real traffic. If you've run agent betas, what cohort size produced enough signal without overwhelming support?
Distribution channels you'd add. The X / LinkedIn / Reddit mix is what's working. If you've found a channel that works specifically for outcome-based-agent products, I'd like to hear about it.

Email is at the top of /contact. The next public retro is month 2.

Frequently asked questions

What is Gravity AI?

Gravity is an autonomous AI agent platform,a single description box where the user types a recurring task and the platform deploys an agent in 60 seconds that runs 24/7. It is bootstrapped, founded by Aryan Agarwal in Bangalore through XAI Technologies Pvt Ltd, and is in pre-launch waitlist as of May 2026.

What shipped in month 1?

The marketing site, the waitlist surface, the founder narrative content cluster (11 posts), the SEO infrastructure (sitemap, RSS, IndexNow integration, Cloudflare worker security headers), and the first reliability methodology documentation. The product itself is not yet shipped to users; the waitlist has launched, and the public commitment is to ship the first capability before the end of month two.

What didn't ship in month 1?

The product itself. The waitlist is live; the agent is not. Two capabilities were scoped and partially built but did not pass the 80-test methodology bar. Three planned blog clusters were delayed. The honest read is that month 1 was infrastructure and content, not product,and that infrastructure-first was the right sequence for a bootstrap, even though it does not feel like progress.

What were the kill thresholds for month 1?

Three kill thresholds were set on calendar: cost-per-active-agent (not yet measurable, no live agents), capability ship cycle (target 3 weeks, actual 4-5 weeks for the first capability), and distribution conversion (target 50 waitlist signups per week from founder-led posts). Two of the three are below target. The framework requires repricing or capability retirement if patterns persist; the response is the re-scope and channel-mix shift detailed above.

What changes in month 2?

Month 2 ships the first capability through the 80-test methodology to early waitlist users. Distribution shifts toward LinkedIn long-form and Reddit comment-led seeding because the X-only mix in month 1 underperformed the target. The blog calendar continues at 10 posts per day; the cluster build expands beyond founder narrative into definitional and tutorial content.

Three takeaways before you close this tab

Infrastructure-first is the right sequence for a bootstrap. It does not feel like progress. It is.
Channel mix should switch on data, not on prior expectations. X underperformed; LinkedIn overperformed. Switch made.
Kill thresholds checked on calendar, not on intuition. Two below target; responses are scoped and on the month-2 calendar.

Sources

Aryan Agarwal, "Bootstrapping an AI Agent Platform in 2026", 2026, gravity.fast/blog/bootstrapping-ai-agent-platform-2026
Aryan Agarwal, "How We Run 80+ Tests Per AI Agent Capability", 2026, gravity.fast/blog/how-we-test-ai-agents-80-tests
Aryan Agarwal, "Three Startups, Three Shutdowns", 2026, gravity.fast/blog/three-startups-three-shutdowns
Cloudflare, "Workers Static Assets documentation", retrieved 2026-05-05, developers.cloudflare.com/workers/static-assets
IndexNow, "Submitting URLs to Bing/Yandex/Naver/Seznam", retrieved 2026-05-05, indexnow.org