Weekly Retro: Week 1 of Building Gravity (Tools, Decisions, Problems, Week 2 Changes)

Week 1 of Gravity is in the books. This is the tactical retrospective, written in the format I plan to use weekly: what shipped, what tools settled in, what decisions got made, what problems hit, week-1 metrics, and what changes for week 2. The cadence is the point. The discipline of writing this every week is the unit-economics enforcement tool I detailed in bootstrapping an AI agent platform in 2026.

Build-in-public is not a marketing tool first; it is a discipline tool first. The marketing benefit is downstream of the discipline benefit. If the writing is honest, the marketing follows. If the marketing is leading, the writing softens, and the discipline collapses.

What shipped

One capability, end-to-end, tested against the methodology at how we test AI agents. The capability covered the smallest version of the autonomous-outcome flow: describe the outcome in one sentence, agent runs, agent reports completion. Eighty tests pass; refusal correctness is on track; cost-per-active-agent is bounded.

Beyond the capability, the marketing site went live with twenty blog posts seeded across the founder cluster. The posts are deliberately structured around the framework that emerged from three shutdowns; each one wires back to the hub at three startups, three shutdowns. The cluster math: ten posts compounding internal-link equity is a different shape than ten posts standing alone. The hub-and-spoke approach is the cheapest way for a single founder to build search authority.

Tools that settled in

The tooling stack that ended week 1 is small and intentional. Smaller is faster for a single founder.

Cloudflare Workers for the marketing site and the API edge. Pay-per-request, global, zero idle cost.
OpenAI and Anthropic APIs for foundation-model calls. Capability-priced; the per-capability cost is tracked weekly.
Stripe for billing. Capability-based pricing tied to the cost-of-inference per agent, set up in week 1 so the live behaviour is correct from the first paying customer.
An internal capability registry built in TypeScript. Each capability has a name, a cost ceiling, a refusal table, and a test set. Adding a capability is editing the registry plus shipping the test set.
Cloudflare Web Analytics for traffic. Privacy-respecting, no third-party cookies, free.
X, LinkedIn, Reddit, IndieHackers, Hacker News for distribution. One channel per audience.

What is conspicuously absent: a CRM, a paid analytics tool, a CI/CD platform beyond GitHub Actions, a project-management app. Single founders do not need either of those. Adding tools adds maintenance cost; the rule is "tool only when the absence is hurting".

Decisions taken

Three meaningful decisions in week 1.

Decision 1, capability ordering. The first capability is the autonomous-outcome flow described above, not a more flashy demo. The reasoning is in the mistakes I made with Super AI: the first capability shapes the first user, and the first user shapes the product more than the founder does. The capability picked the user the company wants to keep building for.

Decision 2, no premature hire. Multiple offers of "I would join" came through the inbox in week 1. Every one was politely deferred to month three or later, on the principle that the per-active-agent margin must be able to absorb the salary before the hire happens. The reasoning is in the mistakes I made with Super AI; hiring before unit economics balance just makes the imbalance worse.

Decision 3, refusal log live. Every feature request that does not match the next-six-weeks roadmap goes into a public refusal log with a calendar reason. The reasoning is in what Vibe AI taught me about product: refusal is a feature when the underlying economics are tight.

Problems hit

Two problems worth naming.

Problem 1, distribution discipline. The single-founder temptation to post into every channel as the urge arises collapsed the first signal-to-noise check. Some days had three X threads; some days had none. LinkedIn was sporadic. Hacker News was opportunistic. The cumulative effect was lumpy distribution that did not produce a clean signal.

Problem 2, write-while-building gap. Building a capability took three days. Writing about it took four. The aggregate ratio is wrong; the writing should keep pace with the building, not lag it. If writing lags, the public-build discipline weakens, and the unit-economics enforcement weakens with it.

Week-1 metrics

The metrics dashboard is small and on calendar. Every Sunday, four numbers.

Capabilities and content on target. Distribution posts and waitlist additions both below target; week 2 fix is a posting calendar.

Changes for week 2

Three discrete changes for the next week.

Fixed posting calendar. One X thread Tuesday, one LinkedIn long-form Wednesday, one Reddit or IndieHackers crosspost Thursday, one Hacker News Show post on a capability ship Friday. Channels have schedules, not vibes.
Write-while-building. The blog post for a capability gets started on day one of the build, not on the day after the build ends. The ratio target: build and write each take three days, in parallel.
Capability ship time tracked explicitly. Every capability gets a stopwatch from start to ship. The threshold from the playbook is three weeks; if a capability looks like it will exceed three weeks, it gets killed. Tracking explicitly makes the threshold visible early.

What is staying the same: the weekly cadence, the kill thresholds, the framework, the refusal log, the bootstrap discipline. The staying-the-same list is longer than the changing list, which is the right ratio for a single founder. Stability of process is what lets a one-person operation compound across months instead of restarting weekly.

Next week's retro lands the following Sunday. The format will be identical so the comparison is honest. If you are building bootstrapped right now, my email is at the top of /contact.

Frequently asked questions

What did week 1 of Gravity look like?

One capability shipped end-to-end, the marketing site live with 20 published posts, the test methodology in place, the waitlist counting visible. Distribution started on X, LinkedIn, and Hacker News. Tools settled into Cloudflare Workers, Stripe, an internal capability registry. The week was deliberately bounded: ship one thing, write the retro, decide what changes.

What was the biggest week-1 problem?

Distribution discipline. The single-founder temptation to post into every channel without a cadence collapsed the first signal-to-noise check. The fix for week 2 is a fixed posting calendar: one X thread, one LinkedIn long-form, one Reddit/IH crosspost. Channels have schedules, not vibes. The discipline change is small; the compound effect over a quarter is large.

What surprised you?

How much faster shipping a single capability is than I expected, and how much slower writing about it is than I expected. The build was three days; the writing was four. For build-in-public to work as a unit-economics enforcement tool, the writing has to keep pace. The week 2 change is to write while building, not after.

What is staying the same in week 2?

The cadence: one capability, end-to-end, with a unit-economics check before the next. The kill thresholds: cost-per-active-agent, capability ship time, distribution floor. The framework: 10x value, scaling potential, sustainable margins. The staying-the-same list is longer than the changing list, which is the right ratio for a single founder.

Why publish a weekly retro?

Public commitment is unit-economics enforcement. The discipline of writing the retro forces honesty about what shipped, what stuck, what failed. The discipline of publishing it removes the rationalisation surface. Build-in-public is not a marketing tool first; it is a discipline tool first. The marketing benefit is downstream of the discipline benefit.

Three takeaways before you close this tab

Cadence is the point. The same retro structure every week makes the data comparable.
Channels have schedules, not vibes. Distribution discipline is week 2's biggest fix.
Write while building. The post for the capability starts on day one of the build, not after.

Sources

TechStartups, "Top AI Startups That Shut Down in 2025: What Founders Can Learn", December 2025, retrieved 2026-05-07, techstartups.com
CB Insights, "Why Startups Fail: Top 9 Reasons", 2026 analysis, retrieved 2026-05-07, cbinsights.com
Bessemer Venture Partners, "State of the Cloud 2026", retrieved 2026-05-07, bvp.com