On 2026-06-12 we ran the counting scripts against this blog: 355 live posts, 704,255 words, 39 days since the first post went up on 2026-05-05. That is 9.1 posts per day at an average of 1,984 words each, written, fact-checked, scheduled, and deployed by an autonomous pipeline that starts from a scheduled task on a laptop in Bangalore. API spend is roughly five to seven dollars a day.

One note before anything else: every number in this post was measured on 2026-06-12, and the corpus grows by about ten posts a day, so the number in the title is already stale by the time you read this. We can rerun the scripts any day and get a bigger total. The shape of the system is what stays constant, and the shape is what this post is about. That includes the parts that broke, which turned out to be the most useful parts to write down.

The numbers, measured 2026-06-12

Everything below comes from scripts run against the live site and the git history on 2026-06-12. Nothing here is an estimate except where marked, and each line can be regenerated by counting files, words, and links in the repository.

MetricValue
Live posts355
Publishing window2026-05-05 to 2026-06-12 (39 days)
Average cadence9.1 posts per day
Total words704,255
Average post length1,984 words
Internal blog-to-blog links2,689
Posts with FAQPage schema355 of 355
Busiest days20 posts (2026-05-08, 05-09, 05-10)
Cover images at time of measure336 of 355 (the missing 19 were backfilled the same day)
Running costroughly five to seven dollars a day in API spend

For scale: 704,255 words is about seven novels. We are a tiny pre-launch team building an AI agent platform, and there was no version of this output that involved humans typing. The honest framing is that the blog is itself an agent deployment: a long-running, unattended system doing real work with real failure modes. We treat it with the same engineering seriousness as anything else we ship.

Posts published per day, 2026-05-05 to 2026-06-12 (approximate) 20 9.1 20/day burst, May 8-10 May 5 May 15 Jun 1 Jun 12 Approximate sketch. Verified points: first post May 5, 20-post days May 8-10, 355 total on Jun 12. Other days drawn at the 9.1 average.
An honest sketch rather than a precise chart: the verified facts are the start date, the three 20-post burst days, and the 355 total. Every other bar is drawn at the 9.1 daily average because that is the number we can defend.

How does the pipeline actually work?

The pipeline is six stages. None of them is exotic on its own. The whole thing is a markdown file, one scheduled task, a handful of Node scripts, a GitHub Actions workflow, and static hosting. The interesting engineering is in the seams between stages, because that is where everything failed.

The autonomous content pipeline, end to end 01 Editorial calendar A markdown file, the single source of truth; the agent extends it when it runs short 02 Daily batch, 15:00 IST Task Scheduler wakes a headless agent session; it writes the next ten posts 03 Per-post QA chain factcheck · analyze score 80 or higher, up to 2 auto-rewrites · seo-check · schema validation 04 3-day draft buffer Posts wait in a drafts directory with a future datePublished; the laptop can sleep 05 GitHub Actions cron A daily workflow promotes every draft whose date has arrived into the live blog 06 Deploy and notify Listing, RSS, and sitemap regenerate; Cloudflare deploy; IndexNow ping to search engines As running on 2026-06-12. Each stage exists in a script or workflow file in the site repository.
Six stages, no exotic infrastructure: a markdown calendar, one scheduled task, a QA chain, a dated drafts folder, a cron, and a static deploy.

Stage 1: the calendar is the only brain

The editorial calendar is a markdown file in the repository. It holds the post list: slugs, titles, pillars, target dates. Nothing else in the system decides what gets written. The batch run reads the calendar, finds the next unwritten entries, and writes those. When the calendar runs short, the agent extends it following the documented pillar strategy, so the system never starves itself of work.

This sounds trivial. It is the single most important design decision in the pipeline. Because the calendar is a file under version control, every editorial decision has a diff, a timestamp, and an author. When output looked wrong, we could always answer "what was it told to write?" by reading one file.

Stages 2 and 3: a headless agent with mechanical gates

At 15:00 IST, Windows Task Scheduler starts a headless Claude Code session. No chat window, no human watching. The session writes the next ten posts into a drafts directory, each one wrapped in the standard post template with full schema markup. Then each post runs a QA chain: a fact-check pass that requires every numeric claim to carry a source, a quality analyzer that scores the post and triggers up to two automatic rewrites if the score is below 80, an SEO check, and schema validation.

The gates are mechanical on purpose. We do not ask the agent to "make sure it is good." We ask scripts to produce numbers, and the numbers decide. This is the same philosophy as the 80-plus tests we run per agent capability: non-deterministic systems do not get judged on a single run or a vibe, they get judged on measured pass rates against fixed criteria. A post that cannot clear the gates after two rewrites does not ship that day.

Stages 4 and 5: the buffer that decouples writing from publishing

Posts are not published when they are written. They land in a drafts directory with a datePublished up to three days in the future, forming a rolling buffer. A GitHub Actions cron runs daily, executes the build script, and promotes any draft whose date has arrived into the live blog directory. The listing page, RSS feed, and sitemap regenerate in the same step.

The buffer is the reliability trick. The laptop that writes posts and the cloud cron that publishes them are independent failure domains. The laptop can sleep, hang, or lose power for a day or two and the blog keeps publishing on schedule from the buffer. The cron can fail and the next day's run picks up everything it missed, because promotion is by date, not by event. Both halves are idempotent: running them twice produces the same result as running them once.

Stage 6: deploy and tell the search engines

The deploy is static files on Cloudflare Workers. After each deploy, an IndexNow ping tells participating search engines which URLs changed, instead of waiting for a recrawl. The whole publish path, from cron trigger to indexed-and-pinged, involves no human and no manual step.

What broke along the way?

This is the section the post exists for. The architecture above looks clean because every failure below got turned into a guardrail. We are listing them in the order they hurt us, with the specific fix each one produced. If you build something like this, you will hit most of these.

Two runs committed but never pushed, and we deployed stale code

Two early batch runs finished their work, committed twenty posts locally, and never pushed. No error surfaced anywhere, the headless session simply ended. The next morning the GitHub Action deployed origin/main, which did not contain the posts. The site published nothing new while two days of finished work sat invisible in a local working tree. We only noticed because the blog listing looked thin.

The fix is push verification: after every batch, the run confirms that the remote branch actually contains the new commit, and a missing push is a loud failure instead of a silent one. The general lesson stuck with us: in an unattended system, "the step ran" and "the step's effect is visible downstream" are different assertions, and you have to test the second one.

Placeholder markers shipped image-less posts

Early posts contained bracketed image placeholder markers in the body, on the assumption that a later step would replace them with real figures. A post-processing step stripped the markers instead. The posts went live looking exactly like what they were: articles where the images had been deleted and nobody checked. At the 2026-06-12 measure, 336 of 355 posts had cover images; the missing 19 were found by that same audit and backfilled the same day.

The fix has two parts. Markers are banned outright; the writing step is not allowed to emit them. And an image generation step produces a cover and one inline figure for every post, with a check that fails the batch when either file is missing. Image presence is now an asserted property, not a hoped-for one.

A trailing-slash bug turned every /legal page into an infinite redirect

A worker change normalized URLs by redirecting to add a trailing slash. Another rule redirected the slashed version back. Every page under /legal entered a redirect loop until the browser gave up. We shipped this, and it sat live until we happened to click a legal link ourselves. Search crawlers met the same loop, and duplicate-URL noise from this class of bug took real effort to clean out of the index later.

The fix is post-deploy smoke tests: after every deploy, a script fetches a fixed list of representative URLs and asserts clean 200 responses with no redirect chains. It is the cheapest test we run and it has caught the most embarrassing class of bug. A deploy that breaks routing now reverts itself before anyone outside the team sees it.

The analytics beacon collected nothing for weeks

The Content-Security-Policy headers we set in the worker blocked our own real-user-monitoring beacon. Pages rendered perfectly, no user saw any error, and the analytics dashboard quietly showed almost nothing. For weeks we interpreted near-zero RUM data as "the blog is young, traffic is small" when the truth was "the measuring instrument is disconnected." We lost the early baseline entirely; that data is simply gone.

The fix was the CSP allowance for the beacon origin, plus a recurring check that analytics rows actually exist for recent days. The lesson generalizes to any autonomous system: silence is not a signal of health. Instrumentation needs its own instrumentation, even something as dumb as "alert if zero rows."

Orphan posts that nothing linked to

The pipeline wrote posts that linked out to other posts, but nothing guaranteed any post received links. A crawl found a set of orphans: live, indexed-eligible pages with zero internal links pointing at them, invisible to both crawlers following links and readers browsing hubs. For a system publishing ten pages a day, orphan creation compounds fast.

The fix is orphan adoption as a scheduled job: hub pages and related posts get backlinks added to any post that lacks inbound internal links. The corpus now carries 2,689 blog-to-blog links, and link reception, like image presence, is an asserted property checked by a script rather than an outcome we assume.

The CI gate that hung

The deploy workflow includes a test gate. One run hung without a timeout configured, so the job just sat there, and the day's promotion never executed until we noticed and re-ran it by hand. Nothing was wrong with the content, the buffer, or the build; the pipeline was simply wedged on a step that would never finish on its own.

The fix is boring and absolute: explicit timeouts on every workflow step, so a hung step becomes a failed step, and a failed daily run self-heals because the next day's cron promotes everything the previous one missed. Wedged is the worst state for an unattended system, worse than failed, because failed states get retried and wedged states get discovered.

What does it cost to run?

Roughly five to seven dollars a day in API spend, at a cadence of ten posts a day with the full QA chain. The QA chain is most of the bill: a post that needs two rewrites costs meaningfully more than one that passes first time, because each rewrite is another full generation plus another scoring pass. The infrastructure around the model calls rounds to zero: static hosting on Cloudflare Workers, a GitHub Actions cron well inside the free tier, and a scheduler that is literally a laptop we already owned.

We like that the cost scales with work performed and nothing else. No retainer, no seat licenses, no idle spend on days the pipeline writes nothing. That is the same economic shape we chose for Gravity itself, pay per use with $1 buying 1,000 credits, for reasons we laid out in why we chose credits over subscriptions: when a system runs autonomously, billing tied to actual runs is the only model that stays honest as usage moves around.

Steal this pattern

None of this depends on our stack. The pattern generalizes to any team that wants an unattended content system, and every piece can be swapped for an equivalent. Here is the whole design in five rules.

  1. Make an editorial calendar the single source of truth. One version-controlled file that says what gets written, when, and against which strategy. The agent reads it, executes it, and extends it under documented rules. Specify outcomes, not procedures; we wrote about why that framing matters in describe the outcome, not the workflow. If a human wants to steer the system, they edit the file, not the agent.
  2. Run a headless agent on a plain cron. Any scheduler works: Task Scheduler, cron, a cloud workflow. The requirement is that the agent runs unattended end to end, never blocking on a question. If it lacks information, it uses documented defaults and leaves a note in the output rather than waiting for a human who is not there.
  3. Gate quality with mechanical checks and an auto-rewrite loop. Score every artifact against fixed criteria. Below threshold, rewrite automatically, at most N times, then fail loudly. Never ship on "it looked fine." The threshold and the rewrite cap matter more than the specific scoring tool, because they bound both quality and cost.
  4. Publish by date, not by event. Writing produces dated drafts into a buffer; an independent daily job promotes whatever is due. Both halves are idempotent, so either can fail for a day and the next run catches up with no missed or duplicated posts. The buffer is what turns two flaky components into one reliable system.
  5. Smoke-test after every deploy. Fetch a fixed list of URLs, assert clean responses and no redirect chains, confirm the feed and sitemap regenerated, and confirm telemetry is actually arriving. Every failure we listed above is now caught by a check in this stage or the batch stage. When something new breaks, the fix is a new mechanical check, never a note to remember to look.

The meta-rule underneath all five: an autonomous pipeline is not the clever prompt, it is the scar tissue. Our writing step worked on day one. Nearly all the engineering that followed was making "it ran" imply "it worked," and that part accumulated one failure at a time.

Does a human still touch it?

Yes, and deliberately so, just not where you might expect. No human writes, edits, schedules, or deploys the daily posts. The human work is entirely upstream and downstream: setting the pillar strategy the calendar follows, writing briefs for flagship pieces like this one, deciding what the blog is for, and doing distribution, which stays founder-led and off-pipeline.

In our experience the role change is the real story. The job stopped being "write posts" and became "edit the system that writes posts." When output disappointed us, the fix was never a better paragraph, it was a better calendar entry, a tighter gate, or a new check. That matches what we see across the industry in the state of AI agents in mid-2026: the teams getting real work from agents are the ones engineering the harness, not the ones polishing individual outputs. It is also, frankly, the only way a pre-launch team this small ships 355 posts while building a product, a constraint we wrote about in bootstrapping an AI agent platform in 2026.

Frequently asked questions

What does an autonomous content pipeline cost to run?

Roughly five to seven dollars a day in API spend at a cadence of ten posts a day, measured across May and June 2026. Hosting is static files on Cloudflare Workers, the scheduler is a laptop running Windows Task Scheduler, and promotion runs on GitHub Actions, none of which adds anything meaningful to that figure.

Does Google penalize AI-generated content?

Google's published guidance targets low-quality content regardless of how it is produced, not AI authorship itself. Our answer is mechanical quality gates: every post passes a fact-check pass for sourced claims, a quality analyzer with a minimum score of 80 and up to two automatic rewrites, an SEO check, and schema validation before it can ship.

What tools does the pipeline use?

A markdown editorial calendar, Claude Code running headless under Windows Task Scheduler for writing and QA, Node scripts that rebuild the listing, RSS feed, and sitemap, GitHub Actions for the daily promotion cron, Cloudflare Workers for static hosting, and IndexNow to ping search engines after each deploy.

How long does it take to build a pipeline like this?

The core loop, a calendar, a scheduled headless agent run, and a date-gated build script, is a few days of work. Most of the real build time came afterwards: each production failure became a new guardrail, and the guardrails, push verification, smoke tests, image enforcement, accumulated over weeks of daily runs.

Does an autonomous content pipeline still need a human?

Yes. The pipeline writes, checks, schedules, and deploys without input, but a human still owns strategy: the editorial calendar's direction, briefs for flagship posts, distribution, and the decision of what the blog is for. In our experience the human work shifts from writing posts to editing the system that writes them.

Three takeaways before you close this tab

Sources