I let a Claude agent run rpsg.co.id. Here's the setup.
How I handed my 6-month organic search plan to a Claude agent. The architecture, the autopilot crons, the 4-layer fact-check, what I still own.
Most "AI runs my SEO" demos stop at "ChatGPT writes my blog post." This experiment goes further. I handed Claude my entire 6-month organic search plan for rpsg.co.id and asked it to manage execution end-to-end. Here is what it built, what surprised me, and what I still keep my hands on.
TL;DR
- An agent now manages the rpsg.co.id 6-month plan through a Notion command center, six scheduled crons, and an end-to-end weekly publish pipeline.
- Technical SEO runs through a real Screaming Frog crawler on the 22nd of each month. Not chatbot analysis of a sitemap.
- Every post passes a 4-layer anti-hallucination check before publish. A missed Wednesday is better than a fabricated post.
- I kept hands on outreach approval and code merges. Everything else is autopilot.
- The split ended up roughly ninety-five five in the agent's favour. The bottleneck was workflow definition, not the agent.
The starting point
This site is a brand-new domain. May 2026 first posts, near-zero authority, no referring domains worth counting. The 6-month plan calls for foundation in M1, topic clusters in M2, first links in M3, commercial intent and local in M4, AI citation push in M5, compound and systematise in M6. One quality post per week. The plan itself lives in a Google Doc.
My setup is unremarkable: an Astro site on Firebase Hosting, a Sanity studio I rarely log into, a Notion workspace already crowded with other projects, and a GitHub Action that publishes posts via the Anthropic API.
What I wanted to know: how much of the plan can an agent execute without me hand-holding every step?
Step 1, a command center in Notion
The agent's first move was to set up its own state. Six databases parented under my existing SEO hub:
- A 6-month plan tracker with one row per workstream per month
- A content queue with priority, cluster, and scheduled-for dates
- A 20-query AI citation scoreboard (ChatGPT, Perplexity, Gemini, Google AI Overviews)
- A link outreach CRM
- A monthly KPI snapshot
- A glossary backlog
Plus a single-row "agent lock" page used as an advisory lock against concurrent writes.
Forty-eight plan-tracker rows seeded. Ten priority content ideas in the queue. Twenty-five candidate queries proposed, of which I picked the final 20. That took about ten minutes of my time.
Step 2, subagents and slash commands
Claude did not get one giant prompt. It split itself into six subagents, each with a focused job:
- Orchestrator owns Notion reads and writes, acquires the advisory lock, dispatches the specialists
- Content writer drafts posts from briefs
- Technical auditor runs the technical SEO baseline
- Citation auditor runs the monthly AI citation scoreboard
- Outreach drafter writes pitches and queues them for my approval
- KPI reporter assembles the monthly snapshot
Fourteen slash commands sit on top of those agents: status, weekly-prep, citation-audit, outreach-batch, monthly-report, and so on.
Three skills carry the voice rules: forbidden phrases, the case-study allowlist, the structural rules every post must follow. The same rules apply to blog posts, pitch emails, and anything else written in my name.
Step 3, technical SEO that actually crawls
This is the part most "AI SEO" tools skip. The agent connects to a real Screaming Frog crawler through an MCP integration. Every month on the 22nd, it crawls rpsg.co.id and reports back on:
- Indexation status across the sitemap
- Core Web Vitals on mobile (the priority for the en_ID market)
- Broken internal and external links
- Orphan pages with zero internal inbound links
- Schema sanity per page (Person, Organization, Service, Article, FAQ)
- AI crawler access in robots.txt (GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Applebot, CCBot)
When an orphan shows up, the agent files a refresh item in the content queue automatically.
Asking a chatbot to "analyze my sitemap" is not the same as a real crawl. The difference shows up when you have an actual broken link and no agent in the loop catches it.
Step 4, autopilot crons
Six scheduled tasks run unattended in Jakarta time:
- Monday 08:00. Status email to me with the week ahead and any pending approvals.
- Friday 12:00. Promote the next queued content item to "scheduled" for the following Wednesday, write the brief.
- 1st of month. Automated AI citation audit on Google AI Mode via SearchAPI.
- 15th. Draft five outreach pitches and queue them in Notion as Pending Approval (no auto-send).
- 22nd. Screaming Frog crawl and internal link audit.
- 28th. Monthly KPI report from Search Console, Semrush, and Notion rollups, drafted as an email to me.
The Wednesday publish itself runs as a GitHub Action. It pulls the next scheduled content queue item, hands the brief to Claude, runs four layers of validation, publishes to Sanity, builds, and deploys to Firebase Hosting. End to end without me in the loop.
Step 5, anti-hallucination guardrails
This is the layer I would not ship without. Four checks run before any post is published:
- Vague-citation phrase blocklist. A regex pass over every sentence catches authority-by-vibes patterns: appeals to unnamed studies, unnamed experts, unnamed industry surveys, and generic-population percentages. Any of those triggers a fail unless the same sentence has an inline link to a primary source.
- Unknown-metric detection. Any numeric claim about an entity that is not on the case-study allowlist, not framed as first-person experience, and not inline-linked.
- External link reachability. Every external link is fetched with a 10-second timeout. 4xx or 5xx fails the post.
- Adversarial fact-check. A second Claude call with web search categorises every claim as supported, first-person-acceptable, allowlist-match, common-knowledge, or unsupported. Any unsupported claim kills the post.
If any layer fails, the script exits non-zero. Sanity is not written. The week's slot stays empty. A missed Wednesday beats a fabricated post.
This guardrail is stricter than most human editors I have worked with. That is intentional. The whole point of rpsg.co.id is to be a site AI engines can trust enough to cite. A post built on invented statistics breaks that trust on day one.
What I still keep my hands on
Two things stay manual.
Outreach approval. When the agent drafts a guest post pitch, a podcast ask, or a directory submission, it writes the row to Notion as Pending Approval and creates a Gmail draft to me with an "rpsg-approval" subject prefix. I review, edit if needed, and approve via a slash command that re-addresses the draft to the actual prospect and sends. The agent never auto-sends to a third party in my name.
Code edits. Any change to the Astro source (schema updates, robots.txt tweaks, the site config) goes through a PR on a branch prefixed chore/rpsg-agent. I review the diff, run checks, merge.
Everything else is autopilot.
What surprised me
I expected to do roughly half the work. The actual split was closer to ninety-five five. I answered a handful of clarifying questions early ("How autonomous?" "Where does state live?" "Cron or on-demand?"), approved the initial twenty queries, and verified rpsg.co.id in Google Search Console and Bing Webmaster Tools. The rest the agent did itself.
The biggest surprise is the guardrail layer. I would not have written that on day one as a human team. I would have shipped posts, watched a stat slip through, and added validation later. The agent built it in upfront because I told it to make hallucination impossible.
What this is and is not
This is not a demo. It is a system that runs unattended, with a 6-month plan and a published scoreboard. Every Wednesday a new post lands. Every month an AI citation audit runs against twenty queries. Every twenty-eighth a KPI report drafts itself.
The next six months are trackable in public on this site.
If you are weighing whether to build something similar for your own search practice or your team's, what holds most engagements back is not the agent. It is the workflow definition. The agent only goes as far as the plan you can describe to it.
That is also the most useful outcome of an advisory engagement: turning a vague six-month wish list into a structured plan an agent or an in-house team can actually execute.