All You Need To Know About Cloudflare’s Agent Readiness Score via @sejournal, @slobodanmanic

Agent-readiness crossed from concept to measurable infrastructure this week. On April 17, as Cloudflare Agents Week extended into its sixth day, the company shipped isitagentready.com, a public scanner that scores any website on how prepared it is for AI agents. Paste a URL, get a score, see which checks passed and which failed, read AI-generated guidance on how to improve. For the first time, the agent-legibility conversation moved from “is my website ready for agents” as a gut feeling to “my website scored X out of 100 in these five categories, here are the failing signals.”

The Agent Readiness Score is a real shift. It is also a structurally misleading tool if you stop reading after the composite number.

I ran the scan on this website (nohacks.co) and scored 33 out of 100, Level 2 “Bot-Aware.” The robots.txt passed. The sitemap passed. The AI bot rules in robots.txt passed. Content Signals passed. Then the score collapsed across categories where a content-only blog genuinely doesn’t need what the scanner checks for. More on that in a minute.

First, the context. Cloudflare has been shipping agent-facing infrastructure all week. The Agent Readiness Score arrived alongside Agent Memory, Shared Dictionaries, Redirects for AI Training, an LLM compression technique called Unweight, and a feature-flag tool called Flagship built for AI-generated code. Four days earlier, they shipped Project Think (a new Agents SDK), and OpenAI matched it within hours with their own Agents SDK. I wrote about that in The Agent Runtime Wars Started This Week. The readiness scanner is the logical next piece: If runtimes are the new browser layer, website owners need a way to test whether their website is legible to that layer. Cloudflare shipped the tester.

The question this article answers is narrower: What does the scanner actually check, what should you do with your score, and where is the scoring structurally misleading enough that the number by itself leads you astray?

What Cloudflare Shipped: Scanner, API, And An MCP Endpoint Agents Can Call On You

The scanner is at isitagentready.com. Paste any URL, pick a website type (All Checks, Content Site, or API/Application) to scope which signals get scanned, hit Scan. The scanner fetches the homepage and a handful of well-known paths, runs a set of checks against each, and returns a scored report with pass/fail markers, status codes, response bodies, and AI-generated guidance on what to fix.

The scanner is also available in three other ways:

  • Integrated into Cloudflare Radar, so the same checks run alongside Radar’s existing URL analysis.
  • Exposed programmatically via the Cloudflare URL Scanner API for automation.
  • Available as a stateless MCP server at /.well-known/mcp.json, so any MCP-compatible agent can call the scan as a tool and reason over the result

That last one is worth sitting with for a moment. Cloudflare shipped an agent-readiness scanner that agents themselves can call to audit websites before deciding how to interact with them. The scanner checks whether your website is ready for agents, and any agent can invoke it to decide how to interact with you before arriving. The measurement and the measured are starting to share the same surface.

Back to the practical question. What exactly does it check?

16 Checks, 5 Categories: What The Scanner Actually Tests

The scanner groups its checks into five categories. Here is what each one looks for, grouped by what the check actually means in practice.

Discoverability (3 Checks)

Whether the website publishes the basic metadata an agent needs to find what is where.

  • robots.txt exists. The classic crawl-policy file. An agent that follows robots.txt needs it to exist and parse.
  • sitemap.xml exists. Either declared via a Sitemap directive in robots.txt or available at the standard path. An agent that wants to enumerate pages uses the sitemap.
  • Link headers (RFC 8288). HTTP Link headers pointing to canonical, alternate, or related resources. Useful for agents that parse responses rather than HTML.

Content (1 Check)

  • Markdown for Agents. Content negotiation. The scanner sends Accept: text/markdown and checks whether the website returns Markdown instead of HTML. This is Cloudflare’s own proposal rather than an IETF spec, though the mechanism (HTTP content negotiation via the Accept header) is standard. Real agent runtimes prefer Markdown because it is cheaper to tokenize and easier to parse than HTML. Some early movers (Cloudflare itself, a handful of docs websites) support Markdown content negotiation; most websites do not.

Bot Access Control (3 Checks)

  • AI bot rules in robots.txt (RFC 9309). Whether robots.txt contains directives for AI-specific user agents (GPTBot, ClaudeBot, PerplexityBot, etc.).
  • Content Signals in robots.txt. An emerging spec for expressing per-URL access rules inside robots.txt. Parsed as User-agent: * followed by Content-signal: directives. Adoption is minimal right now.
  • Web Bot Auth request signing. HTTP message signatures at /.well-known/http-message-signatures-directory that let agents prove their identity cryptographically. This is the Agent Name Service side of things, Cloudflare shipped with GoDaddy earlier in Agents Week. Adoption is almost zero outside Cloudflare’s own properties.

API, Auth, MCP & Skill Discovery (6 Checks)

  • API Catalog (RFC 9727). A machine-readable index of a website’s API endpoints at /.well-known/api-catalog.
  • OAuth / OIDC discovery (RFC 8414). Standard OAuth 2.0 authorization server metadata at /.well-known/oauth-authorization-server and /.well-known/openid-configuration.
  • OAuth Protected Resource (RFC 9728). A website declaring which endpoints are OAuth-protected and how to authenticate.
  • MCP Server Card (SEP-1649). A Model Context Protocol server advertising its capabilities at /.well-known/mcp/server-card.json. SEP-1649 is a draft proposal inside the MCP spec process.
  • Agent Skills index. A list of agent-callable skills at /.well-known/agent-skills/index.json. Also emerging.
  • WebMCP (Experimental). An in-page JavaScript API registering agent-callable tools via navigator.modelContext. The scanner uses headless browser rendering to detect whether the website registers any WebMCP tools on page load.

Commerce (3 Optional Checks, Not Scored On Non-Commerce Websites)

  • x402 payment protocol. HTTP 402 Payment Required infrastructure for agent-native payments.
  • UCP profile (Universal Commerce Protocol). Google’s merchant-metadata standard at /.well-known/ucp.
  • ACP discovery document (Agentic Commerce Protocol). At /.well-known/acp.json.

The Commerce category is flagged “optional” on non-commerce websites. The scanner detects whether any ecommerce signals are present and, if not, displays the commerce checks for informational purposes without counting them in the score.

That last design detail matters. It is evidence Cloudflare anticipated exactly the problem the rest of this article is about.

Nohacks.co Scored 33/100, Level 2 Bot-Aware

I ran the scan on nohacks.co. The result was 33 out of 100, Level 2 “Bot-Aware.”

The Agent Readiness Score report for nohacks.co, scanned on 2026-04-18. Composite: 33/Level 2 “Bot-Aware.” Category breakdown: Discoverability 67 (2/3), Content 0 (0/1), Bot Access Control 100 (2/2), API, Auth, MCP & Skill Discovery 0 (0/6). Commerce checks not scored (no ecommerce signals detected). Image Credit: Slobodan Manic

A note on that number: After the first scan, I added Content Signals directives to robots.txt, which moved Bot Access Control from 50 to 100 and pulled the composite up eight points from an initial 25. Every other category below is unchanged from the first scan. I’ll come back to the Content Signals fix and why I made it at the end of this section.

Here is what drove each category score:

  • Discoverability: 67. robots.txt and sitemap.xml passed. Link headers failed because this website does not emit Link: headers in its responses.
  • Content: 0. Markdown content negotiation is not configured. The website returns HTML regardless of the Accept header.
  • Bot Access Control: 100. Both scored checks passed. AI bot rules in robots.txt (I have explicit rules for AI user agents) and Content Signals in robots.txt (I added these after the first scan). Web Bot Auth request signing is listed in this category as an informational check, but not counted toward the 2/2.
  • API, Auth, MCP & Skill Discovery: 0. All six checks failed. No API Catalog. No OAuth discovery. No OAuth Protected Resource metadata. No MCP Server Card. No Agent Skills index. No WebMCP tools on the page.
  • Commerce: not scored. nohacks.co has no e-commerce. The Commerce checks all failed, but the category is correctly excluded from the composite score.

That is a 33 on a scanner built by the company I most trust to understand where the agent-ready web is going. I consider this website reasonably well-designed for agents. The robots.txt is clean and explicit. The content is server-rendered, machine-readable HTML with clean semantic structure. The sitemap is current. The URLs are stable. If you asked me a week ago whether this website was agent-ready, my answer would be somewhere between “mostly yes” and “for what it needs to do, yes.”

And yet: 33, Level 2.

The scanner is measuring what it says it is measuring. The composite score, by itself, is still the wrong number to optimize for.

One note on the Content Signals fix, because it’s relevant to the Goodhart argument later in this article. Content Signals is a Cloudflare proposal with almost no deployment beyond Cloudflare-aligned crawlers. I debated adding it for exactly the score-chasing reason this article warns about. I decided it was defensible for two reasons. First, the fix is declarative, not decorative. The directives state real policy about what should happen with my content, and the statement has meaning even if the spec fails. That is different from adding an empty MCP Server Card to satisfy a scorer. Second, for a website that writes about agent-readiness specifically, publicly declaring content policy is editorial practice regardless of which crawler respects it. The fix was one commit to public/robots.txt and the directives are readable by any human curious enough to check.

Same Website Scores 33 Or 67 Depending On The Preset You Select

On the All Checks preset, nohacks.co scores 33 out of 100, Level 2 “Bot-Aware.” On the Content Site preset, same website, same day, different scan configuration, it scores 67, still Level 2 “Bot-Aware.” Nearly double the composite number. The 34-point gap is the difference between two scan configurations of the same scanner, not a difference between two websites.

Here is what the Content Site preset changes in the scan configuration:

The Content Site preset unchecks every item in the API/Auth/MCP/Skill Discovery category, every item in the Commerce category, and Web Bot Auth in Bot Access Control. Six scored checks remain: three Discoverability (robots.txt, Sitemap, Link headers), one Content Accessibility (Markdown negotiation), two Bot Access Control (AI bot rules, Content Signals). Image Credit: Slobodan Manic

Running that preset on nohacks.co produced this result:

Nohacks.co under the Content Site preset: 67 / Level 2 “Bot-Aware.” Four of six scored checks pass. The two failing checks are Link headers (a fix I have not deployed yet) and Markdown content negotiation (not configured). Both are real shipping signals that agent runtimes benefit from today. Image Credit: Slobodan Manic

Four of six scored checks pass. The two failures are unambiguous remediation targets: Link headers via HTTP response configuration, Markdown content negotiation via origin or CDN response logic. Both ship against real agent-runtime behavior today. Neither is a proposal-stage format that will only maybe become a standard. This is the honest reading of nohacks.co’s agent-readiness state: two specific, actionable gaps.

The Correct Toggle Is Hidden, And The Default Score Is Wrong

The scanner is doing its job. It knows a blog does not need an MCP Server Card. It knows a podcast archive does not publish an API catalog. The Content Site preset is not cosmetic. It removes irrelevant checks and gives a content website an accurate reading against standards that actually apply.

The problem is that the preset is hidden. When a user lands on isitagentready.com and pastes a URL, the default scan is All Checks. The Site Type toggle that would switch to Content Site or API/Application lives inside a Customize dropdown that most users will never open. The user clicks Scan, reads the composite number, takes a screenshot, shares it. The shareable number, the one that travels on social media, the one competitors compare across, is the All Checks composite.

For a content website that runs the default scan without reading individual checks, the composite is structurally too low. The 33 on nohacks.co is wrong for the kind of website nohacks.co is. The 67 from the Content Site preset is the accurate reading. Two numbers from the same scanner on the same website. The accurate number is behind a dropdown. The wrong number is on the front page.

Any web professional who runs the scanner and plans to share the score anywhere public needs to open Customize, select the preset that matches their website type, and re-run before sharing. Without that step, the public score will understate the website’s actual agent-readiness, and the gap between the shared number and the accurate number will be larger for content websites than for API websites (which are closer to the All Checks baseline). Read the individual checks. Do not share a composite until you know which preset produced it.

For the record: the 67 is bothering me. I am going to go get the 100. I know exactly what the Goodhart section below is about to warn against, and I am going to do it anyway. Two fixes stand between me and the 100. Both are five-minute jobs. Both map to real agent-runtime behavior (Link headers for discovery, Markdown content negotiation for efficient agent parsing), so at least the motivation is legitimate and not pure score-chasing. That caveat is also exactly what score-chasers say. Public scores are a gravitational field. Even the person writing a long article about their unreliability ends up orbiting.

Agent Readiness Measures Delivery, Not Message

Every category the Agent Readiness scanner tests is about delivery: discoverability, content negotiation, bot access, API discovery, commerce protocols. None tests the quality of the message itself.

The scanner never asks whether your headlines are clear, whether your product descriptions persuade, whether your content answers the query well, whether your writing is any good. Those are SEO and CRO questions. They occupy the discipline of making the message better. The Agent Readiness Score occupies a different discipline entirely. It asks whether an agent can fetch your content, parse the format it arrives in, authenticate against your endpoints, call your functions, pay for your outputs.

That is the distinction that matters. Classical web optimization (SEO, CRO) is about what you say and how persuasively you say it. Agent-readiness is about how you deliver what you say to a non-human reader. Two websites can publish word-for-word identical content. One serves it as server-rendered HTML with semantic markup, responds to Accept: text/markdown, exposes structured data, returns predictable response codes. The other serves it as a JavaScript-rendered single-page application with no content negotiation and an inconsistent error surface. The message is identical. The delivery is different. The agent-readiness score will be different. And it will be right to be different, because the delivery is what the agent interacts with.

This is also why agent-readiness fixes tend to be orthogonal to SEO and CRO work. You can improve an agent-readiness score without rewriting a single word of your content. You can also have world-class SEO content that scores a 10 on the agent-readiness scanner because none of your delivery pipeline was designed for machine consumers. SEO and CRO work on the content layer. Agent-readiness works on the transport and protocol layer. They are adjacent but not the same craft, and treating them as the same is the mistake that turns an agent-readiness project into a content-rewrite project and misses the actual fix.

The people who will do well over the next several years are the ones who stop arguing about which discipline matters more and start recognizing they occupy different layers of the stack.

3 Goodhart Risks Built Into The Agent Readiness Score

Goodhart’s law says that when a measure becomes a target, it stops being a good measure. The Agent Readiness Score is well-designed, but it is also now a public, shareable, compared number, which produces three predictable behavioral failures in the wild.

The first risk is that website owners will optimize for the number rather than for real agent behavior. Add an MCP Server Card that points nowhere because the scanner wants one. Publish an Agent Skills index with no actual skills. Ship a WebMCP tool that does nothing just to pass the detection check. The score goes up, and nothing changes for real agent runtimes visiting the website.

The second risk is that consultancies will start selling “Agent Readiness Score optimization” as a service, selling the score rather than the underlying architecture. The history of SEO gives us a century of data on how this plays out. PageRank became a target, and a decade of link-spam economy grew up around it. Core Web Vitals became a target, and a generation of performance-theater optimizations followed. The Agent Readiness Score is a better-designed metric than either of those were at launch, but the same gravity applies.

The third risk is that the scanner’s inclusion of emerging standards as scored signals will accelerate the adoption of those standards past the point where they are ready to carry real traffic. The scanner checks for llms.txt, a proposed format for exposing website content to language models. Llms.txt is not a ratified standard, has no governing body, and has competing proposals for how it should be structured. Including it as a scored signal gives it weight it has not earned in the ecosystem. A website owner looking to fix a failing check is the marginal adopter who tips a proposal into a de facto standard before the spec work is done.

None of these failure modes are hypothetical. They are how every public measurement score in the history of the web has played out. The Agent Readiness Score is better than most because Cloudflare is honest about what it is, because the per-check detail is available right alongside the composite number, and because the Commerce category correctly excludes itself on non-commerce websites. That honesty is a feature worth protecting. Website owners and the consultancy industry will be tempted to treat the composite number as the target anyway.

Do not do this.

6 Weekend Fixes That Map To Real Agent Runtimes

Six actions for a web professional running the scanner the weekend of its launch, ordered from highest-leverage to lowest:

  1. Run the scan on your website. It takes about 30 seconds. Note the score and open the detailed report. The detail is where the signal is.
  2. Fix the failing checks that ship against real agent runtimes today. These are the ones whose absence measurably hurts your website for agents visiting it right now:
    • robots.txt. If missing, add one. If present, make sure it contains specific rules for AI user agents (GPTBot, ClaudeBot, PerplexityBot, Google-Extended, etc.).
    • sitemap.xml. If missing, generate one and link it from robots.txt. Keep it current.
    • Markdown content negotiation. Configure your origin or CDN to return text/markdown when the Accept header requests it. Cloudflare’s own AI Crawl Control has first-class support for this. Other providers require custom server logic.
    • Structured data. Ship schema.org JSON-LD for the content types your website publishes (Article, Product, Organization, BreadcrumbList). This is not a scored check, but it is the highest-leverage fix for citation behavior across every agent runtime currently deployed.
  3. Treat the proposal-stage formats as a watch list, not a checklist. llms.txt, Content Signals in robots.txt, Web Bot Auth, API Catalog, MCP Server Card, Agent Skills, WebMCP, ACP, UCP are all real working standards in some sense. They are not shipping against real agent-runtime behavior at scale yet. Watch them. Implement them when your stack has a reason to, not because the scanner flags them.
  4. Ignore the composite number in your own tracking. Track individual check outcomes over time. A website that goes from 3 of 5 real-runtime checks passing to 5 of 5 has measurably improved, even if the composite score barely moved because the 10 proposal-stage checks still fail.
  5. Re-scan after changes. The scanner is fast, free, and available via the URL Scanner API if you want to script regression checks into your deployment pipeline.
  6. Skip the consultancies selling Agent Readiness Score optimization. The work is straightforward enough that a half-day audit and a focused remediation sprint will beat any packaged service.

The scanner is the tool. The work is still the work.

Vendor-Specific Scanners Are Coming: Track What Every Scanner Tests

The Agent Readiness scanner is standards-list-shaped: a set of checks against a fixed list of protocols and formats, some ratified (RFC 8288 Link headers, RFC 9309 robots.txt rules, RFC 8414 OAuth discovery, RFC 9727 API Catalog, RFC 9728 OAuth Protected Resource), some emerging proposals (MCP SEP-1649, WebMCP, Content Signals, Web Bot Auth, x402, UCP, ACP, llms.txt). The next thing that happens in the ecosystem is predictable: Other vendors will ship their own scanners against their own preferred lists. The overlap will be significant because most of the ratified standards are uncontroversial. The divergence will be in which proposals each vendor scores for.

That divergence is where the agent-readiness measurement story gets interesting. A Cloudflare scanner that checks for Web Bot Auth and UCP is making a bet. A Google scanner, if it ships, would check for some of the same things and some different ones (Google has UCP, does not have Web Bot Auth). A Perplexity scanner would check for yet another set. Website owners would see different scores from different scanners on the same website. The composite number, already not trustworthy, becomes vendor-specific.

The signal worth tracking is which checks show up in every scanner that ships. Those are the de facto standards. The checks that only show up in Cloudflare’s scanner are Cloudflare’s bets. Some will win. Most will not.

This is the pattern that made me comfortable publishing an article about a Cloudflare tool on the day it shipped. The Agent Readiness Score is real. The thesis behind it (agent-readiness is a measurable property) is the right thesis. The specific scorecard is version one of something that is going to have dozens of versions, each reflecting its vendor’s bets. Web professionals should engage with the version-one scorecard, fix what it correctly flags as real, watch what it flags as emerging, and keep their own running list of which checks survive across every scanner that ships in the next six months.

That running list is the real agent-readiness standard. The composite score is the marketing layer.

Run the scan. Read the report. Fix what matters. Watch what might.

More Resources:


This post was originally published on No Hacks.


Featured Image: RobinRmD/Shutterstock

LLM Guidance Doesn’t Transfer The Way SEO Guidance Did via @sejournal, @DuaneForrester

For roughly two decades, the SEO discipline operated on a quiet assumption that turned out to be one of its most valuable features. Guidance from one search engine traveled. If Google said sitemaps mattered, Bing said sitemaps mattered. If Bing said structured data deserved real effort, Google said the same. Practitioners optimized for Google with reasonable confidence that the work would carry across the other engines, and most of the time it did. That portability was not luck. It was the product of a structurally large overlap layer that the major search engines had jointly built, brick by brick, over twenty years.

That world doesn’t exist in LLM-land. The major providers train on different corpora, run different crawlers under different policies, route different queries through different retrieval systems, and apply different alignment processes that shape the final response in ways the upstream signals can’t predict. Guidance from any one provider, including Google’s guidance about its own Gemini products, is one data point. Practitioners carrying the SEO habit forward, the habit of treating one engine’s guidance as roughly the whole map, will optimize confidently for one platform and miss the others.

Sidebar: As I was finalizing this piece, Google published fresh guidance on optimizing for their generative AI features. Their framing is explicit: from Google Search’s perspective, optimizing for AI search is still SEO. That framing is accurate for Google Search. It does not extend to ChatGPT, Claude, Perplexity, or any other LLM, and that is precisely the trap this article is about.

The Shared Standards That Made SEO Guidance Portable

The era of portable guidance was built on actual collaboration, not coincidence. The Sitemaps protocol became the joint property of Google, Yahoo, and Microsoft in November 2006, when the three engines formally agreed to support a common protocol at version 0.90, building on Google’s earlier Sitemaps 0.84 from June 2005. Five years later, on June 2, 2011, the same three engines launched Schema.org, with Yandex joining shortly after, to create a common vocabulary for structured data markup. That was the announcement that got made on stage at SMX Advanced. I was on the Bing team at the time, and what struck me then is what still matters now. The engines were competitors, but they had decided that a shared vocabulary served them all. Webmasters got one set of rules. The web got cleaner data. The engines got better signals. Everybody won.

The pattern repeated with robots.txt, the 1994 convention that became RFC 9309 at the IETF in 2022, formalizing what every serious crawler already honored. And it repeated again, more recently, with IndexNow, the protocol Microsoft Bing and Yandex launched in October 2021. IndexNow is now supported by Bing, Yandex, Naver, Seznam, and Yep. Google has tested the protocol since 2021, but has not adopted it.

That overlap layer is exactly why Google’s guidance felt safe to follow, even if you cared about Bing traffic. The signals the engines used were not identical, but the inputs they accepted, the protocols they honored, and the standards they advertised were. Optimization had a shared substrate.

Where The LLM Stacks Actually Diverge

The LLM environment doesn’t have a shared substrate of comparable size. The differences are not cosmetic, and they are not temporary. They are baked into how the systems are built.

Start with training data. OpenAI has signed disclosed licensing deals with News Corp worth up to $250 million over five years, Axel Springer at roughly $13 million per year, Reddit at an estimated $70 million per year, plus the Financial Times, Condé Nast, Hearst, Vox Media, The Atlantic, the Associated Press, Le Monde, and others. Google has its own Reddit deal, estimated at $60 million per year, granting real-time data API access. Anthropic has not publicly disclosed equivalent publisher licensing deals, and that undisclosed status is itself the practitioner-facing point. The corpora that fed these models, and that continue to refresh them, are not the same documents. Practitioners cannot know what any given provider has paid for and what it hasn’t.

The crawler infrastructure diverges next. OpenAI runs three separate bots: GPTBot for training, OAI-SearchBot for search indexing, and ChatGPT-User for user-initiated retrieval. Anthropic runs three of its own: ClaudeBot for training, Claude-SearchBot for search, and Claude-User for user-initiated retrieval. Perplexity runs PerplexityBot and Perplexity-User. Google introduced Google-Extended in September 2023 as the user-agent that controls whether Google can use a site’s content to train Gemini, separate entirely from the Googlebot that handles traditional search indexing. There is no single AI user-agent. Every provider requires a separate rule, and the rules don’t translate cleanly across providers because the bots don’t do equivalent jobs in equivalent ways.

The retrieval architectures diverge structurally. ChatGPT has historically used Bing’s index as its primary web search source, and that connection appears to still be primary, though OpenAI continues to build out additional infrastructure alongside it. Perplexity built its retrieval system on a Vespa-based pipeline that treats documents and sub-document chunks as first-class retrievable units. Google’s Gemini uses Google’s own index plus Knowledge Graph grounding. Claude uses Brave Search as a retrieval partner. Same query, four different retrieval systems, four different views of which sources exist and which sources are worth surfacing.

Then comes the alignment layer, which is where SEO had no equivalent at all. After a model is trained on its corpus, providers run post-training to shape how the model actually behaves: tone, refusal patterns, format, safety posture, what counts as a good answer. OpenAI’s primary approach has been RLHF, or Reinforcement Learning from Human Feedback, where human raters score model outputs and the model learns to produce highly rated responses. Anthropic developed Constitutional AI, which trains models to critique and revise their own outputs against a written set of principles. These methodologies produce demonstrably different behavior in the final products. The same retrieved content, fed into two models aligned by two methodologies, can yield two materially different responses about the same brand.

When One Provider’s Guidance Demonstrably Fails To Port

The clearest single example of guidance that doesn’t port is llms.txt. Jeremy Howard of Answer.AI proposed the file in September 2024 as a markdown manifest, placed at a site’s root, that would guide LLMs to the most important content. The proposal got picked up across the SEO community. Yoast built a generator. Agencies added llms.txt creation to their service catalogs. Conference speakers declared it essential.

As of mid-2026, no major LLM provider has confirmed they consume the file. Not OpenAI. Not Anthropic. Not Google. Server-log analyses across hundreds of thousands of domains show major AI crawlers don’t routinely request /llms.txt at all. Google’s John Mueller publicly compared it to the deprecated meta keywords tag. Gary Illyes confirmed at Search Central Live in July 2025 that Google does not support llms.txt and is not planning to.

I’ve written about this elsewhere, so I won’t repeat the technicalities here. What matters for this argument is the structural lesson. Schema.org succeeded because three engines built it together and then enforced it together. Llms.txt was proposed by one researcher, picked up by tooling vendors, and ignored by the platforms it was supposed to serve. The shared-standards model that gave SEO its portable guidance is not available to LLM practitioners at the same scale, because the platforms are not building the standards together. They are building their own pipelines.

The Gemini Inversion

The cleanest illustration of how far guidance portability has degraded sits inside one company. Google publishes its own SEO documentation at Search Central, the canonical guidance the industry has followed for two decades. Those documents emphasize traditional ranking signals, E-E-A-T, content quality, technical accessibility, and structured data. That guidance is still useful for Google Search itself.

Google also makes Gemini, the model that powers AI Overviews and Google’s separate AI Mode surface. And the citation behavior of those surfaces does not appear to track the guidance the same company publishes for its own search results.

In late 2024, roughly three-quarters of pages cited in AI Overviews also ranked in Google’s top 12 for the same query. By early 2026, after Google upgraded AI Overviews to Gemini 3 in January, Ahrefs analyzed 4 million AI Overview URLs and found that only 38% of cited pages also appeared in the top 10 for the same query. A separate BrightEdge analysis put the overlap closer to 17%. SE Ranking’s post-upgrade work found that Gemini 3 replaced approximately 42% of the domains previously cited under earlier model versions and generates 32% more sources per response.

The gap widens further when you look at Google’s AI Mode, which is a separate conversational surface that runs on the same Gemini family. Semrush data shows AI Mode and AI Overviews reach semantically similar conclusions 86% of the time, but cite the same URLs only 13.7% of the time. Only 14% of AI Mode citations rank in Google’s traditional top 10.

It appears, so far, that the canonical relationship has shifted. Google’s published SEO guidance is still the cleanest path to ranking in Google Search. But that ranking is no longer a reliable proxy for being cited by Google’s own AI surfaces. The same guidance, the same content, the same domain, can produce three meaningfully different outcomes across Google Search, AI Overviews, and AI Mode, even though all three live inside the same company. The old playbook of following the search engine’s guidance and trusting that the engine’s other surfaces would behave consistently does not appear to be delivering the same returns it used to.

What Still Ports, And Why It’s Smaller Than It Looks

A universal layer does survive. Crawler accessibility still matters across every provider. Primary-source factual content still wins more citations than aggregator restatement. Clean retrievable structure still helps every system understand what a page is about. Presence on the high-authority sources that all major LLMs disproportionately cite, Wikipedia, YouTube, Reddit, major news outlets, still functions as a force multiplier across platforms. Earning visibility on those sources gives content a chance to surface in any LLM that draws on them.

But the universal layer is much smaller than it was in the SEO era. Qwairy’s analysis of 118,000 AI responses across ChatGPT, Perplexity, Google AI Mode, and Claude found that only 11% of cited domains appeared across multiple platforms. The other 89% were platform-specific. A brand that wins citations on Perplexity may be largely invisible on Claude. A brand that’s a regular reference on ChatGPT may not show up in AI Overviews at all. The same content can be the right answer for one system and the wrong answer for the system next to it.

What This Means For The Work

The practical implication is not abandoning all hope. It is that practitioners need to stop treating any single LLM provider’s guidance as the universal map and start treating it as one input among several. Read what every major provider publishes about their own systems. Test your visibility across platforms, not just on the platform you happen to use most. Treat divergence as the default and overlap as the exception, not the other way around.

This is not how SEO worked, and the difference matters. The old reflex was to optimize for Google and trust the portability. The new reality is that following one LLM’s guidance, even Google’s guidance about Gemini, will leave you optimized for a slice of the landscape and potentially blind to the rest. The discipline is being rebuilt on platform-specific work that didn’t exist in the SEO era, and the practitioners who recognize that first are going to spend the next two years setting the standards everyone else follows.

The overlap has shrunk. You now have more work than ever to accomplish.

If you have thoughts on where the divergence between providers is sharpest in your own work, reach out directly. I’d genuinely like to hear what’s showing up in the data.

More Resources:


This post was originally published on Duane Forrester Decodes.


Featured Image: Rawpixel.com/Shutterstock; Paulo Bobita/Search Engine Journal

How To Stress-Test A Staging Environment To Surface Risks Pre-Launch – Ask An SEO via @sejournal, @HelenPollitt1

This week’s Ask An SEO question:

“How do you stress-test a staging environment to surface SEO risks before a large-scale launch?”

It is one of the most important questions to answer when considering rolling out new websites, migrations, or significant changes to your live site.

First, let’s look at the difference between a “staging” site and the “production” site.

The staging site is often also called the “development” site, “pre-production,” or another name that is specific to your company. It is a test site that is meant to mirror your live site as much as possible to help developers test changes in a safe, private environment before launching them.

The “production” site is your live site. It’s the one that is accessible to the general public and should be operating as close to perfectly as possible.

There are some instances where developers might deploy straight to the production site without testing on a staging site first. For example, when there is no testing site to use, or there is no way of mimicking the conditions to test without deploying the change to the live site. This is risky to do. If a deployment breaks something else in the code, it could critically affect the usability of the live site.

How To Stress-Test The Staging Environment

As SEOs, it is very important that we test deployments that could potentially impact SEO performance before they launch. Oftentimes, we find ourselves discovering deployments after they have already started to affect traffic and rankings. This is less than ideal, as it can take a while for Googlebot to pick up changes once a bad deployment has been fixed. It is far better to test how Googlebot might process changes before it is able to do so.

Mirror The Production Site As Closely As Possible

The most important aspect of the staging site is that it is as close to the production environment as possible. This is critical because it enables any testing that you do to reveal the same outcome as if you had run the test on the production environment.

Any deviations between the two environments need to be cataloged. These discrepancies need to be communicated so that testers know to pay special attention to the areas of the production site that differ from staging. Once the deployment goes live, testers can quickly ensure these areas of the production site are behaving as expected.

Crawl The Site At Scale With Multiple User-Agents

One area that is often overlooked when stress-testing the staging environment is using several different user agents when crawling the site.

By using different agents, for example, mimicking Googlebot Smartphone and Googlebot Desktop, you are more likely to pick up technical issues with the site that aren’t obvious on first crawl. For instance, crawling as both desktop Googlebot and mobile Googlebot could show issues with rendering that are only occurring on mobile devices.

Make sure to crawl the site with user agents that are important for your specific industry. If you are targeting Google News as a channel, make sure to crawl the site as the Google-News bot. If images or videos are important to your SEO, crawl as Google-Image and Google-Video bots.

To put your staging site through its paces, make sure to crawl it with a mobile user agent, a desktop user agent, and spoof two search engine bots, e.g., Google and Bing. This way you are getting good coverage of the experiences of different, important bots. If possible, try to crawl as an LLM bot also.

Check The Rendering

A good starting point when testing a staging environment before a large-scale deployment is rendering. Modern websites will often use a lot of JavaScript, which, not inherently bad, can pose issues for some search bots in processing. For more information on how search bots process JavaScript, see this guide.

Set your crawling tool to include JavaScript rendering, and see what elements it can pick up. For example, can you see the header tags, meta title, schema markup? Then crawl the site again without JavaScript rendering enabled. Make sure those same elements are still available to the bots.

If in doubt, carry out some spot-checks on pages on the staging site. Inspect the Document Object Model (DOM) to see if the critical code elements are visible on first load of the page.

It is important that what you are seeing on the page is what the search bots are able to parse and render.

Test SEO Elements In Bulk And Across Page Types

Carrying out tests in bulk is important when testing a site before a large launch. When carrying out your tests, make sure they are across different page types and, if applicable, across languages.

If your site uses templates, make sure to test each of the templates that are critical to your SEO success. For example, on an ecommerce site, this means checking the category and product pages as a high priority.

For multilingual sites, ensure your tests are being run across different languages, and set a VPN to target the countries those languages are important for. Spoof those countries when running your crawls to make sure users will be seeing the correct language and content for their region. Although Googlebot frequently crawls from U.S.-based IP addresses, it also uses geo-distributed configurations, particularly for locale-adaptive or multilingual sites.

On your staging site, you may find that not all of the languages are represented, or perhaps there is a different localization process than what exists on production. This brings us back to the first point of needing the staging site to be as comparable to the production site as possible.

If it isn’t, in particular for localization elements, these need to be at the top of your post-deployment checks.

Benchmark Current Production Performance

A good aspect to remember is that your staging site may well be on a less performant server. This means that when conducting speed tests on staging, the results might be worse than if the tests were run on production. This can limit your ability to run meaningful checks before deployment.

To work around this, make sure to benchmark performance on production so that you can run the tests again quickly after deployment. This will mean waiting until the changes have gone live, but may be the only way to get an accurate understanding of areas like page load speed in situations where the staging server just isn’t as good as the production one.

Test For Edge Cases

Developers will try to break their code when testing it; we should too. When testing your staging site before deployment, run it through some edge cases. In practice, this means thinking of scenarios that, although unlikely, are possible. For example,

  • I am visiting the website from the U.S., but my language is set to French. What language are the meta tags in?
  • I am viewing the website on a mobile device but have the viewport set to desktop. What content am I able to access that I couldn’t on mobile otherwise?
  • If I turn JavaScript off, can I still use the menu drop-downs?

Test For Previously Known Issues

Make sure previous issues haven’t been reintroduced into the code during the most recent work. Even if the mass deployment is for a small area, such as a new meta title template being rolled out, that’s not to say issues aren’t being reintroduced elsewhere.

Don’t test only for the item being changed, but check across critical SEO areas. In particular, if work has been done recently to improve pages on the site, check those will still be in place with this latest deployment.

Equally, if there are known bugs that have affected your SEO performance in the past, check for these even if the deployment isn’t related to them. It’s easy for bugs to sneak back into code, especially if they have been there before.

More Resources:


Featured Image: Paulo Bobita/Search Engine Journal

More Organic Search Traffic, More Ad Revenue: 4 Publishing Workflow Fixes That Bring Both

This post was sponsored by WP Engine. The opinions expressed in this article are the sponsor’s own.

Why are we missing the SERP window on breaking stories we should be winning?
How are smaller outlets ranking faster than us on the same news?
Why is our ad stack tanking Core Web Vitals on our highest-traffic pages?

In most large newsrooms, the answer traces back to the same culprit: a fragile, patchwork legacy CMS held together with ad-hoc plugins. For SEO and growth teams, that’s a direct hit to organic search traffic and ad revenue.
Below are four publishing workflow fixes that move both metrics in the same direction.

The 4 Publishing Pillars That Improve SEO & Monetization

To stop paying this tax, media organizations are moving away from treating their workflows as a collection of disparate parts. Instead, they are adopting a unified system that eliminates the friction between engineering, editorial, and growth.

A modern publishing standard addresses these marketing hurdles through four key operational pillars:

Pillar 1: Automated Governance (Built-In SEO & Tracking Integrity)

Marketing integrity relies on consistency.

In a fragmented system, SEO metadata, tracking pixels, and brand standards are often managed manually, leading to human error.

A unified approach embeds governance directly into the workflow.

By using automated checklists, organizations ensure that no article goes live until it meets defined standards, protecting the brand and ensuring every piece of content is optimized for discovery from the moment of publication.

Pillar 2: Fearless Iteration (Continuous SEO & CRO Optimization Without Risk)

High-traffic articles are a marketer’s most valuable asset. However, in a legacy stack, updating a live story to include, for instance, a Call-to-Action (CTA), is often a high-risk maneuver that could break site layouts.

A modern unified approach allows for “staged” edits, enabling teams to draft and review iterations on live content without forcing those changes live immediately. This allows for a continuous improvement cycle that protects the user experience and site uptime.

Pillar 3: Cross-Functional Collaboration (Reducing Workflow Bottlenecks Between Editorial, SEO & Engineering)

Any type of technology disruption requires a team to collaborate in real-time. The “Sticky-taped” approach often forces teams to work in separate tools, creating bottlenecks.

A modern unified standard utilizes collaborative editing, separating editorial functions into distinct areas for text, media, and metadata. This allows an SEO specialist or a growth marketer to optimize a story simultaneously with the journalist, ensuring the content is “market-ready” the instant it’s finished.

Pillar 4: Native Breaking News Capabilities (Capturing Real-Time Search Demand)

Late-breaking or real-time events, such as global geopolitical shifts or live sports, require in-the-moment storytelling to keep audiences informed, engaged, and on-site. Traditionally, “Live Blogs” relied on clunky third-party embeds that fragmented user data and slowed page loads.

A unified standard treats breaking news as a native capability, enabling rapid-fire updates that keep the audience glued to the brand’s own domain, maximizing ad impressions and subscription opportunities.

If those are things you’ve explored changing, it may be time to examine your own Fragmentation Tax, and why a new publishing standard is required to reclaim growth.

Stop Paying The Fragmentation Tax: How A Siloed CMS, Disconnected Data & Tech Debt Are Costing You Growth

The Fragmentation Tax is the hidden cost of operational inefficiency. It drains budgets, burns out teams, and stunts the ability to scale. For digital marketing and growth leads, this tax is paid in three distinct “currencies”:

1. Siloed Data & Strategic Blindness.

When your ad server, subscriber database, and content tools exist as siloed work streams, you lose the ability to see the full picture of the reader’s journey.

Without integrated attribution, marketers are forced to make strategic pivots based on vanity metrics like generic pageviews rather than true business intelligence, such as conversion funnels or long-term reader retention.

2. The Editorial Velocity Gap.

In the era of breaking news, being second is often the same as being last. If an editorial team is forced into complex, manual workflows because of a fragmented tech stack, content reaches the market too late to capture peak search volume or social trends. This friction creates a culture of caution precisely when marketing needs a culture of velocity to capture organic traffic.

3. Tech Debt vs. Innovation.

Tech debt is the future cost of rework created by choosing “quick-and-dirty” solutions. This is a silent killer of marketing budgets. Every hour an engineering team spends fixing plugin conflicts or managing security fires caused by a cobbled-together infrastructure is an hour stolen from innovation.

Conclusion: Trading Toil for Agility

Ultimately, shifting to a unified standard is about reducing inefficiencies caused by “fighting the tools.” By removing the technical toil that typically hides insights in siloed tools, media organizations can finally trade operational friction for strategic agility.

When your site’s foundation is solid and fast, editors can hit “publish” without worrying about things breaking. At the same time, marketers can test new ways to grow the audience without waiting weeks for developers to update code. This setup clears the way for everyone to move faster and focus on what actually matters: telling great stories and connecting with readers.

The era of stitching software together with “sticky tape” is over. For modern media companies to thrive amid constant digital disruption, infrastructure must be a launchpad, not a hindrance. By eliminating the Fragmentation Tax, marketing leaders can finally stop surviving and start growing.

Jason Konen is director of product management at WP Engine, a global web enablement company that empowers companies and agencies of all sizes to build, power, manage, and optimize their WordPressⓇ websites and applications with confidence.

Image Credits

Featured Image: Image by WP Engine. Used with permission.

In-Post Images: Image by WP Engine. Used with permission.

SERP FAQ Removal & New Data Challenge Schema’s AI Search Value via @sejournal, @MattGSouthern

Schema markup had a rough week. Google ended FAQ rich results. Four days later, Ahrefs published a report, finding that adding JSON-LD didn’t produce a clear citation lift across Google AI Overviews, AI Mode, or ChatGPT.

These developments weaken two common pitches for schema markup: increased SERP visibility and potential AI citation gains. This article examines their implications and what the data indicates about schema’s future.

Google’s Visible Schema Rewards Have Been Narrowing For Years

Google has been pulling back visible Search rewards tied to specific structured data types since 2023. Google restricted FAQ rich results to authoritative government and health sites, and HowTo rich results were limited to desktop and later deprecated.

In 2025, Google announced the retirement of several structured data features, including Course Info, Claim Review, and Estimated Salary. Book Actions was initially included but later carved out after Google removed its deprecation banner. Google called the remaining retirements “not commonly used in Search” and no longer providing value to users.

In 2026, Practice Problem structured data was deprecated. John Mueller noted on Reddit that “markup types come and go, but a precious few you should hold on to.”

The pattern is that visible structured data rewards have disappeared after becoming familiar SEO tactics. The markup itself stays valid, but the rich result doesn’t. Google doesn’t always describe these removals as responses to overuse, but the pattern offers less reason to treat any single markup type as a durable strategy.

These recent updates differ because the evidence for one proposed replacement value also weakened. The “GEO” advisory space claims schema boosts AI citations, and Ahrefs data tested part of that.

What The Ahrefs Report Found

Ahrefs tracked 1,885 web pages that added JSON-LD schema. Each page was matched against control pages that never added schema. Citation changes were measured across Google AI Overviews, AI Mode, and ChatGPT.

The results were flat. Google AI Mode showed +2.4%, ChatGPT showed +2.2%, and Google AI Overviews showed -4.6%.

The first two were too small to tell apart from random variation. The AI Overviews decline was statistically significant, but Ahrefs said it can’t confidently attribute that to schema.

Every page in the dataset already had more than 100 AI Overview citations before any schema was added. These pages were already being crawled and cited.

Ahrefs acknowledged that for pages not yet visible to AI, schema might still help with crawling, parsing, or indexing. But their data can’t confirm that.

Gianluca Fiorelli, a strategic SEO consultant, called the study “one of the more honest pieces of research to come out of the AI Search space in 2026.” But he argued the scope was narrower than the headline suggested. He compared it to “testing whether adding a label to a bottle already on the supermarket shelf makes customers pick it up more often.”

Ahrefs also cited a searchVIU experiment that found five AI systems relied on visible HTML during direct page retrieval and did not use hidden JSON-LD, Microdata, or RDFa. That finding covers one stage of the pipeline. It does not rule out schema playing a role earlier in indexing or entity understanding.

Ryan Law, Ahrefs’ director of content marketing, summarized the finding on LinkedIn, saying:

“Does adding schema markup help your pages get cited in AI search? Probably not,” he wrote. He added that schema is “probably not some magic fix for improving your AI citations.”

The Practitioner Debate

Both updates land in the middle of an active argument about schema and GEO.

Roughly 168,000 pages use the phrase “FAQ schema is critical for GEO,” according to search results that Lily Ray, VP of SEO and AI Search at Amsive, flagged on LinkedIn. She called the trend familiar.

“Anything that can be spammed in SEO, will be spammed,” Ray wrote. She’d warned about this in a 2019 Moz article when FAQ schema first launched, and described Google’s FAQ removal as the same cycle repeating.

Ray hedged throughout her post, calling it “putting on my tin foil hat” and “just an idea.” But the pattern she described is the same one visible in the timeline above. A useful markup type gets scaled as a tactic, Google pulls the reward, and the industry moves on to the next one.

Joost de Valk, founder of Yoast, made the connection explicit in a blog post. “The GEO industry is replaying early SEO, just faster,” de Valk said. “And the FAQ schema deprecation is the first concrete proof point that the cycle is back on.”

He also filed a Schema.org proposal for a new FAQSection type to address what he sees as the structural problem, separating “this page has an FAQ section” from “this page IS an FAQ.”

The frustration was sharpest from practitioners who’d been watching the GEO playbook harden around schema as its most concrete recommendation. Mark Williams-Cook, director at Candour and founder of AlsoAsked, shared the Ahrefs report on LinkedIn.

“GEO bros are selling snake oil with schema to boost citations, but people like Gianluca Fiorelli are talking sense,” he posted.

Marie Haynes, founder of Marie Haynes Consulting, commented on Ray’s post with a different theory altogether.

“My theory is that Google needed our FAQs to train AI so they gave us incentive to add them (aka rich results.) And now they don’t need them anymore,” she wrote. The theory is unconfirmed by any primary source, but it shows how far the speculation has traveled.

Some practitioners pushed back on the gloomier readings. Google’s broader guidance still presents structured data as a way to make page information machine-readable, and at a 2025 Search Central Live event in Madrid, the Search Relations team told practitioners that supported structured data types are still worth using.

What The Data Can’t Answer Yet

Whether schema helps pages that aren’t yet being cited is a separate question that the data can’t answer, because every page already had more than 100 AI Overview citations before schema was added.

The test also pooled all schema types together. Article, FAQ, Product, HowTo, and Organization were all treated as one category. Type-specific effects haven’t been isolated, and they could look different.

The 30-day measurement window may miss slower effects, and on live websites, schema changes can overlap with other page changes, making it hard to separate what schema did from what changed around it. The report only examined schema in the page’s HTML, not schema injected via JavaScript, which AI crawlers treat differently.

Ahrefs measured Google AI Overviews, AI Mode, and ChatGPT. Whether Bing, Copilot, Perplexity, Claude, or other answer systems treat schema differently from the systems Ahrefs measured is an open question.

Google’s FAQ deprecation notice says the company will continue using FAQ structured data to “better understand” pages. What that produces in measurable terms is unclear. The same uncertainty applies to whether schema affects citations indirectly, through eligibility, entity understanding, or source selection, rather than during the direct retrieval that searchVIU tested.

Nobody has published data that isolates that path.

Why This Matters

The Ahrefs data gives no measured reason to add JSON-LD, expecting short-term AI citation gains for pages already visible in AI Overviews. The trickier question is what to do with schema strategies more broadly.

Product, Review, Event, Video, and some other structured data types still support active rich result features. Organization, Person, and Article markup can still help describe entities and content, even when the payoff is less visible.

A blanket “schema doesn’t work” reading overstates what the data showed, because the test pooled all types and measured only one outcome. What the data does challenge is a specific sales pitch.

“Add schema to boost AI citations” has been one of the more concrete recommendations in GEO guides. For example, Frase.io called schema markup “critically important for AI search, GEO, and AEO.”

Without data support for that claim, it’s harder to justify the investment. AI systems in searchVIU’s test relied on visible HTML during retrieval, not JSON-LD. That suggests content structure, clear headings, and direct answers in prose may matter more for AI citation than markup structure.

Looking Ahead

The question hanging over the SEO industry is where schema creates measurable value. Adding JSON-LD didn’t measurably increase AI citations for pages already visible in AI Overviews.

For those pages, schema looks more like plumbing that serves other systems than a lever that moves citation counts. That’s still real value, but it’s a different pitch.


Featured Image: BEST-BACKGROUNDS/Shutterstock

More Resources

The Tech SEO Audit for the AI Search Era: How to Maximize Your AI Visibility via @sejournal, @JetOctopus

This post was sponsored by JetOctopus. The opinions expressed in this article are the sponsor’s own.

How do I optimize my site for ChatGPT and Perplexity, not just Google?

How do I know if AI bots are actually crawling my site?

How should my technical SEO strategy change for AI Search?

A significant portion of your site’s search impressions in 2026 are generated by machines researching on behalf of humans.

Those machines don’t care about your keyword rankings. They care whether your:

  • HTML loads cleanly in under 200 milliseconds
  • Product detail page is reachable in fewer than four clicks
  • Content answers a specific, nine-word question that has never appeared in any keyword research tool in your career.

This isn’t speculation. It’s what our server log data across hundreds of enterprise websites is showing us, consistently, since mid-2025.

What’s Actually Happening On Your Site

My colleague, Stan, flagged a pattern in a Slack message: query lengths were growing at rates that didn’t correlate with human behavior.

A 161% growth rate in 10-word queries year-over-year is not driven by users who suddenly got more verbose. It’s driven by AI agents decomposing a single user prompt into dozens of parallel sub-queries, a process researchers now call “fan-out.”

Query Length Growth in 2025

Image created by JetOctopus, Aggregated GSC data across hundreds of enterprise properties, 2025

The gradient is the tell. Human search behavior doesn’t scale this cleanly by word count. Machines do. By October 2025, 7-plus-word queries reached nearly 1% of total query volume, roughly triple their historical share.

More revealing than the volume is the CTR. While impression counts for 10-word queries spiked 161%, click-through rate collapsed to 2.26%, down from 8–11% in 2023.

The AI reads your page, extracts the answer, synthesizes it for the user. Your site never gets the visit.

We call these “phantom impressions.” They’re real signals that your content is being evaluated inside AI reasoning chains. If you’re filtering them out of your reporting because they don’t drive traffic, you are flying blind.

The Three Bots Visiting Your Site & Their Impact On SERP Visibility

Not all AI crawlers are equal, and treating them as a single category is the first mistake most technical SEOs make.

Training bots crawl broadly and ignore click depth. A training visit means the AI knows your content exists, not that users will ever see it.

AI search bots drop off quickly beyond two or three clicks from the homepage and typically visit each page only once a month.

AI user bots are initiated when a real person asks a question in ChatGPT, Perplexity, or Claude, and the AI researches the answer on their behalf. These are the only visits that translate to actual AI visibility.

Bot Type What Triggers It Crawl Depth Impact on AI Visibility
Training bots Model education cycles Deep — ignores click distance None directly. Awareness only.
AI search bots New URL discovery & fresh content Shallow — ~1 visit/month beyond 2–3 clicks Critical gatekeeper. If it misses a page, user bots won’t find it either.
AI user bots Real user query in ChatGPT / Claude / Perplexity Selective — driven by speed and structure High. Closest proxy to an AI impression.

Your site can receive heavy crawling from training and search bots and still be completely absent from AI-generated answers. If you’re not segmenting AI bot traffic by type in your log analysis, you have no idea which third of the iceberg you’re measuring.

Which SEO Signals Do LLMs Respect?

Robots.txt is your primary lever.

Most major AI platforms (ChatGPT, Claude, Gemini) follow robots.txt directives. Perplexity is a partial exception: PerplexityBot respects robots.txt, but Perplexity-User, the user-triggered bot, does not. Cloudflare confirmed this in an investigation. Most sites haven’t audited their robots.txt with AI access in mind. Do it.

Sitemaps are broadly supported.

ChatGPT, Claude, and PerplexityBot all use XML sitemaps for URL discovery. Keep them accurate.

Signals Best Saved For SEO & Ranking Efforts

These signals below don’t appear to impact AI visibility, but are still key for ranking for queries that still trigger traditional SERPs.

Canonical tags and noindex directives do nothing for AI bots.

AI crawlers don’t build a search index, so they have no use for these meta-signals. Content hidden from Google using noindex is fully visible to ChatGPT’s crawler.

LLM.txt does nothing.

Our log data shows major AI bots don’t read this file. Don’t invest time here.

JavaScript rendering is a critical blind spot.

Most AI crawlers (ChatGPT, Claude, Perplexity) don’t render JavaScript. If your product pages load key content client-side, those agents read an empty shell. Server-side rendering is the only architecture that works universally. The exception is Google Gemini, which uses the same Web Rendering Service as Googlebot.

How To Make Sure ChatGPT, Perplexity & LLMs Can Reach Your Content

AI search bots visit deep pages roughly once a month and drop off sharply beyond three clicks from the homepage. The pages with the most specific, answerable information are often the hardest for agents to reach.

The fix: Elevate your most valuable deep pages through internal linking, ensuring they’re reachable within four clicks.

Pages crawled by training bots but never reached by user bots are your highest-priority targets. Pages AI user bots visit frequently are telling you what to scale: more content covering the same topic cluster and depth.

Optimize Content For Longer, Fan-Out Queries

95% of the queries driving AI citations have zero monthly search volume. They’re synthetic sub-queries generated by AI models. But they show up in GSC: impressions, no clicks, query lengths you’d never target voluntarily.

How To Find Fan Out Query Opportunities

To surface fan out queries that are worth chasing, connect your GSC API to JetOctopus (to bypass the 1,000-row UI limit) and filter for: query length greater than 7 words, impressions under 50, clicks at 0, over the last 3 months. That’s your Fan-Out Opportunity Matrix, the exact questions AI agents are asking about your content.

Prompt Types That Fan Out Most

Image created by JetOctopus, 2025

If your content isn’t structured to answer list and comparison queries, with explicit rankings, pros/cons, and side-by-side specs, you’re leaving the highest fan-out surface area unoptimized.

“Product review” intent queries surged from 239 in June 2025 to over 40,000 by September 2025. That 16,000% increase was AI agents systematically harvesting structured opinion data. If your product pages lack this depth, you’re invisible to that harvest.

The Technical Audit: Where to Start

Step 1: Identify AI User Bot Traffic In Logs

Pull raw server logs (Apache/Nginx) and export all lines containing these user agents: OAI-SearchBot and ChatGPT-User, PerplexityBot and Perplexity-User, Claude-SearchBot and Claude-User. Then manually group hits by user-agent patterns and endpoints in a spreadsheet. To distinguish training bots from user bots, you’ll need to maintain your own classification list — one that changes often and isn’t standardized.

In JetOctopus Log Analyzer, this segmentation is built in: filter by bot type (training, search, and user) in a few clicks and immediately see which pages AI user bots visit (your AI-visible content, ready to scale) versus pages training bots hit but user bots never reach (your highest-priority fix targets).

Step 2: Audit Technical Accessibility Of Deep Pages

Pick a sample of deep URLs and check HTML payload size, confirm key content isn’t injected via JavaScript by viewing raw HTML, simulate crawl depth by counting clicks from the homepage, and test load time in Chrome DevTools or Lighthouse. Also check whether important content sits behind accordions or “View More” elements — these require JavaScript execution that AI bots skip entirely. For large sites with thousands of deep pages, this sampling approach misses a lot. AI agents don’t click. If information only appears after user interaction, it doesn’t exist for these crawlers.

Step 3: Clean Up Your Robots.txt

Open your robots.txt and review all Disallow and Allow directives for every user-agent line by line. AI bots follow Disallow rules, so make sure you’re not accidentally blocking important URLs. Manually test key URLs to confirm they aren’t blocked. A 30-minute audit here can prevent you from blocking crawlers you want in, or exposing content you’d rather keep out.

Step 4: Map Your Phantom Impressions

Export data from GSC Performance reports filtered by impressions with zero clicks. Because of the 1,000-row UI limit, you’ll need to use the GSC API or export in chunks by date and query, then merge datasets in spreadsheets or BigQuery. Also factor in query frequency: long queries appearing daily are likely not fan-outs.

Connect your GSC API to JetOctopus to bypass the row limit and build your Fan-Out Opportunity Matrix automatically — the exact questions AI agents are asking about your content, ready to act on.

Step 5: Monitor The Changes

Set up a recurring export process — pull GSC data monthly and compare impressions over time, re-run log analysis scripts and diff bot activity, track Core Web Vitals separately in PageSpeed Insights or CrUX. You’ll end up stitching together multiple data sources with no unified alerting, making it hard to catch regressions early.

JetOctopus Alerts covers exactly this: unified notifications for changes in AI bot activity alongside Googlebot behavior, Core Web Vitals, on-page SEO issues, and SERP efficiency drops, so you catch regressions before they compound.

The New KPI: Technical Accessibility

SEO in 2026 is restructuring around one constraint: can an AI agent crawl, reach, and extract a fact from your 50,000th product page in under 200 milliseconds?

If the answer is no, your rankings, backlinks, and content quality become irrelevant for a growing share of search interactions. The machines are searching. The question is how quickly you can see what’s actually happening.

Start with your logs. Everything else follows from there.

Want to see exactly how AI bots are interacting with your site: which pages they reach, which they skip, and where your fan-out opportunities are hiding? Book a live walkthrough of the JetOctopus platform. We’ll pull your actual log data and show you what your GSC reports aren’t telling you.

Image Credits

Featured Image: Image by JetOctopus. Used with permission.

Schema Markup Didn’t Move AI Citations In Ahrefs Test via @sejournal, @MattGSouthern

Schema markup is far more common on pages cited by AI. But a new Ahrefs report found that adding it didn’t result in a clear increase in citations.

Ahrefs tracked 1,885 web pages that added JSON-LD schema. Each page was matched against control pages that never added schema, and citation changes were measured across Google AI Overviews, AI Mode, and ChatGPT.

No platform showed a meaningful citation increase after schema was added.

What Ahrefs Found

The report analyzed 6 million URLs and found that pages cited by AI were roughly three times more likely to include JSON-LD. This gap has been seen as evidence that schema improves AI visibility. However, Ahrefs tested whether this held true when isolated from other signals, since sites with schema tend to invest in better content and earn more links.

They ran a controlled comparison, matching each schema page with three control pages from different domains with similar citation levels that never added JSON-LD. Citation changes were measured 30 days before and after schema addition.

Using its Brand Radar tool and Agent A, Ahrefs conducted a matched difference-in-differences analysis to account for platform trends. Here’s what was found.

  • Google AI Overviews: −4.6% (a small but statistically notable decline relative to controls)
  • Google AI Mode: +2.4% (too small to distinguish from random variation)
  • ChatGPT: +2.2% (too small to distinguish from random variation)

Three more tests were run alongside the primary comparison, and all four found no clear positive or negative effect.

The AI Overview Decline

The −4.6% decline in the AI Overview section deserves context. Ahrefs reports both treated and control pages were already declining before schema was added. Treated pages declined slightly faster, but the difference is small, with about 12 fewer daily citations per page in a sample where most pages received hundreds.

The report notes that the decline could reflect a small negative effect from schema, or it could be coincidence. It doesn’t draw a conclusion either way.

What The Report Doesn’t Cover

Every page in the dataset had 100+ AI Overview citations before any schema was added. These pages were already in the consideration set, being crawled and surfaced.

The report admits this limitation. For pages not yet visible to AI, schema might still aid crawling, parsing, or indexing, but the data can’t confirm this.

The report also notes other limitations. Pages adding JSON-LD often change other elements, making it hard to separate schema effects from those changes. All schema types were pooled, so some might perform differently. The 30-day window might miss slower effects.

A searchVIU experiment cited in the report tested whether five AI systems used schema markup when fetching pages in real time. None did; they only extracted visible HTML, ignoring JSON-LD, Microdata, and RDFa. This was a direct-fetch test, not proof of schema’s role during training, indexing, or retrieval.

Why This Matters

Schema markup is frequently recommended for AI visibility. However, Ahrefs’ data complicates this. While schema supports rich results and knowledge graphs, adding JSON-LD doesn’t increase AI citations for pages already cited.

The data shows a correlation: pages with schema are cited more often by AI, but Ahrefs interprets this as a sign of overall site quality rather than schema’s direct impact.

Looking Ahead

The report can’t determine whether schema helps pages that aren’t yet cited, which is a different group of pages that need another study. If pages are visible to AI, JSON-LD probably won’t boost citations.


Featured Image: Roman Samborskyi/Shutterstock

Google Drops FAQ Rich Results From Search via @sejournal, @MattGSouthern
  • Google has removed FAQ rich results from Search after limiting them for most sites.
  • Search Console reporting ends in June, followed by API support in August.
  • FAQ schema can stay on your pages, but it no longer earns visible FAQ results in Google.

Google deprecated FAQ rich results, completing a removal that started years ago. FAQ rich results were already restricted for most sites.

How To Design URL Structures For AI Retrieval, Not Just Rankings

For years, URL structure was a technical SEO checkbox. Keep it short, use hyphens, include the keyword, done.

While that playbook still works, it’s increasingly incomplete. A growing share of the target audience now discovers content through AI assistants and large language models like ChatGPT, Perplexity, Claude, Google’s AI Overviews, and more.

These systems retrieve and synthesize information differently from traditional search crawlers, and if your URL architecture isn’t built with that in mind, you are increasing your chances of not being cited by LLMs.

In the new age of search, we need to extend those SEO fundamentals to also align with AI bots and how they crawl URLs.

Why AI Systems Read URLs Differently

Search engines have spent decades developing sophisticated crawling and indexing infrastructure. They follow redirects, resolve canonicals, parse JavaScript (sometimes…), and can infer context from a page when the URL is a string of random characters.

AI retrieval systems, particularly retrieval-augmented generation (RAG) pipelines and web-connected LLMs, often work differently.

There are three core parts to how RAG works:

  1. The input prompt is converted into a vector embedding
  2. Relevant passages are then retrieved from indexed URLs, documents and knowledge graphs in traditional search results like Google and Bing.
  3.  An LLM like ChatGPT or similar will then process this information and generate a refined response.

A developer-built RAG system will essentially use data sources from URLs to extract content – they will crawl the URL, convert the web content into searchable “chunks” and store them as numerical vectors for later retrieval.

This is now also evolving into a realm of URL context grounding, which is specific to Gemini. The aim for URL context grounding is to help Gemini (and presumably AI Overviews / AI Mode) to better understand and answer questions about content and data in individual URLs without performing traditional RAG processing.

The aim here is for the LLM to specifically pull direct information from multiple URLs, analyze multiple reports and combine information from several sources to generate more accurate summaries. This should, in theory, help to improve AI factual accuracy and reduce hallucinations.

Then there’s zero shot classification – a technique that enables models to categorize the purpose of a webpage without any task-specific training data.

Rather than relying on labeled examples, the model analyzes semantic cues such as URL structures (treated as plain text strings) and maps them to predefined categories using methods like cosine similarity or prompt-based reasoning.

This works by drawing on the model’s pre-trained language knowledge to infer a page’s likely function, while also detecting distinct patterns in the words and phrasing that signal what type of content the page contains.

This has been particularly useful in identifying phishing links and other malicious links based solely on their URL patterns but also indicates how LLMs could begin to leverage zero-shot classification to rely solely on URLs to infer semantic relevance.

A URL that communicates nothing forces LLM models to work harder and introduces ambiguity in how the content gets categorized.

More practically, when an AI system cites a source in a response, it often surfaces the URL alongside the excerpt. That URL becomes visible to real users, in the same way it does in a search result, and they’re going to make real decisions about whether or not to click.

A clean, descriptive path builds trust in a way that something like /p?id-4821 never will.

The Core Principle Of URLs As Semantic Signals

Think of your URL structure as a secondary content layer – one that communicates hierarchy, topic, and specificity independently to the page title or H1, or other metadata.

A URL like /resources/seo/url-structure-ai-retrieval/ tells a retrieval system several things at once: This lives under a resources hub, it’s within an SEO category, and it covers a specific subtopic at a granular level.

That’s a useful signal. It maps to how AI systems try to understand content provenance and relevance before surfacing it in a response.

This matters especially for:

  • Long-tail and question-based queries, where AI systems are looking for precise matches to specific information needs.
  • Topical authority, where your URL hierarchy can reinforce that your domain owns a subject area.
  • Citation quality, where a descriptive URL increases the likelihood an AI agent references your content over a competitor’s near-identical page.

Practical Architecture Principles

There are a number of practical architecture principles that you should consider for both traditional search as well as AI search.

Use A Logical, Shallow Hierarchy

Deep nesting (i.e., /blog/category/subcategory/year/month/post-title/) creates noise, and your content is multiple steps away from the homepage. A structure three levels deep is almost always sufficient, i.e., domain > category > specific page. There are some CMS setups, like Shopify, where you are forced into four or five, depending on your theme (i.e., domain/blog/name-of-blog/blog-post-title/), but as long as you’re adding meaningful context and not administrative clutter, your structure will be aligned with the principle.

Make Every Segment Human-Readable And Descriptive

Avoid abbreviations, internal jargon, or ID numbers in public-facing URLs. A URL like /ai-search-optimization communicates the topic directly, whereas a URL like /aso-v2 communicates nothing without prior knowledge.

Align URL Slugs With The Actual Search Intent, Not Just The Keyword

There’s a big difference between /email-marketing and /email-marketing-best-practices-b2b. The second one signals specificity. It’s more likely to surface when an AI system is generating a response to a precise question, because the URL itself narrows the relevance scope before the content is even parsed.

Be Consistent With Category Naming Across Your Site

If your content strategy uses /guides/ for long-form education content and /blog/ for shorter commentary, maintain that consistently. It’s likely that AI retrieval systems build a model of your site structure over time. Inconsistency blurs the signal about what type of content lives where.

Avoid Keyword Stuffing In URLs

This is old SEO advice, but it also applies here. A URL crammed with keywords looks spammy to human users who see it cited in an AI response, which undermines the trust benefit you’re trying to build. One primary keyword or phrase per segment is the right call.

What Does This Look Like In Practice

If two different marketers are writing about the same topic, the URL structure could be key for RAG systems to better understand the context of the page as part of content retrieval.

An example:

Marketer A publishes /blog/2024/03/email-tips-part-4.

Marketer B publishes /resources/email-marketing/b2b-deliverability-guide.

Marketer B’s URL structure properly communicates hierarchy (resources hub), category (email marketing), and a specific focus (B2B deliverability) before a single word of body copy is processed.

Users are also more likely to benefit from this URL being cited because they can make sense of it immediately.

It can be argued that this type of clarity and specificity could compound as your URL structure and site’s information architecture can dictate the entire topical structure of your site, also helping to communicate both expertise and relevance.

The Redirect & Consolidation Problem

This is more relevant to enterprise sites that have accumulated URL debt like redirects, duplicate paths, and inconsistent slugs due to historical content management system migrations.

This could create a specific problem for AI retrieval if there are redirect chains and duplicate paths, as crawlers may not consistently land on the canonical version of a page, and different retrieval systems handle redirect resolution differently.

A practical fix will be to prioritize your website’s URLs. Audit your highest traffic and highest value pages, and confirm that their canonical URLs are clean, accessible, and structured in line with your current taxonomy.

Then work backward.

You don’t need to restructure the entire site for the chance of being cited in AI responses, but especially for your highest value pages, you should ensure that you’re offering the best possible URL signals.

What You Should Avoid Changing

It’s important not to always chase the big and shiny, so don’t completely restructure your entire site’s URL architecture just for marginal AI retrieval gains.

URL restructuring carries real SEO risk and time to recover link equity if 301 redirects are put in place – and there have been many web migration horror stories that can attest to what can happen when they’re not implemented correctly.

The goal is to apply these principles to new content and flag structural problems in existing high-value pages where the case to remediate these issues is clear and lower risk.

If your current URL structure already follows clean, descriptive, hierarchical conventions (which is all a standard part of SEO best practice), then congratulations! You’ve been optimizing for AI retrieval without even knowing.

In Summary

URL structure has always been a relatively small signal, but as AI assistants become more of a meaningful discovery channel, URL structures have the potential to be cited in more places than just Google and Bing.

They can help you to appear in AI-generated answers, they can shape citation quality, and they can contribute to how retrieval systems will categorize your content before anything else.

Simply build URLs that tell the story of your content clearly, before the user clicks on it.

More Resources:


Featured Image: Vitya_M/Shutterstock

The Technical SEO Audit Needs A New Layer via @sejournal, @slobodanmanic

The standard technical SEO audit checks crawlability, indexability, website speed, mobile-friendliness, and structured data. That checklist was designed for one consumer: Googlebot.

This is how it’s always been.

In 2026, your website has, at least, a dozen additional non-human consumers. AI crawlers like GPTBot, ClaudeBot, and PerplexityBot train models and power AI search results. User-triggered agents like the newly announced Google-Agent, or its “siblings” Claude-User and ChatGPT-User, browse websites on behalf of specific humans in real time. A Q1 2026 analysis across Cloudflare’s network found that 30.6% of all web traffic now comes from now bots, with AI crawlers and agents making up a growing share. Your technical audit needs to account for all of them.

Here are the five layers to add to your existing technical SEO audit.

Layer 1: AI Crawler Access

Your robots.txt was probably written for Googlebot, Bingbot, and maybe a few scrapers. AI crawlers need their own robots.txt rules, and they need to be separate from Googlebot and Bingbot.

What To Check

Review your robots.txt for rules targeting AI-specific user agents: GPTBot, ClaudeBot, PerplexityBot, Google-Extended, Bytespider, AppleBot-Extended, CCBot, and ChatGPT-User. If none of these appear, you’re running on defaults, and those defaults might not reflect what you actually want. Never accept the defaults unless you know they are exactly what you need.

The key is making a conscious decision per crawler rather than blanket allowing or blocking everything. Not all AI crawlers serve the same purpose. AI crawler traffic can be split into three categories: training crawlers that collect data for model training (89.4% of AI crawler traffic according to Cloudflare data), search crawlers that power AI search results (8%), and user-triggered agents like Google-Agent and ChatGPT-User that browse on behalf of a specific human in real time (2.2%). Each category warrants a different robots.txt decision.

Chart showing traffic volume by crawler purpose - Cloudflare Radar Q1 2026
Cloudflare Radar data showing traffic volume by crawl purpose (Q1 2026); Screenshot by author, April 2026

The crawl-to-referral ratios from Cloudflare’s Radar report can make this an informed decision for you. Anthropic’s ClaudeBot crawls 20.6 thousand pages for every single referral it returns. OpenAI’s ratio is 1,300:1. Meta sends no referrals. Blocking OpenAI’s OAI-SearchBot or PerplexityBot reduces your visibility in ChatGPT Search and Perplexity’s AI answers. Blocking training-focused crawlers like CCBot or Meta’s crawler prevents data extraction from a provider that sends zero traffic back. The crawl-to-referral ratios tell you who is taking without giving.

There is one crawler that requires special attention. Google added Google-Agent to its official list of user-triggered fetchers on March 20, 2026. Google-Agent identifies requests from AI systems running on Google infrastructure that browse websites on behalf of users. Unlike traditional crawlers, Google-Agent ignores robots.txt. Google’s position is that since a human initiated the request, the agent acts as a user proxy rather than an autonomous crawler. Blocking Google-Agent requires server-side authentication, not robots.txt rules. This is both interesting, and important for the future, even if it’s not within the scope of this article.

Official documentation for each crawler:

Layer 2: JavaScript Rendering

Googlebot renders JavaScript using headless Chromium. There is nothing new about that. What is new and different is that virtually every major AI crawler does not render JavaScript.

Crawler Renders JavaScript
GPTBot (OpenAI) No
ClaudeBot (Anthropic) No
PerplexityBot No
CCBot (Common Crawl) No
AppleBot Yes
Googlebot Yes

AppleBot (which uses a WebKit-based renderer) and Googlebot are the only major crawlers that render JavaScript. Four of the six major web crawlers (GPTBot, ClaudeBot, PerplexityBot, and CCBot) fetch static HTML only, making server-side rendering a requirement for AI search visibility, not an optimization. If your content lives in client-side JavaScript, it is invisible to the crawlers training OpenAI, Anthropic, and Perplexity’s models and powering their AI search products.

What To Check

Run curl -s [URL] on your critical pages and search the output for key content like product names, prices, or service descriptions. If that content isn’t in the curl response, GPTBot, ClaudeBot, and PerplexityBot can’t see it either. Alternatively, use View Source in your browser (not Inspect Element, which shows the rendered DOM after JavaScript execution) and check whether the important information is present in the raw HTML.

CURL fetch of No Hacks homepage
Curl fetch of No Hacks homepage (Image from author, April 2026)

Single-page applications (SPAs) built with React, Vue, or Angular are particularly at risk unless they use server-side rendering (SSR) or static site generation (SSG). A React SPA that renders product descriptions, pricing, or key claims entirely on the client side is sending AI crawlers a blank page with a link to the JavaScript bundle.

The fix isn’t complicated. Server-side rendering (SSR), static site generation (SSG), or pre-rendering solves this for every major framework. Next.js supports SSR and SSG natively for React, Nuxt provides the same for Vue, and Angular Universal handles server rendering for Angular applications. The audit just needs to flag which pages depend on client-side JavaScript for critical content.

Layer 3: Structured Data For AI

Structured data has been part of technical SEO audits for years, but the evaluation criteria need updating. The question is no longer just “does this page have schema markup?” It’s “does this markup help AI systems understand and cite this content?”

What To Check

  • JSON-LD implementation (preferred over Microdata and RDFa for AI parsing).
  • Schema types that go beyond the basics: Organization, Article, Product, FAQ, HowTo, Person.
  • Entity relationships: sameAs, author, publisher connections that link your content to known entities.
  • Completeness: are all relevant properties populated, or are you just checking a box using skeleton schemas with name and URL?

Why This Matters Now

Microsoft’s Bing principal product manager Fabrice Canel confirmed in March 2025 that schema markup helps LLMs understand content for Copilot. The Google Search team stated in April 2025 that structured data gives an advantage in search results.

No, you can’t win with schema alone. Yes, it can help.

The data density angle matters too. The GEO research paper by Princeton, Georgia Tech, the Allen Institute for AI, and IIT Delhi (presented at ACM KDD 2024, first to publicly use the term “GEO”) found that adding statistics to content improved AI visibility by 41%. Yext’s analysis found that data-rich websites earn 4.3x more AI citations than directory-style listings. Structured data contributes to data density by giving AI systems machine-readable facts rather than requiring them to extract meaning from prose.

An important caveat: No peer-reviewed academic studies exist yet on schema’s impact on AI citation rates specifically. The industry data is promising and consistent, but treat these numbers as indicators rather than guarantees.

W3Techs reports that approximately 53% of the top 10 million websites use JSON-LD as of early 2026. If your website isn’t among them, you’re missing signals that both traditional and AI search systems use to understand your content.

Duane Forrester, who helped build Bing Webmaster Tools and co-launched Schema.org, argues that schema markup is only step one. As AI agents continue moving from simply interpreting pages to making decisions, brands will also need to publish operational truth (pricing, policies, constraints) in machine-verifiable formats with versioning and cryptographic signatures. Publishing machine-verifiable source packs is beyond the scope of a standard audit today, but auditing structured data completeness and accuracy is the foundation verified source packs build on.

Layer 4: Semantic HTML And The Accessibility Tree

The first three layers of the AI-readiness audit cover crawler access (robots.txt), JavaScript rendering, and structured data. The final two address how AI agents actually read your pages and what signals help them discover and evaluate your content.

Most SEOs evaluate HTML for search engine consumption. Agentic browsers like ChatGPT Atlas, Chrome with auto browse, and Perplexity Comet don’t parse pages the way Googlebot does. They read the accessibility tree instead.

The accessibility tree is a parallel representation of your page that browsers generate from your HTML. It strips away visual styling, layout, and decoration, keeping only the semantic structure: headings, links, buttons, form fields, labels, and the relationships between them. Screen readers like VoiceOver and NVDA have used the accessibility tree for decades to make websites usable for people with visual impairments. AI agents now use the same tree to understand and interact with web pages.

And the reason is simple: efficiency. Processing screenshots is both more expensive and slower than working with the accessibility tree.

Accessibility tree shown in Google Chrome
This is what an accessibility tree looks like in Google Chrome (Image from author, April 2026)

This matters because the accessibility tree exposes what your HTML actually communicates, not what your CSS (or JS) makes it look like. A

styled to look like a button doesn’t appear as a button in the accessibility tree. An image without alt text means nothing. A heading hierarchy that skips from H1 to H4 creates a broken structure that both screen readers and AI agents will struggle to navigate.

Microsoft’s Playwright MCP, the standard tool for connecting AI models to browser automation, uses accessibility snapshots rather than raw HTML or screenshots. Playwright MCP’s browser_snapshot function returns an accessibility tree representation because it’s more compact and semantically meaningful for LLMs. OpenAI’s documentation states that ChatGPT Atlas uses ARIA tags to interpret page structure when browsing websites.

Web accessibility and AI agent compatibility are now the same discipline. Proper heading hierarchy (H1-H6) creates meaningful sections that AI systems use for content extraction. Semantic elements like

,

,

, and

tell machines what role each content block plays. Form labels and descriptive button text make interactive elements understandable to agents that parse the accessibility tree instead of rendering visual design.

What To Check

  • Heading hierarchy: logical H1-H6 structure that machines can use to understand content relationships.
  • Semantic elements: nav, main, article, section, aside, header, footer, used appropriately.
  • Form inputs: every input has a label, every button has descriptive text.
  • Interactive elements: clickable things use or , not

    .

  • Accessibility tree: run a Playwright MCP snapshot or test with VoiceOver/NVDA to see what agents actually see.

Somehow, things are getting worse on this front. The WebAIM Million 2026 report found that the average web page now has 56.1 accessibility errors, up 10.1% from 2025.

ARIA (Accessible Rich Internet Applications) usage increased 27% in a single year. ARIA is a set of HTML attributes that add extra semantic information to elements, telling screen readers and AI agents things like “this div is actually a dialog” or “this list functions as a menu.” But what’s critical is this: pages with ARIA present had significantly more errors (59.1 on average) than pages without ARIA (42 on average). Adding ARIA without understanding it makes things worse, not better, because incorrect ARIA overrides the browser’s default accessibility tree interpretation with wrong information. Start with proper semantic HTML. Add ARIA only when native elements aren’t sufficient.

Technical SEOs do not need to become accessibility experts. But treating accessibility as someone else’s problem is no longer viable when the same tree that screen readers parse is now the primary interface between AI agents and your website.

Sidenote: The Markdown Shortcut Doesn’t Work

Serving raw markdown files to AI crawlers instead of HTML can result in a 95% reduction in token usage per page. However, Google Search Advocate John Mueller called this “a stupid idea” in February 2026 on Bluesky. Mueller’s argument was this: “Meaning lives in structure, hierarchy and context. Flatten it and you don’t make it machine-friendly, you make it meaningless.” LLMs were trained on normal HTML pages from the beginning and have no problems processing them. The answer isn’t to create a flat, simplified version for machines. It’s to make the HTML itself properly structured. Well-written semantic HTML already is the machine-readable format. Besides, that simplified version already exists in the accessibility tree, and it is what AI agents already use.

Layer 5: AI Discoverability Signals

The final layer covers signals that don’t fit neatly into traditional audit categories but directly affect how AI systems discover and evaluate your website.

llms.txt (dis-honourable mention). Listed first for one reason only, ask any LLM what you should do to make your website more visible to AI systems, and llms.txt will be at or near the top of the list. It’s their world, I guess. The llms.txt specification provides a simple markdown file that helps AI agents understand your website’s purpose, structure, and key content. No large-scale adoption data has been published yet, and its actual impact on AI citations is unproven. But LLMs consistently recommend it, which means AI-powered audit tools and consultants will flag its absence. It takes minutes to create and costs nothing to maintain.

OK, now that we’ve got that out of the way, let’s look at what might really matter.

AI crawler analytics. Are you monitoring AI bot traffic? Cloudflare’s AI Audit dashboard shows which AI crawlers visit, how often, and which pages they hit. If you’re not on Cloudflare, check server logs for Google-Agent, ChatGPT-User, and ClaudeBot user agent strings. Google publishes a user-triggered-agents.json file containing IP ranges that Google-Agent uses, so you can verify whether incoming requests are genuinely from Google rather than spoofed user agent strings.

Entity definition. Does your website clearly define what the business is, who runs it, and what it does? Not in marketing copy, but in structured, machine-parseable markup. Organization schema should include name, URL, logo, founding date, and sameAs links to verified profiles on LinkedIn, Crunchbase, and Wikipedia. Person schema for key people should connect them to the organization via author and employee properties. AI systems need to resolve your identity as a distinct entity before they can confidently recommend you over competitors with similar names or offerings. Don’t slap this on top of your website when your designer is done with their work. Start here; it will make your life easier.

Content position. Where you place information on the page directly affects whether AI systems cite it. Kevin Indig’s analysis of 98,000 ChatGPT citation rows across 1.2 million responses found that 44.2% of all AI citations come from the top 30% of a page. The bottom 10% earns only 2.4-4.4% of citations regardless of industry. Duane Forrester calls this “dog-bone thinking”: strong at the beginning and end, weak in the middle, a pattern Stanford researchers have confirmed as the “lost in the middle” phenomenon. Audit your key pages: are the most important claims and data points in the first 30%, or buried in the middle?

Content extractability. Pull any key claim from your page and read it in isolation. Does it still make sense without the surrounding paragraphs? AI retrieval systems, like ChatGPT, Perplexity, and Google AI Overviews, extract and cite individual passages and sentences that rely on “this,” “it,” or “the above” for meaning, become unusable when extracted from their original context. Ramon Eijkemans’ excellent utility-writing framework maps these principles to documented retrieval mechanisms: self-contained sentences, explicit entity relationships, and quotable anchor statements that AI systems can confidently cite without additional inference.

The Audit Checklist

Check Tool/Method What You’re Looking For
AI crawler robots.txt Manual review Conscious per-crawler decisions
JavaScript rendering curl, View Source, Lynx browser Critical content in static HTML
Structured data Schema validator, Rich Results Test Complete, connected JSON-LD
Semantic HTML axe DevTools, Lighthouse Proper elements, heading hierarchy
Accessibility tree Playwright MCP snapshot, screen reader What agents actually see
AI bot traffic Cloudflare, server logs Volume, pages hit, patterns

From Audit To Action

This audit identifies gaps. Fixing them requires a sequence, because some fixes depend on others. Optimizing content structure before establishing a machine-readable identity means agents can extract your information, but can’t confidently attribute it to your brand. I wrote Machine-First Architecture to provide that sequence: identity, structure, content, interaction, each pillar building on the previous one.

Why Technical SEO Audit Is Where This Belongs

None of this is technically SEO. Robots.txt rules for AI crawlers don’t affect Google rankings. Accessibility tree optimization doesn’t move keyword positions. Content position scoring has nothing to do with search indexing.

But most of it did grow out of technical SEO. Crawl management, structured data, semantic HTML, JavaScript rendering, server log analysis: these are skills technical SEOs already have. The audit methodology transfers directly. The consumer it serves is what changed.

The websites that get cited in AI responses, that work when Chrome auto browse visits them, that show up when someone asks ChatGPT for a recommendation, they won’t be the ones with the best content alone. They’ll be the ones whose technical foundation made that content accessible to machines. Technical SEOs are the people best equipped to build that foundation. The old audit template just needs a new section to reflect it.

More Resources:


Featured Image: Anton Vierietin/Shutterstock