What Is A Migration Hangover Traffic Drop & How Do You Avoid It?

Launching a new website, whether it’s a redesign, replatform, or full CMS migration, is often treated as a milestone for a business. But for SEO teams, it can quickly become a high-risk transition. Even migrations that appear technical sound at launch can trigger significant visibility and traffic declines in the months that follow.

In more severe cases, the impact of an “SEO migration hangover” can persist for 12 to 18 months, impacting rankings, organic revenue, and overall search performance long after the new site deploys.

What Is A Migration Hangover?

A SEO migration hangover is the prolonged, significant, and often avoidable drop in organic traffic that follows a website migration. A migration hangover is a long-term loss of authority and traffic following a poorly executed domain move. Normal volatility differs significantly from a hangover.

Normal volatility is only a temporary website migration traffic drop, with less fluctuation as Google recrawls, reprocesses, and re-evaluates changed content. In my experience, a normal, temporary dip in site traffic is typically 10-30%, while a damaging hangover causes a traffic drop of 50% or more.

Google needs time to process structural changes primarily due to the immense scale of its infrastructure, requiring months to re-crawl, re-evaluate, and re-index trillions of pages, especially after core updates.

Google processes URL changes through a multi-stage workflow designed to transfer ranking signals and ensure users reach the correct content. It can take anywhere from a few weeks to several months to fully crawl your site, depending on the number of URLs.

Why Does It Happen? The Most Common Causes

The majority of post-migration traffic drops share a common root cause. Site migrations are too often scoped as technical projects, a handoff between developers and designers, rather than strategic business decisions with significant SEO implications. When teams launch without SEO input, the consequences can follow a business for months.

Some of the most common reasons for a website migration drop include:

Broken Or Missing 301 Redirects

301 redirects for an SEO migration are responsible for passing link equity to new URLs. When they’re missing or wrong, Google will treat the old page as if it’s gone and strip its ranking power. Even one missed high-authority URL can cause a significant dip in traffic.

Common Redirect Errors:

  • Missing redirects entirely.
  • Temporary 302 redirects used instead of a permanent 301.
  • Redirect chains with multiple hops that slow crawling.
  • Redirects to irrelevant pages.

Noindex Tags Left Over From Staging

Leaving noindex tags on a live site after the migration is a classic and devastating mistake. Developers set pages to noindex during staging to prevent premature indexing, then forget to flip it back.

Google is instructed to ignore the pages and starts to de-index the entire site. Once the tags are removed, it can take anywhere from a few days to several weeks for Google and other search engines to re-index all pages.

Canonical Tags Pointing To Old URLs

If canonical tags still reference the old domain or URL structure post-migration, Google will continue to credit the old URLs and ignore the new ones. This will delay the transfer of ranking signals. New pages will fail to index because Google sees the old URL as the true authority.

This is one of the most common causes for a migration hangover, as it’s not always picked up on typical crawling tools without manual review.

Content Changes That Hurt Relevance

Sometimes the new design includes rewriting copy or removing pages that ranked well. If the content changes, the keyword relevance changes, and the rankings will follow.

Content changes that can hurt relevance include:

  • Content edits (heading structures, body content, internal linking patterns).
  • Missing title tags and meta descriptions.
  • Inconsistent formatting (headings, bullet points).
  • Missing content elements (images, videos, body copy).

Page Speed Regression

A new design or new CMS can quietly make the site slower. Slow, clunky sites hurt rankings and the user experience. A performance regression after the migration can chip away at rankings over time since Google uses Core Web Vitals as a ranking signal.

Unnecessary Changes To URL Structures

While some URL structure changes are unavoidable during a replatforming project, for example, moving from WordPress to Shopify, where default structures like /collections/ and /products/ are introduced, unnecessary URL changes can create avoidable ranking volatility and visibility loss.

Even when 301 redirects are implemented correctly, redirects are not a perfect transfer of authority or relevance. Changing URLs at scale forces search engines to reassess page signals and process new site structures. Google can take time to fully understand the relationship between the old and new URLs, particularly on larger or more complex websites.

How To Know If You Have A Migration Hangover (Vs. Normal Volatility)

Changes in traffic might lead business owners to wonder, “Is this normal, or is something broken?”

In my experience, normal volatility looks like a 10-20% dip that stabilizes and recovers within two to six weeks with no ongoing errors in Google Search Console.

A migration hangover looks like a drop exceeding 30-50%, new crawl errors or 404s appearing in Search Console, indexed page counts falling, and no sign of stabilization after four or more weeks.

It is possible to fix websites that suffer from a 30% drop in traffic post-migration, but it is highly recommended to avoid this situation in the first place.

How To Avoid A Migration Hangover

A successful website migration begins many months before the code changes. The pre-migration phase will determine whether the migration will lead to growth or a loss in traffic. Here are a few website migration SEO best practices:

  • Crawl your existing site before launch and document all URLs, title tags, and canonical tags.
  • Map every old URL to its new destination and test all 301 redirects in staging.
  • Audit structured data and schema markup to ensure it migrates correctly.
  • Benchmark page speed before migration so you have a baseline to compare post-launch.
  • Confirm robots.txt and noindex tags are set correctly on the live site before going live.
  • Submit a fresh XML sitemap in Google Search Console immediately post-launch.
  • Have your SEO team involved before the migration, not called in to fix things after.
  • Explore Google’s Site Move Guide for tips to help transfer your site to a new domain or URL.

What To Do If You’re Already Experiencing A Drop

The migration doesn’t end when the site goes live. During the post-migration monitoring period, it’s critical to catch issues as soon as possible. This will ensure your SEO performance will recover as the business intended. To recover traffic after a site migration, start with a crawl of the new site to identify potential technical errors.

Developers should fix the highest-traffic pages first, then cross-check canonicals, re-submit the sitemap, and verify noindex isn’t blocking key pages.

Any content that changed significantly may need to be restored or re-optimized for target keywords. It’s common for fluctuation to happen during the migration.

A Migration Hangover Case Study

In this example, a SaaS website introduced an SEO agency halfway through a site migration. They staggered the redesign and partially relaunched initiatives while the old version of the site sat, still live, on a “legacy” subdomain that was crawlable and accessible to Google.

SEMrush keyword visibility
Image from author, May 2026

Here’s Where They Went Wrong:

  • A subdomain was launched during the migration. The subdomain was the old version of the site, causing conflict with the live domain, including content cannibalization and both the main domain and subdomain competing against each other for branded and non-branded key terms. Google has already specifically said that Google does not treat a migration as a gradual or modular process, and a partial site migration can mean Google cannot reliably determine which domain represents the site’s primary identity.
  • Content delays, optimization approvals, and a lack of content transfer caused a loss in visibility. The new design offered minimal space for content creation, which, if the agency had been bought in sooner, would have been discussed during the process.
  • Multiple redirects, broken pages, and external domain migrations were left outstanding with no priority, despite the impact on the main domain being clear.

The website migration took place at the peak of the website’s visibility, indicating the significant impact that the hangover had.

A Migration Success Story

It doesn’t have to be this messy, though. Bringing in an SEO partner prior to the site migration taking place can be highly impactful.

In this case study, an aftermarket parts distributor (ecommerce site) rebuilt their website from the ground up with a new platform, new structure, new URL architecture, and new design. They had momentum on the old site and weren’t willing to risk it. In just three months post-migration, they generated over $750,000 in organic revenue, top 3 ranking positions increased to an all-time high, both clicks & impressions increased by 5% vs. the previous period, and they saw month-over-month gains across every user acquisition metric.

SEO Gets graph showing an uplift in MoM performance
Image from author, May 2026
SEMrush keyword visibility pinpointing top 3 and SERP Feature improvements from March 2026 to May 2026
Image from author, May 2026

This came from a well-managed migration process with clear pre- and post- migration steps and, bringing in the SEO team from the initial design phase through to post-deployment.

Final Thoughts

A messy website migration doesn’t have to happen.

A website migration can unlock major improvements for a business – from better user experience to more scalable technology and long-term growth. But without a clear SEO migration strategy, even well-intentioned redesigns that may look great for the brand can result in prolonged traffic loss and reduced organic revenue.

The difference between a successful migration and a damaging one usually comes down to proper preparation, collaboration, and post-launch monitoring. Businesses that involve SEO teams early in the planning stage are far more likely to preserve visibility and maintain momentum after launch.

From initial feedback during the design and wireframing phase to auditing the existing site, protecting high-value URLs, validating technical SEO elements before and after deployment, and closely monitoring performance post-launch, businesses can significantly reduce migration risk and position the new website for long-term organic growth instead of recovery.

More Resources:


Featured Image: DC Studio/Shutterstock

How To Use Lighthouse To Test Your Website For Agentic Readiness via @sejournal, @marie_haynes

Google just shared more information to help us get our websites ready for agents. There is a new report from Lighthouse that anybody can run. You do not need external software to run it. You can do it right from within your Chrome browser.

The report tells you whether your website is discoverable for AI agents, whether you have WebMCP integration set up, and something worth discussing: an evaluation of your LLMs.txt file!

How To Run The Agentic Web Report

Right now, if you try to get this report from within the standard version of Chrome, you probably will not find it. You can use it if you have Chrome Canary. This is the upcoming beta version of Chrome. Once you have that, you simply right-click and choose Inspect Page, then navigate to Lighthouse at the top. You will see a new category for “Agentic Browsing.”

Image Credit: Marie Haynes

Walk Through The New Lighthouse Agent Readiness Report With Me

I ran this on Google’s own page about the new Lighthouse report for agentic browsing, and it turns out that Google’s own documentation has issues that might hinder agents! The new report does not give a score out of 100, but rather a ratio showing how many agentic readiness checks your site passes.

3 Things To Watch For Agentic Readiness

Here are the topics I have been discussing with my clients regarding this new shift.

1. AI Accessibility And The Accessibility Tree

Agents can look at your pages in three ways:

  • Vision (screenshots).
  • HTML.
  • The accessibility tree.

The accessibility tree was originally meant for screen readers, but it actually tells an agent where the buttons are and what the important elements are. If your accessibility tree is not well-formed, agents will struggle to use your site. I think that making our pages agent-friendly will eventually be a ranking factor in terms of whether agents recommend your page.

2. Understanding WebMCP

WebMCP is a proposed web standard to help you build and expose structured tools for AI agents. Essentially, it is a way to teach agents how to use the functionality of your website.

There are two types: declarative and imperative. Declarative is simple code you wrap around a form, while imperative allows the agent to interact back and forth with your website. If you have tools on your site that people will use with their agents, WebMCP is going to be very important.

3. The LLMs.txt File

This sounds crazy because Google just put out documentation for ranking in the AI features of Search, saying you do not need an LLMs.txt file. But this report is not about Search; it is about agents using your website. The proposal is to use an LLMs.txt file (similar to robots.txt) to provide markdown information that helps agents understand your site at inference time. It allows you to give specific instructions to agents on what they are allowed to do and where they can find important information. You likely don’t need an LLMs.txt file unless you have elements that specifically are going to be used by agents.

LLMs.txt is for agents using your site, and not for Search reasons.

I would highly recommend setting aside some time to check your own site in Chrome Canary. Most of us do not need these files right this second, but we need to be aware of them as our websites start to become agentic.

More Resources:


Read Marie’s newsletter, AI News You Can Use. Subscribe now.


Featured Image: Roman Samborskyi/Shutterstock

Machine-First Architecture: How To Build Websites Machines Can Identify, Read, Cite & Use via @sejournal, @slobodanmanic

In the late 2000s, “mobile-first” emerged as a design discipline. The argument was a single sentence: don’t design for the big screen and squeeze it down. Start with the small screen, the harder constraint, the one that forces you to figure out what actually matters. If it works on a phone, it works everywhere.

Google leaned in early. By February 2010, Eric Schmidt was telling Mobile World Congress that Google’s strategy was “Mobile First in everything.” In April 2015, the Mobilegeddon update penalized non-mobile-friendly websites at scale. In October 2016, StatCounter reported mobile traffic surpassing desktop globally for the first time. A month later, Google announced mobile-first indexing. By October 2023, that migration was complete.

The web is now standing at the same kind of inflection point. Except the harder constraint isn’t a small screen. It’s no screen at all. It’s a machine.

The approach I use, Machine-First Architecture, is a full-stack methodology covering the entire arc of how machines now interact with a brand. It runs from how an organization is identified and resolved across the web, to how a website’s pages expose their data, to how content is consumed and cited, to how an autonomous agent completes a transaction on the website itself. Four pillars, in a specific order: Identity, Structure, Content, Interaction. The order matters. Each pillar depends on the one before it.

This is a website architecture discipline, not a content optimization playbook. Content is just one of four pillars. Most existing AI-search guidance, including frameworks I deeply respect, sits inside that single pillar. Machine-First Architecture extends upstream to organizational identity and downstream to autonomous agent action because that is where the actual work now is.

Last month, I outlined five layers the technical SEO audit needs to add for AI search. That piece described what to check on a website that already exists. Machine-First Architecture is the build framework the audit assumes: the architectural sequence you follow before any audit, on a website you are designing or rebuilding from the ground up. The audit catches gaps. The architecture prevents them. Reading the two together is the point: the build sequence here, the audit checklist there.

The whole journey has to be covered, and that is the part that matters most. The agentic journey is end-to-end: a machine has to identify your brand, parse your website’s structure, evaluate your content, and complete an action on your website. If any one of those steps fails, the whole chain fails. Excellent content cannot save a website with broken identity, because the machine never resolves the right entity to attribute the content to. Strong identity does nothing if the website’s structure hides the data behind JavaScript a crawler will not run. And both of those are wasted if an agent arrives ready to transact and finds a checkout flow it cannot navigate without a human.

It is important to note that machine-first does not mean human-last. Designing for the most constrained consumer (a machine that cannot interpret visual layouts, guess at meaning, or recover from ambiguity) creates a foundation that serves all visitors more effectively. Mobile-first didn’t make desktop worse. It made desktop better by prioritizing what really matters. Machine-first does the same thing for human consumers.

This is the reference version of the framework. What each pillar covers, what to build, what fails when it is missing, and what real protocol infrastructure now backs each one.

Pillar 1: Identity. Can Machines Unambiguously Identify Who You Are?

Identity must come first because AI systems cannot evaluate, recommend, or transact with a brand they cannot confidently resolve.

Google’s Knowledge Graph holds tens of billions of entities and well over a trillion facts about them, with E-E-A-T credibility signals applied at the person-entity level. AI systems consolidate brand identity by reading multiple external platforms in parallel and reconciling what they find. When your website says “AI consultancy,” your LinkedIn says “digital agency,” and your Google Business Profile says “IT services,” models either average those signals into something vague or lose confidence in the entity altogether.

Canonical Definition

A canonical definition is a single, structured, machine-readable document that defines what an organization is in fields rather than paragraphs. Think of it as your brand’s API documentation. Every bio, directory listing, schema block, and social profile description should trace back to this one canonical source.

Entity Relationships

When an AI system answers “who are the leading consultants in this space,” the model traverses connections between entities: founders, clients, industry categories, technologies, publications. The machine-first approach means actively defining and publishing those relationships as structured data, rather than leaving them implicit in blog posts.

Ecosystem Mapping

Map every platform where your brand exists or should exist. Industry directories, review platforms, podcast directories, GitHub profiles, marketplace listings, data aggregators. Each platform exposes data to machines differently. Optimize each platform’s specific structured data format rather than copy-pasting the same bio across all of them.

Version Control

Treat your canonical definition as a versioned document. When identity changes, propagate that change across every platform in your ecosystem map. Machines synthesize identity continuously, and staleness in any one source can degrade the overall picture.

Research by The Digital Bloom from December 2025 found that brands mentioned on four or more platforms are 2.8 times more likely to appear in ChatGPT responses. The architectural condition that makes that compounding effect work, in my experience, is that the platforms tell the same story, which is what the Identity pillar is built to enforce.

A note on scope. This pillar is about the identity of the brand the AI system is trying to recognize. It is not about the cryptographic identity of the AI agent accessing the website. Both matter, but they are different problems.

Output of this pillar:

  • A structured identity document serving as the single source of truth.
  • A map of every platform in your digital ecosystem.
  • A process for keeping all platforms aligned over time.

Pillar 2: Structure. Can Machines Extract Your Information?

Structure inverts the traditional web design process. Define the data model first, then wrap the design around the data.

Most websites are designed to look good to humans, with critical information locked inside visual layouts, JavaScript interactions, and design patterns that machines cannot parse. When an AI agent lands on a product page, it needs to extract the price, specifications, and availability programmatically. Structure is what makes that extraction work.

Structure overlaps with classical technical SEO and modern front-end engineering, but it is neither. Technical SEO has historically focused on what a single rendered page exposes to one crawler. Front-end engineering has focused on how that page is delivered and made interactive for human eyes. Structure, as a pillar of Machine-First Architecture, is upstream of both. It asks what data each page type exists to expose, before either the technical SEO audit or the front-end build begins. The audit checks whether the data is reachable. The architecture decides what data is there to be reached.

Data Models Before Page Designs

Before wireframing a page, define the discrete, extractable pieces of information that page must contain. The question changes from “what should this page look like?” to “what data does this page need to expose?” The page design wraps around the data model, instead of forcing the data model to conform to the design. This is the inversion that distinguishes architecture from audit. An audit can tell you whether your product page exposes price, availability, and specifications. Only the architecture step decides those are the four facts the page exists to express in the first place.

Information Hierarchy For Machines

Machine information hierarchy is structural, not visual. Machines read heading level, schema markup, semantic HTML, and position on the page, not font size, color, or visual weight. Architecturally, this means deciding what goes in the first content block of every page type before deciding how the page looks.

Relationship Architecture

This is where Machine-First Architecture diverges most sharply from how websites are traditionally built. The conventional process designs and ships pages one at a time, with the relationships between them inferred later from navigation menus and internal links. That is backward. Machines need to understand how pages relate to each other before they understand any single page: product taxonomies, service hierarchies, content-to-offering mappings, parent-child structures. Declare those connections explicitly through internal linking patterns, breadcrumb structures, and schema that names the hierarchical relationships directly. The test: Could a machine, starting from your homepage, construct a complete and accurate map of everything you offer by following structured, declared relationships? Not by guessing from menu labels. By traversing connections you have explicitly published.

One more decision belongs in this pillar: rendering. Critical data has to be present in the initial HTML response, before any client-side JavaScript runs. Build a JavaScript-heavy website where prices, specifications, and availability load after the page renders, and that data is locked away from every crawler that doesn’t execute JavaScript. Retrofitting a client-rendered SPA into something that serves data in static HTML is a very expensive failure mode. I broke down which AI crawlers render JavaScript and which ones don’t in “The Technical SEO Audit Needs A New Layer” if you want the specifics.

Output of this pillar:

  • A data model for every key page type, defining exactly what machine-readable information each page contains.
  • A relationship architecture connecting all pages.
  • A rendering strategy ensuring critical data is accessible regardless of how the page is processed.

Do not start designing pages until this work is done. The rendered page is one possible output of the data model. AI search results, voice answers, agent tool calls, and chat citations are other outputs the same data model has to serve. If the design comes first, the data model is whatever the design happened to support, which is rarely what every machine consumer needs.

Pillar 3: Content. Will Machines Rely On What You Are Saying?

Content is the pillar most existing AI-search research already targets. Kevin Indig‘s Growth Memo, Duane Forrester‘s Substack, Ramon Eijkemans’ utility-writing framework, and the ongoing work coming out of SEO Week and the BrightonSEO research community have produced rigorous data on how AI systems evaluate content. I lean on their work in this pillar more than I do in the others, and so should you.

The discipline of writing for AI extraction (answer-first writing, content extractability, citable specificity, content position) is something I get into in detail in “The Technical SEO Audit Needs A New Layer,” and the practitioners I named go deeper still. What Machine-First Architecture adds to that discipline is three architectural decisions that determine whether any of the writing-side work can succeed at all. They are: how authorship is structurally established, how time is signaled, and how the page is composed as modular knowledge units rather than a monolithic narrative.

Authorship And Attribution

AI systems evaluate authorship against the broader knowledge graph when deciding whether to cite a source. Machine-first content makes authorship explicit and structured: who wrote this, what their credentials are, where else they have published. Connected to the knowledge graph through schema markup, with sameAs links to verified profiles, with the author entity itself defined in the canonical identity document established by the Identity Pillar. This is where Identity and Content compose: the author entity referenced here is the same entity defined upstream. Authorship buried in a footer bio is invisible to that compounding effect.

Temporal Signaling

AI systems weigh recency heavily. A 2024 guide loses ground to a 2026 article on the same topic, regardless of objective quality. The distinction runs deeper than ranking. As Duane Forrester wrote, pre-cutoff and post-cutoff content occupy different systems inside the same model. Pre-cutoff content is presented confidently and without attribution. Post-cutoff content arrives with hedging language and citations. The architectural move is this: declare when specific claims were true, what data they are based on, and what has changed since original publication, at a granularity finer than the page’s publication date. AI systems can then evaluate the freshness of individual claims rather than treating the whole page as one timestamp.

Knowledge Modularity

Retrieval systems extract specific claims, answers, and data points. They do not consume content as continuous narrative. Long documents have a well-documented middle-section problem: Language models attend most strongly to the beginning and end of a document and lose fidelity in the middle. Self-contained sections are how content survives that effect. The architectural move is to design content as collections of modular knowledge units rather than monolithic articles. Each section has its own clear scope, its own question, its own supporting evidence. The page tells a complete story where each component functions independently when extracted. This is a composition decision made at the architecture level, not a writing decision made at the draft step.

Output of this pillar: a content framework where:

  • Authorship is structurally connected to your identity layer.
  • Time is declared at claim granularity.
  • The page is composed as modular knowledge units that function independently when retrieved.

Pillar 4: Interaction. Can Machines Act On Your Website Autonomously?

Interaction is the pillar where most existing AI-search frameworks stop. Visibility and citation work covers the first half of the journey: The machine finds and reads you. Accessibility work covers a different problem entirely: a human user with assistive technology making decisions in real time. The pillar that nobody else is finishing is the part where an autonomous agent has to do something on the website on behalf of a real person, with real money, with no human in the loop at the moment of action.

Leaving this last step unfinished is the costliest gap in the journey. An agent that can find your website, parse it, and decide it is the right answer will still abandon if it cannot complete the action it came to perform. That failure will be silent. You never see it in your analytics or your error log, the customer never tells you their agent gave up, and the next agent visit goes to a competitor whose interaction layer works. The full agentic journey is identification through completion, and the framework only delivers compounding value if every pillar holds.

The distinction from accessibility is important. Accessibility assumes a human is still in control: A screen reader translates the page for a person who makes decisions, interprets ambiguity, and recovers from errors. Machine interaction has no human in the loop at the point of action. The agent decides, acts, and verifies on its own.

Most of the eye-catching numbers in trade press right now (393% year-over-year jumps in AI-referred traffic, conversion lifts of 42%, peaks above 1,000% in the December holiday window) measure human traffic that came from AI-powered browsers and AI search results, not autonomous agent activity on the website. A person used ChatGPT or Atlas or Comet to find your website, then clicked through and shopped themselves. That is a real and growing share of website traffic, but it is the visibility-and-citation half of the journey, not the interaction half.

However, the logical next step for that same traffic is the machine also doing the action. The user who today asks ChatGPT to recommend a product and then clicks through to buy it will, increasingly, ask ChatGPT to buy it. The user who today asks Comet to compare hotels and then completes the booking themselves will, increasingly, hand the booking off to the agent. Each step delegates more of the journey to the machine. The Interaction pillar is the layer that has to be ready before that delegation becomes the default. That layer is currently developing, but moving very fast.

Every major AI vendor running the citation layer is also building the agent layer at the same pace, often faster. The companies that decide whether to cite your website are the same companies that decide where their agents try to act.

  • OpenAI runs ChatGPT alongside the Atlas browser, with built-in agent mode (formerly the standalone Operator product, integrated into ChatGPT in mid-2025).
  • Google folded Project Mariner into Gemini Agent and Chrome’s auto-browse capability in May 2026, and operates the Google-Agent fetcher for AI systems acting on user queries.
  • Anthropic pairs Claude with computer-use capability and the Claude-User crawler.
  • Perplexity has both its answer engine and the Comet browser.
  • Microsoft built Copilot Mode and Agent Mode into Edge for multi-step automation.

Treating AI as a pure distribution channel (optimizing for citation, stopping at “be visible in the answer”) is the most dangerous position in this discipline. It assumes the journey ends at the citation, which the vendors building the system have already publicly committed it does not. The citation and agent layers are rolling out on overlapping timelines from the same companies. The website architecture has to be ready for both.

The protocol stack supporting agent-side interaction has crystallized over the last twelve months.

  • Model Context Protocol (MCP): agent-to-tool communication. An inaugural project of the Agentic AI Foundation under the Linux Foundation.
  • A2A: agent-to-agent coordination. A separate Linux Foundation project.
  • WebMCP: agent-to-website interaction. A W3C Community Group draft.
  • Agentic Commerce Protocol (ACP): agent-initiated commerce. Co-developed by OpenAI and Stripe and launched inside ChatGPT in 2025. OpenAI scaled native in-ChatGPT checkout back in early 2026 after low adoption, and ACP now powers purchases through merchant apps integrated into ChatGPT rather than native checkout. The protocol continues, the deployment model is still being figured out.
  • Universal Commerce Protocol (UCP): agent-to-merchant commerce. Developed by Google with Shopify, Etsy, Wayfair, Target, and Walmart, and endorsed by 20+ partners across retail, payments, and processors (Stripe, Visa, Mastercard, American Express, Best Buy, Macy’s, The Home Depot, Zalando, and more). Announced at NRF in January 2026. Shopify’s implementation includes UCP-compliant MCP servers covering storefront browsing, customer account access, and developer tooling so agents can browse, compare, and place orders without screen-scraping.
  • Visa’s Trusted Agent Protocol: cryptographic identity for agent-initiated transactions. In production.

Autonomous agent transactions are not the dominant share of website traffic today, but the infrastructure is in place, the first flows are live, and the websites that wait until traffic forces the issue will be the ones rebuilding under pressure rather than designing into it. Interaction is the build-now-for-the-near-future pillar.

Discoverability Of Actions

A human can tell that a button is clickable through visual design. An AI agent has no such intuition. It needs a programmatic action manifest: Structured declarations of what actions are available on each page, what inputs those actions require, and what outcomes they produce. Schema.org actions provide one path; WebMCP provides another. Every page must answer “what can a machine do here?” as clearly as it answers “what can a human see here?”

Predictable Outcomes

Every action must return a machine-readable response confirming what happened, what changed, and what the next available actions are. An agent adding an item to a cart needs structured state confirmation: The item was added, the cart now contains three items, the total is this amount, the next available action is checkout or continued browsing. Design the state communication layer before the visual feedback layer.

Workflow Continuity

A human navigating a multi-step checkout maintains context mentally. An agent needs that context exposed as structured data: current step, prior decisions, remaining steps, required inputs, and the ability to revise without losing progress.

Error Recovery

Treat errors as structured branching points, not dead ends. When an agent encounters an out-of-stock item, “sorry, something went wrong” is useless. The error response must include structured data: The item is unavailable in size M, available sizes are S, L, and XL, a similar product is available in size M. Every error needs to be a decision point the agent can navigate without human intervention.

Trust And Verification

Humans rely on visual trust signals: padlock icons, brand recognition, professional design. Agents acting on behalf of humans with real money need machine-verifiable trust data: structured, verifiable transaction terms covering pricing, return policies, merchant verification, and guarantees that can be evaluated programmatically before committing. Visa’s Trusted Agent Protocol adds cryptographic proof-of-identity to agent-initiated transactions. The Agentic Commerce Protocol provides the merchant-side payment specification that agent checkouts run on.

Agent Policies And Permissions

When agents visit your website, you need a way to communicate what they are allowed to do. Browse only, or transact? Compare prices? Identify themselves? Rate limits? Standards work here is moving fast and not yet settled. New drafts are published every few weeks across IETF, W3C, and vendor working groups. The architectural need stays the same regardless of which draft wins: a programmatic way to declare what agents can do on your website, before they try to do it.

Output of this pillar: a functional map of every key action on the website, designed as:

  • Machine-navigable pathways with predictable outcomes.
  • Structured error recovery.
  • Verifiable trust signals.
  • Explicit agent policies.

The human visual experience is an enhancement layer on top of this.

The Four Pillars Are Sequential, Not Parallel

Build order matters. Identity first, Structure second, Content third, Interaction last.

You cannot have machine-readable Content without resolved Identity. The authorship principle (who wrote this, what their credentials are, what entities they connect to) depends on the canonical definition that Identity establishes.

You cannot expose Interaction without underlying Structure. An agent cannot complete a checkout flow on a page where the data model was never defined. The action manifest the agent reads is built on the same structural foundation that exposes price, specifications, and availability.

You cannot fix Interaction by patching it on at the end. Websites that try this end up with disconnected JavaScript widgets that simulate machine-readability without actually delivering it. Agents detect the gap, abandon the task, and leave no trace in your analytics.

Build Identity first. Layer Structure on top of it. Build Content into the Structure. Add Interaction as the operational layer once the first three are in place. Each pillar makes the next one possible.

Where To Start: One Action Per Pillar

A practical architecture move per pillar. None of these are audit checks. They are decisions you make before any audit becomes useful.

Identity. Write your canonical definition as fields, not paragraphs. What you do, who you do it for, where you operate, what makes you credible, who the key people are, what entities you connect to. Make this the source of truth that every bio, schema block, and platform listing derives from. Then Google your business name and compare what comes back against that definition. Every platform that tells a different story is a leak in your identity that the canonical document needs to resolve.

Structure. Pick your three most important page types: homepage, primary product or service, primary content. For each, list the discrete facts the page exists to expose, in priority order, before any consideration of layout or design. If you cannot list those facts, the page is being designed before the data model exists, which is the inversion you should aim to prevent.

Content. Pick the three pages most likely to be cited by AI systems. For each, establish two architectural connections: the author entity, schema-linked to the canonical identity document established by the Identity Pillar, and granular temporal signaling on specific claims, declaring when each was true and what data underlies it. The audit will catch whether the content reads well. The architecture decides whether the content is structurally connected to your identity and dated at the claim level.

Interaction. Try to complete a core action on your website (buying something, booking something, submitting a form) using only a screen reader. If you cannot get through the flow, neither can an agent. And agents do not have the patience to figure it out. They move on to a competitor.

Where Machine-First Architecture Fits Among SEO, GEO, And Accessibility

Machine-First Architecture is deliberately broader in scope than the existing AI-search guidance most practitioners are working with. Most frameworks in this space focus on a single slice of the journey: visibility, citation, content optimization, retrieval mechanics. Those are real disciplines, and they are necessary work. Machine-First Architecture is built one altitude above them: the architectural methodology that determines whether any of those tactics can land at all, plus the autonomous-interaction layer the others do not address.

Look at the scope mapping. SEO has historically covered Structure, plus parts of Identity through schema. Generative Engine Optimization covers Content, plus parts of Structure for retrieval. Accessibility covers parts of Structure and parts of Interaction, but only for human-assisted access. Both organizational Identity and autonomous-agent Interaction sit outside the primary scope of every existing discipline. Machine-First Architecture is what sits at the union.

The framework’s scope is bounded by what AI vendors and standards bodies are actively building toward consuming, not by speculation about what future AI might want. Identity protocols are landing, with Knowledge Graph consolidation already in production and verifiable-identity standards moving through W3C. Structural data extraction is mature, with all major AI crawlers parsing JSON-LD and semantic HTML. Content evaluation has documented retrieval mechanisms across position-based citation, authorship cross-referencing, and recency weighting. Interaction protocols are crystallizing as I write this. The four pillars don’t describe what to build for an imagined future. They describe what to build for the demand surface that already exists, plus a near-future surface that is already being shipped.

Duane Forrester’s The Machine Layer is the canonical guide for the visibility-and-trust side of the journey. Read it. Machine-First Architecture is what you build under that, wrapping the same content discipline inside the full architectural span, with Identity at one end and Interaction at the other.

The piece on the technical SEO audit I linked in the opening is the audit you run once the architecture is in place. The accessibility tree work I covered earlier is the rendering surface where most agentic browsers actually read your website, which is where the Structure Pillar’s information hierarchy ultimately gets evaluated.

Mobile-first took years to fully play out, but the actual transition (the point where websites that ignored it started losing) happened in months. Once Google began penalizing non-mobile-friendly websites in 2015, the window for ignoring it closed.

Machine-first is following the same curve, compressed.

More Resources:


Featured Image: Olga S L/Shutterstock

All You Need To Know About Cloudflare’s Agent Readiness Score via @sejournal, @slobodanmanic

Agent-readiness crossed from concept to measurable infrastructure this week. On April 17, as Cloudflare Agents Week extended into its sixth day, the company shipped isitagentready.com, a public scanner that scores any website on how prepared it is for AI agents. Paste a URL, get a score, see which checks passed and which failed, read AI-generated guidance on how to improve. For the first time, the agent-legibility conversation moved from “is my website ready for agents” as a gut feeling to “my website scored X out of 100 in these five categories, here are the failing signals.”

The Agent Readiness Score is a real shift. It is also a structurally misleading tool if you stop reading after the composite number.

I ran the scan on this website (nohacks.co) and scored 33 out of 100, Level 2 “Bot-Aware.” The robots.txt passed. The sitemap passed. The AI bot rules in robots.txt passed. Content Signals passed. Then the score collapsed across categories where a content-only blog genuinely doesn’t need what the scanner checks for. More on that in a minute.

First, the context. Cloudflare has been shipping agent-facing infrastructure all week. The Agent Readiness Score arrived alongside Agent Memory, Shared Dictionaries, Redirects for AI Training, an LLM compression technique called Unweight, and a feature-flag tool called Flagship built for AI-generated code. Four days earlier, they shipped Project Think (a new Agents SDK), and OpenAI matched it within hours with their own Agents SDK. I wrote about that in The Agent Runtime Wars Started This Week. The readiness scanner is the logical next piece: If runtimes are the new browser layer, website owners need a way to test whether their website is legible to that layer. Cloudflare shipped the tester.

The question this article answers is narrower: What does the scanner actually check, what should you do with your score, and where is the scoring structurally misleading enough that the number by itself leads you astray?

What Cloudflare Shipped: Scanner, API, And An MCP Endpoint Agents Can Call On You

The scanner is at isitagentready.com. Paste any URL, pick a website type (All Checks, Content Site, or API/Application) to scope which signals get scanned, hit Scan. The scanner fetches the homepage and a handful of well-known paths, runs a set of checks against each, and returns a scored report with pass/fail markers, status codes, response bodies, and AI-generated guidance on what to fix.

The scanner is also available in three other ways:

  • Integrated into Cloudflare Radar, so the same checks run alongside Radar’s existing URL analysis.
  • Exposed programmatically via the Cloudflare URL Scanner API for automation.
  • Available as a stateless MCP server at /.well-known/mcp.json, so any MCP-compatible agent can call the scan as a tool and reason over the result

That last one is worth sitting with for a moment. Cloudflare shipped an agent-readiness scanner that agents themselves can call to audit websites before deciding how to interact with them. The scanner checks whether your website is ready for agents, and any agent can invoke it to decide how to interact with you before arriving. The measurement and the measured are starting to share the same surface.

Back to the practical question. What exactly does it check?

16 Checks, 5 Categories: What The Scanner Actually Tests

The scanner groups its checks into five categories. Here is what each one looks for, grouped by what the check actually means in practice.

Discoverability (3 Checks)

Whether the website publishes the basic metadata an agent needs to find what is where.

  • robots.txt exists. The classic crawl-policy file. An agent that follows robots.txt needs it to exist and parse.
  • sitemap.xml exists. Either declared via a Sitemap directive in robots.txt or available at the standard path. An agent that wants to enumerate pages uses the sitemap.
  • Link headers (RFC 8288). HTTP Link headers pointing to canonical, alternate, or related resources. Useful for agents that parse responses rather than HTML.

Content (1 Check)

  • Markdown for Agents. Content negotiation. The scanner sends Accept: text/markdown and checks whether the website returns Markdown instead of HTML. This is Cloudflare’s own proposal rather than an IETF spec, though the mechanism (HTTP content negotiation via the Accept header) is standard. Real agent runtimes prefer Markdown because it is cheaper to tokenize and easier to parse than HTML. Some early movers (Cloudflare itself, a handful of docs websites) support Markdown content negotiation; most websites do not.

Bot Access Control (3 Checks)

  • AI bot rules in robots.txt (RFC 9309). Whether robots.txt contains directives for AI-specific user agents (GPTBot, ClaudeBot, PerplexityBot, etc.).
  • Content Signals in robots.txt. An emerging spec for expressing per-URL access rules inside robots.txt. Parsed as User-agent: * followed by Content-signal: directives. Adoption is minimal right now.
  • Web Bot Auth request signing. HTTP message signatures at /.well-known/http-message-signatures-directory that let agents prove their identity cryptographically. This is the Agent Name Service side of things, Cloudflare shipped with GoDaddy earlier in Agents Week. Adoption is almost zero outside Cloudflare’s own properties.

API, Auth, MCP & Skill Discovery (6 Checks)

  • API Catalog (RFC 9727). A machine-readable index of a website’s API endpoints at /.well-known/api-catalog.
  • OAuth / OIDC discovery (RFC 8414). Standard OAuth 2.0 authorization server metadata at /.well-known/oauth-authorization-server and /.well-known/openid-configuration.
  • OAuth Protected Resource (RFC 9728). A website declaring which endpoints are OAuth-protected and how to authenticate.
  • MCP Server Card (SEP-1649). A Model Context Protocol server advertising its capabilities at /.well-known/mcp/server-card.json. SEP-1649 is a draft proposal inside the MCP spec process.
  • Agent Skills index. A list of agent-callable skills at /.well-known/agent-skills/index.json. Also emerging.
  • WebMCP (Experimental). An in-page JavaScript API registering agent-callable tools via navigator.modelContext. The scanner uses headless browser rendering to detect whether the website registers any WebMCP tools on page load.

Commerce (3 Optional Checks, Not Scored On Non-Commerce Websites)

  • x402 payment protocol. HTTP 402 Payment Required infrastructure for agent-native payments.
  • UCP profile (Universal Commerce Protocol). Google’s merchant-metadata standard at /.well-known/ucp.
  • ACP discovery document (Agentic Commerce Protocol). At /.well-known/acp.json.

The Commerce category is flagged “optional” on non-commerce websites. The scanner detects whether any ecommerce signals are present and, if not, displays the commerce checks for informational purposes without counting them in the score.

That last design detail matters. It is evidence Cloudflare anticipated exactly the problem the rest of this article is about.

Nohacks.co Scored 33/100, Level 2 Bot-Aware

I ran the scan on nohacks.co. The result was 33 out of 100, Level 2 “Bot-Aware.”

The Agent Readiness Score report for nohacks.co, scanned on 2026-04-18. Composite: 33/Level 2 “Bot-Aware.” Category breakdown: Discoverability 67 (2/3), Content 0 (0/1), Bot Access Control 100 (2/2), API, Auth, MCP & Skill Discovery 0 (0/6). Commerce checks not scored (no ecommerce signals detected). Image Credit: Slobodan Manic

A note on that number: After the first scan, I added Content Signals directives to robots.txt, which moved Bot Access Control from 50 to 100 and pulled the composite up eight points from an initial 25. Every other category below is unchanged from the first scan. I’ll come back to the Content Signals fix and why I made it at the end of this section.

Here is what drove each category score:

  • Discoverability: 67. robots.txt and sitemap.xml passed. Link headers failed because this website does not emit Link: headers in its responses.
  • Content: 0. Markdown content negotiation is not configured. The website returns HTML regardless of the Accept header.
  • Bot Access Control: 100. Both scored checks passed. AI bot rules in robots.txt (I have explicit rules for AI user agents) and Content Signals in robots.txt (I added these after the first scan). Web Bot Auth request signing is listed in this category as an informational check, but not counted toward the 2/2.
  • API, Auth, MCP & Skill Discovery: 0. All six checks failed. No API Catalog. No OAuth discovery. No OAuth Protected Resource metadata. No MCP Server Card. No Agent Skills index. No WebMCP tools on the page.
  • Commerce: not scored. nohacks.co has no e-commerce. The Commerce checks all failed, but the category is correctly excluded from the composite score.

That is a 33 on a scanner built by the company I most trust to understand where the agent-ready web is going. I consider this website reasonably well-designed for agents. The robots.txt is clean and explicit. The content is server-rendered, machine-readable HTML with clean semantic structure. The sitemap is current. The URLs are stable. If you asked me a week ago whether this website was agent-ready, my answer would be somewhere between “mostly yes” and “for what it needs to do, yes.”

And yet: 33, Level 2.

The scanner is measuring what it says it is measuring. The composite score, by itself, is still the wrong number to optimize for.

One note on the Content Signals fix, because it’s relevant to the Goodhart argument later in this article. Content Signals is a Cloudflare proposal with almost no deployment beyond Cloudflare-aligned crawlers. I debated adding it for exactly the score-chasing reason this article warns about. I decided it was defensible for two reasons. First, the fix is declarative, not decorative. The directives state real policy about what should happen with my content, and the statement has meaning even if the spec fails. That is different from adding an empty MCP Server Card to satisfy a scorer. Second, for a website that writes about agent-readiness specifically, publicly declaring content policy is editorial practice regardless of which crawler respects it. The fix was one commit to public/robots.txt and the directives are readable by any human curious enough to check.

Same Website Scores 33 Or 67 Depending On The Preset You Select

On the All Checks preset, nohacks.co scores 33 out of 100, Level 2 “Bot-Aware.” On the Content Site preset, same website, same day, different scan configuration, it scores 67, still Level 2 “Bot-Aware.” Nearly double the composite number. The 34-point gap is the difference between two scan configurations of the same scanner, not a difference between two websites.

Here is what the Content Site preset changes in the scan configuration:

The Content Site preset unchecks every item in the API/Auth/MCP/Skill Discovery category, every item in the Commerce category, and Web Bot Auth in Bot Access Control. Six scored checks remain: three Discoverability (robots.txt, Sitemap, Link headers), one Content Accessibility (Markdown negotiation), two Bot Access Control (AI bot rules, Content Signals). Image Credit: Slobodan Manic

Running that preset on nohacks.co produced this result:

Nohacks.co under the Content Site preset: 67 / Level 2 “Bot-Aware.” Four of six scored checks pass. The two failing checks are Link headers (a fix I have not deployed yet) and Markdown content negotiation (not configured). Both are real shipping signals that agent runtimes benefit from today. Image Credit: Slobodan Manic

Four of six scored checks pass. The two failures are unambiguous remediation targets: Link headers via HTTP response configuration, Markdown content negotiation via origin or CDN response logic. Both ship against real agent-runtime behavior today. Neither is a proposal-stage format that will only maybe become a standard. This is the honest reading of nohacks.co’s agent-readiness state: two specific, actionable gaps.

The Correct Toggle Is Hidden, And The Default Score Is Wrong

The scanner is doing its job. It knows a blog does not need an MCP Server Card. It knows a podcast archive does not publish an API catalog. The Content Site preset is not cosmetic. It removes irrelevant checks and gives a content website an accurate reading against standards that actually apply.

The problem is that the preset is hidden. When a user lands on isitagentready.com and pastes a URL, the default scan is All Checks. The Site Type toggle that would switch to Content Site or API/Application lives inside a Customize dropdown that most users will never open. The user clicks Scan, reads the composite number, takes a screenshot, shares it. The shareable number, the one that travels on social media, the one competitors compare across, is the All Checks composite.

For a content website that runs the default scan without reading individual checks, the composite is structurally too low. The 33 on nohacks.co is wrong for the kind of website nohacks.co is. The 67 from the Content Site preset is the accurate reading. Two numbers from the same scanner on the same website. The accurate number is behind a dropdown. The wrong number is on the front page.

Any web professional who runs the scanner and plans to share the score anywhere public needs to open Customize, select the preset that matches their website type, and re-run before sharing. Without that step, the public score will understate the website’s actual agent-readiness, and the gap between the shared number and the accurate number will be larger for content websites than for API websites (which are closer to the All Checks baseline). Read the individual checks. Do not share a composite until you know which preset produced it.

For the record: the 67 is bothering me. I am going to go get the 100. I know exactly what the Goodhart section below is about to warn against, and I am going to do it anyway. Two fixes stand between me and the 100. Both are five-minute jobs. Both map to real agent-runtime behavior (Link headers for discovery, Markdown content negotiation for efficient agent parsing), so at least the motivation is legitimate and not pure score-chasing. That caveat is also exactly what score-chasers say. Public scores are a gravitational field. Even the person writing a long article about their unreliability ends up orbiting.

Agent Readiness Measures Delivery, Not Message

Every category the Agent Readiness scanner tests is about delivery: discoverability, content negotiation, bot access, API discovery, commerce protocols. None tests the quality of the message itself.

The scanner never asks whether your headlines are clear, whether your product descriptions persuade, whether your content answers the query well, whether your writing is any good. Those are SEO and CRO questions. They occupy the discipline of making the message better. The Agent Readiness Score occupies a different discipline entirely. It asks whether an agent can fetch your content, parse the format it arrives in, authenticate against your endpoints, call your functions, pay for your outputs.

That is the distinction that matters. Classical web optimization (SEO, CRO) is about what you say and how persuasively you say it. Agent-readiness is about how you deliver what you say to a non-human reader. Two websites can publish word-for-word identical content. One serves it as server-rendered HTML with semantic markup, responds to Accept: text/markdown, exposes structured data, returns predictable response codes. The other serves it as a JavaScript-rendered single-page application with no content negotiation and an inconsistent error surface. The message is identical. The delivery is different. The agent-readiness score will be different. And it will be right to be different, because the delivery is what the agent interacts with.

This is also why agent-readiness fixes tend to be orthogonal to SEO and CRO work. You can improve an agent-readiness score without rewriting a single word of your content. You can also have world-class SEO content that scores a 10 on the agent-readiness scanner because none of your delivery pipeline was designed for machine consumers. SEO and CRO work on the content layer. Agent-readiness works on the transport and protocol layer. They are adjacent but not the same craft, and treating them as the same is the mistake that turns an agent-readiness project into a content-rewrite project and misses the actual fix.

The people who will do well over the next several years are the ones who stop arguing about which discipline matters more and start recognizing they occupy different layers of the stack.

3 Goodhart Risks Built Into The Agent Readiness Score

Goodhart’s law says that when a measure becomes a target, it stops being a good measure. The Agent Readiness Score is well-designed, but it is also now a public, shareable, compared number, which produces three predictable behavioral failures in the wild.

The first risk is that website owners will optimize for the number rather than for real agent behavior. Add an MCP Server Card that points nowhere because the scanner wants one. Publish an Agent Skills index with no actual skills. Ship a WebMCP tool that does nothing just to pass the detection check. The score goes up, and nothing changes for real agent runtimes visiting the website.

The second risk is that consultancies will start selling “Agent Readiness Score optimization” as a service, selling the score rather than the underlying architecture. The history of SEO gives us a century of data on how this plays out. PageRank became a target, and a decade of link-spam economy grew up around it. Core Web Vitals became a target, and a generation of performance-theater optimizations followed. The Agent Readiness Score is a better-designed metric than either of those were at launch, but the same gravity applies.

The third risk is that the scanner’s inclusion of emerging standards as scored signals will accelerate the adoption of those standards past the point where they are ready to carry real traffic. The scanner checks for llms.txt, a proposed format for exposing website content to language models. Llms.txt is not a ratified standard, has no governing body, and has competing proposals for how it should be structured. Including it as a scored signal gives it weight it has not earned in the ecosystem. A website owner looking to fix a failing check is the marginal adopter who tips a proposal into a de facto standard before the spec work is done.

None of these failure modes are hypothetical. They are how every public measurement score in the history of the web has played out. The Agent Readiness Score is better than most because Cloudflare is honest about what it is, because the per-check detail is available right alongside the composite number, and because the Commerce category correctly excludes itself on non-commerce websites. That honesty is a feature worth protecting. Website owners and the consultancy industry will be tempted to treat the composite number as the target anyway.

Do not do this.

6 Weekend Fixes That Map To Real Agent Runtimes

Six actions for a web professional running the scanner the weekend of its launch, ordered from highest-leverage to lowest:

  1. Run the scan on your website. It takes about 30 seconds. Note the score and open the detailed report. The detail is where the signal is.
  2. Fix the failing checks that ship against real agent runtimes today. These are the ones whose absence measurably hurts your website for agents visiting it right now:
    • robots.txt. If missing, add one. If present, make sure it contains specific rules for AI user agents (GPTBot, ClaudeBot, PerplexityBot, Google-Extended, etc.).
    • sitemap.xml. If missing, generate one and link it from robots.txt. Keep it current.
    • Markdown content negotiation. Configure your origin or CDN to return text/markdown when the Accept header requests it. Cloudflare’s own AI Crawl Control has first-class support for this. Other providers require custom server logic.
    • Structured data. Ship schema.org JSON-LD for the content types your website publishes (Article, Product, Organization, BreadcrumbList). This is not a scored check, but it is the highest-leverage fix for citation behavior across every agent runtime currently deployed.
  3. Treat the proposal-stage formats as a watch list, not a checklist. llms.txt, Content Signals in robots.txt, Web Bot Auth, API Catalog, MCP Server Card, Agent Skills, WebMCP, ACP, UCP are all real working standards in some sense. They are not shipping against real agent-runtime behavior at scale yet. Watch them. Implement them when your stack has a reason to, not because the scanner flags them.
  4. Ignore the composite number in your own tracking. Track individual check outcomes over time. A website that goes from 3 of 5 real-runtime checks passing to 5 of 5 has measurably improved, even if the composite score barely moved because the 10 proposal-stage checks still fail.
  5. Re-scan after changes. The scanner is fast, free, and available via the URL Scanner API if you want to script regression checks into your deployment pipeline.
  6. Skip the consultancies selling Agent Readiness Score optimization. The work is straightforward enough that a half-day audit and a focused remediation sprint will beat any packaged service.

The scanner is the tool. The work is still the work.

Vendor-Specific Scanners Are Coming: Track What Every Scanner Tests

The Agent Readiness scanner is standards-list-shaped: a set of checks against a fixed list of protocols and formats, some ratified (RFC 8288 Link headers, RFC 9309 robots.txt rules, RFC 8414 OAuth discovery, RFC 9727 API Catalog, RFC 9728 OAuth Protected Resource), some emerging proposals (MCP SEP-1649, WebMCP, Content Signals, Web Bot Auth, x402, UCP, ACP, llms.txt). The next thing that happens in the ecosystem is predictable: Other vendors will ship their own scanners against their own preferred lists. The overlap will be significant because most of the ratified standards are uncontroversial. The divergence will be in which proposals each vendor scores for.

That divergence is where the agent-readiness measurement story gets interesting. A Cloudflare scanner that checks for Web Bot Auth and UCP is making a bet. A Google scanner, if it ships, would check for some of the same things and some different ones (Google has UCP, does not have Web Bot Auth). A Perplexity scanner would check for yet another set. Website owners would see different scores from different scanners on the same website. The composite number, already not trustworthy, becomes vendor-specific.

The signal worth tracking is which checks show up in every scanner that ships. Those are the de facto standards. The checks that only show up in Cloudflare’s scanner are Cloudflare’s bets. Some will win. Most will not.

This is the pattern that made me comfortable publishing an article about a Cloudflare tool on the day it shipped. The Agent Readiness Score is real. The thesis behind it (agent-readiness is a measurable property) is the right thesis. The specific scorecard is version one of something that is going to have dozens of versions, each reflecting its vendor’s bets. Web professionals should engage with the version-one scorecard, fix what it correctly flags as real, watch what it flags as emerging, and keep their own running list of which checks survive across every scanner that ships in the next six months.

That running list is the real agent-readiness standard. The composite score is the marketing layer.

Run the scan. Read the report. Fix what matters. Watch what might.

More Resources:


This post was originally published on No Hacks.


Featured Image: RobinRmD/Shutterstock

LLM Guidance Doesn’t Transfer The Way SEO Guidance Did via @sejournal, @DuaneForrester

For roughly two decades, the SEO discipline operated on a quiet assumption that turned out to be one of its most valuable features. Guidance from one search engine traveled. If Google said sitemaps mattered, Bing said sitemaps mattered. If Bing said structured data deserved real effort, Google said the same. Practitioners optimized for Google with reasonable confidence that the work would carry across the other engines, and most of the time it did. That portability was not luck. It was the product of a structurally large overlap layer that the major search engines had jointly built, brick by brick, over twenty years.

That world doesn’t exist in LLM-land. The major providers train on different corpora, run different crawlers under different policies, route different queries through different retrieval systems, and apply different alignment processes that shape the final response in ways the upstream signals can’t predict. Guidance from any one provider, including Google’s guidance about its own Gemini products, is one data point. Practitioners carrying the SEO habit forward, the habit of treating one engine’s guidance as roughly the whole map, will optimize confidently for one platform and miss the others.

Sidebar: As I was finalizing this piece, Google published fresh guidance on optimizing for their generative AI features. Their framing is explicit: from Google Search’s perspective, optimizing for AI search is still SEO. That framing is accurate for Google Search. It does not extend to ChatGPT, Claude, Perplexity, or any other LLM, and that is precisely the trap this article is about.

The Shared Standards That Made SEO Guidance Portable

The era of portable guidance was built on actual collaboration, not coincidence. The Sitemaps protocol became the joint property of Google, Yahoo, and Microsoft in November 2006, when the three engines formally agreed to support a common protocol at version 0.90, building on Google’s earlier Sitemaps 0.84 from June 2005. Five years later, on June 2, 2011, the same three engines launched Schema.org, with Yandex joining shortly after, to create a common vocabulary for structured data markup. That was the announcement that got made on stage at SMX Advanced. I was on the Bing team at the time, and what struck me then is what still matters now. The engines were competitors, but they had decided that a shared vocabulary served them all. Webmasters got one set of rules. The web got cleaner data. The engines got better signals. Everybody won.

The pattern repeated with robots.txt, the 1994 convention that became RFC 9309 at the IETF in 2022, formalizing what every serious crawler already honored. And it repeated again, more recently, with IndexNow, the protocol Microsoft Bing and Yandex launched in October 2021. IndexNow is now supported by Bing, Yandex, Naver, Seznam, and Yep. Google has tested the protocol since 2021, but has not adopted it.

That overlap layer is exactly why Google’s guidance felt safe to follow, even if you cared about Bing traffic. The signals the engines used were not identical, but the inputs they accepted, the protocols they honored, and the standards they advertised were. Optimization had a shared substrate.

Where The LLM Stacks Actually Diverge

The LLM environment doesn’t have a shared substrate of comparable size. The differences are not cosmetic, and they are not temporary. They are baked into how the systems are built.

Start with training data. OpenAI has signed disclosed licensing deals with News Corp worth up to $250 million over five years, Axel Springer at roughly $13 million per year, Reddit at an estimated $70 million per year, plus the Financial Times, Condé Nast, Hearst, Vox Media, The Atlantic, the Associated Press, Le Monde, and others. Google has its own Reddit deal, estimated at $60 million per year, granting real-time data API access. Anthropic has not publicly disclosed equivalent publisher licensing deals, and that undisclosed status is itself the practitioner-facing point. The corpora that fed these models, and that continue to refresh them, are not the same documents. Practitioners cannot know what any given provider has paid for and what it hasn’t.

The crawler infrastructure diverges next. OpenAI runs three separate bots: GPTBot for training, OAI-SearchBot for search indexing, and ChatGPT-User for user-initiated retrieval. Anthropic runs three of its own: ClaudeBot for training, Claude-SearchBot for search, and Claude-User for user-initiated retrieval. Perplexity runs PerplexityBot and Perplexity-User. Google introduced Google-Extended in September 2023 as the user-agent that controls whether Google can use a site’s content to train Gemini, separate entirely from the Googlebot that handles traditional search indexing. There is no single AI user-agent. Every provider requires a separate rule, and the rules don’t translate cleanly across providers because the bots don’t do equivalent jobs in equivalent ways.

The retrieval architectures diverge structurally. ChatGPT has historically used Bing’s index as its primary web search source, and that connection appears to still be primary, though OpenAI continues to build out additional infrastructure alongside it. Perplexity built its retrieval system on a Vespa-based pipeline that treats documents and sub-document chunks as first-class retrievable units. Google’s Gemini uses Google’s own index plus Knowledge Graph grounding. Claude uses Brave Search as a retrieval partner. Same query, four different retrieval systems, four different views of which sources exist and which sources are worth surfacing.

Then comes the alignment layer, which is where SEO had no equivalent at all. After a model is trained on its corpus, providers run post-training to shape how the model actually behaves: tone, refusal patterns, format, safety posture, what counts as a good answer. OpenAI’s primary approach has been RLHF, or Reinforcement Learning from Human Feedback, where human raters score model outputs and the model learns to produce highly rated responses. Anthropic developed Constitutional AI, which trains models to critique and revise their own outputs against a written set of principles. These methodologies produce demonstrably different behavior in the final products. The same retrieved content, fed into two models aligned by two methodologies, can yield two materially different responses about the same brand.

When One Provider’s Guidance Demonstrably Fails To Port

The clearest single example of guidance that doesn’t port is llms.txt. Jeremy Howard of Answer.AI proposed the file in September 2024 as a markdown manifest, placed at a site’s root, that would guide LLMs to the most important content. The proposal got picked up across the SEO community. Yoast built a generator. Agencies added llms.txt creation to their service catalogs. Conference speakers declared it essential.

As of mid-2026, no major LLM provider has confirmed they consume the file. Not OpenAI. Not Anthropic. Not Google. Server-log analyses across hundreds of thousands of domains show major AI crawlers don’t routinely request /llms.txt at all. Google’s John Mueller publicly compared it to the deprecated meta keywords tag. Gary Illyes confirmed at Search Central Live in July 2025 that Google does not support llms.txt and is not planning to.

I’ve written about this elsewhere, so I won’t repeat the technicalities here. What matters for this argument is the structural lesson. Schema.org succeeded because three engines built it together and then enforced it together. Llms.txt was proposed by one researcher, picked up by tooling vendors, and ignored by the platforms it was supposed to serve. The shared-standards model that gave SEO its portable guidance is not available to LLM practitioners at the same scale, because the platforms are not building the standards together. They are building their own pipelines.

The Gemini Inversion

The cleanest illustration of how far guidance portability has degraded sits inside one company. Google publishes its own SEO documentation at Search Central, the canonical guidance the industry has followed for two decades. Those documents emphasize traditional ranking signals, E-E-A-T, content quality, technical accessibility, and structured data. That guidance is still useful for Google Search itself.

Google also makes Gemini, the model that powers AI Overviews and Google’s separate AI Mode surface. And the citation behavior of those surfaces does not appear to track the guidance the same company publishes for its own search results.

In late 2024, roughly three-quarters of pages cited in AI Overviews also ranked in Google’s top 12 for the same query. By early 2026, after Google upgraded AI Overviews to Gemini 3 in January, Ahrefs analyzed 4 million AI Overview URLs and found that only 38% of cited pages also appeared in the top 10 for the same query. A separate BrightEdge analysis put the overlap closer to 17%. SE Ranking’s post-upgrade work found that Gemini 3 replaced approximately 42% of the domains previously cited under earlier model versions and generates 32% more sources per response.

The gap widens further when you look at Google’s AI Mode, which is a separate conversational surface that runs on the same Gemini family. Semrush data shows AI Mode and AI Overviews reach semantically similar conclusions 86% of the time, but cite the same URLs only 13.7% of the time. Only 14% of AI Mode citations rank in Google’s traditional top 10.

It appears, so far, that the canonical relationship has shifted. Google’s published SEO guidance is still the cleanest path to ranking in Google Search. But that ranking is no longer a reliable proxy for being cited by Google’s own AI surfaces. The same guidance, the same content, the same domain, can produce three meaningfully different outcomes across Google Search, AI Overviews, and AI Mode, even though all three live inside the same company. The old playbook of following the search engine’s guidance and trusting that the engine’s other surfaces would behave consistently does not appear to be delivering the same returns it used to.

What Still Ports, And Why It’s Smaller Than It Looks

A universal layer does survive. Crawler accessibility still matters across every provider. Primary-source factual content still wins more citations than aggregator restatement. Clean retrievable structure still helps every system understand what a page is about. Presence on the high-authority sources that all major LLMs disproportionately cite, Wikipedia, YouTube, Reddit, major news outlets, still functions as a force multiplier across platforms. Earning visibility on those sources gives content a chance to surface in any LLM that draws on them.

But the universal layer is much smaller than it was in the SEO era. Qwairy’s analysis of 118,000 AI responses across ChatGPT, Perplexity, Google AI Mode, and Claude found that only 11% of cited domains appeared across multiple platforms. The other 89% were platform-specific. A brand that wins citations on Perplexity may be largely invisible on Claude. A brand that’s a regular reference on ChatGPT may not show up in AI Overviews at all. The same content can be the right answer for one system and the wrong answer for the system next to it.

What This Means For The Work

The practical implication is not abandoning all hope. It is that practitioners need to stop treating any single LLM provider’s guidance as the universal map and start treating it as one input among several. Read what every major provider publishes about their own systems. Test your visibility across platforms, not just on the platform you happen to use most. Treat divergence as the default and overlap as the exception, not the other way around.

This is not how SEO worked, and the difference matters. The old reflex was to optimize for Google and trust the portability. The new reality is that following one LLM’s guidance, even Google’s guidance about Gemini, will leave you optimized for a slice of the landscape and potentially blind to the rest. The discipline is being rebuilt on platform-specific work that didn’t exist in the SEO era, and the practitioners who recognize that first are going to spend the next two years setting the standards everyone else follows.

The overlap has shrunk. You now have more work than ever to accomplish.

If you have thoughts on where the divergence between providers is sharpest in your own work, reach out directly. I’d genuinely like to hear what’s showing up in the data.

More Resources:


This post was originally published on Duane Forrester Decodes.


Featured Image: Rawpixel.com/Shutterstock; Paulo Bobita/Search Engine Journal

How To Stress-Test A Staging Environment To Surface Risks Pre-Launch – Ask An SEO via @sejournal, @HelenPollitt1

This week’s Ask An SEO question:

“How do you stress-test a staging environment to surface SEO risks before a large-scale launch?”

It is one of the most important questions to answer when considering rolling out new websites, migrations, or significant changes to your live site.

First, let’s look at the difference between a “staging” site and the “production” site.

The staging site is often also called the “development” site, “pre-production,” or another name that is specific to your company. It is a test site that is meant to mirror your live site as much as possible to help developers test changes in a safe, private environment before launching them.

The “production” site is your live site. It’s the one that is accessible to the general public and should be operating as close to perfectly as possible.

There are some instances where developers might deploy straight to the production site without testing on a staging site first. For example, when there is no testing site to use, or there is no way of mimicking the conditions to test without deploying the change to the live site. This is risky to do. If a deployment breaks something else in the code, it could critically affect the usability of the live site.

How To Stress-Test The Staging Environment

As SEOs, it is very important that we test deployments that could potentially impact SEO performance before they launch. Oftentimes, we find ourselves discovering deployments after they have already started to affect traffic and rankings. This is less than ideal, as it can take a while for Googlebot to pick up changes once a bad deployment has been fixed. It is far better to test how Googlebot might process changes before it is able to do so.

Mirror The Production Site As Closely As Possible

The most important aspect of the staging site is that it is as close to the production environment as possible. This is critical because it enables any testing that you do to reveal the same outcome as if you had run the test on the production environment.

Any deviations between the two environments need to be cataloged. These discrepancies need to be communicated so that testers know to pay special attention to the areas of the production site that differ from staging. Once the deployment goes live, testers can quickly ensure these areas of the production site are behaving as expected.

Crawl The Site At Scale With Multiple User-Agents

One area that is often overlooked when stress-testing the staging environment is using several different user agents when crawling the site.

By using different agents, for example, mimicking Googlebot Smartphone and Googlebot Desktop, you are more likely to pick up technical issues with the site that aren’t obvious on first crawl. For instance, crawling as both desktop Googlebot and mobile Googlebot could show issues with rendering that are only occurring on mobile devices.

Make sure to crawl the site with user agents that are important for your specific industry. If you are targeting Google News as a channel, make sure to crawl the site as the Google-News bot. If images or videos are important to your SEO, crawl as Google-Image and Google-Video bots.

To put your staging site through its paces, make sure to crawl it with a mobile user agent, a desktop user agent, and spoof two search engine bots, e.g., Google and Bing. This way you are getting good coverage of the experiences of different, important bots. If possible, try to crawl as an LLM bot also.

Check The Rendering

A good starting point when testing a staging environment before a large-scale deployment is rendering. Modern websites will often use a lot of JavaScript, which, not inherently bad, can pose issues for some search bots in processing. For more information on how search bots process JavaScript, see this guide.

Set your crawling tool to include JavaScript rendering, and see what elements it can pick up. For example, can you see the header tags, meta title, schema markup? Then crawl the site again without JavaScript rendering enabled. Make sure those same elements are still available to the bots.

If in doubt, carry out some spot-checks on pages on the staging site. Inspect the Document Object Model (DOM) to see if the critical code elements are visible on first load of the page.

It is important that what you are seeing on the page is what the search bots are able to parse and render.

Test SEO Elements In Bulk And Across Page Types

Carrying out tests in bulk is important when testing a site before a large launch. When carrying out your tests, make sure they are across different page types and, if applicable, across languages.

If your site uses templates, make sure to test each of the templates that are critical to your SEO success. For example, on an ecommerce site, this means checking the category and product pages as a high priority.

For multilingual sites, ensure your tests are being run across different languages, and set a VPN to target the countries those languages are important for. Spoof those countries when running your crawls to make sure users will be seeing the correct language and content for their region. Although Googlebot frequently crawls from U.S.-based IP addresses, it also uses geo-distributed configurations, particularly for locale-adaptive or multilingual sites.

On your staging site, you may find that not all of the languages are represented, or perhaps there is a different localization process than what exists on production. This brings us back to the first point of needing the staging site to be as comparable to the production site as possible.

If it isn’t, in particular for localization elements, these need to be at the top of your post-deployment checks.

Benchmark Current Production Performance

A good aspect to remember is that your staging site may well be on a less performant server. This means that when conducting speed tests on staging, the results might be worse than if the tests were run on production. This can limit your ability to run meaningful checks before deployment.

To work around this, make sure to benchmark performance on production so that you can run the tests again quickly after deployment. This will mean waiting until the changes have gone live, but may be the only way to get an accurate understanding of areas like page load speed in situations where the staging server just isn’t as good as the production one.

Test For Edge Cases

Developers will try to break their code when testing it; we should too. When testing your staging site before deployment, run it through some edge cases. In practice, this means thinking of scenarios that, although unlikely, are possible. For example,

  • I am visiting the website from the U.S., but my language is set to French. What language are the meta tags in?
  • I am viewing the website on a mobile device but have the viewport set to desktop. What content am I able to access that I couldn’t on mobile otherwise?
  • If I turn JavaScript off, can I still use the menu drop-downs?

Test For Previously Known Issues

Make sure previous issues haven’t been reintroduced into the code during the most recent work. Even if the mass deployment is for a small area, such as a new meta title template being rolled out, that’s not to say issues aren’t being reintroduced elsewhere.

Don’t test only for the item being changed, but check across critical SEO areas. In particular, if work has been done recently to improve pages on the site, check those will still be in place with this latest deployment.

Equally, if there are known bugs that have affected your SEO performance in the past, check for these even if the deployment isn’t related to them. It’s easy for bugs to sneak back into code, especially if they have been there before.

More Resources:


Featured Image: Paulo Bobita/Search Engine Journal

More Organic Search Traffic, More Ad Revenue: 4 Publishing Workflow Fixes That Bring Both

This post was sponsored by WP Engine. The opinions expressed in this article are the sponsor’s own.

Why are we missing the SERP window on breaking stories we should be winning?
How are smaller outlets ranking faster than us on the same news?
Why is our ad stack tanking Core Web Vitals on our highest-traffic pages?

In most large newsrooms, the answer traces back to the same culprit: a fragile, patchwork legacy CMS held together with ad-hoc plugins. For SEO and growth teams, that’s a direct hit to organic search traffic and ad revenue.
Below are four publishing workflow fixes that move both metrics in the same direction.

The 4 Publishing Pillars That Improve SEO & Monetization

To stop paying this tax, media organizations are moving away from treating their workflows as a collection of disparate parts. Instead, they are adopting a unified system that eliminates the friction between engineering, editorial, and growth.

A modern publishing standard addresses these marketing hurdles through four key operational pillars:

Pillar 1: Automated Governance (Built-In SEO & Tracking Integrity)

Marketing integrity relies on consistency.

In a fragmented system, SEO metadata, tracking pixels, and brand standards are often managed manually, leading to human error.

A unified approach embeds governance directly into the workflow.

By using automated checklists, organizations ensure that no article goes live until it meets defined standards, protecting the brand and ensuring every piece of content is optimized for discovery from the moment of publication.

Pillar 2: Fearless Iteration (Continuous SEO & CRO Optimization Without Risk)

High-traffic articles are a marketer’s most valuable asset. However, in a legacy stack, updating a live story to include, for instance, a Call-to-Action (CTA), is often a high-risk maneuver that could break site layouts.

A modern unified approach allows for “staged” edits, enabling teams to draft and review iterations on live content without forcing those changes live immediately. This allows for a continuous improvement cycle that protects the user experience and site uptime.

Pillar 3: Cross-Functional Collaboration (Reducing Workflow Bottlenecks Between Editorial, SEO & Engineering)

Any type of technology disruption requires a team to collaborate in real-time. The “Sticky-taped” approach often forces teams to work in separate tools, creating bottlenecks.

A modern unified standard utilizes collaborative editing, separating editorial functions into distinct areas for text, media, and metadata. This allows an SEO specialist or a growth marketer to optimize a story simultaneously with the journalist, ensuring the content is “market-ready” the instant it’s finished.

Pillar 4: Native Breaking News Capabilities (Capturing Real-Time Search Demand)

Late-breaking or real-time events, such as global geopolitical shifts or live sports, require in-the-moment storytelling to keep audiences informed, engaged, and on-site. Traditionally, “Live Blogs” relied on clunky third-party embeds that fragmented user data and slowed page loads.

A unified standard treats breaking news as a native capability, enabling rapid-fire updates that keep the audience glued to the brand’s own domain, maximizing ad impressions and subscription opportunities.

If those are things you’ve explored changing, it may be time to examine your own Fragmentation Tax, and why a new publishing standard is required to reclaim growth.

Stop Paying The Fragmentation Tax: How A Siloed CMS, Disconnected Data & Tech Debt Are Costing You Growth

The Fragmentation Tax is the hidden cost of operational inefficiency. It drains budgets, burns out teams, and stunts the ability to scale. For digital marketing and growth leads, this tax is paid in three distinct “currencies”:

1. Siloed Data & Strategic Blindness.

When your ad server, subscriber database, and content tools exist as siloed work streams, you lose the ability to see the full picture of the reader’s journey.

Without integrated attribution, marketers are forced to make strategic pivots based on vanity metrics like generic pageviews rather than true business intelligence, such as conversion funnels or long-term reader retention.

2. The Editorial Velocity Gap.

In the era of breaking news, being second is often the same as being last. If an editorial team is forced into complex, manual workflows because of a fragmented tech stack, content reaches the market too late to capture peak search volume or social trends. This friction creates a culture of caution precisely when marketing needs a culture of velocity to capture organic traffic.

3. Tech Debt vs. Innovation.

Tech debt is the future cost of rework created by choosing “quick-and-dirty” solutions. This is a silent killer of marketing budgets. Every hour an engineering team spends fixing plugin conflicts or managing security fires caused by a cobbled-together infrastructure is an hour stolen from innovation.

Conclusion: Trading Toil for Agility

Ultimately, shifting to a unified standard is about reducing inefficiencies caused by “fighting the tools.” By removing the technical toil that typically hides insights in siloed tools, media organizations can finally trade operational friction for strategic agility.

When your site’s foundation is solid and fast, editors can hit “publish” without worrying about things breaking. At the same time, marketers can test new ways to grow the audience without waiting weeks for developers to update code. This setup clears the way for everyone to move faster and focus on what actually matters: telling great stories and connecting with readers.

The era of stitching software together with “sticky tape” is over. For modern media companies to thrive amid constant digital disruption, infrastructure must be a launchpad, not a hindrance. By eliminating the Fragmentation Tax, marketing leaders can finally stop surviving and start growing.

Jason Konen is director of product management at WP Engine, a global web enablement company that empowers companies and agencies of all sizes to build, power, manage, and optimize their WordPressⓇ websites and applications with confidence.

Image Credits

Featured Image: Image by WP Engine. Used with permission.

In-Post Images: Image by WP Engine. Used with permission.

SERP FAQ Removal & New Data Challenge Schema’s AI Search Value via @sejournal, @MattGSouthern

Schema markup had a rough week. Google ended FAQ rich results. Four days later, Ahrefs published a report, finding that adding JSON-LD didn’t produce a clear citation lift across Google AI Overviews, AI Mode, or ChatGPT.

These developments weaken two common pitches for schema markup: increased SERP visibility and potential AI citation gains. This article examines their implications and what the data indicates about schema’s future.

Google’s Visible Schema Rewards Have Been Narrowing For Years

Google has been pulling back visible Search rewards tied to specific structured data types since 2023. Google restricted FAQ rich results to authoritative government and health sites, and HowTo rich results were limited to desktop and later deprecated.

In 2025, Google announced the retirement of several structured data features, including Course Info, Claim Review, and Estimated Salary. Book Actions was initially included but later carved out after Google removed its deprecation banner. Google called the remaining retirements “not commonly used in Search” and no longer providing value to users.

In 2026, Practice Problem structured data was deprecated. John Mueller noted on Reddit that “markup types come and go, but a precious few you should hold on to.”

The pattern is that visible structured data rewards have disappeared after becoming familiar SEO tactics. The markup itself stays valid, but the rich result doesn’t. Google doesn’t always describe these removals as responses to overuse, but the pattern offers less reason to treat any single markup type as a durable strategy.

These recent updates differ because the evidence for one proposed replacement value also weakened. The “GEO” advisory space claims schema boosts AI citations, and Ahrefs data tested part of that.

What The Ahrefs Report Found

Ahrefs tracked 1,885 web pages that added JSON-LD schema. Each page was matched against control pages that never added schema. Citation changes were measured across Google AI Overviews, AI Mode, and ChatGPT.

The results were flat. Google AI Mode showed +2.4%, ChatGPT showed +2.2%, and Google AI Overviews showed -4.6%.

The first two were too small to tell apart from random variation. The AI Overviews decline was statistically significant, but Ahrefs said it can’t confidently attribute that to schema.

Every page in the dataset already had more than 100 AI Overview citations before any schema was added. These pages were already being crawled and cited.

Ahrefs acknowledged that for pages not yet visible to AI, schema might still help with crawling, parsing, or indexing. But their data can’t confirm that.

Gianluca Fiorelli, a strategic SEO consultant, called the study “one of the more honest pieces of research to come out of the AI Search space in 2026.” But he argued the scope was narrower than the headline suggested. He compared it to “testing whether adding a label to a bottle already on the supermarket shelf makes customers pick it up more often.”

Ahrefs also cited a searchVIU experiment that found five AI systems relied on visible HTML during direct page retrieval and did not use hidden JSON-LD, Microdata, or RDFa. That finding covers one stage of the pipeline. It does not rule out schema playing a role earlier in indexing or entity understanding.

Ryan Law, Ahrefs’ director of content marketing, summarized the finding on LinkedIn, saying:

“Does adding schema markup help your pages get cited in AI search? Probably not,” he wrote. He added that schema is “probably not some magic fix for improving your AI citations.”

The Practitioner Debate

Both updates land in the middle of an active argument about schema and GEO.

Roughly 168,000 pages use the phrase “FAQ schema is critical for GEO,” according to search results that Lily Ray, VP of SEO and AI Search at Amsive, flagged on LinkedIn. She called the trend familiar.

“Anything that can be spammed in SEO, will be spammed,” Ray wrote. She’d warned about this in a 2019 Moz article when FAQ schema first launched, and described Google’s FAQ removal as the same cycle repeating.

Ray hedged throughout her post, calling it “putting on my tin foil hat” and “just an idea.” But the pattern she described is the same one visible in the timeline above. A useful markup type gets scaled as a tactic, Google pulls the reward, and the industry moves on to the next one.

Joost de Valk, founder of Yoast, made the connection explicit in a blog post. “The GEO industry is replaying early SEO, just faster,” de Valk said. “And the FAQ schema deprecation is the first concrete proof point that the cycle is back on.”

He also filed a Schema.org proposal for a new FAQSection type to address what he sees as the structural problem, separating “this page has an FAQ section” from “this page IS an FAQ.”

The frustration was sharpest from practitioners who’d been watching the GEO playbook harden around schema as its most concrete recommendation. Mark Williams-Cook, director at Candour and founder of AlsoAsked, shared the Ahrefs report on LinkedIn.

“GEO bros are selling snake oil with schema to boost citations, but people like Gianluca Fiorelli are talking sense,” he posted.

Marie Haynes, founder of Marie Haynes Consulting, commented on Ray’s post with a different theory altogether.

“My theory is that Google needed our FAQs to train AI so they gave us incentive to add them (aka rich results.) And now they don’t need them anymore,” she wrote. The theory is unconfirmed by any primary source, but it shows how far the speculation has traveled.

Some practitioners pushed back on the gloomier readings. Google’s broader guidance still presents structured data as a way to make page information machine-readable, and at a 2025 Search Central Live event in Madrid, the Search Relations team told practitioners that supported structured data types are still worth using.

What The Data Can’t Answer Yet

Whether schema helps pages that aren’t yet being cited is a separate question that the data can’t answer, because every page already had more than 100 AI Overview citations before schema was added.

The test also pooled all schema types together. Article, FAQ, Product, HowTo, and Organization were all treated as one category. Type-specific effects haven’t been isolated, and they could look different.

The 30-day measurement window may miss slower effects, and on live websites, schema changes can overlap with other page changes, making it hard to separate what schema did from what changed around it. The report only examined schema in the page’s HTML, not schema injected via JavaScript, which AI crawlers treat differently.

Ahrefs measured Google AI Overviews, AI Mode, and ChatGPT. Whether Bing, Copilot, Perplexity, Claude, or other answer systems treat schema differently from the systems Ahrefs measured is an open question.

Google’s FAQ deprecation notice says the company will continue using FAQ structured data to “better understand” pages. What that produces in measurable terms is unclear. The same uncertainty applies to whether schema affects citations indirectly, through eligibility, entity understanding, or source selection, rather than during the direct retrieval that searchVIU tested.

Nobody has published data that isolates that path.

Why This Matters

The Ahrefs data gives no measured reason to add JSON-LD, expecting short-term AI citation gains for pages already visible in AI Overviews. The trickier question is what to do with schema strategies more broadly.

Product, Review, Event, Video, and some other structured data types still support active rich result features. Organization, Person, and Article markup can still help describe entities and content, even when the payoff is less visible.

A blanket “schema doesn’t work” reading overstates what the data showed, because the test pooled all types and measured only one outcome. What the data does challenge is a specific sales pitch.

“Add schema to boost AI citations” has been one of the more concrete recommendations in GEO guides. For example, Frase.io called schema markup “critically important for AI search, GEO, and AEO.”

Without data support for that claim, it’s harder to justify the investment. AI systems in searchVIU’s test relied on visible HTML during retrieval, not JSON-LD. That suggests content structure, clear headings, and direct answers in prose may matter more for AI citation than markup structure.

Looking Ahead

The question hanging over the SEO industry is where schema creates measurable value. Adding JSON-LD didn’t measurably increase AI citations for pages already visible in AI Overviews.

For those pages, schema looks more like plumbing that serves other systems than a lever that moves citation counts. That’s still real value, but it’s a different pitch.


Featured Image: BEST-BACKGROUNDS/Shutterstock

More Resources

The Tech SEO Audit for the AI Search Era: How to Maximize Your AI Visibility via @sejournal, @JetOctopus

This post was sponsored by JetOctopus. The opinions expressed in this article are the sponsor’s own.

How do I optimize my site for ChatGPT and Perplexity, not just Google?

How do I know if AI bots are actually crawling my site?

How should my technical SEO strategy change for AI Search?

A significant portion of your site’s search impressions in 2026 are generated by machines researching on behalf of humans.

Those machines don’t care about your keyword rankings. They care whether your:

  • HTML loads cleanly in under 200 milliseconds
  • Product detail page is reachable in fewer than four clicks
  • Content answers a specific, nine-word question that has never appeared in any keyword research tool in your career.

This isn’t speculation. It’s what our server log data across hundreds of enterprise websites is showing us, consistently, since mid-2025.

What’s Actually Happening On Your Site

My colleague, Stan, flagged a pattern in a Slack message: query lengths were growing at rates that didn’t correlate with human behavior.

A 161% growth rate in 10-word queries year-over-year is not driven by users who suddenly got more verbose. It’s driven by AI agents decomposing a single user prompt into dozens of parallel sub-queries, a process researchers now call “fan-out.”

Query Length Growth in 2025

Image created by JetOctopus, Aggregated GSC data across hundreds of enterprise properties, 2025

The gradient is the tell. Human search behavior doesn’t scale this cleanly by word count. Machines do. By October 2025, 7-plus-word queries reached nearly 1% of total query volume, roughly triple their historical share.

More revealing than the volume is the CTR. While impression counts for 10-word queries spiked 161%, click-through rate collapsed to 2.26%, down from 8–11% in 2023.

The AI reads your page, extracts the answer, synthesizes it for the user. Your site never gets the visit.

We call these “phantom impressions.” They’re real signals that your content is being evaluated inside AI reasoning chains. If you’re filtering them out of your reporting because they don’t drive traffic, you are flying blind.

The Three Bots Visiting Your Site & Their Impact On SERP Visibility

Not all AI crawlers are equal, and treating them as a single category is the first mistake most technical SEOs make.

Training bots crawl broadly and ignore click depth. A training visit means the AI knows your content exists, not that users will ever see it.

AI search bots drop off quickly beyond two or three clicks from the homepage and typically visit each page only once a month.

AI user bots are initiated when a real person asks a question in ChatGPT, Perplexity, or Claude, and the AI researches the answer on their behalf. These are the only visits that translate to actual AI visibility.

Bot Type What Triggers It Crawl Depth Impact on AI Visibility
Training bots Model education cycles Deep — ignores click distance None directly. Awareness only.
AI search bots New URL discovery & fresh content Shallow — ~1 visit/month beyond 2–3 clicks Critical gatekeeper. If it misses a page, user bots won’t find it either.
AI user bots Real user query in ChatGPT / Claude / Perplexity Selective — driven by speed and structure High. Closest proxy to an AI impression.

Your site can receive heavy crawling from training and search bots and still be completely absent from AI-generated answers. If you’re not segmenting AI bot traffic by type in your log analysis, you have no idea which third of the iceberg you’re measuring.

Which SEO Signals Do LLMs Respect?

Robots.txt is your primary lever.

Most major AI platforms (ChatGPT, Claude, Gemini) follow robots.txt directives. Perplexity is a partial exception: PerplexityBot respects robots.txt, but Perplexity-User, the user-triggered bot, does not. Cloudflare confirmed this in an investigation. Most sites haven’t audited their robots.txt with AI access in mind. Do it.

Sitemaps are broadly supported.

ChatGPT, Claude, and PerplexityBot all use XML sitemaps for URL discovery. Keep them accurate.

Signals Best Saved For SEO & Ranking Efforts

These signals below don’t appear to impact AI visibility, but are still key for ranking for queries that still trigger traditional SERPs.

Canonical tags and noindex directives do nothing for AI bots.

AI crawlers don’t build a search index, so they have no use for these meta-signals. Content hidden from Google using noindex is fully visible to ChatGPT’s crawler.

LLM.txt does nothing.

Our log data shows major AI bots don’t read this file. Don’t invest time here.

JavaScript rendering is a critical blind spot.

Most AI crawlers (ChatGPT, Claude, Perplexity) don’t render JavaScript. If your product pages load key content client-side, those agents read an empty shell. Server-side rendering is the only architecture that works universally. The exception is Google Gemini, which uses the same Web Rendering Service as Googlebot.

How To Make Sure ChatGPT, Perplexity & LLMs Can Reach Your Content

AI search bots visit deep pages roughly once a month and drop off sharply beyond three clicks from the homepage. The pages with the most specific, answerable information are often the hardest for agents to reach.

The fix: Elevate your most valuable deep pages through internal linking, ensuring they’re reachable within four clicks.

Pages crawled by training bots but never reached by user bots are your highest-priority targets. Pages AI user bots visit frequently are telling you what to scale: more content covering the same topic cluster and depth.

Optimize Content For Longer, Fan-Out Queries

95% of the queries driving AI citations have zero monthly search volume. They’re synthetic sub-queries generated by AI models. But they show up in GSC: impressions, no clicks, query lengths you’d never target voluntarily.

How To Find Fan Out Query Opportunities

To surface fan out queries that are worth chasing, connect your GSC API to JetOctopus (to bypass the 1,000-row UI limit) and filter for: query length greater than 7 words, impressions under 50, clicks at 0, over the last 3 months. That’s your Fan-Out Opportunity Matrix, the exact questions AI agents are asking about your content.

Prompt Types That Fan Out Most

Image created by JetOctopus, 2025

If your content isn’t structured to answer list and comparison queries, with explicit rankings, pros/cons, and side-by-side specs, you’re leaving the highest fan-out surface area unoptimized.

“Product review” intent queries surged from 239 in June 2025 to over 40,000 by September 2025. That 16,000% increase was AI agents systematically harvesting structured opinion data. If your product pages lack this depth, you’re invisible to that harvest.

The Technical Audit: Where to Start

Step 1: Identify AI User Bot Traffic In Logs

Pull raw server logs (Apache/Nginx) and export all lines containing these user agents: OAI-SearchBot and ChatGPT-User, PerplexityBot and Perplexity-User, Claude-SearchBot and Claude-User. Then manually group hits by user-agent patterns and endpoints in a spreadsheet. To distinguish training bots from user bots, you’ll need to maintain your own classification list — one that changes often and isn’t standardized.

In JetOctopus Log Analyzer, this segmentation is built in: filter by bot type (training, search, and user) in a few clicks and immediately see which pages AI user bots visit (your AI-visible content, ready to scale) versus pages training bots hit but user bots never reach (your highest-priority fix targets).

Step 2: Audit Technical Accessibility Of Deep Pages

Pick a sample of deep URLs and check HTML payload size, confirm key content isn’t injected via JavaScript by viewing raw HTML, simulate crawl depth by counting clicks from the homepage, and test load time in Chrome DevTools or Lighthouse. Also check whether important content sits behind accordions or “View More” elements — these require JavaScript execution that AI bots skip entirely. For large sites with thousands of deep pages, this sampling approach misses a lot. AI agents don’t click. If information only appears after user interaction, it doesn’t exist for these crawlers.

Step 3: Clean Up Your Robots.txt

Open your robots.txt and review all Disallow and Allow directives for every user-agent line by line. AI bots follow Disallow rules, so make sure you’re not accidentally blocking important URLs. Manually test key URLs to confirm they aren’t blocked. A 30-minute audit here can prevent you from blocking crawlers you want in, or exposing content you’d rather keep out.

Step 4: Map Your Phantom Impressions

Export data from GSC Performance reports filtered by impressions with zero clicks. Because of the 1,000-row UI limit, you’ll need to use the GSC API or export in chunks by date and query, then merge datasets in spreadsheets or BigQuery. Also factor in query frequency: long queries appearing daily are likely not fan-outs.

Connect your GSC API to JetOctopus to bypass the row limit and build your Fan-Out Opportunity Matrix automatically — the exact questions AI agents are asking about your content, ready to act on.

Step 5: Monitor The Changes

Set up a recurring export process — pull GSC data monthly and compare impressions over time, re-run log analysis scripts and diff bot activity, track Core Web Vitals separately in PageSpeed Insights or CrUX. You’ll end up stitching together multiple data sources with no unified alerting, making it hard to catch regressions early.

JetOctopus Alerts covers exactly this: unified notifications for changes in AI bot activity alongside Googlebot behavior, Core Web Vitals, on-page SEO issues, and SERP efficiency drops, so you catch regressions before they compound.

The New KPI: Technical Accessibility

SEO in 2026 is restructuring around one constraint: can an AI agent crawl, reach, and extract a fact from your 50,000th product page in under 200 milliseconds?

If the answer is no, your rankings, backlinks, and content quality become irrelevant for a growing share of search interactions. The machines are searching. The question is how quickly you can see what’s actually happening.

Start with your logs. Everything else follows from there.

Want to see exactly how AI bots are interacting with your site: which pages they reach, which they skip, and where your fan-out opportunities are hiding? Book a live walkthrough of the JetOctopus platform. We’ll pull your actual log data and show you what your GSC reports aren’t telling you.

Image Credits

Featured Image: Image by JetOctopus. Used with permission.

Schema Markup Didn’t Move AI Citations In Ahrefs Test via @sejournal, @MattGSouthern

Schema markup is far more common on pages cited by AI. But a new Ahrefs report found that adding it didn’t result in a clear increase in citations.

Ahrefs tracked 1,885 web pages that added JSON-LD schema. Each page was matched against control pages that never added schema, and citation changes were measured across Google AI Overviews, AI Mode, and ChatGPT.

No platform showed a meaningful citation increase after schema was added.

What Ahrefs Found

The report analyzed 6 million URLs and found that pages cited by AI were roughly three times more likely to include JSON-LD. This gap has been seen as evidence that schema improves AI visibility. However, Ahrefs tested whether this held true when isolated from other signals, since sites with schema tend to invest in better content and earn more links.

They ran a controlled comparison, matching each schema page with three control pages from different domains with similar citation levels that never added JSON-LD. Citation changes were measured 30 days before and after schema addition.

Using its Brand Radar tool and Agent A, Ahrefs conducted a matched difference-in-differences analysis to account for platform trends. Here’s what was found.

  • Google AI Overviews: −4.6% (a small but statistically notable decline relative to controls)
  • Google AI Mode: +2.4% (too small to distinguish from random variation)
  • ChatGPT: +2.2% (too small to distinguish from random variation)

Three more tests were run alongside the primary comparison, and all four found no clear positive or negative effect.

The AI Overview Decline

The −4.6% decline in the AI Overview section deserves context. Ahrefs reports both treated and control pages were already declining before schema was added. Treated pages declined slightly faster, but the difference is small, with about 12 fewer daily citations per page in a sample where most pages received hundreds.

The report notes that the decline could reflect a small negative effect from schema, or it could be coincidence. It doesn’t draw a conclusion either way.

What The Report Doesn’t Cover

Every page in the dataset had 100+ AI Overview citations before any schema was added. These pages were already in the consideration set, being crawled and surfaced.

The report admits this limitation. For pages not yet visible to AI, schema might still aid crawling, parsing, or indexing, but the data can’t confirm this.

The report also notes other limitations. Pages adding JSON-LD often change other elements, making it hard to separate schema effects from those changes. All schema types were pooled, so some might perform differently. The 30-day window might miss slower effects.

A searchVIU experiment cited in the report tested whether five AI systems used schema markup when fetching pages in real time. None did; they only extracted visible HTML, ignoring JSON-LD, Microdata, and RDFa. This was a direct-fetch test, not proof of schema’s role during training, indexing, or retrieval.

Why This Matters

Schema markup is frequently recommended for AI visibility. However, Ahrefs’ data complicates this. While schema supports rich results and knowledge graphs, adding JSON-LD doesn’t increase AI citations for pages already cited.

The data shows a correlation: pages with schema are cited more often by AI, but Ahrefs interprets this as a sign of overall site quality rather than schema’s direct impact.

Looking Ahead

The report can’t determine whether schema helps pages that aren’t yet cited, which is a different group of pages that need another study. If pages are visible to AI, JSON-LD probably won’t boost citations.


Featured Image: Roman Samborskyi/Shutterstock