Your AI Visibility Tracker Is Quietly Breaking Your Analytics And Your Strategy via @sejournal, @TaylorDanRW

Jan-Willem Bobbink shared a take on X, that AI visibility trackers are quietly breaking the analytics of brands who are paying them to track for them. It’s time we put more focus on this issue, as it is causing misalignment, misreporting, and misspending of resources and marketing budget in the clamor to be more visible in AI.

Screenshot from X, April 2026

Jan-Willem hits on the issue of the lack of attribution in RAG loops. When a tracker triggers a prompt, and that prompt triggers a fetch, the brand is essentially paying a tool to generate its own AI visibility, and it begins to report on itself.

This is known as being ouroboros, which is a word you will likely see appearing more and more in the SEO industry as we describe AI/LLMs.

The ouroboros effect of how AI starts to quote itself, something that Pedro Dias has covered recently.

A large number of AI visibility tools have received significant amounts of funding in recent months, and some of them charge brands tens of thousands of dollars to “track” visibility, but this looping effect is beginning to become a reality, and how third-party tools track AI visibility will have a knock-on effect.

One example I point back to a lot is the drop in citations that ChatGPT produced when it released the 5.0 model in August 2025.

A number of tools that provide ChatGPT visibility saw the graphs decline, not because websites had violated spam policies or their short-termist tactics had run their course, but because of how the tools tracked citations, and the model produced less. This isn’t a measure of visibility, but a rehashed version of rank tracking, and these graphs can cost vendor contracts, incorrectly inform budget spending, and create false panic (or false celebration).

The Dangers Of The Observer Effect

In physics, the observer effect states that the act of monitoring a phenomenon changes it. This is happening in real-time for the SEO industry.

Most LLM trackers use a headless browser or a specialized API. When Perplexity or ChatGPT “searches” for fresh info to answer your tracker’s prompt, it doesn’t just hit your homepage; it performs a RAG fetch and can hit multiple URLs.

Because these bots often rotate IPs/proxies or use “stealth” headers to avoid being blocked by anti-scraping walls, they look like legitimate organic discovery crawls. This is how a number of rank tracking tools have operated for a number of years.

Because of this, you might report to a client, or other stakeholders, that “AI interest in our product pages is up 40%,” when in reality, 35% of that was just your own tracking tool refreshing its cache, or other tracking tools looking for you as a competitor of their brand.

AI Tracking Noise Is Worse Than Rank Tracking Noise

As Jan-Willem noted, we used to ignore rank tracker noise in Google Search Console because impressions were a “soft” metric. But log file data is hard data used for infrastructure, understanding how bots are accessing your website (server log file analysis), and now, in the age of AI, understanding how AI platforms are interacting with your site.

When you present a report to your client, peers, or your chief marketing officer, you are trying to prove brand preference within a large language model. If your data is polluted by your own tracking (and other people’s tracking), you risk a “false positive” strategy.

You might double down on content that isn’t actually popular with real AI users, but is simply the content your tracking tool happens to trigger most often.

What To Do Right Now

Until a vendor builds the “Clean Log” API Jan-Willem is calling for, you have to treat log files with skepticism.

Run your tracking tools on a “quiet” staging environment or a specific set of sacrificial URLs to measure the “noise floor” created by the tool itself.

Look for specific patterns (user-agent fingerprinting) in the logs that correlate with your tool’s scan times. Even if IPs rotate, the timing often shows patterns that can be identified easily.

And stop reporting “total AI fetches” as a success metric. Focus on how often your brand is mentioned relative to competitors, which is a metric derived from the LLM output, not your server logs.

More Resources:


Featured Image: Master1305/Shutterstock

The 90-Day GEO Playbook for Local Search: How To Show Up When AI Does The Searching

This post was sponsored by Uberall. The opinions expressed in this article are the sponsor’s own.

Local consumers have stopped searching the way we built our marketing around.

This significant change in buyer habits has been quietly happening in the last 18 to 24 months.

According to recent Uberall research into AI search behavior, an estimated $750 billion in consumer spend is already shifting toward AI-powered search. Roughly 60% of all searches now end without a single click to a website. And in a finding that should stop every marketer cold, or at least those working for multi-location businesses, 68% of brands are missing entirely from the recommendations AI engines generate in their category.

That problem goes beyond channels. It’s a fast-moving visibility problem that risks affecting conversions and revenue.

Generative Engine Optimization (GEO) is the discipline built for this moment. Where SEO optimized pages for a ranking, GEO optimizes entities for a recommendation.

The goal is no longer just to be found in Search Engine Results Pages (SERPs). It’s to be cited, summarized, and trusted when a model answers on your customer’s behalf.

In GEO, three pillars carry the weight. If you’ve worked in SEO for any length of time, the shape will look familiar — compounding visibility isn’t new, it’s the surface that’s changed.

  • Source of truth. The basic facts about your brand (name, address, hours, services) need to match everywhere a model might look. Inconsistent signals train AI engines to trust you less.
  • Context engineering. Your content has to answer the questions customers actually ask, in the language they ask them. Of course, conversational answers should take priority over keyword clusters.
  • Orchestration. You measure citations, refresh content, and compound visibility over time.

Here is how those three pillars translate into a realistic 90-day plan teams can actually run.

Phase 1 (Week 1): Foundational Analysis

You cannot optimize what the model cannot parse. The first week is a data hygiene sprint, rather than a content sprint.

Start with the local SEO basics most teams assume are already clean:

  • Audit your NAP details (Name, Address, Phone) across Google Business Profiles, Apple Maps, Yelp, Bing Places, and the major data aggregators. Even small inconsistencies — a missing suite number, an old phone format, a rebrand that never propagated — train AI engines to treat your brand as a lower-confidence entity.
  • Check your location pages, about page, and product pages for structured data. Schema isn’t a magic AI switch — recent tests suggest LLMs largely read it like any other on-page text. What it does is reduce ambiguity about what your business is and does, and that clarity is what helps a model interpret and cite you correctly.
  • Type the questions your customers actually ask into ChatGPT, Gemini, Perplexity, and Google AI Overviews. Not branded queries – real ones like “best orthodontist near Lincoln Park,” “which EV charger works with a Ford Lightning,” “coffee shops in Berlin that allow dogs.” Note where you appear, where you don’t, and which competitors show up instead.

That gap list becomes your brief for the next 80 days. It’s also where most brands discover the blind spots they didn’t know they had.

Phase 2 (Days 7–30): Context Engineering And Targeted Content

Once you know which prompts you’re missing from, the work becomes specific. For each blind spot, you are building the content a model would actively want to cite.

A few patterns that hold up across industries:

  • One prompt, one page. If “best family dentist in Austin with Saturday hours” returns three competitors and none of your locations, build or optimize the pages that answer exactly that. Don’t bury the answer three scrolls down.
  • Write for the question, not the keyword. AI engines extract complete answers, not phrases. A well-structured FAQ with direct, factual responses often outperforms a 2,000-word, keyword-stuffed guide that dances around the point
  • Cite yourself credibly. Include dates, local details, original data, named authors, and explicit comparisons. Models reward specificity and downgrade vague claims.

This is the phase where content that actually gets cited starts to look different from content built for the old ranking game. It is tighter, more factual, and structured around how someone would ask a question out loud.

Phase 3 (Days 30–60): Surgical Placement & Off-Page Authority

Off-page authority still matters. The economics, however, have flipped.

The instinct is to chase top-tier publishers. For GEO, that is usually the wrong move.

The sites that generative engines pull from most often aren’t always the ones with the highest domain authority. These are the ones relevant to your business and are cited more frequently, even if they’re not huge publications.

A more effective approach:

  • Focus on sites that already rank in Google for the prompts your customers use — the kind of credible, topical sources you’d want them to find when they’re researching. Top-tier placement isn’t the goal; any authoritative site that actually serves your audience counts.
  • The publishers AI engines already cite in your category are the ones models trust enough to source from. Re-run your Phase 1 prompts, track which domains keep appearing in the citations, and that’s your shortlist.
  • Size and prestige aren’t reliable proxies for AI citation rates. A specialist publication with real topical authority in your category often earns more AI citations than a bigger, more generic name.

The goal isn’t link volume. It is being mentioned, in context, in the sources your category’s models already trust.

Phase 4 (Days 60–90): Orchestration And Compounding

By day 60, you should have new content live, citations starting to show up on publisher sites, and enough signal to measure. Phase 4 is where GEO stops being a project and starts being a system.

Three metrics worth tracking weekly:

  • AI citation rate — how often your brand is named in AI-generated answers for your priority prompts.
  • Share of Voice — your citation rate relative to competitors across the same prompt set.
  • Content decay — which cited pages are losing citations over time and need refreshing with new data, dates, or insights.
Image created by Uberall, April 2026

The compounding effect here is profound. Brands that treat GEO as an ongoing loop — audit, publish, place, measure, refresh — see substantially higher citations and conversion rates. A recent Search Engine Journal webinar, featuring Uberall with AthenaHQ, states that GEO-savvy brands see 2x as many citations and 3–9x higher conversion rates within 90 days compared to brands still optimizing purely for classic search.

That delta matters more than it looks. As zero-click behavior grows, the citation inside the AI answer is the conversion surface.

For a concrete example, Audika France, a multi-location hearing-care brand and Uberall customer, ran this orchestration loop as an early adopter. They used it to track how AI engines described their clinics, spot the attributes models were missing, and close the gap between visible and recommended. Their results show how one multi-location brand went from an AI blind spot to a consistent recommendation.

What To Do Next

The pattern is consistent across multiple industries, including retail and restaurants. Brands that start now build a structural advantage that is hard to unwind once the category catches up. The ones that wait end up explaining to their board a year from now why a competitor became the default recommendation in every model their customers use.

If you want a snapshot of how your locations are performing in AI search, check out our AI Visibility Grader tool. It gives you a quick view of your AI visibility and the factors shaping it.

Or if you want to take this further and get a higher definition picture of where you stand in AI search, GEO Studio’s free trial will map your brand’s presence across the major generative engines.

Local search has changed. This is how you become the default answer.


Image Credits

Featured Image: Image by Michelle Azar/ Uberall. Used with permission.
In-Post Image: Image by Uberall. Used with permission.

B2B Buyers Choose A Vendor Before They Reach Out – 3 Ways To Be Visible When It Counts via @sejournal, @alexanderkesler

The fundamental question for 2026 is not how visible you are in search, but how wide the gap has grown between where you invest in discoverability and where buyers actually form their decisions.

Here is the reality: B2B buyers complete the majority of their research and form vendor preferences before your sellers can make their introductions.

Traditional SEO is a critical component of the brand discovery process, but it represents only a fraction of how buying groups validate decisions.

While SEO requires optimizing content for individual search intent (one person researching a solution), B2B purchasing works fundamentally differently. Enterprise software and service decisions are made when buying groups, averaging eleven members, reach consensus.

B2B buyers contact vendors only after completing 61% of their research. So, by the time buyers reach out to schedule that first demo, they’ve already completed most of their research out of sight from client relationship managers, already forming a shortlist of preferred vendors.

To earn consideration from B2B buyers as a preferred vendor in 2026, organizations ought to master this invisible buying journey and the discoverability process to out-position competitors.

In this article, I will present three tactics to help you improve the discoverability of your brand beyond SEO, helping your brand appear as a top choice for B2B buyers.

How To Make Your Brand Discoverable For B2B Buyers

SEO remains essential for organic search visibility, but buyer research extends far beyond search queries.

Buyers use AI tools to research solutions and validate findings across peer networks, review sites, technical documentation, and professional networks.

This creates a need for your B2B brand to be visible across multiple channels at once.

Your ability to establish brand confidence by enabling validation across the entire buying group, as well as measuring performance in these channels, is essential for securing favorable placement on B2B vendor shortlists.

3 Tactics To Increase Brand Discoverability

1. Establish Brand Confidence

Beyond traditional search, you need credibility across peer networks and review sites where buying groups conduct research.

Ensure your brand is visible across these B2B buyer research channels:

  • Search engines, answer engines, and AI tools.
  • Review sites like G2 and TrustRadius.
  • Peer networks, including Slack, Reddit, and technical forums.
  • Technical documentation sites.
  • PR, Wikipedia.
  • Third-party sites, like partner and syndication networks.

Prioritize AEO And GEO

As buyers increasingly turn to AI tools to research solutions, answer engine optimization (AEO) and generative engine optimization (GEO) have become important to brand discoverability.

  • Conduct an AI visibility audit to assess brand visibility across AI platforms.
  • Track citations, identify entity recognition gaps, and monitor competitors in AI-generated responses.
  • Enhance technical infrastructure with schema markup and optimize content for large language models (LLMs).
  • Secure consistent citations through PR and vendor comparison content.
  • Use citation monitoring tools to connect AI visibility to revenue, not just impressions.

Review Platform Management

Buyers trust validation on the quality of solutions via professional peers more than vendor claims.

  • Maintain a steady flow of authentic reviews on sites like G2 and TrustRadius through client engagement.
  • Analyze competitors’ reviews to identify gaps your products cover, then address those gaps with specific use cases and documentation.
  • Respond promptly to every client/user review. Your responses demonstrate commitment to client success and provide context for future readers evaluating similar use cases.
  • Align review content with B2B buyer journey stages. Early-stage (top of funnel) researchers need high-level product capability validation, while late-stage (bottom of funnel) evaluators need detailed implementation and integration information.

Peer Community Engagement

When practitioners recommend your solution unprompted in peer forums, you have established genuine community support.

  • Engage in peer networks like LinkedIn, Reddit, Slack channels, and technical forums to build trust through authentic contributions.
  • Track community sentiment and branded search lift to measure impact.
  • Monitor how frequently your brand appears in organic peer discussions versus competitors.

2. Enable B2B Buyers To Validate Your Solutions

Supporting buying group decision-making relies on the discoverability of evidence that aligns with the specific priorities of individual group members.

Organizations that ensure discoverability and enable validation across technical and business stakeholders earn consideration when B2B buying groups narrow their options.

Technical Decision Maker Enablement

Technical buyers test solutions themselves before talking to sales. They research how to connect systems on GitHub, solve setup problems on Stack Overflow, and review code interfaces through live documentation before contacting vendors.

Use structured data strategies and content architecture techniques to ensure resources like code guides and setup workflows are easily discoverable by AI crawlers.

Enhance discoverability by:

  • Providing resources that allow technical buyers to test things on their own time. This includes complete code guides with working examples, test environments they can use immediately, detailed security documentation, and setup workflows for common platforms.
  • Making these resources easy to find where they actually work. Maintain GitHub projects with real examples, answer questions on Stack Overflow, and publish technical content that demonstrates expertise.
  • Creating discoverable materials that cater to different teams within an organization. Operations teams need setup guides demonstrating clean code design. Engineers need system diagrams showing how your solution fits their tech setup. Security teams need security reviews and access controls validated through independent audits.
  • Implementing FAQ schema, HowTo schema, and Organization/Product markup to improve visibility for LLMs, making resources like documentation and guides more accessible during AI search.

Business Leader Validation Frameworks

Business leaders trust proven results and return on investment over technical specifications. Ensure that validation data is discoverable and geared toward demonstrating how these solutions meet industry standards.

Provide benchmark data showing how your solution compares to industry standards, with metrics executives can confidently present to their CFO and board.

  • Commission independent research that positions your approach within broader market trends.
  • Secure placement in analyst evaluations. These third-party validations carry weight with executive buyers who need external credibility to support internal business cases.
  • Distribute insights through channels executives actually monitor: LinkedIn posts that demonstrate thought leadership on strategic challenges, webinars that address business transformation rather than product features, and board-ready presentations that translate technical capabilities into business outcomes.
  • Enhance citation authority by building backlinks and optimizing for third-party mentions. This positions your solution favorably within broader market trends, making it more discoverable and credible.

B2B Buying Group Champion Enablement Systems

Internal champions require easily discoverable resources to address objections of other stakeholders and build consensus across their buying groups.

  • Equip B2B buying group champions with resource kits that provide responses to predictable concerns:
    • Finance (ROI models and cost-benefit analyses).
    • IT (integration complexity and security requirements).
    • Security (compliance frameworks and audit readiness).
    • Operations (change management and training requirements).
    • Executive leadership (strategic alignment and competitive positioning).
  • Offer presentation templates designed for different audiences:
    • Executive summaries for C-suite approval.
    • Technical reviews for architecture committees.
    • Business cases for financial justification.
    • Adoption plans for operational leadership.
  • Use citation authority-building tactics such as knowledge panel optimization and competitor comparison content to make champion resources more visible and credible.

By weaving discoverability into these offerings, organizations will better support technical decision makers in validating solutions effectively, thus positioning themselves favorably in the decision-making process.

3. Measure And Optimize

Discovery channel analytics reveal which research paths lead to actual buyer engagement and revenue.

Track Discovery Performance Across Channels

Build a comprehensive discovery analytics dashboard that monitors:

AI Visibility Metrics:

  • Share-of-voice in AI-generated responses across LLMs like ChatGPT, Perplexity, Gemini, and Copilot.
  • Citation frequency trends and competitive displacement rate within AI answers (can be a challenge right now, but as tools mature).
  • AI-sourced traffic attribution and correlation with pipeline outcomes.

Review Platform Metrics:

  • Review volume trends, average ratings across key categories (ease of use, support quality, value), and competitive positioning within your category (quarterly).
  • Sentiment analysis from peer networks like Reddit and Slack, where practitioners discuss solutions candidly.

Technical Validation Metrics:

  • Developer engagement on GitHub and Stack Overflow, API call volumes, and technical documentation traffic.
  • Page interaction depth (scroll patterns, time on page) and trial conversion rates from documentation paths.

Business Stakeholder Metrics:

  • Content consumption patterns by role and lead quality from executive-focused content.
  • Analyst report downloads and correlation with enterprise deal conversion rates.

Discovery Path Indicators:

  • Branded search lift and correlation between community engagement and inbound inquiry volume.
  • Channel combinations and content sequences that appear in successful deals.

Analyze Discovery Patterns That Drive Revenue

Trace content consumption paths that lead to demo requests, trial signups, and sales conversations. Use tracking parameters and form fields that identify origin sources.

Reverse-engineer successful deals to uncover:

  • Which channels start serious evaluation (peer networks, review sites, technical documentation).
  • Whether discovery through practitioner recommendations correlates with higher-quality leads.
  • Which content types drive engagement from different stakeholder roles (technical documentation for engineers, analyst reports for executives, peer reviews for operations leaders).

Correlate discovery metrics with sales cycle length, win rates, and client advocacy rates to identify which activities drive shortlist inclusion versus those that simply generate activity without business impact.

The buyer journey has fundamentally changed. Research happens before engagement, decisions form before conversation, and shortlists solidify before prospects present themselves.

Organizations that win in 2026 understand this reality and act accordingly. They establish presence where B2B buyers research, enable validation across stakeholder groups, and measure what drives consideration.

Implemented successfully, discoverability is the revenue engine that drives conversion in the AI-led buying era.

Key Takeaways

  • Optimize for AI-powered search: AEO and GEO are now foundational to brand discoverability. Audit your visibility across ChatGPT, Perplexity, Gemini, and Copilot, then build citation authority, structured data, and AI-consumable content architecture to earn consistent inclusion.
  • Build systematic review presence: Maintain an authentic review flow on platforms like G2 and TrustRadius through consistent client engagement.
  • Engage peer networks authentically: Participate in LinkedIn, Reddit, Slack channels, and technical forums where target buyers gather. Share insights and answer questions to build organic support.
  • Enable technical validation: Provide comprehensive resources on GitHub and Stack Overflow where technical buyers validate solutions through hands-on testing.
  • Support business leader decisions: Offer benchmarking data, independent research reports, and analyst validations that economic buyers can defend to CFOs and boards.
  • Equip internal champions: Supply presentation templates, competitive frameworks, and objection response playbooks that enable champions to build consensus across finance, IT, security, operations, and executive stakeholders.
  • Measure what drives consideration: Track AI visibility metrics alongside review site performance, peer network sentiment, technical documentation engagement, and champion support usage, connecting every channel to pipeline outcomes.

More Resources:


Featured Image: eamesBot/Shutterstock

OpenAI Crawl Activity Tripled Since GPT-5, Data Shows via @sejournal, @MattGSouthern

OpenAI’s automated crawl activity is estimated to have roughly tripled after the launch of GPT-5, according to a new analysis from Botify and guest author Chris Long.

In Botify’s dataset, OpenAI’s search crawler is now generating more log events than its training crawler. That’s a reversal from the period before GPT-5.

Long, co-founder of the SEO consultancy Nectiv, analyzed roughly 7 billion OpenAI-bot log events from Botify’s enterprise client dataset spanning November 2024 through March 2026.

What The Data Shows

Two of the three OpenAI user agents Botify measured saw activity spike around the GPT-5 launch.

OAI-SearchBot, which retrieves content when ChatGPT performs web searches, recorded about 3.5x more events after August 2025. That works out to roughly 2.2 billion additional events in Botify’s dataset.

GPTBot, which collects training data, recorded about 2.9x more events over the same period. That is another 1.8 billion events.

The third user agent, ChatGPT-User, moved in the opposite direction. Long reports a 28% drop in ChatGPT-User log events between December 2025 and March 2026. ChatGPT-User fires when a ChatGPT session fetches a page on behalf of a user, so the drop measures logged user-initiated fetches rather than ChatGPT usage overall.

Long offers two possible readings. One is that fewer sessions may be triggering real-time page fetches. The other, suggested by Botify’s team, is that OpenAI may be relying more on stored or indexed resources, reducing the need to fetch pages in real time. Long does not pick between them.

Search Bot Now Outpaces Training Bot

Before GPT-5, OAI-SearchBot and GPTBot ran at roughly even volumes in Botify’s dataset, with a ratio of about 0.95 search events per training event. After GPT-5, that ratio rose to about 1.14.

The pattern lines up with what Dan Petrovic wrote in August 2025 about GPT-5, arguing that OpenAI was sourcing more answers from live search than from trained memory. Botify’s data is consistent with that read.

Industry Breakdown

The post-GPT-5 search bot increases varied by industry. Healthcare sites saw about 740% more OAI-SearchBot activity after launch; Media and Publishing, 702%; and Marketplaces, Software, and Retail, 190-216%.

Travel sites had the smallest rise at 30%. The search and training balance also varies. Long reports a +256% OAI-SearchBot to GPTBot crawl difference for Media/Publishing, the largest gap. Software and Internet lean toward search, Healthcare and Retail favor training, with -50% and -33%. GPTBot is more active overall.

Botify and Long suggest OpenAI routes prompt types differently: news inquiries trigger live search, health and product queries rely on trained knowledge.

How OpenAI’s Crawl Compares To Google’s

Even after tripling, OpenAI’s crawl activity is much smaller than Google’s.

In Botify’s most recent 30-day window, Googlebot registered 18.2 billion events, compared with 887 million events from OpenAI’s crawlers combined. That puts OpenAI at about 4% of Google’s crawl volume.

A year earlier, the same comparison was 15 billion Google events to 207 million OpenAI events, or about 1.38%. The gap is closing, though Google’s crawl is still roughly 20 times larger in absolute terms.

Bingbot registered about 5.49 billion events in the most recent window, putting OpenAI at roughly 14% of Bing.

Methodology & Commercial Context

The dataset is Botify’s, covering enterprise clients in retail, ecommerce, technology, publishing, travel, and marketplaces. The analysis was conducted by Long as a guest author on Botify’s blog.

For transparency, Botify sells log file analysis and AI bot management software, and the post promotes a follow-up webinar and a product demo.

The dataset skews toward large enterprise websites rather than a representative cross-section of the web.

Why This Matters

In Botify’s dataset, OAI-SearchBot now generates more log events than GPTBot. Sites that block only GPTBot are not blocking the bot OpenAI says is used to surface websites in ChatGPT search answers.

Sites that block OAI-SearchBot may be excluding themselves from ChatGPT search answers.

How This Fits With Other Reports

Botify’s findings line up with patterns other vendors have reported. An Alli AI analysis covered earlier this month found OpenAI’s ChatGPT-User made 3.6x more requests than Googlebot in a smaller WordPress-heavy sample. A Hostinger analysis found OAI-SearchBot’s website coverage reaching 55% while GPTBot coverage fell. Akamai’s recent bot traffic report showed OpenAI leading AI bot traffic to publishing sites.

The reports suggest that AI training crawls and AI search crawls need to be measured separately, especially as OAI-SearchBot activity grows.

The Fully Non-Human Web: No One Builds The Page, No One Visits It via @sejournal, @slobodanmanic

In January 2026, Google was granted patent US12536233B1. Six engineers worked on it, and it describes a system that scores a landing page on conversion rate, bounce rate, and design quality. If the landing page falls below a threshold, generate an AI replacement personalized to the searcher. The advertiser never sees it. Never approves it. Might not even know it happened.

The debate around this patent has centered on scope: Is it limited to shopping ads, or does it signal something broader? That’s the wrong question.

The right question: What happens when you combine AI-generated pages with AI agents that browse, shop, and transact on behalf of humans?

For the first time, we have the infrastructure for a web where no human creates the page and no human visits it. Both sides can be non-human. That changes everything.

The Supply Side: AI-Generated Pages

The supply side of the web has always been human. Someone designs a page, writes copy, publishes it. Three developments are changing that.

Google’s patent US12536233B1 is the most direct: Score a landing page on conversion rate, bounce rate, and design quality, then replace underperforming pages with AI-generated versions. The replacement pages draw on the searcher’s full search history, previous queries, click behavior, location, and device data. Google builds personalized landing pages no advertiser can match, because no advertiser has access to cross-query behavioral data at that scale. Barry Schwartz covered the patent on Search Engine Land, describing a system where Google could automatically create custom landing pages, replacing organic results. Glenn Gabe called Google’s AI landing page patent potentially more controversial than AI Overviews. Roger Montti at Search Engine Journal argued the patent’s scope is limited to shopping and ads. Both camps agree: the technology to score and replace landing pages with AI exists and works.

NLWeb, Microsoft’s open project, takes a different approach. NLWeb turns any website into a natural language interface using existing Schema.org markup and RSS feeds. An AI agent querying an NLWeb-enabled site doesn’t load a page at all. The agent asks a structured question, NLWeb returns a structured answer. The rendered page becomes optional.

WebMCP goes further still. With WebMCP, a website registers tools with defined input/output schemas that AI agents discover and call as functions. A product search becomes a function call. A checkout becomes an API request. WebMCP eliminates the “page” concept entirely, dissolving the web page as a unit of content into a set of callable capabilities.

Each mechanism works differently, but the direction is the same: the page is becoming something generated, queried, or bypassed entirely. The human-designed, human-published web page is no longer the only way content reaches an audience.

The Demand Side: AI Agents As Visitors

The demand side shifted faster. In 2024, bots surpassed human traffic for the first time in a decade, accounting for 51% of all web activity. Cloudflare’s data shows AI “user action” crawling (agents actively doing things, not just indexing) grew 15x during 2025. Gartner predicts 40% of enterprise applications will feature task-specific AI agents by end of 2026, up from less than 5% in 2025. The scale is hard to overstate.

Agentic browsers are the most visible shift. Chrome’s auto browse turned 3 billion Chrome installations into potential AI agent launchpads. Google’s Gemini scrolls, clicks, fills forms, and completes multi-step tasks autonomously inside Chrome. Perplexity’s Comet browser conducts deep research across multiple sites simultaneously. Microsoft’s Edge Copilot Mode handles multi-step workflows from within the browser sidebar. The full agentic browser landscape now includes over a dozen consumer and developer tools, all browsing on behalf of humans.

Commerce agents have moved past browsing into buying. OpenAI launched Instant Checkout to let users purchase products directly inside ChatGPT, powered by Stripe’s Agentic Commerce Protocol (ACP). OpenAI killed the feature in March 2026 after near-zero purchase conversions and only a dozen merchant integrations out of over a million promised. The failure was execution, not concept: Alibaba’s Qwen app processed 120 million orders in six days in February 2026 because Alibaba owns the AI model, the marketplace, the payment rails (Alipay), and the logistics. OpenAI tried to replicate agentic commerce without owning the stack. Google and Shopify’s Universal Commerce Protocol (UCP) connects over 20 companies, including Walmart, Target, and Mastercard, in a framework designed for AI agents to handle commerce from product discovery through checkout. Shopify auto-opted over a million merchants into agentic shopping experiences with ChatGPT, Copilot, and Perplexity. The transaction happens in an AI conversation. No checkout page loads.

Agent-to-agent communication removes the human from both ends. Google’s Agent-to-Agent (A2A) protocol lets AI agents from different vendors discover each other’s capabilities and collaborate on tasks without human mediation. A travel planning agent negotiates directly with a booking agent. A procurement agent evaluates supplier agents across vendors. Over 150 organizations support A2A, including Salesforce, SAP, and PayPal, making agent-to-agent commerce and coordination a production reality.

When Both Sides Go Non-Human

Until now, one side of the web was always human. A person built the page, or a person visited it. Usually both.

Google’s patent closes the circuit.

Here’s what a complete non-human flow might look like. A user tells their AI assistant they need running shoes. The assistant queries product data through NLWeb or WebMCP, no page load needed. The assistant evaluates options by checking inventory across retailers via A2A. If the user needs to review a comparison, Google generates a landing page personalized to that specific user’s search history and preferences. The assistant completes checkout through ACP or UCP using Shared Payment Tokens. The user receives a confirmation.

The human’s role in that entire flow: stating intent and approving the purchase. Discovery, page generation, product evaluation, and transaction completion are all handled by AI systems. The human touches only the two endpoints of the chain.

Every piece of technology in that chain exists in production today. Chrome auto browse is live for 3 billion Chrome users. A2A has 150+ organizational supporters. ACP underpins Stripe’s agentic commerce infrastructure (ChatGPT’s Instant Checkout failed on execution, not protocol). UCP connects Shopify, Google, Walmart, and Target. Patent US12536233B1 is granted. No single company has assembled the full loop yet, but every component is operational.

Who’s Building The Non-Human Web

Here’s where it gets interesting. Map out who’s building what, and a pattern emerges:

Layer What Who
Page generation AI landing pages Google
Content-as-API WebMCP, NLWeb Google, Microsoft
Agent infrastructure MCP, A2A Anthropic, Google
Agent browsers Chrome, Comet, Copilot Google, Perplexity, Microsoft
Agent commerce ACP, UCP Stripe + OpenAI, Shopify + Google
Edge delivery Markdown for Agents Cloudflare

Google appears in five of six layers: page generation (patent US12536233B1), content-as-API (WebMCP), agent infrastructure (A2A), agent browsers (Chrome auto browse), and commerce (UCP). Google is positioning itself to mediate the non-human web the same way Google mediates the human one through Search.

The Agentic AI Foundation (AAIF), formed under the Linux Foundation with Anthropic, OpenAI, Google, and Microsoft as platinum members, provides the governance layer. The AAIF functions as the W3C for the agentic web: the vendor-neutral body that decides which protocols become standards for agent interoperability.

What Website Owners Need To Know

This isn’t an optimization checklist. It’s three structural shifts in what your website is for.

Your Data Layer Is Your Website

Google’s patent generates landing pages from product feed data, making product feeds the most important asset an ecommerce business maintains. NLWeb queries Schema.org markup instead of rendering pages, making structured markup the front door to your content. WebMCP exposes site capabilities as function calls, making tool definitions the user interface agents interact with.

Structured data, product feeds, JSON-LD, and API surfaces have traditionally been treated as backend infrastructure. In the non-human web, these data layers become the primary way a business reaches customers. Product feed accuracy (specs, pricing, stock levels, images) matters more than homepage design when AI systems generate the page from that feed.

Trust Is The Moat

AI can generate a page. It cannot generate a reason to seek you out by name.

Direct traffic, email subscribers, community members, and brand reputation persist when the page itself becomes replaceable. An AI agent can build a product page, but no AI agent can build the trust that makes a consumer (or their agent) request a specific brand by name.

The brands that matter in the non-human web are the ones people tell their agents to find. “Get me a fleece jacket” is a commodity query. “Get me a fleece jacket from Patagonia” is a brand moat.

The Measurement Problem

How do you measure a page you didn’t build? How do you A/B test against something Google generates dynamically? How do you attribute a conversion that happened inside ChatGPT, initiated by an agent acting on behalf of a user who never saw your website?

Traditional web analytics (page views, sessions, bounce rate, time on site) assume two things: a human visitor and a page you control. On the non-human web, neither assumption holds. A Google-generated landing page isn’t yours. A ChatGPT checkout session doesn’t register in your analytics.

I don’t have a clean answer here, and neither does anyone else. Measurement is the genuinely unsolved problem of the non-human web. New metrics will need to track agent discoverability, agent conversion rate, and data feed quality. But as of March 2026, the measurement infrastructure hasn’t caught up to the technology it needs to measure.

Four Predictions For 2026-2027

Four things to watch over the next 12-18 months.

Google ships patent US12536233B1, or something like it. The technology for scoring and replacing landing pages exists. The business incentive exists. Google has a history of introducing features in ads first, then expanding (Google Shopping went from free to paid to essential). AI-generated landing pages will likely appear in shopping ads first, then broaden to other verticals. Landing page quality scores in Google Ads serve as the early warning system for which pages Google considers replaceable.

Agent traffic becomes measurable. Analytics platforms will need to distinguish human sessions from agent sessions. BrightEdge reports AI agents account for roughly 33% of organic search activity as of early 2026. WP Engine’s traffic data shows 1 AI bot visit for every 31 human visits by Q4 2025, up from 1 per 200 at the start of that year. Agent traffic ratios will accelerate further as Chrome auto browse rolls out globally beyond the US. New metrics around agent conversion rate and agent discoverability will emerge from necessity.

The protocol stack consolidates. MCP, A2A, NLWeb, and WebMCP form a coherent stack covering tool access, agent communication, content querying, and browser-level integration. Expect more interoperability between these protocols and fewer competing standards. The Agentic AI Foundation (AAIF) accelerates consolidation. Within 18 months, “does your site support MCP?” will be as standard a question as “is your site mobile-friendly?”

Brand differentiation gets harder and more important. When AI generates pages and agents do the shopping, the only defensible position is being the brand people (and their agents) seek out by name. Direct relationships, owned audiences, trust signals. Everything else is a commodity.

The Web Splits In Two

When Shopify auto-opted merchants into agentic shopping, I asked whether your website just became optional. The answer is more nuanced than optional or essential. It’s becoming something different.

The web isn’t dying. It’s splitting.

The transactional web (product listings, checkout flows, information retrieval, comparison shopping) is going non-human first. AI generates the landing pages. AI agents visit and transact on those pages. Humans approve decisions at the endpoints. Google’s patent lives in the transactional web, and the economics of conversion optimization push hardest toward automation in this layer.

The experiential web (brand storytelling, community, content that rewards sustained attention, design that creates emotional response) stays human. Not because AI can’t generate brand experiences, but because the value of those experiences comes from the human connection behind them. Nobody tells their agent to “go enjoy a brand experience on my behalf.”

Your website’s new job description: data source for the agents, trust anchor for the humans, brand home for both. The companies that treat their structured data, product feeds, and API surfaces with the same care they give their homepage design are the ones that show up in both worlds.

The non-human web isn’t replacing the human web. It’s growing alongside it. Your job is to show up in both.

More Resources:


This was originally published on No Hacks.


Featured Image: Yaaaaayy/Shutterstock

Google’s Updates Push Search Further Into Task Completion via @sejournal, @MattGSouthern

Google announced three updates to Search and AI Mode this week, which Roger Montti reported for SEJ. Reading his article motivated me to examine these updates, the broader pattern, and their implications for search this year.

Looking at this in detail, it appears the updates push more of what used to be a results-page experience into task completion.

What Google Announced

Google launched individual hotel price tracking in Search, now available globally for signed-in users searching in English and Spanish. Email alerts notify users of rate changes during selected dates.

Additionally, in March, Canvas trip planning in AI Mode moved from Labs preview to general U.S. availability, allowing users to describe trips and receive custom itineraries with flights, hotels, and attractions that save automatically. Agent-powered store calling, first introduced in classic Search, will soon roll out to AI Mode, enabling Google’s AI to call nearby stores, check inventory, using Gemini models and Duplex.

Rose Yao, Product Leader in Search, posted the updates on X. Additional detail sits in Google’s blog post.

The Pattern

These updates reflect Google’s product direction seen in research, patents, and executive statements since January.

In January, Google published the SAGE research paper on training agents for reasoning chains over four steps, laying groundwork for multi-step tasks in Search.

Pichai’s April interview made the language public. Pichai said, “A lot of what are just information-seeking queries will be agentic in Search.” Our deep dive tracked how his language shifted from “search will change” to specific descriptions of task completion.

Earlier this month, Montti argued that task-based agentic search was already changing SEO, citing Google’s global rollout of agentic restaurant booking as evidence that the future tense in Pichai’s language was already past tense in product.

A week ago, the U.S. Patent Office published a Google continuation patent titled “Autonomously providing search results post-facto” (our coverage). The filing describes a system that waits for answers when none are immediately available, then delivers them later through assistant interactions.

These updates continue in the same direction. Canvas moves from Labs preview to broader U.S. availability, approximately five months after its initial launch in November. Store calling has been introduced in AI Mode following its debut in Search last November. Additionally, hotel price tracking is now available in Search at the single-property level.

Microsoft’s recent news fits the same pattern. Sumit Chauhan, President of Microsoft’s Office Product Group, wrote in a company blog post that Copilot’s agentic capabilities are now generally available in Word, Excel, and PowerPoint:

“Copilot creates the most value when it performs the work—formatting, restructuring, building visuals, and transforming data—rather than just suggesting steps.”

The features are the default for Microsoft 365 Copilot and Premium subscribers, and available to Personal and Family plans. It’s unclear whether businesses will receive similar reporting for agent-driven surfaces, a point not addressed in Microsoft’s post.

The Vocabulary Hasn’t Settled

Google uses “agentic” in its product language and announcements, describing features like calling and AI Mode as task-oriented. A SeatGeek partnership was called “Google’s Agentic AI Search Experience.” Other companies also use a similar agent framework language.

Pichai describes ‘Agent manager’ as Google’s role and envisions a future in which Search becomes ‘an agent manager’ overseeing various tasks. It positions Google as an orchestration layer on top of agents rather than a direct competitor.

Montti has used “task-based agentic search” in his recent SEJ coverage, sometimes shortened to TBAS. That’s his shorthand for this beat, not industry-standard terminology.

“Agentic” describes the capability. “Agent manager” refers to a specific architectural role that Google is claiming. “Task-based” centers the user’s goal. When three different labels show up in one month, the market is still working out what to call this.

Why This Matters For Search Professionals

Features introduced this week change the meaning of visibility across several business categories.

Local retailers now encounter a new discovery surface. When a store calls in AI Mode, Google’s agents, rather than users, will contact businesses to verify stock and details. Google hasn’t disclosed which stores its agents will contact first, how eligibility is decided, or if specific business information influences the process.

An analysis of 68 million AI crawler visits across 858,457 Duda-hosted sites shows that sites with connections to Yext, Google Business Profile, and review systems were crawled more often than those without. These findings describe crawler behavior, not agent calls. It’s unknown if similar signals influence which stores are contacted.

Hotels and travel businesses now face individual-property price monitoring. Trip itineraries are based on Canvas’s selection logic. No report shows if a hotel appeared in a Canvas plan, triggered an alert, or was named in an AI Mode response.

Publishers face continued pressure from AI-driven summarization. Index Exchange analyzed 1,200 publishers on its exchange platform, finding that 69% experienced year-over-year declines in ad opportunities, with an average drop of 14%.

Declines varied across verticals. Health and careers publishers saw 40-50% ad drops, while news and politics publishers saw only 7% declines.

Vanessa Otero, Founder and CEO of Ad Fontes Media, told Index Exchange for the same piece:

“When it’s important enough that you want to be accurately and fully informed about some big international, national, or local event, a quality news site is still a much better experience than asking an AI chatbot, which may give a genericized or inaccurate answer. AI users already know this, which is why most news consumers still go direct to their trusted sites. News has always performed well for advertisers, and if the trend of news site resilience holds, this inventory will likely become the most valuable on the open web of the future.”

Travel publishers face pressure as Canvas compiles itineraries without citing sources, making it impossible for publications to know if their coverage influences trip plans.

Ecommerce retailers lack visibility into which stores get called, so they can’t determine if inventory feeds, listing accuracy, or Google Business Profile signals are effective.

Multi-platform coverage complicates strategy. Google’s agents favor structured data and verified profiles. Perplexity Computer routes across 19 models with diverse retrieval preferences. ChatGPT Atlas scrapes browser content directly. OpenAI’s Operator uses GUI vision to interact with rendered pages.

One business has multiple discovery mechanisms with varying technical needs. Single-strategy optimization no longer covers all surfaces.

What’s Still Invisible

Since our coverage flagged the measurement gap, it has widened.

Search professionals still can’t see whether their business was included in a Canvas trip plan. They can’t see whether an agent called them. They can’t see whether their hotel was surfaced in a price-tracking alert. And they can’t see how often their content was used to assemble someone else’s itinerary.

No new reporting surfaces were shipped alongside the updates. Alphabet reported $63.1 billion in Google Search & Other advertising revenue for Q4 2025, up 17% year-over-year, with management crediting Search and Cloud acceleration and AI usage gains. No new reporting tools have arrived to help businesses track their role in AI-mediated search.

The pattern holds across platforms. ChatGPT referral data is limited to what OpenAI shares. Perplexity citation visibility is inside Perplexity. Google’s agent surfaces don’t cleanly map to Search Console.

Academic research on agent training continues to advance. Two April 2026 papers on arXiv show the pace. CW-GRPO, from Junzhe Wang and colleagues, proposes reinforcement-learning improvements for multi-turn search agents. SKILL0, developed by Zhengxi Lu and colleagues at Zhejiang University, trains agents to internalize skill packages. The result is agents that operate without instruction overhead during inference.

The training pipeline is evolving faster than the measurement pipeline businesses depend on. Search professionals can’t close that gap alone. Google, OpenAI, Perplexity, and Anthropic would all need to provide equivalent agent-surface reporting. None has publicly committed to doing so.

Looking Ahead

Pichai said that 2027 would be “an important inflection point for certain things.” He cited non-engineering workflows and some agentic business processes. Our coverage walked through that timeline.

May brings Google I/O and Microsoft Build. Both companies are likely to expand their agentic surfaces at those events, making reporting the most urgent thing to watch. If businesses can’t see their role in task-based search, they can’t optimize for it or argue about who should pay for it.

Two longer-running questions sit behind that. Pay-per-click worked when users clicked links. Store calling, Canvas planning, and price tracking don’t produce clicks, and no platform has described a replacement. Schema.org was designed for search engine crawling, not for agents that need real-time inventory, booking availability, and action endpoints. Standards for agent-readable business data haven’t caught up either.

What happens next depends on whether any platform builds the reporting alongside the capability. So far, none has described how it would. Until that changes, businesses will be optimizing for surfaces they can’t see. Next signals land at I/O and Build in three weeks.

More Resources

OpenAI’s Crawler Docs Now List OAI-AdsBot For ChatGPT Ads via @sejournal, @MattGSouthern

OpenAI’s public crawler documentation now lists OAI-AdsBot, a bot that may visit pages submitted as ChatGPT ads to check policy compliance and help determine ad relevance.

The entry sits alongside OAI-SearchBot, GPTBot, and ChatGPT-User on OpenAI’s crawler docs page, bringing the documented bot count to four.

OpenAI states that OAI-AdsBot only visits pages submitted as ads and that the data it collects isn’t used to train its generative AI foundation models.

What The Bot Does

Per OpenAI’s docs, OAI-AdsBot may visit an ad’s landing page after the ad gets submitted. The bot checks whether the page complies with OpenAI’s ad policies. It may also use content from the landing page to help decide when to show the ad to ChatGPT users.

The bot identifies itself with the user-agent string Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; OAI-AdsBot/1.0; +https://openai.com/adsbot.

OAI-SearchBot and GPTBot are both at version 1.3, per OpenAI’s docs. The crawler only visits pages submitted as ad landing pages, not the wider web.

What The Bot Doesn’t Do

Data collected by OAI-AdsBot isn’t used to train generative AI foundation models. That keeps OAI-AdsBot out of GPTBot’s territory, which handles training data collection.

It also keeps OAI-AdsBot separate from OpenAI’s other bots. OAI-SearchBot surfaces content in ChatGPT search, while ChatGPT-User fetches pages during user-initiated browsing, and OAI-AdsBot is limited to ad validation.

OAI-SearchBot and GPTBot can be controlled independently through robots.txt. ChatGPT-User is user-initiated, and the company notes that robots.txt rules may not apply to it. The OAI-AdsBot entry doesn’t say how the bot treats robots.txt.

No Public IP List Yet

OpenAI publishes IP range files for its three earlier bots at openai.com/searchbot.json, openai.com/gptbot.json, and openai.com/chatgpt-user.json. At the time of publication, no equivalent openai.com/adsbot.json file appears in OpenAI’s docs.

Without a published list, verifying a real OAI-AdsBot visit becomes harder. User-agent strings can be spoofed, and the IP lists give you a way to cross-check for the other three OpenAI bots. For OAI-AdsBot, that cross-check isn’t available.

Why This Matters

OAI-AdsBot has two audiences. Advertisers buying placements on ChatGPT need the bot to reach their landing pages; otherwise, the ad may not validate. Anyone tracking AI bot activity in server logs gets a new user-agent to watch, one tied to paid inventory rather than search or training.

Aggressive bot protection through Cloudflare, Akamai, or similar tools may block OAI-AdsBot before it reaches the page. That could create validation friction for advertisers who use strict bot-mitigation tools.

Looking Ahead

ChatGPT’s ad program has moved fast since OpenAI started testing ads on Feb. 9. As access opens up to more advertisers, OAI-AdsBot traffic will start showing up in more server logs. Watch for an eventual IP range file at openai.com/adsbot.json if OpenAI chooses to publish one. For now, the user-agent string is what you have to work with.


Featured Image: Blossom Stock Studio/Shutterstock

Google Ads Posts GEO Partner Manager Role via @sejournal, @MattGSouthern

Google’s Large Customer Sales team has posted a role titled “GEO Partner Manager, Performance Solutions” on Google Careers. The listing is a single job posting inside Google’s ads sales organization.

The term “GEO” appears seven times across the listing, including the title. “Generative Engine Optimization” is spelled out twice. Other references include “GEO players,” “GEO ecosystem,” and “GEO/AEO companies.”

The listing says the role will “shape the GEO ecosystem to prioritize Google surfaces.” Responsibilities include influencing partners to prioritize Google-owned surfaces in their tools and methodologies, as well as in “Share of Model” analysis. “Share of Model” is an industry term for a brand’s presence in AI-generated answers.

Why This Matters

The terminology is worth noting because it sits alongside a different public position from Google’s search side. In July, Google’s Gary Illyes said standard SEO is sufficient for AI Overviews and AI Mode, and that specialized AEO or GEO optimization is not needed. As of publication, Google has not publicly updated that guidance.

Large Customer Sales manages relationships with major advertisers and agencies. The role’s alignment with the 3P Measurement team places it firmly inside Google’s ad-side partner work.

Microsoft and Google are in different places here, and the categories of evidence differ. In March, Bing added “GEO” to its official webmaster guidelines, defining the term and placing it alongside SEO as a named category. Bing’s AI Performance dashboard, launched in February, was positioned as a step toward GEO tooling.

The Google listing is one job posting inside an ads sales team. Both are adoption signals, but not the same level of commitment.

Looking Ahead

The language reflects how one team inside Google’s ads organization frames this work today. It doesn’t carry the same weight as a documentation update, a public statement from Google Search, or a policy change.

Whether similar GEO language appears in other Google job listings across Ads, Cloud, or Search would indicate whether this is a pattern or a single team’s choice.

For brands working with GEO or AEO partners, the listing is worth noting. The listing indicates Google’s ads team wants partner tools and methodologies to prioritize Google surfaces.


Featured Image: Jack_the_sparow/Shutterstock

How To Build AI Visibility In 90 Days [Webinar] via @sejournal, @hethr_campbell

AI search has changed how buyers discover solutions. Here’s how to make sure they find you.

Why AI Visibility Is Now a Growth Priority

Platforms like ChatGPT, Perplexity, and Google AI Overviews are now active discovery channels for buyers. Marketing leaders who understand those signals are building durable visibility. Those who don’t are quietly losing ground.

What You’ll Learn in This Free SEO Webinar

  • Which AI visibility signals actually drive discoverability in 2026
  • A phased 90-day framework that helps you audit your baseline, run AI-native experiments, then scale what works
  • How funded startups are restructuring teams and budgets around this shift

About the Speaker

Jason Shafton is Founder & CEO of Winston Francois, a growth consulting firm. He’s led growth and marketing at Google, Headspace, and Kajabi, and has built AI visibility playbooks across 10+ venture and PE-backed startups navigating this exact transition.

Register Free

This is one hour of tactical, experience-backed frameworks, built for founders, CMOs, and marketing leaders who are ready to act.

68 Million AI Crawler Visits Show What Drives AI Search Visibility via @sejournal, @martinibuster

A new analysis of 858,457 sites hosted on the Duda platform shows how AI crawlers are interacting with websites at scale. The data offers a clearer view of how crawling activity is growing and what SEOs and businesses should do to increase traffic from AI search.

AI Crawling Has Already Reached Scale

AI crawling is growing quickly, with more requests tied to real-time answers and most of that activity coming from a single provider. The data creates a pattern that shows which sites are being crawled and more importantly, why.

Year-Over-Year Growth In LLM Referrals

LLM referral traffic has increased sharply over the past year, with multiple platforms showing meaningful gains from very different starting points.

AI Referral Traffic Patterns

  • Total LLM referrals: 93,484 to 161,469 (+72.7%)
  • ChatGPT: 81,652 to 136,095 (+66.7%)
  • Claude: 106 to 2,488 (23x growth)
  • Copilot: 22 to 9,560 (from near-zero)
  • Perplexity: 11,533 to 13,157 (+14.1%)

Growth is not happening evenly, but across the board, referral traffic from AI systems is increasing. That makes AI-generated discovery a growing source of traffic, not a marginal one.

Crawlers Are Increasingly Fetching Content To Ground Answers

AI crawlers are no longer used primarily for indexing, with most activity now tied to retrieving content in real time to generate answers for users.

Most crawling is now happening in response to user queries rather than for building an index, which changes how content is accessed and used.

  • User Fetch (real-time answers): 56.9% of all crawler activity, driven almost entirely by ChatGPT
  • Training (model learning): 28.8%, split across GPTBot and other model crawlers
  • Discovery (content indexing): 14.3%, distributed across multiple systems
  • ChatGPT User Fetch volume: ~39.8 million visits

The trends are largely driven by ChatGPT, which is responsible for nearly all real-time retrieval activity. That means the move toward answer-based crawling is not evenly distributed, but concentrated in one platform shaping how content is accessed. This trend may change with Google’s new Google-Agent crawler.

Market Concentration In AI Crawling

AI crawler activity is heavily concentrated, with OpenAI responsible for the vast majority of requests, reflecting its position as the primary tool users rely on to find and retrieve information.

  • OpenAI: 55.8 million visits (81.0%)
  • Anthropic (Claude): 11.5 million (16.6%)
  • Perplexity: 1.3 million (1.8%)
  • Google (Gemini): 380,000 (0.6%)

Most AI crawling activity comes from OpenAI, which aligns with ChatGPT’s role as a primary tool for finding and retrieving information. Claude follows at a much smaller share, suggesting a different usage pattern, while the rest of the market accounts for a minimal portion of crawler activity.

Scale And What That Actually Means

AI crawling is already operating across a large portion of the web, reaching hundreds of thousands of sites and generating tens of millions of requests in a single month.

More than half of all sites in the dataset received at least one AI crawler visit, showing that this activity is not limited to a small subset of websites.

  • Total sites analyzed: 858,457
  • Sites with at least one AI crawler visit: 506,910 (59%)
  • Total AI crawler visits (Feb 2026): 68.9 million

AI crawling is not isolated to high-profile or heavily trafficked sites. It is already widespread, with consistent activity across a majority of the web.

The Relationship Between Crawling and Real Traffic

Sites that allow AI systems to crawl them consistently show stronger engagement across multiple metrics.

What the data actually shows is:

  1. Sites that allow AI crawling receive significantly more human traffic
  2. Higher-traffic sites are more likely to be crawled

Sites that allow crawling by AI systems receive significantly more human traffic, averaging 527.7 sessions compared to 164.9 for sites that are not crawled. This does not establish causation, but it shows a clear alignment between sites that attract human visitors and how often AI systems revisit them.

  • Average human traffic (AI-crawled vs not): 527.7 vs 164.9 (3.2x higher)
  • Average form completions: 4.17 vs 1.57 (2.7x higher)
  • Averageclick-to-call: 8.62 vs 3.46 (2.5x higher)
  • Sites with 10K+ sessions: 90.5% crawl rate

AI systems are not discovering weak or inactive sites and lifting them up. They are returning to sites that already attract human visitors. For marketers, that shifts the focus away from trying to “get crawled” and toward building real audience demand, since visibility in AI systems appears to follow it.

What Correlates With More Crawling

The research compared sites that include specific third-party integrations, structured features, and content depth with those that do not and found which ones mattered most for AI crawler activity and referrals.

Across the dataset, 59% of sites received at least one AI crawler visit in February 2026. Sites that are crawled more often tend to combine three types of signals: external integrations, structured business data, and content depth.

1. External Integrations

These integrations connect the site to external systems that validate and distribute business information.

  • Yext integration: 97.1% crawl rate vs ~58% without (+38.9pp)
  • Reviews integrations: 89.8% crawl rate vs 58.8% without, 376.9 average crawler visits

Sites that are connected to external data and review systems are crawled more often and more frequently, indicating that AI systems rely on these integrations as signals that a business is real, verifiable, and worth revisiting.

2. Structured Site Features And Business Data

These are built into the site and help AI systems understand and verify business identity.

  • Google Business Profile sync: 92.8% crawl rate vs 58.9% without, 415.6 average crawler visits
  • Local schema: 72.3% vs 55.2% (+17.1pp), 22.3% adoption
  • Dynamic pages: 69.4% vs 58.2% (+11.2pp)
  • Ecommerce: 54.2% vs 59.2% (-5.0pp)

Sites that clearly define their business identity and structure their information in a machine-readable way are crawled more often, showing that AI systems favor sites they can easily interpret, verify, and extract information from.

3. Content Depth (Volume Of Usable Data)

Sites with more content provide more opportunities for AI systems to retrieve, reference, and reuse information in responses.

  • Sites with 50+ blog posts: 1,373.7 average crawler visits vs 41.6 with no blog (~33x higher)

Sites with more content are crawled far more often, indicating that AI systems may return to sources that offer a larger supply of usable information to draw from when generating answers.

Local Business Schema Completeness = More Crawling

This part of the research focuses specifically on local business schema, comparing how the completeness of schema implementation for communicating business details relates to AI crawler activity. The fields measured include business name, phone number, address, hours, and social profiles.

  • No local schema fields: 55.2% crawl rate
  • 10–11 completed schema fields: 82% crawl rate
  • Sites with more complete local schema show a 26.8 percentage point higher crawl rate (82% vs 55.2%)

Sites that provide more complete local business information in structured form are crawled more often and receive more crawler visits. As more of these fields are filled in, both crawl rate and crawl frequency increase.

The data shows that clearly defined local business data makes a site easier for AI systems to identify, verify, and subsequently revisit, all the prerequisites for receiving traffic from AI search.

Takeaways

AI crawling is a parallel method for content discovery and the research shows clear patterns for sites that are visited by crawlers most often.

  • AI crawling operates alongside traditional search, changing how content is accessed and reused
  • Sites with structured local signals, deeper content, and more complete schema are crawled more often
  • Multiple reinforcing signals appear together on the same sites, not in isolation
  • The data shows direction, not causation, but the patterns are consistent

The data shows that sites that make it easy for AI crawlers to index and revisit the them tend to perform better. Interestingly, sites that present clear, structured, and verifiable information, while continuing to build real audience demand, are more likely to be revisited by AI systems and benefit from traffic generated through AI search.

Read the research: Duda study finds AI-optimized websites drive 320% more traffic to local businesses

Featured Image by Shutterstock/Preaapluem