Google’s Sundar Pichai recently said that the future of Search is agentic, but what does that really mean? A recent tweet from Google’s search product lead shows what the new kind of task-based search looks like. It’s increasingly apparent that the internet is transitioning to a model where every person has their own agent running tasks on their behalf, experiencing an increasingly personal internet.
Search Is Becoming Task-Oriented
The internet, with search as the gateway to it, is a model where websites are indexed, ranked, and served to users who basically use the exact same queries to retrieve virtually the same sets of web pages. AI is starting to break that model because users are transitioning to researching topics, where a link to a website does not provide the clear answers users are gradually becoming conditioned to ask for. The internet was built to serve websites that users could go to and read stuff and to connect with others via social media.
What’s changing is that now people can use that same search box to do things, exactly as Pichai described. For example, Google recently announced the worldwide rollout of the ability to describe the needs for a restaurant reservation, and AI agents go out and fetch the information, including booking information.
“Date nights and big group dinners just got a lot easier.
We’re thrilled to expand agentic restaurant booking in Search globally, including the UK and India!
Tell AI Mode your group size, time, and vibe—it scans multiple platforms simultaneously to find real-time, bookable spots.
No more app-switching. No more hassle. Just great food.”
That’s not search, that’s task completion. What was not stated is that restaurants will need to be able to interact with these agents, to provide information like available reservation slots, menu choices that evening, and at some point those websites will need to be able to book a reservation with the AI agent. This is not something that’s coming in the near future, it’s here right now.
“I feel like in search, with every shift, you’re able to do more with it.
…If I fast forward, a lot of what are just information seeking queries will be agentic search. You will be completing tasks, you have many threads running.”
When asked if search will still be around in ten years, Pichai answered:
“Search would be an agent manager, right, in which you’re doing a lot of things.
…And I can see search doing versions of those things, and you’re getting a bunch of stuff done.”
Everyone Has Their Own Personal Internet
Cloudflare recently published an article that says the internet was the first way for humans to interact with online content, and that cloud infrastructure was the second adaptation that emerged to serve the needs of mobile devices. The next adaptation is wild and has implications for SEO because it introduces a hyper-personalized version of the web that impacts local SEO, shopping, and information retrieval.
AI agents are currently forced to use an internet infrastructure that’s built to serve humans. That’s the part that Cloudflare says is changing. But the more profound insight is that the old way, where millions of people asked the same question and got the same indexed answer, is going away. What’s replacing it is a hyper-personal experience of the web, where every person can run their own agent.
“Unlike every application that came before them, agents are one-to-one. Each agent is a unique instance. Serving one user, running one task. Where a traditional application follows the same execution path regardless of who’s using it, an agent requires its own execution environment: one where the LLM dictates the code path, calls tools dynamically, adjusts its approach, and persists until the task is done.
Think of it as the difference between a restaurant and a personal chef. A restaurant has a menu — a fixed set of options — and a kitchen optimized to churn them out at volume. That’s most applications today. An agent is more like a personal chef who asks: what do you want to eat? They might need entirely different ingredients, utensils, or techniques each time. You can’t run a personal-chef service out of the same kitchen setup you’d use for a restaurant.”
Cloudflare’s angle is that they are providing the infrastructure to support the needs of billions of agents representing billions of humans. But that is not the part that concerns SEO. The part that concerns digital marketing is that the moment when search transforms into an “agent manager” is here, right now.
WordPress 7.0
Content management systems are rapidly adapting to this change. It’s very difficult to overstate the importance of the soon-to-be-released WordPress 7.0, as it is jam-packed with the capability to connect to AI systems that will enable the internet transition from a human-centered web to an increasingly agentic-centered web.
The current internet is built for human interaction. Agents are operating within that structure, but that’s going to change very fast. The search marketing community really needs to wrap its collective mind around this change and to really understand how content management systems fit into that picture.
What Sources Do The Agents Trust?
Search marketing professional Mike Stewart recently posted on Facebook about this change, reflecting on what it means to him.
“I let Claude take over my computer. Not metaphorically — it moved my mouse, opened apps, and completed tasks on its own. That’s when something clicked… This isn’t just AI assisting anymore. This is AI operating on your behalf.
Google’s CEO is already talking about “agentic search” — where AI doesn’t just return results, it manages the process. So the real questions become: 👉 Who controls the journey? 👉 What sources does the agent trust? 👉 Where does your business show up in that decision layer? Because you don’t get “agentic search” without the ecosystem feeding it — websites, content, businesses.
That part isn’t going away. But it is being abstracted.”
Task-Based Agentic Search
I think the part that I guess we need to wrap our heads around is that humans are still making the decision to click the “make the reservation” button, and at some point, at least at the B2B layer, making purchases will increasingly become automated.
I still have my doubts about the complete automation of shopping. It feels unnatural, but it’s easy to see that the day may rapidly be approaching when, instead of writing a shopping list, a person will just tell an AI agent to talk to the local grocery store AI agent to identify which one has the items in stock at the best price, dump it into a shopping cart, and show it to the human, who then approves it.
The big takeaway is that the web may be transitioning to the “everyone has a personal chef” model, and that’s a potentially scary level of personalization. How does an SEO optimize for that? I think that’s where WordPress 7.0 comes in, as well as any other content management systems that are agentic-web ready.
Every major AI platform can now browse websites autonomously. Chrome’s auto browse scrolls and clicks. ChatGPT Atlas fills forms and completes purchases. Perplexity Comet researches across tabs. But none of these agents sees your website the way a human does.
This is Part 4 in a five-part series on optimizing websites for the agentic web. Part 1 covered the evolution from SEO to AAIO. Part 2 explained how to get your content cited in AI responses. Part 3 mapped the protocols forming the infrastructure layer. This article gets technical: how AI agents actually perceive your website, and what to build for them.
The core insight is one that keeps coming up in my research: The most impactful thing you can do for AI agent compatibility is the same work web accessibility advocates have been pushing for decades. The accessibility tree, originally built for screen readers, is becoming the primary interface between AI agents and your website.
According to the 2025 Imperva Bad Bot Report (Imperva is a cybersecurity company), automated traffic surpassed human traffic for the first time in 2024, constituting 51% of all web interactions. Not all of that is agentic browsing, but the direction is clear: the non-human audience for your website is already larger than the human one, and it’s growing. Throughout this article, we draw exclusively from official documentation, peer-reviewed research, and announcements from the companies building this infrastructure.
Three Ways Agents See Your Website
When a human visits your website, they see colors, layout, images, and typography. When an AI agent visits, it sees something entirely different. Understanding what agents actually perceive is the foundation for building websites that work for them.
The major AI platforms use three distinct approaches, and the differences have direct implications for how you should structure your website.
Vision: Reading Screenshots
Anthropic’s Computer Use takes the most literal approach. Claude captures screenshots of the browser, analyzes the visual content, and decides what to click or type based on what it “sees.” It’s a continuous feedback loop: screenshot, reason, act, screenshot. The agent operates at the pixel level, identifying buttons by their visual appearance and reading text from the rendered image.
Google’s Project Mariner follows a similar pattern with what Google describes as an “observe-plan-act” loop: observe captures visual elements and underlying code structures, plan formulates action sequences, and act simulates user interactions. Mariner achieved an 83.5% success rate on the WebVoyager benchmark.
The vision approach works, but it’s computationally expensive, sensitive to layout changes, and limited by what’s visually rendered on screen.
ChatGPT Atlas uses ARIA tags, the same labels and roles that support screen readers, to interpret page structure and interactive elements.
Atlas is built on Chromium, but rather than analyzing rendered pixels, it queries the accessibility tree for elements with specific roles (“button”, “link”) and accessible names. This is the same data structure that screen readers like VoiceOver and NVDA use to help people with visual disabilities navigate the web.
Microsoft’s Playwright MCP, the official MCP server for browser automation, takes the same approach. It provides accessibility snapshots rather than screenshots, giving AI models a structured representation of the page. Microsoft deliberately chose accessibility data over visual rendering for their browser automation standard.
Hybrid: Both At Once
In practice, the most capable agents combine approaches. OpenAI’s Computer-Using Agent (CUA), which powers both Operator and Atlas, layers screenshot analysis with DOM processing and accessibility tree parsing. It prioritizes ARIA labels and roles, falling back to text content and structural selectors when accessibility data isn’t available.
Perplexity’s research confirms the same pattern. Their BrowseSafe paper, which details the safety infrastructure behind Comet’s browser agent, describes using “hybrid context management combining accessibility tree snapshots with selective vision.”
Platform
Primary Approach
Details
Anthropic Computer Use
Vision (screenshots)
Screenshot, reason, act feedback loop
Google Project Mariner
Vision + code structure
Observe-plan-act with visual and structural data
OpenAI Atlas
Accessibility tree
Explicitly uses ARIA tags and roles
OpenAI CUA
Hybrid
Screenshots + DOM + accessibility tree
Microsoft Playwright MCP
Accessibility tree
Accessibility snapshots, no screenshots
Perplexity Comet
Hybrid
Accessibility tree + selective vision
The pattern is clear. Even platforms that started with vision-first approaches are incorporating accessibility data. And the platforms optimizing for reliability and efficiency (Atlas, Playwright MCP) lead with the accessibility tree.
Your website’s accessibility tree isn’t a compliance artifact. It’s increasingly the primary interface agents use to understand and interact with your website.
Last year, before the European Accessibility Act took effect, I half-joked that it would be ironic if the thing that finally got people to care about accessibility was AI agents, not the people accessibility was designed for. That’s no longer a joke.
The Accessibility Tree Is Your Agent Interface
The accessibility tree is a simplified representation of your page’s DOM that browsers generate for assistive technologies. Where the full DOM contains every div, span, style, and script, the accessibility tree strips away the noise and exposes only what matters: interactive elements, their roles, their names, and their states.
This is why it works so well for agents. A typical page’s DOM might contain thousands of nodes. The accessibility tree reduces that to the elements a user (or agent) can actually interact with: buttons, links, form fields, headings, landmarks. For AI models that process web pages within a limited context window, that reduction is significant.
Follow WAI-ARIA best practices by adding descriptive roles, labels, and states to interactive elements like buttons, menus, and forms. This helps ChatGPT recognize what each element does and interact with your site more accurately.
And:
Making your website more accessible helps ChatGPT Agent in Atlas understand it better.
Research data backs this up. The most rigorous data on this comes from a UC Berkeley and University of Michigan study published for CHI 2026, the premier academic conference on human-computer interaction. The researchers tested Claude Sonnet 4.5 on 60 real-world web tasks under different accessibility conditions, collecting 40.4 hours of interaction data across 158,325 events. The results were striking:
Condition
Task Success Rate
Avg. Completion Time
Standard (default)
78.33%
324.87 seconds
Keyboard-only
41.67%
650.91 seconds
Magnified viewport
28.33%
1,072.20 seconds
Under standard conditions, the agent succeeded nearly 80% of the time. Restrict it to keyboard-only interaction (simulating how screen reader users navigate) and success drops to 42%, taking twice as long. Restrict the viewport (simulating magnification tools), and success drops to 28%, taking over three times as long.
The paper identifies three categories of gaps:
Perception gaps: agents can’t reliably access screen reader announcements or ARIA state changes that would tell them what happened after an action.
Cognitive gaps: agents struggle to track task state across multiple steps.
Action gaps: agents underutilize keyboard shortcuts and fail at interactions like drag-and-drop.
The implication is direct. Websites that present a rich, well-labeled accessibility tree give agents the information they need to succeed. Websites that rely on visual cues, hover states, or complex JavaScript interactions without accessible alternatives create the conditions for agent failure.
Perplexity’s search API architecture paper from September 2025 reinforces this from the content side. Their indexing system prioritizes content that is “high quality in both substance and form, with information captured in a manner that preserves the original content structure and layout.” Websites “heavy on well-structured data in list or table form” benefit from “more formulaic parsing and extraction rules.” Structure isn’t just helpful. It’s what makes reliable parsing possible.
Semantic HTML: The Agent Foundation
The accessibility tree is built from your HTML. Use semantic elements, and the browser generates a useful accessibility tree automatically. Skip them, and the tree is sparse or misleading.
This isn’t new advice. Web standards advocates have been screaming “use semantic HTML” for two decades. Not everyone listened. What’s new is that the audience has expanded. It used to be about screen readers and a relatively small percentage of users. Now it’s about every AI agent that visits your website.
Use native elements. A element automatically appears in the accessibility tree with the role “button” and its text content as the accessible name. A
does not. The agent doesn’t know it’s clickable.
Search flights
Label your forms. Every input needs an associated label. Agents read labels to understand what data a field expects.
The autocomplete attribute deserves attention. It tells agents (and browsers) exactly what type of data a field expects, using standardized values like name, email, tel, street-address, and organization. When an agent fills a form on someone’s behalf, autocomplete attributes make the difference between confident field mapping and guessing.
Establish heading hierarchy. Use h1 through h6 in logical order. Agents use headings to understand page structure and locate specific content sections. Skip levels (jumping from h1 to h4) create confusion about content relationships.
Use landmark regions. HTML5 landmark elements (
, ,
,
,
) tell agents where they are on the page. A
element is unambiguously navigation. A
requires interpretation. Clarity for the win, always.
Flight Search
Microsoft’s Playwright test agents, introduced in October 2025, generate test code that uses accessible selectors by default. When the AI generates a Playwright test, it writes:
const todoInput = page.getByRole('textbox', { name: 'What needs to be done?' });
Not CSS selectors. Not XPath. Accessible roles and names. Microsoft built its AI testing tools to find elements the same way screen readers do, because it’s more reliable.
The final slide of my Conversion Hotel keynote about optimizing websites for AI agents. (Image Credit: Slobodan Manic)
ARIA: Useful, Not Magic
OpenAI recommends ARIA (Accessible Rich Internet Applications), the W3C standard for making dynamic web content accessible. But ARIA is a supplement, not a substitute. Like protein shakes: useful on top of a real diet, counterproductive as a replacement for actual food.
If you can use a native HTML element or attribute with the semantics and behavior you require already built in, instead of re-purposing an element and adding an ARIA role, state or property to make it accessible, then do so.
The fact that the W3C had to make “don’t use ARIA” the first rule of ARIA tells you everything about how often it gets misused.
Adrian Roselli, a recognized web accessibility expert, raised an important concern in his October 2025 analysis of OpenAI’s guidance. He argues that recommending ARIA without sufficient context risks encouraging misuse. Websites that use ARIA are generally less accessible according to WebAIM’s annual survey of the top million websites, because ARIA is often applied incorrectly as a band-aid over poor HTML structure. Roselli warns that OpenAI’s guidance could incentivize practices like keyword-stuffing in aria-label attributes, the same kind of gaming that plagued meta keywords in early SEO.
The right approach is layered:
Start with semantic HTML. Use , , , , and other native elements. These work correctly by default.
Add ARIA when native HTML isn’t enough. Custom components that don’t have HTML equivalents (tab panels, tree views, disclosure widgets) need ARIA roles and states to be understandable.
Use ARIA states for dynamic content. When JavaScript changes the page, ARIA attributes communicate what happened:
Keep aria-label descriptive and honest. Use it to provide context that isn’t visible on screen, like distinguishing between multiple “Delete” buttons on the same page. Don’t stuff it with keywords.
The principle is the same one that applies to good SEO: build for the user first, optimize for the system second. Semantic HTML is building for the user. ARIA is fine-tuning for edge cases where HTML falls short.
The Rendering Question
Browser-based agents like Chrome auto browse, ChatGPT Atlas, and Perplexity Comet run on Chromium. They execute JavaScript. They can render your single-page application.
But not everything that visits your website is a full browser agent.
AI crawlers (PerplexityBot, OAI-SearchBot, ClaudeBot) index your content for retrieval and citation. Many of these crawlers do not execute client-side JavaScript. If your page is a blank
until React hydrates, these crawlers see an empty page. Your content is invisible to the AI search ecosystem.
Part 2 of this series covered the citation side: AI systems select fragments from indexed content. If your content isn’t in the initial HTML, it’s not in the index. If it’s not in the index, it doesn’t get cited. Server-side rendering isn’t just a performance optimization.
It’s a visibility requirement.
Even for full browser agents, JavaScript-heavy websites create friction. Dynamic content that loads after interactions, infinite scroll that never signals completion, and forms that reconstruct themselves after each input all create opportunities for agents to lose track of state. The A11y-CUA research attributed part of agent failure to “cognitive gaps”: agents losing track of what’s happening during complex multi-step interactions. Simpler, more predictable rendering reduces these failures.
Microsoft’s guidance from Part 2 applies here directly: “Don’t hide important answers in tabs or expandable menus: AI systems may not render hidden content, so key details can be skipped.” If information matters, put it in the visible HTML. Don’t require interaction to reveal it.
Practical rendering priorities:
Server-side render or pre-render content pages. If an AI crawler can’t see it, it doesn’t exist in the AI ecosystem.
Avoid blank-shell SPAs for content pages. Frameworks like Next.js (which powers this website), Nuxt, and Astro make SSR straightforward.
Don’t hide critical information behind interactions. Prices, specifications, availability, and key details should be in the initial HTML, not behind accordions or tabs.
Use standard links for navigation. Client-side routing that doesn’t update the URL or uses onClick handlers instead of real links breaks agent navigation.
Testing Your Agent Interface
You wouldn’t ship a website without testing it in a browser. Testing how agents perceive your website is becoming equally important.
Screen reader testing is the best proxy. If VoiceOver (macOS), NVDA (Windows), or TalkBack (Android) can navigate your website successfully, identifying buttons, reading form labels, and following the content structure, agents can likely do the same. Both audiences rely on the same accessibility tree. This isn’t a perfect proxy (agents have capabilities screen readers don’t, and vice versa), but it catches the majority of issues.
Microsoft’s Playwright MCP provides direct accessibility snapshots. If you want to see exactly what an AI agent sees, Playwright MCP generates structured accessibility snapshots of any page. These snapshots strip away visual presentation and show you the roles, names, and states that agents work with. Published as @playwright/mcp on npm, it’s the most direct way to view your website through an agent’s eyes.
The output looks something like this (simplified):
If your critical interactive elements don’t appear in the snapshot, or appear without useful names, agents will struggle with your website.
Browserbase’s Stagehand (v3, released October 2025, and humbly self-described as “the best browser automation framework”) provides another angle. It parses both DOM and accessibility trees, and its self-healing execution adapts to DOM changes in real time. It’s useful for testing whether agents can complete specific workflows on your website, like filling a form or completing a checkout.
The Lynx browser is a low-tech option worth trying. It’s a text-only browser that strips away all visual rendering, showing you roughly what a non-visual agent parses. A trick I picked up from Jes Scholz on the podcast.
A practical testing workflow:
Run VoiceOver or NVDA through your website’s key user flows. Can you complete the core tasks without vision?
Generate Playwright MCP accessibility snapshots of critical pages. Are interactive elements labeled and identifiable?
View your page source. Is the primary content in the HTML, or does it require JavaScript to render?
Load your page in Lynx or disable CSS and check if the content order and hierarchy still make sense. Agents don’t see your layout.
A Checklist For Your Development Team
If you’re sharing this article with your developers (and you should), here’s the prioritized implementation list. Ordered by impact and effort, starting with the changes that affect the most agent interactions for the least work.
High impact, low effort:
Use native HTML elements. for actions, for links, for dropdowns. Replace patterns wherever they exist.
Label every form input. Associate elements with inputs using the for attribute. Add autocomplete attributes with standard values.
Server-side render content pages. Ensure primary content is in the initial HTML response.
High impact, moderate effort:
Implement landmark regions. Wrap content in , , , and elements. Add aria-label when multiple landmarks of the same type exist on the same page.
Fix heading hierarchy. Ensure a single h1, with h2 through h6 in logical order without skipping levels.
Move critical content out of hidden containers. Prices, specifications, and key details should not require clicks or interactions to reveal.
Moderate impact, low effort:
Add ARIA states to dynamic components. Use aria-expanded, aria-controls, and aria-hidden for menus, accordions, and toggles.
Use descriptive link text. “Read the full report” instead of “Click here.” Agents use link text to understand where links lead.
Test with a screen reader. Make it part of your QA process, not a one-time audit.
Key Takeaways
AI agents perceive websites through three approaches: vision, DOM parsing, and the accessibility tree. The industry is converging on the accessibility tree as the most reliable method. OpenAI Atlas, Microsoft Playwright MCP, and Perplexity’s Comet all rely on accessibility data.
Web accessibility is no longer just about compliance. The accessibility tree is the literal interface AI agents use to understand your website. The UC Berkeley/University of Michigan study shows agent success rates drop significantly when accessibility features are constrained.
Semantic HTML is the foundation. Native elements like , , , and automatically create a useful accessibility tree. No framework required. No ARIA needed for the basics.
ARIA is a supplement, not a substitute. Use it for dynamic states and custom components. But start with semantic HTML and add ARIA only where native elements fall short. Misused ARIA makes websites less accessible, not more.
Server-side rendering is an agent visibility requirement. AI crawlers that don’t execute JavaScript can’t see content in blank-shell SPAs. If your content isn’t in the initial HTML, it doesn’t exist in the AI ecosystem.
Screen reader testing is the best proxy for agent compatibility. If VoiceOver or NVDA can navigate your website, agents probably can too. For direct inspection, Playwright MCP accessibility snapshots show exactly what agents see.
The first three parts of this series covered why the shift matters, how to get cited, and what protocols are being built. This article covered the implementation layer. The encouraging news is that these aren’t separate workstreams. Accessible, well-structured websites perform better for humans, rank better in search, get cited more often by AI, and work better for agents. It’s the same work serving four audiences.
And the work builds on itself. The semantic HTML and structured data covered here are exactly what WebMCP builds on for its declarative form approach. The accessibility tree your website exposes today becomes the foundation for the structured tool interfaces of tomorrow.
Up next in Part 5: the commerce layer. How Stripe, Shopify, and OpenAI are building the infrastructure for AI agents to complete purchases, and what it means for your checkout flow.
Everyone is scrambling to incorporate AI. But what takes priority?
Is generative engine optimization (GEO) replacing traditional SEO?
Should you shift budget from traditional SEO to AI content experiments?
Watch this on-demand SEO webinar to see how to prioritize SEO vs. AI search based on your business model.
Before You Reallocate SEO Budget, Validate Where AI Will Drive Incremental Growth In Channel Mix
In this session, DAC’s Alex Hernandez, Associate Director of SEO, and Orli Millstein, Director of Content Strategy, challenge the assumption that more AI optimization automatically equals more growth. Instead, you’ll see how business model, product complexity, and customer journey determine whether AI visibility should be accelerated, balanced, or deprioritized.
You’ll Learn:
You’ll walk away with a structured way to evaluate strategic fit, content readiness, and revenue impact before reallocating budget or rewriting your roadmap.
Watch the on-demand webinar now to build an AI search strategy that strengthens performance rather than dilutes it!
Akamai analyzed AI bot activity by examining application-layer traffic from its bot management tools.
Commerce drew the most AI bot traffic at 48%. Media, which includes publishing, video, social media, and broadcasting, came second at 13%.
Publishing companies accounted for 40% of all AI bot activity in media, ahead of broadcast and OTT at 29%.
OpenAI generated the most AI bot traffic hitting media companies, with 40% of its media requests going to publishing companies. That’s partly because OpenAI runs multiple bots. GPTBot handles training, OAI-SearchBot powers AI search, and ChatGPT-User retrieves content in real time.
Meta and ByteDance were the second- and third-largest operators. Anthropic and Perplexity rounded out the top five at lower volumes.
Why Akamai Says Fetcher Bots Are The Bigger Concern
The report groups AI bots into four types based on behavior.
Training crawlers and fetchers account for most of the AI bot activity Akamai saw in media, which includes publishing. Training crawlers collect content to build language models. They made up 63% of AI bot activity targeting media in H2 2025.
Fetcher bots grab specific pages in real time when someone asks an AI chatbot a question. They made up 24%, and publishing accounted for 43% of that fetcher activity.
Akamai argues that fetcher bots are the more immediate revenue concern, even though training crawlers generate more total traffic. When a fetcher bot pulls an article to answer a chatbot query, the user gets the information without visiting the publisher’s site.
How Publishers Are Responding
It’s worth noting that Akamai sells bot management tools, and the report’s recommendations point toward its own products and partners.
The most common responses among Akamai’s customers are deny (blocking requests outright), tarpit (holding connections open to waste bot resources), and delay (adding a pause before responding). One unnamed publisher chose tarpitting over blocking, controlled 97% of AI bot requests, and kept the door open to potential licensing deals.
The report argues against blanket blocking, saying some AI companies are willing to pay for content access and that blocking all bots removes that option.
Looking Ahead
The report’s top takeaway is the distinction between training crawlers and fetcher bots. Blocking a training crawler can influence how your content helps build future AI models. Blocking a fetcher bot affects whether your content appears in AI responses right now.
AI search is dominating the strategy conversation right now, and every SEO director is fielding the same pressure from leadership: “What’s our AI search plan?”
The instinct is to optimize everywhere: close every citation gap, refresh every page, pursue every placement. But before you reallocate budget or rebuild your GEO roadmap, there’s a more useful question to ask first:
Which AI search signals are actually driving citations for your brand, and do you have a system to act on them?
Join us for an upcoming expert webinar where we’ll dive into exactly that.
What You’ll Learn
In this webinar, Sam Garg, Founder and CEO of Writesonic, will break down what 500M+ AI conversations reveal about citation signals, and show how that data should shape your GEO execution strategy.
Specifically, you’ll walk away with:
The signals behind AI citations: which content types, sources, and placements actually get cited in ChatGPT, Perplexity, and Gemini, and why it differs from traditional ranking logic
A GEO prioritization framework: so you stop spreading effort equally across citation outreach, content refresh, and third-party placements, and focus on what moves the needle for your specific gaps
An execution model powered by AI agents: including free open-source tools you can deploy right away to automate GEO tasks at scale
Why Attend?
Most SEO teams already have dashboards showing where they’re invisible in AI search. Few have a process to fix it. This session gives you both the diagnostic framework and the execution playbook to close those gaps, and the data to make the case for AI search investment internally.
Join us live to get your questions answered directly by the expert.
This post was sponsored by Alli AI. The opinions expressed in this article are the sponsor’s own.
Everyone assumes Googlebot is the dominant crawler hitting their website. That assumption is now wrong.
We analyzed 24,411,048 proxy requests across 78,000+ pages on 69 customer websites on Alli AI’s crawler enablement platform over a 55-day period (January to March 2026). OpenAI’s ChatGPT-User crawler made 3.6x more requests than Googlebot across our data sample. And that’s not even counting GPTBot, OpenAI’s separate training crawler.
A note on methodology: Crawler identification used user agent string matching, verified against published IP ranges. Request metrics are measured at the proxy/CDN layer. The dataset covers 69 websites across a variety of industries and sizes, predominantly WordPress-based. Full methodology is detailed at the end.
Finding 1: AI Crawlers Now Outpace Google 3.6x & ChatGPT Leads the Pack
Image created by Alli AI, April 2026.
When we ranked every identified crawler by request volume, the results were unambiguous:
Rank
Crawler
Requests
Category
1
ChatGPT-User (OpenAI)
133,361
AI Search
2
Googlebot
37,426
Traditional Search
3
Amazonbot
35,728
AI / E-Commerce
4
Bingbot
18,280
Traditional Search
5
ClaudeBot (Anthropic)
13,918
AI Search
6
MetaBot
10,756
Social
7
GPTBot (OpenAI)
8,864
AI Training
8
Applebot
6,794
AI Search
9
Bytespider (ByteDance)
6,644
AI Training
10
PerplexityBot
5,731
AI Search
ChatGPT-User made more requests than Googlebot, Amazonbot, and Bingbot combined.
Image created by Alli AI, April 2026.
Grouped by purpose, AI-related crawlers (ChatGPT-User, GPTBot, ClaudeBot, Amazonbot, Applebot, Bytespider, PerplexityBot, CCBot) made 213,477 requests versus 59,353 for traditional search crawlers (Googlebot, Bingbot, YandexBot). AI crawlers are now making 3.6x more requests than traditional search crawlers across our network.
Finding 2: OpenAI Uses 2 Crawlers (And Most Sites Don’t Know the Difference)
Image created by Alli AI, April 2026.
OpenAI operates two distinct crawlers with very different purposes.
ChatGPT-User is the retrieval crawler. It fetches pages in real time when users ask ChatGPT questions that require up-to-date web information. This determines whether your content appears in ChatGPT’s answers.
GPTBot is the training crawler. It collects data to improve OpenAI’s models. Many sites block GPTBot via robots.txt but not ChatGPT-User, or vice versa, without understanding the distinct consequences of each.
Combined, OpenAI’s crawlers made 142,225 requests: 3.8x Googlebot’s volume.
The robots.txt directives are separate:
User-agent: GPTBot # Training crawler — feeds OpenAI's models
User-agent: ChatGPT-User # Retrieval crawler — fetches pages for ChatGPT answers
Finding 3: AI Crawlers Are Faster & More Reliable, But Their Volume Adds Up
Image created by Alli AI, April 2026.
AI crawlers are significantly more efficient per request:
Crawler
Avg Response Time
200 Success Rate
PerplexityBot
8ms
100%
ChatGPT-User
11ms
99.99%
GPTBot
12ms
99.9%
ClaudeBot
21ms
99.9%
Bingbot
42ms
98.4%
Googlebot
84ms
96.3%
Two likely reasons. First, AI retrieval crawlers are fetching specific pages in response to user queries, not exhaustively discovering site architecture. They know what they want, they grab it, and they leave. Second, while all crawlers on our infrastructure receive pre-rendered responses, Googlebot’s broader crawl pattern means it requests a wider range of URLs, including stale paths from sitemaps and its own legacy index, which adds latency from redirect chains and error handling that retrieval crawlers avoid entirely.
But there’s a catch: while each individual request is lightweight, the sheer volume means aggregate server load is substantial. ChatGPT-User at 11ms × 133,361 requests is still a real infrastructure cost, just distributed differently than Googlebot’s fewer, heavier requests.
Finding 4: Googlebot Sees a Different (Worse) Version of Your Site
Image created by Alli AI, April 2026.
Googlebot’s 96.3% success rate versus near-perfect rates for AI crawlers reveals an important structural difference.
Googlebot received 624 blocked responses (403) and 480 not found errors (404), accounting for 3% of its requests. Meanwhile, ChatGPT-User achieved 99.99% success. PerplexityBot hit a perfect 100%.
Image created by Alli AI, April 2026.
Why the gap? The most likely explanation is index age and crawl behavior, not site misconfiguration.
Googlebot maintains a massive legacy index built over years of continuous crawling. It routinely re-requests URLs it already knows about — including pages that have since been deleted (404s) or restructured (403s). This is normal behavior for a search engine maintaining an index of this scale, but it means a meaningful percentage of Googlebot’s requests are directed at URLs that no longer exist.
AI crawlers don’t carry that baggage. ChatGPT-User fetches specific pages in response to real-time user queries, targeting content that’s currently relevant and linked. That’s a structural advantage that produces near-perfect success rates.
Industry Reports Confirm AI Crawling Surged 15x in 2025
Our data shows this crossover may already be happening at the site level for properties that actively enable AI crawler access.
Your New SEO Strategy: How To Audit, Clean Up & Optimize For AI Crawlers
1. Audit your robots.txt for AI crawlers today
Most robots.txt files were written for a Googlebot-first world. At minimum, have explicit directives for ChatGPT-User, GPTBot, ClaudeBot, Amazonbot, PerplexityBot, Applebot, Bytespider, CCBot, and Google-Extended.
Our recommendation: Most businesses benefit from allowing both retrieval crawlers (ChatGPT-User, PerplexityBot, ClaudeBot) and training crawlers (GPTBot, CCBot, Bytespider), training data is what teaches these models about your brand, products, and expertise. Blocking training crawlers today means AI models learn less about you tomorrow, which reduces your chances of being cited in AI-generated answers down the line.
The exception: if you have content you specifically need to protect from model training (proprietary research, gated content), use granular Disallow rules for those paths rather than blanket blocks.
2. Clean up stale URLs in Google Search Console
Our data shows Googlebot hits a 3% error rate, mostly 403s and 404s, while AI crawlers achieve near-perfect success rates. That gap likely reflects Googlebot re-crawling legacy URLs that no longer exist. But those failed requests still consume the crawl budget.
Audit your GSC crawl stats for recurring 404s and 403s. Set up proper redirects for restructured URLs and submit updated sitemaps.
3. Treat AI crawler accessibility as a distinct SEO channel
Ranking in ChatGPT’s answers, Perplexity’s results, and Claude’s responses is emerging as a distinct visibility channel. If your content isn’t accessible to these crawlers, particularly if you’re running JavaScript-heavy frameworks, you’re invisible in AI search.
We’ve published a live dashboard showing how AI crawler traffic breaks down across a real site: which platforms are visiting, how often, and their share of total traffic; if you want to see what this looks like in practice.
4. Plan for volume, not just individual request weight
AI crawlers send light, fast requests, but they send many of them. ChatGPT-User alone accounted for more than 133,000 requests in 55 days. The aggregate server load from AI crawlers is now likely exceeding your Googlebot load. Make sure your hosting and CDN can handle it, the low per request response times in our data reflect the fact that Alli AI serves pre-rendered static HTML from the CDN edge, which is exactly the kind of architecture that absorbs this volume without taxing your origin server.
Methodology
This analysis is based on 24,411,048 HTTP proxy requests processed through Alli AI’s crawler enablement platform between January 14 and March 9, 2026, covering 69 customer websites.
Crawler identification used user agent string matching, verified against published IP ranges. For OpenAI crawlers specifically, every request was cross-referenced against OpenAI’s published CIDR ranges. This confirmed 100% of GPTBot requests and 99.76% of ChatGPT-User requests originated from OpenAI’s infrastructure. The remaining 0.24% (requests from spoofed user agents) were excluded.
Limitations: The dataset is scoped to Alli AI customers who have opted into crawler enablement. Crawlers that don’t self-identify via user agent are not captured. Response time measurements are at the proxy layer, not the origin server.
About Alli AI
Alli AI provides server-side rendering infrastructure for AI and search engine crawlers. This analysis was produced using data from our proxy infrastructure to help the SEO community better understand the evolving crawler landscape.
Want to see this data in action? See the breakdown firsthand by visiting our AI visibility dashboard.
Resoneo says ChatGPT responses began referencing about 20% fewer websites after what it identifies as the early-March transition to GPT-5.3 Instant.
The analysis comes from the French SEO consultancy and draws on data from Meteoria, an AI visibility-tracking platform that monitored 400 prompts daily over 14 weeks, producing 27,000 comparable responses.
Average unique domains per response dropped from 19 before the transition to 15 after. Average unique URLs per response fell from 24 to 19.
The URLs-per-domain ratio remained at 1 throughout the tracking period. The data suggests ChatGPT isn’t visiting as many sites per response, but it’s going just as deep into each one.
Fewer domains now share the same citation surface in each response, meaning the sites that do get cited take up a larger share of each answer.
Server Logs Back Up The Pattern
Independent log analysis from Jérôme Salomon at Oncrawl supports the findings. Tracking ChatGPT-User bot activity across multiple websites, his data shows crawl volume has settled at a lower level. Some pages aren’t being crawled at all anymore, and the crawl frequency for pages still being visited has dropped.
Resoneo links the change to ChatGPT’s default experience now being driven more heavily by GPT-5.3 Instant, which the company says triggers fewer web searches and citations than earlier behavior. Oncrawl’s server log data shows the lower crawl pattern over the same period.
A 20% drop in cited domains per response means fewer websites competing for visibility inside each ChatGPT answer. The total citation surface shrank, but the sites that kept appearing maintained their same crawl depth.
For anyone tracking referral traffic from ChatGPT, the early-March model transition is a date range worth checking in your analytics.
Looking Ahead
Resoneo’s analysis notes that GPT-5.4 Thinking reintroduces search fan-outs and uses site: operators to target trusted domains, but these behaviors weren’t captured in the quantitative dataset, which covers GPT-5.3 Instant and below.
Whether the citation surface continues to narrow or widens again with newer models isn’t yet clear.
Google, OpenAI, and Shopify insist that the next revolution in AI is agentic AI shopping agents. Shopping is a lucrative area for AI to burrow into. The thing that I keep thinking is that shopping is a deeply important activity to humans; it’s literally a part of our DNA. Is surrendering the shopping experience something the general public is willing to do?
Agentic AI shopping is like a personal assistant that you tell what you want and maybe why you need it, plus some features and a price range. The AI will go out and do the research and comparison and even make the purchase.
There’s no human performing a search in that scenario. So it’s kind of not necessarily good for SEO unless you’re optimizing shopping sites for agentic AI shoppers.
Shopping Is A Part Of Human Biology
Scientists say that shopping is literally a part of our DNA. Our desire to hunt, to gather, and to flaunt our ability to be successful is a part of the evolutionary competition we participate in (whether we know it or not).
“Richard Dawkins outlines in The Selfish Gene (1976) that humans are machines made of genes, and genes are the grounding for everything people do.
…Therefore, everything that people do relates to thriving in their environment above competition, including the way people consume as a form of survival in their environment when simply purchasing the basic physiological needs of food, water and warmth. People also consume to thrive above others, for example in conspicuous consumption where a luxury car represents money and high social status…”
What that means is that whether we know it or not, our drive to shop is a part of evolutionary competition with each other. Part of it is to signal our status and attractiveness for reproduction. So when we go shopping for clothes or toilet paper, it’s part of our genetic programming to feel good about it.
Shopping And The Brain’s Chemical Cocktail
And when it comes to feeling good, some of that is triggered by chemicals like dopamine, endorphins, and serotonin firing off to reward you for finding a good deal.
Even scoring a deal on toilet paper can trigger reward signals in the brain.
Another Wikipedia page about the biology of our reward system explains:
“Reward is the attractive and motivational property of a stimulus that induces appetitive behavior, also known as approach behavior, and consummatory behavior. A rewarding stimulus has been described as “any stimulus, object, event, activity, or situation that has the potential to make us approach and consume it.”
A sale sign in a store can act as a reward cue because it signals a lower price or added value, which can drive someone to approach and buy it. The sign itself is just information, but when a person recognizes the discount or deal as beneficial, it can trigger motivation to act. That’s a deeply embedded behavior that we carry with us.
We are like machines that are programmed in our genes to shop.
So that raises the question: Why would anyone delegate that deeply rewarding activity to an AI agent? It’s like delegating the enjoyment of chocolate to a robot.
I suspect that most of you reading this know which supermarkets sell the best produce at the cheapest price, which ones have the yummiest bread, and which markets have the best spices. That’s our programming; it’s biological. It does not make sense to delegate the rewards inherent in discovery or acquisition to an AI shopping agent.
Serendipity And Shopping
Serendipity is when things happen by chance, unplanned, that nonetheless provide a happy outcome or benefit. One of the joys of shopping is stumbling onto something that’s a good deal or beautiful or has some other value. Employing an AI agent will cause humans to miss out on the serendipitous joy of discovering something they hadn’t been looking for that is not just desirable but also something they hadn’t known they needed.
For example, I purchased a birthday gift for my wife. I walked into a gift shop run by a charming new age hippie. We talked about music as I browsed the gifts for sale. I found something, two things, that I hadn’t planned on buying. The two things had a semantic connection to each other that I found to be poetic and therefore extra nice as a gift. The shop owner put the two items into two boxes, then placed the boxes in a lovely mesh gift bag with a ribbon.
That’s serendipity in action. It was a pleasurable moment I enjoyed. I walked out of the store into the sunshine with a fresh cocktail of dopamine, endorphins, and serotonin flooding my brain, and it was a delightful moment. I bought a gift that I was certain my wife would enjoy.
Agentic AI Shopping Is Unnatural
My question is, why does Silicon Valley think it can automate the many things that make us human?
It’s as if Silicon Valley is trying to convert us into teenagers by doing the things adults normally do.
Now they want to take shopping away from us?
I think that the only way that agentic AI has a chance of working is if they build in a sense of serendipity and discovery into the system. I’ve been a part of the technology scene for over 25 years, I lived in the world capital of the Internet in San Francisco and even worked for a time at a leading technology magazine.
So it’s not that I’m a luddite about technology. AI integrated into a shopping site makes a lot of sense. It can make recommendations and answer questions. That’s great. There is still a human who is clicking around and discovering things for themselves in a way that satisfies are natural urge to shop and consume. That’s good for SEO because it means that a store needs to be optimized for search.
AI agents doing the shopping for humans makes less sense because it’s unnatural, it goes against our biology.
Artificial intelligence led all employer-cited reasons for U.S. job cuts in March, accounting for 15,341 of the month’s 60,620 announced layoffs, according to outplacement firm Challenger, Gray & Christmas.
That’s 25% of all cuts for the month, up from roughly 10% in February.
Since Challenger began tracking AI as a reason in 2023, employers have now cited it in 99,470 layoff announcements, or 3.5% of all cuts during that period.
What The Numbers Show
Total U.S. job cuts rose 25% from February to March but are down 78% from March 2025, when a wave of federal layoffs pushed that month’s total to 275,240.
For the first quarter overall, employers announced 217,362 cuts. That’s the lowest Q1 total since 2022.
AI ranks fifth among all cited reasons year-to-date, behind market and economic conditions, restructuring, closings, and contract loss. But its share is growing. In all of 2025, AI accounted for 5% of cited cuts. Through Q1 2026, it’s at 13%.
These are employer-stated reasons, not independently verified causes. Companies may cite AI when cuts involve broader cost restructuring.
Technology Sector Hit Hardest
Technology companies announced 18,720 cuts in March alone, bringing the 2026 total to 52,050. That’s up 40% from the 37,097 tech cuts announced in the same period last year. It’s the highest year-to-date total for the sector since 2023.
Andy Challenger, the firm’s chief revenue officer, said the pattern goes beyond traditional cost-cutting.
“Companies are shifting budgets toward AI investments at the expense of jobs. The actual replacing of roles can be seen in Technology companies, where AI can replace coding functions. Other industries are testing the limits of this new technology, and while it can’t replace jobs completely, it is costing jobs.”
Dell accounted for a large portion of March’s tech cuts based on its latest annual filing, according to the report. Oracle reportedly began layoffs late last month but has not released a total. Meta is also cutting roles in its Reality Labs division as it redirects resources toward AI.
Other Industries
Transportation companies announced the second-most cuts year-to-date with 32,241, up 703% from the same period in 2025. It’s the highest Q1 total for the sector on record.
Healthcare announced 23,520 cuts in Q1, also a record for the sector.
The news industry, tracked as a subset of media, announced 639 cuts through Q1 2026, up 12% from 573 in the same period last year.
Why This Matters
The Challenger data puts company-level numbers behind what workforce projections have estimated.
SEJ recently covered the Tufts American AI Jobs Risk Index, which ranked computer programmers at 55% vulnerability and web developers at 46%.
Challenger’s report separately shows tech sector cuts at their highest since 2023 and AI as the top employer-cited reason for March layoffs overall. The two datasets measure different things, but they point in the same direction.
For people working in search, content, and digital marketing, Challenger’s data adds another reference point to track alongside academic projections and company earnings calls.
Looking Ahead
Challenger said he expects more tech layoffs in 2026 as companies continue redirecting budgets toward AI.
“One thing that is clear is that AI is changing work and the workforce. Workers will need to be more strategic as they lead AI-powered agents that handle increasingly complex tasks.”
Challenger, Gray & Christmas publishes updated cut data monthly.
AI search is dominating the strategy conversation right now, and everyone is hearing the same thing from clients and directors: “What’s our AI search plan?”
The instinct is to optimize everywhere, ChatGPT, Perplexity, Gemini, and move fast. But before you reallocate budget or rewrite your GEO roadmap, there’s a more useful question to ask first:
Join us for an upcoming expert panel webinar where we’ll dive into exactly that.
What You’ll Learn
In this webinar, Danielle Wood, Content & Creative Manager at CallRail, and Natalie Johnson, SEO & AI Visibility Expert & Founder of SweetGlow Marketing, will break down real conversion data by LLM and show how platform-level performance should shape your GEO strategy.
Specifically, you’ll walk away with:
Conversion data by LLM platform, so you know where high-intent traffic is actually coming from in each industry
A clear AI prioritization framework to stop spreading GEO effort equally and concentrate it where it converts
A reporting model that ties AI search activity to real business outcomes clients can see and trust
Why Attend?
You’ll finally be able to justify AI search investment; this session will give you the data and the framework to make that case and to implement the strongest, most successful AI search strategy possible.
Join us live to get your questions answered directly by the expert panel.