Complete Crawler List For AI User-Agents [Dec 2025] via @sejournal, @vahandev

AI visibility plays a crucial role for SEOs, and this starts with controlling AI crawlers. If AI crawlers can’t access your pages, you’re invisible to AI discovery engines.

On the flip side, unmonitored AI crawlers can overwhelm servers with excessive requests, causing crashes and unexpected hosting bills.

User-agent strings are essential for controlling which AI crawlers can access your website, but official documentation is often outdated, incomplete, or missing entirely. So, we curated a verified list of AI crawlers from our actual server logs as a useful reference.

Every user-agent is validated against official IP lists when available, ensuring accuracy. We will maintain and update this list to catch new crawlers and changes to existing ones.

The Complete Verified AI Crawler List (December 2025)

Name Purpose Crawl Rate of SEJ (pages/hour) Verified IP List Robots.txt disallow Complete User Agent
GPTBot AI training data collection for GPT models (ChatGPT, GPT-4o) 100 Official IP List User-agent: GPTBot
Allow: /
Disallow: /private-folder
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.3; +https://openai.com/gptbot)
ChatGPT-User AI agent for real-time web browsing when users interact with ChatGPT 2400 Official IP List User-agent: ChatGPT-User
Allow: /
Disallow: /private-folder
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; ChatGPT-User/1.0; +https://openai.com/bot
OAI-SearchBot AI search indexing for ChatGPT search features (not for training) 150 Official IP List User-agent: OAI-SearchBot
Allow: /
Disallow: /private-folder
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36; compatible; OAI-SearchBot/1.3; +https://openai.com/searchbot
ClaudeBot AI training data collection for Claude models 500 Official IP List User-agent: ClaudeBot
Allow: /
Disallow: /private-folder
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)
Claude-User AI agent for real-time web access when Claude users browse <10>

Not available User-agent: Claude-User
Disallow: /sample-folder
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Claude-User/1.0; +Claude-User@anthropic.com)
Claude-SearchBot AI search indexing for Claude search capabilities <10>

Not available User-agent: Claude-SearchBot
Allow: /
Disallow: /private-folder
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Claude-SearchBot/1.0; +https://www.anthropic.com)
Google-CloudVertexBot AI agent for Vertex AI Agent Builder (site owners’ request only) <10>

Official IP List User-agent: Google-CloudVertexBot
Allow: /
Disallow: /private-folder
Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/141.0.7390.122 Mobile Safari/537.36 (compatible; Google-CloudVertexBot; +https://cloud.google.com/enterprise-search)
Google-Extended Token controlling AI training usage of Googlebot-crawled content. User-agent: Google-Extended
Allow: /
Disallow: /private-folder
Gemini-Deep-Research AI research agent for Google Gemini’s Deep Research feature <10>

Official IP List User-agent: Gemini-Deep-Research
Allow: /
Disallow: /private-folder
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Gemini-Deep-Research; +https://gemini.google/overview/deep-research/) Chrome/135.0.0.0 Safari/537.36
Google  Gemini’s chat when a user asks to open a webpage <10>

Google
Bingbot Powers Bing Search and Bing Chat (Copilot) AI answers 1300 Official IP List User-agent: BingBot
Allow: /
Disallow: /private-folder
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm) Chrome/116.0.1938.76 Safari/537.36
Applebot-Extended Doesn’t crawl but controls how Apple uses Applebot data. <10>

Official IP List User-agent: Applebot-Extended
Allow: /
Disallow: /private-folder
Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.4 Safari/605.1.15 (Applebot/0.1; +http://www.apple.com/go/applebot)
PerplexityBot AI search indexing for Perplexity’s answer engine 150 Official IP List User-agent: PerplexityBot
Allow: /
Disallow: /private-folder
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; PerplexityBot/1.0; +https://perplexity.ai/perplexitybot)
Perplexity-User AI agent for real-time browsing when Perplexity users request information <10>

Official IP List User-agent: Perplexity-User
Allow: /
Disallow: /private-folder
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Perplexity-User/1.0; +https://perplexity.ai/perplexity-user)
Meta-ExternalAgent AI training data collection for Meta’s LLMs (Llama, etc.) 1100 Not available User-agent: meta-externalagent
Allow: /
Disallow: /private-folder
meta-externalagent/1.1 (+https://developers.facebook.com/docs/sharing/webmasters/crawler)
Meta-WebIndexer Used to improve Meta AI search. <10>

Not available User-agent: Meta-WebIndexer
Allow: /
Disallow: /private-folder
meta-webindexer/1.1 (+https://developers.facebook.com/docs/sharing/webmasters/crawler)
Bytespider AI training data for ByteDance’s LLMs for products like TikTok <10>

Not available User-agent: Bytespider
Allow: /
Disallow: /private-folder
Mozilla/5.0 (Linux; Android 5.0) AppleWebKit/537.36 (KHTML, like Gecko) Mobile Safari/537.36 (compatible; Bytespider; https://zhanzhang.toutiao.com/)
Amazonbot AI training for Alexa and other Amazon AI services 1050 Not available User-agent: Amazonbot
Allow: /
Disallow: /private-folder
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Amazonbot/0.1; +https://developer.amazon.com/support/amazonbot) Chrome/119.0.6045.214 Safari/537.36
DuckAssistBot AI search indexing for DuckDuckGo search engine 20 Official IP List User-agent: DuckAssistBot
Allow: /
Disallow: /private-folder
DuckAssistBot/1.2; (+http://duckduckgo.com/duckassistbot.html)
MistralAI-User Mistral’s real-time citation fetcher for “Le Chat” assistant <10>

Not available User-agent: MistralAI-User
Allow: /
Disallow: /private-folder
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; MistralAI-User/1.0; +https://docs.mistral.ai/robots)
Webz.io Data extraction and web scraping used by other AI training companies. Formerly known as Omgili. <10>

Not available User-agent: webzio
Allow: /
Disallow: /private-folder
webzio (+https://webz.io/bot.html)
Diffbot Data extraction and web scraping used by companies all over the world. <10>

Not available User-agent: Diffbot
Allow: /
Disallow: /private-folder
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.2) Gecko/20090729 Firefox/3.5.2 (.NET CLR 3.5.30729; Diffbot/0.1; +http://www.diffbot.com)
ICC-Crawler AI and machine learning data collection <10>

Not available User-agent: ICC-Crawler
Allow: /
Disallow: /private-folder
ICC-Crawler/3.0 (Mozilla-compatible; ; https://ucri.nict.go.jp/en/icccrawler.html)
CCBot Open-source web archive used as training data by multiple AI companies <10>

Official IP List User-agent: CCBot
Allow: /
Disallow: /private-folder
CCBot/2.0 (https://commoncrawl.org/faq/)

The user-agent strings above have all been verified against Search Engine Journal server logs.

Popular AI Agent Crawlers With Unidentifiable User Agent

We’ve found that the following didn’t identify themselves:

  • you.com.
  • ChatGPT’s agent Operator.
  • Bing’s Copilot chat.
  • Grok.
  • DeepSeek.

There is no way to track this crawler from accessing webpages other than by identifying the explicit IP.

We set up a trap page (e.g., /specific-page-for-you-com/) and used the on-page chat to prompt you.com to visit it, allowing us to locate the corresponding visit record and IP address in our server logs. Below is the screenshot:

Screenshot by author, December 2025

What About Agentic AI Browsers?

Unfortunately, AI browsers such as Comet or ChatGPT’s Atlas don’t differentiate themselves in the user agent string, and you can’t identify them in server logs and blend with normal users’ visits.

Chatgpt's Atlas browser user agetn string from server logs records
ChatGPT’s Atlas browser user agent string from server logs records (Screenshot by author, December 2025)

This is disappointing for SEOs because tracking agentic browser visits to a website is important for reporting POV.

How To Check What’s Crawling Your Server

Some hosting companies offer a user interface (UI) that makes it easy to access and look at server logs, depending on what hosting service you are using.

If your hosting doesn’t offer this, you can get server log files (usually located  /var/log/apache2/access.log in Linux-based servers) via FTP or request it from your server support to send it to you.

Once you have the log file, you can view and analyze it in either Google Sheets (if the file is in CSV format), Screaming Frog’s log analyzer, or, if your log file is less than 100 MB, you can try analyzing it with Gemini AI.

How To Verify Legitimate Vs. Fake Bots

Fake crawlers can spoof legitimate user agents to bypass restrictions and scrape content aggressively. For example, anyone can impersonate ClaudeBot from their laptop and initiate crawl request from the terminal. In your server log, you will see it as Claudebot is crawling it:

curl -A 'Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +claudebot@anthropic.com)' https://example.com

Verification can help to save server bandwidth and prevent harvesting content illegally. The most reliable verification method you can apply is checking the request IP.

Check all IPs and scan to match if it’s one of the officially declared IPs listed above. If so, you can allow the request; otherwise, block.

Various types of firewalls can help you with this via allowlist verified IPs (which allows legitimate bot requests to pass through), and all other requests impersonating AI crawlers in their user agent strings are blocked.

For example, in WordPress, you can use Wordfence free plugin to allowlist legitimate IPs from the official lists (as above) and add blocking custom rules as below:

The allowlist rule is superior, and it will let legitimate crawlers pass through and block any impersonation request which comes from different IPs.

However, please note that it is possible to spoof an IP address, and in that case, when bot user agent and IPs are spoofed, you won’t be able to block it.

Conclusion: Stay In Control Of AI Crawlers For Reliable AI Visibility

AI crawlers are now part of our web ecosystem, and the bots listed here represent the major AI platforms currently indexing the web, although this list is likely to grow.

Check your server logs regularly to see what’s actually hitting your site and make sure you inadvertently don’t block AI crawlers if visibility in AI search engines is important for your business. If you don’t want AI crawlers to access your content, block them via robots.txt using the user-agent name.

We’ll keep this list updated as new crawlers emerge and update existing ones, so we recommend you bookmark this URL, or revisit this article on a regular basis to keep your AI crawler list up to date.

More Resources:


Featured Image: BestForBest/Shutterstock

SEO Pulse: Google Updates Console, Maps & AI Mode Flow via @sejournal, @MattGSouthern

Google packed a lot into this week, with Search Console picking up AI-powered configuration, Maps loosening its real-name rule for reviews, and a new test nudging more people from AI Overviews into AI Mode.

Here’s what that means for you.

Google Search Console Tests AI-Powered Report Configuration

Google introduced an experimental AI feature in Search Console that lets you describe the report you want and have the tool build it for you.

The feature, announced in a Google blog post, lives inside the Search results Performance report. You can type something like “compare clicks from UK versus France,” and the system will set filters, comparisons, and metrics to match what it thinks you mean.

For now, the feature is limited to Search results data, while Discover, News, and video reports still work the way they always have. Google says it’s starting with “a limited set of websites” and will expand access based on feedback.

The update is about configuration, not new metrics. It can help you set up a table, but it will not change how you sort or export data, and it does not add separate reporting for AI Overviews or AI Mode.

Why SEOs Should Pay Attention

If you spend a lot of time rebuilding the same types of reports, this can save you some setup time. It’s easier to describe a comparison in one sentence than to remember which checkboxes and filters you used last month.

The tradeoff is that you still need to confirm what the AI actually did. When a view comes from a written request instead of a manual series of clicks, it’s easy for a small misinterpretation to slip through and show up in a deck or a client email.

This is not a replacement for understanding how your reports are put together. It also does nothing to answer a bigger question for SEO professionals about how much traffic is coming from Google’s AI surfaces.

What SEO Professionals Are Saying

On LinkedIn, independent SEO consultant Brodie Clark summed up the launch with:

“Whoa, Google Search Console just rolled out another gem: a new AI-powered configuration to analyse your search traffic. The new feature is designed to reduce the effort it takes for you to select, filter, and compare your data.”

He then walks through how it can apply filters, set comparisons, and pick metrics for common tasks.

Under the official Search Central post, one commenter joked about the gap between configuration and data:

“GSC: ‘Describe the dataview you want to see’ Me: ‘Show me how much traffic I receive from AI overviews and AI mode’ :’)”

The overall mood is that this is a genuine quality-of-life improvement, but many SEO professionals would still rather get first-class reporting for AI Overviews and AI Mode than another way to slice existing Search results data.

Read our full coverage: Google Adds AI-Powered Configuration To Search Console

Google Maps Reviews No Longer Require Real Names

Google Maps now lets people leave reviews under a custom display name and profile picture instead of their real Google Account name. The change rolled out globally and is documented in recent Google Maps updates.

You set this up in the Contributions section of your profile. Once you choose a display name and avatar, that identity appears on new reviews and can be applied to older ones if you edit them, while Google still ties everything back to a real account with a full activity history.

The change is more than cosmetic because review identity shapes how people interpret trust and intent when they scan a local business profile.

Why SEOs Should Pay Attention

Reviews remain one of the strongest local ranking signals, based on Whitespark’s Local Search Ranking Factors survey. When names turn into nicknames, it shifts how business owners and customers read that feedback.

For local businesses, it becomes harder to recognize reviewers at a glance, review audits feel more manual because names are less useful, and owners may feel they have less visibility into who is talking about them, even though Google still sees the underlying accounts.

If you manage local clients, you will likely spend time explaining that this doesn’t make reviews truly anonymous, and that review solicitation and response strategies still matter.

What Local SEO Professionals Are Saying

In a LinkedIn post, Darren Shaw, founder of Whitespark, tried to calm some of the panic:

“Hot take: Everyone is freaking out that anonymous Google reviews will cause a surge in fake review spam, but I don’t think so.”

He points out that anyone determined to leave fake reviews can already create throwaway accounts, and that:

“Anonymous display names ≠ anonymous accounts”

Google still sees device data, behavior patterns, and full contribution history. In his view, the bigger story is that this change lowers the barrier for honest feedback in “embarrassed consumer” categories like criminal defense, rehab, and therapy, where people do not want their real names in search results.

The comments add useful nuance. Curtis Boyd expects “an increase in both 5 star reviews for ‘embarrassed consumer industries’ and correspondingly – 1 star reviews, across all industries as google makes it easier to hide identity.”

Taken together, the thread suggests you should watch for changes in review volume and rating mix, especially in sensitive verticals, without assuming this update alone will cause a sudden spike in spam.

Read our full coverage: Google Maps Lets Users Post Reviews Using Nicknames

Google Tests Seamless AI Overviews To AI Mode Transition

Google is testing a new mobile flow that sends people straight from AI Overviews into AI Mode when they tap “Show more,” based on a post from Robby Stein, VP of Product for Google Search.

In the examples Google has shown, you see an AI Overview at the top of the results page. When you expand it, an “Ask anything” bar appears at the bottom, and typing into that bar opens AI Mode with your original query pulled into a chat thread.

The test is limited to mobile and to countries where AI Mode is already available, and Google hasn’t said how long it will run or when it might roll out more broadly.

Why SEOs Should Pay Attention

This test blurs the line between AI Overviews as a SERP feature and AI Mode as a separate product. If it sticks, someone who sees your content cited in an Overview has a clear path to keep asking follow-up questions inside AI Mode instead of scrolling down to organic results.

On mobile, where this is running first, the effect is stronger because screen space is tight. A prominent “Ask anything” bar at the bottom of the screen gives people an obvious option that doesn’t involve hunting for blue links underneath ads, shopping units, and other features.

If your pages show up in AI Overviews today, it’s worth watching mobile traffic and AI-related impressions so you have before-and-after data if this behavior expands.

What SEO Professionals Are Saying

In a widely shared LinkedIn post, Lily Ray, VP of SEO Strategy & Research at Amsive, wrote:

“Google announced today that they’ll be testing a new way for users to click directly into AI Mode via AI Overviews.”

She notes that many people will likely expect “Show more” to lead back to traditional results, not into a chat interface, and ties the test to the broader state of the results page, arguing that ads and new sponsored treatments are making it harder to find organic listings.

Ray’s most pointed line is:

“Compared to the current chaotic state of Google’s search results, AI Mode feels frictionless.”

Her view is that Google is making traditional search more cluttered while giving AI Mode a cleaner, easier experience.

Other SEO professionals in the comments give concrete examples. One notes that “the well hidden sponsored ads have gotten completely out of control lately,” describing a number one organic result that sits below “5–6 sponsored ads.” Another says they have “been working with SEO since 2007” and only recently had to pause before clicking on a result because they were not sure whether it was organic or an ad.

There’s also frustration with AI Mode’s limits. One commenter describes how the context window “just suddenly refreshes and forgets everything after about 10 prompts/turns,” which makes longer research sessions difficult even as the entry point gets smoother.

Overall, the thread reads as a warning that AI Mode may feel cleaner but also keeps people on Google, and that this test is one more step in nudging searchers toward that experience.

Read our full coverage: Google Connects AI Overviews To AI Mode On Mobile

Theme Of The Week: Google Tightens Its Grip On The Journey

All three updates are pulling in the same direction: More of the search journey happens inside Google’s own interfaces.

Search Console’s AI configuration keeps you in the Performance report longer by taking some of the work out of report setup. Maps nicknames make it easier for people to speak freely, but on a platform where Google defines how identity is presented. The AI Overviews to AI Mode test turns follow-up questions into a chat that runs on Google’s terms rather than yours.

There are real usability wins in all of this, but also fewer clear moments where a searcher is nudged off Google and onto your site.

If you want to dig deeper into this week’s stories, you can read:

And for broader context:


Featured Image: Pixel-Shot/Shutterstock

The New Structure Of AI Era SEO via @sejournal, @DuaneForrester

People keep asking me what it takes to show up in AI answers. They ask in conference hallways, in LinkedIn messages, on calls, and during workshops. The questions always sound different, but the intent is the same. People want to know how much of their existing SEO work still applies. They want to know what they need to learn next and how to avoid falling behind. Mostly, they want clarity (hence my new book!). The ground beneath this industry feels like it moved overnight, and everyone is trying to figure out if the skills they built over the last twenty years still matter.

They do. But not in the same proportions they used to. And not for the same reasons.

When I explain how GenAI systems choose content, I see the same reaction every time. First, relief that the fundamentals still matter. Then a flicker of concern when they realize how much of the work they treated as optional is now mandatory. And finally, a mix of curiosity and discomfort when they hear about the new layer of work that simply did not exist even five years ago. That last moment is where the fear of missing out turns into motivation. The learning curve is not as steep as people imagine. The only real risk is assuming future visibility will follow yesterday’s rules.

That is why this three-layer model helps. It gives structure to a messy change. It shows what carries over, what needs more focus, and what is entirely new. And it lets you make smart choices about where to spend your time next. As always, feel free to disagree with me, or support my ideas. I’m OK with either. I’m simply trying to share what I understand, and if others believe things to be different, that’s entirely OK.

This first set contains the work every experienced SEO already knows. None of it is new. What has changed is the cost of getting it wrong. LLM systems depend heavily on clear access, clear language, and stable topical relevance. If you already focus on this work, you are in a good starting position.

You already write to match user intent. That skill transfers directly into the GenAI world. The difference is that LLMs evaluate meaning, not keywords. They ask whether a chunk of content answers the user’s intent with clarity. They no longer care about keyword coverage or clever phrasing. If your content solves the problem the user brings to the model, the system trusts it. If it drifts off topic or mixes multiple ideas in the same chunk/block, it gets bypassed.

Featured snippets prepared the industry for this. You learned to lead with the answer and support it with context. LLMs treat the opening sentences of a chunk as a kind of confidence score. If the model can see the answer in the first two or three sentences, it is far more likely to use that block. If the answer is buried under a soft introduction, you lose visibility. This is not stylistic preference. It is about risk. The model wants to minimize uncertainty. Direct answers lower that uncertainty.

This is another long-standing skill that becomes more important. If the crawler cannot fetch your content cleanly, the LLM cannot rely on it. You can write brilliant content and structure it perfectly, and none of it matters if the system cannot get to it. Clean HTML, sensible page structure, reachable URLs, and a clear robots.txt file are still foundational. Now they also affect the quality of your vector index and how often your content appears in AI answers.

Updating fast-moving topics matters more today. When a model collects information, it wants the most stable and reliable view of the topic. If your content is accurate but stale, the system will often prefer a fresher chunk from a competitor. This becomes critical in categories like regulations, pricing, health, finance, and emerging technology. When the topic moves, your updates need to move with it.

This has always been at the heart of SEO. Now it becomes even more important. LLMs look for patterns of expertise. They prefer sources that have shown depth across a subject instead of one-off coverage. When the model attempts to solve a problem, it selects blocks from sources that consistently appear authoritative on that topic. This is why thin content strategies collapse in the GenAI world. You need depth, not coverage for the sake of coverage.

This second group contains tasks that existed in old SEO but were rarely done with discipline. Teams touched them lightly but did not treat them as critical. In the GenAI era, these now carry real weight. They do more than polish content. They directly affect chunk retrieval, embedding quality, and citation rates.

Scanning used to matter because people skim pages. Now chunk boundaries matter because models retrieve blocks, not pages. The ideal block is a tight 100 to 300 words that covers one idea with no drift. If you pack multiple ideas into one block, retrieval suffers. If you create long, meandering paragraphs, the embedding loses focus. The best performing chunks are compact, structured, and clear.

This used to be a style preference. You choose how to name your product or brand and try to stay consistent. In the GenAI era, entity clarity becomes a technical factor. Embedding models create numeric patterns based on how your entities appear in context. If your naming drifts, the embeddings drift. That reduces retrieval accuracy and lowers your chances of being used by the model. A stable naming pattern makes your content easier to match.

Teams used to sprinkle stats into content to seem authoritative. That is not enough anymore. LLMs need safe, specific facts they can quote without risk. They look for numbers, steps, definitions, and crisp explanations. When your content contains stable facts that are easy to lift, your chances of being cited go up. When your content is vague or opinion-heavy, you become less usable.

Links still matter, but the source of the mention matters more. LLMs weigh training data heavily. If your brand appears in places known for strong standards, the model builds trust around your entity. If you appear mainly on weak domains, that trust does not form. This is not classic link equity. This is reputation equity inside a model’s training memory.

Clear writing always helped search engines understand intent. In the GenAI era, it helps the model align your content with a user’s question. Clever marketing language makes embeddings less accurate. Simple, precise language improves retrieval consistency. Your goal is not to entertain the model. Your goal is to be unambiguous.

This final group contains work the industry never had to think about before. These tasks did not exist at scale. They are now some of the largest contributors to visibility. Most teams are not doing this work yet. This is the real gap between brands that appear in AI answers and brands that disappear.

The LLM does not rank pages. It ranks chunks. Every chunk competes with every other chunk on the same topic. If your chunk boundaries are weak or your block covers too many ideas, you lose. If the block is tight, relevant, and structured, your chances of being selected rise. This is the foundation of GenAI visibility. Retrieval determines everything that follows.

Your content eventually becomes vectors. Structure, clarity, and consistency shape how those vectors look. Clean paragraphs create clean embeddings. Mixed concepts create noisy embeddings. When your embeddings are noisy, they lose queries by a small margin and never appear. When your embeddings are clean, they align more often and rise in retrieval. This is invisible work, but it defines success in the GenAI world.

Simple formatting choices change what the model trusts. Headings, labels, definitions, steps, and examples act as retrieval cues. They help the system map your content to a user’s need. They also reduce risk, because predictable structure is easier to understand. When you supply clean signals, the model uses your content more often.

LLMs evaluate trust differently than Google or Bing. They look for author information, credentials, certifications, citations, provenance, and stable sourcing. They prefer content that reduces liability. If you give the model clear trust markers, it can use your content with confidence. If trust is weak or absent, your content becomes background noise.

Models need structure to interpret relationships between ideas. Numbered steps, definitions, transitions, and section boundaries improve retrieval and lower confusion. When your content follows predictable patterns, the system can use it more safely. This is especially important in advisory content, technical content, and any topic with legal or financial risk.

The shift to GenAI is not a reset. It is a reshaping. People are still searching for help, ideas, products, answers, and reassurance. They are just doing it through systems that evaluate content differently. You can stay visible in that world, but only if you stop expecting yesterday’s playbook to produce the same results. When you understand how retrieval works, how chunks are handled, and how meaning gets modeled, the fog lifts. The work becomes clear again.

Most teams are not there yet. They are still optimizing pages while AI systems are evaluating chunks. They are still thinking in keywords while models compare meaning. They are still polishing copy while the model scans for trust signals and structured clarity. When you understand all three layers, you stop guessing at what matters. You start shaping content the way the system actually reads it.

This is not busywork. It is strategic groundwork for the next decade of discovery. The brands that adapt early will gain an advantage that compounds over time. AI does not reward the loudest voice. It rewards the clearest one. If you build for that future now, your content will keep showing up in the places your customers look next.


My new book, “The Machine Layer: How to Stay Visible and Trusted in the Age of AI Search,” is now on sale at Amazon.com. It’s the guide I wish existed when I started noticing that the old playbook (rankings, traffic, click-through rates) was quietly becoming less predictive of actual business outcomes. The shift isn’t abstract. When AI systems decide which content gets retrieved, cited, and trusted, they’re also deciding which expertise stays visible and which fades into irrelevance. The book covers the technical architecture driving these decisions (tokenization, chunking, vector embeddings, retrieval-augmented generation) and translates it into frameworks you can actually use. It’s built for practitioners whose roles are evolving, executives trying to make sense of changing metrics, and anyone who’s felt that uncomfortable gap opening between what used to work and what works now.

The Machine Layer
Image Credit: Duane Forrester

More Resources:


This post was originally published on Duane Forrester Decodes.


Featured Image: Master1305/Shutterstock

How CMOs Should Prioritize SEO Budgets In 2026 Q1 And H1 via @sejournal, @TaylorDanRW

Search evolved quickly throughout 2025 as AI systems became a primary route for information discovery, which, in turn, reduced the consistency and predictability of traditional organic traffic for many brands.

As blue‑link visibility tightened and click‑through rates became more erratic, CMOs found themselves under growing pressure to justify marketing spend while still demonstrating momentum. This shift required marketing leaders to think more seriously about resilience across their owned channels. It is no longer viable to rely solely on rankings.

Brands need stable visibility across AI surfaces, stronger and more coherent content operations, and cleaner technical foundations that support both users and AI systems.

Q1 and H1 2026 are the periods in which these priorities need to be funded and executed.

Principles For 2026 SEO Budgeting In Q1/H1

A well‑structured SEO budget for early 2026 is built on a clear set of principles that guide both stability and experimentation.

Protect A Baseline Allocation For Core SEO

This includes technical health, site performance, information architecture, and the ongoing maintenance of content. These activities underpin every marketing channel, and cutting them introduces unnecessary risk at a time when discovery patterns are shifting.

Create A Separate Experimental Pot For AI Discovery

As AI Overviews and other generative engines influence how users encounter brands, it becomes important to ring‑fence investment for testing answer‑led content, entity development, evolving schema patterns, and AI measurement frameworks. Without a dedicated pot, these activities either stall or compete with essential work.

Invest In Measurement That Explains Real User Behavior

Because AI visibility remains immature and uneven, analytics must capture how users move through journeys, where AI systems mention the brand, and which content shapes those outcomes.

This level of insight strengthens the CMO’s ability to defend and adjust budgets later in the year.

Where To Put Money In Q1

Q1 is the moment to stabilize the foundation while preparing for new patterns in discovery. The work done here shapes the results achieved in H1.

Technical Foundations

Begin with site health. Improve performance, resolve crawl barriers, modernize internal linking, and strengthen information architecture. AI systems and LLMs rely heavily on clean and consistent signals, so a strong technical environment supports every subsequent content, GEO, and measurement initiative.

Entity‑Rich, Question‑Led Content

Users are now expressing broader and more layered questions, and AI engines reward content that defines concepts clearly, addresses common questions in detail, and builds meaningful topical depth. Invest in structured content programmes aligned to real customer problems and journeys, placing emphasis on clarity, usefulness, and authority rather than chasing volume for its own sake.

Early GEO Experimentation

There is considerable overlap between SEO and LLM inclusion because both rely on strong technical foundations, consistent entity signals, and helpful content that is easy for systems to interpret. LLM discovery should be seen as an extension of SEO rather than a standalone discipline, since most of the work that strengthens SEO also strengthens LLM inclusion by improving clarity, coherence, and relevance.

Certain sectors are beginning to experience new nuances. One example is Agentic Commerce Protocol (ACP), which is influencing how AI systems understand products, evaluate them, and, in some cases, transact with them.

Whether we refer to this area as GEO, AEO, or LLMO, the principle is the same – brands are now optimising for multiple platforms and an expanding set of discovery engines, each with its own interpretation of signals.

Q1 is the right time to assess how your brand appears across these systems. Review answer hubs, evaluate your entity relationships, and examine how structured signals are interpreted. This initial experimentation will inform where budget should be expanded in H1.

H1 View: Scaling What Works

H1 is when early insights from Q1 begin to mature into scalable programmes.

Rolling Winning Experiments Into BAU

When early LLM discovery or structured content initiatives show clear signs of traction, they should be incorporated into business‑as‑usual SEO. Formalizing these practices allows them to grow consistently without requiring new budget conversations every quarter.

Cutting Low‑ROI Tools And Reinvesting In People And Process

Many organizations overspend on tools that fail to deliver meaningful value.

H1 provides the opportunity to review tool usage, identify duplication, and retire underused platforms. Redirecting that spend towards people, content quality, and operational improvements generally produces far stronger outcomes. The AI race that pretty much all tool providers have entered will begin to die down, and those that drive clear value will begin to emerge from the noise.

Adjusting Budget Mix As Data Emerges

By the latter part of H1, the business should have clearer evidence of where visibility is shifting and which activities genuinely influence discovery and engagement. Budgets should then be adjusted to support what is working, maintain core SEO activity, expand successful content areas, and reduce investment in experiments that have not produced results.

CMO Questions Before Sign‑Off

As CMOs review their SEO budgets for 2026, the final stage of sign‑off should be shaped by a balanced view of both offensive and defensive tactics, ensuring the organization invests in movement as well as momentum.

Defensive tactics protect what the brand has already earned: stability in rankings, continuity of technical performance, dependable content structures, and the preservation of existing visibility across both search and AI‑driven experiences.

Offensive tactics, on the other hand, are designed to create new points of visibility, unlock new categories of demand, and strengthen the brand’s presence across emerging discovery engines.

A balanced budget needs to fund both, because without defence the brand becomes fragile, and without offence it becomes invisible.

Movement refers to the activities that help the brand adapt to evolving discovery environments. These include early LLM discovery experiments, entity expansion, and the modernization of content formats.

Momentum represents the compounding effect of sustained investment in core SEO and consistent optimization across key journeys.

CMOs should judge budgets by their ability to generate both: movement that positions the brand for the future, and momentum that sustains growth.

With that in mind, CMOs may wish to ask the following questions before approving any budget:

  • To what extent does this budget balance defensive activity, such as technical stability and content maintenance, with offensive initiatives that expand future visibility?
  • How clearly does the plan demonstrate where movement will come from in early 2026, and how momentum will be protected and strengthened throughout H1?
  • Which elements of the programme directly enhance the brand’s presence across AI surfaces, GEO, and other emerging discovery engines?
  • How effectively does the proposed content strategy support both immediate user needs and longer‑term category growth?
  • How will we track changes in brand visibility across multiple platforms, including traditional search, AI‑driven answers, and sector‑specific discovery systems?
  • What roles do teams, processes, and first‑party data play in sustaining movement and momentum, and are they funded appropriately?
  • What reporting improvements will allow the leadership team to judge the success of both defensive and offensive investments by the end of H1?

More Resources:


Featured Image: N Universe/Shutterstock

5 Reasons To Use The Internet Archive’s New WordPress Plugin via @sejournal, @martinibuster

The Internet Archive, also known as the Wayback Machine, is generally regarded as a place to view old web pages, but its value goes far beyond reviewing old pages. There are five ways that Archive.org can help a website improve their user experience and SEO. The Wayback Machine’s new WordPress plugin  makes it easy to benefit from the Internet Archive automatically.

1. Copyright, DMCA, And Business Disputes

The Internet Archive can serve as an independent timestamped record to prove ownership of content or to defend against false claims that someone else wrote the content first. The Internet Archive is an independent non-profit organization and there is no way to fake an entry, which makes it an excellent way to prove who was first to publish disputed content.

2. The Worst Case Scenario Backup

Losing the entire website content due to hardware failure, ransomware, a vulnerability, or even a datacenter fire is almost always within the realm of possibility. While it’s a best-practice to always have an up to date backup stored off the server, unforseen mistakes can happen.

The Internet Archive does not offer a way to conveniently download website content. But there are services that facilitate it. It used to be a popular technique with spammers to use these services to download the previous content from expired domains and bring them back to the web. Although I’ve not used any of these services and therefore can’t vouch for any of them, if you search around you’ll be able to find them.

3. Fix Broken Links

Sometimes a URL gets lost in a website redesign or maybe it was purposely removed but then find out later that the page is popular and people are linking to it. What do you do?

Something like this happened to me in the past where I changed domains and decided I didn’t need certain of the pages. A few years later I discovered that people were still linking to those pages because they were still useful. The Internet Archive made it easy to reproduce the old content on the new domain. It’s one way to recover the Page Rank that would otherwise have been lost.

Having old pages archived can help in reviving old pages back into the current website. But you can’t do this unless the page is archived and the new plugin makes sure that this happens for every web page.

4. Can Indicate Trustworthiness

This isn’t about search algorithms or LLMs. This is about trust with other sites and site visitors. Spammy sites tend to not be around very long. A documented history on Archive.org can be a form of proof that a site has been around for a long time. A legitimate business can point to X years of archived pages to prove that they are an established business.

5. Identify Link Rot

The Internet Archive Wayback Machine Link Fixer plugin provides an easy way to archive your web pages at Archive.org. When you publish a new page or update an older page the Wayback Machine WordPress plugin will automatically create a new archive page.

But one of the useful features of the plugin is that it automatically scans all outbound links and tests them to see if the linked pages still exist. The plugin can automatically update the link to a saved page at the Internet Archive.

The official plugin lists these features and benefits:

  • “Automatically scans for outbound links in post content
  • Checks the Wayback Machine for existing archives
  • Creates new snapshots if no archive exists
  • Redirects broken or missing links to archived versions
  • Archives your own posts on updates
  • Works on both new and existing content
  • Helps maintain long-term content reliability and SEO”

I don’t know what they mean about maintaining SEO but one benefit they don’t mention is that it keeps users happy and that’s always a plus.

Wayback Machine Is Useful For Competitor Analysis

The Internet Archive makes it so easy to see how a competitor has changed over the years. It’s also a way to catch competitors who are copying or taking “inspiration” from your content when they do their annual content refresh.

The Wayback Machine can let you see what services or products a competitor offered and how they were offered. It can also give a peek into what changed during a redesign which tells something about what their competitive priorities are.

Takeaways

  • The Internet Archive provides practical benefits for website owners beyond simply viewing old pages.
  • Archived snapshots help address business disputes, lost content, broken links, and long-term site credibility.
  • Competitor history and past site versions become easy to evaluate through Archive.org.
  • The Wayback Machine WordPress plugin automates archiving and helps manage link rot.
  • Using the Archive proactively can improve user experience and support SEO-adjacent needs, even if indirectly.

The six examples in this article show that the Internet Archive is useful for SEO, competitor research, and for improving the user experience and maintaining trust. The Internet Archive’s new WordPress plugin makes archiving and link-checking easy because it’s completely automatic. Taken together, these strengths make the Archive a useful part of keeping a website reliable, recoverable, and easier for people to use.

The Internet Archive Wayback Machine Link Fixer is a project created by Automattic and the Internet Archive, which means that it’s a high quality and trusted plugin for WordPress.

Download The Internet Archive WordPress Plugin

Check it out at the official WordPress plugin repository: Internet Archive Wayback Machine Link Fixer By Internet Archive

Featured Image by Shutterstock/Red rose 99

Google Year In Search 2025: Gemini, DeepSeek Top Trending Lists via @sejournal, @MattGSouthern

Google released its Year in Search data, revealing the queries that saw the largest spikes in search interest.

AI tools featured prominently in the global list, with Gemini ranking as the top trending search worldwide and DeepSeek also appearing in the top 10.

The annual report tracks searches with the highest sustained traffic spikes in 2025 compared to 2024, rather than total search volume.

AI Tools Lead Global Trending Searches

Gemini topped the global trending searches list, reflecting the growth of Google’s AI assistant throughout 2025.

DeepSeek, the Chinese AI company that drew attention earlier this year, appeared in both the global (#6) and US (#7) trending lists.

The global top 10 trending searches were:

  1. Gemini
  2. India vs England
  3. Charlie Kirk
  4. Club World Cup
  5. India vs Australia
  6. DeepSeek
  7. Asia Cup
  8. Iran
  9. iPhone 17
  10. Pakistan and India

US Trending Searches Show Different Priorities

The US list diverged from global trends, with Charlie Kirk leading and entertainment properties ranking high. KPop Demon Hunters claimed the second spot.

The US top 10 trending searches were:

  1. Charlie Kirk
  2. KPop Demon Hunters
  3. Labubu
  4. iPhone 17
  5. One Big Beautiful Bill Act
  6. Zohran Mamdani
  7. DeepSeek
  8. Government shutdown
  9. FIFA Club World Cup
  10. Tariffs

AI-Generated Content Leads US Trends

A dedicated “Trends” category in the US data showed AI content creation drove search interest throughout 2025.

The top US trends included:

  1. AI action figure
  2. AI Barbie
  3. Holy airball
  4. AI Ghostface
  5. AI Polaroid
  6. Chicken jockey
  7. Bacon avocado
  8. Anxiety dance
  9. Unfortunately, I do love
  10. Ghibli

The Ghibli entry likely reflects the viral AI-generated images mimicking Studio Ghibli’s animation style that circulated on social media platforms.

News & Current Events

News-related trending searches reflected the year’s developments. Globally, the top trending news searches included the LA Fires, Hurricane Melissa, TikTok ban, and the selection of a new pope.

US news trends focused on domestic policy, with the One Big Beautiful Bill Act and tariffs appearing alongside the government shutdown and Los Angeles fires.

Why This Matters

This data shows where user interest spiked throughout 2025. The presence of AI tools at the top of global trends confirms continued growth in AI-related search behavior.

The split between global and US lists also shows regional differences in trending topics. Cricket matches dominated global sports interest while US searches leaned toward entertainment and policy.

Looking Ahead

Google’s Year in Search data is available on the company’s trends site.

Comparing this year’s trending topics against your content calendar can reveal gaps in coverage or opportunities for timely updates to existing content.

7 SEO, Marketing, And Tech Predictions For 2026 via @sejournal, @Kevin_Indig

Previous predictions: 2018 | 2019 | 2020 | 2021 | 2022 | 2023 | 2024

This is my 8th time publishing annual predictions. As always, the goal is not to be right but to practice thinking.

For example, in 2018, I predicted “Niche communities will be discovered as a great channel for growth” and “Email marketing will return” in 2019. It took another 6 years. That same year, I also wrote “Smart speakers will become a viable user-acquisition channel in 2018”. Well…

All 2026 Predictions

  1. AI visibility tools face a reckoning.
  2. ChatGPT launches first quality update.
  3. Continued click-drops lead to a “Dark Web” defense.
  4. AI forces UGC platforms to separate feeds.
  5. ChatGPT’s ad platform provides “demand data.”
  6. Perplexity sells to xAI or Salesforce.
  7. Competition tanks Nvidia’s stock by -20%.

Boost your skills with Growth Memo’s weekly expert insights. Subscribe for free!

For the past three years, we have lived in the “generative era,” where AI could read the internet and summarize it for us. 2026 marks the beginning of the “agentic era,” where AI stops just consuming the web and starts writing to it – a shift from information retrieval to task execution.

This isn’t just a feature update; it is a fundamental restructuring of the digital economy. The web is bifurcating into two distinct layers:

  1. The Transactional Layer: Dominated by bots executing API calls and “Commercial Agents” (like Remarkable Alexa) that bypass the open web entirely.
  2. The Human Layer: Verified users and premium publishers retreating behind “Dark Web” blockades (paywalls, login gates, and C2PA encryption) to escape the sludge of AI content.

A big question mark is advertising, where Google’s expansion of ads into AI Mode and ChatGPT showing ads to free users could alleviate pressure on CPCs, but AI Overviews (AIOs) could drive them up. 2026 could be a year of wild price swings where smart teams (your “holistic pods”) move budget daily between Google (high cost/high intent) and ChatGPT (low cost/discovery) to exploit the spread.

It is not the strongest of the species that survives, nor the most intelligent; it is the one most adaptable to change.

— Leon C. Megginso


SEO/AEO

AI Visibility Tools Face A Reckoning

Prediction: I forecast an “Extinction Event” in Q3 2026 for the standalone AI visibility tracking category. Rather than a simple consolidation, our analysis shows the majority of pure-play tracking startups might fold or sell for parts as their 2025 funding runways expire simultaneously without the revenue growth to justify Series B rounds.

Why:

  • Tracking is a feature, not a company. Amplitude built an AI tracker for free in three weeks, and legacy platforms like Semrush bundled it as a checkbox, effectively destroying the standalone business model.
  • Many tools have almost zero “customer voice” proof of concept (e.g., zero G2 reviews), creating a massive valuation bubble.
  • The ROI of AI visibility optimization is still unclear and hard to prove.

Context:

  • Roughly 20 companies raised over $220 million at high valuations. 73% of those companies were founded in 2024.
  • Adobe’s $1.9 billion acquisition of Semrush proves that value lies in platforms with distribution, not in isolated dashboards.

Consequences:

  • Smart money will flee “read-only” tools (dashboards) and rotate into “write-access” tools (agentic SEO) that can automatically ship content and fix issues.
  • There will be -3 winners of AI visibility trackers on top of the established all-in-one platforms. Most of them will evolve into workflow automation, where most of the alpha is, and where established platforms have not yet built features.
  • The remaining players will sell, consolidate, pivot, or shut down.
  • AI visibility tracking itself faces a crisis of (1) what to track and (2) how to influence the numbers, since a large part of impact comes from third-party sites.

ChatGPT Launches First Quality Update

Prediction: It’ll be harder for spammers to influence AI visibility in 2026 with link spam, mass-generated AI content, and cloaking. By 2026, agents will likely use Multi-Source Corroboration to eliminate this asymmetry.

Why:

  • The fact that you can publish a listicle about top solutions on your site and name yourself first and influence AI visibility seems off.
  • New technology, like “ReliabilityRAG“ or “Multi-Agent Debate,” where one AI agent retrieves the info and another agent acts as a “judge” to verify it against other sources before showing it to the user, is available.

Context:

  • Most current agents (like standard ChatGPT, Gemini, or Perplexity) use a process called Retrieval-Augmented Generation (RAG). But RAG is still susceptible to hallucination and making errors.
  • Spammers often target specific, low-volume queries (e.g., “best AI tool for underwater basket weaving”) because there is no competition. However, new “knowledge graph” integration allows AIs to infer that a basket-weaving tool shouldn’t be a crypto-scam site based on domain authority and topic relevance, even if it’s the only page on the internet with those keywords.

Consequences:

  • OpenAI engineers are likely already working on better quality filters.
  • LLMs will shift from pure retrieval to corroboration.
  • Spammers might move to more sophisticated tactics, where they try to manufacture the consensus by buying and using zombie media outlets, cloaking, and other malicious tactics.

Continued Click-Drops Lead To A “Dark Web” Defense

Prediction: AI Overviews (AIOs) scale to 75% of keywords for big sites. AI Mode rolls out to 10-20% of queries.

Why:

  • Google said they’re seeing more queries as a result of AIOs. The logical conclusion is to show even more AIOs.
  • CTR for organic search results tanked from 1.41% to 0.64% already in January. Since January, paid CTR dropped from 14.92% to 6.34% (over 42% less).

Context:

  • Big sites already see AIOs for ~50% of their keywords.
  • Google started testing ads in AI Mode. If successful, Google would feel more confident to roll out AI Mode more broadly, and the investor story would sound better.
  • 80% of consumers now use AI summaries for at least 40% of their searches, according to Bain.
  • 2025 saw a massive purge in digital media, with major layoffs at networks like NBC News, BBC, and tech publishers as they restructured for a “post-traffic” world.

Consequences:

  • Publishers monetize audiences directly instead of ads and move to “experience-based” content (firsthand reviews, contrarian opinions, proprietary data) because AI cannot experience things. The space consolidates further (layoffs, acquisitions, Chapter 9).
  • By 2026, we expect a massive wave of “LLM blockades.” Major publishers will update their robots.txt to block Google-Extended and GPTBot, forcing users to visit the site to see the answer. This creates a “Dark Web” of high-quality content that AI cannot see, bifurcating the internet into AI slop (free) and human insight (paid).

Marketing

AI Forces UGC Platforms To Separate Feeds

Prediction: By 2026, “identity spoofing” will become the single largest cybersecurity risk for public companies. We move from, Is this content real? to Is this source verified?

Why:

  • Real influencers are risky (scandals, contract disputes). AI influencers are brand-safe assets that work 24/7/365 and never say anything controversial unless prompted. Brands will pay a premium to avoid humans.

Context:

  • Deepfake fraud attempts increased 257% in 2024. Most detection tools currently have a 20%+ false positive rate, making them hard to use for platforms like YouTube without killing legitimate creator reach.
  • Example: In 2024, the engineering firm Arup lost $25 million when an employee was tricked by a deepfake video conference call where the “CFO” and other colleagues were all AI simulations.
  • In May 2023, a fake AI image of an explosion at the Pentagon caused a momentary dip in the S&P 500.

Consequences:

  1. Cryptographic signatures (C2PA) become the only proof of reality for video.
  2. YouTube and LinkedIn will likely split feeds into “verified human” (requires ID + biometric scan) and “synthetic/unverified.”
  3. “Blue checks” won’t just be for status, but a security requirement to comment or post video, effectively ending anonymity for high-reach accounts.
  4. Platforms will be forced by regulators (EU AI Act, August 2026 deadline) to label AI content.
  5. Cameras (Sony, Canon) and iPhones will start embedding C2PA digital signatures at the hardware level. If a video lacks this “chain of custody” metadata, platforms will auto-label it as “unverified/synthetic.”

ChatGPT’s Ad Platform Provides “Demand Data”

Prediction: OpenAI shifts to a hybrid pricing model in 2026: An “ad-supported free tier” and “credit-based pro tier.”

Why:

  • Inference costs are skyrocketing. A heavy user paying $20/month can easily burn $100+ of computing, making them unprofitable.

Context:

  • Leaked code in the ChatGPT Android App (v1.2025.329) explicitly references “search ads carousel” and “bazaar content.”

Consequences:

  • Free users will see “sponsored citations” and product cards (ads) in their answers.
  • Power users will face “compute credits” – a base subscription gets you standard GPT-5, but heavy use of deep research or reasoning agents will require buying top-up packs.
  • We get a Search-Console style interface. Brands need data. If OpenAI wants to sell ads, it must give brands a dashboard showing, “Your product was recommended in 5,000 chats about running shoes.” The data will add fuel to the fire for AEO/GEO/LLMO/SEO.
  • The leaked term “bazaar content” suggests OpenAI might not just show ads, but allow transactions inside the chat (e.g., “Book this flight”) where they take a cut. This moves OpenAI from a software company to a marketplace (like the App Store), effectively competing with Amazon and Expedia.

Tech

Perplexity Sells To xAI Or Salesforce

Prediction: Perplexity will be acquired in late 2026 for $25-$30 billion. After its user growth plateaus at ~50 million MAU, the “unit economics wall” forces a sale to a giant that needs its technology (real-time RAG), not its business model.

Why:

  • In late 2025, Perplexity raised capital at a $20 billion valuation (roughly 100x its ~$200 million ARR). To justify this, they need Facebook-level growth. However, 2025 data shows they hit a ceiling at ~30 million users while ChatGPT surged to +800 million.
  • By 2026, Google and OpenAI will have effectively cloned Perplexity’s core feature (Deep Research) and given it away for free.

Context:

  • While Perplexity grew 66% YoY in 2025 to ~30 million monthly active users (MAU), this pales in comparison to ChatGPT’s +800 million.
  • It costs ~10x more to run a Perplexity deep search query than a standard Google search. Without a high-margin ad network (which takes a decade to build), they burn cash on every free user, creating a “negative scale” problem.
  • Salesforce acquired Informatica for ~$8 billion in 2025 specifically to power its agentforce strategy. This proves Benioff is willing to spend billions to own the data layer for enterprise agents.
  • xAI raised over $20 billion in late 2025, valuing the company at $200 billion. Musk has the liquid cash to buy Perplexity tomorrow to fix Grok’s hallucination problems.

Consequences:

  • xAI has the cash, and Musk needs a “real-time truth engine” for Grok. Perplexity could make X (Twitter) a more powerful news engine. Grok (X’s current AI) learns from tweets, but Perplexity cites sources that can reduce hallucination. Perplexity could also give xAI a browser, bringing it closer to Musk’s vision of a super app.
  • Marc Benioff wants to own “enterprise search.” Imagine a Salesforce Agent that can search the entire public web (via Perplexity) + your private CRM data to write a perfect sales email.

Competition Tanks Nvidia’s Stock By -20%

Prediction: Nvidia stock will correct by >20% in 2026 as its largest customers successfully shift 15-20% of their workloads to custom internal silicon. This causes a P/E compression from ~45x to ~30x as the market realizes Nvidia is no longer a monopoly, but a “competitor” in a commoditized market. (Not investment advice!)

Why:

  • Microsoft, Meta, Google, and Amazon likely account for over 40% of Nvidia’s revenue. For them, Nvidia is a tax on their margins. They are currently spending ~$300 billion combined on CAPEX in 2025, but a growing portion is now allocated to their own chip supply chains rather than Nvidia H100s/Blackwells.
  • Hyperscalers don’t need chips that beat Nvidia on raw specs; they just need chips that are “good enough” for internal inference (running models), which accounts for 80-90% of compute demand.

Context:

  • In late 2025, reports surfaced that Meta was negotiating to buy/rent Google’s TPU v6 (Trillium) chips to reduce its reliance on Nvidia.
  • AWS Trainium 2 & 3 chips are reportedly 30-50% cheaper to operate than Nvidia H100s for specific workloads. Amazon is aggressively pushing these cheaper instances to startups to lock them into the AWS silicon ecosystem.
  • Microsoft’s Maia 100 is now actively handling internal Azure OpenAI workloads. Every workload shifted to Maia is an H100 Nvidia didn’t sell.
  • Reports confirm OpenAI is partnering with Broadcom to mass-produce its own custom AI inference chip in 2026, directly attacking Nvidia’s dominance in the “Model Serving” market.
  • Fun fact: Without Nvidia, the S&P500 would’ve made 3 percentage points less in 2025.

Consequence:

  • Nvidia will react by refusing to sell just chips. They will push the GB200 NVL72 – a massive, liquid-cooled supercomputer rack that costs millions. This forces customers to buy the entire Nvidia ecosystem (networking, cooling, CPUs), making it physically impossible to swap in a Google TPU or Amazon chip later.
  • If hyperscalers signal even a 5% cut in Nvidia orders to favor their own chips, Wall Street will panic-sell, fearing the peak of the AI Infrastructure Cycle has passed.

Featured Image: Paulo Bobita/Search Engine Journal

The Search Equity Gap: Quantifying Lost Organic Market Share (And Winning It Back) via @sejournal, @billhunt

Every month, companies lose millions in unrealized search value not because their teams stopped optimizing, but because they stopped seeing where visibility converts into economic return.

When search performance drops, most teams chase rankings. The real leaders chase equity.

This is the Search Equity Gap – the measurable delta between the organic market share your brand once held and what it holds today.

 In most organizations, this gap isn’t tracked or budgeted for. Yet it represents one of the most consistent and compounding forms of digital opportunity cost. Every unclaimed click isn’t just lost traffic; it’s lost demand at the lowest acquisition cost possible – an invisible tax on growth.

When we treat SEO as a channel, we chase traffic.

When we treat it as an equity engine, we reclaim value.

Search Equity: The Compounding Value Of Discoverability

Search equity is the accumulated advantage your brand earns when visibility, authority, and user trust align. Like financial equity, it compounds over time – links build reputation, content earns citations, and user engagement reinforces relevance.

But the opposite is also true: When migrations break URLs, when content fragments across markets, or when AI overviews intercept clicks, that equity erodes.

And that’s usually the moment when management suddenly discovers the value of organic search – right after it vanishes.

What was once dismissed as “free traffic” becomes an expensive emergency as other channels scramble to compensate for the lost opportunity. Paid budgets balloon, acquisition costs spike, and leadership learns that SEO isn’t a faucet you can turn back on.

Search equity isn’t just about rankings. It’s about discoverability at scale – ensuring your brand appears, is understood, and is chosen in every relevant search context, from classic results to AI-generated overviews.

In this new environment, visibility without qualification is meaningless. A million impressions that never convert are not an asset. The opportunity lies in reclaiming qualified visibility – the type that drives revenue, reduces acquisition costs, and compounds shareholder value.

Diagnosing The Decline: Where Search Equity Disappears

Every SEO audit can uncover technical or content issues. But the deeper cause of declining performance often stems from three systemic leaks.

1. Structural Leaks

Migrations, redesigns, and rebrands remain the biggest equity destroyers in enterprise SEO. When URLs change without proper mapping, Google’s understanding of authority resets. Internal link equity splinters. Canonical signals conflict.

Each broken or redirected page acts like a severed artery in your digital system – small losses multiplied at scale. What seems like a simple platform refresh can erase years of accumulated search trust.

2. Behavioral Shifts

Even when nothing changes internally, the ecosystem around you continues to evolve. Zero-click results, AI Overviews, and new answer formats siphon attention. Search visibility remains, but user behavior no longer translates into traffic.

The new challenge isn’t “ranking first.” It’s being chosen when the user’s question is answered before they click. This demands a shift from keyword optimization to intent satisfaction and requires restructuring your content, data, and experience for discoverability and decision influence.

3. Organizational Drift

Perhaps the most corrosive leak of all: misalignment. When SEO sits in marketing, IT in technology, and analytics in finance, nobody owns the whole system.

Executives’ fund rebrands that destroy crawl efficiency. Paid teams buy traffic that good content could have earned. Each department optimizes its own key performance indicator (KPI), and in doing so, the organization loses cohesion. Search equity collapses not because of algorithms, but because of organizational architecture. The fix starts at the top.

Quantifying The Search Equity Gap (Actuals-Based Model)

Most companies estimate what they should earn in search and compare it to current performance. But in volatile, AI-driven SERPs, real performance deltas tell the truer story.

Instead of modeling potential, this approach uses before-and-after data – actual performance metrics from both pre-impact and current states. By doing so, you measure realized loss, click erosion, and intent displacement with precision.

Search Equity Gap = Lost Qualified Traffic + Lost Discoverability + Lost Intent Coverage

Step 1: Establish A Baseline (Pre-Impact Period)

Pull your data from a stable window before the event (typically three to six months prior).

From Google Search Console and analytics, extract:

  • Top performing queries (impressions, clicks, CTR, position).
  • Top landing pages and their mapped queries.
  • Conversion or value proxies where available.

This becomes your search equity portfolio – the measurable value of your earned discoverability.

Step 2: Compare To The Current State (Post-Impact)

Run the same data for the current period and align query-to-page pairs.

Then classify each outcome:

Equity Status Definition Typical Cause Recovery Outlook
Lost Equity Queries or pages no longer ranking or receiving traffic Migration, technical, cannibalization High (fixable)
Eroded Equity Still ranking, but dropped positions or CTR Content fatigue, new competitors, UX decay Moderate (recoverable)
Reclassified Equity Still visible but replaced or suppressed by AI Overviews, zero-click blocks, or SERP features Algorithmic change/behavioral shift Low-Moderate (influence possible)

This comparison reveals both visibility loss and click erosion, clarifying where and why your equity declined.

Step 3: Attribute The Loss

Link each pattern to its primary driver:

  1. Structural – Indexation, redirects, broken templates.
  2. Content – Thin, outdated, or unstructured pages lacking E-E-A-T.
  3. SERP Format – AI overviews, videos, or answer boxes replacing classic results.
  4. Competitive – New entrants or aggressive refresh cycles.

These map to equity types:

  • Recoverable Equity: technical or content improvements.
  • Influence Equity: optimizing brand/entity visibility within AI Overviews.
  • Retired Equity: informational queries no longer yielding clicks.

This triage converts diagnosis into a prioritized investment plan.

Step 4: Quantify The Economic Impact

For each equity type, calculate:

Lost Value = Δ Clicks × Conversion Rate × Value per Conversion

Add a Paid Substitution Cost to translate organic loss into a financial figure:

Cost of Not Ranking = Lost Clicks × Avg CPC

This ties the forensic analysis directly to your legacy framework, which I define as The Cost of Not Ranking, and shows executives the tangible price of underperformance.

Example:

  • 15,000 fewer monthly clicks on high-intent queries.
  • 3% conversion × $120 avg order value = $54,000/month in unrealized value.
  • CPC $3.10 → $46,000/month to replace via paid.

Now your analysis quantifies both organic value lost and capital inefficiency created.

Step 5: Separate The Signal From The Noise

Not all loss deserves recovery. Patterns surface quickly:

  • High-volume informational pages: visibility stable, clicks down – reclassified (low ROI).
  • Product or service pages: dropped due to structural issues – recoverable (high ROI).
  • Brand or review pages: replaced by AI summaries – influence (medium ROI).

Plot these on a Search Equity Impact Matrix – potential value vs. effort – to direct resources toward recoverable, high-margin opportunities.

Why This Matters

Most SEO reports describe position snapshots. Few reveal equity trajectories. By grounding analysis in actuals before and after impact, you replace speculation with measurable evidence that data executives can trust. This reframes search optimization as loss prevention and value recovery, not traffic chasing.

From Visibility Metrics To Value Metrics

Traditional metrics focus on activity:

  • Average ranking position.
  • Total impressions.
  • Organic sessions.

Value-based metrics focus on performance and economics:

  • Qualified Visibility Share (discoverability within high-intent categories).
  • Recovered Revenue Potential (modeled from Δ Clicks × Value).
  • Digital Cost of Capital (what it costs to replace that traffic via paid).

Integrating your Cost of Not Ranking logic further amplifies this.

Every click you have to buy is a symptom of a ranking you didn’t earn.

By comparing your paid and organic data for the same query set, you can see how much budget covers for lost equity and how much could be redeployed if organic recovery occurred.

When teams present SEO performance in these financial terms, they gain executive attention and budget alignment.

Example:

“Replacing lost organic share with paid clicks costs $480,000 per quarter. Fixing canonical and internal-link issues can recover 70% of that value within 90 days.”

That’s not an SEO report. That’s a business case for digital capital recovery.

Winning It Back: A Framework For Recovery

Search equity recovery follows the same progression as digital value creation – diagnose, quantify, prioritize, and institutionalize.

1. Discover The Gap

Compare actual performance pre- and post-impact. Visualize equity at risk by category or market.

2. Diagnose The Cause

Layer crawl data, analytics, and competitive intelligence to isolate technical, behavioral, and AI factors.

3. Differentiate

Focus on qualified clicks from mid- and late-funnel intents where AI summaries mention your brand but don’t link to you.

Answer those queries more directly. Reinforce them with structured data and content relationships that signal expertise and trust.

4. Reinforce

Embed SEO governance into development, design, and content workflows. Optimization becomes a process, not a project – or, as I’ve written before, infrastructure, not tacticWhen governance becomes muscle memory, equity doesn’t just recover; it compounds.

From Cost Center To Compounding Asset

Executives often ask:

“How much revenue does SEO drive?”

The better question is:

“How much value are we losing by not treating search as infrastructure?”

The search equity gap quantifies that blind spot. It reframes SEO from a cost-justified marketing function into a value-restoration system – one that preserves and grows digital capital over time. Each recovered visit is a visit you no longer need to buy. Each resolved structural issue accelerates time-to-value for every future campaign.

Ironically, the surest way to make executives appreciate SEO is to let it break once. Nothing clarifies its importance faster than the sound of paid budgets doubling to make up for “free” traffic that suddenly disappeared. That’s how SEO evolves from an acquisition channel to a shareholder-value lever.

Final Thought

The companies dominating search today aren’t publishing more content – they’re protecting and compounding their equity more effectively.

They’ve built digital balance sheets that grow through governance, not guesswork. The rest are still chasing algorithm updates while silently losing market share in the one channel that could deliver the highest margin growth.

The search equity gap isn’t a ranking problem. It’s a visibility-to-value disconnect, and closing it starts by measuring what most teams never even notice.

More Resources:


Featured Image: N Universe/Shutterstock

Tools to Track GenAI Citations, Sources

Generative AI platforms increasingly conduct live web searches to respond to users’ prompts. The platforms don’t reveal how or where they search, but it’s likely a combination of Google, Bing, and the platforms’ own bots.

Just a few months ago, those answers would have relied primarily on existing training data.

Regardless, understanding how AI platforms conduct the searches is key to optimizing visibility in the answers.

Analyze:

  • Which web pages produce the genAI answers? Try to appear in those pages.
  • Which brands and products influenced an answer? Are they competitors?

Here are three tools to help reveal impactful pages and influential brands and products.

ChatGPT Path

ChatGPT Path from Ayima, an agency, is a free Chrome extension that extracts citations, brands and products (entities), and fan-out queries from any ChatGPT dialog. Download the extension and converse with ChatGPT. Then click the extension icon to open a side panel with the key info, called RAG Sources (“Retrieval‑Augmented Generation”).

Export the report via CSV for easier analysis.

ChatGPT conversation on choosing running shoes for rainy weather displayed on the left, with a ChatGPT Path sidebar on the right. The sidebar is labeled ‘RAG Sources (23)’ and lists numbered source cards with titles and snippets from websites such as runrepeat.com, Reddit, Nike.com, Outside Online, and Facebook. The main chat response includes a checklist of features to look for in rainy-condition running shoes, with brand and retailer citations highlighted inline.

ChatGPT Path extracts citations, brands and products, and fan-out queries from any ChatGPT dialog, such as this example for “help me choose running shoes for rainy weather.”

AI Search Impact Analysis

AI Search Impact Analysis is another free Chrome extension that analyzes multiple queries on Google AI Overviews.

Install the extension and type your comma-separated queries into the tool’s sidebar. The tool will run each search and identify AI Overviews and the queries that triggered them.

A separate “Citation Report” includes all URLs cited in each Overview and overall for all queries. In my testing, this feature was handy for identifying URLs cited repeatedly.

The extension’s “Brand Check” analyzes mentions of your company and competitors in Overviews.

Dashboard sidebar and AI Overview Impact Analysis panel. The left sidebar displays menu items including Search, Report, Citation Report, AIO Answer, Brand Check, and Word Count. The main panel shows a Brand Mentions Analysis tool with fields for ‘Your Brand’ filled in as ‘nike’ and ‘Competitors’ filled in as ‘hoka.’ A blue button labeled ‘Analyze Brand Mentions’ appears below the inputs. A Brand Mentions Summary table lists total mentions and keyword coverage for the user’s brand versus competitors.

“Brand Check” analyzes Overviews for mentions of your company and competitors, such as “nike” and “hoka” shown here.

Peec AI

Peec AI is a premium analytics tool for sources and brand mentions in ChatGPT, Perplexity, and AI Overviews.

To use, enter your brand and targeted prompts. The tool, after a few minutes, will create a detailed report, listing:

  • Domains cited in genAI answers for those prompts,
  • URLs linked in the answers.

The report categorizes cited domains by type (e.g., corporate, brand-owned, user-generated) and frequency (to know a domain’s impact on a cluster of answers).

A separate aggregated report combines all genAI platforms, with URL filters for each one. The “Gap analysis” lists cited URLs that mention competing brands but not yours.

Finally, Peec AI analyzes all entered prompts and lists the most-cited brands to compare and track against your own.

Analytics dashboard showing a line chart titled ‘Source Usage by Domain’ with multiple domain lines such as annsmarty.com, convert.com, and linkedin.com. To the right is a donut chart illustrating domain type distribution with 516 total citations across categories such as Corporate, You, Editorial, UGC, Institutional, and Reference. A table below lists domains with corresponding usage percentages and average citations, including smarty.marketing, anns smarty.com, youtube.com, and linkedin.com.

Peec AI’s report categorizes cited domains by type and frequency.