Data: Translated Sites See 327% More Visibility in AI Overviews

This post was sponsored by Weglot. The opinions expressed in this article are the sponsor’s own.

When Google’s AI Overviews launched in 2024, dozens of questions quickly surfaced among SEO professionals, one being: if AI now curates and summarizes search results, how do websites earn visibility, especially across languages?

Weglot recently conducted a data-driven study, analyzing 1.3 million citations across Google AI Overviews and ChatGPT to determine if LLMs cite content in one language, would they also cite it in others?

The result: translated websites saw up to 327% more visibility in AI Overviews than untranslated ones, a clear signal that international SEO is becoming inseparable from AI search.

What’s more, websites with another language available were also more likely to be cited in AI Overviews, regardless of the language the search was made.

This shift is redefining the rules of visibility. AI Overviews and large language models (LLMs) now mediate how information is discovered. Instead of ranking pages, they “cite” sources in generated responses.

But with that shift comes a new risk: if your website isn’t available in the user’s search language, does AI simply overlook it, or worse, send users to Google Translate’s proxy page instead?

The risk with Google’s Translate proxy is that while it does the translation work for you, you have no control over the translations of your content. Worse still, you don’t get any of the traffic benefits, as users are not directed to your site.

The Study

Here’s how the research worked. To understand how translation affects AI visibility, Weglot focused the research on Spanish-language websites across two markets: Spain and Mexico.

The study was then split into two phases. Phase one focused on websites that weren’t translated, and therefore only displayed the language intended for their market, in this case, Spanish.

In that phase, Weglot looked at 153 websites without English translations: 98 from Spain and 55 from Mexico. Weglot deliberately selected high-traffic sites because they offered no English versions.

Phase two involved a comparison group of 83 Spanish and Mexican sites with versions in both Spanish and English. This allowed Weglot to directly compare the performance of translated versus untranslated content.

In total, this generated 22,854 queries in phase one and 12,138 in phase two. The methodology converted the top 50 non-branded keywords of each site into queries that users would likely search, and then these were translated between the Spanish and English versions.

In total, 1.3 million citations were analyzed.

The Key Results

Untranslated Sites Have Very Low AI Search Visibility

The findings show that untranslated websites experience a substantial drop in visibility for searches conducted in non-available languages, despite maintaining strong visibility in the current available language.

Diving deeper into this, untranslated sites essentially lose massive visibility. From the study, even when these Spanish websites performed well in Spanish searches, the sites virtually disappeared in English searches.

Looking at this data further within Google AI Overviews:

  • The sample size of 98 untranslated sites from Spain had 17,094 citations for Spanish queries vs 2,810 citations for the equivalent search in English, a 431% gap in visibility.
  • Taking a look at untranslated sites in Mexico, the study identified a similar pattern. 12,038 citations for Spanish queries vs 3,450 citations for English, showing 213% fewer citations when searching English.

Even ChatGPT, though slightly more balanced, still favored translated sites, with Spanish sites receiving 3.5% fewer citations in English and 4.9% fewer with Mexican sites.

Image created by Weglot, November 2025

Translated Sites Have 327% More AI Search Visibility

But what happens when you do translate your site?

Bringing in the comparison group of Spanish websites that also have an English version, we can see that translated sites dramatically close the visibility gap and that having a second language transformed visibility within Google AI Overviews.

Google AI Overviews:

  • Translated sites in Spain saw 10,046 citations vs 8,048 in English, showcasing only a 22% gap.
  • Translated sites in Mexico showed 5,527 citations for Spanish queries and 3,325 citations for English, and a difference of 59%.

Overall, translated sites achieved 327% more visibility than untranslated ones and earned 24% more total citations per query.

When looking at ChatGPT, the bias almost vanished. Translated sites saw near-equal citations in both languages.

Image created by Weglot, November 2025

Next Steps: Translate Your Site To Boost Global Visibility In AI SERPs

Translation does more than boost visibility, it multiplies it.

Not only does having multiple languages across your site ensure your site gets picked up for searches in multiple languages, but it also adds to the overall visibility of your site as a whole.

The study found that translated sites perform better across all metrics. The data shows that translated sites received 24% more citations per prompt than untranslated sites.

Looking at this by language, translation resulted in a 33% increase in English citations and a 16% increase in Spanish citations per query.

Weglot’s findings indicate that translation acts as a signal of authority and reliability for AIOs and ChatGPT, boosting citation performance across all languages, not only the ones content is translated.

Image created by Weglot, November 2025

AI Search Rewards Translated Content as a Visibility Signal

Traditional international SEO has long focused on hreflang tags and localized keywords. But in the age of AI search, translation itself becomes a visibility signal:

  1. Language alignment: AI engines prioritize content matching the query’s language.
  2. Authority building: Translated content attracts engagement across markets, improving perceived reliability.
  3. Traffic control: Proper translations prevent Google Translate proxies from intercepting clicks.
  4. Semantic reach: Multilingual content broadens your surface area for AI training and citation.

Put simply: If your content isn’t in the language of the question, it’s unlikely it will be in the answer either.

The Business Impact

The consequences aren’t theoretical. One case in Weglot’s dataset, a major Spanish book retailer selling English-language titles worldwide without an English version of its site, shows the impact.

When English speakers searched for relevant books:

  • The site appeared 64% less often in Google AI Overviews and ChatGPT.
  • In 36% of the cases where it did appear, the link pointed to Google Translate’s proxy, not the retailer’s own domain.

Despite offering exactly what English users wanted, the business lost visibility, traffic, and ultimately, sales.

The Bigger Picture: AI Search Is Redefining SEO and Translation Is Now a Growth Strategy

The implications reach far beyond Spain or Mexico, or even the Spanish language.

As AI search evolves, the SEO playbook is expanding. Ranking isn’t just about “position one” anymore; it’s about being cited, summarized, and surfaced by machines trained on multilingual web content.

Weglot’s findings point to a future where translation is both an SEO and an AI strategy and not a localization afterthought.

With Google AIOs now live in multiple languages and ChatGPT integrating real-time web data, multilingual visibility has become an equity issue: sites optimized for one language risk being invisible in another.

Image created by Weglot, November 2025

Final Takeaway: Untranslated Sites Are Invisible in AI Search

The evidence is clear: Untranslated = unseen. Website translation is high up there for AIO visibility.

As AI continues to shape how search engines understand relevance, translation isn’t just about accessibility; it’s how your brand gets recognized by algorithms and audiences alike.

For the easiest way to translate a website, start your free trial now!

Plus, enjoy a 15% discount for 12 months on public plans by using the promo code SEARCH15 on a paid plan purchase.

Image Credits

Featured Image: Image by Weglot. Used with permission.

In-Post Images: Image by Weglot. Used with permission.

llms.txt: The Web’s Next Great Idea, Or Its Next Spam Magnet via @sejournal, @DuaneForrester

At a recent conference, I was asked if llms.txt mattered. I’m personally not a fan, and we’ll get into why below. I listened to a friend who told me I needed to learn more about it as she believed I didn’t fully understand the proposal, and I have to admit that she was right. After doing a deep dive on it, I now understand it much better. Unfortunately, that only served to crystallize my initial misgivings. And while this may sound like a single person disliking an idea, I’m actually trying to view this from the perspective of the search engine or the AI platform. Why would they, or why wouldn’t they, adopt this protocol? And that POV led me to some, I think, interesting insights.

We all know that search is not the only discovery layer anymore. Large-language-model (LLM)-driven tools are rewriting how web content is found, consumed, and represented. The proposed protocol, called llms.txt, attempts to help websites guide those tools. But the idea carries the same trust challenges that killed earlier “help the machine understand me” signals. This article explores what llms.txt is meant to do (as I understand it), why platforms would be reluctant, how it can be abused, and what must change before it becomes meaningful.

Image Credit: Duane Forrester

What llms.txt Hoped To Fix

Modern websites are built for human browsers: heavy JavaScript, complex navigation, interstitials, ads, dynamic templates. But most LLMs, especially at inference time, operate in constrained environments: limited context windows, single-pass document reads, and simpler retrieval than traditional search indexers. The original proposal from Answer.AI suggests adding an llms.txt markdown file at the root of a site, which lists the most important pages, optionally with flattened content so AI systems don’t have to scramble through noise.

Supporters describe the file as “a hand-crafted sitemap for AI tools” rather than a crawl-block file. In short, the theory: Give your site’s most valuable content in a cleaner, more accessible format so tools don’t skip it or misinterpret it.

The Trust Problem That Never Dies

If you step back, you discover this is a familiar pattern. Early in the web’s history, something like the meta keywords tag let a site declare what it was about; it was widely abused and ultimately ignored. Similarly, authorship markup (rel=author, etc) tried to help machines understand authority, and again, manipulation followed. Structured data (schema.org) succeeded only after years of governance and shared adoption across search engines. llms.txt sits squarely inside this lineage: a self-declared signal that promises clarity but trusts the publisher to tell the truth. Without verification, every little root-file standard becomes a vector for manipulation.

The Abuse Playbook (What Spam Teams See Immediately)

What concerns platform policy teams is plain: If a website publishes a file called llms.txt and claims whatever it likes, how does the platform know that what’s listed matches the live content users see, or can be trusted in any way? Several exploit paths open up:

  1. Cloaking through the manifest. A site lists pages in the file that are hidden from regular visitors or behind paywalls, then the AI tool ingests content nobody else sees.
  2. Keyword stuffing or link dumping. The file becomes a directory stuffed with affiliate links, low-value pages, or keyword-heavy anchors aimed at gaming retrieval.
  3. Poisoning or biasing content. If agents trust manifest entries more than the crawl of messy HTML, a malicious actor can place manipulative instructions or biased lists that affect downstream results.
  4. Third-party link chains. The file could point to off-domain URLs, redirect farms, or content islands, making your site a conduit or amplifier for low-quality content.
  5. Trust laundering. The presence of a manifest might lead an LLM to assign higher weight to listed URLs, so a thin or spammy page gets a boost purely by appearance of structure.

The broader commentary flags this risk. For instance, some industry observers argue that llms.txt “creates opportunities for abuse, such as cloaking.” And community feedback apparently confirms minimal actual uptake: “No LLM reads them.” That absence of usage ironically means fewer real-world case studies of abuse, but it also means fewer safety mechanisms have been tested.

Why Platforms Hesitate

From a platform’s viewpoint, the calculus is pragmatic: New signals add cost, risk, and enforcement burden. Here’s how the logic works.

First, signal quality. If llms.txt entries are noisy, spammy, or inconsistent with the live site, then trusting them can reduce rather than raise content quality. Platforms must ask: Will this file improve our model’s answer accuracy or create risk of misinformation or manipulation?

Second, verification cost. To trust a manifest, you need to cross-check it against the live HTML, canonical tags, structured data, site logs, etc. That takes resources. Without verification, a manifest is just another list that might lie.

Third, abuse handling. If a bad actor publishes an llms.txt manifest that lists misleading URLs which an LLM ingests, who handles the fallout? The site owner? The AI platform? The model provider? That liability issue is real.

Fourth, user-harm risk. An LLM citing content from a manifest might produce inaccurate or biased answers. This just adds to the current problem we already face with inaccurate answers and people following incorrect, wrong, or dangerous answers.

Google has already stated that it will not rely on llms.txt for its “AI Overviews” feature and continues to follow “normal SEO.” And John Mueller wrote: “FWIW no AI system currently uses llms.txt.” So the tools that could use the manifest are largely staying on the sidelines. This reflects the idea that a root-file standard without established trust is a liability.

Why Adoption Without Governance Fails

Every successful web standard has shared DNA: a governing body, a clear vocabulary, and an enforcement pathway. The standards that survive all answer one question early … “Who owns the rules?”

Schema.org worked because that answer was clear. It began as a coalition between Bing, Google, Yahoo, and Yandex. The collaboration defined a bounded vocabulary, agreed syntax, and a feedback loop with publishers. When abuse emerged (fake reviews, fake product data), those engines coordinated enforcement and refined documentation. The signal endured because it wasn’t owned by a single company or left to self-police.

Robots.txt, in contrast, survived by being minimal. It didn’t try to describe content quality or semantics. It only told crawlers what not to touch. That simplicity reduced its surface area for abuse. It required almost no trust between webmasters and platforms. The worst that could happen was over-blocking your own content; there was no incentive to lie inside the file.

llms.txt lives in the opposite world. It invites publishers to self-declare what matters most and, in its full-text variant, what the “truth” of that content is. There’s no consortium overseeing the format, no standardized schema to validate against, and no enforcement group to vet misuse. Anyone can publish one. Nobody has to respect it. And no major LLM provider today is known to consume it in production. Maybe they are, privately, but publicly, no announcements about adoption.

What Would Need To Change For Trust To Build

To shift from optional neat-idea to actual trusted signal, several conditions must be met, and each of these entails a cost in either dollars or human time, so again, dollars.

  • First, manifest verification. A signature or DNS-based verification could tie an llms.txt file to site ownership, reducing spoof risk. (cost to website)
  • Second, cross-checking. Platforms should validate that URLs listed correspond to live, public pages, and identify mismatch or cloaking via automated checks. (cost to engine/platform)
  • Third, transparency and logging. Public registries of manifests and logs of updates would make dramatic changes visible and allow community auditing. (cost to someone)
  • Fourth, measurement of benefit. Platforms need empirical evidence that ingesting llms.txt leads to meaningful improvements in answer correctness, citation accuracy, or brand representation. Until then, this is speculative. (cost to engine/platform)
  • Finally, abuse deterrence. Mechanisms must be built to detect and penalize spammy or manipulative manifest usage. Without that, spam teams simply assume negative benefit. (cost to engine/platform)

Until those elements are in place, platforms will treat llms.txt as optional at best or irrelevant at worst. So maybe you get a small benefit? Or maybe not…

The Real Value Today

For site owners, llms.txt still may have some value, but not as a guaranteed path to traffic or “AI ranking.” It can function as a content alignment tool, guiding internal teams to identify priority URLs you want AI systems to see. For documentation-heavy sites, internal agent systems, or partner tools that you control, it may make sense to publish a manifest and experiment.

However, if your goal is to influence large public LLM-powered results (such as those by Google, OpenAI, or Perplexity), you should tread cautiously. There is no public evidence those systems honor llms.txt yet. In other words: Treat llms.txt as a “mirror” of your content strategy, not a “magnet” pulling traffic. Of course, this means building the file(s) and maintaining them, so factor in the added work v. whatever return you believe you will receive.

Closing Thoughts

The web keeps trying to teach machines about itself. Each generation invents a new format, a new way to declare “here’s what matters.” And each time the same question decides its fate: “Can this signal be trusted?” With llms.txt, the idea is sound, but the trust mechanisms aren’t yet baked in. Until verification, governance, and empirical proof arrive, llms.txt will reside in the grey zone between promise and problem.

More Resources:


This post was originally published on Duane Forrester Decodes.


Featured Image: Roman Samborskyi/Shutterstock

Data Shows How AI Overviews Is Ranking Shopping Keywords via @sejournal, @martinibuster

BrightEdge’s latest research shows that Google’s AI Overviews are now appearing in ways that reflect what BrightEdge describes as “deliberate, aggressive choices” about where AI shows up and where it does not. These trends show marketers where AI search is showing up within the buyer’s journey and what businesses should expect.

The data indicates that Google is concentrating AI in parts of the shopping process where it gives clear informational value, particularly during research and evaluation. This aligns AI Overviews with the points in the shopping journey where users need help comparing options or understanding product details.

BrightEdge reports that Google retained only about 30 percent of the AI Overview keywords that appeared at the peak of its September 1 through October 15, 2025 research window. The retained queries also tended to have higher search volume than the removed ones, which BrightEdge notes is the opposite pattern observed in 2024. This fits with the higher retention in categories where shoppers look for explanations, comparisons, and instructional information.

BrightEdge explains:

“The numbers paint an interesting story: Google retained only 30% of its peak AI Overview keywords. But here’s what makes 2025 fundamentally different: those retained keywords have HIGHER search volume than removed ones—the complete opposite of 2024. Google isn’t just pulling back; it’s being strategic about which searches deserve AI guidance.”

The shifting behavior of AI Overviews shows how actively Google is tuning its system. BrightEdge observed a spike from 9 percent to 26 percent coverage on September 18 before returning to 9 percent soon after. This change signals ongoing testing. The year-over-year overlap of AI Overview keywords is only 18 percent, which BrightEdge calls a “massive reshuffling” that shows “active experimentation” and requires marketers to plan for change rather than stability. The volatility shows Google may be experimenting or responding to user trends and that the queries shown in AI Overviews can change over time.
My opinion is that Google is likely responding to user trends, testing how they respond to AI Overviews, then using the data to show more if user reactions are positive.

AI Is A Comparison And Evaluation Layer

BrightEdge’s research indicates that AI Overviews aligns with shopper intent. Google places AI in research queries such as “best TV for gaming,” continues support for evaluation queries like “Samsung vs LG,” and then withdraws when users show purchase intent with searches like “Samsung S95C price.”

These examples show that AI serves as an educational and comparison layer, not a transactional one. When a shopper reaches a buying decision, Google steps back and lets traditional results handle the final step. This apparent alignment with comparison and evaluation means Google is confident in using AI Overviews as a part of the shopping journey.

Usefulness Varies Across Categories

The data shows that AI’s usefulness varies across categories, and Google adjusts AIO keywords retention based on these needs. Categories that retained AI Overviews such as Grocery, TV and Home Theater, and Small Appliances share a pattern.

Users rely on comparison, explanation, and instruction during their decisions. In contrast, categories with low retention, like Furniture and Home, rely on visual browsing rather than text-based evaluation. This limits the value of AI. Google’s category patterns show that AI appears more often in categories where text-based information (such as comparison, explanation, and instruction) guides decisions.

Google’s keyword filtering clarifies how AI fits into the shopping journey. Among retained queries, a little more than a quarter are evaluation or comparison searches, including “best [product]” and “X vs Y” terms. These are queries where users need background and guidance. In contrast, Google removes bottom-funnel keywords. Price, buy, deals, and specific product names are removed. This shows Google’s focus is on how useful AI serves for each intent. AI educates and guides but does not handle the final purchase step.

Shopping Trends Influence AI Appearance

The shopping calendar shapes how AI appears in search results. BrightEdge describes the typical shopping journey as consisting of research in November, evaluation and comparison in early December, and buying in late December. AI helps shoppers understand options in November, assists with comparisons in early December, and by late December, AI tends to be less influential and traditional search results tend to complete the sale.

This makes November the key moment for making evaluation and comparison content easier for AI to cite. Once December arrives, the chance for AI-driven discovery shrinks because consumers have moved on to the final leg of their shopping journey, purchase.

These findings mean that brands should align their content strategies with the points in the journey where AI Overviews are active. BrightEdge advises identifying evaluation and transactional pages, ensuring that comparison content is indexed early, and watching category-specific retention patterns. The data indicates two areas where brands can focus their efforts. One is supporting AI during research and review stages. The other is improving organic search visibility for purchasing queries. The 18 percent year-over-year consistency figure also shows that flexibility is needed because the queries shown in AI Overviews change frequently.

Although the behavior of AI Overviews may seem volatile, BrightEdge’s research suggests that the changes follow a consistent pattern. AI surfaces when people are learning and evaluating and withdraws when users shift into buying. Categories that require explanations or comparisons see the highest retention in AI Overviews, and November remains the key period when AI can use that content. The overall pattern gives brands a clearer view of how AI fits into the shopping journey and how user intent shapes where AI shows up.

Read BrightEdge’s report:
Google AI Overview Holiday Shopping Test: The 57% Pullback That Changes Everything

Featured Image by Shutterstock/Misselss

Why WordPress 6.9 Abilities API Is Consequential And Far-Reaching via @sejournal, @martinibuster

WordPress 6.9, scheduled for release on December 2, 2025, is shipping with a new Abilities API that introduces a new system designed to make advanced AI-driven functionality possible for themes and plugins. The new Abilities API will standardize how plugins, themes, and core describe what they can do in a format that humans and machines can understand.

This positions WordPress sites to be understood and used more reliably by AI agents and automation tools, since the Abilities API provides the structured information those systems need to interact with site functionality in a predictable way.

The Abilities API is designed to address a long-standing issue in WordPress: functionality has been scattered across custom functions, AJAX handlers, and plugin-specific implementations. According to WordPress, the purpose of the API is to provide a common way for WordPress core, plugins, and themes to describe what they can do in a standardized, machine-readable form.

This approach enables discoverability, clear validation, and predictable execution wherever an ability originates. By centralizing the description and exposure of capabilities, the Abilities API provides a centralized way to describe functionality that might otherwise be scattered across different implementations.

What An Ability Is

The announcement defines an “ability” as a self-contained unit of functionality that includes its inputs, outputs, permissions, and execution logic. This structure allows abilities to be managed as separate pieces of functionality rather than fragments buried in theme or plugin code. WordPress explains that registering abilities through the API lets developers define permission checks, execution callbacks, and validation requirements, ensuring predictable behavior wherever the ability is used. By replacing isolated functions with defined units, WordPress creates a clearer and more open system for interacting with its features.

What Developers Gain From Abilities API

Developers gain several advantages by registering functionality as abilities. According to the announcement, abilities become discoverable through standardized interfaces, which means they can be queried, listed, and inspected across different contexts. Developers can organize them into categories, validate inputs and outputs, and apply permission rules that define who or what can execute them. The announcement notes that one benefit is automatic exposure through REST API endpoints under the wp-abilities/v1 namespace. This setup shifts WordPress from custom-coded actions to a system where functionality is defined in a consistent and reachable way.

Abilities Best Practices

One of the frustrating paint points for WordPress users is when a plugin or theme conflicts with another one. This happens for a variety of reasons but in the case of the Abilities API, WordPress has created a set of rules that should help prevent conflicts and errors.

WordPress explains the practices:

Ability names should follow these practices:

  • Use namespaced names to prevent conflicts (e.g., my-plugin/my-ability)
  • Use only lowercase alphanumeric characters, dashes, and forward slashes
  • Use descriptive, action-oriented names (e.g., process-payment, generate-report)
  • The format should be namespace/ability-name

Abilities API

The Abilities API introduces three components that work together to provide a complete system for registering and interacting with abilities.

1. The first is a PHP API for registering, managing, and executing abilities.

2. The second is automatic REST API exposure, which ensures that abilities can be accessed through endpoints without extra developer effort.

3. The third is a set of new hooks that help developers integrate with the system. These components, according to the announcement, bring consistency to how abilities are described and executed, forming a base described in the announcement as a consistent way to register and execute abilities.

The Abilities API is guided by several design goals that help it function as a long-term foundation.

Discoverability
Discoverability is a central goal, allowing every ability to be listed, queried, and inspected.

Interoperability
Interoperability is also emphasized, as the uniform schema lets different parts of WordPress create workflows together.

Security
Security is a part of the new API by design with permission checks defining who and what can invoke abilities.

Part Of The AI Building Blocks Initiative

The Abilities API is not an isolated change but part of the AI Building Blocks initiative meant to prepare WordPress for AI-driven workflows. The announcement explains that this system provides the base for AI agents, automation tools, and developers to interact with WordPress in a predictable way.
Abilities are machine-readable and exposed in the same manner across PHP, REST, and planned interfaces, and the announcement describes them as usable across those contexts. The Abilities API provides the metadata that AI agents and automation tools can use to understand and work with WordPress functionality.

The introduction of the Abilities API in WordPress 6.9 potentially marks a huge change in how functionality is organized, described, and accessed across the platform. By creating a standardized system for defining abilities and exposing them in different contexts, WordPress introduces a system that positions WordPress to be in the forefront of future AI innovations for years to come. This is a big and consequential update to WordPress that will be here in a few short weeks.

Featured Image by Shutterstock/AntonKhrupinArt

OpenAI Releases GPT-5.1 With Improved Instruction Following via @sejournal, @MattGSouthern

OpenAI released GPT-5.1 Instant and GPT-5.1 Thinking with updates to conversational style and reasoning capabilities.

The updates begin rolling out today to paid users before expanding to free accounts.

OpenAI says this release addresses feedback from users who want AI that feels more natural to interact with, while also improving intelligence.

What’s New

GPT-5.1 Instant

GPT-5.1 Instant, ChatGPT’s most-used model, now defaults to a warmer, more conversational tone.

OpenAI reports improved instruction following, with the model more reliably answering the specific question asked rather than drifting into tangents.

GPT-5.1 Instant can use adaptive reasoning. The model decides when to think before responding to challenging questions, producing more thorough answers while maintaining speed.

GPT-5.1 Thinking

The advanced reasoning model adapts thinking time more precisely. On a representative distribution of ChatGPT tasks, GPT-5.1 Thinking runs roughly twice as fast on the fastest tasks and roughly twice as slow on the slowest tasks compared to GPT-5 Thinking.

Responses use less jargon and fewer undefined terms, which OpenAI says should make the most capable model more approachable for complex workplace tasks and explaining technical concepts.

Customization Options

OpenAI refined personality presets to better reflect common usage patterns. Default, Friendly (formerly Listener), and Efficient (formerly Robot) remain with updates, and new options include Professional, Candid, and Quirky.

These presets apply across all models. The original Cynical (formerly Cynic) and Nerdy (formerly Nerd) options remain available under the same personalization menu.

Beyond presets, OpenAI is experimenting with controls that let you tune specific characteristics such as response conciseness, warmth, scannability, and emoji frequency from personalization settings.

Personalization changes now take effect across all chats immediately, including ongoing conversations. Previously, changes only applied to new conversations started afterward.

The updated GPT-5.1 models also adhere more closely to custom instructions, giving you more precise tone and behavior control.

Rollout Timeline

GPT-5.1 Instant and Thinking begin rolling out today starting with paid subscribers. Free and logged-out users will get access afterward.

Enterprise and Education customers get a seven-day early access toggle to GPT-5.1 (off by default). After that window, GPT-5.1 becomes the default ChatGPT model.

GPT-5 (Instant and Thinking) remains available in the legacy models dropdown for paid subscribers for three months, giving people time to compare and adapt.

Why This Matters

GPT-5.1 can change how your day-to-day workflows behave. Better instruction following means less prompt tweaking and fewer off-brief outputs.

Adaptive reasoning may make simple tasks feel faster while giving more complex work, like technical explanations or data analysis, extra time.

Looking Ahead

OpenAI frames this update as a step toward personalized AI that adapts to individual preferences and tasks.

Updated personality styles and tone options roll out today. Granular characteristic tuning will roll out later this week as an experiment to a limited number of users, with further changes based on feedback.


Featured Image: Photo Agency/Shutterstock

OpenAI’s Sam Altman Says Personalized AI Raises Privacy Concerns via @sejournal, @martinibuster

In a recent interview with Stanford University, OpenAI’s CEO Sam Altman predicted that AI security will become the defining problem of the next phase of AI development, saying that AI security is one of the best fields to study right now. He also cited personalized AI as one example of a security concern that he’s been thinking about lately.

What Does AI Security Mean Today?

Sam Altman said that concerns about AI safety will be reframed as AI Security issues that can be solvable by AI.

Interview host, Dan Boneh, asked:

“So what does it mean for an AI system to be secure? What does it mean for even trying to kind of make it do things it wasn’t designed to do?

How do we protect AI systems from prompt injections and other attacks like that? How do you think of AI security?

I guess the concrete question I want to ask is, among all the different things we can do with AI, this course is about learning one sliver of the field. Is this a good area? Should people go into this?”

Sam Altman encouraged today’s students to study AI security.

He answered:

“I think this is one of the best areas to go study. I think we are soon heading into a world where a lot of the AI safety problems that people have traditionally talked about are going to be recast as AI security problems in different ways.

I also think that given how capable these models are getting, if we want to be able to deploy them for wide use, the security problems are going to get really big. You mentioned many areas that I think are super important to figure out. Adversary robustness in particular seems like it’s getting quite serious.”

What Altman means is that people are starting to find ways to trick AI systems, and the problem is becoming serious enough that researchers and engineers need to focus on making AI resistant to manipulation and other kinds of attacks, such as prompt injections.

AI Personalization Becoming A Security Concern

Altman also said that something he’s been thinking a lot about lately is possible security issues with AI personalization. He said that people appreciate personalized responses from AI but he said that this could open the door to malicious hackers figuring out how to steal sensitive data (exfiltrate).

He explained:

“One more that I will mention that you touched on a little bit, but just it’s been on my mind a lot recently. There are two things that people really love right now that taken together are a real security challenge.

Number one, people love how personalized these models are getting. So ChatGPT now really gets to know you. It personalizes over your conversational history, your data you’ve connected to it, whatever else.

And then number two is you can connect these models to other services. They can go off and like call things on the web and, you know, do stuff for you that’s helpful.

But what you really don’t want is someone to be able to exfiltrate data from your personal model that knows everything about you.

And humans, you can kind of trust to be reasonable at this. If you tell your spouse a bunch of secrets, you can sort of trust that they will know in what context what to tell to other people. The models don’t really do this very well yet.

And so if you’re telling like a model all about your, you know, private health care issues, and then it is off, and you have it like buying something for you, you don’t want that e-commerce site to know about all of your health issues or whatever.

But this is a very interesting security problem to solve this with like 100% robustness.”

Altman identifies personalization as both a breakthrough and a new opening for cyber attack. The same qualities that make AI more useful also make it a target, since models that learn from individual histories could be manipulated to reveal them. Altman shows how convenience can become a source of exposure, explaining that privacy and usability are now security challenges.

Lastly, Altman circled back to AI as both the security problem and the solution.

He concluded:

“Yeah, by the way, it works both directions. Like you can use it to secure systems. I think it’s going to be a big deal for cyber attacks at various times.”

Takeaways

  • AI Security As The Next Phase Of AI Development
    Altman predicts that AI security will replace AI safety as the central challenge and opportunity in artificial intelligence.
  • Personalization As A New Attack Surface
    The growing trend of AI systems that learn from user data raises new security concerns, since personalization could expose opportunities for attackers to extract sensitive information.
  • Dual Role Of AI In Cybersecurity
    Altman emphasizes that AI will both pose new security threats and serve as a powerful tool to detect and prevent them.
  • Emerging Need For AI Security Expertise
    Altman’s comments suggests that there will be a rising demand for professionals who understand how to secure, test, and deploy AI responsibly.
Is AI Search SEO Leaving Bigger Opportunities Behind? via @sejournal, @martinibuster

A recent podcast by Ahrefs raised two issues about optimizing for AI search that can cause organizations to underperform and miss out on opportunities to improve sales. The conversation illustrates a gap between realistic expectations for AI-based trends and what can be achieved through overlooked opportunities elsewhere.

YouTube Is Second Largest Search Engine

The first thing noted in the podcast is that YouTube is the second-largest search engine by queries entered in the search bar. More people type search queries into YouTube’s search bar than any other search engine except Google itself. So it absolutely makes sense for companies to seriously consider how a video strategy can work to increase traffic and brand awareness.

It should be a no-brainer that businesses figure out YouTube, and yet many businesses are rushing to spend time and money optimizing for answer engines like Perplexity and ChatGPT, which have a fraction of the traffic of YouTube.

Patrick Stox explained:

“YouTube is the second largest search engine. There’s a lot of focus on all these AI assistants. They’re in total driving less than 1% of your traffic. YouTube might be a lot more. I don’t know how much it’s going to drive traffic to the website, but there’s a lot of eyes on it. I know for us, like we see it in our signups, …they sign up for Ahrefs.

It’s an incredible channel that I think as people need to diversify, to kind of hedge their bets on where their traffic is coming from, this would be my first choice. Like go and do more video. There’s your action item. If you’re not doing it, go do more video right now.”

Tim Soulo, Ahrefs CMO, expressed curiosity that so many people are looking two or three years ahead for opportunities that may or may not materialize on AI assistants, while overlooking the real benefits available today on YouTube.

He commented:

“I feel that a lot of people get fixated on AI assistants like ChatGPT and Perplexity and optimizing for AI search because they are kind of looking three, five years ahead and they are kind of projecting that in three, five years, that might be the dominant thing, how people search.

…But again, if we focus on today, YouTube is much more popular than ChatGPT and YouTube has a lot more business potential than ChatGPT. So yeah, definitely you have to invest in AI search. You have to do the groundwork that would help you rank in Google, rank in ChatGPT and everything. …I don’t see YouTube losing its relevance five years from now. I can only see it getting bigger and bigger because the new generation of people that is growing up right now, they are very video oriented. Short form video, long form video. So yeah, definitely. If you’re putting all your eggs in the basket of ChatGPT, but not putting anything in YouTube, that’s a big mistake.”

Patrick Stox agreed with Tim, noting that Instagram and TikTok are big for short-form videos that are wildly popular today, and encouraged viewers and listeners to see how video can fit into their marketing.

Some of the disconnect regarding SEO and YouTube is that SEOs may feel that SEO is about Google, and YouTube is therefore not their domain of responsibility. I would counter that YouTube should be a part of SEOs’ concern because people use it for reviews, how-to information, and product research, and the searches on YouTube are second only to Google.

SEO/AEO/GEO Can’t Solve All AI Search Issues

The second topic they touched on was the expectations placed on SEO to solve all of a business’s traffic and visibility problems. Patrick Stox and Tim Soulo suggested that high rankings and a satisfactory marketing outcome begin and end with a high-quality product, service, and content. Problems at the product or service end cause friction and result in negative sentiment on social media. This isn’t something that you can SEO yourself out of.

Patrick Stox explained:

“We only have a certain amount of control, though. We can go and create a bunch of pages, a bunch of content. But if you have real issues, like if everyone suddenly is like Nvidia’s graphics cards suck and they’re saying that on social media and Reddit and everything, YouTube, there’s only so much you can do to combat that.

…And there might be tens of thousands of them and there’s one of me. So what am I gonna do? I’m gonna be a drop in the bucket. It’s gonna be noise in the void. The internet is still the one controlling the narrative. So there’s only so much that SEOs are gonna be able to do in a situation like that.

…So this is going to get contentious in a lot of organizations where you’re going to have to do something that the execs are going to be yelling, can’t you just change that, make it go away?”

Tim and Patrick went on to use the example of their experience with a pricing change they made a few years ago, where customers balked at the changes. Ahrefs made the change because they thought it would make their service more affordable, but despite their best efforts to answer user questions and get control of the conversation, the controversy wouldn’t go away, so they ultimately decided to give users what they wanted.

The point is that positive word of mouth isn’t necessarily an SEO issue, even though SEO/GEO/AEO is now expected to get out there and build positive brand associations so that they’re recommended by AI Mode, ChatGPT, and Perplexity.

Takeaways

  • Find balance between AI search and immediate business opportunities:
    Some organizations may focus too heavily on optimizing for AI assistants at the expense of video and multimodal search opportunities.
  • YouTube’s marketing power:
    YouTube is the second-largest search engine and a major opportunity for traffic and brand visibility.
  • Realistic expectations for SEO:
    SEO/GEO/AEO cannot fix problems rooted in poor products, services, or customer sentiment. Long-term visibility in AI search depends not just on optimization, but on maintaining positive brand sentiment.

Watch the video at about the 36 minute mark:

Featured Image by Shutterstock/Collagery

How To Cultivate Brand Mentions For Higher AI Search Rankings via @sejournal, @martinibuster

Building brand awareness has long been an important but widely overlooked part of SEO. AI Search has brought this activity to the forefront. The following ideas should assist in forming a strategy for achieving brand name mentions at a ubiquitous scale, with the goal of achieving similar ubiquity in AI search results.

Tell People About The Site

SEOs and businesses can become overly concerned with getting links and forget that the more important thing to do is to get the word out about a website. A website must have unique qualities that will positively impress people and make them enthusiastic about the brand. If the site you’re trying to build traffic to lacks those unique qualities then building links or brand awareness can become a futile activity.

User behavior signals have been a part of Google’s algorithms since the 2004 Navboost signals were kicking in and the recent Google antitrust lawsuit shows that user behavior signals have continued to play a role. What has changed is that SEOs have noticed that AI search results tend to recommend sites that are recommended by other sites, brand mentions.

The key to all of this has been to tell other sites about your site and make it clear to potential consumers or website visitors what makes your site special.

  • So the first task is always to make a site special in every possible way.
  • The second task is to tell others about the site in order to build word of mouth and top-of-mind brand presence.

Optimizing a website for users and cultivating awareness of that site are the building blocks of the external signals of authoritativeness, expertise, and popularity that Google is always talks about.

Downside of Backlink Searches

Everyone knows how to do a backlink search with third-party tools but a lot of the data consists of garbage-y sites; that’s not the tool’s fault, it’s just the state of the Internet. In any case, a backlink search is limited, it doesn’t surface the conversations real people are having about a website.

In my experience, a better way to do it is to identify all instances of where a site is linked from another site or discussed by another site.

Brand And Link Mentions

Some websites have bookmark and resource pages. These are low hanging fruit.

Search for a competitor’s links:

example.com site:.com “bookmarks” -site:example.com

example.com site:.com “resources” -site:example.com

The “-site:example.com” removes the competitor site from the search results, showing you just the sites that might mention the full URL of the site which may or may not be linked.

The TLD segmented variants are:

example.com site:.net "resources" 
example.com site:.org "resources" 
example.com site:.edu "resources" 
example.com site:.ai "resources" 
example.com site:.net "links" 
example.com site:.org "links" 
example.com site:.edu "links" 
example.com site:.ai "links" 
Etc.

The goal is not necessarily to get links. It’s to build awareness of the site and build popularity.

Brand Mentions By Company Name

One way to identify brand mentions is to search by company name using the TLD segmentation technique. Making a broad search for a company’s name will only get you some of the brand mentions. Segmenting the search by TLD will reveal a wider range of sites.

Segmented Brand Mention Search

The following assumes that the competitor’s site is on the .com domain and you’re limiting the search to .com websites.

Competitor's Brand Name site:.com -site:example.com

Segmented Variants:

Competitor's Brand Name site:.org
Competitor's Brand Name site:.edu
Competitor's Brand Name site:.Reddit.com
Competitor's Brand Name site:.io
etc.

Sponsored Articles

Sponsored articles are indexed by search engines and ranked in AI search surfaces like AI Mode and ChatGPT. These can present opportunities to purchase a sponsored post that enables you to present your message with links that are nofollow and a prominent “sponsored post” disclaimer at the top of the web page – all in compliance with Google and FTC guidelines.

Brand Mentions: Authoritativeness Is Key

The thing that some SEOs never learned is that authoritativeness is important and quite likely millions of dollars have been wasted on paying for links from low-quality blogs and higher quality sites.

ChatGPT and AI Mode are found to recommend sites that are mentioned in high quality authoritative sites. Do not waste time or money paying for mentions on low quality sites.

Some Ways To Search

Product/Service/Solution Search

Name Of Product Or Service Or Problem Needing Solving site:.com “sponsored article”
Name Of Product Or Service Or Problem Needing Solving site:.net “sponsored article”
Name Of Product Or Service Or Problem Needing Solving site:.org “sponsored article”
Name Of Product Or Service Or Problem Needing Solving site:.edu “sponsored article”
Name Of Product Or Service Or Problem Needing Solving site:.io “sponsored article”
etc.

Sponsored Post Variant

Name Of Product Or Service Or Problem Needing Solving site:.com “sponsored post”
Name Of Product Or Service Or Problem Needing Solving site:.net “sponsored post”
Name Of Product Or Service Or Problem Needing Solving site:.org “sponsored post”
Name Of Product Or Service Or Problem Needing Solving site:.edu “sponsored post”
Name Of Product Or Service Or Problem Needing Solving site:.io “sponsored post”
etc.

Key insight: Test whether “sponsored post” or “sponsored article” provides better results or just more results. Using quotation marks, or if necessary the verbatim search tool, will stop Google from stemming the search results and prevents it from showing a mix of both “post” and “article” results. By forcing Google to be specific, you’re forcing Google to show more search results.

Competitor Search

Competitor’s Brand Name site:.com “sponsored post”
Competitor’s Brand Name site:.net “sponsored post”
Competitor’s Brand Name site:.org “sponsored post”
Competitor’s Brand Name site:.edu “sponsored post”
Competitor’s Brand Name site:.io “sponsored post”
etc.

Pure Awareness Building With Zero Internet Presence

This method of getting the word out is pure gold, especially for B2B but also for professional businesses such as in the legal niches. There are organizations and associations that print magazines or send out newsletters to thousands, sometimes tens of thousands, of people who are an exact match for the people you want to build top of mind brand name recognition with.

Emails and magazines do not have links and that’s okay. The goal is to build name brand recognition with positive associations. What better way than getting interviewed in a newsletter or magazine? What better way than submitting an article to a newsletter or magazine?

Don’t Forget PDF Magazines

Not all magazines are print, many magazines are in the form of a PDF. For example, I subscribe to a surf fishing magazine that is entirely in a proprietary web format that can only be viewed by subscribers. If I were a fishing company, I would make an effort to meet some of article authors, in addition to the publishers, at fishing industry conferences where they appear as presenters and in product booths.

This kind of outreach is in-person, it’s called relationship building. 

Getting back to the industry organizations and associations, this is an entire topic in itself and I’ll follow up with another article, but many of the techniques covered in this guide will work with this kind of brand building.

Using the filetype search operator in combination with the TLD segmentation will yield some of these kinds of brand building opportunities.

[product/service/keyword/niche] filetype:pdf site:.com newsletter
[product/service/keyword/niche] filetype:pdf site:.org newsletter

1. Segment the search for opportunities search by TLD .net/.com/.org/.us/.edu, etc.
Segmenting by TLD will help you discover different kinds of brand building opportunities. Websites on a Dot Org domain often link to a site for different reasons than a Dot Com website. Dot org domains represent article writing projects, free links on a links page, newsletter article opportunity, and charity link opportunities, just to name a few.

2. Consider Segmenting Dot Com Searches
The Dot Com TLD will yields an overabundance of search results, not all of them useful. This makes it imperative to segment the results to find all available opportunities. Even if you’re

Ways to segment the Dot Com are by:

  • A. Kinds of sites (blog/shopping related keywords/product or service keywords/forum/etc.)
    This is pretty straightforward. If you’re looking for brand mentions be sure to add keywords to the searches that are directly relevant to what your business is about. If your site is about car injuries then sites about cars as well as specific makes, models, and kinds of automobiles are how you would segment a .com search
  • B. Context – Audience Relevance Not Keyword Match
    Context of a sponsored article is important. This is not about whether the website content matches what your site, business, product, or service are about.  What’s important is to identify if the audience reach is an exact match to the audience that will be interested in your product, business, or service.
  • C. Quality And Authoritativeness
    This is not about third-party metrics related to links. This is just about making a common sense judgment about whether a site where you want a mention is well-regarded by those who are likely to be interested in your brand. That’s it.

Takeaway

The thing I want you to walk away with is that it’s useful to just tell people about a site and to get as many people as possible aware of it. Identify opportunities for ways to get them to tell a friend. There is no better recommendation than the one you can get from a friend or from a trusted organization.  This is the true source of authoritativeness and popularity.

Featured Image by Shutterstock/Bird stocker TH

Google AI Overviews Appear On 21% Of Searches: New Data via @sejournal, @MattGSouthern

Ahrefs analyzed 146 million search results to determine which query types trigger AI Overviews. The research tracked AIO appearance across 86 keyword characteristics.

Here’s a concise look at the patterns and how they may affect your strategy.

What The Analysis Found

AI Overviews appear on 20.5% of all keywords. Specific query types show notable variance, with some categories hitting 60% trigger rates while others stay below 2%.

Patterns Observed Across Query Types

Single-word queries activate AIOs only 9.5% of the time, whereas queries with seven or more words trigger them 46.4%. This correlation indicates that Google primarily uses AIOs for complex informational searches rather than simple lookups.

The question format also shows a similar trend: question-based queries result in AIOs 57.9% of the time, while non-question queries have a much lower rate of 15.5%.

The most significant distinctions are seen based on intent. Informational queries make up 99.9% of all AIO appearances, while navigational queries trigger AIOs just 0.09%. Commercial queries account for 4.3%, and transactional queries for 2.1%.

Patterns Observed Across Industry Categories

Science queries have an AIO rate of 43.6%, while health queries are at 43.0%, and pets & animals reach 36.8%. People & society questions result in AIOs 35.3% of the time.

In contrast, commerce categories exhibit opposite trends. Shopping queries are associated with AIOs only 3.2% of the time, the lowest in the dataset. Real estate remains at 5.8%, sports at 14.8%, and news at 15.1%.

YMYL queries display unexpectedly high trigger rates. Medical YMYL searches trigger AI Overviews 44.1% of the time, financial YMYL hits 22.9%, and safety YMYL reaches 31.0%.

These findings contradict Google’s focus on expert content for topics that could impact health, financial security, or safety.

Queries With Low Presence Of AI Overviews

6.3% of “very newsy” keywords trigger AI Overviews, while 20.7% of non-news queries display AIOs.

The pattern indicates that Google deliberately limits AIOs for time-sensitive content where accuracy and freshness are essential.

Local searches demonstrate a similar trend, with only 7.9% of local queries showing AI Overviews compared to 22.8% for non-local queries.

NSFW content consistently avoids AIOs across categories: adult queries trigger AIOs 1.5% of the time, gambling 1.4%, and violence 7.7%. Drug-related queries have the highest NSFW trigger rate at 12.6%, yet this remains well below the baseline.

Brand vs. Non-Brand

Branded keywords show slight differences compared to non-branded ones. Non-branded queries trigger AIOs 24.9% of the time, whereas branded queries do so 13.1% of the time.

The data indicates that AIOs occur 1.9 times more frequently for generic searches than for brand-specific lookups.

No Correlation With CPC

CPC shows no meaningful correlation with AIO appearance. Keyword cost-per-click values don’t affect trigger rates across any price range tested, with rates hovering between 12.4% and 27.6% regardless of commercial value.

Why This Matters

Publishers focused on informational content encounter the greatest AIO exposure. Question-based and how-to guides align closely with Google’s trigger criteria, putting educational content publishers at the highest risk of traffic loss.

Medical content has the highest category-specific AIO rate, despite concerns about AI accuracy in health advice.

Ecommerce and news publishers are relatively less affected by AIOs. The low trigger rates for shopping and news queries indicate these sectors experience less AI-driven traffic disruption compared to informational sites.

Looking Ahead

Using this data, publishers can review their current keyword portfolios to identify AIO exposure patterns. The most reliable indicators are query intent and length, with industry category and question format also playing significant roles.

AIO exposure varies considerably across different industry categories, with differences exceeding 40 percentage points between the highest and lowest. Content strategies need to consider this variation at the category level instead of assuming consistent baseline risk across all topics.

For a more in-depth examination of this data, see the full analysis.


Featured Image: Zorion Art Production/Shutterstock

A Step-By-Step AEO Guide For Growing AI Citations & Visibility via @sejournal, @fthead9

This post was sponsored by TAC Marketing. The opinions expressed in this article are the sponsor’s own.

After years of trying to understand the black box that is Google search, SEO professionals have a seemingly even more opaque challenge these days – how to earn AI citations.

While at first glance inclusion in AI answers seems even more of a mystery than traditional SEO, there is good news. Once you know how to look for them, the AI engines do provide clues to what they consider valuable content.

This article will give you a step-by-step guide to discovering the content that AI engines value and provide a blueprint for optimizing your website for AI citations.

Take A Systematic Approach To AI Engine Optimization

The key to building an effective AI search optimization strategy begins with understanding the behavior of AI crawlers. By analyzing how these bots interact with your site, you can identify what content resonates with AI systems and develop a data-driven approach to optimization.

While Google remains dominant, AI-powered search engines like ChatGPT, Perplexity, and Claude are increasingly becoming go-to resources for users seeking quick, authoritative answers. These platforms don’t just generate responses from thin air – they rely on crawled web content to train their models and provide real-time information.

This presents both an opportunity and a challenge. The opportunity lies in positioning your content to be discovered and referenced by these AI systems. The challenge is understanding how to optimize for algorithms that operate differently from traditional search engines.

The Answer Is A Systematic Approach

  • Discover what content AI engines value based on their crawler behavior.
    • Traditional log file analysis.
    • SEO Bulk Admin AI Crawler monitoring.
  • Reverse engineer prompting.
    • Content analysis.
    • Technical analysis.
  • Building the blueprint.

What Are AI Crawlers & How To Use Them To Your Advantage

AI crawlers are automated bots deployed by AI companies to systematically browse and ingest web content. Unlike traditional search engine crawlers that primarily focus on ranking signals, AI crawlers gather content to train language models and populate knowledge bases.

Major AI crawlers include:

  • GPTBot (OpenAI’s ChatGPT).
  • PerplexityBot (Perplexity AI).
  • ClaudeBot (Anthropic’s Claude).
  • Googlebot crawlers (Google AI).

These crawlers impact your content strategy in two critical ways:

  1. Training data collection.
  2. Real-time information retrieval.

Training Data Collection

AI models are trained on vast datasets of web content. Pages that are crawled frequently may have a higher representation in training data, potentially increasing the likelihood of your content being referenced in AI responses.

Real-Time Information Retrieval

Some AI systems crawl websites in real-time to provide current information in their responses. This means fresh, crawlable content can directly influence AI-generated answers.

When ChatGPT responds to a query, for instance, it’s synthesizing information gathered by its underlying AI crawlers. Similarly, Perplexity AI, known for its ability to cite sources, actively crawls and processes web content to provide its answers. Claude also relies on extensive data collection to generate its intelligent responses.

The presence and activity of these AI crawlers on your site directly impact your visibility within these new AI ecosystems. They determine whether your content is considered a source, if it’s used to answer user questions, and ultimately, if you gain attribution or traffic from AI-driven search experiences.

Understanding which pages AI crawlers visit most frequently gives you insight into what content AI systems find valuable. This data becomes the foundation for optimizing your entire content strategy.

How To Track AI Crawler Activity: Find & Use Log File Analysis

The Easy Way: We use SEO Bulk Admin to analyze server log files for us.

However, there’s a manual way to do it, as well.

Server log analysis remains the standard for understanding crawler behavior. Your server logs contain detailed records of every bot visit, including AI crawlers that may not appear in traditional analytics platforms, which focus on user visits.

Essential Tools For Log File Analysis

Several enterprise-level tools can help you parse and analyze log files:

  • Screaming Frog Log File Analyser: Excellent for technical SEOs comfortable with data manipulation.
  • Botify: Enterprise solution with robust crawler analysis features.
  • Semrush: Offers log file analysis within its broader SEO suite.
Screenshot from Screaming Frog Log File AnalyserScreenshot from Screaming Frog Log File Analyser, October 2025

The Complexity Challenge With Log File Analysis

The most granular way to understand which bots are visiting your site, what they’re accessing, and how frequently, is through server log file analysis.

Your web server automatically records every request made to your site, including those from crawlers. By parsing these logs, you can identify specific user-agents associated with AI crawlers.

Here’s how you can approach it:

  1. Access Your Server Logs: Typically, these are found in your hosting control panel or directly on your server via SSH/FTP (e.g., Apache access logs, Nginx access logs).
  2. Identify AI User-Agents: You’ll need to know the specific user-agent strings used by AI crawlers. While these can change, common ones include:
  • OpenAI (for ChatGPT, e.g., `ChatGPT-User` or variations)
  • Perplexity AI (e.g., `PerplexityBot`)
  • Anthropic (for Claude, though often less distinct or may use a general cloud provider UAs)
  • Other LLM-related bots (e.g., “GoogleBot” and `Google-Extended` for Google’s AI initiatives, potentially `Vercelbot` or other cloud infrastructure bots that LLMs might use for data fetching).
  1. Parse and Analyze: This is where the previously mentioned log analyzer tools come into play. Upload your raw log files into the analyzer and start filtering the results to identify AI crawler and search bot activity. Alternatively, for those with technical expertise, Python scripts or tools like Splunk or Elasticsearch can be configured to parse logs, identify specific user-agents, and visualize the data.

While log file analysis provides the most comprehensive data, it comes with significant barriers for many SEOs:

  • Technical Depth: Requires server access, understanding of log formats, and data parsing skills.
  • Resource Intensive: Large sites generate massive log files that can be challenging to process.
  • Time Investment: Setting up proper analysis workflows takes considerable upfront effort.
  • Parsing Challenges: Distinguishing between different AI crawlers requires detailed user-agent knowledge.

For teams without dedicated technical resources, these barriers can make log file analysis impractical despite its value.

An Easier Way To Monitor AI Visits: SEO Bulk Admin

While log file analysis provides granular detail, its complexity can be a significant barrier for all but the most highly technical users. Fortunately, tools like SEO Bulk Admin can offer a streamlined alternative.

The SEO Bulk Admin WordPress plugin automatically tracks and reports AI crawler activity without requiring server log access or complex setup procedures. The tool provides:

  • Automated Detection: Recognizes major AI crawlers, including GPTBot, PerplexityBot, and ClaudeBot, without manual configuration.
  • User-Friendly Dashboard: Presents crawler data in an intuitive interface accessible to SEOs at all technical levels.
  • Real-Time Monitoring: Tracks AI bot visits as they happen, providing immediate insights into crawler behavior.
  • Page-Level Analysis: Shows which specific pages AI crawlers visit most frequently, enabling targeted optimization efforts.
Screenshot of SEO Bulk Admin AI/Bots ActivityScreenshot of SEO Bulk Admin AI/Bots Activity, October 2025

This gives SEOs instant visibility into which pages are being accessed by AI engines – without needing to parse server logs or write scripts.

Comparing SEO Bulk Admin Vs. Log File Analysis

Feature Log File Analysis SEO Bulk Admin
Data Source Raw server logs WordPress dashboard
Technical Setup High Low
Bot Identification Manual Automatic
Crawl Tracking Detailed Automated
Best For Enterprise SEO teams Content-focused SEOs & marketers

For teams without direct access to server logs, SEO Bulk Admin offers a practical, real-time way to track AI bot activity and make data-informed optimization decisions.

Screenshot of SEO Bulk Admin Page Level Crawler ActivityScreenshot of SEO Bulk Admin Page Level Crawler Activity, October 2025

Using AI Crawler Data To Improve Content Strategy

Once you’re tracking AI crawler activity, the real optimization work begins. AI crawler data reveals patterns that can transform your content strategy from guesswork into data-driven decision-making.

Here’s how to harness those insights:

1. Identify AI-Favored Content

  • High-frequency pages: Look for pages that AI crawlers visit most frequently. These are the pieces of content that these bots are consistently accessing, likely because they find them relevant, authoritative, or frequently updated on topics their users inquire about.
  • Specific content types: Are your “how-to” guides, definition pages, research summaries, or FAQ sections getting disproportionate AI crawler attention? This can reveal the type of information AI models are most hungry for.

2. Spot LLM-Favored Content Patterns

  • Structured data relevance: Are the highly-crawled pages also rich in structured data (Schema markup)? It’s an open debate, but some speculate that AI models often leverage structured data to extract information more efficiently and accurately.
  • Clarity and conciseness: AI models excel at processing clear, unambiguous language. Content that performs well with AI crawlers often features direct answers, brief paragraphs, and strong topic segmentation.
  • Authority and citations: Content that AI models deem reliable may be heavily cited or backed by credible sources. Track if your more authoritative pages are also attracting more AI bot visits.

3. Create A Blueprint From High-Performing Content

  • Reverse engineer success: For your top AI-crawled content, document its characteristics.
  • Content structure: Headings, subheadings, bullet points, numbered lists.
  • Content format: Text-heavy, mixed media, interactive elements.
  • Topical depth: Comprehensive vs. niche.
  • Keywords/Entities: Specific terms and entities frequently mentioned.
  • Structured data implementation: What schema types are used?
  • Internal linking patterns: How is this content connected to other relevant pages?
  • Upgrade underperformers: Apply these successful attributes to content that currently receives less AI crawler attention.
  • Refine content structure: Break down dense paragraphs, add more headings, and use bullet points for lists.
  • Inject structured data: Implement relevant Schema markup (e.g., `Q&A`, `HowTo`, `Article`, `FactCheck`) on pages lacking it.
  • Enhance clarity: Rewrite sections to achieve conciseness and directness, focusing on clearly answering potential user questions.
  • Expand Authority: Add references, link to authoritative sources, or update content with the latest insights.
  • Improve Internal Linking: Ensure that relevant underperforming pages are linked from your AI-favored content and vice versa, signaling topical clusters.

This short video walks you through the process of discovering what pages are crawled most often by AI crawlers and how to use that information to start your optimization strategy.

Here is the prompt used in the video:

You are an expert in AI-driven SEO and search engine crawling behavior analysis.

TASK: Analyze and explain why the URL [https://fioney.com/paying-taxes-with-a-credit-card-pros-cons-and-considerations/] was crawled 5 times in the last 30 days by the oai-searchbot(at)openai.com crawler, while [https://fioney.com/discover-bank-review/] was only crawled twice.

GOALS:

– Diagnose technical SEO factors that could increase crawl frequency (e.g., internal linking, freshness signals, sitemap priority, structured data, etc.)

– Compare content-level signals such as topical authority, link magnet potential, or alignment with LLM citation needs

– Evaluate how each page performs as a potential citation source (e.g., specificity, factual utility, unique insights)

– Identify which ranking and visibility signals may influence crawl prioritization by AI indexing engines like OpenAI’s

CONSTRAINTS:

– Do not guess user behavior; focus on algorithmic and content signals only

– Use bullet points or comparison table format

– No generic SEO advice; tailor output specifically to the URLs provided

– Consider recent LLM citation trends and helpful content system priorities

FORMAT:

– Part 1: Technical SEO comparison

– Part 2: Content-level comparison for AI citation worthiness

– Part 3: Actionable insights to increase crawl rate and citation potential for the less-visited URL

Output only the analysis, no commentary or summary.

Note: You can find more prompts for AI-focused optimization in this article: 4 Prompts to Boost AI Citations.

By taking this data-driven approach, you move beyond guesswork and build an AI content strategy grounded in actual machine behavior on your site.

This iterative process of tracking, analyzing, and optimizing will ensure your content remains a valuable and discoverable resource for the evolving AI search landscape.

Final Thoughts On AI Optimization

Tracking and analyzing AI crawler behavior is no longer optional for SEOs seeking to remain competitive in the AI-driven search era.

By using log file analysis tools – or simplifying the process with SEO Bulk Admin – you can build a data-driven strategy that ensures your content is favored by AI engines.

Take a proactive approach by identifying trends in AI crawler activity, optimizing high-performing content, and applying best practices to underperforming pages.

With AI at the forefront of search evolution, it’s time to adapt and capitalize on new traffic opportunities from conversational search engines.

Image Credits

Featured Image: Image by TAC Marketing. Used with permission.

In-Post Images: Image by TAC Marketing. Used with permission.