Google’s New User Intent Extraction Method via @sejournal, @martinibuster

Google published a research paper on how to extract user intent from user interactions that can then be used for autonomous agents. The method they discovered uses on-device small models that do not need to send data back to Google, which means that a user’s privacy is protected.

The researchers discovered they were able to solve the problem by splitting it into two tasks. Their solution worked so well it was able to beat the base performance of multi-modal large language models (MLLMs) in massive data centers.

Smaller Models On Browsers And Devices

The focus of the research is on identifying the user intent through the series of actions that a user takes on their mobile device or browser while also keeping that information on the device so that no information is sent back to Google. That means the processing must happen on the device.

They accomplished this in two stages.

  1. The first stage the model on the device summarizes what the user was doing.
  2. The sequence of summaries are then sent to a second model that identifies the user intent.

The researchers explained:

“…our two-stage approach demonstrates superior performance compared to both smaller models and a state-of-the-art large MLLM, independent of dataset and model type.
Our approach also naturally handles scenarios with noisy data that traditional supervised fine-tuning methods struggle with.”

Intent Extraction From UI Interactions

Intent extraction from screenshots and text descriptions of user interactions was a technique that was proposed in 2025 using Multimodal Large Language Models (MLLMs). The researchers say they followed this approach to their problem but using an improved prompt.

The researchers explained that extracting intent is not a trivial problem to solve and that there are multiple errors that can happen along the steps. The researchers use the word trajectory to describe a user journey within a mobile or web application, represented as a sequence of interactions.

The user journey (trajectory) is turned into a formula where each interaction step consists of two parts:

  1. An Observation
    This is the visual state of the screen (screenshot) of where the user is at that step.
  2. An Action
    The specific action that the user performed on that screen (like clicking a button, typing text, or clicking a link).

They described three qualities of a good extracted intent:

  • “faithful: only describes things that actually occur in the trajectory;
  • comprehensive: provides all of the information about the user intent required to re-enact the trajectory;
  • and relevant: does not contain extraneous information beyond what is needed for comprehensiveness.”

Challenging To Evaluate Extracted Intents

The researchers explain that grading extracted intent is difficult because user intents contain complex details (like dates or transaction data) and the user intents are inherently subjective, containing ambiguities, which is a hard problem to solve. The reason trajectories are subjective is because the underlying motivations are ambiguous.

For example, did a user choose a product because of the price or the features? The actions are visible but the motivations are not. Previous research shows that intents between humans matched 80% on web trajectories and 76% on mobile trajectories, so it’s not like a given trajectory can always indicate a specific intent.

Two-Stage Approach

After ruling out other methods like Chain of Thought (CoT) reasoning (because small language models struggled with the reasoning), they chose a two-stage approach that emulated Chain of Thought reasoning.

The researchers explained their two-stage approach:

“First, we use prompting to generate a summary for each interaction (consisting of a visual screenshot and textual action representation) in a trajectory. This stage is
prompt-based as there is currently no training data available with summary labels for individual interactions.

Second, we feed all of the interaction-level summaries into a second stage model to generate an overall intent description. We apply fine-tuning in the second stage…”

The First Stage: Screenshot Summary

The first summary, for the screenshot of the interaction, they divide the summary into two parts, but there is also a third part.

  1. A description of what’s on the screen.
  2. A description of the user’s action.

The third component (speculative intent) is a way to get rid of speculation about the user’s intent, where the model is basically guessing at what’s going on. This third part is labeled “speculative intent” and they actually just get rid of it. Surprisingly, allowing the model to speculate and then getting rid of that speculation leads to a higher quality result.

The researchers cycled through multiple prompting strategies and this was the one that worked the best.

The Second Stage: Generating Overall Intent Description

For the second stage, the researchers fine tuned a model for generating an overall intent description. They fine tuned the model with training data that is made up of two parts:

  1. Summaries that represent all interactions in the trajectory
  2. The matching ground truth that describes the overall intent for each of the trajectories.

The model initially tended to hallucinate because the first part (input summaries) are potentially incomplete, while the “target intents” are complete. That caused the model to learn to fill in the missing parts in order to make the input summaries match the target intents.

They solved this problem by “refining” the target intents by removing details that aren’t reflected in the input summaries. This trained the model to infer the intents based only on the inputs.

The researchers compared four different approaches and settled on this approach because it performed so well.

Ethical Considerations And Limitations

The research paper ends by summarizing potential ethical issues where an autonomous agent might take actions that are not in the user’s interest and stressed the necessity to build the proper guardrails.

The authors also acknowledged limitations in the research that might limit generalizability of the results. For example, the testing was done only on Android and web environments, which means that the results might not generalize to Apple devices. Another limitation is that the research was limited to users in the United States in the English language.

There is nothing in the research paper or the accompanying blog post that suggests that these processes for extracting user intent are currently in use. The blog post ends by communicating that the described approach is helpful:

“Ultimately, as models improve in performance and mobile devices acquire more processing power, we hope that on-device intent understanding can become a building block for many assistive features on mobile devices going forward.”

Takeaways

Neither the blog post about this research or the research paper itself describe the results of these processes as something that might be used in AI search or classic search. It does mention the context of autonomous agents.

The research paper explicitly mentions the context of an autonomous agent on the device that is observing how the user is interacting with a user interface and then be able to infer what the goal (the intent) of those actions are.

The paper lists two specific applications for this technology:

  1. Proactive Assistance:
    An agent that watches what a user is doing for “enhanced personalization” and “improved work efficiency”.
  2. Personalized Memory
    The process enables a device to “remember” past activities as an intent for later.

Shows The Direction Google Is Heading In

While this might not be used right away, it shows the direction that Google is heading, where small models on a device will be watching user interactions and sometimes stepping in to assist users based on their intent. Intent here is used in the sense of understanding what a user is trying to do.

Read Google’s blog post here:

Small models, big results: Achieving superior intent extraction through decomposition

Read the PDF research paper:

Small Models, Big Results: Achieving Superior Intent Extraction through Decomposition (PDF)

Featured Image by Shutterstock/ViDI Studio

User Data Is Important In Google Search, Per Liz Reid’s DOJ Filing via @sejournal, @marie_haynes

I found some interesting things in the latest document in the DOJ vs. Google trial. Google has appealed the ruling that says they need to give proprietary information to competitors.

Image Credit: Marie Haynes

Key Takeaways:

  • Google has been ordered to give information to competitors so as not to be an illegal monopoly. Google does not want to give its extensive user-side data away.
  • Google’s data on page quality and freshness is proprietary. They don’t want to give it away.
  • Pages that are indexed are marked up with annotations, including signals that identify spam pages.
  • If spammers got hold of those spam signals, it would make stopping spam difficult.
  • User data is important to Google’s Glue system that stores info on every query searched, what the user saw, and how they interacted with the search results.
  • User data is important for training RankEmbed BERT – one of the deep learning systems behind Search.

OK, let’s get into the interesting stuff!

Google Has Proprietary Page Quality And Freshness Signals

This really isn’t a surprise. I did find it interesting that freshness signals are at the heart of Google’s proprietary secrets.

Image Credit: Marie Haynes

Again, here’s more on the importance of Google’s proprietary freshness signals:

Image Credit: Marie Haynes

Pages That Are Crawled Are Marked Up With ‘Proprietary Page Understanding Annotations’

Every page in Google’s index is marked up with annotations to help it understand the page. These include signals to identify spam and duplicate pages. I’ve written before about how every page in the index has a spam score.

Image Credit: Marie Haynes

Spam Scores Could Be Used To Reverse Engineer Ranking Systems

Google doesn’t want to share information with its competitors on these scores.

Image Credit: Marie Haynes

If the spam scores get out, it could lead to more spamming and more difficulty for Google in fighting spam.

Image Credit: Marie Haynes

Google Builds The Index Using These Marked-Up Pages

The pages that Google has added page understanding annotations on are organized based on how frequently Google expects the content will need to be accessed and how fresh the content needs to be.

Image Credit: Marie Haynes

Only A Fraction Of Pages Make It Into Google’s Index

Google argues that giving competitors a list of indexed URLs will enable them to “forgo crawling and analyzing the larger web, and to instead focus their efforts on crawling only the fraction of pages Google has included in its index.” Building this index costs Google extensive time and money. They don’t want to give that away for free.

Image Credit: Marie Haynes

The Role Of User Data In Google’s Ranking Systems

This is the most interesting part. I feel that we do not pay enough attention to Google’s use of user data. (Stay tuned to my YouTube channel as I’m soon about to release a very interesting video with my thoughts on how user-side data is so important – likely the MOST important factor in Google’s ranking systems.)

User Data Is Used To Build GLUE And RankEmbed Models

Google Glue is a huge table of user activity. It collects the text of the queries searched, the user’s language, location and device type, and information on what appeared on the SERP, what the user clicked on or hovered over, how long they stayed on a SERP, and more.

RankEmbed BERT is even more interesting. RankEmbed BERT is one of the deep learning systems that underpins Search. In the Pandu Nayak testimony, we learned that RankEmbed BERT is used in reranking the results returned by traditional ranking systems. RankEmbed BERT is trained on click and query data from actual users.

The AI systems behind search are continually learning to improve upon presenting searchers with satisfying results. Google looks at what they are clicking on and whether they return to the SERPs or not. Google also runs live experiments that look at what searchers choose to click on and stay on. Those actions help train RankEmbed BERT. It is further fine-tuned by ratings from the quality raters. I will be publishing more on this soon. The take-home point I want to hammer on is that user satisfaction is by far the most important thing we should be optimizing for!

From the Liz Reid document we are analyzing today, we can see that user data is used to train, build, and operate RankEmbed models.

Image Credit: Marie Haynes

Once again, we learn that the user data that is used to train these models includes query, location, time of search, and how the user interacted with what was displayed to them.

Image Credit: Marie Haynes

This is talking about the actions that users take from within the Google Search results. What I really want to know is how much of a role Chrome data uses. Does Google look at whether people are engaging with your pages, filling out your forms, making your recipes, and more? I think they do. The judgment summary of this trial hints that Chrome data is used in the ranking systems, but not a lot of detail is shared.

Image Credit: Marie Haynes

Google Says That If Someone Had The Glue And RankEmbed User Data, They Could Train An LLM With It

This user data is the key to Google’s success.

Image Credit: Marie Haynes

It’s worthwhile reading the whole declaration from Liz Reid.

More Resources:


This post was originally published on Marie Haynes Consulting.


Featured Image: N Universe/Shutterstock

Marketing to Humans and Machines

Agentic shopping presents ecommerce marketers with a familiar problem in a new form.

The promise is simple enough. AI agents act on behalf of shoppers to search, compare, select, and even purchase products. These agents will use a shopper’s preferences — stated and inferred — rather than browsing products from digital shelves.

McKinsey & Company describes it this way: “Companies have spent decades refining consumer journeys, fine-tuning every click, scroll, and tap. But in the era of agentic commerce, the consumer no longer travels alone. Their digital proxies now navigate the commerce ecosystem.”

2 Targets

Ecommerce marketers will target both people and AI in the era of agentic commerce.

In effect, this means ecommerce marketers have two targets: a human and a machine.

It’s a familiar scenario. Marketers seeking organic traffic have long sought shoppers and appeased machines, e.g., search engines.

An online pet supply company wants Google to place its dripless water bowls at the top of search results and humans to click the listing.

In much the same way, this retailer now wants an AI shopping agent to offer that dripless bowl when a consumer asks a genAI platform how to keep a Doberman puppy from sloshing water all over the kitchen.

This two-prong approach paints a helpful picture, as many ecommerce businesses wonder how they will drive sales when chatbots do most of the shopping.

Marketing to Machine

For merchants, the most important component — shopping agents — will likely come via platforms.

Few ecommerce businesses will integrate their catalogs directly into every LLM or shopping agent. Instead, commerce platforms and marketplaces will be the conduits. Merchants will publish structured product data once and let those intermediaries distribute it into agentic ecosystems.

This is already happening. Shopify, for example, is building an agentic shopping infrastructure that allows agents to tap merchant catalogs and build carts.

Marketplaces will play a similar role. Amazon and Walmart already serve as product discovery engines and have no incentive to surrender that position.

A recent dispute between Amazon and Perplexity over agentic shopping tools underscores how aggressively marketplaces may defend their infrastructure and customer relationships.

The implication for ecommerce marketers is practical. Marketing to machines will be a lot of structured data work. Product feeds, catalog hygiene, and API-ready commerce systems will become part of the visibility strategy, much as technical search engine optimization was necessary when Google dominated.

Marketing to People

With agentic commerce, marketers aim to influence the AI. The second tactic is influencing the person typing the prompt.

AI agents select products based on users’ stated needs and inferred preferences. Merchants, then, have a clear objective: Shape what shoppers want, how they describe it, and which brands or shops they trust before asking.

This, too, is not new. It resembles brand demand in Google search results. A shopper will get one set of results from typing “best dog bowl” and another for “best dripless dog bowl Chewy.”

In agentic commerce, brand-building and preference-setting become even more valuable because they guide the shopper’s intent. And that intent, in turn, influences the agent.

Here’s how merchants exert that influence.

Advertising. Social and video ads foster familiarity, define product categories, and introduce specific terminology.

In time, that language becomes prompt phrasing. A merchant may not control the AI’s model, but it can control whether its product name, differentiator, or problem statement becomes part of a shopper’s vocabulary.

Content marketing. Buying guides, comparisons, and problem-solving articles seed the concepts that shoppers recall later in prompts.

Personalized lifecycle marketing and email marketing may become even more critical because it represents an owned audience and an opportunity to identify shopper preferences.

Merchant systems, including AI, can use purchase history, browsing signals, and customer data to anticipate needs and recommend actions. The better a merchant is at retention, the more likely it influences the prompt. Or, for that matter, bypass it altogether.

Personalized lifecycle marketing emphasizes individuals, according to Matthew Fanelli, chief revenue officer at Digital Remedy. Shopppers, Fanelli said, are like snowflakes: beautiful and unique in their own ways.

Influencer marketing is another prompt-shaper. Fanelli described it as a third prong, driven by peer behavior and social proof. “What is my peer group doing? What are they buying? How do I get in with them?” he said.

Fanelli expects a trifecta of forces to reshape ecommerce: more choice, shorter attention spans, and more connected devices. “That’s when you start to get agents,” he said. For marketers, the response is not panic but discipline. Create demand from humans and structure data for machines.

A Breakdown Of Microsoft’s Guide To AEO & GEO via @sejournal, @martinibuster

Microsoft published a sixteen page explainer guide about optimizing for AI search and chat. While many of the suggestions can be classified as SEO, some of the other tips relate exclusively to AI search surfaces. Here are the most helpful takeaways.

What AEO and GEO Are And Why They Matter

Microsoft explains that AI search surfaces have created an evolution from “ranking for clicks” to “being understood and recommended by AI.” Traditional SEO still provides a foundation for being cited in AI, but AEO and GEO determine whether content gets surfaced inside AI-driven experiences.

Here is how Microsoft distinguishes AEO and GEO. The first thing to notice is that they define AEO as Agentic Engine Optimization. That’s different from Answer Engine Optimization, which is how AEO is commonly understood.

  • AEO (Answer/Agentic Engine Optimization) focuses on optimizing content and product information easy for AI assistants and agents to retrieve, interpret, and present as direct answers.
  • GEO (Generative Engine Optimization) focuses on making your content discoverable and persuasive inside generative AI systems by increasing clarity, trustworthiness, and authoritativeness.

Microsoft views AEO and GEO as not limited to marketing, but multiple teams within an organization.

The guide says:

“This shift impacts every part of the organization. Marketing teams must rethink brand differentiation, growth teams need to adapt to AI-driven journeys, ecommerce teams must measure success differently, data teams must surface richer signals, and engineering teams must ensure systems are AI-readable and reliable.”

AI shopping is not one channel, it’s really a set of overlapping systems.

Microsoft describes AI shopping as three overlapping consumer touchpoints:

  1. AI browsers that interpret what’s on a page and surface context while users browse.
  2. AI assistants that answer questions and guide decisions in conversation.
  3. AI agents that can take actions, like navigating, selecting options, and completing purchases.

The AI touchpoint matters less than whether the system can access accurate, structured, and trustworthy product information.

SEO Still Plays A Role

Microsoft’s guide says that the AEO and GEO competition changes from discovery over to influence. SEO is still important, but it is no longer the whole game.

The new competition is about influencing the AI recommendation layer, not just showing up in rankings.

Microsoft describes it like this:

  • SEO helps the product get found.
  • AEO helps the AI explain it clearly.
  • GEO helps the AI trust it and recommend it.

Microsoft explains:

“Competition is shifting from discovery to influence (SEO to AEO/GEO).

If SEO focused on driving clicks, AEO is focused on driving clarity with enriched, real-time data, while GEO focuses on building credibility and trust so AI systems can confidently recommend your products.

SEO remains foundational, but winning in AI-powered shopping experiences requires helping AI systems understand not just what your product is, but why it should be chosen.”

How AI Systems Decide What To Recommend

Microsoft explains how an AI assistant, in this case Copilot, handles a user’s request. When a user asks for a recommendation, the AI assistant goes into a reasoning phase where the query is broken down using a combination of web and product feed data.

The web data provides:

  • “General knowledge
  • Category understanding
  • Your brand positioning”

Feed data provides:

  • “Current prices
  • Availability
  • Key specs”

The AI assistant may, based on the feed data, choose to surface the product with the lowest price that is also in stock.  When the user clicks through to the website, the AI Assistant scans the page for information that provides context.

Microsoft lists these as examples of context:

  • Detailed reviews
  • Video that explain the product
  • Current promotions
  • Delivery estimates

The agent aggregates this information and provides guidance on what it discovered in terms of the context of the product (delivery times, etc.).

Microsoft brings it all together like this:

First, there’s crawled data:
The information AI systems learned during training and retrieve from indexed web pages, which shapes your brand’s baseline perception and provides grounding for AI responses, including your product
categories, reputation and market position.

Second, there’s product feeds and APIs:
The structured data you actively push to AI platforms, giving you control over how your products are represented in comparisons and recommendations. Feeds provide accuracy, details and consistency.

Third, there’s live website data:
The real-time information AI agents see when they visit your actual site, from rich media and user reviews to dynamic pricing and transaction capabilities. Each data source plays a distinct role in the shopping journey — traditional SEO remains essential because AI systems perform real-time web searches frequently throughout the shopping journey, not just at purchase time, and your site must rank well to be discovered, evaluated, and recommended.

Microsoft recommends A Three-Part Action Plan

Strategy 1: Technical Foundations

The core idea for this strategy is that your product catalog must be machine-readable, consistent everywhere, and up to date.

Key actions:

  • Use structured data (schema) for products, offers, reviews, lists, FAQs, and brand.
  • Include dynamic fields like pricing and availability.
  • Keep feed data and on-page structured data aligned with what users actually see.
  • Avoid mismatches between visible content and what is served to crawlers.

Strategy 2: Optimize Content For Intent And Clarity

This strategy is about optimizing product content so that it answers typical user questions and is easy for AI to reuse.

Key actions:

  • Write product descriptions that start with benefits and real use-case value.
  • Use headings and phrasing that match how people ask questions.

Add modular content blocks:

  • FAQs
  • specs
  • key features
  • comparisons

Add Contextual Information

  • Support multi-modal interpretation (good alt text, transcripts for video content, structured image metadata).
  • Add complementary product context (pairings, bundles, “goes well with”).

Strategy 3: Trust Signals (Authority And Credibility)

The takeaway for this strategy is that AI assistants and agents prioritize content that looks verified and reputable.

Key actions:

  • Strengthen review credibility (verified reviews, strong volumes, clear sentiment).
  • Reinforce brand authority through real-world signals (press, certifications, partnerships).
  • Keep claims grounded and consistent to avoid trust degradation.
  • Use structured data to clarify legitimacy and identity.

Microsoft explains it like this:

“AI assistants prioritize content from sources they can trust. Signals such as verified reviews, review volume, and clear sentiment help establish credibility and influence recommendations.

Brand authority is reinforced through consistent identity, real-world validation such as press coverage, certifications, and partnerships, and the use of structured data to clearly define brand entities.

Claims should be factual, consistent, and verifiable, as exaggerated or misleading information can reduce trust and limit visibility in AI-powered experiences”

Takeaways

AI search changes the goal from winning rankings to earning recommendations. SEO still matters, but AEO and GEO determine how well content is interpreted, explained, and chosen inside AI assistants and agents.

AI shopping is not a single channel but an ecosystem of assistants, browsers, and agents that rely on authoritative signals across crawled content, structured feeds, and live site experiences. The brands that win are the ones with consistent, machine-readable data, and clear content that contains useful contextual information that can be easily summarized.

Microsoft published a blog post that is accompanied by a link to the downloadable explainer guide: From Discovery to Influence: A Guide to AEO and GEO.

Featured Image by Shutterstock/Kues

Five Things To Do That Will Increase Authoritativeness And Earn Links via @sejournal, @martinibuster

The following are five things that anyone can do to establish authoritativeness and trustworthiness that can be communicated quickly and contribute to earning more links. The trick to this technique is that you have to put some time into these tactics first but the rewards after you are done are links, lots of them.

The idea behind this tactic is to convince a web publisher to give you a free link, or to give you the opportunity to publish an article (with or without a customary byline and link).

In order to cut through the noise of all the other emails the web publisher receives, it is necessary to establish your authority in order to inspire trust. And you need to do it quickly. These are some touchstones I crafted, through trial and error, in order to accomplish a higher success level in link building campaigns.

I call this method, Establishing your Bona Fides. It works by creating trust with one to two sentences. Whether at the beginning, middle or end of the outreach is up to you, but I’ve enjoyed a good response rate by placing it near the beginning.

Here are the shortcuts to establishing bona fides:

  1. Awards
  2. Media appearances and mentions
  3. List of authoritative organizations that have published your work
  4. List of peers that have published your work
  5. Authority of your website’s authors

As you can see this isn’t really something you can fake your way through. But if you take the time to first establish your bona fides (what makes your legitimate and authoritative), you will see a higher percentage of positive response rates. People will take your emails more seriously.

There is no need to be annoying and badger people over and over the way some marketing agencies do. The success rate improvement from this method will cut the need for such aggressive pestering, something that I have never approved of.

The first two bona fides are self explanatory. But I will explain them quickly.

Awards
It’s always useful to obtain recognition in whatever field that you are in (if that’s a thing). Even if it’s recognition for volunteering for an organization and doing charitable work.  Other kinds of awards are the kind that local news might give out, like best whatever in whatever town your company is based out of.

Media Appearances And Mentions
Appearing in television news, being cited in respected news or online magazines are ways to establish signals of authoritativeness. Signals of authoritativeness aren’t just ranking signals, they are also the kinds of things that  humans respond to.

Organizations And Associations
The third bona fide relates to associations and organizations that your company is allied or partnered with, and any publications that are related to those organizations, both online and offline. Some organizations are always on the lookout for people to profile or publish articles by for their association publications. This kind of publishing is a great way to establish authoritativeness and trustworthiness. It’s truly earning recognition for your expertise.

Publishing articles in offline publications are a bonanza. While you likely won’t get a link, you will also be the rare online organization submitting a guest post in those publications. Most companies and marketing agencies aren’t doing this because there is no link associated with it. This this will be your advantage because as you’ll see, it will help to increase your link building success rate. When you publish an article in an authoritative space, even if it’s offline, it gives you the ability to rightfully say in your outreach email that you’ve been published in so and so magazine or newsletter. Associating your brand with the authoritative brand in this way instantly makes your brand authoritative to the person you’re communicating with. This is especially powerful if the person you’re communicating with is also a member of whatever association or organization that you have published an article with.

The reason this approach works is that it enables you to establish yourself as authoritative with a single sentence. With only a few words in your outreach email, you can quickly profile your site as not a spammer, and a legit organization that’s ultimately worthy of getting a link. In my experience this has worked exceedingly well for consistently earning instant trust from whoever you’re outreaching to.

You can get to number four  (list of peers that have published your work) without doing number three (list of organizations that have published your work). But you’ll have greater success if you put a good amount of number three projects behind you. Even if you don’t use all the projects in your initial outreach email, you may have to deploy them in follow up emails to doubting recipients who need more convincing. And you get add all of these to your About Us page.

Authority Of Website Authors
Point number five (authority of your website’s authors) is more or less self-explanatory. It helps if the person authoring your articles is someone who the outreach recipient can identify with, can think of as “one of us” when you list their credentials. For example, I once did an outreach in the educational space citing the writing talents of a math teacher who was also an education technology blogger. This person’s credentials and authority opened doors for my link building outreach and helped my client receive links from some truly prestigious education related websites.

Obviously, the success of this approach requires do some work ahead of time to get appearances in blogs, podcasts, video interviews, publishing in association and organization online and offline publications. Even taking a photo with someone who is well known and authoritative and putting that on your About Us page can be helpful. People who are considering giving you a link will go to your website’s About Us page to verify who this company is and if they’re as above board and authoritative as you say.

Using the above pre-campaign tactics will improve your trustworthiness and authoritativeness and have a positive impact on link building success rates.

Featured Image by Shutterstock/Krakenimages.com

When Platforms Say ‘Don’t Optimize,’ Smart Teams Run Experiments via @sejournal, @DuaneForrester

A quick note up front, so we start on the right foot.

The research I’m about to reference is not mine. I did not run these experiments. I’m not affiliated with the authors. I’m not here to “endorse” a camp, pick a side, or crown a winner. What I am going to endorse, loudly and without apology, is measurement. Replication. Real-world experiments. The kind of work that teaches us in real time, in real life, what changes when an LLM sits between customers and content. We need more tested data, and this is one of those starting points.

If you do nothing else with this article, do this: Read the paper, then run your own test. Whether your results agree or disagree, publish them. We need more receipts and fewer hot takes.

Now, the reason I’m writing this.

Over the last year, the industry has been pushed toward a neat, comforting story: GEO is just SEO. Nothing new to learn. No need to change how you work. Just keep doing the fundamentals, and everything will be fine.

I don’t buy that.

Not because SEO fundamentals stopped mattering. They still matter, and they remain necessary. But because “necessary” is not the same as “sufficient,” and because the incentives behind platform messaging do not always align with the operational realities businesses are walking into and dealing with.

Image Credit: Duane Forrester

The Narrative And The Incentives

If you’ve paid attention to public guidance coming from the leading search platforms lately, you’ve probably heard a version of: Don’t focus on chunking. Don’t create “bite-sized chunks.” Don’t optimize for how the machine works. Focus on good content.

That’s been echoed and amplified across industry coverage, though I want to be precise about my position here. I’m not claiming a conspiracy, and I’m not saying anyone is being intentionally misleading. I’m not doing that.

I am saying something much simpler. It’s my opinion and happens to be based on actual experience – when messaging repeats across multiple spokespeople in a tight window, it signals an internal alignment effort.

That’s not an insult nor is it a moral judgment. That’s how large organizations operate when they want the market to hear one clear message. I was part of exactly that type of environment for well over a decade in my career.

And the message itself, on its face, is not wrong. You can absolutely hurt yourself by over-optimizing for the wrong proxy. You can absolutely create brittle content by trying to game a system you do not fully understand. In many cases, “write clearly for humans” is solid baseline guidance.

The problem is what happens when that baseline guidance becomes a blanket dismissal of how the machine layer works today, even if it’s unintentional. Because we are not in a “10 blue links” world anymore.

We are in a world where answer surfaces are expanding, search journeys are compressing, and the unit of competition is shifting from “the page” to “the selected portion of the page,” assembled into an answer the user never clicks past.

And that is where “GEO is just SEO” starts to break in my mind.

The Wrong Question: “Is Google Still The Biggest Traffic Driver?”

Executives love comforting statements: “Google still dominates search. Traditional SEO still drives the most traffic. Therefore this LLM-stuff is overblown.

The first half is true, but the conclusion is where companies get hurt.

The biggest risk here is asking the wrong question. “Where does traffic come from today?” is a dashboard question, and it’s backward-looking. It tells you what has been true.

The more important questions are forward-looking:

  • What happens to your business when discovery shifts from clicks to answers?
  • What happens when the customer’s journey ends on the results page, inside an AI Overview, inside an AI Mode experience, or inside an assistant interface?
  • What happens when the platform keeps the user, monetizes the answer surface, and your content becomes a source input rather than a destination?

If you want the behavior trendline in plain terms, start here, with the 2024 SparkToro study, then take a look at what Danny Goodwin wrote in 2024, and as a follow-up in 2025 (spoiler – zero click instances increased Y-o-Y). And while some sources are a couple of years old, you can easily find newer data showing the trend growing.

I’m not using these sources to claim “the sky is falling.” I’m using them to reinforce a simple operational reality: If the click declines, “ranking” is no longer the end goal. Being selected into the answer becomes the end goal.

That requires additional thinking beyond classic SEO. Not instead of it. On top of it.

The Platform Footprint Is Changing, And The Business Model Is Following

If you want to understand why the public messaging is conservative, you have to look at the platform’s strategic direction.

Google, for example, has been expanding AI answer surfaces, and it’s not subtle. Both AI Overviews and AI Mode saw announcements of large expansions during 2025.

Again, notice what this implies at the operating level. When AI Overviews and AI Mode expand, you’re not just dealing with “ranking signals.” You’re dealing with an experience layer that can answer, summarize, recommend, and route a user without a click.

Then comes the part everyone pretends not to see until it’s unavoidable: Monetization follows attention.

This is no longer hypothetical. Search Engine Journal covered Google’s official rollout of ads in AI Overviews, which matters because it signals this answer layer is being treated as a durable interface surface, not a temporary experiment.

Google’s own Ads documentation reinforces the same point: This isn’t just “something people noticed,” it’s a supported placement pattern with real operational guidance behind it. And Google noted mid-last-year that AI Overviews monetize at a similar rate to traditional search, which is a quiet signal that this isn’t a side feature.

You do not need to be cynical to read this clearly. If the answer surface becomes the primary surface, the ad surface will evolve there too. That’s not a scandal so much as just the reality of where the model is evolving to.

Now connect the dots back to “don’t focus on chunking”-style guidance.

A platform that is actively expanding answer surfaces has multiple legitimate reasons to discourage the market from “engineering for the answer layer,” including quality control, spam prevention, and ecosystem stability.

Businesses, however, do not have the luxury of optimizing for ecosystem stability. Businesses must optimize for business outcomes. Their own outcomes.

That’s the tension.

This isn’t about blaming anyone. It’s about understanding misaligned objectives, so you don’t make decisions that feel safe but cost you later.

Discovery Is Fragmenting Beyond Google, And Early Signals Matter

I’m on record that traditional search is still an important driver, and that optimizing in this new world is additive, not an overnight replacement story. But “additive” still changes the workflow.

AI assistants are becoming measurable referrers. Not dominant, not decisive on their own, but meaningful enough to track as an early indicator. Two examples that capture this trend.

TechCrunch noted that while it’s not enough to offset the loss of traffic from search declines, news sites are seeing growth in ChatGPT referrals. And Digiday has data showing traffic from ChatGPT doubled from 2024 to 2025.

Why do I include these?

Because this is how platform shifts look in the early stages. They start small, then they become normal, then they become default. If you wait for the “big numbers,” you’re late building competence and in taking action. (Remember “directories”? Yeah, Search ate their lunch.)

And competence, in this new environment, is not “how do I rank a page.” It’s “how do I get selected, cited, and trusted when the interface is an LLM.

This is where the “GEO is just SEO” framing stops being a helpful simplification and starts becoming operationally dangerous.

Now, The Receipts: A Paper That Tests GEO Tactics And Shows Measurable Differences

Let’s talk about the research. The paper I’m referencing here is publicly available, and I’m going to summarize it in plain English, because most practitioners do not have time to parse academic structure during the week.

At a high level, the (“E-GEO: A Testbed for Generative Engine Optimization in E-Commerce”) paper tests whether common human-written rewrite heuristics actually improve performance in an LLM-mediated product selection environment, then compares that to a more systematic optimization approach. It uses ecommerce as the proving ground, which is smart for one reason: Outcomes can be measured in ways that map to money. Product rank and selection are economically meaningful.

This is important because the GEO conversation often gets stuck in “vibes.” In contrast, this work is trying to quantify outcomes.

Here’s the key punchline, simplified:

A lot of common “rewrite advice” does not help in this environment. Some of it can be neutral. Some of it can be negative. But when they apply a meta-optimization process, prompts improve consistently, and the optimized patterns converge on repeatable features.

That convergence is the part that should make every practitioner sit up. Because convergence suggests there are stable signals the system responds to. Not mystical. Not magical. Not purely random.

Stable signals.

And this is where I come back to my earlier point: If GEO were truly “just SEO,” then you would expect classic human rewrite heuristics to translate cleanly. You would expect the winning playbook to be familiar.

This paper suggests the reality is messier. Not because SEO stopped mattering, but because the unit of success changed.

  • From page ranking to answer selection.
  • From persuasion copy to decision copy.
  • From “read the whole page” to “retrieve the best segment.”
  • From “the user clicks” to “the machine chooses.”

What The Optimizer Keeps Finding, And Why That Matters

I want to be careful here, as I’m not telling you to treat this paper like doctrine. You should not accept it on face value and suddenly adopt this as gospel. You should treat it as a public experiment that deserves replication.

Now, the most valuable output isn’t the exact numbers in their environment, but rather, it’s the shape of the solution the optimizer keeps converging on. (The name of their system/process is optimizer.)

The optimized patterns repeatedly emphasize clarity, explicitness, and decision-support structure. They reduce ambiguity. They surface constraints. They define what the product is and is not. They make comparisons easier. They encode “selection-ready” information in a form that is easier for retrieval and ranking layers to use.

That is a different goal than classic marketing copy, which often leans on narrative, brand feel, and emotional persuasion.

Those things still have a place. But if you want to be selected by an LLM acting as an intermediary, the content needs to do a second job: become machine-usable decision support.

That’s not “anti-human.” It’s pro-clarity, and it’s the kind of detail that will come to define what “good content” means in the future, I think.

The Universal LLM-Optimization Rewrite Recipe, Framed As A Reusable Template

What follows is not me inventing a process out of thin air. This is me reverse-engineering what their optimization process converged toward, and turning it into a repeatable template you can apply to product descriptions and other decision-heavy content.

Treat it as a starting point, then test it. Revise it, create your own version, whatever.

Step 1: State the product’s purpose in one sentence, with explicit context.
Not “premium quality.” Not “best in class.” Purpose.

Example pattern:
This is a [product] designed for [specific use case] in [specific constraints], for people who need [core outcome].

Step 2: Declare the selection criteria you satisfy, plainly.
This is where you stop writing like a brochure and start writing like a spec sheet with a human voice.

Include what the buyer cares about most in that category. If the category is knives, it’s steel type, edge retention, maintenance, balance, handle material. If it’s software, it’s integration, security posture, learning curve, time-to-value.

Make it explicit.

Step 3: Surface constraints and qualifiers early, not buried.
Most marketing copy hides the “buts” until the end. Machines do not reward that ambiguity.

Examples of qualifiers that matter:
Not ideal for [X]. Works best when [Y]. Requires [Z]. Compatible with [A], not [B]. This matters if you [C].

Step 4: State what it is, and what it is not.
This is one of the simplest ways to reduce ambiguity for both the user and the model.

Pattern:
This is for [audience]. It is not for [audience].
This is optimized for [scenario]. It is not intended for [scenario].

Step 5: Convert benefits into testable claims.
Instead of “durable,” say what durable means in practice. Instead of “fast,” define what “fast” looks like in a workflow.

Do not fabricate. Do not inflate. This is not about hype. It’s about clarity.

Step 6: Provide structured comparison hooks.
LLMs often behave like comparison engines because users ask comparative questions.

Give the model clean hooks:
Compared to [common alternative], this offers [difference] because [reason].
If you’re choosing between [A] and [B], pick this when [condition].

Step 7: Add evidence anchors that improve trust.
This can be certifications, materials, warranty terms, return policies, documented specs, and other verifiable signals.

This is not about adding fluff. It’s about making your claims attributable and your product legible.

Step 8: Close with a decision shortcut.
Make the “if you are X, do Y” moment explicit.

Pattern:
Choose this if you need [top 2–3 criteria]. If your priority is [other criteria], consider [alternative type].

That’s the template*.

Notice what it does. It turns a product description into structured decision support, which is not how most product copy is written today. And it is an example of why “GEO is just SEO” fails as a blanket statement.

SEO fundamentals help you get crawled, indexed, and discovered. This helps you get selected when discovery is mediated by an LLM.

Different layer. Different job.

Saying GEO = SEO and SEO = GEO is an oversimplification that will become normalized and lead to people missing the fact that the details matter. The differences, even small ones, matter. And they can have impacts and repercussions.

*A much deeper-dive pdf version of this process is available for my Substack subscribers for free via my resources page.

What To Do Next: Read The Paper, Then Replicate It In Your Environment

Here’s the part I want to be explicit about. This paper is interesting because it’s measurable, and because it suggests the system responds to repeatable features.

But you should treat it as a starting point, not a law of physics. Results like this are sensitive to context: industry, brand authority, page type, and even the model and retrieval stack sitting between the user and your content.

That’s why replication matters. The only way we learn what holds, what breaks, and what variables actually matter is by running controlled tests in our own environments and publishing what we find. If you work in SEO, content, product marketing, or growth, here is the invitation.

Read the paper here.

Then run a controlled test on a small, meaningful slice of your site.

Keep it practical:

  • Pick 10 to 20 pages with similar intent.
  • Split them into two groups.
  • Leave one group untouched.
  • Rewrite the other group using a consistent template, like the one above.
  • Document the changes so you can reverse them if needed.
  • Measure over a defined window.
  • Track outcomes that matter in your business context, not just vanity metrics.

And if you can, track whether these pages are being surfaced, cited, paraphrased, or selected in the AI answer interfaces your customers are increasingly using.

You are not trying to win a science fair. You are trying to reduce uncertainty with a controlled test. If your results disagree with the paper, that’s not failure. That’s signal.

Publish what you find, even if it’s messy. Even if it’s partial. Even if the conclusion is “it depends.” Because that is exactly how a new discipline becomes real. Not through repeating platform talking points. Not through tribal arguments. Through measurement.

One Final Level-Set, For The Executives Reading This

Platform guidance is one input, not your operating system. Your operating system is your measurement program. SEO is still necessary. If you can’t get crawled, you can’t get chosen.

But GEO, meaning optimizing for selection inside LLM-mediated discovery, is an additional competence layer. Not a replacement. A layer. If you decide to ignore that layer because a platform said “don’t optimize,” you’re outsourcing your business risk to someone else’s incentive structure.

And that’s not a strategy. The strategy is simple: learn the layer by testing the layer.

We need more people doing exactly that.

More Resources:


This post was originally published on Duane Forrester Decodes.


Featured Image: Rawpixel.com/Shutterstock

Shopify Shares More Details On Universal Commerce Protocol (UCP) via @sejournal, @martinibuster

Harvey Finkelstein, the president of Shopify, was recently interviewed about their open source Universal Commerce Protocol (UCP), which enables agentic AI shopping. Co-developed with Google, he explains how UCP enables brands to be discovered by customers based on personalized recommendations, as opposed to advertising and classic search paradigms that are less personalized.

Finkelstein said that the Universal Commerce Protocol (UCP) is designed to enable AI agents to surface products in a manner that merchants can control, show consumers personalized recommendations based on users’ preferences, and deliver a shopping experience that’s as good as any ecommerce store platform.

Shopify is also opening agentic commerce access to brands that are not Shopify customers through their Agentic plan, which he briefly mentions. This plan is designed for enterprise brands and merchants who do not use Shopify to upload their product data to Shopify’s infrastructure so it can be discovered and purchased directly by AI agents.

This positions Shopify as infrastructure for agentic commerce, not just a hosted commerce platform. This makes it easier for brands to gain immediate access to agentic shopping channels without having to migrate platforms.

Finkelstein also points out that agentic commerce only works if consumers can access all brands, not just those on Shopify.

Shopify’s Finkelstein said that UCP will enable merchants to more effectively control how their products are shown. He also discussed their strategy of bringing agentic shopping to all brands, regardless of whether they are on Shopify or not.

He explained:

“We created this protocol called Universal Commerce Protocol which effectively is this universal language is open sourced so that all merchants can speak directly to every single one of the agents.

And the best way to explain it is up until now, it was really just about like a single transaction.

So I can buy something on ChatGPT or Gemini or Microsoft. there’s no concept of loyalty or subscription or bundling or, you know, if it’s furniture, for example, please don’t ship it to me on Thursday. I’m not home Thursday. Send it Friday.

So this idea of creating this universal protocol that we co-developed with Google means that now merchants can actually tell these agents exactly how to show their products on these agentic tools. And it should be as good as it is on the online store. So that was a really, really big one.

The second thing we announced also with Google is that now we’re actually expanding. You can sell everywhere commerce is happening from an agentic perspective.

So we’re going beyond the agentic storefronts of just ChatGPT, which is what we said, you know, in Q3. Now it’s also, we’re going to be working with Gemini, with AI mode in Google Search, and also with copilot.

And maybe the last one is that we’re actually bringing agentic commerce to every brand, whether or not they’re on Shopify.

So if you’re not on Shopify, but you want to have your product syndicated and indexed, you can do so with our agentic plan.”

Access To Many Brands Is Key

Finkelstein stressed that the key to the success of agentic AI is to be able to show the widest possible selection of brands. He said it’s a big opportunity.

He explained:

“I think if Agentic is going to do what a lot of us think it’s going to do from a commerce perspective, you have to give consumers all the brands.

We obviously want them all on Shopify, but there’s some brands that want to participate now, but it may take some time for them to migrate over.

So this idea of opening up to anyone, we think is a big opportunity.”

Who Will Be The Early Adopters?

Finkelstein was asked about who the early adopters will be. His answer was cautious, seemingly acknowledging that it’s likely not going to immediately be a big crush of people turning to AI to buy things.

He answered:

“I think it’ll likely be something that like most people use some of the time and some people use most of the time. I don’t think it’s going to cross the threshold of most most, the way e-commerce does now. It’s just going to take time. It’s going to take some time.”

AI Chat Reduces Friction

Finkelstein said that Universal Commerce Protocol (UCP) enables better shopping experiences, reducing the “friction” that AI shopping may have produced. He believes that once people start having good experiences shopping with an agent, they will start to get into the habit of using it for other kinds of shopping and begin relying on it.

Finkelstein explained:

“Once you have a good experience, I think the actual friction reduces. You’ll keep having it over and over again.

But the thing that we felt was missing, and this is the reason why I think this UCP protocol is so important, is it was very difficult to do merchandising inside of these applications.

And this protocol allows you to do a lot more… Well, up until UCP happened, you couldn’t actually do subscriptions. Now you can.

Or this idea of bundling, you know, for Gymshark, it’s a huge part of their business is if you buy these, you’ll also buy these as well. You can do that as well.

So I think all of these things are sort of in line with creating a much more delightful experience in the chat.”

Merit Based Shopping Versus SEO?

Finkelstein brought up the topic of merit-based shopping where products are recommended to a user because it is what they are looking for. He used the phrase “merit-based shopping” as a contrast to today’s online advertising ecosystems that prioritize products that pay to be shown as a recommendation. The main point is that shopping recommendations are made based on personalization.

Finkelstein explained:

“And I think ultimately what it leads to is like, this will be merit-based shopping, which will be different than I think some of the traditional retailers who were kind of leaning on their balance sheets to spend money on ads. You can’t really game the system in that that way.

You actually have to be, from a context perspective, the right product for the right consumer.”

What Happens To Creative Assets And SEO

One of the podcast hosts asked about what happens to creative assets like photos, saying that he noticed that shopping AI uses images. He asked how that was going to evolve. Finkelstein’s answer touched on SEO in the context of how agentic AI shopping is about showing products based on user preferences, a tighter form of relevance than in the advertising and classic search ecosystems.

Finkelstein explained:

“I think …the idea of SEO won’t exist in Agentic because again, it’s merit-based and it’s mostly based on the context history you’ve had.

But I do think though, you’re going to have… these brands are going to have people at their companies who are thinking a lot about like consistent updates to UCP, consistent updates to the catalog.

So they may pull something off the catalog and say, we don’t want to sell it anymore this way. So I think there’s going to be, I don’t know if they’re going to be actual jobs, but there’s going to be people inside of the company, potentially in the merchandising department, who say, actually, the way that we want to sell all this, the way we want to describe this to these agents is a particular way.

And then because of UCP and because of Shopify catalog, it gets easily disseminated across every single one of these agentic applications. So the experience just gets better and better.

I think you have to be a little bit of a techno optimist… as I am, to believe that even if the experience is not incredible right now, it’s likely just going to get better at this ridiculous pace.”

Cutting Out Incentivized Recommendations

When asked what’s the most exciting thing about Agentic AI, he returned to the concept of merit-based shopping, where LLMs have the ability to personalize responses by learning user preferences and therefore recommend a product that fits within that person’s requirements. He contrasted that with what happens in the real world, where a salesperson’s recommendations are influenced by commissions.

So what he is excited about is the idea of the playing field being leveled. He mentioned the possibility of lesser-known brands, like True Classic Tees, being surfaced in AI shopping because that kind of brand is a match for a specific consumer.

He responded:

“Most of the excitement is actually around this idea of like, is there a potential for this to level the playing field? Meaning, you know, if I’ve done a bunch of research historically on an agentic application …about the stuff that I love, the brands that I love. …It probably should not show me a generic pair of boots.

So the excitement actually is around like, is this going to introduce more brands that otherwise are unknown to more people or, you know, True Classic Tee, for example, which, you know, if you’re looking for a black t-shirt, I suspect on a search engine, you’re not going to see True Classic Tee come up that much, but it’s an incredible product and ultimately it can be found on these agentic tools in a way that it probably couldn’t historically.”

Agentic AI Will Accelerate Online Shopping

The other thing that Finkelstein is excited about is that he believes Agentic AI shopping will accelerate the amount of shopping that is done online. He compared using Agentic AI to the COVID moment, where people changed their work and shopping behavior in a major way that became permanent.

He then circled back to the idea that Agentic AI is less biased:

“I think it’s actually a better version of that because it’s an unbiased discussion, an unbiased conversation.”

Watch the video podcast interview at a few minutes after the 3 hour mark:

Featured Image by Shutterstock/Julien Tromeur

More Sites Blocking LLM Crawling – Could That Backfire On GEO? via @sejournal, @martinibuster

Hostinger released an analysis showing that businesses are blocking AI systems used to train large language models while allowing AI assistants to continue to read and summarize more websites. The company examined 66.7 billion bot interactions across 5 million websites and found that AI assistant crawlers used by tools such as ChatGPT now reach more sites even as companies restrict other forms of AI access.

Hostinger Analysis

Hostinger is a web host and also a no-code, AI agent-driven platform for building online businesses. The company said it analyzed anonymized website logs to measure how verified crawlers access sites at scale, allowing it to compare changes in how search engines and AI systems retrieve online content.

The analysis they published shows that AI assistant crawlers expanded their reach across websites during a five-month period. Data was collected during three six-day windows in June, August, and November 2025.

OpenAI’s SearchBot increased coverage from 52 percent to 68 percent of sites, while Applebot (which indexes content for powering Apple’s search features) doubled from 17 percent to 34 percent. During the same period, traditional search crawlers essentially remained constant. The data indicates that AI assistants are adding a new layer to how information reaches users rather than replacing search engines outright.

At the same time, the data shows that companies sharply reduced access for AI training crawlers. OpenAI’s GPTBot dropped from access on 84 percent of websites in August to 12 percent by November. Meta’s ExternalAgent dropped from 60 percent coverage to 41 percent website coverage. These crawlers collect data over time to improve AI models and update their Parametric Knowledge but many businesses are blocking them, either to limit data use or for fear of copyright infringement issues.

Parametric Knowledge

Parametric Knowledge, also known as Parametric Memory, is the information that is “hard-coded” into the model during training. It is called “parametric” because the knowledge is stored in the model’s parameters (the weights). Parametric Knowledge is long-term memory about entities, for example, people, things, and companies.

When a person asks an LLM a question, the LLM may recognize an entity like a business and then retrieve the the associated vectors (facts) that it learned during training. So, when a business or company blocks a training bot from their website, they’re keeping the LLM from knowing anything about them, which might not be the best thing for an organization that’s concerned about AI visibility.

Allowing an AI training bot to crawl a company website enables that company to exercise some control over what the LLM knows about it, including what it does, branding, whatever is in the About Us, and enables the LLM to know about the products or services offered. An informational site may benefit from being cited for answers.

Businesses Are Opting Out Of Parametric Knowledge

Hostinger’s analysis shows that businesses are “aggressively” blocking AI training crawlers. While Hostinger’s research doesn’t mention this, the effect of blocking AI training bots is that businesses are essentially opting out of LLM’s parametric knowledge because the LLM is prevented from learning directly from first-party content during training, removing the site’s ability to tell its own story and forcing the LLM to rely on third-party data or knowledge graphs.

Hostinger’s research shows:

“Based on tracking 66.7 billion bot interactions across 5 million websites, Hostinger uncovered a significant paradox:

Companies are aggressively blocking AI training bots, the systems that scrape content to build AI models. OpenAI’s GPTBot dropped from 84% to 12% of websites in three months.

However, AI assistant crawlers, the technology that ChatGPT, Apple, etc. use to answer customer questions, are expanding rapidly. OpenAI’s SearchBot grew from 52% to 68% of sites; Applebot doubled to 34%.”

A recent post on Reddit shows how blocking LLM access to content is normalized and understood as something to protect intellectual property (IP).

The post starts with an initial question asking how to block AIs:

“I want to make sure my site is continued to be indexed in Google Search, but do not want Gemini, ChatGPT, or others to scrape and use my content.

What’s the best way to do this?”

Screenshot Of A Reddit Conversation

Later on in that thread someone asked if they’re blocking LLMs to protect their intellectual property and the original poster responded affirmatively, that that was the reason.

The person who started the discussion responded:

“We publish unique content that doesn’t really exist elsewhere. LLMs often learn about things in this tiny niche from us. So we need Google traffic but not LLMs.”

That may be a valid reason. A site that publishes unique instructional information about a software product that does not exist elsewhere may want to block an LLM from indexing their content because if they don’t then the LLM will be able to answer questions while also removing the need to visit the site.

But for other sites with less unique content, like a product review and comparison site or an ecommerce site, it might not be the best strategy to block LLMs from adding information about those sites into their parametric memory.

Brand Messaging Is Lost To LLMs

As AI assistants answer questions directly, users may receive information without needing to visit a website. This can reduce direct traffic and limit the reach of a business’s pricing details, product context, and brand messaging. It’s possible that the customer journey ends inside the AI interface and the businesses that block LLMs from acquiring knowledge about their companies and offerings are essentially relying on the search crawler and search index to fill that gap (and maybe that works?).

The increasing use of AI assistants affects marketing and extends into revenue forecasting. When AI systems summarize offers and recommendations, companies that block LLMs have less control over how pricing and value appear. Advertising efforts lose visibility earlier in the decision process, and ecommerce attribution becomes harder when purchases follow AI-generated answers rather than direct site visits.

According to Hostinger, some organizations are becoming more selective about what which content is available to AI, especially AI assistants.

Tomas Rasymas, Head of AI at Hostinger commented:

“With AI assistants increasingly answering questions directly, the web is shifting from a click-driven model to an agent-mediated one. The real risk for businesses isn’t AI access itself, but losing control over how pricing, positioning, and value are presented when decisions are made.”

Takeaway

Blocking LLMs from using website data for training is not really the default position to take, even though many people feel real anger and annoyance of the idea of an LLM training on their content.  It may be useful to take a more considered response that weighs the benefits versus the disadvantages and to also consider whether those disadvantages are real or perceived.

Featured Image by Shutterstock/Lightspring

A Little Clarity On SEO, GEO, And AEO via @sejournal, @martinibuster

The debate about AEO/GEO centers on whether it’s a subset of SEO, a standalone discipline, or just standard SEO. Deciding on where to plant a flag is difficult because every argument makes a solid case. There’s no doubt that change is underway and it may be time find where all the competing ideas intersect and work from there.

The Case Against AEO/GEO

Many SEOs argue that AEO/GEO doesn’t differentiate itself enough to justify being anything other than a subset of SEO, sharing computers in the same office.

Harpreet Singh Chatha (X profile) of Harps Digital recently tweeted about AEO / GEO myths to leave behind in 2025.

Some of what he listed:

  • “LLMs.txt
  • Paying a GEO expert to do “chunk optimization.” Chunking content is just making your content readable.
  • Thinking AEO / GEO have nothing in common with SEO. Ask your favourite GEO expert for 25 things that are unique to AI search and don’t overlap with SEO. They will block you.
  • Saying SEO is dead. “

The legendary Greg Boser (LinkedIn profile), one of the original SEOs since 1996 tweeted this:

“At the end of the day, the core foundation of what we do always has been and always will be about understanding how humans use technology to gain knowledge.

We don’t need to come up with a bunch of new acronyms to continue to do what we do. All that needs to happen is we all agree to change the “E” in SEO from “Engine” to “Experience”.

Then everyone can stop wasting time writing all the ridiculous SEO/GEO/AEO posts, and get back to work.”

Inability To Articulate AEO/GEO

What contributes to the perception that AEO/GEO is not a real thing is that many proponents of AEO/GEO fail to differentiate it from standard SEO. We’ve all seen it where someone tweets their new tactic and the SEO peanut gallery chimes in, nah, that’s SEO.

Back in October Microsoft published a blog post about optimizing content for for AI where they asserted:

“While there’s no secret strategy for being selected by AI systems, success starts with content that is fresh, authoritative, structured, and semantically clear.”

The post goes on to affirm the importance of SEO fundamentals such as “Crawlability, metadata, internal linking, and backlinks” but then states that these are just starting points. Microsoft points out that AI search provides answers, not ranked list of pages. That’s correct and it changes a lot.

Microsoft says that now it’s about which pieces of content are being ranked:

“In AI search, ranking still happens, but it’s less about ordering entire pages and more about which pieces of content earn a place in the final answer.”

That kind of echoes what Jesse Dwyer of Perplexity AI recently said about AI Search and SEO:

“As for the index technology, the biggest difference in AI search right now comes down to whole-document vs. “sub-document” processing.

…The AI-first approach is known as “sub-document processing.” Instead of indexing whole pages, the engine indexes specific, granular snippets (not to be confused with what SEO’s know as “featured snippets”).”

Microsoft recently published an explainer called “From discovery to influence:A guide to AEO and GEO” that’s tellingly focused mostly on shopping, which is notable and remarkable because there’s a growing awareness that ecommerce stands to gain a lot from AI Search.

No such luck for informational sites because it’s also gradually becoming understood that Agentic AI is poised to strip informational sites of all branding and value-add and treating them as sources of data.

Common SEO Practices That Pass As GEO

Some of what some champion as GEO and AEO are actually longstanding SEO practices:

  • Crafting content in the form of answers
    Good SEOs have been doing this since Featured Snippets came out in 2014.
  • Chunking content
    Crafting content in tight paragraphs looks good in mobile devices and it’s something good SEOs and thoughtful content creators have been doing for well over a decade.
  • Structured Content
    Headings and other elements that strongly disambiguate the content are also SEO.
  • Structured Data
    Shut your mouth. This is SEO.

The Customer Is Always Right

Some of in the GEO Is Real campe tend to regard themselves as evolving with the times but they also acknowledge they’re just offering what the clients are demanding. SEO practioners are in a hard spot, what are you going to do? Plant your flag on traditional SEO and turn your back on what potential clients are begging for?

Googlers Insist It’s Still SEO

There are Googlers such as Robby Stein (VP of Product), Danny Sullivan, and John Mueller who say that SEO is 100% still relevant because under the hood AI is just firing off Google searches for top ranked sites to backfill into synthesized answers and links (Read: Google Downplays GEO – But Let’s Talk About Garbage AI SERPs). OpenAI was recently hiring a content strategist that is able to lean into to SEO (not GEO), which some say demonstrates that even OpenAI is focused on traditional SEO.

Optimization Is No Longer Just Google

Manick Bhan (LinkedIn profile), founder of the Search Atlas SEO suite, offered an interesting take on why we may be transitioning to a divided SEO and GEO path.

Manick shared:

“SEO has always meant ‘search engine optimization,’ but in practice it has historically meant ‘Google optimization.’ Google defined the interface, the ranking paradigm, the incentives, and the entire mental model the industry used.

The challenge with calling GEO a ‘sub-discipline’ of SEO is that the LLM ecosystem is not one ecosystem, and Google’s AI Mode is becoming a generative surface itself.”

Manick asserts that there is no one “GEO” because each of the AI search and answer engines use different methodologies. He observed that the underlying tactics remain the same but the “the interface, the retrieval model, and the answer surface” are all radically changed from anything that’s come before.

Manick believes that GEO is not SEO, offering the following insights:

“My position is clear: GEO is not just SEO with a fresh coat of paint, and reducing it to that misses the fundamental shift in how modern answer engines actually retrieve, rank, and assemble information.

Yes, the tactics still live in the same universe of on-page and off-page signals. Those fundamentals haven’t changed. But the machines we’re optimizing for have.

Today’s answer engines:

  • Retrieve differently,
  • Fuse and weight sources differently,
  • Handle recency differently,
  • Assign trust and authority differently,
  • Fan out queries differently,
  • And incorporate user behavior into their RAG corpora differently.

Even seemingly small mechanics — like logit calibration and temperature — produce practically different retrieval outputs, which is why identical prompts across engines show measurable semantic drift and citation divergence.

This is why we’re seeing quantifiable, repeatable differences in:

  • Retrieved sources,
  • Answer structures,
  • Citation patterns,
  • Semantic frames,
  • And ranking behavior across LLMs, AI Mode surfaces, and classical Google results.

In this landscape, humility and experimentation matter more than dogma. Treating all of this as ‘just SEO’ ignores how different these systems already are, and how quickly they’re evolving.”

It’s Clear We Are In Transition

Maybe one of the reasons for the anti-GEO backlash is that there is a loud contingent of agencies and individuals who have very little experience with SEO, some who are fresh out of college with zero experience. And it’s not their lack of experience that gets some SEOs in ranting mode. It’s the things they purport are GEO/AEO that are clearly just SEO.

Yet, as Manick of Search Atlas pointed out, AI search and chat surfaces are wildly different from classic search and it’s kind of closing ones eyes to the obvious to deny that things are different and in transition.

Featured Image by Shutterstock/Natsmith1

Web Governance As A Growth Lever: Building A Center Of Excellence That Actually Works via @sejournal, @billhunt

In every digital transformation I’ve consulted on, from global banks to manufacturing giants, the failure point isn’t usually the strategy. It’s the governance.

Strategy defines where to go. Operations define how to get there. Governance is what keeps everyone moving in the same direction, at the same speed, without crashing into each other.

In my earlier Search Engine Journal articles, we built the foundation for this discussion:

This article closes the loop. Because until governance and accountability take hold, every strategy, no matter how visionary, remains a PowerPoint slide.

Governance As Guardrails For Growth

Governance has a branding problem. Too often, it’s mistaken for red tape – a set of rules designed to slow things down. In reality, good governance is what lets organizations move faster without flying apart. It’s a system of guardrails, not gates – a shared framework that protects creativity by keeping it aligned with purpose.

When done right, governance is the difference between freedom and anarchy. It ensures that every team, design, dev, content, and analytics can innovate confidently within an agreed-upon structure of trust, compliance, and clarity.

Governance doesn’t limit autonomy; it enables responsible autonomy.

The most effective Centers of Excellence build their governance around three principles:

  1. Guardrails, not barriers – Standards prevent rework and confusion, not creativity.
  2. Enablement through clarity – When expectations are clear, teams spend less time negotiating and more time executing.
  3. Evolution, not enforcement – Governance must adapt with technology, markets, and now, AI systems.

This turns governance into a living framework – one that scales excellence, accelerates innovation, and protects enterprise value simultaneously.

The Cast Of Characters: Who Belongs In A Modern Center of Excellence

A true Center of Excellence (COE) isn’t a department—it’s an alignment mechanism.

Its power lies in uniting diverse roles around shared definitions of value, performance, and accountability.

Role Type Primary Focus Key Question They Answer
Business Leadership (CEO, CFO, CMO) Direction, metrics, incentives “Are our digital assets creating measurable enterprise value?”
Digital Operations (CTO, DevOps, Product) Infrastructure, scalability, uptime “Can we deploy and measure at scale without friction?”
Marketing & Experience (SEO, UX, Content, CX) Discoverability, usability, trust “Is our content findable, credible, and consistent across markets?”
Data & AI Enablement (Analytics, Schema, AI Strategy) Structuring and measuring the data layer “Can machines – and humans – understand our brand at every level?”

An effective COE sits at the crossroads of these groups. It translates corporate objectives into digital guardrails, workflows, and shared KPIs.

And it does so through clarity of ownership – who decides, who executes, and who is accountable for outcomes.

Without that alignment, teams drift into the ownership gap I outlined in “Who Owns Web Performance?,” each optimizing their own slice while the organization loses system-level performance.

Anatomy Of A Working Center Of Excellence

A COE that works isn’t a poster on the wall but an ecosystem built around five components:

  1. Vision & Mandate – A clearly articulated purpose with executive sponsorship. Governance without mandate becomes optional. Tie the COE to measurable outcomes – revenue efficiency, cost avoidance, and risk reduction.
  2. Standards & Playbooks – Codified frameworks for content hierarchy, tagging, schema, and AI readiness. Standards remove friction when they’re written for usability, not perfection.
  3. Measurement & Accountability – Shared dashboards connecting digital KPIs to business KPIs. The CEO shouldn’t ask, “How’s SEO?” but “What’s the digital contribution to EBITDA?”
  4. Enablement & Knowledge Sharing – Training, automation, and playbooks that make compliance the natural outcome of good work, not an afterthought.
  5. Feedback & Evolution – Regular audits and retrospectives to ensure standards evolve as the technology – and the company – does.

A COE that only publishes rules is a library.

A COE that enforces and evolves them is a growth engine.

Effective governance transforms from control to enablement when standards become self-reinforcing. Instead of asking, “Did we follow the rules?” teams ask, “Do the rules help us move faster and smarter?” That’s the culture shift a Center of Excellence exists to create.

Corporate Judo: Turning Structure Into Strength

In “Epiphany 2 — Leverage Corporate Judo,” I wrote that the secret to lasting change isn’t fighting the system – it’s using its momentum. You don’t overpower corporate structure; you redirect it.

“The art of corporate judo is learning to use the organization’s own weight to create forward motion.”

Web governance works the same way. Rather than viewing process and policy as obstacles, a skilled COE converts them into leverage – turning approvals, reporting lines, and compliance requirements into tools for acceleration. A well-designed COE doesn’t rebel against structure; it channels it toward growth.

In this sense, governance becomes corporate aikido by absorbing friction and transforming it into alignment.

Cross-Channel Alignment: The Prerequisite For Performance

Before you can optimize, you must align. The most advanced analytics stack or SEO roadmap will fail if the organization itself is out of sync.

A functioning COE creates connective tissue between:

  • Search & Content – shared definitions of topics, authority, and metrics.
  • UX & Engineering – balance between design freedom and structural consistency.
  • Marketing & Analytics – unified measurement across paid, earned, and owned.
  • Corporate & Regional Teams – global templates with local flexibility.

In multinational environments, this alignment prevents the “geo-targeting misalignment” I’ve written about – where the wrong market page ranks, or translation replaces true localization. The COE becomes the referee between global efficiency and local relevance.

Why This Matters Even More In The AI Era

AI has raised the stakes for governance.
In the old world, poor governance hurt rankings.
In the new world, it hurts eligibility.

Search-grounded AI systems like Google’s AI Overviews and Bing Copilot rely on structured, accessible, and authoritative data to decide what’s trustworthy enough to include. If your schema, content, or infrastructure is inconsistent, the machine can’t reconcile your brand—and when it can’t reconcile, it omits you.

If SEO was about visibility, AI is about eligibility – and eligibility depends on governance.

As I argued in “Stop Retrofitting. Start Commissioning: The New Role of SEO in the Age of AI,” The role of SEO and, by extension, digital governance has shifted from a reactive fix to a proactive design function. SEO is no longer the cleanup crew that patches gaps after launch. It must become the Commissioning Authority, the group that ensures what’s being built meets both user and machine standards before it ever goes live.

Governance, in this new context, isn’t back-office oversight. It’s front-office enablement.

It ensures that every digital asset – content, structure, schema, and technical architecture – is commissioned for machine interpretation, not just human readability.

Because in today’s AI-first ecosystem, the question isn’t simply, “Can users find us?” It’s “Can machines trust, understand, and use us?”

“The era of being brought in after launch is over.
Governance – and SEO – must move upstream to where strategy and systems are conceived.”

Good governance isn’t a final check; it’s a design ethos. It transforms your organization from retrofitting performance to commissioning excellence.

And that shift from reactivity to readiness is what separates the brands that survive AI disruption from the ones that silently vanish from the conversation.

Governance As Digital Operating Leverage

Governance may not sound glamorous, but it’s the lever that compounds returns across every other investment.

  • Revenue Growth – Faster launches, better discoverability, consistent brand experience.
  • Cost Efficiency – Reduced rework, redundant tools, and duplicated content.
  • Capital Efficiency – Shared systems and reusable frameworks across markets.
  • Risk & Resilience – Compliance, uptime, and data consistency.
  • Innovation & Optionality – Guardrails that enable safe experimentation with AI and automation.

In financial terms, governance converts digital activity into operating leverage by increasing output without proportionally growing cost. This means your overall Web Effectiveness is a shareholder issue, not a marketing one. Governance is how you turn that theory into muscle.

The Leadership Imperative

Ultimately, governance fails when it’s delegated. A COE can’t succeed without executive willpower and cross-functional buy-in.

The CEO owns shareholder value.
The CMO owns demand.
The CTO owns systems.
But the COE owns the connection between them.

If your website is the factory, your COE is the operations manual that keeps it producing value – efficiently, predictably, and at scale.

Web governance isn’t a brake pedal; it’s a steering system. It creates the clarity and confidence that allow innovation to scale safely. It’s how large organizations protect creativity without chaos — and how they turn complexity into compound value.

In the age of AI, alignment isn’t optional. Governance is growth.

More Resources:


Featured Image: ImageFlow/Shutterstock