Your Owned Content Is Losing To A Stranger’s Reddit Comment via @sejournal, @DuaneForrester

The next time you ask an AI what product to buy, which agency to hire, or which software platform actually works, pay attention to where the answer comes from. Increasingly, it does not come from the vendor’s own website. It comes from a stranger’s Reddit comment written eighteen months ago, upvoted 847 times by people who tried the thing themselves.

This is not an accident. It’s architecture.

The Reddit Effect

The financial architecture behind Reddit’s presence in AI answers became public in early 2024. Google signed an initial licensing agreement with Reddit worth a reported $60 million per year, with total disclosed licensing across multiple AI companies reaching $203 million. That arrangement gave Google real-time access to Reddit’s posts and comments for training its AI models and powering AI Overviews, and the terms are now being renegotiated upward. Reddit executives have said current agreements undervalue the platform’s discussions, which now fuel everything from ChatGPT to Google’s generative answers.

The citation data confirms how central Reddit has become. Between August 2024 and June 2025, Reddit was the most cited domain in both Google AI Overviews and Perplexity, and the second most cited source in ChatGPT, trailing only Wikipedia. In Google’s AI Overviews specifically, Reddit citations grew 450% between March and June 2025. A separate study from early 2024 found Reddit appearing in results more than 97% of the time for queries related to products and reviews.

Reddit’s visibility in traditional search has fluctuated over this period, with organic rankings dropping noticeably in early January 2025. But its foothold in the AI answer layer has proven more durable than its SERP position, because these are different systems pulling from the same data source. Reddit’s hold on the AI layer reflects something structural about the content itself, not just a licensing arrangement.

Why Community Signals Work For AI

To understand why community platforms have become load-bearing infrastructure for AI answers, you need to hold two ideas at once.

First, community signals enter AI systems through two distinct pathways, not one. In the parametric pathway, community content gets baked into model weights during training and becomes part of what the model knows before anyone types a query. In the retrieval pathway, community content gets pulled in real time through retrieval-augmented generation (RAG) when the model needs current, specific, or contested information. Brands absent from community platforms before a model’s training cutoff face a significantly harder problem than brands simply absent from recent crawls. They are invisible at both layers simultaneously.

Second, the quality filtering that community platforms apply, through upvotes, accepted answers, reply chains, and sustained engagement, functions as a proxy signal that training pipelines have learned to weight. OpenAI’s training data hierarchy explicitly places Reddit content with three or more upvotes at Tier 2, directly below Wikipedia and licensed publisher partners. A heavily upvoted Reddit thread is treated as more credible input than most published content on the open web, because it carries the accumulated validation of hundreds or thousands of independent human judgments.

When multiple independent voices converge on the same recommendation across a thread, that convergence pattern looks different to a retrieval system than a single authoritative publication making the same claim. It is the AI equivalent of a strong link graph, distributed and uncoordinated agreement that no single actor manufactured. About 48% of AI citations now come from community platforms like Reddit and YouTube, with 85% of brand mentions originating from third-party pages rather than owned domains. The model is telling you something about where it trusts the signal.

The Manipulation Risk

Any system that rewards community consensus will attract people who want to manufacture it, and this one is no exception. The SEO parallel is exact: The same logic that made link spam profitable for decades is now making fake community engagement attractive to anyone who understands how AI systems weigh these signals.

The Trap Plan incident in late 2025 is the clearest recent case study. A marketing firm posted approximately 100 fake organic comments promoting a game on Reddit, then published a blog post documenting the campaign’s approach. The screenshots circulated everywhere. The post was ultimately deleted, but the reputational damage was not. A thread naming the company indexed in Google and sat in search results alongside legitimate coverage, visible to every potential customer searching the brand.

The detection infrastructure is more robust than in the early link spam era. Reddit’s automated systems flag coordinated inauthentic behavior through patterns in posting timing, account age, karma accumulation, and comment structure, and moderator communities actively watch for coordinated campaigns. The community itself maintains a strong norm against manufactured consensus, and the backlash when a campaign is exposed tends to be proportional to how authentic it claimed to be.

There is also a structural dimension that goes beyond individual campaigns. Research by Originality.ai found that 15% of Reddit posts in 2025 were likely AI-generated, up from 13% in 2024. That is not just brands gaming the system. It is a broader contamination of the community signal itself, creating a feedback loop where AI trains on Reddit content that increasingly contains AI-generated material designed to look like human consensus. The argument for building authentic community presence now, before detection systems become more aggressive about filtering synthetic signals, is a strategic one, not a moral one. Manufactured signals degrade faster than authentic ones, and the penalty when they collapse is worse than the benefit while they worked.

What Brands Should Actually Do

The practical implication is not “post more on Reddit.” It is more precise than that.

Monitor brand mentions across Reddit, Stack Overflow, Quora, and review platforms not as a reputation exercise but as entity intelligence. The narrative that forms in community discussions, the specific language, the repeated associations, the persistent objections, is the narrative AI systems are more likely to reproduce than anything on your own website. If community threads consistently describe your enterprise product as “great for small teams,” that characterization will surface in AI answers regardless of how your positioning page reads.

Ensure subject matter experts are participating in relevant communities under their real identities, contributing answers to questions they actually know well. The upvote accumulation those answers generate is a durable quality signal that persists across training cycles. One genuinely helpful response in a relevant technical subreddit or a well-supported Stack Overflow answer does more long-term structural work than ten pieces of owned content, because it carries community validation that owned content cannot provide.

Create content that community members actively want to reference. Original research, specific benchmarks, documented case studies with real numbers, these are the formats that generate organic community citations, which in turn generate the kind of third-party mentions that AI systems treat as consensus rather than marketing. A practical rule of thumb that holds in community engagement generally: 80% of participation should contribute genuine value with no promotional intent, and the 20% that mentions your product should only appear when it is the honest answer to the question being asked.

Think of community presence as a context moat with a long construction timeline. Unlike most marketing assets, authentic community reputation compounds slowly and is genuinely difficult for competitors to replicate quickly. A brand that has been a good-faith participant in its relevant communities for two years has something that cannot be acquired in a quarter.

The Review Layer

Most brands managing reviews understand that aggregate star ratings affect purchase decisions. Fewer understand that the specific review content, the language customers use, the features they praise or criticize, the comparisons they draw to competitors, is increasingly the raw material for how AI describes your brand at the moment of recommendation.

The numbers make the stakes concrete. Domains with profiles on review platforms have three times higher chances of being chosen by ChatGPT as a source compared to sites without such presence. In a G2 survey of B2B software buyers in August 2025, 87% reported that AI chatbots are changing how they research products, and half now start their buying journey in an AI chatbot rather than Google, a 71% increase in just four months. When a procurement director asks an AI to recommend CRM options for a 50-person team, the answer draws from review platform content, not from vendor websites.

Here is where the landscape shifts in a way that most review management programs have not caught up with yet. Not all review platforms are accessible to AI retrieval systems, and the differences are significant.

A June 2025 analysis of 456,570 AI citations found that review platforms divide into three distinct categories based on crawler access policies. Platforms like Clutch and SourceForge allow full crawler access, and their content surfaces regularly in AI-generated answers. Platforms like G2 and Capterra operate with selective access that permits some retrieval. Major platforms (Yelp is an example) block AI crawlers at the robots.txt level, which means reviews written there, however numerous or positive, are structurally unavailable to AI retrieval at the point of recommendation.

The citation data reflects this directly. For Perplexity, 75% of review site citations in the software category come from G2. Clutch dominates AI citations in the agency and digital services category. The market prominence of a review platform and its accessibility to AI crawlers are different variables, and review management strategy that conflates them is directing effort toward platforms where the AI visibility signal cannot be retrieved regardless of review volume.

This is not an argument that major platform reviews are worthless. They still matter significantly for direct consumer decision-making, traditional search, and brand reputation overall. It is an argument that the AI visibility value of a review depends specifically on whether the platform permits retrieval, and that understanding has material consequences for where teams prioritize cultivating review volume when AI answer visibility is the goal.

One additional layer of complexity: robots.txt compliance among AI crawlers is not guaranteed. Analysis by Tollbit found that 13.26% of AI bot requests ignored robots.txt directives in Q2 2025, up from 3.3% in Q4 2024. The boundary between “blocked” and “accessible” is not as clean in practice as it is in policy. The implication is to treat your entire review footprint as potentially accessible to AI retrieval while being deliberate about which platforms receive active cultivation for AI visibility specifically.

The Broader Picture

Community presence has always been a trust signal. What has changed is that the systems making purchase recommendations at scale are now reading those signals directly, at the platform level, and weighting them above the content brands produce about themselves.

SEO professionals who have spent years optimizing owned content for search visibility now face a layer of visibility that operates on fundamentally different inputs. The link-building parallel is not rhetorical. Just as the profession eventually accepted that links from authoritative external sources outweigh on-page optimization in many contexts, the community signal layer is demonstrating the same dynamic for AI-generated answers. Authority comes from outside the brand’s control, which means the work of building it looks less like content production and more like sustained, authentic participation in the places where buyers actually talk.

The brands that start building authentic community presence now are constructing a signal that compounds. Genuine community reputation is difficult to manufacture at scale, genuinely difficult for competitors to replicate quickly, and structurally favored by the same AI systems that are increasingly the first stop in the purchase journey. Later entrants will find it expensive to match.

If you want to learn more about topics like these, take a look at my newest book on Amazon: The Machine Layer: How to Stay Visible and Trusted in the Age of AI Search. It’s written to help you not only understand the topics I write about here, but also to help you learn more about LLMs and consumer behavior, build ways to grow conversations within your organization, and can serve as a workbook with multiple frameworks included.

More Resources:


Featured Image: ginger_polina_bublik/Shutterstock; Paulo Bobita/Search Engine Journal

Why Your New SEO Vendor Can’t Build on a Broken Foundation via @sejournal, @TaylorDanRW

There is a common expectation in SEO that needs to be challenged, and it usually appears as soon as a new agency or consultant takes over performance.

Many businesses assume that fresh expertise should lead to quick wins, as if changing vendors resets everything and removes the issues that held performance back before. This view ignores how search works and overlooks the lasting impact of previous decisions.

Quick wins can still exist, but they should be seen as small steps rather than complete solutions. Changes such as improving page titles, updating content, or fixing isolated issues can lead to short-term improvements, but they do not address deeper problems.

Relying too heavily on these quick fixes can create a false sense of progress while leaving the core issues unresolved.

Inheriting History Is Never Starting Fresh

A new SEO vendor does not start with a clean slate, and they are never working in isolation from what came before. They inherit the full history of the website, including past strategies, technical decisions, and content choices, whether those were effective or not. That inherited position becomes the real starting point, and in many cases, it is far more restrictive than stakeholders expect.

Poor SEO does not simply fail to deliver results, as it often creates long-term problems that take time to fix. If a site has built up low-quality backlinks, published thin or duplicated content, or ignored technical issues, it develops a reputation that search engines take into account. This means that even strong improvements can take time to show results, as they must first counterbalance what already exists.

The impact of past decisions tends to build over time, shaping how a domain is viewed and ranked. Practices such as buying links at scale, creating large volumes of low-value pages, or focusing only on short-term gains often leave a lasting footprint. Search engines respond to this by becoming more cautious, which affects not just old content but also any new work that is introduced.

Technical debt is another major factor that often goes unnoticed until a new vendor begins to investigate properly. Many websites grow over time without clear structure or oversight, leading to issues such as broken internal links, inefficient crawl paths, duplicate content, and problems with how pages are rendered. These issues directly affect how search engines access and understand a site, which makes them a priority to fix before growth can happen.

Stabilization Comes Before Growth

The early stages of a new SEO engagement are usually focused on stabilizing the site rather than driving immediate growth. This involves detailed audits, identifying crawl and indexation issues, and making sure important pages are accessible and prioritized correctly. Although this work is essential, it does not always lead to instant improvements in rankings, which can be frustrating if expectations are not aligned.

Rebuilding trust is another key part of the process, especially if a site has used poor practices in the past. Search engines are designed to reward consistency and reliability over time, and trust cannot be restored through quick changes. It requires steady improvements in content quality, link profile, and overall site structure, supported by signals that show genuine value to users.

Brand Strength As A Limiting Factor

Brand strength also plays a larger role than many businesses realize, and its absence can limit SEO performance even when technical issues are addressed. Websites with little presence outside their own domain, few mentions across the web, and low branded search demand often struggle to compete. Search engines look for signals that a brand is recognized and trusted, which means visibility beyond the site itself matters.

A lack of investment in brand building creates additional work for any new SEO vendor, as they may need to introduce digital PR, content promotion, and strategies that increase visibility across relevant platforms. These efforts take time to build momentum and rarely deliver immediate results, which reinforces the need for a longer-term view.

Frustration often comes from the fact that much of the early work is not visible in the form of traffic or ranking gains. Audits, clean-up work, and structural improvements are not always obvious to stakeholders, but they are necessary to remove the barriers that limit performance. Without this foundation, any gains from new content or links are likely to be limited.

Accountability And Communication

Accountability still matters, and a new SEO vendor should be clear about what they are doing and why. They should explain the starting position, set realistic timelines, and outline the steps needed to improve performance. Clear communication helps build trust and ensures that stakeholders understand what progress looks like at each stage.

Realistic timelines are often longer than businesses expect, especially when there are significant issues to address. The first few months are usually focused on fixing problems and improving site health, followed by a period where early signals begin to improve. More noticeable growth in rankings and traffic often comes later, once the foundation is stronger and search engines respond to consistent improvements.

A shift in mindset is needed to get the most from SEO, moving away from the idea of quick fixes towards a more long-term approach. Past decisions, whether they involve shortcuts or a lack of investment, shape current performance and cannot be undone instantly. Accepting this reality allows businesses to focus on building sustainable growth rather than chasing immediate results.

The Long-Term View

Bringing in a new SEO vendor should be seen as the start of a process rather than the end of a problem. The best outcomes come from understanding the starting point, investing in the necessary work, and staying consistent over time. This approach creates the conditions for steady improvement rather than short bursts of activity that do not last.

The key point is that SEO reflects both history and current effort, and ignoring the past leads to unrealistic expectations. A new vendor can bring structure, expertise, and direction, but they cannot remove the impact of previous actions overnight. What they can do is build a stronger position over time, provided they are given the space and support to do it properly.

More Resources:


Featured Image: Anton Vierietin/Shutterstock

Google’s CEO Predicts Search Will Become An AI Agent Manager via @sejournal, @martinibuster

In a recent interview, Google’s CEO, Sundar Pichai, explained how search is changing in response to advances in AI. The discussion centered on a simple question: If AI can act, plan, and execute, then what role will search play in the future?

Information Queries May Become Agent AI Search

The interviewer asked whether search remains a product or becomes something else as AI systems begin handling tasks instead of returning results.

They asked:

“What do you view as a future of search? Is it a distribution mechanism? Is it a future product? Is it one of N ways people are going to interact with the world?”

Had Pichai been interviewed by members of the publishing and SEO community, his answer may have received some pushback. He answered that search does not get replaced, but continues to expand as new capabilities are introduced and user expectations change.

He said:

“I feel like in search, with every shift, you’re able to do more with it.

And we have to absorb those new capabilities and keep evolving the product frontier.

If it’s mobile, the product evolved pretty quickly, you’re getting out of a New York subway, you’re looking for web pages, you want to go somewhere, how do you find it? So you’re constantly shifting, people’s expectations shift, and you’re moving along.

If I fast forward, a lot of what are just information seeking queries will be agentic search. You will be completing tasks, you have many threads running.”

In the first example of a person coming out of a New York subway, yes, someone may search for a web page, but will Google show the user a web page or treat it like data by summarizing it?

The second example completely removes the user from search and inserts agents in the middle. That scenario implicitly treats web pages as data.

Will Search Exist In Ten Years?

Pichai was asked what the future of search will be like in ten years. His answer suggests that the future of search will involve many information-seeking queries being handled as tasks carried out by agentic AI systems. Furthermore, search will be more like an orchestration layer that sits between the user and AI agents.

The exact question he was asked is:

“Will search exist in ten years?”

Google’s CEO responded:

“It keeps evolving. Search would be an agent manager, right, in which you’re doing a lot of things.

I think in some ways, I use anti-gravity today, and you have a bunch of agents doing stuff.

And I can see search doing versions of those things, and you’re getting a bunch of stuff done.”

At this point, the interviewer tried to get Pichai to return to the question of the actual search paradigm, if that will exist in ten years. Pichai declined to expressly state whether the search paradigm will still exist.

He continued his answer:

“Today in AI mode in search, people do deep research queries. So that doesn’t quite fit the definition of what you’re saying. But kind of people adapted to that.

So I think people will do long-running tasks, can be asynchronous.”

What he described is a version of search that manages actions across multiple steps, where multiple processes can run at once instead of returning a single set of ranked results. And yet, it’s weirdly abstract because he’s talking about queries but fails to mention websites or web pages in that specific context.

What’s going on? His next answer brings it into sharper focus.

Who Is The Flea And Who Is The Dog?

The interviewer picked up on Pichai’s mention of adaptation, made an analogy to evolution, and then asked:

“It’s almost like, does that former version or paradigm eventually go away? And what was search becomes an agent and your future interface is an agent, and the search box in ten years or n years is no longer the–“

Pichai interrupted the interviewer to say that it’s no longer possible to look ahead five or ten years because the models are changing, what people do is rapidly changing, and given that pace, the only thing to do is to embrace it.

He explained:

“The form factor of devices are going to change. I/O is going to radically change. And so …I think you can paralyze yourself thinking ten years ahead. But we are fortunate to be in a moment where you can think a year ahead, and the curve is so steep. It’s exciting to just do that year ahead, right?

Whereas in the past, you may need to sit and envision five years out, unlike the models are going to be dramatically different in a year’s time.

…I think it’ll evolve, but it’s an expansionary moment. I think what a lot of people underestimate in these moments is, it feels so far from a zero-sum game to me, right? The value of what people are going to be able to do is also on some crazy curve, right?

I think the more you view it as a zero-sum game, it looks difficult. It can become a zero-sum game if you’re innovating or the product is not evolving.

But as long as you’re at the cutting edge of doing those things, and we’re doing both search and Gemini, and so they will overlap in certain ways. They will profoundly diverge in certain ways, right? And so I think it’s good to have both and embrace it.”

What Google’s CEO is doing is rejecting the possibility of becoming obsolescent by deliberately focusing on competitive agility and embracing uncertainty as a strategic advantage.

That might work for Google, but what about websites?

I think businesses also need to embrace competitive agility and get out of the mental attitude of fleas on the dog. And yet, online businesses, publishers, and the SEO community are not fleas because Google itself is the one feeding off the web’s content.

What About Websites?

The interview lasted for over an hour, and at no point did Pichai mention websites. He mentioned web pages twice, once as something to understand with technology and once in the example of a person emerging from a subway who is looking for a web page. In both of those instances, the context was not Google Search looking for or fetching a web page in response to a query.

Given that Google Search is used by billions of people every day, it’s a bit odd that websites aren’t mentioned at all by the CEO of the world’s most successful search engine.

Google Confirms March 2026 Core Update Is Complete via @sejournal, @MattGSouthern

Google’s March core update has finished rolling out, according to the Google Search Status Dashboard.

The dashboard updated at 6:12 AM PDT on April 8 with the completion note: “The rollout was complete as of April 8, 2026.” The update began on March 27 at 2:00 AM PT, making the total rollout 12 days.

That’s within Google’s original two-week estimate and faster than the December 2025 core update, which took 18 days.

What Google Said About This Update

Google called the March 2026 core update “a regular update designed to better surface relevant, satisfying content for searchers from all types of sites.”

The company didn’t publish a companion blog post or announce specific goals for this update. It also didn’t share new guidance with the completion notice.

Core updates involve broad changes to Google’s ranking systems. They aren’t targeted at specific types of content or policy violations. Pages can move up or down based on how the update reassesses quality across the web.

Three Updates In One Month

March was unusually active for Google’s ranking systems. The core update was the third confirmed update in roughly five weeks.

The February Discover core update finished rolling out on February 27 after 22 days. That was the first time Google publicly labeled a core update as Discover-only.

The March 2026 spam update rolled out and completed in under 20 hours on March 24-25. That was the shortest confirmed spam update in the dashboard’s history.

The core update followed two days later on March 27.

Roger Montti, writing for Search Engine Journal, noted that the spam-then-core sequencing may not have been a coincidence. He wrote that spam fighting is logically part of the broader quality reassessment in a core update, comparing it to “clearing the table” before recalibrating the core ranking signals.

How The Rollout Compared To Recent Core Updates

The March rollout was the second-shortest of the past five broad core updates.

Only the December 2024 update finished faster.

Why This Matters

The completed rollout means you can now compare pre-update and post-update performance in Search Console across a full window. Google recommends waiting at least one full week after completion before drawing conclusions from the data.

Your baseline period should be the weeks before March 27, compared against performance after April 8. Keep in mind that the March spam update completed on March 25, so any ranking changes between March 24-27 could be from either update.

A drop in rankings after a core update doesn’t mean your site violated a policy. Core updates reassess content quality across the web, and some pages move up while others move down.

Looking Ahead

Google will likely continue making smaller, unannounced core updates between the larger confirmed rollouts. The company updated its core updates documentation in December to say that smaller core updates happen on an ongoing basis.


Featured Image: Rohit-Tripathi/Shutterstock

GEO Was Invented On Sand Hill Road

I’ve been putting this one off.

Not because the argument is hard to make – it isn’t – but because the behavior it’s about has been a fixture of the SEO industry for as long as I’ve worked in it. The shiny new object arrives, the FOMO kicks in, the conference decks update, and an entire professional class reshuffles its vocabulary to match whatever acronym landed that quarter. I wrote recently about how AI content scaling is just content spinning with better grammar – the tools change, the qualitative wall doesn’t. The acronym cycle runs on the same engine.

But this time, the shiny object didn’t emerge from practitioners observing a genuine shift and trying to name it. It was manufactured upstream – by venture capital, amplified by engagement farming, and adopted by professionals whose primary motivation isn’t “this is real” but “I can’t afford to look like I’m not keeping up.”

So here we are.

The Investment Thesis

In May 2025, Andreessen Horowitz published a blog post titled “How Generative Engine Optimization (GEO) Rewrites the Rules of Search.” It appeared in their enterprise newsletter, written by two a16z partners, Zach Cohen and Seema Amble. Public, on their website, available to anyone with a browser.

The post declared that the “$80 billion+ SEO market just cracked” and that “a new paradigm is emerging.” It name-dropped three GEO tools – Profound, Goodie, and Daydream – as platforms enabling brands to track how they appear in AI-generated responses. It described a future where GEO companies would “fine-tune their own models” and “own the loop” between insight and iteration. a16z promoted it across their social channels, including a post from the firm’s official account: “SEO is slowly losing its dominance. Welcome to GEO.”

Screenshot from X, April 2026

Also: a16z is an investor in Profound.

The blog post creates demand for the category. The category creates demand for the tools. The tools are in their portfolio. A sales funnel with a byline.

Marc Andreessen’s “Software is eating the world” wasn’t just an essay – it was a prospectus dressed in editorial clothing. The GEO post follows the same logic: identify the wave, position your bets as the inevitable response, publish the narrative that makes both feel like settled truth. Even sympathetic coverage noticed. The Alts.co write-up noted plainly that “a16z is drawing attention to GEO because it’s a chance to peddle/pump their own investments.”

What Happens When Nobody Checks The Source

Ten months later, in March 2026, someone on X described the blog post as “a 34-page internal memo” that a16z had “quietly published” and which had received only “200 views.” It cited a specific statistic: portfolio companies ranking No.1 on Google saw “a 34% drop in organic traffic in 12 months.” I’m not interested in the individual. This post is one of hundreds following the same pattern, and the pattern is what matters.

None of this is real.

The blog post isn’t 34 pages. It isn’t internal. It wasn’t quietly published. The specific opening line and the 34% stat don’t appear in the actual piece. You can verify this yourself right now.

This isn’t a16z’s doing. An engagement farmer found an old blog post and repackaged it with fictional scaffolding because that format performs better on social media. A “leaked internal memo” is sexier than a newsletter. “200 views” creates scarcity. Invented statistics create authority.

And it worked. People shared it, built threads around it, didn’t check whether the memo existed. Why would they? The narrative was too good.

Two independent forces – a VC firm doing standard narrative-building, and an engagement farmer doing standard engagement farming – converge on the same result. The VC seeds the category. The farmer, months later, independently amplifies a distorted version. Professionals absorb the distortion because nobody goes back to check the primary source.

Not coordination. Convergence. And a category becomes “real” without anybody establishing that it is.

The Willing Participants

VCs and engagement farmers can’t take all the credit. SEO professionals are the most culpable link in the chain.

One widely-shared post on X captures the mentality – and I’m citing the behavior, not the person, because this position is everywhere in the industry right now. The argument: Clients don’t want to hear that GEO is “just SEO repackaged.” Neither does your executive team. Tell them “it’s just SEO,” and you’ll be “perceived as a legacy outdated thinker.” You might even be “replaced by a GEO agency.” The conclusion: “whether you like it or not… it’s in your best interest to get aboard the AI train.”

Image Credit: Pedro Dias

The argument is not that GEO works. Not that it measures anything meaningful. Not that it produces better outcomes for clients. The argument is that if you don’t adopt the label, you will lose your job.

Ambulance chasing dressed as career advice.

And here’s what makes SEO professionals more culpable than the VCs or the engagement farmers: they don’t just absorb the fear. They market it. They repackage the anxiety about their own relevance and sell it downstream to clients and executives who are even less equipped to evaluate the claims. The VC creates the narrative. The engagement farmer amplifies it. The SEO professional walks into a client meeting and says, “You need a GEO strategy, or you’ll be invisible to AI,” knowing full well they can’t define what that means in terms the client could verify.

This is how SEO professionals undermine their own credibility. Not by being wrong about the technical shift, but by selling certainty they don’t have about a category they didn’t bother to verify, using someone else’s terminology to paper over their own lack of understanding.

Nobody held a gun to anyone’s head and said, “Put GEO on your LinkedIn headline.” SEO professionals are choosing to adopt terminology they haven’t evaluated, from sources they haven’t verified, for tools they can’t validate; and then surfing that same fear factor into client budgets. If the only way you can sell your expertise is by rebranding it every eighteen months, the problem isn’t the label. It’s the confidence.

The people most capable of evaluating whether GEO is a real discipline are the same people adopting it fastest. Every hour they spend chasing the vocabulary is an hour not spent building the understanding that would make them impossible to replace. I’ve written about how AI is hollowing out the junior pipeline: the apprenticeship layer where practitioners actually learn judgment. The acronym treadmill accelerates that. It replaces depth with breadth, understanding with terminology, and professional development with professional performance.

What’s Actually Underneath

Strip away the a16z framing, the fabricated memos, and the professional anxiety, and ask the boring question: what would you actually do differently if you took GEO seriously?

I’ve argued before that grounding is just retrieval: When an AI system cites a source, it’s running a search task, not exercising editorial judgment. Indexing, vector search, relevance scoring. The same principles we’ve been working with for two decades, with a generative interface on top. GEO isn’t a second discipline standing alongside SEO. It’s old retrieval visibility in a trench coat pretending to be two disciplines. And your data interpretation skills – perched comfortably atop Mount Dunning-Kruger – don’t trump the clear, demonstrable logic of how a retrieval engine works. If you can’t explain why a result appeared, you have no business selling a service that claims to optimize for it.

The a16z post itself confirms this, perhaps accidentally. The advice it gives brands pursuing GEO is a greatest hits of SEO best practices: structured content, authoritative backlinks (rebranded as “earned media”), schema markup, topical authority. It even recommends “short, dense, citation-worthy paragraphs” and “specific claims with verifiable numbers” – which is, and I cannot stress this enough, just competent writing.

David McSweeney has been doing SEO since before some of these GEO startups’ founders graduated. He’s spent years writing about the same tactics now being repackaged under the GEO label (content freshness, digital PR, community participation, link building) and has the publication dates to prove it. His summary of the GEO pitch: take advantage of the fact that businesses don’t understand AI systems rely on traditional search, and extract more money from them.

Screenshot from X, April 2026

He called it the grift. I think that’s generous. A grift implies individual con artists. This is structural: a category manufactured at the top, distorted in the middle, and adopted at the bottom. Not because it describes anything new, but because the professional cost of ignoring it feels higher than the professional cost of pretending it’s real.

You’re Not In The Driver’s Seat

Your job as a competent professional is to understand what these abbreviations actually mean, where they come from, and what – if anything – they change about your work.

If you can explain to your clients and your leadership what AI systems actually do, how they retrieve information, what’s genuinely measurable, and what isn’t – you will never be in a reactive position. You will never be the person scrambling to add “GEO” to a slide deck because someone on X told you it was the future.

If instead you let yourself be dragged around by whatever narrative venture capitalists need you to believe this quarter, you will always be reacting. One blog post away from a strategy pivot. Buying tools sold by people who benefit from your insecurity. That’s a choice. Not a fate.

The underlying mechanics of how content gets discovered – search engine crawler, LLM grounding system, RAG pipeline – haven’t undergone a paradigm shift. The interface has shifted. Users get answers synthesized from sources rather than a list of links.

But “the interface changed” doesn’t sell software. “Everything you know is obsolete and you need our dashboard” does.

Follow The Money

a16z benefits because the GEO narrative creates demand for their portfolio companies. The tool startups benefit because the narrative creates their market. The engagement farmers benefit because fabricated memos drive impressions. The agencies that rebrand as “GEO specialists” benefit because they can charge more for the same services with a shinier label.

Who doesn’t benefit? The practitioners doing solid, foundational work. Those people don’t need a new acronym. They need the industry to stop mistaking marketing for methodology.

And the clients. The clients are where the fear chain terminates, and the invoices begin. A new line item for work that should have been happening already under the SEO retainer, or that can’t be reliably measured in the first place. The VC manufactures the category. The SEO professional absorbs it and marks it up. The client pays for it. A game of telephone where the bill lands on the last person in the room who doesn’t speak the language.

I’ve written separately about the measurement problem with these tools – the non-determinism, the gap between parametric and retrieved knowledge, the dashboards built on methodological sand. The tools a16z promotes in that blog post have the same structural limitations. The dashboards look great. The numbers move. Whether the numbers mean anything is a question nobody selling the dashboard has an incentive to answer.

Meanwhile, the actual crisis gets no airtime. Organic search traffic across major U.S. publishers dropped 42% after AI Overviews expanded. Rankings didn’t change. Traffic did. That’s the real problem. Not which three-letter acronym to put on your slide deck, but the fact that the economic model underpinning content production on the open web is breaking. GEO doesn’t address that. It doesn’t even pretend to. It just gives everyone something to be busy with while the floor drops out.

The cycle time is getting shorter. We went from “AEO” to “GEO” in about eighteen months. Give it another year, and there’ll be another acronym, another VC blog post, another fabricated memo, and another round of professionals trying to decide whether the latest three letters are worth putting on their LinkedIn headline.

Or you could just do good work and understand what you’re doing well enough to explain it without borrowed terminology. But I suppose that doesn’t have the same ring to it on a pitch deck.

More Resources:


This post was originally published on The Inference.


Featured Image: Summit Art Creations/Shutterstock

Why Product Feeds Shouldn’t Be The Most Ignored SEO System In Ecommerce

Most ecommerce brands obsess over category pages and backlinks or product optimizations, while their product feeds remain auto-generated and underoptimized. Product feeds act as the backbone of ecommerce site catalogs and have long been the sole remit of PPC teams, but in the new era of AI Search, this is changing.

Back in 2023, Search Console added enhancements to the Shopping tab Listings report to help brands to get a better understanding of how their products were being seen in the Merchant Center.

We’ve also seen the emergence of OpenAI’s Product Feed specification as a specific requirement to allow ChatGPT to accurately index and display products. Although more recently, we’ve seen announcements that OpenAI has ended Instant Checkout and considering new directions.

These changes are pulling product feed visibility directly into the SEO performance ecosystem and aligning it as general “search infrastructure,” not just “ads infrastructure.”

In this article, we’ll be talking you through the value that product feeds can bring to businesses and how SEO aligns with this.

SEO’s Role In Product Feeds

In ecommerce, product feeds are often seen as “set it and forget it” assets, but treating these feeds as simply raw data is an immediate missed opportunity to boost visibility across organic search, shopping, and agentic commerce in the future.

While a standard product feed provides basic data to search bots, an optimized feed enhances attribute accuracy to ensure your products appear for high-intent search queries. By refining your product data, you bridge the gap between technical specs and consumer needs, increasing both visibility and click-through rates.

SEO can help to optimize feeds across four main pillars:

1. Semantic Query Mapping

SEOs don’t just use basic product names. They use consumer language built out of query mapping and intent-matching.

By front-loading titles with high-intent keywords and “long-tail” descriptions that include attributes like color, material, or use-case, products are more likely to appear where the user’s intent is highest.

Example:

Instead of “Men’s Waterproof Jacket Black”

SEO Driven Product Feed: “Brand X Men’s Waterproof Running Jacket – Black Lightweight Performance Shell”

2. Taxonomy Logic

Taxonomy is important to stop your products from being lost in the void. A misplaced product can quickly become a lost sale.

By refining categorization and product grouping, general terms like “tactical hiking boots” won’t get buried under generalized categories like “general footwear.”

Building a logical hierarchy allows algorithms to crawl and understand the catalog with higher confidence of exactly who the product is targeting. All products within your feed will be automatically assigned a product category.

Ensuring your taxonomy, as well as the titles, descriptions, and GTIN information, will help to ensure that products are correctly categorized according to [google_product_category] attribute.

3. Structured Data

In Google Shopping, structured data acts as the anchor of “truth” that connects your website to your Merchant Center feed.

Structured data allows Google and other bots to directly pull product data from your HTML, creating a form of automated data validation. If, for example, your feed says a product is $50, but your schema says $60, Google will likely disapprove the listing.

In many cases, high-performing feeds rely on structured data to update price and availability in real-time. If you run a flash sale, Google’s crawler can detect the change via schema and updates your Shopping Ads, preventing “out of stock” clicks.

When it comes to agentic commerce, agents will query schema properties to see if your product fits the user’s specific constraints.

Structured data provides hard facts and allows agents to see if a product is “agent-ready” for checkout.

4. Analytical Review

Having a highly analytical mind that is always looking for opportunity, SEOs can help to identify any “ghost products” and diagnose whether the issues are down to attributes, images, or descriptions, providing ongoing optimization recommendations.

As we move into an era of AI-driven discovery, the quality of a brand’s feed data can quickly become a reflection of a brand’s reputation.

By providing more context within the feed, you are more likely to see your brand get recommended in conversational search and show up in organic shopping.

What Ecommerce Brands Get Wrong With Product Feed Optimization

The majority of issues that we see in product feeds come from inconsistencies and a lack of depth within the feed.

From conversations with brand managers, this seems to stem from a lack of ownership within a channel and a lack of understanding of the impact of what these inconsistencies can have.

In some cases, feeds can be disapproved due to having inaccurate price status due to inconsistency between the feed and a landing page.

Other common issues include:

  • Auto-generated Shopify titles.
  • No keyword layering.
  • Inconsistent variants.
  • Missing GTIN/MPN.
  • Thin descriptions.
  • Feed data not aligned with on-page SEO.

This is where having the eyes of an SEO who is used to ongoing technical auditing and hygiene maintenance, and understands the value of structured data and content for context, can be vital in product feed performance.

How Product Feeds Directly Impact Organic & AI Visibility

Quite simply, the more context you can provide in your product feed, the more chances you have of being shown or cited in traditional search and in AI engines.

If a product feed is missing critical attributes like size, color, material, compatibility, or use case, the product won’t just rank lower; it will become ineligible for more specific, high-intent queries.

As search queries grow longer and intent becomes more nuanced, i.e., searchers looking for “men’s waterproof trail running jacket black medium” rather than just “men’s trail running jacket,” feeds need to evolve past being simple descriptors.

They need to properly layer structured attributes that mirror how real customers search and filter online. The more complete the product feed, the more opportunities there will be for your products to appear online across Shopping to AI-generated citations.

What Product Feed Optimization Actually Looks Like

There are a few stages of product feed optimization that SEOs need to be both aware of and able to deliver.

Keyword & Intent Architecture

SEOs should approach product feeds the same way they approach category and content strategy.

Keyword research should be conducted at a product level, identifying high-intent modifiers such as size, material, compatibility, and demographic, and layer those attributes both into product titles and feed data.

Rather than relying on generic exports from Shopify or another ecommerce platform, product titles should reflect real organic search behavior around how customers actually query products.

Structured Data Alignment

SEOs should also make sure that feed attributes match on-page schema.

Keeping a close eye on Merchant Center for any potential issues, such as missing GTINs or prices not matching, and making any necessary adjustments to schema/structured data, will help to ensure that the feed is consistent and context is fully delivered to bots.

Variant Consolidation Strategy

This leans heavily into faceted navigation – which ecommerce SEOs have been battling for years.

By determining when product variations should be grouped under a single parent entity versus a standalone URL, SEOs can have more control over any unnecessary duplication and cannibalization.

This can also help to protect crawl efficiencies across large product catalogs and declutter product feeds.

Feed Health Monitoring

Similar to how SEOs regularly run technical crawls of websites to maintain hygiene and pick up any issues, SEOs should also treat feed governance as part of their regular checks.

This includes actively monitoring feed errors and addressing any Merchant Center issues that might limit visibility.

Prioritizing AI Search Readiness

A large opportunity for the future of search comes with agentic commerce, and product feeds are going to align directly with this.

By ensuring feeds are clearly structured and contain complete and accurate attributes, SEOs can reinforce strong product entity signals and provide clarity, which AI systems rely on to determine what to display in comparisons and recommendations.

Final Thoughts

Product feeds are no longer just paid media assets; they are core search infrastructure that directly impacts organic shopping visibility and AI-driven discovery.

Even the strongest category pages can’t compensate for inconsistent or poorly structured data at scale.

As search becomes more conversational and comparative, structured product clarity is going to be the difference between brands that are cited and brands that are not.

More Resources:


Featured Image: Roman Samborskyi/Shutterstock

Google Says It Can Handle Multiple URLs To The Same Content via @sejournal, @martinibuster

Google’s John Mueller answered a question about duplicate URLs appearing after a site structure change. His response offers clarity about how Google handles duplicate content and what actually influences indexing and ranking decisions.

Concern About Duplicate URLs And Ranking Impact

A site owner had changed the URL structure of their web pages then later discovered that older versions of those URLs were still accessible and appearing in Google Search Console.

The person asking the question on Reddit was concerned that requesting recrawls of the older URLs might confuse Google or lead to ranking issues.

They asked:

“I switched over themes a while back and did some redesign and at some point …I changed all my recipes urls by taking the /recipe/ part out of site.com/recipe/actualrecipe so it’s now just site.com/actualrecipe but there are urls that still work when you put the /recipe/ back in the url.

I went to GSC and panicked that a bunch of my recipes weren’t indexed due to a 5xx error (I think it was when my site was down for a few days).

Now I’ve requested a bunch of them already to be recrawled, but realizing maybe google was ignoring them for a reason, like it didn’t want the duplicates.

Are my recrawl requests for /recipe/ urls going to confuse google who might penalize my ranking for the duplicates?”

The question reflects a reasonable concern that duplicate URLs and content might negatively affect rankings, especially when the error is surfaced through the search console indexing reports.

Google Is Able To Handle Duplicate URLs

Google’s John Mueller answered the question by explaining that multiple URLs pointing to the same content do not trigger a penalty or loss of search visibility. He also noted that this kind of duplication is common across the web, implying that Google’s systems are experienced with handling this kind of problem.

He explained:

“It’s fine, but you’re making it harder on yourself (Google will pick one to keep, but you might have preferences).

There’s no penalty or ranking demotion if you have multiple URLs going to the same content, almost all sites have it in variations. A lot of technical SEO is basically search-engine whispering, being consistent with hints, and monitoring to see that they get picked up.”

What Mueller is referring to is Google’s ability to canonicalize a single URL as the one that’s representative of the various similar URLs. As Mueller said, multiple URLs for essentially the same content is a frequent issue on the web.

Google’s documentation lists five reasons duplicate content happens:

  1. “Region variants: for example, a piece of content for the USA and the UK, accessible from different URLs, but essentially the same content in the same language
  2. Device variants: for example, a page with both a mobile and a desktop version
  3. Protocol variants: for example, the HTTP and HTTPS versions of a site
  4. Site functions: for example, the results of sorting and filtering functions of a category page
  5. Accidental variants: for example, the demo version of the site is accidentally left accessible to crawlers”

The point is that duplicate content is something that happens often on the the web and is something that Google is able to handles.

Technical SEO Signals

Mueller said Google will pick one URL to keep, but added that the site owner might have preferences. That means Google will canonicalize the duplicates on its own, but the site owner or SEO can still signal which URL is the best choice (the canonical one) for ranking in the search results.

That is where technical SEO comes in. Internal linking, redirects, the proper use of rel=”canonical”, sitemap consistency, and consistency in 301 redirects all work as hints that help Google identify on the version you actually want indexed.

The Real Problem Is Mixed Signals

Mueller’s remark about making it harder on yourself was about the site owner/SEO spending time requesting URLs to be recrawled and noting that Google will figure it out on its own. But then he also referenced preferences, which alluded to all the signals I previously mentioned, in particular the rel=”canonical”.

Technical SEO Is Often About Reinforcing Preferences

Mueller’s description of technical SEO as “search-engine whispering” is useful because it captures how much of SEO involves reinforcing your preferences for what URLs are crawled, which content is chosen to rank, and indicating which pages of a website are the most important. Google may still choose a canonical on its own, but consistent signals increase the chance that it chooses the version the site owner wants.

That makes this a good example of what SEO is all about: Making it easy for Google to crawl, index, and understand the content. That’s really the essence of SEO. It is about being clear and consistent in the content, URLs, internal linking, overall site navigation, and even in showing the cleanest HTML, including semantic HTML (which makes it easier for Google to annotate a web page).

Semantic HTML can be used to clearly identify the main content of a web page. It can directly help Google zero in on what’s called the Centerpiece content, which is likely used for Google’s Centerpiece Annotation. The centerpiece annotation is a summary of the main topic of the web page.

Google’s canonicalization documentation explains:

“When Google indexes a page, it determines the primary content (or centerpiece) of each page. If Google finds multiple pages that seem to be the same or the primary content very similar, it chooses the page that, based on the factors (or signals) the indexing process collected, is objectively the most complete and useful for search users, and marks it as canonical. The canonical page will be crawled most regularly; duplicates are crawled less frequently in order to reduce the crawling load on sites.”

Technical SEO And Being Consistent

Stepping back to take a forest level view, duplicate URLs are really about a website not being consistent. Being consistent is not often seen as having to do with SEO but it actually is, on a general level. Every time I have created a new website I always had a plan for how to make it consistent, from the URLs to the topics, and also how to be able to expand that in a consistent manner as the website grows to cover more topics, to build that in.

Takeaways

  • Multiple URLs to the same content do not cause a penalty or ranking demotion
  • Google will usually pick one version to keep
  • Site owners can influence that choice through consistent technical signals
  • The real issue is mixed signals, not duplicate content itself
  • Technical SEO often comes down to reinforcing clear preferences and monitoring whether Google picks them up
  • The forest-level view of SEO can be seen as being consistent

Featured Image by Shutterstock/Andrey_Kuzmin

How Consumers Navigate High-Stakes Purchases In AI Mode via @sejournal, @Kevin_Indig

Boost your skills with Growth Memo’s weekly expert insights. Subscribe for free!

AI Mode is compressing the stage where buyers compare, reject, and discover brands on their own. Our new usability study of 185 documented purchase tasks shows that 74% of AI Mode final shortlists came directly from the AI’s output – no external check, no triangulation, no second opinion.

This analysis will cover:

  • How the comparison search phase has collapsed.
  • What this means for brands competing in categories with high competitor AI Mode saturation.
  • The three levers that determine whether your brand shows up.

Why We Conducted The Study

AI transforms Search from a list of results to a list of recommendations (shortlist). Until now, we have no idea how users treat AI shortlists. Do they take it at face value or thoroughly validate it?

That’s why I partnered with Citation Labs and Clickstream Solutions to record real users and their interactions when facing high-stakes purchases. This usability study of 48 participants completing 185 major-purchase tasks reveals that AI Mode operates as a recommendation environment, not a comparison one.

In traditional search, people click through results, comparing across sources to assemble a candidate set. In AI Mode, they accept the AI’s candidates and move on. 74% of AI Mode shortlists came directly from the AI’s output with no external check. In traditional search, more than half of users built their own shortlist from scratch.

The study covers four categories (televisions, laptops, washer/dryer sets, and car insurance). Participants completed tasks using both AI Mode and traditional search in a within-subjects A/B design, producing 149 AI Mode task observations and 36 search observations. The behavioral patterns are consistent enough across categories and participants to carry weight. (Full study design is at the end.)

From Garret French, founder of Citation Labs:

“In AI Mode, buyers often use a shortlist synthesis to shortcut the cognitive effort of Standard Searching and comparing. This raises the value of onsite decision assets and third-party sources that provide AI with clear trade-offs, specific evidence, and sufficient contextual structure to describe a brand’s offering with confidence.”

From Eric Van Buskirk:

The absence of narrowness frustration is the most intellectually significant finding. 15% in AI Mode vs 11% in Search, with no meaningful statistical difference. That’s the finding that rules out the obvious alternative explanation: that users accepted the AI’s shortlist because they felt trapped. They didn’t push back. They weren’t frustrated. They were satisfied. That makes the acceptance harder to dismiss.

Here’s what happened.

1. 88% Of Users Took The AI’s Shortlist Outright

Across the laptop and insurance tasks, where participants used both search surfaces (classic search and AI Mode), the gap in constructing a product shortlist was stark.

Image Credit: Kevin Indig

Definitions:

  • AI Adopted: The participant took the AI’s recommended candidates as their shortlist with no changes or external verification.
  • User Built: The participant ignored the AI’s (or Search’s) suggestions and assembled their own candidate list from independent sources.
  • AI Verified: The participant started with the AI’s candidates but checked them against an outside source (a retailer site, a review, a manufacturer page) before finalizing.
  • Hybrid: The participant combined AI-suggested candidates with at least one candidate they found independently.

In classic search, 56% of participants built their own shortlist from multiple sources. In AI Mode, only 8 out of 147 codeable tasks produced a genuinely self-built shortlist. The user’s comparison process didn’t just shrink when using AI Mode. For most participants, it didn’t happen at all.

64% of AI Mode participants clicked nothing at all during their task. They read the AI’s text, sometimes scrolled through inline product snippets, and declared their finalists. The no-click rate varied by category:

Image Credit: Kevin Indig

Insurance participants delegated most heavily. Washer/dryer participants clicked the most, likely because appliance decisions involve specific physical constraints (capacity, stacking compatibility, dimensions) that the AI summary didn’t always resolve.

The 36% who did interact with individual results within AI Mode broke into 2 groups:

  • About 15% of the AI Adopted group (17 of 117 participants) verified inside AI Mode: They opened inline product cards or merchant pop-ups to check a price or spec, then returned to the AI’s list.
  • Others used follow-up prompts as verification tools, asking the AI for prices or narrowing by constraints.

A separate 23% of all AI Mode tasks involved at least one visit to an external website, mostly retailers (Best Buy appeared in 10 of 34 tasks with external visits) and manufacturer sites. The destination pattern matters: Users left AI Mode to confirm a candidate they’d already accepted from the AI’s list, not to find new ones.

Of the 117 participants who adopted the AI’s shortlist directly, roughly 85% showed no internal verification behavior at all. Participants who built their own lists took an average of 89 seconds longer and consulted more than twice as many sources.

  • “Given that the first paragraph says Lenovo or Apple… going with that,” said one user about laptops when searching via AI Mode. Position one in the AI response was the entire decision.
  • Another AI Mode user remarked: “I liked it more than anything else I’ve ever used for product searching. It made it a lot quicker to find the options.” They experienced speed as a valuable feature, not a shortcut.

In classic search, the pattern reversed. Nearly 89% of participants clicked on something.

  • One insurance participant clicked out to Progressive and GEICO independently, read both landing pages, consulted an Experian article, and then arrived at a shortlist.
  • A laptop participant applied hardware filters and flagged a review score discrepancy: “It shows 4.6 out of 5 stars for the reviews, but when you actually click the link: not reviewed yet.” Active skepticism of aggregated data was a behavior absent from AI Mode transcripts.

2. The AI’s Top Pick Becomes The User’s Top Pick 74% Of The Time

Just like in classic search, the top answer carries outsized weight. 74% of participants chose the item ranked first in the AI’s response as their top pick. The mean rank of the final choice was 1.35. Only 10% chose something ranked third or lower.

Image Credit: Kevin Indig

Position one in the AI’s output carries an outsized advantage because of where it sits: inside a curated section that typically contains two to five items, after the AI has already done the filtering. The first item is the AI’s top pick. When people engage with AI mode, we know they read almost all of the output: The first AI Mode study found users spend 50 to 80 seconds reading AI Mode output, more than double the dwell time on AI Overviews. Users are reading carefully. They just read within a set the AI already narrowed.

However, 26% of participants in this study overrode rank order. The driver: brand recognition. They spotted a brand lower on the list and preferred it regardless of where the AI placed it. TV and laptop categories saw this most, where participants arrived with existing preferences for Samsung, LG, Apple, or Lenovo. But overriding rank did not mean rejecting the AI’s output: 81% of rank-override participants still chose from the AI’s candidate set.

3. The AI’s Words Become The Trust Signal

“Travelers and USAA actually tell me how much, whereas State Farm and GEICO give percentages. Just knowing the exact amount makes me want to pick Travelers or USAA right off the bat.”

That quote captures a core pattern in AI Mode trust. The AI’s formatting shaped the decision: Dollar amounts versus percentage discounts determined which brands made the shortlist.

AI framing (37%), meaning how AI talks about the product, and brand recognition (34%) were the top 2 trust drivers in AI Mode. They run nearly even:

  • Brand recognition led when participants arrived with brand preferences.
  • AI’s wording filled the gaps where participants didn’t already have preferences.
Image Credit: Kevin Indig

In classic search, the dominant trust mechanism was multi-source convergence: Participants built confidence by checking whether multiple independent sources agreed about a product.

Essentially, users triangulated. One checked Progressive, then GEICO, then an Experian article. Another compared aggregated star ratings against reviews on the actual site. They were building a case from separate inputs.

That behavior was almost absent in AI Mode (5%). Instead, AI framing (how the AI worded its description of a product) and brand recognition were the top 2 trust drivers.

The split between these two signals tracked closely with product category:

Image Credit: Kevin Indig

For televisions and laptops, where most participants arrived with existing brand preferences, brand recognition dominated. For insurance and washer/dryer, where participants had less prior knowledge, AI framing dominated.

When you lack a prior view, the AI’s description becomes the trust signal. In AI Mode, the synthesis is the corroboration. Participants treated the AI’s summary as if the cross-checking had already been done for them.

The first study showed a related pattern from the supply side: AI Mode matches site type to intent, surfacing brands for transactional queries and review sites for comparisons. This study shows the demand side of the same behavior: When the AI surfaces a brand the user already knows, brand recognition drives the decision; when it doesn’t, the AI’s own framing fills that role. The site-type matching and the trust mechanism reinforce each other.

4. If You’re Not In The List, You Don’t Exist

Purchase outcomes in AI Mode concentrated heavily. For laptops, three brands captured 93% of all AI Mode final choices. In classic search, the distribution was broader: HP EliteBook variants appeared three times, ASUS once, and other brands got consideration they never received in AI Mode.

Image Credit: Kevin Indig

Two distinct problems emerged:

  1. Brands that never appeared in the AI’s output were never considered. Participants didn’t see them, so they couldn’t evaluate them. The AI decided who made the list, not the buyer.
  2. Brands that did appear but lacked recognition faced a different problem: They weren’t seriously considered. Erie Insurance showed up in AI Mode results, but multiple participants eliminated it on name recognition alone. The brand was present but hadn’t built enough awareness to survive the moment of selection. One participant dropped a brand because it lacked a hyperlink in the AI output, reading that formatting gap as a credibility signal: “There’s not even a link there.”

Another participant said when using AI Mode: “I’m already eager to believe these are good recommendations because it mentions LG and Samsung, two brands I consider very reliable.” The AI didn’t say those brands were better. The participant inferred it from familiarity.

Participants didn’t feel constrained by the narrower set. Narrowness frustration appeared in 15% of AI Mode tasks and 11% of classic search tasks, statistically indistinguishable. The option set shrank, but the feeling of having enough options didn’t change. The most skeptical AI Mode participant in the comparison set, who complained the AI kept pointing to “teen drivers, teen drivers, teen drivers,” still chose GEICO and Travelers: the consensus AI result.

5. Users Leave To Buy, Not To Research

23% of AI Mode tasks involved an external site visit, but keep in mind these prompts reflect high-stakes situations. In standard search, that figure was 67%.

Image Credit: Kevin Indig

The volume difference matters less than the intent difference:

  • AI Mode participants who left went to retailer sites and manufacturer pages to verify a price or spec for a candidate they’d already selected.
  • Standard Search participants left to discover candidates: Reddit for peer opinions, editorial review sites for expert takes, insurance aggregators for comparison.

In the first AI Overviews study, we found that high risk leads users to verify AI claims more and reference against answers from other users on UGC platforms (like Reddit).

In this study, Reddit appeared in 19% of standard search tasks and only twice across all 149 AI Mode sessions. The peer-opinion layer that shapes a large share of traditional Search barely exists in AI Mode behavior.

There’s irony in that pattern. Google leans heavily on Reddit content to train its models. However, the source that users rely on most in standard search is the one they almost never visit when the AI synthesizes those same sources for them.

The first study found the same pattern at a different scale. Across 250 sessions, clicks were “reserved for transactions:” Shopping prompts drove the highest exit share, while comparison prompts drove the lowest. The exit destinations were retailers and brand sites, not editorial or peer-opinion sources. Six months and a different task set later, the pattern holds: When users leave AI Mode, they leave to buy.

6. 3 Levers: Visibility, Framing, And Pricing Data

Three things that excite me most about the study:

First, we can apply the mental model of rankings (higher = better) to AI Mode as well. Most users choose the first product. Now, we can apply this to prompt tracking by focusing more on prompts that lead to shortlists and use our position as a goalpost.

Second, trust trumps rank. We know this since the first user behavior studies I published, but this study reinforces the importance of building trust with users before they search. It’s the ultimate cheat code.

Third, we now know buyers trust AI’s recommendations. Obviously, there’s a high risk here if the AI is wrong, but seeing how quickly buyers take the AI’s recommendation also shows us how fast consumers adopt AI. It truly is the future of Search.

Keep in mind:

1. Visibility at the model layer is the new threshold. If AI Mode doesn’t surface your brand, you have a visibility problem at the model layer. Query your own category the way a buyer would (i.e., “best car insurance for a family with a teen driver,” “best washer dryer set under $2,000”) and document which brands appear, in what order, and with what framing. Do this across multiple prompt variations. Do it regularly, because AI responses shift over time.

2. How the AI describes you matters as much as whether it appears. Brands cited with concrete attributes (specific model, specific price, named use case) held stronger positions than brands described generically. The content on your site that the AI draws from not only affects whether you show up, but also how confidently and specifically you show up. A brand with structured pricing data, clear product specs, and explicit use cases gives the AI better material to work with.

3. For categories with context-dependent pricing, AI Mode creates a false-confidence problem. 63% of insurance participants were rated overconfident about pricing. They accepted AI-quoted rate estimates without checking whether the figures applied to their actual state, driving record, or current insurer. They made elimination decisions based on numbers that may not have applied to them. Where shopping panels showed explicit retailer-confirmed prices (washer/dryer), 85% of participants understood pricing clearly. Where they didn’t (insurance, laptops), confusion and overconfidence filled the gap. Structured pricing data through Merchant Center feeds and schema markup is the most direct lever for brands selling physical products. For services, the lever is editorial: Make sure your landing pages and FAQ content frame pricing as conditional (“your rate depends on X, Y, Z”) so the AI has that framing to draw from.

Study Design

Citation Labs and Clickstream Solutions ran this as a remote, unmoderated usability study with 48 U.S.-based participants recruited through Prolific. Each participant completed up to four major-purchase shortlisting tasks across televisions, laptops, washer/dryer sets, and car insurance.

The comparison between AI Mode and traditional standard search used a within-subjects A/B design: Participants used both surfaces, not one or the other. Significance calculations were normalized for the exact number of participants in each group (149 AI Mode task observations, 36 standard search task observations). This matters because the groups are unequal in size, and raw percentage comparisons between them would overstate confidence without that correction.

Sessions were screen-recorded with think-aloud audio. Trained analysts annotated each recording for behavioral markers (click-through, shortlist origin, trust signals, external site visits) and qualitative markers (stated reasoning, brand mentions, frustration signals). The 185 task-level observations provide a larger analytical base than the 48-participant headcount suggests, but confidence intervals remain wider than a large-scale survey. Findings are directional, not population-level estimates.

Notes on terminology used throughout this report:

  • Shortlist: The final set of brands a user would consider buying from.
  • AI Adopted: The participant took the AI’s recommended candidates as their shortlist with no changes or external verification.
  • User Built: The participant ignored the AI’s (or Search’s) suggestions and assembled their own candidate list from independent sources. In Search, when there was no AIO present, they had no option for relying on AI suggestions.
  • AI Verified: The participant started with the AI’s candidates but checked them against an outside source (a retailer site, a review, a manufacturer page, further prompting, or interaction with a panel outside the main AI text block ) before finalizing.
  • Hybrid: The participant combined AI-suggested candidates with at least one candidate they found independently.
  • AI framing: The specific words and structure the AI used to describe a product, such as labels like “best for affordability” or explicit price comparisons.
  • Brand recognition: The user chose or eliminated a brand based on prior familiarity, not the AI’s description or any external research.
  • AI trust (general): The user accepted the AI’s output as credible without citing a specific reason, such as a particular label or description.
  • Source trust: The user trusted a recommendation because of where it came from, such as a retailer, manufacturer, or named publication surfaced in results.
  • Multi-source convergence: The user built confidence by checking whether multiple independent sources agreed on the same recommendation.
  • Rank override rate: The share of users who chose a brand other than the AI’s top-ranked option, regardless of whether they stayed within the AI’s candidate list.

Featured Image: Tapati Rinchumrus/Shutterstock; Paulo Bobita/Search Engine Journal

Google Explains Why It Doesn’t Matter That Websites Are Getting Larger via @sejournal, @martinibuster

A recent podcast by Google called attention to the fact that websites are getting larger than ever before. Google’s Gary Illyes and Martin Splitt explained that the idea that websites are getting “larger” is a bad thing is not necessarily true. The takeaway for publishers and SEOs is that Page Weight is not a trustworthy metric because the cause of the “excess” weight might very well be something useful.

Page Size Depends On What ‘s Being Measured

Google’s Martin Splitt explained that what many people think of as page size depends on what is being measured.

  • Is it measured by just the HTML?
  • Or are you talking about total page size, including images, CSS, and JavaScript?

It’s an important distinction. For example, many SEOs were freaked out when they heard that Googlebot was limiting their page crawl to just 2 megabytes of HTML per page. To put that into perspective, two megabytes of HTML equals about two million characters (letters, numbers, and symbols). That’s the equivalent of one HTML page with the same number of letters as two Harry Potter books.

But when you include CSS, images, and JavaScript along with the HTML, now we’re having a different conversation that’s related to page speed for users, not for the Googlebot crawler.

Martin discussed an article on HTTPArchive’s Web Almanac, which is a roundup of website trends. The article appeared to be mixing up different kinds of page weight, and that makes it confusing because there are at least two versions of page weight.

He noted:

“See that’s where I’m not so clear about their definition of page weight.

…they have a paragraph where they are trying to like explain what they mean by page weight. …I don’t understand the differences in what these things are. So they say page weight (also called page size) is the total volume of data measured in kilobytes or megabytes that a user must download to view a specific page. In my book that includes images and whatnot because I have to download that to see.

And that’s why I was surprised to hear that in 2015 that was 845 kilobytes. That to me was surprising. …Because I would have assumed that with images it would be more than 800 kilobytes.

… In July 2025, the same median page is now 2.3 megabytes.”

Data Gets Compressed

But that is only one way to understand page size. Another way to consider page size is by focusing on what is transferred over the network, which can be smaller due to compression. Compression is an algorithm on the server side that minimizes the size of the file that is sent from the server and downloaded by the browser. Most servers use a compression algorithm called Brotli.

Martin Splitt explains:

“I ask this question publicly that different people had very different notions of how they understood page size. Depending on the layer you are looking at, it gets confusing as well
because there’s also compression.

…So some people are like, ah, but this website downloads 10 megabytes onto my disk.

And I’m like, yes. …but maybe if you look at what actually goes over the wire, you might find that this is five or six megabytes, not the whole 10 megabytes. Because you can compress things on the network level and then you decompress them on the client side level…”

Technically, the page size in Martin’s example is actually five or six megabytes because of compression, and it’s able to download faster. But on the user’s side, that five or six megabytes gets decompressed, and it turns back into ten megabytes, which occupies that much space on a user’s phone, desktop, or wherever.

And that introduces an ambiguity. Is your web page ten megabytes or five megabytes?

That illustrates a wider problem: different people are talking about different things when they talk about page size.

Even widely used definitions don’t fully resolve the ambiguity. Page weight is described as “the total volume of data measured in kilobytes or megabytes that a user must download,” but as the discussion makes clear, there is no one clear definition.

Martin asserts:

“When you ask people what they think, if this is big or not, you start getting very different answers depending on how they think about page size. And there is no one true definition of it.”

What About Ratio Of Markup To Content?

One of the most interesting distinctions made in the podcast is that a large page is not necessarily inefficient. For example, a 15 MB HTML document is considered acceptable because “pretty much most of these 15 megabytes are actually useful content.” The size reflects the value being delivered.

By contrast, what if the ratio of content to markup were the other way around, where there was a little bit of content but the overwhelming amount of the page weight was markup.

Martin discussed the ratio example:

“…what if the markup is the only overhead? And I mean like what do you mean? It’s like, well, you know, if it’s like five megabytes but it’s only very little content, is that bad? Is that worse as in this case, the 15 megabytes.

And I’m like, that’s tricky because then we come into this weird territory of the ratio between content and markup. Yeah.

And I said, well, but what if a lot of it is markup that is metadata for some third party tool or for some service or for regulatory reasons or licensing reasons or whatever. Then that’s useful content, but not necessarily for the end user, but you still kind of have to have it.

It would be weird to say that that is worse than the page where the weight is mostly content.”

What Martin is doing here is shifting the idea of page weight away from raw size toward what the data actually represents.

Why Pages Include Data Users Never See

A major contributor to page weight is content that users never see.

Gary Illyes points to structured data as an example of content that is specifically meant for machines and not for users. While it can be useful for search engines, it also adds to the overall size of the page. If a publisher adds a lot of structured data to their page in order to take advantage of all the different options that are available, that’s going to add to the page size even though the user will never see it.

This calls attention to a structural reality of the web: pages are not just built for human readers. They are also built for search engines, tools, AI agents, and other systems, all of which add their own requirements to the weight of a web page.

When Overhead Is Justified

Not all non-user-facing content is unnecessary.

Martin talked about how markup may include “metadata” or a tool, regulatory, or licensing purpose, creating a kind of gray area. Even if the additional data does not improve the user experience directly, it does serve a purpose, including helping the user find the page through a search engine.

The point that Martin was getting at is that these considerations of page weight complicate attempts to label page weight as good if it is under this threshold or bad if the page weight exceeds it.

Why Separating Content and Metadata Doesn’t Work

One possible solution that Gary Illyes discussed is separating human-facing content from machine-facing data. While Gary didn’t specifically mention the LLMs.txt proposal, what he discussed kind of resembles it in that it serves content to a machine minus all the other overhead that goes with the user-facing content.

What he actually discussed was a way to separate all of the machine-facing data from what the user will download, thus, in theory, making the user’s version of a web page smaller.

Gary quickly dismisses that idea as “utopic” because there will always be hordes of spammers who will find a way to take advantage of that.

He explained:

“But then unfortunately this is an utopic thing. Because not everyone on the internet is playing nice.

We know how much spam we have to deal with. On our blog we say somewhere that we catch like 40 billion URLs per day that’s spam or some insane number, I don’t remember exactly, but it’s some insane number and definitely billions. That will just exacerbate the amount of spam that search engines receive and other machines receive maybe like I would bet $1 and 5 cents that will actually increase the amount of spam that search engines and LLMs and others ingest.”

Gary also said that Google’s experience is that, historically, when you have separate kinds of content, there will always be differences between the two kinds. He used the example of when websites had mobile and desktop pages, where the two versions of content were generally different, which in turn caused issues for search and also for usability when a site ranks a web page for content on one version of a page, then sends the user to a different version of the page where that content does not exist.

Although he didn’t explicitly mention it, that explanation of Google’s experience may shed more light on why Google will not adopt LLMS.txt.

As a result, search engines have largely settled on a single-document model, even if it is inefficient.

Website Size vs Page Size Is the Real World

The discussion ultimately challenges the original concept of the problem, that heavy web pages are bad.

Gary observes:

“The first question is, are websites getting fat? I think this question is not even meaningful.

Because it does not matter in the context of a website if it’s fat. In the context of a single page, yes.

But in the context of a website, it really doesn’t matter.”

So now Gary and Martin change the focus to web pages that are getting heavier, a more meaningful way to look at the issue of how web pages and websites are evolving.

This moves the discussion from an abstract idea to something more measurable and actionable.

Heavier Pages Still Carry Real Costs

Even with faster connections and better infrastructure, larger pages still have consequences, and smaller weighted pages have positive benefits.

Martin explains:

“I think we are wasting a lot of resources. And I mean we, we had that in another episode where we said that we know that there are studies that show that websites that are faster have better retention and better conversion rates. Yeah. And speed is in part also based on size. Because the more data I ship, the longer it takes for the network to actually transfer that data and the longer it takes for the processor of whatever device you’re on to actually process it and display it to you.”

From a broader perspective, the issue is not just performance but efficiency. As Illyes puts it, “we are wasting a lot of resources.”

The web may be getting heavier, but the more important takeaway is why. Pages are carrying more than just user-facing content, and that design choice shapes both their size and their impact.

Featured Image by Shutterstock/May_Chanikran

Google’s Mueller On SEO Gurus Who Are “Clueless Imposters” via @sejournal, @martinibuster

A search marketing professional from India wrote a blog post about how she feels about seeing the word guru used within the SEO community in a way that’s different from its meaning in India. Several people, including Google’s John Mueller, agreed with her and shared how they felt when people self-identify as SEO gurus.

The Word Guru Is Misused

Preeti Gupta wrote a blog post titled, I don’t like how the word ‘Guru’ is misused in the SEO industry, in which she shared what the word guru actually means and how it’s misused in the SEO industry in a way that trivializes a word that in India holds special meaning.

She wrote that in India the word guru has a deep meaning and that they hold great respect for actual gurus.

Her blog post shared a Sanskrit mantra about it:

“The Guru is like Brahma (the creator). They create the desire for knowledge.
The Guru is like Vishnu (The preserver). They help the student keep and use the knowledge.
The Guru is like Maheshwara (Shiva, the Destroyer). They destroy ignorance and bad habits.
The Guru is the supreme reality itself, standing right before your eyes.
I bow and offer my respects to that great teacher.”

She then contrasted that profound meaning of the word guru with the trivialization of it within the context of self-described SEO gurus, who she regards as shady types who engage in unethical SEO practices. She said that it’s not her intention to tell people what words to use, but she did express the hope that people would use the word in the right context.

The phrase SEO guru is used in both contexts, as a derogatory phrase to paint someone as a false leader with naïve followers and also as someone who is highly regarded. However, I think an argument can be made that using that phrase for oneself is immodest, self-aggrandizing, and simply isn’t a good look.

AlexHarford-TechSEO responded to her post on Bluesky:

“It puts me off when I see an SEO self-describe themselves as a “Guru.” I’ve never come across anyone who does so who is a good and ethical SEO.

A lot of words are losing meaning in today’s world, though there can’t be many that were as special to you as Guru.”

Words are always in a state of change, and the way people speak not only changes from region to region but also from decade to decade. The meaning of words does change, especially when they jump continents and languages.

Self-Declared SEO Gurus

It was at this point that John Mueller responded to share what he thinks about self-described SEO gurus:

“To me, when someone self-declares themselves as an SEO guru, it’s an extremely obvious sign that they’re a clueless imposter. SEO is not belief-based, nobody knows everything, and it changes over time. You have to acknowledge that you were wrong at times, learn, and practice more.”

Mueller is right that nobody knows everything and that SEO changes over time, and for a long time many SEOs didn’t keep up with how Google ranks websites. The industry has largely shed that naivete, and yet nobody really agrees on what to do to rank better in search engines and AI search.

SEO Is A Belief System

Although I know there are some SEOs who firmly believe that SEO is a set of universally agreed upon practices and that that is all there is, unaware that the history of SEO is one of constant change. How SEO is practiced today is quite different from how it was practiced eight years ago. There is no set of practices to be agreed on except Google’s best practices, which are less about do this and you will rank better and more about do this and you may have a chance to rank better.

So yes, to a certain extent, SEO is a belief system and will continue to be a belief system so long as Google’s search ranking algorithms remain a black box algorithm that people can see what goes in and what comes out but not what happens in the middle. That part remains a mystery. So when you don’t know for sure that what you do will guarantee better rankings, the only thing left is to believe, hope, and even have faith that the rankings will happen. Faith, after all, is belief in something that does not provide definitive proof. You don’t need faith to believe in a fact, right?

And that last part, the mystery of what happens in the black box, is why nobody can really call themselves a guru in the sense of being all-knowing. Nobody outside of Google knows everything that’s going on within that part in the middle where the rankings “magic” happens.

Given all that, who can truly call themselves a guru in SEO?

Featured Image by Shutterstock/funstarts33