Are AI Tools Eliminating Jobs? Yale Study Says No via @sejournal, @MattGSouthern

Marketing professionals rank among the most vulnerable to AI disruption, with Indeed recently placing marketing fourth for AI exposure.

But employment data tells a different story.

New research from Yale University’s Budget Lab finds “the broader labor market has not experienced a discernible disruption since ChatGPT’s release 33 months ago,” undercutting fears of economy-wide job losses.

The gap between predicted risk and actual impact suggests “exposure” scores may not predict job displacement.

Yale notes the two measures it analyzes, OpenAI’s exposure metric and Anthropic’s usage, capture different things and correlate only weakly in practice.

Exposure Scores Don’t Match Reality

Yale researchers examined how the occupational mix changed since November 2022, comparing it to past tech shifts like computers and the early internet.

The occupational mix measures the distribution of workers across different jobs. It changes when workers switch careers, lose jobs, or enter new fields.

Jobs are changing only about one percentage point faster than during early internet adoption, according to the research:

“The recent changes appear to be on a path only about 1 percentage point higher than it was at the turn of the 21st century with the adoption of the internet.”

Sectors with high AI exposure, including Information, Financial Activities, and Professional and Business Services, show larger shifts, but “the data again suggests that the trends within these industries started before the release of ChatGPT.”

Theory vs. Practice: The Usage Gap

The research compares OpenAI’s theoretical “exposure” data with Anthropic’s real usage from Claude and finds limited alignment.

Actual usage is concentrated: “It is clear that the usage is heavily dominated by workers in Computer and Mathematical occupations,” with Arts/Design/Media also overrepresented. This illustrates why exposure scores don’t map neatly to adoption.

Employment Data Shows Stability

The team tracked unemployed workers by duration to look for signs of AI displacement. They didn’t find them.

Unemployed workers, regardless of duration, “were in occupations where about 25 to 35 percent of tasks, on average, could be performed by generative AI,” with “no clear upward trend.”

Similarly, when looking at occupation-level AI “automation/augmentation” usage, the authors summarize that these measures “show no sign of being related to changes in employment or unemployment.”

Historical Disruption Timeline

Past disruptions took years, not months. As Yale puts it:

“Historically, widespread technological disruption in workplaces tends to occur over decades, rather than months or years. Computers didn’t become commonplace in offices until nearly a decade after their release to the public, and it took even longer for them to transform office workflows.”

The researchers also stress their work is not predictive and will be updated monthly:

“Our analysis is not predictive of the future. We plan to continue monitoring these trends monthly to assess how AI’s job impacts might change.”

What This Means

A measured approach beats panic. Both Indeed and Yale emphasize that realized outcomes depend on adoption, workflow design, and reskilling, not raw exposure alone.

Early-career effects are worth watching: Yale notes “nascent evidence” of possible impacts for early-career workers, but cautions that data are limited and conclusions are premature.

Looking Ahead

Organizations should integrate AI deliberately rather than restructure reactively.

Until comprehensive, cross-platform usage data are available, employment trends remain the most reliable indicator. So far, they point to stability over transformation.

OpenAI Launches Apps In ChatGPT & Releases Apps SDK via @sejournal, @MattGSouthern

OpenAI has launched a new app ecosystem within ChatGPT, along with a preview of the Apps SDK, enabling developers to create conversational, interactive applications based on the Model Context Protocol.

These apps are now accessible to all logged-in ChatGPT users outside the European Union, across Free, Go, Plus, and Pro plans.

Early partners include Booking.com, Canva, Coursera, Expedia, Figma, Spotify, and Zillow.

How ChatGPT Apps Work

Apps naturally integrate into conversation, and you can activate them by name, such as saying, “Spotify, make a playlist for my party this Friday.’

When using an app for the first time, ChatGPT prompts you to connect and clarifies what data might be shared. For example, OpenAI demonstrates ChatGPT suggesting the Zillow app during a home-buying discussion, allowing you to browse listings on an interactive map without leaving the chat.

John Weisberg, Head of AI at Zillow, said:

“The Zillow app in ChatGPT shows the power of AI to make real estate feel more human. Together with OpenAI, we’re bringing a first-of-its-kind experience to millions — a conversational guide that makes finding a home faster, easier, and more intuitive.”

Developer Opportunities & Reach

OpenAI positions the Apps SDK as a way to “reach over 800 million ChatGPT users at just the right time.”

The SDK is open source and built on MCP, allowing developers to create their own chat logic and custom interfaces. You can also connect to your own backends for login and premium features, and easily test everything through Developer Mode in ChatGPT.

OpenAI has provided detailed documentation, design guidelines, and example apps to support developers.

Submission & Monetization

Developers can begin building immediately. OpenAI has announced that formal app submissions, reviews, and publication will commence later this year, along with a directory for browsing and searching apps.

Additionally, the company plans to disclose monetization details, including support for the Agentic Commerce Protocol, which enables instant checkout within ChatGPT.

Safety & Privacy

All apps must follow OpenAI’s policies, be audience-appropriate, and have clear third-party rules. Developers should provide privacy policies, collect only necessary data, and be transparent about permissions.

OpenAI’s draft guidelines also require apps to be purposeful, avoid misleading designs, and manage errors effectively. Submissions must demonstrate stability, responsiveness, and low latency; apps that crash or hang will be rejected.

Rollout & Availability

Today’s rollout does not include EU users, but OpenAI has announced plans to introduce these apps to that region soon.

Additionally, eleven more partner apps are scheduled for release later this year. OpenAI also intends to expand app availability to ChatGPT Business, Enterprise, and Education plans.

Looking Ahead

Apps that appear within AI-led conversations could transform the way services are found and accessed.

Instead of relying on traditional rankings or app-store positions, visibility might be driven more by conversational relevance and demonstrated value within the chat.

Teams responsible for app functionality should think about how users will naturally request these services and identify the key moments when ChatGPT is likely to recommend them.

Google’s AI Mode: What We Know & What Experts Think via @sejournal, @martinibuster

AI Mode is Google’s most powerful AI search experience, providing answers to complex questions in a way that anticipates the user’s information needs. Although Google says that nothing special needs to be done to rank in AI Mode, the reality is that SEO only makes pages eligible to appear.

The following facts, insights, and examples demystify AI Mode and offer a clear perspective on how pages are ranked and why.

What Is AI Mode?

Google’s AI Mode was introduced on March 5, 2025, as an experiment in Google Labs, then swiftly rolled out as a live Google search surface on May 20. AI Mode is described as its most cutting-edge search experience, combining advanced reasoning with multimodality. Multimodality means content beyond text data, such as images and video content.

AI Mode is a significant evolution of Google Search that encourages users to research topics. This presents benefits and changes to how search works:

  • The benefit is that Google is citing a greater variety of websites per query.
  • The change is that websites are being cited for multiple queries, beginning with the initial query plus follow-up queries.

Those two factors present challenges to SEO. For example, do you optimize for the initial query, or what can be considered a more granular follow-up query? Most SEOs may consider optimizing for both.

Query Fan-Out

Similar to AI Overviews, AI Mode uses what they call a query fan-out technique, which divides the initial search query into subtopics that anticipate further information the user may need.

Query fan-out anticipates the user’s information journey. So, if they ask question A, Google’s AI Mode will show answers to follow-up questions about B, C, and D.

For example, if you ask, “What is a mechanical keyboard?” Google answers the following questions:

  1. What is a mechanical keyboard?
  2. What are mechanical switches?
  3. What happens when a key is pressed on a mechanical keyboard?
  4. What are keycaps and what materials are they made from?
  5. What is the role of the printed circuit board (PCB)?
  6. How are mechanical switches categorized?

The following screenshot of the AI Mode search result shows the questions (in red) positioned next to the answers, illustrating how query fan-out generates related questions and creates answers for them.

Screenshot of query fan-out in AI Mode, September 2025

How I Extracted Latent Questions From AI Mode Search Results

The way I extracted the questions that query fan-out is answering was by doing an inverse knowledge search, also known as reverse QA.

I copied the output from AI Mode into a document, then uploaded it to ChatGPT with the following prompt:

Read the document and extract a list of questions that are directly and completely answered by full sentences in the text. Only include questions if the document contains a full sentence that clearly answers it. Do not include any questions that are answered only partially, implicitly, or by inference.

Try that with AI Mode to get a better understanding of the underlying questions it generates with query fan-out. This will help clarify what is happening and make it less mysterious.

Content With Depth

Google’s advice to publishers who want to rank in AI Mode is to encourage them to create content that engages users who are conducting in-depth queries:

“…users are asking longer and more specific questions – as well as follow-up questions to dig even deeper.”

That may not mean creating giant articles with depth. It just means focusing on the content that users are looking for. That approach to content is subtly different from chasing keyword inventory.

Google recommends:

  • Focus on unique, valuable content for people.
  • Provide a great page experience.
  • Ensure we can access your content.
  • Manage visibility with preview controls. (Make use of nosnippet, data-nosnippet, max-snippet, or noindex to set your display preferences.)
  • Make sure structured data matches the visible content.
  • Go beyond text for multimodal success.
  • Understand the full value of your visits.
  • Evolve with your users.

The last two recommendations require further clarification:

Understand The Full Value Of Your Visits

This is an encouragement to focus on delivering the information needs of the user and to note that focusing too hard on the “click” comes at the expense of providing what an “engaged” audience is looking for.

Evolve With Your Users

Google frames this as evolving along with how users are searching. A more pragmatic view is to evolve with how Google is showing results to users.

What Experts Say About Content Structure For AI Mode

Duane Forrester, formerly of Bing Search, advises that content needs to be structured differently for AI search.

He advises:

“…the search pipeline has changed. You don’t need to rank – you need to be retrieved, fused, and reasoned over by GenAI systems.”

In his article titled “Search Without A Webpage,” he expands on the idea that content must be useful as forming the basis of an answer:

“…your content doesn’t have to rank. It has to be retrieved, understood, and assembled into an answer.”

He also says that content needs to be:

“…structured, interpretable, and available when it’s time to answer.

This is the new search stack. Not built on links, pages, or rankings – but on vectors, embeddings, ranking fusion, and LLMs that reason instead of rank.”

When Duane says that content needs to be structured, he’s referring to on-page structure that communicates not just the hierarchy of information but also offers a clean delineation of what each section of content is about.

In my opinion:

  • Paragraphs should consist of sentences that build to an idea, with a clear payoff at the end.
  • If a sentence doesn’t have a purpose within the paragraph, it’s probably better to remove it.
  • If a paragraph doesn’t have a clear purpose, get rid of it.
  • If a group of paragraphs is out of place near the end of the document, move it closer to the beginning if that’s where it belongs.
  • The entire document should have a clear beginning, middle, and end, with each section serving as “the basis of an answer.”

Itai Sadan, CEO of Duda, recommends:

“Use clear, specific language: LLMs rely on clarity first and foremost, so avoid using too many pronouns or any other vague, undefined references.

Organize your content predictably: Break your content up into sections and use headings, like H2 and H3, to organize the unique ideas central to your article’s thesis.”

Mordy Oberstein, founder of Unify Marketing, explains that the focus on attribution took precedence for the average digital marketer:

“What resonates with the person hasn’t fundamentally changed, and I don’t think we’ve realized that. I think we’ve forgotten. I think we’ve completely forgotten what resonance is as digital marketers because of the advent of two things with the internet:

  1. Attribution
  2. The ability to track responses

Businesses were seemingly OK with digital marketers doing whatever it took to get that traffic, to get that conversion, because that’s just the Internet, so everyone just goes along.

Now, with AI Mode, attribution no longer exists in the same way.”

Mordy’s right about attribution. AI Mode cannot be tracked in Google Analytics 4 or Google Search Console. They’re lumped into the Web Search bucket, so there’s no way to tell where it’s coming from. It can’t be distinguished from regular organic search in either GA4 or GSC.

The attribution question is a big issue for digital marketers. Michael Bonfils of Digital International Group recently discussed the issue of attribution from the perspective of zero-click searches.

Bonfils says:

“But the organic side, there is an area … that is zero click. So zero click is for those audience members who don’t know what that means, zero click means when you are having a conversation with AI, for example, I’m trying to compare two different running shoes and I’m having this, ‘what’s going to be better for me?’

I’m having a conversation with AI and AI is pooling and referencing … whatever winning schema formats and content that are out there … but it’s zero click. It’s not going to your site. It’s not going there. So without this data that really affects … organic content strategy.”

And that dovetails with what Mordy is getting at, that SEOs are conditioned to view internet marketing through the “attribution” lens, but that we may be entering a kind of post-attribution period, which is what it largely was pre-internet. So, the old marketing strategies are back in, but they were always good strategies (building awareness and popularity); it’s just that digital marketers tended to engage more with attribution.

Mordy shares the example of someone researching a brand of sneakers, who asks a chatbot about it, then goes to Amazon to see what it looks like and what people are saying about it, then watches video reviews on YouTube, and then goes to AI Mode to review the specs. After all that research, the consumer might return to Amazon and then head over to Google Shopping to compare prices.

He concludes with the insight that resonating with users has always been important, and that very little has changed in terms of consumers conducting research prior to making a purchase:

“That was all happening before. But now the perception is that it’s happening because of LLMs. I don’t think things have fundamentally changed.”

I think that the key insight here is that the research is still happening exactly as before, but what’s changed is that the opportunities to expose your business or products have expanded to multimodal search surfaces, especially with AI Mode.

The screenshot below shows how Nike is taking charge of the conversation on AI Mode with both text and video content.

Screenshot of citations and videos in AI Mode, September 2025

Connect Your Brand To A Product

It’s becoming evident that connecting a brand semantically to a service or product may be important for communicating that the brand is relevant to whatever you want it to be relevant for.

Below is a screenshot of a sponsored post that’s indexed by Google and is ranking in AI Mode for the keyword phrase “what are ad hijacking tools.”

Screenshot of sponsored post ranking in AI Mode, September 2025

SEO Makes Content Eligible For AI Mode

SEO best practices are necessary to be eligible to appear in AI Mode. That’s different from saying that standard SEO will help you rank in AI Mode.

This is what Google says:

“To be eligible to be shown as a supporting link in AI Overviews or AI Mode, a page must be indexed and eligible to be shown in Google Search with a snippet, fulfilling the Search technical requirements. There are no additional technical requirements.”

The “Search technical requirements” are just the three basics of SEO:

  • “Googlebot isn’t blocked.
  • The page works, meaning that Google receives an HTTP 200 (success) status code.
  • The page has indexable content.”

Google clearly says that foundational SEO is necessary to be eligible to rank in AI Mode. But it does not explicitly confirm that SEO will help a site rank in AI Mode.

Is SEO Enough For AI Mode?

Google and Googlers have reassured publishers and SEOs that nothing extra needs to be done to rank in AI search surfaces. They affirm that standard SEO practices are enough.

Standard SEO practices ensure that a site is crawled, indexed, and eligible for ranking in AI Mode. But there is implication that the signals for actually ranking in AI Mode are substantially different from standard organic search.

What Is FastSearch?

Information contained in recent Google antitrust court documents shows that AI Mode ranks pages with a technology called FastSearch.

FastSearch grounds Google’s AI search results in facts, including data from the web. This is significant because FastSearch uses different ranking signals from what’s used in the regular organic search, prioritizing speed and selecting only a top few pages for AI grounding.

The recent Google antitrust trial document from early September offers this explanation of FastSearch:

“To ground its Gemini models, Google uses a proprietary technology called FastSearch. … FastSearch is based on RankEmbed signals—a set of search ranking signals—and generates abbreviated, ranked web results that a model can use to produce a grounded response. …

FastSearch delivers results more quickly than Search because it retrieves fewer documents, but the resulting quality is lower than Search’s fully ranked web
results. “

And elsewhere in the same document:

“FastSearch is a technology that rapidly generates limited organic search results for certain use cases, such as grounding of LLMs, and is derived primarily from the RankEmbed model.”

RankEmbed

RankEmbed is a deep learning model that identifies patterns in datasets and develops signals that are used for ranking purposes. It uses a combination of user data from search logs and scores generated by human raters to create the ranking-related signals.

The court document explains:

“RankEmbed and its later iteration RankEmbedBERT are ranking models that rely on two main sources of data: __% of 70 days of search logs plus scores generated by human raters and used by Google to measure the quality of organic search results.

The RankEmbed model itself is an AI-based, deep learning system that has strong natural-language understanding. This allows the model to more efficiently identify the best
documents to retrieve, even if a query lacks certain terms.”

Human-Rated Data

The human-rated data, which is part of RankEmbed, is not used to rank webpages. Human-rated data is used to train deep learning models so they can recognize patterns that correlate with high and low-quality webpages.

How human-rated data is used in general:

  • Human-rated data is used to create what are called labeled data.
  • Labeled data are examples that models use to identify patterns in vast amounts of data.

In this specific instance, the human-labeled data are examples of relevance and quality. The RankEmbed deep learning model uses those examples to learn how to identify patterns that correlate with relevance and page quality.

Search Logs And User Behavior Signals

Let’s go back to how Google uses “70 days of search logs” as part of the RankEmbed deep learning model, which underpins FastSearch.

Search logs refer to user behavior at the point when they’re searching. The data is rich with a wide range of information, such as what users mean when they search, and it can also include the domain names of businesses they associate with certain keywords.

The court documentation doesn’t say all the ways this data can be used. However, a Google antitrust document from May 2025 revealed that search log (click) patterns only become meaningful when scaled to the billions.

Some SEOs have theorized that click data can directly influence the rankings, describing a granular use of clicks for ranking. But that may not be how click data is used, because it’s too noisy and imprecise.

What’s really happening is more scaled than granular. Patterns reveal themselves in the billions, not in the individual click. That’s not just my opinion; it’s a fact confirmed in the May 2025 Google antitrust exhibit:

“Some Known Shortcomings of Live Traffic Eval
The association between observed user behavior and search result quality is tenuous. We need lots of traffic to draw conclusions, and individual examples are difficult to interpret.”

It’s fair to say that search logs are not used to directly impact the rankings of an individual webpage, but are used to learn about relevance and quality from user behavior.

FastSearch is not the same ranking algorithm as the one used for organic search results. It is based on RankEmbed, and the term “embed” suggests that embeddings are involved. Embeddings map words into a vector space so that the meaning of the text is captured. For SEO, this means that keyword relevance matters less, and topical relevance and semantic meaning carry more weight.

Google’s statement that standard SEO is all that’s needed to rank in AI Mode is true only to the extent that standard SEO will ensure that the webpage is crawled, indexed, and eligible for the final stage of AI Mode ranking, which is FastSearch.

But FastSearch uses an entirely different set of considerations at the LLM level to decide what will be used to answer the question.

In my opinion, it’s more realistic to say that SEO best practices make webpages eligible to appear in AI Mode, but the ranking processes are different, and so new considerations come into play.

SEO is still important, but it may be useful to focus on semantic and topical relevance.

AI Mode Is Multimodal

AI Mode is multimodal, meaning image and video content rank in AI Mode. That’s something that SEOs and publishers need to consider in terms of how user expectations drive content discovery. This means it may be useful to create image, video, and maybe even audio content in addition to text.

Optimizing Images For AI Mode

Something that is under your control is the featured image and the in-content images that go with your content. The best images, in my opinion, are images that are noticeable when displayed in AI Mode and contain visual information that is relevant to the search query.

Here’s a screenshot of images that accompany the cited webpages for the query, “What is a mechanical keyboard?”

Screenshot from AI Mode, September 2025

As you can see, none of the images pop out or call attention to themselves. I don’t think that’s Google’s preference; that’s just what publishers use. Images should not be an afterthought. Make them an integrated part of your ranking strategy for AI Mode.

Creative use of images, in my opinion, can help a page call attention to itself as useful and relevant. The best images are ones that look good when Google crops them into a square format.

Google AI Mode is multimodal, which means optimizing your images so that they display well in AI Mode search results. Your images should be attractive regardless of whether they are displayed as either a rectangle (approximately 16:9 aspect ratio) or a square (approximately 4:3 aspect ratio).

Mordy Oberstein offers these insights on multimodal marketing:

“AI Mode is looking at videos, images, and yes, you could do all of that. Yes, you should do all of that – whatever is possible to do while being efficient and not getting misdirected or losing focus – yes, go ahead. I’m all for creating authoritativeness through content. I think that’s an essential strategy for pretty much any business.

AI Mode is not just looking at your website content, whether it’s your image content, audio content, whatever it may be, it’s also looking at how the web is talking about you.”

AI Mode Is Evolution, Not Extension

AI Mode is not just an extension of traditional search but an evolution of it. Search now includes text, images, and video. It anticipates follow-up queries and displays the answers to them using the query fan-out technique. This shifts the SEO focus away from keyword inventory and chasing clicks and toward considering how the entire user information journey is best addressed and then crafting content that satisfies that need.

More Resources:


Featured Image: Jirsak/Shutterstock

Perplexity Launches Comet Browser For Free Worldwide via @sejournal, @MattGSouthern

Perplexity released its Comet browser to everyone today, shifting from a waitlist to free desktop downloads worldwide.

Comet bakes an AI assistant into every new tab so you can ask questions, summarize pages, and navigate without jumping between search results and multiple tools.

Perplexity first introduced Comet in July in a limited release. Since then, the company says “millions” have joined the waitlist, and early users asked 6–18 times more questions on day one.

The move poses a challenge to traditional search engines and browsers by adopting an AI-first approach to web navigation, which reduces the need for multiple searches and the management of numerous tabs.

What Makes Comet Different

At the core of Comet’s functionality is the Comet Assistant, an AI-powered helper that browses alongside users and handles tasks such as research, meeting support, coding assistance, and e-commerce activities.

The assistant appears in every new tab, ready to answer questions or complete actions without requiring users to navigate away from their current workflow.

Unlike traditional browsers where users must open a separate search engine, copy information between tabs, or use multiple tools, Comet integrates assistance directly into the browsing experience. You can ask questions in natural language, and the assistant provides answers drawn from web sources.

Background Assistants

Perplexity also announced Background Assistants today. These assistants work simultaneously and asynchronously in the background, handling tasks without requiring active user supervision.

The Background Assistants join the recently announced Email Assistant, currently available to Max Subscribers. The Email Assistant can be cc’d on email threads to handle scheduling, draft replies, and manage inbox tasks without opening a separate application.

Mobile & Voice Coming Soon

While Comet has been desktop-only since its July launch, Perplexity recently previewed mobile versions for iPhone and Android.

The mobile version will include voice technology, allowing users to interact with Comet assistants through speech rather than typing.

Availability

Comet is now available for free download at perplexity.ai/comet for desktop users.

For tips on using the browser, see Perplexity’s resource hub.


Featured Image: Sidney van den Boogaard/Shutterstock

Vector Index Hygiene: A New Layer Of Technical SEO via @sejournal, @DuaneForrester

For years, technical SEO has been about crawlability, structured data, canonical tags, sitemaps, and speed. All the plumbing that makes pages accessible and indexable. That work still matters. But in the retrieval era, there’s another layer you can’t ignore: vector index hygiene. And while I’d like to claim my usage of vector index hygiene is unique, similar concepts exist in machine learning (ML) circles already. It is unique when applied specifically to our work with content embedding, chunk pollution, and retrieval in SEO/AI pipelines, however.

This isn’t a replacement for crawlability and schema. It’s an addition. If you want visibility in AI-driven answer engines, you now need to understand how your content is dismantled, embedded, and stored in vector indexes and what can go wrong if it isn’t clean.

Traditional Indexing: How Search Engines Break Pages Apart

Google has never stored your page as one giant file. From the beginning, search has dismantled webpages into discrete elements and stored them in separate indexes.

  • Text is broken into tokens and stored in inverted indexes, which map terms to the documents they appear in. Here, tokenization means traditional IR terms, not LLM sub-word units. This is the backbone of keyword retrieval at scale. (See: Google’s How Search Works overview.)
  • Images are indexed separately, using filenames, alt text, captions, structured data, and machine-learned visual features. (See: Google Images documentation.)
  • Video is split into transcripts, thumbnails, and structured data, all stored in a video index. (See: Google’s video indexing docs.)

When you type a query into Google, it queries these indexes in parallel (web, images, video, news) and blends the results into one SERP. This separation exists because handling “an internet’s worth” of text is not the same as handling an internet’s worth of images or video.

For SEOs, the important point is this: you never really ranked “the page.” You ranked the parts of it that were indexed and retrievable.

GenAI Retrieval: From Inverted Indexes To Vector Indexes

AI-driven answer engines like ChatGPT, Gemini, Claude, and Perplexity push this model further. Instead of inverted indexes that map terms to documents, they use vector indexes that store embeddings, essentially mathematical fingerprints of meaning.

  • Chunks, not pages. Content is split into small blocks. Each block is embedded into a vector. Retrieval happens by finding semantically similar vectors in response to a query. (See: Google Vertex AI Vector Search overview.)
  • Hybrid retrieval is common. Dense vector search captures semantics. Sparse keyword search (BM25) captures exact matches. Fusion methods like reciprocal rank fusion (RRF) combine both. (See: Weaviate hybrid search explained and RRF primer.)
  • Paraphrased answers replace ranked lists. Instead of showing a SERP, the model paraphrases retrieved chunks into a single answer.

Sometimes, these systems still lean on traditional search as a backstop. Recent reporting showed ChatGPT quietly pulling Google results through SerpApi when it lacked confidence in its own retrieval. (See: Report)

For SEOs, the shift is stark. Retrieval replaces ranking. If your blocks aren’t retrieved, you’re invisible.

What Vector Index Hygiene Means

Vector index hygiene is the discipline of preparing, structuring, embedding, and maintaining content so it remains clean, deduplicated, and easy to retrieve in vector space. Think of it as canonicalization for the retrieval era.

Without hygiene, your content pollutes indexes:

  • Bloated blocks: If a chunk spans multiple topics, the resulting embedding is muddy and weak.
  • Boilerplate duplication: Repeated intros or promos create identical vectors that may drown out unique content.
  • Noise leakage: Sidebars, CTAs, or footers can get chunked and embedded, then retrieved as if they were main content.
  • Mismatched content types: FAQs, glossaries, blogs, and specs each need different chunk strategies. Treat them the same and you lose precision.
  • Stale embeddings: Models evolve. If you never re-embed after upgrades, your index contains inconsistencies.

Independent research backs this up. LLMs lose salience on long, messy inputs (“Lost in the Middle”). Chunking strategies show measurable trade-offs in retrieval quality (See: “Improving Retrieval for RAG-based Question Answering Models on Financial Documents“). Best practices now include regular re-embedding and index refreshes (See: Milvus guidance.).

For SEOs, this means hygiene work is no longer optional. It decides whether your content gets surfaced at all.

SEOs can begin treating hygiene the way we once treated crawlability audits. The steps are tactical and measurable.

1. Prep Before Embedding

Strip navigation, boilerplate, CTAs, cookie banners, and repeated blocks. Normalize headings, lists, and code so each block is clean. (Do I need to explain that you still need to keep things human-friendly, too?)

2. Chunking Discipline

Break content into coherent, self-contained units. Right-size chunks by content type. FAQs can be short, guides need more context. Overlap chunks sparingly to avoid duplication.

3. Deduplication

Vary intros and summaries across articles. Don’t let identical blocks generate nearly identical embeddings.

4. Metadata Tagging

Attach content type, language, date, and source URL to every block. Use metadata filters during retrieval to exclude noise. (See: Pinecone research on metadata filtering.)

5. Versioning And Refresh

Track embedding model versions. Re-embed after upgrades. Refresh indexes on a cadence aligned to content changes. (See: Milvus versioning guidance.)

6. Retrieval Tuning

Use hybrid retrieval (dense + sparse) with RRF. Add re-ranking to prioritize stronger chunks. (See: Weaviate hybrid search best practices.)

A Note On Cookie Banners (Illustration Of Pollution In Theory)

Cookie consent banners are legally required across much of the web. You’ve seen the text: “We use cookies to improve your experience.” It’s boilerplate, and it repeats across every page of a site.

In large systems like ChatGPT or Gemini, you don’t see this text popping up in answers. That’s almost certainly because they filter it out before embedding. A simple rule like “if text contains ‘we use cookies,’ don’t vectorize it” is enough to prevent most of that noise.

But despite this, cookie banners a still a useful illustration of theory meeting practice. If you’re:

  • Building your own RAG stack, or
  • Using third-party SEO tools where you don’t control the preprocessing,

Then cookie banners (or any repeated boilerplate) can slip into embeddings and pollute your index. The result is duplicate, low-value vectors spread across your content, which weakens retrieval. This, in turn, messes with the data you’re collecting, and potentially the decisions you’re about to make from that data.

The banner itself isn’t the problem. It’s a stand-in for how any repeated, non-semantic text can degrade your retrieval if you don’t filter it. Cookie banners just make the concept visible. And if the systems ignore your cookie banner content, etc., is the volume of that content needing to be ignored simply teaching the system that your overall utility is lower than a competitor without similar patterns? Is there enough of that content that the system gets “lost in the middle” trying to reach your useful content?

Old Technical SEO Still Matters

Vector index hygiene doesn’t erase crawlability or schema. It sits beside them.

  • Canonicalization prevents duplicate URLs from wasting crawl budget. Hygiene prevents duplicate vectors from wasting retrieval opportunities. (See: Google’s canonicalization troubleshooting.)
  • Structured data still helps models interpret your content correctly.
  • Sitemaps still improve discovery.
  • Page speed still influences rankings where rankings exist.

Think of hygiene as a new pillar, not a replacement. Traditional technical SEO makes content findable. Hygiene makes it retrievable in AI-driven systems.

You don’t need to boil the ocean. Start with one content type and expand.

  • Audit your FAQs for duplication and block size (chunk size).
  • Strip noise and re-chunk.
  • Track retrieval frequency and attribution in AI outputs.
  • Expand to more content types.
  • Build a hygiene checklist into your publishing workflow.

Over time, hygiene becomes as routine as schema markup or canonical tags.

Your content is already being chunked, embedded, and retrieved, whether you’ve thought about it or not.

The only question is whether those embeddings are clean and useful, or polluted and ignored.

Vector index hygiene is not THE new technical SEO. But it is A new layer of technical SEO. If crawlability was part of the technical SEO of 2010, hygiene is part of the technical SEO of 2025.

SEOs who treat it that way will still be visible when answer engines, not SERPs, decide what gets seen.

More Resources:


This post was originally published on Duane Forrester Decodes.


Featured Image: Collagery/Shutterstock

How People Really Use LLMs And What That Means For Publishers

OpenAI released the largest study to date on how users really use ChatGPT. I have painstakingly synthesized the ones you and I should pay heed to, so you don’t have to wade through the plethora of useful and pointless insights.

TL;DR

  1. LLMs are not replacing search. But they are shifting how people access and consume information.
  2. Asking (49%) and Doing (40%) queries dominate the market and are increasing in quality.
  3. The top three use cases – Practical Guidance, Seeking Information, and Writing – account for 80% of all conversations.
  4. Publishers need to build linkable assets that add value. It can’t just be about chasing traffic from articles anymore.
Image Credit: Harry Clarkson-Bennett

Chatbot 101

A chatbot is a statistical model trained to generate a text response given some text input. Monkey see, monkey do.

The more advanced chatbots have a two or more-stage training process. In stage one (less colloquially known as “pre-training”), LLMs are trained to predict the next word in a string.

Like the world’s best accountant, they are both predictable and boring. And that’s not necessarily a bad thing. I want my chefs fat, my pilots sober, and my money men so boring they’re next in line to lead the Green Party.

Stage two is where things get a little fancier. In the “post-training” phase, models are trained to generate “quality” responses to a prompt. They are fine-tuned on different strategies, like reinforcement learning, to help grade responses.

Over time, the LLMs, like Pavlov’s dog, are either rewarded or reprimanded based on the quality of their responses.

In phase one, the model “understands” (definitely in inverted commas) a latent representation of the world. In phase two, its knowledge is honed to generate the best quality response.

Without temperature settings, LLMs will generate exactly the same response time after time, as long as the training process is the same.

Higher temperatures (closer to 1.0) increase randomness and creativity. Lower temperatures (closer to 0) make the model(s) far more predictive and precise.

So, your use case determines the appropriate temperature settings. Coding should be set closer to zero. Creative, more content-focused tasks should be closer to one.

I have already talked about this in my article on how to build a brand post AI. But I highly recommend reading this very good guide on how temperature scales work with LLMs and how they impact the user base.

What Does The Data Tell Us?

That LLMs are not a direct replacement for search. Not even that close IMO. This Semrush study highlighted that LLM super users increased the amount of traditional searches they were doing. The expansion theory seems to hold true.

But they have brought on a fundamental shift in how people access and interact with information. Conversational interfaces have incredible value. Particularly in a workplace format.

Who knew we were so lazy?

1. Guidance, Seeking Information, And Writing Dominate

These top three use cases account for 80% of all human-robot conversations. Practical guidance, seeking information, and please help me write something bland and lacking any kind of passion or insight, wondrous robot.

I will concede that the majority of Writing queries are for editing existing work. Still. If I read something written by AI, I will feel duped. And deception is not an attractive quality.

2. Non-Work-Related Usage Is Increasing

  • Non-work-related messages grew from 53% of all usage to more than 70% by July 2025.
  • LLMs have become habitual. Particularly when it comes to helping us make the right decisions. Both in and out of work.

3. Writing Is The Most Common Workplace Application

  • Writing is the most common work use case, accounting for 40% of work-related messages on average in June 2025.
  • About two-thirds of all Writing messages are requests to modify existing user text rather than create new text from scratch.

I know enough people that just use LLMs to help them write better emails. I almost feel sorry for the tech bros that the primary use cases for these tools are so lacking in creativity.

4. Less So Coding

  • Computer coding queries are a relatively small share, at only 4.2% of all messages.*
  • This feels very counterintuitive, but specialist bots like Claude or tools like Lovable are better alternatives.
  • This is a point of note. Specialist LLM usage will grow and will likely dominate specific industries because they will be able to develop better quality outputs. The specialized stage two style training makes for a far superior product.

*Compared to 33% of work-related Claude conversations.

It’s important to note that other studies have some very different takes on what people use LLMs for. So this isn’t as cut and dry as we think. I’m sure things will continue to change.

5. Men No Longer Dominate

  • Early adopters were disproportionately male (around 80% with typically masculine names).
  • That number declined to 48% by June 2025, with active users now slightly more likely to have typically feminine names.

Sure, us men have our flaws. Throughout history maybe we’ve been a tad quick to battle and a little dominating. But good to see parity.

  • 89% of all queries are Asking and Doing related.
  • 49% Asking and 40% Doing, with just 11% for Expressing.
  • Asking messages have grown faster than Doing messages over the last year, and are rated higher quality.
A ChatGPT-built table with examples of each query type – Asking, Doing, and Expressing (Image Credit: Harry Clarkson-Bennett)

7. Relationships And Personal Reflection Are Not Prominent

  • There have been a number of studies that state that LLMs have become personal therapists for people (see above).
  • However, relationships and personal reflection only account for 1.9% of total messages according to OpenAI.

8. The Bloody Youth (*Shakes Fist*)

Takeaways

I don’t think LLMs are a disaster for publishers. Sure, they don’t send any referral traffic and have started to remove citations outside of paid users (classic). But none of these tech-heads are going to give us anything.

It’s a race to the moon, and we’re the dog they sent on the test flight.

But if you’re a publisher with an opinion, an audience, and – hopefully – some brand depth and assets to hand, you’ll be ok. Although their crawling behavior is getting out of hand.

Shit-quality traffic and not a lot of it (Image Credit: Harry Clarkson-Bennett)

One of the most practical outcomes we as publishers can take from this data is the apparent change in intents. For eons, we’ve been lumbered with navigational, informational, commercial, and transactional.

Now we have Doing. Or Generating. And it’s huge.

Even simple tools can still drive fantastic traffic and revenue (Image Credit: Harry Clarkson-Bennett)

SEO isn’t dead for publishers. But we do need to do more than just keep publishing content. There’s a lot to be said for espousing the values of AI, while keeping it at arm’s length.

Think BBC Verify. Content that can’t be synthesized by machines because it adds so much value. Tools and linkable assets. Real opinions from experts pushed to the fore.

But it’s hard to scale that quality. Programmatic SEO can drive amazing value. As can tools. Tools that answer users’ “Doing” queries time after time. We have to build things that add value outside of the existing corpus.

And if your audience is generally younger and more trusting, you’re going to have to lean into this more.

More Resources:


This post was originally published on Leadership in SEO.


Featured Image: Roman Samborskyi/Shutterstock

How AI Really Weighs Your Links (Analysis Of 35,000 Datapoints) via @sejournal, @Kevin_Indig

Before we jump in:

  • I hate to brag, but I will say I’m extremely proud to have placed 4th in the G50 SEO World Championships this past week.
  • I’m speaking at NESS, the global News & Editorial SEO Summit, on October 22. Growth Memo readers get 20% off when code “kevin2025”

Boost your skills with Growth Memo’s weekly expert insights. Subscribe for free!

Historically, backlinks have always been one of the most reliable currencies of visibility in search results.

We know links matter for visibility in AI-based search, but how they work inside LLMs – including AI Overviews, Gemini, or ChatGPT & Co.- is still somewhat of a black box.

The rise of AI search models changes the rules of organic visibility and the competition for share of voice in LLM results.

So the question is, do backlinks still earn visibility in AI-based modalities of search… and if so, which ones?

If backlinks were the currency of the pre-LLM web, this week’s analysis is a first look at whether they’re still legal tender in the new AI search economy.

Together with Semrush, I analyzed 1,000 domains and their AI mentions against core backlink metrics.

Image Credit: Kevin Indig

The data surfaced four clear takeaways:

  1. Backlink-earned authority helps, but it’s not everything.
  2. Link quality outweighs volume.
  3. Most surprisingly, nofollow links pull real weight.
  4. Image links can move the needle on authority.

These findings help us all understand how AI models surface sites, along with exposing what backlink levers marketers can pull to influence visibility.

Below, you’ll find the methodology, deeper data takeaways, and, for premium subscribers, recommendations (with benchmarks) to put these findings into action.

Methodology

For this analysis, I looked at relationships between AI mentions for 1,000 randomly selected web domains. All data is from the Semrush AI SEO Toolkit, Semrush’s AI visibility & search analytics platform.

Along with the Semrush team, I examined the number of mentions across:

  • ChatGPT.
  • ChatGPT with Search activated.
  • Gemini.
  • Google’s AI Overviews.
  • Perplexity.

(If you’re wondering where Claude.ai fits in this analysis, we didn’t include it at this time as its user base is generally less focused on web search and more on generative tasks.)

For the platforms above, we measured Share of Voice and the number of AI mentions against the following backlink metrics:

  • Total backlinks.
  • Unique linking domains.
  • Follow links.
  • Nofollow links.
  • Authority Score (a Semrush metric referred to as Ascore below).
  • Text links.
  • Image links.

In this analysis, I used two different ways of measuring correlation across the data: a Pearson correlation and a Spearman correlation.

If you are familiar with these concepts, skip to the next section where we dive into the results.

For everyone else, I’ll break these down so you have a better understanding of the findings below.

Both Pearson and Spearman are correlation coefficients – numbers between -1 and +1 that measure how strongly two different variables are related.

The closer the coefficient is to +1 or -1, the more likely and stronger the correlation. (Near 0 means weak or no correlation at all.)

  • Pearson’s r measures the strength and direction of a linear relationship between two variables. Pearson looks at a linear correlation across the data using the raw values. This way of measuring is sensitive to outliers. But, if the relationship curves or has thresholds, Pearson under-measures it.
  • Spearman’s ρ (rho) measures the strength and direction of a monotonic relationship, or whether values consistently move in the same or opposite direction, not necessarily in a straight line. Spearman looks at rank correlation across the data. It asks whether higher X tends to come with higher Y; Spearman correlation asks: “When one thing increases, does the other usually increase too?”. It’s a correlation that is more robust to outliers and accounts for non-linear, monotonic patterns.

A gap between Pearson and Spearman correlation coefficients can mean the gains are non-linear.

In other words: There’s a threshold to cross. And that means the effect of X on Y doesn’t kick in right away.

Examining both the Pearson and Spearman coefficients can tell us if nothing (or very little) happens until you pass a certain point – and then once you exceed that point, the relationship shows up strongly.

Here’s a quick example of what an analysis that involves both coefficients can reveal:

Spending $500 (action X) on ads might not move the needle on sales growth (outcome Y). But once you cross, say, $5,000/month (action X), sales start growing steadily (outcome Y).

And that’s the end of your statistics lesson for today.

Image Credit: Kevin Indig

The first signal we examined was the strength of the relationship between the number of backlinks a site gets versus its AI Share of Voice.

Here’s what the data showed:

  • Authority Score has a moderate link to Share of Voice (SoV): Pearson ~0.23, Spearman ~0.36.
  • Higher authority means higher SoV, but the gains are uneven. There’s a threshold you need to cross.
  • Authority supports visibility, yet it does not explain most of the variance. What this means is that backlinks do have an impact on AI visibility, but there is more to the story, like your content, brand perceptions, etc.

Also, the number of unique linking domains matters more than the total number of backlinks.

In plain terms, your site is more likely to have a larger SoV when you have links from many different websites than a huge number of links from just a few sites.

Image Credit: Kevin Indig

Across all models, the strongest relationship occurred between Authority Score (0.65 Pearson, 0.57 Spearman) and the number of mentions

Here’s how Semrush defines the Authority Score measurement:

Authority Score is our compound metric that grades the overall quality of a website or a webpage. The higher the score, the more assumed weight a domain’s or webpage’s outbound links to another site could have.

It takes into account the number and quality of backlinks, organic traffic to link source pages, and the spamminess of the link profile.

Of course, Ascore is just a proxy for quality. LLMs have their own way of arriving at backlink quality. But the data shows that we can use Semrush’s Ascore as a good representative.

Most models value this metric equally for mentions, but ChatGPT Search and Perplexity value it the least compared to the average.

Surprisingly, regular ChatGPT (without search activated) weighs Ascore the most out of all models.

Critical to know: Median mentions jump from ~21.5 in decile 8 to ~79.0 in decile 9. The relationship is non-linear. In other words, the biggest gains come when you hit the upper boundaries of authority, or Ascore in this case.

(For context, a decile is a way of splitting a dataset into 10 equal parts. Each segment, or decile, contains 10% of the data points when they’re sorted in order.)

Image Credit: Kevin Indig

Perhaps the most significant finding from this analysis is that it doesn’t matter much if the links are set to nofollow or not!

And this has huge implications.

Confirmation of the value of nofollow links is so important because these types of links tend to be easier to build than follow links.

This is where LLMs are distinctly different from search engines: We’ve known for a while that Google also counts nofollow links, but not how much and for what (crawling, ranking, etc).

Once again, you won’t see big gains until you’re in the top 3 deciles, or the top 30% of the data points.

Follow links → Mentions:

  • Pearson 0.334, Spearman 0.504

Nofollow links → Mentions:

  • Pearson 0.340, Spearman 0.509

Conversely, Google’s AI Overviews and Perplexity weighed regular links the highest and nofollow links the least.

And interestingly, Gemini and ChatGPT weigh nofollow links the highest (over regular follow links).

Here’s my own theory as to why Gemini and ChatGPT weigh nofollow more:

With Gemini, I’m curious if Google weighs nofollow links higher than we have believed them to be in the past. And with ChatGPT, my hypothesis is that Bing is also weighing nofollow links higher (once Google started doing it, too). But this is just a theory, and I don’t have the data to support it at this time.

Image Credit: Kevin Indig

Beyond text-based backlinks, we also tested if image-based backlinks carry the same weight.

And in some cases, they had a stronger relationship to mentions than text-based links.

But how strong?

  • Images vs mentions: Pearson 0.415, Spearman 0.538
  • Text links vs mentions: Pearson 0.334, Spearman 0.472

Image links really start to pay off once you already have some authority.

  • From mid decile tiers up, the relationship turns positive, then strengthens, and is strongest in the top deciles.
  • In low-Ascore deciles (deciles 1 and 2), the images → mentions tie is weak or negative.

If you are targeting mention growth on Perplexity or Search-GPT, image links are especially productive.

  • Images correlate with mentions most on Perplexity and Search-GPT (Spearman ≈ 0.55 and 0.53), then ChatGPT/Gemini (≈ 0.49 – 0.52), then Google-AI (≈ 0.46).

Featured Image: Paulo Bobita/Search Engine Journal

OpenAI Launches Sora iOS App Alongside Sora 2 Video Model via @sejournal, @MattGSouthern

OpenAI launched the Sora iOS app, beginning an invite-based rollout in the United States and Canada.

With Sora, OpenAI appears to be releasing its first non-ChatGPT consumer app and its first social product.

The app runs on the newly released Sora 2 model for video and synchronized audio.

What’s The Sora App?

Sora is positioned as a creation-first social experience rather than a public-broadcast platform.

It adds social features on top of Sora 2’s generation capabilities, including tools to remix videos and collaborate with friends inside the app.

Custom Feed

The app uses OpenAI’s language models to power a recommender algorithm that accepts natural language instructions.

Users can customize their feed through conversational commands rather than buried settings menus.

By default, the feed prioritizes content from people users follow or interact with.

The Sora team wrote:

“We are not optimizing for time spent in feed, and we explicitly designed the app to maximize creation, not consumption.”

Cameos

Sora centers on “cameos,” which let you place yourself or friends inside AI-generated scenes after a short one-time video and audio capture in the app.

OpenAI says people who appear in cameos control who can use their likeness and can revoke access or remove any video that includes it.

Content Creation

Beyond cameos and feed browsing, the app lets users create original videos through text prompts and remix other users’ generations.

The underlying Sora 2 model can follow multi-shot instructions, maintain world state across scenes, and generate synchronized dialogue and sound effects.

ChatGPT Pro subscribers can access an experimental higher-quality Sora 2 Pro model on sora.com, with app access planned.

The original Sora 1 Turbo remains available, and existing user content stays in personal libraries.

Monetization

OpenAI plans to keep Sora free initially, with generation limits determined by available compute resources.

The company’s revenue strategy involves charging users for extra generations when demand surpasses capacity. No plans for advertising or creator revenue sharing have been announced.

Availability

The app operates on an invite-only basis, with sign-ups available through the iOS app. The App Store listing is live.

Image Credit: Apple App Store

OpenAI says it made Sora invite-only to ensure users arrive with friends already in the app. The company cites feedback indicating that cameos drive the experience, making existing connections essential.

Looking Ahead

For marketers and creators, Sora serves as a new platform for distributing short, AI-generated videos, affirming OpenAI’s focus on developing consumer-oriented tools.

Sora’s adoption will largely depend on accessibility, real-world applications, and how well the feed encourages active creation instead of passive viewing.


Featured Image: Robert Way/Shutterstock

Google AI Mode Gets Visual + Conversational Image Search via @sejournal, @MattGSouthern

Google announced that AI Mode now supports visual search, letting you use images and natural language together in the same conversation.

The update is rolling out this week in English in the U.S.

What’s New

Visual Search Gets Conversational

Google’s update to AI Mode aims to address the challenge of searching for something that’s hard to describe.

You can start with text or an image, then refine results naturally with follow-up questions.

Robby Stein, VP of Product Management for Google Search, and Lilian Rincon, VP of Product Management for Google Shopping, wrote:

“We’ve all been there: staring at a screen, searching for something you can’t quite put into words. But what if you could just show or tell Google what you’re thinking and get a rich range of visual results?”

Google provides an example that begins with a search for “maximalist bedroom inspiration,” and is refined with “more options with dark tones and bold prints.”

Image Credit: Google
Image Credit: Google

Each image links to its source, so searchers can click through when they find what they want.

Shopping Without Filters

Rather than using conventional filters for style, size, color, and brand, you can describe products conversationally.

For example, asking “barrel jeans that aren’t too baggy” will find suitable products, and you can narrow down options further with requests like “show me ankle length.”

Image Credit: Google

This experience is powered by the Shopping Graph, which spans more than 50 billion product listings from major retailers and local shops.

The company says over 2 billion listings are refreshed every hour to keep details such as reviews, deals, available colors, and stock status up to date.

Technical Foundation

Building on Lens and Image Search, the visual abilities now include Gemini 2.5’s advanced multimodal and language understanding.

Google introduces a technique called “visual search fan-out,” where it runs several related queries in the background to better grasp what’s in an image and the nuances of your question.

Plus, on mobile devices, you can search within a specific image and ask conversational follow-ups about what you see.

Image Credit: Google

Additional Context

In a media roundtable attended by Search Engine Journal, a Google spokesperson said:

  • When a query includes subjective modifiers, such as “too baggy,” the system may use personalization signals to infer what you likely mean and return results that better match that preference. The spokesperson didn’t detail which signals are used or how they are weighted.
  • For image sources, the systems don’t explicitly differentiate real photos from AI-generated images for this feature. However, ranking may favor results from authoritative sources and other quality signals, which can make real photos more likely to appear in some cases. No separate policy or detection standard was shared.

Why This Matters

For SEO and ecommerce teams, images are becoming even more essential. As Google gets better at understanding detailed visual cues, high-quality product photos and lifestyle images may boost your visibility.

Since Google updates the Shopping Graph every hour, it’s important to keep your product feeds accurate and up-to-date.

As search continues to become more visual and conversational, remember that many shopping experiences might begin with a simple image or a casual description instead of exact keywords.

Looking Ahead

The new experience is rolling out this week in English in the U.S. Google hasn’t shared timing for other languages or regions.

What OpenAI’s Research Reveals About The Future Of AI Search

The launch of ChatGPT in 2022 didn’t so much cause a shift in the search landscape as trigger a series of seismic events. And, like seismologists, the SEO industry needs data if it’s to predict future tremors and aftershocks – let alone prepare itself for what the landscape might reshape itself into once the ground has finally settled.

So, when OpenAI released a 65-page research paper on Sept. 15, 2025, titled “How People Use ChatGPT,” some of us were understandably excited to finally have some authoritative usage data from inside a major large language model (LLM).

Two key findings leap out:

  1. We’re closer to mass adoption of AI than most probably realize.
  2. How users interact with ChatGPT has fundamentally shifted in the past year.

For SEOs, this isn’t just another adoption study: It’s strategic intelligence about where AI search is heading.

Mass Adoption Is Closer Than You Think

How close is ChatGPT to the tipping point where it will accelerate into mass adoption?

Developed by sociologist Everett Rogers, the diffusion of innovation theory provides us with a useful framework to explain how new technologies spread through society in predictable stages. First, there are the innovators, accounting for 2.5% of the market. Then, the early adopters come along (13.5%), to be followed by the early majority (34%). At this point, ~50% of the potential market has adopted the technology. Anyone jumping on board after this point can safely be described as either the late majority (34%), or laggards (16%).

The tipping point happens at around 20%, when the new technology is no longer confined to innovators or early adopters but is gradually taken up by the early majority. It’s at this point that mainstream adoption accelerates rapidly.

Now, let’s apply this to ChatGPT’s data.

Since launching in late 2022, ChatGPT’s growth has been staggering. The new report reveals that, in the five-month period from February to July 2025, ChatGPT grew from 400 million to 700 million weekly active users (WAU), sending 18 billion messages per week. That represents an average compound growth of roughly 11-12% month-over-month.

700 million WAU is equivalent to around 10% of the global adult population; impressive, but not quite mass adoption. Yet.

(Side note: Back in April, Sam Altman gave a figure of ~800 million weekly active users when speaking at TED 2025. To avoid confusion, we’ll stick with the official figure of 700 million WAU quoted in OpenAI’s report.)

It’s estimated there were approximately 5.65 billion internet users globally at the start of July 2025. This is the total addressable market (TAM) available to ChatGPT.

20% of 5.65 billion = 1.13 billion WAU. That’s the tipping point.

Even if the growth rate slows to a more conservative 5-6% per month, ChatGPT would already have reached at least 770 million WAU as I write this. At that rate of growth, ChatGPT will cross the mass adoption threshold between December 2025 and August 2026, with April 2026 as the most likely midpoint.

Of course, if the rate of growth remains closer to 11-12%, we can expect to tip over into mass adoption even earlier.

Start Level

July 2025

Growth (MoM) September 2025 Approx. Crossing Window
700 million 4% 757.12 Aug 2026
700 million 5% 771.75 May 2026
700 million 6% 786.52 Apr 2026
700 million 7% 801.43 Mar 2026
700 million 8% 816.48 Feb 2026
700 million 9% 839.30 Jan 2026
700 million 10% 847.00 Jan 2026
700 million 11% 862.47 Dec 2025
700 million 12% 878.08 Dec 2025

For SEOs, this timeline matters. We don’t have years to prepare for mass AI search adoption. We have months.

The window is rapidly closing for any brands not wanting to be left behind.

The Behavioral Revolution Hiding In Plain Sight

Buried within OpenAI’s usage data is perhaps the most significant finding for search marketers: a fundamental shift in how people are using AI tools.

In June 2024, non-work messages accounted for 53% of all ChatGPT interactions. By June 2025, this figure had climbed to 73%. This is a clear signal that ChatGPT is moving from workplace tool to everyday utility.

Things get even more interesting when we look at the intent behind those queries. OpenAI categorizes user interactions into three types:

  1. Asking (seeking information and guidance).
  2. Doing (generating content or completing tasks).
  3. Expressing (sharing thoughts or feelings with no clear intent).

The data reveals that “Asking” now makes up 51.6% of all interactions, compared to 34.6% for “Doing” and 13.8% for “Expressing.”

Let’s be clear: What ChatGPT categorizes as “Asking” is pretty much synonymous with what we think of as AI search. These are the queries that were once the exclusive domain of search engines.

Users are also increasingly satisfied with the quality of responses to “Asking” queries, rating interactions as either Good or Bad at a ratio of 4.45:1. For “Doing” interactions, the ratio of Good to Bad drops to 2.76.

The trend becomes even clearer when we break down interactions by topic. Three topics account for just under 78% of all messages.

  • Practical Guidance (29%).
  • Seeking Information (24%).
  • Writing (24%).

These figures are even more noteworthy when you consider that, in July 2024, “Writing” was easily the most common topic (36%), dropping 12 percentiles in just one year.

And while “Practical Guidance” has remained steady at 29%, “Seeking Information” has shot up 10 percentiles from 14%. What a difference a year makes.

And while “Writing” still accounts for 42% of all work-related messages, the nature of these requests has shifted. Instead of generating content from scratch, two-thirds of writing requests now focus on editing, translating, or summarizing text supplied by the user.

Whichever way you slice it, AI search is now the primary use case for ChatGPT, not content generation. But where does that leave traditional search?

The AI Wars: Battling For The Future Of Search

ChatGPT may be reshaping the landscape, but Google hasn’t been sitting idle.

Currently rolling out to 180 countries worldwide, AI Mode is Google’s biggest response yet to ChatGPT’s encroachment on its territory. Setting the scene for what is likely to become a competitive struggle between Google and OpenAI to define and dominate AI search.

ChatGPT has an advantage in having largely established the conversational search behaviors we’re now seeing. Instead of piecing together information by clicking back and forth on links in the SERPs, ChatGPT provides users with complete answers in a fraction of the time.

Meanwhile, Google’s advantage is that AI Mode grounds responses against a highly sophisticated search infrastructure, drawing on decades of web indexing expertise, contextual authority, and myriad other signals.

The stakes are high. If Google doesn’t transition aggressively enough to seize ground in AI search and protect its overall search dominance, it risks becoming the next Ask Jeeves.

That’s why I wouldn’t be surprised at all to see AI Mode become their primary search interface sooner rather than later.

Naturally, this would be a massive disruption to the traditional Google Ads model. Google’s recent launch of a new payment protocol suggests it is already hedging against the risk of falling ad revenue from traditional search.

With everything still so fluid, it’s virtually impossible to predict what the search landscape will eventually look like once the dust has settled and new business models have emerged.

Whichever platform ultimately dominates, it’s all but certain that AI search will be the victor.

Instead of focusing on what we don’t know and waiting for answers, brands can use what they do know about AI search to seize a strategic advantage.

Rethinking Traffic Value

With most websites only seeing ~1-2% of traffic coming from LLMs like ChatGPT, it would be tempting to dismiss AI search as insignificant, a distraction – at least for now.

But with ChatGPT about to hit mass adoption in months, this picture could change very rapidly.

Plus, AI search isn’t primarily about clicks. Users will often get the information they need from AI search without clicking on a single link. AI search is about influence, awareness, and decision support.

However, analyzing traffic from AI sources does reveal some interesting patterns.

Our own research indicates that, in some industries at least, LLM-referred visitors convert at a higher rate than traditional search traffic.

This makes sense. If someone has already engaged with your brand through one or more AI interactions and still chooses to visit your site, they’re doing so with more intent than someone clicking through in search of basic information. Perhaps they’re highly engaged in the topic and want to go deeper. Or perhaps the AI responses have answered their product queries, and they’re now ready to buy.

Even if it results in fewer clicks, this indirect form of brand exposure could become increasingly valuable as AI adoption reaches mass market levels.

If 1-2% of traffic currently comes from AI sources at 10% market adoption, what happens when we reach 20% or 30% adoption? AI-mediated traffic – with its higher conversion rate – could easily grow to 5-10% of total website visits within two years.

For many businesses, that’s enough to warrant strategic attention now.

Strategic Implications For Search Marketers

Traditional keyword optimization hasn’t been cutting it for a while. And things aren’t about to get any simpler for anyone hoping to capture the intent-driven queries dominating AI interactions.

Digital marketers and SEOs need to think beyond algorithms, considering aspects that aren’t always so easily captured in a spreadsheet, such as user goals and decision-making processes.

This doesn’t mean we should abandon those SEO fundamentals essential to healthy, scalable growth. And technical SEO remains as important as ever, including proper site structure, fast loading times, and crawlable content.

However, when it comes to the content itself, the emphasis needs to shift toward providing greater depth, expertise, and user value. AI systems are far more likely to reward original, comprehensive, and authoritative information over keyword-optimized but otherwise thin content.

In short, your content needs to be built for “Asking.”

Focus on the underlying needs of the user: information gathering, interpretation, or decision support. And plan your content around “answer objects.” These are modular content components designed to be reused and repurposed by AI when generating responses to specific queries.

Instead of traditional articles targeting specific keywords, build decision frameworks that include goals, options, criteria, trade-offs, and guardrails. Each of these components can provide useful material for AI to cite in responses, whichever AI system that might be.

Preparing for AI search isn’t about looking for ways to game an algorithm. It’s about creating genuinely useful content that helps users make decisions.

For many brands, this will mean moving away from individually optimized pages to entire content ecosystems.

The Way Ahead

OpenAI’s research gives us the most authoritative picture yet of AI search adoption and user behavior. The data shows that we’re approaching a tipping point where AI-mediated search will become mainstream, while user behavior has shifted dramatically toward information seeking over content generation.

Meanwhile, the competitive landscape remains extremely fluid.

The message is clear, for now at least: Build for “Asking.”

Start planning strategies around intent-driven, decision-supporting content now, while the landscape is still evolving.

The businesses that can establish their authority in AI responses now will be in the best position when AI search does reach mass adoption – regardless of which platforms ultimately dominate.

More Resources:


Featured Image: Collagery/Shutterstock