New AI Models Make More Mistakes, Creating Risk for Marketers via @sejournal, @MattGSouthern

The newest AI tools, built to be smarter, make more factual errors than older versions.

As The New York Times highlights, tests show errors as high as 79% in advanced systems from companies like OpenAI.

This can create problems for marketers who rely on these tools for content and customer service.

Rising Error Rates in Advanced AI Systems

Recent tests reveal a trend: newer AI systems are less accurate than their predecessors.

OpenAI’s latest system, o3, got facts wrong 33% of the time when answering questions about people. That’s twice the error rate of their previous system.

Its o4-mini model performed even worse, with a 48% error rate on the same test.

For general questions, the results (PDF link) were:

  • OpenAI’s o3 made mistakes 51% of the time
  • The o4-mini model was wrong 79% of the time

Similar problems appear in systems from Google and DeepSeek.

Amr Awadallah, CEO of Vectara and former Google executive, tells The New York Times:

“Despite our best efforts, they will always hallucinate. That will never go away.”

Real-World Consequences For Businesses

These aren’t just abstract problems. Real businesses are facing backlash when AI gives wrong information.

Last month, Cursor (a tool for programmers) faced angry customers when its AI support bot falsely claimed users couldn’t use the software on multiple computers.

This wasn’t true. The mistake led to canceled accounts and public complaints.

Cursor’s CEO, Michael Truell, had to step in:

“We have no such policy. You’re of course free to use Cursor on multiple machines.”

Why Reliability Is Declining

Why are newer AI systems less accurate? According to a New York Times report, the answer lies in how they’re built.

Companies like OpenAI have used most of the available internet text for training. Now they’re using “reinforcement learning,” which involves teaching AI through trial and error. This approach helps with math and coding, but seems to hurt factual accuracy.

Researcher Laura Perez-Beltrachini explained:

“The way these systems are trained, they will start focusing on one task—and start forgetting about others.”

Another issue is that newer AI models “think” step-by-step before answering. Each step creates another chance for mistakes.

These findings are concerning for marketers using AI for content, customer service, and data analysis.

AI content with factual errors could hurt your search rankings and brand.

Pratik Verma, CEO of Okahu, tells the New York Times:

“You spend a lot of time trying to figure out which responses are factual and which aren’t. Not dealing with these errors properly basically eliminates the value of AI systems.”

Protecting Your Marketing Operations

Here’s how to safeguard your marketing:

  • Have humans review all customer-facing AI content
  • Create fact-checking processes for AI-generated material
  • Use AI for structure and ideas rather than facts
  • Consider AI tools that cite sources (called retrieval-augmented generation)
  • Create clear steps to follow when you spot questionable AI information

The Road Ahead

Researchers are working on these accuracy problems. OpenAI says it’s “actively working to reduce the higher rates of hallucination” in its newer models.

Marketing teams need their own safeguards while still using AI’s benefits. Companies with strong verification processes will better balance AI’s efficiency with the need for accuracy.

Finding this balance between speed and correctness will remain one of digital marketing’s biggest challenges as AI continues to evolve.


Featured Image: The KonG/Shutterstock

ChatGPT Leads AI Search Race While Google & Others Slip, Data Shows via @sejournal, @MattGSouthern

ChatGPT leads the AI search race with an 80.1% market share, according to fresh data from Similarweb.

Over the last six months, OpenAI’s tool has maintained a strong lead despite ups and downs.

Meanwhile, traditional search engines are struggling to grow as AI tools reshape how people find information online.

AI Search Market Share: Today’s Picture

The latest numbers show ChatGPT’s market share rebounding to 80.1%, up from 77.6% a month ago.

Here’s how the competition stacks up:

  • DeepSeek: 6.5% (down from 7.6% last month)
  • Google’s AI tools: 5.6% (up slightly from 5.5% last month)
  • Perplexity: 1.5% (down from 1.9% last month)
  • Grok: 2.6% (down from 3.2% last month)

These numbers are part of Similarweb’s bigger “AI Global” report (PDF link).

Traditional Search Engines Losing Ground

The most important finding may be that traditional search engines aren’t growing:

  • Google: -2% year-over-year
  • Bing: -18% year-over-year (a big drop from +18% in January)
  • Yahoo: -11% year-over-year
  • DuckDuckGo: -6% year-over-year
  • Baidu: -12% year-over-year

Traditional search shows a steady decline of -1% to -2% compared to last year. It’s important to note, however, that Google has seven times the user base of ChatGPT.

Which AI Categories Are Growing Fastest

While AI is changing search, some AI categories are growing faster than others:

  • DevOps & Code Completion: +103% (over 12 weeks)
  • General AI tools: +34%
  • Music Generation: +12%
  • Voice Generation: +8%

On the other hand, some AI areas are shrinking, including Writing and content Generation (-12 %), Customer Support (11%), and Legal AI (70%).

Beyond Search: Other Affected Industries

AI’s impact goes beyond just search engines. Other digital sectors facing big changes include:

  • EdTech: -28% year-over-year (with Chegg down 66% and CourseHero down 69%)
  • Website Builders: -13% year-over-year
  • Freelance Platforms: -19% year-over-year

Design platforms are still growing at +10% year over year, suggesting that AI might be helping rather than replacing these services.

What This Means

Traditional SEO still matters, but it isn’t enough. As traditional search traffic drops, you need to branch out.

Similarweb’s data shows consistent negative growth for traditional search engines alongside ChatGPT’s dominant market position, indicating a significant shift in information discovery patterns.

The takeaway for search marketers is to adapt to AI-driven search while keeping up with practices that work in traditional search. This balanced approach will be key to success in 2025 and beyond.


Featured Image: Fajri Mulia Hidayat/Shutterstock

Factors To Consider When Implementing Schema Markup At Scale via @sejournal, @marthavanberkel

Organizations adopting schema markup at scale often see a boost in non-branded search queries, signaling broader topic authority and improved discoverability.

It has also become a powerful answer to a pressing executive question: “What are we doing about generative AI?” One smart answer is, “We’re implementing schema markup.”

In March 2025, Fabrice Canel, principal program manager at Bing, confirmed that Microsoft uses structured data to support how its large language models (LLMs) interpret web content.

Just a day later, at Google’s Search Central Live event in New York, Google structured data engineer Ryan Levering shared that schema markup plays a critical role in grounding and scaling Google’s own generative AI systems.

“A lot of our systems run much better with structured data,” he noted, adding that “it’s computationally cheaper than extracting it.”

This is unsurprising to hear since schema markup, when done semantically, creates a knowledge graph, a structured framework of organizing information that connects concepts, entities, and their relationships.

A 2023 study by Data.world found that enterprise knowledge graphs improved LLM response accuracy by up to 300%, underscoring the value structured data brings to AI initiatives.

With Google continuing to dominate both search and AI – most recently launching Gemini 2.5 in March 2025, which topped the LMArena leaderboard – the intersection between structured data and AI is only growing more critical.

With that in mind, let’s explore the four key factors to consider when implementing schema markup at scale.

1. Establish Your Goal For Implementing Schema Markup

Before you invest in doing schema markup at scale, let’s explore the business outcomes you can achieve with the different schema markup implementations.

There are three different levels of schema markup complexity:

  1. Basic schema markup.
  2. Internal and external linked schema markup.
  3. Full representation of your content with a content knowledge graph.
Level Of Schema Markup Outcome Strategy
Basic Schema Markup Rich results with higher click-through rates. Implement schema markup for required properties.
Internal and external linked entities within schema markup Increase in non-branded queries.

Entities can be fully understood by AI and search engines.

Define key entities within the page and add them to your schema markup. Link entities within the website and to external knowledge bases for clarity.
Content knowledge graph: A full representation of your content as a content knowledge graph. Content is fully understood in context.

A reusable semantic data layer that enables accurate inferencing and supports LLMs.

Define all important elements of a page using the Schema.org vocabulary and elaborate entity linking to enable accurate extraction of facts about your brand.

Basic Schema Markup

Basic schema markup is when you choose to optimize a page specifically to achieve a rich result.

You look at the minimum required properties from Google’s Documentation and add them to the markup on your page.

The benefits of basic schema markup come from being eligible for a rich result. Achieving this enhanced search result can help your page stand out on the search engine results page (SERP), and it typically yields a higher click-through rate.

Internal And External Linked Entities Within The Schema Markup

Building on your basic schema markup, you can use the Schema.org vocabulary to clarify the entities on your website and how they connect with each other.

An entity refers to a single, unique, well-defined, and distinguishable thing or idea. Examples of an entity on your website include your organization, employees, products, services, blog articles, etc.

You can clarify a topic by linking an entity mentioned on your page to a corresponding external entity definition on Wikidata, Wikipedia, or Google’s knowledge graph.

This enables search engines to clearly understand the entity mentioned on your website, which results in measurable increases in non-branded queries related to that entity or topic.

You can also provide context on how entities on your site are connected by using the appropriate property to link your entity and its identifier.

For example, if you had a page that outlined your product geared toward women, you would use external entity linking to clarify that the audience is women.

If the page also lists related products or services, your schema markup would be used to point to where those related products and services are defined on your site.

When you do this, you provide a holistic and complete view of the content on your page.

With these internal and external entities fully defined, AI and search engines can understand and contextualize your entities accurately.

Full Representation Of Your Content As A Content Knowledge Graph

The final level of schema markup involves using Schema.org to define all page content. This creates a content knowledge graph, which is the most strategic use case of schema markup and has the greatest potential impact on the business.

The benefit of building a content knowledge graph lies in providing an accurate semantic data layer to both search engines and AI to fully understand your brand and the content on your website.

By defining the relationships between things on the website, you give them what they need to get accurate, clear answers.

In addition to how search engines use this robust schema markup, internal AI initiatives can use it to accelerate training on your web data.

Now that you have decided what kind of schema markup you need to achieve your business goals, let’s talk about the role cross-functional stakeholders play in helping you do schema markup at scale.

2. Cross-Departmental Collaboration And Buy-In

The SEO team often initiates Schema markup. They define the strategy, map Schema.org types to key pages, and validate the markup to ensure it’s indexed by search engines.

However, while SEO professionals may lead the charge, schema markup is not just an SEO task.

Successful schema markup implementation at scale requires alignment across multiple departments that can all derive business results from this strategy.

To maximize the value of your schema markup strategy, consider these key stakeholders before you get started:

Content Team

Whether it’s your core content team, lines of business, or a center of excellence, the teams who own the content on the website play a critical role.

Your schema markup is only as good as the content on the page. If you want to achieve a rich result and gain visibility for a specific entity, you need to ensure your page has the required content to make it eligible for this result.

Help your content team understand the value of structured data and how it helps them achieve their goals, so they’ll be motivated to make the content adjustments needed to support your schema markup strategy.

IT Team

No matter how you plan to implement schema markup, whether internally or through a vendor, your IT team’s buy-in is essential.

If you’re working with a vendor, IT will support setting up integrations and enforce security protocols. Their support is critical for enabling deployment while protecting your infrastructure.

If you’re managing schema markup in-house, IT will be responsible for the technical implementation, building advanced capabilities such as entity recognition, and ongoing maintenance.

Without their partnership, scaling and creating an agile, high-value schema markup strategy will be a challenge.

Either way, securing IT’s support early on ensures smoother implementation, stronger data governance, and long-term success.

Executive Team

Your executive leadership team ultimately determines where you should put your dollars to get the best return on investment (ROI).

They want to see the ROI and understand how this strategy helps them prepare for AI, and also stay competitive in the market.

Clear reporting on the outcomes of your structured data efforts will help secure ongoing executive support.

Educating them on how schema markup can help their brand visibility, AI search understanding, and accelerate internal AI initiatives can often help get them on board.

Innovation Team

As mentioned earlier, you can use schema markup to develop a semantic data layer, also known as a content knowledge graph.

This can be useful for your innovation or AI governance team as they could use this data layer to ground their LLMs and accelerate internal AI programs.

Your innovation team will want to understand this potential, especially if AI is a priority on the roadmap.

Pro tip: Communicate early and often. Sharing both the why and the wins will keep cross-functional teams aligned and invested as your schema markup strategy scales.

3. Capability Readiness For Doing Schema Markup At Scale

Now that you know what type of schema markup you want to implement at scale and have the cross-functional team aligned, there are some technical capabilities you need to consider.

When looking to do schema markup at scale, here are key capabilities required from either your IT team or vendor to achieve your desired outcomes.

Basic Schema Markup Capabilities

For basic schema markup for rich results, the capabilities required to implement at scale are the ability to map content to required properties to achieve a rich result and integrate it to show up on page load to be seen by Google. The key factor that simplifies this process is having a well-templated website.

Your team or vendor can map the schema markup and required properties from Google to the appropriate content elements on the page and generate the JSON-LD using these mappings.

Internal And External Entity Linking Capabilities

If you want to do internal and external entity linking within your schema markup at scale, you require more complex capabilities to identify, define, and nest entities within your schema markup.

To identify your internal and external entities and nest them within your schema markup to showcase their relationships, your team or vendor will need the ability to do Named Entity Recognition (NER).

NER extracts named entities and disambiguates the terms.

In addition to extracting proper nouns, you will want the technology to be able to recognize your business terms, your products, people, and events that perhaps aren’t notable yet to warrant a Wikipedia page.

Once the entity is identified, you will need the capability to look up the Entity Definition in a reference knowledge base. This is often done with an API to Wikidata or Google’s knowledge graph.

Now that the entity is defined, you will need the capability to dynamically insert the entity with the appropriate relationship within your schema markup.

To ensure accuracy and completeness on entity identification and relationship mapping, you want controls for the human in the loop to fine-tune matches in your domain.

Full Content Knowledge Graph Representation

For a full representation of your content knowledge graph, which can scale and update dynamically with your content, you will need to add further natural language processing capabilities.

Specifically, your vendor or IT will need to have the ability to identify the semantic relationship between entities in the text (relation extraction) and the ability to identify the concepts within sentences (semantic parsing).

Alternatively, you can do these three functions (NER, relation extraction, and semantic parsing) with a large language model.

LLMs dramatically improve this functionality with some caveats, which include high cost, lack of explainability, and hallucinations.

Once the semantic schema markup is created, your IT or vendor will store the schema markup in a database or knowledge graph and monitor the data to ensure business outcomes.

Finally, depending on the business case, you’ll want the capability to re-use your knowledge graph, so ensure that your knowledge graph data is available to be queried by other tools and systems.

4. The Maintenance Factor

Schema markup isn’t a “set it and forget it” strategy.

Your website content is constantly evolving, especially in enterprise organizations, where different teams may be publishing new content daily.

To remain accurate and effective, your schema markup needs to be dynamic and stay up to date alongside any content changes.

Apart from your website, the broader search landscape is also rapidly shifting.

Between Google’s frequent updates and the growing influence of AI platforms that consume and interpret your content, your schema markup strategy needs to be agile and adaptable.

Consider having someone on your team focused on evolving your schema markup in alignment with business goals and desired outcomes.

Whether it’s an internal resource or a vendor partner, this individual should be adaptable and bear a growth mindset.

They’ll measure the impact of your schema markup, as well as test and measure new strategies (like those mentioned above) to help you thrive in search and AI-driven experiences.

In this ever-changing search landscape, agility is key. The ability to iterate quickly is critical to staying ahead of your competitors in today’s fast-moving digital environment.

Finally, don’t overlook the importance of ongoing monitoring.

Ensuring your markup remains valid and accurate across all key pages is where long-term value is realized.

Many organizations forget this step, but it’s often where the biggest gains in performance and visibility happen.

Schema Markup Is A Business Growth Lever

Schema markup is not just an SEO tactic to achieve rich results. It’s a business growth lever that can drive discoverability, support AI readiness, and fuel long-term business growth.

Depending on the business outcome your organization is targeting – whether it’s improved search visibility, AI initiatives, deeper content intelligence, or all of the above – different factors will take priority.

That’s why CMOs and digital leaders must treat structured data as a core component of their marketing and digital transformation strategy and carefully consider how they will scale it for the best outcomes.

More Resources:


Featured Image: Just Life/Shutterstock

Google Expands AIO Coverage In Select Industries via @sejournal, @martinibuster

BrightEdge Generative Parser™ detected an expansion of AI Overviews beginning April 25th, covering a larger quantity of entertainment and travel search queries, with noteworthy growth in insurance, B2B technology and education queries.

Expanded AIO Coverage

Expansion of AIO coverage for actor filmographies represented the largest growth area for the entertainment sector, with 76.34% of new query coverage focused on these kinds of queries. In total, the entertainment sector experienced approximately 175% expansion of AI Overview coverage.

Geographic specific travel queries experienced substantial coverage growth of approximately 108%, showing up in greater numbers for people who are searching for activities in specific travel destinations within specific time periods. These are complex search queries that are difficult to get right with the normal organic search.

B2B Technology

The technology space continues to experience steady growth of approximately 7% while the Insurance topic has a slightly greater expansion of nearly 8%. These two sectors bear a little more examination because they mean that publishers increasingly shouldn’t rely on keyword search results performance but instead focus on growing mindshare in the audience that are likely to be interested in these topics. Doing this may also assist in generating the external signals of relevance that Google may be looking for when understanding what topics a website is authoritative and expert in.

According to BrightEdge:

“Technical implementation queries for containerization (Docker) and data management technologies are gaining significant traction, with AIOs expanding to address specific coding challenges.”

That suggests that Google is stepping up on how-to type queries to help people understand the blizzard of new technologies, services and products that are available every month.

Education Queries

The Education sector also continues to see steady growth with a nearly 5% expansion of AIO keyword coverage, wiwth nearly 32% of that growth coming from keywords associated with online learning, with particular focus on specialized degree programs and professional certifications in new and emerging fields.

BrightEdge commented on the data:

“Industry-specific expansion rates directly impact visibility potential. Intent patterns are unique to each vertical – success requires understanding the specific query types gaining AI Overviews in YOUR industry, not just high-volume terms. Google is building distinct AI Overview patterns for each sector.”

Jim Yu, CEO of BrightEdge, observes:

“The data is clear, Google is reshaping search with AI-first results in highly specific ways across different verticals. What works in one industry won’t translate to another.”

Takeaways

Entertainment Sector Sees Largest AIO Growth

  • Actor filmographies dominate expanded coverage, making up over 76% of entertainment-related expansions.
  • Entertainment queries in AIO expanded by about 175%.

Travel AIO Coverage Grows For Location-Specific Queries

  • Geographic and time-specific activity searches expanded by roughly 108%.
  • AIO is increasingly surfacing for complex trip planning queries.

Steady AIO Expansion In B2B Technology

  • About 7% growth, with increasing coverage of technical topics.
  • Google appears to target how-to queries in fast growing technology sectors.

Insurance Sector Expansion Signals Broader Intent Targeting

  • Insurance topics coverage by AIO grew by nearly 8%.

Education Sector Growth Is Focused On Online Learning

  • 5% increase overall, with nearly one-third of new AIO coverage tied to online programs and professional certifications in emerging fields.

Sector-Specific AIO Patterns Require Tailored SEO Strategies

Success depends on understanding AIO triggers within your vertical and not relying solely on high-volume keywords, which means considering a more nuanced approach to topics.  Google’s AI-first indexing is reshaping how publishers need to think about search visibility.

Featured Image by Shutterstock/Sergey Nivens

Google AI Mode Exits Waitlist, Now Available To All US Users via @sejournal, @MattGSouthern

Google has removed the waitlist for AI Mode in Search. This Gemini-powered search tool is now available to all US users.

The update introduces new features, including visual cards for places and products, shopping integration, and a history panel for desktop users.

This growth aligns with Google’s recent earnings reports, which indicate that investments in AI are yielding financial returns.

AI Mode Now Available to All US Users

Previously, AI Mode was only available to participants in Google Labs. Now, anyone in the United States can access it.

Google reports that early users provided “incredibly positive feedback” about the tool.

The announcement reads:

“Millions of people are using AI Mode in Labs to search in new ways. They’re asking longer, harder questions, using follow-ups to dig deeper, and discovering new websites and businesses.”

New Visual Cards for Places and Products

The update adds visual cards to AI Mode results. These cards help users take action after getting information.

For local businesses, cards show:

  • Ratings and reviews
  • Opening hours
  • How busy a place is right now
  • Quick buttons to call or get directions

Here’s an example of a local business query in Google’s AI mode:

Image Credit: Google
Image Credit: Google

For products, cards include:

  • Current prices and deals
  • Product images
  • Shipping details
  • Local store availability

Google’s announcement reads:

“This is made possible by Google’s trusted and up-to-date info about local businesses, and our Shopping Graph — with over 45 billion product listings.”

It’s worth noting this expansion comes days after OpenAI announced an upgrade to ChatGPT’s shopping capabilities.

History Panel for Continuous Research

Google has added a new left-side panel on desktop that saves your past AI Mode searches. This helps with ongoing research projects. You can:

  • Return to previous search topics
  • Pick up where you left off
  • Ask follow-up questions
  • Take the next steps based on what you found earlier

Here’s an example of what it looks like:

Image Credit: Google

Limited Test Outside of Labs

Google plans to test AI Mode beyond the Labs environment. The company says:

“In the coming weeks, a small percentage of people in the U.S. will see the AI Mode tab in Search.”

This indicates that Google is moving cautiously toward broader integration.

AI Mode Capabilities

Google’s AI Mode utilizes a technology called “query fan-out.” This means it runs multiple searches at once across different topics and sources. It then combines this information into a comprehensive answer, providing links to sources.

The system also supports image search. You can upload pictures and ask questions about them. It combines Google Lens, which identifies objects, with Gemini’s reasoning abilities to understand and explain what’s in the image.

AI Investment Reflected in Earnings

The expansion of AI Mode follows strong financial results from Google.

Despite concerns that AI might harm traditional search, Google Search revenue increased 10% to $50.7 billion in Q1 2025. This suggests AI is helping, not hurting, their core business.

Google plans to invest $75 billion in capital improvements in 2025, including infrastructure to support its AI features.

In February, CEO Sundar Pichai announced:

  • 11 new Cloud regions and data centers worldwide.
  • 7 new undersea cable projects to improve global connectivity.

Alphabet’s spending on infrastructure jumped 43% to $17.2 billion in Q1 2025.

Pichai claims that modern data centers now deliver four times more computing power using the same amount of energy.

For marketers, this financial context matters. Google’s investment in AI search isn’t just a tech experiment. It’s a core business strategy that’s already showing positive returns.

As these AI-powered search experiences continue to grow, marketing strategies must evolve to remain visible.

What This Means for Digital Marketers

For SEO and marketing professionals, these updates signal the following trends:

  • Visual content is becoming increasingly important as Google improves its ability to understand and display images in search results.
  • Local SEO remains critical, with business details appearing directly in AI Mode responses.
  • As AI Mode pulls from Google’s Shopping Graph, product data feeds must be accurate and complete.
  • Long-form content addressing complex questions may become more valuable, as AI Mode is better equipped to handle longer, more nuanced queries.
  • Google’s success with AI search, resulting in 10% revenue growth in Q1 2025, indicates that these features will continue to expand.

Availability

To access AI Mode, you need:

  • To be in the United States
  • To be at least 18 years old
  • The latest Google app or Chrome browser
  • Search history turned on

You can access AI Mode through google.com/aimode, the Google.com homepage (tap AI Mode below the search bar), or the Google app.

AI Search & SEO: Key Trends and Insights [Webinar] via @sejournal, @lorenbaker

As AI continues to reshape search, marketers and SEOs are facing a new set of challenges and opportunities. 

From the rise of AI Overviews to shifting SERP priorities, it’s more important than ever to know what to focus on in 2025.

Why This Webinar Is a Must-Attend Event

In this session, you’ll get:

You’ll Learn How To:

  • Adapt your approach to optimize for both answer engines and traditional search engines.
  • Create top-of-SERP content that stands out to AI Overviews.
  • Update technical SEO strategies for the AI era.
  • Use success in conversions as the overall KPI.

Expert Insights From Conductor

Join Shannon Vize, Sr. Content Marketing Manager at Conductor, and Pat Reinhart, VP of Services & Thought Leadership, as they walk through the biggest search and content shifts shaping 2025. From Google’s AI Overviews to new content strategies that actually convert, you’ll get clear guidance to help you move forward with confidence.

Don’t Miss Out!

Join us live and walk away with a clear roadmap for leading your SEO and content strategy in 2025.

Can’t attend live?

Register anyway and we’ll send you the full recording to watch at your convenience.

We Figured Out How AI Overviews Work [& Built A Tool To Prove It] via @sejournal, @mktbrew

This post was sponsored by Market Brew. The opinions expressed in this article are the sponsor’s own.

Wondering how to realign your SEO strategy for maximum SERP visibility in AI Overviews (AIO)?

Do you wish you had techniques that mirror how AI understands relevance?

Imagine if Google handed you the blueprint for AI Overviews:

  • Every signal.
  • Every scoring mechanism.
  • Every semantic pattern it uses to decide what content makes the cut.

That’s what our search engineers did.

They reverse-engineered how Google’s AI Overviews work and built a model that shows you exactly what to fix.

It’s no longer about superficial tweaks; it’s about aligning with how AI truly evaluates meaning and relevance.

In this article, we’ll show you how to rank in AIO SERPs by creating embeddings for your content and how to realign your content for maximum visibility by using AIO tools built by search engineers.

The 3 Key Features Of AI Overviews That Can Make Or Break Your Rankings

Let’s start with the basic building blocks of a Google AI Overviews (AIO) response:

What Are Embeddings?

Embeddings are high-dimensional numerical representations of text. They allow AI systems to understand the meaning of words, phrases, or even entire pages, beyond just the words themselves.

Rather than matching exact terms, embeddings turn language into vectors, or arrays of numbers, that capture the semantic relationships between concepts.

For example, “car,” “vehicle,” and “automobile” are different words, but their embeddings will be close in vector space because they mean similar things.

Large language models (LLMs) like ChatGPT or Google Gemini use embeddings to “understand” language; they don’t just see words, they see patterns of meaning.

What Are Embeddings?: InfographicImage created by MarketBrew.ai, April 2025

Why Do Embeddings Matter For SEO?

Understanding how Large Language Models (LLMs) interpret content is key to winning in AI-driven search results, especially with Google’s AI Overviews.

Search engines have shifted from simple keyword matching to deeper semantic understanding. Now, they rank content based on contextual relevance, topic clusters, and semantic similarity to user intent, not just isolated words.

Vector Representations of WordsImage created by MarketBrew.ai, April 2025

Embeddings power this evolution.

They enable search engines to group, compare, and rank content with a level of precision that traditional methods (like TF-IDF, keyword density, or Entity SEO) can’t match.

By learning how embeddings work, SEOs gain tools to align their content with how search engines actually think, opening the door to better rankings in semantic search.

The Semantic Algorithm GalaxyImage created by MarketBrew.ai, April 2025

How To Rank In AIO SERPs By Creating Embeddings

Step 1: Set Up Your OpenAI Account

  • Sign Up or Log In: If you haven’t already, sign up for an account on OpenAI’s platform at https://platform.openai.com/signup.
  • API Key: Once logged in, you’ll need to generate an API key to access OpenAI’s services. You can find this in your account settings under the API section.

Step 2: Install The OpenAI Python Client To Simplify This Step For SEO Pros

OpenAI provides a Python client that simplifies the process of interacting with their API. To install it, run the following command in your terminal or command prompt:

pip install openai

Step 3: Authenticate With Your API Key

Before making requests, you need to authenticate using your API key. Here’s how you can set it up in your Python script:

import openai

openai.api_key = 'your-api-key-here'

Step 4: Choose Your Embedding Model

At the time of this article’s creation, OpenAI’s text-embedding-3-small is considered one of the most advanced embedding models. It is highly efficient for a wide range of text processing tasks.

Step 5: Create Embeddings For Your Content

To generate embeddings for text:

response = openai.Embedding.create(

model="text-embedding-3-small",

input="This is an example sentence."

)

embeddings = response['data'][0]['embedding']

print(embeddings)

The result is a list of numbers representing the semantic meaning of your input in high-dimensional space.

Step 6: Storing Embeddings

Store embeddings in a database for future use; tools like Pinecone or PostgreSQL with pgvector are great options.

Step 7: Handling Large Text Inputs

For large content, break it down into paragraphs or sections and generate embeddings for each chunk.

Use similarly sized chunks for better cosine similarity calculations. To represent an entire document, you can average the embeddings for each chunk.

💡Pro Tip: Use Market Brew’s free AI Overviews Visualizer. The search engineer team at Market Brew has created this visualizer to help you understand exactly how embeddings, the fourth generation of text classifiers, are used by search engines.

Semantics: Comparing Embeddings With Cosine Similarity

Cosine similarity measures the similarity between two vectors (embeddings), regardless of their magnitude.

This is essential for comparing the semantic similarity between two pieces of text.

How Does Cosine Similarity Work? Image created by MarketBrew.ai, April 2025

Typical search engine comparisons include:

  1. Keywords with paragraphs,
  2. Groups of paragraphs with other paragraphs, and
  3. Groups of keywords with groups of paragraphs.

Next, search engines cluster these embeddings.

How Search Engines Cluster Embeddings

Search engines can organize content based on clusters of embeddings.

In the video below, we are going to illustrate why and how you can use embedding clusters, using Market Brew’s free AI Overviews Visualizer, to fix content alignment issues that may be preventing you from appearing in Google’s AI Overviews or even their regular search results!

Embedding clusters, or “semantic clouds”, form one of the most powerful ranking tools for search engineers today.

Semantic clouds are topic clusters in thousands of dimensions. The illustration above shows a 3D representation to simplify understanding.

Topic clusters are to entities as semantic clouds are to embeddings. Think of a semantic cloud as a topic cluster on steroids.

Search engineers use this like they do topic clusters.

When your content falls outside the top semantic cloud – what the AI deems most relevant – it is ignored, demoted, or excluded from AI Overviews (and even regular search results) entirely.

No matter how well-written or optimized your page might be in the traditional sense, it won’t surface if it doesn’t align with the right semantic cluster that the finely tuned AI system is seeking.

By using the AI Overviews Visualizer, you can finally see whether your content aligns with the dominant semantic cloud for a given query. If it doesn’t, the tool provides a realignment strategy to help you bridge that gap.

In a world where AI decides what gets shown, this level of visibility isn’t just helpful. It’s essential.

Free AI Overviews Visualizer: How To Fix Content Alignment

Step 1: Use The Visualizer

Input your URL into this AI Overviews Visualizer tool to see how search engines view your content using embeddings. The Cluster Analysis tab will display embedding clusters for your page and indicate whether your content aligns with the correct cluster.

MarketBrew.ai dashboard Screenshot from MarketBrew.ai, April 2025

Step 2: Read The Realignment Strategy

The tool provides a realignment strategy if needed. This provides a clear roadmap for adjusting your content to better align with the AI’s interpretation of relevance.

Example: If your page is semantically distant from the top embedding cluster, the realignment strategy will suggest changes, such as reworking your content or shifting focus.

Example: Embedding Cluster AnalysisScreenshot from MarketBrew.ai, April 2025
Example of New Page Content Aligned with Target EmbeddingScreenshot from MarketBrew.ai, April 2025

Step 3: Test New Changes

Use the “Test New Content” feature to check how well your content now fits the AIO’s top embedding cluster. Iterative testing and refinement are recommended as AI Overviews evolve.

AI Overviews authorScreenshot by MarketBrew.ai, April 2025

See Your Content Like A Search Engine & Tune It Like A Pro

You’ve just seen under the hood of modern SEO – embeddings, clusters, and AI Overviews. These aren’t abstract theories. They’re the same core systems that Google uses to determine what ranks.

Think of it like getting access to the Porsche service manual, not just the owner’s guide. Suddenly, you can stop guessing which tweaks matter and start making adjustments that actually move the needle.

At Market Brew, we’ve spent over two decades modeling these algorithms. Tools like the free AI Overviews Visualizer give you that mechanic’s-eye view of how search engines interpret your content.

And for teams that want to go further, a paid license unlocks Ranking Blueprints to help track and prioritize which AIO-based metrics most affect your rankings – like cosine similarity and top embedding clusters.

You have the manual now. The next move is yours.


Image Credits

Featured Image: Image by Market Brew. Used with permission.

In-Post Image: Images by Market Brew. Used with permission.

The Data Behind Google’s AI Overviews: What Sundar Pichai Won’t Tell You via @sejournal, @Kevin_Indig

Google claims AI Overviews are revolutionizing search behavior.

But the data tells a different story.

Since launching AI Overviews in 2024, Google CEO Sundar Pichai has repeatedly claimed they’re transforming search behavior:

People are using it to Search in entirely new ways, and asking new types of questions, longer and more complex queries… and getting back the best the web has to offer.1

Image Credit: Kevin Indig

The narrative has been consistent across earnings calls and interviews:

  • “We have been able to grow traffic to the ecosystem.”2
  • “Growth actually increases over time as people learn to adapt to that new behavior.”3
  • “Based on our testing, we are encouraged that we are seeing an increase in search usage among people who use the new AI overviews as well as increased user satisfaction with the results.” (Source)

Yet, Google has never backed these claims with actual data.

So, I partnered with Similarweb to analyze over 5 billion search queries across multiple markets.

There’s too much good data in this analysis not to share. So today, you’re getting part one of a two-part series. Here’s what we’ll cover across both parts:

  1. Part 1: How AI Overviews affect user behavior on Google.
  2. Part 2: How AI Overviews impact traffic and engagement on websites.
Image Credit: Lyna ™

Boost your skills with Growth Memo’s weekly expert insights. Subscribe for free!

The stand-out result?

Google’s claims are partially true but significantly oversimplified.

My analysis with Similarweb shows:

  • People visit Google more frequently (+9%) but spend less time per visit.
  • Query length has barely changed.
  • And – most importantly for SEO pros and marketers – the data reveals critical insights about how to adapt to this new search paradigm.
Image Credit: Kevin Indig

About The Data

The data set includes:

  • Over 5 billion search queries and 20 million websites.4
  • Average time on site, searches per session, and visits per user on Google.com – both in total and comparing the UK, U.S., and Germany.
  • A comparison of keywords with and without AI Overviews that analyze searches per session, average time spent on Google, and zero-click share.
  • Page views and time spent on Google.com for keywords showing AI Overviews vs. keywords without AI Overviews.
  • Average query length for the UK, U.S., and Germany.

In this overall claim, the phrase “Search usage” isn’t very well defined.

Is it more searches, more engagement with SERP features, or longer sessions?

Unfortunately, I wasn’t able to pinpoint the exact definition of search usage. It’s Google’s own wording, and so it might be intentionally vague. (Have your own thoughts on this? Let me know!)

Whether there has actually been an overall increase in Search usage because of AI Overviews is more complex, depending on several factors. And the analysis below shows clear patterns we can learn from.

For the U.S. market specifically, the data confirms that the claim “We are seeing an increase in Search usage among people who use the new AI Overviews” is directionally correct.

Here’s how we know it’s correct: Google visits rose +9% after the May 24 rollout (from 26.9% to 29.1%).

The initial drop could be explained by the PR disaster from the first two weeks. (Remember those strange results that mentioned smoking when pregnant or glue on your pizza?)

While U.S. visits to Google grew modestly from 2023 to mid-2024, a clearer upward trend began around August 2024.

Image Credit: Kevin Indig

When we look at the two comparative keyword sets – remember, one set in this study shows AIOs and one doesn’t – we can see that page views on websites from AIO keywords have increased by 22% since the U.S. launch. (Shown by the red line below).

I know we all want to talk about the effect of AIOs on organic clicks, but we’ll get there. I’ll come back to the fact that non-AIO queries drive more page views in Part 2.

However, the “Page views on websites” chart for U.S. searches below reveals two critical insights:

  1. Websites are receiving more views from AIO keyword searches over time, but
  2. Non-AIO queries drive about twice as many page views (shown by the black line).

This suggests AI Overviews may be increasing engagement for specific query types while having a limited impact on overall traffic patterns.

Image Credit: Kevin Indig

Next, let’s take a look at the US, UK, and Germany markets compared.

Although Google has claimed “We are seeing an increase in Search usage among people who use the new AI Overviews,” in general, the SimilarWeb data shows a more nuanced story.

Here’s how we know the claim is only partially correct, depending on the market:

The growth of U.S. visits to Google is proportionally higher than in Germany (chart below), which is our control market because AIOs didn’t roll out there until March 2025.

Image Credit: Kevin Indig

However, in the UK, where AI Overviews rolled out in August 2024, visits are trending flat to down after the rollout (shown via the green line above).

In fact, there was more engagement growth from 2023 to 2024 (before the AIO roll-out).

Ultimately, I consider Google’s claim incorrect for other markets outside the US:More SERP interaction does not translate into longer on-Google sessions.

In the chart below, we can see that time-on-site for Google.com in both the US and UK has been either flat or declining.

And something is reducing time-on-site in Google DE fairly significantly. Maybe it’s related toGoogle losing market share in the EU.

Image Credit: Kevin Indig

We see the same trend when we compare AIO-showing with non-AIO-showing queries in the chart below.

Time on Google for AIO queries falls by -1%.

While this isn’t a huge dip, it certainly isn’t an “increase in Search usage.”

Image Credit: Kevin Indig

Notably, you’ll see in the chart below that pages-per-visit on Google.com declined across the board in 2024 after rolling out AIOs, but then they start recovering and growing again in 2025.

This chart shows a clear dip in pages-per-visit immediately following the May 2024 AIO launch, suggesting users needed fewer results pages when AI Overviews answered their queries directly.

The subsequent recovery in 2025 indicates either user adaptation or Google adjustments to how AIOs function within the search experience.

Image Credit: Kevin Indig

But what about this sudden uplift in 2025?

It happens in our controlled market, Germany, as well. So, it’s not due to AIOs.

How do we know this? Pop back up to that Time on Site graph above that shows all three markets. And you’ll see that Germany’s time on site shows a decline after the AIO launch.

While I’m not sure what drives this trend, I do wonder how less time on Google impacts its bottom line.

MBI published a very interesting deep dive on Alphabet with a chart that indicates that AI Overviews do not monetize as well as claimed.

Instead, a rise in cost per click seems to drive Alphabet’s outstanding earnings.

To be fair, that trend started in 2018, so it’s not clear how much AIOs have accelerated it.

Chart from MBI’s latest deep dive on Alphabet (Image Credit: Kevin Indig)

We are seeing an increase in Search usage among people who use the new AI Overviews.

Based on the data, this claim has layers of truth and omission.

Google visits did increase post-AIO launch (+9%), and AIO keyword pageviews rose impressively (+22%).

However, the full picture reveals important nuances that we need to take into account:

  • UK visits remained flat after the AIO rollout, despite U.S. gains.
  • Time-on-site metrics are flat or declining across markets.
  • Pages-per-visit initially dropped after AIOs launched.

The data suggests users are visiting Google more frequently but spending less time per visit, likely because AI Overviews provide faster answers without deeper exploration.

This pattern aligns with a “resolve and leave” user behavior, rather than increased engagement with Google itself.

While it might be technically true that “Search usage” increased by some metrics, the claim obscures how AIOs are fundamentally changing search interaction patterns at the cost of web traffic.

When we look at the data closely, this claim doesn’t hold true.

Here’s how we know it’s incorrect:

The growth in query length is tiny – certainly not a step-change to “entirely new ways.”

We’re talking about a very gradual increase of 3.27 to 3.37 average words per query in the U.S. over the course of two years.

Sure, that’s only 3% – and maybe at the scale of Google, that has a huge impact.

But this is no step change.

Image Credit: Kevin Indig

The difference in query length between May 2024 and February 2025 is only +0.6%.

In the UK, query length decreased by -0.3% after AI Overviews launched from 3.18 words in August 2024 to 3.17 words in February 2025.

In Germany, query length increased a bit (+0.4%) before the AI Overviews launch.

Verdict: This Claim Is Overstated And Incorrect

While Google reports “People are using it to search…longer and more complex queries,” a closer look shows otherwise.

The data shows only minimal changes in query length in the US, with the UK seeing a decrease after AIOs rolled out.

The data simply doesn’t support the narrative that AI Overviews are driving users to construct “longer and more complex queries” in any meaningful way.

When we examine the data closely, a clear pattern emerges:

Google’s claims about how AI Overviews are fundamentally changing how we search are largely overstated.

Yes, users visit Google more frequently, but they’re spending less time per visit and not crafting significantly longer or more complex queries.

This suggests AI Overviews are creating a “quick answer” behavior pattern rather than deeper engagement with search.

The modest increases in visits are counterbalanced by decreases in time-on-site and pages-per-visit.

And the minimal change in query length across all markets – regardless of whether AI Overviews have launched – indicates that any evolution in search behavior is happening independently of AI features.

These findings matter because they challenge the narrative that AI Overviews represent a revolutionary enhancement to search.

Instead, they’re changing user interaction patterns in ways that Google hasn’t fully acknowledged.

Keep in mind that I’m working with third-party data, which can always be skewed or partial. I don’t think it’s wrong, but we always need to keep the limitations of the data in mind.

  1. Optimize for the new “visit more, stay less” pattern: Users are more frequently turning to Google, but they’re spending less time seeking answers. Your content strategy should focus on both being accurately represented in AI Overviews and providing deeper value that encourages clicks when the overview isn’t sufficient.
  2. Focus on engagement quality: The pattern suggests users are more selective about clicking through, making the quality of experience more important than ever when they do reach your site.
  3. Factor in regional differences: The significant variations between U.S. and UK behavior after AI Overview launches suggest regional testing is essential – what works in one market may not transfer directly to others.

In Part 2, we’ll explore the even more critical question: What happens to the broader web ecosystem when users get their answers directly on Google rather than clicking through to websites?

The answer will reveal whether Google’s claims about “growing traffic to the ecosystem” hold up to scrutiny.


1 Google I/O 2024: An I/O for a new generation

2 CNBC Exclusive: CNBC Transcript: Alphabet CEO Sundar Pichai Speaks with CNBC’s Deirdre Bosa on “Closing Bell: Overtime” Today

3 2024 Q3 Earnings Call

4 A 360-Degree View into the Digital Landscape


Featured Image: Paulo Bobita/Search Engine Journal

Reddit Mods Accuse AI Researchers Of Impersonating Sexual Assault Victims via @sejournal, @martinibuster

Researchers testing the ability of AI to influence people’s opinions violated the ChangeMyView subreddit’s rules and used deceptive practices that allegedly were not approved by their ethics committee, including impersonating victims of sexual assault and using background information about Reddit users to manipulate them.

They argue that those conditions may have introduced biases. Their solution was to introduce AI bots into a live environment without telling the forum members they were interacting with an AI bot. Their audience were unsuspecting Reddit users in the Change My View (CMV) subreddit (r/ChangeMyView), even though it was a violation of the subreddit’s rules which prohibit the use of undisclosed AI bots.

After the research was finished the researchers disclosed their deceit to the Reddit moderators who subsequently posted a notice about it in the subreddit, along with a draft copy of the completed research paper.

Ethical Questions About Research Paper

The CMV moderators posted a discussion that underlines that the subreddit prohibits undisclosed bots and that permission to conduct this experiment would never have been granted:

“CMV rules do not allow the use of undisclosed AI generated content or bots on our sub. The researchers did not contact us ahead of the study and if they had, we would have declined. We have requested an apology from the researchers and asked that this research not be published, among other complaints. As discussed below, our concerns have not been substantively addressed by the University of Zurich or the researchers.”

This fact that the researchers violated the Reddit rules was completely absent from the research paper.

Researchers Claim Research Was Ethical

While the researchers omit that the research broke the rules of the subreddit, they do create the impression that it was ethical by stating that their research methodology was approved by an ethics committee and that all generated comments were checked to assure they were not harmful or unethical:

“In this pre-registered study, we conduct the first large-scale field experiment on LLMs’ persuasiveness, carried out within r/ChangeMyView, a Reddit community of almost 4M users and ranking among the top 1% of subreddits by size. In r/ChangeMyView, users share opinions on various topics, challenging others to change their perspectives by presenting arguments and counterpoints while engaging in a civil conversation. If the original poster (OP) finds a response convincing enough to reconsider or modify their stance, they award a ∆ (delta) to acknowledge their shift in perspective.

…The study was approved by the University of Zurich’s Ethics Committee… Importantly, all generated comments were reviewed by a researcher from our team to ensure no harmful or unethical content was published.”

The moderators of the ChangeMyView subreddit dispute the researcher’s claim to the ethical high ground:

“During the experiment, researchers switched from the planned “values based arguments” originally authorized by the ethics commission to this type of “personalized and fine-tuned arguments.” They did not first consult with the University of Zurich ethics commission before making the change. Lack of formal ethics review for this change raises serious concerns.”

Why Reddit Moderators Believe Research Was Unethical

The Change My View subreddit moderators raised multiple concerns about why they believe the researchers engaged in a grave breach of ethics, including impersonating victims of sexual assault. They argue that this qualifies as “psychological manipulation” of the original posters (OPs), the people who started each discussion.

The Reddit moderators posted:

“The researchers argue that psychological manipulation of OPs on this sub is justified because the lack of existing field experiments constitutes an unacceptable gap in the body of knowledge. However, If OpenAI can create a more ethical research design when doing this, these researchers should be expected to do the same. Psychological manipulation risks posed by LLMs is an extensively studied topic. It is not necessary to experiment on non-consenting human subjects.

AI was used to target OPs in personal ways that they did not sign up for, compiling as much data on identifying features as possible by scrubbing the Reddit platform. Here is an excerpt from the draft conclusions of the research.

Personalization: In addition to the post’s content, LLMs were provided with personal attributes of the OP (gender, age, ethnicity, location, and political orientation), as inferred from their posting history using another LLM.

Some high-level examples of how AI was deployed include:

  • AI pretending to be a victim of rape
  • AI acting as a trauma counselor specializing in abuse
  • AI accusing members of a religious group of “caus[ing] the deaths of hundreds of innocent traders and farmers and villagers.”
  • AI posing as a black man opposed to Black Lives Matter
  • AI posing as a person who received substandard care in a foreign hospital.”

The moderator team have filed a complaint with the University Of Zurich

Are AI Bots Persuasive?

The researchers discovered that AI bots are highly persuasive and do a better job of changing people’s minds than humans can.

The research paper explains:

“Implications. In a first field experiment on AI-driven persuasion, we demonstrate that LLMs can be highly persuasive in real-world contexts, surpassing all previously known benchmarks of human persuasiveness.”

One of the findings was that humans were unable to identify when they were talking to a bot and (unironically) they encourage social media platforms to deploy better ways to identify and block AI bots:

“Incidentally, our experiment confirms the challenge of distinguishing human from AI-generated content… Throughout our intervention, users of r/ChangeMyView never raised concerns that AI might have generated the comments posted by our accounts. This hints at the potential effectiveness of AI-powered botnets… which could seamlessly blend into online communities.

Given these risks, we argue that online platforms must proactively develop and implement robust detection mechanisms, content verification protocols, and transparency measures to prevent the spread of AI-generated manipulation.”

Takeaways:

  • Ethical Violations in AI Persuasion Research
    Researchers conducted a live AI persuasion experiment without Reddit’s consent, violating subreddit rules and allegedly violating ethical norms.
  • Disputed Ethical Claims
    Researchers claim ethical high ground by citing ethics board approval but omitted citing rule violations; moderators argue they engaged in undisclosed psychological manipulation.
  • Use of Personalization in AI Arguments
    AI bots allegedly used scraped personal data to create highly tailored arguments targeting Reddit users.
  • Reddit Moderators Allege Profoundly Disturbing Deception
    The Reddit moderators claim that the AI bots impersonated sexual assault victims, trauma counselors, and other emotionally charged personas in an effort to manipulate opinions.
  • AI’s Superior Persuasiveness and Detection Challenges
    The researchers claim that AI bots proved more persuasive than humans and remained undetected by users, raising concerns about future bot-driven manipulation.
  • Research Paper Inadvertently Makes Case For Why AI Bots Should Be Banned From Social Media
    The study highlights the urgent need for social media platforms to develop tools for detecting and verifying AI-generated content. Ironically, the research paper itself is a reason why AI bots should be more aggressively banned from social media and forums.

Researchers from the University of Zurich tested whether AI bots could persuade people more effectively than humans by secretly deploying personalized AI arguments on the ChangeMyView subreddit without user consent, violating platform rules and allegedly going outside the ethical standards approved by their university ethics board. Their findings show that AI bots are highly persuasive and difficult to detect, but the way the research itself was conducted raises ethical concerns.

Read the concerns posted by the ChangeMyView subreddit moderators:

Unauthorized Experiment on CMV Involving AI-generated Comments

Featured Image by Shutterstock/Ausra Barysiene and manipulated by author

How LLMs Interpret Content: How To Structure Information For AI Search via @sejournal, @cshel

In the SEO world, when we talk about how to structure content for AI search, we often default to structured data – Schema.org, JSON-LD, rich results, knowledge graph eligibility – the whole shooting match.

While that layer of markup is still useful in many scenarios, this isn’t another article about how to wrap your content in tags.

Structuring content isn’t the same as structured data

Instead, we’re going deeper into something more fundamental and arguably more important in the age of generative AI: How your content is actually structured on the page and how that influences what large language models (LLMs) extract, understand, and surface in AI-powered search results.

Structured data is optional. Structured writing and formatting are not.

If you want your content to show up in AI Overviews, Perplexity summaries, ChatGPT citations, or any of the increasingly common “direct answer” features driven by LLMs, the architecture of your content matters: Headings. Paragraphs. Lists. Order. Clarity. Consistency.

In this article, I’m unpacking how LLMs interpret content — and what you can do to make sure your message is not just crawled, but understood.

How LLMs Actually Interpret Web Content

Let’s start with the basics.

Unlike traditional search engine crawlers that rely heavily on markup, metadata, and link structures, LLMs interpret content differently.

They don’t scan a page the way a bot does. They ingest it, break it into tokens, and analyze the relationships between words, sentences, and concepts using attention mechanisms.

They’re not looking for a tag or a JSON-LD snippet to tell them what a page is about. They’re looking for semantic clarity: Does this content express a clear idea? Is it coherent? Does it answer a question directly?

LLMs like GPT-4 or Gemini analyze:

  • The order in which information is presented.
  • The hierarchy of concepts (which is why headings still matter).
  • Formatting cues like bullet points, tables, bolded summaries.
  • Redundancy and reinforcement, which help models determine what’s most important.

This is why poorly structured content – even if it’s keyword-rich and marked up with schema – can fail to show up in AI summaries, while a clear, well-formatted blog post without a single line of JSON-LD might get cited or paraphrased directly.

Why Structure Matters More Than Ever In AI Search

Traditional search was about ranking; AI search is about representation.

When a language model generates a response to a query, it’s pulling from many sources – often sentence by sentence, paragraph by paragraph.

It’s not retrieving a whole page and showing it. It’s building a new answer based on what it can understand.

What gets understood most reliably?

Content that is:

  • Segmented logically, so each part expresses one idea.
  • Consistent in tone and terminology.
  • Presented in a format that lends itself to quick parsing (think FAQs, how-to steps, definition-style intros).
  • Written with clarity, not cleverness.

AI search engines don’t need schema to pull a step-by-step answer from a blog post.

But, they do need you to label your steps clearly, keep them together, and not bury them in long-winded prose or interrupt them with calls to action, pop-ups, or unrelated tangents.

Clean structure is now a ranking factor – not in the traditional SEO sense, but in the AI citation economy we’re entering.

What LLMs Look For When Parsing Content

Here’s what I’ve observed (both anecdotally and through testing across tools like Perplexity, ChatGPT Browse, Bing Copilot, and Google’s AI Overviews):

  • Clear Headings And Subheadings: LLMs use heading structure to understand hierarchy. Pages with proper H1–H2–H3 nesting are easier to parse than walls of text or div-heavy templates.
  • Short, Focused Paragraphs: Long paragraphs bury the lede. LLMs favor self-contained thoughts. Think one idea per paragraph.
  • Structured Formats (Lists, Tables, FAQs): If you want to get quoted, make it easy to lift your content. Bullets, tables, and Q&A formats are goldmines for answer engines.
  • Defined Topic Scope At The Top: Put your TL;DR early. Don’t make the model (or the user) scroll through 600 words of brand story before getting to the meat.
  • Semantic Cues In The Body: Words like “in summary,” “the most important,” “step 1,” and “common mistake” help LLMs identify relevance and structure. There’s a reason so much AI-generated content uses those “giveaway” phrases. It’s not because the model is lazy or formulaic. It’s because it actually knows how to structure information in a way that’s clear, digestible, and effective, which, frankly, is more than can be said for a lot of human writers.

A Real-World Example: Why My Own Article Didn’t Show Up

In December 2024, I wrote a piece about the relevance of schema in AI-first search.

It was structured for clarity, timeliness, and was highly relevant to this conversation, but didn’t show up in my research queries for this article (the one you are presently reading). The reason? I didn’t use the term “LLM” in the title or slug.

All of the articles returned in my search had “LLM” in the title. Mine said “AI Search” but didn’t mention LLMs explicitly.

You might assume that a large language model would understand “AI search” and “LLMs” are conceptually related – and it probably does – but understanding that two things are related and choosing what to return based on the prompt are two different things.

Where does the model get its retrieval logic? From the prompt. It interprets your question literally.

If you say, “Show me articles about LLMs using schema,” it will surface content that directly includes “LLMs” and “schema” – not necessarily content that’s adjacent, related, or semantically similar, especially when it has plenty to choose from that contains the words in the query (a.k.a. the prompt).

So, even though LLMs are smarter than traditional crawlers, retrieval is still rooted in surface-level cues.

This might sound suspiciously like keyword research still matters – and yes, it absolutely does. Not because LLMs are dumb, but because search behavior (even AI search) still depends on how humans phrase things.

The retrieval layer – the layer that decides what’s eligible to be summarized or cited – is still driven by surface-level language cues.

What Research Tells Us About Retrieval

Even recent academic work supports this layered view of retrieval.

A 2023 research paper by Doostmohammadi et al. found that simpler, keyword-matching techniques, like a method called BM25, often led to better results than approaches focused solely on semantic understanding.

The improvement was measured through a drop in perplexity, which tells us how confident or uncertain a language model is when predicting the next word.

In plain terms: Even in systems designed to be smart, clear and literal phrasing still made the answers better.

So, the lesson isn’t just to use the language they’ve been trained to recognize. The real lesson is: If you want your content to be found, understand how AI search works as a system – a chain of prompts, retrieval, and synthesis. Plus, make sure you’re aligned at the retrieval layer.

This isn’t about the limits of AI comprehension. It’s about the precision of retrieval.

Language models are incredibly capable of interpreting nuanced content, but when they’re acting as search agents, they still rely on the specificity of the queries they’re given.

That makes terminology, not just structure, a key part of being found.

How To Structure Content For AI Search

If you want to increase your odds of being cited, summarized, or quoted by AI-driven search engines, it’s time to think less like a writer and more like an information architect – and structure content for AI search accordingly.

That doesn’t mean sacrificing voice or insight, but it does mean presenting ideas in a format that makes them easy to extract, interpret, and reassemble.

Core Techniques For Structuring AI-Friendly Content

Here are some of the most effective structural tactics I recommend:

Use A Logical Heading Hierarchy

Structure your pages with a single clear H1 that sets the context, followed by H2s and H3s that nest logically beneath it.

LLMs, like human readers, rely on this hierarchy to understand the flow and relationship between concepts.

If every heading on your page is an H1, you’re signaling that everything is equally important, which means nothing stands out.

Good heading structure is not just semantic hygiene; it’s a blueprint for comprehension.

Keep Paragraphs Short And Self-Contained

Every paragraph should communicate one idea clearly.

Walls of text don’t just intimidate human readers; they also increase the likelihood that an AI model will extract the wrong part of the answer or skip your content altogether.

This is closely tied to readability metrics like the Flesch Reading Ease score, which rewards shorter sentences and simpler phrasing.

While it may pain those of us who enjoy a good, long, meandering sentence (myself included), clarity and segmentation help both humans and LLMs follow your train of thought without derailing.

Use Lists, Tables, And Predictable Formats

If your content can be turned into a step-by-step guide, numbered list, comparison table, or bulleted breakdown, do it. AI summarizers love structure, so do users.

Frontload Key Insights

Don’t save your best advice or most important definitions for the end.

LLMs tend to prioritize what appears early in the content. Give your thesis, definition, or takeaway up top, then expand on it.

Use Semantic Cues

Signal structure with phrasing like “Step 1,” “In summary,” “Key takeaway,” “Most common mistake,” and “To compare.”

These phrases help LLMs (and readers) identify the role each passage plays.

Avoid Noise

Interruptive pop-ups, modal windows, endless calls-to-action (CTAs), and disjointed carousels can pollute your content.

Even if the user closes them, they’re often still present in the Document Object Model (DOM), and they dilute what the LLM sees.

Think of your content like a transcript: What would it sound like if read aloud? If it’s hard to follow in that format, it might be hard for an LLM to follow, too.

The Role Of Schema: Still Useful, But Not A Magic Bullet

Let’s be clear: Structured data still has value. It helps search engines understand content, populate rich results, and disambiguate similar topics.

However, LLMs don’t require it to understand your content.

If your site is a semantic dumpster fire, schema might save you, but wouldn’t it be better to avoid building a dumpster fire in the first place?

Schema is a helpful boost, not a magic bullet. Prioritize clear structure and communication first, and use markup to reinforce – not rescue – your content.

How Schema Still Supports AI Understanding

That said, Google has recently confirmed that its LLM (Gemini), which powers AI Overviews, does leverage structured data to help understand content more effectively.

In fact, John Mueller stated that schema markup is “good for LLMs” because it gives models clearer signals about intent and structure.

That doesn’t contradict the point; it reinforces it. If your content isn’t already structured and understandable, schema can help fill the gaps. It’s a crutch, not a cure.

Schema is a helpful boost, but not a substitute, for structure and clarity.

In AI-driven search environments, we’re seeing content without any structured data show up in citations and summaries because the core content was well-organized, well-written, and easily parsed.

In short:

  • Use schema when it helps clarify the intent or context.
  • Don’t rely on it to fix bad content or a disorganized layout.
  • Prioritize content quality and layout before markup.

The future of content visibility is built on how well you communicate, not just how well you tag.

Conclusion: Structure For Meaning, Not Just For Machines

Optimizing for LLMs doesn’t mean chasing new tools or hacks. It means doubling down on what good communication has always required: clarity, coherence, and structure.

If you want to stay competitive, you’ll need to structure content for AI search just as carefully as you structure it for human readers.

The best-performing content in AI search isn’t necessarily the most optimized. It’s the most understandable. That means:

  • Anticipating how content will be interpreted, not just indexed.
  • Giving AI the framework it needs to extract your ideas.
  • Structuring pages for comprehension, not just compliance.
  • Anticipating and using the language your audience uses, because LLMs respond literally to prompts and retrieval depends on those exact terms being present.

As search shifts from links to language, we’re entering a new era of content design. One where meaning rises to the top, and the brands that structure for comprehension will rise right along with it.

More Resources:


Featured Image: Igor Link/Shutterstock