An AI-Powered Workflow To Solve Content Cannibalization via @sejournal, @Kevin_Indig

Your site likely suffers from at least some content cannibalization, and you might not even realize it.

Cannibalization hurts organic traffic and revenue: The impact can stretch from key pages not ranking to algorithm issues due to low domain quality.

However, cannibalization is tricky to detect, can change over time, and exists on a spectrum.

It’s the “microplastics of SEO.”

In this Memo, I’ll show you:

  1. How to identify and fix content cannibalization reliably.
  2. How to automate content cannibalization detection.
  3. An automated workflow you can try out right now: The Cannibalization Detector, my new keyword cannibalization tool.

I could have never done this without Nicole Guercia from AirOps. I’ve designed the concept and stress-tested the automated workflow, but Nicole built the whole thing.

How To Think About Content Cannibalization The Right Way

Before jumping into the workflow, we must clarify a few guiding principles about content cannibalization that are often misunderstood.

The biggest misconception about cannibalization is that it happens on the keyword level.

It’s actually happening on the user intent level.

We all need to stop thinking about this concept as keyword cannibalization and instead as content cannibalization based on user intent.

With this in mind, cannibalization…

  • Is a moving target: When Google updates its understanding of intent during a core update, suddenly two pages can compete with each other that previously didn’t.
  • Exists on a spectrum: A page can compete with another page or several pages, with an intent overlap from 10% to 100%. It’s hard to say exactly how much overlap is fine without looking at outcomes and context.
  • Doesn’t stop at rankings: Looking for two pages that are getting a “substantial” amount of impressions or rankings for the same keyword(s) can help you spot cannibalization, but it is not a very accurate method. It’s not enough proof.
  • Needs regular check-ups: You need to check your site for cannibalization regularly and treat your content library as a “living” ecosystem.
  • Can be sneaky: Many cases are not clear-cut. For example, international content cannibalization is not obvious. A /en directory to address all English-speaking countries can compete with a /en-us directory for the U.S. market.
Image Credit: Kevin Indig

Different types of sites have fundamentally different weaknesses for cannibalization.

My model for site types is the integrator vs. aggregator model. Online retailers and other marketplaces face fundamentally different cases of cannibalization than SaaS or D2C companies.

Integrators cannibalize between pages. Aggregators cannibalize between page types.

  • With aggregators, cannibalization often happens when two page types are too similar. For example, you can have two page types that could or could not compete with each other: “points of interest in {city}” and “things to do in {city}”.
  • With integrators, cannibalization often happens when companies publish new content without maintenance and a plan for the existing content. A big part of the issue is that it becomes harder to keep an overview of what you have and what keywords/intent it targets at a certain number of articles (I found the linchpin to be around 250 articles).

How To Spot Content Cannibalization

An example of content cannibalization (Image Credit: Kevin Indig)

Content cannibalization can have one or more of the following symptoms:

  • “URL flickering”: meaning at least two URLs alternate in ranking for one or more keywords.
  • A page loses traffic and/or ranking positions after another one goes live.
  • A new page hits a ranking plateau for its main keyword and cannot break into the top 3 positions.
  • Google doesn’t index a new page or pages within the same page type.
  • Exact duplicate titles appear in Google’s search index.
  • Google reports “crawled, not indexed” or “discovered, not indexed” for URLs that don’t have thin content or technical issues.

Since Google doesn’t give us a clear signal for cannibalization, the best way to measure similarity between two or more pages is cosine similarity between their tokenized embeddings (I know, it’s a mouthful).

But this is what it means: Basically, you compare how similar two pages are by turning their text into numbers and seeing how closely those numbers point in the same direction.

Think about it like a chocolate cookie recipe:

  • Tokenization = Break down each recipe (e.g., page content) into ingredients: flour, sugar, chocolate chips, etc.
  • Embeddings = Convert each ingredient into numbers, like how much of each ingredient is used and how important each one is to the recipe’s identity.
  • Cosine Similarity = Compare the recipes mathematically. This gives you a number between 0 and 1. A score of 1 means the recipes are identical, while 0 means they’re completely different.

Follow this process to scan your site and find cannibalization candidates:

  • Crawl: Scrape your site with a tool like Screaming Frog (optionally, exclude pages that have no SEO purpose) to extract the URL and meta title of each page
  • Tokenization: Turn words in both the URL and title into pieces of words that are easier to work with. These are your tokens.
  • Embeddings: Turn the tokens into numbers to do “word math.”
  • Similarity: Calculate the cosine similarity between all URLs and meta titles

Ideally, this gives you a shortlist of URLs and titles that are too similar.

In the next step, you can apply the following process to make sure they truly cannibalize each other:

  • Extract content: Clearly isolate the main content (exclude navigation, footer, ads, etc.). Maybe clean up certain elements, like stop words.
  • Chunking or tokenization: Either split content into meaningful chunks (sentences or paragraphs) or tokenize directly. I prefer the latter.
  • Embeddings: Embed the tokens.
  • Entities: Extract named entities from the tokens and weigh them higher in embeddings. In essence, you check which embeddings are “known things” and give them more power in your analysis.
  • Aggregation of embeddings: Aggregate token/chunk embeddings with a weighted averaging (eg, TF-IDF) or attention-weighted pooling.
  • Cosine similarity: Calculate cosine similarity between resulting embeddings.

You can use my app script if you’d like to try it out in Google Sheets (but I have a better alternative for you in a moment).

About cosine similarity: It’s not perfect, but good enough.

Yes, you can fine-tune embedding models for specific topics.

And yes, you can use advanced embedding models like sentence transformers on top, but this simplified process is usually sufficient. No need to make an astrophysics project out of it.

How To Fix Cannibalization

Once you’ve identified cannibalization, you should take action.

But don’t forget to adjust your long-term approach to content creation and governance. If you don’t, all this work to find and fix cannibalization is going to be a waste.

Solving Cannibalization In The Short Term

The short-term action you should take depends on the degree of cannibalization and how quickly you can act.

“Degree” means how similar the content across two or more pages is, expressed in cosine or content similarity.

Though not an exact science, in my experience, a cosine similarity higher than 0.7 is classified as “high”, while it’s “low” below a value of 0.5.

4 ways to fix cannibalization (Image Credit: Kevin Indig)

What to do if the pages have a high degree of similarity:

  • Canonicalize or noindex the page when cannibalization happens due to technical issues like parameter URLs, or if the cannibalizing page is irrelevant for SEO, like paid landing pages. In this case, canonicalize the parameter URL to the non-parameter URL (or noindex the paid landing page).
  • Consolidate with another page when it’s not a technical issue. Consolidation means combining the content and redirecting the URLs. I suggest taking the older page and/or the worse-performing page and redirecting to a new, better page. Then, transfer any useful content to the new variant.

What to do if the pages have a low degree of similarity:

  • Noindex or remove (status code: 410) when you don’t have the capacity or ability to make content changes.
  • Disambiguate the intent focus of the content if you have the capacity, and if the overlap is not too strong. In essence, you want to differentiate the parts of the pages that are too similar.

Solving Cannibalization In The Long Term

It’s critical to take long-term action to adjust your strategy or production process because content cannibalization is a symptom of a bigger issue, not a root cause.

(Unless we’re talking about Google changing its understanding of intent during a core algorithm update, and that has nothing to do with you or your team.)

The most critical long-term changes you need to make are:

  1. Create a content roadmap: SEO Integrators should maintain a living spreadsheet or database with all SEO-relevant URLs and their main target keywords and intent to tighten editorial oversight. Whoever is in charge of the content roadmap needs to ensure there is no overlap between articles and other page types. Writers need to have a clear target intent for new and existing content.
  2. Develop clear site architecture: The pendant of a content map for SEO Aggregators is a site architecture map, which is simply an overview of different page types and the intent they target. It’s critical to underline the intent as you define it with example keywords that you verify on a regular basis (”Are we still ranking well for those keywords?”) to match it against Google’s understanding and competitors.

The last question is: “How do I know when content cannibalization is fixed?”

The answer is when the symptoms mentioned in the previous chapter go away:

  • Indexing issues resolve.
  • URL flickering goes away.
  • No duplicate titles appear in Google’s search index.
  • “Crawled, not indexed” or “discovered, not indexed” issues decrease.
  • Rankings stabilize and break through a plateau (if the page has no other apparent issues).

And, after working with my clients under this manual framework for years, I decided it’s time to automate it.

Introducing: A Fully Automated Cannibalization Detector

Together with Nicole, I used AirOps to build a fully automated AI workflow that goes through 37 steps to detect cannibalization within minutes.

It performs a thorough analysis of content cannibalization by examining keyword rankings, content similarity, and historical data.

Below, I’ll break down the most important steps that it automates on your behalf:

1. Initial URL Processing

The workflow extracts and normalizes the domain and brand name from the input URL.

This foundational step establishes the target website’s identity and creates the baseline for all subsequent analysis.

Image Credit: Kevin Indig

2. Target Content Analysis

To ensure that the system has quality source material to analyze and compare against competitors, Step 2 involves:

  • Scraping the page.
  • Validating and analyzing the HTML structure for main content extraction.
  • Cleaning the article content and generating target embeddings.
Image Credit: Kevin Indig

3. Keyword Analysis

Step 3 reveals the target URL’s search visibility and potential vulnerabilities by:

  • Analyzing ranking keywords through Semrush data.
  • Filtering branded versus non-branded terms.
  • Identifying SERP overlap with competing URLs.
  • Conducting historical ranking analysis.
  • Determining page value based on multiple metrics.
  • Analyzing position differential changes over time.
Image Credit: Kevin Indig

4. Competing Content Analysis (Iteration Over Competing URLs)

Step 4 gathers additional context for cannibalization by iteratively processing each competing URL in the search results through the previous steps.

Image Credit: Kevin Indig

5. Final Report Generation

In the final step, the workflow cleans up the data and generates an actionable report.

Image Credit: Kevin Indig

Try The Automated Content Cannibalization Detector

Image Credit: Kevin Indig

Try the Cannibalization Detector and check out an example report.

A few things to note:

  1. This is an early version. We’re planning to optimize and improve it over time.
  2. The workflow can time out due to a high number of requests. We intentionally limit usage so as not to get overwhelmed by API calls (they cost money). We’ll monitor usage and might temporarily raise the limit, which means if your first attempt isn’t successful, try again in a few minutes. It might just be a temporary spike in usage.
  3. I’m an advisor to AirOps but was neither paid nor incentivized in any other way to build this workflow.

Please leave your feedback in the comments.

We’d love to hear how we can take the Cannibalization Detector to the next level!

Boost your skills with Growth Memo’s weekly expert insights. Subscribe for free!


Featured Image: Paulo Bobita/Search Engine Journal

AI Researchers Warn: Hallucinations Persist In Leading AI Models via @sejournal, @MattGSouthern

A report from the Association for the Advancement of Artificial Intelligence (AAAI) reveals a disconnect between public perceptions of AI capabilities and the reality of current technology.

Factuality remains a major unsolved challenge for even the most advanced models.

The AAAI’s “Presidential Panel on the Future of AI Research” report draws on input from 24 experienced AI researchers and survey responses from 475 participants.

Here are the findings that directly impact search and digital marketing strategies.

Leading AI Models Fail Basic Factuality Tests

Despite billions in research investment, AI factuality remains largely unsolved.

According to the report, even the most advanced models from OpenAI and Anthropic “correctly answered less than half of the questions” on new benchmarks like SimpleQA, a collection of straightforward factual questions.

The report identifies three main techniques being deployed to improve factuality:

  • Retrieval-augmented generation (RAG): Gathering relevant documents using traditional information retrieval before generating answers.
  • Automated reasoning checks: Verifying outputs against predefined rules to cull inconsistent responses.
  • Chain-of-thought (CoT): Breaking questions into smaller units and prompting AI to reflect on tentative conclusions

However, these techniques show limited success, with 60% of AI researchers expressing pessimism that factuality issues will be “solved” in the near future.

This suggests you should prepare for continuous human oversight to ensure content and data accuracy. AI tools may speed up routine tasks, but full autonomy remains risky.

The Reality Gap: AI Capabilities vs. Public Perception

The report highlights a concerning perception gap, with 79% of AI researchers surveyed disagreeing or strongly disagreeing that “current perception of AI capabilities matches the reality.”

The report states:

“The current Generative AI Hype Cycle is the first introduction to AI for perhaps the majority of people in the world and they do not have the tools to gauge the validity of many claims.”

As of November, Gartner placed Generative AI just past its peak of inflated expectations and is now heading toward the “trough of disillusionment” in its Hype Cycle framework.

For those in SEO and digital marketing, this cycle can provoke boom-or-bust investment patterns. Decision-makers might overcommit resources based on AI’s short-term promise, only to experience setbacks when performance fails to meet objectives.

Perhaps most concerning, 74% of researchers believe research directions are driven by hype rather than scientific priorities, potentially diverting resources from foundational issues like factuality.

Dr. Henry Kautz, chair of the Factuality & Trustworthiness section of the report, notes that “many of the public statements of people quite new to the field are out of line with reality,” suggesting that even expert commentary should be evaluated cautiously.

Why This Matters for SEOs & Digital Marketing

Adopting New Tools

The pressure to adopt AI tools can overshadow their limitations. Since issues of factual accuracy remain unresolved, marketers should use AI responsibly.

Conducting regular audits and seeking expert reviews can help reduce the risks of misinformation, particularly in industries regulated by YMYL (Your Money, Your Life) standards, such as finance and healthcare.

The Impact On Content Quality

AI-based content generation can lead to inaccuracies that can directly harm user trust and brand reputation. Search engines may demote websites that publish unreliable or deceptive material produced by AI.

Taking a human-plus-AI approach, where editors meticulously fact-check AI outputs, is recommended.

Navigating the Hype

Beyond content creation challenges, leaders must adopt a clear-eyed view to navigate the hype cycle. The report warns that hype can misdirect resources and overshadow more sustainable gains.

Search professionals who understand AI’s capabilities and limitations will be best positioned to make strategic decisions that deliver real value.

For more details, read the full report (PDF link).


Featured Image: patpitchaya/Shutterstock

The Role Of E-E-A-T In AI Narratives: Building Brand Authority For Search Success via @sejournal, @cshel

For over a decade, E-A-T (expertise, authoritativeness, and trustworthiness) has played a role in search rankings, first introduced in Google’s Search Quality Rater Guidelines in 2014.

But with the rise of AI-generated content and AI-synthesized answers, E-E-A-T (now including experience) is no longer just a good idea. It has become the defining factor in determining which sources AI-driven search results consider authoritative enough to cite and include in their synthesized narratives and responses.

AI Overviews and other AI-generated search features don’t just favor sites that “align with E-E-A-T principles” – they favor recognized experts.

To be cited in AI-driven answers, a brand needs to demonstrate undeniable expertise and establish itself as the authority in its field.

This means consistently producing original research, providing real-world insights, and gaining industry-wide (or broader) recognition.

In this article, we’ll explore how E-E-A-T determines visibility in AI-driven search and AI-generated answers, what challenges brands face in maintaining credibility, and strategies for ensuring that AI models and search engines rely on your content as a trusted source.

The Intersection Of E-E-A-T And AI-Generated Answers

The rise of AI-generated search results presents both opportunities and challenges for brands.

AI-powered features like Google’s AI Overviews, ChatGPT search integrations, and Perplexity AI are synthesizing answers instead of just returning traditional blue links.

This means that appearing in AI-driven answers requires more than just good SEO – it requires E-E-A-T-backed authority.

Key considerations for ensuring visibility in AI search features:

  • Experience: AI models favor content backed by first-hand knowledge. Brands that demonstrate real-world expertise through case studies, original research, and hands-on experience have a greater chance of being cited.
  • Expertise: AI-generated answers prefer sources with clear subject matter expertise. Author bylines, credentials, and expert contributions all signal trustworthiness to AI-driven search.
  • Authoritativeness: AI Overviews and LLM-generated answers prioritize brands that own their knowledge graph, are widely referenced, and are recognized leaders in their industry.
  • Trustworthiness: AI-generated content is acceptable to use (in that it is not inherently “bad” or penalized) but must be factually accurate and verifiable. Content backed by reliable sources, citations, and transparent authorship is more likely to surface in AI-generated search features.

Read More: A Candid Assessment Of AI Search & SEO

AI Overviews And E-E-A-T: What Google’s Latest Research Reveals

Google’s recent post on AI Overviews and AI Mode highlights how AI-generated search experiences are evolving and underscores the importance of E-E-A-T in shaping AI-driven responses.

Here are key takeaways that reinforce the role of E-E-A-T:

Google Integrates E-E-A-T Into AI Overviews

  • AI Overviews leverage Google’s ranking systems and Knowledge Graph to determine which sources are most authoritative. (Hint: Ensure your Knowledge Graph exists and is accurate!)
  • E-E-A-T signals directly influence which websites AI Overviews pull from, reinforcing the need for brands to establish themselves as leading authorities.

High-Quality Sources Are A Requirement

  • AI Overviews corroborate AI-generated summaries with top-ranked content, (theoretically) ensuring the information is reliable.
  • For Your Money or Your Life (YMYL) queries, the bar for trustworthiness is even higher, emphasizing the importance of expert-driven content. (This is why author biographies with CVs, other credentials, and proof of expertise are necessary.)

AI Overviews Increase Engagement With High-Quality Content

  • Google reports that users who interact with AI Overviews visit a greater diversity of websites and that click-throughs from AI Overviews are of higher quality.
  • This presents an opportunity for brands with strong E-E-A-T signals to attract engaged visitors who trust the AI-curated results (but click through to verify).

Manual And Algorithmic Safety Checks Reinforce E-E-A-T’s Importance

  • Google’s Search Quality Raters, adversarial testing, and fact-checking systems ensure AI Overviews prioritize reliable information.
  • Brands that lack E-E-A-T credentials (specifically Knowledge Graphs and other key indicators that your brand is considered authoritative) may struggle to appear in AI-generated search experiences.

Future AI Search Innovations Will Reward E-E-A-T Signals

  • Google’s experimental AI Mode in Search expands AI-generated responses using multimodal data and real-time corroboration with authoritative sources.
  • Brands with verified expertise, structured citations, and widespread recognition will have an advantage in AI-driven search.

This reinforces the need for brands to proactively establish E-E-A-T authority to maintain visibility in AI-driven search features.

Read More: AI Search Optimization: Data Finds Brand Mentions Improve Visibility

Challenges In Applying E-E-A-T To AI-Generated Search

Despite its benefits, AI-driven search presents several challenges for brands trying to maintain visibility:

1. AI Prioritizes Recognized Authorities: Simply optimizing for E-E-A-T is not enough. Brands must become the trusted source that AI search engines consistently reference.

It’s easy to optimize for or align with E-E-A-T in principle, but much more difficult to achieve in reality because some of the requirements simply aren’t within your control.

2. Potential For Misinformation: AI-generated search results can fabricate statistics, misquote sources, or create misleading narratives. Brands must actively monitor AI-generated mentions for accuracy.

3. Duplicate And Unoriginal Content: AI often pulls from widely cited knowledge bases, meaning brands that don’t produce original insights and research risk being ignored.

4. Algorithmic Bias And Filtering: AI search models prioritize widely referenced sources, which can disadvantage emerging brands. Overcoming this requires strategic partnerships, citations, and broad industry engagement.

AI’s Tendency To Be “Confidently Wrong”

A March 2025 study by the Columbia Journalism Review found that AI-powered search tools frequently provide incorrect answers with “alarming confidence.”

The study tested eight major AI search engines and found that chatbots collectively provided inaccurate answers more than 60% of the time, nearly always without acknowledging uncertainty.

Most interesting finding: Premium AI models were even more prone to confidently incorrect responses than their free counterparts, contradicting the assumption that paid AI services are more reliable.

ChatGPT, in particular, only indicated uncertainty in its wrong answers 7.5% of the time. Which means that 92.5% of the times it was wrong, it was confident it was correct.

If ChatGPT’s success rate at indicating uncertainty were a batting average, it would be .075.

John Vukovich, known for recording the lowest ever MLB batting average (for non-pitchers with more than 500 at bats), had a career BA of .161 – which is still 100% better than ChatGPT’s ability to acknowledge it might not be right.

The findings in this report only underscore the need for careful, attentive human oversight when producing content and active reputation management to ensure accuracy in AI-generated search environments.

Read More: The Impact Of AI And Other Innovations On Data Storytelling

Strategies For Strengthening E-E-A-T In AI-Driven Search

To ensure visibility in AI-generated search results, brands must prioritize establishing true authority, not just optimizing content.

1. Own And Optimize Your Knowledge Graph

  • Ensure Google’s Knowledge Graph accurately represents your brand.
  • Claim your entity in Google Search and establish schema markup for credibility.

2. Demonstrate Real-World Expertise

  • Publish original research, case studies, and expert insights.
  • Engage in media interviews, guest contributions, and speaking engagements.

3. Become The Primary Source Of Industry Insights

4. Monitor And Influence AI Search Results

  • Actively track how AI-generated answers represent your brand.
  • Engage with AI search models via feedback loops and corrections.

5. Leverage Thought Leadership Beyond Your Website

  • Be featured on authoritative platforms, podcasts, and news outlets.
  • Participate in peer-reviewed research and industry collaborations.

Read More: What 7 SEO Experts Think About AI Overviews And Where Search Is Going

    Becoming The Source AI Can’t Ignore

    E-E-A-T is the key to visibility in AI-driven search – but it’s not just about optimization.

    Brands must become the expert sources AI models trust, reference, and cite.

    Those who invest in credibility, expertise, and real-world authority will survive in AI-powered search landscapes, and those who don’t will fade into irrelevance.

    More Resources:


    Featured Image: insta_photos/Shutterstock

    Google Rolls Out AI-Powered Travel Updates For Search & Maps via @sejournal, @MattGSouthern

    Google has released its annual summer travel trends report alongside several AI-powered updates to its travel planning tools.

    The announcement reveals shifting travel preferences while introducing enhancements to Search, Maps, Lens, and Gemini functionality.

    New AI Search and Planning Features

    Google announced five major updates to its travel planning ecosystem.

    Expanded AI Overviews

    Google has enhanced its AI Overviews in Search to generate travel recommendations for entire countries and regions, not just cities.

    You can now request specialized itineraries by entering queries like “create an itinerary for Costa Rica with a focus on nature.”

    The feature includes visual elements and the ability to export recommendations to various Google products.

    Image Credit: Google

    Price Monitoring for Hotels

    Following its flight price tracking implementation, Google has extended similar functionality to accommodations.

    When browsing google.com/hotels, you can now toggle price tracking to receive alerts when hotel rates decrease for selected dates and destinations.

    The system factors in applied filters include amenity preferences and star ratings.

    Image Credit: Google

    Screenshot Recognition in Maps

    A new Google Maps feature can help organize travel plans by automatically identifying places mentioned in screenshots.

    Using Gemini AI capabilities, the system recognizes venues from saved images and allows users to add them to dedicated lists.

    The feature is launching first on iOS in English, with Android rollout planned.

    Gemini Travel Assistance

    Google’s Gemini AI assistant now offers enhanced travel planning support, allowing users to create “Gems” – customized AI assistants for specific travel needs.

    Now available at no cost, these specialized assistants can help with destination selection, local recommendations, and trip logistics.

    Expanded Lens Capabilities

    Google Lens continues evolving, offering enhanced AI-powered information delivery when pointing your camera at landmarks or objects.

    The feature is expanding beyond English to include Hindi, Indonesian, Japanese, Korean, Portuguese, and Spanish, complementing its existing translation capabilities.

    Image Credit: Google

    Travel Search Trends

    According to Google’s Flights and Search data analysis, travelers are increasingly drawn to coastal destinations for the Summer of 2025.

    Caribbean islands, including Puerto Rico, Curacao, and St. Lucia, are seeing significant search growth, along with other beach destinations like Rio de Janeiro, Maui, and Nantucket.

    The data also reveals continued momentum for outdoor adventure travel within the U.S.:

    • Cities with proximity to nature experiences (Billings, Montana; Juneau, Alaska; and Bangor, Maine) are experiencing higher search volume
    • “Cabins” has emerged as the top accommodation search for romantic getaways
    • Family travelers are increasingly searching for “dude ranch” vacations
    • Weekend getaway searches concentrate on natural destinations, including upstate New York, Joshua Tree National Park, and Sedona.

    An unexpected trend in luggage preferences was also noted, with “checked bags” queries now exceeding historically dominant “carry on” searches.

    Supporting this shift, space-saving solutions like vacuum bags and compression packing cubes have become top trending travel accessory searches.

    Implications for SEO and Travel Content

    These updates signal Google’s continued investment in controlling the travel research journey within its own ecosystem.

    The expansion of AI-generated itineraries and information potentially reduces the need for users to visit traditional travel content sites during the planning phase.

    Travel brands and publishers may need to adapt their SEO and content strategies to account for these changes, focusing more on unique experiences and in-depth content beyond what Google’s AI tools can generate.

    The trend data also provides valuable insights for travel-related keyword targeting and content development as summer vacation planning begins for many consumers.

    OpenAI Rolls Out GPT-4o Image Creation To Everyone via @sejournal, @MattGSouthern

    OpenAI has rolled out a new image generation system directly integrated with GPT-4o. This system allows the AI to access its knowledge base and conversation context when creating images.

    This integration is said to enable more contextually relevant and accurate visual outputs.

    OpenAI’s announcement reads:

    “GPT‑4o image generation excels at accurately rendering text, precisely following prompts, and leveraging 4o’s inherent knowledge base and chat context—including transforming uploaded images or using them as visual inspiration. These capabilities make it easier to create exactly the image you envision, helping you communicate more effectively through visuals and advancing image generation into a practical tool with precision and power.”

    Here’s everything else you need to know.

    Technical Capabilities

    OpenAI highlights the following capabilities of its new image generation system:

    1. It accurately renders text within images.
    2. It allows users to refine images through conversation while keeping a consistent style.
    3. It supports complex prompts with up to 20 different objects.
    4. It can generate images based on uploaded references.
    5. It creates visuals using information from GPT-4o’s training data.

    OpenAI states in its announcement:

    “Because image generation is now native to GPT‑4o, you can refine images through natural conversation. GPT‑4o can build upon images and text in chat context, ensuring consistency throughout. For example, if you’re designing a video game character, the character’s appearance remains coherent across multiple iterations as you refine and experiment.”

    Examples

    To demonstrate character consistency, here’s an example showing a cat and then that same cat with a hat and monocle.

    Screenshot from: openai.com/index/introducing-4o-image-generation/, March 2025.

    Here’s a more practical example for marketers, demonstrating text generation: a full restaurant menu generated with a detailed prompt.

    Screenshot from: openai.com/index/introducing-4o-image-generation/, March 2025.

    There are dozens more examples in OpenAI’s announcement post, many of which contain several prompts and follow-ups.

    Limitations

    OpenAI admits:

    “Our model isn’t perfect. We’re aware of multiple limitations at the moment which we will work to address through model improvements after the initial launch.”

    The company notes the following limitations of its new image generation system:

    • Cropping: GPT-4o sometimes crops long images, like posters, too closely at the bottom.
    • Hallucinations: This model can create false information, especially with vague prompts.
    • High Blending Problems: It struggles to accurately depict more than 10 to 20 concepts at once, like a complete periodic table.
    • Multilingual Text: The model can have issues showing non-Latin characters, leading to errors.
    • Editing: Requests to edit specific image parts may change other areas or create new mistakes. It also struggles to keep faces consistent in uploaded images.
    • Information Density: The model has difficulty showing detailed information at small sizes.

    Search Implications

    This update changes AI image generation from mainly decorative uses to more practical functions in business and communication.

    Websites can use AI-generated images but with important considerations.

    Google’s guidelines do not prohibit AI-generated visuals, focusing instead on whether content provides value regardless of how it’s produced.

    Following these best practices is recommended:

    • Using C2PA metadata (which GPT-4o adds automatically) to maintain transparency
    • Adding proper alt text for accessibility and indexing
    • Ensuring images serve user intent rather than just filling space
    • Creating unique visuals rather than generic AI templates

    Google Search Advocate John Mueller has expressed a negative opinion regarding AI-generated images. While his personal preferences don’t influence Google’s algorithms, they may indicate how others feel about AI images.

    Screenshot from: bsky.app/profile/johnmu.com, March 2025.

    Note that Google is implementing measures to label AI-generated images in search results.

    Availability

    The feature is now available to ChatGPT users with Plus, Pro, Team, or Free plans. Access for Enterprise and Edu users will be available soon.

    Developers can expect API access in the coming weeks. Because of higher processing needs, image generation takes about one minute on average.


    Featured Image: PatrickAssale/Shutterstock

    70% Of Media Companies Not Fully Using AI, IAB Report Finds via @sejournal, @MattGSouthern

    IAB’s latest “State of Data” report reveals that despite recognizing its potential, 70% of agencies, brands, and publishers have yet to integrate AI into their campaigns fully.

    Here’s a look at the study, which examines the current use of AI in advertising, the challenges of adoption, and the opportunities for success.

    Current State of AI Adoption

    A report from the Interactive Advertising Bureau (IAB) surveyed over 500 experts and found that AI use varies across the industry:

    • 30% of companies have implemented AI in their media campaigns.
    • Agencies (37%) and publishers (34%) are more advanced in using AI compared to brands (19%).
    • Half of the companies that haven’t adopted AI plan to do so by 2026.
    • Most organizations (85%) are using general AI tools, while fewer are using custom solutions (45%) or proprietary tools (24%).

    One SVP from an undisclosed brand stated in the report:

    “We have been slow to fully implement AI into our day-to-day processes. We are wary to go ‘all in’ until it’s become a bit more of a societal norm with a long-standing track record of scalable success.”

    AI Perceptions

    Companies using AI generally have positive experiences:

    • 82% say AI meets or exceeds their efficiency expectations, saving time and costs.
    • 75% believe AI helps their media campaigns effectively.
    • 73% find AI reliable over time.

    AI excels in data-heavy tasks, like audience segmentation and targeting, but struggles with tasks needing human judgment, such as RFP management and campaign setup.

    Adoption Barriers

    The research found several barriers to adopting AI in media campaigns:

    • 62% said they’re concerned about how complex it is to set up and maintain AI.
    • 62% worry about the risk of data security.
    • 61% noted that their organizations lack AI knowledge.
    • 60% have concerns about how accurate and transparent AI is.

    Interestingly, job displacement isn’t seen as a major issue, with only 37% identifying it as a concern.

    Buy-Side vs. Sell-Side Challenges

    Agencies, brands, and publishers face unique challenges with AI:

    • Publishers struggle with complex technology (67%) and scattered capabilities (62%).
    • Brands and publishers (56% each) lack a clear AI vision.
    • Agencies encounter the most resistance to change from teammates and clients (61%).
    • Additionally, 51% of brands worry about transparency in how their partners use AI.

    Looking Ahead

    AI is changing media campaigns, and IAB’s report highlights some important points.

    First, many companies are in the early stages of adopting AI, but this is happening faster than before. Companies without clear plans risk falling behind by 2026.

    Second, companies need good data and solid governance guidelines to succeed with AI. Organizations should train their teams in best practices and set clear goals.

    Standards for transparency, privacy, and reliability are still being developed across the industry. Companies that collaborate to set these standards will be best positioned to handle this change in digital advertising.

    The full “State of Data” report is available through IAB.


    Featured Image: eamesBot/Shutterstock

    Google Search Central Live NYC: Insights On SEO For AI Overviews via @sejournal, @martinibuster

    Danny Sullivan, Google’s Search Liaison, shared insights about AI Overviews, explaining how predictive summaries, grounding links, and the query fan-out technique work together to shape AI-generated search results.

    Optimizing For AIO

    Danny Sullivan shared insights into how AI Overviews are generated, helping explain why Google may link to websites that don’t match the typical search results. While the links can differ, he emphasized that the fundamentals of search optimization remain unchanged.

    This is what Danny Sullivan said, based on my notes:

    “The core fundamental things haven’t really changed. If you’re doing things that are making you successful on search, those sorts of things should transfer into some of the things that you see in the generative AI kind of summaries.”

    Google Explains Why AIO Results Are Different

    One of the main takeaways from this part of Danny’s presentation was his explanation of why Google AIO search results are different. This is the clearest explanation of why the AIO search results are different, every SEO and publisher needs to know this.

    He introduced two concepts to familiarize yourself with in order to better understand AIO search results.

    1. Predictive Summaries
    2. Grounding Links

    Predictive Summaries

    Danny solved the mystery behind AIO search results that show content and links that are different from what organic search results show, which makes it harder to understand how to optimize for that kind of AI search results.

    He shared that the reason for that kind of AIO is something called predictive summaries. Predictive summaries show answers to a search query but also try to predict related variations of what a user will also want to see. This sounds a lot like Google’s patent about Information Gain. Information Gain is about predicting the next question that a searcher may ask after reading the answer to their present question. Information gain is a patent that is strictly to the context of AI Search and AI Assistants.

    Here is what he said, according to my notes:

    “One thing I think that people find really confusing sometimes is that they’ll do a query and especially you’ll see …these are the top 10 results, but I don’t see them in the AIO, what’s going on?

    And it’s like, yeah, the query in the search box is the same query, but the model that’s going out there to try to understand what to show is kind of an overview, going beyond just the top 10 results. It’s understanding a lot of results and it’s understanding a lot of variations that you might kind of get and so that it’s coming back and it’s trying to provide its predictive summary of what the query is related to.”

    Grounding Links

    Sullivan also revealed that “grounding links” are another reason why AIO search results are different from the regular organic search results. An AIO search result is a summary of a topic that includes facts about multiple subtopics. The purpose of grounding is to anchor the entire summary to verifiable information from the web ecosystem.

    In the context of AIO, grounding is the process of confirming the factual authenticity of the AI summaries so that a searcher can click to read about any subtopic discussed in the answer summary provided by AIO. This is the second reason why the links in AIO show a variety not normally seen in the organic search results.

    One way to look at this is that the links are more contextual than the regular ten blue links of the organic search results. These contextual links are also referred to as qualified clicks or qualified links, links that are hyper-specific and more relevant in general than organic search results.

    Danny appears to say that the grounding links are created from searches that are related to the initial search query, but are not the same. Like, if you want to explain how a conventional automobile runs, you need information about the powertrain which is made up of a gas combustion engine, a transmission, the axles and so on. Answering a complex question requires grounding from a wide array of information sources.

    According to my notes, this is how Danny Sullivan explained it:

    “And then on top of that, it’s then also trying to bring in the grounding links. And those grounding links, because it kind of comes from a broader set aren’t just going to match. The queries are going to be different and the overall set is going to be different.

    Which is why it’s a great opportunity for diversity and whatever our query thing is that we say, but that’s why you can see different things that are showing there.”

    Don’t Mess Up Your Rankings

    Sullivan cautioned about trying to rank for both the organic and the different parts of the AIO summaries, saying that it’s likely to “mess things up” because “it doesn’t really work like that.”

    Query Fan-Out Technique

    Danny Sullivan also touched on the topic of AI Mode, saying that right now it’s not really something to optimize for because it’s still in Google Labs and it’s very likely to change and be something different if it ever gets out of Google Labs.

    But he did say that AI Mode uses something called a query fan-out technique.

    He said:

    “…one of the things they talk about is like ‘we use an advanced query fan out technique with multiple related queries in it…’ And it’s basically that what I said before.

    You issued a query. You try to understand the variations and things that are related. which by the way is not that much different to how search works at the moment even when you didn’t have the AI elements to it. Because when you would issue a query now we try to understand synonyms, we try to understand the meaning of the entire query. If it’s a sentence, we try to match it in all sorts of different ways …because sometimes it just brings you better results.”

    Takeaways:

    Google Search Liaison, aka Danny Sullivan, encouraged the use of the core SEO fundamentals, saying that they are still relevant for ranking. Danny explained why the links in AI Overviews can sometimes differ significantly from those in the organic search results, introducing three concepts that help understand AIO search results better.

    Three concepts related to AIO search results to understand:

    1. Predictive Summaries
    2. Grounding Links
    3. Query Fan-Out Technique
    Google Expands AI Overviews To Thousands More Health Queries via @sejournal, @MattGSouthern

    Google is expanding AI overviews to “thousands more health topics,” per an announcement at the company’s health-focused ‘The Check Up’ event.

    The event included developments spanning research, wearable technology, and medical records.

    Here’s more about how Google is refining health results in Search.

    AI Overviews For Health Queries

    Google is showing AI overviews for more health-related queries.

    Compared to other types of questions, this topic has had fewer AI overviews. Now, these overviews will be available for more queries and in more languages.

    Google states:

    “Now, using AI and our best-in-class quality and ranking systems, we’ve been able to expand these types of overviews to cover thousands more health topics. We’re also expanding to more countries and languages, including Spanish, Portuguese and Japanese, starting on mobile.”

    Google notes health-focused advancements to its Gemini models will go into summarizing information for health topics.

    With these updates, Google claims AI overviews for health queries are “more relevant, comprehensive and continue to meet a high bar for clinical factuality.”

    New “What People Suggest” Feature

    Google is introducing a new feature for health queries called “What People Suggest.”

    It uses AI to organize perspectives from online discussions and to analyze what people with similar health conditions are saying.

    For example, someone with arthritis looking for exercise recommendations could use this feature to learn what works for others with the same condition.

    See an example below.

    Screenshot from: blog.google/technology/health/the-check-up-health-ai-updates-2025/, March 2025.

    “What People Suggest” is currently available only on mobile devices in the U.S.

    Broader Health AI Initiatives

    The search updates were part of a larger set of health technology announcements at The Check Up event. Google also revealed:

    • Medical Records APIs in Health Connect for managing health data across applications
    • FDA clearance for Loss of Pulse Detection on Pixel Watch 3
    • An AI co-scientist built on Gemini 2.0 to help biomedical researchers
    • TxGemma, a collection of open models for AI-powered drug discovery
    • Capricorn, an AI tool for pediatric oncology treatment developed with Princess Máxima Center

    Looking Ahead

    Hallucination remains a problem for AI models. While Gemini may have upgrades that make it more accurate, it will still be wrong at least sometimes.

    Google’s inclusion of personal experiences alongside medical websites marks a shift, recognizing people value both clinical information and real-world perspectives.

    Health publishers should be aware that this could affect search visibility but may also increase chances of appearing for more queries or the “What People Suggest” section.

    What Content Works Well In LLMs? via @sejournal, @Kevin_Indig

    Over the last 12 months, we filled significant gaps in our understanding of AI Chatbots like ChatGPT & Co.

    We know:

    1. Adoption is growing rapidly.
    2. AI chatbots send more referrals to websites over time.
    3. Referral traffic from AI chatbots has a higher quality than that from Google.

    You can read all about it in the state of AI chatbots and SEO.

    But there isn’t much content about examples and success factors of content that drives citations and mentions in AI chatbots.

    To get an answer, I analyzed over 7,000 citations across 1,600 URLs to content-heavy sites (think: Integrators) in # AI chatbots (ChatGPT, Perplexity, AI Overviews) in February 2024 with the help of Profound.

    My goal is to figure out:

    1. Why some pages are more cited than others, so we can optimize content for AI chatbots.
    2. Whether classic SEO factors matter for AI chatbot visibility, so we can prioritize.
    3. What traps to avoid, so we don’t have to learn the same lessons many times.
    4. If different factors influence mentions and citations, so we can be more targeted in our efforts.

    Here are my findings:

    Boost your skills with Growth Memo’s weekly expert insights. Subscribe for free!

    The Key To Brand Citation In AI Chatbots: Deep Content

    Image Credit: Kevin Indig

    🔍 Context: We know that AI chatbots use Retrieval Augmented Generation (RAG) to weigh their answers with results from Google and Bing. However, does that mean classic SEO ranking factors also translate to AI chatbot citations? No.

    My correlation analysis shows that none of the classic SEO metrics have strong relationships with citations. LLMs have light preferences: Perplexity and in AIOs weigh word and sentence count higher. ChatGPT weighs domain rating and Flesch Score.

    💡Takeaway: Classic SEO metrics don’t matter nearly as much for AI chatbot mentions and citations. The best thing you can do for content optimization is to aim for depth, comprehensiveness, and readability (how easy the text is to understand).

    The following examples all demonstrate those attributes:

    • https://www.byrdie.com/digital-prescription-services-dermatologist-5179537
    • https://www.healthline.com/nutrition/best-weight-loss-programs
    • https://www.verywellmind.com/we-tried-online-therapy-com-these-were-our-experiences-8780086

    Broad correlations didn’t reveal enough meat on the bone and left me with too many open questions.

    So, I looked at what the most-cited content does differently than the rest. That approach showed much stronger patterns.

    Image Credit: Kevin Indig

    🔍Context: Because I didn’t get much out of statistical correlations, I wanted to see how the top 10% of most cited content stacks up against the bottom 90%.

    The bigger the difference, the more critical the factor for the top 10%. In other words, the multiplier (x-axis on the chart) indicates what factors LLMs reward with citations.

    The results:

    • The two factors that stand out are sentence and word count, followed by the Flesch Score. Metrics related to backlinks and traffic seem to have a negative effect, which doesn’t mean that AI chatbots weigh them negatively but simply that they don’t matter for mentions or citations.
    • The top 10% of most cited pages across all three LLMs have much less traffic, rank for fewer keywords, and get fewer total backlinks. How does that make sense? It almost looks like being strong in traditional SEO metrics is bad for AI chatbot visibility.
    • Copilot (not included in the chart) has the starkest inequality, by the way. The top 10% have 17.6 more citations than the bottom 90%. However, top 10% also rank for 1.7x more keywords in organic search. So, Copilot seems to have stronger preferences than other AI Chatbots.

    Splitting the data up by AI Chatbot shows you their unique preferences:

    Image Credit: Kevin Indig

    💡Takeaway: Content depth (word and sentence count) and readability (Flesch Score) have the biggest impact on citations in AI chatbots.

    This is important to understand: Longer content isn’t better because it’s longer, but because it has a higher chance of answering a specific question prompted in an AI chatbot.

    Examples:

    • www.verywellmind.com/best-online-psychiatrists-5119854 has 187 citations, over 10,000 words, and over 1,500 sentences, with a Flesch Score of 55, and is cited 72 times by ChatGPT.
    • On the other hand, www.onlinetherapy.com/best-online-psychiatrists/ has only three citations, also a low Flesch Score, with 48, but comes “short” with only 3,900 words and 580 sentences.

    🔍Context: We don’t yet know the value of a brand being mentioned by an AI chatbot.

    Early research indicates it’s high, especially when prompts indicate purchase intent.

    However, I wanted to get a step closer by understanding what leads to brand mentions in AI chatbots in the first place.

    After matching many metrics with AI chatbot visibility, I found one factor that stands out more than anything else: Brand search volume.

    The number of AI chatbot mentions, and brand search volume have a correlation of .334 – pretty good in this field. In other words, the popularity of a brand broadly decides how visible it is in AI chatbots.

    Image Credit: Kevin Indig

    Popularity is the most significant predictor for ChatGPT, which also sends the most traffic and has the highest usage of all AI chatbots.

    When breaking it down by AI chatbot, I found ChatGPT has the highest correlation with .542 (strong) ,but Perplexity (.196) and Google AIOs (.254) have lower correlations.

    To be clear, there is a lot of nuance on the prompt and category level. But broadly, a brand’s visibility seems to be severely impacted by how popular it is.

    Example of popular brands and their visibility in the health category (Image Credit: Kevin Indig)

    However, when brands are mentioned, all AI chatbots prefer popular brands and consistently rank them in the same order.

    • There is a clear link between the categories of the users’ questions (mental health, skincare, weight loss, hair loss, erectile dysfunction) and brands.
    • Early data shows that the most visible brands are digital-first and invest heavily in their online presence with content, SEO, reviews, social media, and digital advertising.

    💡Takeaway: Popularity is the biggest criterion that decides whether a brand is mentioned in AI chatbots or not. The way consumers connect brands to product categories also matters.

    Comparing brand search volume and product category presence with your competitors gives you the best idea of how competitive you are on ChatGPT & Co.

    Examples: All models in my analysis cite Healthline most often. Not a single other domain was in the top 10 citations for all four models, showing their distinctly different tastes and how important it is to keep track of many models as opposed to only ChatGPT – if those models also send you traffic.

    Image Credit: Kevin Indig

    Other well-cited domains across most models:

    • verywellmind.com
    • onlinedoctor.com
    • medicalnewstoday.com
    • byrdie.com
    • cnet.com
    • ncoa.org
    Image Credit: Kevin Indig

    Context: Not all AI chatbots mentioned brands with the same frequency. Even though ChatGPT has the highest adoption and sends the most referral traffic to sources, Perplexity mentions the most brands per average in answers.

    Prompt structure matters for brand visibility:

    • The word “best” was a strong trigger for brand mentions in 69.71% of prompts.
    • Words like “trusted” (5.77%), “source” (2.88%), “recommend” (0.96%), and “reliable” (0.96%) were also associated with an increased likelihood of brand mentions.
    • Prompts including “recommend” often mention public organizations like the FDA, especially when the prompt includes words like “trusted” or “leading.”
    • Google AIOs show the highest brand diversity, followed by Perplexity, then ChatGPT.

    💡Takeaway: Prompt structure has a meaningful impact on the brands that come up in the answer.

    However, we’re not yet able to truly know what prompts users utilize. This is important to keep in mind: All prompts we look at and track are just proxies for what users might be doing.

    Image Credit: Kevin Indig

    🔍Context: In my research, I encountered several ways brands unintentionally sabotage their AI chatbot visibility.

    I surface them here because the pre-requisite to being visible in LLMs is, of course, their ability to crawl your site, whether that’s directly or through training data.

    For example, Copilot doesn’t cite onlinedoctor.com because it’s not indexed in Bing. I couldn’t find indicators that this was done on purpose, so I assume it’s an accident that could quickly be fixed and rewarded with referral traffic.

    On the other hand, ChatGPT 4o doesn’t cite cnet.com, and Perplexity doesn’t cite everydayhealth.com because both sites intentionally block the respective LLM in their robots.txt.

    But there are also cases in which AI chatbots reference sites even though they technically shouldn’t.

    The most cited domain in Perplexity in my dataset is blocked.goodrx.com. GoodRX blocks users from non-U.S. countries, and it seems it accidentally or intentionally blocks Perplexity.

    Image Credit: Kevin Indig

    It’s important to single out Google’s AI Overviews here: There is no opt-out for AIOs, meaning if you want to get organic traffic from Google, you need to allow it to crawl your site, potentially use your content to train its models and surface it in AI Overviews. Chegg recently filed a lawsuit against Google for this.

    💡Takeaway: Monitor your site, especially if all wanted URLs are indexed, in Google Search Console and Bing Webmaster Tools.

    Double-check whether you accidentally block an LLM crawler in your robots.txt or through your CDN.

    If you intentionally block LLM crawlers, double-check whether you appear in their answers simply by asking them what they know about your domain.

    Summary: 6 Key Learnings

    • Classic SEO metrics don’t strongly influence AI chatbot citations.
    • Content depth (higher word and sentence counts) and readability (good Flesch Score) matter more.
    • Different AI chatbots have distinct preferences – monitoring multiple platforms is important.
    • Brand popularity (measured by search volume) is the strongest predictor of brand mentions in AI chatbots, especially in ChatGPT.
    • Prompt structure influences brand visibility, and we don’t yet know how user phrase prompts.
    • Technical issues can sabotage AI visibility – ensure your site isn’t accidentally blocking LLM crawlers through robots.txt or CDN settings.

    Featured Image: Paulo Bobita/Search Engine Journal