WordPress Robots.txt: What Should You Include? via @sejournal, @alexmoss

The humble robots.txt file often sits quietly in the background of a WordPress site, but the default is somewhat basic out of the box and, of course, doesn’t contribute towards any customized directives you may want to adopt.

No more intro needed – let’s dive right into what else you can include to improve it.

(A small note to add: This post is only useful for WordPress installations on the root directory of a domain or subdomain only, e.g., domain.com or example.domain.com. )

Where Exactly Is The WordPress Robots.txt File?

By default, WordPress generates a virtual robots.txt file. You can see it by visiting /robots.txt of your install, for example:

https://yoursite.com/robots.txt

This default file exists only in memory and isn’t represented by a file on your server.

If you want to use a custom robots.txt file, all you have to do is upload one to the root folder of the install.

You can do this either by using an FTP application or a plugin, such as Yoast SEO (SEO → Tools → File Editor), that includes a robots.txt editor that you can access within the WordPress admin area.

The Default WordPress Robots.txt (And Why It’s Not Enough)

If you don’t manually create a robots.txt file, WordPress’ default output looks like this:

User-agent: *
Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php

While this is safe, it’s not optimal. Let’s go further.

Always Include Your XML Sitemap(s)

Make sure that all XML sitemaps are explicitly listed, as this helps search engines discover all relevant URLs.

Sitemap: https://example.com/sitemap_index.xml
Sitemap: https://example.com/sitemap2.xml

Some Things Not To Block

There are now dated suggestions to disallow some core WordPress directories like /wp-includes/, /wp-content/plugins/, or even /wp-content/uploads/. Don’t!

Here’s why you shouldn’t block them:

  1. Google is smart enough to ignore irrelevant files. Blocking CSS and JavaScript can hurt renderability and cause indexing issues.
  2. You may unintentionally block valuable images/videos/other media, especially those loaded from /wp-content/uploads/, which contains all uploaded media that you definitely want crawled.

Instead, let crawlers fetch the CSS, JavaScript, and images they need for proper rendering.

Managing Staging Sites

It’s advisable to ensure that staging sites are not crawled for both SEO and general security purposes.

I always advise to disallow the entire site.

You should still use the noindex meta tag, but to ensure another layer is covered, it’s still advisable to do both.

If you navigate to Settings > Reading, you can tick the option “Discourage search engines from indexing this site,” which does the following in the robots.txt file (or you can add this in yourself).

User-agent: *
Disallow: /

Google may still index pages if it discovers links elsewhere (usually caused by calls to staging from production when migration isn’t perfect).

Important: When you move to production, ensure you double-check this setting again to ensure that you revert any disallowing or noindexing.

Clean Up Some Non-Essential Core WordPress Paths

Not everything should be blocked, but many default paths add no SEO value, such as the below:

Disallow: /trackback/
Disallow: /comments/feed/
Disallow: */feed/
Disallow: */embed/
Disallow: /cgi-bin/
Disallow: /wp-login.php
Disallow: /wp-json/

Disallow Specific Query Parameters

Sometimes, you’ll want to stop search engines from crawling URLs with known low-value query parameters, like tracking parameters, comment responses, or print versions.

Here’s an example:

User-agent: *
Disallow: /*?replytocom=
Disallow: /*?print=

You can use Google Search Console’s URL Parameters tool to monitor parameter-driven indexing patterns and decide if additional disallows are worthy of adding.

Disallowing Low-Value Taxonomies And SERPs

If your WordPress site includes tag archives or internal search results pages that offer no added value, you can block them too:

User-agent: *
Disallow: /tag/
Disallow: /page/
Disallow: /?s=

As always, weigh this against your specific content strategy.

If you use tag taxonomy pages as part of content you want indexed and crawled, then ignore this, but generally, they don’t add any benefits.

Also, make sure your internal linking structure supports your decision and minimizes any internal linking to areas you have no intention of indexing or crawling.

Monitor On Crawl Stats

Once your robots.txt is in place, monitor crawl stats via Google Search Console:

  • Look at Crawl Stats under Settings to see if bots are wasting resources.
  • Use the URL Inspection Tool to confirm whether a blocked URL is indexed or not.
  • Check Sitemaps and make sure they only reference pages you actually want crawled and indexed.

In addition, some server management tools, such as Plesk, cPanel, and Cloudflare, can provide extremely detailed crawl statistics beyond Google.

Lastly, use Screaming Frog’s configuration override to simulate changes and revisit Yoast SEO’s crawl optimization features, some of which solve the above.

Final Thoughts

While WordPress is a great CMS, it isn’t set up with the most ideal default robots.txt or set up with crawl optimization in mind.

Just a few lines of code and less than 30 minutes of your time can save you thousands of unnecessary crawl requests to your site that aren’t worthy of being identified at all, as well as securing a potential scaling issue in the future.

More Resources:


Featured Image: sklyareek/Shutterstock

We Figured Out How AI Overviews Work [& Built A Tool To Prove It] via @sejournal, @mktbrew

This post was sponsored by Market Brew. The opinions expressed in this article are the sponsor’s own.

Wondering how to realign your SEO strategy for maximum SERP visibility in AI Overviews (AIO)?

Do you wish you had techniques that mirror how AI understands relevance?

Imagine if Google handed you the blueprint for AI Overviews:

  • Every signal.
  • Every scoring mechanism.
  • Every semantic pattern it uses to decide what content makes the cut.

That’s what our search engineers did.

They reverse-engineered how Google’s AI Overviews work and built a model that shows you exactly what to fix.

It’s no longer about superficial tweaks; it’s about aligning with how AI truly evaluates meaning and relevance.

In this article, we’ll show you how to rank in AIO SERPs by creating embeddings for your content and how to realign your content for maximum visibility by using AIO tools built by search engineers.

The 3 Key Features Of AI Overviews That Can Make Or Break Your Rankings

Let’s start with the basic building blocks of a Google AI Overviews (AIO) response:

What Are Embeddings?

Embeddings are high-dimensional numerical representations of text. They allow AI systems to understand the meaning of words, phrases, or even entire pages, beyond just the words themselves.

Rather than matching exact terms, embeddings turn language into vectors, or arrays of numbers, that capture the semantic relationships between concepts.

For example, “car,” “vehicle,” and “automobile” are different words, but their embeddings will be close in vector space because they mean similar things.

Large language models (LLMs) like ChatGPT or Google Gemini use embeddings to “understand” language; they don’t just see words, they see patterns of meaning.

What Are Embeddings?: InfographicImage created by MarketBrew.ai, April 2025

Why Do Embeddings Matter For SEO?

Understanding how Large Language Models (LLMs) interpret content is key to winning in AI-driven search results, especially with Google’s AI Overviews.

Search engines have shifted from simple keyword matching to deeper semantic understanding. Now, they rank content based on contextual relevance, topic clusters, and semantic similarity to user intent, not just isolated words.

Vector Representations of WordsImage created by MarketBrew.ai, April 2025

Embeddings power this evolution.

They enable search engines to group, compare, and rank content with a level of precision that traditional methods (like TF-IDF, keyword density, or Entity SEO) can’t match.

By learning how embeddings work, SEOs gain tools to align their content with how search engines actually think, opening the door to better rankings in semantic search.

The Semantic Algorithm GalaxyImage created by MarketBrew.ai, April 2025

How To Rank In AIO SERPs By Creating Embeddings

Step 1: Set Up Your OpenAI Account

  • Sign Up or Log In: If you haven’t already, sign up for an account on OpenAI’s platform at https://platform.openai.com/signup.
  • API Key: Once logged in, you’ll need to generate an API key to access OpenAI’s services. You can find this in your account settings under the API section.

Step 2: Install The OpenAI Python Client To Simplify This Step For SEO Pros

OpenAI provides a Python client that simplifies the process of interacting with their API. To install it, run the following command in your terminal or command prompt:

pip install openai

Step 3: Authenticate With Your API Key

Before making requests, you need to authenticate using your API key. Here’s how you can set it up in your Python script:

import openai

openai.api_key = 'your-api-key-here'

Step 4: Choose Your Embedding Model

At the time of this article’s creation, OpenAI’s text-embedding-3-small is considered one of the most advanced embedding models. It is highly efficient for a wide range of text processing tasks.

Step 5: Create Embeddings For Your Content

To generate embeddings for text:

response = openai.Embedding.create(

model="text-embedding-3-small",

input="This is an example sentence."

)

embeddings = response['data'][0]['embedding']

print(embeddings)

The result is a list of numbers representing the semantic meaning of your input in high-dimensional space.

Step 6: Storing Embeddings

Store embeddings in a database for future use; tools like Pinecone or PostgreSQL with pgvector are great options.

Step 7: Handling Large Text Inputs

For large content, break it down into paragraphs or sections and generate embeddings for each chunk.

Use similarly sized chunks for better cosine similarity calculations. To represent an entire document, you can average the embeddings for each chunk.

💡Pro Tip: Use Market Brew’s free AI Overviews Visualizer. The search engineer team at Market Brew has created this visualizer to help you understand exactly how embeddings, the fourth generation of text classifiers, are used by search engines.

Semantics: Comparing Embeddings With Cosine Similarity

Cosine similarity measures the similarity between two vectors (embeddings), regardless of their magnitude.

This is essential for comparing the semantic similarity between two pieces of text.

How Does Cosine Similarity Work? Image created by MarketBrew.ai, April 2025

Typical search engine comparisons include:

  1. Keywords with paragraphs,
  2. Groups of paragraphs with other paragraphs, and
  3. Groups of keywords with groups of paragraphs.

Next, search engines cluster these embeddings.

How Search Engines Cluster Embeddings

Search engines can organize content based on clusters of embeddings.

In the video below, we are going to illustrate why and how you can use embedding clusters, using Market Brew’s free AI Overviews Visualizer, to fix content alignment issues that may be preventing you from appearing in Google’s AI Overviews or even their regular search results!

Embedding clusters, or “semantic clouds”, form one of the most powerful ranking tools for search engineers today.

Semantic clouds are topic clusters in thousands of dimensions. The illustration above shows a 3D representation to simplify understanding.

Topic clusters are to entities as semantic clouds are to embeddings. Think of a semantic cloud as a topic cluster on steroids.

Search engineers use this like they do topic clusters.

When your content falls outside the top semantic cloud – what the AI deems most relevant – it is ignored, demoted, or excluded from AI Overviews (and even regular search results) entirely.

No matter how well-written or optimized your page might be in the traditional sense, it won’t surface if it doesn’t align with the right semantic cluster that the finely tuned AI system is seeking.

By using the AI Overviews Visualizer, you can finally see whether your content aligns with the dominant semantic cloud for a given query. If it doesn’t, the tool provides a realignment strategy to help you bridge that gap.

In a world where AI decides what gets shown, this level of visibility isn’t just helpful. It’s essential.

Free AI Overviews Visualizer: How To Fix Content Alignment

Step 1: Use The Visualizer

Input your URL into this AI Overviews Visualizer tool to see how search engines view your content using embeddings. The Cluster Analysis tab will display embedding clusters for your page and indicate whether your content aligns with the correct cluster.

MarketBrew.ai dashboard Screenshot from MarketBrew.ai, April 2025

Step 2: Read The Realignment Strategy

The tool provides a realignment strategy if needed. This provides a clear roadmap for adjusting your content to better align with the AI’s interpretation of relevance.

Example: If your page is semantically distant from the top embedding cluster, the realignment strategy will suggest changes, such as reworking your content or shifting focus.

Example: Embedding Cluster AnalysisScreenshot from MarketBrew.ai, April 2025
Example of New Page Content Aligned with Target EmbeddingScreenshot from MarketBrew.ai, April 2025

Step 3: Test New Changes

Use the “Test New Content” feature to check how well your content now fits the AIO’s top embedding cluster. Iterative testing and refinement are recommended as AI Overviews evolve.

AI Overviews authorScreenshot by MarketBrew.ai, April 2025

See Your Content Like A Search Engine & Tune It Like A Pro

You’ve just seen under the hood of modern SEO – embeddings, clusters, and AI Overviews. These aren’t abstract theories. They’re the same core systems that Google uses to determine what ranks.

Think of it like getting access to the Porsche service manual, not just the owner’s guide. Suddenly, you can stop guessing which tweaks matter and start making adjustments that actually move the needle.

At Market Brew, we’ve spent over two decades modeling these algorithms. Tools like the free AI Overviews Visualizer give you that mechanic’s-eye view of how search engines interpret your content.

And for teams that want to go further, a paid license unlocks Ranking Blueprints to help track and prioritize which AIO-based metrics most affect your rankings – like cosine similarity and top embedding clusters.

You have the manual now. The next move is yours.


Image Credits

Featured Image: Image by Market Brew. Used with permission.

In-Post Image: Images by Market Brew. Used with permission.

Google’s John Mueller: Updating XML Sitemap Dates Doesn’t Help SEO via @sejournal, @MattGSouthern

Google’s John Mueller clarifies that automatically changing XML sitemap dates doesn’t boost SEO and could make it harder for Google to find actual content updates.

The “Freshness Signal” Myth Busted

On Reddit’s r/SEO forum, someone asked if competitors ranked better by setting their XML sitemap dates to the current date to send a “freshness signal” to Google.

Mueller’s answer was clear:

“It’s usually a sign they have a broken sitemap generator setup. It has no positive effect. It’s just a lazy setup.”

The discussion shows a common frustration among SEO pros. The original poster was upset after following Google’s rules for 15 years, only to see competitors using “spam tactics” outrank established websites.

When asked about sites using questionable tactics yet still ranking well, Mueller explained that while some “sneaky things” might work briefly, updating sitemap dates isn’t one of them.

Mueller said:

“Setting today’s date in a sitemap file isn’t going to help anyone. It’s just lazy. It makes it harder for search engines to spot truly updated pages. This definitely isn’t working in their favor.”

XML Sitemaps: What Works

XML sitemaps help search engines understand your website structure and when content was last updated. While good sitemaps are essential for SEO, many people misunderstand the impact they have on rankings.

According to Google, the lastmod tag in XML sitemaps should show when a page was truly last updated. When used correctly, this helps search engines know which pages have new content that needs to be recrawled.

Mueller confirms that faking these dates doesn’t help your rankings and may prevent Google from finding your real content updates.

What This Means for Your SEO

Mueller’s comments remind us that while some SEO tactics might seem to improve rankings, correlation isn’t causation.

Sites ranking well despite questionable methods are likely succeeding due to other factors, rather than manipulated sitemap dates.

For website owners and SEO professionals, the advice is:

  • Keep your XML sitemaps accurate
  • Only update lastmod dates when you change content
  • Focus on creating valuable content instead of technical shortcuts
  • Be patient with ethical SEO strategies – they provide lasting results

It can be frustrating to see competitors seemingly benefit from questionable tactics. However, Mueller suggests these advantages don’t last long and can backfire.

This exchange confirms that Google’s smart algorithms can recognize and eventually ignore artificial attempts to manipulate ranking signals.


Featured Image:  Keronn art/Shutterstock

Google’s Martin Splitt Explains How To Find & Remove Noindex Tags via @sejournal, @MattGSouthern

Google’s Search Relations team has released a new SEO Office Hours video with Martin Splitt.

He tackles a common problem many website owners face: unwanted noindex tags that keep pages out of search results.

In the video, Splitt helps a user named Balant who couldn’t remove a noindex tag from their website. Balant wanted their page to be public, but the tag prevented this.

Where Unwanted Noindex Tags Come From

Splitt listed several places where unwanted noindex tags might be hiding:

“Make sure that it’s not in the source code, it’s not coming from JavaScript, it’s not coming from a third-party JavaScript.”

Splitt pointed out that A/B testing tools often cause this problem. These tools sometimes add noindex tags to test versions of your pages without you realizing it.

CDN & Cache Problems

If you use a Content Delivery Network (CDN), Splitt warned that old cached versions might still have noindex tags even after you remove them from your site.

Splitt explained:

“If you had a noindex in and you’re using a CDN, it might be that the cache hasn’t updated yet.”

Check Your CMS Settings & Plugins

Splitt explained that your Content Management System (CMS) settings might be adding noindex tags without you knowing.

He said:

“If you’re using a CMS, there might be settings or plugins for SEO, and there might be something like ‘allow search engines to index this content’ or ‘to access this content,’ and you want to make sure that’s set.”

Splitt added that settings labeled as “disallow search engines” should be unchecked if you want your content to appear in search results.

See the full video:

Debugging Process for Persistent Noindex Issues

If you’re dealing with stubborn noindex problems, Splitt suggests checking these places in order:

  1. Check your HTML source code directly
  2. Look at JavaScript files that might add meta tags
  3. Review third-party scripts, especially testing tools
  4. Check if your CDN cache needs updating
  5. Look at your CMS settings and SEO plugins

What This Means For SEO Professionals

Google’s advice shows why thorough technical SEO checks are essential. Modern websites are complex with dynamic content and third-party tools, so finding technical SEO problems takes deeper digging.

SEO professionals should regularly crawl their sites with tools that process JavaScript. This practice provides a deeper understanding of how search engines interpret your pages, going beyond the basic HTML and revealing the true visibility of your content.

Google keeps covering these basic technical issues in its videos, suggesting that even well-designed websites often struggle with indexing problems.

If your pages aren’t showing up in search results, use Google’s URL Inspection tool in the Search Console. This shows you how Google sees your page and whether any noindex tags exist.

Why Do Web Standards Matter? Google Explains SEO Benefits via @sejournal, @MattGSouthern

Google Search Relations team members recently shared insights about web standards on the Search Off the Record podcast.

Martin Splitt and Gary Illyes explained how these standards are created and why they matter for SEO. Their conversation reveals details about Google’s decisions that affect how we optimize websites.

Why Some Web Protocols Become Standards While Others Don’t

Google has formally standardized robots.txt through the Internet Engineering Task Force (IETF). However, they left the sitemap protocol as an informal standard.

This difference illustrates how Google determines which protocols require official standards.

Illyes explained during the podcast:

“With robots.txt, there was a benefit because we knew that different parsers tend to parse robots.txt files differently… With sitemap, it’s like ‘eh’… it’s a simple XML file, and there’s not that much that can go wrong with it.”

This statement from Illyes reveals Google’s priorities. Protocols that confuse platforms receive more attention than those that work well without formal standards.

The Benefits of Protocol Standardization for SEO

The standardization of robots.txt created several clear benefits for SEO:

  • Consistent implementation: Robots.txt files are now interpreted more consistently across search engines and crawlers.
  • Open-source resources: “It allowed us to open source our robots.txt parser and then people start building on it,” Illyes noted.
  • Easier to use: According to Illyes, standardization means “there’s less strain on site owners trying to figure out how to write the damned files.”

These benefits make technical SEO work more straightforward and more effective, especially for teams managing large websites.

Inside the Web Standards Process

The podcast also revealed how web standards are created.

Standards groups, such as the IETF, W3C, and WHATWG, work through open processes that often take years to complete. This slow pace ensures security, clear language, and broad compatibility.

Illyes explained:

“You have to show that the thing you are working on actually works. There’s tons of iteration going on and it makes the process very slow—but for a good reason.”

Both Google engineers emphasized that anyone can participate in these standards processes. This creates opportunities for SEO professionals to help shape the protocols they use on a daily basis.

Security Considerations in Web Standards

Standards also address important security concerns. When developing the robots.txt standard, Google included a 500-kilobyte limit specifically to prevent potential attacks.

Illyes explained:

“When I’m reading a draft, I would look at how I would exploit stuff that the standard is describing.”

This demonstrates how standards establish security boundaries that safeguard both websites and the tools that interact with them.

Why This Matters

For SEO professionals, these insights indicate several practical strategies to consider:

  • Be precise when creating robots.txt directives, since Google has invested heavily in this protocol.
  • Use Google’s open-source robots.txt parser to check your work.
  • Know that sitemaps offer more flexibility with fewer parsing concerns.
  • Consider joining web standards groups if you want to help shape future protocols.

As search engines continue to prioritize technical quality, understanding the underlying principles behind web protocols becomes increasingly valuable for achieving SEO success.

This conversation shows that even simple technical specifications involve complex considerations around security, consistency, and ease of use, all factors that directly impact SEO performance.

Hear the full discussion in the video below:

AI Overviews: We Reverse-Engineered Them So You Don’t Have To [+ What You Need To Do Next]

This post was sponsored by DAC. The opinions expressed in this article are the sponsor’s own. Authors: Dan Lauer & Michael Goodman

Is the classic funnel model (TOFU-MOFU-BOFU) still relevant in an AI-driven SERP?

What kinds of queries trigger Google’s AI Overviews?

How can you structure content so that AI pulls your site into the response?

Do you really need to change your SEO strategy?

For years, SEO teams followed a familiar SEO playbook:

  1. Optimize upper-funnel content to capture awareness,
  2. mid-funnel content to drive consideration,
  3. lower-funnel content to convert.

One page, one keyword, one intent.

But with the rise of ChatGPT, Perplexity, Copilot, Gemini, and now Google’s AI Mode, that linear model is increasingly outdated.

So, how do you move forward and keep your visibility high in modern search engine results pages (SERPs)?

We’ve reverse-engineered AI Overviews, so you don’t have to. Let’s dive in.

What We’ve Discovered Through Reverse Engineering Google’s AI Overviews (AIO)

From what we’re seeing across client industries and in how AI-driven results behave, the traditional funnel model – the idea of users moving cleanly from awareness to consideration to conversion – feels increasingly out of step with how people actually search.

How Today’s Search Users Actually Search

Today’s users jump between channels, devices, and questions.

They skim, abandon, revisit, and decide faster than ever.

AI Overviews don’t follow a tidy funnel because most people don’t either.

They surface multiple types of information at once, not because it’s smarter SEO, but because it’s closer to how real decisions get made.

AIOs & AI Mode Aren’t Just Answering Queries – They’re Expanding Them

Traditionally, SEO strategy followed a structured framework. Take a travel-related topic, for example:

  • Informational (Upper-Funnel) – “How to plan a cruise?”
  • Commercial (Mid-Funnel) – “Best cruise lines for families”
  • Transactional (lower-Funnel) – “Find Best Alaska Cruise Deals”

However, AI Overviews don’t stick to that structure.

Instead, they blend multiple layers of intent into a single, comprehensive response.

How AI Overviews Answer & Expand Search Queries

Let’s stay with the travel theme. A search for “Mediterranean cruise” might return an AI Overview that includes:

  • Best Time to go (Informational).
  • Booking Your Cruise (Commercial).
  • Cruise Lines (Navigational).

AI Mode Example for ‘Mediterranean Cruise’

What’s Happening Here?

In this case, Google isn’t just answering the query.

It anticipates what the user will want to know next, acting more like a digital concierge than a traditional search engine.

The AI Overview Test & Parameters

  • Source: Semrush & Google
  • Tested Data: 200 cruise-related informational queries

We started noticing this behavior showing up more often, so we wanted to see how common it actually is.

To get a clearer picture, we pulled 200 cruise-related informational queries from SEMrush and ran them through our custom-built AI SERP scraper. The goal was to see how often these queries triggered AI Overviews, and what kind of intent those Overviews covered.

The patterns were hard to miss:

  • 88% of those queries triggered an AI Overview
  • More than half didn’t just answer the initial question.
  • 52% mixed in other layers of intent, like brand suggestions, booking options, or comparisons, right alongside the basic information someone might’ve been looking for.

Using a different query related to Mediterranean Cruises, the AIO response acts as a travel agent, guiding the user on topics like:

  • How to fly,
  • Destinations with region,
  • Cruise prices,
  • Cruise lines that sail to that destination.

While it’s an Information non-brand search query,  the AIO response is lower-funnel as well.

Again, less than half of the queries were matched intent.

Here are some examples of queries that were identified as Informational and provided only the top-of-funnel response without driving the user further down the funnel.

The Verdict

Even when someone asks a simple, top-of-funnel question, AI is already steering them toward what to do next, whether that’s comparing prices, picking a provider, or booking a trip.

What Does This Mean for SEO Strategies Moving Forward?

If AI Overviews and AI Mode are blending intent types, content, and SEO strategies need to catch up:

  1. It’s no longer enough to rank for high-volume informational keywords. If your content doesn’t address multiple layers of intent, AI will fill the gaps with someone else’s content.
  2. SEO teams need to analyze how AI handles their most important queries. What related questions is it pulling in? Are those answers coming from your site or your competitors?
  3. Think beyond keyword volume. Long-tail queries may have lower search traffic, but they often align better with AI-cited content. Structure your pages with clear headings, bullets, and concise, helpful language—that’s what AI models prefer to surface.

The Future of SEO in an AI World: Hybrid Intent Optimization

The fundamentals of technical and on-page SEO still matter. But if your content is still built around single keywords and single intent types, you’re likely to lose visibility as AI continues to reshape the SERP.

The brands that adapt to this shift by creating content that mirrors the blended, fast-moving behavior of actual users are the ones that will continue to own key moments across the funnel, even as the funnel itself evolves.

As AI transforms search behavior, its crucial to adapt your SEO strategies accordingly. At DAC, we specialize in aligning your content with the latest search trends to enhance visibility and engagement. Reach out to us today to future-proof your strategy with our award-winning TotalSERP approach and stay ahead in the evolving digital landscape.

https://www.dacgroup.com/” class=”btn-learn-more button-green medium-size”>Optimize Your SEO For AI Search, Now

Image Credits

Featured Image: Image by DAC. Used with permission.

In-Post Image: Images by DAC. Used with permission.

How To Identify Migration Issues Quickly Using AI via @sejournal, @makhyan

Site migration issues happen. You plan, create a staging site, and then when the site goes live, there’s bound to be something wrong.

Quality assurance gets thrust into overdrive the moment that migrations are complete.

You sift through thousands of pages, metadata, and more to fix any problems before someone else notices.

It’s a lot of work and time-consuming to feel confident that a site migration is complete without issues.

But, I’m going to show you how to identify migration issues quickly using Google Sheets and AI. You still have a lot to do (migration experts, rejoice!), but this script is going to help you:

  • Compare old and new ScreamingFrog crawls.
  • Identify immediate issues that you need to resolve.

SEOs have their own strategies and practices that they follow, and this script is going to allow you to QA migrations quickly based on your own requirements.

You can adapt the script below to make this work for you, whether you’re working on a small local business site or an enterprise.

Setting Everything Up With Screaming Frog And Google Sheets

I’m using Screaming Frog for this example because it makes it easy for me to export data for both sites.

We’re going to assume the following:

  1. Your first version is your live website, which we’ll call the Old Crawl.
  2. Your second version is your new site on a staging environment, which we’ll call New Crawl.

You’re going to create a Google Sheets with the following Sheets:

  • Overview.
  • Old Crawl.
  • New Crawl.

Once your Sheet is set up properly, run your ScreamingFrog scan using any settings that you like.

You’ll run the scan for your Old and New Crawl and then inmport the data to the Old Crawl and New Crawl tabs in your Sheets.

Your sheets will look something like this:

ScreamingFrog Export Crawl ResultsScreamingFrog Export Crawl Results (Screenshot of Google Sheet, March 2025)

The New Crawl will look very similar.

Once you fill in both the New and Old Crawl sheets, you’ll need to populate your Overview sheet.

The table that you create in this sheet should contain the following columns:

  • Existing (old) URL.
  • New URL.
  • Status Code.
  • Indexability.
  • Title 1.
  • Meta Description 1.
  • H1-1.
  • H2-1.
  • Column 3.
  • Column 4.

Your Overview sheet will look something like this:

Migration QA Overview SheetMigration QA Overview Sheet (Screenshot of Google Sheet, March 2025)

Once you have your sheets set up, it’s time to put your favorite AI to work to compare your data.

I used ChatGPT, but you can use any AI you like. I’m sure Claude, Deepseek, or Gemini would do equally as well as long as you use similar prompts.

Prompts To Create Your Google Sheets Data

You can fill in your Google Sheet formulas by hand if you’re a formula guru, but it’s easier to let AI do it for you since we’re making basic comparisons.

Remember, the Old Crawl is the live site, and the New Crawl is my staging site.

Now, go to your AI tool and prompt it with the following:

I need a Google Sheets formula that compares values between two sheets: "Old Crawl" and "New Crawl." The formula should:
Look up a value in column A of "Old Crawl" using the value in column A of the current sheet.
Look up a value in column A of "New Crawl" using the value in column B of the current sheet.
Find the corresponding column in both sheets by matching the column header in row 1 with the current column header.
If the values match, return "Pass".
If they don't match, return "Error (old<>new)" with the differing values shown.
Use TEXTJOIN("<>", TRUE, ...) to format the error message.
Ensure compatibility with Google Sheets by specifying explicit ranges instead of full-column references.

You can adjust these prompt points on your own.

For example, you can change “Old Crawl” to “Live Site,” but be sure that the sheet names match up properly.

ChatGPT generated code for me that looks something like this:

=IF(
INDEX('Old Crawl'!$A$1:$Z$1000, MATCH($A2, 'Old Crawl'!$A$1:$A$1000, 0), MATCH(C$1, 'Old Crawl'!$1:$1, 0)) =
INDEX('New Crawl'!$A$1:$Z$1000, MATCH($B2, 'New Crawl'!$A$1:$A$1000, 0), MATCH(C$1, 'New Crawl'!$1:$1, 0)),
"Pass",
"Error (" & TEXTJOIN("<>", TRUE,
IFERROR(INDEX('Old Crawl'!$A$1:$Z$1000, MATCH($A2, 'Old Crawl'!$A$1:$A$1000, 0), MATCH(C$1, 'Old Crawl'!$1:$1, 0)), ""),
IFERROR(INDEX('New Crawl'!$A$1:$Z$1000, MATCH($B2, 'New Crawl'!$A$1:$A$1000, 0), MATCH(C$1, 'New Crawl'!$1:$1, 0)), "")
) & ")"
)

You can use these basic formulas to start comparing rows by pasting the formula in row 2.

Adding the formula is as simple as double-clicking the field and pasting it in.

I know that you’ll want to make this a little more complex. You can do a lot of things with Google Sheets and formulas, so tweak things as needed.

Ideas For Expanding Your Migration Sheet

Your formulas will depend on the settings of your Screaming Frog crawl, but here are a few that I think will work well:

  • Create a function to compare all of the status codes between the Old Crawl and New Crawl to identify key issues that exist. For example, if a page has anything but a 200 code, you can highlight the issue to fix it quickly.
  • Add a formula to highlight metadata that is too long or short, so that you can add it to your task list for when the audit is over.
  • Create a function to monitor Response Time between both the Old and New Crawl so that you can identify any issues that the new crawl may have or report speed increases if switching to a new host or server.
  • Create another function to compare the URL structure of each URL. You might compare trailing slashes, structure and more.
  • Develop a new function for Inlinks to be sure that no internal links were lost in the migration. You can also check external links using the same concept.

Migrating a site is always tedious.

A lot of QA goes into the process, and while necessary, the concept above will make the process much easier.

You can also use AI to recommend further enhancements to your newly migrated site.

How would you improve this file or its functionality?

More Resources:


Featured Image: TarikVision/Shutterstock

Google’s Martin Splitt Reveals 3 JavaScript SEO Mistakes & Fixes via @sejournal, @MattGSouthern

Google’s Martin Splitt recently shared insights on how JavaScript mistakes can hurt a website’s search performance.

His talk comes as Google Search Advocate John Mueller also urges SEO pros to learn more about modern client-side technologies.

Mistake 1: Rendered HTML vs. Source HTML

During the SEO for Paws Conference, a live-streamed fundraiser by Anton Shulke, Splitt drew attention to a trend he’s noticing.

Many SEO professionals still focus on the website’s original source code even though Google uses the rendered HTML for indexing. Rendered HTML is what you see after JavaScript has finished running.

Splitt explains:

“A lot of people are still looking at view source. That is not what we use for indexing. We use the rendered HTML.”

This is important because JavaScript can change pages by removing or adding content. Understanding this can help explain some SEO issues.

Mistake 2: Error Pages Being Indexed

Splitt pointed out a common error with single-page applications and JavaScript-heavy sites: they often return a 200 OK status for error pages.

This happens because the server sends a 200 response before the JavaScript checks if the page exists.

Splitt explains:

“Instead of responding with 404, it just responds with 200 … always showing a page based on the JavaScript execution.”

When error pages get a 200 code, Google indexes them like normal pages, hurting your SEO.

Splitt advises checking server settings to handle errors properly, even when using client-side rendering.

Mistake 3: Geolocation Request Issue

Another problem arises when sites ask users for location or other permissions.

Splitt says Googlebot will always refuse the request if a site relies on geolocation (or similar requests) without a backup plan.

Splitt explains:

“Googlebot does not say yes on that popup. It says no on all these requests … so if you request geolocation, Googlebot says no.”

The page can appear blank to Googlebot without alternative content, meaning nothing is indexed. This can turn into a grave SEO mistake.

How to Debug JavaScript for SEO

Splitt shared a few steps to help diagnose and fix JavaScript issues:

  1. Start with Search Console: Use the URL Inspection tool to view the rendered HTML.
  2. Check the Content: Verify if the expected content is there.
  3. Review HTTP Codes: Look at the status codes in the “More info” > “Resources” section.
  4. Use Developer Tools: Open your browser’s developer tools. Check the “initiator” column in the Network tab to see which JavaScript added specific content.

Splitt adds:

“The initiator is what loaded it. If it’s injected by JavaScript, you can see which part of the code did it.”

Following these steps can help you find the problem areas and work with your developers to fix them.

See Splitt’s full talk in the recording below:

A Shift in SEO Skills

Splitt’s advice fits with Mueller’s call for SEOs to broaden their skill set.

Mueller recently suggested that SEO professionals learn about client-side frameworks, responsive design, and AI tools.

Mueller stated:

“If you work in SEO, consider where your work currently fits in … if your focus was ‘SEO at server level,’ consider that the slice has shrunken.”

Modern JavaScript techniques create new challenges that old SEO methods cannot solve alone. Splitt’s real-world examples show why understanding these modern web practices is now critical.

What This Means For SEO Professionals

Both Google Advocates point to a clear trend: SEO now requires more technical skills. As companies look for professionals who can blend SEO and web development, the demand for these modern skills is growing.

To keep up, SEO pros should:

  • Learn How JavaScript Affects Indexing: Know the difference between source and rendered HTML.
  • Master Developer Tools: Use tools like Search Console and browser developer tools to spot issues.
  • Collaborate with Developers: Work together to build sites that serve users and search engines well.
  • Broaden Your Skillset: Add client-side techniques to your traditional SEO toolkit.

Looking Ahead

As the web evolves, so must the skills of SEO professionals. However, leveling up your knowledge doesn’t have to be intimidating.

This fresh look at JavaScript’s role in SEO shows that even simple changes can have a big impact.


Featured Image: BestForBest/Shutterstock

Google’s Mueller Predicts Uptick Of Hallucinated Links: Redirect Or Not? via @sejournal, @MattGSouthern

Website owners and SEO professionals are facing a new problem. AI content generation tools are creating fake URLs when referencing real websites.

This issue was discussed in a recent social media conversation between industry professionals.

Hallucinated Links Causing 404s

On Bluesky, digital marketer Dan Thornton pointed out a pattern of 404 errors from non-existent URLs generated by AI systems.

His question: Should these links be redirected to existing pages?

Thornton states:

“Investigated a number of 404s recorded on a client website.

And a significant amount were generated by an AI service, which appears to have just made up articles, and URLs, in citations. It isn’t even using the right URL structure 🤦‍♂️

Debating the value of redirects and any potential impact.”

Thornton adds:

“On one hand, mistakes by more obscure AI bots might not seem worth correcting for the sake of adding more redirects. On the other, if it’s a relatively small client with a high value for conversions, even a couple of lost sales due to the damage to the brand will be noticeable.”

Google’s Perspective

Predicting an increase in hallucinated links, Google Search Advocate John Mueller offers guidance that can help navigate this issue.

First, he recommends having a good 404 page in place, stating:

“A good 404 page could help explain the value of the site, and where to go for more information. You could also use the URL as a site-search query & show the results on the 404 page, to get people closer.”

Before investing in solutions, he recommends collecting data.

Mueller states:

“I wonder if this is going to be a more common thing? It’s tempting to extrapolate from one off [incidents], but perhaps it makes sense to collect some more data before spending too much on it.”

In a follow-up comment, Mueller predicted:

“My tea leaves say that for the next 6-12 months we’ll see a slight uptick of these hallucinated links being clicked, and then they’ll disappear as the consumer services adjust to better grounding on actual URLs.”

Don’t Hope For Accidental Clicks

Mueller provided a broader perspective, advising SEO professionals to avoid focusing on minor metrics.

He adds:

“I know some SEOs like to over-focus on tiny metrics, but I think sites will be better off focusing on a more stable state, rather than hoping for accidental by-clicks. Build more things that bring real value to the web, that attract & keep users coming back on their own.”

What This Means

As AI adoption grows, publishers may need to develop new strategies for mitigating hallucinations.

Ammon Johns, recognized as a pioneer in the SEO industry, offers a potential solution to consider.

In response to Thornton, he suggests:

“I think any new custom 404 page should include a note to anyone that arrived there from an AI prompt to explain hallucinations and how AI makes so many of them you’ve even updated your site to warn people. Always make your market smarter – education is the ultimate branding.”

It’s too early to recommend a specific strategy at this time.

Mueller advises monitoring these errors and their impact before making major changes.


Featured Image: Iljanaresvara Studio/Shutterstock

Google’s Muller Cautions SEO Pros On Changing Business Needs via @sejournal, @MattGSouthern

John Mueller, a Google Search Advocate, suggests that SEO professionals should reconsider how their work fits into the modern web stack.

He references a “vibes-based” visualization highlighting how developers’ focus areas have shifted.

Mueller notes a disconnect between what industry pros pay attention to (such as JavaScript frameworks, performance optimizations, or new AI-driven tech) and what online businesses need.

However, he sees this as an opportunity for SEO professionals. He provides advice on staying relevant amid shifting business priorities.

Changing Business Priorities

Laurie Voss, VP of Developer Relations at Llama Index, shared a chart showing the areas of focus of software professionals from 1990 to 2025.

Screenshot from: Seldo.com, March 2025.

In the early days, developers were mainly concerned with hardware and networking. By the mid-2000s, the focus shifted to HTML, CSS, and server technologies. More recently, we’ve seen a move toward client frameworks, responsive design, and AI-powered development.

Although the data is subjective, Mueller highlights its value for SEOs. It shows how quickly areas like server-level work have become less critical for average web developers.

Mueller’s Take

Mueller’s point is straightforward: as web development changes, SEO must change, too. The skills that made you valuable five years ago might not be enough today.

Screenshot from: Seldo.com, March 2025.

Mueller says:

“If you work in SEO, consider where your work currently fits in with a graph like this. It’s not an objective graph based on data, but I think it’s worth thinking about how your work could profit from adding or shifting “tracks.””

He adds:

“What the average web developer thinks about isn’t necessarily what’s relevant for the “online business” (in whichever form you work). Looking at the graph, if your focus was “SEO at server level,” consider that the slice has shrunken quite a bit already.”

This matches Voss’s argument in the article “AI’s effects on programming jobs.”

Voss believes AI won’t kill development jobs but will create a new abstraction layer, changing how work is done. The same likely applies to SEO work.

What Should SEO Pros Focus On?

Reading between the lines of Mueller’s comment and the chart, several areas stand out for SEOs to develop:

  • Mobile performance skills
  • Working with AI tools
  • Understanding responsive design
  • Knowledge of client-side frameworks and how they affect SEO
  • Prompt engineering

In other words, step outside server-level optimizations and focus on client-side rendering and user experience elements.

Our Take At Search Engine Journal

Mueller’s advice hits home for us at SEJ. We’ve watched SEO evolve firsthand.

Not long ago, technical SEO mostly meant handling sitemaps, robots.txt files, and basic schema markup. Now, we’re writing about JavaScript rendering, Core Web Vitals, and AI content evaluation.

The most successful industry pros are those who expand their technical knowledge rather than stick to outdated practices. Those who understand traditional optimization and new web technologies will continue to thrive as our industry changes.

Mueller’s reminder to adapt isn’t just sound advice; it’s essential for staying relevant in search.


Featured Image: B Desain28/Shutterstock