Google Says It Can Handle Multiple URLs To The Same Content via @sejournal, @martinibuster

Google’s John Mueller answered a question about duplicate URLs appearing after a site structure change. His response offers clarity about how Google handles duplicate content and what actually influences indexing and ranking decisions.

Concern About Duplicate URLs And Ranking Impact

A site owner had changed the URL structure of their web pages then later discovered that older versions of those URLs were still accessible and appearing in Google Search Console.

The person asking the question on Reddit was concerned that requesting recrawls of the older URLs might confuse Google or lead to ranking issues.

They asked:

“I switched over themes a while back and did some redesign and at some point …I changed all my recipes urls by taking the /recipe/ part out of site.com/recipe/actualrecipe so it’s now just site.com/actualrecipe but there are urls that still work when you put the /recipe/ back in the url.

I went to GSC and panicked that a bunch of my recipes weren’t indexed due to a 5xx error (I think it was when my site was down for a few days).

Now I’ve requested a bunch of them already to be recrawled, but realizing maybe google was ignoring them for a reason, like it didn’t want the duplicates.

Are my recrawl requests for /recipe/ urls going to confuse google who might penalize my ranking for the duplicates?”

The question reflects a reasonable concern that duplicate URLs and content might negatively affect rankings, especially when the error is surfaced through the search console indexing reports.

Google Is Able To Handle Duplicate URLs

Google’s John Mueller answered the question by explaining that multiple URLs pointing to the same content do not trigger a penalty or loss of search visibility. He also noted that this kind of duplication is common across the web, implying that Google’s systems are experienced with handling this kind of problem.

He explained:

“It’s fine, but you’re making it harder on yourself (Google will pick one to keep, but you might have preferences).

There’s no penalty or ranking demotion if you have multiple URLs going to the same content, almost all sites have it in variations. A lot of technical SEO is basically search-engine whispering, being consistent with hints, and monitoring to see that they get picked up.”

What Mueller is referring to is Google’s ability to canonicalize a single URL as the one that’s representative of the various similar URLs. As Mueller said, multiple URLs for essentially the same content is a frequent issue on the web.

Google’s documentation lists five reasons duplicate content happens:

  1. “Region variants: for example, a piece of content for the USA and the UK, accessible from different URLs, but essentially the same content in the same language
  2. Device variants: for example, a page with both a mobile and a desktop version
  3. Protocol variants: for example, the HTTP and HTTPS versions of a site
  4. Site functions: for example, the results of sorting and filtering functions of a category page
  5. Accidental variants: for example, the demo version of the site is accidentally left accessible to crawlers”

The point is that duplicate content is something that happens often on the the web and is something that Google is able to handles.

Technical SEO Signals

Mueller said Google will pick one URL to keep, but added that the site owner might have preferences. That means Google will canonicalize the duplicates on its own, but the site owner or SEO can still signal which URL is the best choice (the canonical one) for ranking in the search results.

That is where technical SEO comes in. Internal linking, redirects, the proper use of rel=”canonical”, sitemap consistency, and consistency in 301 redirects all work as hints that help Google identify on the version you actually want indexed.

The Real Problem Is Mixed Signals

Mueller’s remark about making it harder on yourself was about the site owner/SEO spending time requesting URLs to be recrawled and noting that Google will figure it out on its own. But then he also referenced preferences, which alluded to all the signals I previously mentioned, in particular the rel=”canonical”.

Technical SEO Is Often About Reinforcing Preferences

Mueller’s description of technical SEO as “search-engine whispering” is useful because it captures how much of SEO involves reinforcing your preferences for what URLs are crawled, which content is chosen to rank, and indicating which pages of a website are the most important. Google may still choose a canonical on its own, but consistent signals increase the chance that it chooses the version the site owner wants.

That makes this a good example of what SEO is all about: Making it easy for Google to crawl, index, and understand the content. That’s really the essence of SEO. It is about being clear and consistent in the content, URLs, internal linking, overall site navigation, and even in showing the cleanest HTML, including semantic HTML (which makes it easier for Google to annotate a web page).

Semantic HTML can be used to clearly identify the main content of a web page. It can directly help Google zero in on what’s called the Centerpiece content, which is likely used for Google’s Centerpiece Annotation. The centerpiece annotation is a summary of the main topic of the web page.

Google’s canonicalization documentation explains:

“When Google indexes a page, it determines the primary content (or centerpiece) of each page. If Google finds multiple pages that seem to be the same or the primary content very similar, it chooses the page that, based on the factors (or signals) the indexing process collected, is objectively the most complete and useful for search users, and marks it as canonical. The canonical page will be crawled most regularly; duplicates are crawled less frequently in order to reduce the crawling load on sites.”

Technical SEO And Being Consistent

Stepping back to take a forest level view, duplicate URLs are really about a website not being consistent. Being consistent is not often seen as having to do with SEO but it actually is, on a general level. Every time I have created a new website I always had a plan for how to make it consistent, from the URLs to the topics, and also how to be able to expand that in a consistent manner as the website grows to cover more topics, to build that in.

Takeaways

  • Multiple URLs to the same content do not cause a penalty or ranking demotion
  • Google will usually pick one version to keep
  • Site owners can influence that choice through consistent technical signals
  • The real issue is mixed signals, not duplicate content itself
  • Technical SEO often comes down to reinforcing clear preferences and monitoring whether Google picks them up
  • The forest-level view of SEO can be seen as being consistent

Featured Image by Shutterstock/Andrey_Kuzmin

Pichai Says AI Could ‘Break Pretty Much All Software’ via @sejournal, @MattGSouthern

Google CEO Sundar Pichai said AI models could break widely used software and that prices for black-market zero-day exploits may be falling. The comments came during a conversation on the Cheeky Pint podcast with Stripe CEO Patrick Collison.

What Pichai Said

The discussion touched on constraints facing AI infrastructure buildout when Pichai turned to security as a less visible risk.

Pichai said:

“These models are definitely like really going to break pretty much all software out there. Maybe already we don’t know as we sit here and speak.”

Elad Gil mentioned hearing that black market zero-day prices were falling because AI was increasing the supply of discoverable vulnerabilities. Pichai said he was “not at all surprised,” though neither cited specific pricing data.

Pichai framed security threats as a hidden constraint on AI deployment, alongside memory supply and energy. He said the situation would require “more coordination, which is not happening today” and predicted a potential “sharp moment” ahead.

“I don’t think you can wish them away,” Pichai said.

What The Data Shows

Google’s Threat Intelligence Group tracked 90 zero-day exploits used in attacks during 2025, up from 78 in 2024. Nearly half targeted enterprise software, an all-time high.

The GTIG report predicted that AI would “accelerate the ongoing race between attackers and defenders” in 2026. It said adversaries are likely to use AI to accelerate reconnaissance, vulnerability discovery, and exploit development.

While Pichai and Gil described falling black-market prices, industry reporting on the separate commercial-exploit market has shown prices holding or rising in some categories as vendors harden their products.

Why This Matters

Every website runs on software with potential vulnerabilities. WordPress plugins, server configurations, third-party scripts, and authentication systems are all part of the attack surface that AI-assisted exploit discovery could target faster.

If AI is accelerating the pace at which vulnerabilities are found and weaponized, the window between a flaw existing and an attacker using it gets shorter. That puts more pressure on maintaining current patches and auditing their dependencies.

Google’s threat data shows exploit volume rising and AI accelerating discovery, even if the pricing claim lacks specifics.

Looking Ahead

Pichai’s comments were conversational, not a formal Google policy statement. But they came from someone who oversees both the company’s AI models and its threat intelligence operation.

The gap between AI capability and security readiness is a theme Google’s threat researchers have been documenting with increasing urgency. The GTIG report expects AI to speed both offense and defense going forward.


Featured Image: FotoField/Shutterstock

Google Explains Why It Doesn’t Matter That Websites Are Getting Larger via @sejournal, @martinibuster

A recent podcast by Google called attention to the fact that websites are getting larger than ever before. Google’s Gary Illyes and Martin Splitt explained that the idea that websites are getting “larger” is a bad thing is not necessarily true. The takeaway for publishers and SEOs is that Page Weight is not a trustworthy metric because the cause of the “excess” weight might very well be something useful.

Page Size Depends On What ‘s Being Measured

Google’s Martin Splitt explained that what many people think of as page size depends on what is being measured.

  • Is it measured by just the HTML?
  • Or are you talking about total page size, including images, CSS, and JavaScript?

It’s an important distinction. For example, many SEOs were freaked out when they heard that Googlebot was limiting their page crawl to just 2 megabytes of HTML per page. To put that into perspective, two megabytes of HTML equals about two million characters (letters, numbers, and symbols). That’s the equivalent of one HTML page with the same number of letters as two Harry Potter books.

But when you include CSS, images, and JavaScript along with the HTML, now we’re having a different conversation that’s related to page speed for users, not for the Googlebot crawler.

Martin discussed an article on HTTPArchive’s Web Almanac, which is a roundup of website trends. The article appeared to be mixing up different kinds of page weight, and that makes it confusing because there are at least two versions of page weight.

He noted:

“See that’s where I’m not so clear about their definition of page weight.

…they have a paragraph where they are trying to like explain what they mean by page weight. …I don’t understand the differences in what these things are. So they say page weight (also called page size) is the total volume of data measured in kilobytes or megabytes that a user must download to view a specific page. In my book that includes images and whatnot because I have to download that to see.

And that’s why I was surprised to hear that in 2015 that was 845 kilobytes. That to me was surprising. …Because I would have assumed that with images it would be more than 800 kilobytes.

… In July 2025, the same median page is now 2.3 megabytes.”

Data Gets Compressed

But that is only one way to understand page size. Another way to consider page size is by focusing on what is transferred over the network, which can be smaller due to compression. Compression is an algorithm on the server side that minimizes the size of the file that is sent from the server and downloaded by the browser. Most servers use a compression algorithm called Brotli.

Martin Splitt explains:

“I ask this question publicly that different people had very different notions of how they understood page size. Depending on the layer you are looking at, it gets confusing as well
because there’s also compression.

…So some people are like, ah, but this website downloads 10 megabytes onto my disk.

And I’m like, yes. …but maybe if you look at what actually goes over the wire, you might find that this is five or six megabytes, not the whole 10 megabytes. Because you can compress things on the network level and then you decompress them on the client side level…”

Technically, the page size in Martin’s example is actually five or six megabytes because of compression, and it’s able to download faster. But on the user’s side, that five or six megabytes gets decompressed, and it turns back into ten megabytes, which occupies that much space on a user’s phone, desktop, or wherever.

And that introduces an ambiguity. Is your web page ten megabytes or five megabytes?

That illustrates a wider problem: different people are talking about different things when they talk about page size.

Even widely used definitions don’t fully resolve the ambiguity. Page weight is described as “the total volume of data measured in kilobytes or megabytes that a user must download,” but as the discussion makes clear, there is no one clear definition.

Martin asserts:

“When you ask people what they think, if this is big or not, you start getting very different answers depending on how they think about page size. And there is no one true definition of it.”

What About Ratio Of Markup To Content?

One of the most interesting distinctions made in the podcast is that a large page is not necessarily inefficient. For example, a 15 MB HTML document is considered acceptable because “pretty much most of these 15 megabytes are actually useful content.” The size reflects the value being delivered.

By contrast, what if the ratio of content to markup were the other way around, where there was a little bit of content but the overwhelming amount of the page weight was markup.

Martin discussed the ratio example:

“…what if the markup is the only overhead? And I mean like what do you mean? It’s like, well, you know, if it’s like five megabytes but it’s only very little content, is that bad? Is that worse as in this case, the 15 megabytes.

And I’m like, that’s tricky because then we come into this weird territory of the ratio between content and markup. Yeah.

And I said, well, but what if a lot of it is markup that is metadata for some third party tool or for some service or for regulatory reasons or licensing reasons or whatever. Then that’s useful content, but not necessarily for the end user, but you still kind of have to have it.

It would be weird to say that that is worse than the page where the weight is mostly content.”

What Martin is doing here is shifting the idea of page weight away from raw size toward what the data actually represents.

Why Pages Include Data Users Never See

A major contributor to page weight is content that users never see.

Gary Illyes points to structured data as an example of content that is specifically meant for machines and not for users. While it can be useful for search engines, it also adds to the overall size of the page. If a publisher adds a lot of structured data to their page in order to take advantage of all the different options that are available, that’s going to add to the page size even though the user will never see it.

This calls attention to a structural reality of the web: pages are not just built for human readers. They are also built for search engines, tools, AI agents, and other systems, all of which add their own requirements to the weight of a web page.

When Overhead Is Justified

Not all non-user-facing content is unnecessary.

Martin talked about how markup may include “metadata” or a tool, regulatory, or licensing purpose, creating a kind of gray area. Even if the additional data does not improve the user experience directly, it does serve a purpose, including helping the user find the page through a search engine.

The point that Martin was getting at is that these considerations of page weight complicate attempts to label page weight as good if it is under this threshold or bad if the page weight exceeds it.

Why Separating Content and Metadata Doesn’t Work

One possible solution that Gary Illyes discussed is separating human-facing content from machine-facing data. While Gary didn’t specifically mention the LLMs.txt proposal, what he discussed kind of resembles it in that it serves content to a machine minus all the other overhead that goes with the user-facing content.

What he actually discussed was a way to separate all of the machine-facing data from what the user will download, thus, in theory, making the user’s version of a web page smaller.

Gary quickly dismisses that idea as “utopic” because there will always be hordes of spammers who will find a way to take advantage of that.

He explained:

“But then unfortunately this is an utopic thing. Because not everyone on the internet is playing nice.

We know how much spam we have to deal with. On our blog we say somewhere that we catch like 40 billion URLs per day that’s spam or some insane number, I don’t remember exactly, but it’s some insane number and definitely billions. That will just exacerbate the amount of spam that search engines receive and other machines receive maybe like I would bet $1 and 5 cents that will actually increase the amount of spam that search engines and LLMs and others ingest.”

Gary also said that Google’s experience is that, historically, when you have separate kinds of content, there will always be differences between the two kinds. He used the example of when websites had mobile and desktop pages, where the two versions of content were generally different, which in turn caused issues for search and also for usability when a site ranks a web page for content on one version of a page, then sends the user to a different version of the page where that content does not exist.

Although he didn’t explicitly mention it, that explanation of Google’s experience may shed more light on why Google will not adopt LLMS.txt.

As a result, search engines have largely settled on a single-document model, even if it is inefficient.

Website Size vs Page Size Is the Real World

The discussion ultimately challenges the original concept of the problem, that heavy web pages are bad.

Gary observes:

“The first question is, are websites getting fat? I think this question is not even meaningful.

Because it does not matter in the context of a website if it’s fat. In the context of a single page, yes.

But in the context of a website, it really doesn’t matter.”

So now Gary and Martin change the focus to web pages that are getting heavier, a more meaningful way to look at the issue of how web pages and websites are evolving.

This moves the discussion from an abstract idea to something more measurable and actionable.

Heavier Pages Still Carry Real Costs

Even with faster connections and better infrastructure, larger pages still have consequences, and smaller weighted pages have positive benefits.

Martin explains:

“I think we are wasting a lot of resources. And I mean we, we had that in another episode where we said that we know that there are studies that show that websites that are faster have better retention and better conversion rates. Yeah. And speed is in part also based on size. Because the more data I ship, the longer it takes for the network to actually transfer that data and the longer it takes for the processor of whatever device you’re on to actually process it and display it to you.”

From a broader perspective, the issue is not just performance but efficiency. As Illyes puts it, “we are wasting a lot of resources.”

The web may be getting heavier, but the more important takeaway is why. Pages are carrying more than just user-facing content, and that design choice shapes both their size and their impact.

Featured Image by Shutterstock/May_Chanikran

Google’s Mueller On SEO Gurus Who Are “Clueless Imposters” via @sejournal, @martinibuster

A search marketing professional from India wrote a blog post about how she feels about seeing the word guru used within the SEO community in a way that’s different from its meaning in India. Several people, including Google’s John Mueller, agreed with her and shared how they felt when people self-identify as SEO gurus.

The Word Guru Is Misused

Preeti Gupta wrote a blog post titled, I don’t like how the word ‘Guru’ is misused in the SEO industry, in which she shared what the word guru actually means and how it’s misused in the SEO industry in a way that trivializes a word that in India holds special meaning.

She wrote that in India the word guru has a deep meaning and that they hold great respect for actual gurus.

Her blog post shared a Sanskrit mantra about it:

“The Guru is like Brahma (the creator). They create the desire for knowledge.
The Guru is like Vishnu (The preserver). They help the student keep and use the knowledge.
The Guru is like Maheshwara (Shiva, the Destroyer). They destroy ignorance and bad habits.
The Guru is the supreme reality itself, standing right before your eyes.
I bow and offer my respects to that great teacher.”

She then contrasted that profound meaning of the word guru with the trivialization of it within the context of self-described SEO gurus, who she regards as shady types who engage in unethical SEO practices. She said that it’s not her intention to tell people what words to use, but she did express the hope that people would use the word in the right context.

The phrase SEO guru is used in both contexts, as a derogatory phrase to paint someone as a false leader with naïve followers and also as someone who is highly regarded. However, I think an argument can be made that using that phrase for oneself is immodest, self-aggrandizing, and simply isn’t a good look.

AlexHarford-TechSEO responded to her post on Bluesky:

“It puts me off when I see an SEO self-describe themselves as a “Guru.” I’ve never come across anyone who does so who is a good and ethical SEO.

A lot of words are losing meaning in today’s world, though there can’t be many that were as special to you as Guru.”

Words are always in a state of change, and the way people speak not only changes from region to region but also from decade to decade. The meaning of words does change, especially when they jump continents and languages.

Self-Declared SEO Gurus

It was at this point that John Mueller responded to share what he thinks about self-described SEO gurus:

“To me, when someone self-declares themselves as an SEO guru, it’s an extremely obvious sign that they’re a clueless imposter. SEO is not belief-based, nobody knows everything, and it changes over time. You have to acknowledge that you were wrong at times, learn, and practice more.”

Mueller is right that nobody knows everything and that SEO changes over time, and for a long time many SEOs didn’t keep up with how Google ranks websites. The industry has largely shed that naivete, and yet nobody really agrees on what to do to rank better in search engines and AI search.

SEO Is A Belief System

Although I know there are some SEOs who firmly believe that SEO is a set of universally agreed upon practices and that that is all there is, unaware that the history of SEO is one of constant change. How SEO is practiced today is quite different from how it was practiced eight years ago. There is no set of practices to be agreed on except Google’s best practices, which are less about do this and you will rank better and more about do this and you may have a chance to rank better.

So yes, to a certain extent, SEO is a belief system and will continue to be a belief system so long as Google’s search ranking algorithms remain a black box algorithm that people can see what goes in and what comes out but not what happens in the middle. That part remains a mystery. So when you don’t know for sure that what you do will guarantee better rankings, the only thing left is to believe, hope, and even have faith that the rankings will happen. Faith, after all, is belief in something that does not provide definitive proof. You don’t need faith to believe in a fact, right?

And that last part, the mystery of what happens in the black box, is why nobody can really call themselves a guru in the sense of being all-knowing. Nobody outside of Google knows everything that’s going on within that part in the middle where the rankings “magic” happens.

Given all that, who can truly call themselves a guru in SEO?

Featured Image by Shutterstock/funstarts33

Trust In AI Search Could Drop With Ads, Survey Shows via @sejournal, @MattGSouthern
  • A new Ipsos survey adds consumer sentiment data to the growing debate over ads in AI search.
  • Most US adults say ads in AI search results would reduce their trust in those results.
  • Early advertiser data from ChatGPT’s ad pilot offers limited context.

An Ipsos survey of U.S. adults found 63% say ads in AI search results would reduce trust. Early advertiser data offers limited, mixed signals.

ChatGPT Search Is Citing Fewer Sites, Data Shows via @sejournal, @MattGSouthern

Resoneo says ChatGPT responses began referencing about 20% fewer websites after what it identifies as the early-March transition to GPT-5.3 Instant.

The analysis comes from the French SEO consultancy and draws on data from Meteoria, an AI visibility-tracking platform that monitored 400 prompts daily over 14 weeks, producing 27,000 comparable responses.

Average unique domains per response dropped from 19 before the transition to 15 after. Average unique URLs per response fell from 24 to 19.

The URLs-per-domain ratio remained at 1 throughout the tracking period. The data suggests ChatGPT isn’t visiting as many sites per response, but it’s going just as deep into each one.

Fewer domains now share the same citation surface in each response, meaning the sites that do get cited take up a larger share of each answer.

Server Logs Back Up The Pattern

Independent log analysis from Jérôme Salomon at Oncrawl supports the findings. Tracking ChatGPT-User bot activity across multiple websites, his data shows crawl volume has settled at a lower level. Some pages aren’t being crawled at all anymore, and the crawl frequency for pages still being visited has dropped.

Resoneo links the change to ChatGPT’s default experience now being driven more heavily by GPT-5.3 Instant, which the company says triggers fewer web searches and citations than earlier behavior. Oncrawl’s server log data shows the lower crawl pattern over the same period.

Earlier data has shown how AI platforms cite sources differently from traditional search. An SE Ranking analysis of 129,000 domains found that referring domains were the strongest predictor of the likelihood of ChatGPT citation, with a threshold effect at 32,000 referring domains.

A Search Atlas report similarly showed low overlap between Google rankings and ChatGPT citations, with median domain overlap around 10-15%.

Why This Matters

A 20% drop in cited domains per response means fewer websites competing for visibility inside each ChatGPT answer. The total citation surface shrank, but the sites that kept appearing maintained their same crawl depth.

For anyone tracking referral traffic from ChatGPT, the early-March model transition is a date range worth checking in your analytics.

Looking Ahead

Resoneo’s analysis notes that GPT-5.4 Thinking reintroduces search fan-outs and uses site: operators to target trusted domains, but these behaviors weren’t captured in the quantitative dataset, which covers GPT-5.3 Instant and below.

Whether the citation surface continues to narrow or widens again with newer models isn’t yet clear.

WordPress’s Troubled Real-Time Collaboration Feature via @sejournal, @martinibuster

WordPress delayed the release of the highly anticipated version 7.0 of the CMS because the real-time collaboration (RTC) feature was not yet stable. The delay has caused some to question whether the feature is necessary in the core, while others say that the delay is a symptom of deeper issues within WordPress itself.

Real-Time Collaboration (RTC)

The Gutenberg project has been on a four-phase development track: Gutenberg block editor (phase 1), Full Site Editing (phase 2), Collaboration (phase 3), and multilingual capabilities within core (phase 4).

WordPress 7.0, initially due to be released on April 9, was supposed to be the rollout of phase 3, as well as other important features that make it easier to use AI within WordPress.

RTC enables multiple users to simultaneously edit content and block-based design within the block editor, a functionality that will be useful to publishers and agencies.

RTC Has Been Tested

The commercial side of WordPress, Automattic’s WordPress.com, has made RTC available to beta testers since October 2025. These beta testers are enterprise-level customers of WordPress VIP. WordPress’s documentation states that RTC works best with native WordPress blocks and implies that the feature could be buggy with blocks that don’t strictly abide by best practices.

A post on the official WordPress.org site provides this information about RTC performance:

“The most consistent feedback: real-time collaboration works seamlessly when sites are built for modern WordPress. Organizations using the block editor with native WordPress blocks and custom blocks developed using best practices reported smooth experiences with minimal issues.

One technical lead at a major research institution noted their team has invested in a deep understanding of Gutenberg and, as a result, “…have not run into any issues.”

…Multiple teams tested the limits by:

Adding dozens of blocks simultaneously.

Copying large amounts of existing content in parallel.

Having entire teams edit the same post together (one team specifically noted “this is so fun”).

In these stress tests with native blocks and modern custom blocks, real-time collaboration held up remarkably well.”

Those tests were with a version that reused existing tables to store the editing events. That method resulted in multiple bugs, leading to a delay after it was decided to create a dedicated table for the RTC feature in the database used by WordPress sites in order to improve stability.

The beta-tested version of RTC had to limit the number of users who could simultaneously edit together.

A GitHub issue ticket explains what’s wrong with the old version of RTC:

“It is known to be limited on a performance and scaling basis, but provides an easy way to see collaboration working.

By limiting the provider to a set low number of collaborators by default, the chance of overloading is reduced.”

So that’s one of the issues being solved by introducing a new database table. Once that is done, the RTC feature will need to be tested, and this is the area that WordPress web hosts will be concerned about.

Symptom Of Deeper Issues?

Joost de Valk, founder of Yoast SEO, recently published a blog post that made the case that WordPress is in need of rewriting existing code to make it more secure, modern, and efficient. He called attention to the troubled state of real-time collaboration as a symptom of the problems with the core itself.

He wrote:

“The recent deferral of WordPress 7.0 illustrates the problem in real time. The release was delayed because the team needs to revisit how real-time collaboration data is stored — the initial approach of stuffing it into postmeta wasn’t going to hold up. They’re now considering a custom table. This is exactly the pattern: a new feature runs into the limits of the existing data model, and the team has to work around it or pause to rethink.”

That’s one person’s opinion, and not everyone shares it. A lively discussion on the Post Status Slack channel showed that some in the WordPress community strongly disagreed that WordPress needs to be refactored.

Impact To WordPress Hosts

A concern I have heard privately is that RTC could have a negative impact on shared hosting providers. But it’s hard to know because the RTC feature is still evolving from what was tested on WordPress.com, which is supposed to make it more stable.

Shared hosting environments will have to make a decision as to how to accommodate this feature.

  • How will the hosting environment be affected by thousands of RTC customers editing all at once?
  • Will they need to limit how many users can edit within the block editor?
  • Will they have to place an upper limit of simultaneous editors for one tier of customers and a higher limit for other customers?

Should RTC Be A Plugin?

WordPress professional Matt Cromwell (LinkedIn profile) recently published an opinion piece that called attention to whether RTC should even be in the WordPress core and should instead be developed as a plugin. His reasoning is based on WordPress’s core philosophy that any new feature introduced into the core should be something that the majority of WordPress users will need.

The reason for that design philosophy is to make WordPress usable for the majority of users and not ship with features that most will not use. This keeps WordPress lean. His article quotes the official WordPress design philosophy:

“Design for the majority
Many end users of WordPress are non-technically minded. They don’t know what AJAX is, nor do they care about which version of PHP they are using. The average WordPress user simply wants to be able to write without problems or interruption. These are the users that we design the software for as they are ultimately the ones who are going to spend the most time using it for what it was built for.”

Cromwell writes:

“If a feature isn’t needed by the vast majority, it belongs in a plugin. It is the reason WordPress remains lean enough to power 43% of the web.

Real-time collaboration fails this test spectacularly.”

Although Cromwell insists that this feature wouldn’t be used by the majority, an argument could be made that this is a feature that people want. For example, the Atarim collaboration plugin, with the free version currently installed on over 1,000 websites, states that the plugin has been used on over 120,000 websites by agencies and freelancers.

It could be that RTC is indeed an important feature, especially to designers, agencies, and editorial teams working together on articles.

AI In WordPress

The four-stage WordPress roadmap was created six years ago in 2018, and there was no way to know then how important AI would be today. Yet arguably it’s AI, not collaboration, that’s the most anticipated integration for WordPress 7.0. Nevertheless, real-time collaboration will very likely land in WordPress 7.0, enabling freelancers and agencies to work together with clients as well as with internal teams spread out across countries. That seems like a valid reason to ship a stable feature within core as opposed to within a plugin.

Featured Image by Shutterstock/Summit Art Creations

Why Agentic AI Shopping Feels Unnatural And May Not Threaten SEO via @sejournal, @martinibuster

Google, OpenAI, and Shopify insist that the next revolution in AI is agentic AI shopping agents. Shopping is a lucrative area for AI to burrow into. The thing that I keep thinking is that shopping is a deeply important activity to humans; it’s literally a part of our DNA. Is surrendering the shopping experience something the general public is willing to do?

Agentic AI shopping is like a personal assistant that you tell what you want and maybe why you need it, plus some features and a price range. The AI will go out and do the research and comparison and even make the purchase.

There’s no human performing a search in that scenario. So it’s kind of not necessarily good for SEO unless you’re optimizing shopping sites for agentic AI shoppers.

Shopping Is A Part Of Human Biology

Scientists say that shopping is literally a part of our DNA. Our desire to hunt, to gather, and to flaunt our ability to be successful is a part of the evolutionary competition we participate in (whether we know it or not).

A Wikipedia page on the subject explains:

“Richard Dawkins outlines in The Selfish Gene (1976) that humans are machines made of genes, and genes are the grounding for everything people do.

…Therefore, everything that people do relates to thriving in their environment above competition, including the way people consume as a form of survival in their environment when simply purchasing the basic physiological needs of food, water and warmth. People also consume to thrive above others, for example in conspicuous consumption where a luxury car represents money and high social status…”

What that means is that whether we know it or not, our drive to shop is a part of evolutionary competition with each other. Part of it is to signal our status and attractiveness for reproduction. So when we go shopping for clothes or toilet paper, it’s part of our genetic programming to feel good about it.

Shopping And The Brain’s Chemical Cocktail

And when it comes to feeling good, some of that is triggered by chemicals like dopamine, endorphins, and serotonin firing off to reward you for finding a good deal.

Even scoring a deal on toilet paper can trigger reward signals in the brain.

Another Wikipedia page about the biology of our reward system explains:

“Reward is the attractive and motivational property of a stimulus that induces appetitive behavior, also known as approach behavior, and consummatory behavior. A rewarding stimulus has been described as “any stimulus, object, event, activity, or situation that has the potential to make us approach and consume it.”

A sale sign in a store can act as a reward cue because it signals a lower price or added value, which can drive someone to approach and buy it. The sign itself is just information, but when a person recognizes the discount or deal as beneficial, it can trigger motivation to act. That’s a deeply embedded behavior that we carry with us.

We are like machines that are programmed in our genes to shop.

So that raises the question: Why would anyone delegate that deeply rewarding activity to an AI agent? It’s like delegating the enjoyment of chocolate to a robot.

I suspect that most of you reading this know which supermarkets sell the best produce at the cheapest price, which ones have the yummiest bread, and which markets have the best spices. That’s our programming; it’s biological. It does not make sense to delegate the rewards inherent in discovery or acquisition to an AI shopping agent.

Serendipity And Shopping

Serendipity is when things happen by chance, unplanned, that nonetheless provide a happy outcome or benefit. One of the joys of shopping is stumbling onto something that’s a good deal or beautiful or has some other value. Employing an AI agent will cause humans to miss out on the serendipitous joy of discovering something they hadn’t been looking for that is not just desirable but also something they hadn’t known they needed.

For example, I purchased a birthday gift for my wife. I walked into a gift shop run by a charming new age hippie. We talked about music as I browsed the gifts for sale. I found something, two things, that I hadn’t planned on buying. The two things had a semantic connection to each other that I found to be poetic and therefore extra nice as a gift. The shop owner put the two items into two boxes, then placed the boxes in a lovely mesh gift bag with a ribbon.

That’s serendipity in action. It was a pleasurable moment I enjoyed. I walked out of the store into the sunshine with a fresh cocktail of dopamine, endorphins, and serotonin flooding my brain, and it was a delightful moment. I bought a gift that I was certain my wife would enjoy.

Agentic AI Shopping Is Unnatural

My question is, why does Silicon Valley think it can automate the many things that make us human?

It’s as if Silicon Valley is trying to convert us into teenagers by doing the things adults normally do.

Now they want to take shopping away from us?

I think that the only way that agentic AI has a chance of working is if they build in a sense of serendipity and discovery into the system. I’ve been a part of the technology scene for over 25 years, I lived in the world capital of the Internet in San Francisco and even worked for a time at a leading technology magazine.

So it’s not that I’m a luddite about technology. AI integrated into a shopping site makes a lot of sense. It can make recommendations and answer questions. That’s great. There is still a human who is clicking around and discovering things for themselves in a way that satisfies are natural urge to shop and consume. That’s good for SEO because it means that a store needs to be optimized for search.

AI agents doing the shopping for humans makes less sense because it’s unnatural, it goes against our biology.

Featured Image by Shutterstock/Prostock-studio

Mullenweg To Cloudflare: Keep WordPress Out Of Your Mouth via @sejournal, @martinibuster

WordPress co-founder Matt Mullenweg responded to Cloudflare’s announcement of EmDash as the spiritual successor to WordPress by invoking Will Smith’s Oscars slap. Cloudflare’s CEO responded by doing exactly what Mullenweg told him not to do.

Spiritual Successor To WordPress

Mullenweg’s first criticism of the new EmDash was the claim that it was the spiritual successor to WordPress. He made the point that WordPress can be installed and used on virtually any device and on any platform, saying that this was a part of their mission to democratize publishing by making it easy to deploy on almost any kind of infrastructure.

Although he didn’t say it out loud, the implication is clear: WordPress can be deployed everywhere, and EmDash is not as flexible.

Matt aimed his next comment straight at Cloudflare:

“You can come after our users, but please don’t claim to be our spiritual successor without understanding our spirit.”

The Compliment Sandwich

Back in the early 2000s, Googlers were famous for their friendliness and smiles. I don’t think it was a calculated thing; the smiles were not a persona; it was genuine. I believe that many of the Googlers who had interactions with the SEO community were genuinely friendly and truly wanted to help people with their SEO issues. When I lived in San Francisco, I had many visits at Google and had nothing but positive experiences.

Matt affects that same kind of persona where he speaks with a smile. But he also does it while being critical of things, which is a kind of dissonant thing to witness. His response to Cloudflare is the written equivalent of that approach.

It follows compliment sandwich pattern:

  • Positive statement
  • Criticism or negative point
  • Another positive statement

Done correctly, with tact and genuine empathy, it can soften criticism. It’s a valid approach to providing critical but helpful feedback.

Matt accused Cloudflare of using EmDash as a way to promote their infrastructure, but he did it with a smile.

He criticized:

“I think EmDash was created to sell more Cloudflare services.”

Then he switched over to the positive statement:

“And that’s okay! It can kinda run on Netlify or Vercel, but good stuff works best on Cloudflare. This is where I’m going to stop and say, I really like Cloudflare! I think they’re one of the top engineering organizations on the planet; they run incredible infrastructure, and their public stock is one of the few I own. And I love that this is open source! That’s more important than anything. I will never belittle a fellow open source CMS; I only hate the proprietary ones.”

Then he criticized Cloudflare again:

“If you want to adopt a CMS that will work seamlessly with Cloudflare and make it hard for you to ever switch vendors, EmDash is an incredible choice.”

That last part is a backhanded and sarcastic compliment, implying that EmDash is a way to trap users within Cloudflare’s infrastructure. Mullenweg offered a bullet-point list of additional criticism mixed with compliments.

Keep WordPress Out Of Your Mouth

Mullenweg ended his blog post with a conciliatory-sounding paragraph that ends abruptly with a phrase that invoked Will Smith’s Oscars slap:

“Some day, there may be a spiritual successor to WordPress that is even more open. When that happens, I hope we learn from it and grow together. Until then, please keep the WordPress name out of your mouth.”

Mullenweg is doing something between the lines there. Whether he did it intentionally or not, he’s invoking Will Smith’s infamous moment at the Oscars, when he slapped Chris Rock across the face and told him to keep his wife’s name out of his mouth. That phrase subtly invokes a violent image, with Mullenweg playing the role of Will Smith slapping Cloudflare across the face.

By using that specific phrase, Matt Mullenweg was, intentionally or not, invoking the conflict by comparing Cloudflare’s use of the “WordPress” name to an insulting personal attack.

Understated Irony

After being told to keep WordPress out his mouth, Cloudflare co-founder and CEO Matthew Prince responded on X by saying it’s a fair criticism and then immediately putting WordPress in his mouth. Prince tweeted:

“Think this is a fair critique from @photomatt of EmDash.

I remain hopeful it’ll bring a broader set of developers into the WordPress ecosystem.”

What Prince did there was politely defy Mullenweg by tweeting the word “WordPress” in his response after being told to keep it out of his mouth while simultaneously adopting the persona of someone trying to “help” the person who just slapped him. In the context of the Oscar reference, it’s as if Chris Rock had responded to the slap by calmly saying, “I hope this incident brings more viewers to your next movie.”

Was that meant as understated irony? If so, it’s a master class.

Featured Image by Shutterstock/Prostock-studio

Google Answers Why Some SEOs Split Their Sitemap Into Multiple Files via @sejournal, @martinibuster

Google’s John Mueller answered a question about why some websites use multiple XML sitemaps instead of a single file. His answer suggests that what looks like unnecessary complexity may come from reasons that are not immediately obvious.

The question came from an SEO trying to understand why managing multiple sitemap files would be preferred over keeping everything in one place.

Question About Using Multiple Sitemaps

The SEO framed the issue as a matter of efficiency, questioning why anyone would choose to increase the number of files they have to manage.

They asked:

“Can I ask a silly question, what’s the advantage of multiple site maps? It seems like your going from 1 file to manage to X files to manage.

Why the extra work? Why not just have 1 file?”

It’s a good question, avoiding extra work is always a good idea in SEO, especially if someone has a relatively small website, it makes sense to have just one sitemap but as Google’s Mueller explains, there may be good reasons to split a sitemap into multiple files.

Mueller Explains Why Multiple Sitemaps Are Used

Mueller responded by listing several reasons why multiple sitemap files are used, including both practical and less intentional causes.

He responded:

“Some reasons I’ve seen:

  • want to track different kinds of urls in groups (“product detail page sitemap” vs “product category sitemap” — which you can kinda do with the page indexing report)
  • split by freshness (evergreen content in a separate sitemap file – theoretically a search engine might not need to check the “old” sitemap as often; I don’t know if this actually happens though)
  • proactively split (so that you don’t get to 50k and have to urgently figure out how to change your setup)
  • hreflang sitemaps (can take a ton of space, so the 50k URLs could make the files too large)
  • my computer did it, I don’t know why”

Mueller’s answer shows that sitemaps can be used in creative ways that serve a purpose. Something I’ve heard from enterprise level SEOs is that they find that keeping a sitemap to well under 50k lines ensures better indexing.

Takeaways

Mueller’s answer shows that sometimes keeping things “simple” isn’t always the best strategy. It might make sense apply organization to the sitemaps appears to be unnecessary complexity is often the result of practical constraints, evolving site structures, or automated systems rather than deliberate optimization.

  • Multiple sitemaps can be used to group different types of content
  • They help avoid hitting technical limits like the 50,000 URL cap
  • Some implementations are based on theory rather than confirmed behavior
  • Not all sitemap structures are intentional or strategically planned

Featured Image by Shutterstock/Rachchanont Hemmawong