This post was sponsored by WP Engine. The opinions expressed in this article are the sponsor’s own.
In the race for audience attention, digital marketers at media companies often have one hand tied behind their backs. The mission is clear: drive sustainable revenue, increase engagement, and stay ahead of technological disruptions such as LLMs and AI agents.
Yet, for many media organizations, execution is throttled by a “Sticky-taped stack,” which is a fragile, patchwork legacy CMS structure and ad-hoc plugins. For a digital marketing leader, this isn’t just a technical headache; it’s a direct hit to the bottom line.
Fragmentation Tax: How A Siloed CMS, Disconnected Data & Tech Debt Are Costing You Growth
The Fragmentation Tax is the hidden cost of operational inefficiency. It drains budgets, burns out teams, and stunts the ability to scale. For digital marketing and growth leads, this tax is paid in three distinct “currencies”:
1. Siloed Data & Strategic Blindness.
When your ad server, subscriber database, and content tools exist as siloed work streams, you lose the ability to see the full picture of the reader’s journey.
Without integrated attribution, marketers are forced to make strategic pivots based on vanity metrics like generic pageviews rather than true business intelligence, such as conversion funnels or long-term reader retention.
2. The Editorial Velocity Gap.
In the era of breaking news, being second is often the same as being last. If an editorial team is forced into complex, manual workflows because of a fragmented tech stack, content reaches the market too late to capture peak search volume or social trends. This friction creates a culture of caution precisely when marketing needs a culture of velocity to capture organic traffic.
3. Tech Debt vs. Innovation.
Tech debt is the future cost of rework created by choosing “quick-and-dirty” solutions. This is a silent killer of marketing budgets. Every hour an engineering team spends fixing plugin conflicts or managing security fires caused by a cobbled-together infrastructure is an hour stolen from innovation.
The 4 Publishing Pillars That Improve SEO & Monetization
To stop paying this tax, media organizations are moving away from treating their workflows as a collection of disparate parts. Instead, they are adopting a unified system that eliminates the friction between engineering, editorial, and growth.
A modern publishing standard addresses these marketing hurdles through four key operational pillars:
Pillar 1: Automated Governance (Built-In SEO & Tracking Integrity)
Marketing integrity relies on consistency.
In a fragmented system, SEO metadata, tracking pixels, and brand standards are often managed manually, leading to human error.
A unified approach embeds governance directly into the workflow.
By using automated checklists, organizations ensure that no article goes live until it meets defined standards, protecting the brand and ensuring every piece of content is optimized for discovery from the moment of publication.
Pillar 2: Fearless Iteration (Continuous SEO & CRO Optimization Without Risk)
High-traffic articles are a marketer’s most valuable asset. However, in a legacy stack, updating a live story to include, for instance, a Call-to-Action (CTA), is often a high-risk maneuver that could break site layouts.
A modern unified approach allows for “staged” edits, enabling teams to draft and review iterations on live content without forcing those changes live immediately. This allows for a continuous improvement cycle that protects the user experience and site uptime.
Pillar 3: Cross-Functional Collaboration (Reducing Workflow Bottlenecks Between Editorial, SEO & Engineering)
Any type of technology disruption requires a team to collaborate in real-time. The “Sticky-taped” approach often forces teams to work in separate tools, creating bottlenecks.
A modern unified standard utilizes collaborative editing, separating editorial functions into distinct areas for text, media, and metadata. This allows an SEO specialist or a growth marketer to optimize a story simultaneously with the journalist, ensuring the content is “market-ready” the instant it’s finished.
Late-breaking or real-time events, such as global geopolitical shifts or live sports, require in-the-moment storytelling to keep audiences informed, engaged, and on-site. Traditionally, “Live Blogs” relied on clunky third-party embeds that fragmented user data and slowed page loads.
A unified standard treats breaking news as a native capability, enabling rapid-fire updates that keep the audience glued to the brand’s own domain, maximizing ad impressions and subscription opportunities.
Conclusion: Trading Toil for Agility
Ultimately, shifting to a unified standard is about reducing inefficiencies caused by “fighting the tools.” By removing the technical toil that typically hides insights in siloed tools, media organizations can finally trade operational friction for strategic agility.
When your site’s foundation is solid and fast, editors can hit “publish” without worrying about things breaking. At the same time, marketers can test new ways to grow the audience without waiting weeks for developers to update code. This setup clears the way for everyone to move faster and focus on what actually matters: telling great stories and connecting with readers.
The era of stitching software together with “sticky tape” is over. For modern media companies to thrive amid constant digital disruption, infrastructure must be a launchpad, not a hindrance. By eliminating the Fragmentation Tax, marketing leaders can finally stop surviving and start growing.
Jason Konen is director of product management at WP Engine, a global web enablement company that empowers companies and agencies of all sizes to build, power, manage, and optimize their WordPressⓇ websites and applications with confidence.
Google’s John Mueller shared a case where a leftover HTTP homepage was causing unexpected site-name and favicon problems in search results.
The issue, which Mueller described on Bluesky, is easy to miss because Chrome can automatically upgrade HTTP requests to HTTPS, making the HTTP version easy to overlook.
What Happened
Mueller described the case as “a weird one.” The site used HTTPS, but a server-default HTTP homepage was still accessible at the HTTP version of the domain.
“A hidden homepage causing site-name & favicon problems in Search. This was a weird one. The site used HTTPS, however there was a server-default HTTP homepage remaining.”
The tricky part is that Chrome can upgrade HTTP navigations to HTTPS, which makes the HTTP version easy to miss in normal browsing. Googlebot doesn’t follow Chrome’s upgrade behavior.
“Chrome automatically upgrades HTTP to HTTPS so you don’t see the HTTP page. However, Googlebot sees and uses it to influence the sitename & favicon selection.”
Google’s site name system pulls the name and favicon from the homepage to determine what to display in search results. The system reads structured data from the website, title tags, heading elements, og:site_name, and other signals on the homepage. If Googlebot is reading a server-default HTTP page instead of the actual HTTPS homepage, it’s working with the wrong signals.
How To Check For This
Mueller suggested two ways to see what Googlebot sees.
First, he joked that you could use AI. Then he corrected himself.
“No wait, curl on the command line. Or a tool like the structured data test in Search Console.”
Running curl http://yourdomain.com from the command line would show the raw HTTP response without Chrome’s auto-upgrade. If the response returns a server-default page instead of your actual homepage, that’s the problem.
If you want to see what Google retrieved and rendered, use the URL Inspection tool in Search Console and run a Live Test. Google’s site name documentation also notes that site names aren’t supported in the Rich Results Test.
This case introduces a new complication. The problem wasn’t in the structured data or the HTTPS homepage itself. It was a ghost page in the HTTP version, which you’d have no reason to check because your browser never showed it.
Google’s site name documentation explicitly mentions duplicate homepages, including HTTP and HTTPS versions, and recommends using the same structured data for both. Mueller’s case shows what can go wrong when an HTTP version contains content different from the HTTPS homepage you intended to serve.
The takeaway for troubleshooting site-name or favicon problems in search results is to check the HTTP version of your homepage directly. Don’t rely on what Chrome shows you.
Looking Ahead
Google’s site name documentation specifies that WebSite structured data must be on “the homepage of the site,” defined as the domain-level root URI. For sites running HTTPS, that means the HTTPS homepage is the intended source.
If your site name or favicon looks wrong in search results and your HTTPS homepage has the correct structured data, check whether an HTTP version of the homepage still exists. Use curl or the URL Inspection tool’s Live Test to view it directly. If a server-default page is sitting there, removing it or redirecting HTTP to HTTPS at the server level should resolve the issue.
Google’s crawl team has been filing bugs directly against WordPress plugins that waste crawl budget at scale.
Gary Illyes, Analyst at Google, shared the details on the latest Search Off the Record podcast. His team filed an issue against WooCommerce after identifying its add-to-cart URL parameters as a top source of crawl waste. WooCommerce picked up the bug and fixed it quickly.
Not every plugin developer has been as responsive. An issue filed against a separate action-parameter plugin is still sitting unclaimed. And Google says its outreach to the developer of a commercial calendar plugin that generates infinite URL paths fell on deaf ears.
What Google Found
The details come from Google’s internal year-end crawl issue report, which Illyes reviewed during the podcast with fellow Google Search Relations team member Martin Splitt.
Action parameters accounted for roughly 25% of all crawl issues reported in 2025. Only faceted navigation ranked higher, at 50%. Together, those two categories represent about three-quarters of every crawl issue Google flagged last year.
The problem with action parameters is that each one creates what appears to be a new URL by adding text like ?add_to_cart=true. Parameters can stack, doubling or tripling the crawlable URL space on a site.
Illyes said these parameters are often injected by CMS plugins rather than built intentionally by site owners.
The WooCommerce Fix
Google’s crawl team filed a bug report against the plugin, flagging the add-to-cart parameter behavior as a source of crawl waste affecting sites at scale.
Illyes describes how they identified the issue:
“So we would try to dig into like where are these coming from and then sometimes you can identify that perhaps these action parameters are coming from WordPress plug-ins because WordPress is quite a popular CMS content management system. And then you would find that yes, these plugins are the ones that add to cart and add to wish list.”
And then what you would do if you were a Gary is to try to see if they are open source in the sense that they have a repository where you can report bugs and issues and in both of these cases the answer was yes. So we would file issues against these uh plugins.”
WooCommerce responded and shipped a fix. Illyes noted the turnaround was fast, but other plugin developers with similar issues haven’t responded. Illyes didn’t name the other plugins.
He added:
“What I really, really loved is that the good folks at Woolcommerce almost immediately picked up the issue and they solved it.”
The data shows those warnings and documentation updates didn’t solve the problem because the same issues still dominate crawl reports.
The crawl waste is often baked into the plugin layer. That creates a real bind for websites with ecommerce plugins. Your crawl problems may not be your fault, but they’re still your responsibility to manage.
Illyes said Googlebot can’t determine whether a URL space is useful “unless it crawled a large chunk of that URL space.” By the time you notice the server strain, the damage is already happening.
Google consistently recommends robots.txt, as blocking parameter URLs proactively is more effective than waiting for symptoms.
Looking Ahead
Google filing bugs against open-source plugins could help reduce crawl waste at the source. The full podcast episode with Illyes and Splitt is available with a transcript.
Natural language is quickly becoming the default way people interact with online tools. Instead of typing a few keywords, users now ask full questions, give detailed instructions, and are starting to expect clear, conversational answers. So, how can you make sure your content provides the answer to their question? Or better yet, how can you make it possible for them to interact with your website in a similar way? That’s where Microsoft’s NLWeb comes in.
Meet NLWeb, Microsoft’s new open project
NLWeb, short for Natural Language Web, is an open project recently launched by Microsoft. The aim of this project is to bring conversational interfaces directly to websites, rather than users having to use an external chatbot that’s in control of what’s shown. Instead of relying on traditional navigation or search bars, NLWeb is designed to allow users to ask questions and explore content in a more personal, conversational way.
At its core, NLWeb connects website content to AI-powered tools. It enables AI to understand what a website is about, what information it contains, and how that information should be interpreted for the purpose of returning personalized results. With this project, Microsoft is moving toward a more interoperable, standards-based, and open web that allows everyone to prepare their website for the future of search.
This project was initiated and realized by R.V. Guha, CVP and Technical Fellow at Microsoft. Guha is one of the creators of widely used web standards such as RSS and Schema.org.
How NLWeb works
NLWeb works by combining structured data, standardized APIs and AI models capable of understanding natural language. Every NLWeb instance acts as a Model Context Protocol (MCP) server, which makes your content discoverable for all the agents operating in the MCP ecosystem. This makes it easy for these agents to find your website.
Using structured data, website owners then present their content in a machine-readable way. AI applications can then consume this data and answer user questions accurately by matching them to the most relevant information. The result is a conversational experience powered by existing content, either directly on a website or through using an online search tool. A conversational interface for both human users and AI agents collecting information.
An important thing to note is that NLWeb is an open project. It’s not a closed ecosystem, meaning that Microsoft wants to make it accessible to everyone. The idea is to make it easy for any website owner to create an intelligent, natural language experience for their site, while also preparing their content to interact with and be discovered by other online agents, such as AI tools and search engines.
How does natural language work?
Natural language simply refers to the way we speak and write. This means using full sentences that allow room for intent, context and nuance. More than keywords or short commands, natural language reflects how people think and what they are looking for exactly.
To give you an example: a focus keyphrase might be running shoes trail. But using natural language, the request would look more like this: What are the best running shoes for trail running in wet conditions?
Natural language in AI tools
Modern AI tools are designed to understand this kind of input. The large language models behind these tools can analyze intent and context to generate responses that fulfill the given request. This is why conversational interfaces feel more intuitive than traditional search or forms.
Tools like AI chat assistants, voice search, and even traditional search engines rely heavily on natural language understanding and users have quickly adapted to it.
The current state of search
The way people find information online is changing fast. A change that is heavily influenced by the use of AI-powered tools. We now expect personalized answers instead of a list of results to sort through ourselves. AI chatbots also give us the option to follow up on our original search query, which turns search into a conversation instead of a series of clicks.
Research from McKinsey & Company shows that AI adoption and natural language interfaces are becoming mainstream, with 50% of consumers already using AI-driven tools for information discovery. The majority even say it’s the top digital source they use to make buying decisions. As these habits continue to grow, websites that aren’t optimized for natural language risk becoming invisible in AI-generated answers.
Why this is interesting for you
The shift to natural language isn’t just a technical trend. As discussed above, it directly impacts your online visibility and competitive position.
If users ask an AI system for information, only a handful of sources will be referenced in the response. This is because, like search engines, AI platforms also need to be able to read the information on your website. Being one of those sources can be the difference between being discovered or being overlooked.
NLWeb collaborates with Yoast
With NLWeb, you are communicating your website’s content clearly and in a standardized way. That means your brand, products, or expertise can appear in AI-powered answers instead of your competitors. To help as many website owners as possible benefit from this shift, Yoast is collaborating with NLWeb.
The best part? If you’re a user of any of our Yoast plans designed for WordPress, you’re well ahead here. Yoast’s integration with NLWeb will roll out in phases, starting with functionality that helps our users using WordPress express their content in ways AI systems can interpret accurately, without any additional setup required. So sit tight and let us help you prepare your website for the new world of search!
NLWeb aims to make your content understandable not just for people, but for the AI systems that are increasingly relevant to your website’s discovery.
The open web is the part of the internet built on open standards that anyone can use. This concept creates a democratic digital space where people can build on each other’s work without restrictions, just like how WordPress.org is built. For website owners, understanding and leveraging the open web is increasingly crucial. Especially with the rise of AI-powered systems and the general direction that online search is taking. So, let’s explore what the open web is and what it means for your website.
What is the open web?
The open web refers to the part of the internet built on open, shared standards that are available to everyone. It’s powered by technologies like HTTP, HTML, RSS, and Schema.org, which make it easy for websites and online systems to interact with each other. But it is more than just technical protocols. It also includes open‑source code, public APIs, and the free flow of data and content across sites, services, and devices. Creating a democratic digital space where people can build on each other’s work without heavy restrictions.
Because these standards are not owned or patented, the open web remains largely decentralized. This allows content to be accessed, understood, and reused across devices and platforms. This not only encourages innovation but also ensures that information is discoverable without being locked behind proprietary ecosystems.
The benefits of an open web
The open web is built on publicly available protocols that enable access, collaboration, and innovation at a global scale.
The most important benefits include:
Collaboration and innovation: Open protocols enable developers to build on each other’s work without proprietary restrictions.
Accessibility: Users and AI agents alike can access and interact with web content regardless of device, platform, or underlying technology.
Democratization: No single company controls access to information, giving publishers greater autonomy.
Inclusion: The open web creates a more level playing field, where everyone gets a chance to participate in the digital economy.
The open web vs the deep web
To give you a better idea of what the open web is, it helps to know about the “deep web” and closed or “walled garden” platforms. The deep web covers content not indexed by search engines, while closed systems or walled gardens restrict access and keep data siloed.
On the open web, anyone can access information freely. A good example of that is Wikipedia. Accessible to anyone looking for information on a topic and anyone who wants to contribute to its content. Closed-off platforms, like proprietary apps or social media ecosystems, create places where content is only available if you pay or use a specific service. Well-known examples of this are social media platforms such as Facebook and Instagram. Another example is a news website that requires a paid subscription to get access.
In essence, the open web keeps information discoverable, accessible, and interoperable – instead of locked inside a handful of platforms.
AI and the open web
The popularity of AI-powered search makes open web principles more important than ever. Decentralized and accessible information allows AI tools to interact with content directly and use it freely to generate an answer for a user.
“We believe the future of AI is grounded in the open web.”
Ramanathan Guha, CVP and Technical Fellow at Microsoft.
Microsoft’s open project NLWeb is a prime example. It provides a standardized layer that enables AI agents to discover, understand, and interact with websites efficiently, without needing separate integrations for every platform.
What this means for website owners
For website owners, including small business owners, embracing the open web means making your content freely available in ways that AI can interpret. By using structured data standards like Schema.org, your website becomes discoverable to AI tools. Increasing your reach and ensuring that your content remains part of the future of search.
Yoast and Microsoft: collaborating towards a more open web
Yoast is proud to collaborate with NLWeb, a Microsoft project that makes your content easier to understand for AI agents without extra effort from website owners. Allowing your content to remain discoverable, reach a wider audience with and show up in AI-powered search results.
The open web strives towards an accessible web where content is available for everyone. A web where it doesn’t matter how big your website or marketing budget is. Giving everyone the chance to be found and represented in AI-powered search. NLWeb helps turn this vision into reality by connecting today’s open web with tomorrow’s AI-driven search ecosystem
This post was sponsored by WP Media. The opinions expressed in this article are the sponsor’s own.
You’ve built a WordPress site you’re proud of. The design is sharp, the content is solid, and you’re ready to compete. But there’s a hidden cost you might not have considered: a slow site doesn’t just hurt your SEO-it now affects your AI visibility too.
With AI-powered search platforms such as ChatGPT and Google’s AI Overviews and AI Mode reshaping how people discover information, speed has never mattered more. And optimizing for it might be simpler than you think.
The conventional wisdom? “Speed optimization is technical and complicated.” “It requires a developer.” “It’s not that big a deal anyway.” These myths spread because performance optimization is genuinely challenging. But dismissing it because it’s hard? That’s leaving lots of untapped revenue on the table.
Here’s what you need to know about the speed-SEO-AI connection-and how to get your site up to speed without having to reinvent yourself as a performance engineer.
Why Visitors Won’t Wait For Your Site To Load (And What It Costs You)
Let’s start with the basics. When’s the last time you waited patiently for a slow website to load? Exactly.
Google’s research shows that as page load time increases from one second to three seconds, the probability of a visitor bouncing increases by 32%. Push that to five seconds, and bounce probability jumps to 90%.
Think about it. You’re spending money on ads, content, and SEO to get people to your site-and then losing nearly half of them before they see anything because your pages load too slowly.
For e-commerce, the stakes are even higher:
A site loading in 1 second has a conversion rate 5x higher than one loading in 5 seconds.
79% of shoppers who experience performance issues say they won’t return to buy again.
Every 1-second delay reduces customer satisfaction by 16%.
A slow site isn’t just losing one sale. It’s potentially losing you customers for life.
Website Speeds That AI and Visitors Expect
Google stopped being subtle about this in 2020. With the introduction of Core Web Vitals, page speed became an official ranking factor. If your WordPress site meets these benchmarks, you’re signaling quality to Google. If it doesn’t, you’re handing competitors an advantage.
Here’s the challenge: only 50% of WordPress sites currently meet Google’s Core Web Vitals standards.
That means half of WordPress websites have room to improve-and an opportunity to gain ground on competitors who haven’t prioritized performance.
The key metric to watch is Largest Contentful Paint (LCP)-how qhttps://wp-rocket.me/blog/website-load-time-speed-statistics/uickly your main content loads. Google wants this under 2.5 seconds. Hit that target, and you’re in good standing.
What most site owners miss: speed improvements compound. Better Core Web Vitals leads to better rankings, which leads to more traffic, which leads to more conversions. The sites that optimize first capture that momentum.
The AI Visibility Advantage: Why Speed Matters More Than Ever
Here’s where it gets really interesting-and where early movers have an edge.
The rise of AI-powered search tools like ChatGPT, Perplexity, and Google’s AI Overviews is fundamentally changing how people discover information. And here’s what most haven’t realized yet: page speed influences AI visibility too.
A recent study by SE Ranking analyzed 129,000 domains across over 216,000 pages to identify what factors influence ChatGPT citations. The findings on page speed were striking:
Fast pages (FCP under 0.4 seconds): averaged 6.7 citations from ChatGPT
Slow pages (FCP over 1.13 seconds): averaged just 2.1 citations
That’s a threefold difference in AI visibility based largely on how fast your pages load.
Why does this matter? Because 50% of consumers use AI-powered search today in purchase decisions. Sites that load fast are more likely to be cited, recommended, and discovered by a growing audience that starts their search with AI.
The opportunity: Speed optimization now serves double duty-it boosts your traditional SEO and positions you for visibility in an AI-first search landscape.
How To Improve Page Speed Metrics & Increase AI Citations
Speed, SEO, and AI visibility are now deeply connected.
Every day your site underperforms, you’re missing opportunities.
Your Page Speed Optimization Roadmap
Here’s your action plan:
Audit your current speed.
Identify the bottlenecks.
Implement a comprehensive solution. Rather than patching issues one plugin at a time, use an all-in-one performance tool that addresses caching, code optimization, and media loading together.
Monitor and maintain. Speed isn’t a one-time fix. Track your metrics regularly to ensure you’re maintaining performance as you add content and features.
Step 1: Audit Your Current Website Speed
To best identify where the source of your slow website lies and build a baseline to test against, you must perform a website speed test audit.
Compare your Core Web Vitals results scores to your industry’s CWV baseline.
Identify which scores are lowest before moving to step 2.
Step 2: Identify Your Page Speed Bottlenecks
Is it unoptimized images? Render-blocking JavaScript? Too many plugins? Understanding the issue helps you choose the right solution.
In fact, this is where most of your competitors drop the ball, allowing you to pick it up and outperform their websites on SERPs. For business owners focused on running their company, this often falls to the bottom of the priority list.
Why? Because traditional website speed optimization involves a daunting technical website testing checklist that includes, but isn’t limited to:
Rather than piecing together multiple plugins and manually tweaking settings, you get an all-in-one approach that handles the heavy lifting automatically. This is where purpose-built performance technology can change the game.
The endgame is to remove the complexity from WordPress optimization:
Instant results. For example, upon activation, WP Rocket implements 80% of web performance best practices without requiring any configuration. Page caching, GZIP compression, CSS and JS minification, and browser caching are just a few of the many optimizations that run in the background for you.
No coding required. Advanced features such as lazy-loading images, removing unused CSS, and delaying JavaScript are available via simple toggles.
Built-in compatibility. It’s designed to work with popular themes, plugins, page builders, and WooCommerce.
Performance tracking included. Built-in tool lets you monitor your speed improvements and Core Web Vitals scores without leaving your dashboard.
The goal isn’t to become a performance expert. It’s to have a fast website that supports your business objectives. When optimization happens in the background, you’re free to focus on what you actually do best.
For many, shifting tactics can cause confusion and unnecessary complexity. Utilizing the right technology makes implementing them so much easier and ensures you maximize AI visibility and website revenue.
A three-minute fix can make a huge difference to how your WordPress site performs.
Google released another Core Update to its search algorithm over the holidays. It was the most comprehensive update of 2025.
Google changes its algorithm frequently. Some are more widespread than others. Unlike Spam Updates, Core Updates generally do not penalize but, instead, alter how the algorithm treats certain queries and their intent.
For example, a Core Update may result in more “best of” listings (rather than product categories) in search results. Ecommerce sites may lose traffic, but not because of anything they’ve done, so no fix is required.
Yet a Core Update may result in higher rankings for certain types of content, which could prompt merchants to add those pages.
Core Updates can elevate a wide range of queries. The recent holiday update lowered the listings of large publishers and elevated niche sites. Search Engine Journal reported that Macy’s rankings decreased, while those of Columbia, The North Face, and Fragrance Market increased.
Content helpfulness
Google’s infamous Helpful Content algorithm is now part of its Core Updates and can, in theory, target an “unhelpful” site.
Google provides guidelines to human evaluators for what makes content helpful. It’s the best indicator for search optimizers as to Google’s definition of that term. To paraphrase from the guidelines:
Websites should place the most useful portions at the top of a page.
The amount of effort, originality, and skill determines the quality of the content.
Avoid unnecessary fluff or “filler” content that obscures what visitors are looking for.
Use clear titles and headings that inform, not oversell.
If a Core Update resulted in lost traffic, scrutinize your content helpfulness and on-page engagement.
How to recover
It’s often difficult to know why a Core Update lowered a site’s rankings. To diagnose, I typically start with the helpfulness of its pages and its overall engagement.
The first step is always to identify what was lost. Search Console will reveal the impacted queries:
Go to the full “Performance” report.
Choose “Compare” in the “More” filter.
Choose “Custom” and set start and end dates to expose the week before the change (early December for the most recent update) and the week after (beginning of January). Click “Apply.”
Sort the ensuing “Queries” column and the “Clicks Difference” column to see queries that now generate fewer clicks.
Select a before and after date range in Search Console to identify queries that generate fewer organic clicks.
Next, manually search Google for each affected query to determine if results shifted broadly or only for your page. The appearance of many new listings that answer a query in a new way may indicate a broad shift.
Semrush provides monthly snapshots of ranking URLs for each query. Refer to its archive to see how your overall SERPs have changed. If you see a widespread shift (i.e., 80% of listings are new for a given query), there is likely no fix needed. It’s Google changing its algorithm.
If only your site is downranked, most definitely look at the impacted pages and how to make them more helpful and engaging, such as:
Move the main portion, such as a quick answer to a search query, to the top.
Improve page structure and subheadings.
Remove ads, such as intrusive pop-ups, that block users from interacting with a page.
Add jump-to links that help visitors navigate the page.
Include social proof on the page.
Show the author’s name and bio.
Link to trusted sources.
Add helpful images and videos.
Update the page with recent data, trends, and stats (with sources).
Add explanatory sections, such as FAQs and definitions, tailored to the page’s purpose.
Helpfulness is subjective and vague. Nonetheless, consider your target audience and tailor your content accordingly.
Google announces only substantial Core Updates, those that affect many users. Lesser, unannounced updates occur more often and can result in recoveries.
Google’s John Mueller recently answered a question about phantom noindex errors reported in Google Search Console. Mueller asserted that these reports may be real.
Noindex In Google Search Console
A noindex robots directive is one of the few commands that Google must obey, one of the few ways that a site owner can exercise control over Googlebot, Google’s indexer.
And yet it’s not totally uncommon for search console to report being unable to index a page because of a noindex directive that seemingly does not have a noindex directive on it, at least none that is visible in the HTML code.
When Google Search Console (GSC) reports “Submitted URL marked ‘noindex’,” it is reporting a seemingly contradictory situation:
The site asked Google to index the page via an entry in a Sitemap.
The page sent Google a signal not to index it (via a noindex directive).
It’s a confusing message from Search Console that a page is preventing Google from indexing it when that’s not something the publisher or SEO can observe is happening at the code level.
“For the past 4 months, the website has been experiencing a noindex error (in ‘robots’ meta tag) that refuses to disappear from Search Console. There is no noindex anywhere on the website nor robots.txt. We’ve already looked into this… What could be causing this error?”
Noindex Shows Only For Google
Google’s John Mueller answered the question, sharing that there were always a noindex showing to Google on the pages he’s examined where this kind of thing was happening.
Mueller responded:
“The cases I’ve seen in the past were where there was actually a noindex, just sometimes only shown to Google (which can still be very hard to debug). That said, feel free to DM me some example URLs.”
While Mueller didn’t elaborate on what can be going on, there are ways to troubleshoot this issue to find out what’s going on.
How To Troubleshoot Phantom Noindex Errors
It’s possible that there is a code somewhere that is causing a noindex to show just for Google. For example, it may have happened that a page at one time had a noindex on it and a server-side cache (like a caching plugin) or a CDN (like Cloudflare) has cached the HTTP headers from that time, which in turn would cause the old noindex header to be shown to Googlebot (because it frequently visits the site) while serving a fresh version to the site owner.
Checking the HTTP Header is easy, there are many HTTP header checkers like this one at KeyCDN or this one at SecurityHeaders.com.
A 520 server header response code is one that’s sent by Cloudflare when it’s blocking a user agent.
Screenshot: 520 Cloudflare Response Code
Below is a screenshot of a 200 server response code generated by cloudflare:
Screenshot: 200 Server Response Code
I checked the same URL using two different header checkers, with one header checker returning a a 520 (blocked) server response code and the other header checker sending a 200 (OK) response code. That shows how differently Cloudflare can respond to something like a header checker. Ideally, try checking with several header checkers to see if there’s a consistent 520 response from Cloudflare.
In the situation where a web page is showing something exclusively to Google that is otherwise not visible to someone looking at the code, what you need to do is to get Google to look at the page for you using an actual Google crawler and from a Google IP address. The way to do this is by dropping the URL into Google’s Rich Results Test. Google will dispatch a crawler from a Google IP address and if there’s something on the server (or a CDN) that’s showing a noindex, this will catch it. In addition to the structured data, the Rich Results test will also provide the HTTP response and a snapshot of the web page showing exactly what the server shows to Google.
When you run a URL through the Google Rich Results Test, the request:
Originates from Google’s Data Centers: The bot uses an actual Google IP address.
Passes Reverse DNS Checks: If the server, security plugin, or CDN checks the IP, it will resolve back to googlebot.com or google.com.
If the page is blocked by noindex, the tool will be unable to provide any structured data results. It should provide a status saying “Page not eligible” or “Crawl failed”. If you see that, click a link for “View Details” or expand the error section. It should show something like “Robots meta tag: noindex” or ‘noindex’ detected in ‘robots’ meta tag”.
This approach does not send the GoogleBot user agent, it uses the Google-InspectionTool/1.0 user agent string. That means if the server block is by IP address then this method will catch it.
Another angle to check is for the situation where a rogue noindex tag is specifically written to block GoogleBot, you can still spoof (mimic) the GoogleBot user agent string with Google’s own User Agent Switcher extension for Chrome or configure an app like Screaming Frog set to identify itself with the GoogleBot user agent and that should catch it.
Screenshot: Chrome User Agent Switcher
Phantom Noindex Errors In Search Console
These kinds of errors can feel like a pain to diagnose but before you throw your hands up in the air take some time to see if any of the steps outlined here will help identify the hidden reason that’s responsible for this issue.
A common question among search optimizers is whether a “404” HTTP status code conveys negative ranking signals for the site as a whole.
The answer is yes, but indirectly.
Impact of 404s
For starters, a 404 error is not a direct ranking signal. Broken links or deleted pages do not impact sitewide rankings in Google search results. Former Google Webmaster Trends Analyst Susan Moskwa confirmed this in 2011. She called 404 errors a natural occurrence on the web, one that search engines are aware of.
She also stated that Google prefers 404 status codes (or 410s for pages intentionally removed) because they clearly inform that the page is unavailable.
Google’s Search Console guidelines also address 404s, stating they “won’t impact your site’s search performance.”
Google’s John Mueller recently confirmed this on Reddit (“johnmu” ): “Just to be clear: 404s/410s are not a negative quality signal. It’s how the web is supposed to work.”
Yet 404 status codes can result in a loss of organic rankings through other signals:
Poor usability. Clicking a broken link is a poor user experience, which can prompt visitors to abandon a site. Clicks and engagementare Google ranking factors. Visitors who land on a site and quickly leave suggest to Google that they are dissatisfied.
Loss of link equity. Internal and external links to deleted pages pass no link equity.
Detecting 404s
Hence detecting and fixing broken links and deleted pages is a key step in diagnosing organic traffic drops. I typically use three methods: Search Console, Google Analytics, and third-party tools.
Search Console
Search Console’s “Pages” report includes unindexed URLs and the reasons, such as 404 and 410 status codes. Review the list and confirm:
You removed the pages intentionally.
No internal links point to those pages. To verify, click the 404 page in the list, then “Inspect URL” to the right for referring sitemaps or pages.
Search Console’s “Pages” report includes unindexed URLs and the reasons. Click image to enlarge.
Google Analytics
First, note the default title of your 404 pages. Load a meaningless, non-existent URL on your site, such as yoursite.com/iuyhtgf. View the page title. (Bookmark the page, view the title in “Edit bookmark” or similar.)
In my case, it’s “404 – Page Not Found.”
View the page title, such as “404 – Page Not Found” in this example.
Next, go to the “Pages and screens: Landing page” report in Google Analytics:
Keep the primary dimension as “Page title and screen name.”
Add a secondary dimension “Page URL.”
Search for your 404 page title.
Go to the “Pages and screens: Landing page” report on Google Analytics. Click image to enlarge.
Third-party tools
Platforms such as Ahrefs and Semrush can identify external links pointing to error pages on your site. Access Semrush’s tool in the “Backlink Audit” section:
Enter your domain.
Go to the “Backlink Audit” in the right-hand panel.
Click “Indexed pages.”
Check the box for “Broken links.”
The Semrush report shows the number of domains linking to each broken page. Ahrefs’ report is similar.
Web crawlers such as Screaming Frog can identify broken internal links.
301-redirect the link to another internal page. Google will pass link equity via a 301 only if the destination page’s content is identical or very similar to the deleted version.
Don’t mass-redirect all 404 pages to the home or unrelated page. It’s a poor user experience because visitors were expecting different content.
Instead, optimize 404s by redirecting to similar pages, or do not redirect at all and encourage visitors on the 404 page to use internal search.
Magento, now officially Adobe Commerce (but still known as Magento with SEOs), remains a powerful but demanding ecommerce platform, especially in the Magento 2 era.
Adobe Commerce can deliver strong organic performance when built and optimized correctly, but it requires careful attention to technical SEO, site speed, and structured data. This guide outlines key Magento/Adobe Commerce SEO challenges and how to set your store up for long-term success in 2026.
It offers deep flexibility, strong product catalog capabilities, and enterprise-grade customization, which is why major brands like RadioShack and The National Gallery still rely on it today. But technical SEO on Magento can be challenging if the build, theme, and extensions are not handled with care.
Modern Magento builds must go beyond legacy SEO thinking. Alongside fundamentals like crawl efficiency and URL handling, store owners now need to consider Core Web Vitals, mobile-first indexing, structured data for product discovery, and visibility in AI-driven search experiences.
Magento can perform very well for organic search when implemented correctly, but out of the box, it is not optimized. Many of the known issues are fixable with the proper development and SEO process in place.
General Magento/Adobe Commerce SEO Issues
Magento 2 can deliver strong performance, but it requires the right hosting stack, theme, and caching setup. In a mobile-first and Core Web Vitals world, speed and stability are not optional, as they influence rankings, conversion rates, and how efficiently Google can crawl your store.
To build a fast Magento site, focus on solid hosting, full-page caching, Varnish, and Redis. Reducing JavaScript bloat from extensions, compressing images into modern formats like WebP or AVIF, and lazy-loading heavy assets also help keep load times low. Regular audits in tools like Lighthouse and PageSpeed Insights make sure you stay aligned with Core Web Vitals.
Crawl efficiency is another important consideration. Your mobile version needs to load all core content and links, since Google now uses mobile crawling. It helps to maintain a clear category structure and server-side rendering for critical templates, so search engines can discover and interpret key content easily. Log file analysis is also useful to understand what Googlebot sees and where it may be wasting crawl budget.
On the infrastructure side, a CDN helps with serving assets quickly globally, while running PHP 8+ and MySQL 8 ensures stronger performance and security. Server-side caching layers further support speed and consistency.
Magento/Adobe Commerce Site Speed Issues
In my experience, Magento sites often become slow due to heavy themes and unnecessary extensions. Always question whether a module is needed and consider its impact on JavaScript and DOM complexity.
A slow site can cost both traffic and sales. Faster sites convert better and get crawled more frequently by Google.
There is more to performance than these points, but focusing on hosting, caching, and efficient rendering gives your store a strong foundation.
Key priorities to keep in mind:
Optimize hosting and caching so your store responds quickly at all times, especially during peak traffic.
Minimize JavaScript and extension load to reduce render delays and maintain healthy Core Web Vitals.
Ensure content and navigation are fully accessible on mobile, supporting crawl efficiency and user experience.
Use modern image formats and lazy loading to keep pages fast, even with rich visuals.
Common Magento/Adobe Commerce Product SEO Issues
Modern Magento product SEO is not just about fixing duplication. The goal is to help search engines understand products as entities, scale content efficiently, and support both shoppers and AI-driven discovery.
Simple Vs. Configurable Products
Configurable products should hold the primary authority. Simple SKUs used for variations (color, size, style) should:
Canonical to the parent configurable product.
Avoid indexation unless they serve a unique search intent (rare cases).
Carry structured data that matches the parent.
This prevents duplicate content, consolidates ranking signals, and aligns with Google’s preference for primary entity pages.
You need to ensure canonicals are server-side rendered and not dependent on JavaScript.
Product Titles & On-Page Content
Magento default document titles are still too generic, and leave a lot of opportunity for optimization.
Your title strategy should scale but remain meaningful. For scale, it’s easy enough to use a template (such as the one below), and then modify key pages with bespoke titles as needed.
[Type] [Key Attribute] [Brand] [Variant]
Which would generate a title like: Men’s Navy Wool Sweater – Medium
For larger catalogs, you should:
Set smart naming conventions.
Optimize high-value SKUs manually.
Avoid keyword stuffing; clarity wins.
Header Tags
Header tags shape how both users and search engines understand a product page. I’ve seen Magento themes sometimes misuse or duplicate headers, which weakens content structure and confuses algorithms trying to interpret page hierarchy.
A clean structure with one H1 for the product name helps search engines immediately identify the primary entity.
Supporting sections such as details, reviews, and shipping information should use H2 tags so AI systems and Google can parse the content into digestible, meaningful blocks. A structured hierarchy also improves accessibility and user experience, which, in turn, supports better engagement metrics and signals that modern AI‑enhanced ranking systems increasingly factor in.
Structured Data
Magento provides a basic layer of schema, but modern search and AI systems require richer, more complete product data. Enhanced product schema gives search engines precise information about what the product is, how it is priced, whether it is in stock, and what other attributes define it.
This matters because AI‑driven search experiences rely heavily on structured data to understand products as entities. Including fields like brand, GTIN, SKU, material, and size helps AI classify products correctly and match them to user intent.
AI‑driven search platforms evaluate far more than keywords. They look for clarity, completeness, and consumer‑ready information. Systems like Google AI Overviews and Perplexity need clean specifications, descriptive language, review signals, and trustworthy product attributes to surface a product confidently. When Magento product pages include well‑structured attributes, FAQs, clear specifications, and real user feedback, AI models can better determine relevance and usefulness. This not only improves visibility in AI‑generated summaries but also increases the likelihood of being surfaced as a recommended product in conversational search flows.
Product URLs
The structure of your product URLs plays a significant role in crawl efficiency and clarity.
Magento allows both category‑based URLs and top‑level product URLs, but the latter is usually better for SEO and AI systems. Clean, stable URLs reduce duplication and consolidate ranking signals into one definitive version of the page. When URLs change based on category paths, search engines may split authority across multiple versions or waste crawl budget on unnecessary duplicates.
For AI search systems, predictable URLs make it easier to associate product data with a single entity across the web. Using top‑level URLs, supported by strong internal linking and accurate canonicals, helps ensure that both search engines and AI models reference the correct version of the product page.
Faceted Navigation & Crawl Efficiency
Filters can easily create thousands of thin or duplicate pages. In 2026, the focus is not on hiding parameters but managing crawl paths and preserving essential category pages.
Use “noindex, follow” on filter pages if they must be accessible.
Avoid infinite combinations of parameters.
The old Google Search Console parameter tool is deprecated. Handle parameters via:
Robots rules (carefully).
Canonical tags.
Internal linking logic.
Smart sitemap control.
AJAX filtering is fine if you pair it with crawlable fallback links.
Ensure stateful URLs exist for important filtered views (e.g., size filters for apparel).
Product Filters should not hide useful product attributes. AI systems still need to understand sizing, material, price, and availability.
URL Rewrites & Duplicate Paths
Magento’s rewrite system is powerful but prone to creating duplicates if not managed carefully. Duplicate URLs dilute authority, create confusion for crawlers, and introduce unnecessary complexity for AI systems that depend on consistent signals. Issues like category paths reappearing, /catalog/ versions resurfacing, or numbered duplicates often stem from misconfigured rewrites or bulk product imports.
For SEO, duplicate paths waste crawl budget and risk indexing low‑quality or unintentional versions of a product page. For AI‑driven search, inconsistency makes it harder to map attributes, reviews, and pricing data to the correct canonical product. Regular rewrite table audits, strict redirect rules, and blocking system paths ensure search engines see only the correct version. Clean, predictable URL behavior is essential for long‑term organic stability.
Pagination
With Google no longer using rel=next/prev, the emphasis has shifted toward clarity, crawlability, and consistent content signals. Each paginated page should have its own unique title and H1, so search engines understand that these pages represent distinct sections of a category, not duplicates. Self‑canonicals prevent incorrect consolidation while still allowing discovery of deeper product listings.
For infinite scroll implementations, providing paginated fallbacks ensures that Google and AI systems can access all products, not just the first batch loaded on scroll. Proper pagination protects category visibility, prevents orphaned products, and ensures that AI models can access a full and accurate representation of your catalog.
Adobe Commerce Page Builder (ACP)
Adobe Commerce Page Builder (ACP) is becoming a central part of the Adobe Commerce ecosystem, offering a more streamlined and flexible way to manage content.
ACP could play a growing role in SEO and AI visibility because it supports cleaner markup, more consistent content structures, and modular components that search engines can interpret more easily. As AI-driven systems rely more on structured, reliable, and semantically organized content, ACP provides a foundation for producing consistent product pages, category templates, and merchandising blocks.
For merchants, ACP also reduces the risk of layout bloat since content blocks are standardized and optimized. This helps with performance and keeps Core Web Vitals healthy, a key factor for rankings and AI-driven discovery. Even if you cannot access ACP yet, planning for its adoption ensures your store remains future-ready.
Agentic Commerce Protocol (ACP)
The Agentic Commerce Protocol (ACP) is one of the most important emerging technologies for ecommerce, and Adobe Commerce merchants should begin preparing for its impact. Unlike Page Builder, this ACP refers to a protocol designed to help AI agents interact directly with online stores.
Screenshot from acp-magento.com, November 2025
AI agents, such as those in ChatGPT, Perplexity, and Google’s agentic systems, will increasingly handle tasks like comparing products, checking availability, and completing purchases on behalf of users. The Agentic Commerce Protocol standardizes how product data, pricing, stock status, and checkout operations are communicated to these agents.
For SEO and AI visibility, ACP represents a major shift. It allows AI systems to:
Access real‑time product information.
Evaluate product suitability based on user needs.
Complete actions such as adding products to carts.
Ensure product data is accurate and trustworthy.
For merchants, adopting ACP in the future means your products can be recommended, compared, and purchased through AI interfaces, not only through traditional search. Stores that implement ACP early will have a competitive advantage in AI‑driven discovery, especially as conversational shopping becomes normalized.
Even though ACP adoption is still developing, Adobe Commerce teams should begin preparing for:
Cleaner, more structured product data.
Accurate, machine‑readable availability and pricing.
Consistent taxonomy and attributes.
Technical readiness for API exposure and agent interactions.
Magento/Adobe Commerce can be a high-performing platform for search when built with technical SEO in mind. The key is to set strong foundations, efficient crawl paths, fast performance, structured data, and clarity for both users and AI systems.
Whether you are building yourself or working with developers, use this guide as a framework to ensure your Adobe Commerce store is set up for organic success in 2026 and beyond.