Cloudflare’s New Markdown for AI Bots: What You Need To Know via @sejournal, @MattGSouthern

Cloudflare launched a feature that converts HTML pages to markdown when AI systems request it. Sites on its network can now serve lighter content to bots without building separate pages.

The feature, called Markdown for Agents, works through HTTP content negotiation. An AI crawler sends a request with Accept: text/markdown in the header. Cloudflare intercepts it, fetches the original HTML from the origin server, converts it to markdown, and delivers the result.

The launch arrives days after Google’s John Mueller called the idea of serving markdown to AI bots “a stupid idea” and questioned whether bots can even parse markdown links properly.

What’s New

Cloudflare described the feature as treating AI agents as “first-class citizens” alongside human visitors. The company used its own blog post as an example. The HTML version consumed 16,180 tokens while the markdown conversion used 3,150 tokens.

“Feeding raw HTML to an AI is like paying by the word to read packaging instead of the letter inside,” the company wrote.

The conversion happens at Cloudflare’s edge network, not at the origin server. Websites enable it per zone through the dashboard, and it’s available in beta at no additional cost for Pro, Business, and Enterprise plan customers, plus SSL for SaaS customers.

Cloudflare noted that some AI coding tools already send the Accept: text/markdown header. The company named Claude Code and OpenCode as examples.

Each converted response includes an x-markdown-tokens header that estimates the token count of the markdown version. Developers can use this to manage context windows or plan chunking strategies.

Content-Signal Defaults

Converted responses include a Content-Signal header set to ai-train=yes, search=yes, ai-input=yes by default, signaling the content can be used for AI training, search use, and AI input (including agentic use). Whether a given bot honors those signals depends on the bot operator. Cloudflare said the feature will offer custom Content-Signal policies in the future.

The Content Signals framework, which Cloudflare announced during Birthday Week 2025, lets site owners set preferences for how their content gets used. Enabling markdown conversion also applies a default usage signal, not just a format change.

How This Differs From What Mueller Criticized

Mueller was criticizing a different practice. Some site owners build separate markdown pages and serve them to AI user agents through middleware. Mueller raised concerns about cloaking and broken linking, and questioned whether bots could even parse markdown properly.

Cloudflare’s feature uses a different mechanism. Instead of detecting user agents and serving alternate pages, it relies on content negotiation. The same URL serves different representations based on what the client requests in the header.

Mueller’s comments addressed user-agent-based serving, not content negotiation. In a Reddit thread about Cloudflare’s feature, Mueller responded with the same position. He wrote, “Why make things even more complicated (parallel version just for bots) rather than spending a bit of time improving the site for everyone?”

Google defines cloaking as showing different content to users and search engines with the intent to manipulate rankings and mislead users. The cloaking concern may apply differently here. With user-agent sniffing, the server decides what to show based on who’s asking. With content negotiation, the client requests a format and the server responds. The content is the same information in a different format, not different content for different visitors.

The practical result is still similar from a crawler’s perspective. Googlebot requesting standard HTML would see a full webpage. An AI agent requesting markdown would see a stripped-down text version of the same page.

New Radar Tracking

Cloudflare also added content type tracking to Cloudflare Radar for AI bot traffic. The data shows the distribution of content types returned to AI agents and crawlers, broken down by MIME type.

You can filter by individual bot to see what content types specific crawlers receive. Cloudflare showed OAI-SearchBot as an example, displaying the volume of markdown responses served to OpenAI’s search crawler.

The data is available through Cloudflare’s public APIs and Data Explorer.

Why This Matters

If you already run your site through Cloudflare, you can enable markdown conversion with a single toggle instead of building separate markdown pages.

Enabling Markdown for Agents also sets the Content-Signal header to ai-train=yes, search=yes, ai-input=yes by default. Publishers who have been careful about AI access to their content should review those defaults before toggling the feature on.

Looking Ahead

Cloudflare said it plans to add custom Content-Signal policy options to Markdown for Agents in the future.

Mueller’s criticism focused on separate markdown pages, not on standard content negotiation. Google hasn’t addressed whether serving markdown through content negotiation falls under its cloaking guidelines.

The feature is opt-in and limited to paid Cloudflare plans. Review the Content-Signal defaults before enabling it.

Google Clarifies Its Stance On Campaign Consolidation via @sejournal, @brookeosmundson

In the recent episode of Google’s Ads Decoded podcast, Ginny Marvin sat down with Brandon Ervin, Director of Product Management for Search Ads, to address a topic many PPC marketers have strong opinions about: campaign and ad group consolidation.

Ervin, who oversees product development across core Search and Shopping ad automation, including query matching, Smart Bidding, Dynamic Search Ads, budgeting, and AI-driven systems, made one thing clear.

Consolidation is not the end goal. Equal or better performance with less granularity is.

What Was Said

During the discussion, Ervin acknowledged that many legacy account structures were built with good reason.

“What people were doing before was quite rational,” he said.

For years, granular campaign builds gave advertisers control. Match type segmentation, tightly themed ad groups, layered bidding strategies, and regional splits all made sense in a manual or semi-automated environment.

But according to Ervin, the rise of Smart Bidding and AI has shifted that dynamic.

The big shift we’ve seen with the rise of Smart Bidding and AI, the machine in general can do much better than most humans. Consolidation is not necessarily the goal itself. This evolution we’ve gone through allows you to get equal or better performance with a lot less granularity.

In other words, the structure that once helped performance may now be limiting it.

Ervin also pushed back on the idea that consolidation means losing control.

“Control still exists,” he said. “It just looks different than it did before.”

Ginny Marvin described it as a “mindset shift.”

When Segmentation Still Makes Sense

Despite Google’s push toward leaner account structures, Ervin did not suggest collapsing everything into one campaign.

Segmentation still makes sense when it reflects how a business actually operates.

Examples he shared included:

  • Distinct product lines with separate budgets and bidding goals
  • Different business objectives that require their own targets or reporting
  • Regional splits if that mirrors how the company runs operations

The key distinction is intent. If structure supports real budget decisions, reporting requirements, or operational differences, it belongs. If it exists only because that was the best practice five years ago, it may be creating more friction than value.

Ervin also addressed a common concern: how do you know when you’ve consolidated enough?

His benchmark was 15 conversions over a 30-day period. Those conversions do not need to come from a single campaign. Shared budgets and portfolio bidding strategies can aggregate conversion data across campaigns to meet that threshold.

If your campaign or ad group segmentation dilutes learning and slows down bidding models, it may be time to rethink your structure.

Why This Matters

For many PPC professionals, granularity has long been associated with expertise. Highly segmented accounts, tightly themed ad groups, and cautious use of broad match were once signs of disciplined management.

In earlier versions of Google Ads, that level of control often made a measurable difference.

I used to build accounts that way, too. When I used to manage highly competitive and seasonal E-commerce brands, SKAG structures were common practice for good reason. It was a way to better control budget for high-volume, generic terms that performed differently than more niche, long-tail terms.

What has changed my mindset is not the importance of structure, but the role it plays in my accounts. As Smart Bidding and automation have matured, I have seen firsthand how legacy segmentation can dilute data and slow down learning.

In several accounts where consolidation was tested thoughtfully, performance stabilized and, in some cases, improved. Especially in accounts I managed that had low conversion volume as a whole. What I thought was a perfectly built account structure was actually limiting performance because I was trying to spread budget and conversion volume too thin.

After a few months of poor performance, I was essentially “forced” to test out a simpler campaign structure and let go of hold habits.

Was it uncomfortable? Absolutely. When you’ve been doing PPC for years (think back to when Google Shopping was first free!), you’re essentially unlearning years of ‘best practices’ and having to learn a new way of managing accounts.

That does not mean consolidation is always the answer. It does suggest that structure should be tied directly to business logic, not inherited from best practices that were built for a different version of the platform.

Looking Ahead

If you’re in the camp of needing to start consolidating campaigns or ad groups, know that these large structural changes should not happen overnight.

For many teams, especially those managing complex accounts, restructuring can carry risk and large volatility spikes if it is done too aggressively.

A more measured approach may make sense. Start by identifying splits that clearly align with budgets, reporting requirements, or business priorities. Then evaluate the ones that exist primarily because they were once considered best practice.

In some cases, consolidation may unlock stronger data signals and steadier bidding. In others, maintaining separation may still be justified. The key is being intentional about the reason each layer exists.

WP Engine Complaint Adds Unredacted Allegations About Mullenweg Plan via @sejournal, @martinibuster

WP Engine recently filed its third amended complaint against WordPress co-founder Matt Mullenweg and Automattic, which includes newly s allegations that Mullenweg identified ten companies to pursue for licensing fees and contacted a Stripe executive in an effort to persuade Stripe to cancel contracts and partnerships with WPE.

Mullenweg And “Nuclear War”

The defendants argued that Mullenweg did not use the phrase “nuclear war.” However, documents they produced show that he used the phrase in a message describing his response to WP Engine if it did not comply with his demands.

The footnote states:

“During the recent hearing before this Court, Defendants represented that “we have seen over and over again ‘nuclear war’ in quotes,” but Mullenweg “didn’t say it” and it “[d]idn’t happen.” August 28, 2025 Hrg. Tr. at 33. According to Defendants’ counsel, Mullenweg instead only “refers to nuclear,” not “nuclear war.””

While WPE alleges that both threats are abhorrent and wrongful, reflecting a distinction without a difference, documents recently produced by Defendants confirm that in a September 13, 2024 message sent shortly before Defendants launched their campaign against WPE, Mullenweg declared “for example with WPE . . . [i]f that doesn’t resolve well it’ll look like all-out nuclear war[.]”

Email From Matt Mullenweg To A Stripe Executive

Another newly unredacted detail is an email from Matt Mullenweg to a Stripe executive in which he asked Stripe to “cancel any contracts or partnerships with WP Engine.” Stripe is a financial infrastructure platform that enables companies to accept credit card payments online.

The new information appears in the third amended complaint:

“In a further effort to inflict harm upon WPE and the market, Defendants secretly sought to strongarm Stripe into ceasing any business dealings with WPE. Shocking documents Defendants recently produced in discovery reveal that in mid-October 2024, just days after WPE brought this lawsuit, Mullenweg emailed a Stripe senior executive, insisting that Stripe “cancel any contracts or partnerships with WP Engine,” and threatening, “[i]f you chose not to do so, we should exit our contracts.”

“Destroy All Competition”

In paragraphs 200 and 202, WP Engine alleges that Defendants acknowledged having the power to “destroy all competition” and were seeking contributions that benefited Automattic rather than the WordPress.org community. WPE argues that Mullenweg abused his roles as the head of a nonprofit foundation, the owner of critical “dot-org” infrastructure, and the CEO of a for-profit competitor, Automattic.

These paragraphs appear intended to support WP Engine’s claim that the “Five for the Future” program and other community-oriented initiatives were used as leverage to pressure competitors into funding Automattic’s commercial interests. The complaint asserts that only a monopolist could make such demands and successfully coerce competitors in this manner.

Here are the paragraphs:

“Indeed, in documents recently produced by Defendants, they shockingly acknowledge that they have the power to “destroy all competition” and would inflict that harm upon market participants unless they capitulated to Defendants’ extortionate demands.”

“…Defendants’ monopoly power is so overwhelming that, while claiming they are interested in encouraging their competitors to “contribute to the community,” internal documents recently produced by Defendants reveal the truth—that they are engaged in an anticompetitive campaign to coerce their competitors to “contribute to Automattic.” Only a monopolist could possibly make such demands, and coerce their competitors to meet them, as has occurred here.”

“They Get The Same Thing Today For Free”

Additional paragraphs allege that internal documents contradict the defendants’ claim that their trademark enforcement is legitimate by acknowledging that certain WordPress hosts were already receiving the same benefits for free.

The new paragraph states:

“Contradicting Defendants’ current claim that their enforcement of supposed trademarks is legitimate, Defendants conceded internally that “any Tier 1 host (WPE for example)” would “pushback” on agreeing to a purported trademark license because “they get the same thing today for free. They’ve never paid for [the WordPress] trademarks and won’t want to pay …”

“If They Don’t Take The Carrot We’ll Give Them The Stick”

Paragraphs 211, 214, and 215 cite internal correspondence that WP Engine alleges reflects an intention to enforce compliance using a “carrot” or “stick” approach. The complaint uses this language to support its claims of market power and exclusionary conduct, which form the basis of its coercion and monopolization allegations under the Sherman Act.

Paragraph 211:

“Given their market power, Defendants expected to be able to enforce compliance, whether with a “carrot” or a “stick.””

Paragraph 214

“Defendants’ internal discussions further reveal that if market participants did not acquiesce to the price increases via a partnership with a purported trademark license component, then “they are fair game” and Defendants would start stealing their sites, thereby effectively eliminating those competitors. As Defendants’ internal correspondence states, “if they don’t take the carrot we’ll give them the stick.””

Paragraph 215:

“As part of their scheme, Defendants initially categorized particular market participants as follows:
• “We have friends (like Newfold) who pay us a lot of money. We want to nurture and value these relationships.”
• “We have would-be friends (like WP Engine) who are mostly good citizens within the WP ecosystem but don’t directly contribute to Automattic. We hope to change this.”
• “And then there are the charlatans ( and ) who don’t contribute. The charlatans are free game, and we should steal every single WP site that they host.””

Plan To Target At Least Ten Competitors

Paragraphs 218, 219, and 220 serve to:

  • Support its claim that WPE was the “public example” of what it describes as a broader plan to target at least ten other competitors with similar trademark-related demands.
  • Allege that certain competitors were paying what it describes as “exorbitant sums” tied to trademark arrangements.

WP Engine argues that these allegations show the demands extended beyond WPE and were part of a broader pattern.

The complaint cites internal documents produced by Defendants in which Mullenweg claimed he had “shield[ed]” a competitor “from directly competitive actions,” which WP Engine cites as evidence that Defendants had and exercised the ability to influence competitive conditions through these arrangements.

In those same internal documents, proposed payments were described as “not going to work,” which the complaint uses to argue that the payment amounts were not standardized but could be increased at Defendants’ insistence.

Here are the paragraphs:

“218. Ultimately, WPE was the public example of the “stick” part of Defendants’ “trademark license” demand. But while WPE decided to stand and fight by refusing Defendants’ ransom demand, Defendants’ list included at least ten other competitors that they planned to target with similar demands to pay Defendants’ bounty.

219. Indeed, based on documents that Defendants have recently produced in discovery, other competitors such as Newfold and [REDACTED] are paying Defendants exorbitant sums as part of deals that include “the use of” Defendants’ trademarks.

220. Regarding [REDACTED], in internal documents produced by Defendants, [REDACTED] confirmed that “[t]he money we’re sending from the hosting page is going to you directly”.

In return, Mullenweg claimed he apparently “shield[ed]” [REDACTED] “from directly competitive actions from a number of places[.]”.

Mullenweg further criticized the level of contributions for the month of August 2024, claiming “I’d need 3 years of that to get a new Earthroamer”.

Confronted with Mullenweg’s demand for more, [REDACTED] described itself as “the smallest fish,” suggesting that Mullenweg “can get more money from other companies,” and asking whether [REDACTED] was “the only ones you’re asking to make this change” in an apparent reference to “whatever trademark guidelines you send over”.

Mullenweg responded “nope[.]”. Later, on November 26, 2024—the same day this Court held the preliminary injunction hearing—Mullenweg told [REDACTED] that its proposed “monthly payment of [REDACTED] and contributions to wordpress.org were not “going to work,” and wished it “[b]est of luck” in resisting Defendants’ higher demands.”

WP Engine Versus Mullenweg And Automattic

Much of the previously redacted material is presented to support WP Engine’s antitrust claims, including statements that Defendants had the power to “destroy all competition.” What happens next is up to the judge.

Featured Image by Shutterstock/Kues

Google’s Ads Chief Details UCP Expansion, New AI Mode Ads via @sejournal, @MattGSouthern

Google’s VP of Ads and Commerce, Vidhya Srinivasan, published her third annual letter to the industry, outlining how the company plans to connect advertising, commerce, and AI across Search, YouTube, and Gemini in 2026.

The letter covers agentic commerce, AI-powered ad formats, creator partnerships, and creative tools. Several of the announcements build on features Google previewed at NRF 2026 in January and detailed during its Q4 2025 earnings call earlier this month.

What’s New

UCP Adoption

The letter confirms that the Universal Commerce Protocol now powers purchases from Etsy and Wayfair for U.S. shoppers inside AI Mode in Search and Gemini. Google said it has received interest from “hundreds of top tech companies, payments partners and retailers” since launching UCP.

When Google announced UCP at NRF, the company said the protocol was co-developed with Shopify and that more than 20 companies had endorsed it.

Google also said UCP’s potential “extends far beyond retail,” describing it as the foundation for agentic experiences across all commercial categories.

AI Mode Ad Formats

Srinivasan wrote that Google is testing a new ad format in AI Mode that highlights retailers offering products relevant to a query and marks them as sponsored. The letter describes the format as helping “shoppers easily find convenient buying options” while giving retailers visibility during the consideration stage.

The letter also mentioned Direct Offers, the ad pilot Google introduced at NRF that lets businesses share tailored deals with shoppers in AI Mode. Google plans to expand Direct Offers beyond price-based promotions to include loyalty benefits and product bundles.

Creator-Brand Matching

Srinivasan described YouTube creators as “today’s most trusted tastemakers,” citing a Google/Kantar study of 2,160 weekly video viewers. YouTube CEO Neal Mohan outlined related creator and commerce priorities in his own annual letter last month.

The letter highlights new AI-powered tools that match brands with creator communities based on content and audience analysis. Google said it started with its “open call” feature for sourcing creator partnerships and plans to go further in 2026.

Creative Asset Stats

Google said it saw a 3x increase in Gemini-generated assets in 2025, and that Q4 alone accounted for nearly 70 million assets across AI Max and Performance Max campaigns, according to Google internal data.

Srinivasan wrote that Veo 3, Google’s video generation tool, is now in Google Ads Asset Studio alongside the previously launched Nano Banana.

AI Max Performance Claims

Srinivasan wrote that AI Max is “unlocking billions of net-new searches” that advertisers had not previously reached.

Google introduced AI Max as an expansion tool for Search campaigns and discussed its performance during the Q4 earnings call.

Why This Matters

We’ve covered each major announcement in this letter as it was made. The UCP checkout announcement came at NRF in January. The retailer tradeoff questions followed days later. The pricing controversy played out the same week. The AI Mode monetization details came through during the earnings call.

What this letter adds is a bigger picture of where Google’s leadership sees these pieces fitting together. Srinivasan says this is the year agentic commerce moves from concept to operating reality, with UCP as the connective layer across shopping, payments, and AI agents.

For advertisers, the notable updates are the expansion of Direct Offers beyond price discounts and the testing of AI Mode ad formats in travel. For ecommerce stores, the Etsy and Wayfair confirmation shows that UCP checkout is processing real transactions with recognizable retailers. But the open questions I raised in January’s coverage about Merchant Center controls, opt-in mechanics, and reporting remain unanswered.

Looking Ahead

Srinivasan’s letter didn’t include specific launch dates for the features coming later this year. Google Marketing Live, the company’s annual ads event, takes place in the spring and would be the likely venue for more detailed announcements.


Featured Image: Mijansk786/Shutterstock

Hidden HTTP Page Can Cause Site Name Problems In Google via @sejournal, @MattGSouthern

Google’s John Mueller shared a case where a leftover HTTP homepage was causing unexpected site-name and favicon problems in search results.

The issue, which Mueller described on Bluesky, is easy to miss because Chrome can automatically upgrade HTTP requests to HTTPS, making the HTTP version easy to overlook.

What Happened

Mueller described the case as “a weird one.” The site used HTTPS, but a server-default HTTP homepage was still accessible at the HTTP version of the domain.

Mueller wrote:

“A hidden homepage causing site-name & favicon problems in Search. This was a weird one. The site used HTTPS, however there was a server-default HTTP homepage remaining.”

The tricky part is that Chrome can upgrade HTTP navigations to HTTPS, which makes the HTTP version easy to miss in normal browsing. Googlebot doesn’t follow Chrome’s upgrade behavior.

Mueller explained:

“Chrome automatically upgrades HTTP to HTTPS so you don’t see the HTTP page. However, Googlebot sees and uses it to influence the sitename & favicon selection.”

Google’s site name system pulls the name and favicon from the homepage to determine what to display in search results. The system reads structured data from the website, title tags, heading elements, og:site_name, and other signals on the homepage. If Googlebot is reading a server-default HTTP page instead of the actual HTTPS homepage, it’s working with the wrong signals.

How To Check For This

Mueller suggested two ways to see what Googlebot sees.

First, he joked that you could use AI. Then he corrected himself.

Mueller wrote:

“No wait, curl on the command line. Or a tool like the structured data test in Search Console.”

Running curl http://yourdomain.com from the command line would show the raw HTTP response without Chrome’s auto-upgrade. If the response returns a server-default page instead of your actual homepage, that’s the problem.

If you want to see what Google retrieved and rendered, use the URL Inspection tool in Search Console and run a Live Test. Google’s site name documentation also notes that site names aren’t supported in the Rich Results Test.

Why This Matters

The display of site names and favicons in search results is something we’ve been documenting since Google first replaced title tags with site names in 2022. Since then, the system has gone through multiple growing pains. Google expanded site name support to subdomains in 2023, then spent nearly a year fixing a bug where site names on internal pages didn’t match the homepage.

This case introduces a new complication. The problem wasn’t in the structured data or the HTTPS homepage itself. It was a ghost page in the HTTP version, which you’d have no reason to check because your browser never showed it.

Google’s site name documentation explicitly mentions duplicate homepages, including HTTP and HTTPS versions, and recommends using the same structured data for both. Mueller’s case shows what can go wrong when an HTTP version contains content different from the HTTPS homepage you intended to serve.

The takeaway for troubleshooting site-name or favicon problems in search results is to check the HTTP version of your homepage directly. Don’t rely on what Chrome shows you.

Looking Ahead

Google’s site name documentation specifies that WebSite structured data must be on “the homepage of the site,” defined as the domain-level root URI. For sites running HTTPS, that means the HTTPS homepage is the intended source.

If your site name or favicon looks wrong in search results and your HTTPS homepage has the correct structured data, check whether an HTTP version of the homepage still exists. Use curl or the URL Inspection tool’s Live Test to view it directly. If a server-default page is sitting there, removing it or redirecting HTTP to HTTPS at the server level should resolve the issue.

Google Can Now Monitor Search For Your Government IDs via @sejournal, @MattGSouthern
  • Google’s “Results about you” tool now lets you find and request removal of search results containing government-issued IDs.
  • This includes IDs like passports, driver’s licenses, and Social Security numbers.
  • The expansion is rolling out in the U.S. over the coming days, with additional regions planned.

Google’s Results about you tool now monitors Search results for government-issued IDs like passports, driver’s licenses, and Social Security numbers.

New Data Shows Googlebot’s 2 MB Crawl Limit Is Enough via @sejournal, @martinibuster

New data based on real-world actual web pages demonstrates that Googlebot’s crawl limit of two megabytes is more than adequate. New SEO tools provide an easy way to check how much the HTML of a web page weighs.

Data Shows 2 Megabytes Is Plenty

Raw HTML is basically just a text file. For a text file to get to two megabytes it would require over two million characters.

The HTTPArchive explains what’s in the HTML weight measurement:

“HTML bytes refers to the pure textual weight of all the markup on the page. Typically it will include the document definition and commonly used on page tags such as

or . However it also contains inline elements such as the contents of script tags or styling added to other tags. This can rapidly lead to bloating of the HTML doc.”

That is the same thing that Googlebot is downloading as HTML, just the on-page markup, not the links to JavaScript or CSS.

According to the HTTPArchive’s latest report, the real-world median average size of raw HTML is 33 kilobytes. The heaviest page weight at the 90th percentile is 155 kilobytes, meaning that the HTML for 90% of sites are less than or approximately equal to 155 kilobytes in size. Only at the 100th percentile does the size of HTML explode to way beyond two megabytes, which means that pages weighing two megabytes or more are extreme outliers.

The HTTPArchive report explains:

“HTML size remained uniform between device types for the 10th and 25th percentiles. Starting at the 50th percentile, desktop HTML was slightly larger.

Not until the 100th percentile is a meaningful difference when desktop reached 401.6 MB and mobile came in at 389.2 MB.”

The data separates the home page measurements from the inner page measurements and surprisingly shows that there is little difference between the weights of either. The data is explained:

“There is little disparity between inner pages and the home page for HTML size, only really becoming apparent at the 75th and above percentile.

At the 100th percentile, the disparity is significant. Inner page HTML reached an astounding 624.4 MB—375% larger than home page HTML at 166.5 MB.”

Mobile And Desktop HTML Sizes Are Similar

Interestingly, the page sizes between mobile and desktop versions were remarkably similar, regardless of whether HTTPArchive was measuring the home page or one of the inner pages.

HTTPArchive explains:

“The size difference between mobile and desktop is extremely minor, this implies that most websites are serving the same page to both mobile and desktop users.

This approach dramatically reduces the amount of maintenance for developers but does mean that overall page weight is likely to be higher as effectively two versions of the site are deployed into one page.”

Though the overall page weight might be higher since the mobile and desktop HTML exists simultaneously in the code, as noted earlier, the actual weight is still far below the two-megabyte threshold all the way up until the 100th percentile.

Given that it takes about two million characters to push the website HTML to two megabytes and that the HTTPArchive data based on actual websites shows that the vast majority of sites are well under Googlebot’s 2 MB limit, it’s safe to say it’s okay to scratch off HTML size from the list of SEO things to worry about.

Tame The Bots

Dave Smart of Tame The Bots recently posted that they updated their tool so that it now will stop crawling at the two megabyte limit for those whose sites are extreme outliers, showing at what point Googlebot would stop crawling a page.

Smart posted:

“At the risk of overselling how much of a real world issue this is (it really isn’t for 99.99% of sites I’d imagine), I added functionality to tamethebots.com/tools/fetch-… to cap text based files to 2 MB to simulate this.”

Screenshot Of Tame The Bots Interface

The tool will show what the page will look like to Google if the crawl is limited to two megabytes of HTML. But it doesn’t show whether the tested page exceeds two megabytes, nor does it show how much the web page weighs. For that, there are other tools.

Tools That Check Web Page Size

There are a few tool sites that show the HTML size but here are two that just show the web page size. I tested the same page on each tool and they both showed roughly the same page weight, give or take a few kilobytes.

Toolsaday Web Page Size Checker

The interestingly named Toolsaday web page size checker enables users to test one URL at a time. This specific tool just does the one thing, making it easy to get a quick reading of how much a web page weights in kilobytes (or higher if the page is in the 100th percentile).

Screenshot Of Toolsaday Test Results

Small SEO Tools Website Page Size Checker

The Small SEO Tools Website Page Size Checker differs from the Toolsaday tool in that Small SEO Tools enables users to test ten URLs at a time.

Not Something To Worry About

The bottom line about the two megabyte Googlebot crawl limit is that it’s not something the average SEO needs to worry about. It literally affects a very small percentage of outliers. But if it makes you feel better, give one of the above SEO tools a try to reassure yourself or your clients.

Featured Image by Shutterstock/Fathur Kiwon

Bing Webmaster Tools Adds AI Citation Performance Data via @sejournal, @MattGSouthern

Microsoft introduced an AI Performance dashboard in Bing Webmaster Tools, giving visibility into how content gets cited across Copilot and AI-generated answers in Bing.

The feature, now in public preview, shows citation counts, page-level activity, and trends over time. It covers AI experiences across Copilot, AI summaries in Bing, and select partner integrations.

Microsoft announced the feature on the Bing Webmaster Blog.

What’s New

The AI Performance dashboard provides four core metrics.

Total citations tracks how often your content appears as a source in AI-generated answers during a selected time period. Average cited pages shows the daily average of unique URLs from your site referenced across AI answers.

Page-level citation activity breaks down which specific URLs get cited most often. This lets you see which pages AI systems reference and how that activity changes over time.

The dashboard also introduces “grounding queries,” which Microsoft describes as the key phrases AI used when retrieving content for answers. The company notes this data represents a sample rather than complete citation activity.

A timeline view shows how citation patterns change over time across supported AI experiences.

Why This Matters

This is the first time Bing Webmaster Tools has shown how often content is cited in generative answers, including which URLs are referenced and how citation activity changes over time.

Google includes AI Overviews and AI Mode in Search Console’s overall Performance reporting, but it doesn’t offer a dedicated AI Overviews/AI Mode report or citation-style URL counts. AI Overviews also occupy a single position, with all links assigned that same position.

Bing’s dashboard goes further. It tracks which pages get cited, how often, and what phrases triggered the citation. That gives you data to work with instead of guesses.

Looking Ahead

AI Performance is available now in Bing Webmaster Tools as a public preview. Microsoft said it will continue refining metrics as more data is processed.

Bing has been building toward this for a while. The platform consolidated web search and chat metrics into a single dashboard and has added comparison features and content control tools since then.


Featured Image: Mijansk786/Shutterstock

OpenAI Begins Testing Ads In ChatGPT For Free And Go Users via @sejournal, @MattGSouthern

OpenAI is testing ads inside ChatGPT, bringing sponsored content to the product for the first time.

The test is live for logged-in adult users in the U.S. on the free and Go subscription tiers. Plus, Pro, Business, Enterprise, and Education subscribers won’t see ads.

OpenAI announced the launch with a brief blog post confirming that the principles it outlined in January are now in effect.

OpenAI’s post also adds Education to the list of ad-free tiers, which wasn’t included in the company’s initial plans.

How The Ads Work

Ads appear at the bottom of ChatGPT responses, visually separated from the answer and labeled as sponsored.

OpenAI says it selects ads by matching advertiser submissions with the topic of your conversation, your past chats, and past interactions with ads. If someone asks about recipes, they might see an ad for a meal kit or grocery delivery service.

Advertisers don’t see users’ conversations or personal details. They receive only aggregate performance data like views and clicks.

Users can dismiss ads, see why a specific ad appeared, turn off personalization, or clear all ad-related data. OpenAI also confirmed it won’t show ads in conversations about health, mental health, or politics, and won’t serve them to accounts identified as under 18.

Free users who don’t want ads have another option. OpenAI says you can opt out of ads in the Free tier in exchange for fewer daily free messages. Go users can avoid ads by upgrading to Plus or Pro.

The Path To Today

OpenAI first announced plans to test ads on January 16, alongside the U.S. launch of ChatGPT Go at $8 per month. The company laid out five principles. They cover mission alignment, answer independence, conversation privacy, choice and control, and long-term value.

The January post was careful to frame ads as supporting access rather than driving revenue. Altman wrote on X at the time:

“It is clear to us that a lot of people want to use a lot of AI and don’t want to pay, so we are hopeful a business model like this can work.”

That framing sits alongside OpenAI’s financial reality. Altman said in November that the company is considering infrastructure commitments totaling about $1.4 trillion over eight years. He also said OpenAI expects to end 2025 with an annualized revenue run rate above $20 billion. A source told CNBC that OpenAI expects ads to account for less than half of its revenue long term.

OpenAI has confirmed a $200,000 minimum commitment for early ChatGPT ads, Adweek reported. Digiday reported media buyers were quoted about $60 per 1,000 views for sponsored placements during the initial U.S. test.

Altman’s Evolving Position

The launch represents a notable turn from Altman’s earlier public statements on advertising.

In an October 2024 fireside chat at Harvard, Altman said he “hates” ads and called the idea of combining ads with AI “uniquely unsettling,” as CNN reported. He contrasted ChatGPT’s user-aligned model with Google’s ad-driven search, saying Google’s results depended on “doing badly for the user.”

By November 2025, Altman’s position had softened. He told an interviewer he wasn’t “totally against” ads but said they would “take a lot of care to get right.” He drew a line between pay-to-rank advertising, which he said would be “catastrophic,” and transaction fees or contextual placement that doesn’t alter recommendations.

The test rolling out today follows the contextual model Altman described. Ads sit below responses and don’t affect what ChatGPT recommends. Whether that distinction holds as ad revenue grows will be the longer-term question.

Where Competitors Stand

The timing puts OpenAI’s decision in sharp contrast with its two closest rivals.

Anthropic ran a Super Bowl campaign last week centered on the tagline “Ads are coming to AI. But not to Claude.” The spots showed fictional chatbots interrupting personal conversations with sponsored pitches.

Altman called the campaign “clearly dishonest,” writing on X that OpenAI “would obviously never run ads in the way Anthropic depicts them.”

Google has also kept distance from chatbot ads. DeepMind CEO Demis Hassabis said at Davos in January that Google has no current plans for ads in Gemini, calling himself “a little bit surprised” that OpenAI moved so early. He drew a distinction between assistants, where trust is personal, and search, where Google already shows ads in AI Overviews.

That was the second time in two months that Google leadership publicly denied plans for Gemini advertising. In December, Google Ads VP Dan Taylor disputed an Adweek report claiming advertisers were told to expect Gemini ads in 2026.

The three companies are now on distinctly different paths. OpenAI is testing conversational ads at scale. Anthropic is marketing its refusal to run them. Google is running ads in AI Overviews but holding off on its standalone assistant.

Why This Matters

OpenAI says ChatGPT is used by hundreds of millions of people. CNBC reported that Altman told employees ChatGPT has about 800 million weekly users. That creates pressure to find revenue beyond subscriptions, and advertising is the proven model for monetizing free users across consumer tech.

For practitioners, today’s launch opens a new ad channel for AI platform monetization. The targeting mechanism uses conversation context rather than search keywords, which creates a different kind of intent signal. Someone asking ChatGPT for help planning a trip is further along in the decision process than someone typing a search query.

The restrictions are also worth watching. No ads near health, politics, or mental health topics means the inventory is narrower than traditional search. Combined with reported $60 CPMs and a $200K minimum, this starts as a premium play for a limited set of advertisers rather than a self-serve marketplace.

Looking Ahead

OpenAI described today’s rollout as a test to “learn, listen, and make sure we get the experience right.” No timeline was given for expanding beyond the U.S. or beyond free and Go tiers.

Separately, CNBC reported that Altman told employees in an internal Slack message that ChatGPT is “back to exceeding 10% monthly growth” and that an “updated Chat model” is expected this week.

How users respond to ads in their ChatGPT conversations will determine whether this test scales or gets pulled back. It will also test whether the distinction Altman drew in November between trust-destroying ads and acceptable contextual ones holds up in practice.

7 Insights From Washington Post’s Strategy To Win Back Traffic via @sejournal, @martinibuster

The Washington Post’s recent announcement of staffing cuts is a story with heroes, villains, and victims, but buried beneath the headlines is the reality of a big brand publisher confronting the same changes with Google Search that SEOs, publishers, and ecommerce stores are struggling with. The following are insights into their strategy to claw back traffic and income that could be useful for everyone seeking to stabilize traffic and grow.

Disclaimer

The Washington Post is proposing the following strategies in response to steep drops in search traffic, the rise of multi-modal content consumption, and many other factors that are fragmenting online audiences. The strategies have yet to be proven.

The value lies in analyzing what they are doing and understanding if there are any useful ideas for others.

Problem That Is Being Solved

The reasons given for the announced changes are similar to what SEOs, online stores, and publishers are going through right now because of the decline of search and the hyper-diversification of sources of information.

The memo explains:

“Platforms like Search that shaped the previous era of digital news, and which once helped The Post thrive, are in serious decline. Our organic search has fallen by nearly half in the last three years.

And we are still in the early days of AI-generated content, which is drastically reshaping user experiences and expectations.”

Those problems are the exact same ones affecting virtually all online businesses. This makes The Washington Post’s solution of interest to everyone beyond just news sites.

Problems Specific To The Washington Post

Recent reporting on The Washington Post tended to narrowly frame it in the context of politics, concerns about the concentration of wealth, and how it impacts coverage of sports, international news, and the performing arts, in addition to the hundreds of staff and reporters who lost their jobs.

The job cuts in particular are a highly specific solution applied by The Washington Post and are highly controversial. An opinion can be made that cutting some of the lower performing topics removes the very things that differentiate the website. As you will see next, Executive Editor Matt Murray justifies the cuts as listening to readers’ signals.

Challenges Affecting Everyone

If you zoom out, there is a larger pattern of how many organizations are struggling to understand where the audience has gone and how best to bring them back.

Shared Industry Challenges

  • Changes in content consumption habits
  • Decline of search
  • Rise of the creator economy
  • Growth of podcasts and video shows
  • Social media competing for audience attention
  • Rise of AI search and chat

A recent podcast interview (link to Spotify) with the executive editor of The Washington Post, Matt Murray, revealed a years-long struggle to restructure the organization’s workflow into one that:

  • Was responsive to audience signals
  • Could react in real time instead of the rigid print-based news schedule
  • Explored emerging content formats so as to evolve alongside readers
  • Produced content that is perceived as indispensable

The issues affecting the Washington Post are similar to issues affecting everyone else from recipe bloggers to big brand review sites. A key point Murray made was the changes were driven by audience signals.

Matt Murray said the following about reader signals:

“Readers in today’s world tell you what they want and what they don’t want. They have more power. …And we weren’t picking up enough of the reader signals.”

Then a little later on he again emphasized the importance of understanding reader signals:

“…we are living in a different kind of a world that is a data reader centric world. Readers send us signals on what they want. We have to meet them more where they are. That is going to drive a lot of our success.”

Whether listening to audience signals justifies cutting staff or ends up removing the things that differentiate The Washington Post remains to be seen.

For example, I used to subscribe to the print edition of The New Yorker for the articles, not for the restaurant or theater reviews yet they were still of interest to me as I liked to keep track of trends in live theater and dining. The New Yorker cartoons rarely had anything to do with the article topics and yet they were a value add. Would something like that show up in audience signals?

Build A Base Then Adapt

The memo paints what they’re doing as a foundation for building a strategy that is still evolving, not as a proven strategy. In my opinion that reflects the uncertainty introduced by the rapid decline of classic search and the knowledge that there are no proven strategies.

That uncertainty makes it more interesting to examine what a big brand organization like The Washington Post is doing to create a base strategy to start from and adapt it based on outcomes. That, in itself, is a strategy for coping with a lack of proven tactics.

Three concrete goals they are focusing on are:

  1. Attracting readers
  2. Create content that leads to subscriptions
  3. Increase engagement.

They write:

“From this foundation, we aim to build on what is working, and grow with discipline and intent, to experiment, to measure and deepen what resonates with customers.”

In the podcast interview, Murray also described the stability of a foundation as a way to nurture growth, explaining that it creates the conditions for talent to do its best work. He explains that building the foundation gives the staff the space to focus on things that work.

He explained:

“One of the reasons I wanted to get to stability, as I want room for that talent to thrive and flourish.

I also want us to develop it in a more modern multi-modal way with those that we’ve been able to do.”

A Path To Becoming Indispensable

The Washington Post memo offered insights about their strategy, with the goal stated that the brand must become indispensable to readers, naming three criteria that articles must validate against.

According to the memo:

“We can’t be everything to everyone. But we must be indispensable where we compete. That means continually asking why a story matters, who it serves and how it gives people a clearer understanding of the world and an advantage in navigating it.”

Three Criteria For Content

  1. Content must matter to site visitors.
  2. Content must have an identifiable audience.
  3. Content must provide understanding and also be applicable (useful).

Content Must Matter
Regardless of whether the content is about a product, a service, or informational, the Washington Post’s strategy states that content must strongly fulfill a specific need. For SEOs, creators, ecommerce stores, and informational content publishers, “mattering” is one of the pillars that support making a business indispensable to a site visitor and provides an advantage.

Identifiable Audience
Information doesn’t exist in a vacuum, but traditional SEO has strongly focused on keyword volume and keyword relevance, essentially treating information as existing in a space devoid of human relevance. Keyword relevance is not the same as human relevance. Keyword relevance is relevance to a keyword phrase, not relevance to a human.

This point matters because AI Chat and Search destroys the concept of keywords, because people are no longer typing in keyword phrases but are instead engaging in goal-oriented discussions.

When SEOs talk about keyword relevance, they are talking about relevance to an algorithm. Put another way, they are essentially defining the audience as an algorithm.

So, point two is really about stepping back and asking, “Why does a person need this information?”

Provide Understanding And Be Applicable
Point three states that it’s not enough for content to provide an understanding of what happened (facts). It requires that the information must make the world around the reader navigable (application of the facts).

This is perhaps the most interesting pillar of the strategy because it acknowledges that information vomit is not enough. It must be information that is utilitarian. Utilitarian in this context means that content must have some practical use.

In my opinion, an example of this principle in the context of an ecommerce site is product data. The other day I was on a fishing lure site, and the site assumed that the consumer understood how each lure is supposed to be used. It just had the name of the lure and a photo. In every case, the name of the lure was abstract and gave no indication of how the lure was to be used, under what circumstances, and what tactic it was for.

Another example is a clothing site where clothing is described as small, medium, large, and extra large, which are subjective measurements because every retailer defines small and large differently. One brand I shop at consistently labels objectively small-sized jackets as medium. Fortunately, that same retailer also provides chest, shoulder, and length measurements, which enable a user to understand exactly whether that clothing fits.

I think that’s part of what the Washington Post memo means when it says that the information should provide understanding but also be applicable. It’s that last part that makes the understanding part useful.

Three Pillars To Thriving In A Post-Search Information Economy

All three criteria are pillars that support the mandate to be indispensable and provide an advantage. Satisfying those goals help content differentiate it from information vomit, AI slop. Their strategy supports becoming a navigational entity, a destination that users specifically seek out and it helps the publisher, ecommerce store, and SEOs build an audience in order to claw back what classic search no longer provides.

Featured Image by Shutterstock/Roman Samborskyi