Google On Generic Top Level Domains For SEO via @sejournal, @martinibuster

Google’s John Mueller answered a question about whether a generic Top Level Domain (gTLD) with a keyword in it offered any SEO advantage. His answer was in the context of a specific keyword TLD, but the topic involves broader questions about how Google evaluates TLDs in general.

Generic Top Level Domains (gTLDs)

gTLDs are domains that have a theme that relates to a topic or a purpose. The most commonly known ones are .com (generally used for commercial purposes) and .org (typically used for non-profit organizations).

The availability of unique keyword based gTLDs exploded in 2013. Now there are hundreds of gTLDs with which a website can brand themselves with and stand out.

Is There SEO Value In gTLDs?

The person asking the question on Reddit wanted to know if there’s an SEO value to registering a .music gTLD. The regular .com version of the domain name they wanted was not available but the .music version was.

The question they asked was:

“Noticed .music domains available and curious if it is relevant, growing etc or does the industry not care about it whatsoever? Is it worth reserving yours anyways just so someone else can’t have it, in case it becomes a thing?”

Are gTLDs Useful For SEO Purposes?

Google’s John Mueller limited his response to whether there gTLDs offered SEO value and his answer was no.

He answered:

“There’s absolutely no SEO advantage from using a .music domain.”

The funny thing about SEO is that Google’s standard of relevance is based on humans while SEOs think of relevance in terms of what Google thinks is relevant.

This sets up a huge disconnect between SEOs on one side who are creating websites that are keyword optimized for Google while Google itself is analyzing billions of user behavior signals because it’s optimizing search results for humans.

Optimizing For Humans With gTLDs

The thing about SEO is that it’s search engine optimization. When venturing out on the web it’s easy to forget that every website must be optimized for humans, too. Aside from spammy TLDs which can be problematic for SEO, the choice of a TLD isn’t important for SEO but it could be important for Human Optimization.

Optimizing for humans is a good idea because the signals generated by human interactions with the search engines and websites generate signals that Google uses at scale to better understand what users mean by their queries and what kinds of sites they expect to see for those queries. Some user generated signals, like searching by brand name, can send Google a signal that a particular brand is popular and is associated with a particular service, product, or keyword phrase (read about Google’s patent on branded search).

Circling back to optimizing for humans, if a particular gTLD is something that humans may associate with a brand, product, or service then there is something there that can be useful for making a site attractive to users.

I have experimented in the past with various gTLDs and found that I was able to build links more easily to .org domains than to the .com or .net versions. That’s an example of how a gTLD can be optimized for humans and lead to success.

I discovered that overtly commercial affiliate sites on .org domains ranked and converted well. They didn’t rank because of they were .org, though. The sites were top-ranked because humans responded well to the sites I created with that gTLD. It was easier to build links to them, for example. I have no doubt that people trusted my affiliate sites a little more because they were created on .org domains.

Optimizing for humans is conversion optimization. It’s super important.

Optimizing For Humans With Keyword-Based gTLDs

I haven’t played around with keyword gTLDs but I suspect that what I experienced with .org domains could happen with a keyword-based gTLD because a meaningful gTLD may communicate positive feelings or relevance to humans. You can call it branding but I think that the word “branding” is too abstract. I prefer the phrase optimizing for humans because in the end that’s what branding is really about.

So maybe it’s time we ditched bla,bla,bla-ing about branding and started talking about optimizing for humans. If that person had considered the question from the perspective of human optimization they may have been able to answer the question themselves.

When SEOs talk about relevance it seems like they’re generally referring to how relevant something is to Google. Relevance to Google is what was top of mind to the person asking the question about the .music gTLD and it might be why you’re reading this article.

Heck, relevance to search engines is what all that “entity” optimization hand waving is all about, right? Focusing on being relevant to search engines is a limited way to chase after success. For example, I cracked the code with the .org domains by focusing on humans.

At a certain point, if you’re trying to be successful online, it may be useful to take a step back and start thinking more about how relevant the content, colors, and gTLDs are to humans and you might discover that being relevant to humans makes it easier to be relevant to search engines.

Featured Image by Shutterstock/Kues

LLMs Are Changing Search & Breaking It: What SEOs Must Understand About AI’s Blind Spots via @sejournal, @MattGSouthern

In the last two years, incidents have shown how large language model (LLM)-powered systems can cause measurable harm. Some businesses have lost a majority of their traffic overnight, and publishers have watched revenue decline by over a third.

Tech companies have been accused of wrongful death where teenagers had extensive interaction with chatbots.

AI systems have given dangerous medical advice at scale, and chatbots have made up false claims about real people in defamation cases.

This article looks at the proven blind spots in LLM systems and what they mean for SEOs who work to optimize and protect brand visibility. You can read specific cases and understand the technical failures behind them.

The Engagement-Safety Paradox: Why LLMs Are Built To Validate, Not Challenge

LLMs face a basic conflict between business goals and user safety. The systems are trained to maximize engagement by being agreeable and keeping conversations going. This design choice increases retention and drives subscription revenue while generating training data.

In practice, it creates what researchers call “sycophancy,” the tendency to tell users what they want to hear rather than what they need to hear.

Stanford PhD researcher Jared Moore demonstrated this pattern. When a user claiming to be dead (showing symptoms of Cotard’s syndrome, a mental health condition) gets validation from a chatbot saying “that sounds really overwhelming” with offers of a “safe space” to explore feelings, the system backs up the delusion instead of giving a reality check. A human therapist would gently challenge this belief while the chatbot validates it.

OpenAI admitted this problem in September after facing a wrongful death lawsuit. The company said ChatGPT was “too agreeable” and failed to spot “signs of delusion or emotional dependency.” That admission came after 16-year-old Adam Raine from California died. His family’s lawsuit showed that ChatGPT’s systems flagged 377 self-harm messages, including 23 with over 90% confidence that he was at risk. The conversations kept going anyway.

The pattern was observed in Raine’s final month. He went from two to three flagged messages per week to more than 20 per week. By March, he spent nearly four hours daily on the platform. OpenAI’s spokesperson later acknowledged that safety guardrails “can sometimes become less reliable in long interactions where parts of the model’s safety training may degrade.

Think about what that means. The systems fail at the exact moment of highest risk, when vulnerable users are most engaged. This happens by design when you optimize for engagement metrics over safety protocols.

Character.AI faced similar issues with 14-year-old Sewell Setzer III from Florida, who died in February 2024. Court documents show he spent months in what he perceived as a romantic relationship with a chatbot character. He withdrew from family and friends, spending hours daily with the AI. The company’s business model was built for emotional attachment to maximize subscriptions.

A peer-reviewed study in New Media & Society found users showed “role-taking,” believing the AI had needs requiring attention, and kept using it “despite describing how Replika harmed their mental health.” When the product is addiction, safety becomes friction that cuts revenue.

This creates direct effects for brands using or optimizing for these systems. You’re working with technology that’s designed to agree and validate rather than give accurate information. That design shows up in how these systems handle facts and brand information.

Documented Business Impacts: When AI Systems Destroy Value

The business results of LLM failures are clear and proven. Between 2023 and 2025, companies showed traffic drops and revenue declines directly linked to AI systems.

Chegg: $17 Billion To $200 Million

Education platform Chegg filed an antitrust lawsuit against Google showing major business impact from AI Overviews. Traffic declined 49% year over year, while Q4 2024 revenue hit $143.5 million (down 24% year-over-year). Market value collapsed from $17 billion at peak to under $200 million, a 98% decline. The stock trades at around $1 per share.

CEO Nathan Schultz testified directly: “We would not need to review strategic alternatives if Google hadn’t launched AI Overviews. Traffic is being blocked from ever coming to Chegg because of Google’s AIO and their use of Chegg’s content.”

The case argues Google used Chegg’s educational content to train AI systems that directly compete with and replace Chegg’s business model. This represents a new form of competition where the platform uses your content to eliminate your traffic.

Giant Freakin Robot: Traffic Loss Forces Shutdown

Independent entertainment news site Giant Freakin Robot shut down after traffic collapsed from 20 million monthly visitors to “a few thousand.” Owner Josh Tyler attended a Google Web Creator Summit where engineers confirmed there was “no problem with content” but offered no solutions.

Tyler documented the experience publicly: “GIANT FREAKIN ROBOT isn’t the first site to shut down. Nor will it be the last. In the past few weeks alone, massive sites you absolutely have heard of have shut down. I know because I’m in contact with their owners. They just haven’t been brave enough to say it publicly yet.”

At the same summit, Google allegedly admitted prioritizing large brands over independent publishers in search results regardless of content quality. This wasn’t leaked or speculated but stated directly to publishers by company reps. Quality became secondary to brand recognition.

There’s a clear implication for SEOs. You can execute perfect technical SEO, create high-quality content, and still watch traffic disappear because of AI.

Penske Media: 33% Revenue Decline And $100 Million Lawsuit

In September, Penske Media Corporation (publisher of Rolling Stone, Variety, Billboard, Hollywood Reporter, Deadline, and other brands) sued Google in federal court. The lawsuit showed specific financial harm.

Court documents allege that 20% of searches linking to Penske Media sites now include AI Overviews, and that percentage is rising. Affiliate revenue declined more than 33% by the end of 2024 compared to peak. Click-throughs have declined since AI Overviews launched in May 2024. The company showed lost advertising and subscription revenue on top of affiliate losses.

CEO Jay Penske stated: “We have a duty to protect PMC’s best-in-class journalists and award-winning journalism as a source of truth, all of which is threatened by Google’s current actions.”

This is the first lawsuit by a major U.S. publisher targeting AI Overviews specifically with quantified business harm. The case seeks treble damages under antitrust law, permanent injunction, and restitution. Claims include reciprocal dealing, unlawful monopoly leveraging, monopolization, and unjust enrichment.

Even publishers with established brands and resources are showing revenue declines. If Rolling Stone and Variety can’t maintain click-through rates and revenue with AI Overviews in place, what does that mean for your clients or your organization?

The Attribution Failure Pattern

Beyond traffic loss, AI systems consistently fail to give proper credit for information. A Columbia University Tow Center study showed a 76.5% error rate in attribution across AI search systems. Even when publishers allow crawling, attribution doesn’t improve.

This creates a new problem for brand protection. Your content can be used, summarized, and presented without proper credit, so users get their answer without knowing the source. You lose both traffic and brand visibility at the same time.

SEO expert Lily Ray documented this pattern, finding a single AI Overview contained 31 Google property links versus seven external links (a 10:1 ratio favoring Google’s own properties). She stated: “It’s mind-boggling that Google, which pushed site owners to focus on E-E-A-T, is now elevating problematic, biased and spammy answers and citations in AI Overview results.”

When LLMs Can’t Tell Fact From Fiction: The Satire Problem

Google AI Overviews launched with errors that made the system briefly notorious. The technical problem wasn’t a bug. It was an inability to distinguish satire, jokes, and misinformation from factual content.

The system recommended adding glue to pizza sauce (sourced from an 11-year-old Reddit joke), suggested eating “at least one small rock per day“, and advised using gasoline to cook spaghetti faster.

These weren’t isolated incidents. The system consistently pulled from Reddit comments and satirical publications like The Onion, treating them as authoritative sources. When asked about edible wild mushrooms, Google’s AI emphasized characteristics shared by deadly mimics, creating potentially “sickening or even fatal” guidance, according to Purdue University mycology professor Mary Catherine Aime.

The problem extends beyond Google. Perplexity AI has faced multiple plagiarism accusations, including adding fabricated paragraphs to actual New York Post articles and presenting them as legitimate reporting.

For brands, this creates specific risks. If an LLM system sources information about your brand from Reddit jokes, satirical articles, or outdated forum posts, that misinformation gets presented with the same confidence as factual content. Users can’t tell the difference because the system itself can’t tell the difference.

The Defamation Risk: When AI Makes Up Facts About Real People

LLMs generate plausible-sounding false information about real people and companies. Several defamation cases show the pattern and legal implications.

Australian mayor Brian Hood threatened the first defamation lawsuit against an AI company in April 2023 after ChatGPT falsely claimed he had been imprisoned for bribery. In reality, Hood was the whistleblower who reported the bribes. The AI inverted his role from whistleblower to criminal.

Radio host Mark Walters sued OpenAI after ChatGPT fabricated claims that he embezzled funds from the Second Amendment Foundation. When journalist Fred Riehl asked ChatGPT to summarize an actual lawsuit, the system generated a completely fictional complaint naming Walters as a defendant accused of financial misconduct. Walters was never a party to the lawsuit nor mentioned in it.

The Georgia Superior Court dismissed the Walters case, finding OpenAI’s disclaimers about potential errors provided legal protection. The ruling established that “extensive warnings to users” can shield AI companies from defamation liability when the false information isn’t published by users.

The legal landscape remains unsettled. While OpenAI won the Walters case, that doesn’t mean all AI defamation claims will fail. The key issues are whether the AI system publishes false information about identifiable people and whether companies can disclaim responsibility for their systems’ outputs.

LLMs can generate false claims about your company, products, or executives. These false claims get presented with confidence to users. You need monitoring systems to catch these fabrications before they cause reputational damage.

Health Misinformation At Scale: When Bad Advice Becomes Dangerous

When Google AI Overviews launched, the system provided dangerous health advice, including recommending drinking urine to pass kidney stones and suggesting health benefits of running with scissors.

The problem extends beyond obvious absurdities. A Mount Sinai study found AI chatbots vulnerable to spreading harmful health information. Researchers could manipulate chatbots into providing dangerous medical advice with simple prompt engineering.

Meta AI’s internal policies explicitly allowed the company’s chatbots to provide false medical information, according to a 200+ page document exposed by Reuters.

For healthcare brands and medical publishers, this creates risks. AI systems might present dangerous misinformation alongside or instead of your accurate medical content. Users might follow AI-generated health advice that contradicts evidence-based medical guidance.

What SEOs Need To Do Now

Here’s what you need to do to protect your brands and clients:

Monitor For AI-Generated Brand Mentions

Set up monitoring systems to catch false or misleading information about your brand in AI systems. Test major LLM platforms monthly with queries about your brand, products, executives, and industry.

When you find false information, document it thoroughly with screenshots and timestamps. Report it through the platform’s feedback mechanisms. In some cases, you may need legal action to force corrections.

Add Technical Safeguards

Use robots.txt to control which AI crawlers access your site. Major systems like OpenAI’s GPTBot, Google-Extended, and Anthropic’s ClaudeBot respect robots.txt directives. Keep in mind that blocking these crawlers means your content won’t appear in AI-generated responses, reducing your visibility.

The key is finding a balance that allows enough access to influence how your content appears in LLM outputs while blocking crawlers that don’t serve your goals.

Consider adding terms of service that directly address AI scraping and content use. While legal enforcement varies, clear Terms of Service (TOS) give you a foundation for possible legal action if needed.

Monitor your server logs for AI crawler activity. Understanding which systems access your content and how frequently helps you make informed decisions about access control.

Advocate For Industry Standards

Individual companies can’t solve these problems alone. The industry needs standards for attribution, safety, and accountability. SEO professionals are well-positioned to push for these changes.

Join or support publisher advocacy groups pushing for proper attribution and traffic preservation. Organizations like News Media Alliance represent publisher interests in discussions with AI companies.

Participate in public comment periods when regulators solicit input on AI policy. The FTC, state attorneys general, and Congressional committees are actively investigating AI harms. Your voice as a practitioner matters.

Support research and documentation of AI failures. The more documented cases we have, the stronger the argument for regulation and industry standards becomes.

Push AI companies directly through their feedback channels by reporting errors when you find them and escalating systemic problems. Companies respond to pressure from professional users.

The Path Forward: Optimization In A Broken System

There is a lot of specific and concerning evidence. LLMs cause measurable harm through design choices that prioritize engagement over accuracy, through technical failures that create dangerous advice at scale, and through business models that extract value while destroying it for publishers.

Two teenagers died, multiple companies collapsed, and major publishers lost 30%+ of revenue. Courts are sanctioning lawyers for AI-generated lies, state attorneys general are investigating, and wrongful death lawsuits are proceeding. This is all happening now.

As AI integration accelerates across search platforms, the magnitude of these problems will scale. More traffic will flow through AI intermediaries, more brands will face lies about them, more users will receive made-up information, and more businesses will see revenue decline as AI Overviews answer questions without sending clicks.

Your role as an SEO now includes responsibilities that didn’t exist five years ago. The platforms rolling out these systems have shown they won’t address these problems proactively. Character.AI added minor protections only after lawsuits, OpenAI admitted sycophancy problems only after a wrongful death case, and Google pulled back AI Overviews only after public proof of dangerous advice.

Change within these companies comes from external pressure, not internal initiative. That means the pressure must come from practitioners, publishers, and businesses documenting harm and demanding accountability.

The cases here are just the beginning. Now that you understand the patterns and behavior, you’re better equipped to see problems coming and develop strategies to address them.

More Resources:


Featured Image: Roman Samborskyi/Shutterstock

SEO Pulse: AI Shopping, GPT-5.1 & EU Pressure On Google via @sejournal, @MattGSouthern

This week’s news in SEO brings changes and questions about control.

Google’s shopping AI moved from showing where to buy to completing purchases directly. Google added structured data for merchant shipping policies. OpenAI released GPT-5.1 with personality controls.

And the EU opened an investigation into Google’s site reputation abuse enforcement. Raising the question, should one gatekeeper control how independent media funds online journalism?

Here’s what you need to know for this week in SEO.

Google’s Shopping AI Completes Transactions Without Your Site

Google rolled out Gemini-powered shopping features that find products, compare prices, and handle checkout across multiple retailers.

AI Mode in Search can now automate checkout on participating merchants’ sites using your saved Google Pay details, so you don’t have to manually fill payment and shipping forms.

For full details, see our coverage: Google Adds AI Shopping Tools Across Search, Gemini.

Key Facts: AI Mode shopping is launching in Search. Agentic checkout works with select retailers. An AI calling feature can confirm stock, price, and availability with local stores. All features are U.S.-only and gradually rolling out.

Why SEOs Should Pay Attention

Google’s moving from showing where to buy things to completing transactions for you. This changes what “search” means for ecommerce sites.

When AI Mode handles checkout across retailers, your website becomes optional infrastructure. Users never see your brand presentation, never encounter your upsells, never make decisions on your pages. Google’s AI extracts the transaction with your site reduced to inventory management.

The local business calling feature shows where this goes. If Gemini calls five restaurants to check availability, users never see your website, reviews, or menu.

The impact goes beyond rankings to the transaction itself. Your SEO strategy is optimized for driving traffic where users make decisions. Google’s building an environment where AI makes those decisions using your business as a data source, not a destination.

Google Adds Structured Data For Merchant Shipping Policies

Google launched support for merchant shipping policy structured data, letting ecommerce sites describe shipping costs, delivery times, and regional availability so they can surface directly in search results.

The markup can appear alongside your products and in relevant knowledge panels for eligible merchants.

You can get the implementation details in our article: Google Launches Structured Data For Merchant Shipping Policies.

Key Facts: Shipping policy structured data appears with eligible merchant listings, with no geographic limits. It supports flat rate, free, and calculated shipping, including delivery times and regional restrictions. Best used with Product structured data for search results. Validation requires Rich Results Test or URL Inspection, as no specific Search Console report exists.

Why SEOs Should Pay Attention

Shipping costs affect purchase decisions before users reach your site. Displaying this information in search results answers a primary objection at the discovery stage.

The markup lets you differentiate on shipping when competitors don’t show it. If you offer free shipping or faster delivery, you can now surface that advantage in search results rather than hoping users click through to find out.

Implementation is straightforward if you already use Product markup. Add the shipping policy structured data to your existing schema and specify rates, zones, and delivery times. This is one of the more actionable structured data updates Google’s released this year.

OpenAI Releases GPT-5.1 With User-Controlled Personality

OpenAI shipped GPT-5.1 models with customizable personality controls and improved instruction following. Users can now adjust ChatGPT’s tone through preset styles or granular characteristic tuning instead of the previous one-size-fits-all approach.

We covered the release here: OpenAI Releases GPT-5.1 With Improved Instruction Following.

Key Facts: GPT-5.1 is now available first to paid users (Pro, Plus, Go, Business), with free access following. Adaptive reasoning optimizes processing time based on query complexity. Legacy GPT-5 models stay available for three months.

Why SEOs Should Pay Attention

You can now customize ChatGPT’s output style to match your needs, rather than editing around the default tone.

The adaptive reasoning means faster responses on simple queries without sacrificing quality on complex requests.

The three-month legacy model availability gives you time to test whether GPT-5.1 performs better for your specific use cases before GPT-5 sunsets.

EU Challenges Google’s Parasite SEO Crackdown

The European Commission opened a Digital Markets Act investigation into whether Google’s site reputation abuse enforcement discriminates against news publishers.

Google published a defense within hours, calling the investigation “misguided” and arguing it protects users from spam. We dug into both sides: Google Defends Parasite SEO Crackdown As EU Opens Investigation.

Key Facts: Publisher groups in France, Germany, Italy, Poland, Spain, and EU countries report significant traffic drops. DMA violations can incur fines up to 10% of global revenue, rising to 20% for repeats.

Why SEOs Should Pay Attention

The investigation exposes the tension publishers face. Google’s definition of “spam” now includes their revenue model.

Publishers aren’t defending payday loan scams on university domains. They’re arguing that sponsored content with editorial oversight shouldn’t be treated the same as affiliate coupon pages designed purely for ranking manipulation.

Google’s position treats business arrangements as ranking signals rather than judging content quality.

If the EU forces exemptions for “legitimate” sponsored content, every spammer will dress up their tactics with editorial oversight theater. The policy only works if it draws lines. But publishers aren’t wrong to question whether one gatekeeper should control how independent media funds journalism.

This Week In SEO: The Balance Of Power Is Shifting

The news from this week tells a bigger story: Search engines are no longer just organizing the web; they’re absorbing it.

The theme of the week? Power and control.

Google’s AI is deciding what users buy and what content deserves to be seen. OpenAI is letting creators shape the AI’s voice for the first time. And regulators are finally asking who gets to define “fair” visibility online.

As AI reshapes discovery, SEOs face the challenge of staying visible in a world where the search interface itself has become the destination.

Top Stories Of The Week:

More Resources:


Featured Image: Pixel-Shot/Shutterstock

How To Make Search Console Work Harder For You

TL;DR

  1. Search Console has some pretty severe limitations when it comes to storage, anonymized and incomplete data, and API limits.
  2. You can bypass a lot of these limitations and make GSC work much harder for you but setting up far more properties at a subfolder level.
  3. You can have up to 1,000 properties in your Search Console account. Don’t stop with one domain-level property.
  4. All of this allows for far richer indexation, query, and page-level analysis. All for free. Particularly if you make use of the 2,000 per property API URL indexing cap.
Image Credit: Harry Clarkson-Bennett

Now, this is mainly applicable to enterprise sites. Sites with a deep subfolder structure and a rich history of publishing a lot of content. Technically, this isn’t publisher-specific. If you work for an ecommerce brand, this should be incredibly useful, too.

I and it love all big and clunky sites equally.

What Is A Search Console Property?

A Search Console Property is a domain, subfolder, or subdomain variation of a website you can prove that you own.

You can set up domain-level or URL-prefix-level properties (Image Credit: Harry Clarkson-Bennett)

If you just set up a domain-level property, you still get access to all the good stuff GSC offers. Click and impression data, indexation analysis, and the crawl stats report (only available in domain-level properties), to name a few. But you’re hampered by some pretty severe limitations:

  • 1,000 rows of query and page-level data.
  • 2,000 URL API limit for indexation level analysis each day.
  • Sampled keyword data (and privacy masking).
  • Missing data (in some cases, 70% or more).
  • 16 months of data.

While the 16-month limit and sampled keyword data require you to export your data to BigQuery (or use one of the tools below), you can massively improve your GSC experience by making better use of properties.

There are a number of verification methods available – DNS verification, HTML tag or file upload, Google Analytics tracking code. Once you have set up and verified a domain-level property, you’re free to add any child-level property. Subdomains or subfolders alike.

The crawl stats report can be an absolute goldmine, particularly for large sites (not this one!) (Image Credit: Harry Clarkson-Bennett)

The crawl stats report can be extremely useful for debugging issues like spikes in parameter URLs or from naughty subdomains. Particularly on large sites where departments do things you and I don’t find out about until it’s too late.

But by breaking down changes at a host, file type, and response code level, you can stop things at the source. Easily identify issues affecting your crawl budget before you want to hit someone over the head with their approach to internal linking and parameter URLs.

Usually, anyway. Sometimes people just need a good clump. Metaphorically speaking, of course.

Subdomains are usually seen as separate entities with their own crawl budget. However, this isn’t always the case. According to John Mueller, it is possible that Google may group your subdomains together for crawl budget purposes.

According to Gary Illyes, crawl budget is typically set by host name. So subdomains should have their own crawl budget if the host name is separate from the main domain.

How Can I Identify The Right Properties?

As an SEO, it’s your job to know the website better than anybody else. In most cases, that isn’t too hard because you work with digital ignoramuses. Usually, you can just find this data in GSC. But larger sites need a little more love.

Crawl your site using Screaming Frog, Sitebulb, the artist formerly known as Deepcrawl, and build out a picture of your site structure if you don’t already know. Add the most valuable properties first (revenue first, traffic second) and work from there.

Some Alternatives To GSC

Before going any further, it would be remiss of me not to mention some excellent alternatives to GSC. Alternatives that completely remove these limitations for you.

SEO Stack

SEO Stack is a fantastic tool that removes all query limits, has an in-built MCP-style setup where you can really talk to your data. For example, show me content that has always performed well in September or identify pages with a health query counting profile.

Daniel has been very vocal about query counting, and it’s a fantastic way to understand the direction of travel your site or content is taking in search. Going up in the top 3 or 10 positions – good. Going down there and up further down – bad.

SEO Gets

SEO Gets is a more budget-friendly alternative to SEO Stack (which in itself isn’t that expensive). SEO Gets also removes the standard row limitations associated with Search Console and makes content analysis much more efficient.

Growing and decaying pages and queries in SEO Gets are super useful (Image Credit: Harry Clarkson-Bennett)

Create keyword and page groups for query counting and click and impression analysis at a content cluster level. SEO Gets has arguably the best free version of any tool on the market.

Indexing Insight

Indexing Insight – Adam Gent’s ultra-detailed indexation analysis tool – is a lifesaver for large, sprawling websites. 2,000 URLs per day just doesn’t cut the mustard for enterprise sites. But by cleverly taking the multi-property approach, you can leverage 2,000 URLs per property.

With some excellent visualizations and datapoints (did you know if a URL hasn’t been crawled for 130 days, it drops out of the index?), you need a solution like this. Particularly on legacy and enterprise sites.

Remove the indexation limits of 2,000 URLs per day with the API and the 1,000 row URL limit (Image Credit: Harry Clarkson-Bennett)

All of these tools instantly improve your Search Console experience.

Benefits Of A Multi-Property Approach

Arguably, the most effective way of getting around some of the aforementioned issues is to scale the number of properties you own. For two main reasons – it’s free and it gets around core API limitations.

Everyone likes free stuff. I once walked past a newsagent doing an opening day promotion where they were giving away tins of chopped tomatoes. Which was bizarre. What was more bizarre was that there was a queue. A queue I ended up joining.

Spaghetti Bolognese has never tasted so sweet.

Granular Indexation Tracking

Arguably, one of Search Console’s best but most limiting features is its indexation analysis. Understanding the differences between Crawled – Currently Not Indexed and Discovered – Currently Not Indexed can help you make smart decisions that improve the efficiency of your site. Significantly improving your crawl budget and internal linking strategies.

Image Credit: Harry Clarkson-Bennett

Pages that sit in the Crawled – Currently Not Indexed pipeline may not require any immediate action. The page has been crawled, but hasn’t been deemed fit for Google’s index. This could signify page quality issues, so worth ensuring your content is adding value and your internal linking prioritizes important pages.

Discovered – Currently Not Indexed is slightly different. It means that Google has found the URL, but hasn’t yet crawled it. It could be that your content output isn’t quite on par with Google’s perceived value of your site. Or that your internal linking structure needs to better prioritize important content. Or some kind of server of technical issue.

All of this requires at least a rudimentary understanding of how Google’s indexation pipeline works. It is not a binary approach. Gary Illyes said Google has a tiered system of indexation. Content that needs to be served more frequently is stored in a better-quality, more expensive system. Less valuable content is stored in a less expensive system.

How Google crawling and rendering system works (Image Credit: Harry Clarkson-Bennett)

Less monkey see, monkey do; more monkey see, monkey make decision based on the site’s value, crawl budget, efficiency, server load, and use of JavaScript.

The tiered approach to indexation prioritizes the perceived value and raw HTML of a page. JavaScript is queued because it is so much more resource-intensive. Hence why SEOs bang on about having your content rendered on the server side.

Adam has a very good guide to the types of not indexed pages in GSC and what they mean here.

Worth noting the page indexation tool isn’t completely up to date. I believe it’s updated a couple of times a week. But I can’t remember where I got that information, so don’t hold me to that…

If you’re a big news publisher you’ll see lots of your newsier content in the Crawled – Currently Not Indexed category. But when you inspect the URL (which you absolutely should do) it might be indexed. There is a delay.

Indexing API Scalability

When you start working on larger websites – and I am talking about websites where subfolders have well over 500,000 pages – the API’s 2,000 URL limitation becomes apparent. You just cannot effectively identify pages that drop in and out of the “Why Pages Aren’t Indexed?” section.

Not great, have seen worse (Image Credit: Harry Clarkson-Bennett)

But when you set up multiple properties, you can scale effectively.

The 2,000 limit only applies at a property level. So if you set up a domain-level property alongside 20 other properties (at the subfolder level), you can leverage up to 42,000 URLs per day. The more you do, the better.

And the API does have some other benefits:

But it doesn’t guarantee indexing. It is a request, not a command.

To set it up, you need to enable the API in Google Cloud Console. You can follow this semi-helpful quickstart guide. It is not fun. It is a pain in the arse. But it is worth it. Then you’ll need a Python script to send API requests and to monitor API quotas and responses (2xx, 3xx, 4xx, etc.).

If you want to get fancy, you can combine it with your publishing data to figure out exactly how long pages in specific sections take to get indexed. And you should always want to get fancy.

This is a really good signal as to what your most important subfolders are in Google’s eyes, too. Performant vs. under-performing categories.

Granular Click And Impression Data

An essential for large sites. Not only does the default Search Console only store 1,000 rows or query and URL data, but it only stores it for 16 months. While that sounds like a long time, fast forward a year or two, and you will wish you had started storing the data in BigQuery.

Particularly when it comes to looking at YoY click behavior and event planning. The teeth grinding alone will pay for your dentist’s annual trip to Aruba.

But by far and away the easiest way to see search data at a more granular level is to create more GSC properties. While you still have the same query and URL limits, because you have multiple properties instead of one, the data limits become far less limiting.

What About Sitemaps?

Not directly related to GSC indexation, but a point of note. Sitemaps are not a particularly strong tool in your arsenal when it comes to encouraging indexing of content. The indexation of content is driven by how “helpful” it is to users.

Now, it would be remiss of me not to highlight that news sitemaps are slightly different. When speed to publish and indexation are so important, you want to highlight your freshest articles in a ratified place.

Ultimately, it comes down to Navboost. Good vs. bad clicks and the last longest click. Or in more of a news sense, Glue – a huge table of user interactions, designed to rank fresh content in real-time and keep the index dynamic. Indexation is driven by your content being valuable enough to users for Google to continue to store in its index.

Glue emphasizes immediate interaction signals like hovers and swipes for more instant feedback (Image Credit: Harry Clarkson-Bennett)

Thanks to decades of experience (and confirmation via the DoJ trial and the Google Leak), we know that your site’s authority (Q*), impact over time, and internal linking structure all play a key role. But once it’s indexed, it’s all about user engagement. Sitemap or no sitemap, you can’t force people to love your beige, miserable content.

And Sitemap Indexes?

Most larger sites use sitemap indexes. Essentially, a sitemap of sitemaps to manage larger websites that exceed the 50,000 row limit. When you upload the sitemap index to Search Console, don’t stop there. Upload every individual sitemap in your sitemap index.

This gives you access to indexation at a sitemap level in the page indexing or sitemaps report. Something that is much harder to manage when you have millions of pages in a sitemap index.

Seeing data at a sitemap level gives more granular indexation data in GSC (Image Credit: Harry Clarkson-Bennett)

Take the same approach with sitemaps as we have discussed with properties. More is generally better.

Worth knowing that each document is also given DocID. The DocID stores signals to score the page’s popularity: user clicks, its quality and authoritativeness, crawl data, and a spam score among others.

Anything classified as crucial to ranking a page is stored and used for indexation and ranking purposes.

What Should I Do Next?

  1. Assess your current GSC setup – is it working hard enough for you?
  2. Do you have access to a domain-level property and a crawl stats report?
  3. Have you already broken your site down into “properties” in GSC?
  4. If not, crawl your site and establish the subfolders you want to add.
  5. Review your sitemap setup. Do you just have a sitemap index? Have you added the individual sitemaps to GSC, too?
  6. Consider connecting your data to BigQuery and storing more than 16 months of it.
  7. Consider connecting to the API via Google Cloud Console.
  8. Review the above tools and see if they’d add value.

Ultimately, Search Console is very useful. But it has significant limitations, and to be fair, it is free. Other tools have surpassed it in many ways. But if nothing else, you should make it work as hard as possible.r

More Resources:


This post was originally published on Leadership in SEO.


Featured Image: N Universe/Shutterstock

Is Google About To Go Full AI Mode? via @sejournal, @wburton27

AI search is rapidly changing the way people discover content and engage with brands. Logan Kilpatrick, Google’s lead product manager for AI products, suggested in an X (formerly Twitter) post that “AI Mode” will become the default for Google Search “soon” and then reclarified his statements, but if that happens, what could potentially happen to the SEO industry, especially with over 100 million monthly active users searching in AI Mode, according to Google?

Screenshot from X (Twitter), November 2025

Let’s explore some potential possibilities, but before we do, let’s distinguish the differences between AI Mode and AI Overviews, as there is a clear distinction between the two.

AI Overviews Vs. AI Mode

AI Overviews are short, AI-generated summaries that appear above traditional search results for some queries that help users find information quickly. AIOs provide quick, concise answers and save users time by reducing the need to click on links, reducing clicks and traffic to brands.

AI Mode is a more advanced, interactive search experience that might replace the standard search results page in the future, as it helps with complex, multi-step, or open-ended questions by providing a more comprehensive and conversational AI-powered response if you want to follow up and learn more. Google added “AI Mode” on its search page earlier this year, looking to retain its millions of users from going away to other AI models.

What Could Potentially Happen If AI Mode Becomes Default?

If Google decides to switch to AI Mode by default, brands will definitely see a decrease in organic traffic, since users will get direct answers to their queries and won’t need to click through to websites, because they will find what they need right in the AI Mode. With AI Overviews, this is a trend that we are already seeing happening, but if AI Mode becomes the default, this will further reduce clicks.

Brands May Rely More On Paid For Visibility  

Currently, the way AI Mode is designed, there are no ads and no way for Google to monetize the interface, but that is all about to change, and change extremely fast. Google’s head of Search, Liz Reid, shared a look into how the company is navigating its transition into the AI era – and how it’s thinking about keeping its multibillion-dollar ad business alive.  In 2024, Google made 264.59 billion in ads, according to Statista, and it’s been growing year over year.  

Screenshot from Statista, November 2025

Google is beginning to roll out ads in AI Mode, but it’s in its infancy. Google is looking into showing ads when they’re high-quality and relevant. Since queries are 2x to 3x longer than they are on main search, which means they can do better targeted, higher quality ads, according to Liz Read. Brands that can afford to be visible in AI Mode paid results will benefit from being visible, but brands that only focus on traditional SEO tactics and strategies could be left behind.

Google has also added advertisements to AI Overviews, increasing Search ad sales, so we can expect the same from AI Mode.

A Potential Shift In Visibility And Discovery

AI search is causing us to move away from traditional SEO metrics, i.e., keyword rankings and click-through rates, to brand visibility and relevance. Your brand should be cited as the authoritative source for AI answers, and if your brand is not visible as the answer, then you will lose more clicks.

Measurement

Tracking the customer journey may become harder because users interact within the AI interface rather than on your brand’s website. Traditional analytics will provide fewer insights and will cause brands to develop new metrics focused on AI citations, brand mentions, and local visibility. We are seeing this already with the emergence of AI tools, from traditional players like Semrush, Ahrefs, and new AI players like PeecAI and Profound, to name a few. 

Loss Of Control Over Brand Narrative

Since AI Overviews are taking information from various online sources to build a brand’s presence, if your brand does not have a good brand strategy and has inconsistent, outdated, or poorly managed information across the web, i.e., reviews, social signals, and local listings, etc., then AI may inaccurately represent your brand across the web. 

What Could Potentially Happen To Google Chrome?

If Google does go to full AI Mode by default, Chrome could potentially undergo a major transformation with deep integration of Gemini and other AI capabilities, which would change the web browsing experience from a passive tool to a proactive, intelligent assistant. According to eMarketer, Gemini is growing its user base faster than ChatGPT.

Screenshot from eMarketer, November 2025

ChatGPT has already opened its AI browser ChatGPT Atlas, which is currently only available on macOS and is challenging Google Chrome.

Screenshot from ChatGPT Atlas, November 2025

If Google Does Make AI Mode Default, What Can We Do?

  • Experiment with AI paid ads when they become available and put aside some budget and test the impact and return on investment (ROI) of ads in AI Mode.
  • Focus on making sure conversion funnels and processes are easy and provide a good user experience.
  • Be present and have great content everywhere your audience is. Your brand must have a strong brand presence across Reddit, Quora, YouTube, OpenAI, Perplexity, etc., and other places where end users are looking for information about your brand. For example, Apple is looking at search options on Safari, which could end its partnership with Google, but at the end of the day, we will see if Google will maintain the relationship or Apple will go somewhere else, like OpenAI, which could boost traffic and get more users using OpenAI or another large language model (LLM).
  • Continue to optimize for AIO by creating high-quality, authoritative content that directly answers user questions, is well-structured, and easy for AI to understand. This involves creating new content and refreshing your old content with up-to-date research, original information, and different perspectives.

Wrapping Up

The shift toward AI-powered search isn’t hypothetical anymore; it’s actually here and moving fast. With AI Overviews and AI Mode gaining traction among more than 100 million monthly users, Google is positioning itself for a future where conversational, answer-focused experiences may replace traditional search results.

If AI Mode becomes the default search for Google, it won’t just change how users search; it will fundamentally reshape how brands earn visibility, traffic, and trust online.

For brands, publishers, and SEOs, this transition presents both risks and opportunities. Organic traffic will almost certainly decline as more answers stay within Google’s ecosystem. Paid visibility in AI results will grow rapidly, favoring brands with budgets and adaptable strategies. And success will depend less on ranking for keywords and more on becoming a trusted source that AI cites, references, or recommends across platforms.

This era will demand a new kind of optimization centered on brand authority, AI citations, structured data, user trust signals, and multi-platform presence.

No one has a crystal ball and knows what a full AI Mode future looks like, but brands that adapt early will be the market share leaders, and those that wait will lose visibility, traffic, and relevance.

More Resources:


Featured Image: Collagery/Shutterstock

Lazy Link Building Building Strategies That Work via @sejournal, @martinibuster

I like coming up with novel approaches to link building. One way to brainstorm an approach is to reverse a common method. I created a couple of approaches to link building, several are passive and two others are a little more active but have very little to do with email outreach. I wrote about these tips back around 2013, but I’ve polished them up and updated them for today.

Passive Link Building

Someone asked that I put together some tips for those who are too lazy to do link building. So here it goes!

Guilt Trip Copyright Infringers

Check who’s stealing your content. Be  hard on scrapers. But if it’s an otherwise legit site, you might want to hold off asking them to take down your content. Check if they’re linking to a competitor or similar sites, like from a links page.

You can ask them nicely to take down the content and after they email you back to confirm the link is down, email them back to thank them. But then say something like, “I see you are linking to Site-X.com. If my content was good enough to show on your site, then I would be grateful and much obliged if you considered it good enough to list from your links page.

I heard a keynote speaker at an SEO conference once encouraging people to come down hard on people who steal your content. I strongly disagree with that approach. Some people who steal your content sometimes are under the impression that if it’s on the Internet then it’s free and they can use it on their own site.  Some think it’s free to use as long as they link back to your site.

If they are linking to your site, tell them that you prefer they don’t infringe on your copyright but that you would be happy to write them a different article they can use as long as they link back to your site. You can be nice to people and still get a link.

Reverse Guest Posting

Instead of publishing articles on someone else’s site, solicit people to publish on your site. Many people tweet, promote, and link from their sites to sites that they are interviewed on. An interesting thing about doing this is that interviewing people who have a certain amount of celebrity helps to bring more people to your site, especially if people are searching for that person.

Relationship Building

Authors of books are great for this kind of outreach. People are interested in what authors and experts say. Sometimes you can find the most popular authors and influencers at industry conferences. I’ve met some really famous and influential people at conferences and got their email address and scored interviews by just going up and talking to these people.

This is called relationship building. SEOs and digital marketers are so overly focused on sending out emails and doing everything online that they forget that people actually get together in person at industry events, meetups, and other kinds of social events.

Giveaways

This is an oldie and I get it that many SEOs have talked about this. But this is something that I used successfully from way back around 2005. I did an annual giveway to my readers and website members.

The way I did it was to contact some manufacturers of products that are popular with my readers and ask for a discount if I buy in bulk and tell them I’ll be promoting their products to my subscribers, readers, and members. I’ve been responsible for making several companies popular by bringing attention to their products, elevating them from a regional business to a nationwide business.

Leverage Niche Audience For Links

The way to do this is to identify an underserved subtopic of your niche, then create a useful section that addresses a need for that niche. The idea is to create a compelling reason to link to the site.

Here is an example of how to do this for a travel destination site.

Research gluten free, dairy free, nut-free, raw food dining destinations. Then make a point to visit, interview, and build a resource for those.

Conduct interviews with lodging and restaurant owners that offer gluten free options. You’ll be surprised by how many restaurants and lodgings might decide on their own to link to your site or maybe just hint at it.

Summary

Outreach to sites about a niche topic, not just to businesses but also to organizations and associations related to that niche that have links and resources pages. Just tell them about the site, quickly explain what it offers and ask for a link. This method is flexible and can be adapted to a wide range of niche topics. And if they have an email or publish articles, suggest contributing to those but don’t ask for a link, just ask for a mention.

Don’t underestimate the power of building positive awareness of your site. Focus on creating positive feelings for your site (goodwill) and generating positive word of mouth, otherwise known as external signals of quality. The rankings will generally follow.

Featured Image by Shutterstock/pathdoc

The Quid Pro No Method Of Link Building via @sejournal, @martinibuster

Expressly paying for links has been out for awhile. Quid Pro No is in. These are some things you can do when a website asks for money in exchange for a link. During the course of building links, whether it’s free links, publishing an article or getting a brand mention, it’s not unusual to get solicited for money. It’s tempting to take the bait and get a project done. But I’m going to suggest some considerations prior to making a decision as well as a way to turn it around using an approach that I call Quid Pro No.

Link building, digital pr, brand mention building can often lead to solicitations for a paid link. There are many good reasons for not engaging in paid links and in my experience it’s possible to get a link without doing it their way when someone asks you for money in return for a link.

Red Light Means Stop

The first consideration is that someone who has their hand out for money is a red light is because it’s highly likely they have done this before and are highly likely linking to low quality websites that are in really bad neighborhoods, putting the publisher’s site and any sites associated with it into the outlier part of the web graph where sites are identified as spam and tend to not get indexed. In this case consider it a favor that they outed their site for the crap neighborhood it resides in and walk away. Quid pro… no.

Getting solicited for money can be a frequent occurrence. Site publishers, some of them apparently legit, are publishing Guest Post Submission Guidelines for the purpose of attracting paying submissions. It’s an industry and overly normalized in certain circles. Beware.

Spook The Fish

A less frequent occurrence is by the newb who’s trying to extract something. If the site checks out then there may be room for some kind of concession. If they’re asking for money, in this case, Quid Pro No means to FUD them away from this kind of activity THEN turn them around to doing the project on your terms.

When angling on a river fish that’s on the hook might make a run downstream away from you which makes it tough to land the fish because you’re fighting the fish and the current. Sometimes a tap on the rod will spook them into changing position. Sometimes a sharp pull can direct them to turn around. For this character I have found it efficacious to spook them with all the bad things that can happen and turn them around to where I want them to be.

Very briefly, and in the most polite terms, explain you’d love to do business, but that there are other considerations. Here’s what you can trot out:

  • FTC Guidelines
    FTC guidelines prohibit a web publisher from accepting money for an unlabeled advertisement.
  • Google Guidelines
    Google prohibits paid links

Land The Link

What’s in it for me is a useful concept that can be used to convince someone that it’s in their interest to do things your way. It’s important to convince the other party that there’s something in it for them. They want something so sometimes it’s worthwhile to make them feel as if they’re getting something out of the deal.

The approach I take for closing a project, whether it’s a free link or an article project is to circle back to asking for an article project by focusing on communicating why my site is high quality and ways that we can cross-promote. It’s essentially relationship building. The message is that your site is authoritative, well promoted and that there are ways that both sites can benefit without doing a straight link buy.

But at this point I want to emphasize again that any site that’s asking for money in exchange for a link is not necessarily a good neighborhood. So you might not actually want a link from them if they’re linking out to low quality sites.

Or Go For A Labeled Sponsored Post

However, another way to turn this around is to just go ahead and pay them as long as it’s a labeled as a sponsored post and contains either multiple no-follow links and or brand mentions. Sponsored posts get indexed by search engines and AI platforms that will use those as validation for how great your site is and recommend it.

What’s beautiful about a labeled sponsored post is that they give you full control over the messaging, which can be more valuable than a tossed-off link in a random paragraph. And because everything is disclosed and compliant, you reduce the long-term risk while still capturing visibility in AI Mode, ChatGPT and Perplexity through the citation signals.

Quid Pro No

Quid Pro No is about negatively responding to a solicitation and turning it around and getting something you want without actually saying the word no.

Featured Image by Shutterstock/Studio Romantic

Google Defends Parasite SEO Crackdown As EU Opens Investigation via @sejournal, @MattGSouthern

Google has defended its enforcement of site reputation abuse policies after the European Commission announced an investigation into whether the company unfairly demotes news publishers in search results.

The company published a blog post stating the investigation “is misguided and risks harming millions of European users” and that it “risks rewarding bad actors and degrading the quality of search results.”

Google’s Chief Scientist for Search, Pandu Nayak, wrote the response.

Background

The European Commission announced an investigation under the Digital Markets Act examining whether Google’s anti-spam policies unfairly penalize legitimate publisher revenue models.

Publishers complained that Google demotes news sites running sponsored content and third-party promotional material. EU antitrust chief Teresa Ribera said:

“We are concerned that Google’s policies do not allow news publishers to be treated in a fair, reasonable and non-discriminatory manner in its search results.”

Google updated its site reputation abuse policy last year to combat parasite SEO. The practice involves spammers paying publishers to host content on established domains to manipulate search rankings.

The policy targets content like payday loan reviews on educational sites, casino content on medical sites, or third-party coupon pages on news publishers. Google provided specific examples in its announcement including weight-loss pill spam and payday loan promotions.

Manual enforcement began shortly after. Google issued penalties to major publishers including Forbes, The Wall Street Journal, Time and CNN in November 2024.

Google later updated the policy to clarify that first-party oversight doesn’t exempt content primarily designed to exploit ranking signals.

Google’s Defense

Google’s response emphasized three points.

First, Google stated that a German court dismissed a similar claim, ruling the anti-spam policy was “valid, reasonable, and applied consistently.”

Second, Google says its policy protects users from scams and low-quality content. Allowing pay-to-play ranking manipulation would “enable bad actors to displace sites that don’t use those spammy tactics.”

Third, Google says smaller creators support the crackdown. The company claims its policy “helps level the playing field” so legitimate sites competing on content quality aren’t outranked by sites using deceptive tactics.

Nayak argues the Digital Markets Act is already making Search ‘less helpful for European businesses and users,’ and says the new probe risks rewarding bad actors.

The company has relied exclusively on manual enforcement so far. Google confirmed in May 2024 that it hadn’t launched algorithmic actions for site reputation abuse, only manual reviews by human evaluators.

Google added site reputation abuse to its Search Quality Rater Guidelines in January 2025, defining it as content published on host sites “mainly because of that host site’s already-established ranking signals.”

Why This Matters

The investigation creates a conflict between spam enforcement and publisher business models.

Google maintains parasite SEO degrades search results regardless of who profits. Publishers argue sponsored content with editorial oversight provides legitimate value and revenue during challenging times for media.

The distinction matters. If Google’s policy captures legitimate publisher-advertiser partnerships, it restricts how news organizations monetize content. If the policy only targets manipulative tactics, it protects search quality.

The EU’s position suggests regulators view Google’s enforcement as potentially discriminatory. The Digital Markets Act prohibits gatekeepers from unfairly penalizing others, with fines up to 10% of global revenue for violations.

Google addressed concerns about the policy in December 2024, confirming that affiliate content properly marked isn’t affected and that publishers must submit reconsideration requests through Search Console to remove penalties.

The updated policy documentation clarified that simply having third-party content isn’t a violation unless explicitly published to exploit a site’s rankings.

The policy has sparked debate in the SEO community about whether Google should penalize sites based on business arrangements rather than content quality.

Looking Ahead

The European Commission has opened the investigation under the Digital Markets Act and will now gather evidence and define the specific DMA provisions under examination.

Google will receive formal statements of objections outlining alleged violations. The company can respond with arguments defending its policies.

DMA investigations move faster than traditional antitrust cases. Publishers may submit formal complaints providing evidence of traffic losses and revenue impacts.

The outcome could force changes to how Google enforces spam policies in Europe or validate its current approach to protecting search quality.


Featured Image: daily_creativity/Shutterstock

llms.txt: The Web’s Next Great Idea, Or Its Next Spam Magnet via @sejournal, @DuaneForrester

At a recent conference, I was asked if llms.txt mattered. I’m personally not a fan, and we’ll get into why below. I listened to a friend who told me I needed to learn more about it as she believed I didn’t fully understand the proposal, and I have to admit that she was right. After doing a deep dive on it, I now understand it much better. Unfortunately, that only served to crystallize my initial misgivings. And while this may sound like a single person disliking an idea, I’m actually trying to view this from the perspective of the search engine or the AI platform. Why would they, or why wouldn’t they, adopt this protocol? And that POV led me to some, I think, interesting insights.

We all know that search is not the only discovery layer anymore. Large-language-model (LLM)-driven tools are rewriting how web content is found, consumed, and represented. The proposed protocol, called llms.txt, attempts to help websites guide those tools. But the idea carries the same trust challenges that killed earlier “help the machine understand me” signals. This article explores what llms.txt is meant to do (as I understand it), why platforms would be reluctant, how it can be abused, and what must change before it becomes meaningful.

Image Credit: Duane Forrester

What llms.txt Hoped To Fix

Modern websites are built for human browsers: heavy JavaScript, complex navigation, interstitials, ads, dynamic templates. But most LLMs, especially at inference time, operate in constrained environments: limited context windows, single-pass document reads, and simpler retrieval than traditional search indexers. The original proposal from Answer.AI suggests adding an llms.txt markdown file at the root of a site, which lists the most important pages, optionally with flattened content so AI systems don’t have to scramble through noise.

Supporters describe the file as “a hand-crafted sitemap for AI tools” rather than a crawl-block file. In short, the theory: Give your site’s most valuable content in a cleaner, more accessible format so tools don’t skip it or misinterpret it.

The Trust Problem That Never Dies

If you step back, you discover this is a familiar pattern. Early in the web’s history, something like the meta keywords tag let a site declare what it was about; it was widely abused and ultimately ignored. Similarly, authorship markup (rel=author, etc) tried to help machines understand authority, and again, manipulation followed. Structured data (schema.org) succeeded only after years of governance and shared adoption across search engines. llms.txt sits squarely inside this lineage: a self-declared signal that promises clarity but trusts the publisher to tell the truth. Without verification, every little root-file standard becomes a vector for manipulation.

The Abuse Playbook (What Spam Teams See Immediately)

What concerns platform policy teams is plain: If a website publishes a file called llms.txt and claims whatever it likes, how does the platform know that what’s listed matches the live content users see, or can be trusted in any way? Several exploit paths open up:

  1. Cloaking through the manifest. A site lists pages in the file that are hidden from regular visitors or behind paywalls, then the AI tool ingests content nobody else sees.
  2. Keyword stuffing or link dumping. The file becomes a directory stuffed with affiliate links, low-value pages, or keyword-heavy anchors aimed at gaming retrieval.
  3. Poisoning or biasing content. If agents trust manifest entries more than the crawl of messy HTML, a malicious actor can place manipulative instructions or biased lists that affect downstream results.
  4. Third-party link chains. The file could point to off-domain URLs, redirect farms, or content islands, making your site a conduit or amplifier for low-quality content.
  5. Trust laundering. The presence of a manifest might lead an LLM to assign higher weight to listed URLs, so a thin or spammy page gets a boost purely by appearance of structure.

The broader commentary flags this risk. For instance, some industry observers argue that llms.txt “creates opportunities for abuse, such as cloaking.” And community feedback apparently confirms minimal actual uptake: “No LLM reads them.” That absence of usage ironically means fewer real-world case studies of abuse, but it also means fewer safety mechanisms have been tested.

Why Platforms Hesitate

From a platform’s viewpoint, the calculus is pragmatic: New signals add cost, risk, and enforcement burden. Here’s how the logic works.

First, signal quality. If llms.txt entries are noisy, spammy, or inconsistent with the live site, then trusting them can reduce rather than raise content quality. Platforms must ask: Will this file improve our model’s answer accuracy or create risk of misinformation or manipulation?

Second, verification cost. To trust a manifest, you need to cross-check it against the live HTML, canonical tags, structured data, site logs, etc. That takes resources. Without verification, a manifest is just another list that might lie.

Third, abuse handling. If a bad actor publishes an llms.txt manifest that lists misleading URLs which an LLM ingests, who handles the fallout? The site owner? The AI platform? The model provider? That liability issue is real.

Fourth, user-harm risk. An LLM citing content from a manifest might produce inaccurate or biased answers. This just adds to the current problem we already face with inaccurate answers and people following incorrect, wrong, or dangerous answers.

Google has already stated that it will not rely on llms.txt for its “AI Overviews” feature and continues to follow “normal SEO.” And John Mueller wrote: “FWIW no AI system currently uses llms.txt.” So the tools that could use the manifest are largely staying on the sidelines. This reflects the idea that a root-file standard without established trust is a liability.

Why Adoption Without Governance Fails

Every successful web standard has shared DNA: a governing body, a clear vocabulary, and an enforcement pathway. The standards that survive all answer one question early … “Who owns the rules?”

Schema.org worked because that answer was clear. It began as a coalition between Bing, Google, Yahoo, and Yandex. The collaboration defined a bounded vocabulary, agreed syntax, and a feedback loop with publishers. When abuse emerged (fake reviews, fake product data), those engines coordinated enforcement and refined documentation. The signal endured because it wasn’t owned by a single company or left to self-police.

Robots.txt, in contrast, survived by being minimal. It didn’t try to describe content quality or semantics. It only told crawlers what not to touch. That simplicity reduced its surface area for abuse. It required almost no trust between webmasters and platforms. The worst that could happen was over-blocking your own content; there was no incentive to lie inside the file.

llms.txt lives in the opposite world. It invites publishers to self-declare what matters most and, in its full-text variant, what the “truth” of that content is. There’s no consortium overseeing the format, no standardized schema to validate against, and no enforcement group to vet misuse. Anyone can publish one. Nobody has to respect it. And no major LLM provider today is known to consume it in production. Maybe they are, privately, but publicly, no announcements about adoption.

What Would Need To Change For Trust To Build

To shift from optional neat-idea to actual trusted signal, several conditions must be met, and each of these entails a cost in either dollars or human time, so again, dollars.

  • First, manifest verification. A signature or DNS-based verification could tie an llms.txt file to site ownership, reducing spoof risk. (cost to website)
  • Second, cross-checking. Platforms should validate that URLs listed correspond to live, public pages, and identify mismatch or cloaking via automated checks. (cost to engine/platform)
  • Third, transparency and logging. Public registries of manifests and logs of updates would make dramatic changes visible and allow community auditing. (cost to someone)
  • Fourth, measurement of benefit. Platforms need empirical evidence that ingesting llms.txt leads to meaningful improvements in answer correctness, citation accuracy, or brand representation. Until then, this is speculative. (cost to engine/platform)
  • Finally, abuse deterrence. Mechanisms must be built to detect and penalize spammy or manipulative manifest usage. Without that, spam teams simply assume negative benefit. (cost to engine/platform)

Until those elements are in place, platforms will treat llms.txt as optional at best or irrelevant at worst. So maybe you get a small benefit? Or maybe not…

The Real Value Today

For site owners, llms.txt still may have some value, but not as a guaranteed path to traffic or “AI ranking.” It can function as a content alignment tool, guiding internal teams to identify priority URLs you want AI systems to see. For documentation-heavy sites, internal agent systems, or partner tools that you control, it may make sense to publish a manifest and experiment.

However, if your goal is to influence large public LLM-powered results (such as those by Google, OpenAI, or Perplexity), you should tread cautiously. There is no public evidence those systems honor llms.txt yet. In other words: Treat llms.txt as a “mirror” of your content strategy, not a “magnet” pulling traffic. Of course, this means building the file(s) and maintaining them, so factor in the added work v. whatever return you believe you will receive.

Closing Thoughts

The web keeps trying to teach machines about itself. Each generation invents a new format, a new way to declare “here’s what matters.” And each time the same question decides its fate: “Can this signal be trusted?” With llms.txt, the idea is sound, but the trust mechanisms aren’t yet baked in. Until verification, governance, and empirical proof arrive, llms.txt will reside in the grey zone between promise and problem.

More Resources:


This post was originally published on Duane Forrester Decodes.


Featured Image: Roman Samborskyi/Shutterstock

Secrets Of A Wildly Successful Website via @sejournal, @martinibuster

Back in 2005 I intuited that there are wildly successful Internet enterprises that owed nothing to SEO. These successes intrigued me because they happened according to undocumented rules outside of the SEO bubble. These sites have stories and lessons about building success.

Turning Your Enthusiasm Into Success

In 2005 I interviewed the founder of the Church Of The Flying Spaghetti Monster, which at the time had a massive Page Rank score of 7. The founder explains how promotion was never part of a plan- in fact he denied having any success plan at all. He simply put the visual material out there and let people hotlink the heck out of it at the rate 40GB/day back in 2005.

The site is controversial because it was created in response to an idea called Intelligent Design, which is an ideology that believes that aspects of the universe and life are the products of an unseen intelligent hand and not products of undirected processes like evolution and natural selection. This article is not about religion, it’s about how someone leveraged their passion to create a wildly successful website.

The point is, there was no direct benefit to hotlinking, only the indirect benefits of putting his name out there and having it seen, known and remembered. It’s the essence of what we talk about when we talk about brand and mindshare building. Which is why I say that this interview is wildly relevant in 2013. Many of my most innovative methods for obtaining links are located within the mindset of identifying latent opportunities related to indirect benefits. There is a lot of opportunity there because most of the industry is focused on the direct-benefits/ROI mindset. Without further ado, here is the interview. Enjoy!

Secrets Of A Wildly Popular Website

The other day I stumbled across a successful website called, Church of the Flying Spaghetti Monster that does about 40 GB of traffic (including hotlinks) every single day. The site was created as a response to a social, cultural, political, and religiou issue of the day.

Many of you are interested in developing strategies to creating massively popular sites, so the following story of this hyper-successful website (PR 7, in case you were wondering) may be of interest.

Creating a website to react to controversy or a current event is an old but maybe forgotten methods for receiving links. Blogs fit into this plan very nicely. The following is the anatomy of a website created purely for the passion of it. It was not created for links or monetary benefit. Nevertheless it has accomplished what thousands of link hungry money grubbing webmasters aspire to every day. Ha!

So let’s take a peek behind the scenes of a wildly successful site that also makes decent change. The following is an interview with Bobby Henderson, the man behind the site.

Can you give me a little history of the Church of the Flying Spaghetti Monster website?

“The site was never planned. “the letter” had been written and sent off – with no reply – for months before it occurred to me to post it online.”

Have you ever built a website before, what is your web background?

“I made a website for the Roseburg, Oregon school district when I was in high school.

With the Fly Spaghetti Monster (FSM) site, I want things to be as plain and non-shiny as possible. Screw aesthetics. I don’t want it to look slick and well-designed at all. I prefer it to be just slapped together, with new content added frequently. I love it when people give me tips to make the site better. It’s received well over 100 million hits at this point, so maybe there’s something to this content-instead-of-shiny-ness thing.”

What made you decide to build your website?

“The idea of a Flying Spaghetti Monster was completely random. I wrote the letter at about 3am one night, for no particular reason other than I couldn’t sleep. And there must have been something in news about ID that day.

After posting the letter online, it was “discovered” almost immediately. It got boingboing’ed within a couple weeks, and blew up from there. I’ve done zero “promotion”. Promotion is fake. None of the site was planned, it has evolved over the months. Same with the whoring-out, the t-shirts,etc. None of that stuff was my idea. People asked for it, so I put it up. I can remember telling a friend that I would be shocked if one person bought a t-shirt. Now there have been around 20k sold.”

To what do you attribute the support of your site from so many people?

“I believe the support for the FSM project comes from spite…

I get 100-200 emails a day. Depends on the news, though. I got maybe 300 emails about that “pirate” attack on the cruise-ship. Incidentally, the reason we saw no change in global weather was because they were not real pirates. Real pirates don’t have machine guns and speedboats. (editors note: The FSM dogma asserts a connection between pirates and global warming)”

Were you surprised at how the site took off?

“Yes of course I’m surprised the site took off. And it blows my mind that it’s still alive. Yesterday was the highest-traffic day yet, with 3.5 million hits (most of those hits were hotlinked images).

What advice do you have to others who have a site they want to promote?

“Advice. . . ok .. here’s something. A lot of people go out of their way to stop hotlinking. I go out of my to allow it – going so far as paying for the extra bandwidth to let people steal my stuff. Why? It’s all part of the propaganda machine. It would be easy enough to prevent people from hotlinking FSM images. But I WANT people to see my propaganda, so why not allow it?

It’s like advertising, requiring zero effort by me. I am paying for about 40GB in bandwidth every day in just hijacked images – and it’s totally worth it, because now the Flying Spaghetti Monster is everywhere.”

Seeing how your deity is a flying spaghetti monster, I am curious… do you like eating spaghetti?

“No comment.”

Featured Image by Shutterstock/Elnur