Archive

Content

Shorter, Focused Content Wins In ChatGPT via @sejournal, @Kevin_Indig

Boost your skills with Growth Memo’s weekly expert insights. Subscribe for free!For years, SEOs have operated on a simple assumption: The more ground your content covers, the more likely it is to surface in AI-generated answers. In fact, every “best practice” in classic SEO content pushes you toward more: more subtopics, more sections, more words. Build the “ultimate guide.”
An analysis of 815,000 query-page pairs across 16,851 queries and 353,799 pages says otherwise:

Fan-out coverage is nearly irrelevant to citation rates.
Two signals actually predict whether ChatGPT cites your page.
Six concrete changes to your existing content library help.

1. The Study
AirOps ran 16,851 queries through ChatGPT three times each through the UI, capturing every fan-out sub-query, every URL searched, every citation made, and every page scraped. Oshen Davidson built the pipeline. I analyzed the data.
Each query generates an average of two fan-out queries. ChatGPT retrieves roughly 10 URLs per sub-search, reads through them, then selects which ones to cite. We scored how well each page’s H2-H4 subheadings matched those fan-out queries using cosine similarity on bge-base-en-v1.5 embeddings. That score is what we call fan-out coverage: the share of subtopics a page addresses at a 0.80 similarity threshold. (The 0.80 similarity threshold cutoff was used to decide whether a subheading counts as a match to a fan-out query. Think of it as a relevance bar.)
The question: Do pages with higher fan-out coverage get cited more?
You’ll find even more information in the co-written AirOps report.
2. Density Barely Moves The Needle
Across 815,484 rows, the relationship between fan-out coverage and citation is weak.
Covering 100% of subtopics adds 4.6 percentage points over covering none. That gap shrinks further when you control for query match (how well the page’s best heading matches the original query). Among pages with strong query match ( >= 0.80 cosine similarity):
Image Credit: Kevin Indig
Moderate coverage (26-50%) outperforms exhaustive coverage. Pages that cover everything score lower than pages that cover a quarter of the subtopics. The “ultimate guide” strategy produces worse results than a focused article that covers two to three related angles well.
3. What Actually Predicts Citation
These two signals dominate: retrieval rank and query match.
1. Retrieval rank is the strongest predictor by a wide margin. A page at position 0 in ChatGPT’s web search results (the first URL returned by its search tool) has a 58% citation rate. By position 10, that drops to 14%. We ran each prompt three times consecutively for this analysis, and pages cited in all three runs have a median retrieval rank of 2.5. Pages never cited: median rank 13.
Image Credit: Kevin Indig
2. Query match (cosine similarity between the query and the page’s best heading) is the strongest content signal. Pages with a 0.90+ heading match have a 41% citation rate compared to the 30% rate for pages below 0.50. Even among top-ranked pages (position 0-2), higher query match adds 19 percentage points.
Fan-out coverage, word count, heading count, domain authority: all secondary. Some are flat. Some are inversely correlated.
4. The Wikipedia Exception
One site type breaks the pattern. Wikipedia has the worst retrieval rank in the dataset (median 24) and the lowest query match score (0.576). It still achieves the highest citation rate: 59%.
Wikipedia pages average 4,383 words, 31 lists, and 6.6 tables. They are encyclopedic in the literal sense. ChatGPT cites Wikipedia from deep in the search results where every other site type gets ignored.
This is density working as a signal, but at a scale no publisher can replicate. Wikipedia’s content is exhaustive, richly structured, and cross-linked across millions of topics. A 3,000-word corporate blog post with 15 subheadings is not the same thing.
5. The Bimodal Reality
58% of pages retrieved by ChatGPT in this dataset are never cited. 25% are always cited when they appear. Only 17% fall in between.
The always-cited and never-cited groups look nearly identical on most content metrics: similar word counts (~2,200), similar heading counts (~20), similar readability scores (~12 FK grade), similar domain authority (~54). The on-page signals we can measure do not separate winners from losers.
What separates them is retrieval rank. Always-cited pages rank near the top when they surface. Never-cited pages rank in the bottom half. The retrieval system, whatever signals it uses internally, is the gatekeeper. Everything else is a tiebreaker.
6. What This Means For Your Content
Conventional SEO content writing wisdom says cover more subtopics, add more sections, build density. The data says the conventional approach produces “mixed” pages, the 17% in the middle that get cited sometimes and ignored other times.
Mixed pages have the highest word counts, the most headings, and the highest domain authority in the dataset. They are the “ultimate guides.” They are also the least reliable performers in ChatGPT.
The pages that win consistently are focused. They:

Match the query directly in their headings,
Tend to be shorter (the citation sweet spot is 500-2,000 words), and
Have enough structure (7-20 subheadings) to organize the content without diluting it.

Build the page that is the best answer to one question. Not the page that adequately answers 20.

Featured Image: Tero Vesalainen/Shutterstock; Paulo Bobita/Search Engine Journal

Read More »
News

Google Lists 9 Scenarios That Explain How It Picks Canonical URLs via @sejournal, @martinibuster

Google’s John Mueller answered a question on Reddit about why Google picks one web page over another when multiple pages have duplicate content, also explaining why Google sometimes appears to pick the wrong URL as the canonical.Canonical URLs
The word canonical was previously mostly used in the religious sense to describe what writings or beliefs were recognized to be authoritative. In the SEO community, the word is used to refer to which URL is the true web page when multiple web pages share the same or similar content.
Google enables site owners and SEOs to provide a hint of which URL is the canonical with the use of an HTML attribute called rel=canonical. SEOs often refer to rel=canonical as an HTML element, but it’s not. Rel=canonical is an attribute of the element. An HTML element is a building block for a web page. An attribute is markup that modifies the element.
Why Google Picks One URL Over Another
A person on Reddit asked Mueller to provide a deeper dive on the reasons why Google picks one URL over another.
They asked:
“Hey John, can I please ask you to go a little deeper on this? Let’s say I want to understand why Google thinks two pages are duplicate and it chooses one over the other and the reason is not really in plain sight. What can one do to better understand why a page is chosen over another if they cover different topics? Like, IDK, red panda and “regular” panda 🐼. TY!!”
Mueller answered with about nine different reasons why Google chooses one page over another, including the technical reasons why Google appears to get it wrong but in reality it’s someetimes due to something that the site owner over SEO overlooked.
Here are the nine reasons he cited for canonical choices:

Exact duplicate contentThe pages are fully identical, leaving no meaningful signal to distinguish one URL from another.
Substantial duplication in main contentA large portion of the primary content overlaps across pages, such as the same article appearing in multiple places.
Too little unique main content relative to template contentThe page’s unique content is minimal, so repeated elements like navigation, menus, or layout dominate and make pages appear effectively the same.
URL parameter patterns inferred as duplicatesWhen multiple parameterized URLs are known to return the same content, Google may generalize that pattern and treat similar parameter variations as duplicates.
Mobile version used for comparisonGoogle may evaluate the mobile version instead of the desktop version, which can lead to duplication assessments that differ from what is manually checked.
Googlebot-visible version used for evaluationCanonical decisions are based on what Googlebot actually receives, not necessarily what users see.
Serving Googlebot alternate or non-content pagesIf Googlebot is shown bot challenges, pseudo-error pages, or other generic responses, those may match previously seen content and be treated as duplicates.
Failure to render JavaScript contentWhen Google cannot render the page, it may rely on the base HTML shell, which can be identical across pages and trigger duplication.
Ambiguity or misclassification in the systemIn some cases, a URL may be treated as duplicate simply because it appears “misplaced” or due to limitations in how the system interprets similarity.

Here’s Mueller’s complete answer:
“There is no tool that tells you why something was considered duplicate – over the years people often get a feel for it, but it’s not always obvious. Matt’s video “How does Google handle duplicate content?” is a good starter, even now.
Some of the reasons why things are considered duplicate are (these have all been mentioned in various places – duplicate content about duplicate content if you will :-)): exact duplicate (everything is duplicate), partial match (a large part is duplicate, for example, when you have the same post on two blogs; sometimes there’s also just not a lot of content to go on, for example if you have a giant menu and a tiny blog post), or – this is harder – when the URL looks like it would be duplicate based on the duplicates found elsewhere on the site (for example, if /page?tmp=1234 and /page?tmp=3458 are the same, probably /page?tmp=9339 is too — this can be tricky & end up wrong with multiple parameters, is /page?tmp=1234&city=detroit the same too? how about /page?tmp=2123&city=chicago ?).
Two reasons I’ve seen people get thrown off are: we use the mobile version (people generally check on desktop), and we use the version Googlebot sees (and if you show Googlebot a bot-challenge or some other pseudo-error-page, chances are we’ve seen that before and might consider it a duplicate). Also, we use the rendered version – but this means we need to be able to render your page if it’s using a JS framework for the content (if we can’t render it, we might take the bootstrap HTML page and, chances are it’ll be duplicate).
It happens that these systems aren’t perfect in picking duplicate content, sometimes it’s also just that the alternative URL feels obviously misplaced. Sometimes that settles down over time (as our systems recognize that things are really different), sometimes it doesn’t.
If it’s similar content then users can still find their way to it, so it’s generally not that terrible. It’s pretty rare that we end up escalating a wrong duplicate – over the years the teams have done a fantastic job with these systems; most of the weird ones are unproblematic, often it’s just some weird error page that’s hard to spot.”
Takeaway
Mueller offered a deep dive into the reasons why Google chooses canonicals. He described the process of choosing canonicals as like a fuzzy sorting system built from overlapping signals, with Google comparing content, URL patterns, rendered output, and crawler-visible versions, while borderline classifications (“weird ones”) are given a pass because they don’t pose a problem.
Featured Image by Shutterstock/Garun .Prdt

Read More »
Affiliate marketing

How To Break Through An Affiliate Site Plateau & Find New Growth – Ask An SEO via @sejournal, @rollerblader

This week’s Ask an SEO question is:“I’ve been running an affiliate site for 2 years but hit a plateau. What advanced data analysis techniques can help me identify new growth opportunities that I might be missing?“
This is one of my favorite questions that come up at conferences and in the affiliate marketing programs we manage. Most of the time, the affiliate submits their site or niche, and I can give direct examples and opportunities. But for this, we want to keep everything anonymous, so I’ll share the processes and ideas so you and anyone else reading can implement, no matter what industry, type of content, etc., you produce.
Breaking The Plateaus
There are a few plateaus affiliates face more than others, including:

Traffic stagnation.
New products and services for recommendation.
Revenue flat lines.
Topics to talk about.

These are the most common with this question, so I’ll focus on them. If anyone reading this has hit a different one and is looking for ways to overcome them, send the question through my author bio page here. If I’ve worked through it, I’ll do my best to answer it in an upcoming column.
Traffic Stagnation
If you have a website and traffic has stagnated because you dominate for all the main queries and topics, look outside your own writing and knowledge base for help. Instead of hiring writers to help with more content based on what exists within your platform, try funneling new visitors in from other platforms (websites, podcasts, apps, etc.) or bringing people in to create unique content for you by featuring them and asking them to promote it.
To find new topics, ideas, and questions people have, adding a forum or community can help bring new traffic and ideas to your website or community. Some search engines like Google tend to reward this authentic user-generated content, but it does come with a decent amount of manual labor with monitoring and quality control. The benefit here is you build a community that creates content for you.
Pro-tip: Add a prompt on the main website pages like “question not answered, click here to ask the community,” where it goes to the forum, or have it go to an answer box where you collect it and create a new guide. Similar to how Search Engine Journal has a “Submit Questions” section for me and other “Ask an” columnists.
The UGC can begin showing up in Google as well as LLMs like ChatGPT, Perplexity, and Claude, and you can begin getting new traffic in and a new user base. This can all be monetized. But maybe you don’t want the hassle and risk of a UGC platform; there are more options.
Take your top guides and articles and begin turning them into videos. A long-form video can help with YouTube and bringing in traffic; it can be uploaded to Skool if you create a course. Skool and other platforms let you charge a fee for access, and each chapter in the video can become a long-form video or a short that works for YouTube, TikTok, and likely Instagram. With the exception of the shorts, affiliate links can be used on all of these platforms. The benefit of videos is a lot of the platforms like YouTube can be steady streams of traffic vs. IG or TikTok where it only lasts for a couple days to a week.
Now begin adding text versions to social media platforms in ways that fit. LinkedIn allows long-form and encourages users to ask questions, answer polls, and then you can link to your website. Bluesky and X are short-form but allow quick and easy links to your website or pages, although the traffic is in short bursts. Pinterest is short form, but image-heavy, and a pin that is done well and gets attention can be consistent traffic for a year and sometimes longer.
Some partners decide they want to start podcasting. Every topic on your website can become a theme or session, or combined into a really strong one that becomes a course you can monetize. Find other people with complementary knowledge and/or who have audiences and invite them to participate. You’ll be helping to grow each other’s traffic and sharing expertise. Sometimes your guest may spur new content ideas for you, too.
New Products And Services And Fixing Revenue Flatlines
When you run out of products and services to promote, or you’re hitting the highest AOVs available, revenue begins to flatline. While you cannot control what happens on the merchant or lead gen website’s funnel, you can control how you make money. This is an affiliate post so I won’t talk about driving higher EPCs and CPMs or getting pageviews to increase ad media, instead it is about using affiliate links and offers.
Here’s where to begin looking.
Survey Your Audience Or Use Your Analytics For Demographics
Having your audience’s demographics, including age, urban/rural/suburban, likes and interests, and anything else, can make you a ton of money. If it turns out the majority have dogs and are urban, but you run a cooking site, add in pet-friendly matching recipes or toys for dogs that get them exercise and stimulation when they cannot be outside more regularly to burn energy.
If the same demographics are local-based, like a group for parents in New England, create snow day resources where you review family-friendly tabletop games for snow days, lists of local restaurants across the area that offer kids eat free or family deals, and affordable snowbird family vacations.
If your audience has a large portion in rural areas, think about the ingredients that are hard to come by in rural areas due to smaller grocery stores, then share online resources to access them. This is a low-hanging fruit item I see as recipe sites will focus on the tools and products, but they can also monetize ingredients.
Learn What Else They’re Into
Once you know who your audience is and where they skew demographically, survey them to find their interests. If you can’t get them to take surveys, even with incentives like gift cards or prizes via a drawing (assuming it’s legal where you and your audience live), look up free research documents and use your marketing skills to find hobbies, stores, and associations that have similar audiences.
Maybe your audience is 50-year-old suburbanites that love bird watching. You’ve already maxed out sportswear and hiking equipment, same with books on birds and binoculars. Maybe it turns out they’re also into photography, so you can sell cameras, photo storage solutions, ways to print the photos and sell them, editing software, and guides to using the camera and setting up different types of shots.
It could also turn out they love to travel. Create guides of where to go that are friendly for people 50-60 years old, including the types of birds that they could see in each spot along the route, and what to pack based on the season, as weather can change. You can now use affiliate links for hotels and airfare, travel supplies, camera bags for different climates, and ebooks or physical books with trail maps, travel guides, and bird watching books to check off the ones they see.
You do need to watch adding too much content that is not the core topic of your channel, so you don’t accidentally uncategorize your platforms for SEO or alienate your core reader base. When you go off topic too often, you chase away current and new subscribers while also confusing algorithms. This is easy to resolve with tech SEO by using metarobots or robots.txt, and having an editorial calendar, but that is a different topic.
Now you have new products and services to promote, new merchants to work with, and this leads to more affiliate sales, increasing your revenue. Shopping guides, comparison grids, listicles, etc.
New Topics To Talk About
Above, I mentioned podcast guests, UGC, and a few ways that can spark new ideas for topics when you run out of things to talk about. So here are a few other ways I break writer’s block with the programs I work on for myself, for clients, and the affiliates in our programs.

AlsoAsked.com: You plug in a topic like “running shoes,” and it spits out a ton of potential questions about them. From there, I go to Google or an LLM and type it in, then I look to see what shows up. To go a step further, I may ask, “What are similar questions to this one?” or “What are complementary but different questions to this one?” as a second query to see what I may be missing.
Rank trackers: Take a URL for a blog or forum and plug it into a rank tracking tool. It’ll provide a list of keywords, questions, and phrases it shows up for.
Comments: Read the comments on YouTube videos for channels that are directly related to your business. These are things people want to know about and can be a way to get new traffic while breaking writer’s block.
AI and LLMs: Ask AI for a list of ideas that are related but not covered on your platform yet, and then have it double-verify. Not everything it recommends will be relevant, but it could spark ideas for you.

There are almost always solutions to preventing stagnation for affiliates, no matter if it is traffic, revenue, topics, or products and services to promote. You may need to expand your offerings to other types of products and services that match the same demographics or look to other platforms and competitors for content inspiration. I hope this helps, and thank you for asking.
More Resources:

Featured Image: Paulo Bobita/Search Engine Journal

Read More »
Software releases and updates

Three new tasks, better navigation, and a bug fix in the Yoast SEO Task List 

We launched the Yoast SEO Task List in December to give you a clear, actionable to-do list for your site’s SEO. In this update, we’ve added two new tasks, improved how you navigate to fixes, and resolved a bug that was showing tasks in the wrong language. 

A quick recap: what does the Task List do? 

The Task List scans your site and surfaces specific content that needs attention, ranked by priority with an estimated time to fix. Instead of guessing what to work on next, you click a task and Yoast takes you directly to the right place to make the improvement. Think of it as a personal SEO assistant that knows your site. 

What’s new in this update 

New task: improve your meta descriptions 

Meta descriptions are the short snippets that appear under your page title in Google search results. They don’t directly affect rankings, however they have a significant impact on whether someone clicks your link. The Task List will now flag recent posts where the meta description is missing or could be stronger, and point you to where you can fix it. Premium users can use the AI Generate button to write one in seconds. 

New task: delete your sample page 

Every new WordPress site comes with a default “Sample Page” that most people never delete. It adds no value and can create unnecessary noise for search engines. The Task List will now remind you to remove it if it’s still there. It’s a two-minute job that’s easy to overlook. 

New task: set social sharing images  

Available with Yoast SEO Premium, Yoast WooCommerce SEO, and Yoast SEO AI+

When someone shares your content on Facebook or X, the image that appears alongside it can make a real difference to whether people click. The Task List will now remind you to set a custom social sharing image for your posts and pages, so your content looks its best every time it gets shared. 

Go directly to the right place in the editor 

Previously, clicking a task would open the post editor and leave you to find the right section yourself. Now, Yoast takes you to the exact part of the editor you need: the SEO tab, the readability panel, or the meta description field. Less scrolling, faster fixing. 

Bug fix: tasks now appear in your language 

We fixed a bug where task descriptions were showing up in the site’s language rather than the logged-in user’s language. If you manage a multilingual site, or your personal language settings differ from your site’s default, tasks will now display correctly for you. 

Also in this release 

We’ve added a new Yoast tab to the WordPress Plugins screen that groups all your installed Yoast plugins in one place. This requires WordPress 7.0+. 

We fixed a bug where alt text changes made via the inline image editor in How-to and FAQ blocks weren’t saving correctly to the frontend. Thanks to @param-chandarana for the report. 

What’s coming next 

We’re continuing to expand the Task List with improvements that surface high-impact changes specific to your content. Users of paid plans will see additional tasks in upcoming releases.

Update to Yoast SEO 27.4 to get these improvements automatically, or download the latest version from the WordPress plugin directory. 

Beth Parker

Beth is Product Marketing Manager at Yoast. Before joining the company, she honed her digital marketing and project management skills in various in-house and agency environments.

Read More »