Boost your skills with Growth Memo’s weekly expert insights. Subscribe for free!
For years, SEOs have operated on a simple assumption: The more ground your content covers, the more likely it is to surface in AI-generated answers. In fact, every “best practice” in classic SEO content pushes you toward more: more subtopics, more sections, more words. Build the “ultimate guide.”
An analysis of 815,000 query-page pairs across 16,851 queries and 353,799 pages says otherwise:
Fan-out coverage is nearly irrelevant to citation rates.
Two signals actually predict whether ChatGPT cites your page.
Six concrete changes to your existing content library help.
1. The Study
AirOps ran 16,851 queries through ChatGPT three times each through the UI, capturing every fan-out sub-query, every URL searched, every citation made, and every page scraped. Oshen Davidson built the pipeline. I analyzed the data.
Each query generates an average of two fan-out queries. ChatGPT retrieves roughly 10 URLs per sub-search, reads through them, then selects which ones to cite. We scored how well each page’s H2-H4 subheadings matched those fan-out queries using cosine similarity on bge-base-en-v1.5 embeddings. That score is what we call fan-out coverage: the share of subtopics a page addresses at a 0.80 similarity threshold. (The 0.80 similarity threshold cutoff was used to decide whether a subheading counts as a match to a fan-out query. Think of it as a relevance bar.)
The question: Do pages with higher fan-out coverage get cited more?
You’ll find even more information in the co-written AirOps report.
2. Density Barely Moves The Needle
Across 815,484 rows, the relationship between fan-out coverage and citation is weak.
Covering 100% of subtopics adds 4.6 percentage points over covering none. That gap shrinks further when you control for query match (how well the page’s best heading matches the original query). Among pages with strong query match (>= 0.80 cosine similarity):
Image Credit: Kevin Indig
Moderate coverage (26-50%) outperforms exhaustive coverage. Pages that cover everything score lower than pages that cover a quarter of the subtopics. The “ultimate guide” strategy produces worse results than a focused article that covers two to three related angles well.
3. What Actually Predicts Citation
These two signals dominate: retrieval rank and query match.
1. Retrieval rank is the strongest predictor by a wide margin. A page at position 0 in ChatGPT’s web search results (the first URL returned by its search tool) has a 58% citation rate. By position 10, that drops to 14%. We ran each prompt three times consecutively for this analysis, and pages cited in all three runs have a median retrieval rank of 2.5. Pages never cited: median rank 13.
Image Credit: Kevin Indig
2. Query match (cosine similarity between the query and the page’s best heading) is the strongest content signal. Pages with a 0.90+ heading match have a 41% citation rate compared to the 30% rate for pages below 0.50. Even among top-ranked pages (position 0-2), higher query match adds 19 percentage points.
Fan-out coverage, word count, heading count, domain authority: all secondary. Some are flat. Some are inversely correlated.
4. The Wikipedia Exception
One site type breaks the pattern. Wikipedia has the worst retrieval rank in the dataset (median 24) and the lowest query match score (0.576). It still achieves the highest citation rate: 59%.
Wikipedia pages average 4,383 words, 31 lists, and 6.6 tables. They are encyclopedic in the literal sense. ChatGPT cites Wikipedia from deep in the search results where every other site type gets ignored.
This is density working as a signal, but at a scale no publisher can replicate. Wikipedia’s content is exhaustive, richly structured, and cross-linked across millions of topics. A 3,000-word corporate blog post with 15 subheadings is not the same thing.
5. The Bimodal Reality
58% of pages retrieved by ChatGPT in this dataset are never cited. 25% are always cited when they appear. Only 17% fall in between.
The always-cited and never-cited groups look nearly identical on most content metrics: similar word counts (~2,200), similar heading counts (~20), similar readability scores (~12 FK grade), similar domain authority (~54). The on-page signals we can measure do not separate winners from losers.
What separates them is retrieval rank. Always-cited pages rank near the top when they surface. Never-cited pages rank in the bottom half. The retrieval system, whatever signals it uses internally, is the gatekeeper. Everything else is a tiebreaker.
6. What This Means For Your Content
Conventional SEO content writing wisdom says cover more subtopics, add more sections, build density. The data says the conventional approach produces “mixed” pages, the 17% in the middle that get cited sometimes and ignored other times.
Mixed pages have the highest word counts, the most headings, and the highest domain authority in the dataset. They are the “ultimate guides.” They are also the least reliable performers in ChatGPT.
The pages that win consistently are focused. They:
Match the query directly in their headings,
Tend to be shorter (the citation sweet spot is 500-2,000 words), and
Have enough structure (7-20 subheadings) to organize the content without diluting it.
Build the page that is the best answer to one question. Not the page that adequately answers 20.
Featured Image: Tero Vesalainen/Shutterstock; Paulo Bobita/Search Engine Journal
Fair to say the majority of evergreen content will not drive the value it did five years ago. Hell, even one or two years ago. What we have done for the last decade will not be as profitable.
AIOs have eroded clicks. Answer engines have given people options. And to be fair, people are bored of the +2,000-word article answering “What time does X start?” Or recipes where the ingredient list is hidden below 1,500 words about why daddy didn’t like me.
So, you’ve got to be smart. This has to be framed as a commercial decision. Content needs to drive real business value. You’ve got to be confident in it delivering.
That doesn’t mean every article, video, or podcast has to drive a subscription or direct conversion. But it needs to play a clear part in the user’s journey. You need to be able to argue for its inclusion:
Is it a jumping-off point?
Will it drive a registration?
Or a free subscriber, save or follow on social
More commonly known as micro-conversions, these things really matter when it comes to cultivating and retaining an audience. People don’t want more bland, banal nonsense. They want something better.
The antithesis to AI slop will help your business be profitable.
Inherently, nothing. It’s a foundational part of the content pyramid.
In most cases, it’s been done to death, and AI is very effective at summarizing a lot of this bread-and-butter content.
The epitome of quantity over quality; it worked and made a fortune.
But I digress.
An authoritative enough site has been able to drive clicks and follow-up value with sub-par content for decades. That is, slowly diminishing. Rightly or wrongly.
And not because of the Helpful Content stuff. Google nerfed all the small sites long before the goliaths. Now they’ve gone after the big fish.
We have to make commercial decisions that help businesses make the right choice. Concepts like E-E-A-T have had an impact on the quality of content (a good thing). It’s also had an impact on the cost of creating quality content.
Working with experts.
Unique imagery.
Video.
Product and development costs.
Data.
This isn’t cheap. Once upon a time, we could generate value from authorless content full of stock images and no unique value. Unless you’re willing to bend the rules (which isn’t an option for most of us), you need an updated plan.
It depends.
You need to establish how much your content now costs to produce and the value it brings. Not everything is going to drive a significant conversion. That doesn’t mean you shouldn’t do it. It means you need to have a very clear reason for what you’re creating and why.
If particular topics are essential to your audience, service, and/or product, then they should at least be investigated.
One of the joys of creating evergreen content has always been that it adds value throughout the year(s). A couple of annual updates, even relatively light touch, could yield big results.
Commissioning something of quality in this space is likely more expensive. It needs to be worth it; it has to form part of your multi-channel experience to make it so.
Unique data and visuals that can be shared on socials.
Building campaigns around it (or it’s part of a campaign).
You can even build authors and your brand around it.
And if it resonates, you can rinse and repeat year after year.
Ahrefs created demand for their brand + an evergreen topic – AIOs (Image Credit: Harry Clarkson-Bennett)
And this type of content or campaign can increase demand for a topic. You can become a thought leader by shifting the tide of public opinion.
For publishers and content creators, that is foundational.
I don’t – on both counts. We should want to be targeted on driving real value for the business.
Something like:
Tier 1: Value – core, revenue, and value-driving conversions.
Tier 2: Registrations (and things that help you build your owned properties), links, shares, and comments.
Tier 3: Page views, returning visits, and engagement metrics.
Micro-conversions over clicks. We’re focusing on registrations, free or lower-value subscriptions. Whatever gets the user into the ecosystem and one step closer to a genuinely valuable conversion.
The messy middle has changed, and it is largely unattributable (Image Credit: Harry Clarkson-Bennett)
Now, could a click be a micro-conversion? If you know that someone who reads a secondary article (by clicking a follow-up link) is 10x more likely to register, that follow-up click could be a sensible micro-conversion.
This type of conversion may not directly drive your bottom line. But it forces you and your team to focus on behaviors that are more likely to lead to a valuable conversion.
That is the point of a micro-conversion. It changes behaviors.
You can tweak the above tiers to better suit your content offering. Not all content is going to drive direct tier one or even two value. You just need to have a very clear idea of its purpose in the customer journey.
If what you’re creating already exists, you’d better make sure you add something extra. You’ve got to force your way into the conversation, and unless you can offer something unique, you’re (almost certainly) wasting your time IMO.
I’ll break all of these down, but I think (in order of importance):
Writing content for people.
Information gain.
Getting it found.
Creating it at the right time.
Structuring it for bots.
Everyone is obsessed with getting cited or being visible in AI.
I think this is completely the wrong way of framing this new era. Getting cited there, or being visible, is a happy byproduct of building a quality brand with an efficient, joined-up approach to marketing.
The more you understand your audience, the more likely you will be to create high-quality, relevant content that gets cited.
If you know your audience really cares about a topic, that’s step one taken care of. If you know where they spend time and how they’re influenced, that’s step two. And if you know how to cut through the noise, that’s step three.
Really, this is an evolution in SEO and the internet at large.
Invest in and create content that will resonate with your audience.
Create a cross-channel marketing strategy that will genuinely reach and influence them.
Share, share, share. Be impactful. Get out there.
Make sure it’s easy to read, share, and consume.
Your content still needs to reach and be remembered by the right people. Do that better than anybody else, and wider visibility will come.
In SEO, we have a different definition of information gain than more traditional information retrieval mechanics. I don’t know if that’s because we’re wrong (probably), or that we have a valid reason…
Maybe someone can enlighten me?
In more traditional machine learning, information gain measures how much uncertainty is reduced after observing new data. That uncertainty is captured by entropy, which is a way of quantifying how unpredictable a variable is based on its probability distribution.
Events with low probability are more surprising and therefore carry more information. High probability events are less surprising and novel. Therefore, entropy reflects the overall level of disorder and unpredictability across all possible outcomes.
Information gain, then, tells us how much that unpredictability drops when we split or segment the data. A higher information gain means the data has become more ordered and less uncertain – in other words, we’ve learned something useful.
To us in SEO, information gain means the addition of new, relevant information. Beyond what is already out there in the wider corpus.
A representative workflow of Google’s Contextual estimation of link information gain patent (Image Credit: Harry Clarkson-Bennett)
Google wants to reduce uncertainty. Reduce ambiguity. Content with a higher level of information gain isn’t only different, it elevates a user’s understanding. It raises the bar by answering the question(s) and topic more effectively than anyone else.
So, try something different, novel even, and watch Google test your content higher up in the SERPs to see if it satisfies a user.
This is such an important concept for evergreen content because so many of these queries have well-established answers. If you’re just parroting these answers because your competitors do it, you’re not forcing Google’s hand.
Particularly if you’re still just copying headers and FAQs from the top three results. Audiences are not arriving at publisher destinations through direct navigation at the same scale. They encounter journalism incidentally, through social feeds, not through habitual site visits.
Younger audiences spend less time on news sites and more time on social every year (Image Credit: Harry Clarkson-Bennett)
You’ve got to meet them there and force their hand.
According to this patent – contextual estimation of link information gain – Google scores documents based on the additional information they offer to a user, considering what the user has already seen.
“Based on the information gain scores of a set of documents, the documents can be provided to the user in a manner that reflects the likely information gain that can be attained by the user if the user were to view the documents.”
Bots, like people, need structure to properly “understand” content.
Elements like headings (h1 – h6), semantic HTML, and linking effectively between articles help search engines (and other forms of information retrieval) understand what content you deem important.
While the majority of semi-literates “understand” content, bots don’t. They fake it. They use engagement signals, NLP, and the vector model space to map your document against others.
They can only do this effectively if you understand how to structure a page.
Frontloading key information.
Effectively targeting highly relevant queries.
Using structured data formats like lists and tables, where appropriate (these are more cost-effective forms of tokenization).
The more clearly a page communicates its topic, subtopics, and relationships, the more likely it is to be consistently retrieved and reused across search and AI surfaces. This has a compounding effect.
Rank more effectively (great for RAG, obviously) – feature more heavily in versions of the internet – force your way into model training data.
If you need to get development work put through, frame it through the lens of assistive technology. Can people with specific needs fully access your pages?
Track and pay very close attention to spikes in demand (Google Trends API being a very obvious option here).
Make sure you’re adding something of value to the wider corpus.
If quality content is already out there and you have nothing extra to add, consider whether it’s worth spending money on (SEO is not free).
Create and update timely evergreen content (Image Credit: Harry Clarkson-Bennett)
While this is primarily for news, you can apply a similar logic to evergreen content if you zoom out and follow macro trends.
Evergreen content still spikes at different times throughout the year. Take Spain as an example. There’s much more limited interest in going to Spain in the Winter months from the UK. But January (holiday planning or weekend breaks) and summer (more immediate holiday-ing with the kids) provide better opportunities to generate traffic.
You’re capturing the spike in demand by updating content at the right time. Particularly if you understand the difference in user needs when this spike in demand happens.
In January, get your holiday planning content ready.
In the summer, get your family-friendly and last-minute holiday content up and running.
Image Credit: Harry Clarkson-Bennett
Demand for evergreen topics can be cyclical. In this example, you would want to capture the spike(s) with carefully planned updates, so you have up-to-date content when a user is really searching for that product, service, or information.
Well, what matters to your brand and your users? Have you asked them?
By the very nature of new and evolving topics and concepts, not everything “evergreen” has been done.
New topics rise. Old ones fall. Some are cyclical.
My rule(s) of thumb would be to establish:
Is the topic foundational to your product and service?
Does your current (and potential) audience demand it?
Do you have something new to add to the wider corpus of information?
If the answer to those three is a broad variation of yes, it’s almost certainly a good bet. Then, I would consider topic search volume, cross-platform demand, and whether the topic is trending up or down in popularity.
There are some things you should be doing “just for SEO.” Content isn’t one of them. You can yell topical authority until you’re blue in the face. If you’re creating stuff just for SEO – kill it.
IMO, these plays have been dead or dying for some time. The modern-day version of the internet (in particular search) demands disambiguation. It demands accuracy. Verification that you are an expert. Otherwise, you’re competing with those who have a level of legitimacy that you do not.
Social profiles, newsletters, real people sharing stories. You’re competing with people who aren’t polishing turds.
If all you’re thinking about is search volume or clicks, I don’t think it’s worth it.
And this brings me nicely onto rented land. Platforms you don’t own.
We’ve spent years creating assets (your websites) to deliver value in search. Owning all of your assets and prioritizing your site above all else. But that is changing. In many cases, people don’t reach your website until they’ve already made a purchasing decision.
I think Rand has managed this transition better than anybody (Image Credit: Harry Clarkson-Bennett)
So, you have to get your stuff out there. Create large, unique studies. Cut them into snippets and short-form videos. Use your individual platform to boost your profile and the content’s chances of soaring.
This is, IMO, particularly prescient for publishers. You’ve got to get out there. You’ve got to share and reuse your content. To make the most of what you’ve created.
Sweat your assets. Even if senior figures aren’t comfortable with this, you need to make it happen.
People have been espousing how important it is to feature as part of the answer. And that may be true. But you’re going to have to be good at selling your projects in if there’s no clear attribution or value.
It might not have the spikes of news, but evergreen interest still spikes at certain times in the year.
Get people – real people – to share it. To have their spin on it.
Outperform the expected early stage engagement and maximize your chance of appearing in platforms like Discover with wider platform engagement.
You have to work harder than before.
I shared an example of this around a year ago, but to revisit it, I now have 11 recommendations from other Substacks.
You can’t do this alone (Image Credit: Harry Clarkson-Bennett)
They have accounted for over 40% of my total subscribers. Admittedly, mainly from Barry, Shelby, and Jessie. But they are, if I may be so bold, superhumans.
And when our main driver of evergreen traffic to the site (Google) has really leaned into the evil that surrounds big tech, we’ve got to be cannier. We have to find ways to get people to share our content.
Even evergreen content.
If we’re being honest, a lot of SEO content has been rubbish. Churned out muck.
People are still churning out muck at an incredible rate. When what you’ve got is crap, more crap isn’t the answer. I think people are turned off. They’re tuning out of things at an alarming rate, especially young people.
It is all about getting the right people into the system. Evergreen content is still foundational here. You just have to make it work harder. Be more interesting. Be shareable.
Hopefully, this makes decisions over what we should and shouldn’t create easier.
Every few years, the SEO industry discovers a new way to mass-produce content and convinces itself that this time it’ll work. That the sheer volume of pages will overwhelm Google’s ability to assess quality. That if you just publish enough, the numbers will carry you.
It never works. It has never worked. And the people selling you these approaches know it has never worked. They just need it to work long enough to collect the invoice.
The Pattern Has A Name. It’s Called “Not Learning”
Let’s walk through the timeline, because apparently, we need to do this again.
2008-2011: Content Spinning
The pitch was simple: Take one article, run it through software that swaps synonyms, and suddenly you have 50 “unique” articles. The word “unique” was doing a lot of heavy lifting in that sentence. These articles read like someone had fed a dictionary through a blender. But even if the output had been polished, the premise was broken. Here’s what the content spinners never grasped, and what their successors still don’t: Uniqueness is trivially easy to produce. A monkey dropping its hands on a keyboard produces unique content. The string of characters has never existed before – congratulations, it’s original. The hard part was never uniqueness. It was producing uniqueness that’s worth something. Unique and valuable are not synonyms, and the gap between them is where every scaling strategy falls apart.
Google tolerated it for a while. Its systems simply hadn’t caught up yet. Then Panda arrived in February 2011, hit nearly 12% of all search queries, and content farms watched their traffic evaporate overnight … I was “fortunate” enough to watch it happen in real time. Demand Media, the poster child of the content-farm model, reported a $6.4 million loss the following year.
The lesson was supposed to be clear: You cannot industrialize quality. Volume without substance is a liability with a longer tail than most budgets can absorb.
2015-2022: Programmatic SEO
The pitch evolved. Instead of spinning existing articles, you’d build templates and fill them with structured data. “Best [X] in [City]” pages, generated by the thousand, each one a thin wrapper around a database query. Some of these actually provided value – if the underlying data was good and the template served genuine user needs. Most didn’t. Most were just doorway pages wearing a better outfit. Google spent years refining its ability to detect and demote templated content that existed primarily for indexing purposes rather than for humans.
The lesson was supposed to be reinforced: scale works when there’s substance underneath. Without it, you’re just building a bigger target.
2023-Present: AI-Generated Content At Scale
And here we are again. Same pitch, shinier tools. “We can produce 500 articles a month!” Wonderful. Can you produce 500 articles a month that are worth reading? That contain something a reader couldn’t get from the results already in the index? That demonstrate any form of expertise, experience, or original thought?
No? Then you’re not scaling content. You’re scaling your crawl budget waste.
And the pattern recognition failures are stunning. (This wasn’t subtle. Several of us noticed. No, we weren’t impressed.)
I recently came across an AI visibility tool – one that sells itself on helping you get discovered by AI systems – that had generated hundreds of pages following the pattern “best SEO agencies in {city}.” Déjà vu. Anyone who lived through programmatic SEO recognizes this immediately – it’s the 2017 playbook, except now the copy is written by an LLM. The template got a grammar upgrade and an “it’s AEO” stamp. The strategy didn’t.
Lily Ray flagged a similar case: a resume site with 500+ programmatic pages for “resume examples for {career}.” Every title following the exact same formula. Near-identical page templates. Misused AggregateRating schema. Obvious AI content throughout. Her summary was three words: “Worked until it didn’t.”
Image Credit: Pedro Dias
That phrase should be tattooed on every content scaling pitch deck. Worked until it didn’t. It always does. And then it doesn’t.
The irony of an AI optimization tool using mass-generated doorway pages to build its own visibility would be funny if it weren’t so perfectly on-brand for this industry.
The Qualitative Wall Doesn’t Move
Here’s what every generation of content scalers fails to understand: Google doesn’t evaluate content in isolation. It evaluates content relative to everything else in the index on the same topic.
Publishing 500 AI-generated articles about mortgage rates doesn’t make you an authority on mortgage rates. It makes you the 500th source saying the same thing in slightly different words. And Google already has 499 of those. It doesn’t need yours.
The qualitative wall is this: There is a minimum threshold of genuine value – original insight, lived experience, specific expertise, something the reader cannot get elsewhere – below which no amount of volume helps you. You can publish a million pages below that threshold. You’ll rank for nothing that matters.
And it gets worse. For the people scaling AI content specifically to gain visibility in AI-powered answer systems, the volume strategy doesn’t just fail; it actively backfires. A 2025 paper on retrieval evaluation for LLM-era systems introduces a metric that measures both helpful and distracting passages in retrieval. The finding that matters here: Low-utility content doesn’t sit quietly in the index waiting to be ignored. It can pull retrieval models off-track, degrading the quality of answers those systems produce. Your 500 thin articles aren’t just invisible. They’re noise. And if your site also has genuinely useful pages buried in that noise, congratulations – you’ve built your own interference pattern. The volume you thought would help discovery is actively drowning the pages that might have earned it.
This isn’t a new insight. It’s the same insight that content spinners ignored in 2010, that programmatic SEO factories ignored in 2018, and that AI content mills are ignoring right now. The tools got better at producing text. The text still has nothing to say.
Google Told You. Repeatedly
Google’s spam policies define scaled content abuse as generating pages “for the primary purpose of search rankings and not helping users.” They explicitly list “using generative AI tools or other similar tools to generate many pages without adding value for users” as an example. This is not subtext. It’s text.
In June 2025, Google began issuing manual actions specifically for scaled content abuse, targeting sites that had been mass-publishing AI-generated content. Sites across the UK, US, and EU received Search Console notifications citing “aggressive spam techniques, such as large-scale content abuse.” Complete visibility drops. Pages didn’t slide down the rankings; they vanished.
The August 2025 spam update continued the enforcement. Subsequent core updates have kept tightening the screws. Each time, the same profile gets hit: high volume, low substance, no editorial oversight.
And each time, the affected site owners acted surprised. As if Google hadn’t been telling them this for 15 years.
‘But Our Content Is Ranking Well’
This is my favorite delusion. I’ve seen it at every stage of this cycle. “Our AI content is ranking, so it must be fine.” Claiming “this is ranking well” is often precisely why Google issues algorithmic improvements and manual actions for your site. If your low-value content is ranking, the system hasn’t gotten to you yet. That’s all it means.
Google aggregates signals at the site level, not just the page level. You can have individual pages performing while the overall quality signal of your site degrades. And when the enforcement catches up (algorithmically or manually), it doesn’t pick off pages one by one. It hits the lot.
This is the content spinner’s fallacy, recycled: “It’s working right now, so it must be a strategy.” Demand Media’s content was ranking too. Right up until it wasn’t.
Lily captured this perfectly: “The case study: scaling AI content is working! The reality:” – followed by the traffic cliff that inevitably arrives. Every scaling success story is a snapshot taken before the correction. Nobody publishes the sequel.
Image Credit: Pedro Dias
The Economics Don’t Even Make Sense
Set aside the risk for a moment. Let’s talk about what you’re actually producing.
Five hundred AI-generated articles a month. Each one needs to be reviewed for accuracy – because LLMs hallucinate, and publishing incorrect information is a liability that extends well beyond SEO. Each one needs to be checked for originality – because if it reads like everything else in the index, it provides no added value; no competitive advantage. Each one needs editorial oversight to ensure it actually serves the audience you claim to serve.
If you’re doing all of that, the cost just moved – and possibly increased – while you convinced yourself you were being efficient. The “efficiency” of AI content generation evaporates the moment you apply the quality standards the content actually needs to meet.
And if you’re not doing any of that? You’re publishing unreviewed, unoriginal, potentially inaccurate content at scale under your brand name. I genuinely do not understand how anyone signs off on that.
Same Mistake, Better Tools
Content spinning. Programmatic SEO. AI-generated content at scale. Three different tools, one identical mistake: treating content as a manufacturing problem.
Manufacturing produces identical outputs at scale – that’s the point. Content derives its value from the opposite: from being specific, from being informed by experience, from saying something the rest of the index doesn’t. Every attempt to industrialise it crashes into that contradiction.
You can’t automate specificity. You can’t template experience. You can’t generate original thought by running a prompt through an LLM and hoping something useful comes out. And these constraints won’t be solved by the next model release. They’re baked into what makes content worth reading in the first place.
The people who keep chasing scale are optimising for the wrong variable. They see “more content” as an input that produces “more traffic” as an output. But the function is not linear. It never was. It’s gated by quality, and no amount of volume bypasses the gate.
The Only Question That Matters
Before you publish anything (AI-assisted or otherwise), ask one question: What does this page offer that the reader cannot already get?
If the answer is “nothing, but we’ll have more pages indexed,” you’re not building a content strategy. You’re building a liability. And you’re doing it with the confidence of someone who has apparently never heard of Panda, never looked at what happened to programmatic SEO sites in 2022, and never read Google’s own spam policies.
You can convince yourself for as long as you want. But you’ll only fool everyone else for a while.
The wall is still there. It’s always been there. The tools keep changing. The wall doesn’t.
More Resources:
This post was originally published on The Inference.
The middle is where your content dies, and not because your writing suddenly gets bad halfway down the page, and not because your reader gets bored. But because large language models have a repeatable weakness with long contexts, and modern AI systems increasingly squeeze long content before the model even reads it.
That combo creates what I think of as dog-bone thinking. Strong at the beginning, strong at the end, and the middle gets wobbly. The model drifts, loses the thread, or grabs the wrong supporting detail. You can publish a long, well-researched piece and still watch the system lift the intro, lift the conclusion, then hallucinate the connective tissue in between.
This is not theory as it shows up in research, and it also shows up in production systems.
Image Credit: Duane Forrester
Why The Dog-Bone Happens
There are two stacked failure modes, and they hit the same place.
First, “lost in the middle” is real. Stanford and collaborators measured how language models behave when key information moves around inside long inputs. Performance was often highest when the relevant material was at the beginning or end, and it dropped when the relevant material sat in the middle. That’s the dog-bone pattern, quantified.
Second, long contexts are getting bigger, but systems are also getting more aggressive about compression. Even if a model can take a massive input, the product pipeline frequently prunes, summarizes, or compresses to control cost and keep agent workflows stable. That makes the middle even more fragile, because it is the easiest segment to collapse into mushy summary.
A fresh example: ATACompressor is a 2026 arXiv paper focused on adaptive, task-aware compression for long-context processing. It explicitly frames “lost in the middle” as a problem in long contexts and positions compression as a strategy that must preserve task-relevant content while shrinking everything else.
So you were right if you ever told someone to “shorten the middle.” Now, I’d offer this refinement:
You are not shortening the middle for the LLM so much as engineering the middle to survive both attention bias and compression.
Two Filters, One Danger Zone
Think of your content going through two filters before it becomes an answer.
Filter 1: Model Attention Behavior: Even if the system passes your text in full, the model’s ability to use it is position-sensitive. Start and end tend to perform better, middle tends to perform worse.
Filter 2: System-Level Context Management: Before the model sees anything, many systems condense the input. That can be explicit summarization, learned compression, or “context folding” patterns used by agents to keep working memory small. One example in this space is AgentFold, which focuses on proactive context folding for long-horizon web agents.
If you accept those two filters as normal, the middle becomes a double-risk zone. It gets ignored more often, and it gets compressed more often.
That is the balancing logic with the dog-bone idea. A “shorten the middle” approach becomes a direct mitigation for both filters. You are reducing what the system will compress away, and you are making what remains easier for the model to retrieve and use.
What To Do About It Without Turning Your Writing Into A Spec Sheet
This is not a call to kill longform as longform still matters for humans, and for machines that use your content as a knowledge base. The fix is structural, not “write less.”
You want the middle to carry higher information density with clearer anchors.
Here’s the practical guidance, kept tight on purpose.
1. Put “Answer Blocks” In The Middle, Not Connective Prose
Most long articles have a soft, wandering middle where the author builds nuance, adds color, and tries to be thorough. Humans can follow that. Models are more likely to lose the thread there. Instead, make the middle a sequence of short blocks where each block can stand alone.
An answer block has: A clear claim. A constraint. A supporting detail. A direct implication.
If a block cannot survive being quoted by itself, it will not survive compression. This is how you make the middle “hard to summarize badly.”
2. Re-Key The Topic Halfway Through
Drift often happens because the model stops seeing consistent anchors.
At the midpoint, add a short “re-key” that restates the thesis in plain words, restates the key entities, and restates the decision criteria. Two to four sentences are often enough here. Think of this as continuity control for the model.
It also helps compression systems. When you restate what matters, you are telling the compressor what not to throw away.
3. Keep Proof Local To The Claim
Models and compressors both behave better when the supporting detail sits close to the statement it supports.
If your claim is in paragraph 14, and the proof is in paragraph 37, a compressor will often reduce the middle into a summary that drops the link between them. Then the model fills that gap with a best guess.
Local proof looks like: Claim, then the number, date, definition, or citation right there. If you need a longer explanation, do it after you’ve anchored the claim.
This is also how you become easier to cite. It is hard to cite a claim that requires stitching context from multiple sections.
4. Use Consistent Naming For The Core Objects
This is a quiet one, but it matters a lot. If you rename the same thing five times for style, humans nod, but models can drift.
Pick the term for the core thing and keep it consistent throughout. You can add synonyms for humans, but keep the primary label stable. When systems extract or compress, stable labels become handles. Unstable labels become fog.
5. Treat “Structured Outputs” As A Clue For How Machines Prefer To Consume Information
A big trend in LLM tooling is structured outputs and constrained decoding. The point is not that your article should be JSON. The point is that the ecosystem is moving toward machine-parseable extraction. That trend tells you something important: machines want facts in predictable shapes.
So, inside the middle of your article, include at least a few predictable shapes: Definitions. Step sequences. Criteria lists. Comparisons with fixed attributes. Named entities tied to specific claims.
Do that, and your content becomes easier to extract, easier to compress safely, and easier to reuse correctly.
How This Shows Up In Real SEO Work
This is the crossover point. If you are an SEO or content lead, you are not optimizing for “a model.” You are optimizing for systems that retrieve, compress, and synthesize.
Your visible symptoms will look like:
Your article gets paraphrased correctly at the top, but the middle concept is misrepresented. That’s lost-in-the-middle plus compression.
Your brand gets mentioned, but your supporting evidence does not get carried into the answer. That’s local proof failing. The model cannot justify citing you, so it uses you as background color.
Your nuanced middle sections become generic. That’s compression, turning your nuance into a bland summary, then the model treating that summary as the “true” middle.
Your “shorten the middle” move is how you reduce these failure rates. Not by cutting value, but by tightening the information geometry.
A Simple Way To Edit For Middle Survival
Here’s a clean, five-step workflow you can apply to any long piece, and it’s a sequence you can run in an hour or less.
Identify the midpoint and read only the middle third. If the middle third can’t be summarized in two sentences without losing meaning, it’s too soft.
Add one re-key paragraph at the start of the middle third. Restate: the main claim, the boundaries, and the “so what.” Keep it short.
Convert the middle third into four to eight answer blocks. Each block must be quotable. Each block must include its own constraint and at least one supporting detail.
Move proof next to claim. If proof is far away, pull a compact proof element up. A number, a definition, a source reference. You can keep the longer explanation later.
Stabilize the labels. Pick the name for your key entities and stick to them across the middle.
If you want the nerdy justification for why this works, it is because you are designing for both failure modes documented above: the “lost in the middle” position sensitivity measured in long-context studies, and the reality that production systems compress and fold context to keep agents and workflows stable.
Wrapping Up
Bigger context windows do not save you. They can make your problem worse, because long content invites more compression, and compression invites more loss in the middle.
So yes, keep writing longform when it is warranted, but stop treating the middle like a place to wander. Treat it like the load-bearing span of a bridge. Put the strongest beams there, not the nicest decorations.
That’s how you build content that survives both human reading and machine reuse, without turning your writing into sterile documentation.
You publish a page that solves a real problem. It reads clean. It has examples, and it has the edge cases covered. You would happily hand it to a customer.
Then you ask an AI platform the exact question that page answers, and your page never shows up. No citation, no link, no paraphrase. Just omitted.
That moment is new. Not because platforms give different answers, as most people already accept that as reality. The shift is deeper. Human relevance and model utility can diverge.
If you are still using “quality” as a single universal standard, you will misdiagnose why content fails in AI answers, and you will waste time fixing the wrong things.
The Utility Gap is the simplest way to name the problem.
Image Credit: Duane Forrester
What The Utility Gap Is
This gap is the distance between what a human considers relevant and what a model considers useful for producing an answer.
Humans read to understand. They tolerate warm-up, nuance, and narrative. They will scroll to find the one paragraph that matters and often make a decision after seeing the whole page or most of the page.
A retrieval plus generation system works differently. It retrieves candidates, it consumes them in chunks, and it extracts signals that let it complete a task. It does not need your story, just the usable parts.
That difference changes how “good” works.
A page can be excellent for a human and still be low-utility to a model. That page can also be technically visible, indexed, and credible, and yet, it can still fail the moment a system tries to turn it into an answer.
This is not a theory we’re exploring here, as research already separates relevance from utility in LLM-driven retrieval.
Why Relevance Is No Longer Universal
Many standard IR ranking metrics are intentionally top-heavy, reflecting a long-standing assumption that user utility and examination probability diminish with rank. In RAG, retrieved items are consumed by an LLM, which typically ingests a set of passages rather than scanning a ranked list like a human, so classic position discounts and relevance-only assumptions can be misaligned with end-to-end answer quality. (I’m over-simplifying here, as IR is far more complex that one paragraph can capture.)
A 2025 paper on retrieval evaluation for LLM-era systems attempts to make this explicit. It argues classic IR metrics miss two big misalignments: position discount differs for LLM consumers, and human relevance does not equal machine utility. It introduces an annotation scheme that measures both helpful passages and distracting passages, then proposes a metric called UDCG (Utility and Distraction-aware Cumulative Gain). The paper also reports experiments across multiple datasets and models, with UDCG improving correlation with end-to-end answer accuracy versus traditional metrics.
The marketer takeaway is blunt. Some content is not merely ignored. It can reduce answer quality by pulling the model off-track. That is a utility problem, not a writing problem.
A related warning comes from NIST. Ian Soboroff’s “Don’t Use LLMs to Make Relevance Judgments” argues you should not substitute model judgments for human relevance judgments in the evaluation process. The mapping is not reliable, even when the text output feels human.
That matters for your strategy. If relevance were universal, a model could stand in for a human judge, and you would get stable results, but you do not.
The Utility Gap sits right in that space. You cannot assume that what reads well to a person will be treated as useful by the systems now mediating discovery.
Even When The Answer Is Present, Models Do Not Use It Consistently
Many teams hear “LLMs can take long context” and assume that means “LLMs will find what matters.” That assumption fails often.
“Lost in the Middle: How Language Models Use Long Contexts” shows that model performance can degrade sharply based on where relevant information appears in the context. Results often look best when the relevant information is near the beginning or end of the input, and worse when it sits in the middle, even for explicitly long-context models.
This maps cleanly to content on the web. Humans will scroll. Models may not use the middle of your page as reliably as you expect. If your key definition, constraint, or decision rule sits halfway down, it can become functionally invisible.
You can write the right thing and still place it where the system does not consistently use it. This means that utility is not just about correctness; it’s also about extractability.
Proof In The Wild: Same Intent, Different Utility Target
This is where the Utility Gap moves from research to reality.
BrightEdge published research comparing how ChatGPT and Google AI approach visibility by industry. In healthcare, BrightEdge reports 62% divergence and gives an example that matters to marketers because it shows the system choosing a path, not just an answer. For “how to find a doctor,” the report describes ChatGPT pushing Zocdoc while Google points toward hospital directories. Same intent. Different route.
A related report from them also frames this as a broader pattern, especially in action-oriented queries, where the platform pushes toward different decision and conversion surfaces.
That is the Utility Gap showing up as behavior. The model is selecting what it considers useful for task completion, and those choices can favor aggregators, marketplaces, directories, or a competitor’s framing of the problem. Your high-quality page can lose without being wrong.
Portability Is The Myth You Have To Drop
The old assumption was simple. If you build a high-quality page and you win in search, you win in discovery, and that is no longer a safe assumption.
BCG describes the shift in discoverability and highlights how measurement is moving from rankings to visibility across AI-mediated surfaces. Their piece includes a claim about low overlap between traditional search and AI answer sources, which reinforces the idea that success does not transfer cleanly across systems.
Profound published a similar argument, positioning the overlap gap as a reason top Google visibility does not guarantee visibility in ChatGPT.
Method matters with overlap studies, so treat these numbers as directional signals rather than fixed constants. Search Engine Land published a critique of the broader trend of SEO research being over-amplified or generalized beyond what its methods can support, including discussion of overlap-style claims.
You do not need a perfect percent to act. You just need to accept the principle. Visibility and performance are not portable by default, and utility is relative to the system assembling the answer.
How You Measure The Utility Gap Without A Lab
You do not need enterprise tooling to start, but you do need consistency and intent discipline.
Start with 10 intents that directly impact revenue or retention. Pick queries that represent real customer decision points: choosing a product category, comparing options, fixing a common issue, evaluating safety or compliance, or selecting a provider. Focus on intent, not keyword volume.
Run the exact same prompt on the AI surfaces your customers use. That might include Google Gemini, ChatGPT, and an answer engine like Perplexity. You are not looking for perfection, just repeatable differences.
Capture four things each time:
Which sources get cited or linked.
Whether your brand is mentioned (cited, mentioned, paraphrased, or omitted).
Whether your preferred page appears.
Whether the answer routes the user toward or away from you.
Then, score what you see. Keep the scoring simple so you will actually do it. A practical scale looks like this in plain terms:
Your content clearly drives the answer.
Your content appears, but plays a minor role.
Your content is absent, and a third party dominates.
The answer conflicts with your guidance or routes users somewhere you do not want them to go.
That becomes your Utility Gap baseline.
When you repeat this monthly, you track drift. When you repeat it after content changes, you can see whether you reduced the gap or merely rewrote words.
How You Reduce The Utility Gap Without Turning Your Site Into A Checklist
The goal is not to “write for AI.” The goal is to make your content more usable to systems that retrieve and assemble answers. Most of the work is structural.
Put the decision-critical information up front. Humans accept a slow ramp. Retrieval systems reward clean early signals. If the user’s decision depends on three criteria, put those criteria near the top. If the safest default matters, state it early.
Write anchorable statements. Models often assemble answers from sentences that look like stable claims. Clear definitions, explicit constraints, and direct cause-and-effect phrasing increase usability. Hedged, poetic, or overly narrative language can read well to humans and still be hard to extract into an answer.
Separate core guidance from exceptions. A common failure pattern is mixing the main path, edge cases, and product messaging inside one dense block. That density increases distraction risk, which aligns with the utility and distraction framing in the UDCG work.
Make context explicit. Humans infer, but models benefit when you state assumptions, geography, time sensitivity, and prerequisites. If guidance changes based on region, access level, or user type, say so clearly.
Treat mid-page content as fragile. If the most important part of your answer sits in the middle, promote it or repeat it in a tighter form near the beginning. Long-context research shows position can change whether information gets used.
Add primary sources when they matter. You are not doing this for decoration. You are giving the model and the reader evidence to anchor trust.
This is content engineering, not gimmicks.
Where This Leaves You
The Utility Gap is not a call to abandon traditional SEO. It is a call to stop assuming quality is portable.
Your job now runs in two modes at once. Humans still need great content. Models need usable content. Those needs overlap, but they are not identical. When they diverge, you get invisible failure.
That changes roles.
Content writers cannot treat structure as a formatting concern anymore. Structure is now part of performance. If you want your best guidance to survive retrieval and synthesis, you have to write in a way that lets machines extract the right thing, fast, without getting distracted.
SEOs cannot treat “content” as something they optimize around at the edges. Technical SEO still matters, but it no longer carries the whole visibility story. If your primary lever has been crawlability and on-page hygiene, you now have to understand how the content itself behaves when it is chunked, retrieved, and assembled into answers.
The organizations that win will not argue about whether AI answers differ. They will treat model-relative utility as a measurable gap, then close it together, intent by intent.
Key dates and notable events throughout the year can feed your content strategy and your social media marketing strategy. Timely aligning your digital campaigns with the right seasons for your brand is a staple part of creating a content calendar.
The SEJ marketing calendar includes dates from holiday dates to big sporting events to awareness months that you can plan content around for maximum engagement. We also include a template for you to plan your own calendar of relevant awareness dates.
Just review the full calendar of dates and copy across the dates you want to select for each month to create your own marketing calendar for 2026.
Use the dates as a starting point to help you brainstorm ideas and find opportunities for content that you can align to events throughout the year for a better chance of engagement.
Free Marketing Calendar And Template For 2026
Below, are listed many of the major holidays, events and obscure awareness days for 2026, month by month. There should be an event for every day of the year.
The full marketing calendar and template are available at the end of the article, with a breakdown of each month.
This calendar focuses mainly on the U.S. and Canada, with some major international and religious holidays included.
Your 2026 Holiday Marketing Calendar
Note: You can use this marketing calendar with our social media planner to keep your ideas, posts, and scheduling organized.
January
January is a time of resolutions and fresh starts, with many picking a goal for the year or looking to make a change.
It can be a slow start, given that many people are still recovering from the end of last year, but that gives you time to plan your calendar and ease into a new year of content.
There are plenty of broad activities to lean into, like Veganuary and National Hobby Month, to connect with audience lifestyles.
Events in January always have all eyes on them, too, like the Golden Globes and Winter X Games, so content around them can kickstart your 2026 engagement.
Monthly Holidays And Observances
International Creativity Month
National Blood Donor Month
National Braille Literacy Month
National Hobby Month
Dry January
Veganuary
Cervical Cancer Awareness Month
National Polka Music Month
National Skating Month
National Slow Cooking Month
National Soup Month
National Staying Healthy Month
National “Thank You” Month
National Train Your Dog Month
Weekly Observances
January 1 – 7 New Year’s Resolutions Week
January 1 – 7 Celebration of Life Week
January 12 – 18 National Pizza Week
January 12 – 18 Home Office and Security Week
January 19 – 25 Healthy Weight Week
Days
January 1 – New Year’s Day
January 1 – Global Family Day
January 2 – National Science Fiction Day
January 4 – World Braille Day
January 5 – National Screenwriters Day
January 6 – Epiphany
January 7 – Orthodox Christmas Day
January 11 – International Thank You Day
January 11 – 83rd Annual Golden Globe Awards
January 13 – Korean American Day
January 13 – Stephen Foster Memorial Day
January 14 – Orthodox New Year
January 14 – Ratification Day
January 17 – Ditch New Year’s Resolutions Day
January 17 – Benjamin Franklin Day
January 19 – Martin Luther King Jr. Day
January 21 – National Hug Day
January 22 (to February 1) – Sundance Film Festival
January 23 – National Pie Day
January 23-25 – Winter X Games
January 24 – International Day of Education
January 27 – International Holocaust Remembrance Day
January 28 – Data Privacy Day
Popular Hashtags For January
#NewYearsDay
#ScienceFictionDay
#NationalTriviaDay
#NationalBirdDay
#NationalStickerDay
#GetToKnowYourCustomersDay
#CheeseLoversDay
#MLKDay
#NationalHuggingDay
#PieDay
#NationalComplimentDay
#PrivacyAware
February
Despite being the shortest month, February is full of interesting events you can leverage for your marketing campaigns. The month is centered on the theme of love (along with timely observances like American Heart Month), so it’s a relatable theme for brands to craft creative campaigns around couples and community.
The colder days can leave people looking for things to get involved with from the comfort of their homes. So, make sure your content is working in line with popular days to attract people to your organization’s content.
February may be short, but it offers plenty of opportunities to tap into the heart of the season and connect with your audience.
Monthly Holidays And Observances
Black History Month
American Heart Month
National Heart Month
National Weddings Month
National Cancer Prevention Month
National Library Lovers Month
Celebration of Chocolate Month
Weekly Observances
February 7-13 – African Heritage and Health Week
February 9-15 – Freelance Writers Appreciation Week
February 9-15 – International Flirting Week
February 11-16 – New York Fashion Week
February 14-20 – Random Acts of Kindness Week
February 16-22 – Engineers’ Week
February 17-23 – National Pancake Week
February 24-March 2 – National Eating Disorders Awareness Week
Days
February 1 – First Day of Black History Month
February 1 – National Freedom Day
February 1 – National Change Your Password Day
February 1 – 68th Annual Grammy Awards
February 2 – Groundhog Day
February 4 – World Cancer Day
February 5 – National Girls and Women in Sports Day
February 8 – Super Bowl LX
February 9 – National Pizza Day
February 11 – International Day of Women and Girls in Science
February 12 – Abraham Lincoln’s Birthday
February 12 – Red Hand Day
February 12 – Georgia Day
February 12 – Darwin Day
February 13 – World Radio Day
February 13-15 – NBA All-Star Weekend
February 14 – Valentine’s Day
February 15 – Susan B. Anthony’s Birthday
February 16 – Presidents’ Day
February 17 – Lunar New Year
February 17 – Mardi Gras
February 17-18 (estimated) – Ramadan Begins
February 22 – George Washington’s Birthday
Popular Hashtags For February
#GroundhogDay
#WorldCancerDay
#NationalWeatherpersonsDay
#SendACardToAFriendDay
#BoyScoutsDay
#NationalPizzaDay
#ValentinesDay
#RandomActsOfKindnessDay
#PresidentsDay
#LoveYourPetDay
March
March marks the beginning of spring, and the days start to get longer. Whether March Madness turns up the heat or Pi Day inspires a little fun, there are plenty of exciting events to get your content involved with.
Some of the monthly observances, such as Women’s History Month or The Great American Cleanup, can serve as great causes for regular engagement this month.
Monthly Observances
Women’s History Month
Nutrition Month
Music in Our Schools Month
National Craft Month
American Red Cross Month
Irish-American Heritage Month
Ramadan (projected to end on March 18-19)
Weekly Observances
March 9-15 – Girl Scout Week
March 9-15 – National Sleep Awareness Week
March 18-24 – National Agriculture Week
March 23-29 – National Cleaning Week
Days
March 1 – Zero Discrimination Day
March 3 – World Wildlife Day
March 3 – National Anthem Day
March 4 – International HPV Awareness Day
March 6 – Global Unplugging Day
March 7 – Employee Appreciation Day
March 8 – International Women’s Day
March 8 – Daylight Saving Time
March 13 – Purim
March 13 – World Sleep Day
March 14 – Pi Day
March 15 – The Ides of March
March 15 – 98th Academy Awards Ceremony
March 17 – St. Patrick’s Day
March 18 – Global Recycling Day
March 18-19 (expected) – Ramadan ends
March 19-20 (expected) – Eid Al-Fitr
March 20 – Nowruz
March 20 – Spring Equinox
March 22 – World Water Day
March 26 – Epilepsy Awareness Day
March 27 – World Theatre Day
March 27 – MLB Opening Day
March 29 – Palm Sunday
Popular Hashtags for March
#PeanutButterLoversDay
#EmployeeAppreciationDay
#ReadAcrossAmerica
#DrSeuss
#WorldWildlifeDay
#NationalGrammarDay
#BeBoldForChange
#DaylightSavings
#PiDay
#StPatricksDay
#FirstDayofSpring
#WorldWaterDay
#NationalPuppyDay
#PurpleDay
#NationalDoctorsDay
#EarthHour
April
April is probably best known for April Fools’ Day, and a chance to get creative with parody and spoof content for your calendar that can make your customers smile.
Earth Month also means you can make more eco-friendly posts about your organization’s commitment to reducing its impact on the planet.
You also might want to get your cape out of storage on April 28 for National Superhero Day.
Monthly Observances
Earth Month
National Autism Awareness Month
Parkinson’s Awareness Month
Celebrate Diversity Month
Stress Awareness Month
Weekly Observances
April 20-26 – National Volunteer Week
April 20-26 – Administrative Professionals Week
April 21-25 – Every Kid Healthy Week
April 21-27 – Animal Cruelty/Human Violence Awareness Week
Days
April 1 – April Fool’s Day
April 1 – Passover starts
April 2 – World Autism Awareness Day
April 2 – International Children’s Book Day
April 2 – National Walking Day
April 2 – Maundy Thursday
April 3 – Good Friday
April 4 – Holy Saturday
April 5 – Easter Sunday
April 6 – Easter Monday
April 7 – National Beer Day
April 7 – World Health Day
April 9-12 – Masters Tournament PGA
April 9 – Passover ends
April 11 – National Pet Day
April 11-13/18-20 – Coachella Music Festival
April 13 – Thomas Jefferson’s Birthday
April 13-14 – Yom HaShoah (Begins evening, ends April 14)
April 13-15 – Songkran
April 15 – American Sign Language Day
April 15 – Tax Day
April 16 – Emancipation Day
April 20 – Patriots’ Day
April 21 – World Creativity and Innovation Day
April 22 – Yom Ha’atzmaut (sundown April 21 to nightfall April 22)
April 22 – Earth Day
April 25 – Arbor Day
April 27 – World Design Day
April 28 – National Superhero Day
April 30 – National Honesty Day
Popular Hashtags For April:
#AprilFools
#WAAD
#FindARainbowDay
#NationalWalkingDay
#LetsTalk
#EqualPayDay
#TaxDay
#NH5D
#NationalLookAlikeDay
#AdministrativeProfessionalsDay
#DenimDay
#EndMalariaForGood
#COUNTONME
#ArborDay
#NationalHonestyDay
#AdoptAShelterPetDay
May
May brings a lot of variety with it as there are plenty of good causes to raise awareness for, plus major sporting events and unique celebrations you can join in with.
Cinco de Mayo, the Kentucky Derby, and Memorial Day are just a few examples of events that will have lots of people paying attention and can make for great marketing themes.
Monthly Observances
ALS Awareness
Asthma Awareness Month
Asian Pacific American Heritage Month
Jewish American Heritage Month
National Celiac Disease Awareness Month
National Clean Air Month
Better Sleep Month
Lupus Awareness Month
Weekly Observances
May 4-10 – National Pet Week
May 4-10 – National Travel & Tourism Week
May 4-10 – Drinking Water Week
May 6-12 – National Nurses Week
May 11-17 – Food Allergy Awareness Week
Days
May 1 – May Day
May 1 – Law Day
May 1 – Lei Day
May 1 – World Password Day
May 2 – Kentucky Derby
May 4 – Star Wars Day
May 4 – International Firefighters Day
May 5 – Cinco De Mayo
May 6 – National Nurses Day
May 8 – World Red Cross and Red Crescent Day
May 10 – World Lupus Day
May 10 – World Fair Trade Day
May 10 – Mother’s Day
May 15-18 – PGA Championship
May 15 – International Day of Families
May 15 – Malcolm X Day
May 17 – Internet Day
May 18 – National HIV Vaccine Awareness Day
May 18 – Victoria Day (Canada)
May 20 – World Bee Day
May 21 – World Meditation Day
May 24-June 7 – French Open
May 25 – Geek Pride Day
May 25 – Memorial Day
May 28 – World Hunger Day
Popular Hashtags For May:
#RedNoseDay
#MayDay
#WorldPasswordDay
#StarWarsDay & #Maythe4thBeWithYou
#InternationalFirefightersDay
#CincoDeMayo
#MothersDay
#BTWD
#MemorialDay & #MDW
June
Once June has arrived, it’s finally starting to feel like summer. Everyone wants to make the most of the sunshine, and the positive energies are flowing.
Given that June also marks Great Outdoors Month, this is a great opportunity to make your brand a must-have companion for planning a beachside vacation or hosting a cookout.
You can also show your support for LGBTQ+ Pride, Flag Day, and Father’s Day, along with all the other events listed here.
Monthly Observances
LGBTQ Pride Month
Caribbean-American Heritage Month
Great Outdoors Month
Men’s Health Month
National Safety Month
National Zoo and Aquarium Month
Weekly Observances
June 1-7 – National Garden Week
June 1-7 – National Headache Awareness Week
June 9-15 – National Men’s Health Week
June 15-21 – National Roller Coaster Week
Days
June 1 – Global Parents Day
June 5 – Hot Air Balloon Day
June 5 – World Environment Day
June 6 – D-Day
June 6 – Belmont Stakes
June 8 – World Oceans Day
June 8 – National Best Friends Day
June 8 – Tony Awards TBD/expected timeframe
June 9 – Donald Duck Day
June 11 – Kamehameha Day
June 11-14 – Bonnaroo Music Festival
June 14 – National Flag Day
June 15 – Trinity Sunday
June 18-21 – U.S. Open PGA
June 19 – Juneteenth
June 19 – Chinese Dragon Boat Festival
June 21 – Father’s Day
June 21 – Summer Solstice
June 23 – International Widows Day
June 25-26 – Ashura
June 29-July 12 – Wimbledon
June 30 – International Asteroid Day
Popular Hashtags For June:
#NationalDonutDay
#FathersDay
#NationalSelfieDay
#TakeYourDogToWorkDay
#HandshakeDay
#SMDay
July
July presents lots of opportunities for savvy marketers, from the 4th of July to the International Day of Friendship.
As we enter the summer slowdown period, there’s a lot to celebrate that can help feed your social media content to keep customers engaged.
So celebrate your independence, indulge in a little ice cream, and bring people together with one of the many events in July.
Monthly Observances
Family Golf Month
Ice Cream Month
National Parks and Recreation Month
National Picnic Month
National Independent Retailer Month
National Blueberry Month
Weekly Observances
July 6–12 – Nude Recreation Week
July 14-20 – Capture the Sunset Week
Days
July 1 – International Joke Day
July 2 – World UFO Day
July 4 – Independence Day (Observed Friday, July 3)
July 4-26 – Tour de France
July 6 – International Kissing Day
July 7 – World Chocolate Day
July 8 – National Video Games Day
July 11 – World Population Day
July 12 – Pecan Pie Day
July 14 – MLB All-Star Game
July 16 – Moon Landing Anniversary
July 17 – World Emoji Day
July 18 – Nelson Mandela International Day
July 20 – International Chess Day
July 20 – National Moon Day
July 21 – National Junk Food Day
July 24 – Amelia Earhart Day
July 26 – Aunt and Uncle Day
July 27 – Parents’ Day
July 28 – World Hepatitis Day
July 30 – International Day of Friendship
July 31 – World Ranger Day
Popular Hashtags For July:
#NationalPostalWorkerDay
#WorldUFODay
#WorldEmojiDay
#DayOfFriendship
August
We’ve hit the hottest days by August as back-to-school looms, and we welcome the return of football.
While many are topping up their tans and making the most of the final Summer days, August still provides lots of opportunities to align your content with wider events.
Make sure you’re using your marketing calendar to the fullest extent to post any sunny seasonal content promptly before fall arrives.
Monthly Observances
Back to School Month
National Breastfeeding Month
Family Fun Month
National Peach Month
Weekly Observances
August 1-7 – International Clown Week
August 3-9 – National Farmers’ Market Week
August 10-16 – National Smile Week
August 25-31 – Be Kind to Humankind Week
Days
August 1 – National Girlfriends Day
August 2 – NFL Hall of Fame Game & Pre-season
August 2 – National Friendship Day
August 7 – Purple Heart Day
August 7 – International Beer Day
August 8 – International Cat Day
August 9 – Book Lover’s Day
August 11 – National Son and Daughter Day
August 11 – Victory Day
August 13 – Left Hander’s Day
August 15 – Assumption of Mary
August 15 – National Honey Bee Day
August 19 – World Humanitarian Day
August 20 – National Radio Day
August 21 – Senior Citizens Day
August 26 – Women’s Equality Day
August 28 – Raksha Bandhan
August 30 – Frankenstein Day
August 30 – National Beach Day
Popular Hashtags For August:
#InternationalCatDay
#NationalBookLoversDay
#WorldElephantDay
#LefthandersDay
#WorldPhotoDay
#WorldHumanitarianDay
#NationalLemonadeDay
#NationalDogDay
#WomensEqualityDay
September
As fall begins, some of the bigger events happening in September are Hispanic Heritage Month, Grandparents Day, and, of course, Labor Day.
There are also plenty of other events to inspire you, from Oktoberfest to National Yoga Month. Plus, a National Coffee Day for those who struggle to start their day without a caffeine fix.
Monthly Observances
Wilderness Month
National Food Safety Education Month
National Yoga Month
Whole Grains Month
Hispanic Heritage Month (September 15 – October 15)
Weekly Observances
September 7-13 – National Suicide Prevention Week
September 13-19 – National Indoor Plant Week
September 15-21 – Pollution Prevention Week
September 21-27 – National Dog Week
Days
September 2 – VJ Day
September 4 – National Wildlife Day
September 5 – International Day of Charity
September 6 – National Fight Procrastination Day
September 7 – Labor Day
September 8 – Pardon Day
September 11 – 9/11
September 11 – Patriot Day
September 12 – Video Games Day
September 13 – Uncle Sam Day
September 13 – National Grandparents Day
September 15 – Greenpeace Day
September 17 – Constitution Day
September 19 – Oktoberfest begins
September 20 – Yom Kippur
September 21 – International Day of Peace
September 22 – World Car-Free Day
September 23 – September Equinox
September 24 – World Bollywood Day
September 25 – Native American Day
September 27 – World Tourism Day
September 29 – National Coffee Day (US)
September 29 – Confucius Day
September 29 – World Heart Day
Popular Hashtags For September:
#LaborDay
#NationalWildlifeDay
#CharityDay
#ReadABookDay
#911Day
#NationalVideoGamesDay
#TalkLikeAPirateDay
#PeaceDay
#CarFreeDay
#WorldRabiesDay
#GoodNeighborDay
#InternationalPodcastDay
October
It’s that time of year when pumpkin spice lattes roll around again.
While October is known as the spooky season to many, there’s much more to this month than just Halloween. There’s Teacher’s Day, World Mental Health Day, and Spirit Day, to name a few, around which your organization can look to create content.
Monthly Observances
Breast Cancer Awareness Month
Bully Prevention Month
Halloween Safety Month
Financial Planning Month
National Pizza Month
Weekly Observances
October 5-11 – Fire Prevention Week
October 13-19 – Earth Science Week
October 19-25 – National Business Women’s Week
Days
October 1 – International Coffee Day
October 1 – World Vegetarian Day
October 3 – National Techies Day
October 5 – World Teachers’ Day
October 5 – Oktoberfest ends
October 5 – Child Health Day
October 10 – World Mental Health Day
October 11 – National Coming Out Day
October 12 – Indigenous Peoples’ Day
October 12 – Columbus Day
October 12 – Thanksgiving Day (Canada)
October 16 – World Food Day
October 16 – Spirit Day (Anti-bullying)
October 17 – Sweetest Day
October 24 – United Nations Day
October 24 – Make a Difference Day
October 30 – Mischief Night
October 31 – Halloween
Popular Hashtags For October:
#InternationalCoffeeDay
#TechiesDay
#NationalTacoDay
#WorldSmileDay
#WorldTeachersDay
#WorldHabitatDay
#WorldMentalHealthDay
#BossesDay
#UNDay
#ChecklistDay
#Halloween
November
During the month in which we all give thanks, there is also a wide range of causes you can help out with or raise awareness for, like Movember and America Recycles Day.
You should also mark your marketing calendar for arguably the biggest sales events of the year – Black Friday and Cyber Monday – which are sure to be on everyone’s radar.
Monthly Observances
Native American Heritage Month
Movember
World Vegan Month
Novel Writing Month
National Gratitude Month
Weekly Observances
November 17-21 – American Education Week
November 20-26 – Game and Puzzle Week
Days
November 1 – Day of the Dead/Día de los Muertos
November 1 – All Saints’ Day
November 1 – World Vegan Day
November 1 – Daylight Saving Time ends
November 3 – Melbourne Cup Day
November 8 – STEM Day
November 8 – Diwali
November 9 – World Freedom Day
November 10 – Marine Corps Birthday
November 11 – Veterans Day
November 13 – World Kindness Day
November 14 – World Diabetes Day
November 17 – National Entrepreneurs Day
November 24 – Evolution Day
November 26 – Thanksgiving Day
November 27 – Black Friday
November 28 – Native American Heritage Day
November 30 – Cyber Monday
Popular Hashtags For November:
#WorldVeganDay
#NationalSandwichDay
#DaylightSavings
#CappuccinoDay
#STEMDay
#VeteransDay
#WKD
#WDD
#BeRecycled
#EntrepreneursDay
#Thanksgiving
#ShopSmall
December
December is here, and the end of the year is in sight.
Although 2027 is right around the corner, and you might want to start planning your content calendar for next year, don’t neglect your content in the run-up to the holidays.
Send your year off in style with marketing campaigns dedicated to events like Nobel Prize Day, Rosa Parks Day, Green Monday, and more.
You can even do a content wrap-up of your best moments from the year – and make sure to get your 2027 marketing calendar sorted early before the post-Christmas wind-down.
Monthly Observances
Human Rights Month
Operation Santa Paws
Safe Toys and Gifts Month
World Food Service Safety Month
Weekly Observances
December 4-12 – Hanukkah (Chanukah)
December 26-January 1 – Kwanzaa
Days
December 1 – World AIDS Day
December 1 – Rosa Parks Day
December 3 – International Day of Persons with Disabilities
December 6 – St. Nicholas Day
December 7 – Pearl Harbor Remembrance Day
December 7 – National Letter Writing Day
December 8 – Feast of the Immaculate Conception
December 10 – Nobel Prize Day
December 10 – Human Rights Day
December 11 – UNICEF Anniversary
December 12 – Hanukkah (end of)
December 15 – Bill of Rights Day
December 18 – National Twin Day
December 21 – Winter Solstice
December 22 – Forefathers Day
December 23 – Festivus
December 24 – Christmas Eve
December 25 – Christmas Day
December 26 – Kwanzaa
December 26 – Boxing Day
December 31 – New Year’s Eve
Popular Hashtags For December:
#IDPWD
#NationalCookieDay
#NobelPrize
#WinterSolstice
#NYE
The Complete Marketing Calendar And Template To Plan 2026
A content plan mapped out months in advance gives you a reliable foundation to work from all year, without trying to think of ideas at the last minute.
Track what performs well throughout the year and use those insights to inform your 2026 marketing calendar, so you can invest more heavily in the content themes that consistently deliver results.
“What is the threshold between keyword stuffing and being optimized? Is there a magic rule for how often to use your main keyword and related keywords in a 2,000-word page? Should the main keyword be in the Headers AND the body in the same section?”
Great question!
There is no such thing as “being optimized” when it comes to keywords and repetitions. This is similar to looking at “authority” scores for domains. The optimization scores you get are measurements based on what an SEO tool thinks gives a domain trust, and not the actual search engines or LLM and AI systems. The idea of a keyword needing to be repeated is from an SEO concept called keyword density, which is a result of SEO tools.
Each tool would have a different way to say if you repeated a word or phrase enough for it to be “SEO friendly,” and because people trust the tools, they trust that this is a valid ranking factor or signal for a search engine. It is not because the search engines do not pay attention to how many times a word is on a page or in a paragraph, as that doesn’t produce a good experience.
Panda reduced the effectiveness of low-quality, keyword-stuffed content, and Google’s later advancements, BERT and MUM, allowed better understanding of context, relationships between terms, and the overall structure of a page. Google is now far better at interpreting meaning without relying on repeated exact-match keywords.
Keywords help to send a signal to a search engine about the topic of the page. And they can be used in headers, within text, as internal links, within title tags, schema, and the URL structure. But worrying about using the keyword for SEO purposes can lead to trouble. So, let’s define keyword stuffing for the sake of this post.
Keyword stuffing is when you force a keyword or keyword phrase into content, headers, and URLs for the sole purpose of SEO.
By forcing a keyword into a post, or forcing it into headers, you hurt the user experience. Although the search engine will know what you want to rank for, the language won’t feel natural. Instead of worrying about how many times you say the keyword, think about synonyms and other ways to say things that are easy to understand. Many search engines are getting better and better at understanding how topics, words, sentences, and phrases relate to one another. You don’t have to repeat the same words over and over anymore.
If you Google the word “swimsuit,” you’ll likely see it in a couple of title tags, but also see “swimwear.” Now type “bathing suits” in, you’ll likely not see it in a ton of the title tags, but the title tags will say “swimwear” and other synonyms, even though “bathing suits” is a popular name for the same product.
Now try “hairdresser near me,” and you’ll likely not see “hairdresser” in a lot of the results, but you will see “hair salon” and similar types of businesses. This is because search engines produce solutions to problems, and if they understand the page has the solution, you don’t need to keep repeating keywords.
For example, instead of saying “keyword stuffing” in this post, I could say “overusing phrases for SEO.” It means the same thing. Readers on this column will get bored pretty fast if I keep saying keyword stuffing, and by mixing it up, I can keep their interest, and search engines are still able to determine it is one-in-the-same. This also applies to header tags.
I don’t have any solid proof of this, but it seems to work well for our clients and the content we create, and it has worked for more than 10 years. If the main keyword phrase is in the H1 tag, whether it is a menu item or a blog post, we don’t worry about placing it in H2, H3, etc. I won’t be upset if the keyword shows up naturally, as that creates a good UX.
The theory here is that headers carry the theme and topic through the sections below. If the top-level header has the word “blue” in it, I make the assumption that theme “blue” carries through the page and applies to the H2 tag as the H2 is a sub-topic of “blue.” “H2’s” for blue could be “t-shirts” and “shorts.”
If this is true, by having the H1 be “blue” and the H2 be “shorts,” a search engine will know they are “blue shorts,” and I feel very confident users will too. They clicked blue or found a SERP for blue clothing, and they clicked shorts from the menu or found them from scrolling.
If you stuff “blue” into each link and header, it is annoying for the user to see it over and over. But many sites that get penalized will have “blue cargo shorts,” “blue chino shorts,” “blue workout shorts,” etc. It looks nicer to just say the styles of shorts like “cargo” or “chino,” and search engines likely already know they’re blue because you had it in the H tag one level up. You also likely have the “blue” part in breadcrumbs, site structure, product descriptions, etc.
One thing you definitely do not want to do is have a million footer links that match the navigation or are keyword-stuffed. This worked a long time ago, but now it is just spam. It doesn’t benefit the user; it is obvious to search engines you’re doing it for SEO. Sites that stuff keywords tend to use these outdated tactics too, so I want to include it here.
I hope this helps answer your question about overusing specific topics or phrases. Doing this only makes the tool happy; it does not mean you’ll be creating a good UX for users or search engines. If you focus on writing for your consumer and incorporate a keyword or phrase naturally, you’ll likely be rewarded.
More Resources:
Featured Image: Paulo Bobita/Search Engine Journal
When Emily Epstein shared her perspective on LinkedIn about how “people didn’t stop reading books when encyclopedias came out,” it sparked a conversation about the future of primary sources in an AI-driven world.
In this episode, Katie Morton, Editor-in-Chief of Search Engine Journal, and Emily Anne Epstein, Director of Content at Sigma, dig into her post and unpack what AI really means for publishers, content creators, and marketers now that AI tools present shortcuts to knowledge.
Their discussion highlights the importance of provenance, the layers involved in online knowledge acquisition, and the need for more transparent editorial standards.
If you’re a content creator, this episode can help you gain insight into how to provide value as the competition for attention becomes a competition for trust.
Watch the video or read the full transcript below:
Katie Morton: Hello, everybody. I’m Katie Morton, Editor-in-Chief of Search Engine Journal, and today I’m sitting down with Emily Anne Epstein, Director of Content at Sigma. Welcome, Emily.
Emily Ann Epstein: Thanks so much. I’m so excited to be here.
Katie: Me too. Thanks for chatting with me. So Emily wrote a really excellent post on LinkedIn that caught my attention. Emily, for our audience, would you mind summarizing that post for us?
Emily: So this should feel both shocking and non-shocking to everybody. But the idea is, people didn’t stop reading books when encyclopedias came out. And this is a response to the hysteria that’s going on with the way AI tools are functioning as summarizing devices for complicated and complex situations. And so the idea is, just because there’s a shortcut now to acquiring knowledge, it doesn’t mean we’re getting rid of the need for primary sources and original sources.
These two different types of knowledge acquisition exist together, and they layer on top of one another. You may start your book report with an encyclopedia or ChatGPT search, but what you find there doesn’t matter if you can’t back it up. You can’t just say in a book report, “I heard it in Encarta.” Where did the information come from? I think about the way this is going to transform search: There’s simply going to be layers now.
Maybe start your search with an AI tool, but you’ll need to finish somewhere else that organizes primary sources, provides deeper analysis, and even shows contradictions that go into creating knowledge.
Because a lot of what these synthesized summaries do is present a calm, “impartial” view of reality. But we all know that’s not true. All knowledge is biased in some way because it cannot be “all-containing.”
The Importance Of Provenance
Katie: I want to talk about something you mentioned in your LinkedIn post: provenance. What needs to happen, whether culturally, editorially, or socially, for “show me the source material” to become standard in AI-assisted search?
With Wikipedia or encyclopedias, ideally, people should still cite the original source, go deeper into the analysis, and be able to say, “Here’s where this information came from.” How do we get there so people aren’t just skimming surface-level summaries and taking them as gospel?
Emily: First, people need to use these tools, and there needs to be a reckoning with how reliable they are. Thinking about provenance means thinking about knowledge acquisition as triangulation. So, when I was a journalist, you have to balance hearsay, direct quotes, press releases, and social media.
You create your story from a variety of sources, so that way, you get something that’s in the middle and can explain multiple truths and realities. That comes from understanding that truth has never been linear, and reality is fracturing.
What AI does, even more advanced than that, is deliver personalized responses. People are prompting their models differently, so we’re all working from different sets of information and getting different answers. Once reality is fractured to that degree, knowing where something comes from – the provenance – becomes essential for context.
And triangulation won’t just be important for journalists; it’s going to be important for everyone because people make decisions based on the information that they receive.
If you get bad inputs, you’ll get bad outputs, make bad decisions, and that affects everything from your work to your housing. People will need to triangulate a better version of reality that is more accurate than what they’re getting from the first person or the first tool they asked.
Creators: Competing For Attention To Competing For Trust
Katie: So if AI becomes the top layer in how people access information – designed to hold attention within its own ecosystem – what does that mean for content creators and publishers? It feels like they’re creating a commodity that AI then repackages as its own.
How do you see that playing out for creators in terms of revenue and visibility?
Emily: Instead of competing for attention, creators and publishers will compete for trust. That means making editorial standards more transparent. They’re going to have to show the work that they’re doing. Because with most AI tools, you don’t see how they work, it’s a bit of a black box.
But if creators can serve as a “blockchain,” (a verifiable ledger of information sources) and they’re showing their sources and methods, that will be their value.
Think about photography. When it first came out, it was considered a science. People thought photos were pure fact. Then, darkroom techniques like dodging and burning or combining multiple exposures showed that photos could lie.
And when photography became an art form, people realized that the photographer’s role was to provide a filter. That’s where we are with AI. There are filters on every piece of information that we receive.
And those organizations that make their filter transparent are going to be more successful, and people will return to them because again, they’re getting better information. They know where it’s coming from, so they can make better decisions and live better lives.
AI Hallucinations & Deepfakes
Emily: It was a shocking moment in the history of photography. that people could lie with photographs. And that’s sort of where we are right now. Everybody is using AI, and we know there are hallucinations, but we have to understand that we cannot trust this tool, generally speaking, unless it shows its work.
Katie: And the risks are real. We’re already seeing AI voiceovers and video deepfakes mimicking creators often without their consent.
Inspiring People To Go Deeper
Katie: In your post, you ended with “people still doing the work of deciding what’s enough.” In an attention economy of speed and convenience, how do we help people go deeper?
Emily: The idea that people don’t want to go deeper flies in the face of Wikipedia holes. People start with summarized information, but then click a citation, keep going further, watch another show, keep digging.
People want more of what they want. If you give them a breadcrumb of fascinating information, they’ll want more or that. Knowledge acquisition has an emotional side. It gives you dopamine hits: “I found that, that’s for me.”
And as content marketers, we have to provide that value for people where they say, ‘Wow, I am smarter because of this information. I like this brand because this brand has invested in my intelligence and my betterment.’
And for content creators, that needs to be the gold star.
Wrapping Up
Katie: Right on. For those who want to follow your work, where can they find you?
Emily: I’m dialoging and writing my thoughts on AI out loud and in public on LinkedIn. Come join me, and let’s think out loud together.
Katie: Sounds great. And I’m always at searchenginejournal.com. Thank you so much, Emily, for taking the time today.
Emily: Thank you!
More Resources:
Featured Image: Paulo Bobita/Search Engine Journal
For two decades, the arrangement between search engines and publishers was a symbiotic relationship where publishers allowed crawling, and search engines sent referral traffic back. That traffic helped to fund content creation for publishers through ads and subscriptions.
AI features are changing this, and the deal is starting to break down.
AI Overviews, ChatGPT, and answer engines keep users within their platform instead of sending them to source sites. The result is publishers are watching their traffic decline while AI companies crawl more content than ever.
New payment models are emerging to replace the old economics. some involve usage-based revenue sharing, others are flat licensing deals worth millions, and a few have ended in court settlements. But the terms vary widely, and it’s unclear whether any model can sustain the content ecosystem that AI depends on.
This article examines the payment models taking shape, how different publishers are responding, and what SEO professionals should consider as the industry figures out sustainable economics.
The crawl-to-referral ratio shows how unbalanced this is. Cloudflare’s analysis tracks Google Search maintaining roughly a 10:1 ratio, crawling about 10 pages for every referral sent back. OpenAI’s ratio was estimated at around 1,200:1 to 1,700:1.
Fewer pageviews mean fewer ad impressions, lower subscription conversions, and reduced affiliate revenue.
Payment Models Taking Shape
Three payment models are emerging.
1. Usage-Based Revenue Sharing
Perplexity launched its Comet Plus program in 2025. The company shares subscription revenue with publishers after keeping a cut for compute costs, though the exact split isn’t disclosed.
These models tie pay to usage, but the pools stay small compared to traditional search revenue and scaling depends on converting free users to paid subscribers.
These arrangements bundle three rights: training data access using archives to improve models, real-time content display with attribution in ChatGPT, and technology access letting publishers use OpenAI tools.
AI companies need both historical archives and current content, but this creates tiers where publishers with vast archives can negotiate deals while smaller publishers lack leverage.
Anthropic settled with authors for $1.5 billion after Judge William Alsup’s June ruling in Bartz v. Anthropic. The ruling said training on legally purchased books was fair use. Downloading from pirate sites was infringement.
The settlement shows AI companies can afford to pay even while arguing in court they shouldn’t have to, and it provides a public benchmark other negotiations may reference, though specific terms remain sealed.
Publishers accepting deals cite new revenue streams, legal protection from copyright claims, influence over AI development, and recognition that AI search adoption appears inevitable, with many viewing early partnerships as positioning for future leverage.
Publishers Pursuing Litigation
The New York Times sued OpenAI and Microsoft in 2023. The complaint argues the companies created “a multi-billion-dollar for-profit business built in large part on the unlicensed exploitation of copyrighted works.”
Publishers refusing deals say the money’s too low and worry that accepting bad terms now legitimizes them going forward, plus AI summaries directly compete with their work.
Trade Organization Positions
Danielle Coffey, CEO of News/Media Alliance, called Google’s AI Mode practices “parasitic, unsustainable and pose a real existential threat.” She suggests that AI systems are only as good as the content they use to train them.
Jason Kint of Digital Content Next noted that despite Google sending large monthly revenue checks through advertising, 78% of member digital revenue still comes from ads. Every point of search traffic lost “squeezes the budgets that fund investigative reporting.”
Both organizations demand that AI systems provide transparency, clearly attribute content, respect publishers’ roles, comply with competition laws, and not misrepresent original works.
The Emerging Division: Licensed Web Vs. Open Web
The payment model differences are creating two tiers of web content with different economics.
A “Licensed Web” consists of premium content behind APIs and licensing agreements. Publishers with vast archives, specialized expertise, or unique data sets are negotiating direct access deals with LLM companies. This content gets used for training and real-time retrieval with attribution and compensation.
The “Open Web” includes crawlable pages without licensing agreements. User-generated content, marketing material, commodity information, and sites lacking leverage to negotiate terms. This content may still get crawled and used, but without direct compensation beyond minimal referral traffic.
This setup can lead to mismatched incentives. Publishers investing in differentiated, high-quality content may have licensing options to support their work. Meanwhile, those creating more easily replaceable information might struggle with commoditization, making it harder to find clear ways to earn revenue.
For practitioners, focus on developing your own research, unique data sets, specialized expertise, and original reporting. This increases both traditional search value and potential licensing value to AI platforms.
How Payment Models Are Reshaping SEO And Content Strategy
The shift from traffic to licensing is forcing changes across SEO.
The Citation Vs. Click Problem
Traditional SEO centered on rankings that drove clicks, but LLM citations work differently as content appears in AI answers with attribution, but fewer click-throughs. Lily Ray believes SEO is no longer just about ranking and traffic.
Practitioners are now tracking engagement quality, conversion rates, branded search, and direct traffic alongside traditional metrics. Some are quantifying AI citations across ChatGPT, Perplexity, and other platforms. This provides visibility into brand mentions even when referrals don’t materialize.
Bot Access Becomes A Business Decision
Publishers today find themselves making decisions about blocking content via robots.txt. These choices weren’t even considered two years ago. The decision weighs AI visibility with concerns about potential traffic loss and the benefits of licensing.
Many content publishers are open to allowing bot access, valuing their presence in AI results more than guarding content that competitors also produce. News organizations prioritize speed and broad coverage for breaking stories, aiming to reach as many people as possible.
On the other hand, some publishers choose to restrict access to their high-value research and specialized insights, knowing that scarcity can give them stronger negotiating power. Those with paywalled analysis often block AI crawlers to protect their subscription models, ensuring they maintain control over their most valuable content.
ProRata and TollBit offer selective licensing as a middle ground. Publishers maintain AI visibility while getting paid. But AI companies haven’t widely adopted these platforms.
Measurement Systems Under Pressure
Traffic declines may trigger discussions with stakeholders who expect a recovery, and for sites that rely solely on advertising, this can be a challenging discussion to have.
Publishers are exploring alternative revenue models such as subscriptions, memberships, consulting, events, and affiliate partnerships, while also prioritizing email, newsletters, and apps.
Branded search remains more stable than overall traffic levels, emphasizing the importance of brand-building beyond search rankings.
Content Investment Questions
Payment uncertainty can make it hard to decide what content is worth investing in. Publishers with licensing deals might focus on what AI companies need for training or retrieval, while those without deals have to consider different factors.
The division between Licensed Web and Open Web influences these choices. Original research, unique data, and specialized expertise may justify different levels of investment compared to more common material.
Smaller publishers often lack the leverage of licensing. Creating high-quality content while competing with AI-generated summaries that don’t drive traffic raises ongoing questions about sustainability.
Content Sustainability Concerns
Revenue declines are forcing news organizations to cut staff, reducing investigative capacity and the production of original reporting.
The Society of Authors reports 12,000+ members have written letters saying they “do not consent” to AI training. That signals creative professionals reconsidering publication if compensation doesn’t materialize.
More content is moving behind paywalls, which protects revenue but limits free information access. The News/Media Alliance warns that without fair compensation for publisher content, AI practices pose a significant threat to ongoing investment in journalism.
The challenge is that AI companies really rely on publishers to provide high-quality training data. But AI systems that don’t generate traffic can make it harder for publishers to fund their content creation efforts.
Right now, payment models might work well for big publishers who have more power, but mid-sized and small publishers face more uncertain financial situations.
Those with direct relationships to their audience and multiple sources of income are generally in a stronger position compared to those mainly relying on ads.
What’s Likely Next
Current LLM payment models don’t match what publishers earned from search traffic, and they also don’t reflect what AI companies extract through crawling.
Publishers are dividing into distinct camps, with some angling for deals while others are betting litigation will establish better terms than individual negotiations.
Trade organizations are pushing for regulatory solutions, but AI companies maintain their current approach works. OpenAI points to expanding partnerships and says deals provide fair value. Perplexity argues its revenue-sharing model aligns incentives. Google hasn’t announced plans beyond existing traffic-sharing arrangements.
What happens next depends on litigation outcomes, regulatory action, and whether market pressure forces AI platforms to improve terms.
Multiple paths forward remain possible, and for now, publishers face immediate decisions about bot access, content strategy, and revenue diversification without clarity on which approach will prove sustainable.
It’s officially the end of organic search as we know it. A recent survey reveals that 83% of consumers believe AI-powered search tools are more efficient than traditional search engines.
The days of simple search are long gone, and a profound transformation continues to sweep the search engine results pages (SERPs). The rise of AI-powered answer engines, from ChatGPT to Perplexity to Google’s AI Overviews, is rewriting the rules of online visibility.
Instead of returning traditional blue links or images, AI systems are returning immediate results. For marketing leaders, the question is no longer “How do we rank number one?” but rather “How do we become the top answer?”
This shift has eliminated the distance between the search and the solution. No longer do customers need to click through to find the information they’re seeking. And while zero-click searches are more prevalent and old metrics like keyword rankings are fading fast, it also creates a massive opportunity for chief marketing officers to redefine SEO as a strategic growth function.
Yes, content remains king, but it must be rooted in a foundation that fuels authority, brand trust, and authenticity to serve the systems that are shaping what appears when a search is conducted. This isn’t just a new channel; it’s a new way of creating, structuring, and validating content
In this post, we’ll dissect how to redesign content workflows for generative engines to ensure your content reigns supreme in an AI-first era.
What Generative Engines Changed And Why “Traditional SEO” Won’t Recover
When users ask generative search engines a question, they aren’t presented with a list of websites to click through to learn more; instead, they’re given a quick, synthesized answer. The source of the answer is cited, allowing users to click to learn more if they so choose to. These citations are the new “rankings” and most likely to be clicked on.
In fact, research shows 60% of consumers click through at least sometimes after seeing an AI-generated overview in Google Search. A separate study found that 91% of frequent AI users turn to popular large language models (LLMs) such as ChatGPT for their searching needs.
While keyword optimization still holds importance in content marketing, generative engines are favoring expertise, brand authority, and structured data. For CMOs, the old metrics no longer necessarily equate to success. Visibility and impressions are no longer tied to website traffic, and success is now contingent upon citations, mentions, and verifiable authority signals.
The AI era signals a serious identity shift, one in which traditional SEO collides with AI-driven search. SEO can no longer be a mechanical, straightforward checklist that sits under demand generation. It must integrate with a broader strategy to manage brand knowledge, ensuring that when AI pulls data to form an answer, your content is what they trust most out of all the options out there.
In this new search era, improving visibility can be measured in three diverse ways:
Appearing in results or answers.
Being seen as a thought leader in your space by being cited or trusted as a credible source.
Driving influence, affinity, or conversions from your digital presence.
Traditional SEO is now only one piece of the content visibility puzzle. Generative SEO demands fluency across all three.
The CMO’s New Dilemma: AI As Both Channel And Competitor
Consumers have questions. Generative engines have the answers. With over half (56%) of consumers trusting the use of Gen AI as an education resource, generative engines are now mediators between your brand and your customers. They can influence purchases or sway customers toward your competition, depending on whether your content earns their hard-earned trust.
For example, if a customer asks, “What’s the best CRM for enterprise brands?” and an AI engine suggests HubSpot’s content over your brand, the damage isn’t just a lost click but a missed opportunity to garner interest and trust with that motivated searcher. The hard truth is the Gen AI model didn’t see your content as relevant or reliable enough to deliver in its answer.
Generative engines are trained on content that already exists, meaning your competitors’ content, user reviews, forum discussions, and your own material are all fair game. That means AI is both a discovery channel and competitor for audience attention. This duality must be recognized by CMOs to invest in structuring, amplifying, and revamping content workflows to match Gen AI’s expectations. The goal isn’t to chase algorithms; it’s to shape the content in a meaningful way to ensure those algorithms trust and view your content as the single source of truth.
Think of it this way: Traditional SEO practices taught you to optimize content for crawlers. With Generative SEO, you’re optimizing for the model’s memory.
How To Redesign SEO Content Workflows For The Generative Era
To win citations and influence AI-generated answers, it’s time to throw out your old playbooks and overhaul previous workflows. It may be time to ditch how you used to plan content and how performance was measured. Out with the old and in with the new (and more successful).
From Keyword Targeting To Knowledge Modeling
Generative models go beyond understanding just keywords. They understand entities and relationships, too. To show up in coveted AI answers and to be the top choice, your content must reflect structured, interconnected knowledge.
Start by building a brand knowledge graph that maps people, products, and topics that define your expertise. Schema markup is also a must to show how these entities connect. Additionally, every piece of content you produce should reinforce your position within that network.
Long-tail keywords may be easier to target and rank for in traditional SEO; however, optimizing for AI search requires a shift in content workflows, one that targets “entity clusters” instead. Here’s what this might look like in practice: A software company wouldn’t only optimize content around the focus keyword phrase “best CRM integrations.” The writer should also define its relationship to the concept of “CRM,” “workflow automation,” “customer data,” and other related phrases.
From Content Volume To Verifiable Authority
It was once thought that the more content, the better. This is not the case with SEO today as AI systems prefer and prioritize content that’s well-sourced, attributable, and authoritative. Content velocity is no longer the end game, but rather producing stronger, more evidence-backed pieces.
Marketing leaders should create an AI-readiness checklist for their content marketing team to ensure every piece of content is optimized for generative engines. Every article should include author credentials (job title, advanced degrees, and certifications), clear citations (where the statistics or research came from), and verifiable claims.
Create an AI-readiness checklist for your team. Every article should include author credentials, clear citations, and verifiable claims. Reference independent studies and owned research where possible. AI models cross-validate multiple sources to determine what’s credible and reliable.
In short: Don’t publish faster. Publish smarter.
From Static Publishing To Dynamic Feedback
If one thing is certain, it’s that generative engines are continuing to evolve, similar to traditional search. What ranks well today may change entirely tomorrow. That’s why successful SEO teams are adopting an agile publishing cycle to continue to stay on top of what’s working best. SEO teams are actively and consistently:
Testing which questions their audience asks in generative engines.
Tracking whether their content appears in those answers.
Refreshing content based on what’s being cited, summarized, or ignored.
Several tools are emerging to help you track your brand’s presence across, ChatGPT, Perplexity, AI Overviews, and more, including SE Ranking, Peec AI, Profound, and Conductor. If you choose to forego tools, you can also run regular AI audits on your own to see how your brand is represented across engines by following the aforementioned framework. Treat that data like search console metrics and think of it as your new visibility report.
How To Measure SEO Success In An Answer-Driven World
Measuring SEO success across generative engines looks different than how we used to measure traditional SEO. Traffic will always matter, but it’s no longer the sole proof of impact. For CMOs, understanding how to measure marketing’s impact is essential to demonstrate the value your team delivers to the organization’s mission.
Here’s how progressive CMOs are redefining SEO success:
AI Citations: How often your content is referenced within AI-generated responses.
Answer Visibility Share: The percentage of relevant queries where your content appears in an AI answer.
Zero-Click Exposure: Instances where your brand is visible in AI responses, even if users don’t visit your site.
Answer Referral Traffic: The new “clicks”; visits that originate directly from AI-generated links.
Semantic Coverage: The breadth of related entities and subtopics your brand consistently appears for.
These metrics move SEO reporting from vanity numbers to visibility intelligence and are a more accurate representation of brand authority in the machine age.
Future-Proof Your SEO For Generative Search
Generative search is just as volatile as traditional search, but volatility is fertile ground for innovation. Instead of resisting it, CMOs should continue to treat SEO as an experimental function; a sandbox for continuously testing new ways to be discovered and trusted. SEO continues to remain a function that isn’t a set it and forget it, but one that must change with time and testing.
CMOs should encourage their team to A/B test content formats, schema implementations, and even phrasing to see what appears in AI generated responses. Cross-pollinate SEO insights with PR, product, and customer experience. When your organization learns how AI represents your brand, it becomes a feedback loop that strengthens everything from messaging to market positioning.
In the near future, the term “organic search” will become something broader to encompass the fast-growing ecosystem of machine-mediated discovery. The brands that succeed won’t just optimize for keywords. They’ll build long-lasting trust.
The Next Evolution Of Search
The notion that AI is killing SEO is false. AI isn’t eliminating SEO but rather redefining what it means today. What used to be a tactical discipline is shifting to become a more strategic approach that requires understanding how your brand exists within digital knowledge systems. It’s straying from what’s comfortable and moving into largely uncharted territory.
The opportunity for marketing leaders is clear: It’s time to move past the known and venture into the somewhat elusive realm of generative answer engines. After all, Forrester predicts AI-powered search will drive 20% of all organic traffic by the end of 2025. At the end of the day, many of the traditional SEO best practices still apply: create content that’s verifiable, well-structured, and context-rich. The main mindset shift lies in how to measure generative engine success, not by rankings but by relevance in conversation.
In the age of AI answers, your brand doesn’t need to just be searchable; it needs to be knowable.