Jun 4 2026

How Can You Implement Entity Optimization Without Relying On Schema Markup? – Ask An SEO via @sejournal, @HelenPollitt1

For this week’s Ask An SEO, the question asked was:

“From a technical standpoint, what does ‘entity optimization’ actually mean beyond adding schema?”

To answer this, first of all, let’s establish, in information retrieval, that an entity is a uniquely identifiable “thing” that exists independently of the words used to describe it. Entity optimization is about building connections and relationships between those concepts and “things” using an ecosystem called the Knowledge Graph, and ensuring your digital footprint is aligning with your brand and products in a way that removes any ambiguity from the search engines or large language models.

This is increasingly important now LLMs are trying to build a picture of your company. Remember, LLMs likely rely on language modeling and relationships between concepts to generate responses. By strengthening your brands’ entities, you are increasing the likelihood of those responses being about your related products or services.

The Goal Of Entity Optimization

Entity optimization is critical for improving online discovery in our modern search world. To make sure you are optimizing your brands for bots and algorithms to understand them, you need to keep in mind the goals of entity optimization.

To Create A Stable, Unambiguous Identity

The primary goal of entity optimization is to create certainty around what an entity is and how it relates to other entities. It is important that when a website refers to a brand, and so does an online directory, the bots can tell they are definitely the same entity. This means keeping references to brands and their products consistent across the internet.

To Strengthen Machine-Readable Identity Across The Internet

Whereas it might be easy for humans to infer that a website referencing a brand but spelling the name wrong doesn’t refer to a completely different entity, that isn’t necessarily the case for search engines. Equally, a company only updating their address on their website after a move, but leaving the old address on their suppliers’ websites, might be enough for the bots to consider them two different businesses, or with two offices concurrently.

The goal of entity optimization is to make it easy for the bots to determine a brand’s identity online.

So When Someone Searches For Your Brand, They Receive Info About Your Whole Brand, Or What Is Linked To It

Entity optimization raises the likelihood that searchers of your brand will be given a full and complete picture of your organization. It will help the bots know that an “iPad” is a product of “Apple.” So when searchers look for [Apple products], the search engines and LLMs are confident that an iPad fits that criteria.

It Creates A Graph Of Data Points Where Everything Connects To Each Other And To The Main Brand

Essentially, entity optimization at its heart is about creating a graph of data points that are related to each other. Each of the “nodes” is a facet of your brand, i.e. the owners, the products, the office address, the blog authors. Entity optimization is about creating a spider web of all of the critical entities that are involved in your brand, connecting them to each other and to the main node that is your brand entity.

Why Schema Is Often The First Thought When Looking At Technical Entity Optimization Improvements

Given what we’ve identified entity optimization to be about, it’s easy to see why schema is often the first thought. Schema.org structured data markup is essentially a machine-readable labeling system. It gives context and meaning to words and media on a page. It shows what type of entity they refer to and can show what other entities they are linked to.

It’s also easy to implement. There are many guides online, or even CMS plug-ins that help with applying schema markup to websites and tests to show if it has been implemented properly.

On the face of it, it is a critical way of strengthening an entity’s identity. However, it might not be pulling the weight you think it is. Google doesn’t simply trust schema declarations; it cross-verifies them against off-site signals.

Schema Should Be Used To Reinforce Identity And Relationships

Schema markup is very helpful in entity optimization, but the key to utilizing it effectively for this is in the relationships it portrays.

Many schema types allow you to show how an entity is related to another. For example, “Thing” schema has a property called “sameAs” which is the “URL of a reference Web page that unambiguously indicates the item’s identity, e.g., the URL of the item’s Wikipedia page, Wikidata entry, or official website.” For example, if you are trying to communicate that an author “Jessica Smith” on your blog post and the “Jessica Smith” of a LinkedIn profile are, in fact, the same person. You could use the sameAs property to link to Jessica Smith’s LinkedIn profile. You could also do this to reference her biography page on another website, or write up in a journal.

For entity optimization, the key is to use schema to mirror a real knowledge graph. The purpose of it is to show connections and alternatives between entities. Yes, schema is helpful for generating rich results, but for entities, the purpose needs to be demonstrating relationships.

What You Can Do To Optimize Your Entities

So, if schema isn’t the be-all of entity optimization, what else can you do to help? Modern search systems attempt to reconcile multiple references into a single canonical entity. Whereas a lot of entity optimization happens external to your website, like through social profiles and third-party mentions, we’re going to look at what you can do specifically through the technical implementation of your site.

Technical Identifiers

Consistency is key both on- and off-site. From a tech standpoint, this can mean consistency in how objects or elements are referred to in the code, especially if they relate to the same entity. For example, always referring to the things your ecommerce site sells as “products” – whether in content or code – reinforces that they are, in fact, products and are referencing the same entity.

For example, using identifying codes like SKUs, ISBNs, and GTINs can help the search engines see the uniqueness of your different products. Using them consistently across your site, wherever your product is referenced, can aid in disambiguating products. Codes like these are unique to each product, whereas the words you use to describe them might not be.

Co-Occurrence Patterns

Search engines are trying to identify if multiple references across the web are referring to the same entity. They compare signals such as names, addresses, social profiles, topical context, and co-occurrence patterns to verify identity.

In entity optimization, co-occurrence refers to the repeated appearance of two or more entities within the same contextual environment. Modern search engines and LLMs are understood to rely on “embeddings” to understand relationships between entities. Instead of looking only for exact keyword matches, they map concepts into “vector spaces” where related topics naturally sit closely together. If a website consistently discusses phones alongside memory, camera specs, and battery life, the search engines will begin to link those entities semantically.

Consider how often related entities occur near each other in your content or site architecture. Place entities in contexts that clearly describe their relationship. For example, “Samsung is a well-respected phone manufacturer, and its newest model, the Samsung Galaxy S26 series, is the flagship line-up for 2026.”

Also, place highly related entity names within the same pages, tables, and lists. By showing that these two subjects are often referred to together, it denotes a relationship between the two. For example, if you always refer to two of your products together, the yellow version and the blue version, it can help cement the idea that they are the same product just in different colorways.

Although it may sound a bit like SEO 101, the hierarchy of page content can be useful for the bots to understand relationships between concepts. The heading hierarchy signals relationships between entities and topics. For example, something being referenced in the

could indicate that it is key to the other entities that are nested under the

or

.

Entity-First Website Architecture

When looking at optimizing your website to aid with entity recognition, look at the foundation of how the site is structured.

Taxonomy

The structural hierarchy of a website can really reinforce the relationships between entities. Introducing a taxonomy to your site – essentially classifying elements like products or blog posts in their relationship to each other – can be a good method of showing the discrete entities in your site and their topical hierarchy.

A taxonomy replaces keyword targeting with a more comprehensive topical framework. By systematically linking subtopics back to a main category, you signal to search engines that your website possesses deep, structured expertise on a specific entity.

Internal Linking

Having a taxonomy in place makes internal linking a lot easier to plan. It enables you to reinforce relationships between entities by linking their pages together.

For example, a clothing store may have a taxonomy that lists all T-shirts under “clothing” and all dresses under “clothing.” By linking both the “Clothing” homepage to the “T-shirts” and “Dresses” pages, you are showing that they are both children of “Clothing” and siblings of each other.

Breadcrumbs

This structural hierarchy can also be reinforced through breadcrumb links and schema. This clearly defines the hierarchy between parent and children entities.

Feeds

Feeds, such as a Google Merchant Feed, structure information about your products in an easily machine-readable format. As long as the information for a product contained in the feed matches the information for that product on the site, and elsewhere, it is another way of encouraging certainty. The feeds are a known quantity. They are structured data that the search engines are primed to receive and digest. As with everything SEO, entity optimization is about small, impactful signals. For ecommerce sites, a feed containing your product data is another nudge towards understanding the products as entities.

Entity homes

Search engines often look for a canonical “entity home” that acts as the primary source of truth for a specific entity. A final way to technically aid in entity optimization is by creating these entity homes for your core entities on your site. Set up pages full of information about these entities, whether they are products, brands, or people. This shows they are important in their own right and enables the structural linking we mentioned earlier.

For example, author pages can help to demonstrate that each writer for the blog is a real, separate person. Linking up their articles to their author page can show the relationship between them and their works. Equally, linking their page to items on the site that speak to their area of expertise can help reinforce their specialisms – think linking your sports writer to the sports section of the site. Similarly, linking them to their external profile on social media or other journalism sites can show what else they have written.

Make It Easy To Crawl And Render

Remember, the key to ensuring your technical SEO aids entity optimization lies in whether the bots can access the content you need them to find.

This includes choosing Server-Side Rendering (SSR) for your important content to ensure LLMs can easily access your content. Although Googlebot can render JavaScript, it can be good practice to keep content easily accessible and not use JavaScript.

By keeping content easily accessible, it also means faster server response times so AI search bots can fully crawl and map your entire entity ecosystem.

Where Do You Start With Entity Optimization?

Entity optimization isn’t a single tactic. It’s an ongoing process of making it easier for search engines and LLMs to understand who you are, what you offer, and how everything connects.

Schema is a good starting point, but it’s only one piece. The real work is in consistency across your code, your content, your site architecture, and your presence across the web. The more coherent and verifiable your entity signals are, the less ambiguity the bots have to deal with.

Start with the basics: consistent identifiers, clear taxonomy, entity homes for your core pages. Then look at how your content places related entities in context with each other. Small, deliberate signals add up.

More Resources:

Featured Image: Paulo Bobita/Search Engine Journal

Ecommerce MGMT 0 Comments

May 21 2026

How To Stress-Test A Staging Environment To Surface Risks Pre-Launch – Ask An SEO via @sejournal, @HelenPollitt1

This week’s Ask An SEO question:

“How do you stress-test a staging environment to surface SEO risks before a large-scale launch?”

It is one of the most important questions to answer when considering rolling out new websites, migrations, or significant changes to your live site.

First, let’s look at the difference between a “staging” site and the “production” site.

The staging site is often also called the “development” site, “pre-production,” or another name that is specific to your company. It is a test site that is meant to mirror your live site as much as possible to help developers test changes in a safe, private environment before launching them.

The “production” site is your live site. It’s the one that is accessible to the general public and should be operating as close to perfectly as possible.

There are some instances where developers might deploy straight to the production site without testing on a staging site first. For example, when there is no testing site to use, or there is no way of mimicking the conditions to test without deploying the change to the live site. This is risky to do. If a deployment breaks something else in the code, it could critically affect the usability of the live site.

How To Stress-Test The Staging Environment

As SEOs, it is very important that we test deployments that could potentially impact SEO performance before they launch. Oftentimes, we find ourselves discovering deployments after they have already started to affect traffic and rankings. This is less than ideal, as it can take a while for Googlebot to pick up changes once a bad deployment has been fixed. It is far better to test how Googlebot might process changes before it is able to do so.

Mirror The Production Site As Closely As Possible

The most important aspect of the staging site is that it is as close to the production environment as possible. This is critical because it enables any testing that you do to reveal the same outcome as if you had run the test on the production environment.

Any deviations between the two environments need to be cataloged. These discrepancies need to be communicated so that testers know to pay special attention to the areas of the production site that differ from staging. Once the deployment goes live, testers can quickly ensure these areas of the production site are behaving as expected.

Crawl The Site At Scale With Multiple User-Agents

One area that is often overlooked when stress-testing the staging environment is using several different user agents when crawling the site.

By using different agents, for example, mimicking Googlebot Smartphone and Googlebot Desktop, you are more likely to pick up technical issues with the site that aren’t obvious on first crawl. For instance, crawling as both desktop Googlebot and mobile Googlebot could show issues with rendering that are only occurring on mobile devices.

Make sure to crawl the site with user agents that are important for your specific industry. If you are targeting Google News as a channel, make sure to crawl the site as the Google-News bot. If images or videos are important to your SEO, crawl as Google-Image and Google-Video bots.

To put your staging site through its paces, make sure to crawl it with a mobile user agent, a desktop user agent, and spoof two search engine bots, e.g., Google and Bing. This way you are getting good coverage of the experiences of different, important bots. If possible, try to crawl as an LLM bot also.

Check The Rendering

A good starting point when testing a staging environment before a large-scale deployment is rendering. Modern websites will often use a lot of JavaScript, which, not inherently bad, can pose issues for some search bots in processing. For more information on how search bots process JavaScript, see this guide.

Set your crawling tool to include JavaScript rendering, and see what elements it can pick up. For example, can you see the header tags, meta title, schema markup? Then crawl the site again without JavaScript rendering enabled. Make sure those same elements are still available to the bots.

If in doubt, carry out some spot-checks on pages on the staging site. Inspect the Document Object Model (DOM) to see if the critical code elements are visible on first load of the page.

It is important that what you are seeing on the page is what the search bots are able to parse and render.

Test SEO Elements In Bulk And Across Page Types

Carrying out tests in bulk is important when testing a site before a large launch. When carrying out your tests, make sure they are across different page types and, if applicable, across languages.

If your site uses templates, make sure to test each of the templates that are critical to your SEO success. For example, on an ecommerce site, this means checking the category and product pages as a high priority.

For multilingual sites, ensure your tests are being run across different languages, and set a VPN to target the countries those languages are important for. Spoof those countries when running your crawls to make sure users will be seeing the correct language and content for their region. Although Googlebot frequently crawls from U.S.-based IP addresses, it also uses geo-distributed configurations, particularly for locale-adaptive or multilingual sites.

On your staging site, you may find that not all of the languages are represented, or perhaps there is a different localization process than what exists on production. This brings us back to the first point of needing the staging site to be as comparable to the production site as possible.

If it isn’t, in particular for localization elements, these need to be at the top of your post-deployment checks.

Benchmark Current Production Performance

A good aspect to remember is that your staging site may well be on a less performant server. This means that when conducting speed tests on staging, the results might be worse than if the tests were run on production. This can limit your ability to run meaningful checks before deployment.

To work around this, make sure to benchmark performance on production so that you can run the tests again quickly after deployment. This will mean waiting until the changes have gone live, but may be the only way to get an accurate understanding of areas like page load speed in situations where the staging server just isn’t as good as the production one.

Test For Edge Cases

Developers will try to break their code when testing it; we should too. When testing your staging site before deployment, run it through some edge cases. In practice, this means thinking of scenarios that, although unlikely, are possible. For example,

I am visiting the website from the U.S., but my language is set to French. What language are the meta tags in?
I am viewing the website on a mobile device but have the viewport set to desktop. What content am I able to access that I couldn’t on mobile otherwise?
If I turn JavaScript off, can I still use the menu drop-downs?

Test For Previously Known Issues

Make sure previous issues haven’t been reintroduced into the code during the most recent work. Even if the mass deployment is for a small area, such as a new meta title template being rolled out, that’s not to say issues aren’t being reintroduced elsewhere.

Don’t test only for the item being changed, but check across critical SEO areas. In particular, if work has been done recently to improve pages on the site, check those will still be in place with this latest deployment.

Equally, if there are known bugs that have affected your SEO performance in the past, check for these even if the deployment isn’t related to them. It’s easy for bugs to sneak back into code, especially if they have been there before.

More Resources:

Featured Image: Paulo Bobita/Search Engine Journal

Ecommerce MGMT 0 Comments

May 12 2026

3 Actionable Ways Affiliate Managers & SEOs Can Keep Relevant – Ask An SEO via @sejournal, @rollerblader

This week’s Ask an SEO question is a bit different. The person wants to know how they can keep relevant and feel secure in their jobs with AI replacing people. In addition to this question, someone at the Digital Marketing EU conference in Lisbon asked, “How do we pay affiliates so we can use them for AIO/GEO?” during my presentation.

If you’re worried about losing your job because of AI or feeling like your role is less relevant, there’s always a chance AI will replace you, but it could be less likely if you make yourself valuable and use it as a booster vs. something that can replace you. This post is specifically for the cross-overs between affiliate marketing and SEO, with some touchpoints for PR and other channels.

First, here’s a quick definition of both to define roles specifically for this post:

Affiliate marketing is a combination of content creators (influencers), bloggers, media company listicles, detailed guides and blog posts, as well as coupon and cashback sites. Each of these can be used as a source and reference by large language models to retrieve information, determine sentiment, know what the brand does, and generate an output.
SEO is finding ways to gain visibility in search engines and now in LLM outputs/results. This is done through the context around a brand mention, websites, and creators, including links that can be followed or crawled if an affiliate link or marked as sponsored to discover new pages, and to help ensure the algorithm knows what the company does or sells, and for which types of audiences.

1. Be Aligned With Presenting The Brand Benefits

We’re already seeing customized outputs in LLMs and AI Overviews (to an extent) based on what the system knows about the person. It’s one of the reasons why a manager and an executive see different results for the same question about a company, and the executive assistant as well. Eventually, we can expect products that surface in a search result or output to be similar.

They could be based on:

Estimated income level.
AI-known gender (especially in retail) based on shopping habits and engagements.
Location.
Language level, accent, slang, and tone used for the question.
The interactions in past sessions.

By having an aligned strategy and knowing what the important talking points about the brand are, this is a way to ensure that the LLM knows how to feature you when your service or product is relevant to a user. If you don’t make it clear and concise to who your audience and customer base is, there is a very likely chance the LLM will ignore you and share a big brand or a competitor that does.

For example, the user could ask which T-shirt is best for them without including price, designer, activity, age, gender, etc. The LLM or AIO will then look at the data it has on the customer and determine which brands match based on the information it has. In SEO, you would get a generic result, with machine learning and Retrieval Augmented Generation (RAG), the system is going to evaluate a factual and more relevant answer based on what it has from external sources combined with what it knows about the person.

SEOs can say, “Here’s what we’re not showing up for,” and list multiple selling points that are relevant for a user. The affiliate manager can then take this and ask affiliates doing reviews, creating videos, and building listicles to incorporate more of these selling points into the content to help the systems learn who the product or service is for, to build the knowledge base.

The above will likely help the SEO as there are now more references to these attributes, and the affiliate manager may benefit because the content is more relevant to a reader. If that reader is in the audience demographic, they now know the product or service is right for them, and there’s a better reason to click through and shop. In turn, the affiliate manager can have the SEO combine the talking points affiliates are using to sell into the website and app experience, which carries a more consistent flow from click to page and should help increase conversions.

2. Update Payment Models

The old affiliate model of paying a percentage or flat fee when a sale is made is outdated and has been for more than 10 years. That model doesn’t account for lifetime value of a customer, touchpoint attribution, or other conversions like email sign-ups that turn into sales with no commission, as it’s after the cookie life, and social media follows that also don’t track to affiliate when they convert.

More importantly, media buyers, link builders, AIO/GEO specialists, and PR people are now buying space on websites for their channels, but not engaging with a way that works across multiple channels. They focus on their channels, so suddenly, there are keyword-rich backlinks vs. natural ones and with unnatural language, or branding statements and talking points vs. actual context about the feature and link.

This is where SEO and affiliate can combine forces to set the brand up for long-term success:

SEO can identify the affiliates that are getting sources regularly in the LLMs and AI Overviews and track the list.
The affiliate manager can then reignite the relationship with the partner, and focus more heavily on them.
- Sometimes the partners are dormant or don’t drive revenue, so the managers don’t pay attention.
SEO & affiliate define a strategy that includes a media fee for a guaranteed placement with natural language on sourced pages and for advertorials or inclusions in topically relevant future content.
- This is pay-to-play and likely will be something that harms you in the future, but for now, it seems to be working well.
- If done organically and through actual, normal, and unbiased coverage (even with a payment), this could be legit and not harm you. It will depend if there is honesty in sharing negatives, ways to improve, and full editorial discretion.
Monitor and track progress.

The goal is to pay the affiliate for their work while ensuring that they continue to feature you as the LLMs are trusting them as a source of information for your industry or niche. This can include Facebook groups, social media influencers, blogs, associations, white papers, and studies, etc.

Updated payments and multiple options can generate more people signing up for your affiliate program and more active promotions. In some of the affiliate manager groups I participate in, one of the biggest questions we have is how to convince our companies or clients to update payment structures for modern times. This is the opportunity.

3. Cross-Recruit Link Building And Affiliate

Affiliate links are not backlinks; they are normally 307 redirects, have parameters on them, or have tracking set via JavaScript upon the exiting of a site. Search engines know what affiliate links are and will not count them as a trusted source like a solid natural backlink. They can follow them, so it is a safe bet that if LLMs will follow suit and identify what is a natural mention vs. a paid placement, they’ll weigh the website, mention, context, and value differently.

Affiliate managers can help clean up bad link profiles by inviting the websites where the links are harmful to become affiliates. The pitch is easy: “You’re already linking to us, why not get paid for the work you already did?” SEOs can stop losing backlinks and organic mentions by sharing their lists of sites with the affiliate managers, so the affiliate managers do not replace quality links with affiliate links.

In cases where the SEO cannot get the coverage, they can invite the person to the affiliate program, where, if they link from content that is relevant and has a person in the decision-making process, they can now make money. This blocks competitors from getting into the space and drives the user to your website when there may have been no brand mentions for anyone or links previously.

On top of this, it gets the website crawled and new pages discovered if it is a new product, vs. waiting for a spider to find it or a manual request to crawl and index.

This Is How You Can Use AI To Remain Relevant

There’s a lot more the two can do together to grow the company and the brand. Yes, AI can email for links and make recommendations on content and placements, but it will likely be seen as AI and cause the affiliates, creators, and publishers to ignore your brand.

Showing how, as a team, you’re increasing brand exposure, building a user base, and driving revenue while using AI to evaluate data and simplify the processes, is how you can secure your job because you are scaling the company in a way AI cannot, and using AI to be more efficient.

More Resources:

Featured Image: Paulo Bobita/Search Engine Journal

Ecommerce MGMT 0 Comments

Apr 21 2026

What’s The Biggest Technical SEO Blind Spot From Over-Relying On Tools? – Ask An SEO via @sejournal, @HelenPollitt1

We are fortunate to have a wide range of SEO tools available, designed to help us understand how our websites might be crawled, indexed, used, and ranked. They often have a similar interface of bold charts, color-coded alerts, and a score that sums up the “health” of your website. For those of us high-achievers who love to be graded.

But these tools can be a curse as well as a blessing, so today’s question is a really important one:

“What’s the biggest technical SEO blind spot caused by SEOs over-relying on tools instead of raw data?”

It’s the false sense of completeness. The belief that the tool is showing you the full picture, when in reality, you’re only seeing a representative model of it.

Everything else, mis-prioritization, conflicting insights, and misguided fixes all flow from that single issue.

Why Technical SEO Tools “Feel Complete” But Aren’t

Technical SEO programs are a critical part of an SEO’s toolkit. They provide insight into how a website is functioning as well as how it may be perceived by users and search bots.

A Snippet In Time Of The State Of Your Website

With a lot of the tools currently on the market, you are presented with a snapshot of the website at the point you set the crawler or report to run. This is helpful for spot-checking issues and fixes. It can be highly beneficial in spotting technical issues that could cause problems in the future, before they have made an impact.

However, they don’t necessarily show how issues have developed over time, or what might be the root cause.

Prioritized List Of Issues

The tools often help to cut through the noise of data by providing prioritized lists of issues. They may even give you a checklist of items to address. This can be very helpful for marketers who haven’t got much experience in SEO and need a hand knowing where to start.

All of these give the illusion that the tool is showing a complete picture of how a search engine perceives your site. But it’s far from accurate.

What’s Missing From Technical SEO Tools

Every tool is constricted in some way. They apply their own crawl limits, assumptions about site structure, prioritization algorithms, and data sampling or aggregation.

Even when tools integrate with each other, they are still stitching together partial views.

By contrast, raw data shows what actually happened, not what could happen or what a tool infers.

In technical SEO, raw data can include:

Without these, you are often diagnosing a simulation of your site and not the real thing.

Joined Up Data

These tools will often only report on data from their own crawl findings. Sometimes it is possible to link tools together, so your crawler can ingest information from Google Search Console, or your keyword tracking tool uses information from Google Analytics. However, they are largely independent of each other.

This means you may well be missing critical information about your website by only looking at one of two of the tools. For a holistic understanding of a website’s potential or actual performance, multiple data sets may be needed.

For example, looking at a crawling tool will not necessarily give you clarity over how the website is currently being crawled by the search engines, just how it potentially could be crawled. For more accurate crawl data, you would need to look at the server log files.

Non-Comparable Metrics

The reverse of this issue is that using too many of these tools in parallel can lead to confusing perspectives on what is going well or not with the website. What do you do if the tools provide conflicting priorities? Or the number of issues doesn’t match up?

Looking at the data through the lens of the tool means there can be an extra layer added to the data that makes it not comparable. For example, sampling could be occurring, or a different prioritization algorithm used. This might result in two tools giving conflicting results or recommendations.

Some Tools Give Simulations Rather Than Actual Data

The other potential pitfall is that, sometimes, the data provided through these reports is simulated rather than actual data. Simulated “lab” data is not the same as actual bot or user data. This can lead to false assumptions and incorrect conclusions being drawn.

In this context, “simulated” doesn’t mean the data is fabricated. It means the tool is recreating conditions to estimate how a page might behave, rather than measuring what actually did happen.

A common example of lab vs. real data is found in speed tests. Tools like Lighthouse simulate page load performance under controlled conditions.

For example, a Lighthouse mobile test runs under throttled network conditions simulating a slow 4G connection. That lab result might show an LCP of 4.5s. But CrUX field data, reflecting real users across all their devices and connections, might show a 75th percentile LCP of 2.8s, because many of your actual visitors are on faster connections.

The lab result is helpful for debugging, but it doesn’t reflect the distribution of real user experiences in real-world scenarios.

Why This Is Important

Understanding the difference between the false sense of completeness shown through tools, and the actual experience of users and bots through raw data can be critical.

As an example, a crawler could flag 200 pages with missing meta descriptions. It suggests you address these missing meta descriptions as a matter of urgency.

Looking at server logs reveals something different. Googlebot only crawls 50 of those pages. The remaining 150 are effectively undiscovered due to poor internal linking. GSC data shows impressions are concentrated on a small subset of the URLs.

If you follow the tool, you spend time writing 200 meta descriptions.

If you follow the raw data, you fix internal linking, thereby unlocking crawlability for 150 pages that currently don’t have visibility in the search engines at all.

The Risk Of This Completeness Blind Spot

The “completeness” blind spot, caused by over-reliance on technical tools, causes a lot of knock-on effects. Through the false sense of completeness, key aspects are overlooked. As a result, time and effort are misguided.

Losing Your Industry Context

Tools often make recommendations without the context of your industry or organization. When SEOs rely too much on the tools and not the data, they may not put on this additional contextual overlay that is important for a high-performing technical SEO strategy.

Optimizing For The Tool, Not Users

When following the recommendations of a tool rather than looking at the raw data itself, there can be a tendency to optimize for the “green tick” of the tool, and not what’s best for users. For example, any tool that provides a scoring system for technical health can lead SEOs to make changes to the site purely so the score goes up, even if it is actually detrimental to users or their search visibility.

Ignoring The Best Way Forward By Following The Tool

For complex situations that take a nuanced approach, there is a risk that overly relying on tools rather than the raw data can lead to SEOs ignoring the complexity of a situation in favor of following the tools’ recommendations. Think of times when you have needed to ignore a tool’s alerts or recommendations because following them would lead to pages on your site being indexed that shouldn’t, or pages being crawlable that you would rather not be. Without the overall context of your strategy for the site, tools cannot possibly know when a “noindex” is good or bad. Therefore, they tend to report in a very black-and-white manner, which can go against what is best for your site.

Final Thought

Overall, there is a very real risk that by accessing all of your technical SEO data only through tools, you may well be nudged towards taking actions that are not beneficial for your overall SEO goals at best, or at worst, you may be doing harm to your site.

More Resources:

Featured Image: Paulo Bobita/Search Engine Journal

Ecommerce MGMT 0 Comments

Apr 14 2026

How To Break Through An Affiliate Site Plateau & Find New Growth – Ask An SEO via @sejournal, @rollerblader

This week’s Ask an SEO question is:

“I’ve been running an affiliate site for 2 years but hit a plateau. What advanced data analysis techniques can help me identify new growth opportunities that I might be missing?“

This is one of my favorite questions that come up at conferences and in the affiliate marketing programs we manage. Most of the time, the affiliate submits their site or niche, and I can give direct examples and opportunities. But for this, we want to keep everything anonymous, so I’ll share the processes and ideas so you and anyone else reading can implement, no matter what industry, type of content, etc., you produce.

Breaking The Plateaus

There are a few plateaus affiliates face more than others, including:

Traffic stagnation.
New products and services for recommendation.
Revenue flat lines.
Topics to talk about.

These are the most common with this question, so I’ll focus on them. If anyone reading this has hit a different one and is looking for ways to overcome them, send the question through my author bio page here. If I’ve worked through it, I’ll do my best to answer it in an upcoming column.

Traffic Stagnation

If you have a website and traffic has stagnated because you dominate for all the main queries and topics, look outside your own writing and knowledge base for help. Instead of hiring writers to help with more content based on what exists within your platform, try funneling new visitors in from other platforms (websites, podcasts, apps, etc.) or bringing people in to create unique content for you by featuring them and asking them to promote it.

To find new topics, ideas, and questions people have, adding a forum or community can help bring new traffic and ideas to your website or community. Some search engines like Google tend to reward this authentic user-generated content, but it does come with a decent amount of manual labor with monitoring and quality control. The benefit here is you build a community that creates content for you.

Pro-tip: Add a prompt on the main website pages like “question not answered, click here to ask the community,” where it goes to the forum, or have it go to an answer box where you collect it and create a new guide. Similar to how Search Engine Journal has a “Submit Questions” section for me and other “Ask an” columnists.

The UGC can begin showing up in Google as well as LLMs like ChatGPT, Perplexity, and Claude, and you can begin getting new traffic in and a new user base. This can all be monetized. But maybe you don’t want the hassle and risk of a UGC platform; there are more options.

Take your top guides and articles and begin turning them into videos. A long-form video can help with YouTube and bringing in traffic; it can be uploaded to Skool if you create a course. Skool and other platforms let you charge a fee for access, and each chapter in the video can become a long-form video or a short that works for YouTube, TikTok, and likely Instagram. With the exception of the shorts, affiliate links can be used on all of these platforms. The benefit of videos is a lot of the platforms like YouTube can be steady streams of traffic vs. IG or TikTok where it only lasts for a couple days to a week.

Now begin adding text versions to social media platforms in ways that fit. LinkedIn allows long-form and encourages users to ask questions, answer polls, and then you can link to your website. Bluesky and X are short-form but allow quick and easy links to your website or pages, although the traffic is in short bursts. Pinterest is short form, but image-heavy, and a pin that is done well and gets attention can be consistent traffic for a year and sometimes longer.

Some partners decide they want to start podcasting. Every topic on your website can become a theme or session, or combined into a really strong one that becomes a course you can monetize. Find other people with complementary knowledge and/or who have audiences and invite them to participate. You’ll be helping to grow each other’s traffic and sharing expertise. Sometimes your guest may spur new content ideas for you, too.

New Products And Services And Fixing Revenue Flatlines

When you run out of products and services to promote, or you’re hitting the highest AOVs available, revenue begins to flatline. While you cannot control what happens on the merchant or lead gen website’s funnel, you can control how you make money. This is an affiliate post so I won’t talk about driving higher EPCs and CPMs or getting pageviews to increase ad media, instead it is about using affiliate links and offers.

Here’s where to begin looking.

Survey Your Audience Or Use Your Analytics For Demographics

Having your audience’s demographics, including age, urban/rural/suburban, likes and interests, and anything else, can make you a ton of money. If it turns out the majority have dogs and are urban, but you run a cooking site, add in pet-friendly matching recipes or toys for dogs that get them exercise and stimulation when they cannot be outside more regularly to burn energy.

If the same demographics are local-based, like a group for parents in New England, create snow day resources where you review family-friendly tabletop games for snow days, lists of local restaurants across the area that offer kids eat free or family deals, and affordable snowbird family vacations.

If your audience has a large portion in rural areas, think about the ingredients that are hard to come by in rural areas due to smaller grocery stores, then share online resources to access them. This is a low-hanging fruit item I see as recipe sites will focus on the tools and products, but they can also monetize ingredients.

Learn What Else They’re Into

Once you know who your audience is and where they skew demographically, survey them to find their interests. If you can’t get them to take surveys, even with incentives like gift cards or prizes via a drawing (assuming it’s legal where you and your audience live), look up free research documents and use your marketing skills to find hobbies, stores, and associations that have similar audiences.

Maybe your audience is 50-year-old suburbanites that love bird watching. You’ve already maxed out sportswear and hiking equipment, same with books on birds and binoculars. Maybe it turns out they’re also into photography, so you can sell cameras, photo storage solutions, ways to print the photos and sell them, editing software, and guides to using the camera and setting up different types of shots.

It could also turn out they love to travel. Create guides of where to go that are friendly for people 50-60 years old, including the types of birds that they could see in each spot along the route, and what to pack based on the season, as weather can change. You can now use affiliate links for hotels and airfare, travel supplies, camera bags for different climates, and ebooks or physical books with trail maps, travel guides, and bird watching books to check off the ones they see.

You do need to watch adding too much content that is not the core topic of your channel, so you don’t accidentally uncategorize your platforms for SEO or alienate your core reader base. When you go off topic too often, you chase away current and new subscribers while also confusing algorithms. This is easy to resolve with tech SEO by using metarobots or robots.txt, and having an editorial calendar, but that is a different topic.

Now you have new products and services to promote, new merchants to work with, and this leads to more affiliate sales, increasing your revenue. Shopping guides, comparison grids, listicles, etc.

New Topics To Talk About

Above, I mentioned podcast guests, UGC, and a few ways that can spark new ideas for topics when you run out of things to talk about. So here are a few other ways I break writer’s block with the programs I work on for myself, for clients, and the affiliates in our programs.

AlsoAsked.com: You plug in a topic like “running shoes,” and it spits out a ton of potential questions about them. From there, I go to Google or an LLM and type it in, then I look to see what shows up. To go a step further, I may ask, “What are similar questions to this one?” or “What are complementary but different questions to this one?” as a second query to see what I may be missing.
Rank trackers: Take a URL for a blog or forum and plug it into a rank tracking tool. It’ll provide a list of keywords, questions, and phrases it shows up for.
Comments: Read the comments on YouTube videos for channels that are directly related to your business. These are things people want to know about and can be a way to get new traffic while breaking writer’s block.
AI and LLMs: Ask AI for a list of ideas that are related but not covered on your platform yet, and then have it double-verify. Not everything it recommends will be relevant, but it could spark ideas for you.

There are almost always solutions to preventing stagnation for affiliates, no matter if it is traffic, revenue, topics, or products and services to promote. You may need to expand your offerings to other types of products and services that match the same demographics or look to other platforms and competitors for content inspiration. I hope this helps, and thank you for asking.

More Resources:

Featured Image: Paulo Bobita/Search Engine Journal

Ecommerce MGMT 0 Comments

Mar 19 2026

What Can Log File Data Tell Me That Tools Can’t? – Ask An SEO via @sejournal, @HelenPollitt1

For today’s Ask An SEO, we answer the question:

“As an SEO, should I be using log file data, and what can it tell me that tools can’t?”

What Are Log Files

Essentially, log files are the raw record of an interaction with a website. They are reported by the website’s server and typically include information about users and bots, the pages they interact with, and when.

Typically, log files will contain certain information, such as the IP address of the person or bot that interacted with the website, the user agent (i.e., Googlebot, or a browser if it is a human), the time of the interaction, the URL, and the server response code the URL provided.

Example log:

6.249.65.1 - - [19/Feb/2026:14:32:10 +0000] "GET /category/shoes/running-shoes/ HTTP/1.1" 200 15432 "-" "Mozilla/5.0 (Macintosh; Intel Mac OS X 14_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36"

6.249.65.1 – This is the IP address of the user agent that hit the website.
19/Feb/2026:14:32:10 +0000 – This is the timestamp of the hit.
GET /category/shoes/running-shoes/ HTTP/1.1 – The HTTP method, the requested URL, and the protocol version.
200 – The HTTP status code.
15432 – The response size in bytes.
Mozilla/5.0 (Macintosh; Intel Mac OS X 14_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/121.0.0.0 Safari/537.36 – The user agent (i.e., the bot or browser that requested the file)

What Log Files Can Be Used For

Log files are the most accurate recording of how a user or a bot has navigated around your website. They are often considered the most authoritative record of interactions with your website, though CDN caching and infrastructure configuration can affect completeness.

What Search Engines Crawl

One of the most important uses of log files for SEO is to understand what pages on our site search engine bots are crawling.

Log files allow us to see which pages are getting crawled and at what frequency. They can help us validate if important pages are being crawled and whether often-changing pages are being crawled with an increased frequency compared to static pages.

Log files can be used to see if there is crawl waste, i.e., pages that you don’t want to have crawled, or with any real frequency, are taking up crawling time when a bot visits a site. For example, by looking at log files, you may identify that parameterized URLs or paginated pages are getting too much crawl attention compared to your core pages.

This information can be critical in identifying issues with page discovery and crawling.

True Crawl Budget Allocation

Log file analysis can give a true picture of crawl budget. It can help with the identification of which sections of a site are getting the most attention, and which are being neglected by the bots.

This can be critical in seeing if there are poorly linked pages on a site, or if they are being given less crawl priority than those sections of the site with less importance.

Log files can also be helpful after the completion of highly technical SEO work. For example, when a website has been migrated, viewing the log files can aid in identifying how quickly the changes to the site are being discovered.

Through log files, it’s also possible to determine if changes to a website’s structure have actually aided in crawl optimization.

When carrying out SEO experiments, it is necessary to know if a page that is a part of the experiment has been crawled by the bots or not, as this can determine whether the test experience has been seen by them. Log files can give that insight.

Crawl Behavior During Technical Issues

Log files can also be useful in detecting technical issues on a website. For example, there are instances where the status code reported by a crawling tool will not necessarily be the status code that a bot will receive when hitting a page. In that instance, log files would be the only way of identifying that with certainty.

Log files will enable you to see if bots are encountering temporary outages on the site, but also how long it takes them to re-encounter those same pages with the correct status once the issue has been fixed.

Bot Verification

One very helpful feature of log file analysis is in distinguishing between real bots and spoofed bots. This is how you can identify if bots are accessing your site under the guise of being from Google or Microsoft, but are actually from another company. This is important because bots may be getting around your site’s security measures by claiming to be a Googlebot, whereas, in fact, they are looking to carry out nefarious actions on your site, like scraping data.

By using log files, it’s possible to identify the IP range that a bot came from and check it against the known IP ranges of legitimate bots, like Googlebot. This can aid IT teams in providing security for a website without inadvertently blocking genuine search bots that need access to the website for SEO to be effective.

Orphan Pages Discovery

Log files can be used to identify internal pages that tools didn’t detect. For example, Googlebot may know of a page through an external link to it, whereas a crawling tool would only be able to discover it through internal linking or through sitemaps.

Looking through log files can be useful for diagnosing orphan pages on your site that you were simply not aware of. This is also very helpful in identifying legacy URLs that should no longer be accessible via the site but may still be crawled. For example, HTTP URLs or subdomains that have not been migrated properly.

What Other Tools Can’t Tell Us That Log Files Can

If you are currently not using log files, you may well be using other SEO tools to get you partway to the insight that log files can provide.

Analytics Software

Analytics software like Google Analytics can give you an indication of what pages exist on a website, even if bots aren’t necessarily able to access them.

Analytics platforms also give a lot of detail on user behavior across the website. They can give context as to which pages matter most for commercial goals and which are not performing.

They don’t, however, show information about non-user behavior. In fact, most analytics programs are designed to filter out bot behavior to ensure the data provided reflects human users only.

Although they are useful in determining the journey of users, they do not give any indication of the journey of bots. There is no way to determine which sequence of pages a search bot has visited or how often.

Google Search Console/Bing Webmaster Tools

The search engines’ search consoles will often give an overview of the technical health of a website, like crawl issues encountered and when pages were last crawled. However, crawl stats are aggregated and performance data is sampled for large sites. This means you may not be able to get information on specific pages you are interested in.

They also only give information about their bots. This means it can be difficult to bring bot crawl information together, and indeed to see the behavior of bots from companies that do not offer a tool like a search console.

Website Crawlers

Website crawling software can help with mimicking how a search bot might interact with your site, including what it can technically access and what it can’t. However, they do not show you what the bot actually accesses. They can give information on whether, in theory, a page could be crawled by a search bot, but do not give any real-time or historical data on whether the bot has accessed a page, when, or how frequently.

Website crawlers are also mimicking bot behavior in the conditions you are setting them, not necessarily the conditions the search bots are actually encountering. For example, without log files, it is difficult to determine how search bots navigated a site during a DDoS attack or a server outage.

Why You Might Not Use Log Files

There are many reasons why SEOs might not be using log files already.

Difficulty In Obtaining Them

Oftentimes, log files are not straightforward to get to. You may need to speak with your development team. Depending on whether that team is in-house or not, this may literally mean trying to track down who has access to the log files first.

For teams working agency-side, there is an added complexity of companies needing to transfer potentially sensitive information outside of the organization. Log files can include personally identifiable information, for example, IP addresses. For those subject to rules like GDPR, there may be some concern around sending these files to a third party. There may be a need to sanitize the data before sharing it. This can be a material cost of time and resources that a client may not want to spend simply to share their log files with their SEO agency.

User Interface Needs

Once you have access to log files, it isn’t all smooth sailing from there. You will need to understand what you are looking at. Log files in their raw form are simply text files containing string after string of data.

It isn’t something that is easily parsed. To truly make sense of log files, there is usually a need to invest in a program to help decipher them. These can range in price depending on whether they are programs designed to let you run a file through on an ad-hoc basis, or whether you are connecting your log files to them so they stream into the program continuously.

Storage Requirements

There is also a need to store log files. Alongside being secure for the reasons mentioned above, like GDPR, they can be very difficult to store for long periods due to how quickly they grow in size.

For a large ecommerce website, you might see log files reach hundreds of gigabytes over the course of a month. In those instances, it becomes a technical infrastructure issue to store them. Compressing the files can help with this. However, given that issues with search bots can take several months of data to diagnose, or require comparison over long time periods, these files can start to get too big to store cost-effectively.

Perceived Technical Complexity

Once you have your log files in a decipherable format, cleaned and ready to use, you actually need to know what to do with them.

Many SEOs have a big barrier to using log files simply based on the fact they seem too technical to use. They are, after all, just strings of information about hits on the website. This can feel overwhelming.

Should SEOs Use Log Files?

Yes, if you can.

As mentioned above, there are many reasons why you may not be able to get hold of your log files and transform them into a usable data source. However, once you can, it will open up a whole new level of understanding of the technical health of your website and how bots interact with it.

There will be discoveries made that simply could not be achieved without log file data. The tools you are currently using may well get you part of the way there. They will never give you the full picture, however.

More Resources:

Featured Image: Paulo Bobita/Search Engine Journal

Ecommerce MGMT 0 Comments

Feb 10 2026

Should I Optimize My Content Differently For Each Platform? – Ask An SEO via @sejournal, @rollerblader

This week’s Ask an SEO question is from an anonymous reader who asks:

“Should I be optimizing content differently for LinkedIn, Reddit, and traditional search engines? I’m seeing these platforms rank highly in Google results, but I’m not sure how to create a cohesive multi-platform SEO approach.”

Yes, you should absolutely be optimizing your content differently based on where you publish it, where you want to reach the audience, and the way they engage. This includes what you put out, what goes on your website, and what exists in your metadata. Each platform has a different user experience, and the people there go for different reasons, so your job with your content is to meet their needs where they are.

Metadata

For SEO purposes, you’re limited to a certain amount of pixels for meta titles and descriptions in a search result, whereas on social media platforms, you’re limited to a different number of characters. This means your titles and descriptions need to be modified to fit the pixel or character lengths defined by the platforms, including Open Graph, rich pins, etc. The people on the platform may also be at different stages of their journey and be different audience demographics.

If the audience on one platform that has its own metadata elements, and it is younger or skews towards one gender, cater the text and imagery in your metadata towards them. It’s worth seeing if that resonates better, but only if that is the majority from that platform. For search engines, it can be anyone and any demographic, so make it a strong sales pitch that is all-inclusive. Use your customer service and review data to find out what matters to them and use it in your messaging. The same goes for the images you use.

What fits on Pinterest won’t look good on LinkedIn, and an image for Google Discover may not work great on Instagram. Pinterest can display a vertical infographic and make it look great, but it will be illegible on platforms that have squares and landscape-oriented images. Resize, change the wording, and ensure the focal point on the image matches the platform it’ll be used on via your metadata.

Search engines and social algorithms look for different things as well. A search engine may allow some clickbait and salesy types of titles and metadata, but social media algorithms may penalize sites that do this. And each platform will be using and looking for different signals.

This is why you want to speak to the audiences on the platforms and focus on what the platform rewards, not just a search engine. Your customers on TikTok may be younger and use different wording than your customers on Facebook, but both will need a balance of the wording on your webpage. This is where using unique metadata by platform and purpose matters.

Content On Your Own Pages

Not every page on your website has to be for SEO, AIO, or GEO, and neither does the user experience. If the page is for an email blast or remarketing where you have strong calls to action, less text, and more conversion, you can noindex it or use a canonical link to the detailed new customer experience page. The same goes for SEO vs. social media visitors.

Someone from social media may need more of an education when buying a product because they didn’t set out that day to buy it; they were on social media to have fun. Someone looking for a product, product + review, or a comparison has a background on the product and wants a solution, so they went to a search engine to find one. This is where an educational vs. a conversion option can happen, and both can exist without competing, even though they’re optimized for the same keyword phrase.

The schema, the way wording is used, and elements on the page like an “add to cart” button above the fold help search engines to know the page is for conversions, while an H1, H2, and text with internal links to product and content pages mean it’s for educational purposes. Now apply this to the goal of what you want to the person to do on the page to the page, and keep in mind where they are coming from before they reach it.

You may want a more visual approach with video demonstrations or reviews, and options to shop and learn more for some experiences, vs. giving them the product and a buy now button. Both are optimized for the same keywords, but both are there for different visitors. This is where you deduplicate them using your SEO skills.

The keywords and phrases will be similar in your title tag, H1 tag, and compete directly against the product or collection page, but the page is how people from Snapchat and Reddit engage vs. someone from an email blast that knows how your brand behaves. So, set the canonical link to the main product page and/or add a meta robots noindex,nofollow. When you’re pushing your content out, share the version of the page to the platform it is designed for. Your site structure and robots.txt guide the search engines and AI to the pages meant for them, helping to eliminate the cannibalization.

It is the same content, the same purpose, and the same goal, just a unique format for the platform you want traffic from. I wouldn’t recommend this for everything because it is a ton of work, but for important pages, products, and services, it can make a difference to provide a better UX based on what the person and platform prefer.

What You Post To The Platform

Last is the content you post to the platform. Some allow hashtags; others prefer a lot of words, and platforms like X or Bluesky restrict the number of words you can use unless you pay. The audiences on these platforms pay attention to and use different words, and the algorithms may reward or penalize content differently.

On LinkedIn and Reddit, you may want to share a portion of the post and a summary of what the person will learn, then encourage engagement and a click through to your website or app. On Facebook, you may do a snippet of text and a stronger call-to-action, as people aren’t there for networking and learning like they are on LinkedIn.

Reddit may also benefit from examples and trust builders, where YouTube Shorts is about a quick message that entices an interaction and ideally a click through. The written description on a YouTube Short may go ignored as it is hidden, so the video is more important message-wise. Reddit can also be people looking for real human experiences, reviews, and comparisons from real customers. So, if you engage and publish your content, look at the topic of the forum and meet the user on the page at that specific stage of their journey.

The description still matters on most of these platforms because they are algorithm-based, and so are the search engines that feature their content. The content here acts like food for the algorithms along with user signals, so make sure you write something that properly matches the video’s content and follows best practices. If you’re publishing to Medium or Reddit and want to get the comparison queries, focus on unbiased and fair comparisons or reviews so Google surfaces it (disclosing you are one of the brands if you are). Then focus your own pages on conversion copy so as the person is ready to buy a blue t-shirt, they see your conversion page.

You should change the content based on the platform, and even your own website, when the goal is to bring users in from a specific traffic source. Someone from social media may like a video, while someone from a search engine wants text. Just make sure you code and structure your pages correctly, and you cater the experience to the right platform so the users reach their correct experience.

This is not practical for every page, so do your best, and at a minimum, customize what you share publicly and what is in your metadata. Those are easy enough and fast enough to be able to be done at scale, then pay attention to the UX on the page and make adjustments as needed.

More Resources:

Featured Image: Paulo Bobita/Search Engine Journal

Ecommerce MGMT 0 Comments

Feb 5 2026

Is Your Internal Linking Helping Or Hurting Topical Authority? Ask An SEO via @sejournal, @HelenPollitt1

Today’s question is about understanding internal linking and how it can help or hinder a search engine’s perception of a page’s topical relevance and authority.

“How do you technically assess whether a site’s internal linking is diluting topical authority rather than strengthening it?”

What Is Topical Authority

In essence, topical authority is the concept of how a search engine may view a website’s ability to provide an authoritative answer for a topic, inferred from how consistently it covers that topic and how signals reinforce that coverage.

Although there is no single standard defined metric for topical authority, it is, in essence, a measure of a page or a whole website’s relevance to a specific knowledge area, and trustworthiness as a source of information.

How Is It Affected By Internal Links

Internal links are crucial in shaping topical authority. They influence how authority, relevance, and intent signals are distributed across a website or folder. If we think of backlinks as bringing topical authority into a website, internal linking then helps to disperse it across the site. Internal linking determines where that authority accumulates and aids search engines in interpreting a page’s topical focus.

Links that connect topically relevant pages together help to strengthen the perception of the destination page’s authority on a subject. Lots of links from pages that aren’t seemingly relevant to each other can dilute the destination’s topical authority.

Something that is central to understanding the role of internal links in shaping topical authority is PageRank. PageRank is an algorithmic system developed in the late ’90s by Google founders Larry Page and Sergey Brin. It was used to measure the importance of a page based on the nature and volume of the links pointing to it. We need to keep this concept in mind when considering the use of internal links to shape the perception of a page’s topical authority.

How Important Are Internal Links In Regard To Topical Authority?

There are several factors of internal links that can affect how beneficial they are in strengthening a page’s topical authority.

Does The Link Pass Authority?

The first aspect is whether the link is followable, or if it is marked as “rel=nofollow.” This also applies to other variations of the “nofollow” tag, like “rel=sponsored.” Note, these tags are hints and not absolutes and Google might ignore them in some cases.

The URL that the link is on, and the page it is pointing to, also need to be crawlable. If those pages are disallowed via the robots.txt, then the value of the authority will not pass, as the page will not be crawled for the internal link to be picked up by the search bots.

Where Is It Placed On The Page?

Where a link is on the page could affect its authority. For example, links placed in the footer of every page on the site, get weighted differently than those that sit within the page’s main content. Google’s Martin Splitt has explained that Google does treat content in different parts of the page differently when trying to understand the topic of a page, and its content that is perceived to be main content that is used most to help with that.

Google’s John Muller recently answered a question about how links are valued in these different areas of a page. He said, “I don’t think there is anything quantifiably different about internal links in different parts of the page.” Although that may seem to contradict Splitt’s comments, remember that Muller is addressing how the value of a link may be affected by its location on a page, whereas Splitt is discussing how location of content affects how it is weighted to determine topic.

Following this logic, links appearing in the main content of a page may affect how that link passes topical relevancy.

What Is The Anchor Text?

The anchor text, or alt-text in cases where an image is linked, will help to inform the search engines of the nature of the page being linked to. The words that form the link are critical in helping the user and search engines know what to expect when they land on the page it takes them to. This context is another signal to the bots of the link destination’s relevancy to a subject.

What Is The Link Pointing From And To?

Similarly, if a link is on a page that is topically similar to the page being linked to, that also reinforces the topical authority of the destination page. If Page A on my fictitious hobby ecommerce site is about different craft hobbies, and Page B is about textile craft hobbies, it will help to reinforce Page B’s relevance to those seeking information about craft hobbies.

How To Assess Your Internal Linking Structure’s Effect On Topical Authority

Internal links can help a site’s topical authority by reinforcing the destination URLs’ topical relevance. They also help to ensure that any external authority signals are being passed to the correct internal pages.

There are calculations that could factor in the flow of link equity and authority through pages to assess the full impact of internal linking on a page’s topical authority. Calculations required include assigning value for position of link placement, click-depth from a topically relevant and authoritative page and topical authority of the links to the page where the link is coming from.

It’s a lot of math.

Instead, I’m in favor of keeping it simple, and defining a process that will allow you to get enough of an understanding of your website’s topical authority to make decisions from.

By looking at a sample of pages from your site across different topics, or if you are particularly focused, just one area of your topical authority, you can get an idea of any issues.

1. Identify Where Your Pages Are Getting Their Internal Links From

First of all, crawl your site, taking a sample of URLs. Export all of the internal links pointing to those pages, including their anchor text and URL the link is on.

2. Classify The URLs In Topic Clusters

Group all the pages into topical themes, i.e., for an ecommerce site that sells hobby equipment, “knitting, crochet, embroidery, and weaving” would all sit within “crafts” and the sub-category of “textile arts.” “Die cutting, digital cutting, laser cutting” would all sit within “crafts” and the sub-category of “cutting and engraving.”

3. Analyze What Proportion Of Each URL’s Followable Internal Links Are From Within The Same Topic And Outside Of The Topic

Using the exported links, for follow links only, match them against the URLs and mark them as “within” or “outside” their topical family

Divide the volume of links that are from the same topic by the volume of links in total. For example, for “examplehobbyshop.com/crafts/embroidery/intro-to-embroidery/, if the total number of internal links is 100 and the volume of internal links from categories that are within the “craft” family is 60, then it would be 60/100 = 60%

The rule I apply is, if the URL internal links from the same family are around 75% or higher, that suggests that internal links are helping solidify topical authority. If it is less than 74%, that suggests that there could be some improvement.

How To Assess How Your Links’ Anchor Text Is Contributing To Your Topical Authority

1. Extract The Anchor Text Of Links Pointing To Your URLs

When gathering the links pointing to a page, remove common links like static header navigation and footer links that stay the same on each page. Then, extract the anchor text or alt text for linked images.

2. Categorize The Relevance Of The Anchor Text Of Links

Next, you want to look at how on-topic the anchor text of the links is for the page they are linking to.

Classify each anchor text as “topically relevant,” “topically irrelevant,” or “generic.” Topically relevant anchor text will have great alignment with the subject of the linked-to page. Topically irrelevant anchor text will not show any useful reinforcement of the topic. “Generic” anchor text includes “click here” or pagination links.

For the URL, examplehobbyshop.com/crafts/embroidery/intro-to-embroidery/, the following internal links’ anchor text could be grouped as follows:

Topically relevant

Topically irrelevant

Generic

“get started with embroidery”

“learn the tools needed to pick up embroidery”

“want to try another fibre craft?”

“beginners’ guide”

“start a new hobby”

“try something new”

“click here”

“next”

“page 2”

The goal is to have a lot of links from topically relevant pages pointing to the URL using topically relevant anchor text.

Measure the relevance of the anchor text against the total volume of anchor text.

For example, if that page had 30 topically relevant anchor texts, 20 topically irrelevant, and 50 generic, of the total 100 internal links pointing to it, it would have a topically relevant anchor text score of 30%. So despite there being a high volume (60%) of relevant internal links pointing to it, only 30% of the links have topically relevant anchor text.

3. Identify The Intent Mix Of The Anchor Text

Next, you want to identify the intent of the anchor text.

When grouping the anchor text by topical relevancy, also consider the intent behind the anchor text. For example, is it suggesting the page you will go to after clicking on it is informational, commercial, or transactional?

This matters because it can lead to dilution of the page intent. If there is a wide spread of intent shown through the anchor text, it can lead to confusion as to the purpose of the page being linked to.

Following on from the previous example, if some of the internal links had the anchor text “learn more about embroidery,” but others were more akin to “buy all the tools you need for your first embroidery project,” it’s not clear if examplehobbyshop.com/crafts/embroidery/intro-to-embroidery/ is an informational, commercial, or transactional page. This suggests the anchor text has a high intent mix, which is not ideal. If the majority of the anchor text were aligned with informational intent, it would have low intent mix.

Together, you want the anchor text to show high topical relevance, and low intent mix.

Final Thoughts

By the end of your analysis, you should have an idea of the topical relevance of the source pages of the internal links and how their anchor text aligns to both the topic and intent of the page being linked to.

Scaling this across a larger volume of URLs means you can start to see how topical relevance and authority are being strengthened or diluted via internal linking.

Once you have an idea of weaker areas of your site, you can begin to optimize anchor text and link sources to reinforce the value of the linked-to page as a source of authority on a subject.

More Resources:

Featured Image: Paulo Bobita/Search Engine Journal

Ecommerce MGMT 0 Comments

Jan 8 2026

Ask An SEO: Can AI Systems & LLMs Render JavaScript To Read ‘Hidden’ Content? via @sejournal, @HelenPollitt1

For this week’s Ask An SEO, a reader asked:

“Is there any difference between how AI systems handle JavaScript-rendered or interactively hidden content compared to traditional Google indexing? What technical checks can SEOs do to confirm that all page critical information is available to machines?”

This is a great question because beyond the hype of LLM-optimization sits a very real technical challenge: ensuring your content can actually be found and read by the LLMs.

For several years now, SEOs have been fairly encouraged by Googlebot’s improvements in being able to crawl and render JavaScript-heavy pages. However, with the new AI crawlers, this might not be the case.

In this article, we’ll look at the differences between the two crawler types, and how to ensure your critical webpage content is accessible to both.

How Does Googlebot Render JavaScript Content?

Googlebot processes JavaScript in three main stages: crawling, rendering, and indexing. In a basic and simple explanation, this is how each stage works:

Crawling

Googlebot will queue pages to be crawled when it discovers them on the web. Not every page that gets queued will be crawled, however, as Googlebot will check to see if crawling is allowed. For example, it will see if the page is blocked from crawling via a disallow command in the robots.txt.

If the page is not eligible to be crawled, then Googlebot will skip it, forgoing an HTTP request. If a page is eligible to be crawled, it will move to render the content.

Rendering

Googlebot will check if the page is eligible to be indexed by ensuring there are no requests to keep it from the index, for example, via a noindex meta tag. Googlebot will queue the page to be rendered. The rendering may happen within seconds, or it may remain in the queue for a longer period of time. Rendering is a resource-intensive process, and as such, it may not be instantaneous.

In the meantime, the bot will receive the DOM response; this is the content that is rendered before JavaScript is executed. This typically is the page HTML, which will be available as soon as the page is crawled.

Once the JavaScript is executed, Googlebot will receive the fully constructed page, the “browser render.”

Indexing

Eligible pages and information will be stored in the Google index and made available to serve as search results at the point of user query.

How Does Googlebot Handle Interactively Hidden Content?

Not all content is available to users when they first land on a page. For example, you may need to click through tabs to find supplementary content, or expand an accordion to see all of the information.

Googlebot doesn’t have the ability to switch between tabs, or to click open an accordion. So, making sure it can parse all the page’s information is important.

The way to do this is to make sure that the information is contained within the DOM on the first load of the page. Meaning, content may be “hidden from view” on the front end before clicking a button, but it’s not hidden in the code.

Think of it like this: The HTML content is “hidden in a box”; the JavaScript is the key to open the box. If Googlebot has to open the box, it may not see that content straightaway. However, if the server has opened the box before Googlebot requests it, then it should be able to get to that content via the DOM.

How To Improve The Likelihood That Googlebot Will Be Able To Read Your Content

The key to ensuring that content can be parsed by Googlebot is making it accessible without the need for the bot to render the JavaScript. One way of doing this is by forcing the rendering to happen on the server itself.

Server-side rendering is the process by which a webpage is rendered on the server rather than by the browser. This means an HTML file is prepared and sent to the user’s browser (or the search engine bot), and the content of the page is accessible to them without waiting for the JavaScript to load. This is because the server has essentially created a file that has rendered content in it already; the HTML and CSS are accessible immediately. Meanwhile, JavaScript files that are stored on the server can be downloaded by the browser.

This is opposed to client-side rendering, which requires the browser to fetch and compile the JavaScript before content is accessible on the webpage. This is a much lower lift for the server, which is why it is often favored by website developers, but it does mean that bots struggle to see the content on the page without rendering the JavaScript first.

How Do LLM Bots Render JavaScript?

Given what we now know about how Googlebot renders JavaScript, how does that differ from AI bots?

The most important element to understand about the following is that, unlike Googlebot, there is no “one” governing body that represents all the bots that might be encompassed under “LLM bots.” That is, what one bot might be capable of doing won’t necessarily be the standard for all.

The bots that scrape the web to power the knowledge bases of the LLMs are not the same as the bots that visit a page to bring back timely information to a user via a search engine.

And Claude’s bots do not have the same capability as OpenAI’s.

When we are considering how to ensure that AI bots can access our content, we have to cater to the lowest-capability bots.

Less is known about how LLM bots render JavaScript, mainly because, unlike Google, the AI bots are not sharing that information. However, some very smart people have been running tests to identify how each of the main LLM bots handles it.

Back in 2024, Vercel published an investigation into the JavaScript rendering capabilities of the main LLM bots, including OpenAI’s, Anthropic’s, Meta’s, ByteDance’s, and Perplexity’s. According to their study, none of those bots were able to render JavaScript. The only ones that were, were Gemini (leveraging Googlebot’s infrastructure), Applebot, and CommonCrawl’s CCbot.

More recently, Glenn Gabe reconfirmed Vercel’s findings through his own in-depth analysis of how ChatGPT, Perplexity, and Claude handle JavaScript. He also runs through how to test your own website in the LLMs to see how they handle your content.

These are the most well-known bots, from some of the most heavily funded AI companies in this space. It stands to reason that if they are struggling with JavaScript, lesser-funded or more niche ones will be also.

How Do AI Bots Handle Interactively Hidden Content?

Not well. That is, if the interactive content requires some execution of JavaScript, they may struggle to parse it.

To ensure the bots are able to see content hidden behind tabs, or in accordions, it is prudent to ensure the content loads fully in the DOM without the need to execute JavaScript. Human visitors can still interact with the content to reveal it, but the bots won’t need to.

How To Check For JavaScript Rendering Issues

There are two very easy ways to check if Googlebot is able to render all the content on your page:

Check The DOM Through Developer Tools

The DOM (Document Object Model) is an interface for a webpage that represents the HTML page as a series of “nodes” and “objects.” It essentially links a webpage’s HTML source code to JavaScript, which enables the functionality of the webpage to work. In simple terms, think of a webpage as a family tree. Each element on a webpage is a “node” on the tree. So, a header tag

, a paragraph

, and the body of the page itself are all nodes on the family tree.

When a browser loads a webpage, it reads the HTML and turns it into the family tree (the DOM).

How To Check It

I’ll take you through this using Chrome’s Developer Tools as an example.

You can check the DOM of a page by going to your browser. Using Chrome, right-click and select “Inspect.” From there, make sure you’re in the “Elements” tab.

To see if content is visible on your webpage without having to execute JavaScript, you can search for it here. If you find the content fully within the DOM when you first load the page (and don’t interact with it further), then it should be visible to Googlebot and LLM bots.

Use Google Search Console

To check if the content is visible specifically to Googlebot, you can use Google Search Console.

Choose the page you want to test and paste it into the “Inspect any URL” field. Search Console will then take you to another page where you can “Test live URL.” When you test a live page, you will be presented with another screen where you can opt to “View tested page.”

How To Check If An LLM Bot Can See Your Content

As per Glenn Gabe’s experiments, you can ask the LLMs themselves what they can read from a specific webpage. For example, you can prompt them to read the text of an article. They will respond with an explanation if they cannot due to JavaScript.

Viewing The Source HTML

If we are working to the lowest common denominator, it is prudent to assume, at this point, LLMs can’t read content in JavaScript. To make sure that your content is available in the HTML of a webpage so that the bots can definitely access it, be absolutely sure that the content of your page is readable to these bots. Make sure it is in the source HTML. To check this, you can go to Chrome and right click on the page. From the menu, select “View page source.” If you can “find” the text in this code, you know it’s in the source HTML of the page.

What Does This Mean For Your Website?

Essentially, Googlebot has been developed over the years to be much better at handling JavaScript than the newer LLM bots. However, it’s really important to understand that the LLM bots are not trying to crawl and render the web in the same way as Googlebot. Don’t assume that they will ever try to mimic Googlebot’s behavior. Don’t consider them “behind” Googlebot. They are a different beast altogether.

For your website, this means you need to check if your page loads all the pertinent information in the DOM on the first load of the page to satisfy Googlebot’s needs. For the LLM bots, to be very sure the content is available to them, check your static HTML.

More Resources:

Featured Image: Paulo Bobita/Search Engine Journal

Ecommerce MGMT 0 Comments

Dec 23 2025

Ask An SEO: What Is The Threshold Between Keyword Stuffing & Being Optimized? via @sejournal, @rollerblader

In this week’s Ask An SEO, Bre asks:

“What is the threshold between keyword stuffing and being optimized? Is there a magic rule for how often to use your main keyword and related keywords in a 2,000-word page? Should the main keyword be in the Headers AND the body in the same section?”

Great question!

There is no such thing as “being optimized” when it comes to keywords and repetitions. This is similar to looking at “authority” scores for domains. The optimization scores you get are measurements based on what an SEO tool thinks gives a domain trust, and not the actual search engines or LLM and AI systems. The idea of a keyword needing to be repeated is from an SEO concept called keyword density, which is a result of SEO tools.

Each tool would have a different way to say if you repeated a word or phrase enough for it to be “SEO friendly,” and because people trust the tools, they trust that this is a valid ranking factor or signal for a search engine. It is not because the search engines do not pay attention to how many times a word is on a page or in a paragraph, as that doesn’t produce a good experience.

Panda reduced the effectiveness of low-quality, keyword-stuffed content, and Google’s later advancements, BERT and MUM, allowed better understanding of context, relationships between terms, and the overall structure of a page. Google is now far better at interpreting meaning without relying on repeated exact-match keywords.

With that said, keywords are important.

Keywords help to send a signal to a search engine about the topic of the page. And they can be used in headers, within text, as internal links, within title tags, schema, and the URL structure. But worrying about using the keyword for SEO purposes can lead to trouble. So, let’s define keyword stuffing for the sake of this post.

Keyword stuffing is when you force a keyword or keyword phrase into content, headers, and URLs for the sole purpose of SEO.

By forcing a keyword into a post, or forcing it into headers, you hurt the user experience. Although the search engine will know what you want to rank for, the language won’t feel natural. Instead of worrying about how many times you say the keyword, think about synonyms and other ways to say things that are easy to understand. Many search engines are getting better and better at understanding how topics, words, sentences, and phrases relate to one another. You don’t have to repeat the same words over and over anymore.

If you Google the word “swimsuit,” you’ll likely see it in a couple of title tags, but also see “swimwear.” Now type “bathing suits” in, you’ll likely not see it in a ton of the title tags, but the title tags will say “swimwear” and other synonyms, even though “bathing suits” is a popular name for the same product.

Now try “hairdresser near me,” and you’ll likely not see “hairdresser” in a lot of the results, but you will see “hair salon” and similar types of businesses. This is because search engines produce solutions to problems, and if they understand the page has the solution, you don’t need to keep repeating keywords.

For example, instead of saying “keyword stuffing” in this post, I could say “overusing phrases for SEO.” It means the same thing. Readers on this column will get bored pretty fast if I keep saying keyword stuffing, and by mixing it up, I can keep their interest, and search engines are still able to determine it is one-in-the-same. This also applies to header tags.

I don’t have any solid proof of this, but it seems to work well for our clients and the content we create, and it has worked for more than 10 years. If the main keyword phrase is in the H1 tag, whether it is a menu item or a blog post, we don’t worry about placing it in H2, H3, etc. I won’t be upset if the keyword shows up naturally, as that creates a good UX.

The theory here is that headers carry the theme and topic through the sections below. If the top-level header has the word “blue” in it, I make the assumption that theme “blue” carries through the page and applies to the H2 tag as the H2 is a sub-topic of “blue.” “H2’s” for blue could be “t-shirts” and “shorts.”

If this is true, by having the H1 be “blue” and the H2 be “shorts,” a search engine will know they are “blue shorts,” and I feel very confident users will too. They clicked blue or found a SERP for blue clothing, and they clicked shorts from the menu or found them from scrolling.

If you stuff “blue” into each link and header, it is annoying for the user to see it over and over. But many sites that get penalized will have “blue cargo shorts,” “blue chino shorts,” “blue workout shorts,” etc. It looks nicer to just say the styles of shorts like “cargo” or “chino,” and search engines likely already know they’re blue because you had it in the H tag one level up. You also likely have the “blue” part in breadcrumbs, site structure, product descriptions, etc.

One thing you definitely do not want to do is have a million footer links that match the navigation or are keyword-stuffed. This worked a long time ago, but now it is just spam. It doesn’t benefit the user; it is obvious to search engines you’re doing it for SEO. Sites that stuff keywords tend to use these outdated tactics too, so I want to include it here.

I hope this helps answer your question about overusing specific topics or phrases. Doing this only makes the tool happy; it does not mean you’ll be creating a good UX for users or search engines. If you focus on writing for your consumer and incorporate a keyword or phrase naturally, you’ll likely be rewarded.

More Resources:

Featured Image: Paulo Bobita/Search Engine Journal

Ecommerce MGMT 0 Comments