Oct 28 2025

Structured data with schema for search and AI

Structured data helps search engines, Large Language Models (LLMs), AI assistants, and other tools understand your website. Using Schema.org and JSON-LD, you make your content clearer and easier to use across platforms. This guide explains what structured data is, why it matters today, and how you can set it up the right way.

Key takeaways

Structured data helps search engines and AI better understand your website, enhancing visibility and eligibility for rich results.
Using Schema.org and JSON-LD improves content clarity and connects different pieces of information graphically.
Implementing structured data today prepares your content for future technologies and AI applications.
Yoast SEO simplifies structured data implementation by automatically generating schema for various content types.
Focus on key elements like business details and products to maximize the impact of your structured data.

What is structured data?

Structured data is a way to tell computers exactly what’s on your web page. Using a standard set of tags from Schema.org, you can identify important details, like whether a page is about a product, a review, an article, an event, or something else.

This structured format helps search engines, AI assistants, LLMs, and other tools understand your content quickly and accurately. As a result, your site may qualify for special features in search results and can be recognized more easily by digital assistants or new AI applications.

Structured data is written in code, with JSON-LD being the most common format. Adding it to your pages gives your content a better chance to be found and understood, both now and as new technologies develop.

A simple example of structured data

Below is a simple example of structured data using Schema.org in JSON-LD format. This is a basic schema for a product with review properties. This code tells search engines that the page is a product (Product). It provides the name and description of the product, pricing information, the URL, plus product ratings and reviews. This allows search engines to understand your products and present your content in search results.




    Product Title

Why do you need structured data?

Structured data gives computers a clear map of what’s on your website. It spells out details about your products, reviews, events, and much more in a format that’s easy for search engines and other systems to process.

This clarity leads to better visibility in search, including features like star ratings, images, or additional links. But the impact reaches further now. Structured data also helps AI assistants, voice search tools, and new web platforms like chatbots powered by Large Language Models understand and represent your content with greater accuracy.

New standards, such as NLWeb (Natural Language Web) and MCP (Model Context Protocol), are emerging to help different systems share and interpret web content consistently. Adding structured data today not only gives your site an advantage in search but also prepares it for a future where your content will flow across more platforms and digital experiences.

The effort you put into structured data now sets up your content to be found, used, and displayed in many places where people search and explore online.

Is structured data important for SEO?

Structured data plays a key role in how your website appears in search results. It helps search engines understand and present your content with extra features, such as review stars, images, and additional links. These enhanced listings can catch attention and drive more clicks to your site.

While using structured data doesn’t directly increase your rankings, it does make your site eligible for these rich results. That alone can set you apart from competitors. As search engines evolve and adopt new standards, well-structured data ensures your content stays visible and accessible in the latest search features.

For SEO, structured data is about making your site stand out, improving user experience, and giving your content the best shot at being discovered, both now and as search technology changes.

Structured data can lead to rich results

By describing your site for search engines, you allow them to do exciting things with your content. Schema.org and its support are constantly developing, improving, and expanding. As structured data forms the basis for many new developments in the SEO world, there will be more shortly. Below is an overview of the rich search results available; examples are in Google’s Search Gallery.

Structured data type	Example use/description
Article	News, blog, or sports article
Breadcrumb	Navigation showing page position
Carousel	Gallery/list from one site (with Recipe, Course, Movie, Restaurant)
Course list	Lists of educational courses
Dataset	Large datasets (Google Dataset Search)
Discussion forum	User-generated forum content
Education Q&A	Education flashcard Q&As
Employer aggregate rating	Ratings about employers in job search results
Event	Concerts, festivals, and other events
FAQ	Frequently asked questions pages
Image metadata	Image creator, credit, and license details
Job posting	Listings for job openings
Local business	Business details: hours, directions, ratings
Math solver	Structured data for math problems
Movie	Lists of movies, movie details
Organization	About your company: name, logo, contact, etc.
Practice problem	Education practice problems for students
Product	Product listings with price, reviews, and more
Profile page	Info on a single person or organization
Q&A	Pages with a single question and answers
Recipe	Cooking recipes, steps, and ingredients
Review snippet	Short review/rating summaries
Software app	Ratings and details on apps or software
Speakable	Content for text-to-speech on Google Assistant
Subscription and paywalled content	Mark articles/content behind a paywall
Vacation rental	Details about vacation property listings
Video	Video info, segments, and live content

The rich results formerly known as rich snippets

You might have heard the term “rich snippets” before. Google now calls these enhancements “rich results.” Rich results are improved search listings that use structured data to show extra information, like images, reviews, product details, or FAQs, directly in search.

For example, a product page marked up with structured data can show its price, whether it’s in stock, and customer ratings right below the search listing, even before someone clicks. Here’s what that might look like:

Some listings offer extra information, like star ratings or product details

With rich results, users see helpful details up front—such as a product’s price, star ratings, or stock status. This can make your listing stand out and attract more clicks.

Keep in mind, valid structured data increases your chances of getting rich results, but display is controlled by Google’s systems and is never guaranteed.

Keep reading: Rich snippets everywhere »

Mobile rich results

Results like this often appear more prominently on mobile devices. Search listings with structured data can display key information, like product prices, ratings, recipes, or booking options, in a mobile-friendly format. Carousels, images, and quick actions are designed for tapping and swiping with your finger.

For example, searching for a recipe on your phone might bring up a swipeable carousel showing photos, cooking times, and ratings for each dish. Product searches can highlight prices, availability, and reviews right in the results, helping users make decisions faster.

Many people now use mobile search as their default search method. Well-implemented structured data not only improves your visibility on mobile but can also make your content easier for users to explore and act on from their phones. To stay visible and competitive, regularly check your markup and make sure it works smoothly on mobile devices.

Knowledge Graph Panel

The Knowledge Graph Panel shows key facts about businesses, organizations, or people beside search results on desktop and at the top on mobile. It can include your logo, business description, location, contact details, and social profiles.

Using structured data, especially Organization, LocalBusiness, or Person markup with current details, helps Google recognize and display your entity accurately. Include recommended fields like your official name, logo, social links (using sameAs), and contact info.

Entity verification is becoming more important. Claim your Knowledge Panel through Google, and make sure your information is consistent across your website, social media, and trusted directories. Major search engines and AI assistants use this entity data for results, summaries, and answers, not just in search but also in AI-powered interfaces and smart devices.

While Google decides who appears in the Knowledge Panel and what details are shown, reliable structured data, verified identity, and a clear online presence give you the best chance of being featured.

Different kinds of structured data

Schema.org includes many types of structured data. You don’t need to use them all, just focus on what matches your site’s content. For example:

If you sell products, use product schema
For restaurant or local business sites, use local business schema
Recipe sites should add recipe schema

Before adding structured data, decide which parts of your site you want to highlight. Check Google’s or other search engines’ documentation to see which types are supported and what details they require. This helps ensure you are using the markup that will actually make your content stand out in search and other platforms.

How Yoast SEO helps with structured data

Yoast SEO automatically adds structured data to your site using smart defaults, making it easier for search engines and platforms to understand your content. The plugin supports a wide range of content types, like articles, products, local businesses, and FAQs, without the need for manual schema coding.

With Yoast SEO, you can:

With a few clicks, set the right content type for each page (such as ContactPage, Product, or Article)
Use built-in WordPress blocks for FAQs and How-tos, which generate valid schema automatically
Link related entities across your site, such as authors, brands, and organizations, to help search engines see the big picture
Adjust schema details per page or post through the plugin’s settings

Yoast SEO also offers an extensible structured data platform. Developers can build on top of Yoast’s schema framework, add custom schema types, or connect other plugins. This helps advanced users or larger sites tailor their structured data for specific content, integrations, or new standards.

Yoast keeps pace with updates to structured data guidelines, so your markup stays aligned with what Google and other platforms support. This makes it easier to earn rich results and other search enhancements.

Yoast SEO helps you fine-tune your schema structured data settings per page

Which structured data types matter most?

When adding structured data, focus first on the types that have the biggest impact on visibility and features in Google Search. These forms of schema are widely supported, trigger rich results, and apply to most kinds of sites:

Most important structured data types

Article: For news sites, blogs, and sports publishers. Adding Article schema can enable rich results like Top Stories, article carousels, and visual enhancements
Product: Essential for ecommerce. Product schema helps show price, stock status, ratings, and reviews right in search. This type is key for online stores and retailers
Event: For concerts, webinars, exhibitions, or any scheduled events. Event schema can display dates, times, and locations directly in search results, making it easier for people to find and attend
Recipe: This is for food blogs and cooking sites. The recipe schema supports images, cooking times, ratings, and step-by-step instructions as rich results, giving your recipes extra prominence in search
FAQPage: For any page with frequently asked questions. This markup can expand your search listing with Q&A drop-downs, helping users get answers fast
QAPage: For online communities, forums, or support sites. QAPage schema helps surface full question-and-answer threads in search
ReviewSnippet: This markup is for feedback on products, books, businesses, or services. It can display star ratings and short excerpts, adding trust signals to your listings
LocalBusiness is vital for local shops, restaurants, and service providers. It supplies address, hours, and contact info, supporting your visibility in the map pack and Knowledge Panel
Organization: Use this to describe your brand or company with a logo, contact details, and social profiles. Organization schema feeds into Google’s Knowledge Panel and builds your online presence
Video: Mark up video content to enable video previews, structured timestamps (key moments), and improved video visibility
Breadcrumb: This feature shows your site’s structure within Google’s results, making navigation easier and your site look more reputable

Other valuable or sector-specific types:

Course: Highlight educational course listings and details for training providers or schools
JobPosting: Share open roles in job boards or company careers pages, making jobs discoverable in Google’s job search features
SoftwareApp: For software and app details, including ratings and download links
Movie: Used for movies and film listings, supporting carousels in entertainment searches and extra movie details
Dataset: Makes large sets of research or open data discoverable in Google Dataset Search
DiscussionForum: Surfaces user-generated threads in dedicated “Forums” search features
ProfilePage: Used for pages focused on an individual (author profiles, biographies) or organization
EmployerAggregateRating: Displays company ratings and reviews in job search results
PracticeProblem: For educational sites offering practice questions or test prep
VacationRental: Displays vacation property listings and details in travel results

Special or supporting types:

Person: This helps Google recognize and understand individual people for entity and Knowledge Panel purposes (it does not create a direct rich result)
Book: Can improve book search features, usually through review or product snippets
Speakable: Reserved for news sites and voice assistant features; limited support
Image metadata, Math Solver, Subscription/Paywalled content: Niche markups that help Google properly display, credit, or flag special content
Carousel: Used in combination with other types (like Recipe or Movie) to display a list or gallery format in results

When choosing which schema to add, always select types that match your site’s actual content. Refer to Google’s Search Gallery for the latest guidance and requirements for each type.

Adding the right structured data makes your pages eligible for rich results, enhances your visibility, and prepares your content for the next generation of search features and AI-powered platforms.

Read on: Local business listings with Schema.org and JSON-LD »

Structured data for voice assistants

Voice search remains important, with a significant share of online queries now coming from voice-enabled devices. Structured data helps content be understood and, in some cases, selected as an answer for voice results.

The Speakable schema (for marking up sections meant to be read aloud by voice assistants) is still officially supported, but adoption is mostly limited to news content. Google and other assistants also use a broader mix of signals, like content clarity, authority, E-E-A-T, and traditional structured data, to power their spoken answers.

If you publish news or regularly answer concise, fact-based questions, consider using Speakable markup. For other content types, focus on structured data and well-organized, user-focused pages to improve your chances of being chosen by voice assistants. Voice search and voice assistants continue to draw on featured snippets, clear Q&A, and trusted sources.

Google Search Console

If you need to check how your structured data is performing in Google, check your Search Console. Find the structured data insights under the Enhancement tab and you’ll see all the pages that have structured data, plus an overview of pages that give errors, if any. Read our Beginner’s guide for Search Console for more info.

The technical details

Structured data uses Schema.org’s hierarchy. This vocabulary starts with broad types like Thing and narrows down to specific ones, such as Product, Movie, or LocalBusiness. Every type has its own properties, and more specific types inherit from their ancestors. For example, a Movie is a type of CreativeWork, which is a type of Thing.

When adding structured data, select the most specific type that fits your content. For a movie, this means using the Movie schema. For a local company, choose the type of business that best matches your offering under LocalBusiness.

Properties

Every Schema.org type includes a range of properties. While you can add many details, focus on the properties that Google or other search engines require or recommend for rich results. For example, a LocalBusiness should include your name, address, phone number, and, if possible, details such as opening hours, geo-coordinates, website, and reviews. You’ll find our Local SEO plugin (available in Yoast SEO Premium) very helpful if you need help with your local business markup.

Here are two examples of structures:

Movie hierarchy

Thing
CreativeWork
- Movie
- Properties: name, description, director, actor, image, genre, duration

Local business hierarchy

Thing
Organization/Place
- LocalBusiness
- Properties: name, address, phone, email, openingHours, geo, review, logo

The more complete and accurate your markup, the greater your chances of being displayed with enhanced features like Knowledge Panels or map results. For details on recommended properties, always check Google’s up-to-date structured data documentation.

In the local business example, you’ll see that Google lists several required properties, like your business’s NAP (Name and Phone) details. There are also recommended properties, like URLs, geo-coordinates, opening hours, etc. Try to fill out as many of these as possible because search engines will only give you the whole presentation you want.

Structured data should be a graph

When you add structured data to your site, you’re not just identifying individual items, but you’re building a data graph. A graph in this context is a web of connections between all the different elements on your site, such as articles, authors, organizations, products, and events. Each entity is linked to others with clear relationships. For instance, an article can be marked as written by a certain author, published by your organization, and referencing a specific product. These connections help search engines and AI systems see the bigger picture of how everything on your site fits together.

Creating a fully connected data graph removes ambiguity. It allows search engines to understand exactly who created content, what brand a product belongs to, or where and when an event takes place, rather than making assumptions based on scattered information. This detailed understanding increases the chances that your site will qualify for rich results, Knowledge Panels, and other enhanced features in search. As your website grows, a well-connected graph also makes it easier to add new content or expand into new areas, since everything slots into place in a way that search engines can quickly process and understand.

Yoast SEO builds a graph

With Yoast SEO, many of the key connections are generated automatically, giving your site a solid foundation. Still, understanding the importance of building a connected data graph helps you make better decisions when structuring your own content or customizing advanced schema. A thoughtful, well-linked graph sets your site up for today’s search features, while making it more adaptable for the future.

Your schema should be a well-formed graph for easier understanding by search engines and AI

Beyond search: AI, assistants, and interoperability

Structured data isn’t just about search results. It’s a map that helps AI assistants, knowledge graphs, and cross‑platform apps understand your content. It’s not just about showing a richer listing; it’s about enabling reliable AI interpretation and reuse across contexts.

Today, the primary payoff is still better search experiences. Tomorrow, AI systems and interoperable platforms will rely on clean, well‑defined data to summarize, reason about, and reuse your content. That shift makes data quality more important than ever.

Practical steps for today

Keep your structured data clean with a few simple habits. Use the same names for people, organizations, and products every time they appear across your site. Connect related information so search engines can see the links. For example, tie each article to its author or a product to its brand. Fill in all the key details for your main schema types and make sure nothing is missing. After making changes or adding new content, run your markup through a validation tool. If you add any custom fields or special schema, write down what they do so others can follow along later. Doing quick checks now and then keeps your data accurate and ready for both search engines and AI.

Interoperability, MCP, and the role of structured data

More and more, AI systems and search tools are looking for websites that are easy to understand, not just for people but also for machines. The Model Context Protocol (MCP) is gaining ground as a way for language models like Google Gemini and ChatGPT to use the structured data already present on your website. MCP draws on formats like Schema.org and JSON-LD to help AI match up the connections between things such as products, authors, and organizations.

Another project, the Natural Language Web (NLWeb), an open project developed by Microsoft, aims to make web content easier for AI to use in conversation and summaries. NLWeb builds on concepts like MCP, but hasn’t become a standard yet. For now, most progress and adoption are happening with MCP, and large language models are focusing their efforts on this area.

Using Schema.org and JSON-LD to keep your structured data clean (no duplicate entities), complete (all indexable content included), and connected (relationships preserved) will prepare you for search engines and new AI-driven features appearing across the web.

Schema.org and JSON-LD: the foundation you can trust

Schema.org and JSON-LD remain the foundation for structured data on the web. They enable today’s rich results in search and form the basis for how AI systems will interpret web content in the future. JSON-LD should be your default format for new markup, allowing you to build structured data graphs that are clean, accurate, and easy to maintain. Focus on accuracy in your markup rather than unnecessary complexity.

To future-proof your data, prioritize stable identifiers such as @id and use clear types to reduce ambiguity. Maintain strong connections between related entities across your pages. If you develop custom extensions to your structured data, document them thoroughly so both your team and automated tools can understand their purpose.

Design your schema so that components can be added or removed without disrupting the entire graph. Make a habit of running validations and audits after you change your site’s structure or content.

Finally, stay current by following guidance and news from official sources, including updates about standards such as NLWeb and MCP, to ensure your site remains compatible with both current search features and new interoperability initiatives.

What do you need to describe for search engines?

To get the most value from structured data, focus first on the most important elements of your site. Describe the details that matter most for users and for search, such as your business information, your main products or services, reviews, events, or original articles. These core pieces of information are what search engines look for to understand your site and display enhanced results.

Rather than trying to mark up everything, start with the essentials that best match your content. As your experience grows, you can build on this foundation by adding more detail and creating links between related entities. Accurate, well-prioritized markup is both easier to maintain and more effective in helping your site stand out in search results and across new AI-driven features.

How to implement structured data

We’d like to remind you that Yoast SEO comes with an excellent structured data implementation. It’ll automatically handle most sites’ most pressing structured data needs. Of course, as mentioned below, you can extend our structured data framework as your needs become bigger.

Do the Yoast SEO configuration and get your site’s structured data set up in a few clicks! The configuration is available for all Yoast SEO users to help you get your plugin configured correctly. It’s quick, it’s easy, and doing it will pay off. Plus, if you’re using the new block editor in WordPress you can also add structured data to your FAQ pages and how-to articles using our structured data content blocks.

Thanks to JSON-LD, there’s nothing scary about adding the data to your pages anymore. This JavaScript-based data format makes it much easier to add structured data since it forms a block of code and is no longer embedded in the HTML of your page. This makes it easier to write and maintain, plus both humans and machines better understand it. If you need help implementing JSON-LD structured data, you can enroll in our free Structured Data for Beginners course, our Understanding Structured Data course, or read Google’s introduction to structured data.

Structured data with JSON-LD

JSON-LD is the recommended way to add structured data to your site. All major search engines, including Google and Bing, now fully support this format. JSON-LD is easy to implement and maintain, as it keeps your structured data separate from the main HTML.

Yoast SEO automatically creates a structured data graph for every page, connecting key elements like articles, authors, products, and organizations. This approach helps search engines and AI systems understand your site’s structure. Our developer resources include detailed Schema documentation and example graphs, making it straightforward to extend or customize your markup as your site grows.

Yoast SEO automatically handles much of the structured data in the background. You could extend our Schema framework, of course — see the next chapter –, but if adding code by hand seems scary, you could try some of the tools listed below. If you need help with how to proceed, ask your web developer for help. They will fix this for you in a couple of minutes.

Generators
Validators and test tools
WordPress Plugins
- Yoast SEO Local (Our Local SEO plugin adds Schema.org for your business details, like address, geo-location, opening hours, etc.)
- The Yoast SEO WooCommerce plugin outputs product Schema to highlight your products in search.
- Yoast SEO uses JSON-LD to add Schema.org information about your site search, your site name, your logo, images, articles, social profiles, and a lot more to your web pages. We ask if your site represents a person or an organization and adapt our structured data based on that. Also, our structured data content blocks for the WordPress block editor make it easy to add structured data to your FAQs and How-Tos. Check out the structured data features in Yoast SEO.

The Yoast SEO Schema structured data framework

Implementing structured data has always been challenging. Also, the results of most of those implementations often needed improvement. At Yoast, we set out to enhance the Schema output for millions of sites. For this, we built a Schema framework, which can be adapted and extended by anyone. We combined all those loose bits and pieces of structured data that appear on many sites, improved these, and put them in a graph. By interconnecting all these bits, we offer search engines all your connections on a silver platter.

See this video for more background on the schema graph.

Of course, there’s a lot more to it. We can also extend Yoast SEO output by adding specific Schema pieces, like how-tos or FAQs. We built structured data content blocks for use in the WordPress block editor. We’ve also enabled other WordPress plugins to integrate with our structured data framework, like Easy Digital Downloads, The Events Calendar, Seriously Simple Podcasting, and WP Recipe Maker, with more to come. Together, these help you remove barriers for search engines and users, as it has always been challenging to work with structured data.

Expanding your structured data implementation

A structured and focused approach is key to successful Schema.org markup on your website. Start by understanding Schema.org and how structured data can influence your site’s presence in search and beyond. Resources like Yoast’s developer portal offer useful insights into building flexible and future-proof markup.

Always use JSON-LD as recommended by Google, Bing, and Yoast. This format is easy to maintain and works well with modern websites. To maximize your implementation, use tools and frameworks that allow you to add, customize, and connect Schema.org data efficiently. Yoast SEO’s structured data framework, for example, enables seamless schema integration and extensibility across your site.

Validate your structured data regularly with tools like the Rich Results Test or Schema Markup Validator and monitor Google Search Console’s Enhancements reports for live feedback. Reviewing your markup helps you fix issues early and spot opportunities for richer results as search guidelines change. Periodically revisiting your strategy keeps your markup accurate and effective as new types and standards emerge.

Read up

By following the guidelines and adopting a comprehensive approach, you can successfully get structured data on your pages and enhance the effectiveness of your schema.org markup implementation for a robust SEO performance. Read the Yoast SEO Schema documentation to learn how Yoast SEO works with structured data, how you can extend it via an API, and how you can integrate it into your work.

Several WordPress plugins already integrate their structured data into the Yoast SEO graph

Keep on reading: Open-source software, open Schema protocol! »

Conclusions about structured data

Structured data has become an essential part of building a visible, findable, and adaptable website. Using Schema.org and JSON-LD not only helps search engines understand your content but also sets your site up for better performance in new AI-driven features, rich results, and across platforms.

Start by focusing on the most important parts of your site, like business information, products, articles, or events, and grow your structured data as your needs evolve. Connected, well-maintained markup now prepares your site for search, AI, and whatever comes next in digital content.

Explore our documentation and training resources to learn more about best practices, advanced integrations, or how Yoast SEO can simplify structured data. Investing the time in good markup today will help your content stand out wherever people (or algorithms) find it.

Edwin Toonen

Edwin is an experienced strategic content specialist. Before joining Yoast, he worked for a top-tier web design magazine, where he developed a keen understanding of how to create great content.

Ecommerce MGMT 0 Comments

Oct 21 2025

How to Remove a Web Page from Google

The reasons for removing a page from Google’s search results haven’t much changed since I first published this article in 2023. Examples include pages with confidential, premium, or outdated info. Yet the tools and tactics have evolved.

Here’s my updated version.

Temporary Removal

The need to remove URLs from Google is urgent when a site is (i) hacked with malware or illicit content while indexed (even ranking) or (ii) inadvertently exposes private information that the search giant then indexes.

The quickest way to hide URLs from searchers is via Google’s URL removal tool in the “Indexing” section of Search Console. There, you can remove a single URL or an entire category.

Google processes these requests quickly in my experience, but it doesn’t permanently deindex them. It instead hides the URLs from search results for roughly six months.

Screenshot of Google Search Console’s “New Request” dialog under the “Temporarily Remove URL” tab. The interface allows users to block URLs from Google Search results for about six months. Options include entering a URL, choosing to remove only that URL or all URLs with the same prefix, and proceeding with a “Next” button.

Search Console’s tool removes URLs from search results for “about six months.” Click image to enlarge.

A similar feature in Bing Webmaster Tools, called “Block URLs,” hides pages from Bing search for approximately 90 days.

Screenshot of Bing Webmaster Tools’ “Add URL to block” dialog. The form allows entry of a URL and selection of options for page or directory, and block type (URL & Cache or Cache only). A note at the bottom says the block will remain in place for a maximum of 90 days.

“Block URLs” in Bing Webmaster Tools hides pages from Bing search for approximately 90 days. Click image to enlarge.

Permanent

Several options remove URLs permanently from Google’s index.

Delete the page from your site

Deleting a page from your web server will permanently deindex it. After deleting, set up a 410 HTTP status code of “gone” instead of 404 “not found.” Allow a few days for Google to recrawl the site, discover the 410 code, and remove the page from its index.

Note that Google discourages the use of redirects to remove low-value pages, as the practice sends poor signals to the successor.

As an aside, Google provides a form to remove personal info from search results.

Add the noindex tag

Search engines nearly always honor the noindex meta tag. Search bots will crawl a noindex page, but will not include it in search results.

In my experience, Google will immediately recognize a noindex meta tag once it crawls the page. Note that the tag removes the page from search results, not the site. The page remains accessible through other links, internal and external.

Noindex tags will not likely remove pages from LLMs such as ChatGPT, Claude, and Perplexity, as those platforms do not always honor them or even robots.txt exclusions. Deleting pages from a site is the surefire removal tactic.

Password protect

Consider adding a password to a published page to prevent it from becoming publicly accessible. Google cannot crawl pages requiring passwords or user names.

Adding a password will not remove an indexed page. A noindex tag will, however.

Remove internal links

Remove all internal links to pages you don’t want indexed. And do not link to password-protected or deleted pages; both hurt the user experience. Always focus on human visitors — not search engines alone.

Robots.txt

Robots.txt files can prevent Google (and other bots) from crawling a page (or category). Pages blocked via robots.txt could still be indexed and ranked if included in a site map or otherwise linked. Google will not encounter a noindex tag on blocked pages since it cannot crawl them.

A robots.txt file can instruct web crawlers to ignore, for instance, login pages, personal archives, or pages resulting from unique sorts and filters. Preserve search bots’ crawl time on the parts you want to rank.

Ecommerce MGMT 0 Comments

Oct 15 2025

Bing Supports data-nosnippet For Search Snippets & AI Answers via @sejournal, @MattGSouthern

Bing now supports the data-nosnippet HTML attribute, giving websites more precise control over what appears in search snippets and AI-generated answers.

The attribute lets you exclude specific page sections from Bing Search results and Copilot while keeping the page indexed.

Content marked with data-nosnippet remains eligible to rank but will not surface in previews.

What’s New

data-nosnippet can be applied to any HTML element you want to keep out of previews.

When Bing crawls your site, marked sections are discoverable but are omitted from snippet text and AI summaries.

Bing highlights common use cases:

Keep paywalled or premium content out of previews
Reduce exposure of user comments or reviews in AI answers
Hide legal boilerplate, disclaimers, and cookie notices
Suppress outdated notices and expired promotions
Exclude sponsored blurbs and affiliate disclaimers from neutral previews
Avoid A/B test noise by hiding variant copy during experiments
Emphasize high-value content while keeping sensitive parts behind the click

Implementation is straightforward. Add the attribute to any element:


  Subscriber Content
  This section will not appear in Bing Search or Copilot answers.

After adding it, you can verify changes in Bing Webmaster Tools with URL inspection. Depending on crawl timing, updates may appear within seconds or take up to a week.

How It Compares To Other Directives

data-nosnippet complements page-level directives.

noindex removes a page from the index
nosnippet blocks all text and preview thumbnails
max-snippet, max-image-preview, and max-video-preview cap preview length or size

Unlike those page-wide controls, data-nosnippet targets specific sections for finer control.

Why This Matters

If you run a subscription site, you can keep subscriber-only passages out of previews without sacrificing indexation.

For pages with user-generated content, you can prevent comments or reviews from appearing in AI summaries while leaving your editorial copy visible.

In short, it lets websites exclude specific sections from search snippets and maintain ranking potential.

Looking Ahead

The attribute is available now. Consider adding it to pages where preview control matters most, then confirm behavior in Bing Webmaster Tools.

Featured Image: T. Schneider/Shutterstock

Ecommerce MGMT 0 Comments

Oct 15 2025

How Structured Data Shapes AI Snippets And Extends Your Visibility Quota via @sejournal, @cyberandy

When conversational AIs like ChatGPT, Perplexity, or Google AI Mode generate snippets or answer summaries, they’re not writing from scratch, they’re picking, compressing, and reassembling what webpages offer. If your content isn’t SEO-friendly and indexable, it won’t make it into generative search at all. Search, as we know it, is now a function of artificial intelligence.

But what if your page doesn’t “offer” itself in a machine-readable form? That’s where structured data comes in, not just as an SEO gig, but as a scaffold for AI to reliably pick the “right facts.” There has been some confusion in our community, and in this article, I will:

walk through controlled experiments on 97 webpages showing how structured data improves snippet consistency and contextual relevance,
map those results into our semantic framework.

Many have asked me in recent months if LLMs use structured data, and I’ve been repeating over and over that an LLM doesn’t use structured data as it has no direct access to the world wide web. An LLM uses tools to search the web and fetch webpages. Its tools – in most cases – greatly benefit from indexing structured data.

Image by author, October 2025

In our early results, structured data increases snippet consistency and improves contextual relevance in GPT-5. It also hints at extending the effective wordlim envelope – this is a hidden GPT-5 directive that decides how many words your content gets in a response. Imagine it as a quota on your AI visibility that gets expanded when content is richer and better-typed. You can read more about this concept, which I first outlined on LinkedIn.

Why This Matters Now

Wordlim constraints: AI stacks operate with strict token/character budgets. Ambiguity wastes budget; typed facts conserve it.
Disambiguation & grounding: Schema.org reduces the model’s search space (“this is a Recipe/Product/Article”), making selection safer.
Knowledge graphs (KG): Schema often feeds KGs that AI systems consult when sourcing facts. This is the bridge from web pages to agent reasoning.

My personal thesis is that we want to treat structured data as the instruction layer for AI. It doesn’t “rank for you,” it stabilizes what AI can say about you.

Experiment Design (97 URLs)

While the sample size was small, I wanted to see how ChatGPT’s retrieval layer actually works when used from its own interface, not through the API. To do this, I asked GPT-5 to search and open a batch of URLs from different types of websites and return the raw responses.

You can prompt GPT-5 (or any AI system) to show the verbatim output of its internal tools using a simple meta-prompt. After collecting both the search and fetch responses for each URL, I ran an Agent WordLift workflow [disclaimer, our AI SEO Agent] to analyze every page, checking whether it included structured data and, if so, identifying the specific schema types detected.

These two steps produced a dataset of 97 URLs, annotated with key fields:

has_sd → True/False flag for structured data presence.
schema_classes → the detected type (e.g., Recipe, Product, Article).
search_raw → the “search-style” snippet, representing what the AI search tool showed.
open_raw → a fetcher summary, or structural skim of the page by GPT-5.

Using a “LLM-as-a-Judge” approach powered by Gemini 2.5 Pro, I then analyzed the dataset to extract three main metrics:

Consistency: distribution of search_raw snippet lengths (box plot).
Contextual relevance: keyword and field coverage in open_raw by page type (Recipe, E-comm, Article).
Quality score: a conservative 0–1 index combining keyword presence, basic NER cues (for e-commerce), and schema echoes in the search output.

The Hidden Quota: Unpacking “wordlim”

While running these tests, I noticed another subtle pattern, one that might explain why structured data leads to more consistent and complete snippets. Inside GPT-5’s retrieval pipeline, there’s an internal directive informally known as wordlim: a dynamic quota determining how much text from a single webpage can make it into a generated answer.

At first glance, it acts like a word limit, but it’s adaptive. The richer and better-typed a page’s content, the more room it earns in the model’s synthesis window.

From my ongoing observations:

Unstructured content (e.g., a standard blog post) tends to get about ~200 words.
Structured content (e.g., product markup, feeds) extends to ~500 words.
Dense, authoritative sources (APIs, research papers) can reach 1,000+ words.

This isn’t arbitrary. The limit helps AI systems:

Encourage synthesis across sources rather than copy-pasting.
Avoid copyright issues.
Keep answers concise and readable.

Yet it also introduces a new SEO frontier: your structured data effectively raises your visibility quota. If your data isn’t structured, you’re capped at the minimum; if it is, you grant AI more trust and more space to feature your brand.

While the dataset isn’t yet large enough to be statistically significant across every vertical, the early patterns are already clear – and actionable.

Figure 1 – How Structured Data Affects AI Snippet Generation (Image by author, October 2025)

Results

Figure 2 – Distribution of Search Snippet Lengths (Image by author, October 2025)

1) Consistency: Snippets Are More Predictable With Schema

In the box plot of search snippet lengths (with vs. without structured data):

Medians are similar → schema doesn’t make snippets longer/shorter on average.
Spread (IQR and whiskers) is tighter when has_sd = True → less erratic output, more predictable summaries.

Interpretation: Structured data doesn’t inflate length; it reduces uncertainty. Models default to typed, safe facts instead of guessing from arbitrary HTML.

2) Contextual Relevance: Schema Guides Extraction

Recipes: With Recipe schema, fetch summaries are far likelier to include ingredients and steps. Clear, measurable lift.
Ecommerce: The search tool often echoes JSON‑LD fields (e.g., aggregateRating, offer, brand) evidence that schema is read and surfaced. Fetch summaries skew to exact product names over generic terms like “price,” but the identity anchoring is stronger with schema.
Articles: Small but present gains (author/date/headline more likely to appear).

3) Quality Score (All Pages)

Averaging the 0–1 score across all pages:

No schema → ~0.00
With schema → positive uplift, driven mostly by recipes and some articles.

Even where means look similar, variance collapses with schema. In an AI world constrained by wordlim and retrieval overhead, low variance is a competitive advantage.

Beyond Consistency: Richer Data Extends The Wordlim Envelope (Early Signal)

While the dataset isn’t yet large enough for significance tests, we observed this emerging pattern:
Pages with richer, multi‑entity structured data tend to yield slightly longer, denser snippets before truncation.

Hypothesis: Typed, interlinked facts (e.g., Product + Offer + Brand + AggregateRating, or Article + author + datePublished) help models prioritize and compress higher‑value information – effectively extending the usable token budget for that page.
Pages without schema more often get prematurely truncated, likely due to uncertainty about relevance.

Next step: We’ll measure the relationship between semantic richness (count of distinct Schema.org entities/attributes) and effective snippet length. If confirmed, structured data not only stabilizes snippets – it increases informational throughput under constant word limits.

From Schema To Strategy: The Playbook

We structure sites as:

Entity Graph (Schema/GS1/Articles/ …): products, offers, categories, compatibility, locations, policies;
Lexical Graph: chunked copy (care instructions, size guides, FAQs) linked back to entities.

Why it works: The entity layer gives AI a safe scaffold; the lexical layer provides reusable, quotable evidence. Together they drive precision under thewordlim constraints.

Here’s how we’re translating these findings into a repeatable SEO playbook for brands working under AI discovery constraints.

Ship JSON‑LD for core templates
- Recipes → Recipe (ingredients, instructions, yields, times).
- Products → Product + Offer (brand, GTIN/SKU, price, availability, ratings).
- Articles → Article/NewsArticle (headline, author, datePublished).
Unify entity + lexical
Keep specs, FAQs, and policy text chunked and entity‑linked.
Harden snippet surface
Facts must be consistent across visible HTML and JSON‑LD; keep critical facts above the fold and stable.
Instrument
Track variance, not just averages. Benchmark keyword/field coverage inside machine summaries by template.

Conclusion

Structured data doesn’t change the average size of AI snippets; it changes their certainty. It stabilizes summaries and shapes what they include. In GPT-5, especially under aggressive wordlim conditions, that reliability translates into higher‑quality answers, fewer hallucinations, and greater brand visibility in AI-generated results.

For SEOs and product teams, the takeaway is clear: treat structured data as core infrastructure. If your templates still lack solid HTML semantics, don’t jump straight to JSON-LD: fix the foundations first. Start by cleaning up your markup, then layer structured data on top to build semantic accuracy and long-term discoverability. In AI search, semantics is the new surface area.

More Resources:

Featured Image: TierneyMJ/Shutterstock

Ecommerce MGMT 0 Comments

Oct 2 2025

Vector Index Hygiene: A New Layer Of Technical SEO via @sejournal, @DuaneForrester

For years, technical SEO has been about crawlability, structured data, canonical tags, sitemaps, and speed. All the plumbing that makes pages accessible and indexable. That work still matters. But in the retrieval era, there’s another layer you can’t ignore: vector index hygiene. And while I’d like to claim my usage of vector index hygiene is unique, similar concepts exist in machine learning (ML) circles already. It is unique when applied specifically to our work with content embedding, chunk pollution, and retrieval in SEO/AI pipelines, however.

This isn’t a replacement for crawlability and schema. It’s an addition. If you want visibility in AI-driven answer engines, you now need to understand how your content is dismantled, embedded, and stored in vector indexes and what can go wrong if it isn’t clean.

Traditional Indexing: How Search Engines Break Pages Apart

Google has never stored your page as one giant file. From the beginning, search has dismantled webpages into discrete elements and stored them in separate indexes.

Text is broken into tokens and stored in inverted indexes, which map terms to the documents they appear in. Here, tokenization means traditional IR terms, not LLM sub-word units. This is the backbone of keyword retrieval at scale. (See: Google’s How Search Works overview.)
Images are indexed separately, using filenames, alt text, captions, structured data, and machine-learned visual features. (See: Google Images documentation.)
Video is split into transcripts, thumbnails, and structured data, all stored in a video index. (See: Google’s video indexing docs.)

When you type a query into Google, it queries these indexes in parallel (web, images, video, news) and blends the results into one SERP. This separation exists because handling “an internet’s worth” of text is not the same as handling an internet’s worth of images or video.

For SEOs, the important point is this: you never really ranked “the page.” You ranked the parts of it that were indexed and retrievable.

GenAI Retrieval: From Inverted Indexes To Vector Indexes

AI-driven answer engines like ChatGPT, Gemini, Claude, and Perplexity push this model further. Instead of inverted indexes that map terms to documents, they use vector indexes that store embeddings, essentially mathematical fingerprints of meaning.

Chunks, not pages. Content is split into small blocks. Each block is embedded into a vector. Retrieval happens by finding semantically similar vectors in response to a query. (See: Google Vertex AI Vector Search overview.)
Hybrid retrieval is common. Dense vector search captures semantics. Sparse keyword search (BM25) captures exact matches. Fusion methods like reciprocal rank fusion (RRF) combine both. (See: Weaviate hybrid search explained and RRF primer.)
Paraphrased answers replace ranked lists. Instead of showing a SERP, the model paraphrases retrieved chunks into a single answer.

Sometimes, these systems still lean on traditional search as a backstop. Recent reporting showed ChatGPT quietly pulling Google results through SerpApi when it lacked confidence in its own retrieval. (See: Report)

For SEOs, the shift is stark. Retrieval replaces ranking. If your blocks aren’t retrieved, you’re invisible.

What Vector Index Hygiene Means

Vector index hygiene is the discipline of preparing, structuring, embedding, and maintaining content so it remains clean, deduplicated, and easy to retrieve in vector space. Think of it as canonicalization for the retrieval era.

Without hygiene, your content pollutes indexes:

Bloated blocks: If a chunk spans multiple topics, the resulting embedding is muddy and weak.
Boilerplate duplication: Repeated intros or promos create identical vectors that may drown out unique content.
Noise leakage: Sidebars, CTAs, or footers can get chunked and embedded, then retrieved as if they were main content.
Mismatched content types: FAQs, glossaries, blogs, and specs each need different chunk strategies. Treat them the same and you lose precision.
Stale embeddings: Models evolve. If you never re-embed after upgrades, your index contains inconsistencies.

Independent research backs this up. LLMs lose salience on long, messy inputs (“Lost in the Middle”). Chunking strategies show measurable trade-offs in retrieval quality (See: “Improving Retrieval for RAG-based Question Answering Models on Financial Documents“). Best practices now include regular re-embedding and index refreshes (See: Milvus guidance.).

For SEOs, this means hygiene work is no longer optional. It decides whether your content gets surfaced at all.

SEOs can begin treating hygiene the way we once treated crawlability audits. The steps are tactical and measurable.

1. Prep Before Embedding

Strip navigation, boilerplate, CTAs, cookie banners, and repeated blocks. Normalize headings, lists, and code so each block is clean. (Do I need to explain that you still need to keep things human-friendly, too?)

2. Chunking Discipline

Break content into coherent, self-contained units. Right-size chunks by content type. FAQs can be short, guides need more context. Overlap chunks sparingly to avoid duplication.

3. Deduplication

Vary intros and summaries across articles. Don’t let identical blocks generate nearly identical embeddings.

4. Metadata Tagging

Attach content type, language, date, and source URL to every block. Use metadata filters during retrieval to exclude noise. (See: Pinecone research on metadata filtering.)

5. Versioning And Refresh

Track embedding model versions. Re-embed after upgrades. Refresh indexes on a cadence aligned to content changes. (See: Milvus versioning guidance.)

6. Retrieval Tuning

Use hybrid retrieval (dense + sparse) with RRF. Add re-ranking to prioritize stronger chunks. (See: Weaviate hybrid search best practices.)

A Note On Cookie Banners (Illustration Of Pollution In Theory)

Cookie consent banners are legally required across much of the web. You’ve seen the text: “We use cookies to improve your experience.” It’s boilerplate, and it repeats across every page of a site.

In large systems like ChatGPT or Gemini, you don’t see this text popping up in answers. That’s almost certainly because they filter it out before embedding. A simple rule like “if text contains ‘we use cookies,’ don’t vectorize it” is enough to prevent most of that noise.

But despite this, cookie banners a still a useful illustration of theory meeting practice. If you’re:

Building your own RAG stack, or
Using third-party SEO tools where you don’t control the preprocessing,

Then cookie banners (or any repeated boilerplate) can slip into embeddings and pollute your index. The result is duplicate, low-value vectors spread across your content, which weakens retrieval. This, in turn, messes with the data you’re collecting, and potentially the decisions you’re about to make from that data.

The banner itself isn’t the problem. It’s a stand-in for how any repeated, non-semantic text can degrade your retrieval if you don’t filter it. Cookie banners just make the concept visible. And if the systems ignore your cookie banner content, etc., is the volume of that content needing to be ignored simply teaching the system that your overall utility is lower than a competitor without similar patterns? Is there enough of that content that the system gets “lost in the middle” trying to reach your useful content?

Old Technical SEO Still Matters

Vector index hygiene doesn’t erase crawlability or schema. It sits beside them.

Canonicalization prevents duplicate URLs from wasting crawl budget. Hygiene prevents duplicate vectors from wasting retrieval opportunities. (See: Google’s canonicalization troubleshooting.)
Structured data still helps models interpret your content correctly.
Sitemaps still improve discovery.
Page speed still influences rankings where rankings exist.

Think of hygiene as a new pillar, not a replacement. Traditional technical SEO makes content findable. Hygiene makes it retrievable in AI-driven systems.

You don’t need to boil the ocean. Start with one content type and expand.

Audit your FAQs for duplication and block size (chunk size).
Strip noise and re-chunk.
Track retrieval frequency and attribution in AI outputs.
Expand to more content types.
Build a hygiene checklist into your publishing workflow.

Over time, hygiene becomes as routine as schema markup or canonical tags.

Your content is already being chunked, embedded, and retrieved, whether you’ve thought about it or not.

The only question is whether those embeddings are clean and useful, or polluted and ignored.

Vector index hygiene is not THE new technical SEO. But it is A new layer of technical SEO. If crawlability was part of the technical SEO of 2010, hygiene is part of the technical SEO of 2025.

SEOs who treat it that way will still be visible when answer engines, not SERPs, decide what gets seen.

More Resources:

This post was originally published on Duane Forrester Decodes.

Featured Image: Collagery/Shutterstock

Ecommerce MGMT 0 Comments

Sep 24 2025

From SEO To GEO: How Can Marketers Adapt To The New Era Of Search Visibility? via @sejournal, @Semji_fr

This post was sponsored by Semji. The opinions expressed in this article are the sponsor’s own.

For three decades, SEO has been the cornerstone of digital visibility.

Keywords, backlinks, and technical optimization determined whether your brand appeared at the top of search results.

However, the landscape is shifting, and it’s likely that if you’re reading this article, you already know it.

Download The Full Guide

With generative AI tools like ChatGPT, Google AI Overviews, Gemini, or Perplexity, users no longer rely solely on lists of blue links.

Instead, searchers and researchers receive synthesized, conversational answers that draw content from high-authority sources.

The message is clear: ranking alone is no longer enough.

To be visible in the age of AI, marketers need a complementary discipline, Generative Engine Optimization (GEO).

To do so, you need concrete methods and best practices to add GEO efficiently into your strategy.

What Is Generative Search Optimization (GEO)?

Generative Search Optimization (GEO) is the practice of ensuring that your content is selected, understood, and cited by large language models (LLMs) and generative engines.

How Does GEO Differ From Traditional SEO?

Traditional search engines use bots to crawl webpages and rank them.

LLMs synthesize patterns from massive pre-ingested datasets. LLMs and answer engines don’t index; they use them as their conversational padding.

What Is A Pre-Ingested Data Set?

Pre-ingested datasets are content that is pulled from websites, reviews, directories, forums, and even brand-owned assets.

This means your visibility no longer depends only on keywords

What Do I Need To Do To Show Up In AI Overviews & SERPs?

To increase your visibility in LLMs, your content must be:

Put simply: GEO ensures your brand shows up in the answers themselves as well as in the links beneath them.

How To Optimize For LLMs In GEO

Optimizing for LLMs is about aligning with how these systems select and reuse content.

From our analysis, three core principles stand out in consistently GEO-friendly content:

1. Provide Structure & Clarity

Generative models prioritize content that is well-organized and easy to parse. Clear headings, bullet points, tables, summaries… help engines extract information and recompose it into human-like answers.

2. Include Trust & Reliability Signals

LLMs reward factual accuracy, consistency, and transparency. Contradictions between your site, profiles, and third-party sources weaken credibility. Conversely, quoting sources, citing data, and showcasing expertise increase your chances of being cited!

3. Contextual & Semantic Depth Are Key

Engines rely less on keywords and more on contextual signals (as it has been more and more the case with Google these last years–hello BERT, haven’t heard from you in a while!). Content enriched with synonyms, related terms, and variations is more flexible and better aligned with diverse queries, which is especially important as AI queries are conversational, not just transactional.

3 Tips For Creating GEO-Friendly Content

In the GEO guide we’re sharing with you in this article, 15 tips are delivered–here are 3 of the most important ones:

1. Be Comprehensive & Intent-Driven

LLMs favor complete answers.

Cover not just the main query but related terms, variations, and natural follow-ups.

For example, if writing about “content ROI,” anticipate adjacent questions like “How do you measure ROI in SEO?” or “What KPIs prove content ROI?”!

By aligning with user intent, not just keywords, you increase the likelihood of your content being surfaced as the “best available answer” for the LLMs.

Learn how to do this.

2. Showcase E-E-A-T Signals

GEO is inseparable from trust. Engines look for identifiable signals of credibility:

Author bylines with expertise.
Real-world examples, roles, or case insights.
Transparent sourcing of statistics and references.
And many more opportunities to prove your credibility and authority.

Think of it as content that doesn’t just “read well,” but feels safe to reuse by the LLMs.

3. Optimize format for machine & human readability

Beyond clarity, formats like FAQs, how-tos, comparisons, and lists make your content both user-friendly and machine-friendly. Many SEO techniques are just as powerful and efficient in GEO:

Add alt text for visuals.
Include summaries and key takeaways in long-form content.
Use structured data and schema where relevant.

This dual optimization increases both discoverability and reusability in AI-generated answers.

Why It’s Essential To Optimize For LLMs

Skeptical about GEO? Consider this: 74% of problem-solving searches now surface AI-generated responses, and AI Overviews already appear in more than 1 in 10 Google queries in the U.S. AI Overviews, Perplexity summaries, and Gemini snapshots are becoming default behaviors in information-seeking. The line between “search” and “chat” is blurring.

The risk of ignoring GEO is not just lower traffic—it’s invisibility in the answer layer where trust and decisions are increasingly formed.

By contrast, marketers who embrace GEO can:

Defend brand presence where AI engines consolidate attention.
Create future-forward SEO strategies as search continues to evolve.
Maximize ROI by aligning content with both human expectations and machine logic.

In other words, GEO is not a trend: it’s a structural shift in digital visibility, where SEO remains essential but is no longer sufficient. GEO adds the missing layer: being cited, trusted, and reused by the engines that increasingly mediate how users access information.

GEO As A New Competitive Advantage

The age of GEO is here. For marketing and SEO leaders, the opportunity is to adapt faster than competitors—aligning content with the standards of generative search while continuing to refine SEO.

To win visibility in this environment, prioritize:

Auditing your current content for GEO readiness.
Enhancing clarity, trust signals, and semantic richness.
Monitoring your presence in AI Overviews, ChatGPT, and other generative engines.

Those who invest in GEO today will shape how tomorrow’s answers are written.

Want to explore the full framework of GEO?

Download the complete guide on Generative Engine Optimization

Image Credits

Featured Image: Image by Semji. Used with permission.

Ecommerce MGMT 0 Comments

Aug 29 2025

Why WooCommerce Slows Down (& How to Fix It With the Right Server Stack)

This post was sponsored by Cloudways. The opinions expressed in this article are the sponsor’s own.

Wondering why your rankings may be declining?

Just discovered your WooCommerce site has slow load times?

A slow WooCommerce site doesn’t just cost you conversions. It affects search visibility, backend performance, and customer trust.

Whether you’re a developer running your own stack or an agency managing dozens of client stores, understanding how WooCommerce performance scales under load is now considered table stakes.

Today, many WordPress sites are far more dynamic, meaning many things are happening at the same time:

Stores run real-time sales.
LMS platforms track user progress.
Membership sites deliver highly personalized content.

Every action a user takes, from logging in, updating a cart, or initiating checkout, relies on live data from the server. These requests cannot be cached.

Tools like Varnish or CDNs can help with public pages such as the homepage or product listings. But once someone logs in to their account or interacts with their session, caching no longer helps. Each request must be processed in real time.

This article breaks down why that happens and what kind of server setup is helping stores stay fast, stable, and ready to grow.

Why Do WooCommerce Stores Slow Down?

WooCommerce often performs well on the surface. But as traffic grows and users start interacting with the site, speed issues begin to show. These are the most common reasons why stores slow down under pressure:

1. PHP: It Struggles With High User Activity

WooCommerce depends on PHP to process dynamic actions such as cart updates, coupon logic, and checkout steps. Traditional stacks using Apache for PHP handling are slower and less efficient.

Modern environments use PHP-FPM, which improves execution speed and handles more users at once without delays.

2. A Full Database: It Becomes A Bottleneck

Order creation, cart activity, and user actions generate a high number of database writes. During busy times like flash sales, new merchandise arrivals, or course launches, the database struggles to keep up.

Platforms that support optimized query execution and better indexing handle these spikes more smoothly.

3. Caching Issues: Object Caching Is Missing Or Poorly Configured

Without proper object caching, WooCommerce queries the database repeatedly for the same information. That includes product data, imagery, cart contents, and user sessions.

Solutions that include built-in Redis support help move this data to memory, reducing server load and improving site speed.

4. Concurrency Limits Affect Performance During Spikes

Most hosting stacks today, including Apache-based ones, perform well for a wide range of WordPress and WooCommerce sites. They handle typical traffic reliably and have powered many successful stores.

As traffic increases and more users log in and interact with the site at the same time, the load on the server begins to grow. Architecture starts to play a bigger role at that point.

Stacks built on NGINX with event-driven processing can manage higher concurrency more efficiently, especially during unanticipated traffic spikes.

Rather than replacing what already works, this approach extends the performance ceiling for stores that are becoming more dynamic and need consistent responsiveness under heavier load.

5. Your WordPress Admin Slows Down During Sales Seasons

During busy periods like seasonal sales campaigns or new stock availability, stores can often slow down for the team managing the site, too. The WordPress dashboard takes longer to load, which means publishing products, managing orders, or editing pages also becomes slower.

This slowdown happens because both shoppers and staff are using the site’s resources at the same time, and the server has to handle all those requests at once.

Modern stacks reduce this friction by balancing frontend and backend resources more effectively.

How To Architect A Scalable WordPress Setup For Dynamic Workloads?

WooCommerce stores today are built for more than stable traffic. Customers are logging in, updating their carts, taking actions to manage their subscription profile, and as a result, are interacting with your backend in real time.

The traditional WordPress setup, which is primarily designed for static content, cannot handle that kind of demand.

Here’s how a typical setup compares to one built for performance and scale:

Component	Basic Setup	Scalable Setup
Web Server	Apache	NGINX
PHP Handler	mod_php or CGI	PHP-FPM
Object Caching	None or database transients	Redis with Object Cache Pro
Scheduled Tasks	WP-Cron	System cron job
Caching	CDN or full-page caching only	Layered caching, including object cache
.htaccess Handling	Built-in with Apache	Manual rewrite rules in NGINX config
Concurrency Handling	Limited	Event-based, memory-efficient server

How To Manually Setup A Performance-Ready & Scalable WooCommerce Stack

Don’t have bandwidth? Try the easy way.

If you’re setting up your own server or tuning an existing one, are the most important components to get right:

1) Use NGINX For Static File Performance

NGINX is often used as a high-performance web server for handling static files and managing concurrent requests efficiently. It is well suited for stores expecting high traffic or looking to fine-tune their infrastructure for speed.

Unlike Apache, NGINX does not use .htaccess files. Rewrite rules, such as permalinks, redirects, and trailing slashes, need to be added manually to the server block. For WordPress, these rules are well-documented and only need to be set once during setup.

This approach gives more control at the server level and can be helpful for teams building out their own environment or optimizing for scale.

2) Enable PHP-FPM For Faster Request Handling

PHP-FPM separates PHP processing from the web server. It gives you more control over memory and CPU usage. Tune values like pm.max_children and pm.max_requests based on your server size to prevent overload during high activity.

3) Install Redis With Object Cache Pro

Redis allows WooCommerce to store frequently used data in memory. This includes cart contents, user sessions, and product metadata.

Pair this with Object Cache Pro to compress cache objects, reduce database load, and improve site responsiveness under load.

4) Replace WP-Cron With A System-Level Cron Job

By default, WordPress checks for scheduled tasks whenever someone visits your site. That includes sending emails, clearing inventory, and syncing data. If you have steady traffic, it works. If not, things get delayed.

You can avoid that by turning off WP-Cron. Just add define(‘DISABLE_WP_CRON’, true); to your wp-config.php file. Then, set up a real cron job at the server level to run wp-cron.php every minute. This keeps those tasks running on time without depending on visitors.

5) Add Rewrite Rules Manually For NGINX

NGINX doesn’t use .htaccess. That means you’ll need to define URL rules directly in the server block.

This includes things like permalinks, redirects, and static file handling. It’s a one-time setup, and most of the rules you need are already available from trusted WordPress documentation. Once you add them, everything works just like it would on Apache.

A Few Tradeoffs To Keep In Mind

This kind of setup brings a real speed boost. But there are some technical changes to keep in mind.

NGINX won’t read .htaccess. All rewrites and redirects need to be added manually.
WordPress Multisite may need extra tweaks, especially if you’re using subdirectory mode.
Security settings like IP bans or rate limits should be handled at the server level, not through plugins.

Most developers won’t find these issues difficult to work with. But if you’re using a modern platform, much of it is already taken care of.

You don’t need overly complex infrastructure to make WooCommerce fast; just a stack that aligns with how modern, dynamic stores operate today.

Next, we’ll look at how that kind of stack performs under traffic, with benchmarks that show what actually changes when the server is built for dynamic sites.

What Happens When You Switch To An Optimized Stack?

Not all performance challenges come from code or plugins. As stores grow and user interactions increase, the type of workload becomes more important, especially when handling live sessions from logged-in users.

To better understand how different environments respond to this kind of activity, Koddr.io ran an independent benchmark comparing two common production setups:

A hybrid stack using Apache and NGINX.
A stack built on NGINX with PHP-FPM, Redis, and object caching.

Both setups were fully optimized and included tuned components like PHP-FPM and Redis. The purpose of the benchmark was to observe how each performs under specific, real-world conditions.

The tests focused on uncached activity from WooCommerce and LearnDash, where logged-in users trigger dynamic server responses.

In these scenarios, the optimized stack showed higher throughput and consistency during peak loads. This highlights the value of having infrastructure tailored for dynamic, high-concurrency traffic, depending on the use case.

WooCommerce Runs Faster Under Load

One test simulated 80 users checking out at the same time. The difference was clear:

Scenario	Hybrid Stack	Optimized Stack	Gain
WooCommerce Checkout	3,035 actions	4,809 actions	+58%

Screenshot from Koddr.io, August 2025

LMS Platforms Benefit Even More

For LearnDash course browsing—a write-heavy and uncached task, the optimized stack completed 85% more requests:

Scenario	Hybrid Stack	Optimized Stack	Gain
LearnDash Course List View	13,459 actions	25,031 actions	+85%

This shows how optimized stacks handle personalized or dynamic content more efficiently. These types of requests can’t be cached, so the server’s raw efficiency becomes critical.

Screenshot from Koddr.io, August 2025

Backend Speed Improves, Too

The optimized stack wasn’t just faster for customers. It also made the WordPress admin area more responsive:

WordPress login times improved by up to 31%.
Publish actions ran 20% faster, even with high traffic.

This means your team can concurrently manage products, update pages, and respond to sales in real time, without delays or timeouts.

It Handles More Without Relying On Caching

When Koddr turned off Varnish, the hybrid stack experienced a 71% drop in performance. This shows how effectively it handles cached traffic. The optimized stack dropped just 7%, which highlights its ability to maintain speed even during uncached, logged-in sessions.

Both setups have their strengths, but for stores with real-time user activity, reducing reliance on caching can make a measurable difference.

Stack Type	With Caching	Without Caching	Drop
Hybrid Stack	654,000 actions	184,000 actions	-7%
Optimized Stack	619,000 actions	572,000 actions	-7%

Screenshot from Koddr.io, August 2025

Why This Matters?

Static pages are easy to optimize. But WooCommerce stores deal with real-time traffic. Cart updates, login sessions, and checkouts all require live processing. Caching cannot help once a user has signed in.

The Koddr.io results show how an optimized server stack:

Reduces CPU spikes during traffic surges.
Keeps the backend responsive for your team.
Delivers more stable speed for logged-in users.
Helps scale without complex performance workarounds.

These are the kinds of changes that power newer stacks purpose-built for dynamic workloads like Cloudways Lightning, built for real WooCommerce workloads.

Core Web Vitals Aren’t Just About The Frontend

You can optimize every image. Minify every line of code. Switch to a faster theme. But your Core Web Vitals score will still suffer if the server can’t respond quickly.

That’s what happens when logged-in users interact with WooCommerce or LMS sites.

When a customer hits “Add to Cart,” caching is out of the picture. The server has to process the request live. That’s where TTFB (Time to First Byte) becomes a real problem.

Slow server response means Google waits longer to start rendering the page. And that delay directly affects your Largest Contentful Paint and Interaction to Next Paint metrics.

Frontend tuning gets you part of the way. But if the backend is slow, your scores won’t improve. Especially for logged-in experiences.

Real optimization starts at the server.

How Agencies Are Skipping The Manual Work

Every developer has a checklist for WooCommerce performance. Use NGINX. Set up Redis. Replace WP-Cron. Add a WAF. Test under load. Keep tuning.

But not every team has the bandwidth to maintain all of it.

That’s why more agencies are using pre-optimized stacks that include these upgrades by default. Cloudways Lightning, a managed stack based on NGINX + PHP-FPM, designed for dynamic workloads is a good example of that.

It’s not just about speed. It’s also about backend stability during high traffic. Admin logins stay fast. Product updates don’t hang. Orders keep flowing.

Joe Lackner, founder of Celsius LLC, shared what changed for them:

“Moving our WordPress workloads to the new Cloudways stack has been a game-changer. The console admin experience is snappier, page load times have improved by +20%, and once again Cloudways has proven to be way ahead of the game in terms of reliability and cost-to-performance value at this price point.”

This is what agencies are looking for. A way to scale without getting dragged into infrastructure management every time traffic picks up.

Final Takeaway

WooCommerce performance is no longer just about homepage load speed.

Your site handles real-time activity from both customers and your team. Once a user logs in or reaches checkout, caching no longer applies. Each action hits the server directly.

If the infrastructure isn’t optimized, site speed drops, sales suffer, and backend work slows down.

The foundations matter. A stack that’s built for high concurrency and uncached traffic keeps things fast across the board. That includes cart updates, admin changes, and product publishing.

For teams who don’t want to manage server tuning manually, options like Cloudways Lightning deliver a faster, simpler path to performance at scale.

Use promo code “SUMMER305” and get 30% off for 5 months + 15 free migrations. Signup Now!

⚡ Power Up My WooCommerce Store

Image Credits

Featured Image: Image by Cloudways. Used with permission.

In-Post Images: Images by Cloudways. Used with permission.

Ecommerce MGMT 0 Comments

Aug 21 2025

Google: Why Lazy Loading Can Delay Largest Contentful Paint (LCP) via @sejournal, @MattGSouthern

In a recent episode of Google’s Search Off the Record podcast, Martin Splitt and John Mueller discussed when lazy loading helps and when it can slow pages.

Splitt used a real-world example on developers.google.com to illustrate a common pattern: making every image lazy by default can delay Largest Contentful Paint (LCP) if it includes above-the-fold visuals.

Splitt said:

“The content management system that we are using for developers.google.com … defaults all images to lazy loading, which is not great.”

Splitt used the example to explain why lazy-loading hero images is risky: you tell the browser to wait on the most visible element, which can push back LCP and cause layout shifts if dimensions aren’t set.

Splitt said:

“If you are using lazy loading on an image that is immediately visible, that is most likely going to have an impact on your largest contentful paint. It’s like almost guaranteed.”

How Lazy Loading Delays LCP

LCP measures the moment the largest text or image in the initial viewport is painted.

Normally, the browser’s preload scanner finds that hero image early and fetches it with high priority so it can paint fast.

When you add loading="lazy" to that same hero, you change the browser’s scheduling:

The image is treated as lower priority, so other resources start first.
The browser waits until layout and other work progress before it requests the hero image.
The hero then competes for bandwidth after scripts, styles, and other assets have already queued.

That delay shifts the paint time of the largest element later, which increases your LCP.

On slow networks or CPU-limited devices, the effect is more noticeable. If width and height are missing, the late image can also nudge layout and feel “jarring.”

SEO Risk With Some Libraries

Browsers now support a built-in loading attribute for images and iframes, which removes the need for heavy JavaScript in standard scenarios. WordPress adopted native lazy loading by default, helping it spread.

Splitt said:

“Browsers got a native attribute for images and iframes, the loading attribute … which makes the browser take care of the lazy loading for you.”

Older or custom lazy-loading libraries can hide image URLs in nonstandard attributes. If the real URL never lands in src or srcset in the HTML Google renders, images may not get picked up for indexing.

Splitt said:

“We’ve seen multiple lazy loading libraries … that use some sort of data-source attribute rather than the source attribute… If it’s not in the source attribute, we won’t pick it up if it’s in some custom attribute.”

How To Check Your Pages

Use Search Console’s URL Inspection to review the rendered HTML and confirm that above-the-fold images and lazy-loaded modules resolve to standard attributes. Avoid relying on the screenshot.

Splitt advised:

“If the rendered HTML looks like it contains all the image URLs in the source attribute of an image tag … then you will be fine.”

Ranking Impact

Splitt framed ranking effects as modest. Core Web Vitals contribute to ranking, but he called it “a tiny minute factor in most cases.”

What You Should Do Next

Keep hero and other above-the-fold images eager with width and height set.
Use native loading="lazy" for below-the-fold images and iframes.
If you rely on a library for previews, videos, or dynamic sections, make sure the final markup exposes real URLs in standard attributes, and confirm in rendered HTML.

Looking Ahead

Lazy loading is useful when applied selectively. Treat it as an opt-in for noncritical content.

Verify your implementation with rendered HTML, and watch how your LCP trends over time.

Featured Image: Screenshot from YouTube.com/GoogleSearchCentral, August 2025.

Ecommerce MGMT 0 Comments

Aug 15 2025

Googlebot Crawl Slump? Mueller Points To Server Errors via @sejournal, @MattGSouthern

A Reddit thread about a sharp crawl drop prompted guidance from Google’s John Mueller. Here’s how to diagnose the cause.

Sudden crawl drops are more consistent with 429/500/503 responses or timeouts than with 404s.
Once issues are fixed, crawl rates should recover, though it can take time.
There’s no defined timeline for recovery.

Ecommerce MGMT 0 Comments

Aug 12 2025

LLM SEO Optimization Techniques: (including llms.txt)

How to make your content visible in the age of AI search

So, what exactly is LLM Optimization? Well, the answer to that question depends on who you ask. For example, if you ask a machine learning engineer, they’ll tell you it’s all about tweaking prompts and token limits to get better performance from a large language model. In fact, Iguazio actually defines LLM optimization as improving the way models respond, which means smarter, faster, and with more contextual recognition.

If, on the other hand, you are a content strategist or SEO enthusiast, LLM optimization will mean something completely different to you and that is making sure that your content shows up in AI-generated search results. And, that needs to be true no matter whether you’re talking to ChatGPT, searching with Perplexity, or scanning Google’s new AI Mode for answers. Some call this ChatGPT SEO or Generative Engine Optimization.

So, if you fall into the latter of those two groups, ie: the people who want their content and product pages to be seen and clicked, then this article is for you. And, if you’d like to read on, we’ll show you why LLM optimization in an AI-search landscape isn’t some sort of luxury option; it’s an absolute necessity.

What are LLMs and why should you care?

AI engineers train Large Language models on huge amounts of text and data to generate answers, summaries, code, and human-like language. They’ve read everything (not just the Classics) and that includes blogs, news articles and your website.

The reason that’s important is that LLMs don’t crawl your website in real time like Search Engines do. What they do is read it, learn from it and when someone asks them a question, they try to recall what they saw and rephrase it into an answer. If your site shows up as the answer, “Great” but if not, you’ve got a visibility problem.

The new way of searching

Search is not just about Google anymore. Also, it’s not as if just one other thing has come to dominate which means we’re left with a rather messy mix of Perplexity answers, Chat GPT chats, Gemini summaries and voice assistants reading out answers while we try to do two tasks at once.

In short, people aren’t just searching, they’re conversing and if your content can’t hold its own in this environment then you’re missing out on visibility, traffic, and the ability to build trust. We’ll walk you through exactly how to fix that.

SEO vs. GEO vs. AEO vs. LLMO: Are we just rebranding SEO?

If you’ve been wondering whether you now need four different strategies for SEO (Search Engine Optimization), GEO (Generative Engine Optimization), AEO (Answer Engine Optimization), and LLMO (Large Language Model Optimization), relax, it’s not as big a deal as you might think. You see, despite all the buzzwords, the core of optimization hasn’t changed much.

All four terms point to the same central goal: making your content more findable, quotable, and credible in machine-generated output regardless of whether that comes from Google’s AI Overviews, ChatGPT, or an answer box on Bing.

So, should you overhaul your entire content strategy to ‘do LLMO’?

Not really. At least, not yet.

Most of what boosts your presence in LLMs is already what SEO professionals have been doing for years. Structured content, semantic clarity, topical authority, entity association, clean internal linking, it’s all classic SEO.

Where they slightly diverge:

SEO (Search Engine Optimization)	Relies on backlinks and site architecture to establish authority
GEO (Generative Engine Optimization	Puts extra emphasis on unlinked brand mentions and semantic association
AEO (Answer Engine Optimization)	Focuses on being the single best, most concise, and sourceable response to a specific query
LLMO (Large Language Model Optimization)	Leans into optimizing content not just for people or search crawlers but for LLMs reading in chunks, skipping JavaScript, and relying on embeddings and grounding datasets

But the thing is: you don’t need four different playbooks. All you need is one solid SEO foundation. In fact, this point is backed up by Google’s Gary Illyes who confirmed that AI Search does not require specialized optimization, saying that “AI SEO” is not necessary and that standard SEO is all that is needed for both AI Overviews and AI Mode.

Focus more on entity mentions, not just links
Treat your core site pages (home, pricing, about) and PDFs as important LLM fuel.
Remember that AI crawlers don’t render JavaScript, so client-side content might be invisible
Think about how LLMs process structure (chunking, context, citations), not just how humans skim it

So, if you’ve already been investing in foundational SEO, you’re already doing most of what GEO, AEO, and LLMO ae all about. That’s why not every new acronym needs you to have a whole rethink on your efforts. Sometimes, it’s just like SEO.

Key LLM SEO optimization techniques

Now that we know LLMs aren’t crawling our site but are understanding it, we need to think a little differently about how we create and construct content and for more on this, you may find this article extremely insightful. This is not about cramming in keywords or trying to play the algorithm, it’s about clarity, structure and credibility because these are the things LLMs care about when deciding what to quote, summarize or ignore. Below are some techniques that will help your content stay visible now that people are using generative search.

The bar has been raised on the quality of content

LLMs love clarity. The more natural and specific your language is, the easier it is for them to understand and reuse your content. That means not using jargon, avoiding ambiguity and instead, focusing on writing like you’re explaining something to a colleague.

To give an exact example:

Don’t Say:

“Our innovative tool revolutionizes the digital landscape for modern businesses.”

Instead Say:

“The Yoast SEO plugin for WordPress helps businesses to improve their website’s visibility and appear inn search results

Use Structure, Chunked Formatting

Chunked formatting means breaking your content into small pieces (chunks) of informatin that are easy to understand and remember. LLMs tend to prioritize the most easily digestible content construction – which means your headings, bullet points, and clearly defined sections must do a lot of heavy lifting. Not only does organizing your content like this help people to skim read, but it also helps machines understand what each section is about.

Structuring your content like this will help:

Write clear, descriptive H2s and 3s
Use bullet points that can provide standalone value
Include summaries and tables to give quick overviews

Be Factual, Transparent, and Authoritative

Just like Google, LLMs need to trust that your content is reliable before they start taking you seriously. This means you need to show your working out, quote sources, reveal authors, and follow the principles of E-E-A-T. Experience, Expertise, Authority, and Trust.

Follow these E-E-A-T principles

To do this:

Include an author bio and credentials if possible (include a link to actual author bios and social profiles)
Name your sources when you use claims or statistics
Share real experiences if possible “As a small business owner…”

The more real, relatable and trustworthy your content looks, the more AI will like it.

Optimize for Summarization

LLMs won’t quote your entire blog post; they’ll only use snippets. Your job is to make those snippets irresistible. Start with strong lead sentences so that each paragraph begins with a clear point followed by context. Also, it’s a good idea to front-load your content. Don’t save your best bits for the end.

As a reminder:

Start each section with what you want the key takeaway to be
Keep paragraphs short and self-contained
Create standalone summary paragraphs as these often get quoted in AI generated answers

Use Schema

Behind every great summary is a structured content model. That’s where Schema markup comes in and to help the AI understand your content, you need to speak in a certain way.

To make things clear, use:

Article for blog content
FAQPage for questions and answers
HowTo for instructions
Author and Person for writer’s bio
WebPage for generic content

Bonus strategies for LLM optimization

Once you’ve got the basics completed, like clear writing, structure and trust signals, there’s still more you can do to give your content the best shot at visibility. These bonus strategies focus on how to make your site even more AI-friendly by anticipating how LLMs interpret and reuse information.

Use Explicit Context and Clear language

Humans have an incredible ability to be able to ‘fill in the blanks’ and still ‘get the message’ even if the information they got was vague or unclear. One of the biggest differences between humans and LLMs? Humans can infer meaning from vague references. LLMs on the other hand… well, let’s just say that it doesn’t come naturally to them.

In any case, the point is that if your article mentions “this tool” or “our product” without any context, an LLM might miss the connection entirely. The result? You’re left out of the answer, even if you’re the best source.

So, to give your content the clarity it deserves:

Use the full product or brand name, like “Yoast SEO plugin for WordPress,” not just “Yoast”
Define technical or niche terms before using them
Avoid vague language (“this page,” “the above section,” “click here”)

You don’t need to be repetitive, but you do need to be explicit rather than implicit.

Leverage FAQs and Conversational Formats

LLMs love FAQs because they’re direct, predictable, and easy to quote. They closely match real user intent and provide high-value snippets that tools like Perplexity and Gemini can pull from without much guesswork.

How to use the FAQ block in WordPress

That said, there’s an important limitation to keep in mind if you’re using the Yoast SEO FAQ block in Gutenberg:

You cannot use H2 or H3 heading tags inside the FAQ block.
The block creates its own question-answer formatting using custom HTML, which is great for structured data (FAQ Page schema), but it doesn’t support native heading tags which limits your ability to optimize AI readability and skimmability.

So, if your goal is to appear in AI-generated summaries or answer boxes, where headings like “What is LLM SEO?” make it easy for AI to quote your content, you might be better off using manual formatting.

Here’s how to get the best of both worlds:

STEP 1: Use H2 or H3 tags for each question (e.g., “What is llms.txt?”) and write a clear, short answer beneath it. This improves LLM visibility but doesn’t generate structured FAQ schema.

Step 2: Use the Yoast FAQ block for schema support but know that it won’t give you a proper heading structure.

Ultimately, the more your FAQs resemble natural, searchable questions — and are structured in a way that both humans and AI can easily parse — the more likely they are to be featured in answers.

Enhance Trust with Freshness Signals

Just like search engines, some LLMs give preference to newer content, but remember that we need to talk to them in a certain way to get the best out of them.

Older content can be overlooked. Worse, it can be quoted incorrectly if something has changed since you last hit publish.

Make sure your pages include:

A clear “last updated” timestamp (can we get a picture of what one would look like for clarification?)
Regular reviews for accuracy
Changelogs or update notes if applicable (especially for software or plugin content)

It doesn’t have to be complicated, even a simple “Last updated: June 2025” can help both readers and AI systems trust that your content is current.

How to keep content fresh

Prioritize Author Visibility and Credibility

Today, we’re entering a phase where who wrote your content is just as important as what it says. That means you need to highlight author visibility and put effort into signaling real-world experience.

Here’s how:

Include author bios in WordPress with credentials and links to their professional profiles
Use Person schema to formally associate the content with a specific individual
Weave in relevant experience (“As an SEO consultant who works with SaaS brands…”)

Remember, LLMs are more likely to trust, quote, and amplify expert-authored content.

Use Internal Linking Strategically

Think of internal linking as your site’s nervous system. It helps both humans and LLMs understand what’s important, how topics relate, and where to go next.

But internal linking isn’t just about SEO hygiene anymore — it’s also a way to establish topic authority and help LLMs build a map of your expertise.

Do:

Cluster related articles together (e.g., link from “LLM Optimization” to “Schema Markup for SEO”)
Use descriptive anchor text like “read our full guide to Schema markup,” not just “click here”
Ensure every piece of content supports a broader narrative

Our internal linking feature is available for free with a Yoast SEO Premium plugin.

The role of llms.txt. Giving AI search all the right signals

Now let’s talk about one of the most recent developments in LLM visibility; a little file called llms.txt.

Think of it as a sibling to robots.txt, but instead of guiding search engines, it tells AI tools how they’re allowed to interact with your content. Note: llms.txt is still an evolving standard, and support across AI tools may vary, but it’s a smart step toward asserting control

With llms.txt, you can:

Define how your content may be reused or summarized
Set clear expectations around attribution, licensing

It’s not just about protection, it’s about being proactive as AI usage accelerates.

Even better: Yoast now offers llms.txt integration right inside the plugin, so you don’t need to mess around with code or server settings. If you want to future-proof your site’s visibility (and your IP), this is where you start.

The llms.txt feature is available for both free and premium customers.

LLM Optimization vs Traditional SEO:

LLM Optimization and SEO are part of the same family, but they serve different functions and require slightly different thinking.

Let’s compare:

Traditional SEO	LLM Optimization
Crawled and ranked by bots	Read, remembered, and reused by AIs
Emphasizes keywords	Emphasizes context and clarity

Optimizes for SERPs	Optimizes for AI-generated summaries and answers

The takeaway? You can’t ignore either. One brings traffic; the other boosts brand visibility within AI responses.

And considering that 42% of users now start their research with an LLM (not Google), you’ll want to be found in both places.

Common Mistakes to Avoid

Even well-meaning content creators fall into holes. So, take a look at the tips below to avoid any mishaps that could damage your LLM visibility:

Writing like a robot or allowing a robot to write for you (ironically, not appreciated by robots)
Leaving your content undated and unchanged for years
Publishing posts without any author information or editorial standards
Ignoring internal links or leaving orphaned pages
Using vague headings or anchor text like “read more” or “this article”

If your content looks generic, outdated, or anonymous, it won’t earn any trust. And, without trust, it won’t get quoted.

Tools and Resources to Get Started

Search used to be about visibility within SERPs. But now, it’s also about being seen in summaries, answers, snippets, and chats. LLMs aren’t just shaping the future of search; they’re shaping how your brand is perceived to both humans and robots alike.

To stand out:

Write with clarity and context
Structure for humans and machines
Cite your expertise and show your authors
Use tools like Yoast and llms.txt to signal your intent

Future-proof your visibility with Yoast SEO. From llms.txt integration to schema support, Yoast gives you all the tools you need to speak AI’s language and dominate both generative answers and search engines. Get started with Yoast SEO Premium now and make it easy for AI to say something accurate, useful, and… ideally, about you.

Brendan Reid

Brendan is a seasoned writer with a particular interest in SMEs. What he really enjoys is being able to provide real, actionable steps that can be taken today to start making business better for everyone.

Ecommerce MGMT 0 Comments

Key takeaways

Table of contents

What is structured data?

A simple example of structured data

Why do you need structured data?

Is structured data important for SEO?

Structured data can lead to rich results

The rich results formerly known as rich snippets

Mobile rich results

Knowledge Graph Panel

Different kinds of structured data

How Yoast SEO helps with structured data

Which structured data types matter most?

Most important structured data types

Other valuable or sector-specific types:

Special or supporting types:

Structured data for voice assistants

Google Search Console

The technical details

Properties

Structured data should be a graph

Yoast SEO builds a graph

Beyond search: AI, assistants, and interoperability

Practical steps for today

Interoperability, MCP, and the role of structured data

Schema.org and JSON-LD: the foundation you can trust

What do you need to describe for search engines?

How to implement structured data

Structured data with JSON-LD

The Yoast SEO Schema structured data framework

Expanding your structured data implementation

Read up

Conclusions about structured data

Temporary Removal

Permanent

Delete the page from your site

Add the noindex tag

Password protect

Remove internal links

Robots.txt

What’s New

Subscriber Content

How It Compares To Other Directives

Why This Matters

Looking Ahead

Why This Matters Now

Experiment Design (97 URLs)

The Hidden Quota: Unpacking “wordlim”

Results

1) Consistency: Snippets Are More Predictable With Schema

2) Contextual Relevance: Schema Guides Extraction

3) Quality Score (All Pages)

Beyond Consistency: Richer Data Extends The Wordlim Envelope (Early Signal)

From Schema To Strategy: The Playbook

Conclusion

Traditional Indexing: How Search Engines Break Pages Apart

GenAI Retrieval: From Inverted Indexes To Vector Indexes

What Vector Index Hygiene Means

1. Prep Before Embedding

2. Chunking Discipline

3. Deduplication

4. Metadata Tagging

5. Versioning And Refresh

6. Retrieval Tuning

A Note On Cookie Banners (Illustration Of Pollution In Theory)

Old Technical SEO Still Matters

What Is Generative Search Optimization (GEO)?

How Does GEO Differ From Traditional SEO?

What Do I Need To Do To Show Up In AI Overviews & SERPs?

How To Optimize For LLMs In GEO

1. Provide Structure & Clarity

2. Include Trust & Reliability Signals

3. Contextual & Semantic Depth Are Key

3 Tips For Creating GEO-Friendly Content

1. Be Comprehensive & Intent-Driven

2. Showcase E-E-A-T Signals

3. Optimize format for machine & human readability

Why It’s Essential To Optimize For LLMs

GEO As A New Competitive Advantage

Why Do WooCommerce Stores Slow Down?