Why Now’s The Time To Adopt Schema Markup via @sejournal, @marthavanberkel

There is no better time for organizations to prioritize Schema Markup.

Why is that so, you might ask?

First of all, Schema Markup (aka structured data) is not new.

Google has been awarding sites that implement structured data with rich results. If you haven’t taken advantage of rich results in search, it’s time to gain a higher click-through rate from these visual features in search.

Secondly, now that search is primarily driven by AI, helping search engines understand your content is more important than ever.

Schema Markup allows your organization to clearly articulate what your content means and how it relates to other things on your website.

The final reason to adopt Schema Markup is that, when done correctly, you can build a content knowledge graph, which is a critical enabler in the age of generative AI. Let’s dig in.

Schema Markup For Rich Results

Schema.org has been around since 2011. Back then, Google, Bing, Yahoo, and Yandex worked together to create the standardized Schema.org vocabulary to enable website owners to translate their content to be understood by search engines.

Since then, Google has incentivized websites to implement Schema Markup by awarding rich results to websites with certain types of markup and eligible content.

Websites that achieve these rich results tend to see higher click-through rates from the search engine results page.

In fact, Schema Markup is one of the most well-documented SEO tactics that Google tells you to do. With so many things in SEO that are backward-engineered, this one is straightforward and highly recommended.

You might have delayed implementing Schema Markup due to the lack of applicable rich results for your website. That might have been true at one point, but I’ve been doing Schema Markup since 2013, and the number of rich results available is growing.

Even though Google deprecated how-to rich results and changed the eligibility of FAQ rich results in August 2023, it introduced six new rich results in the months following – the most new rich results introduced in a year!

These rich results include vehicle listing, course info, profile page, discussion forum, organization, vacation rental, and product variants.

There are now 35 rich results that you can use to stand out in search, and they apply to a wide range of industries such as healthcare, finance, and tech.

Here are some widely applicable rich results you should consider utilizing:

  • Breadcrumb.
  • Product.
  • Reviews.
  • JobPosting.
  • Video.
  • Profile Page.
  • Organization.

With so many opportunities to take control of how you appear in search, it’s surprising that more websites haven’t adopted it.

A statistic from Web Data Commons’ October 2023 Extractions Report showed that only 50% of pages had structured data.

Of the pages with JSON-LD markup, these were the top types of entities found.

  • http://schema.org/ListItem (2,341,592,788 Entities)
  • http://schema.org/ImageObject (1,429,942,067 Entities)
  • http://schema.org/Organization (907,701,098 Entities)
  • http://schema.org/BreadcrumbList (817,464,472 Entities)
  • http://schema.org/WebSite (712,198,821 Entities)
  • http://schema.org/WebPage (691,208,528 Entities)
  • http://schema.org/Offer (623,956,111 Entities)
  • http://schema.org/SearchAction (614,892,152 Entities)
  • http://schema.org/Person (582,460,344 Entities)
  • http://schema.org/EntryPoint (502,883,892 Entities)

(Source: October 2023 Web Data Commons Report)

Most of the types on the list are related to the rich results mentioned above.

For example, ListItem and BreadcrumbList are required for the Breadcrumb Rich Result, SearchAction is required for Sitelink Search Box, and Offer is required for the Product Rich Result.

This tells us that most websites are using Schema Markup for rich results.

Even though these Schema.org types can help your site achieve rich results and stand out in search, they don’t necessarily tell search engines what each page is about in detail and help your site be more semantic.

Help AI Search Engines Understand Your Content

Have you ever seen your competitor’s sites using specific Schema.org Types that are not found in Google’s structured data documentation (i.e. MedicalClinic, IndividualPhysician, Service, etc)?

The Schema.org vocabulary has over 800 types and properties to help websites explain what the page is about. However, Google’s structured data features only require a small subset of these properties for websites to be eligible for a rich result.

Many websites that solely implement Schema Markup to get rich results tend to be less descriptive with their Schema Markup.

AI search engines now look at the meaning and intent behind your content to provide users with more relevant search results.

Therefore, organizations that want to stay ahead should use more specific Schema.org types and leverage appropriate properties to help search engines better understand and contextualize their content. You can be descriptive with your content while still achieving rich results.

For example, each type (e.g. Article, Person, etc.) in the Schema.org vocabulary has 40 or more properties to describe the entity.

The properties are there to help you fully describe what the page is about and how it relates to other things on your website and the web. In essence, it’s asking you to describe the entity or topic of the page semantically.

The word ‘semantic’ is about understanding the meaning of language.

Note that the word “understanding” is part of the definition. Funny enough, in October 2023, John Mueller at Google released a Search Update video. In this six-minute video, he leads with an update on Schema Markup.

For the first time, Mueller described Schema Markup as “a code you can add to your web pages, which search engines can use to better understand the content. ”

While Mueller has historically spoken a lot about Schema Markup, he typically talked about it in the context of rich result eligibility. So, why the change?

This shift in thinking about Schema Markup for enhanced search engine understanding makes sense. With AI’s growing role and influence in search, we need to make it easy for search engines to consume and understand the content.

Take Control Of AI By Shaping Your Data With Schema Markup

Now, if being understood and standing out in search is not a good enough reason to get started, then doing it to help your enterprise take control of your content and prepare it for artificial intelligence is.

In February 2024, Gartner published a report on “30 Emerging Technologies That Will Guide Your Business Decisions,”  highlighting generative AI and knowledge graphs as critical emerging technologies companies should invest in within the next 0-1 years.

Knowledge graphs are collections of relationships between entities defined using a standardized vocabulary that enables new knowledge to be gained by way of inferencing.

Good news! When you implement Schema Markup to define and connect the entities on your site, you are creating a content knowledge graph for your organization.

Thus, your organization gains a critical enabler for generative AI adoption while reaping its SEO benefits.

Learn more about building content knowledge graphs in my article, Extending Your Schema Markup From Rich Results to Knowledge Graphs.

We can also look at other experts in the knowledge graph field to understand the urgency of implementing Schema Markup.

In his LinkedIn post, Tony Seale, Knowledge Graph Architect at UBS in the UK, said,

“AI does not need to happen to you; organizations can shape AI by shaping their data.

It is a choice: We can allow all data to be absorbed into huge ‘data gravity wells’ or we can create a network of networks, each of us connecting and consolidating our data.”

The “networks of networks” Seale refers to is the concept of knowledge graphs – the same knowledge graph that can be built from your web data using semantic Schema Markup.”

The AI revolution has only just begun, and there is no better time than now to shape your data, starting with your web content through the implementation of Schema Markup.

Use Schema Markup As The Catalyst For AI

In today’s digital landscape, organizations must invest in new technology to keep pace with the evolution of AI and search.

Whether your goal is to stand out on the SERP or ensure your content is understood as intended by Google and other search engines, the time to implement Schema Markup is now.

With Schema Markup, SEO pros can become heroes, enabling generative AI adoption through content knowledge graphs while delivering tangible benefits, such as increased click-through rates and improved search visibility.

More resources: 


Featured Image by author

Charts: Global M&A Trends Q2 2024

Worldwide mergers and acquisitions are expected to increase through 2024, with CEOs viewing acquisitions and divestitures as crucial for their immediate priorities. That’s according to the quarterly “CEO Outlook Pulse” survey from EY, the accounting and consulting firm.

EY surveyed 1,200 global executives and 300 institutional investors in March and April 2024 about their plans for capital allocation, investment, and business transformation.

According to EY’s data, M&A deals in Q1 2024 totaled $796 billion, a 36% increase from the same period in 2023. The purpose of most deals was to acquire technology, enhance production, or integrate startups.

Per the EY survey, divestitures, spinoffs, and IPOs will be the top M&A initiatives this year.


In addition, the primary M&A goals of CEOs are to acquire technology or product capabilities and benefit from innovative startups.

Accounting and consulting firm KPMG surveyed (PDF) managers of U.S. private equity firms in early 2024. According to the survey, healthcare, infrastructure, and life sciences deals will be their top targets this year

Vulnerabilities In WooCommerce And Dokan Pro Plugins via @sejournal, @martinibuster

WooCommerce published an advisory about an XSS vulnerability while Wordfence simultaneously advised about a critical vulnerability in a WooCommerce plugin named Dokan Pro. The advisory about Dokan Pro warned that a SQL Injection vulnerability allows unauthenticated attackers to extract sensitive information from a website database.

Dokan Pro WordPress Plugin

The Dokan Pro plugin allows user to transform their WooCommerce website into a multi-vendor marketplace similar to sites like Amazon and Etsy. It currently has over 50,000 installations Plugin versions up to and including 3.10.3 are vulnerable.

According to WordFence, version 3.11.0 represents the fully patched and safest version.

WordPress.org lists the current number of plugin installations of the lite version at over 50,000 and a total all-time number of installations of over 3 million. As of this moment only 30.6% of installations were using the most up to date version, 3.11 which may mean that 69.4% of all Dokan Pro plugins are vulnerable.

Screenshot Of Dokan Plugin Download Statistics

Changelog Doesn’t Show Vulnerability Patch

The changelog is what tells users of a plugin what’s contained in an update. Most plugin and theme makers will publish a clear notice that an update contains a vulnerability patch. According to Wordfence, the vulnerability affects versions up to and including version  3.10.3. But the changelog notation for version 3.10.4 that was released Apr 25, 2024 (which is supposed to be patched) does not show that there’s a patch. It’s possible that the publisher of Dokan Pro and Dokan Lite didn’t want to alert hackers to the critical vulnerability.

Screenshot Of Dokan Pro Changelog

CVSS Score 10

The Common Vulnerability Scoring System (CVSS) is an open standard for assigning a score that represents the severity of a vulnerability. The severity score is based on how exploitable it is, the impact of it, plus supplemental metrics such as safety and urgency which together add up to a total score from least severe (1) to the highest severity (10).

The Dokan Pro plugin received a CVSS score of 10, the highest level severity, which means that any users of the plugin are recommended to take immediate action.

Screenshot Of Dokan Pro Vulnerability Severity Score

Description Of Vulnerability

Dokan Pro was found to contain an Unauthenticated SQL Injection vulnerability. There are authenticated and unauthenticated vulnerabilities. Unauthenticated means that an attacker does not need to acquire user credentials in order to launch an attack. Between the two kinds of vulnerabilities, unauthenticated is the worst case scenario.

A WordPress SQL Injection vulnerability is one in which a plugin or theme allows an attacker to manipulate the database. The database is the heart of every WordPress website, where every password, login names, posts, themes and plugin data. A vulnerability that allows anyone to manipulate the database is considerably severe – this is really bad.

This is how Wordfence describes it:

“The Dokan Pro plugin for WordPress is vulnerable to SQL Injection via the ‘code’ parameter in all versions up to, and including, 3.10.3 due to insufficient escaping on the user supplied parameter and lack of sufficient preparation on the existing SQL query. This makes it possible for unauthenticated attackers to append additional SQL queries into already existing queries that can be used to extract sensitive information from the database.”

Recommended Action For Dokan Pro Users

Users of the Dokan Pro plugin are recommended to consider updating their sites as soon as possible. It’s always prudent to test updates before their uploaded live to a website. But due to the severity of this vulnerability, users should consider expediting this update.

WooCommerce published an advisory of a vulnerability that affects versions 8.8.0 and higher. The vulnerability is rated 5.4 which is a medium level threat, and only affects users who have the Order Attribute feature enabled activated. Nevertheless, WooCommerce “strongly” recommends users update as soon as possible to the most current version (as of this writing), WooCommerce 8.9.3.

WooCommerce Cross Site Scripting (XSS) Vulnerability

The type of vulnerability that affects WooCommerce is called Cross Site Scripting (XSS) which is a type of vulnerability that depends on a user (like a WooCommerce store admin) to click a link.

According to WooCommerce:

“This vulnerability could allow for cross-site scripting, a type of attack in which a bad actor manipulates a link to include malicious content (via code such as JavaScript) on a page. This could affect anyone who clicks on the link, including a customer, the merchant, or a store admin.

…We are not aware of any exploits of this vulnerability. The issue was originally found through Automattic’s proactive security research program with HackerOne. Our support teams have received no reports of it being exploited and our engineering team analyses did not reveal it had been exploited.”

Should Web Hosts Be More Proactive?

Web developer and search marketing expert Adam J. Humphreys, Of Making 8, inc. (LinkedIn profile), feels that web hosts should be more proactive about patching critical vulnerabilities, even though that may cause some sites to lose functionality if there’s a conflict with some other plugin or theme in use.

Adam observed:

“The deeper issue is the fact that WordPress remains without auto updates and a constant vulnerability which is the illusion their sites are safe. Most core updates are not performed by hosts and almost every single host doesn’t perform any plugin updates even if they do them until a core update is performed. Then there is the fact most premium plugin updates will often not perform automatically. Many of which contain critical security patches.”

I asked if he meant a push update, where an update is forced onto a website.

“Correct, many hosts will not perform updates until a WordPress core update. Softaculous engineers confirmed this for me. WPEngine which claims fully managed updates doesn’t do it on the frequency to patch in a timely fashion for said plugins. WordPress without ongoing management is a vulnerability and yet half of all websites are made with it. This is an oversight by WordPress that should be addressed, in my opinion.”

Read more at Wordfence:

Dokan Pro <= 3.10.3 – Unauthenticated SQL Injection

Read the official WooCommerce vulnerability documentation:

WooCommerce Updated to Address Cross-site Scripting Vulnerability

Featured Image by Shutterstock/New Africa

Google Warns Of Quirk In Some Hreflang Implementations via @sejournal, @martinibuster

Google updated their hreflang documentation to note a quirk in how some websites are using it which (presumably) can lead to unintended consequences with how Google processes it.

hreflang Link Tag Attributes

is an HTML attribute that can be used to communicate data to the browser and search engines about linked resources relevant to the webpage. There are multiple kinds of data that can be linked to such as CSS, JS, favicons and hreflang data.

In the case of the hreflang attribute (attribute of the link element), the purpose is to specify the languages. All of the link elements belong in the section of the document.

Quirk In hreflang

Google noticed that there is an unintended behavior that happens when publishers combine multiple in attributes in one link element so they updated the hreflang documentation to make this more broadly known.

The changelog explains:

“Clarifying link tag attributes
What: Clarified in our hreflang documentation that link tags for denoting alternate versions of a page must not be combined in a single link tag.

Why: While debugging a report from a site owner we noticed we don’t have this quirk documented.”

What Changed In The Documentation

There was one change to the documentation that warns publishers and SEOs to watch out for this issue. Those who audit websites should take notice of this.

This is the old version of the documentation:

“Put your tags near the top of the element. At minimum, the tags must be inside a well-formed section, or before any items that might cause the to be closed prematurely, such as

or a tracking pixel. If in doubt, paste code from your rendered page into an HTML validator to ensure that the links are inside the element.”

This is the newly updated version:

“The tags must be inside a well-formed section of the HTML. If in doubt, paste code from your rendered page into an HTML validator to ensure that the links are inside the element. Additionally, don’t combine link tags for alternate representations of the document; for example don’t combine hreflang annotations with other attributes such as media in a single tag.”

Google’s documentation didn’t say what the consequence of the quirk is but if Google was debugging it then that means it did cause some kind of issue. It’s a seemingly minor thing that could have an outsized impact.

Read the newly updated documentation here:

Tell Google about localized versions of your page

Featured Image by Shutterstock/Mix and Match Studio

Want More Clicks? Use Simple Headlines, Study Advises via @sejournal, @MattGSouthern

A new study shows that readers prefer simple, straightforward headlines over complex ones.

The researchers, Hillary C. Shulman, David M. Markowitz, and Todd Rogers, did over 30,000 experiments with The Washington Post and Upworthy.

They found that readers are likelier to click on and read headlines with common, easy-to-understand words.

The study, published in Science Advances, suggests that people are naturally drawn to simpler writing.

In the crowded online world, plain headline language can help grab more readers’ attention.

Field Experiments and Findings

Between March 2021 and December 2022, researchers conducted experiments analyzing nearly 9,000 tests involving over 24,000 headlines.

Data from The Washington Post showed that simpler headlines had higher click-through rates.

The study found that using more common words, a simpler writing style, and more readable text led to more clicks.

In the screenshot below, you can see examples of headline tests conducted at The Washington Post.

Screenshot from: science.org, June 2024.

A follow-up experiment looked more closely at how people process news headlines.

This experiment used a signal detection task (SDT) to find that readers more closely read simpler headlines when presented with a set of headlines of varied complexity.

The finding that readers engage less deeply with complex writing suggests that simple writing can help publishers increase audience engagement even for complicated stories.

Professional Writers vs. General Readers

The study revealed a difference between professional writers and general readers.

A separate survey showed that journalists didn’t prefer simpler headlines.

This finding is important because it suggests that journalists may need help understanding how their audiences will react to and engage with the headlines they write.

Implications For Publishers

As publishers compete for readers’ attention, simpler headline language could create an advantage.

Simplified writing makes content more accessible and engaging, even for complex articles.

To show how important this is, look at The Washington Post’s audience data from March 2021 to December 2022. They averaged around 70 million unique digital visitors per month.

If each visitor reads three articles, a 0.1% increase in click-through rates (from 2.0% to 2.1%) means 200,000 more readers engaging with stories due to the simpler language.

See also: Title Tag Optimization: A Complete How-to Guide

Why SEJ Cares

Google’s recurring message to websites is to create the best content for your readers. This study helps demonstrate what readers want from websites.

While writers and journalists may prefer more complex language, readers are more drawn to simpler, more straightforward headlines.

How This Can Help You

Using simpler headlines can increase the number of people who click on and read your stories.

The study shows that even a tiny increase in click-through rates means more readers.

Writing simple headlines also makes your content accessible to more people, including those who may not understand complex terminology or jargon.

To implement this, test different headline styles and analyze the data on what works best for your audience.


Featured Image: marekuliasz/Shutterstock

Optimize for rich results with the Rich Results Testing Tool

Google has many interesting free tools, but two important ones for helping you improve your site are Search Console and the Rich Results Testing Tool. Search Console helps you get a general feel for how your site is doing in the SERPs, plus keep an eye on any errors to fix and improvements to make. The other one, the Rich Results Testing Tool, helps you see which of your pages are eligible for rich results. Rich results are those highlighted search results like product and event listings.

Rich results are incredibly important in today’s world. Once you add structured data to your site, you might get a highlighted listing in the SERPs. This gives you an edge over your competitor, as highlighted listings tend to get more clicks. For many sites and types of content, it can make sense to target rich results.

Adding structured data to your courses might lead to highlights like this one

This post won’t detail how to get structured data on your site. If you’d like to dive into that, please read our ultimate guide to schema.org structured data, check out our free Structured data for beginners training or our Understanding structured data training course. You can also find out how Yoast SEO automatically applies structured data to your site.

Here, we look at how to verify your eligibility and what you can do to improve on that. Google’s Rich Results Testing Tool helps you check your pages to see if they have valid structured data applied to them and if they might be eligible for rich results. You’ll also find which rich results the page is eligible for and get a preview of how these would look for your content.

Using the Rich Results Testing Tool is very easy. There are two ways to get your insights: enter the URL of the page you want to test or the piece of code you want to test. The second option can be a piece of structured data or the full source code of a page, whichever you prefer.

While testing, you can also choose between a smartphone and a desktop crawler. Google defaults to the smartphone crawler since we live in a mobile-first world, people! Of course, you can switch to a desktop if needed. 

the homepage of the Rich Results Test with a big white bar to fill in the URL to test
Enter a URL or a piece of code to get going. You can also choose between a smartphone or desktop crawler.

There is a difference, of course. It is a good idea to use the URL option if your page is already online. You’ll see if the page is eligible for rich results, view a preview of these rich results, and check out the rendered HTML of the page. But there’s nothing you can ‘do’ in the code. The code option does let you do that.

an example of a valid rich result for courses in the Rich Results Test interface
This particular page has a valid Course list item and Course info and is, therefore, eligible for rich results — which you can see in the first screenshot.

Working with structured data code

If you paste a piece of JSON structured data into the code field and run the test, you get the same results as the URL option. However, you can now also use the code input field to edit your code to fix errors or improve the structured data by fixing warnings.

Did you know?

Do you know Yoast SEO comes with awesome free structured data blocks for how-to and FAQ content?

So, how do you go about this?

  1. Find and copy the code you want to test
  2. If it’s minified, unminify it for better readability
  3. Paste the code in the code field of the Rich Results Testing Tool
  4. Run the test

You’ll get a view similar to the one below.

in orange highlighted fields are optional items that you can add to fill out your structured data to get more chance at getting rich results
Code input is on the left; rich results test is on the right. You can now edit the code and quickly run the test after making those edits to see the changes.

Editing an event page

The page above is an event page; you’ll notice warnings in orange. Remember: red is an error, and orange is a warning. An error you have to fix to be valid, but a warning is a possible improvement to make. Because this concerns a paid event, the page misses an offers property. It also misses the optional fields performer, organizer, description and image. We could add these to remove the warnings and round out this structured data listing — because more is better.

Look at Google’s documentation about events and find out how they’d like the offers to appear in the code. To keep it simple, you could copy the example code and adapt it to your needs. Find a good place for it in your structured data on the left-hand side of your Rich Results Testing Tool screen and paste the code.

You could expand the code until it looks something like this:

Rerun the test, and more sections should turn green. If not, you might have to check if you’ve correctly applied and closed your code.

Once you’ve validated your code and know it’s working, you can apply it to your pages. Remember that we’ve described a very simple way of validating your code, and there are other ways to scale this into production. But that’s not the goal of this article. Here, we’d like to give you a quick insight into structured data and what you can do with the Rich Results Testing Tool.

See a preview of your rich results

The preview option is one of the coolest things in the Rich Results Testing Tool. This gives you an idea of how that page or article will appear on Google. There are several rich results that you can test, like breadcrumbs, courses, job postings, recipes, and many more.

These previews aren’t just for showoff — you can use them to improve the look of the rich results. Maybe the images look weird, or the title is not very attractive. Use these insights to your advantage and get people to click your listings!

an example of a preview for a rich result in Google's Rich Results Test
Get a glimpse of how your rich result might appear in the SERPs

This is a short overview of what you can see and do with the Rich Results Testing Tool. Remember that your content is eligible for rich results if everything is green in the Rich Results Testing Tool and no errors are found. This does not — and we mean not — guarantee that Google will show rich results for this page. You’ll just have to wait and see.

Read more: Rich results are rocking the SERPs »

Coming up next!

Google Launches Custom Event Data Import For GA4 via @sejournal, @MattGSouthern

Google announced a new feature for Google Analytics 4 (GA4), rolling out support for custom event data import.

This allows you to combine external data sources with existing GA4 data for more comprehensive reporting and analysis.

Google’s announcement reads:

“With this feature, you can use a combination of standard fields and event-scoped custom dimensions to join and analyze imported event metadata with your existing Analytics data.

You can then create custom reports for a more complete view of your Analytics data and imported event metadata.”

Custom Event Data Import: How It Works

Google’s help documentation describes the new capability:

“Custom event data import allows you to import and join data in ways that make sense to you. You have more flexibility in the choice of key and import dimensions.”

You begin the process by defining reporting goals and identifying any relevant external data sources not collected in Google Analytics.

You can then set up custom, event-scoped dimensions to use as “join keys” to link the imported data with Analytics data.

Mapping Fields & Uploading Data

Once the custom dimensions are configured, Google provides a detailed mapping interface for associating the external data fields with the corresponding Analytics fields and parameters.

This allows seamless integration of the two data sources.

Google’s help documentation reads:

“In the Key fields table, you’ll add the Analytics fields to join your imported data. In the Import fields table, you’ll select the external fields to include via the join key across both standard Analytics fields/dimensions and custom typed-in event parameters.”

After the data is uploaded through the import interface, Google notes it can take up to 24 hours for the integrated data set to become available in Analytics reports, audiences, and explorations.

Why SEJ Cares

GA4’s custom event data import feature creates opportunities for augmenting Google Analytics data with a business’s proprietary sources.

This allows you to leverage all available data, extract actionable insights, and optimize strategies.

How This Can Help You

Combining your data with Google’s analytics data can help in several ways:

  1. You can create a centralized data repository containing information from multiple sources for deeper insights.
  2. You can analyze user behavior through additional lenses by layering your internal data, such as customer details, product usage, marketing campaigns, etc., on top of Google’s engagement metrics.
  3. Combining analytics data with supplementary data allows you to define audience segments more granularly for targeted strategies.
  4. Using the new data fields and dimensions, You can build custom reports and dashboards tailored to your specific business.

For businesses using GA4, these expanded reporting possibilities can level up your data-driven decision-making.


Featured Image: Muhammad Alimaki/Shutterstock

David Vs. Goliath [Part 2]: Algorithm Updates Have Become The Biggest Risk In SEO via @sejournal, @Kevin_Indig

Boost your skills with Growth Memo’s weekly expert insights. Subscribe for free!

Taking a break from analyzing leaked Google ranking factors and AI Overviews, let’s come back to the question, “Do big sites get an unfair advantage in Google Search?”

In part 1 of David vs. Goliath, I found that bigger sites indeed grow faster than smaller sites, but likely not because they’re big but because they’ve found growth levers they can pull over a long time period.

  • The analysis of 1,000 winner and 1,000 loser sites shows that communities have gained disproportional SEO visibility over the last 12 months, while ecommerce retailers and publishers have lost the most.
  • Backlinks seem to have lost weight in Google’s ranking systems over the last 12 months, even though overperformers still have stronger backlink profiles than underperformers.
  • However, newcomer sites still have good chances to grow big, but not in established verticals.

The correlation between SEO visibility and the number of linking domains is strong but was higher in May 2023 (.81) than in May 2024 (0.62). Sites that lost organic traffic showed lower correlations (0.39 in May 2023 and 0.41 in May 2024). Even though sites that gained organic visibility have more backlinks, the signal seems to have come down significantly over the last 12 months.

In the second part, I share more insights from the data about how and when sites grow or decline in SEO traffic. My goal is to colorize the modern, working approach to SEO and contrast the old, dying approach.

Insights:

  • Most sites lose during core algorithm updates but win outside of them.
  • Most sites grow linearly, not “exponentially.”
  • Tool + programmatic SEO works well.
  • High ad load and confusing design work poorly.
Image Credit: Lyna ™

Hard(Core) Algorithm Updates

During almost half of the year, you can expect at least one Google update to be rolling out.

According to the official page for ranking “incidents,” 2021-2023 had an average of 170 days of Google updates.

Days of the year with Google updates (Image Credit: Kevin Indig)

Keep in mind that roll-out days reflect official updates and the times new updates roll out. The impact of updated rank systems can come into effect way after roll-out when new data is infused into the system, according to Danny Sullivan.

So the folks going “it’s a never ending update” or “the update isn’t over,” search is always being updated.

Image Credit: Kevin Indig

As I wrote in Hard(Core) Algorithm Updates, algo updates become a growing challenge for Google as a user acquisition channel:

No platform has as many changes of requirements. Over the last 3 years, Google launched 8 Core, 19 major and 75-150 minor updates. The company mentions thousands of improvements every year.

Every platform improves its algorithms, but not as often as Google. Instagram launched 6 major algorithm changes over the last 3 years. Linkedin launched 4.

The top 1,000 domains with the biggest traffic losses reflect the risk: When a domain loses organic traffic, it’s most likely due to a Google core algorithm update. A few examples:

  • In SaaS, applicant tracking software company Betterteam was caught by the September 2023 Helpful Content and October 2023 core update, likely because of too much programmatic “low-quality” content.
  • Hints from the Google ranking factor leak indicate a connection between brand searches, backlinks, and content. Whether that’s true or not and if we can influence it remains to be seen, but for Betterteam, brand searches have stagnated since March 2022 while the number of pages has been growing.
Image Credit: Kevin Indig
  • In ecommerce, big US retailers across all verticals (fashion, home, mega-retailers) have been on the decline since the August 2023 core update. More about that in a moment.
Image Credit: Kevin Indig
  • In publishing, sites like Movieweb have also started declining since August 2023. In this case, it’s interesting how Screenrant picks up market share but also dips during the March 2024 core update.
Image Credit: Kevin Indig

Overlapping algorithm updates make it near-impossible to understand what happened, which is a reverse engineering problem for SEO pros and also a guideline issue for anyone responsible for organic traffic. To understand what guideline you violated, you need to be able to understand what happened.

S-Curves Are Rare

It’s rare for a domain to grow exponentially (actually, sigmoidal ), and the average of the top 1,000 domains by organic traffic growth shows linear growth as well. The upside is that growth is more predictable.

Image Credit: Kevin Indig

A great example of the modern approach to SEO in SaaS is the AI tool Quillbot. With a simple but effective design, the tool makes it easy for users to solve issues instead of reading about how to solve them.

Owner Learneo, who also owns Course Hero, saw consistent growth outside of Google algorithm updates. Like German startup DeepL, Quillbot has programmatic pages for translation queries like “translate Arabic to Hindu” or “translate German to English.” The combination of programmatic SEO and a tool works like a charm.

Public relations management tool Muck Rack has programmatic pages for every client (+50,000) in its /media-outlet/ folder, like muckrack.com/media-outlet/fintechzoom. Each page ranks for the client name and has a description, a few details about the company, and the latest press releases for fresh content. Despite not being a tool, the programmatic play works, and Google deems it valuable.

In ecommerce, brands saw the strongest growth.

A few examples:

  • Kay Jewelry (outlet).
  • Lenovo.
  • Steve Madden.
  • Sigma (photo).
  • Billabong.
  • Coleman.
  • Hanes.
  • Etc.

Obviously, there are exceptions on both fronts: brands that lost organic traffic and retailers that gained. We need more data, but it seems that Google has favored brands in the search results over retailers since August 2023.

In publishing, garagegymreviews.com is one of the few affiliate sites that has seen strong growth. It’s important to point out that the main channel of the business is YouTube.

Another example is fodors.com, a travel site that grew predominantly because of its community.

A line graph showing the growth of fodors.com and its subdomains from July 2022 to May 2024. Fodors.com shows the highest growth, followed by community, world, and news subdomains. As evidenced by this David vs. Goliath scenario, smaller subdomains navigate SEO risks amid constant algorithm updates. Data by ahrefs.Image Credit: Kevin Indig

Algo Thrashing

We want a better Google, and Google seems to have taken notice. The response is stronger algorithms that can thrash sites into oblivion and have become the biggest risk in SEO.

At the same time, I haven’t noticed many sites growing due to algorithm updates, meaning the positive effect is indirect: Competitors might be losing traffic.

The big question to finish with is how to mitigate the risk of being hit by an algorithm update. While there is absolutely no guarantee, we can agree on what sites that are unaffected by updates have in common:

  • Allowing Google to index only high-quality pages.
  • Investing in content quality with expert writers and high effort (research, design).
  • Offering good design that makes content easy to read and answers quick to find.
  • Reducing ads as much as possible.

Google Search Status Dashboard


Google Quietly Ends Covid-Era Rich Results via @sejournal, @martinibuster

Google removed the Covid-era structured data associated with the Home Activities rich results that allowed online events to be surfaced in search since August 2020, publishing a mention of the removal in the search documentation changelog.

Home Activities Rich Results

The structured data for the Home Activities rich results allowed providers of online livestreams, pre-recorded events and online events to be findable in Google Search.

The original documentation has been completely removed from the Google Search Central webpages and now redirects to a changelog notation that explains that the Home Activity rich results is no longer available for display.

The original purpose was to allow people to discover things to do from home while in quarantine, particularly online classes and events. Google’s rich results surfaced details of how to watch, description of the activities and registration information.

Providers of online events were required to use Event or Video structured data. Publishers and businesses who have this kind of structured data should be aware that this kind of rich result is no longer surfaced but it’s not necessary to remove the structured data if it’s a burden, it’s not going to hurt anything to publish structured data that isn’t used for rich results.

The changelog for Google’s official documentation explains:

“Removing home activity documentation
What: Removed documentation on home activity structured data.

Why: The home activity feature no longer appears in Google Search results.”

Read more about Google’s Home Activities rich results:

Google Announces Home Activities Rich Results

Read the Wayback Machine’s archive of Google’s original announcement from 2020:

Home activities

Featured Image by Shutterstock/Olga Strel

Unlocking The Secrets Of Google Ad Auctions via @sejournal, @siliconvallaeys

In the world of search marketing, ad auction dynamics play a crucial role in determining ad placements and costs.

Since the DOJ trial against Google, a few elements of the ad auction have gained visibility in the advertising community.

Due to the nature of the trial, the nuances of the auction have been portrayed as serving primarily to increase ad costs. But while higher cost per click (CPCs) are rightfully viewed with skepticism, consider that they may be a side effect of something advertisers would actually want.

I believe nobody should care about CPC.

Instead, the focus should be on cost per action (CPA), return on ad spend (ROAS), return on investment (ROI), or another metric more closely related to business outcomes than CPC.

If you disagree with that premise, you will disagree with the rest of my post. But if you are willing to consider that a higher CPC is not always a bad thing, read on to learn how to explain it to a boss or client who is always on your case about CPCs being too high.

We’ll explore key components of ad auctions, including ad rank thresholds and reserve prices, out-of-order promotions, Randomized Generalized Second-Price (RGSP) mechanisms, and pCTR normalizers to understand how these elements work to create a more effective advertising ecosystem.

But first, let’s cover some of the basics of the ad auction.

The Importance Of Ad Rank

Ad Rank is a fundamental component of ad auctions, balancing bid amounts with ad quality to determine ad placement on the search results page. The basic formula is:

Ad Rank = Max CPC × predicted CTR

This formula ensures that both bid amount and ad quality are considered when determining ad placement.

Predicted CTR (pCTR) is an estimate of how likely it is that an ad will be clicked when shown for a particular search query. This metric is critical because it reflects the ad’s relevance and expected performance.

pCTR Impacts Actual CPC

The actual cost-per-click (CPC) that advertisers pay in an ad auction is influenced by the projected click-through rate (pCTR) of their ads.

Essentially, ads with higher pCTR can achieve better ad positions at a lower actual CPC compared to ads with lower pCTR.

This encourages advertisers to create highly relevant and engaging ads that align with user intent, as improving pCTR can lead to more efficient spending and better ad placements.

Google Ranks Ads Based On CPM

You read that right, and I haven’t gone mad. Since we’re exploring the dynamics of ad auctions and how they influence costs, a helpful point for advertisers to understand is that Google’s ad auction is not a CPC auction but rather a cost-per-thousand-impressions (CPM) auction.

Its not being a CPC auction should be obvious. After all, the pCTR is an equally important factor, and the ad with the highest MaxCPC doesn’t automatically win.

Advertisers bid a maximum CPC (or set a tROAS or tCPA, which gets turned into a MaxCPC at the time of each auction), and when that is combined with pCTR, you get an estimated CPM (eCPM).

The ad with the highest eCPM wins the auction. Since the ad with the highest ad rank wins the auction, we can see that ad rank and eCPM are interchangeable.

And by the way, any publisher can tell you that the best way to monetize a finite number of web visits is by maximizing the CPM, so it should make sense that Google wants to sell ads to the advertisers with the highest CPMs. I explain this in a video.

The Role Of pCTR In Ad Auctions

pCTR is a dynamic metric that influences ad placement and cost. It is calculated for each auction based on the specific context of the search query.

Advertisers with high pCTR benefit from lower CPCs and better ad positions, as the system rewards ads that are more relevant and provide a better user experience.

Understanding and optimizing relevance is crucial for advertisers. High-quality ads that resonate with users are more likely to achieve higher pCTR, reducing overall costs and improving campaign effectiveness.

This dynamic nature of pCTR ensures that advertisers continuously strive to improve ad quality, benefiting both users and advertisers.

Quality Score Is Not pCTR

Quality Score (QS) and projected click-through rate (pCTR) are both critical components for advertisers, but they are not the same.

QS is a 1-10 integer representing the quality and relevance of an ad, taking into account factors such as ad relevance, landing page experience, and historical performance. It is a key performance indicator to help advertisers navigate their way to more relevant ads.

On the other hand, pCTR is a dynamic metric that estimates the likelihood of an ad being clicked for a specific search query.

It varies with each auction and reflects the ad’s expected performance in real time. While QS provides a broad assessment of ad quality, pCTR focuses specifically on predicting user engagement for individual auctions.

Now that I’ve covered the foundation of the ad auction, let’s explore the nuanced aspects that surfaced during the trial.

Thresholds And Reserve Prices

What Are Thresholds And Reserve Prices?

The ad auction is not as simple as ranking ads and then showing them from highest to lowest rank. There are thresholds that determine a number of things, including an ad’s eligibility for a more prominent location on the page and the reserve price for it to be shown at all.

These thresholds vary based on factors such as ad quality, position, user signals, and the specific topic of the search.

Google believes ads are information, too, and should help answer questions. So, there is a quality threshold an ad must meet before it can be shown above organic results.

This is why many searches have fewer than 4 ads above the search results. According to Google’s internal data, as of 2020, fewer than 2% of all searches on Google had 4 or more ads, regardless of position on the page.

How Thresholds And Reserve Prices Impact Costs

To explain this, we need to introduce the notion of an ad’s long-term value (LTV), a measure of the economic benefit of showing the ad minus the expected cost of showing it.

The economic benefit is the ad rank, or pCTR X Max CPC, i.e. how much Google predicts they will earn from showing the ad.

The cost of showing the ad is a prediction of the possibility that the ad will harm user experience and cause them to start avoiding future ads or suffer ad blindness.

The predicted negative impact is the threshold, or reserve price, for an ad. Only if its economic benefit exceeds the expected cost can the ad be shown. So if LTV > 0, the ad may show.

This means that ads may need to pay more than $0.01 (or the equivalent lowest currency in other markets) in order to appear, and that raises prices.

How Do Thresholds And Reserve Prices Benefit Advertisers?

If all second-price auction prices were determined by the next competitor, many advertisers would fall below the LTV > 0 thresholds even though they have a maxCPC that could get them above the threshold.

Google honors the advertiser’s wish to show their ad by collecting the CPC necessary to offset the predicted negative value of showing the ad.

You can think of the threshold as a hidden participant in the ad auction whose ad is tied to the position of the threshold. Beating this threshold raises the effective CPC an advertiser pays, but it also enables the advertiser to get their ads to show in scenarios where they otherwise may not have shown while paying no more than their maximum bid.

For example, in a scenario where your ad is the sole eligible contender, you may be required to pay the reserve price, which is influenced by the thresholds.

In a scenario without strong competition, a very good ad with high quality and a high MaxCPC could find itself unable to meet the threshold. To ensure the advertiser gets what they want, Google bumps their effective CPC so that they meet the threshold and their ad can be shown (LTV > 0).

Out-Of-Order Ad Promotion

Now that we understand reserve prices and thresholds, let’s look at a particular example that involves the threshold for ads to be shown at the top of the page.

What Is Out-Of-Order Ad Promotion?

Out of order ad promotion is when an ad with a lower Ad Rank is allowed to be promoted above an ad with higher Ad Rank.

Let’s dive into this.

The thresholds have a relevance component; for example, Google may say that an ad can only be promoted to the top of the page if it has at least a certain level of relevance (pCTR).

Because Ad Rank is made up of MaxCPC and pCTR, it is possible that a lower-ranked ad (Ad B) could have a better pCTR but be stuck at the bottom of the page behind a higher-ranked ad (Ad A) with a lower pCTR.

If the pCTR promotion threshold was 5%, and Ad Rank was honored, neither of these ads could appear at the top of the page even though ad B has a high enough quality. It would be forced to stay behind Ad A in order to honor Ad Rank.

Ad MaxCPC pCTR Ad Rank
A 10 3 30
B 2 10 20

In out-of-order promotion, ad B is allowed to jump over ad A.

How Out-Of-Order Ad Promotion Impacts Costs

When advertiser A’s low quality doesn’t meet the promotion threshold but advertiser B does meet it, rather than pushing both advertisers to the bottom of the page, advertiser B is allowed to be promoted out of order above advertiser A.

Now, advertiser B pays the CPC needed to beat the top of page threshold (reserve price) which is more than if they were left at the bottom of the page. It can also be more than if they had to beat the Ad Rank of Ad A.

How Out-Of-Order Ad Promotion Benefits Advertisers

Out-of-order ad promotion, where ads are promoted based on factors beyond just the bid amount, benefits advertisers. This approach considers various thresholds, including ad relevance, ensuring that high-quality ads have a chance to appear in top positions even if their Ad Ranks are not the highest.

This can help smaller advertisers with highly relevant ads compete effectively against larger competitors with bigger budgets.

By promoting ads based on relevance and quality, advertisers are incentivized to create more engaging and useful ads, ultimately leading to better user experiences and higher conversion rates.

Randomized Generalized Second-Price (RGSP)

What Is RGSP?

In a traditional second-price auction, the highest bidder wins the ad spot at the price of the second-highest bid.

But remember that the second price depends on pCTR, a number predicted with machine learning. Predictions are not precise, and it can happen that multiple advertisers are competing very closely, and the only thing that sets them apart is an ML-generated pCTR.

To ensure that inaccurate predictions don’t become self-reinforcing truths, ads can be randomly re-ordered. This introduces chances for experimentation that the ML algorithm can use to evaluate its accuracy and improve future predictions.

RGSP is a system to help ensure normalization is handled correctly. It’s hard to have data to do normalization if ads don’t vary. You need to see the same ad’s performance when it wins and loses to be able to identify how much of its performance is due to its inherent quality vs external factors like where it showed.

How RGSP Impacts Costs

RGSP introduces an element of unpredictability, which encourages advertisers to bid their true value rather than strategically underbidding.

When ads are re-ordered and don’t follow the pure ad ranking mechanism, CPCs will be different, and that can raise prices for some advertisers.

How RGSP Helps Advertisers

This mechanism helps prevent ads with high predicted relevance from consistently hogging top positions, promoting a diverse range of ads. By fostering a competitive environment, RGSP mechanisms encourage advertisers to focus on ad quality and relevance, which can lead to better performance and higher return on investment (ROI).

It prevents ads with incorrectly predicted high pCTRs from unfairly remaining in top positions and beating newer ads with inaccurate low pCTRs.

Normalization Techniques

What Are Normalization Techniques?

Google’s normalization techniques ensure that ad rankings reflect relevance rather than being influenced by external factors like ad format or position.

By incorporating metrics such as projected click-through rate (pCTR) and adjusting for factors like ad format, the system creates a level playing field for all advertisers.

Ad rank is partially based on pCTR. But we know that CTR depends on a lot more than just the text of the ad itself. For example, all else being equal, ads in higher positions will get a higher CTR than those in lower positions. Ads with more visible lines of ad text will get higher CTRs than those with fewer lines of text.

Project Momiji works to normalize pCTRs so that a more appealing ad format doesn’t unfairly penalize advertisers whose ads didn’t get the same visual treatment.

How Normalization Techniques Impact Costs

When pCTR is normalized for ad formats and page position, some advertisers with high pCTRs will see a downward adjustment. This is to say that the high pCTR was driven in part by the inherent benefit of a more appealing ad format or a higher page position.

Advertisers should compete on a level playing field, so when this normalization happens, some advertisers will pay more than if the normalization hadn’t happened.

For example, an ad shown in position 1 with a pCTR of 10% may only have had a pCTR of 8% if it had been shown in position 2. There’s an underlying ad relevance pCTR that can be estimated by removing all factors that boost the pCTR due to factors out of the advertiser’s control, like ad formats, position on the page, number of additional ads, etc.

Google can then price all ads based on their normalized pCTR. So, in our example, if the pCTR for the auction is 10% but normalized for all factors, it would only be 8%, then the advertiser’s effective CPC will be higher.

How Normalization Techniques Help Advertisers

Normalization techniques prevent unfair advantages stemming from superior positions or ad treatments, ensuring that ad pricing reflects true relevance. This approach benefits advertisers by promoting fair competition and encouraging investment in high-quality ads that align with user intent.

Focus Less On CPC

Understanding the intricacies of ad auction dynamics is crucial for advertisers seeking to optimize their campaigns and achieve better outcomes.

While higher CPCs might initially appear disadvantageous, they often result from mechanisms designed to promote ad quality, relevance, and a better user experience.

By focusing on metrics that truly matter, such as CPA, ROAS, and ROI, advertisers can better appreciate the benefits of these dynamics.

The components of the ad auction, from ad rank thresholds to out-of-order promotions and RGSP mechanisms, work together to create a competitive yet fair environment.

This encourages advertisers to continuously improve their ads, ultimately benefiting both their business and the users they aim to reach. By embracing these complexities and striving for high-quality, relevant ads, advertisers can navigate the ad auction landscape more effectively and achieve greater success in their digital marketing efforts.

More resources:


Featured Image: ImageFlow/Shutterstock