Meta Plans A Less Punitive AI-Generated Content Policy via @sejournal, @martinibuster

Meta announced an update to its AI labeling policy, expanding its definition of “manipulated media” to go beyond AI-generated videos, to now include deceptive audio and images on Facebook, Instagram and Threads.

An important feature of the new policy is it’s sensitivity on being perceived as being restrictive of freedom of expression. Rather than adopt the approach of removing problematic content Meta is instead simply labeling it. Meta introduced two labels, “Made with AI” and “Imagined with AI,” to make clear what content was created or altered with AI.

New Warning Labels

The AI-generated content will rely on identifying the signals of AI-authorship and self-reporting:

“Our ‘Made with AI’ labels on AI-generated video, audio, and images will be based on our detection of industry-shared signals of AI images or people self-disclosing that they’re uploading AI-generated content”

Content that is significantly misleading may receive more prominent labels so that users can get a better understanding.

Harmful content that violates the Community Standards, such as content that incites violence, election interference, bullying or harassments will qualify for removal, regardless if it is human or AI generated.

Reason For Meta’s Updated Policy

The original AI labeling policy was created in 2020 and because of the state of the technology it was narrowly defined confined to addressing deceptive videos (the kind that depicted public figures saying things they never did). Meta’s Oversight Board recognized that technology has progressed to the point that a new policy was needed. The new policy accordingly expands to now address AI-generated audio and images, in addition to videos.

Based On User Feedback

Meta’s process for updating their rules appear to have anticipated pushback from all sides. Their new policy is based on extensive feedback from from a wide range of stakeholder and input from the general public. The new policy also has the flexibility to bend if needed.

Meta explains:

“In Spring 2023, we began reevaluating our policies to see if we needed a new approach to keep pace with rapid advances… We completed consultations with over 120 stakeholders in 34 countries in every major region of the world. Overall, we heard broad support for labeling AI-generated content and strong support for a more prominent label in high-risk scenarios. Many stakeholders were receptive to the concept of people self-disclosing content as AI-generated.

…We also conducted public opinion research with more than 23,000 respondents in 13 countries and asked people how social media companies, such as Meta, should approach AI-generated content on their platforms. A large majority (82%) favor warning labels for AI-generated content that depicts people saying things they did not say.

…And the Oversight Board noted their recommendations were informed by consultations with civil-society organizations, academics, inter-governmental organizations and other experts.”

Collaboration And Consensus

Meta’s announcement explains that they plan for the policies to keep up with the pace of technology by revisiting it with organizations like the Partnership on AI, governments and non-governmental organizations.

Meta’s revised policy emphasizes the need for transparency and context for AI-generated content, that removal of content will be based on violations of their community standards and that the preferred response will be to label potentially problematic content.

Read Meta’s announcement

Our Approach to Labeling AI-Generated Content and Manipulated Media

Featured Image by Shutterstock/Boumen Japet

Google Explains How It Chooses Canonical Webpages via @sejournal, @martinibuster

In a Google Search Central video Google’s Gary Illyes explained part of webpage indexing that involves selecting canonicals, explaining what a canonical means to Google, a thumbnail explanation of webpage signals, he mentions the centerpiece of a page and tells what it does with the duplicates which implies a new way of thinking about them.

What Is A Canonical Webpage?

There are several ways of considering the what canonical means, the publisher and the SEO’s viewpoint from our side of the search box and what canonical means from Google’s side.

Publishers identify what they feel is the “original” webpage and SEOs conception of canonicals is about choosing the “strongest” version of a webpage for ranking purposes.

Canonicalization for Google is an entirely different thing from what publishers and SEOs think it is so it’s good to hear it from a Googler like Gary Illyes.

Google’s official documentation about canonicalization uses the word deduplication to reference the process of choosing a canonical and lists five typical reasons for why a site might have duplicate pages.

Five Reasons For Duplicate Pages

  1. “Region variants: for example, a piece of content for the USA and the UK, accessible from different URLs, but essentially the same content in the same language
  2. Device variants: for example, a page with both a mobile and a desktop version
  3. Protocol variants: for example, the HTTP and HTTPS versions of a site
  4. Site functions: for example, the results of sorting and filtering functions of a category page
  5. Accidental variants: for example, the demo version of the site is accidentally left accessible to crawlers”

Canonicals can be considered in three different ways and there are at least five reasons for duplicate pages.

Gary describes one more way to think of canonicals.

Signals Are Used For Choosing Canonicals

Ilyes shares one more definition of a canonical, this time from the indexing point of view, and talks about the signals that are used for selecting canonicals.

Gary explains:

“Google determines if the page is a duplicate of another already known page and which version should be kept in the index, the canonical version.

But in this context, the canonical version is the page from a group of duplicate pages that best represents the group according to the signals we’ve collected about each version.”

Gary stops to explain duplicate clustering and then returns to talking about signals a short while later.

He continued:

“For the most part, only canonical pages appear in Search results. But how do we know which page is canonical?

So once Google has the content of your page, or more specifically the main content or centerpiece of a page, it will group it with one or more pages featuring similar content, if any. This is duplicate clustering.”

Just want to stop here to note that Gary refers to the main content as the “centerpiece of a page” which is interesting because there’s a concept introduced by Google’s Martin Splitt called the Centerpiece Annotation. He didn’t really explain what the Centerpiece Annotation is but this bit that Gary shared helps.

The following is the part of the video where Gary talks about what signals actually are.

Illyes explains what “signals” are:

“Then it compares a handful of signals it has already calculated for each page to select a canonical version.

Signals are pieces of information that the search engine collects about pages and websites, which are used for further processing.

Some signals are very straightforward, such as site owner annotations in HTML like rel=”canonical”, while others, like the importance of an individual page on the internet, are less straightforward.”

Duplicate Clusters Have One Canonical

Gary next explains that one page is chosen to represent the canonical for each cluster of duplicate pages in the search results. Every cluster of duplicates has one canonical.

He continues:

“Each of the duplicate clusters will have a single version of the content selected as canonical.

This version will represent the content in Search results for all the other versions.

The other versions in the cluster become alternate versions that may be served in different contexts, like if the user is searching for a very specific page from the cluster.”

Alternate Versions Of Webpages

That last part is really interesting and is important to consider because it can be helpful for being able to rank for multiple variations of a keyword, particularly for ecommerce webpages.

Sometimes the content management system (CMS) creates duplicate webpages to account for variations of a product like the size or color of a product which then can impact the description. Those variations can be chosen by Google to rank in the search results when that variant page more closely serves as a match for a search query.

This is important to think about because it might be tempting to redirect noindex variant webpages to keep them out of the search index out of fear of the (non-existent) keyword cannibalization problem. Adding a noindex to pages that are variants of one page can backfire because there are scenarios where those variant pages are the best ones to rank for a more nuanced search query that contains colors, sizes or version numbers that are different than on the canonical page.

Top Takeaways About Canonicals (And More) To Remember

There is a lot of information packed in Gary’s discussion of canonicals, including some side topics about the main content.

Here are seven takeaways to consider:

  1. The main content is referred to as the Centerpiece
  2. Google calculates a “handful of signals” for each page it discovers.
  3. Signals are data that are used for “further processing” after webpages are discovered.
  4. Some signals are in control of the publisher, like hints (and presumably directives). The hint that Illyes mentioned is the the rel=canonical link attribute.
  5. Other signals are outside of the control of the publisher, like the importance of the page in the context of the Internet.
  6. Some duplicate pages can serve as alternate versions
  7. Alternate versions of webpages can still rank and are useful for Google (and the publisher) for ranking purposes.

Watch the Search Central Episode about indexing:

How Google Search indexes pages

Featured image from Google video/altered by author

Google’s Indexing Process: When Is “Quality” Determined? via @sejournal, @MattGSouthern

In a recent video, Google’s Gary Illyes, a search team engineer, shared details about how the search engine assesses webpage quality during indexing.

This information is timely, as Google has steadily raised the bar for “quality” content.

Quality: A Key Factor in Indexing & Crawling Frequency

Illyes described the indexing stage, which involves analyzing a page’s textual content, tags, attributes, images, and videos.

During this stage, Google also calculates various signals that help determine the page’s quality and, consequently, its ranking in search results.

Illyes explains:

“The final step in indexing is deciding whether to include the page in Google’s index. This process, called index selection, largely depends on the page’s quality and the previously collected signals.”

This detail is especially relevant for publishers and SEO professionals struggling to get content indexed.

You could be doing everything right from a technical standpoint. However, your pages won’t get indexed if they don’t meet a certain quality threshold.

Further, Google has previously confirmed that high-quality content is crawled more frequently, which is crucial for staying competitive in search results.

One of Google’s goals for the year is to conserve crawling resources by prioritizing pages that “deserve” to be crawled, emphasizing the urgency of meeting Google’s quality standard.

Signals & Duplicate Content Handling

Illyes touched on how Google analyzes signals.

Some signals, like the rel= “canonical” annotation, are straightforward, while others, such as a page’s importance on the internet, are more complex.

Google also employs “duplicate clustering,” where similar pages are grouped, and a single canonical version is selected to represent the content in search results. The canonical version is determined by comparing the quality signals collected about each duplicate page.

Additional Indexing Insights

Along with the insight into quality assessment, Illyes shared these notable details:

  1. HTML Parsing and Semantic Issues: Illyes discussed how Google parses the HTML of a webpage and fixes any semantic issues encountered. If unsupported tags are used within the < head> element, it can cause indexing problems.
  2. Main Content Identification: Illyes mentioned that Google focuses on the “main content or centerpiece of a page” when analyzing it. This suggests that optimizing the primary content of a webpage is more important than incremental technical changes.
  3. Index Storage: Illyes revealed that Google’s search database is spread across thousands of computers. This is interesting context regarding the scale of Google’s infrastructure.

Watch the full video below:

Why SEJ Cares

As Google continues prioritizing high-quality content in its indexing and ranking processes, SEO professionals should be aware of how it assesses quality.

Knowing the factors influencing indexing, such as relevance, quality, and signal calculation, SEO professionals know better what to aim for to meet Google’s indexing threshold.

How This Can Help You

To ensure your content meets Google’s quality standards, consider the following actionable steps:

  1. Focus on comprehensively creating content that addresses your audience’s needs and pain points.
  2. Identify current search demand trends and align your content with these topics.
  3. Ensure your content is well-structured and easy to navigate.
  4. Implement schema markup and other structured data to help Google better understand context.
  5. Regularly update and refresh your content to maintain relevance and value.

You can potentially increase your indexed pages and crawling frequency by prioritizing quality, relevance, and meeting search demand.


FAQ

What does Google’s ‘index selection’ process involve?

The index selection process is the final step in Google’s indexing, where it decides whether to include the page in the search index.

This decision is based on the page’s quality and various signals collected during the initial assessment.

If the page doesn’t meet the quality threshold set by Google, it risks not being indexed. For this reason, the emphasis on generating high-quality content is critical for visibility in Google’s search engine.

How does Google handle duplicate content, and what role do quality signals play in this process?

Google handles duplicate content through a process called “duplicate clustering,” where similar pages are grouped. Then, a canonical version is selected to represent the group in search results.

The canonical version is selected based on the quality signals associated with each duplicate page. These signals can include attributes like the proper use of the rel=”canonical” tag or more complex factors like a page’s perceived importance on the Internet.

Ultimately, the chosen canonical version reflects Google’s assessment of which page is most likely to provide the best value to users.


Featured Image: YouTube.com/GoogleSearchCentral, April 2024. 

New Study On Perplexity AI Offers Good News For SEO via @sejournal, @martinibuster

Research by BrightEdge shows that traffic is surging to Perplexity and where the opportunities lie for optimizing for traffic, particularly for ecommerce .

What Is Perplexity AI?

Perplexity is a self-described Answer Engine founded by researchers and engineers from OpenAI, Facebook, Quora, Microsoft, and Databricks. It has backing from many of the most influential investors and engineers in Silicon Valley which has helped to propel it as one of the top innovators in the new wave AI search engines.

Perplexity contains an index that is ranked by their own version of PageRank. It’s a combination of a search engine and a chatbot, with the chatbot part serving as the interface for receiving queries and the AI part on the backend. But it’s retains the functionality of a chatbot in that it can perform tasks like write an essay.

What sets Perplexity.ai apart from competitors, as will be shown below, is that Perplexity shows generous amounts of citations to websites.

Surge In Traffic

One of the key insights from the BrightEdge research is that Perplexity.ai has surged in referral traffic by a whopping 40% since January, indicating that users are interested in trying something different from the usual ten blue links.

Using their proprietary BrightEdge Generative Parser they were able to detect AI search experiences which showed that users are fine with an AI search engine.

The takeaway here is that the search marketing industry is right back to where it was over two decades ago when Google first appeared. SEOs like myself and others were testing to see what activities were useful and which were not.

Most people in search came into it when the technology was relatively mature and don’t know what it’s like to confront the unknown or even how to do it. The only difference between then and now is that today we know about research papers and patents. Back then we had no idea until roughly 2005.

BrightEdge’s report reflected on this period of transition:

“For marketers, who rely on organic search strategies to reach customers, new AI-first search engines like Perplexity and ChatGPT signal a tectonic shift in how brands market and sell their products. However, the newness of these AI-driven platforms means they frequently undergo dynamic changes, making them difficult to track and adapt to.”

In an ad supported model the organic search results competes with advertising for the most valuable search queries but that’s not the case with Perplexity. BrightEdge sees Perplexity as an opportunity for search marketers because it is an ad-free model that sends organic search traffic.

Overlap With Google Search Generative Experience (SGE)

An interesting data point surfaced in BrightEdge’s research is that there was a “significant” overlap between Perplexity’s search results and that of Google’s SGE results. Perhaps not surprisingly, the strongest overlap was in health related search queries, likely because there’s a limited number of sites that are qualified to create content on health and medical topics.

But what may sound discouraging is that Reddit shows up across most search query topics except for healthcare and finance, two YMYL (your money/your life) topics.

Overlap In B2B Search Results

Another area of overlap with Google’s SGE is due to Perplexity’s tendency to rank authoritative sites in topics like Healthcare and Education and in review and local search sites in relation to Restaurants and Travel. Big brands like Yelp and TripAdvisor are winners in Perplexity.

Overlap In Travel Search Results

Yahoo, MarketWatch and CNN are frequently seen in finance related search queries..

There is less overlap in for eCommerce queries apart from Wikipedia and Amazon, which both search engines rank.

According to BrightEdge:

“Google uses Quora and Consumer Reporters for third-party product information, while Perplexity references Reddit. Overall, Perplexity is most likely to reference product sites, whereas Google SGE will also include informational resources such as lifestyle and news sites.”

That’s good news for ecommerce sites that sell actual products, an actual bright spot.

BrightEdge cites the following opportunities with Perplexity:

  • Perplexity’s share of search growth rate is rising at a rate of 39% per month.
  • Perplexity’s search results offer an average of 5.28 website citations.
  • Perplexity AI shows more citations in Travel and Restaurant queries than Google SGE.
  • BrightEdge encourages search marketers to take advantage of opportunities in optimizing for AI-driven search engines.

Jim Yu, Founder and Executive Chairman of BrightEdge, said:

“Optimizing emerging search platforms is essential for marketers because their impact will be seismic – just 1% of the global organic search market equates to approximately $1.2B in ad revenue per year… AI-first engines are steadily gaining ground and carving out their own areas of expertise, making it critical for the marketing community to master multiple search platforms.

There is too much revenue at stake to get left behind, which is why we’re closely tracking the development of these engines and all things AI search – from traffic trends and queries to result quality and more.”

Read more about optimizing for Perplexity at BrightEdge:

The Ultimate Guide to Perplexity

Featured Image by Shutterstock/rafapress

Google Responds To Criticism Over Forums At Top Of Search Results via @sejournal, @MattGSouthern

Google’s discussions and forums carousel in search results has sparked concern among SEO professionals, who worry that the prominence of forum content could lead to misinformation and scams.

Google’s Search Liaison, Danny Sullivan, has acknowledged the issue and stated that feedback has been passed along for further evaluation.

Sullivan also addressed the broader concern regarding forum content, noting that while some may dislike it, many users appreciate and actively seek it out.

This article explores the implications of the new carousel and its potential opportunities and challenges.

Concerns Raised Regarding Forum Content In Search Results

The introduction of the discussions and forums carousel has made some question Google’s commitment to surfacing reliable information.

Lily Ray, a prominent figure in the SEO community, raised this issue on Twitter, stating, “Isn’t this a bit dangerous for Google?”

She pointed out that Reddit, in particular, has been “overtaken by affiliate spam and scammers.”

Google’s Response

In response, Sullivan explained that the carousel “appears automatically if the systems think it might be relevant and useful.”

However, some users pushed back on this explanation.

Twitter user @sc_kkw argued, “If they actively seek it out, let them. It’s much easier for a user to type ‘Reddit’ at the end of their search than it is for someone who doesn’t want forum answers to sift through and find a reputable website now.”

Sullivan maintained that the goal is to show relevant content, whether from forums, blogs, or websites.

He provides an example of a personal search experience where forum results quickly solved an issue with smart window blinds, demonstrating the potential value of this content.

Potential Improvements On The Way?

Sullivan assured Ray that her concern had been understood and passed on to the search team.

He outlined potential improvements, such as adjusting the frequency of forum content for specific queries or adding disclaimers to clarify that forum participants may not be medical professionals.

Why SEJ Cares

The inclusion of the discussions and forums carousel in search results, particularly for YMYL queries, has implications for both users and publishers:

  1. User trust: If forum content containing misinformation or scams appears prominently in search results, it could erode user trust in Google’s ability to provide reliable information.
  2. Discouraged publishers: SEO professionals and creators who have invested time and resources into creating high-quality, authoritative content may feel discouraged if forum content consistently outranks their work.
  3. Public health and well-being: The spread of misinformation through forum content could potentially harm users who rely on search results for accurate medical information.

How This Can Help You

Despite the concerns raised, the inclusion of forum content in search results can present opportunities, such as:

  1. Identify content gaps: Analyzing the questions and discussions in forum results can help you identify gaps in your content and create targeted, authoritative resources to address user needs.
  2. Engage with the community: Participating in relevant forums and providing helpful, accurate information can help establish your brand as a trustworthy authority in your niche, potentially increasing visibility and traffic.
  3. Adapt your content strategy: Consider incorporating user-generated content, such as expert interviews or case studies, to provide firsthand experiences and perspectives that users find valuable in forum discussions.

In Summary

Google’s discussions and forums carousel in search results has raised concerns among SEO professionals. Google acknowledged the feedback and is considering potential improvements.

This development presents challenges and opportunities for SEO professionals to identify content gaps, engage with the community, and adapt content strategies to serve users’ needs better.


Featured Image: pathdoc/Shutterstock

YouTube Analytics Update: Impressions From New & Returning Viewers via @sejournal, @MattGSouthern

YouTube introduces new impression insights, enabling channels to see a breakdown of new and returning viewers.

  • YouTube has introduced a new feature in Studio analytics that shows the number of impressions from new and returning viewers.
  • The new feature aims to help channels better understand their audience composition and reach.
  • You can use these insights to tailor your content strategy to cater to both new and returning viewers.
Google Publishes Tutorial On Identifying INP Issues via @sejournal, @MattGSouthern

Google’s new tutorial can help you identify and resolve Interaction to Next Paint (INP) issues.

  • Google has released a video tutorial on identifying Interaction to Next Paint (INP) issues using Chrome DevTools.
  • INP has recently replaced First Input Delay (FID) as a Core Web Vital.
  • Follow the steps outlined in the tutorial to assess and optimize your website’s INP score.
Are The r/SEO Reddit Moderators Biased Against Google? via @sejournal, @martinibuster

A post in the r/SEO subreddit by Google’s Danny Sullivan that was meant to dispel a misinformed observation was apparently removed by a moderator with zero explanation and then it returned. This isn’t an isolated incident, posts by John Mueller have also been removed without explanation, giving a perception that the r/SEO moderation is biased against Google to the point of actual hostility.

This isn’t the first time that a Googler’s posts have been removed. It’s happened to John Mueller, too.

It was bad enough that the original post misrepresented what SearchLiaison had said but it was even worse that a moderator would remove a post by a Google representative that corrected the misinformation.

The question has to be asked, what value does the r/SEO subreddit have if it doesn’t allow Google representatives to respond to misinformation and to offer help?

Redditor Misinterprets Google

The original post was about one statement that was taken out of context of a much larger tweet by SearchLiaison.

The context that went over the Redditor’s head was that SearchLiaison was recommending that if publishers do things that they do it for their readers and not because they read somewhere that it’s good for ranking.

Here’s the context:

” You want to do things that make sense for your visitors, because what “shows Google” you have a great site is to be… a great site for your visitors not to add things you assume are just for Google.

Doing things you think are just for Google is falling behind what our ranking systems are trying to reward rather than being in front of them.”

SearchLiaison listed things that SEOs do because they think Google is going to rank it better.

A partial list of what was tweeted:

“- Something saying an “expert” reviewed the content because someone mistakenly believes that ranks them better

– Weird table-of-content things shoved at the top because who knows, along the way, somehow that became a thing I’m guessing people assume ranks you better

– The page has been updated within a few days, or even is fresh on the exact day, even though the content isn’t particularly needing anything fresh and probably someone did some really light rewrite and fresh date because they think that “shows Google” you have fresh content and will rank better.”

The Redditor commented:

“To me, it was a silly thing for Search Liaison to say because it is really lame to believe that using a TOC or not would make any difference to SERP ranking.

If you take his point further of not showing to Google, you might remove breadcrumbs, internal links and related posts. In other words, anything that is of SEO value.

So it was really nonsensical advice from Google.

But I’m sure many bloggers will take it as gospel and, in desperation, remove TOCs from their sites.”

Of course, as most anyone who is objective can see, SearchLiaison wasn’t advising anyone to remove their Table Of Contents from their articles. He was just recommending to do what’s best for your users, which makes sense. If your users hate the table of content then it’s a good idea to remove it because it doesn’t make a difference to Google.

And that advice was actually a gift because it helps people avoid wasting time doing things that might annoy readers which is never a good thing to do.

r/SEO Subreddittors Upvote Misinformation

The weird thing about that thread is that the misinformation gets upvoted and people who actually understand what’s going on are ignored.

Here’s an example of a post that totally misunderstands what SearchLiaison posted and repeats the misinformation and receives sixteen upvotes while someone with the correct understanding is upvoted only five times.

This unhelpful post received 16 upvotes:

“I did not understand why he thought table of contents were not helpful. Even before we were using the Internet, we were using books and magazines table of contents to find what we were looking for… We do the same on long posts…”

And this got only five upvotes:

“He never said that tables of contents aren’t helpful. Sometimes they are.”

Screenshot of a misinformed post in the r/SEO subreddit getting more upvotes than a high quality post

Danny Sullivan’s Post Is Restored

Danny’s post in the r/SEO subreddit was subsequently restored. It was a thoughtful 1,120 word response. Why would a moderator for an r/SEO subreddit delete that? There is no good reason to delete it and easily at least a hundred good reasons to keep Danny’s post.

Partial Screenshot Of Danny’s 1,200 Word Response

John Mueller’s Posts Were Also Deleted

Myself and others who write about SEO have noticed that John Mueller’s posts have gone missing, too. It’s been a practice at Search Engine Journal to take a snapshot of Mueller’s posts when writing about them because they tended to occasionally disappear.

Composite Image Of Four Of John Mueller’s Removed Posts

Composite image of four posts that were removed by from the r/SEO subreddit

Is The R/SEO Subreddit Broken?

The inexcusable removal of posts by Danny Sullivan and John Mueller create the perception that the r/SEO subreddit moderating team is biased against Google and do not welcome their contributions.

Did the moderators remove those posts because they are biased against Google? Did they remove the posts out of a misguided anti-spam link rule?

Whatever the reason for the action against the Googler’s it’s a very bad look for the r/SEO subreddit.

Featured Image by Shutterstock/Roman Samborskyi

Google’s Crawling Priorities: Insights From Analyst Gary Illyes via @sejournal, @MattGSouthern

In a recent statement on LinkedIn, Google Analyst Gary Illyes shared his mission for the year: to figure out how to crawl the web even less.

This comes on the heels of a Reddit post discussing the perception that Google is crawling less than in previous years.

While Illyes clarifies that Google is crawling roughly the same amount, he emphasizes the need for more intelligent scheduling and a focus on URLs that are more likely to deserve crawling.

Illyes’ statement aligns with the ongoing discussion among SEO professionals about the concept of a “crawl budget,” which assumes that sites must stay within a limited number of pages that search engines can crawl daily to get their pages indexed.

However, Google’s Search Relations team recently debunked this misconception in a podcast, explaining how Google prioritizes crawling based on various factors.

Crawling Prioritization & Search Demand

In a podcast published two weeks ago, Illyes explained how Google decides how much to crawl:

“If search demand goes down, then that also correlates to the crawl limit going down.”

While he didn’t provide a clear definition of “search demand,” it likely refers to search query demand from Google’s perspective. In other words, if there is a decrease in searches for a particular topic, Google may have less reason to crawl websites related to that topic.

Illyes also emphasized the importance of convincing search engines that a website’s content is worth fetching.

“If you want to increase how much we crawl, then you somehow have to convince search that your stuff is worth fetching, which is basically what the scheduler is listening to.”

Although Illyes didn’t elaborate on how to achieve this, one interpretation could be to ensure that content remains relevant to user trends and stays up to date.

Focus On Quality

Google previously clarified that a fixed “crawl budget” is largely a myth.

Instead, the search engine’s crawling decisions are dynamic and driven by content quality.

As Illyes put it:

“Scheduling is very dynamic. As soon as we get the signals back from search indexing that the quality of the content has increased across this many URLs, we would just start turning up demand.”

The Way Forward

Illyes’ mission to improve crawling efficiency by reducing the amount of crawling and bytes on wire is a step towards a more sustainable and practical web.

As he seeks input from the community, Illyes invites suggestions for interesting internet drafts or standards from IETF or other standards bodies that could contribute to this effort.

“Decreasing crawling without sacrificing crawl-quality would benefit everyone,” he concludes.

Why SEJ Cares

Illyes’ statement on reducing crawling reinforces the need to focus on quality and relevance. SEO isn’t just about technical optimizations but also about creating valuable, user-centric content that satisfies search demand.

By understanding the dynamic nature of Google’s crawling decisions, we can all make more informed choices when optimizing our websites and allocating resources.

How This Can Help You

With the knowledge shared by Illyes, there are several actionable steps you can take:

  1. Prioritize quality: Focus on creating high-quality, relevant, and engaging content that satisfies user intent and aligns with current search demand.
  2. Keep content current: Regularly update and refresh your content to ensure it remains valuable to your target audience.
  3. Monitor search demand trends: Adapt your content strategy to address emerging trends and topics, ensuring your website remains relevant and worthy of crawling.
  4. Implement technical best practices: Ensure your website has a clean, well-structured architecture and a robust internal linking strategy to facilitate efficient crawling and indexing.

As you refine your SEO strategies, remember the key takeaways from Illyes’ statements and the insights Google’s Search Relations team provided.

With these insights, you’ll be equipped to succeed if and when Google reduces crawling frequency.


Featured Image: Skorzewiak/Shutterstock

Google: When To Fix Sites Hit By March 2024 Core Update via @sejournal, @martinibuster

Google’s John Mueller answered a question about whether the March Core Update was finished and whether it’s okay to begin fixing things in response to the update.

Core Update Question On Reddit

The person asking the question wanted to know if the core update was finished because they’ve experienced a 60% loss in traffic and they were waiting for the update to finish before fixing things to make it rank again.

“People advised me against making drastic changes to my blogs while the core update was ongoing. Unfortunately, I’ve experienced a significant loss, about 60% of my traffic, and now I’m determined to restore these numbers.
Do you have any tips for me? It appears that my pages, including (purchased) backlinks, have been most adversely affected!”

The advice that the Redditor received about waiting until after an update is finished before attempting to fix things is good advice… most of the time.

March 2024 Core Algorithm Update Is Not Over

Core algorithm updates are changes to the entire range of algorithms that are a part of search. The ranking part of the algorithm is a part of what constitutes as Google’s Core Algorithm. And the ranking system itself is made up of multiple other components that are related to understanding search queries and webpages, weighting different factors depending on the context and meaning of the search query, relevance, quality, and page experience, among many other factors.

There are also spam related systems such as RankBrain. The core algorithm is comprised of many things and the March 2024 Core Update is a particularly complex one which may explain why it’s taking so long.

John Mueller responded by first acknowledging that the March Core Update is not over yet.

He explained:

“No, it’s not complete. It’ll be labeled complete when it’s finished rolling out.”

Should You Wait Until The Update Is Over?

Mueller next addresses the part of the question that is about whether the person should wait until the update is over to fix their site.

He answered:

“Regardless, if you have noticed things that are worth improving on your site, I’d go ahead and get things done. The idea is not to make changes just for search engines, right? Your users will be happy if you can make things better even if search engines haven’t updated their view of your site yet.”

John Mueller makes a valid point that any time is the right to time to fix shortcomings that are discovered after a website self-assessment.

I’ve been working as a search marketer for 25 years, far longer than John Mueller ever has, so from that perspective I know that rankings tend to shift throughout an algorithm update. It’s not unusual that catastrophic ranking changes are reversed by the time an update is finished. “Fixing” something before the update has finished risks changing something that isn’t broken or in need of fixing.

However in this specific instance John Mueller’s advice to go ahead and fix what’s broken is absolutely correct because a problem the Redditor mentioned, paid links, is quite likly a contributing factor to the negative change in their rankings.

Optimizing For People

Mueller’s next advice is to focus on optimizing the website for people and not search engines. The emphasis of Mueller’s response was to encourage optimizing for “users” which means site visitors.

The remainder of Mueller’s response:

“Also, while I don’t know your site, one thing you can do regardless of anything is to work out how you can grow alternate sources of traffic, so that when search engines revamp their opinion of your site, you’ll have less strong fluctuations (make things more independent of search engines).

And, once you go down this path, you’ll probably also notice that you focus more on building out value for users (because you want them to come & visit & recommend on their own) – which is ultimately what search engines want too.”

Mueller’s response has a lot of merit because optimizing for people will align with how Google ranks websites.  It’s an approach to SEO that I call User Experience SEO.  User experience SEO is anticipating how content affects the user’s experience and satisfaction.

Using these principles I was able to anticipate by several years everything that was in Google’s Reviews Update. My clients with review websites were not caught by surprise by that update because I had anticipated everything in that update so they were ready for it when it happened.

Optimizing for people is not a shallow “make your site awesome” or “content is king” slogan. Optimizing for people is an actionable strategy for how to create and optimize websites with strong ranking power.

The recent U.S. government anti-trust lawsuit against Google made clear that the Navboost signal which tracks user interaction signals is a powerful ranking factor. Google responds to user interaction signals and one of the best ways of creating user interaction signals (as described in the Navboost Patent) is to create websites that cultivate positive responses.

Read the discussion on Reddit:

Is the March core update ended yet?