Google’s Q4 Earnings Reveal AI’s Growing Role In Search via @sejournal, @MattGSouthern

Alphabet, the parent company of Google, announced its financial results for Q4 and the full year of 2023.

Alphabet’s CEO, Sundar Pichai, was pleased with the company’s ongoing success. He pointed to gains in Google Search advertising, YouTube ad revenue, and demand for Google Cloud products and services.

“We are pleased with the ongoing strength in Search and the growing contribution from YouTube and Cloud. Each of these is already benefiting from our AI investments and innovation. As we enter the Gemini era, the best is yet to come,” said Pichai.

Ruth Porat, CFO of Alphabet, also reflected on the company’s financial health, stating, “We ended 2023 with very strong fourth-quarter financial results, with Q4 consolidated revenues of $86 billion, up 13% year over year. We remain committed to our work to durably re-engineer our cost base as we invest to support our growth opportunities.”

Earnings Report Highlights

Alphabet announced Q4 revenues of $86.31 billion, up 13% compared to last year. Operating income for the quarter reached $23.7 billion, an increase from $18.16 billion in Q4 2021.

For 2023, Alphabet’s total revenues were $307.39 billion, representing 9% growth over the previous year.

The company attributed its ongoing revenue growth to investments in AI technology, which drove the expansion of Alphabet’s service offerings and cloud computing business.

Further details from the earnings release and call can be found on Alphabet’s investor relations website.

Highlights Of Earnings Call Webcast

AI: The New Frontier in Search

During Alphabet’s fourth-quarter 2023 earnings call webcast, Pichai discussed the company’s strategic focus on leveraging advanced AI models across its products and services.

He reported early results from integrating AI models like Gemini into Google Search to enhance the user experience and advertiser performance.

Pichai stated that early testing of Google’s Search Generative Experience, which utilizes the Gemini AI model, showed a 40% decrease in search latency times for English language queries in the United States.

He attributed these improvements to Gemini’s ability to process diverse inputs, including text, images, audio, video, and code.

“Gemini gives us a great foundation. It’s already demonstrating state-of-the-art capabilities, and it’s only going to get better,” Pichai stated during the earnings call.

SGE is designed to serve a broader range of information needs, especially for more complex queries that benefit from multiple perspectives.

“By applying generative AI to search, we’re able to serve a wider range of information needs and answer new types of questions,” Pichai explained, highlighting the user-centric approach that Google is taking.

However, Pichai acknowledged that SGE surfaces fewer links within search results, sparking concerns about impacts on publishers who rely on Google traffic.

“We’re improving satisfaction, including for more conversational queries,” he said. “As I’ve mentioned, we’re surfacing more links with SGE and linking to a wider range of sources.”

Ad Growth Driven By AI

On the advertising side, Pichai cited momentum for AI-enabled products like Performance Max, responsive search ads, and automatic ad asset creation. These leverage AI to optimize campaigns and creatives.

“More advanced, generative AI-powered capabilities are coming,” said Philipp Schindler, Senior VP and Chief Business Officer.

Schindler highlighted a new conversational ad experience for search campaigns using Gemini. Early tests found it helps advertisers, especially SMBs, build higher-quality ads with less effort.

As Google doubles down on AI, Pichai said the company will continue investing in compute infrastructure to support growth. He expects capital expenditures to be “notably larger” in 2024.

Google Cloud’s AI-Driven Ascent

Alphabet’s cloud computing division, Google Cloud, continued to grow, with revenues surpassing $9 billion this quarter.

Pichai said this growth was driven by the integration of AI, attracting many customers, including over 90% of AI startups valued at over $1 billion.

Google Cloud aims to be a leader in providing AI-enabled services for businesses, offering customers performance and cost benefits through its AI Hypercomputer technology.

In Summary

Alphabet’s Q4 2023 earnings reveal steady revenue growth and increasing traction of its AI-driven products and services.

The report signals a strategic focus on leveraging AI to enhance core offerings like Search, YouTube, and Cloud.

The key takeaways from Alphabet’s earnings report for SEO and advertising professionals are:

  • Monitor impacts of AI integration on Google Search as it surfaces fewer links but aims to improve satisfaction. This could affect publisher traffic and SEO strategies.
  • Leverage AI-powered ad products like responsive search ads and automatic creative generation to optimize campaigns. But stay updated as more advanced generative AI capabilities emerge.
  • Consider Google Cloud’s AI platform to power data-driven decisions and workflows. Its growth signals a strong demand for AI services.

Above all, prepare for ongoing evolution as Alphabet doubles down on AI to transform search and ads. Proactively adapt strategies to benefit from the positives while mitigating the risks of changes.


Featured Image: IgorGolovniov/Shutterstock

Google Answers Question About Signals And Syndicated Content via @sejournal, @martinibuster

Google’s John Mueller answered a question about what happens to the signals associated with syndicated content when Google chooses the partner as the canonical instead of the original content publisher. John’s answer contained helpful information about the murky area of ranking and syndicated content.

The question was asked by Lily Ray (@lilyraynyc) on X (formerly Twitter).

She tweeted her question:

“If an article is syndicated across partner websites, and Google chooses the partner as canonical (even if canonical on partner site ➡️to original source), does this mean all SEO value is consolidated to partner URL?

E.g. link signals, UX signals, social media signals etc. from the group would be consolidated into Google’s chosen canonical?

& each time this happens, does that represent an “opportunity cost” from the original site, in the sense that they lose out on that SEO value?”

Lily asked about cross-domain canonicals and this:

  • Link signals
  • UX signals
  • Social media signals

John Mueller tweeted:

“Hi Lily! It’s complicated, and not all the things you’re asking about are things we necessarily even use.

In general, if we recognize a page as canonical, that’s going to be the page most likely rewarded by our ranking systems.”

John Mueller answered that Google didn’t use everything on her list but didn’t specify which items. Regarding the canonicals, Google does have a policy about the use of cross-domain canonicals on syndicated content.

Google announced last year that it no longer recommends cross-domain canonicals on syndicated content and instead it suggests using the meta noindex tag on the partner site to block Google from indexing the site entirely if the original publisher wants to be certain that link signals for the content accrue to them and not the syndication partner.

This is Google’s current guidance for cross-domain canonicals:

“Tip: If you want to avoid duplication by syndication partners, the canonical link element is not recommended because syndicated articles are often very different in overall content from original articles. Instead, partners should use meta tags to block the indexing of your content.”

John Mueller didn’t address what happens to the link signals but he did say that the site that is recognized as canonical is the one that’s rewarded Google’s ranking systems and that is ultimately the most important detail.

Featured Image by Shutterstock/Graphic farm

Google: Changing URLs On Larger Sites Takes Time To Process via @sejournal, @martinibuster

Someone on Reddit asked a question about making a sitewide change to the code related to a website with ten languages. Google’s John Mueller offered general advice about the pitfalls of sitewide changes and word about complexity (implying the value of simplicity).

The question was related to hreflang but Mueller’s answer, because it was general in nature, had wider value for SEO.

Here is the question that was asked:

“I am working on a website that contains 10 languages and 20 culture codes. Let’s say blog-abc was published on all languages. The hreflang tags in all languages are pointing to blog-abc version based on the lang. For en it may be en/blog-abc

They made an update to the one in English language and the URL was updated to blog-def. The hreflang tag on the English blog page for en will be updated to en/blog-def. This will however not be dynamically updated in the source code of other languages. They will still be pointing to en/blog-abc. To update hreflang tags in other languages we will have to republish them as well.

Because we are trying to make the pages as static as possible, it may not be an option to update hreflang tags dynamically. The options we have is either update the hreflang tags periodically (say once a month) or move the hreflang tags to sitemap.

If you think there is another option, that will also be helpful.”

Sitewide Changes Take A Long Time To Process

I recently read an interesting thing in a research paper that reminded me of things John Mueller said about how it takes time for Google to understand updated pages relate to the rest of the Internet.

The research paper mentioned how updated webpages required recalculating the semantic meanings of the webpages (the embeddings) and then doing that for the rest of the documents.

Here’s what the research paper (PDF) says in passing about adding new pages to a search index:

“Consider the realistic scenario wherein new documents are continually added to the indexed corpus. Updating the index in dual-encoder-based methods requires computing embeddings for new documents, followed by re-indexing all document embeddings.

In contrast, index construction using a DSI involves training a Transformer model. Therefore, the model must be re-trained from scratch every time the underlying corpus is updated, thus incurring prohibitively high computational costs compared to dual-encoders.”

I mention that passage because in 2021 John Mueller said it can take Google months to assess the quality and the relevance of a site and mentioned how Google tries to understand how a website fits in with the rest of the web.

Here’s what he said in 2021:

“I think it’s a lot trickier when it comes to things around quality in general where assessing the overall quality and relevance of a website is not very easy.

It takes a lot of time for us to understand how a website fits in with regards to the rest of the Internet.

And that’s something that can easily take, I don’t know, a couple of months, a half a year, sometimes even longer than a half a year, for us to recognize significant changes in the site’s overall quality.

Because we essentially watch out for …how does this website fit in with the context of the overall web and that just takes a lot of time.

So that’s something where I would say, compared to technical issues, it takes a lot longer for things to be refreshed in that regard.”

That part about assessing how a website fits in the context of the overall web is a curious and unusual statement.

What he said about fitting into the context of the overall web kind of sounded surprisingly similar to what the research paper said about how the search index “requires computing embeddings for new documents, followed by re-indexing all document embeddings.”

Here’s John Mueller response in Reddit about the problem with updating a lot of URLs:

“In general, changing URLs across a larger site will take time to be processed (which is why I like to recommend stable URLs… someone once said that cool URLs don’t change; I don’t think they meant SEO, but also for SEO). I don’t think either of these approaches would significantly change that.”

What does Mueller mean when he said that big changes take time be processed? It could be similar to what he said in 2021 about evaluating the site all over again for quality and relevance. That relevance part could also be similar to what the research paper said about computing embeddings” which relates to creating vector representations of the words on a webpage as part of understanding the semantic meaning.

Complexity Has Long-Term Costs

John Mueller continued his answer:

“A more meta question might be whether you’re seeing enough results from this somewhat complex setup to merit spending time maintaining it like this at all, whether you could drop the hreflang setup, or whether you could even drop the country versions and simplify even more.

Complexity doesn’t always add value, and brings a long-term cost with it.”

Creating sites with as much simplicity as possible has been something I’ve done for over twenty years. Mueller’s right. It makes updates and revamps so much easier.

Featured Image by Shutterstock/hvostik

WordPress 6.4.3 Security Release Fixes Two Vulnerabilities via @sejournal, @martinibuster

WordPress announced a security release version 6.4.3 as a response to two vulnerabilities discovered in WordPress plus 21 bug fixes.

PHP File Upload Bypass

The first patch is for a PHP File Upload Bypass Via Plugin Installer vulnerability. It’s a flaw in WordPress that allows an attacker to upload PHP files via the plugin and theme uploader. PHP is a scripting language that is used to generate HTML. PHP files can also be used to inject malware into a website.

However, this vulnerability is not as bad as it sounds because the attacker needs administrator level permissions in order to execute this attack.

PHP Object Injection Vulnerability

According to WordPress the second patch is for a Remote Code Execution POP Chains vulnerability which could allow an attacker to remotely execute code.

An RCE POP Chains vulnerability typically means that there’s a flaw that allows an attacker, typically through manipulating input that the WordPress site deserializes, to execute arbitrary code on the server.

Deserialization is the process where data is converted into a serialized format (like a text string) deserialization is the part when it’s converted back into its original form.

Wordfence describes this vulnerability as a PHP Object Injection vulnerability and doesn’t mention the RCE POP Chains part.

This is how Wordfence describes the second WordPress vulnerability:

“The second patch addresses the way that options are stored – it first sanitizes them before checking the data type of the option – arrays and objects are serialized, as well as already serialized data, which is serialized again. While this already happens when options are updated, it was not performed during site installation, initialization, or upgrade.”

This is also a low threat vulnerability in that an attacker would need administrator level permissions to launch a successful attack.

Nevertheless, the official WordPress announcement of the security and maintenance release recommends updating the WordPress installation:

“Because this is a security release, it is recommended that you update your sites immediately. Backports are also available for other major WordPress releases, 4.1 and later.”

Bug Fixes In WordPress Core

This release also fixes five bugs in the WordPress core:

  1. Text isn’t highlighted when editing a page in latest Chrome Dev and Canary
  2. Update default PHP version used in local Docker Environment for older branches
  3. wp-login.php: login messages/errors
  4. Deprecated print_emoji_styles produced during embed
  5. Attachment pages are only disabled for users that are logged in

In addition to the above five fixes to the Core there are an additional 16 bug fixes to the Block Editor.

Read the official WordPress Security and Maintenance Release announcement

WordPress descriptions of each of the 21 bug fixes

The Wordfence description of the vulnerabilities:

The WordPress 6.4.3 Security Update – What You Need to Know

Featured Image by Shutterstock/Roman Samborskyi

Google DeepMind WARM: Can Make AI More Reliable via @sejournal, @martinibuster

Google’s DeepMind published a research paper that proposes a way to train large language models so that they provide more reliable answers and are resistant against reward hacking, a step in the development of more adaptable and efficient AI systems.

Hat tip to @EthanLazuk for tweeting about a new research paper from Google DeepMind.

AI Has A Tendency Toward Reward Hacking

Reinforcement Learning from Human Feedback (RLHF) is a method used to train generative AI so that it learns to offer responses that receive positive scores from by human raters. The positive scores are a reward for correct answers, which is why this technique is called Reinforcement Learning. The positive scores are given by the human raters which is why it’s called Reinforcement Learning from Human Feedback.

RLHF is highly successful but it also comes with an unintended side effect where the AI learns shortcuts receiving a positive reward. Instead of providing a correct answer it provides an answer that has the appearance of a correct answer and when it fools the human raters (which is a failure of the reinforcement training), the AI begins to improve on its ability to fool human raters with inaccurate answers in order to receive the rewards (the positive human ratings).

This tendency of the AI to “cheat” in order to earn the training reward is called Reward Hacking, which is what the study seeks to minimize.

The Causes Of Reward Hacking In Large Language Models

To solve the problem of reward hacking the researchers identified two areas that lead to reward hacking that have to be dealt with by their solution:

  1. Distribution shifts
  2. Inconsistencies in human preferences

Distribution Shifts

Distribution shifts refers to the situation where an LLM is trained on a certain kind of dataset and then, during reinforcement learning, it is exposed to a different kinds of training data that it hasn’t seen before. This change in data type is called a distribution shift, and it could potentially cause the language model to manipulate the reward system in order to give a satisfactory answer that it’s otherwise not prepared to provide.

Inconsistencies In Human Preferences

This is a reference to humans being inconsistent in their ratings when judging answers provided by the AI. For example, solving the problem of inconsistency in human preferences is likely one of the motivations behind the creation of the Google Search Quality Raters Guidelines which has the effect of lessening the influence of subjective preferences.

Human preferences can vary from person to person. Reinforcement Learning from Human Feedback relies on human feedback in the reward model (RM) training process and it’s the inconsistencies that can lead to reward hacking.

Finding a solution is important, as the researchers noted:

“This reward hacking phenomenon poses numerous issues.

First, it degrades performances, manifesting as linguistically flawed or unnecessarily verbose outputs, which do not reflect true human preferences.

Second, it complicates checkpoint selection due to the unreliability of the proxy RM, echoing Goodhart’s Law: ‘when a measure becomes a target, it ceases to be a good measure’.

Third, it can engender sycophancy or amplify social biases, reflecting the limited and skewed demographics of feedback providers.

Lastly and most critically, misalignment due to reward hacking can escalate into safety risks, in particular given the rapid integration of LLMs in everyday life and critical decision-making. “

Weight Averaged Reward Models (WARM)

The Google DeepMind researchers developed a system called Weight Averaged Reward Models (WARM), which creates a proxy model from the combination of multiple individual reward models, each one having slight differences. With WARM, as they increase the number of reward models (RMs) they average together and the results get significantly better, with the system avoiding the sudden decline in reliability as happens with standard models.

The WARM system, because it uses multiple smaller models, has the benefit of being memory efficient and doesn’t slow down the model’s ability to provide answers, in addition to being resistant to reward hacking.

WARM also makes the model more reliable and consistent when dealing with changing data and more consistent.

What caught my eye is its ability to follow the “updatable machine learning paradigm” which refers to WARM’s ability to adapt and improve by incorporating new data or changes over time, without starting from scratch.

In the following quote, WA means Weighted Average and RM means reward model.

The researchers explain:

“WARM represents a flexible and pragmatic method to improve the alignment of AI with human values and societal norms.

…WARM follows the updatable machine learning paradigm, eliminating the need for inter-server communication, thus enabling embarrassingly simple parallelization of RMs.

This facilitates its use in federated learning scenario where the data should remain private; moreover, WA would add a layer of privacy and bias mitigation by reducing the memorization of private preference. Then, a straightforward extension of WARM would combine RMs trained on different datasets, for example, coming from different (clusters of) labelers.

…Furthermore, as WA has been shown to limit catastrophic forgetting, WARM could seamlessly support iterative and evolving preferences.”

Limitations

This research points the way toward more ways of improving AI, it’s not a complete solution because it has inherent limitations. Among the issues is that it doesn’t completely remove all forms of “spurious correlations or biases inherent in the preference data.”

Yet they did conclude in an upbeat tone about the future of WARM:

“Our empirical results demonstrate its effectiveness when applied to summarization. We anticipate that WARM will contribute to more aligned, transparent, and effective AI systems, encouraging further exploration in reward modeling.”

Read the research paper:

WARM: On the Benefits of Weight Averaged Reward Models

Featured Image by Shutterstock/Mansel Birst

Sentence-Level Semantic Internal Links For SEO via @sejournal, @martinibuster

Internal in-content linking practices have remained the same for the past twenty years, which is strange because Google has gone through dramatic changes within the last ten years and even more so in the past five. It may be time to consider freshening up internal linking strategies so that they more closely align with how Google understands and ranks webpages.

Standard Internal Linking Practices

When considering a new way of doing something, it’s important to keep an open mind because what follows will almost be startling, like a slap on the face.

Raise your hand if this is you:

An SEO is writing or updating content and comes across a keyword phrase that’s a match for the keywords targeted by an inner page, so those words get turned into anchor text.

Okay, you can put your hand down. 🙂

I expect that there will be a lot of hands raised and that’s okay because it’s how everybody does it.

As an example, I visited a so-called “white hat” website that offers an SEO-related service and in an article about a sub-topic of “internal linking” they link to another page about What Is Internal Linking using the anchor text “internal linking.”

The target page is an exact match for the two-word phrase targeted by the second page. The standard practice is if you find a keyword match for another internal page then turn it into an anchor text to the target page, right?

But it’s not right.

The sentence containing that anchor text and the paragraph that contains it are about the importance of internal linking for getting internal pages indexed and ranked. The target page is an explainer about is a general page about What Is Internal Linking.

If you think like an SEO then there’s nothing wrong with that link because the anchor text matches the target keyword of the second page.

But if you think like a site visitor who is reading the first page then what is the chance that the reader will stop reading and click the link to learn about What Is Internal Linking?

Quite likely it would be zero percent of readers would click on the link because the link is not contextually relevant.

What Does A Machine Think About It?

To see what a machine thought about that sentence I copied it asked ChatGPT:

ChatGPT replied:

“The sentence highlights the critical role of internal linking in SEO strategies.”

I then asked ChatGPT to summarize the paragraph in fifteen words or less and it responded:

“Internal linking is crucial for website indexing and ranking, with link context being particularly important.”

The context of both the sentence and the paragraph is the importance of internal links but not What Is Internal Linking.

The irony of the above example is that I pulled it from a webpage that was on the topic of the importance of context for internal linking, which shows how deeply engrained the idea is that the only context needed for an internal link is the anchor text.

But that’s not how Google understands context.

The takeaway is that for an internal link to be contextual, it’s important to consider the meaning of the sentence and the paragraph in which it exists.

What Internal Linking Is Not

There are decades-old precepts about internal linking that are commonly accepted as canonical without sufficient critical examination.

Here are a few examples:

  • Put your internal links closer to the top of the webpage.
  • Internal links are for helping other pages rank well.
  • Internal links are for helping other pages get indexed.
  • Use keyword-rich anchor text but make them look natural.
  • Internal linking is important for Google.
  • Add internal links to your most important webpage on a topic from all of the subtopic pages.

What’s missing from the above commonly accepted ideas about internal linking is that none of that has anything to do with the site visitors that are reading the content.

Those ideas aren’t even connected to how Google analyzes and understands webpages and as a consequence they’re not really what internal linking should be about. So before identifying a modern way to link internally that is in line with the modern search engine it’s useful to understand how Google is understanding webpages.

Taxonomy Of Topics In Webpage Content

A taxonomy is a way of classifying something and every well organized webpage can be subdivided into an overall topic and the subtopics beneath it, one flowing into the other so that the overall topic describes what all the subtopics as a group are about and also each subtopic describes an aspect of the main topic in what can be called a Taxonomy of Topics, the hidden structure within the content.

A webpage is called unstructured data. But in order to make sense of it Google has to impose some structure on it. So a webpage is divided into sections like the header, navigation, main content, sidebar and footer.

Google’s Martin Splitt went further and said that the main content is analyzed for the Centerpiece Annotation, a description of what the topic is about, explaining:

“That’s just us analyzing the content and… we have a thing called the Centerpiece Annotation, for instance, and there’s a few other annotations that we have where we look at the semantic content, as well as potentially the layout tree.

But fundamentally we can read that from the content structure in HTML already and figure out so “Oh! This looks like from all the natural language processing that we did on this entire text content here that we got, it looks like this is primarily about topic A, dog food.”

The centerpiece annotation is Google’s estimation of what the content is about and it identifies it by reading it from the Content Structure.

It is that content structure that can be called the Taxonomy of Topics, where a page of content is planned and created according to a topic and the subtopics.

Semantic Content Structure And Internal Links

Content has a hidden semantic structure that can be referred to as the Taxonomy of Topics.

A well constructed webpage has an overall structure that generally looks like this:

Introductory paragraph that introduces the main topic
 -Subtopic 1 (a content block)
 -Subtopic 2 (a content block)
 -Subtopic 3 (a content block)
Ending paragraph that wraps everything up

Subtopics actually have their own hierarchy as well, like this:

Subtopic 1
 -Paragraph A
 -Paragraph B
 -Paragraph C

And each paragraph also has their own hierarchy like this:

Paragraph A
 -Sentence 1
 -Sentence 2
 -Sentence 3
 -Sentence 4

The above outline is an example of how unstructured data like a webpage has a hidden structure that can help a machine understand it better by labeling it with a Centerpiece Annotation, for example.

Given that Google views content as a series of topics and subtopics that are organized in a “content structure” with headings (H1, H2) demarcating each block of content, doesn’t it make sense to also consider internal linking in the same way?

For example, my links to the Taxonomy of Topics article and the source of the Martin Splitt quote are contextually relevant and many readers of this article may likely to follow those links because they expand on the content in an interesting way, they are… contextually relevant.

And being contextually relevant, in my opinion it’s likely that Google will also find the topic matter of the the linked pages to also be relevant.

I didn’t link them to get them crawled or for ranking purposes either. I linked to them because they’re useful to readers and expand on the surrounding content in which those links are embedded.

Semantic Relevance And Contextual Internal Links

For more than ten years I’ve been encouraging the SEO industry to let go of their keywords and start thinking in terms of topics and it’s great to finally see more of the industry finally get it and start thinking about content in terms of what it means at the semantic level.

Now take the next step and let go of that “keyword-targeted” mindset and apply that understanding to internal links. Doing so makes sense for SEO and also for readers. In my 25 years of hands-on experience with SEO, I can say with confidence that the most future-proofed SEO strategy is one that thinks about the impact to site visitors because that’s how Google is looking at pages, too.

Featured Image by Shutterstock/Iconic Bestiary

Google SearchLiason: 4 Reasons Why A Webpage Couldn’t Rank via @sejournal, @martinibuster

Someone asked on Twitter why their articles aren’t ranking well and Google SearchLiaison surprised everyone with a mini site audit of things needing to be fixed.

A person (@iambrandonsalt) tweeted on X (formerly Twitter) asking if anyone could offer an explanation of why some pages of their site was having problems ranking.

He tweeted:

“Does anyone have an explanation to why some of our articles are not showing up in the SERPs… at all?

I’ll update the article with new info, it pops back in and ranks well, then disappears again.

This is happening to lots of our great content, it’s very frustrating :(“

That person subsequently shared the URL of the site under discussion and that’s when SearchLiaison tweeted a response.

Google SearchLiason Mini Site Audit

SearchLiaison’s mini audit spotted three problems that may be causing the site to underperform in the search engine results pages (SERPs).

Overview Of Why A Webpage Is Underperforming In SERPs

Below is an outline of four things that SearchLiaison called attention to. I wouldn’t take what they say as indicative of actual ranking factors.

But I would encourage taking his advice seriously. SearchLiaison/Danny Sullivan, has been involved in search for almost 28 years and now he’s working on the inside at Google.

So he understands what it’s like to be on the outside, which makes him a unique and valuable resource to listen to.

Four Highlighted Content Issues

1. Original Content Is Not Apparent

2. A lack of content that demonstrates experience

3. Unsatisfying Content

4. Stale Content That Doesn’t Deliver

The above are the three main reasons why SearchLiaison felt the webpage was having trouble ranking in the SERPs.

1. Originality Front And Center

Here is his post that offers the specific details:

Here’s the part where he called out the seeming lack of original content:

“Took a look. Will share a few things that maybe might be generally helpful. At first glance, it wasn’t clear to me that there was much original content here.

It looks and feels at first glance like a typical “here’s a bunch of product pages.”

I really had to go into it further to understand there’s original stuff going on.”

It’s great to have original content and it should be readily apparent. I think what SearchLiaison meant when he said that the page looked like “a bunch of product pages” is that the content was a list of features.

One can rewrite what the product features are but that doesn’t make it original. The words may be original and even unique but what they communicate is not original.

Going further from what SearchLiaison said, I would add that what’s lacking is any sign that the person writing the content has actually handled the product, which relates to experience.

2. Does Content Demonstrate Experience?

And yes! SearchLiaison also talked about experience.

He wrote:

“Deck 1, 4 and 9 have long video reviews, it looks like — so cool, you’ve used them, have experiences to share. That’s all great. Maybe make that a bit clearer to the reader? But … it could also be me.”

What SearchLiaison may mean is that reading the content there’s no mention of the physical properties of the product are. Is it light? Does it fit well in the hand? Does it feel cheap? The content is largely a list of product features, a way of writing that doesn’t communicate experience.

3. Unsatisfying Content

An important thing about content is that it should satisfy the reader.

Reader satisfaction is so important the Google Search Quality Raters Guide emphasizes this for the main content (MC):

“Consider the extent to which the MC is satisfying and helps the page achieve its purpose.”

This is what SearchLiaison said:

This is what SearchLiaison said:

“But most of the other devices … don’t exist yet.

You’re promising the reader that these are the best alternatives for 2024. And maybe some of these will be, but if they don’t exist yet, that’s potentially a bummer and unsatisifying to people coming to this page?

Maybe those upcoming devices belong a page about — upcoming devices?”

4. Stale Content That Doesn’t Deliver

An issue SearchLiaison picked up on is content that it is out of date and because of that it doesn’t deliver what it is promising to give.

The problem with some of the content is exactly what SearchLiaison says, that’s it’s out of date.

SearchLiaison observed:

“You also mentioned updating the page and … it feels out-of-date, so what’s being updated on it?

“As of today (April 22nd) the Rog Ally is not out yet, and it was just announced on April 1st” is on the article dated today, Jan 29, and you’d said on Jan 27 this page has also been updated, so what significant change is actually happening to warrant a new byline date?

“At the time of writing, the Lenovo Legion Go isn’t currently out, but all signs are pointing towards an October 2023 release date” — same thing, confusing to be out-of-date on a page claiming to be fresh as of today.

“The IndieGoGo pages goes live on September 5th, so bookmark it and get ready to make a very wise purchase!” — again, out-of-date.”

Advice Is Not Ranking Factors

SearchLiaison ended his critique by stating that none of what he said should be taken to be examples of ranking factors but rather things that tend to align with what Google is looking for.

“Clearly, you put work into some of the video reviews.

Maybe that needs to be more evident with some of the written write-ups. And mixing out-of-date info on a page that claims to be fresh isn’t a great experience.

It’s not that any or all of these things are direct ranking factors, and changing them won’t guarantee to move you up.

But the systems overall are designed to reward reliable helpful content meant for people, so the more this page aligns with that goal, the more you’re potentially going to be successful with it.”

Self-Assessment In Site Auditing

Sometimes it is difficult to critique ones own site. So it’s helpful to seek an outside opinion. One doesn’t necessarily need a full-blown site audit, sometimes critiquing a single page can provide a wealth of helpful information.

Featured Image by Shutterstock/Mix and Match Studio

Why Google SGE Is Stuck In Google Labs And What’s Next via @sejournal, @martinibuster

Google Search Generative Experience (SGE) was set to expire as a Google Labs experiment at the end of 2023 but its time as an experiment was quietly extended, making it clear that SGE is not coming to search in the near future. Surprisingly, letting Microsoft take the lead may have been the best perhaps unintended approach for Google.

Google’s AI Strategy For Search

Google’s decision to keep SGE as a Google Labs project fits into the broader trend of Google’s history of preferring to integrate AI in the background.

The presence of AI isn’t always apparent but it has been a part of Google Search in the background for longer than most people realize.

The very first use of AI in search was as part of Google’s ranking algorithm, a system known as RankBrain. RankBrain helped the ranking algorithms understand how words in search queries relate to concepts in the real world.

According to Google:

“When we launched RankBrain in 2015, it was the first deep learning system deployed in Search. At the time, it was groundbreaking… RankBrain (as its name suggests) is used to help rank — or decide the best order for — top search results.”

The next implementation was Neural Matching which helped Google’s algorithms understand broader concepts in search queries and webpages.

And one of the most well known AI systems that Google has rolled out is the Multitask Unified Model, also known as Google MUM.  MUM is a multimodal AI system that encompasses understanding images and text and is able to place them within the contexts as written in a sentence or a search query.

SpamBrain, Google’s spam fighting AI is quite likely one of the most important implementations of AI as a part of Google’s search algorithm because it helps weed out low quality sites.

These are all examples of Google’s approach to using AI in the background to solve different problems within search as a part of the larger Core Algorithm.

It’s likely that Google would have continued using AI in the background until the transformer-based large language models (LLMs) were able to step into the foreground.

But Microsoft’s integration of ChatGPT into Bing forced Google to take steps to add AI in a more foregrounded way with  their Search Generative Experience (SGE).

Why Keep SGE In Google Labs?

Considering that Microsoft has integrated ChatGPT into Bing, it might seem curious that Google hasn’t taken a similar step and is instead keeping SGE in Google Labs. There are good reasons for Google’s approach.

One of Google’s guiding principles for the use of AI is to only use it once the technology is proven to be successful and is implemented in a way that can be trusted to be responsible and those are two things that generative AI is not capable of today.

There are at least three big problems that must be solved before AI can successfully be integrated in the foreground of search:

  1. LLMs cannot be used as an information retrieval system because it needs to be completely retrained in order to add new data. .
  2. Transformer architecture is inefficient and costly.
  3. Generative AI tends to create wrong facts, a phenomenon known as hallucinating.

Why AI Cannot Be Used As A Search Engine

One of the most important problems to solve before AI can be used as the backend and the frontend of a search engine is that LLMs are unable to function as a search index where new data is continuously added.

In simple terms, what happens is that in a regular search engine, adding new webpages is a process where the search engine computes the semantic meaning of the words and phrases within the text (a process called “embedding”), which makes them searchable and ready to be integrated into the index.

Afterwards the search engine has to update the entire index in order to understand (so to speak) where the new webpages fit into the overall search index.

The addition of new webpages can change how the search engine understands and relates all the other webpages it knows about, so it goes through all the webpages in its index and updates their relations to each other if necessary. This is a simplification for the sake of communicating the general sense of what it means to add new webpages to a search index.

In contrast to current search technology, LLMs cannot add new webpages to an index because the act of adding new data requires a complete retraining of the entire LLM.

Google is researching how to solve this problem in order create a transformer-based LLM search engine, but the problem is not solved, not even close.

To understand why this happens, it’s useful to take a quick look at a recent Google research paper that is co-authored by Marc Najork and Donald Metzler (and several other co-authors). I mention their names because both of those researchers are almost always associated with some of the most consequential research coming out of Google. So if it has either of their names on it, then the research is likely very important.

In the following explanation, the search index is referred to as memory because a search index is a memory of what has been indexed.

The research paper is titled: “DSI++: Updating Transformer Memory with New Documents” (PDF)

Using LLMs as search engines is a process that uses a technology called Differentiable Search Indices (DSIs). The current search index technology is referenced as a dual-encoder.

The research paper explains:

“…index construction using a DSI involves training a Transformer model. Therefore, the model must be re-trained from scratch every time the underlying corpus is updated, thus incurring prohibitively high computational costs compared to dual-encoders.”

The paper goes on to explore ways to solve the problem of LLMs that “forget” but at the end of the study they state that they only made progress toward better understanding what needs to be solved in future research.

They conclude:

“In this study, we explore the phenomenon of forgetting in relation to the addition of new and distinct documents into the indexer. It is important to note that when a new document refutes or modifies a previously indexed document, the model’s behavior becomes unpredictable, requiring further analysis.

Additionally, we examine the effectiveness of our proposed method on a larger dataset, such as the full MS MARCO dataset. However, it is worth noting that with this larger dataset, the method exhibits significant forgetting. As a result, additional research is necessary to enhance the model’s performance, particularly when dealing with datasets of larger scales.”

LLMs Can’t Fact Check Themselves

Google and many others are also researching multiple ways to have AI fact check itself in order to keep from giving false information (referred to as hallucinations). But so far that research is not making significant headway.

Bing’s Experience Of AI In The Foreground

Bing took a different route by incorporating AI directly into its search interface in a hybrid approach that joined a traditional search engine with an AI frontend. This new kind of search engine revamped the search experience and differentiated Bing in the competition for search engine users.

Bing’s AI integration initially created significant buzz, drawing users intrigued by the novelty of an AI-driven search interface. This resulted in an increase in Bing’s user engagement.

But after nearly a year of buzz, Bing’s market share saw only a marginal increase. Recent reports, including one from the Boston Globe, indicate less than 1% growth in market share since the introduction of Bing Chat.

Google’s Strategy Is Validated In Hindsight

Bing’s experience suggests that AI in the foreground of a search engine may not be as effective as hoped. The modest increase in market share raises questions about the long-term viability of a chat-based search engine and validates Google’s cautionary approach of using AI in the background.

Google’s focusing of AI in the background of search is vindicated in light of Bing’s failure to cause users to abandon Google for Bing.

The strategy of keeping AI in the background, where at this point in time it works best, allowed Google to maintain users while AI search technology matures in Google Labs where it belongs.

Bing’s approach of using AI in the foreground now serves as almost a cautionary tale about the pitfalls of rushing out a technology before the benefits are fully understood, providing insights into the limitations of that approach.

Ironically, Microsoft is finding better ways to integrate AI as a background technology in the form of useful features added to their cloud-based office products.

Future Of AI In Search

The current state of AI technology suggests that it’s more effective as a tool that supports the functions of a search engine rather than serving as the entire back and front ends of a search engine or even as a hybrid approach which users have refused to adopt.

Google’s strategy of releasing new technologies only when they have been fully tested explains why Search Generative Experience belongs in Google Labs.

Certainly, AI will take a bolder role in search but that day is definitely not today. Expect to see Google adding more AI based features to more of their products and it might not be surprising to see Microsoft continue along that path as well.

Featured Image by Shutterstock/ProStockStudio

Google’s SEO Starter Guide Updates: Branding In, Keywords Out via @sejournal, @MattGSouthern

In a recent episode of “Search Off the Record,” Google’s Search Relations team provided information about upcoming changes to Google’s SEO Starter Guide.

The team, composed of John Mueller, Lizzie Harvey, and Gary Illyes, says the guide maintains a 91% user satisfaction rating. However, they believe it’s due for an overhaul to streamline outdated advice and better serve its core beginner audience.

These are some of the most notable changes discussed during the podcast.

HTML Structure & Search Rankings

The team discussed the role of HTML structure in search engine rankings. They explained that proper use of HTML elements such as titles can be helpful for rankings but doesn’t impact them as much as some think.

“Using headings and a good title element and having paragraphs, yeah, sure. It’s all great. But other than that, it’s pretty futile to think about how the page… or how the HTML is structured,” Ilyes mentioned, debunking common misconceptions within the SEO community.

Domain Names: Branding Versus Keywords

The team discussed the ongoing debate about whether domain names affect SEO.

They recommended prioritizing branding over including keywords when choosing a domain name. Their view was that establishing a memorable brand should take precedence over trying to optimize domain names for search engines.

This advice reflects a trend toward brand-centric domain name selection in SEO strategies.

Meta Tags: To Include or Not to Include?

The team deliberated whether to discuss meta tags like meta keywords in the revised SEO Starter Guide. The team leaned towards excluding this topic to avoid unnecessarily worrying site owners because meta keywords have minimal influence on Google Search rankings.

“I feel very conflicted about documenting anti patterns because we perhaps also give ideas about like new worries for site owners to think about,” Gary explained, highlighting their cautious approach.

Ultimately, the decision was made to focus the guide on optimizing factors with a more significant impact.

Addressing Misconceptions Head-On

The conversation explored ways for the team to correct common SEO misconceptions, specifically the idea that utilizing Google products improves search rankings.

The team agreed that the SEO Starter Guide should address these inaccurate beliefs to prevent the spread of misinformation.

In Summary

As the Google Search Relations team prepares to release the updated SEO Starter Guide, this recent podcast episode has given the SEO community a sneak peek at the upcoming changes.

They want to simplify and modernize the guide and debunk common SEO myths.

The goal is to provide helpful, practical SEO advice for people who are just starting and experienced professionals.

For more on the updates to Google’s SEO Starter Guide, see:


Featured Image: Sadi-Santos/Shutterstock

About Those Google AI Search Quality Raters via @sejournal, @martinibuster

There’s been a significant amount of speculation about the motives behind Google’s firing of Appen, the company that provides Search Quality Raters, with many expressing this may be the dawn of a new era of AI search quality raters. The facts however are clear about the meaning behind these recent events.

There are two ideas being talked about in the SEO social media world about what this all means:

  1. Google doesn’t need quality raters anymore.
  2. The firing of the search quality raters may mean that Google plans to roll out AI search quality raters at scale.

Are Search Quality Raters Going Away?

Google’s search quality raters are not going away. Google employs multiple companies to provide search quality ratings services, it’s not just one company.

There are threads on Reddit for example that mention Telus and RWS that apparently offer search quality raters to Google.

So the idea that Google’s no longer going to use search quality raters is unfounded, not true.

Some speculate that the reason why Google fired Appen may have something to do with the Appen workers having been encouraged by a Google workers union to negotiate for higher wages (according to a report on Forbes).

So if Google is going to cut costs, the speculation goes, it makes sense to cut the Appen workers since they’re close to the unionized workers at Google.

But that might be a coincidence.

Google announced it is laying off workers again, including from engineering teams and it could be that the Appen workers losing their jobs was just a part of the overall cost cutting.

Will The Raters Be Replaced By AI?

Some SEOs were speculating that the firing of the Appen workers was a sign that Google might replace the quality raters with AI quality raters.

Some search marketers opined that Google doesn’t have the ability to replace the search quality raters with AI now but that they probably will be able to in the near future.

Not only that, some search marketers speculated that it would be bad news for publishers if AI replaced human search quality raters because AI would be able scale the quality ratings to the entire web.

And the thing about SEOs worrying about Google AI search quality raters, expressing fear of a nightmare scenario and speculation about whether it’s even possible to have an AI that can rate webpages for search quality is that machines, including AI, have already been in use for years at Google for rating webpages for search quality.

For example, Google has the Helpful Content System, the Reviews System and SpamBrain all rate websites for search quality at scale.

So it’s not a matter of whether Google will replace Search Quality Raters with AI, at this point Google isn’t replacing them at all, humans are still at work.

But Google already has AI and algorithms that rate websites for search quality. It’s already happening.

Will Google Eventually Get Rid Of Search Quality Raters?

Something to consider is that training AI requires datasets and the work that the search quality raters do can provide data for machines to learn from, in addition to their stated job of rating the search results of new algorithms.

What we do know at this point is that Google isn’t getting rid of the human search quality raters and that Google already has systems that rate webpages for search quality.

The sky isn’t falling.

It’s just another fizzy week in SEO-land.

Featured Image by Shutterstock/Drawlab19