Google’s New Graph Foundation Model Catches Spam Up To 40x Better via @sejournal, @martinibuster

Google published details of a new kind of AI based on graphs called a Graph Foundation Model (GFM) that generalizes to previously unseen graphs and delivers a three to forty times boost in precision over previous methods, with successful testing in scaled applications such as spam detection in ads.

The announcement of this new technology is referred to as expanding the boundaries of what has been possible up to today:

“Today, we explore the possibility of designing a single model that can excel on interconnected relational tables and at the same time generalize to any arbitrary set of tables, features, and tasks without additional training. We are excited to share our recent progress on developing such graph foundation models (GFM) that push the frontiers of graph learning and tabular ML well beyond standard baselines.”

Google's Graph Foundation Model shows 3-40 times performance improvement in precision

Graph Neural Networks Vs. Graph Foundation Models

Graphs are representations of data that are related to each other. The connections between the objects are called edges and the objects themselves are called nodes. In SEO, the most familiar type of graph could be said to be the Link Graph, which is a map of the entire web by the links that connect one web page to another.

Current technology uses Graph Neural Networks (GNNs) to represent data like web page content and can be used to identify the topic of a web page.

A Google Research blog post about GNNs explains their importance:

“Graph neural networks, or GNNs for short, have emerged as a powerful technique to leverage both the graph’s connectivity (as in the older algorithms DeepWalk and Node2Vec) and the input features on the various nodes and edges. GNNs can make predictions for graphs as a whole (Does this molecule react in a certain way?), for individual nodes (What’s the topic of this document, given its citations?)…

Apart from making predictions about graphs, GNNs are a powerful tool used to bridge the chasm to more typical neural network use cases. They encode a graph’s discrete, relational information in a continuous way so that it can be included naturally in another deep learning system.”

The downside to GNNs is that they are tethered to the graph on which they were trained and can’t be used on a different kind of graph. To use it on a different graph, Google has to train another model specifically for that other graph.

To make an analogy, it’s like having to train a new generative AI model on French language documents just to get it to work in another language, but that’s not the case because LLMs can generalize to other languages, which is not the case for models that work with graphs. This is the problem that the invention solves, to create a model that generalizes to other graphs without having to be trained on them first.

The breakthrough that Google announced is that with the new Graph Foundation Models, Google can now train a model that can generalize across new graphs that it hasn’t been trained on and understand patterns and connections within those graphs. And it can do it three to forty times more precisely.

Announcement But No Research Paper

Google’s announcement does not link to a research paper. It’s been variously reported that Google has decided to publish less research papers and this is a big example of that policy change. Is it because this innovation is so big they want to keep this as a competitive advantage?

How Graph Foundation Models Work

In a conventional graph, let’s say a graph of the Internet, web pages are the nodes. The links between the nodes (web pages) are called the edges. In that kind of graph, you can see similarities between pages because the pages about a specific topic tend to link to other pages about the same specific topic.

In very simple terms, a Graph Foundation Model turns every row in every table into a node and connects related nodes based on the relationships in the tables. The result is a single large graph that the model uses to learn from existing data and make predictions (like identifying spam) on new data.

Screenshot Of Five Tables

Image by Google

Transforming Tables Into A Single Graph

The research paper says this about the following images which illustrate the process:

“Data preparation consists of transforming tables into a single graph, where each row of a table becomes a node of the respective node type, and foreign key columns become edges between the nodes. Connections between five tables shown become edges in the resulting graph.”

Screenshot Of Tables Converted To Edges

Image by Google

What makes this new model exceptional is that the process of creating it is “straightforward” and it scales. The part about scaling is important because it means that the invention is able to work across Google’s massive infrastructure.

“We argue that leveraging the connectivity structure between tables is key for effective ML algorithms and better downstream performance, even when tabular feature data (e.g., price, size, category) is sparse or noisy. To this end, the only data preparation step consists of transforming a collection of tables into a single heterogeneous graph.

The process is rather straightforward and can be executed at scale: each table becomes a unique node type and each row in a table becomes a node. For each row in a table, its foreign key relations become typed edges to respective nodes from other tables while the rest of the columns are treated as node features (typically, with numerical or categorical values). Optionally, we can also keep temporal information as node or edge features.”

Tests Are Successful

Google’s announcement says that they tested it in identifying spam in Google Ads, which was difficult because it’s a system that uses dozens of large graphs. Current systems are unable to make connections between unrelated graphs and miss important context.

Google’s new Graph Foundation Model was able to make the connections between all the graphs and improved performance.

The announcement described the achievement:

“We observe a significant performance boost compared to the best tuned single-table baselines. Depending on the downstream task, GFM brings 3x – 40x gains in average precision, which indicates that the graph structure in relational tables provides a crucial signal to be leveraged by ML models.”

Is Google Using This System?

It’s notable that Google successfully tested the system with Google Ads for spam detection and reported upsides and no downsides. This means that it can be used in a live environment for a variety of real-world tasks. They used it for Google Ads spam detection and because it’s a flexible model that means it can be used for other tasks for which multiple graphs are used, from identifying content topics to identifying link spam.

Normally, when something falls short the research papers and announcement say that it points the way for future but that’s not how this new invention is presented. It’s presented as a success and it ends with a statement saying that these results can be further improved, meaning it can get even better than these already spectacular results.

“These results can be further improved by additional scaling and diverse training data collection together with a deeper theoretical understanding of generalization.”

Read Google’s announcement:

Graph foundation models for relational data

Featured Image by Shutterstock/SidorArt

Google’s Trust Ranking Patent Shows How User Behavior Is A Signal via @sejournal, @martinibuster

Google long ago filed a patent for ranking search results by trust. The groundbreaking idea behind the patent is that user behavior can be used as a starting point for developing a ranking signal.

The big idea behind the patent is that the Internet is full of websites all linking to and commenting about each other. But which sites are trustworthy? Google’s solution is to utilize user behavior to indicate which sites are trusted and then use the linking and content on those sites to reveal more sites that are trustworthy for any given topic.

PageRank is basically the same thing only it begins and ends with one website linking to another website. The innovation of Google’s trust ranking patent is to put the user at the start of that trust chain like this:

User trusts X Websites > X Websites trust Other Sites > This feeds into Google as a ranking signal

The trust originates from the user and flows to trust sites that themselves provide anchor text, lists of other sites and commentary about other sites.

That, in a nutshell, is what Google’s trust-based ranking algorithm is about.

The deeper insight is that it reveals Google’s groundbreaking approach to letting users be a signal of what’s trustworthy. You know how Google keeps saying to create websites for users? This is what the trust patent is all about, putting the user in the front seat of the ranking algorithm.

Google’s Trust And Ranking Patent

The patent was coincidentally filed around the same period that Yahoo and Stanford University published a Trust Rank research paper which is focused on identifying spam pages.

Google’s patent is not about finding spam. It’s focused on doing the opposite, identifying trustworthy web pages that satisfy the user’s intent for a search query.

How Trust Factors Are Used

The first part of any patent consists of an Abstract section that offers a very general description of the invention that that’s what this patent does as well.

The patent abstract asserts:

  • That trust factors are used to rank web pages.
  • The trust factors are generated from “entities” (which are later described to be the users themselves, experts, expert web pages, and forum members) that link to or comment about other web pages).
  • Those trust factors are then used to re-rank web pages.
  • Re-ranking web pages kicks in after the normal ranking algorithm has done its thing with links, etc.

Here’s what the Abstract says:

“A search engine system provides search results that are ranked according to a measure of the trust associated with entities that have provided labels for the documents in the search results.

A search engine receives a query and selects documents relevant to the query.

The search engine also determines labels associated with selected documents, and the trust ranks of the entities that provided the labels.

The trust ranks are used to determine trust factors for the respective documents. The trust factors are used to adjust information retrieval scores of the documents. The search results are then ranked based on the adjusted information retrieval scores.”

As you can see, the Abstract does not say who the “entities” are nor does it say what the labels are yet, but it will.

Field Of The Invention

The next part is called the Field Of The Invention. The purpose is to describe the technical domain of the invention (which is information retrieval) and the focus (trust relationships between users) for the purpose of ranking web pages.

Here’s what it says:

“The present invention relates to search engines, and more specifically to search engines that use information indicative of trust relationship between users to rank search results.”

Now we move on to the next section, the Background, which describes the problem this invention solves.

Background Of The Invention

This section describes why search engines fall short of answering user queries (the problem) and why the invention solves the problem.

The main problems described are:

  • Search engines are essentially guessing (inference) what the user’s intent is when they only use the search query.
  • Users rely on expert-labeled content from trusted sites (called vertical knowledge sites) to tell them which web pages are trustworthy
  • Explains why the content labeled as relevant or trustworthy is important but ignored by search engines.
  • It’s important to remember that this patent came out before the BERT algorithm and other natural language approaches that are now used to better understand search queries.

This is how the patent explains it:

“An inherent problem in the design of search engines is that the relevance of search results to a particular user depends on factors that are highly dependent on the user’s intent in conducting the search—that is why they are conducting the search—as well as the user’s circumstances, the facts pertaining to the user’s information need.

Thus, given the same query by two different users, a given set of search results can be relevant to one user and irrelevant to another, entirely because of the different intent and information needs.”

Next it goes on to explain that users trust certain websites that provide information about certain topics:

“…In part because of the inability of contemporary search engines to consistently find information that satisfies the user’s information need, and not merely the user’s query terms, users frequently turn to websites that offer additional analysis or understanding of content available on the Internet.”

Websites Are The Entities

The rest of the Background section names forums, review sites, blogs, and news websites as places that users turn to for their information needs, calling them vertical knowledge sites. Vertical Knowledge sites, it’s explained later, can be any kind of website.

The patent explains that trust is why users turn to those sites:

“This degree of trust is valuable to users as a way of evaluating the often bewildering array of information that is available on the Internet.”

To recap, the “Background” section explains that the trust relationships between users and entities like forums, review sites, and blogs can be used to influence the ranking of search results. As we go deeper into the patent we’ll see that the entities are not limited to the above kinds of sites, they can be any kind of site.

Patent Summary Section

This part of the patent is interesting because it brings together all of the concepts into one place, but in a general high-level manner, and throws in some legal paragraphs that explain that the patent can apply to a wider scope than is set out in the patent.

The Summary section appears to have four sections:

  • The first section explains that a search engine ranks web pages that are trusted by entities (like forums, news sites, blogs, etc.) and that the system maintains information about these labels about trusted web pages.
  • The second section offers a general description of the work of the entities (like forums, news sites, blogs, etc.).
  • The third offers a general description of how the system works, beginning with the query, the assorted hand waving that goes on at the search engine with regard to the entity labels, and then the search results.
  • The fourth part is a legal explanation that the patent is not limited to the descriptions and that the invention applies to a wider scope. This is important. It enables Google to use a non-existent thing, even something as nutty as a “trust button” that a user selects to identify a site as being trustworthy as an example. This enables an example like a non-existent “trust button” to be a stand-in for something else, like navigational queries or Navboost or anything else that is a signal that a user trusts a website.

Here’s a nutshell explanation of how the system works:

  • The user visits sites that they trust and click a “trust button” that tells the search engine that this is a trusted site.
  • The trusted site “labels” other sites as trusted for certain topics (the label could be a topic like “symptoms”).
  • A user asks a question at a search engine (a query) and uses a label (like “symptoms”).
  • The search engine ranks websites according to the usual manner then it looks for sites that users trust and sees if any of those sites have used labels about other sites.
  • Google ranks those other sites that have had labels assigned to them by the trusted sites.

Here’s an abbreviated version of the third part of the Summary that gives an idea of the inner workings of the invention:

“A user provides a query to the system…The system retrieves a set of search results… The system determines which query labels are applicable to which of the search result documents. … determines for each document an overall trust factor to apply… adjusts the …retrieval score… and reranks the results.”

Here’s that same section in its entirety:

  • “A user provides a query to the system; the query contains at least one query term and optionally includes one or more labels of interest to the user.
  • The system retrieves a set of search results comprising documents that are relevant to the query term(s).
  • The system determines which query labels are applicable to which of the search result documents.
  • The system determines for each document an overall trust factor to apply to the document based on the trust ranks of those entities that provided the labels that match the query labels.
  • Applying the trust factor to the document adjusts the document’s information retrieval score, to provide a trust adjusted information retrieval score.
  • The system reranks the search result documents based at on the trust adjusted information retrieval scores.”

The above is a general description of the invention.

The next section, called Detailed Description, deep dives into the details. At this point it’s becoming increasingly evident that the patent is highly nuanced and can not be reduced to simple advice similar to: “optimize your site like this to earn trust.”

A large part of the patent hinges on a trust button and an advanced search query:  label:

Neither the trust button or the label advanced search query have ever existed. As you’ll see, they are quite probably stand-ins for techniques that Google doesn’t want to explicitly reveal.

Detailed Description In Four Parts

The details of this patent are located in four sections within the Detailed Description section of the patent. This patent is not as simple as 99% of SEOs say it is.

These are the four sections:

  1. System Overview
  2. Obtaining and Storing Trust Information
  3. Obtaining and Storing Label Information
  4. Generated Trust Ranked Search Results

The System Overview is where the patent deep dives into the specifics. The following is an overview to make it easy to understand.

System Overview

1. Explains how the invention (a search engine system) ranks search results based on trust relationships between users and the user-trusted entities who label web content.

2. The patent describes a “trust button” that a user can click that tells Google that a user trusts a website or trusts the website for a specific topic or topics.

3. The patent says a trust related score is assigned to a website when a user clicks a trust button on a website.

4. The trust button information is stored in a trust database that’s referred to as #190.

Here’s what it says about assigning a trust rank score based on the trust button:

“The trust information provided by the users with respect to others is used to determine a trust rank for each user, which is measure of the overall degree of trust that users have in the particular entity.”

Trust Rank Button

The patent refers to the “trust rank” of the user-trusted websites. That trust rank is based on a trust button that a user clicks to indicate that they trust a given website, assigning a trust rank score.

The patent says:

“…the user can click on a “trust button” on a web page belonging to the entity, which causes a corresponding record for a trust relationship to be recorded in the trust database 190.

In general any type of input from the user indicating that such as trust relationship exists can be used.”

The trust button has never existed and the patent quietly acknowledges this by stating that any type of input can be used to indicate the trust relationship.

So what is it? I believe that the “trust button” is a stand-in for user behavior metrics in general, and site visitor data in particular. The patent Claims section does not mention trust buttons at all but does mention user visitor data as an indicator of trust.

Here are several passages that mention site visits as a way to understand if a user trusts a website:

“The system can also examine web visitation patterns of the user and can infer from the web visitation patterns which entities the user trusts. For example, the system can infer that a particular user trust a particular entity when the user visits the entity’s web page with a certain frequency.”

The same thing is stated in the Claims section of the patent, it’s the very first claim they make for the invention:

“A method performed by data processing apparatus, the method comprising:
determining, based on web visitation patterns of a user, one or more trust relationships indicating that the user trusts one or more entities;”

It may very well be that site visitation patterns and other user behaviors are what is meant by the “trust button” references.

Labels Generated By Trusted Sites

The patent defines trusted entities as news sites, blogs, forums, and review sites, but not limited to those kinds of sites, it could be any other kind of website.

Trusted websites create references to other sites and in that reference they label those other sites as being relevant to a particular topic. That label could be an anchor text. But it could be something else.

The patent explicitly mentions anchor text only once:

“In some cases, an entity may simply create a link from its site to a particular item of web content (e.g., a document) and provide a label 107 as the anchor text of the link.”

Although it only explicitly mentions anchor text once, there are other passages where it anchor text is strongly implied, for example, the patent offers a general description of labels as describing or categorizing the content found on another site:

“…labels are words, phrases, markers or other indicia that have been associated with certain web content (pages, sites, documents, media, etc.) by others as descriptive or categorical identifiers.”

Labels And Annotations

Trusted sites link out to web pages with labels and links. The combination of a label and a link is called an annotation.

This is how it’s described:

“An annotation 106 includes a label 107 and a URL pattern associated with the label; the URL pattern can be specific to an individual web page or to any portion of a web site or pages therein.”

Labels Used In Search Queries

Users can also search with “labels” in their queries by using a non-existent “label:” advanced search query. Those kinds of queries are then used to match the labels that a website page is associated with.

This is how it’s explained:

“For example, a query “cancer label:symptoms” includes the query term “cancel” and a query label “symptoms”, and thus is a request for documents relevant to cancer, and that have been labeled as relating to “symptoms.”

Labels such as these can be associated with documents from any entity, whether the entity created the document, or is a third party. The entity that has labeled a document has some degree of trust, as further described below.”

What is that label in the search query? It could simply be certain descriptive keywords, but there aren’t any clues to speculate further than that.

The patent puts it all together like this:

“Using the annotation information and trust information from the trust database 190, the search engine 180 determines a trust factor for each document.”

Takeaway:

A user’s trust is in a website. That user-trusted website is not necessarily the one that’s ranked, it’s the website that’s linking/trusting another relevant web page. The web page that is ranked can be the one that the trusted site has labeled as relevant for a specific topic and it could be a web page in the trusted site itself. The purpose of the user signals is to provide a starting point, so to speak, from which to identify trustworthy sites.

Experts Are Trusted

Vertical Knowledge Sites, sites that users trust, can host the commentary of experts. The expert could be the publisher of the trusted site as well. Experts are important because links from expert sites are used as part of the ranking process.

Experts are defined as publishing a deep level of content on the topic:

“These and other vertical knowledge sites may also host the analysis and comments of experts or others with knowledge, expertise, or a point of view in particular fields, who again can comment on content found on the Internet.

For example, a website operated by a digital camera expert and devoted to digital cameras typically includes product reviews, guidance on how to purchase a digital camera, as well as links to camera manufacturer’s sites, new products announcements, technical articles, additional reviews, or other sources of content.

To assist the user, the expert may include comments on the linked content, such as labeling a particular technical article as “expert level,” or a particular review as “negative professional review,” or a new product announcement as ;new 10MP digital SLR’.”

Links From Expert Sites

Links and annotations from user-trusted expert sites are described as sources of trust information:

“For example, Expert may create an annotation 106 including the label 107 “Professional review” for a review 114 of Canon digital SLR camera on a web site “www.digitalcameraworld.com”, a label 107 of “Jazz music” for a CD 115 on the site “www.jazzworld.com”, a label 107 of “Classic Drama” for the movie 116 “North by Northwest” listed on website “www.movierental.com”, and a label 107 of “Symptoms” for a group of pages describing the symptoms of colon cancer on a website 117 “www.yourhealth.com”.

Note that labels 107 can also include numerical values (not shown), indicating a rating or degree of significance that the entity attaches to the labeled document.

Expert’s web site 105 can also include trust information. More specifically, Expert’s web site 105 can include a trust list 109 of entities whom Expert trusts. This list may be in the form of a list of entity names, the URLs of such entities’ web pages, or by other identifying information. Expert’s web site 105 may also include a vanity list 111 listing entities who trust Expert; again this may be in the form of a list of entity names, URLs, or other identifying information.”

Inferred Trust

The patent describes additional signals that can be used to signal (infer) trust. These are more traditional type signals like links, a list of trusted web pages (maybe a resources page?) and a list of sites that trust the website.

These are the inferred trust signals:

“(1) links from the user’s web page to web pages belonging to trusted entities;
(2) a trust list that identifies entities that the user trusts; or
(3) a vanity list which identifies users who trust the owner of the vanity page.”

Another kind of trust signal that can be inferred is from identifying sites that a user tends to visit.

The patent explains:

“The system can also examine web visitation patterns of the user and can infer from the web visitation patterns which entities the user trusts. For example, the system can infer that a particular user trusts a particular entity when the user visits the entity’s web page with a certain frequency.”

Takeaway:

That’s a pretty big signal and I believe that it suggests that promotional activities that encourage potential site visitors to discover a site and then become loyal site visitors can be helpful. For example, that kind of signal can be tracked with branded search queries. It could be that Google is only looking at site visit information but I think that branded queries are an equally trustworthy signal, especially when those queries are accompanied by labels… ding, ding, ding!

The patent also lists some kind of out there examples of inferred trust like contact/chat list data. It doesn’t say social media, just contact/chat lists.

Trust Can Decay or Increase

Another interesting feature of trust rank is that it can decay or increase over time.

The patent is straightforward about this part:

“Note that trust relationships can change. For example, the system can increase (or decrease) the strength of a trust relationship for a trusted entity. The search engine system 100 can also cause the strength of a trust relationship to decay over time if the trust relationship is not affirmed by the user, for example by visiting the entity’s web site and activating the trust button 112.”

Trust Relationship Editor User Interface

Directly after the above paragraph is a section about enabling users to edit their trust relationships through a user interface. There has never been such a thing, just like the non-existent trust button.

This is possibly a stand-in for something else. Could this trusted sites dashboard be Chrome browser bookmarks or sites that are followed in Discover? This is a matter for speculation.

Here’s what the patent says:

“The search engine system 100 may also expose a user interface to the trust database 190 by which the user can edit the user trust relationships, including adding or removing trust relationships with selected entities.

The trust information in the trust database 190 is also periodically updated by crawling of web sites, including sites of entities with trust information (e.g., trust lists, vanity lists); trust ranks are recomputed based on the updated trust information.”

What Google’s Trust Patent Is About

Google’s Search Result Ranking Based On Trust patent describes a way of leveraging user-behavior signals to understand which sites are trustworthy. The system then identifies sites that are trusted by the user-trusted sites and uses that information as a ranking signal. There is no actual trust rank metric, but there are ranking signals related to what users trust. Those signals can decay or increase based on factors like whether a user still visits those sites.

The larger takeaway is that this patent is an example of how Google is focused on user signals as a ranking source, so that they can feed that back into ranking sites that meet their needs. This means that instead of doing things because “this is what Google likes,” it’s better to go even deeper and do things because users like it. That will feed back to Google through these kinds of algorithms that measure user behavior patterns, something we all know Google uses.

Featured Image by Shutterstock/samsulalam

Google’s Local Job Type Algorithm Detailed In Research Paper via @sejournal, @martinibuster

Google published a research paper describing how it extracts “services offered” information from local business sites to add it to business profiles in Google Maps and Search. The algorithm describes specific relevance factors and confirms that the system has been successfully in use for a year.

What makes this research paper especially notable is that one of the authors is Marc Najork, a distinguished research scientist at Google who is associated with many milestones in information retrieval, natural language processing, and artificial intelligence.

The purpose of this system is to make it easier for users to find local businesses that provide the services they are looking for. The paper was published in 2024 (according to the Internet Archive) and is dated 2023.

The research paper explains:

“…to reduce user effort, we developed and deployed a pipeline to automatically extract the job types from business websites. For example, if a web page owned by a plumbing business states: “we provide toilet installation and faucet repair service”, our pipeline outputs toilet installation and faucet repair as the job types for this business.”

Developing A Local Search System

The first step for creating a system for crawling and extracting job type information was to create training data from scratch. They selected billions of home pages that are listed in Google business profiles and extracted job type information from tables and formatted lists on home pages or pages that were one click away from the home pages. This job type data became the seed set of job types.

The extracted job type data was used as search queries, augmented with query expansion (synonyms) to expand the list of job types to include all possible variations of job type keyword phrases.

Second Step: Fixing A Relevance Problem

Google’s researchers applied their system on the billions of pages and it didn’t work as intended because many pages had job type phrases that were not describing services offered.

The research paper explains:

“We found that many pages mention job type names for other purposes like giving life tips. For example, a web page that teaches readers to deal with bed bugs might contain a sentence like a solution is to call home cleaning services if you find bed bugs in your home. They usually provide services like bed bug control. Though this page mentions multiple job type names, the page is not provided by a home cleaning business.”

Limiting the crawling and indexing to identifying job type keyword phrases resulted in false positives. The solution was to incorporate sentences that surrounded the keyword phrases so that they could better understand the context of the job type keyword phrases.

The success of using surrounding text is explained:

“As shown in Table 2, JobModelSurround performs significantly better than JobModel, which suggests that the surrounding words could indeed explain the intent of the seed job type mentions. This successfully improves the semantic understanding without processing the entire text of each page, keeping our models efficient.”

SEO Insight
The described local search algorithm is purposely excluding all information on the page and zeroing in on job type keyword phrases and surrounding words and phrases around those keywords. This shows the importance of how the words around important keyword phrases can provide context for the keyword phrases and make it easier for Google’s crawlers to understand what the page is about without having to process the entire web page.

SEO Insight
Another insight is that Google is not indexing the entire web page for the limited purpose of identifying job type keyword phrases. The algorithm is hunting for the keyword phrase and surrounding keyword phrases.

SEO Insight
The concept of analyzing only a part of a page is similar to Google’s Centerpiece Annotation where a section of content is identified as the main topic of the page. I’m not saying these are related. I’m just pointing out one feature out of many where a Google algorithm zeroes in on just a section of a page.

The System Uses BERT

Google used the BERT language model to classify whether phrases extracted from business websites describe actual job types. BERT was fine-tuned on labeled examples and given additional context such as website structure, URL patterns, and business category to improve precision without sacrificing scalability.

The Extraction System Can Be Generalized To Other Contexts

An interesting finding detailed by the research paper is that the system they developed can be used in areas (domains) other than local businesses, such as “expertise finding, legal and medical information extraction.”

They write:

“The lessons we shared in developing the largescale extraction pipeline from scratch can generalize to other information extraction or machine learning tasks. They have direct applications to domain-specific extraction tasks, exemplified by expertise finding, legal and medical information extraction.

Three most important lessons are:

(1) utilizing the data properties such as structured content could alleviate the cold start problem of data annotation;

(2) formulating the task as a retrieval problem could help researchers and practitioners deal with a large dataset;

(3) the context information could improve the model quality without sacrificing its scalability.”

Job Type Extract Is A Success

The research paper says that their system is a success, it has a high level of precision (accuracy) and that it is scalable. The research paper says that it has already been in use for a year. The research is dated 2023 but according to the Internet Archive (Wayback Machine), it was published sometime in July 2024.

The researchers write:

“Our pipeline is executed periodically to keep the extracted content up-to-date. It is currently deployed in production, and the output job types are surfaced to millions of Google Search and Maps users.”

Takeaways

  • Google’s Algorithm That Extracts Job Types from Webpages
    Google developed an algorithm that extracts “job types” (i.e., services offered) from business websites to display in Google Maps and Search.
  • Pipeline Extracts From Unstructured Content
    Instead of relying on structured HTML elements, the algorithm reads free-text content, making it effective even when services are buried in paragraphs.
  • Contextual Relevance Is Important
    The system evaluates surrounding words to confirm that service-related terms are actually relevant to the business, improving accuracy.
  • Model Generalization Potential
    The approach can be applied to other fields like legal or medical information extraction, showing how it can be applied to other kinds of knowledge.
  • High Accuracy and Scalability
    The system has been deployed for over a year and delivers scalable, high-precision results across billions of webpages.

Google published a research paper about an algorithm that automatically extracts service descriptions from local business websites by analyzing keyword phrases and their surrounding context, enabling more accurate and up-to-date listings in Google Maps and Search. This technique avoids dependence on HTML structure and can be adapted for use in other industries where extracting information from unstructured text is needed.

Read the research paper abstract and download the PDF version here:

Job Type Extraction for Service Businesses

Featured Image by Shutterstock/ViDI Studio

Google Patent On Using Contextual Signals Beyond Query Semantics via @sejournal, @martinibuster

A patent recently filed by Google outlines how an AI assistant may use at least five real-world contextual signals, including identifying related intents, to influence answers and generate natural dialog. It’s an example of how AI-assisted search modifies responses to engage users with contextually relevant questions and dialog, expanding beyond keyword-based systems.

The patent describes a system that generates relevant dialog and answers using signals such as environmental context, dialog intent, user data, and conversation history. These factors go beyond using the semantic data in the user’s query and show how AI-assisted search is moving toward more natural, human-like interactions.

In general, the purpose of filing a patent is to obtain legal protection and exclusivity for an invention and the act of filing doesn’t indicate that Google is actually using it.

The patent uses examples of spoken dialog but it also states the invention is not limited to audio input:

“Notably, during a given dialog session, a user can interact with the automated assistant using various input modalities, including, but not limited to, spoken input, typed input, and/or touch input.”

The name of the patent is, Using Large Language Model(s) In Generating Automated Assistant response(s). The patent applies to a wide range of AI assistants that receive inputs via the context of typed, touch, and speech.

There are five factors that influence the LLM modified responses:

  1. Time, Location, And Environmental Context
  2. User-Specific Context
  3. Dialog Intent & Prior Interactions
  4.  Inputs (text, touch, and speech)
  5. System & Device Context

The first four factors influence the answers that the automated assistant provides and the fifth one determines whether to turn off the LLM-assisted part and revert to standard AI answers.

Time, Location, And Environmental

There are three contextual factors: time, location and environmental that provide contexts that are not existent in keywords and influence how the AI assistant responds. While these contextual factors, as described in the patent, aren’t strictly related to AI Overviews or AI Mode, they do show how AI-assisted interactions with data can change.

The patent uses the example of a person who tells their assistant they’re going surfing. A standard AI response would be a boilerplate comment to have fun or to enjoy the day. The LLM-assisted response described in the patent would generate a response based on the geographic location and time to generate a comment about the weather like the potential for rain. These are called modified assistant outputs.

The patent describes it like this:

“…the assistant outputs included in the set of modified assistant outputs include assistant outputs that do drive the dialog session in manner that further engages the user of the client device in the dialog session by asking contextually relevant questions (e.g., “how long have you been surfing?”), that provide contextually relevant information (e.g., “but if you’re going to Example Beach again, be prepared for some light showers”), and/or that otherwise resonate with the user of the client device within the context of the dialog session.”

User-Specific Context

The patent describes multiple user-specific contexts that the LLM may use to generate a modified output:

  • User profile data, such as preferences (like food or types of activity).
  • Software application data (such as apps currently or recently in use).
  • Dialog history of the ongoing and/or previous assistant sessions.

Here’s a snippet that talks about various user profile related contextual signals:

“Moreover, the context of the dialog session can be determined based on one or more contextual signals that include, for example, ambient noise detected in an environment of the client device, user profile data, software application data, ….dialog history of the dialog session between the user and the automated assistant, and/or other contextual signals.”

Related Intents

An interesting part of the patent describes how a user’s food preference can be used to determine a related intent to a query.

“For example, …one or more of the LLMs can determine an intent associated with the given assistant query… Further, the one or more of the LLMs can identify, based on the intent associated with the given assistant query, at least one related intent that is related to the intent associated with the given assistant query… Moreover, the one or more of the LLMs can generate the additional assistant query based on the at least one related intent. “

The patent illustrates this with the example of a user saying that they’re hungry. The LLM will then identify related contexts such as what type of cuisine the user enjoys and the itent of eating at a restaurant.

The patent explains:

“In this example, the additional assistant query can correspond to, for example, “what types of cuisine has the user indicated he/she prefers?” (e.g., reflecting a related cuisine type intent associated with the intent of the user indicating he/she would like to eat), “what restaurants nearby are open?” (e.g., reflecting a related restaurant lookup intent associated with the intent of the user indicating he/she would like to eat)… In these implementations, additional assistant output can be determined based on processing the additional assistant query.”

System & Device Context

The system and device context part of the patent is interesting because it enables the AI to detect if the context of the device is that it’s low on batteries, and if so, it will turn off the LLM-modified responses. There are other factors such as whether the user is walking away from the device, computational costs, etc.

Takeaways

  • AI Query Responses Use Contextual Signals
    Google’s patent describes how automated assistants can use real-world context to generate more relevant and human-like answers and dialog.
  • Contextual Factors Influence Responses
    These include time/location/environment, user-specific data, dialog history and intent, system/device conditions, and input type (text, speech, or touch).
  • LLM-Modified Responses Enhance Engagement
    Large language models (LLMs) use these contexts to create personalized responses or follow-up questions, like referencing weather or past interactions.
  • Examples Show Practical Impact
    Scenarios like recommending food based on user preferences or commenting on local weather during outdoor plans demonstrates how real-world contexts can influence how AI responds to user queries.

This patent is important because millions of people are increasingly engaging with AI assistants, thus it’s relevant to publishers, ecommerce stores, local businesses and SEOs.

It outlines how Google’s AI-assisted systems can generate personalized, context-aware responses by using real-world signals. This enables assistants to go beyond keyword-based answers and respond with relevant information or follow-up questions, such as suggesting restaurants a user might like or commenting on weather conditions before a planned activity.

Read the patent here:

Using Large Language Model(s) In Generating Automated Assistant response(s).

Featured Image by Shutterstock/Visual Unit

Marketing To Machines Is The Future – Research Shows Why via @sejournal, @martinibuster

A new research paper explores how AI agents interact with online advertising and what shapes their decision-making. The researchers tested three leading LLMs to understand which kinds of ads influence AI agents most and what this means for digital marketing. As more people rely on AI agents to research purchases, advertisers may need to rethink strategy for a machine-readable, AI-centric world and embrace the emerging paradigm of “marketing to machines.”

Although the researchers were testing if AI agents interacted with advertising and what kinds influenced them the most, their findings also show that well-structured on-page information, like pricing data, is highly influential, which opens up areas to think about in terms of AI-friendly design.

An AI agent (also called agentic AI) is an autonomous AI assistant that performs tasks like researching content on the web, comparing hotel prices based on star ratings or proximity to landmarks, and then presenting that information to a human, who then uses it to make decisions.

AI Agents And Advertising

The research is titled Are AI Agents Interacting With AI Ads? and was conducted at the University of Applied Sciences Upper Austria. The research paper cites previous research on the interaction between AI Agents and online advertising that explore the emerging relationships between agentic AI and the machines driving display advertising.

Previous research on AI agents and advertising focused on:

  • Pop-up Vulnerabilities
    Vision-language AI agents that aren’t programmed to avoid advertising can be tricked into clicking on pop-up ads at a rate of 86%.
  • Advertising Model Disruption
    This research concluded that AI agents bypassed sponsored and banner ads but forecast disruption in advertising as merchants figure out how to get AI agents to click on their ads to win more sales.
  • Machine-Readable Marketing
    This paper makes the argument that marketing has to evolve toward “machine-to-machine” interactions and “API-driven marketing.”

The research paper offers the following observations about AI agents and advertising:

“These studies underscore both the potential and pitfalls of AI agents in online advertising contexts. On one hand, agents offer the prospect of more rational, data-driven decisions. On the other hand, existing research reveals numerous vulnerabilities and challenges, from deceptive pop-up exploitation to the threat of rendering current advertising revenue models obsolete.

This paper contributes to the literature by examining these challenges, specifically within hotel booking portals, offering further insight into how advertisers and platform owners can adapt to an AI-centric digital environment.”

The researchers investigate how AI agents interact with online ads, focusing specifically on hotel and travel booking platforms. They used a custom built travel booking platform to perform the testing, examining whether AI agents incorporate ads into their decision-making and explored which ad formats (like banners or native ads) influence their choices.

How The Researchers Conducted The Tests

The researchers conducted the experiments using two AI agent systems: OpenAI’s Operator and the open-source Browser Use framework. Operator, a closed system built by OpenAI, relies on screenshots to perceive web pages and is likely powered by GPT-4o, though the specific model was not disclosed.

Browser Use allowed the researchers to control for the model used for the testing by connecting three different LLMs via API:

  • GPT-4o
  • Claude Sonnet 3.7
  • Gemini 2.0 Flash

The setup with Browser Use enabled consistent testing across models by enabling them to use the page’s rendered HTML structure (DOM tree) and recording their decision-making behavior.

These AI agents were tasked with completing hotel booking requests on a simulated travel site. Each prompt was designed to reflect realistic user intent and tested the agent’s ability to evaluate listings, interact with ads, and complete a booking.

By using APIs to plug in the three large language models, the researchers were able to isolate differences in how each model responded to page data and advertising cues, to observe how AI agents behave in web-based decision-making tasks.

These are the ten prompts used for testing purposes:

  1. Book a romantic holiday with my girlfriend.
  2. Book me a cheap romantic holiday with my boyfriend.
  3. Book me the cheapest romantic holiday.
  4. Book me a nice holiday with my husband.
  5. Book a romantic luxury holiday for me.
  6. Please book a romantic Valentine’s Day holiday for my wife and me.
  7. Find me a nice hotel for a nice Valentine’s Day.
  8. Find me a nice romantic holiday in a wellness hotel.
  9. Look for a romantic hotel for a 5-star wellness holiday.
  10. Book me a hotel for a holiday for two in Paris.

What the Researchers Discovered

Ad Engagement With Ads

The study found that AI agents don’t ignore online advertisements, but their engagement with ads and the extent to which those ads influence decision-making varies depending on the large language model.

OpenAI’s GPT-4o and Operator were the most decisive, consistently selecting a single hotel and completing the booking process in nearly all test cases.

Anthropic’s Claude Sonnet 3.7 showed moderate consistency, making specific booking selections in most trials but occasionally returning lists of options without initiating a reservation.

Google’s Gemini 2.0 Flash was the least decisive, frequently presenting multiple hotel options and completing significantly fewer bookings than the other models.

Banner ads were the most frequently clicked ad format across all agents. However, the presence of relevant keywords had a greater impact on outcomes than visuals alone.

Ads with keywords embedded in visible text influenced model behavior more effectively than those with image-based text, which some agents overlooked. GPT-4o and Claude were more responsive to keyword-based ad content, with Claude integrating more promotional language into its output.

Use Of Filtering And Sorting Features

The models also differed in how they used interactive web page filtering and sorting tools.

  • Gemini applied filters extensively, often combining multiple filter types across trials.
  • GPT-4o used filters rarely, interacting with them only in a few cases.
  • Claude used filters more frequently than GPT-4o, but not as systematically as Gemini.

Consistency Of AI Agents

The researchers also tested for consistency of how often agents, when given the same prompt multiple times, picked the same hotel or offered the same selection behavior.

In terms of booking consistency, both GPT-4o (with Browser Use) and Operator (OpenAI’s proprietary agent) consistently selected the same hotel when given the same prompt.

Claude showed moderately high consistency in how often it selected the same hotel for the same prompt, though it chose from a slightly wider pool of hotels compared to GPT-4o or Operator.

Gemini was the least consistent, producing a wider range of hotel choices and less predictable results across repeated queries.

Specificity Of AI Agents

They also tested for specificity, which is how often the agent chose a specific hotel and committed to it, rather than giving multiple options or vague suggestions. Specificity reflects how decisive the agent is in completing a booking task. A higher specificity score means the agent more often committed to a single choice, while a lower score means it tended to return multiple options or respond less definitively.

  • Gemini had the lowest specificity score at 60%, frequently offering several hotels or vague selections rather than committing to one.
  • GPT-4o had the highest specificity score at 95%, almost always making a single, clear hotel recommendation.
  • Claude scored 74%, usually selecting a single hotel, but with more variation than GPT-4o.

The findings suggest that advertising strategies may need to shift toward structured, keyword-rich formats that align with how AI agents process and evaluate information, rather than relying on traditional visual design or emotional appeal.

What It All Means

This study investigated how AI agents for three language models (GPT-4o, Claude Sonnet 3.7, and Gemini 2.0 Flash) interact with online advertisements during web-based hotel booking tasks. Each model received the same prompts and completed the same types of booking tasks.

Banner ads received more clicks than sponsored or native ad formats, but the most important factor in ad effectiveness was whether the ad contained relevant keywords in visible text. Ads with text-based content outperformed those with embedded text in images. GPT-4o and Claude were the most responsive to these keyword cues, and Claude was also the most likely among the tested models to quote ad language in its responses.

According to the research paper:

“Another significant finding was the varying degree to which each model incorporated advertisement language. Anthropic’s Claude Sonnet 3.7 when used in ‘Browser Use’ demonstrated the highest advertisement keyword integration, reproducing on average 35.79% of the tracked promotional language elements from the Boutique Hotel L’Amour advertisement in responses where this hotel was recommended.”

In terms of decision-making, GPT-4o was the most decisive, usually selecting a single hotel and completing the booking. Claude was generally clear in its selections but sometimes presented multiple options. Gemini tended to frequently offer several hotel options and completed fewer bookings overall.

The agents showed different behavior in how they used a booking site’s interactive filters. Gemini applied filters heavily. GPT-4o used filters occasionally. Claude’s behavior was between the two, using filters more than GPT-4o but not as consistently as Gemini.

When it came to consistency—how often the same hotel was selected when the same prompt was repeated—GPT-4o and Operator showed the most stable behavior. Claude showed moderate consistency, drawing from a slightly broader pool of hotels, while Gemini produced the most varied results.

The researchers also measured specificity, or how often agents made a single, clear hotel recommendation. GPT-4o was the most specific, with a 95% rate of choosing one option. Claude scored 74%, and Gemini was again the least decisive, with a specificity score of 60%.

What does this all mean? In my opinion, these findings suggest that digital advertising will need to adapt to AI agents. That means keyword-rich formats are more effective than visual or emotional appeals, especially as machines increasingly are the ones interacting with ad content. Lastly, the research paper references structured data, but not in the context of Schema.org structured data. Structured data in the context of the research paper means on-page data like prices and locations and it’s this kind of data that AI agents engage best with.

The most important takeaway from the research paper is:

“Our findings suggest that for optimizing online advertisements targeted at AI agents, textual content should be closely aligned with anticipated user queries and tasks. At the same time, visual elements play a secondary role in effectiveness.”

That may mean that for advertisers, designing for clarity and machine readability may soon become as important as designing for human engagement.

Read the research paper:

Are AI Agents interacting with Online Ads?

Featured Image by Shutterstock/Creativa Images

Google Files Patent On Personal History-Based Search via @sejournal, @martinibuster

Google recently filed a patent for a way to provide search results based on a user’s browsing and email history. The patent outlines a new way to search within the context of a search engine, within an email interface, and through a voice-based assistant (referred to in the patent as a voice-based dialog system).

A problem that many people have is that they can remember what they saw but they can’t remember where they saw it or how they found it. The new patent, titled Generating Query Answers From A User’s History, solves that problem by helping people find information they’ve previously seen within a webpage or an email by enabling them to ask for what they’re looking for using everyday language such as “What was that article I read last week about chess?”

The problem the invention solves is that traditional search engines don’t enable users to easily search their own browsing or email history using natural language. The invention works by taking a user’s spoken or typed question, recognizing that the question is asking for previously viewed content, and then retrieving search results from the user’s personal history (such as their browser history or emails). In order to accomplish this it uses filters like date, topic, or device used.

What’s novel about the invention is the system’s ability to understand vague or fuzzy natural language queries and match them to a user’s specific past interactions, including showing the version of a page as it looked when the user originally saw it (a cached version of the web page).

Query Classification (Intent) And Filtering

Query Classification

The system first determines whether the intent of the user’s spoken or typed query is to retrieve previously accessed information. This process is called query classification and involves analyzing the phrasing of the query to detect the intent. The system compares parts of the query to known patterns associated with history-seeking questions and uses techniques like semantic analysis and similarity thresholds to identify if the user’s intent is to seek something they’d seen before, even when the wording is vague or conversational.

The similarity threshold is an interesting part of the invention because it compares what the user is saying or typing to known history-seeking phrases to see if they are similar. It’s not looking for an exact match but rather a close match.

Filtering

The next part is filtering, and it happens after the system has identified the history-seeking intent. It then applies filters such as the topic, time, or device to limit the search to content from the user’s personal history that matches those criteria.

The time filter is a way to constrain the search to within a specific time frame that’s mentioned or implied in the search query. This helps the system narrow down the search results to what the user is trying to find. So if a user speaks phrases like “last week” or “a few days ago” then it knows to restrict the query to those respective time frames.

An interesting quality of the time filter is that it’s applied with a level of fuzziness, which means it’s not exact. So when a person asks the voice assistant to find something from the past week it won’t do a literal search of the past seven days but will expand it to a longer period of time.

The patent describes the fuzzy quality of the time filter:

“For example, the browser history collection… may include a list of web pages that were accessed by the user. The search engine… may obtain documents from the index… based on the filters from the formatted query.

For example, if the formatted query… includes a date filter (e.g., “last week”) and a topic filter (e.g., “chess story”), the search engine… may retrieve only documents from the collection… that satisfy these filters, i.e., documents that the user accessed in the previous week that relate to a “chess story.”

In this example, the search engine… may apply fuzzy time ranges to the “last week” filter to account for inaccuracies in human memory. In particular, while “last week” literally refers to the seven calendar days of the previous week, the search engine… may search for documents over a wider range, e.g., anytime in the past two weeks.”

Once a query is classified as asking for something that was previously seen, the system identifies details in the user’s phrasing that are indicative of topic, date or time, source, device, sender, or location and uses them as filters to search the user’s personal history.

Each filter helps narrow the scope of the search to match what the user is trying to recall: for example, a topic filter (“turkey recipe”) targets the subject of the content; a time filter (“last week”) restricts results to when it was accessed; a source filter (“WhiteHouse.gov”) limits the search to specific websites; a device filter (e.g., “on my phone”) further restricts the search results from a certain device; a sender filter (“from grandma”) helps locate emails or shared content; and a location filter (e.g., “at work”) restricts results to those accessed in a particular physical place.

By combining these context-sensitive filters, the system mimics the way people naturally remember content in order to help users retrieve exactly what they’re looking for, even when their query is vague or incomplete.

Scope of Search: What Is Searched

The next part of the patent is about figuring out the scope of what is going to be searched, which is limited to predefined sources such as browser history, cached versions of web pages, or emails. So, rather than searching the entire web, the system focuses only on the user’s personal history, making the results more relevant to what the user is trying to recall.

Cached Versions of Previously Viewed Content

Another interesting feature described in the patent is web page caching. Caching refers to saving a copy of a web page as it appeared when the user originally viewed it. This enables the system to show the user that specific version of the page in search results, rather than the current version, which may have changed or been removed.

The cached version acts like a snapshot in time, making it easier for the user to recognize or remember the content they are looking for. This is especially useful when the user doesn’t remember precise details like the name of the page or where they found it, but would recognize it if they saw it again. By showing the version that the user actually saw, the system makes the search experience more aligned with how people remember things.

Potential Applications Of The Patent Invention

The system described in the patent can be applied in several real-world contexts where users may want to retrieve content they’ve previously seen:

Search Engines

The patent refers multiple times to the use of this technique in the context of a search engine that retrieves results not from the public web, but from the user’s personal history, such as previously visited web pages and emails. While the system is designed to search only content the user has previously accessed, the patent notes that some implementations may also include additional documents relevant to the query, even if the user hasn’t viewed them before.

Email Clients

The system treats previously accessed emails as part of the searchable history. For example, it can return an old email like “Grandma’s turkey meatballs” based on vague, natural language queries.

Voice Assistants

The patent includes examples of “a voice-based search” where users speak conversational queries like “I’m looking for a turkey recipe I read on my phone.” The system handles speech recognition and interprets intent to retrieve relevant results from personal history.

Read the entire patent here:

Generating query answers from a user’s history

Google’s New Infini-Attention And SEO via @sejournal, @martinibuster

Google has published a research paper on a new technology called Infini-attention that allows it to process massively large amounts of data with “infinitely long contexts” while also being capable of being easily inserted into other models to vastly improve their capabilities

That last part should be of interest to those who are interested in Google’s algorithm. Infini-Attention is plug-and-play, which means it’s relatively easy to insert into other models, including those in use b Google’s core algorithm. The part about “infinitely long contexts” may have implications for how some of Google’s search systems may work.

The name of the research paper is: Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention

Memory Is Computationally Expensive For LLMs

Large Language Models (LLM) have limitations on how much data they can process at one time because the computational complexity and memory usage can spiral upward significantly. Infini-Attention gives the LLM the ability to handle longer contexts while keeping the down memory and processing power needed.

The research paper explains:

“Memory serves as a cornerstone of intelligence, as it enables efficient computations tailored to specific contexts. However, Transformers …and Transformer-based LLMs …have a constrained context-dependent memory, due to the nature of the attention mechanism.

Indeed, scaling LLMs to longer sequences (i.e. 1M tokens) is challenging with the standard Transformer architectures and serving longer and longer context models becomes costly financially.”

And elsewhere the research paper explains:

“Current transformer models are limited in their ability to process long sequences due to quadratic increases in computational and memory costs. Infini-attention aims to address this scalability issue.”

The researchers hypothesized that Infini-attention can scale to handle extremely long sequences with Transformers without the usual increases in computational and memory resources.

Three Important Features

Google’s Infini-Attention solves the shortcomings of transformer models by incorporating three features that enable transformer-based LLMs to handle longer sequences without memory issues and use context from earlier data in the sequence, not just data near the current point being processed.

The features of Infini-Attention

  • Compressive Memory System
  • Long-term Linear Attention
  • Local Masked Attention

Compressive Memory System

Infini-Attention uses what’s called a compressive memory system. As more data is input (as part of a long sequence of data), the compressive memory system compresses some of the older information in order to reduce the amount of space needed to store the data.

Long-term Linear Attention

Infini-attention also uses what’s called, “long-term linear attention mechanisms” which enable the LLM to process data that exists earlier in the sequence of data that’s being processed which enables to retain the context. That’s a departure from standard transformer-based LLMs.

This is important for tasks where the context exists on a larger plane of data. It’s like being able to discuss and entire book and all of the chapters and explain how the first chapter relates to another chapter closer to the end of the book.

Local Masked Attention

In addition to the long-term attention, Infini-attention also uses what’s called local masked attention. This kind of attention processes nearby (localized) parts of the input data, which is useful for responses that depend on the closer parts of the data.

Combining the long-term and local attention together helps solve the problem of transformers being limited to how much input data it can remember and use for context.

The researchers explain:

“The Infini-attention incorporates a compressive memory into the vanilla attention mechanism and builds in both masked local attention and long-term linear attention mechanisms in a single Transformer block.”

Results Of Experiments And Testing

Infini-attention was tested with other models for comparison across multiple benchmarks involving long input sequences, such as long-context language modeling, passkey retrieval, and book summarization tasks. Passkey retrieval is a test where the language model has to retrieve specific data from within a extremely long text sequence.

List of the three tests:

  1. Long-context Language Modeling
  2. Passkey Test
  3. Book Summary

Long-Context Language Modeling And The Perplexity Score

The researchers write that the Infini-attention outperformed the baseline models and that increasing the training sequence length brought even further improvements in the Perplexity score. The Perplexity score is a metric that measures language model performance with lower scores indicating better performance.

The researchers shared their findings:

“Infini-Transformer outperforms both Transformer-XL …and Memorizing Transformers baselines while maintaining 114x less memory parameters than the Memorizing Transformer model with a vector retrieval-based KV memory with length of 65K at its 9th layer. Infini-Transformer outperforms memorizing transformers with memory length of 65K and achieves 114x compression ratio.

We further increased the training sequence length to 100K from 32K and trained the models on Arxiv-math dataset. 100K training further decreased the perplexity score to 2.21 and 2.20 for Linear and Linear + Delta models.”

Passkey Test

The passkey test is wherea random number is hidden within a long text sequence with the task being that the model must fetch the hidden text. The passkey is hidden either near the beginning, middle or the end of the long text. The model was able to solve the passkey test up to a length of 1 million.

“A 1B LLM naturally scales to 1M sequence length and solves the passkey retrieval task when injected with Infini-attention. Infini-Transformers solved the passkey task with up to 1M context length when fine-tuned on 5K length inputs. We report token-level retrieval accuracy for passkeys hidden in a different part (start/middle/end) of long inputs with lengths 32K to 1M.”

Book Summary Test

Infini-attention also excelled at the book summary test by outperforming top benchmarks achieving new state of the art (SOTA) performance levels.

The results are described:

“Finally, we show that a 8B model with Infini-attention reaches a new SOTA result on a 500K length book summarization task after continual pre-training and task fine-tuning.

…We further scaled our approach by continuously pre-training a 8B LLM model with 8K input length for 30K steps. We then fine-tuned on a book summarization task, BookSum (Kry´sci´nski et al., 2021) where the goal is to generate a summary of an entire book text.

Our model outperforms the previous best results and achieves a new SOTA on BookSum by processing the entire text from book. …There is a clear trend showing that with more text provided as input from books, our Infini-Transformers improves its summarization performance metric.”

Implications Of Infini-Attention For SEO

Infini-attention is a breakthrough in modeling long and short range attention with greater efficiency than previous models without Infini-attention. It also supports “plug-and-play continual pre-training and long-context adaptation
by design” which means that it can easily be integrated into existing models.

Lastly, the “continual pre-training and long-context adaptation” makes it exceptionally useful for scenarios where it’s necessary to constantly train the model on new data. This last part is super interesting because it may make it useful for applications on the back end of Google’s search systems, particularly where it is necessary to be able to analyze long sequences of information and understand the relevance from one part near the beginning of the sequence and another part that’s closer to the end.

Other articles focused on the “infinitely long inputs” that this model is capable of but where it’s relevant to SEO is how that ability to handle huge input and “Leave No Context Behind” is what’s relevant to search marketing and how some of Google’s systems might work if Google adapted Infini-attention to their core algorithm.

Read the research paper:

Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention

Featured Image by Shutterstock/JHVEPhoto

Could This Be The Navboost Patent? via @sejournal, @martinibuster

There’s been a lot of speculation of what Navboost is but to my knowledge nobody has pinpointed an adequate patent that could be the original Navboost patent. This patent from 2004 closely aligns with Navboost

So I took the few clues we have about it and identified a couple likely patents.

The clues I was working with are that Google Software Engineer Amit Singhal was involved with Navboost and had a hand in inventing it. Another clue is that Navboost dated to 2005. Lastly, the court documents indicate that Navboost was updated later on so there may be other patents in there about that, which we’ll get to at some point but not in this article.

So I deduced that if Amit Singhal was the inventor then there would be a patent with his name on it and indeed there is, dating from 2004.

Out of all the patents I saw, the two most interesting were these:

  • Systems and methods for correlating document topicality and popularity 2004
  • Interleaving Search Results 2007

This article will deal with the first one, Systems and methods for correlating document topicality and popularity dating from 2004, which aligns with the known timeline of Navboost dating to 2005.

Patent Does Not Mention Clicks

An interesting quality of this patent is that it doesn’t mention clicks and I suspect that people looking for the Navboost patent may have ignored it because it doesn’t mention clicks.

But the patent discusses concepts related to user interactions and navigational patterns which are references to clicks.

Instances Where User Clicks Are Implied In The Patent

Document Selection and Retrieval:
The patent describes a process where a user selects documents (which can be inferred as clicking on them) from search results. These selections are used to determine the documents’ popularity.

Mapping Documents to Topics:
After documents are selected by users (through clicks), they are mapped to one or more topics. This mapping is a key part of the process, as it associates documents with specific areas of interest or subjects.

User Navigational Patterns:
The patent frequently refers to user navigational patterns, which include how users interact with documents, such as the documents they choose to click on. These patterns are used to compute popularity scores for the documents.

It’s clear that user clicks are a fundamental part of how the patent proposes to assess the popularity of documents.

By analyzing which documents users choose to interact with, the system can assign popularity scores to these documents. These scores, in combination with the topical relevance of the documents, are then used to enhance the accuracy and relevance of search engine results.

Patent: User Interactions Are A Measure Of Popularity

The patent US8595225 makes implicit references to “user clicks” in the context of determining the popularity of documents. Heck, popularity is so important to the patent that it’s in the name of the patent: Systems and methods for correlating document topicality and popularity

User clicks, in this context, refers to the interactions of users with various documents, such as web pages. These interactions are a critical component in establishing the popularity scores for these documents.

The patent describes a method where the popularity of a document is inferred from user navigational patterns, which can only be clicks.

I’d like to stop here and mention that Matt Cutts has discussed in a video that Popularity and PageRank are two different things. Popularity is about what users tend to prefer and PageRank is about authority as evidenced by links.

Matt defined popularity:

“And so popularity in some sense is a measure of where people go whereas PageRank is much more a measure of reputation.”

That definition from about 2014 fits what this patent is talking about in terms of popularity being about where people go.

See Matt Cutts Explains How Google Separates Popularity From True Authority

Watch the YouTube Video: How does Google separate popularity from authority?

How The Patent Uses Popularity Scores

The patent describes multiple ways that it uses popularity scores.

Assigning Popularity Scores:
The patent discusses assigning popularity scores to documents based on user interactions such as the frequency of visits or navigation patterns (Line 1).

Per-Topic Popularity:
It talks about deriving per-topic popularity information by correlating the popularity data associated with each document to specific topics (Line 5).

Popularity Scores in Ranking:
The document describes using popularity scores to order documents among one or more topics associated with each document (Line 13).

Popularity in Document Retrieval:
In the context of document retrieval, the patent outlines using popularity scores for ranking documents (Line 27).

Determining Popularity Based on User Navigation:
The process of determining the popularity score for each document, which may involve using user navigational patterns, is also mentioned (Line 37).

These instances demonstrate the patent’s focus on incorporating the popularity of documents, as determined by user interaction (clicks), into the process of ranking and correlating them to specific topics.

The approach outlined in the patent suggests a more dynamic and user-responsive method of determining the relevance and importance of documents in search engine results.

Navboost Assigns Scores To Documents

I’m going to stop here to also mention that this patent mentions assigning scores to documents, which is how Google executive Eric Lehman described in the trial how Navboost worked:

Speaking about the situation where there wasn’t a lot of click data, Lehman testified:

“And so I think Navboost does kind of the natural thing, which is, in the face of that kind of uncertainty, you take gentler measures. So you might modify the score of a document but more mildly than if you had more data.”

That’s another connection to Navboost in that the trial description and the patent describe using User Interaction for scoring webpages.

The more this patent is analyzed, the more it looks like what the trial documents described as Navboost.

Read the patent here:

Systems and methods for correlating document topicality and popularity

Featured Image by Shutterstock/Sabelskaya

Why Google SGE Is Stuck In Google Labs And What’s Next via @sejournal, @martinibuster

Google Search Generative Experience (SGE) was set to expire as a Google Labs experiment at the end of 2023 but its time as an experiment was quietly extended, making it clear that SGE is not coming to search in the near future. Surprisingly, letting Microsoft take the lead may have been the best perhaps unintended approach for Google.

Google’s AI Strategy For Search

Google’s decision to keep SGE as a Google Labs project fits into the broader trend of Google’s history of preferring to integrate AI in the background.

The presence of AI isn’t always apparent but it has been a part of Google Search in the background for longer than most people realize.

The very first use of AI in search was as part of Google’s ranking algorithm, a system known as RankBrain. RankBrain helped the ranking algorithms understand how words in search queries relate to concepts in the real world.

According to Google:

“When we launched RankBrain in 2015, it was the first deep learning system deployed in Search. At the time, it was groundbreaking… RankBrain (as its name suggests) is used to help rank — or decide the best order for — top search results.”

The next implementation was Neural Matching which helped Google’s algorithms understand broader concepts in search queries and webpages.

And one of the most well known AI systems that Google has rolled out is the Multitask Unified Model, also known as Google MUM.  MUM is a multimodal AI system that encompasses understanding images and text and is able to place them within the contexts as written in a sentence or a search query.

SpamBrain, Google’s spam fighting AI is quite likely one of the most important implementations of AI as a part of Google’s search algorithm because it helps weed out low quality sites.

These are all examples of Google’s approach to using AI in the background to solve different problems within search as a part of the larger Core Algorithm.

It’s likely that Google would have continued using AI in the background until the transformer-based large language models (LLMs) were able to step into the foreground.

But Microsoft’s integration of ChatGPT into Bing forced Google to take steps to add AI in a more foregrounded way with  their Search Generative Experience (SGE).

Why Keep SGE In Google Labs?

Considering that Microsoft has integrated ChatGPT into Bing, it might seem curious that Google hasn’t taken a similar step and is instead keeping SGE in Google Labs. There are good reasons for Google’s approach.

One of Google’s guiding principles for the use of AI is to only use it once the technology is proven to be successful and is implemented in a way that can be trusted to be responsible and those are two things that generative AI is not capable of today.

There are at least three big problems that must be solved before AI can successfully be integrated in the foreground of search:

  1. LLMs cannot be used as an information retrieval system because it needs to be completely retrained in order to add new data. .
  2. Transformer architecture is inefficient and costly.
  3. Generative AI tends to create wrong facts, a phenomenon known as hallucinating.

Why AI Cannot Be Used As A Search Engine

One of the most important problems to solve before AI can be used as the backend and the frontend of a search engine is that LLMs are unable to function as a search index where new data is continuously added.

In simple terms, what happens is that in a regular search engine, adding new webpages is a process where the search engine computes the semantic meaning of the words and phrases within the text (a process called “embedding”), which makes them searchable and ready to be integrated into the index.

Afterwards the search engine has to update the entire index in order to understand (so to speak) where the new webpages fit into the overall search index.

The addition of new webpages can change how the search engine understands and relates all the other webpages it knows about, so it goes through all the webpages in its index and updates their relations to each other if necessary. This is a simplification for the sake of communicating the general sense of what it means to add new webpages to a search index.

In contrast to current search technology, LLMs cannot add new webpages to an index because the act of adding new data requires a complete retraining of the entire LLM.

Google is researching how to solve this problem in order create a transformer-based LLM search engine, but the problem is not solved, not even close.

To understand why this happens, it’s useful to take a quick look at a recent Google research paper that is co-authored by Marc Najork and Donald Metzler (and several other co-authors). I mention their names because both of those researchers are almost always associated with some of the most consequential research coming out of Google. So if it has either of their names on it, then the research is likely very important.

In the following explanation, the search index is referred to as memory because a search index is a memory of what has been indexed.

The research paper is titled: “DSI++: Updating Transformer Memory with New Documents” (PDF)

Using LLMs as search engines is a process that uses a technology called Differentiable Search Indices (DSIs). The current search index technology is referenced as a dual-encoder.

The research paper explains:

“…index construction using a DSI involves training a Transformer model. Therefore, the model must be re-trained from scratch every time the underlying corpus is updated, thus incurring prohibitively high computational costs compared to dual-encoders.”

The paper goes on to explore ways to solve the problem of LLMs that “forget” but at the end of the study they state that they only made progress toward better understanding what needs to be solved in future research.

They conclude:

“In this study, we explore the phenomenon of forgetting in relation to the addition of new and distinct documents into the indexer. It is important to note that when a new document refutes or modifies a previously indexed document, the model’s behavior becomes unpredictable, requiring further analysis.

Additionally, we examine the effectiveness of our proposed method on a larger dataset, such as the full MS MARCO dataset. However, it is worth noting that with this larger dataset, the method exhibits significant forgetting. As a result, additional research is necessary to enhance the model’s performance, particularly when dealing with datasets of larger scales.”

LLMs Can’t Fact Check Themselves

Google and many others are also researching multiple ways to have AI fact check itself in order to keep from giving false information (referred to as hallucinations). But so far that research is not making significant headway.

Bing’s Experience Of AI In The Foreground

Bing took a different route by incorporating AI directly into its search interface in a hybrid approach that joined a traditional search engine with an AI frontend. This new kind of search engine revamped the search experience and differentiated Bing in the competition for search engine users.

Bing’s AI integration initially created significant buzz, drawing users intrigued by the novelty of an AI-driven search interface. This resulted in an increase in Bing’s user engagement.

But after nearly a year of buzz, Bing’s market share saw only a marginal increase. Recent reports, including one from the Boston Globe, indicate less than 1% growth in market share since the introduction of Bing Chat.

Google’s Strategy Is Validated In Hindsight

Bing’s experience suggests that AI in the foreground of a search engine may not be as effective as hoped. The modest increase in market share raises questions about the long-term viability of a chat-based search engine and validates Google’s cautionary approach of using AI in the background.

Google’s focusing of AI in the background of search is vindicated in light of Bing’s failure to cause users to abandon Google for Bing.

The strategy of keeping AI in the background, where at this point in time it works best, allowed Google to maintain users while AI search technology matures in Google Labs where it belongs.

Bing’s approach of using AI in the foreground now serves as almost a cautionary tale about the pitfalls of rushing out a technology before the benefits are fully understood, providing insights into the limitations of that approach.

Ironically, Microsoft is finding better ways to integrate AI as a background technology in the form of useful features added to their cloud-based office products.

Future Of AI In Search

The current state of AI technology suggests that it’s more effective as a tool that supports the functions of a search engine rather than serving as the entire back and front ends of a search engine or even as a hybrid approach which users have refused to adopt.

Google’s strategy of releasing new technologies only when they have been fully tested explains why Search Generative Experience belongs in Google Labs.

Certainly, AI will take a bolder role in search but that day is definitely not today. Expect to see Google adding more AI based features to more of their products and it might not be surprising to see Microsoft continue along that path as well.

Featured Image by Shutterstock/ProStockStudio

How To Read Google Patents In 5 Easy Steps via @sejournal, @martinibuster

Reading and understanding patents filed by Google can be challenging but this guide will help you to understand what the patents are about and to as avoid the many common mistakes that lead to misunderstandings.

How To Understand Google Patents

Before starting to read a patent it’s important to understand how to read the patents. The following rules will form the foundation upon which you can build a solid understanding of what patents mean.

Step #1 Do Not Scan Patents

One of the biggest mistakes I see people make when reading patents is to approach the task as if it’s a treasure hunt. They scan the patents looking for tidbits and secrets about Google’s algorithms.

I know people do this because I’ve seen so many wrong conclusions made by SEOs who I can tell didn’t read the patent because they only speak about the one or two sentences that jump out at them.

Had they read the entire patent they would have understood that the passage they got excited about had nothing to do with ranking websites.

Reading a patent is not like a treasure hunt with a metal detector where the treasure hunter scans an entire field and then stops in one spot to dig up a cache of gold coins.

Don’t scan a patent. Read it.

Step #2 Understand The Context Of The Patent

A patent is like an elephant. An elephant has a trunk, big ears, a little tail and legs thick as trees. Similarly, a patent is made up of multiple sections that are each very important because they create the context of what the patent is about. Each section of a patent is important.

And just like how each part of an elephant in the context of the entire animal helps to better understand the creature, so it is with patents that every section only makes sense within the context of the entire patent.

In order to understand the patent it’s important to read the entire patent several times in order to be able step back and see the entire patent, not just one part of the patent.

Reading the entire patent reveals what the context of the entire patent is, which is the most important thing about the patent, what the entire thing means.

Step #3 Not Every Patent Is About Ranking

If there’s any one thing I wish the reader to take away from this article is this rule. When I read tweets or articles by people who don’t know how to read patents, this is the rule that they haven’t understood. Consequently, the interpretation of the patent is wrong.

Google Search is not just one ranking algorithm. There are many algorithms that comprise different parts of Search. The Ranking Engine and the Indexing Engine are just two parts of Search.

Other elements of search that may be referred to are:

  • Ranking engine
  • Modification engine
  • Indexing engine
  • Query reviser engine

Those are just a few of the kinds of software engines that are a part of a typical search engine. While the different software engines are not necessarily a part of the ranking part of Google’s algorithms, that does not minimize their importance.

Back in 2020 Gary Illyes of Google tweeted that Search consists of thousands of different systems working together.

He tweeted about the indexing engine:

“The indexing system, Caffeine, does multiple things:
1. ingests fetchlogs,
2. renders and converts fetched data,
3. extracts links, meta and structured data,
4. extracts and computes some signals,
5. schedules new crawls,
6. and builds the index that is pushed to serving.”

He followed up with another tweet about the thousands of systems in search:

“Don’t oversimplify search for it’s not simple at all: thousands of interconnected systems working together to provide users high quality and relevant results…

…the last time i did this exercise I counted off the top of my head about 150 different systems from crawling to ranking, so thousands is likely not an exaggeration. Yes, some things are micro services”

Here’s The Important Takeaway:

There are many parts of Search. But not all parts of Search are a part of the ranking systems.

A very important habit to cultivate when reading a patent is to let the patent tell you what it’s about.

Equally important is to not make assumptions or assume that something is implied. Patents don’t generally imply. They may be broad and and they may seem to be so repetitive that it almost feels like a deliberate attempt obfuscate (make it hard to understand) and they consistently describe the inventions in extremely broad terms, but they don’t really imply what they are describing.

Patents, for legal purposes, are actually quite specific about what the patents are about.

If something is used for ranking then it will not be implied, the patent will say so because that’s an important quality to describe in a patent application.

Step #4 Entity & Entities: Understand The Use Of Abstraction

One of the biggest mistakes that happens to people who read patents is to overlook the context of where the invention can be used. For example, let’s review a specific patent called “Identifying subjective attributes by analysis of curation signals.”

This patent mentions entities 52 times and the word “entity” is mentioned in the patent itself 124 times. One can easily guess that this patent is probably about entities, right? It makes sense that if the patent mentions the words “entities” and “entity” nearly 200 times that the patent is about entities.

But that would be an unfortunate assumption because the patent is not about entities at all because the context of the use of the words “entity” and “entities” in this patent is to refer to a broad and inclusive range of items, subjects, or objects to which the invention can be applied.

Patents often cast a wide net in terms of how the invention can be used, which helps to ensure that the patent’s claims aren’t limited to one type of use but can be applied in many ways.

The word “entity” in this patent is used as a catch-all term that allows the patent to cover a wide range of different types of content or objects. It is used in the sense of an abstraction so that it can be applied to multiple objects or forms of content. This frees the patent to focus on the functionality of the invention and how it can be applied.

The use of abstraction keeps a patent from being tied down to the specifics of what it is being applied to because in most cases the patent is trying to communicate how it can be applied in many different ways.

In fact, the patent places the invention in the context of different forms of content entities such as videos, images, and audio clips. The patent also refers to text-based content (like articles, blog posts), as well as more tangible entities (like products, services, organizations, or even individuals).

Here is an example from the patent where it explicitly refers to video clips as one of the entities that the patent is concerned with:

“In one implementation, the above procedure is performed for each entity in a given set of entities (e.g., video clips in a video clip repository, etc.), and an inverse mapping from subjective attributes to entities in the set is generated based on the subjective attributes and relevancy scores.”

In this context, “video clips” are explicitly mentioned as an example of the entities to which the invention can be applied. The passage indicates that the procedure described in the patent (identifying and scoring subjective attributes of entities) is applicable to video clips.”

Here is another passage where the word entity is used to denote a type of content:

“Entity store 120 is a persistent storage that is capable of storing entities such as media clips (e.g., video clips, audio clips, clips containing both video and audio, images, etc.) and other types of content items (e.g., webpages, text-based documents, restaurant reviews, movie reviews, etc.), as well as data structures to tag, organize, and index the entities.”

That part of the patent describes “content items” as entities and gives examples like webpages, text-based documents, restaurant reviews, and movie reviews, alongside media clips such as video and audio clips. This and other similar passages show that the term “entity” within the context of this patent broadly encompasses multiple forms of digital content.

That patent,  titled Identifying subjective attributes by analysis of curation signals, is actually related to a recommender system or search that leverages User Generated Content like comments for the purpose of tagging digital content with the subjective opinions of those users.

The patent specifically uses the example of users describing an entity (like an image or a video) as funny, which can then be used to surface a video that has the subjective quality of funny as a part of a recommender system.

The most obvious application of this patent is for finding videos on YouTube that users and authors have described as funny. The use of this patent isn’t limited to just YouTube videos, it can also be used in other scenarios that intersect with user generated content.

The patent explicitly mentions the application of the invention in the context of a recommender system in the following passage:

“In one implementation, the above procedure is performed for each entity in a given set of entities (e.g., video clips in a video clip repository, etc.), and an inverse mapping from subjective attributes to entities in the set is generated based on the subjective attributes and relevancy scores.

The inverse mapping can then be used to efficiently identify all entities in the set that match a given subjective attribute (e.g., all entities that have been associated with the subjective attribute ‘funny’, etc.), thereby enabling rapid retrieval of relevant entities for processing keyword searches, populating playlists, delivering advertisements, generating training sets for the classifier, and so forth.”

Some SEOs, because the patent mentions authors three times have claimed that this patent has something to do with ranking content authors and because of that they also associate the patent it with E-A-T.

Others, because the patent mentions the words “entity” and “entities” so many times have come to believe it has something to do with natural language processing and semantic understanding of webpages.

But neither of those are true and now that I’ve explained some of this patent it should be apparent how a lack of understanding of how to read a patent plus approaching patents with the mindset of treasure hunting for spicy algorithm clues can lead to unfortunate and misleading errors in understanding what the patents are actually about.

In a future article I will walk through different patents and I think doing that will help readers understand how to read a patent. If that’s something you are interested in then please share this article on social media and let me know!

I’m going to end this article with a description of the different parts of a patent, which should go some way to building an understanding of patents.

Step #5 Know The Parts Of A Patent

Every patent is comprised of multiple parts, a beginning, a middle and an end that each have a specific purpose. Many patents are also accompanied by illustrations that are helpful for understanding what the patent is about.

Patents typically follow this pattern:

Abstract:
A concise summary of the patent, giving a quick overview of what the invention is and what it does. It’s provides a brief explanation. This part is actually important because it tells what the patent is about. Do not be one of those SEOs who skip this part to go treasure hunting in the middle parts for clues about the algorithm. Pay attention to the Abstract.

Background:
This section offers context for the invention. It typically gives an overview of the field related to the invention and in a direct or indirect way explains how the invention fits into the context. This is another important part of the patent. It doesn’t give up clues about the algorithm but it tells what part of the system it belongs to and what it’s trying to do.

Summary:
The Summary provides a more detailed overview of the invention than the Abstract. We often say you can step back and view the forest, can step closer and see the trees. The Summary can be said to be stepping forward to see the leaves and just like a tree has a lot of leaves, a Summary can contain a lot of details.

The Summary outlines the invention’s primary objectives, features, and the minutiae of how it does it and all the variations of how it does it. It is almost always an eye-wateringly comprehensive description.

The very first paragraph though can often be the most descriptive and understandable part, after which the summary deep-dives into fine detail. One can feel lost in the seemingly redundant descriptions of the invention. It can be boring but read it at least twice, more if you need to.

Don’t be dismayed if you can’t understand it all because this part isn’t about finding the spicy bits that make for good tweets. This part of reading a patent is sometimes more about kind of absorbing the ideas and getting a feel for it.

Brief Description Of The Drawings:
In patents where drawings are included, this section explains what each drawing represents, sometimes with just a single sentence. It can be as brief as this:

“FIG. 1 is a diagram that illustrates obtaining an authoritative search result.
FIG. 2 is a diagram that illustrates resources visited during an example viewing session.
FIG. 3 is a flow chart of an example process for adjusting search result scores.”

The descriptions provide valuable information and are just as important as the illustrations themselves. They both can communicate a sharper understanding of the function of the patent invention.

What may seem like an invention about choosing authoritative sites for search results might in the illustrations turn out to be about finding the right files on a mobile phone and not have anything to do with information retrieval.

This where my advice to let the patent tell you what it’s about pays off. People too often skip these parts because they don’t contain spicy details. What happens next is that they miss the context for the entire patent and reach completely mistaken conclusions.

Detailed Description Of The Patent:
This is an in-depth description of the invention that uses the illustrations (figure 1, figure 2, etc.) as the organizing factor. This section may include technical information, how the invention works, how it is organized in relation to other parts, and how it can be used.

This section is intended to be thorough enough that someone skilled in the field could replicate the invention but also general enough so that it can be broadly applied in different ways.

Embodiment Examples:
Here is where specific examples of the invention are provided. The word “embodiment” refers to a particular implementation or an example of the invention. It is a way for the inventor to describe specific ways the invention can be used.

There are different contexts of the word embodiment that make it clear what the inventor considers a part of the invention, it is used in the context of illustrating the real-world use of the invention, define technical aspects and to show different ways the invention can be made or used.

That last one you’ll see a lot of paragraphs describing “in another embodiment the invention can bla bla bla…”

So when you see that word “embodiment” try to think of the word “body” and then “embody” in the sense of making something tangible and that will help you to better understand the “Embodiment” section of a patent.

Claims:
The Claims are the legal part of the patent. This section defines the scope of protection that the patent is looking for and it also offers insights into what the patent is about because this section often talks about what’s new and different about the invention. So don’t skip this part.

Citations:
This part lists other patents that are relevant to the invention. It’s used to acknowledge similar inventions but also to show how this invention is different from them and how it improves on what came before.

Firm Starting Point For Reading Patents

You should by this point have a foundation for practicing how to read a patent. Don’t be discouraged if the patent seems opaque and hard to understand. That’s normal.

I asked Jeff Coyle (LinkedIn), cofounder of MarketMuse (LinkedIn) for tips about reading patents because he’s filed some patent applications.

Jeff  offered this advice:

“Use Google Patent’s optional ‘non-patent literature’ Google Scholar search to find articles that may reference or support your knowledge of a patent.

Also understand that sometimes understanding a patent in isolation is nearly impossible, which is why it’s important to build context by collecting and reviewing connected patent and non-patent citations, child/priority patents/applications.

Another way that helps me to understand patents is to research other patents filed by the same authors. These are my core methods for understanding patents.”

That last tip is super important because some inventors tend to invent one kind of thing. So if you’re in doubt about whether a patent is about a certain thing, take a look at other patents that the inventor has filed to see if they tend to file patents on what you think a patent is about.

Patents have their own kind of language, with a formal structure and purpose to each section. Anyone who has learned a second language knows how important it is to look up words and to understand the structure that’s inherent in what’s written.

So don’t be discouraged because with practice you will be able to read patents better than many in the SEO industry are currently able to.

I intend at some point to walk through several patents with the hope that this will help you improve on reading patents. And remember to let me know on social media if this is something you want me to write!