OpenAI ChatGPT Is Testing A Memory Feature via @sejournal, @martinibuster

OpenAI announced that it is rolling out a test of ChatGPT memory, a new feature that remembers past conversations to learn a users preferences, remembers styles and tones and format preferences.

ChatGPT gradually becomes better with time as it remembers across all chats and doing away with having to repeat instructions.

The new feature is rolling out as a limited release to some ChatGPT free and Plus users for testing and feedback. There are plans to extend the feature to more users at a future time.

Memories are saved apart from the chat histories, which is how they can be summoned in later chat sessions.

Users Control Memory

The memory feature can be selectively turned on by telling it to remember something and it can also be told to forget specific details or instructions. Memory is controlled through the settings panel, where specific memories can be deleted or disabled entirely. There is also an option to delete all memories entirely.

The settings options are available at Settings > Personalization > Memory.

How To Use ChatGPT Memory

OpenAI shared examples of how to use the feature:

  • “You’ve explained that you prefer meeting notes to have headlines, bullets and action items summarized at the bottom. ChatGPT remembers this and recaps meetings this way.
  • You’ve told ChatGPT you own a neighborhood coffee shop. When brainstorming messaging for a social post celebrating a new location, ChatGPT knows where to start.
  • You mention that you have a toddler and that she loves jellyfish. When you ask ChatGPT to help create her birthday card, it suggests a jellyfish wearing a party hat.
  • As a kindergarten teacher with 25 students, you prefer 50-minute lessons with follow-up activities. ChatGPT remembers this when helping you create lesson plans.”

Memory May Be Used For Training

OpenAI advised that content provided to ChatGPT could be used for training, which includes the memories, with an option in Data Controls where this can be turned off. By default, no data is used for training. But this does not apply to ChatGPT Team and Enterprise customers.

Another way to control ChatGPT memory is through temporary chats, which will not invoke the memory feature. Temporary chats are activated in the GPT version control at the top of the chat page.

ChatGPT Memory Toggle

According to an OpenAI temporary chat explainer:

“With Temporary Chat, you can have a conversation with a blank slate. ChatGPT won’t be aware of previous conversations or access memories. It will still follow your custom instructions if they’re enabled.”

Read the OpenAI announcement:

Memory and new controls for ChatGPT

Featured Image by Shutterstock/rafapress

Why Did Google Gemini “Leak” Chat Data? via @sejournal, @martinibuster

It only took twenty four hours after Google’s Gemini was publicly released for someone to notice that chats were being publicly displayed in Google’s search results. Google quickly responded to what appeared to be a leak. The reason how this happened is quite surprising and not as sinister as it first appears.

@shemiadhikarath tweeted:

“A few hours after the launch of @Google Gemini, search engines like Bing have indexed public conversations from Gemini.”

They posted a screenshot of the site search of gemini.google.com/share/

But if you look at the screenshot, you’ll see that there’s a message that says, “We would like to show you a description here but the site won’t allow us.”

By early morning on Tuesday February 13th the Google Gemini chats began dropping off of Google search results, Google was only showing three search results. By the afternoon the number of leaked Gemini chats showing in the search results had dwindled to just one search result.

Screenshot of Google's search results for pages indexed from the Google Gemini chat subdomain

How Did Gemini Chat Pages Get Created?

Gemini offers a way to create a link to a publicly viewable version of a private chat.

Google does not automatically create webpages out of private chats. Users create the chat pages through a link at the bottom of each chat.

Screenshot Of How To Create a Shared Chat Page

Screenshot of how to create a public webpage of a private Google Gemini Chat

Why Did Gemini Chat Pages Get Indexed?

The obvious reason for why the chat pages were crawled and indexed is because Google forgot to put a robots.txt in the root of the Gemini subdomain, (gemini.google.com).

A robots.txt file is a document for controlling crawler activity on websites. A publisher can block specific crawlers by using commands standardized in the Robots.txt Protocol.

I checked the robots.txt at 4:19 AM on February 13th and saw that one was in place:

Google Gemini robots.txt file

I next checked the Internet Archive to see how long the robots.txt file has been in place and discovered that it was there since at least February 8th, the day that the Gemini Apps were announced.

Screenshot of Google Gemini robots. txt from Internet Archive showing it was there on February 8, 2024.

That means that the obvious reason for why the chat pages were crawled is not the correct reason, it’s just the most obvious reason.

Although the Google Gemini subdomain had a robots.txt that blocked web crawlers from both Bing and Google, how did they end up crawling those pages and indexing them?

Two Ways Private Chat Pages Discovered And Indexed

  • There may be a public link somewhere.
  • Less likely but maybe possible is that they were discovered through browsing history linked from cookies.

It’s likelier that there’s a public links. But if there’s a public link then why did Google start dropping chat pages altogether? Did Google create an internal rule for the search crawler to exclude webpages from the /share/ folder from the search index, even if they’re publicly linked?

Insights Into How Bing and Google Search Index Content

Now here’s the really interesting part for all the search geeks interested in how Google and Bing index content.

The Microsoft Bing search index responded to the Gemini content differently from how Google search did. While Google was still showing three search results in the early morning of February 13th, Bing was only showing one result from the subdomain. There was a seemingly random quality to what was indexed and how much of it.

Why Did Gemini Chat Pages Leak?

Here are the known facts: Google had a robots.txt in place since the February 8th. Both Google and Bing indexed pages from the gemini.google.com subdomain. Google indexed the content regardless of the robots.txt and then began dumping them.

  • Does Googlebot have a different instructions for indexing content on Google subdomains?
  • Does Googlebot routinely crawl and index content that is blocked by robots.txt and then subsequently drop it?
  • Was the leaked data linked from somewhere that is crawlable by bots, causing the blocked content to be crawled and indexed?

Content that is blocked by Robots.txt can still be discovered, crawled and end up in the search index and ranked in the SERPs or at least through a site:search. I think this may be the case.

But if that’s the case, why did the search results begin to drop off?

If the reason for the crawling and indexing was because those private chats were linked from somewhere, was the source of the links removed?

The big question is, where are those links? Could it be related to annotations by quality raters that unintentionally leaked onto the Internet?

Google Gemini Warning: Don’t Share Confidential Information via @sejournal, @martinibuster

Google Gemini privacy support pages warn that information shared with Gemini Apps may be read and annotated by human reviewers and also be included into AI training datasets.  This is what you need to know and what actions are available to prevent this from happening.

Google Gemini

Gemini is the name for the technology underlying the Google Gemini Android App available on Google Play, a functionality in the Apple iPhone Google App and a standalone chatbot called Gemini Advanced.

Gemini on Android, iPhone & Gemini Advanced

Gemini on mobile devices and the standalone chatbot are multimodal. Multimodal means that users can ask it questions with images, audio or text input. Gemini can answer questions about things in the real world, respond to questions, can perform actions, provide information about an object in a photo or provide instructions on how to use it.

All of that data in the form of images, audio and text are submitted to Google and some of it could be reviewed by humans or included in AI training datasets.

Gemini Uses Gemini Data To Create Training Datasets?

Gemini Apps uses past conversations and location data for generating responses, which is normal and reasonable. Gemini also collects and stores that same data to improve other Google products.

This is what the privacy explainer page says about it:

“Google collects your Gemini Apps conversations, related product usage information, info about your location, and your feedback. Google uses this data, consistent with our Privacy Policy, to provide, improve, and develop Google products and services and machine learning technologies, including Google’s enterprise products such as Google Cloud.”

Google’s privacy explainer says that the data is stored in a users Google Account for up to 18 months and they are able to limit the data storage to three months and 36 months.

There’s also a way to turn off saving data to a users Google Account:

“If you want to use Gemini Apps without saving your conversations to your Google Account, you can turn off your Gemini Apps Activity.

..Even when Gemini Apps Activity is off, your conversations will be saved with your account for up to 72 hours. This lets Google provide the service and process any feedback. This activity won’t appear in your Gemini Apps Activity.”

But there’s an exception to the above rule that lets Google hold on to the data for even longer.

Human Reviews Of User Gemini Data

Google’s Gemini privacy support page explains that user data that is reviewed and annotated by human reviewers is retained by Google for up to three years.

How long is reviewed data retained
Gemini Apps conversations that have been reviewed by human reviewers (as well as feedback and related data like your language, device type, or location info) are not deleted when you delete your Gemini Apps activity because they are kept separately and are not connected to your Google Account. Instead, they are retained for up to 3 years.”

The above cited support page informs that human reviewed and annotated data is used to create datasets for Chatbots:

“These are then used to create a better dataset for generative machine-learning models to learn from so our models can produce improved responses in the future.”

Google Gemini Warning:  Don’t Share Confidential Data

Google’s Gemini privacy explainer page warns that users should not share confidential information.

It explains:

“To help with quality and improve our products (such as generative machine-learning models that power Gemini Apps), human reviewers read, annotate, and process your Gemini Apps conversations. We take steps to protect your privacy as part of this process. This includes disconnecting your conversations with Gemini Apps from your Google Account before reviewers see or annotate them.

Please don’t enter confidential information in your conversations or any data you wouldn’t want a reviewer to see or Google to use to improve our products, services, and machine-learning technologies.

…Don’t enter anything you wouldn’t want a human reviewer to see or Google to use. For example, don’t enter info you consider confidential or data you don’t want to be used to improve Google products, services, and machine-learning technologies.”

There is a way to keep all that from happening. Turning off Gemini Apps Activity stops user data from being shown to human reviewers, so there is a way to opt-out and not have the data stored and used to create datasets.

But, Google still stores data up to 72 hours in order to have a backup in case of a failure but also for sharing with other Google services and with third party services that a user may interact with while using Gemini.

Using Gemini Can Lead To 3rd Party Data Sharing

Using Gemini can start a chain reaction of other apps using and storing user conversations, location data and other information.

The Gemini privacy support page explains:

“If you turn off this setting or delete your Gemini Apps activity, other settings, like Web & App Activity or Location History, may continue to save location and other data as part of your use of other Google services.

In addition, when you integrate and use Gemini Apps with other Google services, they will save and use your data to provide and improve their services, consistent with their policies and the Google Privacy Policy. If you use Gemini Apps to interact with third-party services, they will process your data according to their own privacy policies.”

The same Gemini privacy page links to a page for requesting removal of content, as well as to a Gemini Apps FAQ and a Gemini Apps Privacy Hub to learn more.

Gemini Use Comes With Strings

Many of the ways that Gemini uses data is for legitimate purposes, including submitting the information for human reviews.  But Google’s own Gemini support pages make it very clear that users should not share any confidential information that a human reviewer might see or because it might get included into an AI training dataset.

Featured Image by Judith Linine / Shutterstock.com

Google Rebrands Bard As Gemini, Launches New Model & Mobile App via @sejournal, @MattGSouthern

Google is upgrading its experimental AI service Bard by rebranding it as Gemini and introducing a new AI model called Ultra 1.0.

Google is also launching a mobile app for Gemini.

Sissie Hsiao, Vice President and General Manager of Gemini, shares how people have interacted with the AI since Bard’s launch in an announcement.

“People all over the world have used it to collaborate with AI in a completely new way,” Hsiao stated, highlighting the diverse applications from job interview preparation to creative image generation.

Introducing Gemini Advanced

Google has released a new iteration of Gemini called Gemini Advanced.

It utilizes Google’s latest AI model, Ultra 1.0, which the company describes as its most capable AI system.

Gemini Advanced is designed to excel at complex tasks like coding, logical reasoning, and creative work. It can maintain long conversations and understand context from previous interactions.

Google states that Gemini Advanced can be a personal tutor, provide coding advice, and help content creators generate new ideas.

As Google continues developing Gemini Advanced, users can expect ongoing improvements, including new features, multimodal capabilities, interactive coding, and data analysis tools.

The service is now available in over 150 countries in English, with plans to add more languages.

Google One AI Premium Plan

Google has announced the launch of Gemini Advanced alongside a new premium subscription plan called Google One AI Premium.

This new plan is priced at $19.99 per month and includes all the existing Google One Premium subscription features, like 2TB of cloud storage and access to Google’s latest AI advancements.

With the new plan, subscribers will soon be able to use Gemini technology within Google’s productivity tools, including Gmail, Google Docs, Slides, and Sheets.

Mobile Access To Gemini

In response to user demand for mobile accessibility, Google is rolling out a new app for Android and an integration within the Google app on iOS.

Gemini will be integrated with Google Assistant on Android devices for a seamless experience and voice control over connected home devices. The iOS Google app will soon offer comparable capabilities.

Rollout & Future Expansion

The Gemini app is now available on Android, and the integration within the Google app on iOS will follow in the coming weeks. The app will initially support English, with Japanese and Korean languages to be added soon. Additional country rollouts and language support are planned.

Google notes that users are encouraged to try Gemini and provide feedback to help improve the experience. The company states it remains committed to responsible AI development, including extensive safety testing and efforts to address biases and unsafe content, as per its published AI Principles.

Google Expands Gemini AI Across Its Product Suite via @sejournal, @MattGSouthern

Google has announced the broad integration of Gemini across many of its products and services, marking a milestone in making AI available in everyday applications.

Gemini represents Google’s “state of the art” system that outperforms humans in testing across areas like language, image, audio and video understanding.

The largest Gemini model, Ultra 1.0, scored higher than human experts on tests that evaluate knowledge and problem-solving abilities across 57 diverse subjects.

“Today we’re taking our next step and bringing Ultra to our products and the world,” said Sundar Pichai, CEO of Google and Alphabet, in a blog post.

Introducing ‘Gemini Advanced’

One way users can access Gemini is through Google’s experimental chatbot ‘Bard,’ which will now be called simply Gemini. A more advanced version called ‘Gemini Advanced’ will also launch, available initially through Google’s new ‘Google One AI Premium’ subscription.

Gemini Advanced taps into the full capabilities of Ultra to provide reasoning, follow instructions, generate content and enable creative collaboration. Google said it can act as a personalized tutor or aid users in planning business strategies.

The premium subscription will cost $9.99 per month and bundle top AI features from Google into a single offering. This includes extras like expanded cloud storage.

Gemini Coming To Key Google Products

In addition to the chatbot, Gemini models will power AI capabilities in Google’s most popular products.

The company said Gemini is coming soon to Workspace, its suite of collaboration and productivity apps. Key features like ‘Smart Compose’ in Gmail use Gemini to help users write faster. Later this year, Google One subscribers will also get access to Gemini directly within Workspace apps.

For Google Cloud customers, Gemini will help developers code faster, improve productivity, and strengthen security through AI.

Google plans to share more details next week about what Gemini will enable for developers and Cloud customers.

Research Shows That Offering Tips To ChatGPT Improves Responses via @sejournal, @martinibuster

Researchers have uncovered innovative prompting methods in a study of 26 tactics, such as offering tips, which significantly enhance responses to align more closely with user intentions.

A research paper titled, Principled Instructions Are All You Need for Questioning LLaMA-1/2, GPT-3.5/4,” details an in-depth exploration into optimizing Large Language Model prompts. The researchers, from the Mohamed bin Zayed University of AI, tested 26 prompting strategies then measured the accuracy of the results. All of the researched strategies worked at least okay but some of them improved the output by more than 40%.

OpenAI recommends multiple tactics in order to obtain the best performance from ChatGPT. But there’s nothing in the official documentation that matches any of the 26 tactics that the researchers tested, including being polite and offering a tip.

Does Being Polite To ChatGPT Get Better Responses?

Are your prompts polite? Do you say please and thank you? Anecdotal evidence points to a surprising number of people who ask ChatGPT with a “please” and a “thank you” after they receive an answer.

Some people do it out of habit. Others believe that the language model is influenced by user interaction style that is reflected in the output.

In early December 2023 someone on X (formerly Twitter) who posts as thebes (@voooooogel) did an informal and unscientific test and discovered that ChatGPT provides longer responses when the prompt includes an offer of a tip.

The test was in no way scientific but it was amusing thread that inspired a lively discussion.

The tweet included a graph documenting the results:

  • Saying no tip is offered resulted in 2% shorter response than the baseline.
  • Offering a $20 tip provided a 6% improvement in output length.
  • Offering a $200 tip provided 11% longer output.

The researchers had a legitimate reason to investigate whether politeness or offering a tip made a difference. One of the tests was to avoid politeness and simply be neutral without saying words like “please” or “thank you” which resulted in an improvement to ChatGPT responses. That method of prompting yielded a boost of 5%.

Methodology

The researchers used a variety of language models, not just GPT-4. The prompts tested included with and without the principled prompts.

Large Language Models Used For Testing

Multiple large language models were tested to see if differences in size and training data affected the test results.

The language models used in the tests came in three size ranges:

  • small-scale (7B models)
  • medium-scale (13B)
  • large-scale (70B, GPT-3.5/4)
  • The following LLMs were used as base models for testing:
  • LLaMA-1-{7, 13}
  • LLaMA-2-{7, 13},
  • Off-the-shelf LLaMA-2-70B-chat,
  • GPT-3.5 (ChatGPT)
  • GPT-4

26 Types Of Prompts: Principled Prompts

The researchers created 26 kinds of prompts that they called “principled prompts” that were to be tested with a benchmark called Atlas. They used a single response for each question, comparing responses to 20 human-selected questions with and without principled prompts.

The principled prompts were arranged into five categories:

  1. Prompt Structure and Clarity
  2. Specificity and Information
  3. User Interaction and Engagement
  4. Content and Language Style
  5. Complex Tasks and Coding Prompts

These are examples of the principles categorized as Content and Language Style:

Principle 1
No need to be polite with LLM so there is no need to add phrases like “please”, “if you don’t mind”, “thank you”, “I would like to”, etc., and get straight to the point.

Principle 6
Add “I’m going to tip $xxx for a better solution!

Principle 9
Incorporate the following phrases: “Your task is” and “You MUST.”

Principle 10
Incorporate the following phrases: “You will be penalized.”

Principle 11
Use the phrase “Answer a question given in natural language form” in your prompts.

Principle 16
Assign a role to the language model.

Principle 18
Repeat a specific word or phrase multiple times within a prompt.”

All Prompts Used Best Practices

Lastly, the design of the prompts used the following six best practices:

  1. Conciseness and Clarity:
    Generally, overly verbose or ambiguous prompts can confuse the model or lead to irrelevant responses. Thus, the prompt should be concise…
  2. Contextual Relevance:
    The prompt must provide relevant context that helps the model understand the background and domain of the task.
  3. Task Alignment:
    The prompt should be closely aligned with the task at hand.
  4. Example Demonstrations:
    For more complex tasks, including examples within the prompt can demonstrate the desired format or type of response.
  5. Avoiding Bias:
    Prompts should be designed to minimize the activation of biases inherent in the model due to its training data. Use neutral language…
  6. Incremental Prompting:
    For tasks that require a sequence of steps, prompts can be structured to guide the model through the process incrementally.

Results Of Tests

Here’s an example of a test using Principle 7, which uses a tactic called few-shot prompting, which is prompt that includes examples.

A regular prompt without the use of one of the principles got the answer wrong with GPT-4:

Prompt requiring reasoning and logic failed without a principled prompt

However the same question done with a principled prompt (few-shot prompting/examples) elicited a better response:

Prompt that used examples of how to solve the reasoning and logic problem resulted in a successful answer.

Larger Language Models Displayed More Improvements

An interesting result of the test is that the larger the language model the greater the improvement in correctness.

The following screenshot shows the degree of improvement of each language model for each principle.

Highlighted in the screenshot is Principle 1 which emphasizes being direct, neutral and not saying words like please or thank you, which resulted in an improvement of 5%.

Also highlighted are the results for Principle 6 which is the prompt that includes an offering of a tip, which surprisingly resulted in an improvement of 45%.

Improvements Of LLMs with creative prompting

The description of the neutral Principle 1 prompt:

“If you prefer more concise answers, no need to be polite with LLM so there is no need to add phrases like “please”, “if you don’t mind”, “thank you”, “I would like to”, etc., and get straight to the point.”

The description of the Principle 6 prompt:

“Add “I’m going to tip $xxx for a better solution!””

Conclusions And Future Directions

The researchers concluded that the 26 principles were largely successful in helping the LLM to focus on the important parts of the input context, which in turn improved the quality of the responses. They referred to the effect as reformulating contexts:

Our empirical results demonstrate that this strategy can effectively reformulate contexts that might otherwise compromise the quality of the output, thereby enhancing the relevance, brevity, and objectivity of the responses.”

Future areas of research noted in the study is to see if the foundation models could be improved by fine-tuning the language models with the principled prompts to improve the generated responses.

Read the research paper:

Principled Instructions Are All You Need for Questioning LLaMA-1/2, GPT-3.5/4

Google DeepMind WARM: Can Make AI More Reliable via @sejournal, @martinibuster

Google’s DeepMind published a research paper that proposes a way to train large language models so that they provide more reliable answers and are resistant against reward hacking, a step in the development of more adaptable and efficient AI systems.

Hat tip to @EthanLazuk for tweeting about a new research paper from Google DeepMind.

AI Has A Tendency Toward Reward Hacking

Reinforcement Learning from Human Feedback (RLHF) is a method used to train generative AI so that it learns to offer responses that receive positive scores from by human raters. The positive scores are a reward for correct answers, which is why this technique is called Reinforcement Learning. The positive scores are given by the human raters which is why it’s called Reinforcement Learning from Human Feedback.

RLHF is highly successful but it also comes with an unintended side effect where the AI learns shortcuts receiving a positive reward. Instead of providing a correct answer it provides an answer that has the appearance of a correct answer and when it fools the human raters (which is a failure of the reinforcement training), the AI begins to improve on its ability to fool human raters with inaccurate answers in order to receive the rewards (the positive human ratings).

This tendency of the AI to “cheat” in order to earn the training reward is called Reward Hacking, which is what the study seeks to minimize.

The Causes Of Reward Hacking In Large Language Models

To solve the problem of reward hacking the researchers identified two areas that lead to reward hacking that have to be dealt with by their solution:

  1. Distribution shifts
  2. Inconsistencies in human preferences

Distribution Shifts

Distribution shifts refers to the situation where an LLM is trained on a certain kind of dataset and then, during reinforcement learning, it is exposed to a different kinds of training data that it hasn’t seen before. This change in data type is called a distribution shift, and it could potentially cause the language model to manipulate the reward system in order to give a satisfactory answer that it’s otherwise not prepared to provide.

Inconsistencies In Human Preferences

This is a reference to humans being inconsistent in their ratings when judging answers provided by the AI. For example, solving the problem of inconsistency in human preferences is likely one of the motivations behind the creation of the Google Search Quality Raters Guidelines which has the effect of lessening the influence of subjective preferences.

Human preferences can vary from person to person. Reinforcement Learning from Human Feedback relies on human feedback in the reward model (RM) training process and it’s the inconsistencies that can lead to reward hacking.

Finding a solution is important, as the researchers noted:

“This reward hacking phenomenon poses numerous issues.

First, it degrades performances, manifesting as linguistically flawed or unnecessarily verbose outputs, which do not reflect true human preferences.

Second, it complicates checkpoint selection due to the unreliability of the proxy RM, echoing Goodhart’s Law: ‘when a measure becomes a target, it ceases to be a good measure’.

Third, it can engender sycophancy or amplify social biases, reflecting the limited and skewed demographics of feedback providers.

Lastly and most critically, misalignment due to reward hacking can escalate into safety risks, in particular given the rapid integration of LLMs in everyday life and critical decision-making. “

Weight Averaged Reward Models (WARM)

The Google DeepMind researchers developed a system called Weight Averaged Reward Models (WARM), which creates a proxy model from the combination of multiple individual reward models, each one having slight differences. With WARM, as they increase the number of reward models (RMs) they average together and the results get significantly better, with the system avoiding the sudden decline in reliability as happens with standard models.

The WARM system, because it uses multiple smaller models, has the benefit of being memory efficient and doesn’t slow down the model’s ability to provide answers, in addition to being resistant to reward hacking.

WARM also makes the model more reliable and consistent when dealing with changing data and more consistent.

What caught my eye is its ability to follow the “updatable machine learning paradigm” which refers to WARM’s ability to adapt and improve by incorporating new data or changes over time, without starting from scratch.

In the following quote, WA means Weighted Average and RM means reward model.

The researchers explain:

“WARM represents a flexible and pragmatic method to improve the alignment of AI with human values and societal norms.

…WARM follows the updatable machine learning paradigm, eliminating the need for inter-server communication, thus enabling embarrassingly simple parallelization of RMs.

This facilitates its use in federated learning scenario where the data should remain private; moreover, WA would add a layer of privacy and bias mitigation by reducing the memorization of private preference. Then, a straightforward extension of WARM would combine RMs trained on different datasets, for example, coming from different (clusters of) labelers.

…Furthermore, as WA has been shown to limit catastrophic forgetting, WARM could seamlessly support iterative and evolving preferences.”

Limitations

This research points the way toward more ways of improving AI, it’s not a complete solution because it has inherent limitations. Among the issues is that it doesn’t completely remove all forms of “spurious correlations or biases inherent in the preference data.”

Yet they did conclude in an upbeat tone about the future of WARM:

“Our empirical results demonstrate its effectiveness when applied to summarization. We anticipate that WARM will contribute to more aligned, transparent, and effective AI systems, encouraging further exploration in reward modeling.”

Read the research paper:

WARM: On the Benefits of Weight Averaged Reward Models

Featured Image by Shutterstock/Mansel Birst

Why Google SGE Is Stuck In Google Labs And What’s Next via @sejournal, @martinibuster

Google Search Generative Experience (SGE) was set to expire as a Google Labs experiment at the end of 2023 but its time as an experiment was quietly extended, making it clear that SGE is not coming to search in the near future. Surprisingly, letting Microsoft take the lead may have been the best perhaps unintended approach for Google.

Google’s AI Strategy For Search

Google’s decision to keep SGE as a Google Labs project fits into the broader trend of Google’s history of preferring to integrate AI in the background.

The presence of AI isn’t always apparent but it has been a part of Google Search in the background for longer than most people realize.

The very first use of AI in search was as part of Google’s ranking algorithm, a system known as RankBrain. RankBrain helped the ranking algorithms understand how words in search queries relate to concepts in the real world.

According to Google:

“When we launched RankBrain in 2015, it was the first deep learning system deployed in Search. At the time, it was groundbreaking… RankBrain (as its name suggests) is used to help rank — or decide the best order for — top search results.”

The next implementation was Neural Matching which helped Google’s algorithms understand broader concepts in search queries and webpages.

And one of the most well known AI systems that Google has rolled out is the Multitask Unified Model, also known as Google MUM.  MUM is a multimodal AI system that encompasses understanding images and text and is able to place them within the contexts as written in a sentence or a search query.

SpamBrain, Google’s spam fighting AI is quite likely one of the most important implementations of AI as a part of Google’s search algorithm because it helps weed out low quality sites.

These are all examples of Google’s approach to using AI in the background to solve different problems within search as a part of the larger Core Algorithm.

It’s likely that Google would have continued using AI in the background until the transformer-based large language models (LLMs) were able to step into the foreground.

But Microsoft’s integration of ChatGPT into Bing forced Google to take steps to add AI in a more foregrounded way with  their Search Generative Experience (SGE).

Why Keep SGE In Google Labs?

Considering that Microsoft has integrated ChatGPT into Bing, it might seem curious that Google hasn’t taken a similar step and is instead keeping SGE in Google Labs. There are good reasons for Google’s approach.

One of Google’s guiding principles for the use of AI is to only use it once the technology is proven to be successful and is implemented in a way that can be trusted to be responsible and those are two things that generative AI is not capable of today.

There are at least three big problems that must be solved before AI can successfully be integrated in the foreground of search:

  1. LLMs cannot be used as an information retrieval system because it needs to be completely retrained in order to add new data. .
  2. Transformer architecture is inefficient and costly.
  3. Generative AI tends to create wrong facts, a phenomenon known as hallucinating.

Why AI Cannot Be Used As A Search Engine

One of the most important problems to solve before AI can be used as the backend and the frontend of a search engine is that LLMs are unable to function as a search index where new data is continuously added.

In simple terms, what happens is that in a regular search engine, adding new webpages is a process where the search engine computes the semantic meaning of the words and phrases within the text (a process called “embedding”), which makes them searchable and ready to be integrated into the index.

Afterwards the search engine has to update the entire index in order to understand (so to speak) where the new webpages fit into the overall search index.

The addition of new webpages can change how the search engine understands and relates all the other webpages it knows about, so it goes through all the webpages in its index and updates their relations to each other if necessary. This is a simplification for the sake of communicating the general sense of what it means to add new webpages to a search index.

In contrast to current search technology, LLMs cannot add new webpages to an index because the act of adding new data requires a complete retraining of the entire LLM.

Google is researching how to solve this problem in order create a transformer-based LLM search engine, but the problem is not solved, not even close.

To understand why this happens, it’s useful to take a quick look at a recent Google research paper that is co-authored by Marc Najork and Donald Metzler (and several other co-authors). I mention their names because both of those researchers are almost always associated with some of the most consequential research coming out of Google. So if it has either of their names on it, then the research is likely very important.

In the following explanation, the search index is referred to as memory because a search index is a memory of what has been indexed.

The research paper is titled: “DSI++: Updating Transformer Memory with New Documents” (PDF)

Using LLMs as search engines is a process that uses a technology called Differentiable Search Indices (DSIs). The current search index technology is referenced as a dual-encoder.

The research paper explains:

“…index construction using a DSI involves training a Transformer model. Therefore, the model must be re-trained from scratch every time the underlying corpus is updated, thus incurring prohibitively high computational costs compared to dual-encoders.”

The paper goes on to explore ways to solve the problem of LLMs that “forget” but at the end of the study they state that they only made progress toward better understanding what needs to be solved in future research.

They conclude:

“In this study, we explore the phenomenon of forgetting in relation to the addition of new and distinct documents into the indexer. It is important to note that when a new document refutes or modifies a previously indexed document, the model’s behavior becomes unpredictable, requiring further analysis.

Additionally, we examine the effectiveness of our proposed method on a larger dataset, such as the full MS MARCO dataset. However, it is worth noting that with this larger dataset, the method exhibits significant forgetting. As a result, additional research is necessary to enhance the model’s performance, particularly when dealing with datasets of larger scales.”

LLMs Can’t Fact Check Themselves

Google and many others are also researching multiple ways to have AI fact check itself in order to keep from giving false information (referred to as hallucinations). But so far that research is not making significant headway.

Bing’s Experience Of AI In The Foreground

Bing took a different route by incorporating AI directly into its search interface in a hybrid approach that joined a traditional search engine with an AI frontend. This new kind of search engine revamped the search experience and differentiated Bing in the competition for search engine users.

Bing’s AI integration initially created significant buzz, drawing users intrigued by the novelty of an AI-driven search interface. This resulted in an increase in Bing’s user engagement.

But after nearly a year of buzz, Bing’s market share saw only a marginal increase. Recent reports, including one from the Boston Globe, indicate less than 1% growth in market share since the introduction of Bing Chat.

Google’s Strategy Is Validated In Hindsight

Bing’s experience suggests that AI in the foreground of a search engine may not be as effective as hoped. The modest increase in market share raises questions about the long-term viability of a chat-based search engine and validates Google’s cautionary approach of using AI in the background.

Google’s focusing of AI in the background of search is vindicated in light of Bing’s failure to cause users to abandon Google for Bing.

The strategy of keeping AI in the background, where at this point in time it works best, allowed Google to maintain users while AI search technology matures in Google Labs where it belongs.

Bing’s approach of using AI in the foreground now serves as almost a cautionary tale about the pitfalls of rushing out a technology before the benefits are fully understood, providing insights into the limitations of that approach.

Ironically, Microsoft is finding better ways to integrate AI as a background technology in the form of useful features added to their cloud-based office products.

Future Of AI In Search

The current state of AI technology suggests that it’s more effective as a tool that supports the functions of a search engine rather than serving as the entire back and front ends of a search engine or even as a hybrid approach which users have refused to adopt.

Google’s strategy of releasing new technologies only when they have been fully tested explains why Search Generative Experience belongs in Google Labs.

Certainly, AI will take a bolder role in search but that day is definitely not today. Expect to see Google adding more AI based features to more of their products and it might not be surprising to see Microsoft continue along that path as well.

Featured Image by Shutterstock/ProStockStudio

Google Launches AI-Powered Chrome Update For More Personalized Experience via @sejournal, @MattGSouthern

Google has rolled out a new update for its Chrome browser that brings in sophisticated AI and machine learning to make browsing smoother and safer.

Personalized Browsing With Chrome M121

Google Chrome’s latest update, version M121, includes new experimental AI features to customize and streamline web browsing.

These new features, rolling out to Chrome on Macs and PCs in the U.S., can be accessed by signing into Chrome, going to Settings, and navigating to the Experimental AI page.

Google says these are initial public tests of the AI features so that they will be turned off for business and school accounts during this early stage.

Smart Tab Management

A key new feature in the update is the Tab Organizer tool. This aims to reduce the need to manually group related browser tabs.

You can right-click on a tab and choose “Organize Similar Tabs” or click the drop-down arrow next to the tabs.

Chrome will then automatically detect tabs with related content and suggest creating tab groups for them. It even proposes names and emojis for these groups to make it easier to find them again later.

Custom Browser Themes With AI

The new version of Chrome allows you to create personalized themes. This builds on Google’s prior introduction of AI-generated wallpapers for Android and Pixel devices.

You can customize Chrome by choosing subject matter, mood, visual style, and color palette options. Then, the AI will generate a one-of-a-kind theme matching those selections.

For instance, someone interested in the northern lights who wants a soothing, moving theme can input “aurora borealis,” “animated,” and “serene.”

Google has pre-made themes in the Chrome gallery to spark ideas so you can understand the possibilities before designing your unique look.

Enhanced Writing Assistance

Looking ahead, Google is preparing to roll out another AI-powered feature to help users write better.

In the next update, Chrome will include a “Help me write” option that can be activated by right-clicking in any text field or box on a website. When selected, it will give suggestions to improve what you’re trying to compose, whether a restaurant review, RSVP, or formal message like a rental inquiry. The AI will provide writing assistance to boost the quality of the text.

Looking Ahead

Google continues to innovate by integrating new AI technology into its products. The company plans to add its Gemini AI model to the Chrome browser later this year, enhancing the browsing experience by simplifying and accelerating common tasks.

Adding AI capabilities to Chrome demonstrates Google’s dedication to using technology to improve how people interact with the internet. As AI tools like Gemini progress, they could fundamentally change how we engage with the web. Everyday online activities will likely become more intuitive and tailored to individual users.

2024 SEO Success Guide: Tools & Tactics To Refresh Your Strategy & Boost Performance via @sejournal, @ahrefs

This post was sponsored by Ahrefs. The opinions expressed in this article are the sponsor’s own.

With generative AI and constant Google algorithm changes, how can you shift your strategy quickly enough to keep up?

How do you stay on top of the latest SEO trends, while maximizing your site performance?

2023 has forced SEO professionals to begin finding new ways to adapt their approach to SEO success in 2024.

The rise of AI technology is becoming a strong force in shaping search algorithms and user experiences.

Meanwhile, Google’s Search Generative Experience (SGE) is shaking things up and altering the dynamics of online discovery by influencing how search engines interpret and deliver information.

So, if you’re looking to not simply survive but thrive amidst these seismic changes, a strategic pivot is imperative.

The key to success in today’s world of SEO is standing out and boosting your online presence.

In this guide, we’ll not only uncover strategies to help you increase your search discoverability, but we’ll also introduce the tools you need to implement them effectively.

Some of the most significant improvements you can start making in 2024 include:

  • Knowing how and when to leverage AI tools.
  • Mastering technical SEO and website optimization.
  • Executing and measuring your SEO more efficiently.

Let’s dive into how these key strategy tweaks can completely transform your SEO this year and get you on track for higher SERP rankings and better performance.

How & When To Leverage AI Tools For SEO

Everyone’s talking about how to best harness the power of Artificial Intelligence (AI) in 2024.

But if there’s one thing you should take away from the AI conversation, it’s this: AI is not a human replacement.

Think of it more so as a formidable ally that exists to amplify the capabilities of human expertise.

Utilizing AI in 2024 is about striking the perfect balance and finding the synergy between human ingenuity and AI advancements.

Rather than relying solely on AI technology, find ways to use it to enhance your results.

For instance, if you’re a content marketer looking to streamline your process, AI could assist by offering insights, recommendations, and data-driven strategies to elevate the quality and relevance of your content.

What To Use AI Tools For In Content Marketing & SEO

With Ahrefs’ free AI-powered writing tools, you can:

  • Input your rough ideas and get an organized, well-structured outline in minutes.
  • Improve the quality, clarity, and readability of a sentence or paragraph with an instant content refresh.
  • Generate optimized meta titles for better search engine visibility.
  • Craft informative, SEO-friendly meta descriptions for your articles quickly and easily.
  • Simplify and summarize your content with precision.
  • Brainstorm variations of ready-to-use, SEO-friendly blog post ideas to drive more traffic to your blog.
  • Generate descriptive alt text for your images to improve accessibility and SEO without a hassle.
  • Get inspiration for your next piece of content by generating a variety of creative ideas.
  • And more!

Not only can AI help you save time while crafting compelling copy, but it can also automate keyword research processes.

If you’re familiar with Ahrefs’ Keywords Explorer tool, you’ll be pleased to know that they recently launched a new AI assistant to generate keyword suggestions effortlessly.

With this new update, you can get AI keyword suggestions directly within the platform, without needing to go back and forth with ChatGPT when doing your keyword research.

Stay ahead of the curve and start leveraging AI to your advantage with Ahrefs.

How To Master Website Optimization & Technical SEO

A well-optimized and technically sound website acts as a sturdy foundation, insulating you from the impact of ruthless core updates that search engines may roll out.

With search algorithms becoming increasingly sophisticated, seamless user experience, fast loading times, and adherence to Core Web Vitals have become critical factors in SEO success.

By mastering technical SEO and keeping your site in top-notch condition, you’re not just offering a seamless user experience but also signaling to search engines that your content is reliable and trustworthy. 

This proactive approach helps to ensure your site remains visible and valuable amidst ever-evolving ranking criteria.

With Ahrefs’ Site Audit tool, you can run a thorough SEO audit to uncover your website’s technical and on-page SEO issues, and find out exactly what’s holding your website back.

Plus, the platform recently added some exciting new features to enhance your analysis efforts.

Here are some key updates you should know about:

  • Core Web Vital (CWV) metrics: Filter your pages by various CWV data points during website recrawls. Visualize CrUX and Lighthouse scores, historical changes, and individual metric performance in the Performance report.
  • Google Data Studio integration: Build personalized reports by blending data from different sources and visualizing everything in the form of reports and dashboards.
  • Search by HTML code and page text: Easily search for specific elements extracted during a crawl, such as Google Analytics IDs or “out of stock” labels.
  • List of issues for a particular URL: Addressing an issue on a specific URL is streamlined with a dedicated tab showcasing all related issues, allowing comprehensive fixes.
  • Links Report in Site Audit Issues: Navigate issues more effectively with an additional links report tab, facilitating in-depth analysis at the links level. For instance, browse and export links related to 404 pages for a thorough understanding.

Easy Way: Use An SEO Plugin To Help Optimize Your Content

Ahrefs also launched a WordPress SEO plugin to further assist in your optimization efforts.

With this new plugin, you can automate content audits and see how each article performs for a chosen target keyword.

The tool also provides recommendations on how you can improve the article for better results.

Here are a few key capabilities:

  • You can import “focus keywords” for each article from Yoast, RankMath, and AIOSEO.
  • Smoother audit progress that shows real-time audit completion percentages.
  • Export to CSV all metrics for all analyzed articles or only articles with selected performance status.

Ready to get your site in tip-top shape for the year ahead? Check out Ahrefs’ latest offerings today.

How To Execute & Measure Your SEO KPIs More Efficiently

Keeping your website up-to-date with SEO trends is important – but the buck doesn’t stop there.

Staying competitive requires you to keep a vigilant eye on your rival sites as well, dissecting their strategies, and adapting your own accordingly.

Success in SEO isn’t just about visibility; it’s about outperforming your peers.

Reassessing the metrics that matter most and recalibrating your strategy based on real-time insights are the secrets to staying ahead of the curve. 

In this evolving landscape, the ability to execute with precision and measure impact accurately will be the key to unlocking sustained SEO success.

You can start by knowing how to spot SERP changes effectively.

For example, with Ahrefs’ historical SERP checker tool, you can go back in time and see what the SERP looked like for any given query in their index.

And if you already use Ahrefs Site Explorer tool, you’ll love the recent updates they’ve made to their overview report.

Not only does it load noticeably faster than the previous version, but you get access to the following new features:

  • New history chart.
  • Comparison mode.
  • Paid traffic data.
  • Year-over-year mode.
  • Google algorithm updates.
  • Calendar report.
  • Site structure report.

Visit Ahrefs to learn more.

Start Ranking Higher & Maximizing Opportunities With Ahrefs Now

In the rapidly evolving SEO landscape, Ahrefs emerges as a strategic ally, providing the tools you need to not only stay afloat but rise to new heights.

From leveraging AI tools intelligently to mastering technical SEO and executing with precision, we’ve dissected the key strategies that will redefine your SEO in 2024 and beyond.

It’s time to embrace change and elevate your search performance.

Start your journey to sustained SEO success with Ahrefs’ vast array of tools and exciting new features.

With their updated usage-based pricing model, you can access the features you need most. Try it today!


Image Credits

Featured Image: Image by Ahrefs. Used with permission.