Google Bard Director Talks Usage, Ethics, And Competitive Advantage via @sejournal, @MattGSouthern

In an exclusive interview with Search Engine Journal, Yury Pinsky, Director of Product Management at Google Bard, provides insight into the current status and future trajectory of Google’s experimental AI chatbot.

The interview offers a first-hand perspective into where Bard stands currently, where Google hopes to take it in the future, and how Google is approaching challenges like potential biases and misinformation with this new AI tool.

User Feedback And Usage Patterns

Pinsky reported that the initial feedback on Bard since its launch has been positive.

He highlighted that people have quickly integrated Bard into their workflows to discover how to use it best.

“We’re hearing that across the world, people are eager to engage and collaborate with Bard. I think it’s rather interesting that we are going through this journey alongside our users as we both learn and discover how to make the most of generative AI together.”

Most people use Bard for developing concepts, programming, and grasping complicated subjects, Pinsky says:

“In terms of common themes or, more specifically, how people are using Bard, we are seeing most people use it for writing, i.e., finding the right words to use from an initial user’s idea, working through an idea, coding, and helping people understand complex topics.”

Bard Vs. Google Search

Regarding the relationship between Bard and Google Search, Pinsky highlighted that the two are distinct yet complementary products.

“Bard and Search are separate products; in fact, we see Bard as an experience that is complementary to Search. Bard can help boost your productivity, accelerate your ideas, and fuel your curiosity.”

Google Search has long been a tool for finding information, while Bard was created as a chatbot meant to boost user productivity and creativity.

Bard’s current focus is on how it can encourage creative thinking, not just a basic search users might conduct on Google.

Pinsky continues:

“Whereas the generative capabilities in Search [can] help people in their information journeys, staying true to our strong foundations of information quality and connecting people with a range of sources and perspectives [is crucial].

For now, we’re excited and focused on how people use Bard for creative exploration — in ways that are different from how they typically look for information with Google Search.”

Bard’s Strengths And Areas For Improvement

When asked about Bard’s strengths and weaknesses, Pinsky believes that Bard is skilled at being a creative partner, especially in drafting professional correspondence and the like.

“One of the huge benefits of Bard is that it’s a pretty effective creative collaborator. So for tasks like creating drafts of [something like] a professional letter … Bard can help you find the right words.”

Bard also helps users expand on their ideas to reach innovative solutions, Pinsky explains:

“Another way we’re seeing people use Bard is to generate ideas. Bard can help you from a starting point like you thinking about a trip with family to generating suggested places to visit – and with Extensions, generating options for flights and hotels.

I truly think where Bard excels is helping people build on their ideas to come up with creative conclusions.”

Like other AI language models, Pinksy acknowledges that Bard sometimes makes up incorrect information, known as “hallucinating.”

He encourages people to use the feedback tool to identify inaccuracies, stressing Google’s dedication to transparency and accountability in responsibly developing AI.

“In terms of improvement, we’ve been transparent that hallucinations continue to be a known challenge of LLMs.

When we launched Bard, we published an overview by James Manyika, our head of technology and society, that examines many of these limitations and our approach to mitigating them.

We believe this transparency is important and critical to being responsible with generative AI.

So, we encourage people to use the thumbs down button and provide feedback if they see a hallucination or something that isn’t accurate. That’s one way Bard will learn and improve.”

Quality Assurance And Expansion

Pinsky discussed how Google constantly evaluates and improves Bard’s response quality through user feedback.

“Our user research teams spend a lot of time with our users to better understand the features that they resonate with, how our responses can be better, and how they are using Bard.

Additionally, our ‘thumbs up and down’ feature allows us to evaluate how good Bard is with responses and if we are making progress.”

He emphasized that expanding access to Bard aligns with Google’s mission, but maintaining high-quality responses and responsible AI development remains the top priority over speed of deployment.

“When we announced Bard in February, we initially opened it up to our trusted testers before making it more widely available to the public. These testers help provide critical feedback ahead of broader release.”

Ethical Measures For Bard

In addressing concerns about ethical issues such as bias and misinformation, Pinsky said that Bard was developed according to Google’s principles for ethical AI.

“We go to great lengths to build all of our products consistent with our AI Principles, where we’ve noted that we seek to avoid unjust impacts, including those related to … political/religious beliefs, race, ethnicity, gender, nationality, income, sexual orientation, and ability. We have taken the same approach with Bard.”

The team has taken steps to prevent unfair outcomes related to sensitive attributes when using Bard. The system is regularly checked by people who provide feedback and evaluation. If any problems come up, Pinsky says, Google can quickly take action to address them.

He continues:

“We’re using human feedback and evaluation to improve our systems, but like all LLM-based interfaces, Bard will make mistakes, and when we find that the experience isn’t performing in a way that aligns to our approach, we will work quickly to fix it.

More broadly speaking, finding ways to represent different viewpoints or prevent bias is something that society itself struggles with – it’s a very complex issue and something we continue to work on.”

Standing Out In The AI Market

When discussing how Google intends to differentiate Bard in an increasingly competitive space, Pinsky highlighted Google’s people-focused strategy for making Bard stand out among increased competition with chatbots powered by large language models (LLMs).

“I feel so lucky because it’s an exciting time to be working in this space — there is a vibrant ecosystem with lots of choice, which is great for consumers and the progress of technology.

At Google, we like to say: Put people first, and the rest will follow. And that’s exactly how we’ve approached and will continue to approach Bard – by focusing on what people tell us they want to do with the technology – and we believe this will be a key differentiator.

For instance, people told us they wanted to debug code – so we added coding capabilities. People wanted a more visual experience, so now Google Lens can analyze your photos. And people wanted help solving math problems – so we made the model smarter at logic and reasoning.”

He says Google aims to tailor Bard to users’ needs by integrating it with popular Google apps and services. Pinsky believes this integration and focus on the user experience will differentiate Bard from other LLMs.

“The ability for Bard to integrate the Google apps and tools you use every day with Extensions truly is a game changer. Also, Bard is the first LLM that can admit that it may not have all the right answers because, through the Google It button, people can double-check results.

Overall, I believe a key differentiator is that we hold ourselves to a high standard in AI responsibility and are taking a deliberate and thoughtful approach when bringing new forms of AI technologies to the world.”

Bard & Google’s Advertising Business

When asked how Bard could affect Google’s ad revenue, Pinsky explained that Bard is not focused on making money from ads. Instead, the priority is giving users a positive experience interacting with Bard.

“Our focus right now is not on ads monetization – it’s on building a great experience for people. And I want to reiterate our privacy commitment that people’s Bard conversations are not being used to show you ads.”

Bard’s Future

Pinsky talked about his view of the future potential of generative AI like Bard. He sees this technology as representing a new and exciting frontier for innovation.

“We truly think generative AI is the new frontier of innovation, especially as we bring its magical capabilities together with our products in a way that can truly help people.”

Though he didn’t provide specifics, it’s evident that Google focuses on enhancing Bard’s abilities and improving the system over time.

As Pinsky put it:

“We’re still at the beginning stages of unlocking the potential of this technology and I, for one, cannot wait to share even more features as we continue to develop in a bold and responsible way.”

Summary Of Key Takeaways

Here’s a summary of key information we learned from Pinsky throughout the interview:

  • People find Bard useful for brainstorming, coding, and learning activities.
  • Bard is positioned as a creative enhancement tool, while Google Search remains the primary source for information-finding.
  • Bard’s strengths include its collaborative abilities for drafting and exploring ideas.
  • Areas for improvement include reducing factual inaccuracies and “hallucinations.”
  • Google is taking a responsible approach with Bard, emphasizing ethical AI principles.
  • Differentiators for Bard include optimization for Google products/services and unique features like the “Google it” button.
  • Monetization is not the current focus – delivering user value is the priority.
  • More capabilities and features are actively being developed.

Looking Ahead

Bard shows promise as a collaborative tool for creativity and productivity. However, quality improvements are needed. Maintaining a focus on responsible development, user value, and ethical AI principles is critical for Google.

If the continued rollout is executed well, Bard could become a helpful AI assistant. But it remains an experimental technology, which means cautious optimism is warranted as capabilities evolve.


Featured image: Ascannio/Shutterstock

Google Gemini Launch Delayed Until 2024 via @sejournal, @kristileilani

Google has postponed the launch of Gemini, which was initially set for next week. According to sources for The Information, the delay is due to the AI’s inconsistent performance with non-English queries.

Why Was Google Gemini Delayed?

Google’s CEO, Sundar Pichai, reportedly canceled several events in California, New York, and Washington, where Gemini would be unveiled.

These events were crucial to Google’s strategy, marking what could have been its most significant product introduction of the year.

The delay highlights the intense competition in the AI sector, particularly against OpenAI, Microsoft, and Meta.

GPT-4 Outperformed Gemini In Multilingual Tasks

In certain aspects, Gemini, comparable to OpenAI’s GPT-4, has fallen short in handling multilingual tasks effectively.

This shortcoming is particularly notable given Google’s global market presence and the importance of diverse language support in AI technologies.

Implications For Google’s AI Strategy

This development has implications beyond Gemini itself. Other Google products like Bard, Assistant, and Docs, which are expected to be enhanced by Gemini’s capabilities, may delay receiving these updates.

This setback contrasts the growing popularity of Microsoft Copilot, which recently announced integration with OpenAI’s latest new features, including GPTs.

Traditionally a leader in AI, Google is racing to match the pace of innovation set by OpenAI. While Google Bard’s capabilities continuously increase, it still falls behind the more advanced features available to premium users of ChatGPT.

Despite the delay, Google remains committed to advancing Gemini, with Pichai expressing a focus on ensuring its competitiveness and state-of-the-art capabilities.

As the company refines Gemini, how it will reshape the landscape of conversational AI and compete with rapidly advancing rivals like OpenAI remains to be seen.


Featured image: sdx15/Shutterstock

GPT Store To Launch Next Year, ChatGPT Updates Coming Soon via @sejournal, @kristileilani

OpenAI shares its plans for the GPT Store, enhancements to GPT Builder tools, privacy improvements, and updates coming to ChatGPT.

  • OpenAI has scheduled the launch of the GPT Store for early next year, aligning with its ongoing commitment to developing advanced AI technologies.
  • The GPT Builder tools have received substantial updates, including a more intuitive configuration interface and improved file handling capabilities.
  • Anticipation builds for upcoming updates to ChatGPT, highlighting OpenAI’s responsiveness to community feedback and dedication to AI innovation.
Firefox URL Tracking Removal – Is This A Trend To Watch? via @sejournal, @martinibuster

Firefox recently announced that they are offering users a choice on whether or not to include tracking information from copied URLs, which comes on the on the heels of iOS 17 blocking user tracking via URLs. The momentum of removing tracking information from URLs appears to be gaining speed. Where is this all going and should marketers be concerned?

Is it possible that blocking URL tracking parameters in the name of privacy will become a trend industrywide?

Firefox Announcement

Firefox recently announced that beginning in the Firefox Browser version 120.0, users will be able to select whether or not they want URLs that they copied to contain tracking parameters.

When users select a link to copy and click to raise the contextual menu for it, Firefox is now giving users a choice as to whether to copy the URL with or without the URL tracking parameters that might be attached to the URL.

Screenshot Of Firefox 120 Contextual Menu

Screenshot of Firefox functionality

According to the Firefox 120 announcement:

“Firefox supports a new “Copy Link Without Site Tracking” feature in the context menu which ensures that copied links no longer contain tracking information.”

Browser Trends For Privacy

All browsers, including Google’s Chrome and Chrome variants, are adding new features that make it harder for websites to track users online through referrer information embedded in a URL when a user clicks from one site and leaves through that click to visit another site.

This trend for privacy has been ongoing for many years but it became more noticeable in 2020 when Chrome made changes to how referrer information was sent when users click links to visit other sites. Firefox and Safari followed with similar referrer behavior.

Whether the current Firefox implementation would be disruptive or if the impact is overblown is kind of besides the point.

What is the point is whether or not what Firefox and Apple did to protect privacy is a trend and if that trend will extend to more blocking of URL parameters that are stronger than what Firefox recently implemented.

I asked Kenny Hyder, CEO of online marketing agency Pixel Main, what his thoughts are about the potential disruptive aspect of what Firefox is doing and whether it’s a trend.

Kenny answered:

“It’s not disruptive from Firefox alone, which only has a 3% market share. If other popular browsers follow suit it could begin to be disruptive to a limited degree, but easily solved from a marketers prospective.

If it became more intrusive and they blocked UTM tags, it would take awhile for them all to catch on if you were to circumvent UTM tags by simply tagging things in a series of sub-directories.. ie. site.com/landing// etc.

Also, most savvy marketers are already integrating future proof workarounds for these exact scenarios.

A lot can be done with pixel based integrations rather than cookie based or UTM tracking. When set up properly they can actually provide better and more accurate tracking and attribution. Hence the name of my agency, Pixel Main.”

I think most marketers are aware that privacy is the trend. The good ones have already taken steps to keep it from becoming a problem while still respecting user privacy.”

Some URL Parameters Are Already Affected

For those who are on the periphery of what’s going on with browsers and privacy, it may come as a surprise that some tracking parameters are already affected by actions meant to protect user privacy.

Jonathan Cairo, Lead Solutions Engineer at Elevar shared that there is already a limited amount of tracking related information stripped from URLs.

But he also explained that there are limits to how much information can be stripped from URLs because the resulting negative effects would cause important web browsing functionality to fail.

Jonathan explained:

“So far, we’re seeing a selective trend where some URL parameters, like ‘fbclid’ in Safari’s private browsing, are disappearing, while others, such as TikTok’s ‘ttclid’, remain.

UTM parameters are expected to stay since they focus on user segmentation rather than individual tracking, provided they are used as intended.

The idea of completely removing all URL parameters seems improbable, as it would disrupt key functionalities on numerous websites, including banking services and search capabilities.

Such a drastic move could lead users to switch to alternative browsers.

On the other hand, if only some parameters are eliminated, there’s the possibility of marketers exploiting the remaining ones for tracking purposes.

This raises the question of whether companies like Apple will take it upon themselves to prevent such use.

Regardless, even in a scenario where all parameters are lost, there are still alternative ways to convey click IDs and UTM information to websites.”

Brad Redding of Elevar agreed about the disruptive effect from going too far with removing URL tracking information:

“There is still too much basic internet functionality that relies on query parameters, such as logging in, password resets, etc, which are effectively the same as URL parameters in a full URL path.

So we believe the privacy crackdown is going to continue on known trackers by blocking their tracking scripts, cookies generated from them, and their ability to monitor user’s activity through the browser.

As this grows, the reliance on brands to own their first party data collection and bring consent preferences down to a user-level (vs session based) will be critical so they can backfill gaps in conversion data to their advertising partners outside of the browser or device.”

The Future Of Tracking, Privacy And What Marketers Should Expect

Elevar raises good points about how far browsers can go in terms of how much blocking they can do. Their response that it’s down to brands to own their first party data collection and other strategies to accomplish analytics without compromising user privacy.

Given all the laws governing privacy and Internet tracking that have been enacted around the world it looks like privacy will continue to be a trend.

However, at this point it time, the advice is to keep monitoring how far browsers are going but there is no expectation that things will get out of hand.

In the Hot Seat: Google’s Search Ads Stir Controversy On Questionable Websites via @sejournal, @brookeosmundson

A recent study published by Adalytics reports Google Search Partners ads appeared on content that doesn’t adhere to its publisher policies.

Per Google’s Publisher Policies, ads are not permitted to serve alongside content that:

  • Is illegal or promotes illegal activity
  • Infringes copyright
  • Sells or facilitates the sale of counterfeit products
  • Incites hatred or promotes discrimination against individuals or a group of people
  • Misrepresents, misstates, or conceals information about the publisher
  • Makes demonstrably false claims
  • And more.

The report found examples of search ads on far-leaning political websites despite advertisers’ attempts to add those domains to a block list.

The report raises questions about the lack of transparency and brand safety concerns for advertisers.

What is the Google Search Partner network?

Per definition, the Google Search Partner Network (GSP) is:

A group of search-related websites and apps where your ads can appear.

The GSP network is not new to Google Ads.

It was established in 2003 to expand its reach beyond the Google search engine.

While Google has never published a complete list of websites that belong to the partner network, the Adalytics report found over 51,000 websites that contained the Google Custom Search engine JavaScript enabled.

This infers that websites with that particular JavaScript are part of the GSP network.

For new Google Search campaigns, they’re automatically opted into the 3rd-party network.

Advertisers can opt out in the campaign settings.

Digging into the compromising placement report

In the Adalytics report, there are multiple examples from advertisers where they found their ads on questionable websites.

In the Hot Seat: Google’s Search Ads Stir Controversy On Questionable WebsitesImage credit: Adalytics.io/blog/search-partners-transparency

The advertisers reported that they had previously blocked their ads from these websites.

If that is the case, this could mean that the “excluded placements” setting in Google Ads is not working as intended.

The report focused primarily on finding websites on the GSP that seem not to meet Google Publisher Policy terms and conditions, including:

  • Pornographic websites
  • Websites with copyright-violating material
  • Websites where its operators are located in countries where U.S. sanctions may apply.

Additionally, findings included that Google search ads paid for by the U.S. Treasury appeared on companies’ websites in countries including Iran and Russia.

This is important because those countries, including the Iranian Allow Steel Company (IASCO), are under specific sanctions.

Ethical and transparency concerns

The thought-provoking report raises many concerns over the ability to trust Google.

The GSP is known to be a “black box” for advertisers because there is no transparency about who is allowed into the GSP.

Further, if Google wants to make strides in rebuilding trust with advertisers and the public in general, this report certainly puts that trust at risk.

When it comes to advertising, every dollar counts in today’s economy.

When marketers can’t trust where their ads are being shown, they could be privy to moving their dollars to other platforms.

In an X (formerly Twitter) thread, Dan Taylor, Vice President of Global Ads at Google, responded to the allegations in the report:

Google has been in the headlines, along with other tech giants, for antitrust lawsuits in the past few years.

What can advertisers do?

There are multiple ways advertisers can proactively combat ads shown on questionable content.

#1: Opt-out of Google Search Partner network on Search campaigns

Since this setting is at the campaign level, advertisers must manually go into each Search campaign to opt out of the GSP.

After navigating to your Search campaign, click “Settings” in the left-hand menu.

Google search partner network settings.

Uncheck the “Include Google search partners” box to opt out of the GSP.

#2: Review content suitability settings at the account level

It’s important to note that this setting applies to campaigns running on YouTube or Display.

Navigate to “Tools and settings >> Setup >> Content suitability”

Choose from:

  • Expanded inventory
  • Standard inventory
  • Limited inventory

Google Ads content suitability settings.

If you’re concerned about brand safety, it’s wise to choose “Limited inventory,” which has the most safeguards.

#3: Review where ads showed in Display campaigns

While Google won’t show details on where specific Search ads are shown on the 3rd-party network, your Display performance can help guide you.

To check placements on Display campaigns, navigate to a particular Display campaign.

On the left-hand menu, go to: “Content >> Where ads showed”

Display placements report in Google Ads

This report shows what domains your ads showed on.

If there are any questionable or poor-performing websites or apps, you can negate them at the campaign or account level.

Since we’re talking about garnering control over Search ads, you should negate these at the account level.

#4: Use advanced settings to negate additional content exclusions

These advanced settings are in the same spot as “content suitability.”

Advanced settings to exclude content in Google Ads.

Here, you can exclude content types, including:

  • Sensitive content
  • Types and labels
  • Themes
  • Keywords
  • Websites
  • Apps
  • YouTube channels or videos

If you’ve found questionable placements in Display campaigns, you can negate them at the account level instead of adding them to every campaign.

You can read the full Adalytics report here.


Featured Image: Krakenimages.com/Shutterstock

Researchers Extend GPT-4 With New Prompting Method via @sejournal, @martinibuster

Microsoft published a research study that demonstrates how advanced prompting techniques can cause a generalist AI like GPT-4 to perform as well as or better than a specialist AI that is trained for a specific topic. The researchers discovered that they could make GPT-4 outperform Google’s specially trained Med-PaLM 2 model that was explicitly trained in that topic.

Advanced Prompting Techniques

The results of this research confirms insights that advanced users of generative AI have discovered and are using to generate astonishing images or text output.

Advanced prompting is generally known as prompt engineering. While some may scoff that prompting can be so profound as to warrant the name engineering, the fact is that advanced prompting techniques are based on sound principles and the results of this research study underlines this fact.

For example, a technique used by the researchers, Chain of Thought (CoT) reasoning is one that many advanced generative AI users have discovered and used productively.

Chain of Thought prompting is a method outlined by Google around May 2022 that enables AI to divide a task into steps based on reasoning.

I wrote about Google’s research paper on Chain of Thought Reasoning which allowed an AI to break a task down into steps, giving it the ability to solve any kind of word problems (including math) and to achieve commonsense reasoning.

These principals eventually worked their way into how generative AI users elicited high quality output, whether it was creating images or text output.

Peter Hatherley (Facebook profile), founder of Authored Intelligence web app suites, praised the utility of chain of thought prompting:

“Chain of thought prompting takes your seed ideas and turns them into something extraordinary.”

Peter also noted that he incorporates CoT into his custom GPTs in order to supercharge them.

Chain of Thought (CoT) prompting evolved from the discovery that asking a generative AI for something is not enough because the output will consistently be less than ideal.

What CoT prompting does is to outline the steps the generative AI needs to take in order to get to the desired output.

The breakthrough of the research is that using CoT reasoning plus two other techniques allowed them to achieve stunning levels of quality beyond what was known to be possible.

This technique is called Medprompt.

Medprompt Proves Value Of Advanced Prompting Techniques

The researchers tested their technique against four different foundation models:

  1. Flan-PaLM 540B
  2. Med-PaLM 2
  3. GPT-4
  4. GPT-4 MedPrompt

They used benchmark datasets created for testing medical knowledge. Some of these tests were for reasoning, some were questions from medical board exams.

Four Medical Benchmarking Datasets

  1. MedQA (PDF)
    Multiple choice question answering dataset
  2. PubMedQA (PDF)
    Yes/No/Maybe QA Dataset
  3. MedMCQA (PDF)
    Multi-Subject Multi-Choice Dataset
  4. MMLU (Massive Multitask Language Understanding) (PDF)
    This dataset consists of 57 tasks across multiple domains contained within the topics of Humanities, Social Science, and STEM (science, technology, engineering and math).
    The researchers only used the medical related tasks such as clinical knowledge, medical genetics, anatomy, professional medicine, college biology and college medicine.

GPT-4 using Medprompt absolutely bested all the competitors it was tested against across all four medical related datasets.

Table Shows How Medprompt Outscored Other Foundation Models

Screenshot showing how Medprompt performance scores exceeded those of more advanced specialist foundation models

Why Medprompt is Important

The researchers discovered that using CoT reasoning, together with other prompting strategies, could make a general foundation model like GPT-4 outperform specialist models that were trained in just one domain (area of knowledge).

What makes this research especially relevant for everyone who uses generative AI is that the MedPrompt technique can be used to elicit high quality output in any knowledge area of expertise, not just the medical domain.

The implications of this breakthrough is that it may not be necessary to expend vast amounts of resources training a specialist large language model to be an expert in a specific area.

One only needs to apply the principles of Medprompt in order to obtain outstanding generative AI output.

Three Prompting Strategies

The researchers described three prompting strategies:

  1. Dynamic few-shot selection
  2. Self-generated chain of thought
  3. Choice shuffle ensembling

Dynamic Few-Shot Selection

Dynamic few-shot selection enables the AI model to select relevant examples during training.

Few-shot learning is a way for the foundational model to learn and adapt to specific tasks with just a few examples.

In this method, models learn from a relatively small set of examples (as opposed to billions of examples), with the focus that the examples are representative of a wide range of questions relevant to the knowledge domain.

Traditionally, experts manually create these examples, but it’s challenging to ensure they cover all possibilities. An alternative, called dynamic few-shot learning, uses examples that are similar to the tasks the model needs to solve, examples that are chosen from a larger training dataset.

In the Medprompt technique, the researchers selected training examples that are semantically similar to a given test case. This dynamic approach is more efficient than traditional methods, as it leverages existing training data without requiring extensive updates to the model.

Self-Generated Chain of Thought

The Self-Generated Chain of Thought technique uses natural language statements to guide the AI model with a series of reasoning steps, automating the creation of chain-of-thought examples, which frees it from relying on human experts.

The research paper explains:

“Chain-of-thought (CoT) uses natural language statements, such as “Let’s think step by step,” to explicitly encourage the model to generate a series of intermediate reasoning steps.

The approach has been found to significantly improve the ability of foundation models to perform complex reasoning.

Most approaches to chain-of-thought center on the use of experts to manually compose few-shot examples with chains of thought for prompting. Rather than rely on human experts, we pursued a mechanism to automate the creation of chain-of-thought examples.

We found that we could simply ask GPT-4 to generate chain-of-thought for the training examples using the following prompt:

Self-generated Chain-of-thought Template
## Question: {{question}}
{{answer_choices}}
## Answer
model generated chain of thought explanation
Therefore, the answer is [final model answer (e.g. A,B,C,D)]"

The researchers realized that this method could yield wrong results (known as hallucinated results). They solved this problem by asking GPT-4 to perform an additional verification step.

This is how the researchers did it:

“A key challenge with this approach is that self-generated CoT rationales have an implicit risk of including hallucinated or incorrect reasoning chains.

We mitigate this concern by having GPT-4 generate both a rationale and an estimation of the most likely answer to follow from that reasoning chain.

If this answer does not match the ground truth label, we discard the sample entirely, under the assumption that we cannot trust the reasoning.

While hallucinated or incorrect reasoning can still yield the correct final answer (i.e. false positives), we found that this simple label-verification step acts as an effective filter for false negatives.”

Choice Shuffling Ensemble

A problem with multiple choice question answering is that foundation models (GPT-4 is a foundational model) can exhibit position bias.

Traditionally, position bias is a tendency that humans have for selecting the top choices in a list of choices.

For example, research has discovered that if users are presented with a list of search results, most people tend to select from the top results, even if the results are wrong. Surprisingly, foundation models exhibit the same behavior.

The researchers created a technique to combat position bias when the foundation model is faced with answering a multiple choice question.

This approach increases the diversity of responses by defeating what’s called “greedy decoding,” which is the behavior of foundation models like GPT-4 of choosing the most likely word or phrase in a series of words or phrases.

In greedy decoding, at each step of generating a sequence of words (or in the context of an image, pixels), the model chooses the likeliest word/phrase/pixel (aka token) based on its current context.

The model makes a choice at each step without consideration of the impact on the overall sequence.

Choice Shuffling Ensemble solves two problems:

  1. Position bias
  2. Greedy decoding

This how it’s explained:

“To reduce this bias, we propose shuffling the choices and then checking consistency of the answers for the different sort orders of the multiple choice.

As a result, we perform choice shuffle and self-consistency prompting. Self-consistency replaces the naive single-path or greedy decoding with a diverse set of reasoning paths when prompted multiple times at some temperature> 0, a setting that introduces a degree of randomness in generations.

With choice shuffling, we shuffle the relative order of the answer choices before generating each reasoning path. We then select the most consistent answer, i.e., the one that is least sensitive to choice shuffling.

Choice shuffling has an additional benefit of increasing the diversity of each reasoning path beyond temperature sampling, thereby also improving the quality of the final ensemble.

We also apply this technique in generating intermediate CoT steps for training examples. For each example, we shuffle the choices some number of times and generate a CoT for each variant. We only keep the examples with the correct answer.”

So, by shuffling choices and judging the consistency of answers, this method not only reduces bias but also contributes to state-of-the-art performance in benchmark datasets, outperforming sophisticated specially trained models like Med-PaLM 2.

Cross-Domain Success Through Prompt Engineering

Lastly, what makes this research paper incredible is that the wins are applicable not just to the medical domain, the technique can be used in any kind of knowledge context.

The researchers write:

“We note that, while Medprompt achieves record performance on medical benchmark datasets, the algorithm is general purpose and is not restricted to the medical domain or to multiple choice question answering.

We believe the general paradigm of combining intelligent few-shot exemplar selection, self-generated chain of thought reasoning steps, and majority vote ensembling can be broadly applied to other problem domains, including less constrained problem solving tasks.”

This is an important achievement  because it means that the outstanding results can be used on virtually any topic without having to go through the expense and time of intensely training a model on specific knowledge domains.

What Medprompt Means For Generative AI

Medprompt has revealed a new way to elicit enhanced model capabilities, making generative AI more adaptable and versatile across a range of knowledge domains for a lot less training and effort than previously understood.

The implications for the future of generative AI are profound, not to mention how this may influence the skill of prompt engineering.

Read the new research paper:

Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine (PDF)

Featured Image by Shutterstock/Asier Romero

ChatGPT: A Look Back At One Year Of AI Advances From OpenAI via @sejournal, @kristileilani

Today, ChatGPT celebrates one year since its launch in research preview.

From its humble beginnings, ChatGPT has continually pushed the boundaries of what we perceive as possible with generative AI for almost any task.

In this article, we take a journey through the past year, highlighting the significant milestones and updates that have shaped ChatGPT into the versatile and powerful tool it is today.

ChatGPT: From Research Preview To Customizable GPTs

This story unfolds over the course of nearly a year, beginning on November 30, when OpenAI announced the launch of its research preview of ChatGPT.

As users began to offer feedback, improvements began to arrive.

Before the holiday, on December 15, 2022, ChatGPT received general performance enhancements and new features for managing conversation history. To ensure quality, a daily message cap was also introduced.

As the calendar turned to January 9, 2023, ChatGPT saw improvements in factuality, and a notable feature was added to halt response generation mid-conversation, addressing user feedback and enhancing control.

Just a few weeks later, on January 30, the model was further upgraded for enhanced factuality and mathematical capabilities, broadening its scope of expertise.

February 2023 was a landmark month. On February 9, ChatGPT Plus was introduced, bringing new features and a faster ‘Turbo’ version to Plus users.

This was followed closely on February 13 with updates to the free plan’s performance and the international availability of ChatGPT Plus, featuring a faster version for Plus users.

March 14, 2023, marked a pivotal moment with the introduction of GPT-4 to ChatGPT Plus subscribers.

This new model featured advanced reasoning, complex instruction handling, and increased creativity.

Less than ten days later, on March 23, experimental AI plugins, including browsing and code interpreter capabilities, were made available to selected users.

On May 3, users gained the ability to turn off chat history and export data.

Plus users received early access to experimental web browsing and third-party plugins on May 12.

On May 24, the iOS app expanded to more countries with new features like shared links and Bing plugin, and an option to disable chat history on iOS.

June and July 2023 were filled with updates enhancing mobile app experiences and introducing new features.

The mobile app was updated with browsing features on June 22, and the browsing feature itself underwent temporary disablement for improvements on July 3.

The code interpreter feature rolled out in beta to Plus users on July 6.

Plus customers enjoyed doubled message limits for GPT-4 from July 19, and Custom instructions became available in beta to Plus users the next day.

July 25 saw the Android version of the ChatGPT app launch in selected countries.

As summer progressed, August 3 brought several small updates enhancing the user experience.

Custom instructions were extended to free users, except in the EU & UK, on August 9, and later to those regions on August 21.

The month concluded with the launch of ChatGPT Enterprise on August 28, offering advanced features and security for enterprise users.

Entering autumn, September 11 witnessed limited language support in the web interface.

Voice and image input capabilities in beta were introduced on September 25, further expanding ChatGPT’s interactive abilities.

An updated version of web browsing rolled out to Plus users on September 27.

The fourth quarter of 2023 began with integrating DALL·E 3 in beta on October 16, allowing for image generation from text prompts.

The browsing feature moved out of beta for Plus and Enterprise users on October 17. Customizable versions of ChatGPT, called GPTs, were introduced for specific tasks on November 6.

Finally, on November 21, the voice feature in ChatGPT was made available to all users, rounding off a year of significant advancements and broadening the horizons of AI interaction.

Looking Ahead: What’s Next For ChatGPT

The past year has been a testament to continuous innovation, but it is merely the prologue to a future rich with potential.

The upcoming year promises incremental improvements and leaps in AI capabilities, user experience, and integrative technologies that could redefine our interaction with digital assistants.

With a community of users and developers growing stronger and more diverse, the evolution of ChatGPT is poised to surpass expectations and challenge the boundaries of today’s AI landscape.

As we step into this next chapter, the possibilities are as limitless as generative AI continues to advance.

ChatGPT: A Look Back At One Year Of AI Advances From OpenAIScreenshot from ChatGPT, November 2023

Featured image: photosince/Shutterstock

OpenAI Welcomes Back Sam Altman As CEO via @sejournal, @kristileilani

OpenAI has announced that Sam Altman is officially back as CEO.

Sam Altman has returned as CEO, with Mira Murati reinstated as Chief Technology Officer (CTO).

This leadership reshuffle coincides with the formation of a new initial board, a strategic move that underscores OpenAI’s commitment to its pioneering mission in artificial intelligence (AI).

The newly formed board includes Bret Taylor as Chair, alongside Larry Summers and Adam D’Angelo, bringing diverse expertise to guide OpenAI’s future endeavors.

In his message to the company, Sam Altman emphasized the importance of resilience and the team’s hard work in navigating recent challenges.

Helen Toner Resigns OpenAI BoardScreenshot from X, November 2023

He expressed gratitude towards team members and partners, particularly highlighting the significant role of Microsoft.

Altman’s return signals a renewed focus on advancing AI research and safety initiatives, aligning with OpenAI’s long-term vision.

His leadership, coupled with Murati’s technological expertise and Greg Brockman’s continued role as President, sets a strong foundation for the company’s future growth.

Key priorities for the new leadership include enhancing research capabilities, deploying innovative products, and strengthening corporate governance.

These efforts aim to further OpenAI’s mission of ensuring that AI benefits humanity while addressing the evolving needs of users and stakeholders.

For marketers and SEO professionals, OpenAI’s developments hold particular significance.

The company’s technologies, like ChatGPT, have already transformed marketing strategies and search engine optimization.

Continued advancements in these areas promise new tools and insights for the industry.

In summary, OpenAI’s leadership transition with Sam Altman at the helm, a new board, and a clear focus on research and governance, heralds an exciting era for the AI community.

This shift is poised to impact various sectors, offering fresh perspectives and innovative solutions in the world of AI and technology.

Featured image: Patrickx007/Shutterstock

Google Enhances Search With Support For Organization Meta Data via @sejournal, @MattGSouthern

Google is expanding support for organization structured data markup, allowing companies to provide additional details about themselves in search results.

With this update, Google will use the markup to showcase company details in knowledge panels and other visual elements on the search results page.

This helps users find essential information about organizations they search for.

What’s Changing?

Since 2013, Google has allowed sites to use “logo” and “url” markup to specify the image and link for their logo in search results.

Now, Google enables sites to include further information like their name, address, contact info, and business identifiers.

Google keeps the existing logo structured data guidelines and merges them with the broader organization markup documentation.

Changes In Search Console

The logo report in Google Search Console and tests in the Rich Results Test tool have transitioned to more comprehensive organization markup validations.

The Rich Results Test allows testing organization markup by submitting a URL or code snippet. This provides instant feedback on whether markup is implemented correctly.

What Do Websites Need To Do?

Sites with logo markup don’t need to change anything – Google will recognize it. However, Google encourages adding the new organization fields if applicable.

Providing additional organization details can make companies eligible to display expanded knowledge panels, like Google’s recently launched merchant panels.

For local businesses, Google recommends using both local business and organization markup. Online-only companies should utilize the “OnlineBusiness” subtype of organization markup.

Looking Ahead

These additions aim to make it simpler for businesses and organizations to help users find accurate details in Google Search results.

This update also exemplifies why it can be beneficial for sites to implement structured data markup even when Google doesn’t yet utilize it for rich result enhancements.

While specific markup properties may not drive visible improvements, they position sites to benefit when Google’s systems gain the ability to process and display new kinds of structured data. The work done today can power enhancements down the road.

As Google expands its structured data capabilities, markup that wasn’t previously supported may eventually become eligible for special search features.


Featured Image: G-Stock Studio/Shutterstock

Google SGE & Generative Summaries For Search Results Patent via @sejournal, @kristileilani

Google patent US11769017B1, shared by X user @seostratega, focusing on “Generative Summaries for Search Results,” appears to lay the groundwork for Google’s SGE with the integration of generative AI search summaries.

This article aims to provide a quick analysis of the patent, how it relates to Google’s SGE, and its implications for SEO professionals.

What Is Patent US11769017B1?

The patent describes a method for creating summaries of search results using large language models (LLMs).

These models are designed to understand the context and content of web pages, generating concise and relevant summaries.

It reimagines the search results page, allowing for more complex queries and delivering AI-powered overviews with links to further information.

How Does It Relate To Google SGE?

Using LLMs, Google SGE can generate AI-powered snapshots for queries, offering users an immediate understanding of a topic and avenues to delve deeper.

These snapshots are not isolated pieces of information but are corroborated by links to high-quality web sources, ensuring reliability and breadth in the information provided.

The technology underpinning US11769017B1 is integral to SGE, facilitating these concise overviews and ensuring reliable web sources back them.

Google SGE & Generative Summaries For Search Results PatentScreenshot from Google Patents, November 2023

Key Takeaways For SEO

Integrating generative AI into search suggests a shift towards more nuanced and contextually rich content.

SEO strategies must adapt to prioritize content that aligns with these AI-driven summaries.

Start by focusing on creating comprehensive content. Ensure your content thoroughly covers topics and addresses users’ top questions. This holistic approach increases the likelihood of being featured in AI-generated summaries.

Write in a clear, concise manner while providing context. Content that is easily digestible and contextually rich is more likely to be favored by LLMs for summarization.

Optimize content for conversational queries and voice searches beyond keywords and phrases.

With SGE’s emphasis on reliable sources and the latest updates to E-E-A-T, building the authority and trustworthiness of your content is vital. This includes citing reputable sources and maintaining factual accuracy.

Stay current on emerging trends in generative AI and search, testing the latest updates to SGE in Labs and how it could influence search behavior. Adapting to these changes promptly can give you a competitive edge.

Creating multiple content formats, including text, audio, videos, and images, may increase visibility across different generative search experiences.

Most importantly, tailor your content to align with the potential intent behind search queries, as SGE is likely to prioritize content that best matches user intent.

Learn More About Google SGE With AI

Want to learn more about the patent’s technical details and how they relate to the inner workings of Google SGE?

Poe users can try the SGEPatentBot for free.

poe explains google sge patentScreenshot from Poe, November 2023

ChatGPT Plus subscribers can try the SGEPatentReader, a custom GPT.

gpt explains google sge patentScreenshot from ChatGPT, November 2023

Conclusion

Google’s patent US11769017B1 and continued experimentation with SGE mark a shift towards more AI-driven, contextually aware search experiences.

For SEO professionals, adapting to these changes is crucial. By focusing on comprehensive, clear, and authoritative content and optimizing for conversational queries and user intent, SEO strategies can align with the evolving landscape of Google search, potentially leading to greater visibility and engagement in the SGE era.


Featured image: sdx15/Shutterstock