Google Gemini Sends More Traffic To Sites Than Perplexity: Report via @sejournal, @MattGSouthern

Google Gemini more than doubled its referral traffic to websites between November and January, according to SE Ranking data from more than 101,000 sites with Google Analytics installed.

The increase started in December, shortly after Google began rolling out Gemini 3 across its products. SE Ranking measured a 51% increase in December and a 42% increase in January, for a combined gain of about 115%.

For transparency, SE Ranking sells AI visibility tracking tools, and the data below comes from their own Google Analytics dataset.

Gemini Passes Perplexity

In January, SE Ranking’s data shows Gemini sent 29% more visitors to websites than Perplexity globally. In the U.S., the gap was wider at 41%.

Five months earlier, the positions were reversed. In August, Perplexity was sending roughly three times more referral traffic than Gemini, according to the same dataset.

ChatGPT’s Decline From Peak

ChatGPT’s referral traffic peaked in October and has fallen since then. SE Ranking measured an 8% drop in November and an 18% drop in December, with a partial recovery in January.

Even after the decline, ChatGPT still generates about 80% of all AI referral traffic to websites. ChatGPT’s lead over Gemini narrowed from roughly 22x in October to about 8x in January. That’s still a large gap.

Similarweb’s January data showed a similar pattern when measuring direct visits to chatbot sites. ChatGPT’s traffic share fell from 86% to 64% over the past year, while Gemini rose from 5% to 21 %. The two datasets measure different things, but both show the same direction.

The Gemini 3 Connection

The timing of Gemini’s traffic increase lines up with Google’s rollout of Gemini 3 models.

Google released Gemini 3 Pro on November 18, Gemini 3 Deep Think on December 4, and Gemini 3 Flash on December 17. Flash became the default model in the Gemini app and in AI Mode for Search.

Before those releases, Gemini’s referral traffic had been mostly flat for eight months. SE Ranking’s data shows it grew at roughly 4% per month from January through October. The jump to 47% monthly growth in December and January represents about a 12x acceleration from the prior pace.

AI Traffic In Context

All AI platforms combined still account for a small share of overall web traffic. SE Ranking puts the figure at about 0.24% of global internet traffic as of January, up from 0.15% in 2025.

An earlier SE Ranking report of 13,700 websites found Google generating 94% of organic traffic. ChatGPT and Perplexity were starting to show up in referral reports. The new dataset is larger at 101,574 sites across 250 markets but uses the same GA-based methodology.

Why This Matters

Two months of growth from Gemini doesn’t predict where AI referral traffic will be by year’s end. The increase from November to January is measurable and correlates with a known product launch, but it’s too early to call it a sustained pattern.

The Perplexity milestone is more concrete. Gemini may now show up as a larger referral source than Perplexity in your own analytics. That’s worth checking.

Looking Ahead

SE Ranking says it will continue monitoring AI referral traffic through 2026. Google hasn’t disclosed referral traffic figures for Gemini or AI Mode directly. The next Similarweb AI Tracker update could provide a second data point on whether Gemini’s growth continued past January.


Featured Image: DANIEL CONSTANTE/Shutterstock

Google Begins Rolling Out March 2026 Core Update via @sejournal, @MattGSouthern

Google has started rolling out the March 2026 core update, according to a notice on the Google Search Status Dashboard. The company says the update began at 2:00 a.m. PT on March 27 and may take up to two weeks to finish.

What Google Confirmed

Google hasn’t provided additional details about what changed in this core update. As with previous core updates, the company’s public notice focuses on timing rather than specific ranking systems or categories of websites that may be affected.

The March 2026 core update follows the March 2026 spam update, which Google completed earlier this week.

Why This Matters

If you’re seeing ranking changes over the next several days, Google has now confirmed that a core update rollout is underway.

Core updates can lead to ranking changes across many types of websites while the rollout is in progress. Google hasn’t shared any more specific guidance for this update yet.

Looking Ahead

Google says the rollout may take up to two weeks, so rankings may continue to shift into early April. Until the update is complete, it may be difficult to tell whether visibility changes reflect a lasting reordering or temporary movement during the rollout.

Wikipedia Bans Use Of AI-Generated Content via @sejournal, @martinibuster

Wikipedia recently published guidelines prohibiting the use of AI to generate or rewrite articles, except for two exceptions related to editing and translations. The guidelines acknowledges that identifying AI generated content can’t be based on style signals and offers no further guidance on how they will identify the LLM-based content.

Violation Of Wikipedia’s Core Content Policies.

The new guidelines prohibiting the use of LLMs states that the use of AI violates several of their core content policies, without actually naming them. But a look at those policies makes it reasonably clear which policies are being alluded to, namely their policies about verifiability, their prohibition on no original research, and possibly their requirement for a neutral point of view are quite likely the two obvious policies referred to.

The policy on verifiability requires that content that might be challenged must be attributable to a reliable published source that other editors can check to verify that the source is reliable. LLMs generate text without explicitly citing sources and they also tend to hallucinate facts.

The policy on original research states:

“Wikipedia does not publish original thought: all material in Wikipedia must be attributable to a reliable, published source. Articles may not contain any new analysis or synthesis of published material that serves to advance a position not clearly advanced by the sources.”

Obviously, LLMs generate a synthesis based on published sources and as for neutral point of view, it’s possible for an LLM to place more weight on dominant viewpoints at the expense of those that are in a minority. Most SEOs are aware that asking an LLM about SEO consistently results in answers that reflect the dominant but not necessarily the most correct point of view.

The new guidance makes two exceptions:

  1. “Editors are permitted to use LLMs to suggest basic copyedits to their own writing, and to incorporate some of them after human review, provided the LLM does not introduce content of its own. Caution is required, because LLMs can go beyond what you ask of them and change the meaning of the text such that it is not supported by the sources cited.
  2. Editors are permitted to use LLMs to translate articles from another language’s Wikipedia into the English Wikipedia, but must follow the guidance laid out at Wikipedia:LLM-assisted translation.”

As to identifying AI generated content, the new Wikipedia AI guidelines suggest considering how well the content complies with their core content guidelines and to audit recent posts by the editor whose edits are under suspicion.

Featured Image by Shutterstock/JarTee

Google Takes Search Live Global With Gemini 3.1 Flash Live via @sejournal, @MattGSouthern

Google is expanding Search Live to more than 200 countries and territories, bringing voice and camera conversations to AI Mode globally.

The expansion is powered by Gemini 3.1 Flash Live, a new audio model that Google calls its highest-quality yet. It’s inherently multilingual, so you can speak with Search in your preferred language without switching settings.

Search Live was previously limited to the U.S.

What’s Changing

Search Live lets you talk to Google Search inside AI Mode instead of typing a query. You ask a question out loud and get an audio response, then continue with follow-ups. Web links appear on screen alongside the voice responses.

The feature also supports camera input. Point your phone at a product label or a piece of equipment and ask Search about what it sees. Google Lens users can tap a “Live” option to start a conversation about what’s in the camera view.

With today’s expansion, both voice and camera capabilities are available in every market where AI Mode is active.

The New Model

Gemini 3.1 Flash Live replaces the previous audio model powering Search Live. Google published benchmark results alongside the announcement.

Gemini Live can now follow a conversation thread for twice as long as the previous model, according to Google. Though the company didn’t specify what the previous limit was.

Beyond Search, 3.1 Flash Live is available to developers in preview through the Gemini Live API in Google AI Studio.

Why This Matters

Search Live turns search into a spoken conversation with camera input. Until now, the feature was limited to U.S. users. Today’s expansion makes it available in the markets where AI Mode is live, across more than 200 countries and territories.

There’s no public data yet on how many people use Search Live or how it affects query volume. But Google has been building toward this for the past year. The company launched Search Live in June, added video input in July, and upgraded to Gemini 2.5 Flash Native Audio in December. Each update expanded what the feature can do and who can use it.

Looking Ahead

Google didn’t announce additional Search Live features alongside this expansion. The focus is on geographic reach and the underlying model upgrade.

How the model performs in production across different languages and markets will be worth watching as adoption data becomes available.

Google Adds New Performance Max Controls And Reporting Features via @sejournal, @brookeosmundson

Google has announced a new set of updates to its Performance Max campaign type, focused on two areas advertisers have consistently asked for: more control over who campaigns prioritize, and better visibility into where budget is going.

The updates include first-party audience exclusions, budget reporting, expanded audience reporting, and placement reporting segmented by network.

Read on for more updates and what this means for your campaigns.

New First-Party Audience Exclusions

The first update Google announced was framed around more precise steering for your target audience.

Advertisers can now exclude specific first-party customer lists from Performance Max campaigns.

If your goal is acquiring net-new customers, excluding existing customer lists can help reduce wasted spend on people who may have converted anyway. It also creates a cleaner setup for evaluating whether Performance Max is actually contributing incremental value.

That said, this still depends heavily on how clean and current your first-party data is. If your customer match lists are outdated, incomplete, or poorly segmented, this feature won’t solve the problem by itself.

It also does not turn Performance Max into a precision audience campaign. Advertisers should still think of this as directional steering, not rigid targeting.

New Reporting Features Focused On Budget And Audience Visibility

The second part of Google’s update is around different reporting levers.

The first update is around the budget report. Advertisers can now find the budget report directly within a Performance Max campaign to help forecast the end-of-month spend. It can also provide scenarios on how changing the daily budget impacts potential performance.

Google is also expanding audience reporting with more detailed demographic and segment-level performance views, including breakdowns such as age range and gender.

Image credit: Google, March 2026

That should give advertisers more context around who the system is actually reaching, rather than just what overall campaign performance looks like.

The last reporting update announced is around network reports. Advertisers can now segment placement reports by network to show:

  • Where ads have served
  • More visibility to ensure brand safety across all Google-owned channels

The placement report lives under the “When and where ads showed” tab.

Why This Matters For Advertisers

Google has continued on its promise to provide more transparency to advertisers in these automated campaign types. They’re continuing to make Performance Max more useful for marketers trying to manage it more intentionally.

The first-party audience exclusion update gives advertisers a more practical way to support acquisition-focused strategies. Brands trying to reduce overlap between prospecting and retention efforts may find this especially helpful.

The reporting updates will likely have broader day-to-day value.

Budget reporting should make it easier to monitor pacing and explain monthly spend behavior, especially for teams working within strict budget expectations or reporting back to stakeholders.

Expanded audience reporting gives advertisers more context around who campaigns are actually reaching. That matters when conversion volume alone doesn’t tell the full story.

Network segmentation in placement reporting also adds a layer of visibility many advertisers have wanted for a long time, particularly those keeping a close eye on brand safety and placement quality.

Taken together, these updates give advertisers more visibility into how Performance Max is spending and who it’s reaching.

Looking Ahead

This rollout is more useful than groundbreaking, but that does not make it insignificant.

Google continues to fill in some of the operational gaps that have made Performance Max harder to manage than many advertisers would like.

For teams already using it, these updates should make campaign oversight a little easier.

For teams that have been frustrated by limited visibility, this is another step toward making Performance Max more workable in real account management.

Google’s March Spam Update Felt Muted But May Signal Bigger Changes via @sejournal, @martinibuster

Google’s March 2026 Spam Update was welcomed by many in the SEO community who were hoping for relief from listicles, AI content rewriters, and Google’s own AI Overviews that “rehash other people’s content.” The update unexpectedly finished in less than twenty-four hours, with a collective shrug and a yawn. Yet despite the underwhelming nature of the update, it still yielded a few interesting insights and takeaways.

Hopeful SEOs

Google’s spam announcement was largely welcomed by many in the SEO community who were hoping that spammy sites positioned above them would lose their rankings but the muted response spoke to an update that didn’t seem to land where people expected it to.

EmarketerZ expressed the hope that sites struggling under the weight of spammy sites ranking above them might have their comeback moment.

They tweeted:

“Google’s latest spam update might just be the comeback moment publishers have been waiting for—finally a shot at reclaiming the traffic they lost in the last one 🤣”

Over on LinkedIn Adrian M. responded to Google’s announcement by expressing that it’s about time, calling out fake engagement tactics as an area they’d like to see cleaned out.

They wrote:

“It was only a matter of time, and it’s exactly what the industry needed. Many SEO agencies have been relying on bot networks and residential proxies to simulate organic engagement and inflate their monthly reports. I’ve recently audited e-commerce servers pushed to the brink of crashing (503 errors) just by these automated, fake “add-to-cart” scripts masquerading as real users. This update will finally clean up the vanity metrics and force the market to return to genuine content marketing and real user acquisition. Excellent move by the Search team!”

Muted Response From Digital Marketers

Many SEOs who have been vocal about spammy GEO tactics and regular old spam jamming up the search results were oddly quiet through the duration of the spam update.

Glenn Gabe had this to say:

“Wait, what? The March 2026 Spam Update has completed rolling out. Damn, that was fast. :)”

And Lily Ray tweeted:

The Google subreddit announcing Google’s spam update only had six responses, four of which were conversations asking for a link to the official announcement. It’s fair to say the response on Reddit’s Google subreddit was a shrug and yawn.

The response over on the SEO subreddit was similar, with some of the comments doubting much of anything will change.

One person expressed the hope that this time AI-generated content farms will get wiped out.

They wrote:

“I’m betting on a big hit to AI-generated content farms and those super thin affiliate sites. google’s been hinting at this for a while, feels like it’s finally coming.”

But another Redditor nicknamed mrtornado79 responded with a big nah… and a useful insight.

“It’s been “finally coming” for three years. At this point it’s basically an SEO drinking game — spam update drops, someone says “this is the one that kills AI content farms,” nothing particularly dramatic happens, repeat.

Google called this a “normal spam update.” Not a paradigm shift. Not the AI content apocalypse. Normal.”

That point about the March Spam Update not being a paradigm shift was a good observation about Google’s understated announcement and it probably explains why Google didn’t even bother to update their Spam Update information.

A couple of the SEO Facebook Groups didn’t even have a discussion about the update, which in itself is a comment about how SEOs feel about Google’s spam updates: It could be a sign of how much wind has been taken out of the sails of low-level affiliate spammers and PBN sellers.

Wait, What… That Was It?

The end of the update was generally met by silence on many of the ongoing discussions across the Internet.

WebmasterWorld member Micha expressed the general underwhelment best:

“Huh? The update is over?”

It’s quite possible that Redditor mrtornado79’s opinion that it was not going to be a paradigm shift was the best view of what just happened.

What May Happen Next

The big question now may not be what just happened but rather what is going to happen next.

I’ve always seen Google’s spam updates as a clearing of the table in preparation for the next course. If a core update follows soon, then that may be what this muted spam update was about. That can be anything from the introduction of new AI-driven features (like those title rewrites they were recently experimenting with) to something quiet that will barely be noticed, like an infrastructure change to accommodate something big and new.

What could Google implement over the coming months?

There have been two patents filed recently which I’ll be publishing information about soon.

1. User Journey Patent
The first one describes a machine learning system that determines how different types of content exposure influence a user’s likelihood of performing a specific action, such as making a purchase or signing up for a service. It’s a system to attribute portions of the final action to specific exposures to content or ads, even when multiple exposures occurred at different times.

2. Automatic Search Results Updates
This patent describes a system that improves search experiences by automatically delivering better results to a user after their original search, without requiring them to search again. This is applicable to both an organic search and an AI assisted search. This transforms search from a one-time activity to information requests that resolve over time. This is really interesting because it makes it possible to ask a question about something that’s going to happen or hasn’t been announced yet, expanding the range of queries that Google can answer.

My general impression of Spam Updates is that they are sometimes a prelude to changes elsewhere in Google’s core algorithm or related infrastructure. It may be an interesting month ahead.

Featured Image by Shutterstock/vchal

Google Analytics Launches Scenario Planner and Projections via @sejournal, @brookeosmundson

Google Analytics has launched Scenario Planner and Projections, two new features designed to help advertisers plan and monitor paid media budgets across channels.

The rollout is part of Google Analytics’ cross-channel budgeting feature, which is still in beta and not yet available to every Google Analytics property.

Read on to learn more about the tools, who’s eligible, and how advertisers can use them.

Introducing Scenario Planner and Projections

The rollout includes two separate tools built for different stages of campaign planning.

Scenario Planner is designed for future planning. It allows advertisers to model different budget allocations across channels and estimate how those changes may impact conversions, revenue, or return on investment. The tool is intended for building media plans ahead of campaign launches or defined planning periods.

Projections is designed for active campaigns. It helps advertisers evaluate whether current spend is pacing toward selected goals and where adjustments may be needed before the reporting period ends. This includes visibility into projected budget delivery, conversions, and revenue by channel.

Google says the tools are meant to be used together. Scenario Planner can be used to build a forward-looking budget plan, while Projections can be used to monitor how campaigns are tracking against that plan once they are live.

The feature is not limited to Google Ads data. Advertisers can incorporate campaign data from both Google and non-Google paid channels, provided cost data and integrations are properly configured.

There are, however, some requirements that may limit access. According to Google, eligibility requirements include:

  • At least one year of conversion data
  • Channels with cost are required and must be data compatible with Primary Channel Grouping
  • At least one year of campaign data from at least two channels (Google and non-Google)

Google also notes that both tools rely on modeled estimates based on historical performance, meaning outputs are directional rather than guaranteed.

Cross-channel budgeting is currently labeled as a beta feature, and Google notes that it may not yet be available to all Google Analytics properties, but is working on expanding to more accounts.

Why This Matters For Advertisers

For many teams, budget planning and performance analysis still happen in separate places.

Planning often lives in spreadsheets or internal forecasts, while performance is measured inside ad platforms and Google Analytics after the fact. That separation can make it harder to evaluate whether budget decisions are working in real time.

These tools bring some of that planning workflow into Google Analytics.

Advertisers now have a way to model budget allocation before campaigns begin and check pacing while campaigns are still running, using the same data source they rely on for performance reporting.

That could be useful for teams managing spend across multiple paid channels, particularly when trying to compare performance beyond a single platform’s recommendations.

At the same time, the usefulness of the feature will depend on data quality and setup. Advertisers with incomplete cost imports, limited historical data, or inconsistent conversion tracking may not be able to fully use the tools or may see less reliable projections.

What Comes Next

For advertisers already using Google Analytics as a central reporting tool, Scenario Planner and Projections may offer a more practical way to pressure-test budget decisions before and during campaign execution.

How useful the tools become in day-to-day planning will likely depend on how many advertisers qualify for access and how reliable the forecasting proves to be over time.

Google Begins Rolling Out The March 2026 Spam Update via @sejournal, @MattGSouthern

Google started rolling out the March 2026 spam update today, according to the Google Search Status Dashboard.

The update is global and in all languages, with a rollout that may take a few days.

What’s New

The Search Status Dashboard listed the update as an incident affecting ranking at 12:00 PM PT on March 24, with the release note posted at 12:18 PM PDT.

Google’s description reads:

“Released the March 2026 spam update, which applies globally and to all languages. The rollout may take a few days to complete.”

Google hasn’t published a blog post or announced new spam policies with this rollout. So far, it seems to be a standard spam update, not a broader policy change like the March 2024 update, which added categories such as content abuse, expired domain abuse, and site reputation abuse.

How Spam Updates Work

Google describes spam updates as improvements to spam-prevention systems like SpamBrain, targeting sites violating spam policies, which can lead to lower rankings or removal from search results.

Spam updates differ from core updates, which re-assess content quality. Spam updates enforce policies against violations like cloaking, link spam, and content abuse.

Sites affected by a spam update can recover, but recovery takes time. Google states improvements may only appear once automated systems detect compliance over months.

Context

This is Google’s first spam update since the August 2025 spam update, which ran from August 26 to September 22 and took nearly 27 days to complete. That update was characterized by SISTRIX as penalty-only, with affected spammy domains losing visibility but no broad ranking changes.

Google’s estimated timeline of “a few days” for the March 2026 update suggests a shorter rollout than recent spam updates, though timelines can stretch. The December 2024 spam update completed in seven days. The August 2025 update took nearly four weeks.

The March 2026 spam update comes about three weeks after the February Discover update finished rolling out.

Why This Matters

Ranking changes during spam update rollouts can happen quickly. Monitoring Search Console data over the next few days will help distinguish spam-related drops from normal fluctuation.

Google hasn’t announced new spam policy categories with this update, so the existing spam policies remain the relevant framework for evaluating any impact.

Looking Ahead

Google will update the Search Status Dashboard when the rollout is complete. Search Engine Journal will report on the completion and any observed effects.


Featured Image: Hurunaga Yuuka/Shutterstock

Google Adds AI & Bot Labels To Forum, Q&A Structured Data via @sejournal, @MattGSouthern

Google updated its Discussion Forum and Q&A Page structured data documentation, adding several new supported properties to both markup types.

The most notable addition is digitalSourceType, a property that lets forum and Q&A sites indicate when content was created by a trained AI model or another automated system.

Content Source Labeling Comes To Forum Markup

The new digitalSourceType property uses IPTC digital source enumeration values to indicate how content was created. Google supports two values:

  • TrainedAlgorithmicMediaDigitalSource for content created by a trained model, such as an LLM.
  • AlgorithmicMediaDigitalSource for content created by a simpler algorithmic process, such as an automatic reply bot.

The property is listed as recommended, not required, for both the DiscussionForumPosting and Comment types in the Discussion Forum docs, and for Question, Answer, and Comment types in the Q&A Page docs.

Google already uses similar IPTC source type values in its image metadata documentation to identify how images were created. The update extends that concept to text-based forum and Q&A content.

New Comment Count Property

Google added commentCount as a recommended property across both documentation pages. It lets sites declare the total number of comments on a post or answer, even when not all comments appear in the markup.

The Q&A Page documentation includes a new formula: answerCount + commentCount should equal the total number of replies of any type. This gives Google a clearer picture of thread activity on pages where comments are paginated or truncated.

Expanded Shared Content Support

The Discussion Forum documentation expanded its sharedContent property. Previously, sharedContent accepted a generic CreativeWork type. The updated docs now explicitly list four supported subtypes:

  • WebPage for shared links.
  • ImageObject for posts where an image is the primary content.
  • VideoObject for posts where a video is the primary content.
  • DiscussionForumPosting or Comment for quoted or reposted content from other threads.

The addition of DiscussionForumPosting and Comment as accepted types is new. Google’s updated documentation includes a code example showing how to mark up a referenced comment with its URL, author, date, and text.

The image property description was also updated across both docs with a note about link preview images. Google now recommends placing link preview images inside the sharedContent field’s attached WebPage rather than in the post’s image field.

Why This Matters

For sites that publish a mix of human and machine-generated content, the digitalSourceType addition provides a structured way to communicate that to Google. The new properties are optional, and no existing implementations will break.

Google has not said how it will use the digitalSourceType data in its ranking or display systems. The documentation only describes it as a way to indicate content origin.

Looking Ahead

The update does not include changes to required properties, so existing forum and Q&A structured data implementations remain valid. Sites that want to adopt the new properties can add them incrementally.

Research: “You Are An Expert” Prompts Can Damage Factual Accuracy via @sejournal, @martinibuster

“You are an expert” persona prompting can harm performance as much as it helps. A new study shows that persona prompting improves alignment with human expectations but can reduce factual accuracy on knowledge-heavy tasks, with effects varying by task type and model. The takeaway is that persona prompting works better on some kinds of tasks than it does in others.

Persona Prompting

Persona prompting is a common way to shape how large language models respond, especially in applications where tone and alignment with human expectations matter. It is widely used because it improves how outputs read and feel. Given how widespread persona prompting is, it may come as a surprise that its actual effect on performance remains unclear, as prior research shows inconsistent results, throwing the technique into doubt as to whether it is helping or harming.

The researchers concluded that persona prompting is neither broadly beneficial nor harmful, and that its efficacy depends on the type of task.

They found:

  • It improves alignment-related outputs such as tone, formatting, and safety behavior
  • Persona prompting degrades performance on tasks that rely on factual accuracy and reasoning

Based on this, the authors introduce a method called PRISM (Persona Routing via Intent-based Self-Modeling), that applies personas selectively, using intent-based routing instead of treating personas as a default setting. Their findings show that persona prompting works best as a conditional tool and provide a better understanding of when persona prompting helps and when it should be avoided.

Managing Behavioral Signals

In section three of the paper, the researchers say that expert personas have “useful behavioral signals” but that naïve use of persona prompting damages as much as it helps. They say this raises the question of whether those benefits can be separated from the harms and applied only where they improve results.

Behavioral signals influence LLM output. These signals are the reason persona prompting works. They drive improvements in tone, structure, safety behavior, and how well responses match expectations. Without them, there would be no benefit to persona prompting.

Yet, in a seeming paradox, the paper shows that those same signals interfere with tasks that depend on factual accuracy and reasoning. That is why the paper treats them as something to manage, not maximize.

These signals include:

  • Stylistic adaptation and tone matching: Adopting a professional or creative voice.
  • Structured formatting: Providing step-by-step or technical layouts.
  • Format adherence: Helping the model follow complex structures, like professional emails or step-by-step STEM explanations.
  • Intent following: Focusing the model on the user’s underlying goal, especially in tasks like data extraction.
  • Safety refusal: Identifying and declining harmful requests more effectively by adopting a “Safety Monitor” role.

Persona Prompt Wins

The paper found that persona prompts were a win in five out of eight categories of tasks:

  1. Extraction: +0.65 score increase.
  2. STEM: +0.60 score increase.
  3. Reasoning: +0.40 score increase.
  4. Writing: Improved through better stylistic adaptation.
  5. Roleplaying a domain expert: Improved through better tone matching.

The persona prompting won in the above categories because they are more about style and clarity rather than whether the answer is correct for facts and knowledge. They also found that the longer and more detailed the persona prompt, the stronger the alignment and safety behaviors become.

Persona Prompt Failures

Conversely, the expert persona consistently degraded performance in the remaining three (out of eight) categories because they rely on precise fact retrieval or strict logic rather than style and clarity. The reason for the performance drop is that adding a detailed expert persona essentially “distracts” the model by activating an “instruction-following mode” that prioritizes tone and style.

Activating expert personas come at the expense of “factual recall.” The model is so focused on trying to act like an expert that it forgets the information it learned during its initial training.That explains the drops in accuracy for facts and math.

Persona expert prompts performed worse in the following three categories:

  1. Math
  2. Coding
  3. Humanities (memorized factual knowledge)

The paper notes that on one of the knowledge benchmarks (MMLU), accuracy dropped from a 71.6% baseline to 68.0% even with the “minimum” persona, and fell further to 66.3% with the “long” persona.

They explained the safety improvements:

“More detailed persona descriptions provide richer alignment information, amplifying instruction-tuning behaviors proportionally.”

And showed why factual accuracy takes a hit:

“Persona Damages Pretraining Tasks
During pretraining, language models acquire capabilities such as factual knowledge memorization, classification, entity relationship recognition, and zero-shot reasoning. These abilities can be accessed without relying on instruction-tuning, and can be damaged by extra instruction-following context, such as expert persona prompts.”

Conclusions Reached

The researchers conclude that persona prompting consistently improves alignment-dependent tasks such as writing, roleplay, and safety behavior, while degrading performance on tasks that rely on pretraining-based knowledge, including math, coding, and general knowledge benchmarks.

They also found that a model’s sensitivity to personas scales with its training. Models that are more optimized to follow instructions are more “steerable,” which means they get the biggest boost in safety and tone, but they also suffer the largest drops in factual accuracy.

Takeaways

1. Be selective about using persona prompts:

  • Do not default to “You are an expert” prompts
  • Treat persona prompting as situational. Using it everywhere introduces hidden accuracy risks.

2. Persona prompting is effective for:

  • Writing quality
  • Tone
  • Formatting and organization
  • Readability

3. Tasks that don’t benefit from persona prompting and should instead use neutral prompting to preserve accuracy:

  • Fact-checking
  • Statistics
  • Technical explanations
  • Logic-heavy outputs
  • Research
  • SEO analysis

4. Remember these three findings:

  • Use persona prompting to generate content, then switch to a non-persona prompt (or a stricter mode) to verify facts.
  • Highly detailed “expert” prompts strengthen tone and clarity but reduce factual and knowledge accuracy.
  • “You are an expert” prompts may cause a model to prioritize sounding correct over actually being correct.

5. Match your prompts to the task:

  • Content creation: Persona helps
  • Analysis and validation: Persona hurts

The most effective approach is not one prompt, but a workflow that switches prompts depending on the task, similar to the researcher’s PRISM approach.

Read the research paper:
Expert Personas Improve LLM Alignment but Damage Accuracy: Bootstrapping Intent-Based Persona Routing with PRISM

Featured Image by Shutterstock/ImageFlow