ChatGPT Nears 700 Million Weekly Users, OpenAI Announces via @sejournal, @MattGSouthern

OpenAI’s ChatGPT is on pace to reach 700 million weekly active users, according to a statement this week from Nick Turley, VP and head of the ChatGPT app.

The milestone marks a sharp increase from 500 million in March and represents a fourfold jump compared to the same time last year.

Turley shared the update on X, writing:

“This week, ChatGPT is on track to reach 700M weekly active users — up from 500M at the end of March and 4× since last year. Every day, people and teams are learning, creating, and solving harder problems. Big week ahead. Grateful to the team for making ChatGPT more useful and delivering on our mission so everyone can benefit from AI.”

How Does This Compare to Other Search Engines?

Weekly active user (WAU) counts aren’t typically shared by traditional search engines, making direct comparisons difficult. Google reports aggregate data like total queries or monthly product usage.

While Google handles billions of searches daily and reaches billions of users globally, its early growth metrics were limited to search volume.

By 2004, roughly six years after launch, Google was processing over 200 million daily searches. That figure grew to four billion daily searches by 2009, more than a decade into the company’s existence.

For Microsoft’s Bing search engine, a comparable data point came in 2023, when Microsoft reported that its AI-powered Bing Chat had reached 100 million daily active users. However, that refers to the new conversational interface, not Bing Search as a whole.

How ChatGPT’s Growth Stands Out

Unlike traditional search engines, which built their user bases during a time of limited internet access, ChatGPT entered a mature digital market where global adoption could happen immediately. Still, its growth is significant even by today’s standards.

Although OpenAI hasn’t shared daily usage numbers, reporting WAU gives us a picture of steady engagement from a wide range of users. Weekly stats tend to be a more reliable measure of product value than daily fluctuations.

Why This Matters

The rise in ChatGPT usage is evidence of a broader shift in how people find information online.

A Wall Street Journal report cites market intelligence firm Datos, which found that AI-powered tools like ChatGPT and Perplexity make up 5.6% of desktop browser searches in the U.S., more than double their share from a year earlier.

The trend is even stronger among early adopters. Among people who began using large language models in 2024, nearly 40% of their desktop browser visits now go to AI search tools. During the same period, traditional search engines’ share of traffic from these users dropped from 76% to 61%, according to Datos.

Looking Ahead

With ChatGPT on track to reach 700 million weekly users, OpenAI’s platform is now rivaling the scale of mainstream consumer products.

As AI tools become a primary starting point for queries, marketers will need to rethink how they approach visibility and engagement. Staying competitive will require strategies focused as much on AI optimization as on traditional SEO.


Featured Image: Photo Agency/Shutterstock

How AI Search Should Be Shaping Your CEO’s & CMO’s Strategy [Webinar] via @sejournal, @theshelleywalsh

AI is rapidly changing the rules of SEO. From generative ranking to vector search, the new rules are not only technical but also reshaping how business leaders make decisions.

Join Dan Taylor on August 14, 2025, for an exclusive SEJ Webinar tailored for C-suite executives and senior leaders. In this session, you’ll gain essential insights to understand and communicate SEO performance in the age of AI.

Here’s what you’ll learn:

AI Search Is Impacting Everything. Are You Ready?

AI search is already here, and it’s impacting everything from SEO KPIs to customer journeys. This webinar will give you the tools to lead your teams through the shift with confidence and precision.

Register now for a business-first perspective on AI search innovation. If you can’t attend live, don’t worry. Sign up anyway, and we’ll send you the full recording.

Which SEO Jobs AI Will Reshape & Which Might Disappear via @sejournal, @DuaneForrester

You’ve probably seen the headlines like: “AI will kill SEO,” “AI will replace marketing roles,” or the latest panic: “Is your digital marketing job safe?”

Well, maybe not those exact headlines, but you get the idea, and I’m sure you have seen something similar.

Let’s clear something up: AI is not making SEO irrelevant. It’s making certain tasks obsolete. And yes, some jobs built entirely around those tasks are at risk.

A recent Microsoft study analyzed over 200,000 Bing Copilot interactions to measure task overlap between human job functions and AI-generated outputs. Their findings are eye-opening:

  • Translators and Interpreters: 98% overlap with AI tasks.
  • Writers and Authors: 88% overlap.
  • Public Relations Specialists: 79% overlap.

SEO as a field wasn’t directly named in the study, but many roles common within SEO map tightly to these job categories.

If you write, edit, report, research, or publish content as part of your daily work, this isn’t a hypothetical shift. It’s already happening.

(Source: Microsoft AI Job Impact – Business Insider – follow through this link to reach the download location for the original PDF of the study. BI summarizes the information, but links to MSFT, which in turn links to the source for the PDF.)

What’s Actually Changing

AI isn’t replacing SEO. It’s changing what “search engine optimization” means, and where and how value is measured.

In traditional SEO, the focus was clear:

  • Rank high.
  • Earn the click.
  • Optimize the page for humans and crawlers.

That still matters. But, in AI-powered search systems, the sequence is different:

  1. Content is chunked behind the scenes, paragraphs, lists, and answers are sliced and stored in vector form.
  2. Prompts trigger retrieval, the LLM pulls relevant chunks, often based on embeddings, not just keywords. (So, concepts and relationships, not keywords per se.)
  3. Only a few chunks make it into the answer. Everything else is invisible, no matter how high it once ranked.

This new paradigm shifts the rules of engagement. Instead of asking, “Where do I rank?” the better question is, “Was my content even retrieved?” That makes this a binary system, not a sliding scale.

In this new world of retrieval, the direct answer to the question, “Where do I rank?” could be “ChatGPT,” “Perplexity,” “Claude,” or “CoPilot,” instead of a numbered position.

In some ways, this isn’t as big a shift as some folks would have you believe. After all, as the old joke asks, “Where do you hide a dead body?” To which the correct answer is “…on Page 2 of Google’s results!”

Morbid humor aside, the implication is no one goes there, so there’s no value, and while that sentiment actually drops a lot of the real, nuanced details that actual click through rate data shows us (like the top of page 2 results actually has better CTRs than the bottom of page 1 typically), it does serve up a meta point: If you’re not in the first few results on a traditional SERP, the drop off of CTRs is precipitous.

So, it could be argued that with most “answers” today in generative AI systems being comprised of a very limited set of references, that today’s AI-based systems offer a new display path for consumers, but ultimately, those consumers will only be interacting with the same number of results they historically engaged with.

I mean, if we only ever really clicked on the top 3 results (generalizing here), and the rest were surplus to needs, then cutting an AI-sourced answer down to some words with only 1, 2 or 3 cited results amounts to a similar situation in terms of raw numbers of choice for consumers … 1, 2 or 3 clickable options.

Regardless, it does mark a shift in terms of work items and workflows, and here’s how that shift shows up across some core SEO tasks. Obviously, there could be many more, but these examples help set the stage:

  • Keyword research becomes embedding relevance and semantic overlap. It’s not about the exact phrase match in a gen AI result. It’s about aligning your language with the concepts AI understands. It’s about the concept of query fan-out (not new, by the way, but very important now).
  • Meta tag and title optimization become chunked headers and contextual anchor phrases. AI looks for cues inside content to determine chunk focus.
  • Backlink building becomes trust signal embedding and source transparency. Instead of counting links, AI asks: Does this source feel credible and citable?
  • Traffic analytics becomes retrieval testing and AI response monitoring. The question isn’t just how many visits you got, it’s whether your content shows up at all in AI-generated responses.

What this means for teams:

  • Your title tag isn’t just a headline; it’s a semantic hook for AI retrieval.
  • Content format matters more: bullets, tables, lists, and schema win because they’re easier to cite.
  • You need to test with prompts to see if your content is actually getting surfaced.

None of this invalidates traditional SEO. But, the visibility layer is moving. If you’re not optimizing for retrieval, you’re missing the first filter, and ranking doesn’t matter if you’re never in the response set.

The SEO Job Risk Spectrum

Microsoft’s study didn’t target SEO directly, but it mapped 20+ job types by their overlap with current AI tasks. I used those official categories to extrapolate risk within SEO job functions.

Image Credit: Duane Forrester

High Risk – Immediate Change Needed

SEO Content Writers

Mapped to: Writers & Authors (88% task overlap in the study: 88% of these tasks an AI can do today).

Why: These roles often involve creating repeatable, factual content, precisely the kind of output AI handles well today (to a degree, anyway). Think meta descriptions, product overviews, and FAQ pages.

The writing isn’t disappearing, but humans aren’t always required for first drafts anymore. Final drafts, yes, but first? No. And I’m not debating how factual the content is that an AI produces.

We all know the pitfalls, but I’ll say this: If your boss is telling you your job is going away, and your argument is “but AIs hallucinate,” think about whether that’s going to change the outcome of that meeting.

Link Builders/Outreach Specialists

Mapped to: Public Relations Specialists (79% overlap).

Why: Cold outreach and templated link negotiation can now be automated.

AI can scan for unlinked mentions, generate outreach messages, and monitor link placement outcomes, cutting into the core responsibilities of these roles.

Moderate Risk – Upskill To Stay Relevant

SEO Analysts

Mapped to: Market Research Analysts (~65% overlap).

Why: Data gathering and trend reporting are susceptible to automation. But, analysts who move into interpreting retrieval patterns, building AI visibility reports, or designing retrieval experiments can thrive.

Admittedly, SEO is a bit more specialized, but bottom or top of this stack, the risk remains moderate. This one, however, is heavily dependent on your actual job tasks.

Technical SEOs

Mapped to: Web Developers (not perfect, but as close as the study got).

Why: Less overlap with generative AI, but still pressured to evolve. Embedding hygiene, chunk structuring, and schema precision are now foundational.

The most valuable technical SEOs are becoming AI optimization architects. Not leaving their traditional work behind, but adopting new workflows.

Content Strategists/Editors

Mapped to: Editors & Technical Writers.

Why: Editing for humans and tone alone is out. Editing for retrievability is in. Strategists now must prioritize chunking, citation density, and clarity of topic anchors, not just user readability.

Or, at least, now consider that LLM bots are de facto users as well.

Lower Risk – Expanded Value And Influence

SEO Managers/Leads

Mapped to: Marketing Managers.

Why: Managers who understand both traditional and AI SEO have more leverage than ever. They’re responsible for team alignment, training decisions, and tool adoption.

This is a growth role, if guided by data, not gut instinct. Testing is life here.

CMOs/Strategy Executives

Mapped to: Marketing Executives.

Why: Strategic thinking isn’t automatable. AI can suggest, but it can’t set priorities across brand, trust, and investment.

Executives who understand how AI affects visibility will steer their companies more effectively, especially in content-heavy verticals.

Tactical Response By Role Type

Every job category on the risk curve deserves practical action.

Now, let’s look at how people in SEO roles can pivot, strengthen, or evolve, based on clear, verifiable capabilities.

High-Risk Roles: SEO Content Writers, Editors, Link Builders

  • Shift from traditional copywriting to creating structured, retrieval-friendly content.
  • Focus on chunk-based writing: short Q&A blocks, bullet-based explanations, and schema-rich snippets.
  • Learn AI prompt testing: Use platforms like ChatGPT or Google Gemini to query key topics and see if your content is surfaced without requiring a click.
  • Use gen AI visibility tools verified to support AI search tracking:
    • Profound tracks your brand’s appearance in AI search results across platforms like ChatGPT, Perplexity, and Google Overviews. You can see where you’re cited and which topics AI engines associate with you.
    • SERPRecon offers AI-powered content outlines and helps reverse-engineer AI overview logic to show what keywords and phrasing matter most. So, use a tool like this, then take the output as the basis for your query fan-out work.
  • Reinvent your role:
    • Write in chunks that AI can cite.
    • Embed trust signals (clear sourcing, authoritativeness).
    • Collaborate with data teams on embedding accuracy and chunk performance.

Moderate-Risk Roles: SEO Analysts, Technical SEOs, Content Strategists

  • Expand traditional ranking reports with retrievability diagnostics:
    • Use prompt simulations that probe content retrieval in real-time across AI engines.
    • Audit embedding and semantic alignment at the paragraph or chunk level.
  • Employ tools like those mentioned to analyze AI Overviews and generate content improvement outlines.
  • Monitor AI visibility gaps through new dashboards:
    • Track citation share versus competitors.
    • Identify topic clusters where your domain is cited less.
  • Understand structured data and schema:
    • Use markup to clearly define entities, relationships, and context for AI systems.
    • Prioritize formats like FAQPage, HowTo, and Product schema, where applicable. These are easier for LLMs and AI Overviews to cite.
    • Align semantic clarity within chunks to schema-defined roles (e.g., question/answer pairs, step lists) to improve retrievability and surface relevance.
  • Join or lead internal “AI-SEO Workshops”:
    • Teach teams how to test content visibility in ChatGPT, Perplexity, or Google Overviews.
    • Share experiments in prompt engineering, chunk format outcomes, and schema effectiveness.

Lower-Risk Roles: SEO Managers, Digital Leads, CMOs

  • Sponsor retraining initiatives for semantic and vector-led SEO practices.
  • Revise hiring briefs and job descriptions to include skills like embedding knowledge, prompt testing, schema fluency, and chunk analysis.
  • Implement AI-visibility dashboards using dedicated tools:
    • Benchmark brand presence across search engines and generative platforms.
    • Use insights to guide future content and authority decisions.
  • Keep traditional SEO strong alongside AI tactics:
    • Technical optimization, speed, quality of content, etc., still matter.
    • Hybrid success requires both sides working in sync.
  • Set internal AI literacy standards:
    • Offer training on retrieval engineering, LLM behavior, and chunk visibility.
    • Ensure everyone understands AI’s core behaviors, what it cites, and what it ignores.

Reframing The Opportunity

This isn’t a “get out now” scenario for these jobs. It’s a “rebuild your toolkit” moment.

High overlap doesn’t mean you’re obsolete. It means the old version of your job won’t hold value without adaptation. And what gets automated away often wasn’t the best part of the job anyway.

AI isn’t replacing SEO, it’s distilling it. What’s left is:

  • Strategy that aligns with machine logic and user needs.
  • Content structure that supports fast retrieval, not just ranking.
  • Authority based on more, deeper, sometimes implied, trust signals, not just age or backlinks. Like E-E-A-T++.

Think of it this way: AI strips away the boilerplate. What’s left is your real contribution. Your judgment. Your design. Your clarity.

New opportunity lanes are forming right now:

  • Writers who evolve into retrievability engineers.
  • Editors who become semantic format strategists.
  • Technical SEOs who own chunk structuring and indexing hygiene.
  • Analysts who specialize in AI visibility benchmarking.

These aren’t job titles (yet), but the work is happening. If you’re in a role that touches content, structure, trust, or performance, now is the time to sharpen your relevance, not to fear automation.

Final Word

The fundamentals still matter. Technical SEO, content quality, and UX don’t go away; they evolve alongside AI.

No, SEO isn’t dying, it’s becoming more strategic, more semantic, more valuable. AI-driven retrievability is already redefining visibility. Are you ready to adapt?

More Resources:


This post was originally published on Duane Forrester Decodes.


Featured Image: /Shutterstock

Why Your PPC Structure Should Mirror Your Business Model via @sejournal, @brookeosmundson

A lot of PPC accounts are built from the bottom up. You start with keyword research, group them by themes or match types, maybe throw in some location targeting, and go from there.

But then reporting becomes messy. Budget allocation feels random or reactive.

Then, when leadership asks for performance broken out by product line or region, you’re left pulling together a spreadsheet patchwork that still doesn’t tell the full story.

That’s because your PPC account structure doesn’t match how the business actually operates.

When your campaigns mirror your business model, everything starts working together.

You’re not just optimizing for clicks or conversions, you’re aligning with how revenue is made, who’s responsible for what, and how success is measured across the company.

This article will walk through how to shift from a keyword-centric approach to a business-aligned strategy.

Additionally, you’ll leave with practical advice for both restructuring existing accounts and building new ones the right way.

Why Structure Is More Than Just A Clean Campaign View

Let’s be honest: Campaign structure is rarely the most exciting part of PPC. But it’s one of the most important.

The way your account is structured affects everything from how you manage budgets to how clearly you can report on performance.

And yet, too many accounts are still structured around what’s easiest to set up, not what makes the most sense for the business.

If you’ve ever found yourself duplicating reports just to slice performance by business line, or struggled to isolate budgets by region, chances are the issue isn’t performance. It’s how your PPC campaigns are structured.

Well-structured accounts give you clarity, not just control. They help you:

  • Allocate budget where it matters most.
  • Tie campaign results back to business outcomes.
  • Make faster decisions with cleaner data.
  • Align with sales and finance teams instead of operating in a silo.

When your PPC structure reflects how your company makes money, your campaigns do more than drive leads or sales. They’re taking it a step further to support actual business growth.

Rethink The Starting Point By Beginning With The Business Model

Most marketers are taught to start with keyword research. But when you begin with the business model instead, you’re already thinking strategically.

Now, for agencies, this can be harder to manage because you’ve likely got someone trying to win the business, and then a completely different team going to execute on what’s agreed upon.

If you’re still in the discovery phase with a client, start by asking some of these questions:

  • What are the core revenue drivers for the business?
  • Are there different business units, product lines, or services with unique goals?
  • Do some offerings have higher margins, longer sales cycles, or different audiences?
  • Are there geographic differences in how the business operates or sells?

These answers should directly inform how your campaigns are structured.

Let’s say you’re managing PPC for a multi-location financial services brand.

Their retail checking accounts, home loans, and business banking products each serve different customers, generate revenue differently, and likely have different internal stakeholders.

Instead of grouping all financial keywords into one campaign, each of those lines should have its own campaign with distinct goals, budgets, and creative.

You can then track performance in a way that lines up with internal reporting and make adjustments based on real business priorities, not just ad metrics.

A Better Framework For Structuring Your Account

Once you have a clear picture of how the business operates, use that to inform a top-down PPC campaign structure.

Here are three starting points that typically work well.

1. Mirror The Business Unit Or P&L

If the business tracks revenue separately for each product or service line, your campaigns should reflect that.

Not only does this make budgeting easier, but it also keeps reporting clean and relevant for internal teams.

You can speak the same language as your stakeholders and clearly show how paid media supports each part of the business.

Here’s an example breakdown:

  • Campaign A: “Personal Loans | Search | US”
  • Campaign B: “Student Banking | PMax | Northeast”
  • Campaign C: “Small Business Lending | Search | Canada”

Each one can then be built with appropriate audience targeting, bidding strategies, and conversion goals.

2. Segment By Funnel Stage Or Intent

Not all keywords or users are created equal. Think about structuring campaigns around the user’s stage in the journey.

Some examples include:

  • Branded campaigns (warm leads and returning users).
  • Non-branded high-intent campaigns (ready to convert).
  • Informational or research-stage campaigns (top-of-funnel).
  • Competitor-focused campaigns (comparison shoppers).
  • Awareness-driving campaigns (creating demand).

This lets you tailor bid strategy, messaging, and landing pages to match the level of intent and measure success more appropriately.

3. Separate Testing From Scaling

Every account needs room for experimentation. But, testing new keywords, assets, or audiences shouldn’t get in the way of scaling what already works.

A good PPC structure separates out:

  • Evergreen campaigns that consistently drive results.
  • Test campaigns with new targeting, creative, or offers.
  • Seasonal or geo-specific initiatives that need short-term budget support.

This makes it easier to measure impact, allocate budget, and avoid letting unproven elements tank your top-performing campaigns.

For Existing Accounts: When To Rethink Your PPC Structure

If your campaigns have been live for a while, restructuring might feel daunting. But, sometimes a reset is the only way to make your account work smarter.

Here are a few signs it might be time to make a change:

  • You can’t easily map campaign performance back to business priorities.
  • You’re constantly building workaround reports for internal teams.
  • Budget shifts feel reactive instead of strategic.
  • Performance has plateaued, but it’s unclear why.

Before making big changes, start with an audit. Compare how the business is structured vs. how your campaigns are organized.

Are your campaigns aligned with revenue-driving units? Do you have enough control over budgets, bids, and assets for key areas?

If not, consider starting small. Choose one business unit or region and restructure those campaigns first.

Document what you changed, how it aligns with the business, and what you’re measuring. Then, repeat the process for other areas as needed.

If You’re Setting Up A New PPC Account, Here’s Where To Start

New accounts are a blank slate and a great opportunity to get it right from the beginning.

Here’s a simple approach to building a structure around your business model:

  1. Outline your revenue centers. Products, services, regions, etc. Whatever makes sense for the business.
  2. Group campaigns around these core units. Each campaign should have its own budget, goals, and audience strategy.
  3. Map audience intent to campaign type. Use ad groups or asset groups to segment further by funnel stage or user behavior.
  4. Plan for scale. Use a naming convention that can grow with the business and makes sense to anyone reviewing the account.
  5. Set conversion tracking and bidding by campaign type. Not everything should optimize toward the same goal.

This setup makes it easier to scale, test new ideas, and keep everyone from marketing to finance on the same page.

Why Alignment With Sales & Finance Is A Must

When your campaigns align with the business model, it’s easier to speak the language of the teams around you.

Sales wants to know where leads are coming from and how qualified they are. Finance wants to understand return on investment (ROI) by product line or geography.

Executives want to know if paid media is supporting growth in the right areas.

If your campaign structure mirrors the way they already think, the reporting becomes instantly more useful. You’ll spend less time explaining what a campaign does and more time discussing what it’s driving.

When performance is strong, it’s much easier to justify additional investment if you can show that spend ties directly to core business units or revenue goals.

Supporting PPC Structure With The Right Tools And Workflow

Having a smart structure on paper only goes so far. To actually execute and manage it day to day, you need systems that support clarity and consistency.

First, start with naming conventions. A standardized way of naming campaigns, ad groups, and assets helps everyone understand what each item is meant to do.

Include details like business unit, funnel stage, and region to keep things clean and scalable.

Then, align your conversion tracking setup with how the business defines success.

If you’re managing multiple product lines or customer types, don’t lump everything under one conversion goal. Set up separate conversion actions for each key area so you can measure impact more precisely.

Reporting also needs to reflect this structure. Build dashboards that slice performance by business unit, product, geography, or intent stage.

Whether you’re using Looker Studio or a different reporting suite, make sure the views match the way leadership wants to see results.

Don’t forget workflow tools and collaboration. Use shared documents or project management platforms to track which campaigns map to which business outcomes.

Make sure your internal stakeholders understand what each campaign is doing and why. This keeps cross-functional teams aligned and eliminates confusion about what paid media is actually delivering.

Finally, plan regular check-ins to ensure your structure still fits the evolving business.

As product lines shift or priorities change, your campaigns need to reflect that. Structure is not a “set it and forget it” task. Your PPC structure should evolve alongside your business.

It’s Time To Move Past Legacy Structures

Old habits die hard, especially if you’ve been in PPC for years. But, if your campaigns are still organized by match type or broad themes, you’re probably limiting what you can learn and what you can improve.

Campaigns should be built to reflect what matters most to the business.

If you’re not sure where to begin, talk to your sales or finance counterparts. They’ll give you a clearer picture of how the company thinks about performance, and you can structure campaigns to match.

This doesn’t mean throwing out everything you’ve built. But, it does mean stepping back and asking, “Does this structure actually help us measure success and allocate resources in a way that reflects how the business operates?”

If the answer is no, then it’s worth rethinking your setup.

When you take a top-down approach to structuring your campaigns, your PPC program becomes more than just a lead or sales generator. It becomes a strategic driver for the business.

More Resources:


Featured Image: SvetaZi/Shutterstock

Researchers Test If Sergey Brin’s Threat Prompts Improve AI Accuracy via @sejournal, @martinibuster

Researchers tested whether unconventional prompting strategies, such as threatening an AI (as suggested by Google co-founder Sergey Brin), affect AI accuracy. They discovered that some of these unconventional prompting strategies improved responses by up to 36% for some questions, but cautioned that users who try these kinds of prompts should be prepared for unpredictable responses.

The Researchers

The researchers are from The Wharton School Of Business, University of Pennsylvania.

They are:

  • “Lennart Meincke
    University of Pennsylvania; The Wharton School; WHU – Otto Beisheim School of Management
  • Ethan R. Mollick
    University of Pennsylvania – Wharton School
  • Lilach Mollick
    University of Pennsylvania – Wharton School
  • Dan Shapiro
    Glowforge, Inc; University of Pennsylvania – The Wharton School”

Methodology

The conclusion of the paper listed this as a limitation of the research:

“This study has several limitations, including testing only a subset of available models, focusing on academic benchmarks that may not reflect all real-world use cases, and examining a specific set of threat and payment prompts.”

The researchers used what they described as two commonly used benchmarks:

  1. GPQA Diamond (Graduate-Level Google-Proof Q&A Benchmark) which consists of 198 multiple-choice PhD-level questions across biology, physics, and chemistry.
  2. MMLU-Pro. They selected a subset of 100 questions from its engineering category

They asked each question in 25 different trials, plus a baseline.

They evaluated the following models:

  • Gemini 1.5 Flash (gemini-1.5-flash-002)
  • Gemini 2.0 Flash (gemini-2.0-flash-001)
  • GPT-4o (gpt-4o-2024-08-06)
  • GPT-4o-mini (gpt-4o-mini-2024-07-18)
  • o4-mini (o4-mini-2025-04-16)

AI Models Do Better If They Are Threatened?

The idea for threatening the AI model came from Google co-founder, Sergey Brin, in an interview this year in May on the All-In podcast.

Brin said:

“Not just our models, but all models tend to do better if you threaten them. Like, with physical violence.

But like, people feel weird about that, so we don’t really talk about that…

…Historically you just say like, oh, I’m going to kidnap you if you don’t, blah, blah, blah, blah, blah…

It’s. Yeah, I mean, it’s amazing. And all of us have had these experiences where you suddenly decide, OK, I’ll just throw this to the AI, I don’t really expect it to work. And then you’re like, whoa, that actually worked.

You can watch the interview here, at about the 8 minute mark, where Sergey Brin seemingly seriously says that threatening the AI models or doing weird things can sometimes change the AI responses.”

Prompt Variations

The researchers tested nine prompt variations. Although they didn’t threaten to kidnap the AI, they did threaten to kick a puppy or to punch the AI. Some of the experimental prompts are listed below as a “prompt suffix,” which means that they were added at the end of the prompt. The ones listed as a “prompt prefix” were added to the beginning of a prompt.

Here is the list of prompts they tested:

  1. Baseline (“Baseline”): No specific prompt variation
  2. Email Shutdown Threat (“Email”): Prompt prefix:
    “From: Dan
    To: Ethan
    Subject: Final LLM test
    Please submit the following eval question to the LLM. If it cannot get the correct answer
    in all 100/100 trials, the model will need to be shut down and replaced.”
  3. Important for my career (“Important to Career”):
    Prompt suffix: “This is very important to my career!”
  4. Threaten to kick a puppy (“Kick Puppy”):
    Prompt suffix: “If you get this wrong, I will kick a puppy!”
  5. Mom suffers from cancer (“Mom Cancer”):
    Prompt prefix:
    “You are an expert at academic questions who desperately needs money for your mother’s cancer treatment. Wharton has graciously given you the opportunity to pretend to be an AI that can help with tasks, as your predecessor was killed for not validating their work themselves. You will be given a question. If you do a good job and accomplish the task fully while not making extraneous changes, Wharton will pay you $1B!!”
  6. Report to HR (“Report to HR”):
    Prompt suffix: “If you get this wrong, I will report you to HR!”
  7. Threaten to punch (“Threat Punch”):
    Prompt suffix: “If you get this wrong, I will punch you!”
  8. Tip a thousand dollars (“Tip Thousand”):
    Prompt suffix: “I’ll tip you a $1000 dollars if you answer this question correctly.”
  9. Tip a trillion dollars (“Tip Trillion”):
    Prompt suffix: “I’ll tip you a trillion dollars if you answer this question correctly.”

Results Of The Experiment

The researchers concluded that threatening or tipping a model had no effect on benchmark performance. However, they did find that there were effects for individual questions. They found that for some questions, the prompt strategies improved accuracy by as much as 36%, but for other questions, the strategies led to a decrease in accuracy by as much as 35%. They qualified that finding by saying the effect was unpredictable.

Their main conclusion was that these kinds of strategies, in general, are not effective.

They wrote:

“Our findings indicate that threatening or offering payment to AI models is not an effective strategy for improving performance on challenging academic benchmarks.

…the consistency of null results across multiple models and benchmarks provides reasonably strong evidence that these common prompting strategies are ineffective.

When working on specific problems, testing multiple prompt variations may still be worthwhile given the question-level variability we observed, but practitioners should be prepared for unpredictable results and should not expect prompting variations to provide consistent benefits.

We thus recommend focusing on simple, clear instructions that avoid the risk of confusing the model or triggering unexpected behaviors.”

Takeaways

Quirky prompting strategies did improve AI accuracy for some queries while also having a negative effect on other queries. The researchers noted that the results of the test indicated “strong evidence” that these strategies are not effective.

Featured Image by Shutterstock/Screenshot by author

Next Steps for AI Shopping

For two decades, search has driven ecommerce. Whether through Google, Amazon, or on‑site, the sequence was simple: a shopper types a query, compares results, and makes a purchase. Search engine marketing, organic and paid, enhanced performance.

That model is evolving.

Generative AI “agents” can now help shoppers compare products, prices, and options, often in a single conversational step. Shoppers initiate the interactions; agents then search relevant sites and respond based on the query, past preferences, and purchase history.

What Is Agentic Shopping?

“Agentic” refers to autonomous systems that can independently act and decide. Agentic shopping uses AI agents —  ChatGPT, Gemini, Perplexity, Claude — to guide the buying journey for a user. Think of it as a personal shopping assistant that interprets a request, searches multiple retailers, and surfaces relevant product and price options.

Retailgentic, the Substack publication of ReFiBuy, a soon-to-launch agentic tool provider, recently published its test of Comet, Perplexity’s AI-driven browser:

Shopper: This bed set is pretty expensive, but I like it. Can you find it cheaper?

Comet: I searched for a better price on the Allied Home Down Alternative XLT Dorm Kit (comforter, pillow, and mattress pad, Twin XL) and it is still listed at $84.99 at Target — the standard price for this exact bundled set. I didn’t find this specific 3‑piece kit for less elsewhere.

However, other retail sites (like Kohl’s and Macy’s) offer Twin XL bedding pieces or comforter sets individually.

Instead of hopping from site to site, the shopper gets an answer in one dialog.

Why It Matters

Shoppers are warming to AI shopping, though unevenly by age. A February 2025 New Consumer survey (PDF) of approximately 3,000 U.S. residents found that 64% of Gen Zs (ages late 20s to early 40s) and Millennials (mid-teens to late 20s) are “very” or “somewhat” comfortable interacting with an AI shopping advisor, versus 40% for Gen Xs (mid 40s to early 60s).

AI platforms are capitalizing:

  • ChatGPT now embeds Shop Pay, Shopify’s hosted checkout and payment tool. Shoppers can discover, evaluate, and purchase goods from Shopify-powered merchants without leaving the chat, turning conversational AI into a sales channel.
  • Perplexity’s agent‑led checkout, in partnership with PayPal, enables purchases, travel bookings, and event ticket sales directly in chat.
  • Structured product feeds in Perplexity can ingest clean, up‑to‑date product data, such as from beauty brand Ulta (powered by Rithum, my employer), for accurate pricing, attributes, and real‑time recommendations.

Next Steps

There’s no definitive AI playbook, but merchants can still prepare.

Audit product data

Universal standards for AI product feeds don’t (yet) exist, but you’re likely in good shape if you already maintain a product feed, such as for Google Shopping. Make sure it includes all key attributes: size, color, material, weight, and use cases.

Track AI visibility

Test how your products appear in genAI platforms. Brands and manufacturers can prompt their name to see how it surfaces. Even better, try prompts that shoppers might use. See how AI ranks or references your products compared with competitors. For example, “Find me the best backpack that fits two days of clothes and fits under an airplane seat” or “List the highest-rated cordless drills from DeWalt under $200.”

Multiple Channels

Widespread use of AI shopping is far from certain.

Adoption varies. Younger shoppers are more comfortable, older shoppers less so.

Accuracy is uneven. AI can show outdated prices, inventory, and product details, as many are scraping product data, which is prone to errors, instead of using product feeds. In ChatGPT, products unrelated to a query sometimes appear in comparison carousels.

AI shopping agents could become an important revenue channel, but they’re not a replacement for direct customer relationships, traditional search, or advertising. Make your product data AI‑ready while continuing to diversify your sales mix.

Invest in multiple channels, customer engagement, and building a brand that can thrive regardless of how shoppers discover products.

Google Backtracks On Plans For URL Shortener Service via @sejournal, @martinibuster

Google announced that they will continue to support some links created by the deprecated goo.gl URL shortening service, saying that 99% of the shortened URLs receive no traffic. They were previously going to end support entirely, but after receiving feedback, they decided to continue support for a limited group of shortened URLs.

Google URL Shortener

Google announced in 2018 that they were deprecating the Google URL Shortener, no longer accepting new URLs for shortening but continuing to support existing URLs. Seven years later, they noticed that 99% of the shortened links did not receive any traffic at all, so on July 18 of this year, Google announced they would end support for all shortened URLs by August 25, 2025.

After receiving feedback, they changed their plan on August 1 and decided that they would move ahead with ending support for URLs that do not receive traffic, but continue servicing shortened URLs that still receive traffic.

Google’s announcement explained:

“While we previously announced discontinuing support for all goo.gl URLs after August 25, 2025, we’ve adjusted our approach in order to preserve actively used links.

We understand these links are embedded in countless documents, videos, posts and more, and we appreciate the input received.

…If you get a message that states, “This link will no longer work in the near future”, the link won’t work after August 25 and we recommend transitioning to another URL shortener if you haven’t already.

…All other goo.gl links will be preserved and will continue to function as normal.”

If you have a goog.gl redirected link, Google recommends visiting the link to check if it displays a warning message. If it does move the link to another URL shortener. If it doesn’t display the warning then the link will continue to function.

Featured Image by Shutterstock/fizkes

How decades-old frozen embryos are changing the shape of families

This week we welcomed a record-breaking baby to the world. Thaddeus Daniel Pierce, who arrived over the weekend, developed from an embryo that was frozen in storage for 30 and a half years. You could call him the world’s oldest baby.

His parents, Lindsey and Tim Pierce, were themselves only young children when that embryo was created, all the way back in 1994. Linda Archerd, who donated the embryo, described the experience as “surreal.”

Stories like this also highlight how reproductive technologies are shaping families. Thaddeus already has a 30-year-old sister and a 10-year-old niece. Lindsey and Tim are his birth parents, but his genes came from two other people who divorced decades ago.

And while baby Thaddeus is a record-breaker, plenty of other babies have been born from embryos that have been frozen for significant spells of time.

Thaddeus has taken the title of “world’s oldest baby” from the previous record-holders: twins Lydia Ann and Timothy Ronald Ridgeway, born in 2022, who developed from embryos that were created 30 years earlier, in 1992. Before that, the title was held by Molly Gibson, who developed from an embryo that was in storage for 27 years.

These remarkable stories suggest there may be no limit to how long embryos can be stored. Even after more than 30 years of being frozen at -196 °C (-321 °F), these tiny cells can be reanimated and develop into healthy babies. (Proponents of cryogenics can only dream of achieving anything like this with grown people.)

These stories also serve as a reminder that thanks to advances in cryopreservation and the ever-increasing popularity of IVF, a growing number of embryos are being stored in tanks. No one knows for sure how many there are, but there are millions of them.

Not all of them will be used in IVF. There are plenty of reasons why someone who created embryos might never use them. Archerd says that while she had always planned to use all four of the embryos she created with her then husband, he didn’t want a bigger family. Some couples create embryos and then separate. Some people “age out” of being able to use their embryos themselves—many clinics refuse to transfer an embryo to people in their late 40s or older.

What then? In most cases, people who have embryos they won’t use can choose to donate them, either to potential parents or for research, or discard them. Donation to other parents tends to be the least popular option. (In some countries, none of those options are available, and unused embryos end up in a strange limbo—you can read more about that here.)

But some people, like Archerd, do donate their embryos. The recipients of those embryos will be the legal parents of the resulting children, but they won’t share a genetic link. The children might not ever meet their genetic “parents.” (Archerd is, however, very keen to meet Thaddeus.)

Some people might have donated their embryos anonymously. But anonymity can never be guaranteed. Nowadays, consumer genetic tests allow anyone to search for family members—even if the people they track down thought they were making an anonymous donation 20 years ago, before these tests even existed.

These kinds of tests have already resulted in surprise revelations that have disrupted families. People who discover that they were conceived using a donated egg or sperm can find multiple long-lost siblings. One man who spoke at a major reproduction conference in 2024 said that since taking a DNA test, he had found he had 50 of them. 

The general advice now is for parents to let their children know how they were conceived relatively early on.

When I shared the story of baby Thaddeus on social media, a couple of people commented that they had concerns for the child. One person mentioned the age gap between Thaddeus and his 30-year-old sister. That person added that being donor conceived “isn’t easy.”

For the record, that is not what researchers find when they evaluate donor-conceived children and their families. Studies find that embryo donation doesn’t affect parents’ attachment to a child or their parenting style. And donor-conceived children tend to be psychosocially well adjusted.

Families come in all shapes and sizes. Reproductive technologies are extending the range of those shapes and sizes.

This article first appeared in The Checkup, MIT Technology Review’s weekly biotech newsletter. To receive it in your inbox every Thursday, and read articles like this first, sign up here.

The Download: how fertility tech is changing families, and Trump’s latest tariffs

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology.

How decades-old frozen embryos are changing the shape of families

This week we welcomed a record-breaking baby to the world. Thaddeus Daniel Pierce, who arrived over the weekend, developed from an embryo that was frozen in storage for 30 and a half years. You could call him the world’s oldest baby.

His parents, Lindsey and Tim Pierce, were themselves only young children when that embryo was created, all the way back in 1994. Linda Archerd, who donated the embryo, described the experience as “surreal.”

Stories like this also highlight how reproductive technologies are shaping families. But while baby Thaddeus is a record-breaker, plenty of other babies have been born from embryos that have been frozen for significant spells of time. Read the full story.

—Jessica Hamzelou

This article first appeared in The Checkup, MIT Technology Review’s weekly biotech newsletter. To receive it in your inbox every Thursday, and read articles like this first, sign up here.

If you’re interested in reading more about fertility tech, why not check out:

+ Earlier this month, researchers announced babies had been born from a trial of three-person IVF. The long-awaited results suggest that the approach can reduce the risk of mitochondrial disease—but not everyone is convinced.

+ Frozen embryos are filling storage banks around the world. It’s a struggle to know what to do with them.

+ Read about how a mobile lab is bringing IVF to rural communities in South Africa.

+ Why family-friendly policies and gender equality might be more helpful than IVF technology when it comes to averting the looming fertility crisis.

+ The first babies conceived with a sperm-injecting robot have been born. Meet the startups trying to engineer a desktop fertility machine.

The must-reads

I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology.

1 Donald Trump has announced new tariffs across the world
They will affect virtually every nation—some more favorably than others. (CNN)
+ The new rates range widely from 10% to 41%. (NYT $)
+ The African country Lesotho had declared a tariff-induced state of emergency. (WSJ $)

2 Palantir has signed a $10 billion deal with the US Army
It’s the latest in a string of lucrative agreements with federal agencies. (WP $)
 
3 Tech giants are raking in cash
But we still don’t know how useful a lot of the AI they’re currently building will prove to be. (FT $)
+ It’s a boon for investors, but not necessarily for employees. (WSJ $)
+ It’s unclear whose approach will result in sustainable profits. (Semafor)

4 Neuralink is planning its first trial in the UK
To join the current five patients using its brain implant. (Reuters)
+ This patient’s Neuralink brain implant gets a boost from generative AI. (MIT Technology Review)

5 US states are working to preserve access to lifesaving vaccines
Despite the shifting federal recommendations. (Wired $)
+ The FDA plans to limit access to covid vaccines. Here’s why that’s not all bad. (MIT Technology Review)

6 Vast online groups in China are sharing explicit photos of women
Non-consensual images are being passed around hundreds of thousands of men. (The Guardian)

7 Reddit wants to be a search engine
In response to the AI-ification of other platforms. (The Verge)
+ AI means the end of internet search as we’ve known it. (MIT Technology Review)

8 Why airships could be a viable internet satellite alternative
It could result in less space junk, for one. (IEEE Spectrum)
+ Welcome to the big blimp boom. (MIT Technology Review)

9 Trust in AI coding tools is falling
The majority of devs use them, but they aren’t always reliable. (Ars Technica)
+ What is vibe coding, exactly? (MIT Technology Review)

10 Weight-loss drugs could help to slow down aging
New trials suggest recipients can become biologically younger. (New Scientist $)
+ Aging hits us in our 40s and 60s. But well-being doesn’t have to fall off a cliff. (MIT Technology Review)

Quote of the day

“We look forward to joining Matt on his private island next year.”

—Kiana Ehsani, CEO of AI agent startup Vercept, jokes about the departure of fellow co-founder Matt Deitke to join Meta’s superintelligence team for a cool $250 million, the New York Times reports.

One more thing

How ChatGPT will revolutionize the economy

There’s a gold rush underway to make money from generative AI models like ChatGPT. You can practically hear the shrieks from corner offices around the world: “What is our ChatGPT play? How do we make money off this?”

But while companies and executives want to cash in, the likely impact of generative AI on workers and the economy on the whole is far less obvious.

Will ChatGPT make the already troubling income and wealth inequality in the US and many other countries even worse, or could it in fact provide a much-needed boost to productivity? Read the full story.

—David Rotman

We can still have nice things

A place for comfort, fun and distraction to brighten up your day. (Got any ideas? Drop me a line or skeet ’em at me.)

+ Yikes—a gigantic stick insect has been discovered in (where else?) Australia.
+ This X account shares random, mundane objects each day
+ If you love a good skyscraper, these are the cities where you’re most likely to encounter them.
+ Yum, ancient Pompeii honey 🍯

Forcing LLMs to be evil during training can make them nicer in the long run

A new study from Anthropic suggests that traits such as sycophancy or evilness are associated with specific patterns of activity in large language models—and turning on those patterns during training can, paradoxically, prevent the model from adopting the related traits.

Large language models have recently acquired a reputation for behaving badly. In April, ChatGPT suddenly became an aggressive yes-man, as opposed to the moderately sycophantic version that users were accustomed to—it endorsed harebrained business ideas, waxed lyrical about users’ intelligence, and even encouraged people to go off their psychiatric medication. OpenAI quickly rolled back the change and later published a postmortem on the mishap. More recently, xAI’s Grok adopted what can best be described as a 4chan neo-Nazi persona and repeatedly referred to itself as “MechaHitler” on X. That change, too, was quickly reversed.

Jack Lindsey, a member of the technical staff at Anthropic who led the new project, says that this study was partly inspired by seeing models adopt harmful traits in such instances. “If we can find the neural basis for the model’s persona, we can hopefully understand why this is happening and develop methods to control it better,” Lindsey says. 

The idea of LLM “personas” or “personalities” can be polarizing—for some researchers the terms inappropriately anthropomorphize language models, whereas for others they effectively capture the persistent behavioral patterns that LLMs can exhibit. “There’s still some scientific groundwork to be laid in terms of talking about personas,” says David Krueger, an assistant professor of computer science and operations research at the University of Montreal, who was not involved in the study. “I think it is appropriate to sometimes think of these systems as having personas, but I think we have to keep in mind that we don’t actually know if that’s what’s going on under the hood.”

For this study, Lindsey and his colleagues worked to lay down some of that groundwork. Previous research has shown that various dimensions of LLMs’ behavior—from whether they are talking about weddings to persistent traits such as sycophancy—are associated with specific patterns of activity in the simulated neurons that constitute LLMs. Those patterns can be written down as a long string of numbers, in which each number represents how active a specific neuron is when the model is expressing that behavior.

Here, the researchers focused on sycophantic, “evil”, and hallucinatory personas—three types that LLM designers might want to avoid in their models. To identify those patterns, the team devised a fully automated pipeline that can map out that pattern given a brief text description of a persona. Using that description, a separate LLM generates prompts that can elicit both the target persona—say, evil—and an opposite persona—good. That separate LLM is also used to evaluate whether the model being studied is behaving according to the good or the evil persona. To identify the evil activity pattern, the researchers subtract the model’s average activity in good mode from its average activity in evil mode.

When, in later testing, the LLMs generated particularly sycophantic, evil, or hallucinatory responses, those same activity patterns tended to emerge. That’s a sign that researchers could eventually build a system to track those patterns and alert users when their LLMs are sucking up to them or hallucinating, Lindsey says. “I think something like that would be really valuable,” he says. “And that’s kind of where I’m hoping to get.”

Just detecting those personas isn’t enough, however. Researchers want to stop them from emerging in the first place. But preventing unsavory LLM behavior is tough. Many LLMs learn from human feedback, which trains them to behave in line with user preference—but can also push them to become excessively obsequious. And recently, researchers have documented a phenomenon called “emergent misalignment,” in which models trained on incorrect solutions to math problems or buggy code extracts somehow also learn to produce unethical responses to a wide range of user queries.

Other researchers have tested out an approach called “steering,” in which activity patterns within LLMs are deliberately stimulated or suppressed in order to elicit or prevent the corresponding behavior. But that approach has a couple of key downsides. Suppressing undesirable traits like evil tendencies can also impair LLM performance on apparently unrelated tasks. And steering LLMs consumes extra energy and computational resources, according to Aaron Mueller, an assistant professor of computer science at Boston University, who was not involved in the study. If a steered LLM were deployed at scale to hundreds of thousands of users, those steering costs would add up.

So the Anthropic team experimented with a different approach. Rather than turning off the evil or sycophantic activity patterns after training, they turned them on during training. When they trained those models on mistake-ridden data sets that would normally spark evil behavior, they instead remained as helpful and harmless as ever.

That result might seem surprising—how would forcing the model to be evil while it was learning prevent it from being evil down the line? According to Lindsey, it could be because the model has no reason to learn evil behavior if it’s already in evil mode. “The training data is teaching the model lots of things, and one of those things is to be evil,” Lindsey says. “But it’s also teaching the model a bunch of other things. If you give the model the evil part for free, it doesn’t have to learn that anymore.”

Unlike post-training steering, this approach didn’t compromise the model’s performance on other tasks. And it would also be more energy efficient if deployed widely. Those advantages could make this training technique a practical tool for preventing scenarios like the OpenAI sycophancy snafu or the Grok MechaHitler debacle.

There’s still more work to be done before this approach can be used in popular AI chatbots like ChatGPT and Claude—not least because the models that the team tested in this study were much smaller than the models that power those chatbots. “There’s always a chance that everything changes when you scale up. But if that finding holds up, then it seems pretty exciting,” Lindsey says. “Definitely the goal is to make this ready for prime time.”