How social media encourages the worst of AI boosterism

Demis Hassabis, CEO of Google DeepMind, summed it up in three words: “This is embarrassing.”  

Hassabis was replying on X to an overexcited post by Sébastien Bubeck, a research scientist at the rival firm OpenAI, announcing that two mathematicians had used OpenAI’s latest large language model, GPT-5, to find solutions to 10 unsolved problems in mathematics. “Science acceleration via AI has officially begun,” Bubeck crowed.

Put your math hats on for a minute, and let’s take a look at what this beef from mid-October was about. It’s a perfect example of what’s wrong with AI right now.

Bubeck was excited that GPT-5 seemed to have somehow solved a number of puzzles known as Erdős problems.

Paul Erdős, one of the most prolific mathematicians of the 20th century, left behind hundreds of puzzles when he died. To help keep track of which ones have been solved, Thomas Bloom, a mathematician at the University of Manchester, UK, set up erdosproblems.com, which lists more than 1,100 problems and notes that around 430 of them come with solutions. 

When Bubeck celebrated GPT-5’s breakthrough, Bloom was quick to call him out. “This is a dramatic misrepresentation,” he wrote on X. Bloom explained that a problem isn’t necessarily unsolved if this website does not list a solution. That simply means Bloom wasn’t aware of one. There are millions of mathematics papers out there, and nobody has read all of them. But GPT-5 probably has.

It turned out that instead of coming up with new solutions to 10 unsolved problems, GPT-5 had scoured the internet for 10 existing solutions that Bloom hadn’t seen before. Oops!

There are two takeaways here. One is that breathless claims about big breakthroughs shouldn’t be made via social media: Less knee jerk and more gut check.

The second is that GPT-5’s ability to find references to previous work that Bloom wasn’t aware of is also amazing. The hype overshadowed something that should have been pretty cool in itself.

Mathematicians are very interested in using LLMs to trawl through vast numbers of existing results, François Charton, a research scientist who studies the application of LLMs to mathematics at the AI startup Axiom Math, told me when I talked to him about this Erdős gotcha.

But literature search is dull compared with genuine discovery, especially to AI’s fervent boosters on social media. Bubeck’s blunder isn’t the only example.

In August, a pair of mathematicians showed that no LLM at the time was able to solve a math puzzle known as Yu Tsumura’s 554th Problem. Two months later, social media erupted with evidence that GPT-5 now could. “Lee Sedol moment is coming for many,” one observer commented, referring to the Go master who lost to DeepMind’s AI AlphaGo in 2016.

But Charton pointed out that solving Yu Tsumura’s 554th Problem isn’t a big deal to mathematicians. “It’s a question you would give an undergrad,” he said. “There is this tendency to overdo everything.”

Meanwhile, more sober assessments of what LLMs may or may not be good at are coming in. At the same time that mathematicians were fighting on the internet about GPT-5, two new studies came out that looked in depth at the use of LLMs in medicine and law (two fields that model makers have claimed their tech excels at). 

Researchers found that LLMs could make certain medical diagnoses, but they were flawed at recommending treatments. When it comes to law, researchers found that LLMs often give inconsistent and incorrect advice. “Evidence thus far spectacularly fails to meet the burden of proof,” the authors concluded.

But that’s not the kind of message that goes down well on X. “You’ve got that excitement because everybody is communicating like crazy—nobody wants to be left behind,” Charton said. X is where a lot of AI news drops first, it’s where new results are trumpeted, and it’s where key players like Sam Altman, Yann LeCun, and Gary Marcus slug it out in public. It’s hard to keep up—and harder to look away.

Bubeck’s post was only embarrassing because his mistake was caught. Not all errors are. Unless something changes researchers, investors, and non-specific boosters will keep teeing each other up. “Some of them are scientists, many are not, but they are all nerds,” Charton told me. “Huge claims work very well on these networks.”

*****

There’s a coda! I wrote everything you’ve just read above for the Algorithm column in the January/February 2026 issue of MIT Technology Review magazine (out very soon). Two days after that went to press, Axiom told me its own math model, AxiomProver, had solved two open Erdős problems (#124 and #481, for the math fans in the room). That’s impressive stuff for a small startup founded just a few months ago. Yup—AI moves fast!

But that’s not all. Five days later the company announced that AxiomProver had solved nine out of 12 problems in this year’s Putnam competition, a college-level math challenge that some people consider harder than the better-known International Math Olympiad (which LLMs from both Google DeepMind and OpenAI aced a few months back). 

The Putnam result was lauded on X by big names in the field, including Jeff Dean, chief scientist at Google DeepMind, and Thomas Wolf, cofounder at the AI firm Hugging Face. Once again familiar debates played out in the replies. A few researchers pointed out that while the International Math Olympiad demands more creative problem-solving, the Putnam competition tests math knowledge—which makes it notoriously hard for undergrads, but easier, in theory, for LLMs that have ingested the internet.

How should we judge Axiom’s achievements? Not on social media, at least. And the eye-catching competition wins are just a starting point. Determining just how good LLMs are at math will require a deeper dive into exactly what these models are doing when they solve hard (read: hard for humans) math problems.

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

Why it’s time to reset our expectations for AI

Can I ask you a question: How do you feel about AI right now? Are you still excited? When you hear that OpenAI or Google just dropped a new model, do you still get that buzz? Or has the shine come off it, maybe just a teeny bit? Come on, you can be honest with me.

Truly, I feel kind of stupid even asking the question, like a spoiled brat who has too many toys at Christmas. AI is mind-blowing. It’s one of the most important technologies to have emerged in decades (despite all its many many drawbacks and flaws and, well, issues).

At the same time I can’t help feeling a little bit: Is that it?

If you feel the same way, there’s good reason for it: The hype we have been sold for the past few years has been overwhelming. We were told that AI would solve climate change. That it would reach human-level intelligence. That it would mean we no longer had to work!

Instead we got AI slop, chatbot psychosis, and tools that urgently prompt you to write better email newsletters. Maybe we got what we deserved. Or maybe we need to reevaluate what AI is for.

That’s the reality at the heart of a new series of stories, published today, called Hype Correction. We accept that AI is still the hottest ticket in town, but it’s time to re-set our expectations.

As my colleague Will Douglas Heaven puts it in the package’s intro essay, “You can’t help but wonder: When the wow factor is gone, what’s left? How will we view this technology a year or five from now? Will we think it was worth the colossal costs, both financial and environmental?” 

Elsewhere in the package, James O’Donnell looks at Sam Altman, the ultimate AI hype man, through the medium of his own words. And Alex Heath explains the AI bubble, laying out for us what it all means and what we should look out for.

Michelle Kim analyzes one of the biggest claims in the AI hype cycle: that AI would completely eliminate the need for certain classes of jobs. If ChatGPT can pass the bar, surely that means it will replace lawyers? Well, not yet, and maybe not ever. 

Similarly, Edd Gent tackles the big question around AI coding. Is it as good as it sounds? Turns out the jury is still out. And elsewhere David Rotman looks at the real-world work that needs to be done before AI materials discovery has its breakthrough ChatGPT moment.

Meanwhile, Garrison Lovely spends time with some of the biggest names in the AI safety world and asks: Are the doomers still okay? I mean, now that people are feeling a bit less scared about their impending demise at the hands of superintelligent AI? And Margaret Mitchell reminds us that hype around generative AI can blind us to the AI breakthroughs we should really celebrate.

Let’s remember: AI was here before ChatGPT and it will be here after. This hype cycle has been wild, and we don’t know what its lasting impact will be. But AI isn’t going anywhere. We shouldn’t be so surprised that those dreams we were sold haven’t come true—yet.

The more likely story is that the real winners, the killer apps, are still to come. And a lot of money is being bet on that prospect. So yes: The hype could never sustain itself over the short term. Where we’re at now is maybe the start of a post-hype phase. In an ideal world, this hype correction will reset expectations. 

Let’s all catch our breath, shall we?

This story first appeared in The Algorithm, our weekly free newsletter all about AI. Sign up to read past editions here.

The State of AI: Welcome to the economic singularity

Welcome back to The State of AI, a new collaboration between the Financial Times and MIT Technology Review. Every Monday for the next two weeks, writers from both publications will debate one aspect of the generative AI revolution reshaping global power.

This week, Richard Waters, FT columnist and former West Coast editor, talks with MIT Technology Review’s editor at large David Rotman about the true impact of AI on the job market.

Bonus: If you’re an MIT Technology Review subscriber, you can join David and Richard, alongside MIT Technology Review’s editor in chief, Mat Honan, for an exclusive conversation live on Tuesday, December 9 at 1pm ET about this topic. Sign up to be a part here.

Richard Waters writes:

Any far-reaching new technology is always uneven in its adoption, but few have been more uneven than generative AI. That makes it hard to assess its likely impact on individual businesses, let alone on productivity across the economy as a whole.

At one extreme, AI coding assistants have revolutionized the work of software developers. Mark Zuckerberg recently predicted that half of Meta’s code would be written by AI within a year. At the other extreme, most companies are seeing little if any benefit from their initial investments. A widely cited study from MIT found that so far, 95% of gen AI projects produce zero return.

That has provided fuel for the skeptics who maintain that—by its very nature as a probabilistic technology prone to hallucinating—generative AI will never have a deep impact on business.

To many students of tech history, though, the lack of immediate impact is just the normal lag associated with transformative new technologies. Erik Brynjolfsson, then an assistant professor at MIT, first described what he called the “productivity paradox of IT” in the early 1990s. Despite plenty of anecdotal evidence that technology was changing the way people worked, it wasn’t showing up in the aggregate data in the form of higher productivity growth. Brynjolfsson’s conclusion was that it just took time for businesses to adapt.

Big investments in IT finally showed through with a notable rebound in US productivity growth starting in the mid-1990s. But that tailed off a decade later and was followed by a second lull.

Richard Waters and David Rotman

FT/MIT TECHNOLOGY REVIEW | ADOBE STOCK

In the case of AI, companies need to build new infrastructure (particularly data platforms), redesign core business processes, and retrain workers before they can expect to see results. If a lag effect explains the slow results, there may at least be reasons for optimism: Much of the cloud computing infrastructure needed to bring generative AI to a wider business audience is already in place.

The opportunities and the challenges are both enormous. An executive at one Fortune 500 company says his organization has carried out a comprehensive review of its use of analytics and concluded that its workers, overall, add little or no value. Rooting out the old software and replacing that inefficient human labor with AI might yield significant results. But, as this person says, such an overhaul would require big changes to existing processes and take years to carry out.

There are some early encouraging signs. US productivity growth, stuck at 1% to 1.5% for more than a decade and a half, rebounded to more than 2% last year. It probably hit the same level in the first nine months of this year, though the lack of official data due to the recent US government shutdown makes this impossible to confirm.

It is impossible to tell, though, how durable this rebound will be or how much can be attributed to AI. The effects of new technologies are seldom felt in isolation. Instead, the benefits compound. AI is riding earlier investments in cloud and mobile computing. In the same way, the latest AI boom may only be the precursor to breakthroughs in fields that have a wider impact on the economy, such as robotics. ChatGPT might have caught the popular imagination, but OpenAI’s chatbot is unlikely to have the final word.

David Rotman replies: 

This is my favorite discussion these days when it comes to artificial intelligence. How will AI affect overall economic productivity? Forget about the mesmerizing videos, the promise of companionship, and the prospect of agents to do tedious everyday tasks—the bottom line will be whether AI can grow the economy, and that means increasing productivity. 

But, as you say, it’s hard to pin down just how AI is affecting such growth or how it will do so in the future. Erik Brynjolfsson predicts that, like other so-called general purpose technologies, AI will follow a J curve in which initially there is a slow, even negative, effect on productivity as companies invest heavily in the technology before finally reaping the rewards. And then the boom. 

But there is a counterexample undermining the just-be-patient argument. Productivity growth from IT picked up in the mid-1990s but since the mid-2000s has been relatively dismal. Despite smartphones and social media and apps like Slack and Uber, digital technologies have done little to produce robust economic growth. A strong productivity boost never came.

Daron Acemoglu, an economist at MIT and a 2024 Nobel Prize winner, argues that the productivity gains from generative AI will be far smaller and take far longer than AI optimists think. The reason is that though the technology is impressive in many ways, the field is too narrowly focused on products that have little relevance to the largest business sectors.

The statistic you cite that 95% of AI projects lack business benefits is telling. 

Take manufacturing. No question, some version of AI could help; imagine a worker on the factory floor snapping a picture of a problem and asking an AI agent for advice. The problem is that the big tech companies creating AI aren’t really interested in solving such mundane tasks, and their large foundation models, mostly trained on the internet, aren’t all that helpful. 

It’s easy to blame the lack of productivity impact from AI so far on business practices and poorly trained workers. Your example of the executive of the Fortune 500 company sounds all too familiar. But it’s more useful to ask how AI can be trained and fine-tuned to give workers, like nurses and teachers and those on the factory floor, more capabilities and make them more productive at their jobs. 

The distinction matters. Some companies announcing large layoffs recently cited AI as the reason. The worry, however, is that it’s just a short-term cost-saving scheme. As economists like Brynjolfsson and Acemoglu agree, the productivity boost from AI will come when it’s used to create new types of jobs and augment the abilities of workers, not when it is used just to slash jobs to reduce costs. 

Richard Waters responds : 

I see we’re both feeling pretty cautious, David, so I’ll try to end on a positive note. 

Some analyses assume that a much greater share of existing work is within the reach of today’s AI. McKinsey reckons 60% (versus 20% for Acemoglu) and puts annual productivity gains across the economy at as much as 3.4%. Also, calculations like these are based on automation of existing tasks; any new uses of AI that enhance existing jobs would, as you suggest, be a bonus (and not just in economic terms).

Cost-cutting always seems to be the first order of business with any new technology. But we’re still in the early stages and AI is moving fast, so we can always hope.

Further reading

FT chief economics commentator Martin Wolf has been skeptical about whether tech investment boosts productivity but says AI might prove him wrong. The downside: Job losses and wealth concentration might lead to “techno-feudalism.”

The FT‘s Robert Armstrong argues that the boom in data center investment need not turn to bust. The biggest risk is that debt financing will come to play too big a role in the buildout.

Last year, David Rotman wrote for MIT Technology Review about how we can make sure AI works for us in boosting productivity, and what course corrections will be required.

David also wrote this piece about how we can best measure the impact of basic R&D funding on economic growth, and why it can often be bigger than you might think.

The State of AI: How war will be changed forever

Welcome back to The State of AI, a new collaboration between the Financial Times and MIT Technology Review. Every Monday, writers from both publications debate one aspect of the generative AI revolution reshaping global power.

In this conversation, Helen Warrell, FT investigations reporter and former defense and security editor, and James O’Donnell, MIT Technology Review’s senior AI reporter, consider the ethical quandaries and financial incentives around AI’s use by the military.

Helen Warrell, FT investigations reporter 

It is July 2027, and China is on the brink of invading Taiwan. Autonomous drones with AI targeting capabilities are primed to overpower the island’s air defenses as a series of crippling AI-generated cyberattacks cut off energy supplies and key communications. In the meantime, a vast disinformation campaign enacted by an AI-powered pro-Chinese meme farm spreads across global social media, deadening the outcry at Beijing’s act of aggression.

Scenarios such as this have brought dystopian horror to the debate about the use of AI in warfare. Military commanders hope for a digitally enhanced force that is faster and more accurate than human-directed combat. But there are fears that as AI assumes an increasingly central role, these same commanders will lose control of a conflict that escalates too quickly and lacks ethical or legal oversight. Henry Kissinger, the former US secretary of state, spent his final years warning about the coming catastrophe of AI-driven warfare.

Grasping and mitigating these risks is the military priority—some would say the “Oppenheimer moment”—of our age. One emerging consensus in the West is that decisions around the deployment of nuclear weapons should not be outsourced to AI. UN secretary-general António Guterres has gone further, calling for an outright ban on fully autonomous lethal weapons systems. It is essential that regulation keep pace with evolving technology. But in the sci-fi-fueled excitement, it is easy to lose track of what is actually possible. As researchers at Harvard’s Belfer Center point out, AI optimists often underestimate the challenges of fielding fully autonomous weapon systems. It is entirely possible that the capabilities of AI in combat are being overhyped.

Anthony King, Director of the Strategy and Security Institute at the University of Exeter and a key proponent of this argument, suggests that rather than replacing humans, AI will be used to improve military insight. Even if the character of war is changing and remote technology is refining weapon systems, he insists, “the complete automation of war itself is simply an illusion.”

Of the three current military use cases of AI, none involves full autonomy. It is being developed for planning and logistics, cyber warfare (in sabotage, espionage, hacking, and information operations; and—most controversially—for weapons targeting, an application already in use on the battlefields of Ukraine and Gaza. Kyiv’s troops use AI software to direct drones able to evade Russian jammers as they close in on sensitive sites. The Israel Defense Forces have developed an AI-assisted decision support system known as Lavender, which has helped identify around 37,000 potential human targets within Gaza. 

Helen Warrell and James O'Donnell

FT/MIT TECHNOLOGY REVIEW | ADOBE STOCK

There is clearly a danger that the Lavender database replicates the biases of the data it is trained on. But military personnel carry biases too. One Israeli intelligence officer who used Lavender claimed to have more faith in the fairness of a “statistical mechanism” than that of a grieving soldier.

Tech optimists designing AI weapons even deny that specific new controls are needed to control their capabilities. Keith Dear, a former UK military officer who now runs the strategic forecasting company Cassi AI, says existing laws are more than sufficient: “You make sure there’s nothing in the training data that might cause the system to go rogue … when you are confident you deploy it—and you, the human commander, are responsible for anything they might do that goes wrong.”

It is an intriguing thought that some of the fear and shock about use of AI in war may come from those who are unfamiliar with brutal but realistic military norms. What do you think, James? Is some opposition to AI in warfare less about the use of autonomous systems and really an argument against war itself? 

James O’Donnell replies:

Hi Helen, 

One thing I’ve noticed is that there’s been a drastic shift in attitudes of AI companies regarding military applications of their products. In the beginning of 2024, OpenAI unambiguously forbade the use of its tools for warfare, but by the end of the year, it had signed an agreement with Anduril to help it take down drones on the battlefield. 

This step—not a fully autonomous weapon, to be sure, but very much a battlefield application of AI—marked a drastic change in how much tech companies could publicly link themselves with defense. 

What happened along the way? For one thing, it’s the hype. We’re told AI will not just bring superintelligence and scientific discovery but also make warfare sharper, more accurate and calculated, less prone to human fallibility. I spoke with US Marines, for example, who tested a type of AI while patrolling the South Pacific that was advertised to analyze foreign intelligence faster than a human could. 

Secondly, money talks. OpenAI and others need to start recouping some of the unimaginable amounts of cash they’re spending on training and running these models. And few have deeper pockets than the Pentagon. And Europe’s defense heads seem keen to splash the cash too. Meanwhile, the amount of venture capital funding for defense tech this year has already doubled the total for all of 2024, as VCs hope to cash in on militaries’ newfound willingness to buy from startups. 

I do think the opposition to AI warfare falls into a few camps, one of which simply rejects the idea that more precise targeting (if it’s actually more precise at all) will mean fewer casualties rather than just more war. Consider the first era of drone warfare in Afghanistan. As drone strikes became cheaper to implement, can we really say it reduced carnage? Instead, did it merely enable more destruction per dollar?

But the second camp of criticism (and now I’m finally getting to your question) comes from people who are well versed in the realities of war but have very specific complaints about the technology’s fundamental limitations. Missy Cummings, for example, is a former fighter pilot for the US Navy who is now a professor of engineering and computer science at George Mason University. She has been outspoken in her belief that large language models, specifically, are prone to make huge mistakes in military settings.

The typical response to this complaint is that AI’s outputs are human-checked. But if an AI model relies on thousands of inputs for its conclusion, can that conclusion really be checked by one person?

Tech companies are making extraordinarily big promises about what AI can do in these high-stakes applications, all while pressure to implement them is sky high. For me, this means it’s time for more skepticism, not less. 

Helen responds:

Hi James, 

We should definitely continue to question the safety of AI warfare systems and the oversight to which they’re subjected—and hold political leaders to account in this area. I am suggesting that we also apply some skepticism to what you rightly describe as the “extraordinarily big promises” made by some companies about what AI might be able to achieve on the battlefield. 

There will be both opportunities and hazards in what the military is being offered by a relatively nascent (though booming) defense tech scene. The danger is that in the speed and secrecy of an arms race in AI weapons, these emerging capabilities may not receive the scrutiny and debate they desperately need.

Further reading:

Michael C. Horowitz, director of Perry World House at the University of Pennsylvania, explains the need for responsibility in the development of military AI systems in this FT op-ed.

The FT’s tech podcast asks what Israel’s defense tech ecosystem can tell us about the future of warfare 

This MIT Technology Review story analyzes how OpenAI completed its pivot to allowing its technology on the battlefield.

MIT Technology Review also uncovered how US soldiers are using generative AI to help scour thousands of pieces of open-source intelligence.

An AI adoption riddle

A few weeks ago, I set out on what I thought would be a straightforward reporting journey. 

After years of momentum for AI—even if you didn’t think it would be good for the world, you probably thought it was powerful enough to take seriously—hype for the technology had been slightly punctured. First there was the underwhelming release of GPT-5 in August. Then a report released two weeks later found that 95% of generative AI pilots were failing, which caused a brief stock market panic. I wanted to know: Which companies are spooked enough to scale back their AI spending?

I searched and searched for them. As I did, more news fueled the idea of an AI bubble that, if popped, would spell doom economy-wide. Stories spread about the circular nature of AI spending, layoffs, the inability of companies to articulate what exactly AI will do for them. Even the smartest people building modern AI systems were saying the tech has not progressed as much as its evangelists promised. 

But after all my searching, companies that took these developments as a sign to perhaps not go all in on AI were nowhere to be found. Or, at least, none that were willing to admit it. What gives? 

There are several interpretations of this one reporter’s quest (which, for the record, I’m presenting as an anecdote and not a representation of the economy), but let’s start with the easy ones. First is that this is a huge score for the “AI is a bubble” believers. What is a bubble if not a situation where companies continue to spend relentlessly even in the face of worrying news? The other is that underneath the bad headlines, there’s not enough genuinely troubling news about AI to convince companies they should pivot.

But it could also be that the unbelievable speed of AI progress and adoption has made me think industries are more sensitive to news than they perhaps should be. I spoke with Martha Gimbel, who leads the Yale Budget Lab and coauthored a report finding that AI has not yet changed anyone’s jobs. What I gathered is that Gimbel, like many economists, thinks on a longer time scale than anyone in the AI world is used to. 

“It would be historically shocking if a technology had had an impact as quickly as people thought that this one was going to,” she says. In other words, perhaps most of the economy is still figuring out what the hell AI even does, not deciding whether to abandon it. 

The other reaction I heard—particularly from the consultant crowd—is that when executives hear that so many AI pilots are failing, they indeed take it very seriously. They’re just not reading it as a failure of the technology itself. They instead point to pilots not moving quickly enough, companies lacking the right data to build better AI, or a host of other strategic reasons.

Even if there is incredible pressure, especially on public companies, to invest heavily in AI, a few have taken big swings on the technology only to pull back. The buy now, pay later company Klarna laid off staff and paused hiring in 2024, claiming it could use AI instead. Less than a year later it was hiring again, explaining that “AI gives us speed. Talent gives us empathy.” 

Drive-throughs, from McDonald’s to Taco Bell, ended pilots testing the use of AI voice assistants. The vast majority of Coca-Cola advertisements, according to experts I spoke with, are not made with generative AI, despite the company’s $1 billion promise. 

So for now, the question remains unanswered: Are there companies out there rethinking how much their bets on AI will pay off, or when? And if there are, what’s keeping them from talking out loud about it? (If you’re out there, email me!)

The three big unanswered questions about Sora

Last week OpenAI released Sora, a TikTok-style app that presents an endless feed of exclusively AI-generated videos, each up to 10 seconds long. The app allows you to create a “cameo” of yourself—a hyperrealistic avatar that mimics your appearance and voice—and insert other peoples’ cameos into your own videos (depending on what permissions they set). 

To some people who believed earnestly in OpenAI’s promise to build AI that benefits all of humanity, the app is a punchline. A former OpenAI researcher who left to build an AI-for-science startup referred to Sora as an “infinite AI tiktok slop machine.” 

That hasn’t stopped it from soaring to the top spot on Apple’s US App Store. After I downloaded the app, I quickly learned what types of videos are, at least currently, performing well: bodycam-style footage of police pulling over pets or various trademarked characters, including SpongeBob and Scooby Doo; deepfake memes of Martin Luther King Jr. talking about Xbox; and endless variations of Jesus Christ navigating our modern world. 

Just as quickly, I had a bunch of questions about what’s coming next for Sora. Here’s what I’ve learned so far.

Can it last?

OpenAI is betting that a sizable number of people will want to spend time on an app in which you can suspend your concerns about whether what you’re looking at is fake and indulge in a stream of raw AI. One reviewer put it this way: “It’s comforting because you know that everything you’re scrolling through isn’t real, where other platforms you sometimes have to guess if it’s real or fake. Here, there is no guessing, it’s all AI, all the time.”

This may sound like hell to some. But judging by Sora’s popularity, lots of people want it. 

So what’s drawing these people in? There are two explanations. One is that Sora is a flash-in-the-pan gimmick, with people lining up to gawk at what cutting-edge AI can create now (in my experience, this is interesting for about five minutes). The second, which OpenAI is betting on, is that we’re witnessing a genuine shift in what type of content can draw eyeballs, and that users will stay with Sora because it allows a level of fantastical creativity not possible in any other app. 

There are a few decisions down the pike that may shape how many people stick around: how OpenAI decides to implement ads, what limits it sets for copyrighted content (see below), and what algorithms it cooks up to decide who sees what. 

Can OpenAI afford it?

OpenAI is not profitable, but that’s not particularly strange given how Silicon Valley operates. What is peculiar, though, is that the company is investing in a platform for generating video, which is the most energy-intensive (and therefore expensive) form of AI we have. The energy it takes dwarfs the amount required to create images or answer text questions via ChatGPT.

This isn’t news to OpenAI, which has joined a half-trillion-dollar project to build data centers and new power plants. But Sora—which currently allows you to generate AI videos, for free, without limits—raises the stakes: How much will it cost the company? 

OpenAI is making moves toward monetizing things (you can now buy products directly through ChatGPT, for example). On October 3, its CEO, Sam Altman, wrote in a blog post that “we are going to have to somehow make money for video generation,” but he didn’t get into specifics. One can imagine personalized ads and more in-app purchases. 

Still, it’s concerning to imagine the mountain of emissions might result if Sora becomes popular. Altman has accurately described the emissions burden of one query to ChatGPT as impossibly small. What he has not quantified is what that figure is for a 10-second video generated by Sora. It’s only a matter of time until AI and climate researchers start demanding it. 

How many lawsuits are coming? 

Sora is awash in copyrighted and trademarked characters. It allows you to easily deepfake deceased celebrities. Its videos use copyrighted music. 

Last week, the Wall Street Journal reported that OpenAI has sent letters to copyright holders notifying them that they’ll have to opt out of the Sora platform if they don’t want their material included, which is not how these things usually work. The law on how AI companies should handle copyrighted material is far from settled, and it’d be reasonable to expect lawsuits challenging this. 

In last week’s blog post, Altman wrote that OpenAI is “hearing from a lot of rightsholders” who want more control over how their characters are used in Sora. He says that the company plans to give those parties more “granular control” over their characters. Still, “there may be some edge cases of generations that get through that shouldn’t,” he wrote.

But another issue is the ease with which you can use the cameos of real people. People can restrict who can use their cameo, but what limits will there be for what these cameos can be made to do in Sora videos? 

This is apparently already an issue OpenAI is being forced to respond to. The head of Sora, Bill Peebles, posted on October 5 that users can now restrict how their cameo can be used—preventing it from appearing in political videos or saying certain words, for example. How well will this work? Is it only a matter of time until someone’s cameo is used for something nefarious, explicit, illegal, or at least creepy, sparking a lawsuit alleging that OpenAI is responsible? 

Overall, we haven’t seen what full-scale Sora looks like yet (OpenAI is still doling out access to the app via invite codes). When we do, I think it will serve as a grim test: Can AI create videos so fine-tuned for endless engagement that they’ll outcompete “real” videos for our attention? In the end, Sora isn’t just testing OpenAI’s technology—it’s testing us, and how much of our reality we’re willing to trade for an infinite scroll of simulation.

The US may be heading toward a drone-filled future

On Thursday, I published a story about the police-tech giant Flock Safety selling its drones to the private sector to track shoplifters. Keith Kauffman, a former police chief who now leads Flock’s drone efforts, described the ideal scenario: A security team at a Home Depot, say, launches a drone from the roof that follows shoplifting suspects to their car. The drone tracks their car through the streets, transmitting its live video feed directly to the police. 

It’s a vision that, unsurprisingly, alarms civil liberties advocates. They say it will expand the surveillance state created by police drones, license-plate readers, and other crime tech, which has allowed law enforcement to collect massive amounts of private data without warrants. Flock is in the middle of a federal lawsuit in Norfolk, Virginia, that alleges just that. Read the full story to learn more

But the peculiar thing about the world of drones is that its fate in the US—whether the skies above your home in the coming years will be quiet, or abuzz with drones dropping off pizzas, inspecting potholes, or chasing shoplifting suspects—pretty much comes down to one rule. It’s a Federal Aviation Administration (FAA) regulation that stipulates where and how drones can be flown, and it is about to change.

Currently, you need a waiver from the FAA to fly a drone farther than you can see it. This is meant to protect the public and property from in-air collisions and accidents. In 2018, the FAA began granting these waivers for various scenarios, like search and rescues, insurance inspections, or police investigations. With Flock’s help, police departments can get waivers approved in just two weeks. The company’s private-sector customers generally have to wait 60 to 90 days.

For years, industries with a stake in drones—whether e-commerce companies promising doorstep delivery or medical transporters racing to move organs—have pushed the government to scrap the waiver system in favor of easier approval to fly beyond visual line of sight. In June, President Donald Trump echoed that call in an executive order for “American drone dominance,” and in August, the FAA released a new proposed rule.

The proposed rule lays out some broad categories for which drone operators are permitted to fly drones beyond their line of sight, including package delivery, agriculture, aerial surveying, and civic interest, which includes policing. Getting approval to fly beyond sight would become easier for operators from these categories, and would generally expand their range. 

Drone companies, and amateur drone pilots, see it as a win. But it’s a win that comes at the expense of privacy for the rest of us, says Jay Stanley, a senior policy analyst with the ACLU Speech, Privacy and Technology Project who served on the rule-making commission for the FAA.

“The FAA is about to open up the skies enormously, to a lot more [beyond visual line of sight] flights without any privacy protections,” he says. The ACLU has said that fleets of drones enable persistent surveillance, including of protests and gatherings, and impinge on the public’s expectations of privacy.

If you’ve got something to say about the FAA’s proposed rule, you can leave a public comment (they’re being accepted until October 6.) Trump’s executive order directs the FAA to release the final rule by spring 2026.

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

The looming crackdown on AI companionship

As long as there has been AI, there have been people sounding alarms about what it might do to us: rogue superintelligence, mass unemployment, or environmental ruin from data center sprawl. But this week showed that another threat entirely—that of kids forming unhealthy bonds with AI—is the one pulling AI safety out of the academic fringe and into regulators’ crosshairs.

This has been bubbling for a while. Two high-profile lawsuits filed in the last year, against Character.AI and OpenAI, allege that companion-like behavior in their models contributed to the suicides of two teenagers. A study by US nonprofit Common Sense Media, published in July, found that 72% of teenagers have used AI for companionship. Stories in reputable outlets about “AI psychosis” have highlighted how endless conversations with chatbots can lead people down delusional spirals.

It’s hard to overstate the impact of these stories. To the public, they are proof that AI is not merely imperfect, but a technology that’s more harmful than helpful. If you doubted that this outrage would be taken seriously by regulators and companies, three things happened this week that might change your mind.

A California law passes the legislature

On Thursday, the California state legislature passed a first-of-its-kind bill. It would require AI companies to include reminders for users they know to be minors that responses are AI generated. Companies would also need to have a protocol for addressing suicide and self-harm and provide annual reports on instances of suicidal ideation in users’ conversations with their chatbots. It was led by Democratic state senator Steve Padilla, passed with heavy bipartisan support, and now awaits Governor Gavin Newsom’s signature. 

There are reasons to be skeptical of the bill’s impact. It doesn’t specify efforts companies should take to identify which users are minors, and lots of AI companies already include referrals to crisis providers when someone is talking about suicide. (In the case of Adam Raine, one of the teenagers whose survivors are suing, his conversations with ChatGPT before his death included this type of information, but the chatbot allegedly went on to give advice related to suicide anyway.)

Still, it is undoubtedly the most significant of the efforts to rein in companion-like behaviors in AI models, which are in the works in other states too. If the bill becomes law, it would strike a blow to the position OpenAI has taken, which is that “America leads best with clear, nationwide rules, not a patchwork of state or local regulations,” as the company’s chief global affairs officer, Chris Lehane, wrote on LinkedIn last week.

The Federal Trade Commission takes aim

The very same day, the Federal Trade Commission announced an inquiry into seven companies, seeking information about how they develop companion-like characters, monetize engagement, measure and test the impact of their chatbots, and more. The companies are Google, Instagram, Meta, OpenAI, Snap, X, and Character Technologies, the maker of Character.AI.

The White House now wields immense, and potentially illegal, political influence over the agency. In March, President Trump fired its lone Democratic commissioner, Rebecca Slaughter. In July, a federal judge ruled that firing illegal, but last week the US Supreme Court temporarily permitted the firing.

“Protecting kids online is a top priority for the Trump-Vance FTC, and so is fostering innovation in critical sectors of our economy,” said FTC chairman Andrew Ferguson in a press release about the inquiry. 

Right now, it’s just that—an inquiry—but the process might (depending on how public the FTC makes its findings) reveal the inner workings of how the companies build their AI companions to keep users coming back again and again. 

Sam Altman on suicide cases

Also on the same day (a busy day for AI news), Tucker Carlson published an hour-long interview with OpenAI’s CEO, Sam Altman. It covers a lot of ground—Altman’s battle with Elon Musk, OpenAI’s military customers, conspiracy theories about the death of a former employee—but it also includes the most candid comments Altman’s made so far about the cases of suicide following conversations with AI. 

Altman talked about “the tension between user freedom and privacy and protecting vulnerable users” in cases like these. But then he offered up something I hadn’t heard before.

“I think it’d be very reasonable for us to say that in cases of young people talking about suicide seriously, where we cannot get in touch with parents, we do call the authorities,” he said. “That would be a change.”

So where does all this go next? For now, it’s clear that—at least in the case of children harmed by AI companionship—companies’ familiar playbook won’t hold. They can no longer deflect responsibility by leaning on privacy, personalization, or “user choice.” Pressure to take a harder line is mounting from state laws, regulators, and an outraged public.

But what will that look like? Politically, the left and right are now paying attention to AI’s harm to children, but their solutions differ. On the right, the proposed solution aligns with the wave of internet age-verification laws that have now been passed in over 20 states. These are meant to shield kids from adult content while defending “family values.” On the left, it’s the revival of stalled ambitions to hold Big Tech accountable through antitrust and consumer-protection powers. 

Consensus on the problem is easier than agreement on the cure. As it stands, it looks likely we’ll end up with exactly the patchwork of state and local regulations that OpenAI (and plenty of others) have lobbied against. 

For now, it’s down to companies to decide where to draw the lines. They’re having to decide things like: Should chatbots cut off conversations when users spiral toward self-harm, or would that leave some people worse off? Should they be licensed and regulated like therapists, or treated as entertainment products with warnings? The uncertainty stems from a basic contradiction: Companies have built chatbots to act like caring humans, but they’ve postponed developing the standards and accountability we demand of real caregivers. The clock is now running out.

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

Help! My therapist is secretly using ChatGPT

In Silicon Valley’s imagined future, AI models are so empathetic that we’ll use them as therapists. They’ll provide mental-health care for millions, unimpeded by the pesky requirements for human counselors, like the need for graduate degrees, malpractice insurance, and sleep. Down here on Earth, something very different has been happening. 

Last week, we published a story about people finding out that their therapists were secretly using ChatGPT during sessions. In some cases it wasn’t subtle; one therapist accidentally shared his screen during a virtual appointment, allowing the patient to see his own private thoughts being typed into ChatGPT in real time. The model then suggested responses that his therapist parroted. 

It’s my favorite AI story as of late, probably because it captures so well the chaos that can unfold when people actually use AI the way tech companies have all but told them to.

As the writer of the story, Laurie Clarke, points out, it’s not a total pipe dream that AI could be therapeutically useful. Early this year, I wrote about the first clinical trial of an AI bot built specifically for therapy. The results were promising! But the secretive use by therapists of AI models that are not vetted for mental health is something very different. I had a conversation with Clarke to hear more about what she found. 

I have to say, I was really fascinated that people called out their therapists after finding out they were covertly using AI. How did you interpret the reactions of these therapists? Were they trying to hide it?

In all the cases mentioned in the piece, the therapist hadn’t provided prior disclosure of how they were using AI to their patients. So whether or not they were explicitly trying to conceal it, that’s how it ended up looking when it was discovered. I think for this reason, one of my main takeaways from writing the piece was that therapists should absolutely disclose when they’re going to use AI and how (if they plan to use it). If they don’t, it raises all these really uncomfortable questions for patients when it’s uncovered and risks irrevocably damaging the trust that’s been built.

In the examples you’ve come across, are therapists turning to AI simply as a time-saver? Or do they think AI models can genuinely give them a new perspective on what’s bothering someone?

Some see AI as a potential time-saver. I heard from a few therapists that notes are the bane of their lives. So I think there is some interest in AI-powered tools that can support this. Most I spoke to were very skeptical about using AI for advice on how to treat a patient. They said it would be better to consult supervisors or colleagues, or case studies in the literature. They were also understandably very wary of inputting sensitive data into these tools.

There is some evidence AI can deliver more standardized, “manualized” therapies like CBT [cognitive behavioral therapy] reasonably effectively. So it’s possible it could be more useful for that. But that is AI specifically designed for that purpose, not general-purpose tools like ChatGPT.

What happens if this goes awry? What attention is this getting from ethics groups and lawmakers?

At present, professional bodies like the American Counseling Association advise against using AI tools to diagnose patients. There could also be more stringent regulations preventing this in future. Nevada and Illinois, for example, have recently passed laws prohibiting the use of AI in therapeutic decision-making. More states could follow.

OpenAI’s Sam Altman said last month that “a lot of people effectively use ChatGPT as a sort of therapist,” and that to him, that’s a good thing. Do you think tech companies are overpromising on AI’s ability to help us?

I think that tech companies are subtly encouraging this use of AI because clearly it’s a route through which some people are forming an attachment to their products. I think the main issue is that what people are getting from these tools isn’t really “therapy” by any stretch. Good therapy goes far beyond being soothing and validating everything someone says. I’ve never in my life looked forward to a (real, in-person) therapy session. They’re often highly uncomfortable, and even distressing. But that’s part of the point. The therapist should be challenging you and drawing you out and seeking to understand you. ChatGPT doesn’t do any of these things. 

Read the full story from Laurie Clarke

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

Can an AI doppelgänger help me do my job?

Everywhere I look, I see AI clones. On X and LinkedIn, “thought leaders” and influencers offer their followers a chance to ask questions of their digital replicas. OnlyFans creators are having AI models of themselves chat, for a price, with followers. “Virtual human” salespeople in China are reportedly outselling real humans. 

Digital clones—AI models that replicate a specific person—package together a few technologies that have been around for a while now: hyperrealistic video models to match your appearance, lifelike voices based on just a couple of minutes of speech recordings, and conversational chatbots increasingly capable of holding our attention. But they’re also offering something the ChatGPTs of the world cannot: an AI that’s not smart in the general sense, but that ‘thinks’ like you do. 

Who are they for? Delphi, a startup that recently raised $16 million from funders including Anthropic and actor/director Olivia Wilde’s venture capital firm, Proximity Ventures, helps famous people create replicas that can speak with their fans in both chat and voice calls. It feels like MasterClass—the platform for instructional seminars led by celebrities—vaulted into the AI age. On its website, Delphi writes that modern leaders “possess potentially life-altering knowledge and wisdom, but their time is limited and access is constrained.”

It has a library of official clones created by famous figures that you can speak with. Arnold Schwarzenegger, for example, told me, “I’m here to cut the crap and help you get stronger and happier,” before informing me cheerily that I’ve now been signed up to receive the Arnold’s Pump Club newsletter. Even if his or other celebrities’ clones fall short of Delphi’s lofty vision of spreading “personalized wisdom at scale,” they at least seem to serve as a funnel to find fans, build mailing lists, or sell supplements.

But what about for the rest of us? Could well-crafted clones serve as our stand-ins? I certainly feel stretched thin at work sometimes, wishing I could be in two places at once, and I bet you do too. I could see a replica popping into a virtual meeting with a PR representative, not to trick them into thinking it’s the real me, but simply to take a brief call on my behalf. A recording of this call might summarize how it went. 

To find out, I tried making a clone. Tavus, a Y Combinator alum that raised $18 million last year, will build a video avatar of you (plans start at $59 per month) that can be coached to reflect your personality and can join video calls. These clones have the “emotional intelligence of humans, with the reach of machines,” according to the company. “Reporter’s assistant” does not appear on the company’s site as an example use case, but it does mention therapists, physician’s assistants, and other roles that could benefit from an AI clone.

For Tavus’s onboarding process, I turned on my camera, read through a script to help it learn my voice (which also acted as a waiver, with me agreeing to lend my likeness to Tavus), and recorded one minute of me just sitting in silence. Within a few hours, my avatar was ready. Upon meeting this digital me, I found it looked and spoke like I do (though I hated its teeth). But faking my appearance was the easy part. Could it learn enough about me and what topics I cover to serve as a stand-in with minimal risk of embarrassing me?

Via a helpful chatbot interface, Tavus walked me through how to craft my clone’s personality, asking what I wanted the replica to do. It then helped me formulate instructions that became its operating manual. I uploaded three dozen of my stories that it could use to reference what I cover. It may have benefited from having more of my content—interviews, reporting notes, and the like—but I would never share that data for a host of reasons, not the least of which being that the other people who appear in it have not consented to their sides of our conversations being used to train an AI replica.

So in the realm of AI—where models learn from entire libraries of data—I didn’t give my clone all that much to learn from, but I was still hopeful it had enough to be useful. 

Alas, conversationally it was a wild card. It acted overly excited about story pitches I would never pursue. It repeated itself, and it kept saying it was checking my schedule to set up a meeting with the real me, which it could not do as I never gave it access to my calendar. It spoke in loops, with no way for the person on the other end to wrap up the conversation. 

These are common early quirks, Tavus’s cofounder Quinn Favret told me. The clones typically rely on Meta’s Llama model, which “often aims to be more helpful than it truly is,” Favret says, and developers building on top of Tavus’s platform are often the ones who set instructions for how the clones finish conversations or access calendars.

For my purposes, it was a bust. To be useful to me, my AI clone would need to show at least some basic instincts for understanding what I cover, and at the very least not creep out whoever’s on the other side of the conversation. My clone fell short.

Such a clone could be helpful in other jobs, though. If you’re an influencer looking for ways to engage with more fans, or a salesperson for whom work is a numbers game and a clone could give you a leg up, it might just work. You run the risk that your replica could go off the rails or embarrass the real you, but the tradeoffs might be reasonable. 

Favret told me some of Tavus’s bigger customers are companies using clones for health-care intake and job interviews. Replicas are also being used in corporate role-play, for practicing sales pitches or having HR-related conversations with employees, for example.

But companies building clones are promising that they will be much more than cold-callers or telemarketing machines. Delphi says its clones will offer “meaningful, personal interactions at infinite scale,” and Tavus says its replicas have “a face, a brain, and memories” that enable “meaningful face-to-face conversations.” Favret also told me a growing number of Tavus’s customers are building clones for mentorship and even decision-making, like AI loan officers who use clones to qualify and filter applicants.

Which is sort of the crux of it. Teaching an AI clone discernment, critical thinking, and taste—never mind the quirks of a specific person—is still the stuff of science fiction. That’s all fine when the person chatting with a clone is in on the bit (most of us know that Schwarzenegger’s replica, for example, will not coach me to be a better athlete).

But as companies polish clones with “human” features and exaggerate their capabilities, I worry that people chasing efficiency will start using their replicas at best for roles that are cringeworthy, and at worst for making decisions they should never be entrusted with. In the end, these models are designed for scale, not fidelity. They can flatter us, amplify us, even sell for us—but they can’t quite become us.

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.