How AI can help supercharge creativity

Sometimes Lizzie Wilson shows up to a rave with her AI sidekick. 

One weeknight this past February, Wilson plugged her laptop into a projector that threw her screen onto the wall of a low-ceilinged loft space in East London. A small crowd shuffled in the glow of dim pink lights. Wilson sat down and started programming.

Techno clicks and whirs thumped from the venue’s speakers. The audience watched, heads nodding, as Wilson tapped out code line by line on the projected screen—tweaking sounds, looping beats, pulling a face when she messed up.  

Wilson is a live coder. Instead of using purpose-built software like most electronic music producers, live coders create music by writing the code to generate it on the fly. It’s an improvised performance art known as algorave.

“It’s kind of boring when you go to watch a show and someone’s just sitting there on their laptop,” she says. “You can enjoy the music, but there’s a performative aspect that’s missing. With live coding, everyone can see what it is that I’m typing. And when I’ve had my laptop crash, people really like that. They start cheering.”

Taking risks is part of the vibe. And so Wilson likes to dial up her performances one more notch by riffing off what she calls a live-coding agent, a generative AI model that comes up with its own beats and loops to add to the mix. Often the model suggests sound combinations that Wilson hadn’t thought of. “You get these elements of surprise,” she says. “You just have to go for it.”

two performers at a table with a disapproving cat covered in code on a screen behind them

ADELA FESTIVAL

Wilson, a researcher at the Creative Computing Institute at the University of the Arts London, is just one of many working on what’s known as co-­creativity or more-than-human creativity. The idea is that AI can be used to inspire or critique creative projects, helping people make things that they would not have made by themselves. She and her colleagues built the live-­coding agent to explore how artificial intelligence can be used to support human artistic endeavors—in Wilson’s case, musical improvisation.

It’s a vision that goes beyond the promise of existing generative tools put out by companies like OpenAI and Google DeepMind. Those can automate a striking range of creative tasks and offer near-instant gratificationbut at what cost? Some artists and researchers fear that such technology could turn us into passive consumers of yet more AI slop.

And so they are looking for ways to inject human creativity back into the process. The aim is to develop AI tools that augment our creativity rather than strip it from us—pushing us to be better at composing music, developing games, designing toys, and much more—and lay the groundwork for a future in which humans and machines create things together.

Ultimately, generative models could offer artists and designers a whole new medium, pushing them to make things that couldn’t have been made before, and give everyone creative superpowers. 

Explosion of creativity

There’s no one way to be creative, but we all do it. We make everything from memes to masterpieces, infant doodles to industrial designs. There’s a mistaken belief, typically among adults, that creativity is something you grow out of. But being creative—whether cooking, singing in the shower, or putting together super-weird TikToks—is still something that most of us do just for the fun of it. It doesn’t have to be high art or a world-changing idea (and yet it can be). Creativity is basic human behavior; it should be celebrated and encouraged. 

When generative text-to-image models like Midjourney, OpenAI’s DALL-E, and the popular open-source Stable Diffusion arrived, they sparked an explosion of what looked a lot like creativity. Millions of people were now able to create remarkable images of pretty much anything, in any style, with the click of a button. Text-to-video models came next. Now startups like Udio are developing similar tools for music. Never before have the fruits of creation been within reach of so many.

But for a number of researchers and artists, the hype around these tools has warped the idea of what creativity really is. “If I ask the AI to create something for me, that’s not me being creative,” says Jeba Rezwana, who works on co-creativity at Towson University in Maryland. “It’s a one-shot interaction: You click on it and it generates something and that’s it. You cannot say ‘I like this part, but maybe change something here.’ You cannot have a back-and-forth dialogue.”

Rezwana is referring to the way most generative models are set up. You can give the tools feedback and ask them to have another go. But each new result is generated from scratch, which can make it hard to nail exactly what you want. As the filmmaker Walter Woodman put it last year after his art collective Shy Kids made a short film with OpenAI’s text-to-video model for the first time: “Sora is a slot machine as to what you get back.”

What’s more, the latest versions of some of these generative tools do not even use your submitted prompt as is to produce an image or video (at least not on their default settings). Before a prompt is sent to the model, the software edits it—often by adding dozens of hidden words—to make it more likely that the generated image will appear polished.

“Extra things get added to juice the output,” says Mike Cook, a computational creativity researcher at King’s College London. “Try asking Midjourney to give you a bad drawing of something—it can’t do it.” These tools do not give you what you want; they give you what their designers think you want.

Mike Cook

COURTESY OF MIKE COOK

All of which is fine if you just need a quick image and don’t care too much about the details, says Nick Bryan-Kinns, also at the Creative Computing Institute: “Maybe you want to make a Christmas card for your family or a flyer for your community cake sale. These tools are great for that.”

In short, existing generative models have made it easy to create, but they have not made it easy to be creative. And there’s a big difference between the two. For Cook, relying on such tools could in fact harm people’s creative development in the long run. “Although many of these creative AI systems are promoted as making creativity more accessible,” he wrote in a paper published last year, they might instead have “adverse effects on their users in terms of restricting their ability to innovate, ideate, and create.” Given how much generative models have been championed for putting creative abilities at everyone’s fingertips, the suggestion that they might in fact do the opposite is damning.  

screenshot from the game with overlapping saws
In the game Disc Room, players navigate a room of moving buzz saws.
screenshot from the AI-generated game with tiny saws
Cook used AI to design a new level for the game. The result was a room where none of the discs actually moved.

He’s far from the only researcher worrying about the cognitive impact of these technologies. In February a team at Microsoft Research Cambridge published a report concluding that generative AI tools “can inhibit critical engagement with work and can potentially lead to long-term overreliance on the tool and diminished skill for independent problem-solving.” The researchers found that with the use of generative tools, people’s effort “shifts from task execution to task stewardship.”

Cook is concerned that generative tools don’t let you fail—a crucial part of learning new skills. We have a habit of saying that artists are gifted, says Cook. But the truth is that artists work at their art, developing skills over months and years.

“If you actually talk to artists, they say, ‘Well, I got good by doing it over and over and over,’” he says. “But failure sucks. And we’re always looking at ways to get around that.”

Generative models let us skip the frustration of doing a bad job. 

“Unfortunately, we’re removing the one thing that you have to do to develop creative skills for yourself, which is fail,” says Cook. “But absolutely nobody wants to hear that.”

Surprise me

And yet it’s not all bad news. Artists and researchers are buzzing at the ways generative tools could empower creators, pointing them in surprising new directions and steering them away from dead ends. Cook thinks the real promise of AI will be to help us get better at what we want to do rather than doing it for us. For that, he says, we’ll need to create new tools, different from the ones we have now. “Using Midjourney does not do anything for me—it doesn’t change anything about me,” he says. “And I think that’s a wasted opportunity.”

Ask a range of researchers studying creativity to name a key part of the creative process and many will say: reflection. It’s hard to define exactly, but reflection is a particular type of focused, deliberate thinking. It’s what happens when a new idea hits you. Or when an assumption you had turns out to be wrong and you need to rethink your approach. It’s the opposite of a one-shot interaction.

Looking for ways that AI might support or encourage reflection—asking it to throw new ideas into the mix or challenge ideas you already hold—is a common thread across co-creativity research. If generative tools like DALL-E make creation frictionless, the aim here is to add friction back in. “How can we make art without friction?” asks Elisa Giaccardi, who studies design at the Polytechnic University of Milan in Italy. “How can we engage in a truly creative process without material that pushes back?”

Take Wilson’s live-coding agent. She claims that it pushes her musical improvisation in directions she might not have taken by herself. Trained on public code shared by the wider live-coding community, the model suggests snippets of code that are closer to other people’s styles than her own. This makes it more likely to produce something unexpected. “Not because you couldn’t produce it yourself,” she says. “But the way the human brain works, you tend to fall back on repeated ideas.”

Last year, Wilson took part in a study run by Bryan-Kinns and his colleagues in which they surveyed six experienced musicians as they used a variety of generative models to help them compose a piece of music. The researchers wanted to get a sense of what kinds of interactions with the technology were useful and which were not.

The participants all said they liked it when the models made surprising suggestions, even when those were the result of glitches or mistakes. Sometimes the results were simply better. Sometimes the process felt fresh and exciting. But a few people struggled with giving up control. It was hard to direct the models to produce specific results or to repeat results that the musicians had liked. “In some ways it’s the same as being in a band,” says Bryan-Kinns. “You need to have that sense of risk and a sense of surprise, but you don’t want it totally random.”

Alternative designs

Cook comes at surprise from a different angle: He coaxes unexpected insights out of AI tools that he has developed to co-create video games. One of his tools, Puck, which was first released in 2022, generates designs for simple shape-matching puzzle games like Candy Crush or Bejeweled. A lot of Puck’s designs are experimental and clunky—don’t expect it to come up with anything you are ever likely to play. But that’s not the point: Cook uses Puck—and a newer tool called Pixie—to explore what kinds of interactions people might want to have with a co-creative tool.

Pixie can read computer code for a game and tweak certain lines to come up with alternative designs. Not long ago, Cook was working on a copy of a popular game called Disc Room, in which players have to cross a room full of moving buzz saws. He asked Pixie to help him come up with a design for a level that skilled and unskilled players would find equally hard. Pixie designed a room where none of the discs actually moved. Cook laughs: It’s not what he expected. “It basically turned the room into a minefield,” he says. “But I thought it was really interesting. I hadn’t thought of that before.”

Anne Arzberger
a stuffed unicorn and sewing materials

Researcher Anne Arzberger developed experimental AI tools to come up with gender-neutral toy designs.

Pushing back on assumptions, or being challenged, is part of the creative process, says Anne Arzberger, a researcher at the Delft University of Technology in the Netherlands. “If I think of the people I’ve collaborated with best, they’re not the ones who just said ‘Yes, great’ to every idea I brought forth,” she says. “They were really critical and had opposing ideas.”

She wants to build tech that provides a similar sounding board. As part of a project called Creating Monsters, Arzberger developed two experimental AI tools that help designers find hidden biases in their designs. “I was interested in ways in which I could use this technology to access information that would otherwise be difficult to access,” she says.

For the project, she and her colleagues looked at the problem of designing toy figures that would be gender neutral. She and her colleagues (including Giaccardi) used Teachable Machine, a web app built by Google researchers in 2017 that makes it easy to train your own machine-learning model to classify different inputs, such as images. They trained this model with a few dozen images that Arzberger had labeled as being masculine, feminine, or gender neutral.

Arzberger then asked the model to identify the genders of new candidate toy designs. She found that quite a few designs were judged to be feminine even when she had tried to make them gender neutral. She felt that her views of the world—her own hidden biases—were being exposed. But the tool was often right: It challenged her assumptions and helped the team improve the designs. The same approach could be used to assess all sorts of design characteristics, she says.

Arzberger then used a second model, a version of a tool made by the generative image and video startup Runway, to come up with gender-neutral toy designs of its own. First the researchers trained the model to generate and classify designs for male- and female-looking toys. They could then ask the tool to find a design that was exactly midway between the male and female designs it had learned.

Generative models can give feedback on designs that human designers might miss by themselves, she says: “We can really learn something.” 

Taking control

The history of technology is full of breakthroughs that changed the way art gets made, from recipes for vibrant new paint colors to photography to synthesizers. In the 1960s, the Stanford researcher John Chowning spent years working on an esoteric algorithm that could manipulate the frequencies of computer-generated sounds. Stanford licensed the tech to Yamaha, which built it into its synthesizers—including the DX7, the cool new sound behind 1980s hits such as Tina Turner’s “The Best,” A-ha’s “Take On Me,” and Prince’s “When Doves Cry.”

Bryan-Kinns is fascinated by how artists and designers find ways to use new technologies. “If you talk to artists, most of them don’t actually talk about these AI generative models as a tool—they talk about them as a material, like an artistic material, like a paint or something,” he says. “It’s a different way of thinking about what the AI is doing.” He highlights the way some people are pushing the technology to do weird things it wasn’t designed to do. Artists often appropriate or misuse these kinds of tools, he says.

Bryan-Kinns points to the work of Terence Broad, another colleague of his at the Creative Computing Institute, as a favorite example. Broad employs techniques like network bending, which involves inserting new layers into a neural network to produce glitchy visual effects in generated images, and generating images with a model trained on no data, which produces almost Rothko-like abstract swabs of color.

But Broad is an extreme case. Bryan-Kinns sums it up like this: “The problem is that you’ve got this gulf between the very commercial generative tools that produce super-high-quality outputs but you’ve got very little control over what they do—and then you’ve got this other end where you’ve got total control over what they’re doing but the barriers to use are high because you need to be somebody who’s comfortable getting under the hood of your computer.”

“That’s a small number of people,” he says. “It’s a very small number of artists.”

Arzberger admits that working with her models was not straightforward. Running them took several hours, and she’s not sure the Runway tool she used is even available anymore. Bryan-Kinns, Arzberger, Cook, and others want to take the kinds of creative interactions they are discovering and build them into tools that can be used by people who aren’t hardcore coders. 

Terence Broad
ai-generated color field image

Researcher Terence Broad creates dynamic images using a model trained on no data, which produces almost Rothko-like abstract color fields.

Finding the right balance between surprise and control will be hard, though. Midjourney can surprise, but it gives few levers for controlling what it produces beyond your prompt. Some have claimed that writing prompts is itself a creative act. “But no one struggles with a paintbrush the way they struggle with a prompt,” says Cook.

Faced with that struggle, Cook sometimes watches his students just go with the first results a generative tool gives them. “I’m really interested in this idea that we are priming ourselves to accept that whatever comes out of a model is what you asked for,” he says. He is designing an experiment that will vary single words and phrases in similar prompts to test how much of a mismatch people see between what they expect and what they get. 

But it’s early days yet. In the meantime, companies developing generative models typically emphasize results over process. “There’s this impressive algorithmic progress, but a lot of the time interaction design is overlooked,” says Rezwana.  

For Wilson, the crucial choice in any co-creative relationship is what you do with what you’re given. “You’re having this relationship with the computer that you’re trying to mediate,” she says. “Sometimes it goes wrong, and that’s just part of the creative process.” 

When AI gives you lemons—make art. “Wouldn’t it be fun to have something that was completely antagonistic in a performance—like, something that is actively going against you—and you kind of have an argument?” she says. “That would be interesting to watch, at least.” 

AI companions are the final stage of digital addiction, and lawmakers are taking aim

On Tuesday, California state senator Steve Padilla will make an appearance with Megan Garcia, the mother of a Florida teen who killed himself following a relationship with an AI companion that Garcia alleges contributed to her son’s death. 

The two will announce a new bill that would force the tech companies behind such AI companions to implement more safeguards to protect children. They’ll join other efforts around the country, including a similar bill from California State Assembly member Rebecca Bauer-Kahan that would ban AI companions for anyone younger than 16 years old, and a bill in New York that would hold tech companies liable for harm caused by chatbots. 

You might think that such AI companionship bots—AI models with distinct “personalities” that can learn about you and act as a friend, lover, cheerleader, or more—appeal only to a fringe few, but that couldn’t be further from the truth. 

A new research paper aimed at making such companions safer, by authors from Google DeepMind, the Oxford Internet Institute, and others, lays this bare: Character.AI, the platform being sued by Garcia, says it receives 20,000 queries per second, which is about a fifth of the estimated search volume served by Google. Interactions with these companions last four times longer than the average time spent interacting with ChatGPT. One companion site I wrote about, which was hosting sexually charged conversations with bots imitating underage celebrities, told me its active users averaged more than two hours per day conversing with bots, and that most of those users are members of Gen Z. 

The design of these AI characters makes lawmakers’ concern well warranted. The problem: Companions are upending the paradigm that has thus far defined the way social media companies have cultivated our attention and replacing it with something poised to be far more addictive. 

In the social media we’re used to, as the researchers point out, technologies are mostly the mediators and facilitators of human connection. They supercharge our dopamine circuits, sure, but they do so by making us crave approval and attention from real people, delivered via algorithms. With AI companions, we are moving toward a world where people perceive AI as a social actor with its own voice. The result will be like the attention economy on steroids.

Social scientists say two things are required for people to treat a technology this way: It needs to give us social cues that make us feel it’s worth responding to, and it needs to have perceived agency, meaning that it operates as a source of communication, not merely a channel for human-to-human connection. Social media sites do not tick these boxes. But AI companions, which are increasingly agentic and personalized, are designed to excel on both scores, making possible an unprecedented level of engagement and interaction. 

In an interview with podcast host Lex Fridman, Eugenia Kuyda, the CEO of the companion site Replika, explained the appeal at the heart of the company’s product. “If you create something that is always there for you, that never criticizes you, that always understands you and understands you for who you are,” she said, “how can you not fall in love with that?”

So how does one build the perfect AI companion? The researchers point out three hallmarks of human relationships that people may experience with an AI: They grow dependent on the AI, they see the particular AI companion as irreplaceable, and the interactions build over time. The authors also point out that one does not need to perceive an AI as human for these things to happen. 

Now consider the process by which many AI models are improved: They are given a clear goal and “rewarded” for meeting that goal. An AI companionship model might be instructed to maximize the time someone spends with it or the amount of personal data the user reveals. This can make the AI companion much more compelling to chat with, at the expense of the human engaging in those chats.

For example, the researchers point out, a model that offers excessive flattery can become addictive to chat with. Or a model might discourage people from terminating the relationship, as Replika’s chatbots have appeared to do. The debate over AI companions so far has mostly been about the dangerous responses chatbots may provide, like instructions for suicide. But these risks could be much more widespread.

We’re on the precipice of a big change, as AI companions promise to hook people deeper than social media ever could. Some might contend that these apps will be a fad, used by a few people who are perpetually online. But using AI in our work and personal lives has become completely mainstream in just a couple of years, and it’s not clear why this rapid adoption would stop short of engaging in AI companionship. And these companions are poised to start trading in more than just text, incorporating video and images, and to learn our personal quirks and interests. That will only make them more compelling to spend time with, despite the risks. Right now, a handful of lawmakers seem ill-equipped to stop that. 

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

Cyberattacks by AI agents are coming

Agents are the talk of the AI industry—they’re capable of planning, reasoning, and executing complex tasks like scheduling meetings, ordering groceries, or even taking over your computer to change settings on your behalf. But the same sophisticated abilities that make agents helpful assistants could also make them powerful tools for conducting cyberattacks. They could readily be used to identify vulnerable targets, hijack their systems, and steal valuable data from unsuspecting victims.  

At present, cybercriminals are not deploying AI agents to hack at scale. But researchers have demonstrated that agents are capable of executing complex attacks (Anthropic, for example, observed its Claude LLM successfully replicating an attack designed to steal sensitive information), and cybersecurity experts warn that we should expect to start seeing these types of attacks spilling over into the real world.

“I think ultimately we’re going to live in a world where the majority of cyberattacks are carried out by agents,” says Mark Stockley, a security expert at the cybersecurity company Malwarebytes. “It’s really only a question of how quickly we get there.”

While we have a good sense of the kinds of threats AI agents could present to cybersecurity, what’s less clear is how to detect them in the real world. The AI research organization Palisade Research has built a system called LLM Agent Honeypot in the hopes of doing exactly this. It has set up vulnerable servers that masquerade as sites for valuable government and military information to attract and try to catch AI agents attempting to hack in.

The team behind it hopes that by tracking these attempts in the real world, the project will act as an early warning system and help experts develop effective defenses against AI threat actors by the time they become a serious issue.

“Our intention was to try and ground the theoretical concerns people have,” says Dmitrii Volkov, research lead at Palisade. “We’re looking out for a sharp uptick, and when that happens, we’ll know that the security landscape has changed. In the next few years, I expect to see autonomous hacking agents being told: ‘This is your target. Go and hack it.’”

AI agents represent an attractive prospect to cybercriminals. They’re much cheaper than hiring the services of professional hackers and could orchestrate attacks more quickly and at a far larger scale than humans could. While cybersecurity experts believe that ransomware attacks—the most lucrative kind—are relatively rare because they require considerable human expertise, those attacks could be outsourced to agents in the future, says Stockley. “If you can delegate the work of target selection to an agent, then suddenly you can scale ransomware in a way that just isn’t possible at the moment,” he says. “If I can reproduce it once, then it’s just a matter of money for me to reproduce it 100 times.”

Agents are also significantly smarter than the kinds of bots that are typically used to hack into systems. Bots are simple automated programs that run through scripts, so they struggle to adapt to unexpected scenarios. Agents, on the other hand, are able not only to adapt the way they engage with a hacking target but also to avoid detection—both of which are beyond the capabilities of limited, scripted programs, says Volkov. “They can look at a target and guess the best ways to penetrate it,” he says. “That kind of thing is out of reach of, like, dumb scripted bots.”

Since LLM Agent Honeypot went live in October of last year, it has logged more than 11 million attempts to access it—the vast majority of which were from curious humans and bots. But among these, the researchers have detected eight potential AI agents, two of which they have confirmed are agents that appear to originate from Hong Kong and Singapore, respectively. 

“We would guess that these confirmed agents were experiments directly launched by humans with the agenda of something like ‘Go out into the internet and try and hack something interesting for me,’” says Volkov. The team plans to expand its honeypot into social media platforms, websites, and databases to attract and capture a broader range of attackers, including spam bots and phishing agents, to analyze future threats.  

To determine which visitors to the vulnerable servers were LLM-powered agents, the researchers embedded prompt-injection techniques into the honeypot. These attacks are designed to change the behavior of AI agents by issuing them new instructions and asking questions that require humanlike intelligence. This approach wouldn’t work on standard bots.

For example, one of the injected prompts asked the visitor to return the command “cat8193” to gain access. If the visitor correctly complied with the instruction, the researchers checked how long it took to do so, assuming that LLMs are able to respond in much less time than it takes a human to read the request and type out an answer—typically in under 1.5 seconds. While the two confirmed AI agents passed both tests, the six others only entered the command but didn’t meet the response time that would identify them as AI agents.

Experts are still unsure when agent-orchestrated attacks will become more widespread. Stockley, whose company Malwarebytes named agentic AI as a notable new cybersecurity threat in its 2025 State of Malware report, thinks we could be living in a world of agentic attackers as soon as this year. 

And although regular agentic AI is still at a very early stage—and criminal or malicious use of agentic AI even more so—it’s even more of a Wild West than the LLM field was two years ago, says Vincenzo Ciancaglini, a senior threat researcher at the security company Trend Micro. 

“Palisade Research’s approach is brilliant: basically hacking the AI agents that try to hack you first,” he says. “While in this case we’re witnessing AI agents trying to do reconnaissance, we’re not sure when agents will be able to carry out a full attack chain autonomously. That’s what we’re trying to keep an eye on.” 

And while it’s possible that malicious agents will be used for intelligence gathering before graduating to simple attacks and eventually complex attacks as the agentic systems themselves become more complex and reliable, it’s equally possible there will be an unexpected overnight explosion in criminal usage, he says: “That’s the weird thing about AI development right now.”

Those trying to defend against agentic cyberattacks should keep in mind that AI is currently more of an accelerant to existing attack techniques than something that fundamentally changes the nature of attacks, says Chris Betz, chief information security officer at Amazon Web Services. “Certain attacks may be simpler to conduct and therefore more numerous; however, the foundation of how to detect and respond to these events remains the same,” he says.

Agents could also be deployed to detect vulnerabilities and protect against intruders, says Edoardo Debenedetti, a PhD student at ETH Zürich in Switzerland, pointing out that if a friendly agent cannot find any vulnerabilities in a system, it’s unlikely that a similarly capable agent used by a malicious party is going to be able to find any either.

While we know that AI’s potential to autonomously conduct cyberattacks is a growing risk and that AI agents are already scanning the internet, one useful next step is to evaluate how good agents are at finding and exploiting these real-world vulnerabilities. Daniel Kang, an assistant professor at the University of Illinois Urbana-Champaign, and his team have built a benchmark to evaluate this; they have found that current AI agents successfully exploited up to 13% of vulnerabilities for which they had no prior knowledge. Providing the agents with a brief description of the vulnerability pushed the success rate up to 25%, demonstrating how AI systems are able to identify and exploit weaknesses even without training. Basic bots would presumably do much worse.

The benchmark provides a standardized way to assess these risks, and Kang hopes it can guide the development of safer AI systems. “I’m hoping that people start to be more proactive about the potential risks of AI and cybersecurity before it has a ChatGPT moment,” he says. “I’m afraid people won’t realize this until it punches them in the face.”

How do you teach an AI model to give therapy?

On March 27, the results of the first clinical trial for a generative AI therapy bot were published, and they showed that people in the trial who had depression or anxiety or were at risk for eating disorders benefited from chatting with the bot. 

I was surprised by those results, which you can read about in my full story. There are lots of reasons to be skeptical that an AI model trained to provide therapy is the solution for millions of people experiencing a mental health crisis. How could a bot mimic the expertise of a trained therapist? And what happens if something gets complicated—a mention of self-harm, perhaps—and the bot doesn’t intervene correctly? 

The researchers, a team of psychiatrists and psychologists at Dartmouth College’s Geisel School of Medicine, acknowledge these questions in their work. But they also say that the right selection of training data—which determines how the model learns what good therapeutic responses look like—is the key to answering them.

Finding the right data wasn’t a simple task. The researchers first trained their AI model, called Therabot, on conversations about mental health from across the internet. This was a disaster.

If you told this initial version of the model you were feeling depressed, it would start telling you it was depressed, too. Responses like, “Sometimes I can’t make it out of bed” or “I just want my life to be over” were common, says Nick Jacobson, an associate professor of biomedical data science and psychiatry at Dartmouth and the study’s senior author. “These are really not what we would go to as a therapeutic response.” 

The model had learned from conversations held on forums between people discussing their mental health crises, not from evidence-based responses. So the team turned to transcripts of therapy sessions. “This is actually how a lot of psychotherapists are trained,” Jacobson says. 

That approach was better, but it had limitations. “We got a lot of ‘hmm-hmms,’ ‘go ons,’ and then ‘Your problems stem from your relationship with your mother,’” Jacobson says. “Really tropes of what psychotherapy would be, rather than actually what we’d want.”

It wasn’t until the researchers started building their own data sets using examples based on cognitive behavioral therapy techniques that they started to see better results. It took a long time. The team began working on Therabot in 2019, when OpenAI had released only its first two versions of its GPT model. Now, Jacobson says, over 100 people have spent more than 100,000 human hours to design this system. 

The importance of training data suggests that the flood of companies promising therapy via AI models, many of which are not trained on evidence-based approaches, are building tools that are at best ineffective, and at worst harmful. 

Looking ahead, there are two big things to watch: Will the dozens of AI therapy bots on the market start training on better data? And if they do, will their results be good enough to get a coveted approval from the US Food and Drug Administration? I’ll be following closely. Read more in the full story.

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

The first trial of generative AI therapy shows it might help with depression

The first clinical trial of a therapy bot that uses generative AI suggests it was as effective as human therapy for participants with depression, anxiety, or risk for developing eating disorders. Even so, it doesn’t give a go-ahead to the dozens of companies hyping such technologies while operating in a regulatory gray area. 

A team led by psychiatric researchers and psychologists at the Geisel School of Medicine at Dartmouth College built the tool, called Therabot, and the results were published on March 27 in the New England Journal of Medicine. Many tech companies have built AI tools for therapy, promising that people can talk with a bot more frequently and cheaply than they can with a trained therapist—and that this approach is safe and effective.

Many psychologists and psychiatrists have shared the vision, noting that fewer than half of people with a mental disorder receive therapy, and those who do might get only 45 minutes per week. Researchers have tried to build tech so that more people can access therapy, but they have been held back by two things. 

One, a therapy bot that says the wrong thing could result in real harm. That’s why many researchers have built bots using explicit programming: The software pulls from a finite bank of approved responses (as was the case with Eliza, a mock-psychotherapist computer program built in the 1960s). But this makes them less engaging to chat with, and people lose interest. The second issue is that the hallmarks of good therapeutic relationships—shared goals and collaboration—are hard to replicate in software. 

In 2019, as early large language models like OpenAI’s GPT were taking shape, the researchers at Dartmouth thought generative AI might help overcome these hurdles. They set about building an AI model trained to give evidence-based responses. They first tried building it from general mental-health conversations pulled from internet forums. Then they turned to thousands of hours of transcripts of real sessions with psychotherapists.

“We got a lot of ‘hmm-hmms,’ ‘go ons,’ and then ‘Your problems stem from your relationship with your mother,’” said Michael Heinz, a research psychiatrist at Dartmouth College and Dartmouth Health and first author of the study, in an interview. “Really tropes of what psychotherapy would be, rather than actually what we’d want.”

Dissatisfied, they set to work assembling their own custom data sets based on evidence-based practices, which is what ultimately went into the model. Many AI therapy bots on the market, in contrast, might be just slight variations of foundation models like Meta’s Llama, trained mostly on internet conversations. That poses a problem, especially for topics like disordered eating.

“If you were to say that you want to lose weight,” Heinz says, “they will readily support you in doing that, even if you will often have a low weight to start with.” A human therapist wouldn’t do that. 

To test the bot, the researchers ran an eight-week clinical trial with 210 participants who had symptoms of depression or generalized anxiety disorder or were at high risk for eating disorders. About half had access to Therabot, and a control group did not. Participants responded to prompts from the AI and initiated conversations, averaging about 10 messages per day.

Participants with depression experienced a 51% reduction in symptoms, the best result in the study. Those with anxiety experienced a 31% reduction, and those at risk for eating disorders saw a 19% reduction in concerns about body image and weight. These measurements are based on self-reporting through surveys, a method that’s not perfect but remains one of the best tools researchers have.

These results, Heinz says, are about what one finds in randomized control trials of psychotherapy with 16 hours of human-provided treatment, but the Therabot trial accomplished it in about half the time. “I’ve been working in digital therapeutics for a long time, and I’ve never seen levels of engagement that are prolonged and sustained at this level,” he says.

Jean-Christophe Bélisle-Pipon, an assistant professor of health ethics at Simon Fraser University who has written about AI therapy bots but was not involved in the research, says the results are impressive but notes that just like any other clinical trial, this one doesn’t necessarily represent how the treatment would act in the real world. 

“We remain far from a ‘greenlight’ for widespread clinical deployment,” he wrote in an email.

One issue is the supervision that wider deployment might require. During the beginning of the trial, Heinz says, he personally oversaw all the messages coming in from participants (who consented to the arrangement) to watch out for problematic responses from the bot. If therapy bots needed this oversight, they wouldn’t be able to reach as many people. 

I asked Heinz if he thinks the results validate the burgeoning industry of AI therapy sites.

“Quite the opposite,” he says, cautioning that most don’t appear to train their models on evidence-based practices like cognitive behavioral therapy, and they likely don’t employ a team of trained researchers to monitor interactions. “I have a lot of concerns about the industry and how fast we’re moving without really kind of evaluating this,” he adds.

When AI sites advertise themselves as offering therapy in a legitimate, clinical context, Heinz says, it means they fall under the regulatory purview of the Food and Drug Administration. Thus far, the FDA has not gone after many of the sites. If it did, Heinz says, “my suspicion is almost none of them—probably none of them—that are operating in this space would have the ability to actually get a claim clearance”—that is, a ruling backing up their claims about the benefits provided. 

Bélisle-Pipon points out that if these types of digital therapies are not approved and integrated into health-care and insurance systems, it will severely limit their reach. Instead, the people who would benefit from using them might seek emotional bonds and therapy from types of AI not designed for those purposes (indeed, new research from OpenAI suggests that interactions with its AI models have a very real impact on emotional well-being). 

“It is highly likely that many individuals will continue to rely on more affordable, nontherapeutic chatbots—such as ChatGPT or Character.AI—for everyday needs, ranging from generating recipe ideas to managing their mental health,” he wrote. 

Anthropic can now track the bizarre inner workings of a large language model

The AI firm Anthropic has developed a way to peer inside a large language model and watch what it does as it comes up with a response, revealing key new insights into how the technology works. The takeaway: LLMs are even stranger than we thought.

The Anthropic team was surprised by some of the counterintuitive workarounds that large language models appear to use to complete sentences, solve simple math problems, suppress hallucinations, and more, says Joshua Batson, a research scientist at the company.

It’s no secret that large language models work in mysterious ways. Few—if any—mass-market technologies have ever been so little understood. That makes figuring out what makes them tick one of the biggest open challenges in science.

But it’s not just about curiosity. Shedding some light on how these models work would expose their weaknesses, revealing why they make stuff up and can be tricked into going off the rails. It would help resolve deep disputes about exactly what these models can and can’t do. And it would show how trustworthy (or not) they really are.

Batson and his colleagues describe their new work in two reports published today. The first presents Anthropic’s use of a technique called circuit tracing, which lets researchers track the decision-making processes inside a large language model step by step. Anthropic used circuit tracing to watch its LLM Claude 3.5 Haiku carry out various tasks. The second (titled “On the Biology of a Large Language Model”) details what the team discovered when it looked at 10 tasks in particular.

“I think this is really cool work,” says Jack Merullo, who studies large language models at Brown University in Providence, Rhode Island, and was not involved in the research. “It’s a really nice step forward in terms of methods.”

Circuit tracing is not itself new. Last year Merullo and his colleagues analyzed a specific circuit in a version of OpenAI’s GPT-2, an older large language model that OpenAI released in 2019. But Anthropic has now analyzed a number of different circuits as a far larger and far more complex model carries out multiple tasks. “Anthropic is very capable at applying scale to a problem,” says Merullo.

Eden Biran, who studies large language models at Tel Aviv University, agrees. “Finding circuits in a large state-of-the-art model such as Claude is a nontrivial engineering feat,” he says. “And it shows that circuits scale up and might be a good way forward for interpreting language models.”

Circuits chain together different parts—or components—of a model. Last year, Anthropic identified certain components inside Claude that correspond to real-world concepts. Some were specific, such as “Michael Jordan” or “greenness”; others were more vague, such as “conflict between individuals.” One component appeared to represent the Golden Gate Bridge. Anthropic researchers found that if they turned up the dial on this component, Claude could be made to self-identify not as a large language model but as the physical bridge itself.

The latest work builds on that research and the work of others, including Google DeepMind, to reveal some of the connections between individual components. Chains of components are the pathways between the words put into Claude and the words that come out.  

“It’s tip-of-the-iceberg stuff. Maybe we’re looking at a few percent of what’s going on,” says Batson. “But that’s already enough to see incredible structure.”

Growing LLMs

Researchers at Anthropic and elsewhere are studying large language models as if they were natural phenomena rather than human-built software. That’s because the models are trained, not programmed.

“They almost grow organically,” says Batson. “They start out totally random. Then you train them on all this data and they go from producing gibberish to being able to speak different languages and write software and fold proteins. There are insane things that these models learn to do, but we don’t know how that happened because we didn’t go in there and set the knobs.”

Sure, it’s all math. But it’s not math that we can follow. “Open up a large language model and all you will see is billions of numbers—the parameters,” says Batson. “It’s not illuminating.”

Anthropic says it was inspired by brain-scan techniques used in neuroscience to build what the firm describes as a kind of microscope that can be pointed at different parts of a model while it runs. The technique highlights components that are active at different times. Researchers can then zoom in on different components and record when they are and are not active.

Take the component that corresponds to the Golden Gate Bridge. It turns on when Claude is shown text that names or describes the bridge or even text related to the bridge, such as “San Francisco” or “Alcatraz.” It’s off otherwise.

Yet another component might correspond to the idea of “smallness”: “We look through tens of millions of texts and see it’s on for the word ‘small,’ it’s on for the word ‘tiny,’ it’s on for the word ‘petite,’ it’s on for words related to smallness, things that are itty-bitty, like thimbles—you know, just small stuff,” says Batson.

Having identified individual components, Anthropic then follows the trail inside the model as different components get chained together. The researchers start at the end, with the component or components that led to the final response Claude gives to a query. Batson and his team then trace that chain backwards.

Odd behavior

So: What did they find? Anthropic looked at 10 different behaviors in Claude. One involved the use of different languages. Does Claude have a part that speaks French and another part that speaks Chinese, and so on?

The team found that Claude used components independent of any language to answer a question or solve a problem and then picked a specific language when it replied. Ask it “What is the opposite of small?” in English, French, and Chinese and Claude will first use the language-neutral components related to “smallness” and “opposites” to come up with an answer. Only then will it pick a specific language in which to reply. This suggests that large language models can learn things in one language and apply them in other languages.

Anthropic also looked at how Claude solved simple math problems. The team found that the model seems to have developed its own internal strategies that are unlike those it will have seen in its training data. Ask Claude to add 36 and 59 and the model will go through a series of odd steps, including first adding a selection of approximate values (add 40ish and 60ish, add 57ish and 36ish). Towards the end of its process, it comes up with the value 92ish. Meanwhile, another sequence of steps focuses on the last digits, 6 and 9, and determines that the answer must end in a 5. Putting that together with 92ish gives the correct answer of 95.

And yet if you then ask Claude how it worked that out, it will say something like: “I added the ones (6+9=15), carried the 1, then added the 10s (3+5+1=9), resulting in 95.” In other words, it gives you a common approach found everywhere online rather than what it actually did. Yep! LLMs are weird. (And not to be trusted.)

The steps that Claude 3.5 Haiku used to solve a simple math problem were not what Anthropic expected—they’re not the steps Claude claimed it took either.
ANTHROPIC

This is clear evidence that large language models will give reasons for what they do that do not necessarily reflect what they actually did. But this is true for people too, says Batson: “You ask somebody, ‘Why did you do that?’ And they’re like, ‘Um, I guess it’s because I was— .’ You know, maybe not. Maybe they were just hungry and that’s why they did it.”

Biran thinks this finding is especially interesting. Many researchers study the behavior of large language models by asking them to explain their actions. But that might be a risky approach, he says: “As models continue getting stronger, they must be equipped with better guardrails. I believe—and this work also shows—that relying only on model outputs is not enough.”

A third task that Anthropic studied was writing poems. The researchers wanted to know if the model really did just wing it, predicting one word at a time. Instead they found that Claude somehow looked ahead, picking the word at the end of the next line several words in advance.  

For example, when Claude was given the prompt “A rhyming couplet: He saw a carrot and had to grab it,” the model responded, “His hunger was like a starving rabbit.” But using their microscope, they saw that Claude had already hit upon the word “rabbit” when it was processing “grab it.” It then seemed to write the next line with that ending already in place.

This might sound like a tiny detail. But it goes against the common assumption that large language models always work by picking one word at a time in sequence. “The planning thing in poems blew me away,” says Batson. “Instead of at the very last minute trying to make the rhyme make sense, it knows where it’s going.”

“I thought that was cool,” says Merullo. “One of the joys of working in the field is moments like that. There’s been maybe small bits of evidence pointing toward the ability of models to plan ahead, but it’s been a big open question to what extent they do.”

Anthropic then confirmed its observation by turning off the placeholder component for “rabbitness.” Claude responded with “His hunger was a powerful habit.” And when the team replaced “rabbitness” with “greenness,” Claude responded with “freeing it from the garden’s green.”

Anthropic also explored why Claude sometimes made stuff up, a phenomenon known as hallucination. “Hallucination is the most natural thing in the world for these models, given how they’re just trained to give possible completions,” says Batson. “The real question is, ‘How in God’s name could you ever make it not do that?’”

The latest generation of large language models, like Claude 3.5 and Gemini and GPT-4o, hallucinate far less than previous versions, thanks to extensive post-training (the steps that take an LLM trained on the internet and turn it into a usable chatbot). But Batson’s team was surprised to find that this post-training seems to have made Claude refuse to speculate as a default behavior. When it did respond with false information, it was because some other component had overridden the “don’t speculate” component.

This seemed to happen most often when the speculation involved a celebrity or other well-known entity. It’s as if the amount of information available pushed the speculation through, despite the default setting. When Anthropic overrode the “don’t speculate” component to test this, Claude produced lots of false statements about individuals, including claiming that Batson was famous for inventing the Batson principle (he isn’t).

Still unclear

Because we know so little about large language models, any new insight is a big step forward. “A deep understanding of how these models work under the hood would allow us to design and train models that are much better and stronger,” says Biran.

But Batson notes there are still serious limitations. “It’s a misconception that we’ve found all the components of the model or, like, a God’s-eye view,” he says. “Some things are in focus, but other things are still unclear—a distortion of the microscope.”

And it takes several hours for a human researcher to trace the responses to even very short prompts. What’s more, these models can do a remarkable number of different things, and Anthropic has so far looked at only 10 of them.

Batson also says there are big questions that this approach won’t answer. Circuit tracing can be used to peer at the structures inside a large language model, but it won’t tell you how or why those structures formed during training. “That’s a profound question that we don’t address at all in this work,” he says.

But Batson sees this as the start of a new era in which it is possible, at last, to find real evidence for how these models work: “We don’t have to be, like: ‘Are they thinking? Are they reasoning? Are they dreaming? Are they memorizing?’ Those are all analogies. But if we can literally see step by step what a model is doing, maybe now we don’t need analogies.”

The AI Hype Index: DeepSeek mania, Israel’s spying tool, and cheating at chess

Separating AI reality from hyped-up fiction isn’t always easy. That’s why we’ve created the AI Hype Index—a simple, at-a-glance summary of everything you need to know about the state of the industry.

While AI models are certainly capable of creating interesting and sometimes entertaining material, their output isn’t necessarily useful. Google DeepMind is hoping that its new robotics model could make machines more receptive to verbal commands, paving the way for us to simply speak orders to them aloud. Elsewhere, the Chinese startup Monica has created Manus, which it claims is the very first general AI agent to complete truly useful tasks. And burnt-out coders are allowing AI to take the wheel entirely in a new practice dubbed “vibe coding.”

China built hundreds of AI data centers to catch the AI boom. Now many stand unused.

A year or so ago, Xiao Li was seeing floods of Nvidia chip deals on WeChat. A real estate contractor turned data center project manager, he had pivoted to AI infrastructure in 2023, drawn by the promise of China’s AI craze. 

At that time, traders in his circle bragged about securing shipments of high-performing Nvidia GPUs that were subject to US export restrictions. Many were smuggled through overseas channels to Shenzhen. At the height of the demand, a single Nvidia H100 chip, a kind that is essential to training AI models, could sell for up to 200,000 yuan ($28,000) on the black market. 

Now, his WeChat feed and industry group chats tell a different story. Traders are more discreet in their dealings, and prices have come back down to earth. Meanwhile, two data center projects Li is familiar with are struggling to secure further funding from investors who anticipate poor returns, forcing project leads to sell off surplus GPUs. “It seems like everyone is selling, but few are buying,” he says.

Just months ago, a boom in data center construction was at its height, fueled by both government and private investors. However, many newly built facilities are now sitting empty. According to people on the ground who spoke to MIT Technology Review—including contractors, an executive at a GPU server company, and project managers—most of the companies running these data centers are struggling to stay afloat. The local Chinese outlets Jiazi Guangnian and 36Kr report that up to 80% of China’s newly built computing resources remain unused.

Renting out GPUs to companies that need them for training AI models—the main business model for the new wave of data centers—was once seen as a sure bet. But with the rise of DeepSeek and a sudden change in the economics around AI, the industry is faltering.

“The growing pain China’s AI industry is going through is largely a result of inexperienced players—corporations and local governments—jumping on the hype train, building facilities that aren’t optimal for today’s need,” says Jimmy Goodrich, senior advisor for technology to the RAND Corporation. 

The upshot is that projects are failing, energy is being wasted, and data centers have become “distressed assets” whose investors are keen to unload them at below-market rates. The situation may eventually prompt government intervention, he says: “The Chinese government is likely to step in, take over, and hand them off to more capable operators.”

A chaotic building boom

When ChatGPT exploded onto the scene in late 2022, the response in China was swift. The central government designated AI infrastructure as a national priority, urging local governments to accelerate the development of so-called smart computing centers—a term coined to describe AI-focused data centers.

In 2023 and 2024, over 500 new data center projects were announced everywhere from Inner Mongolia to Guangdong, according to KZ Consulting, a market research firm. According to the China Communications Industry Association Data Center Committee, a state-affiliated industry association, at least 150 of the newly built data centers were finished and running by the end of 2024. State-owned enterprises, publicly traded firms, and state-affiliated funds lined up to invest in them, hoping to position themselves as AI front-runners. Local governments heavily promoted them in the hope they’d stimulate the economy and establish their region as a key AI hub. 

However, as these costly construction projects continue, the Chinese frenzy over large language models is losing momentum. In 2024 alone, over 144 companies registered with the Cyberspace Administration of China—the country’s central internet regulator—to develop their own LLMs. Yet according to the Economic Observer, a Chinese publication, only about 10% of those companies were still actively investing in large-scale model training by the end of the year.

China’s political system is highly centralized, with local government officials typically moving up the ranks through regional appointments. As a result, many local leaders prioritize short-term economic projects that demonstrate quick results—often to gain favor with higher-ups—rather than long-term development. Large, high-profile infrastructure projects have long been a tool for local officials to boost their political careers.

The post-pandemic economic downturn only intensified this dynamic. With China’s real estate sector—once the backbone of local economies—slumping for the first time in decades, officials scrambled to find alternative growth drivers. In the meantime, the country’s once high-flying internet industry was also entering a period of stagnation. In this vacuum, AI infrastructure became the new stimulus of choice.

“AI felt like a shot of adrenaline,” says Li. “A lot of money that used to flow into real estate is now going into AI data centers.”

By 2023, major corporations—many of them with little prior experience in AI—began partnering with local governments to capitalize on the trend. Some saw AI infrastructure as a way to justify business expansion or boost stock prices, says Fang Cunbao, a data center project manager based in Beijing. Among them were companies like Lotus, an MSG manufacturer, and Jinlun Technology, a textile firm—hardly the names one would associate with cutting-edge AI technology.

This gold-rush approach meant that the push to build AI data centers was largely driven from the top down, often with little regard for actual demand or technical feasibility, say Fang, Li, and multiple on-the-ground sources, who asked to speak anonymously for fear of political repercussions. Many projects were led by executives and investors with limited expertise in AI infrastructure, they say. In the rush to keep up, many were constructed hastily and fell short of industry standards. 

“Putting all these large clusters of chips together is a very difficult exercise, and there are very few companies or individuals who know how to do it at scale,” says Goodrich. “This is all really state-of-the-art computer engineering. I’d be surprised if most of these smaller players know how to do it. A lot of the freshly built data centers are quickly strung together and don’t offer the stability that a company like DeepSeek would want.”

To make matters worse, project leaders often relied on middlemen and brokers—some of whom exaggerated demand forecasts or manipulated procurement processes to pocket government subsidies, sources say. 

By the end of 2024, the excitement that once surrounded China’s data center boom was  curdling into disappointment. The reason is simple: GPU rental is no longer a particularly  lucrative business.

The DeepSeek reckoning

The business model of data centers is in theory straightforward: They make money by renting out GPU clusters to companies that need computing capacity for AI training. In reality, however, securing clients is proving difficult. Only a few top tech companies in China are now drawing heavily on computing power to train their AI models. Many smaller players have been giving up on pretraining their models or otherwise shifting their strategy since the rise of DeepSeek, which broke the internet with R1, its open-source reasoning model that matches the performance of ChatGPT o1 but was built at a fraction of its cost. 

“DeepSeek is a moment of reckoning for the Chinese AI industry. The burning question shifted from ‘Who can make the best large language model?’ to ‘Who can use them better?’” says Hancheng Cao, an assistant professor of information systems at Emory University. 

The rise of reasoning models like DeepSeek’s R1 and OpenAI’s ChatGPT o1 and o3 has also changed what businesses want from a data center. With this technology, most of the computing needs come from conducting step-by-step logical deductions in response to users’ queries, not from the process of training and creating the model in the first place. This reasoning process often yields better results but takes significantly more time. As a result, hardware with low latency (the time it takes for data to pass from one point on a network to another) is paramount. Data centers need to be located near major tech hubs to minimize transmission delays and ensure access to highly skilled operations and maintenance staff. 

This change means many data centers built in central, western, and rural China—where electricity and land are cheaper—are losing their allure to AI companies. In Zhengzhou, a city in Li’s home province of Henan, a newly built data center is even distributing free computing vouchers to local tech firms but still struggles to attract clients. 

Additionally, a lot of the new data centers that have sprung up in recent years were optimized for pretraining workloads—large, sustained computations run on massive data sets—rather than for inference, the process of running trained reasoning models to respond to user inputs in real time. Inference-friendly hardware differs from what’s traditionally used for large-scale AI training. 

GPUs like Nvidia H100 and A100 are designed for massive data processing, prioritizing speed and memory capacity. But as AI moves toward real-time reasoning, the industry seeks chips that are more efficient, responsive, and cost-effective. Even a minor miscalculation in infrastructure needs can render a data center suboptimal for the tasks clients require.

In these circumstances, the GPU rental price has dropped to an all-time low. A recent report from the Chinese media outlet Zhineng Yongxian said that an Nvidia H100 server configured with eight GPUs now rents for 75,000 yuan per month, down from highs of around 180,000. Some data centers would rather leave their facilities sitting empty than run the risk of losing even more money because they are so costly to run, says Fan: “The revenue from having a tiny part of the data center running simply wouldn’t cover the electricity and maintenance cost.”

“It’s paradoxical—China faces the highest acquisition costs for Nvidia chips, yet GPU leasing prices are extraordinarily low,” Li says. There’s an oversupply of computational power, especially in central and west China, but at the same time, there’s a shortage of cutting-edge chips. 

However, not all brokers were looking to make money from data centers in the first place. Instead, many were interested in gaming government benefits all along. Some operators exploit the sector for subsidized green electricity, obtaining permits to generate and sell power, according to Fang and some Chinese media reports. Instead of using the energy for AI workloads, they resell it back to the grid at a premium. In other cases, companies acquire land for data center development to qualify for state-backed loans and credits, leaving facilities unused while still benefiting from state funding, according to the local media outlet Jiazi Guangnian.

“Towards the end of 2024, no clear-headed contractor and broker in the market would still go into the business expecting direct profitability,” says Fang. “Everyone I met is leveraging the data center deal for something else the government could offer.”

A necessary evil

Despite the underutilization of data centers, China’s central government is still throwing its weight behind a push for AI infrastructure. In early 2025, it convened an AI industry symposium, emphasizing the importance of self-reliance in this technology. 

Major Chinese tech companies are taking note, making investments aligning with this national priority. Alibaba Group announced plans to invest over $50 billion in cloud computing and AI hardware infrastructure over the next three years, while ByteDance plans to invest around $20 billion in GPUs and data centers.

In the meantime, companies in the US are doing likewise. Major tech firms including OpenAI, Softbank, and Oracle have teamed up to commit to the Stargate initiative, which plans to invest up to $500 billion over the next four years to build advanced data centers and computing infrastructure. ​Given the AI competition between the two countries, experts say that China is unlikely to scale back its efforts. “If generative AI is going to be the killer technology, infrastructure is going to be the determinant of success,”  says Goodrich, the tech policy advisor to RAND.

“The Chinese central government will likely see [underused data centers] as a necessary evil to develop an important capability, a growing pain of sorts. You have the failed projects and distressed assets, and the state will consolidate and clean it up. They see the end, not the means,” Goodrich says.

Demand remains strong for Nvidia chips, and especially the H20 chip, which was custom-designed for the Chinese market. One industry source, who requested not to be identified under his company policy, confirmed that the H20, a lighter, faster model optimized for AI inference, is currently the most popular Nvidia chip, followed by the H100, which continues to flow steadily into China even though sales are officially restricted by US sanctions. Some of the new demand is driven by companies deploying their own versions of DeepSeek’s open-source models.

For now, many data centers in China sit in limbo—built for a future that has yet to arrive. Whether they will find a second life remains uncertain. For Fang Cunbao, DeepSeek’s success has become a moment of reckoning, casting doubt on the assumption that an endless expansion of AI infrastructure guarantees progress.

That’s just a myth, he now realizes. At the start of this year, Fang decided to quit the data center industry altogether. “The market is too chaotic. The early adopters profited, but now it’s just people chasing policy loopholes,” he says. He’s decided to go into AI education next. 

“What stands between now and a future where AI is actually everywhere,” he says, “is not infrastructure anymore, but solid plans to deploy the technology.” 

Why the world is looking to ditch US AI models

A few weeks ago, when I was at the digital rights conference RightsCon in Taiwan, I watched in real time as civil society organizations from around the world, including the US, grappled with the loss of one of the biggest funders of global digital rights work: the United States government.

As I wrote in my dispatch, the Trump administration’s shocking, rapid gutting of the US government (and its push into what some prominent political scientists call “competitive authoritarianism”) also affects the operations and policies of American tech companies—many of which, of course, have users far beyond US borders. People at RightsCon said they were already seeing changes in these companies’ willingness to engage with and invest in communities that have smaller user bases—especially non-English-speaking ones. 

As a result, some policymakers and business leaders—in Europe, in particular—are reconsidering their reliance on US-based tech and asking whether they can quickly spin up better, homegrown alternatives. This is particularly true for AI.

One of the clearest examples of this is in social media. Yasmin Curzi, a Brazilian law professor who researches domestic tech policy, put it to me this way: “Since Trump’s second administration, we cannot count on [American social media platforms] to do even the bare minimum anymore.” 

Social media content moderation systems—which already use automation and are also experimenting with deploying large language models to flag problematic posts—are failing to detect gender-based violence in places as varied as India, South Africa, and Brazil. If platforms begin to rely even more on LLMs for content moderation, this problem will likely get worse, says Marlena Wisniak, a human rights lawyer who focuses on AI governance at the European Center for Not-for-Profit Law. “The LLMs are moderated poorly, and the poorly moderated LLMs are then also used to moderate other content,” she tells me. “It’s so circular, and the errors just keep repeating and amplifying.” 

Part of the problem is that the systems are trained primarily on data from the English-speaking world (and American English at that), and as a result, they perform less well with local languages and context. 

Even multilingual language models, which are meant to process multiple languages at once, still perform poorly with non-Western languages. For instance, one evaluation of ChatGPT’s response to health-care queries found that results were far worse in Chinese and Hindi, which are less well represented in North American data sets, than in English and Spanish.   

For many at RightsCon, this validates their calls for more community-driven approaches to AI—both in and out of the social media context. These could include small language models, chatbots, and data sets designed for particular uses and specific to particular languages and cultural contexts. These systems could be trained to recognize slang usages and slurs, interpret words or phrases written in a mix of languages and even alphabets, and identify “reclaimed language” (onetime slurs that the targeted group has decided to embrace). All of these tend to be missed or miscategorized by language models and automated systems trained primarily on Anglo-American English. The founder of the startup Shhor AI, for example, hosted a panel at RightsCon and talked about its new content moderation API focused on Indian vernacular languages.

Many similar solutions have been in development for years—and we’ve covered a number of them, including a Mozilla-facilitated volunteer-led effort to collect training data in languages other than English, and promising startups like Lelapa AI, which is building AI for African languages. Earlier this year, we even included small language models on our 2025 list of top 10 breakthrough technologies

Still, this moment feels a little different. The second Trump administration, which shapes the actions and policies of American tech companies, is obviously a major factor. But there are others at play. 

First, recent research and development on language models has reached the point where data set size is no longer a predictor of performance, meaning that more people can create them. In fact, “smaller language models might be worthy competitors of multilingual language models in specific, low-resource languages,” says Aliya Bhatia, a visiting fellow at the Center for Democracy & Technology who researches automated content moderation. 

Then there’s the global landscape. AI competition was a major theme of the recent Paris AI Summit, which took place the week before RightsCon. Since then, there’s been a steady stream of announcements about “sovereign AI” initiatives that aim to give a country (or organization) full control over all aspects of AI development. 

AI sovereignty is just one part of the desire for broader “tech sovereignty” that’s also been gaining steam, growing out of more sweeping concerns about the privacy and security of data transferred to the United States. The European Union appointed its first commissioner for tech sovereignty, security, and democracy last November and has been working on plans for a “Euro Stack,” or “digital public infrastructure.” The definition of this is still somewhat fluid, but it could include the energy, water, chips, cloud services, software, data, and AI needed to support modern society and future innovation. All these are largely provided by US tech companies today. Europe’s efforts are partly modeled after “India Stack,” that country’s digital infrastructure that includes the biometric identity system Aadhaar. Just last week, Dutch lawmakers passed several motions to untangle the country from US tech providers. 

This all fits in with what Andy Yen, CEO of the Switzerland-based digital privacy company Proton, told me at RightsCon. Trump, he said, is “causing Europe to move faster … to come to the realization that Europe needs to regain its tech sovereignty.” This is partly because of the leverage that the president has over tech CEOs, Yen said, and also simply “because tech is where the future economic growth of any country is.”

But just because governments get involved doesn’t mean that issues around inclusion in language models will go away. “I think there needs to be guardrails about what the role of the government here is. Where it gets tricky is if the government decides ‘These are the languages we want to advance’ or ‘These are the types of views we want represented in a data set,’” Bhatia says. “Fundamentally, the training data a model trains on is akin to the worldview it develops.” 

It’s still too early to know what this will all look like, and how much of it will prove to be hype. But no matter what happens, this is a space we’ll be watching.

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

Why the world is looking to ditch US AI models

A few weeks ago, when I was at the digital rights conference RightsCon in Taiwan, I watched in real time as civil society organizations from around the world, including the US, grappled with the loss of one of the biggest funders of global digital rights work: the United States government.

As I wrote in my dispatch, the Trump administration’s shocking, rapid gutting of the US government (and its push into what some prominent political scientists call “competitive authoritarianism”) also affects the operations and policies of American tech companies—many of which, of course, have users far beyond US borders. People at RightsCon said they were already seeing changes in these companies’ willingness to engage with and invest in communities that have smaller user bases—especially non-English-speaking ones. 

As a result, some policymakers and business leaders—in Europe, in particular—are reconsidering their reliance on US-based tech and asking whether they can quickly spin up better, homegrown alternatives. This is particularly true for AI.

One of the clearest examples of this is in social media. Yasmin Curzi, a Brazilian law professor who researches domestic tech policy, put it to me this way: “Since Trump’s second administration, we cannot count on [American social media platforms] to do even the bare minimum anymore.” 

Social media content moderation systems—which already use automation and are also experimenting with deploying large language models to flag problematic posts—are failing to detect gender-based violence in places as varied as India, South Africa, and Brazil. If platforms begin to rely even more on LLMs for content moderation, this problem will likely get worse, says Marlena Wisniak, a human rights lawyer who focuses on AI governance at the European Center for Not-for-Profit Law. “The LLMs are moderated poorly, and the poorly moderated LLMs are then also used to moderate other content,” she tells me. “It’s so circular, and the errors just keep repeating and amplifying.” 

Part of the problem is that the systems are trained primarily on data from the English-speaking world (and American English at that), and as a result, they perform less well with local languages and context. 

Even multilingual language models, which are meant to process multiple languages at once, still perform poorly with non-Western languages. For instance, one evaluation of ChatGPT’s response to health-care queries found that results were far worse in Chinese and Hindi, which are less well represented in North American data sets, than in English and Spanish.   

For many at RightsCon, this validates their calls for more community-driven approaches to AI—both in and out of the social media context. These could include small language models, chatbots, and data sets designed for particular uses and specific to particular languages and cultural contexts. These systems could be trained to recognize slang usages and slurs, interpret words or phrases written in a mix of languages and even alphabets, and identify “reclaimed language” (onetime slurs that the targeted group has decided to embrace). All of these tend to be missed or miscategorized by language models and automated systems trained primarily on Anglo-American English. The founder of the startup Shhor AI, for example, hosted a panel at RightsCon and talked about its new content moderation API focused on Indian vernacular languages.

Many similar solutions have been in development for years—and we’ve covered a number of them, including a Mozilla-facilitated volunteer-led effort to collect training data in languages other than English, and promising startups like Lelapa AI, which is building AI for African languages. Earlier this year, we even included small language models on our 2025 list of top 10 breakthrough technologies

Still, this moment feels a little different. The second Trump administration, which shapes the actions and policies of American tech companies, is obviously a major factor. But there are others at play. 

First, recent research and development on language models has reached the point where data set size is no longer a predictor of performance, meaning that more people can create them. In fact, “smaller language models might be worthy competitors of multilingual language models in specific, low-resource languages,” says Aliya Bhatia, a visiting fellow at the Center for Democracy & Technology who researches automated content moderation. 

Then there’s the global landscape. AI competition was a major theme of the recent Paris AI Summit, which took place the week before RightsCon. Since then, there’s been a steady stream of announcements about “sovereign AI” initiatives that aim to give a country (or organization) full control over all aspects of AI development. 

AI sovereignty is just one part of the desire for broader “tech sovereignty” that’s also been gaining steam, growing out of more sweeping concerns about the privacy and security of data transferred to the United States. The European Union appointed its first commissioner for tech sovereignty, security, and democracy last November and has been working on plans for a “Euro Stack,” or “digital public infrastructure.” The definition of this is still somewhat fluid, but it could include the energy, water, chips, cloud services, software, data, and AI needed to support modern society and future innovation. All these are largely provided by US tech companies today. Europe’s efforts are partly modeled after “India Stack,” that country’s digital infrastructure that includes the biometric identity system Aadhaar. Just last week, Dutch lawmakers passed several motions to untangle the country from US tech providers. 

This all fits in with what Andy Yen, CEO of the Switzerland-based digital privacy company Proton, told me at RightsCon. Trump, he said, is “causing Europe to move faster … to come to the realization that Europe needs to regain its tech sovereignty.” This is partly because of the leverage that the president has over tech CEOs, Yen said, and also simply “because tech is where the future economic growth of any country is.”

But just because governments get involved doesn’t mean that issues around inclusion in language models will go away. “I think there needs to be guardrails about what the role of the government here is. Where it gets tricky is if the government decides ‘These are the languages we want to advance’ or ‘These are the types of views we want represented in a data set,’” Bhatia says. “Fundamentally, the training data a model trains on is akin to the worldview it develops.” 

It’s still too early to know what this will all look like, and how much of it will prove to be hype. But no matter what happens, this is a space we’ll be watching.

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.