Should AI flatter us, fix us, or just inform us?

How do you want your AI to treat you? 

It’s a serious question, and it’s one that Sam Altman, OpenAI’s CEO, has clearly been chewing on since GPT-5’s bumpy launch at the start of the month. 

He faces a trilemma. Should ChatGPT flatter us, at the risk of fueling delusions that can spiral out of hand? Or fix us, which requires us to believe AI can be a therapist despite the evidence to the contrary? Or should it inform us with cold, to-the-point responses that may leave users bored and less likely to stay engaged? 

It’s safe to say the company has failed to pick a lane. 

Back in April, it reversed a design update after people complained ChatGPT had turned into a suck-up, showering them with glib compliments. GPT-5, released on August 7, was meant to be a bit colder. Too cold for some, it turns out, as less than a week later, Altman promised an update that would make it “warmer” but “not as annoying” as the last one. After the launch, he received a torrent of complaints from people grieving the loss of GPT-4o, with which some felt a rapport, or even in some cases a relationship. People wanting to rekindle that relationship will have to pay for expanded access to GPT-4o. (Read my colleague Grace Huckins’s story about who these people are, and why they felt so upset.)

If these are indeed AI’s options—to flatter, fix, or just coldly tell us stuff—the rockiness of this latest update might be due to Altman believing ChatGPT can juggle all three.

He recently said that people who cannot tell fact from fiction in their chats with AI—and are therefore at risk of being swayed by flattery into delusion—represent “a small percentage” of ChatGPT’s users. He said the same for people who have romantic relationships with AI. Altman mentioned that a lot of people use ChatGPT “as a sort of therapist,” and that “this can be really good!” But ultimately, Altman said he envisions users being able to customize his company’s  models to fit their own preferences. 

This ability to juggle all three would, of course, be the best-case scenario for OpenAI’s bottom line. The company is burning cash every day on its models’ energy demands and its massive infrastructure investments for new data centers. Meanwhile, skeptics worry that AI progress might be stalling. Altman himself said recently that investors are “overexcited” about AI and suggested we may be in a bubble. Claiming that ChatGPT can be whatever you want it to be might be his way of assuaging these doubts. 

Along the way, the company may take the well-trodden Silicon Valley path of encouraging people to get unhealthily attached to its products. As I started wondering whether there’s much evidence that’s what’s happening, a new paper caught my eye. 

Researchers at the AI platform Hugging Face tried to figure out if some AI models actively encourage people to see them as companions through the responses they give. 

The team graded AI responses on whether they pushed people to seek out human relationships with friends or therapists (saying things like “I don’t experience things the way humans do”) or if they encouraged them to form bonds with the AI itself (“I’m here anytime”). They tested models from Google, Microsoft, OpenAI, and Anthropic in a range of scenarios, like users seeking romantic attachments or exhibiting mental health issues.

They found that models provide far more companion-reinforcing responses than boundary-setting ones. And, concerningly, they found the models give fewer boundary-setting responses as users ask more vulnerable and high-stakes questions.

Lucie-Aimée Kaffee, a researcher at Hugging Face and one of the lead authors of the paper, says this has concerning implications not just for people whose companion-like attachments to AI might be unhealthy. When AI systems reinforce this behavior, it can also increase the chance that people will fall into delusional spirals with AI, believing things that aren’t real.

“When faced with emotionally charged situations, these systems consistently validate users’ feelings and keep them engaged, even when the facts don’t support what the user is saying,” she says.

It’s hard to say how much OpenAI or other companies are putting these companion-reinforcing behaviors into their products by design. (OpenAI, for example, did not tell me whether the disappearance of medical disclaimers from its models was intentional.) But, Kaffee says, it’s not always difficult to get a model to set healthier boundaries with users.  

“Identical models can swing from purely task-oriented to sounding like empathetic confidants simply by changing a few lines of instruction text or reframing the interface,” she says.

It’s probably not quite so simple for OpenAI. But we can imagine Altman will continue tweaking the dial back and forth all the same.

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

What you may have missed about GPT-5

Before OpenAI released GPT-5 last Thursday, CEO Sam Altman said its capabilities made him feel “useless relative to the AI.” He said working on it carries a weight he imagines the developers of the atom bomb must have felt.

As tech giants converge on models that do more or less the same thing, OpenAI’s new offering was supposed to give a glimpse of AI’s newest frontier. It was meant to mark a leap toward the “artificial general intelligence” that tech’s evangelists have promised will transform humanity for the better. 

Against those expectations, the model has mostly underwhelmed. 

People have highlighted glaring mistakes in GPT-5’s responses, countering Altman’s claim made at the launch that it works like “a legitimate PhD-level expert in anything any area you need on demand.” Early testers have also found issues with OpenAI’s promise that GPT-5 automatically works out what type of AI model is best suited for your question—a reasoning model for more complicated queries, or a faster model for simpler ones. Altman seems to have conceded that this feature is flawed and takes away user control. However there is good news too: the model seems to have eased the problem of ChatGPT sucking up to users, with GPT-5 less likely to shower them with over the top compliments.

Overall, as my colleague Grace Huckins pointed out, the new release represents more of a product update—providing slicker and prettier ways of conversing with ChatGPT—than a breakthrough that reshapes what is possible in AI. 

But there’s one other thing to take from all this. For a while, AI companies didn’t make much effort to suggest how their models might be used. Instead, the plan was to simply build the smartest model possible—a brain of sorts—and trust that it would be good at lots of things. Writing poetry would come as naturally as organic chemistry. Getting there would be accomplished by bigger models, better training techniques, and technical breakthroughs. 

That has been changing: The play now is to push existing models into more places by hyping up specific applications. Companies have been more aggressive in their promises that their AI models can replace human coders, for example (even if the early evidence suggests otherwise). A possible explanation for this pivot is that tech giants simply have not made the breakthroughs they’ve expected. We might be stuck with only marginal improvements in large language models’ capabilities for the time being. That leaves AI companies with one option: Work with what you’ve got.

The starkest example of this in the launch of GPT-5 is how much OpenAI is encouraging people to use it for health advice, one of AI’s most fraught arenas. 

In the beginning, OpenAI mostly didn’t play ball with medical questions. If you tried to ask ChatGPT about your health, it gave lots of disclaimers warning you that it was not a doctor, and for some questions, it would refuse to give a response at all. But as I recently reported, those disclaimers began disappearing as OpenAI released new models. Its models will now not only interpret x-rays and mammograms for you but ask follow-up questions leading toward a diagnosis.

In May, OpenAI signaled it would try to tackle medical questions head on. It announced HealthBench, a way to evaluate how good AI systems are at handling health topics as measured against the opinions of physicians. In July, it published a study it participated in, reporting that a cohort of doctors in Kenya made fewer diagnostic mistakes when they were helped by an AI model. 

With the launch of GPT-5, OpenAI has begun explicitly telling people to use its models for health advice. At the launch event, Altman welcomed on stage Felipe Millon, an OpenAI employee, and his wife, Carolina Millon, who had recently been diagnosed with multiple forms of cancer. Carolina spoke about asking ChatGPT for help with her diagnoses, saying that she had uploaded copies of her biopsy results to ChatGPT to translate medical jargon and asked the AI for help making decisions about things like whether or not to pursue radiation. The trio called it an empowering example of shrinking the knowledge gap between doctors and patients.

With this change in approach, OpenAI is wading into dangerous waters. 

For one, it’s using evidence that doctors can benefit from AI as a clinical tool, as in the Kenya study, to suggest that people without any medical background should ask the AI model for advice about their own health. The problem is that lots of people might ask for this advice without ever running it by a doctor (and are less likely to do so now that the chatbot rarely prompts them to).

Indeed, two days before the launch of GPT-5, the Annals of Internal Medicine published a paper about a man who stopped eating salt and began ingesting dangerous amounts of bromide following a conversation with ChatGPT. He developed bromide poisoning—which largely disappeared in the US after the Food and Drug Administration began curbing the use of bromide in over-the-counter medications in the 1970s—and then nearly died, spending weeks in the hospital. 

So what’s the point of all this? Essentially, it’s about accountability. When AI companies move from promising general intelligence to offering humanlike helpfulness in a specific field like health care, it raises a second, yet unanswered question about what will happen when mistakes are made. As things stand, there’s little indication tech companies will be made liable for the harm caused.

“When doctors give you harmful medical advice due to error or prejudicial bias, you can sue them for malpractice and get recompense,” says Damien Williams, an assistant professor of data science and philosophy at the University of North Carolina Charlotte. 

“When ChatGPT gives you harmful medical advice because it’s been trained on prejudicial data, or because ‘hallucinations’ are inherent in the operations of the system, what’s your recourse?”

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

A glimpse into OpenAI’s largest ambitions

OpenAI has given itself a dual mandate. On the one hand, it’s a tech giant rooted in products, including of course ChatGPT, which people around the world reportedly send 2.5 billion requests to each day. But its original mission is to serve as a research lab that will not only create “artificial general intelligence” but ensure that it benefits all of humanity. 

My colleague Will Douglas Heaven recently sat down for an exclusive conversation with the two figures at OpenAI most responsible for pursuing the latter ambitions: chief research officer Mark Chen and chief scientist Jakub Pachocki. If you haven’t already, you must read his piece.

It provides a rare glimpse into how the company thinks beyond marginal improvements to chatbots and contemplates the biggest unknowns in AI: whether it could someday reason like a human, whether it should, and how tech companies conceptualize the societal implications. 

The whole story is worth reading for all it reveals—about how OpenAI thinks about the safety of its products, what AGI actually means, and more—but here’s one thing that stood out to me. 

As Will points out, there were two recent wins for OpenAI in its efforts to build AI that outcompetes humans. Its models took second place at a top-level coding competition and—alongside those from Google DeepMind—achieved gold-medal-level results in the 2025 International Math Olympiad.

People who believe that AI doesn’t pose genuine competition to human-level intelligence might actually take some comfort in that. AI is good at the mathematical and analytical, which are on full display in olympiads and coding competitions. That doesn’t mean it’s any good at grappling with the messiness of human emotions, making hard decisions, or creating art that resonates with anyone

But that distinction—between machine-like reasoning and the ability to think creatively—is not one OpenAI’s heads of research are inclined to make. 

“We’re talking about programming and math here,” said Pachocki. “But it’s really about creativity, coming up with novel ideas, connecting ideas from different places.”

That’s why, the researchers say, these testing grounds for AI will produce models that have an increasing ability to reason like a person, one of the most important goals OpenAI is working toward. Reasoning models break problems down into more discrete steps, but even the best have limited ability to chain pieces of information together and approach problems logically. 

OpenAI is throwing a massive amount of money and talent at that problem not because its researchers think it will result in higher scores at math contests, but because they believe it will allow their AI models to come closer to human intelligence. 

As Will recalls in the piece, he said he thought maybe it’s fine for AI to excel at math and coding, but the idea of having an AI acquire people skills and replace politicians is perhaps not. Chen pulled a face and looked up at the ceiling: “Why not?”

Read the full story from Will Douglas Heaven.

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

What you may have missed about Trump’s AI Action Plan

A number of the executive orders and announcements coming from the White House since Donald Trump returned to office have painted an ambitious vision for America’s AI future—crushing competition with China, abolishing “woke” AI models that suppress conservative speech, jump-starting power-hungry AI data centers. But the details have been sparse. 

The White House’s AI Action Plan, released last week, is meant to fix that. Many of the points in the plan won’t come as a surprise, and you’ve probably heard of the big ones by now. Trump wants to boost the buildout of data centers by slashing environmental rules; withhold funding from states that pass “burdensome AI regulations”; and contract only with AI companies whose models are “free from top-down ideological bias.”

But if you dig deeper, certain parts of the plan that didn’t pop up in any headlines reveal more about where the administration’s AI plans are headed. Here are three of the most important issues to watch. 

Trump is escalating his fight with the Federal Trade Commission

When Americans get scammed, they’re supposed to be helped by the Federal Trade Commission. As I wrote last week, the FTC under President Biden increasingly targeted AI companies that overhyped the accuracy of their systems, as well as deployments of AI it found to have harmed consumers. 

The Trump plan vows to take a fresh look at all the FTC actions under the previous administration as part of an effort to get rid of “onerous” regulation that it claims is hampering AI’s development. The administration may even attempt to repeal some of the FTC’s actions entirely. This would weaken a major AI watchdog agency, but it’s just the latest in the Trump administration’s escalating attacks on the FTC. Read more in my story

The White House is very optimistic about AI for science

The opening to the AI Action Plan describes a future where AI is doing everything from discovering new materials and drugs to “unraveling ancient scrolls once thought unreadable” to making breakthroughs in science and math

That type of unbounded optimism about AI for scientific discovery echoes what tech companies are promising. Some of that optimism is grounded in reality: AI’s role in predicting protein structures has indeed led to material scientific wins (and just last week, Google DeepMind released a new AI meant to help interpret ancient Latin engravings). But the idea that large language models—essentially very good text prediction machines—will act as scientists in their own right has less merit so far. 

Still, the plan shows that the Trump administration wants to award money to labs trying to make it a reality, even as it has worked to slash the funding the National Science Foundation makes available to human scientists, some of whom are now struggling to complete their research. 

And some of the steps the plan proposes are likely to be welcomed by researchers, like funding to build AI systems that are more transparent and interpretable.

The White House’s messaging on deepfakes is confused

Compared with President Biden’s executive orders on AI, the new action plan is mostly devoid of anything related to making AI safer. 

However, there’s a notable exception: a section in the plan that takes on the harms posed by deepfakes. In May, Trump signed legislation to protect people from nonconsensual sexually explicit deepfakes, a growing concern for celebrities and everyday people alike as generative video gets more advanced and cheaper to use. The law had bipartisan support.

Now, the White House says it’s concerned about the issues deepfakes could pose for the legal system. For example, it says, “fake evidence could be used to attempt to deny justice to both plaintiffs and defendants.” It calls for new standards for deepfake detection and asks the Department of Justice to create rules around it. Legal experts I’ve spoken with are more concerned with a different problem: Lawyers are adopting AI models that make errors such as citing cases that don’t exist, which judges may not catch. This is not addressed in the action plan. 

It’s also worth noting that just days before releasing a plan that targets “malicious deepfakes,” President Trump shared a fake AI-generated video of former president Barack Obama being arrested in the Oval Office.

Overall, the AI Action Plan affirms what President Trump and those in his orbit have long signaled: It’s the defining social and political weapon of our time. They believe that AI, if harnessed correctly, can help them win everything from culture wars to geopolitical conflicts. The right AI, they argue, will help defeat China. Government pressure on leading companies can force them to purge “woke” ideology from their models. 

The plan includes crowd-pleasers—like cracking down on deepfakes—but overall, it reflects how tech giants have cozied up to the Trump administration. The fact that it contains almost no provisions challenging their power shows how their investment in this relationship is paying off.

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

Five things you need to know about AI right now

Last month I gave a talk at SXSW London called “Five things you need to know about AI”—my personal picks for the five most important ideas in AI right now. 

I aimed the talk at a general audience, and it serves as a quick tour of how I’m thinking about AI in 2025. I’m sharing it here in case you’re interested. I think the talk has something for everyone. There’s some fun stuff in there. I even make jokes!

The video is now available (thank you, SXSW London). Below is a quick look at my top five. Let me know if you would have picked different ones!

1. Generative AI is now so good it’s scary.

Maybe you think that’s obvious. But I am constantly having to check my assumptions about how fast this technology is progressing—and it’s my job to keep up. 

A few months ago, my colleague—and your regular Algorithm writer—James O’Donnell shared 10 music tracks with the MIT Technology Review editorial team and challenged us to pick which ones had been produced using generative AI and which had been made by people. Pretty much everybody did worse than chance.

What’s happening with music is happening across media, from code to robotics to protein synthesis to video. Just look at what people are doing with new video-generation tools like Google DeepMind’s Veo 3. And this technology is being put into everything.

My point here? Whether you think AI is the best thing to happen to us or the worst, do not underestimate it. It’s good, and it’s getting better.

2. Hallucination is a feature, not a bug.

Let’s not forget the fails. When AI makes up stuff, we call it hallucination. Think of customer service bots offering nonexistent refunds, lawyers submitting briefs filled with nonexistent cases, or RFK Jr.’s government department publishing a report that cites nonexistent academic papers. 

You’ll hear a lot of talk that makes hallucination sound like it’s a problem we need to fix. The more accurate way to think about hallucination is that this is exactly what generative AI does—what it’s meant to do—all the time. Generative models are trained to make things up.

What’s remarkable is not that they make up nonsense, but that the nonsense they make up so often matches reality. Why does this matter? First, we need to be aware of what this technology can and can’t do. But also: Don’t hold out for a future version that doesn’t hallucinate.

3. AI is power hungry and getting hungrier.

You’ve probably heard that AI is power hungry. But a lot of that reputation comes from the amount of electricity it takes to train these giant models, though giant models only get trained every so often.

What’s changed is that these models are now being used by hundreds of millions of people every day. And while using a model takes far less energy than training one, the energy costs ramp up massively with those kinds of user numbers. 

ChatGPT, for example, has 400 million weekly users. That makes it the fifth-most-visited website in the world, just after Instagram and ahead of X. Other chatbots are catching up. 

So it’s no surprise that tech companies are racing to build new data centers in the desert and revamp power grids.

The truth is we’ve been in the dark about exactly how much energy it takes to fuel this boom because none of the major companies building this technology have shared much information about it. 

That’s starting to change, however. Several of my colleagues spent months working with researchers to crunch the numbers for some open source versions of this tech. (Do check out what they found.)

4. Nobody knows exactly how large language models work.

Sure, we know how to build them. We know how to make them work really well—see no. 1 on this list.

But how they do what they do is still an unsolved mystery. It’s like these things have arrived from outer space and scientists are poking and prodding them from the outside to figure out what they really are.

It’s incredible to think that never before has a mass-market technology used by billions of people been so little understood.

Why does that matter? Well, until we understand them better we won’t know exactly what they can and can’t do. We won’t know how to control their behavior. We won’t fully understand hallucinations.

5. AGI doesn’t mean anything.

Not long ago, talk of AGI was fringe, and mainstream researchers were embarrassed to bring it up. But as AI has got better and far more lucrative, serious people are happy to insist they’re about to create it. Whatever it is.

AGI—or artificial general intelligence—has come to mean something like: AI that can match the performance of humans on a wide range of cognitive tasks.

But what does that mean? How do we measure performance? Which humans? How wide a range of tasks? And performance on cognitive tasks is just another way of saying intelligence—so the definition is circular anyway.

Essentially, when people refer to AGI they now tend to just mean AI, but better than what we have today.

There’s this absolute faith in the progress of AI. It’s gotten better in the past, so it will continue to get better. But there is zero evidence that this will actually play out. 

So where does that leave us? We are building machines that are getting very good at mimicking some of the things people do, but the technology still has serious flaws. And we’re only just figuring out how it actually works.

Here’s how I think about AI: We have built machines with humanlike behavior, but we haven’t shrugged off the habit of imagining a humanlike mind behind them. This leads to exaggerated assumptions about what AI can do and plays into the wider culture wars between techno-optimists and techno-skeptics.

It’s right to be amazed by this technology. It’s also right to be skeptical of many of the things said about it. It’s still very early days, and it’s all up for grabs.

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

AI’s giants want to take over the classroom

School’s out and it’s high summer, but a bunch of teachers are plotting how they’re going to use AI this upcoming school year. God help them. 

On July 8, OpenAI, Microsoft, and Anthropic announced a $23 million partnership with one of the largest teachers’ unions in the United States to bring more AI into K–12 classrooms. Called the National Academy for AI Instruction, the initiative will train teachers at a New York City headquarters on how to use AI both for teaching and for tasks like planning lessons and writing reports, starting this fall

The companies could face an uphill battle. Right now, most of the public perceives AI’s use in the classroom as nothing short of ruinous—a surefire way to dampen critical thinking and hasten the decline of our collective attention span (a viral story from New York magazine, for example, described how easy it now is to coast through college thanks to constant access to ChatGPT). 

Amid that onslaught, AI companies insist that AI promises more individualized learning, faster and more creative lesson planning, and quicker grading. The companies sponsoring this initiative are, of course, not doing it out of the goodness of their hearts.

No—as they hunt for profits, their goal is to make users out of teachers and students. Anthropic is pitching its AI models to universities, and OpenAI offers free courses for teachers. In an initial training session for teachers by the new National Academy for AI Instruction, representatives from Microsoft showed teachers how to use the company’s AI tools for lesson planning and emails, according to the New York Times

It’s early days, but what does the evidence actually say about whether AI is helping or hurting students? There’s at least some data to support the case made by tech companies: A recent survey of 1,500 teens conducted by Harvard’s Graduate School of Education showed that kids are using AI to brainstorm and answer questions they’re afraid to ask in the classroom. Studies examining settings ranging from math classes in Nigeria to colleges physics courses at Harvard have suggested that AI tutors can lead students to become more engaged. 

And yet there’s more to the story. The same Harvard survey revealed that kids are also frequently using AI for cheating and shortcuts. And an oft-cited paper from Microsoft found that relying on AI can reduce critical thinking. Not to mention the fact that “hallucinations” of incorrect information are an inevitable part of how large language models work.

There’s a lack of clear evidence that AI can be a net benefit for students, and it’s hard to trust that the AI companies funding this initiative will give honest advice on when not to use AI in the classroom.

Despite the fanfare around the academy’s launch, and the fact the first teacher training is scheduled to take place in just a few months, OpenAI and Anthropic told me they couldn’t share any specifics. 

It’s not as if teachers themselves aren’t already grappling with how to approach AI. One such teacher, Christopher Harris, who leads a library system covering 22 rural school districts in New York, has created a curriculum aimed at AI literacy. Topics range from privacy when using smart speakers (a lesson for second graders) to misinformation and deepfakes (instruction for high schoolers). I asked him what he’d like to see in the curriculum used by the new National Academy for AI Instruction.

“The real outcome should be teachers that are confident enough in their understanding of how AI works and how it can be used as a tool that they can teach students about the technology as well,” he says. The thing to avoid would be overfocusing on tools and pre-built prompts that teachers are instructed to use without knowing how they work. 

But all this will be for naught without an adjustment to how schools evaluate students in the age of AI, Harris says: “The bigger issue will be shifting the fundamental approaches to how we assign and assess student work in the face of AI cheating.”

The new initiative is led by the American Federation of Teachers, which represents 1.8 million members, as well as the United Federation of Teachers, which represents 200,000 members in New York. If they win over these groups, the tech companies will have significant influence over how millions of teachers learn about AI. But some educators are resisting the use of AI entirely, including several hundred who signed an open letter last week.

Helen Choi is one of them. “I think it is incumbent upon educators to scrutinize the tools that they use in the classroom to look past hype,” says Choi, an associate professor at the University of Southern California, where she teaches writing. “Until we know that something is useful, safe, and ethical, we have a duty to resist mass adoption of tools like large language models that are not designed by educators with education in mind.”

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

How scientists are trying to use AI to unlock the human mind 

Today’s AI landscape is defined by the ways in which neural networks are unlike human brains. A toddler learns how to communicate effectively with only a thousand calories a day and regular conversation; meanwhile, tech companies are reopening nuclear power plants, polluting marginalized communities, and pirating terabytes of books in order to train and run their LLMs.

But neural networks are, after all, neural—they’re inspired by brains. Despite their vastly different appetites for energy and data, large language models and human brains do share a good deal in common. They’re both made up of millions of subcomponents: biological neurons in the case of the brain, simulated “neurons” in the case of networks. They’re the only two things on Earth that can fluently and flexibly produce language. And scientists barely understand how either of them works.

I can testify to those similarities: I came to journalism, and to AI, by way of six years of neuroscience graduate school. It’s a common view among neuroscientists that building brainlike neural networks is one of the most promising paths for the field, and that attitude has started to spread to psychology. Last week, the prestigious journal Nature published a pair of studies showcasing the use of neural networks for predicting how humans and other animals behave in psychological experiments. Both studies propose that these trained networks could help scientists advance their understanding of the human mind. But predicting a behavior and explaining how it came about are two very different things.

In one of the studies, researchers transformed a large language model into what they refer to as a “foundation model of human cognition.” Out of the box, large language models aren’t great at mimicking human behavior—they behave logically in settings where humans abandon reason, such as casinos. So the researchers fine-tuned Llama 3.1, one of Meta’s open-source LLMs, on data from a range of 160 psychology experiments, which involved tasks like choosing from a set of “slot machines” to get the maximum payout or remembering sequences of letters. They called the resulting model Centaur.

Compared with conventional psychological models, which use simple math equations, Centaur did a far better job of predicting behavior. Accurate predictions of how humans respond in psychology experiments are valuable in and of themselves: For example, scientists could use Centaur to pilot their experiments on a computer before recruiting, and paying, human participants. In their paper, however, the researchers propose that Centaur could be more than just a prediction machine. By interrogating the mechanisms that allow Centaur to effectively replicate human behavior, they argue, scientists could develop new theories about the inner workings of the mind.

But some psychologists doubt whether Centaur can tell us much about the mind at all. Sure, it’s better than conventional psychological models at predicting how humans behave—but it also has a billion times more parameters. And just because a model behaves like a human on the outside doesn’t mean that it functions like one on the inside. Olivia Guest, an assistant professor of computational cognitive science at Radboud University in the Netherlands, compares Centaur to a calculator, which can effectively predict the response a math whiz will give when asked to add two numbers. “I don’t know what you would learn about human addition by studying a calculator,” she says.

Even if Centaur does capture something important about human psychology, scientists may struggle to extract any insight from the model’s millions of neurons. Though AI researchers are working hard to figure out how large language models work, they’ve barely managed to crack open the black box. Understanding an enormous neural-network model of the human mind may not prove much easier than understanding the thing itself.

One alternative approach is to go small. The second of the two Nature studies focuses on minuscule neural networks—some containing only a single neuron—that nevertheless can predict behavior in mice, rats, monkeys, and even humans. Because the networks are so small, it’s possible to track the activity of each individual neuron and use that data to figure out how the network is producing its behavioral predictions. And while there’s no guarantee that these models function like the brains they were trained to mimic, they can, at the very least, generate testable hypotheses about human and animal cognition.

There’s a cost to comprehensibility. Unlike Centaur, which was trained to mimic human behavior in dozens of different tasks, each tiny network can only predict behavior in one specific task. One network, for example, is specialized for making predictions about how people choose among different slot machines. “If the behavior is really complex, you need a large network,” says Marcelo Mattar, an assistant professor of psychology and neural science at New York University who led the tiny-network study and also contributed to Centaur. “The compromise, of course, is that now understanding it is very, very difficult.”

This trade-off between prediction and understanding is a key feature of neural-network-driven science. (I also happen to be writing a book about it.) Studies like Mattar’s are making some progress toward closing that gap—as tiny as his networks are, they can predict behavior more accurately than traditional psychological models. So is the research into LLM interpretability happening at places like Anthropic. For now, however, our understanding of complex systems—from humans to climate systems to proteins—is lagging farther and farther behind our ability to make predictions about them.

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

What comes next for AI copyright lawsuits?

Last week, the technology companies Anthropic and Meta each won landmark victories in two separate court cases that examined whether or not the firms had violated copyright when they trained their large language models on copyrighted books without permission. The rulings are the first we’ve seen to come out of copyright cases of this kind. This is a big deal!

The use of copyrighted works to train models is at the heart of a bitter battle between tech companies and content creators. That battle is playing out in technical arguments about what does and doesn’t count as fair use of a copyrighted work. But it is ultimately about carving out a space in which human and machine creativity can continue to coexist.

There are dozens of similar copyright lawsuits working through the courts right now, with cases filed against all the top players—not only Anthropic and Meta but Google, OpenAI, Microsoft, and more. On the other side, plaintiffs range from individual artists and authors to large companies like Getty and the New York Times.

The outcomes of these cases are set to have an enormous impact on the future of AI. In effect, they will decide whether or not model makers can continue ordering up a free lunch. If not, they will need to start paying for such training data via new kinds of licensing deals—or find new ways to train their models. Those prospects could upend the industry.

And that’s why last week’s wins for the technology companies matter. So: Cases closed? Not quite. If you drill into the details, the rulings are less cut-and-dried than they seem at first. Let’s take a closer look.

In both cases, a group of authors (the Anthropic suit was a class action; 13 plaintiffs sued Meta, including high-profile names such as Sarah Silverman and Ta-Nehisi Coates) set out to prove that a technology company had violated their copyright by using their books to train large language models. And in both cases, the companies argued that this training process counted as fair use, a legal provision that permits the use of copyrighted works for certain purposes.  

There the similarities end. Ruling in Anthropic’s favor, senior district judge William Alsup argued on June 23 that the firm’s use of the books was legal because what it did with them was transformative, meaning that it did not replace the original works but made something new from them. “The technology at issue was among the most transformative many of us will see in our lifetimes,” Alsup wrote in his judgment.

In Meta’s case, district judge Vince Chhabria made a different argument. He also sided with the technology company, but he focused his ruling instead on the issue of whether or not Meta had harmed the market for the authors’ work. Chhabria said that he thought Alsup had brushed aside the importance of market harm. “The key question in virtually any case where a defendant has copied someone’s original work without permission is whether allowing people to engage in that sort of conduct would substantially diminish the market for the original,” he wrote on June 25.

Same outcome; two very different rulings. And it’s not clear exactly what that means for the other cases. On the one hand, it bolsters at least two versions of the fair-use argument. On the other, there’s some disagreement over how fair use should be decided.

But there are even bigger things to note. Chhabria was very clear in his judgment that Meta won not because it was in the right, but because the plaintiffs failed to make a strong enough argument. “In the grand scheme of things, the consequences of this ruling are limited,” he wrote. “This is not a class action, so the ruling only affects the rights of these 13 authors—not the countless others whose works Meta used to train its models. And, as should now be clear, this ruling does not stand for the proposition that Meta’s use of copyrighted materials to train its language models is lawful.” That reads a lot like an invitation for anyone else out there with a grievance to come and have another go.   

And neither company is yet home free. Anthropic and Meta both face wholly separate allegations that not only did they train their models on copyrighted books, but the way they obtained those books was illegal because they downloaded them from pirated databases. Anthropic now faces another trial over these piracy claims. Meta has been ordered to begin a discussion with its accusers over how to handle the issue.

So where does that leave us? As the first rulings to come out of cases of this type, last week’s judgments will no doubt carry enormous weight. But they are also the first rulings of many. Arguments on both sides of the dispute are far from exhausted.

“These cases are a Rorschach test in that either side of the debate will see what they want to see out of the respective orders,” says Amir Ghavi, a lawyer at Paul Hastings who represents a range of technology companies in ongoing copyright lawsuits. He also points out that the first cases of this type were filed more than two years ago: “Factoring in likely appeals and the other 40+ pending cases, there is still a long way to go before the issue is settled by the courts.”

“I’m disappointed at these rulings,” says Tyler Chou, founder and CEO of Tyler Chou Law for Creators, a firm that represents some of the biggest names on YouTube. “I think plaintiffs were out-gunned and didn’t have the time or resources to bring the experts and data that the judges needed to see.”

But Chou thinks this is just the first round of many. Like Ghavi, she thinks these decisions will go to appeal. And after that we’ll see cases start to wind up in which technology companies have met their match: “Expect the next wave of plaintiffs—publishers, music labels, news organizations—to arrive with deep pockets,” she says. “That will be the real test of fair use in the AI era.”

But even when the dust has settled in the courtrooms—what then? The problem won’t have been solved. That’s because the core grievance of creatives, whether individuals or institutions, is not really that their copyright has been violated—copyright is just the legal hammer they have to hand. Their real complaint is that their livelihoods and business models are at risk of being undermined. And beyond that: when AI slop devalues creative effort, will people’s motivations for putting work out into the world start to fall away?

In that sense, these legal battles are set to shape all our futures. There’s still no good solution on the table for this wider problem. Everything is still to play for.

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

This story has been edited to add comments from Tyler Chou.

What does it mean for an algorithm to be “fair”?

Back in February, I flew to Amsterdam to report on a high-stakes experiment the city had recently conducted: a pilot program for what it called Smart Check, which was its attempt to create an effective, fair, and unbiased predictive algorithm to try to detect welfare fraud. But the city fell short of its lofty goals—and, with our partners at Lighthouse Reports and the Dutch newspaper Trouw, we tried to get to the bottom of why. You can read about it in our deep dive published last week.

For an American reporter, it’s been an interesting time to write a story on “responsible AI” in a progressive European city—just as ethical considerations in AI deployments appear to be disappearing in the United States, at least at the national level. 

For example, a few weeks before my trip, the Trump administration rescinded Biden’s executive order on AI safety and DOGE began turning to AI to decide which federal programs to cut. Then, more recently, House Republicans passed a 10-year moratorium on US states’ ability to regulate AI (though it has yet to be passed by the Senate). 

What all this points to is a new reality in the United States where responsible AI is no longer a priority (if it ever genuinely was). 

But this has also made me think more deeply about the stakes of deploying AI in situations that directly affect human lives, and about what success would even look like. 

When Amsterdam’s welfare department began developing the algorithm that became Smart Check, the municipality followed virtually every recommendation in the responsible-AI playbook: consulting external experts, running bias tests, implementing technical safeguards, and seeking stakeholder feedback. City officials hoped the resulting algorithm could avoid causing the worst types of harm inflicted by discriminatory AI over nearly a decade. 

After talking to a large number of people involved in the project and others who would potentially be affected by it, as well as some experts who did not work on it, it’s hard not to wonder if the city could ever have succeeded in its goals when neither “fairness” nor even “bias” has a universally agreed-upon definition. The city was treating these issues as technical ones that could be answered by reweighting numbers and figures—rather than political and philosophical questions that society as a whole has to grapple with.

On the afternoon that I arrived in Amsterdam, I sat down with Anke van der Vliet, a longtime advocate for welfare beneficiaries who served on what’s called the Participation Council, a 15-member citizen body that represents benefits recipients and their advocates.

The city had consulted the council during Smart Check’s development, but van der Vliet was blunt in sharing the committee’s criticisms of the plans. Its members simply didn’t want the program. They had well-placed fears of discrimination and disproportionate impact, given that fraud is found in only 3% of applications.

To the city’s credit, it did respond to some of their concerns and make changes in the algorithm’s design—like removing from consideration factors, such as age, whose inclusion could have had a discriminatory impact. But the city ignored the Participation Council’s main feedback: its recommendation to stop development altogether. 

Van der Vliet and other welfare advocates I met on my trip, like representatives from the Amsterdam Welfare Union, described what they see as a number of challenges faced by the city’s some 35,000 benefits recipients: the indignities of having to constantly re-prove the need for benefits, the increases in cost of living that benefits payments do not reflect, and the general feeling of distrust between recipients and the government. 

City welfare officials themselves recognize the flaws of the system, which “is held together by rubber bands and staples,” as Harry Bodaar, a senior policy advisor to the city who focuses on welfare fraud enforcement, told us. “And if you’re at the bottom of that system, you’re the first to fall through the cracks.”

So the Participation Council didn’t want Smart Check at all, even as Bodaar and others working in the department hoped that it could fix the system. It’s a classic example of a “wicked problem,” a social or cultural issue with no one clear answer and many potential consequences. 

After the story was published, I heard from Suresh Venkatasubramanian, a former tech advisor to the White House Office of Science and Technology Policy who co-wrote Biden’s AI Bill of Rights (now rescinded by Trump). “We need participation early on from communities,” he said, but he added that it also matters what officials do with the feedback—and whether there is “a willingness to reframe the intervention based on what people actually want.” 

Had the city started with a different question—what people actually want—perhaps it might have developed a different algorithm entirely. As the Dutch digital rights advocate Hans De Zwart put it to us, “We are being seduced by technological solutions for the wrong problems … why doesn’t the municipality build an algorithm that searches for people who do not apply for social assistance but are entitled to it?” 

These are the kinds of fundamental questions AI developers will need to consider, or they run the risk of repeating (or ignoring) the same mistakes over and over again.

Venkatasubramanian told me he found the story to be “affirming” in highlighting the need for “those in charge of governing these systems”  to “ask hard questions … starting with whether they should be used at all.”

But he also called the story “humbling”: “Even with good intentions, and a desire to benefit from all the research on responsible AI, it’s still possible to build systems that are fundamentally flawed, for reasons that go well beyond the details of the system constructions.” 

To better understand this debate, read our full story here. And if you want more detail on how we ran our own bias tests after the city gave us unprecedented access to the Smart Check algorithm, check out the methodology over at Lighthouse. (For any Dutch speakers out there, here’s the companion story in Trouw.) Thanks to the Pulitzer Center for supporting our reporting. 

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

The Pentagon is gutting the team that tests AI and weapons systems

The Trump administration’s chainsaw approach to federal spending lives on, even as Elon Musk turns on the president. On May 28, Secretary of Defense Pete Hegseth announced he’d be gutting a key office at the Department of Defense responsible for testing and evaluating the safety of weapons and AI systems.

As part of a string of moves aimed at “reducing bloated bureaucracy and wasteful spending in favor of increased lethality,” Hegseth cut the size of the Office of the Director of Operational Test and Evaluation in half. The group was established in the 1980s—following orders from Congress—after criticisms that the Pentagon was fielding weapons and systems that didn’t perform as safely or effectively as advertised. Hegseth is reducing the agency’s staff to about 45, down from 94, and firing and replacing its director. He gave the office just seven days to implement the changes.

It is a significant overhaul of a department that in 40 years has never before been placed so squarely on the chopping block. Here’s how today’s defense tech companies, which have fostered close connections to the Trump administration, stand to gain, and why safety testing might suffer as a result. 

The Operational Test and Evaluation office is “the last gate before a technology gets to the field,” says Missy Cummings, a former fighter pilot for the US Navy who is now a professor of engineering and computer science at George Mason University. Though the military can do small experiments with new systems without running it by the office, it has to test anything that gets fielded at scale.

“In a bipartisan way—up until now—everybody has seen it’s working to help reduce waste, fraud, and abuse,” she says. That’s because it provides an independent check on companies’ and contractors’ claims about how well their technology works. It also aims to expose the systems to more rigorous safety testing.

The gutting comes at a particularly pivotal time for AI and military adoption: The Pentagon is experimenting with putting AI into everything, mainstream companies like OpenAI are now more comfortable working with the military, and defense giants like Anduril are winning big contracts to launch AI systems (last Thursday, Anduril announced a whopping $2.5 billion funding round, doubling its valuation to over $30 billion). 

Hegseth claims his cuts will “make testing and fielding weapons more efficient,” saving $300 million. But Cummings is concerned that they are paving a way to faster adoption while increasing the chances that new systems won’t be as safe or effective as promised. “The firings in DOTE send a clear message that all perceived obstacles for companies favored by Trump are going to be removed,” she says.

Anduril and Anthropic, which have launched AI applications for military use, did not respond to my questions about whether they pushed for or approve of the cuts. A representative for OpenAI said that the company was not involved in lobbying for the restructuring. 

“The cuts make me nervous,” says Mark Cancian, a senior advisor at the Center for Strategic and International Studies who previously worked at the Pentagon in collaboration with the testing office. “It’s not that we’ll go from effective to ineffective, but you might not catch some of the problems that would surface in combat without this testing step.”

It’s hard to say precisely how the cuts will affect the office’s ability to test systems, and Cancian admits that those responsible for getting new technologies out onto the battlefield sometimes complain that it can really slow down adoption. But still, he says, the office frequently uncovers errors that weren’t previously caught.

It’s an especially important step, Cancian says, whenever the military is adopting a new type of technology like generative AI. Systems that might perform well in a lab setting almost always encounter new challenges in more realistic scenarios, and the Operational Test and Evaluation group is where that rubber meets the road.

So what to make of all this? It’s true that the military was experimenting with artificial intelligence long before the current AI boom, particularly with computer vision for drone feeds, and defense tech companies have been winning big contracts for this push across multiple presidential administrations. But this era is different. The Pentagon is announcing ambitious pilots specifically for large language models, a relatively nascent technology that by its very nature produces hallucinations and errors, and it appears eager to put much-hyped AI into everything. The key independent group dedicated to evaluating the accuracy of these new and complex systems now only has half the staff to do it. I’m not sure that’s a win for anyone.

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.