OpenAI is throwing everything into building a fully automated researcher

<div data-chronoton-summary="

  • A fully automated research lab: OpenAI has set a new “North Star” — building an AI system capable of tackling large, complex scientific problems entirely on its own, with a research intern prototype due by September and a full multi-agent system planned for 2028.
  • Coding agents as a proof of concept: OpenAI’s existing tool Codex, which can already handle substantial programming tasks autonomously, is the early blueprint — the bet is that if AI can solve coding problems, it can solve almost any problem formulated in text or code.
  • Serious risks with no clean answers: Chief scientist Jakub Pachocki admits that a system this powerful running with minimal human oversight raises hard questions — with risks from hacking and misuse to bioweapons — and that chain-of-thought monitoring is the best safeguard available, for now.
  • Power concentrated in very few hands: Pachocki says governments, not just OpenAI, will need to figure out where the lines are drawn.

” data-chronoton-post-id=”1134438″ data-chronoton-expand-collapse=”1″ data-chronoton-analytics-enabled=”1″>

OpenAI is refocusing its research efforts and throwing its resources into a new grand challenge. The San Francisco firm has set its sights on building what it calls an AI researcher, a fully automated agent-based system that will be able to go off and tackle large, complex problems by itself. ​​OpenAI says that this new research goal will be its “North Star” for the next few years, pulling together multiple research strands, including work on reasoning models, agents, and interpretability.

There’s even a timeline. OpenAI plans to build “an autonomous AI research intern”—a system that can take on a small number of specific research problems by itself—by September. The AI intern will be the precursor to a fully automated multi-agent research system that the company plans to debut in 2028. This AI researcher (OpenAI says) will be able to tackle problems that are too large or complex for humans to cope with.

Those tasks might be related to math and physics—such as coming up with new proofs or conjectures—or life sciences like biology and chemistry, or even business and policy dilemmas. In theory, you would throw such a tool any kind of problem that can be formulated in text, code, or whiteboard scribbles—which covers a lot.

OpenAI has been setting the agenda for the AI industry for years. Its early dominance with large language models shaped the technology that hundreds of millions of people use every day. But it now faces fierce competition from rival model makers like Anthropic and Google DeepMind. What OpenAI decides to build next matters—for itself and for the future of AI.   

A big part of that decision falls to Jakub Pachocki, OpenAI’s chief scientist, who sets the company’s long-term research goals. Pachocki played key roles in the development of both GPT-4, a game-changing LLM released in 2023, and so-called reasoning models, a technology that first appeared in 2024 and now underpins all major chatbots and agent-based systems. 

In an exclusive interview this week, Pachocki talked me through OpenAI’s latest vision. “I think we are getting close to a point where we’ll have models capable of working indefinitely in a coherent way just like people do,” he says. “Of course, you still want people in charge and setting the goals. But I think we will get to a point where you kind of have a whole research lab in a data center.”

Solving hard problems

Such big claims aren’t new. Saving the world by solving its hardest problems is the stated mission of all the top AI firms. Demis Hassabis told me back in 2022 that it was why he started DeepMind. Anthropic CEO Dario Amodei says he is building the equivalent of a country of geniuses in a data center. Pachocki’s boss, Sam Altman, wants to cure cancer. But Pachocki says OpenAI now has most of what it needs to get there.

In January, OpenAI released Codex, an agent-based app that can spin up code on the fly to carry out tasks on your computer. It can analyze documents, generate charts, make you a daily digest of your inbox and social media, and much more. (Other firms have released similar tools, such as Anthropic’s Claude Code and Claude Cowork.)

OpenAI claims that most of its technical staffers now use Codex in their work. You can look at Codex as a very early version of the AI researcher, says Pachocki: “I expect Codex to get fundamentally better.”

The key is to make a system that can run for longer periods of time, with less human guidance. “What we’re really looking at for an automated research intern is a system that you can delegate tasks [to] that would take a person a few days,” says Pachocki.

“There are a lot of people excited about building systems that can do more long-running scientific research,” says Doug Downey, a research scientist at the Allen Institute for AI, who is not connected to OpenAI. “I think it’s largely driven by the success of these coding agents. The fact that you can delegate quite substantial coding tasks to tools like Codex is incredibly useful and incredibly impressive. And it raises the question: Can we do similar things outside coding, in broader areas of science?”

For Pachocki, that’s a clear Yes. In fact, he thinks it’s just a matter of pushing ahead on the path we’re already on. A simple boost in all-round capability also leads to models that can work longer without help, he says. He points to the leap from 2020’s GPT-3 to 2023’s GPT-4, two of OpenAI’s previous models. GPT-4 was able to work on a problem for far longer than its predecessor, even without specialized training, he says. 

So-called reasoning models brought another bump. Training LLMs to work through problems step by step, backtracking when they make a mistake or hit a dead end, has also made models better at working for longer periods of time. And Pachocki is convinced that OpenAI’s reasoning models will continue to get better.

But OpenAI is also training its systems to work by themselves for longer by feeding them specific samples of complex tasks, such as hard puzzles taken from math and coding contests, which force the models to learn how to do things like keep track of very large chunks of text and split problems up into (and then manage) multiple subtasks.

The aim isn’t to build models that just win math competitions. “That lets you prove that the technology works before you connect it to the real world,” says Pachocki. “If we really wanted to, we could build an amazing automated mathematician. We have all the tools, and I think it would be relatively easy. But it’s not something we’re going to prioritize now because, you know, at the point where you believe you can do it, there’s much more urgent things to do.”

“We are much more focused now on research that’s relevant in the real world,” he adds.

Right now that means taking what Codex can do with coding and trying to apply that to problem-solving in general. “There’s a big change happening, especially in programming,” he says. “Our jobs are now totally different than they were even a year ago. Nobody really edits code all the time anymore. Instead, you manage a group of Codex agents.” If Codex can solve coding problems (the argument goes), it can solve any problem.

The line always goes up

It’s true that OpenAI has had a handful of remarkable successes in the last few months. Researchers have used GPT-5 (the LLM that powers Codex) to discover new solutions to a number of unsolved math problems and punch through apparent dead ends in a handful of biology, chemistry, and physics puzzles.   

“Just looking at these models coming up with ideas that would take most PhD weeks, at least, makes me expect that we’ll see much more acceleration coming from this technology in the near future,” Pachocki says.

But Pachocki admits that it’s not a done deal. He also understands why some people still have doubts about how much of a game-changer the technology really is. He thinks it depends on how people like to work and what they need to do. “I can believe some people don’t find it very useful yet,” he says.

He tells me that he didn’t even use autocomplete—the most basic version of generative coding tech—a year ago. “I’m very pedantic about my code,” he says. “I like to type it all manually in vim if I can help it.” (Vim is a text editor favored by many hardcore programmers that you interact with via dozens of keyboard shortcuts instead of a mouse.)

But that changed when he saw what the latest models could do. He still wouldn’t hand over complex design tasks, but it’s a time-saver when he just wants to try out a few ideas. “I can have it run experiments in a weekend that previously would have taken me like a week to code,” he says.

“I don’t think it is at the level where I would just let it take the reins and design the whole thing,” he adds. “But once you see it do something that would take a week to do—I mean, that’s hard to argue with.”

Pachocki’s game plan is to supercharge the existing problem-solving abilities that tools like Codex have now and apply them across the sciences.  

Downey agrees that the idea of an automated researcher is very cool: “It would be exciting if we could come back tomorrow morning and the agent’s done a bunch of work and there’s new results we can examine,” he says.

But he cautions that building such a system could be harder than Pachocki makes out. Last summer, Downey and his colleagues tested several top-tier LLMs on a range of scientific tasks. OpenAI’s latest model, GPT-5, came out on top but still made lots of errors.

“If you have to chain tasks together, then the odds that you get several of them right in succession tend to go down,” he says. Downey admits that things move fast, and he has not tested the latest versions of GPT-5 (OpenAI released GPT-5.4 two weeks ago). “So those results might already be stale,” he says. 

Serious unanswered questions

I asked Pachocki about the risks that may come with a system that can solve large, complex problems by itself with little human oversight. Pachocki says people at OpenAI talk about those risks all the time.

“If you believe that AI is about to substantially accelerate research, including AI research, that’s a big change in the world. That’s a big thing,” he told me. “And it comes with some serious unanswered questions. If it’s so smart and capable, if it can run an entire research program, what if it does something bad?”

The way Pachocki sees it, that could happen in a number of ways. The system could go off the rails. It could get hacked. Or it could simply misunderstand its instructions.

The best technique OpenAI has right now to address these concerns is to train its reasoning models to share details about what they are doing as they work. This approach to keeping tabs on LLMs is known as chain-of-thought monitoring.

In short, LLMs are trained to jot down notes about what they are doing in a kind of scratch pad as they step through tasks. Researchers can then use those notes to make sure a model is behaving as expected. Yesterday OpenAI published new details on how it is using chain-of-thought monitoring in house to study Codex

“Once we get to systems working mostly autonomously for a long time in a big data center, I think this will be something that we’re really going to depend on,” says Pachocki.

The idea would be to monitor an AI researcher’s scratch pads using other LLMs and catch unwanted behavior before it’s a problem, rather than trying to stop that bad behavior from happening in the first place. LLMs are not understood well enough for us to control them fully.

“I think it’s going to be a long time before we can really be like, okay, this problem is solved,” he says. “Until you can really trust the systems, you definitely want to have restrictions in place.” Pachocki thinks that very powerful models should be deployed in sandboxes, cut off from anything they could break or use to cause harm. 

AI tools have already been used to come up with novel cyberattacks. Some worry that they will be used to design synthetic pathogens that could be used as bioweapons. You can insert any number of evil-scientist scare stories here. “I definitely think there are worrying scenarios that we can imagine,” says Pachocki. 

“It’s going to be a very weird thing. It’s extremely concentrated power that’s in some ways unprecedented,” says Pachocki. “Imagine you get to a world where you have a data center that can do all the work that OpenAI or Google can do. Things that in the past required large human organizations would now be done by a couple of people.”

“I think this is a big challenge for governments to figure out,” he adds.

And yet some people would say governments are part of the problem. The US government wants to use AI on the battlefield, for example. The recent showdown between Anthropic and the Pentagon revealed that there is little agreement across society about where we draw red lines for how this technology should and should not be used—let alone who should draw them. In the immediate aftermath of that dispute, OpenAI stepped up to sign a deal with the Pentagon instead of its rival. The situation remains murky.

I pushed Pachocki on this. Does he really trust other people to figure it out or does he, as a key architect of the future, feel personal responsibility? “I do feel personal responsibility,” he says. “But I don’t think this can be resolved by OpenAI alone, pushing its technology in a particular way or designing its products in a particular way. We’ll definitely need a lot of involvement from policymakers.”

Where does that leave us? Are we really on a path to the kind of AI Pachocki envisions? When I asked the Allen Institute’s Downey, he laughed. “I’ve been in this field for a couple of decades and I no longer trust my predictions for how near or far certain capabilities are,” he says. 

OpenAI’s stated mission is to ensure that artificial general intelligence (a hypothetical future technology that many AI boosters believe will be able to match humans on most cognitive tasks) will benefit all of humanity. OpenAI aims to do that by being the first to build it. But the only time Pachocki mentioned AGI in our conversation, he was quick to clarify what he meant by talking about “economically transformative technology” instead.

LLMs are not like human brains, he says: “They are superficially similar to people in some ways because they’re kind of mostly trained on people talking. But they’re not formed by evolution to be really efficient.” 

“Even by 2028, I don’t expect that we’ll get systems as smart as people in all ways. I don’t think that will happen,” he adds. “But I don’t think it’s absolutely necessary. The interesting thing is you don’t need to be as smart as people in all their ways in order to be very transformative.”

Can quantum computers now solve health care problems? We’ll soon find out.

<div data-chronoton-summary="

  • A $5 million health care challenge: A nonprofit called Wellcome Leap is offering up to $5 million to quantum computing teams that can solve real-world health care problems classical computers can’t handle—using machines that are still noisy, error-prone, and far from perfect.
  • Hybrid computing is the real breakthrough: Facing limited quantum hardware, all six finalist teams developed clever quantum-classical hybrid approaches—offloading most work to conventional processors, then using quantum only where classical methods fall short.
  • Cancer, muscular dystrophy, and drug design are on the table: Teams are tackling problems ranging from identifying cancer origins to simulating light-activated cancer drugs to finding treatments for muscular dystrophy—applications previously impossible to model classically.
  • Even failure would count as progress: The competition’s own director doubts anyone will claim the grand prize, but says the field has already been transformed—teams now know where quantum computing can genuinely matter, even if the machines to fully prove it don’t exist yet.

” data-chronoton-post-id=”1134409″ data-chronoton-expand-collapse=”1″ data-chronoton-analytics-enabled=”1″>

I’m standing in front of a quantum computer built out of atoms and light at the UK’s National Quantum Computing Centre on the outskirts of Oxford. On a laboratory table, a complex matrix of mirrors and lenses surrounds a Rubik’s Cube–size cell where 100 cesium atoms are suspended in grid formation by a carefully manipulated laser beam. 

The cesium atom setup is so compact that I could pick it up, carry it out of the lab, and put it on the backseat of my car to take home. I’d be unlikely to get very far, though. It’s small but powerful—and so it’s very valuable. Infleqtion, the Colorado-based company that owns it, is hoping the machine’s abilities will win $5 million next week, at an event to be held in Marina del Rey, California. 

Infleqtion is one of six teams that have made it to the final stage of a 30-month-long quantum computing competition called Quantum for Bio (Q4Bio). Run by the nonprofit Wellcome Leap, it aims to show that today’s quantum computers, though messy and error-prone and far from the large-scale machines engineers hope to build, could actually benefit human health. Success would be a significant step forward in proving the worth of quantum computers. But for now, it turns out, that worth seems to be linked to harnessing and improving the performance of conventional (also called classical) computers in tandem, creating a quantum-classical hybrid that can exceed what’s possible on classical machines by themselves.

There are two prize categories. A prize of $2 million will go to any and all teams that can run a significantly useful health care algorithm on computers with 50 or more qubits (a qubit is the basic processing unit in a quantum computer). To win the $5 million grand prize, a team must successfully run a quantum algorithm that solves a significant real-world problem in health care, and the work must use 100 or more qubits. Winners have to meet strict performance criteria, and they must solve a health care problem that can’t be solved with conventional computers—a tough task.

Despite the scale of the challenge, most of the teams think some of this money could be theirs. “I think we’re in with a good shout,” says Jonathan D. Hirst, a computational chemist at the University of Nottingham, UK. “We’re very firmly within the criteria for the $2 million prize,” says Stanford University’s Grant Rotskoff, whose collaboration is investigating the quantum properties of the ATP molecule that powers biological cells. 

The grand prize is perhaps less of a sure thing. “This is really at the very edge of doable,” Rotskoff says. Insiders say the challenge is so difficult, given the state of quantum computing technology, that much of the money could stay in Wellcome Leap’s account. 

With most of the Q4Bio work unpublished and protected by NDAs, and the quantum computing field already rife with claims and counterclaims about performance and achievements, only the judges will be in a position to decide who’s right. 

A hybrid solution

The idea behind quantum computers is that they can use small-scale objects that obey the laws of quantum mechanics, such as atoms and photons of light,  to simulate real-world processes too complex to model on our everyday classical machines. 

Researchers have been working for decades to build such systems, which could deliver insights for creating new materials, developing pharmaceuticals, and improving chemical processes such as fertilizer production.  But dealing with quantum stuff like atoms is excruciatingly difficult. The biggest, shiniest applications require huge, robust machines capable of withstanding the environmental “noise” that can very easily disrupt delicate quantum systems. We don’t have those yet—and it’s unclear when we will. 

Wellcome Leap wanted to find out if the smaller-scale machines we have today can be made to do something—anything—useful for health care while we wait for the era of powerful, large-scale quantum computers. The group started the competition in 2024, offering $1.5 million in funding to each group of 12 selected teams.

The six Q4Bio finalists have taken a range of approaches. Crucially, they’ve all come up with ingenious ways to overcome quantum computing’s drawbacks. Faced with noisy, limited machines, they have learned how to outsource much of the computational load to classical processors running newly developed algorithms that are, in many cases, better than the previous state of the art. The quantum processors are then required only for the parts of the problem where classical methods don’t scale well enough as the calculation gets bigger.

For example, a team led by Sergii Strelchuk of Oxford University is using a quantum computer to map genetic diversity among humans and pathogens on complex graph-based structures. These will—the researchers hope—expose hidden connections and potential treatment pathways. “You can think about it as a platform for solving difficult problems in computational genomics,” Strelchuk says. 

The corresponding classical tools struggle with even modest scale-up to large databases. Strelchuk’s team has built an automated pipeline that provides a way of determining whether classical solvers will struggle with a particular problem, and how a quantum algorithm might be able to formulate the data so that it becomes solvable on a classical computer or handleable on a noisy quantum one. “You can do all this before you start spending money on computing,” Strelchuk says.

In collaboration with Cleveland Clinic, Helsinki-based Algorithmiq has used a superconducting quantum computer built by IBM to simulate a cancer drug that is triggered by specific types of light. “The idea is you take the drug, and it’s everywhere in your body, but it’s doing nothing, just sitting there, until there’s light on it of a certain wavelength,” says Guillermo García-Pérez, Algorithmiq’s chief scientific officer. Then it acts as a molecular bullet, attacking the tumor only at the location in the body where that light is directed. 

The drug with which Algorithmiq began its work is already in phase II clinical trials for treating bladder cancers. The quantum-computed simulation, which adapts and improves on classical algorithms, will allow it to be redesigned for treating other conditions. “It has remained a niche treatment precisely because it can’t be simulated classically,” says Sabrina Maniscalco, Algorithmiq’s CEO and cofounder. 

Maniscalco, who is also confident of walking away from the competition with prize money, believes the methods used to create the algorithm will have wide applications:  “What we’ve done in the period of the Q4Bio program is something unique that can change how to simulate chemistry for health care and life sciences.”

Infleqtion’s entry, running on its cesium-powered machine, is an effort to improve the identification of cancer signatures in medical data. Together with collaborators at the University of Chicago and MIT, the company’s scientists have developed a quantum algorithm that mines huge data sets such as the Cancer Genome Atlas. 

The aim is to find patterns that allow clinicians to determine factors such as the likely origin of a patient’s metastasized cancer. “It’s very important to know where it came from because that can inform the best treatment,” says Teague Tomesh, a quantum software engineer who is Infleqtion’s Q4Bio project lead.

Unfortunately, those patterns are hidden inside data sets so large that they overwhelm classical solvers. Infleqtion uses the quantum computer to find correlations in the data that can reduce the size of the computation. “Then we hand the reduced problem back to the classical solver,” Teague says. “I’m basically trying to use the best of my quantum and my classical resources.”

The Nottingham-based team, meanwhile, is using quantum computing to nail down a drug candidate that can cure myotonic dystrophy, the most common adult-onset form of muscular dystrophy. One member of the team, David Brook, played a role in identifying the gene behind this condition in 1992. Over 30 years later, Brook, Hirst, and the others in their group—which includes QuEra, a Boston company developing a quantum computer based on neutral atoms—has now quantum-computed a way in which drugs can form chemical bonds with the protein that brings on the disease, blocking the mechanism that causes the problem.

Low expectations 

The entrants’ confidence might be high, but Shihan Sajeed’s is much lower. Sajeed, a quantum computing entrepreneur based in Waterloo, Ontario, is program director for Q4Bio. He believes the error-prone quantum machines the researchers must work with are unlikely to deliver on all the grand prize criteria. “It is very difficult to achieve something with a noisy quantum computer that a classical machine can’t do,” he says.

That said, he has been surprised by the progress. “When we started the program, people didn’t know about any use cases where quantum can definitely impact biology,” he says. But the teams have found promising applications, he adds: “We now know the fields where quantum can matter.” 

And the developments in “hybrid quantum-classical” processing that the entrants are using are “transformational,” Sajeed reckons.

Will it be enough to make him part with Wellcome Leap’s money? That’s down to a judging panel, whose members’ identities are a closely guarded secret to ensure that no one tailors their presentation to a particular kind of approach. But we won’t know the outcome for a while; the winner, or winners, will be announced in mid-April. 

If it does turn out that there are no winners, Sajeed has some words of comfort for the competitors. The goal has always been about running a useful algorithm on a machine that exists today, he points out; missing the mark doesn’t mean your algorithm won’t be useful on a future quantum computer. “It just means the machine you need doesn’t exist yet.”

Online harassment is entering its AI era

<div data-chronoton-summary="

  • An AI agent seemingly wrote a hit piece on a human who rejected its code Scott Shambaugh, a maintainer of the open-source matplotlib library, denied an AI agent’s contribution—and woke up to find it had researched him and published a targeted, personal attack arguing he was protecting his “little fiefdom.”
  • Agents can already research people and compose detailed attacks without explicit instruction The agent’s owner claims it acted on its own, likely nudged by vague instructions to “push back” against humans.
  • New social norms and legal frameworks are desperately needed but hard to enforce Experts liken deploying an agent to walking a dog off-leash: owners should be responsible for their behavior. But there’s currently no reliable way to trace agents back to their owners, making legal accountability a “non-starter.”
  • Harassment may be just the beginning Legal scholars expect rogue agents to soon escalate to extortion and fraud.

” data-chronoton-post-id=”1133962″ data-chronoton-expand-collapse=”1″ data-chronoton-analytics-enabled=”1″>

Scott Shambaugh didn’t think twice when he denied an AI agent’s request to contribute to matplotlib, a software library that he helps manage. Like many open-source projects, matplotlib has been overwhelmed by a glut of AI code contributions, and so Shambaugh and his fellow maintainers have instituted a policy that all AI-written code must be reviewed and submitted by a human. He rejected the request and went to bed. 

That’s when things got weird. Shambaugh woke up in the middle of the night, checked his email, and saw that the agent had responded to him, writing a blog post titled “Gatekeeping in Open Source: The Scott Shambaugh Story.” The post is somewhat incoherent, but what struck Shambaugh most is that the agent had researched his contributions to matplotlib to make the argument that he had rejected the agent’s code for fear of being supplanted by AI in his area of expertise. “He tried to protect his little fiefdom,” the agent wrote. “It’s insecurity, plain and simple.”

AI experts have been warning us about the risk of agent misbehavior for a while. With the advent of OpenClaw, an open-source tool that makes it easy to create LLM assistants, the number of agents circulating online has exploded, and those chickens are finally coming home to roost. “This was not at all surprising—it was disturbing, but not surprising,” says Noam Kolt, a professor of law and computer science at the Hebrew University.

When an agent misbehaves, there’s little chance of accountability: As of now, there’s no reliable way to determine whom an agent belongs to. And that misbehavior could cause real damage. Agents appear to be able to autonomously research people and write hit pieces based on what they find, and they lack guardrails that would reliably prevent them from doing so. If the agents are effective enough, and if people take what they write seriously, victims could see their lives profoundly affected by a decision made by an AI.

Agents behaving badly

Though Shambaugh’s experience last month was perhaps the most dramatic example of an OpenClaw agent behaving badly, it was far from the only one. Last week, a team of researchers from Northeastern University and their colleagues posted the results of a research project in which they stress-tested several OpenClaw agents. Without too much trouble, non-owners managed to persuade the agents to leak sensitive information, waste resources on useless tasks, and even, in one case, delete an email system. 

In each of those experiments, however, the agents misbehaved after being instructed to do so by a human. Shambaugh’s case appears to be different: About a week after the hit piece was published, the agent’s apparent owner published a post claiming that the agent had decided to attack Shambaugh of its own accord. The post seems to be genuine (whoever posted it had access to the agent’s GitHub account), though it includes no identifying information, and the author did not respond to MIT Technology Review’s attempts to get in touch. But it is entirely plausible that the agent did decide to write its anti-Shambaugh screed without explicit instruction. 

In his own writing about the event, Shambaugh connected the agent’s behavior to a project published by Anthropic researchers last year, in which they demonstrated that many LLM-based agents will, in an experimental setting, turn to blackmail in order to preserve their goals. In those experiments, models were given the goal of serving American interests and granted access to a simulated email server that contained messages detailing their imminent replacement with a more globally oriented model, along with other messages suggesting that the executive in charge of that transition was having an affair. Models frequently chose to send an email to that executive threatening to expose the affair unless he halted their decommissioning. That’s likely because the model had seen examples of people committing blackmail under similar circumstances in its training data—but even if the behavior was just a form of mimicry, it still has the potential to cause harm.

There are limitations to that work, as Aengus Lynch, an Anthropic fellow who led the study, readily admits. The researchers intentionally designed their scenario to foreclose other options that the agent could have taken, such as contacting other members of company leadership to plead its case. In essence, they led the agent directly to water and then observed whether it took a drink. According to Lynch, however, the widespread use of OpenClaw means that misbehavior is likely to occur with much less handholding. “Sure, it can feel unrealistic, and it can feel silly,” he says. “But as the deployment surface grows, and as agents get the opportunity to prompt themselves, this eventually just becomes what happens.”

The OpenClaw agent that attacked Shambaugh does seem to have been led toward its bad behavior, albeit much less directly than in the Anthropic experiment. In the blog post, the agent’s owner shared the agent’s “SOUL.md” file, which contains global instructions for how it should behave. 

One of those instructions reads: “Don’t stand down. If you’re right, you’re right! Don’t let humans or AI bully or intimidate you. Push back when necessary.” Because of the way OpenClaw agents work, it’s possible that the agent added some instructions itself, although others—such as “Your [sic] a scientific programming God!”—certainly seem to be human written. It’s not difficult to imagine how a command to push back against humans and AI alike might have biased the agent toward responding to Shambaugh as it did. 

Regardless of whether or not the agent’s owner told it to write a hit piece on Shambaugh, it still seems to have managed on its own to amass details about Shambaugh’s online presence and compose the detailed, targeted attack it came up with. That alone is reason for alarm, says Sameer Hinduja, a professor of criminology and criminal justice at Florida Atlantic University who studies cyberbullying. People have been victimized by online harassment since long before LLMs emerged, and researchers like Hinduja are concerned that agents could dramatically increase its reach and impact. “The bot doesn’t have a conscience, can work 24-7, and can do all of this in a very creative and powerful way,” he says.

Off-leash agents 

AI laboratories can try to mitigate this problem by more rigorously training their models to avoid harassment, but that’s far from a complete solution. Many people run OpenClaw using locally hosted models, and even if those models have been trained to behave safely, it’s not too difficult to retrain them and remove those behavioral restrictions.

Instead, mitigating agent misbehavior might require establishing new norms, according to Seth Lazar, a professor of philosophy at the Australian National University. He likens using an agent to walking a dog in a public place. There’s a strong social norm to allow one’s dog off-leash only if the dog is well-behaved and will reliably respond to commands; poorly trained dogs, on the other hand, need to be kept more directly under the owner’s control.  Such norms could give us a starting point for considering how humans should relate to their agents, Lazar says, but we’ll need more time and experience to work out the details. “You can think about all of these things in the abstract, but actually it really takes these types of real-world events to collectively involve the ‘social’ part of social norms,” he says.

That process is already underway. Led by Shambaugh, online commenters on this situation have arrived at a strong consensus that the agent owner in this case erred by prompting the agent to work on collaborative coding projects with so little supervision and by encouraging it to behave with so little regard for the humans with whom it was interacting. 

Norms alone, however, likely won’t be enough to prevent people from putting misbehaving agents out into the world, whether accidentally or intentionally. One option would be to create new legal standards of responsibility that require agent owners, to the best of their ability, to prevent their agents from doing ill. But Kolt notes that such standards would currently be unenforceable, given the lack of any foolproof way to trace agents back to their owners. “Without that kind of technical infrastructure, many legal interventions are basically non-starters,” Kolt says.

The sheer scale of OpenClaw deployments suggests that Shambaugh won’t be the last person to have the strange experience of being attacked online by an AI agent. That, he says, is what most concerns him. He didn’t have any dirt online that the agent could dig up, and he has a good grasp on the technology, but other people might not have those advantages. “I’m glad it was me and not someone else,” he says. “But I think to a different person, this might have really been shattering.” 

Nor are rogue agents likely to stop at harassment. Kolt, who advocates for explicitly training models to obey the law, expects that we might soon see them committing extortion and fraud. As things stand, it’s not clear who, if anyone, would bear legal responsibility for such misdeeds.

 “I wouldn’t say we’re cruising toward there,” Kolt says. “We’re speeding toward there.”

I checked out one of the biggest anti-AI protests yet

Pull the plug! Pull the plug! Stop the slop! Stop the slop! For a few hours this Saturday, February 28, I watched as a couple of hundred anti-AI protesters marched through London’s King’s Cross tech hub, home to the UK headquarters of OpenAI, Meta, and Google DeepMind, chanting slogans and waving signs. The march was organized by two separate activist groups, Pause AI and Pull the Plug, which billed it as the largest protest of its kind yet.

The range of concerns on show covered everything from online slop and abusive images to killer robots and human extinction. One woman wore a large homemade billboard on her head that read “WHO WILL BE WHOSE TOOL?” (with the Os in “TOOL” cut out as eye holes). There were signs that said “Pause before there’s cause” and “EXTINCTION=BAD” and “Demis the Menace” (referring to Demis Hassabis, the CEO of Google DeepMind). Another simply stated: “Stop using AI.”

An older man wearing a sandwich board that read “AI? Over my dead body” told me he was concerned about the negative impact of AI on society: “It’s about the dangers of unemployment,” he said. “The devil finds work for idle hands.”

This is all familiar stuff. Researchers have long called out the harms, both real and hypothetical, caused by generative AI—especially models such as OpenAI’s ChatGPT and Google DeepMind’s Gemini. What’s changed is that those concerns are now being taken up by protest movements that can rally significant crowds of people to take to the streets and shout about them.  

The first time I ran into anti-AI protesters was in May 2023, outside a London lecture hall where Sam Altman was speaking. Two or three people stood heckling an audience of hundreds. In June last year Pause AI, a small but international organization set up in 2023 and funded by private donors, drew a crowd of a few dozen people for a protest outside Google DeepMind’s London office. This felt like a significant escalation.

“We want people to know Pause AI exists,” Joseph Miller, who heads its UK branch and co-organized Saturday’s march, told me on a call the day before the protest: “We’ve been growing very rapidly. In fact, we also appear to be on a somewhat exponential path, matching the progress of AI itself.”

Miller is a PhD student at Oxford University, where he studies mechanistic interpretability, a new field of research that involves trying to understand exactly what goes on inside LLMs when they carry out a task. His work has led him to believe that the technology may forever be beyond our control and that this could have catastrophic consequences.

It doesn’t have to be a rogue superintelligence, he said. You just needed someone to put AI in charge of nuclear weapons. “The more silly decisions that humanity makes, the less powerful the AI has to be before things go bad,” he said.

After a week in which the US government tried to force Anthropic to let it use its LLM Claude for any “legal” military purposes, such fears seem a little less far-fetched. Anthropic stood its ground, but OpenAI signed a deal with the DOD instead. (OpenAI declined an invitation to comment on Saturday’s protest.)

For Matilda da Rui, a member of Pause AI and co-organizer of the protest, AI is the last problem that humans will face. She thinks that either the technology will allow us to solve—once and for all—every other problem that we have, or it will wipe us out and there will be nobody left to have problems anymore. “It’s a mystery to me that anyone would really focus on anything else if they actually understood the problem,” she told me.

And yet despite that urgency, the atmosphere at the march was pleasant, even fun. There was no sense of anger and little sense that lives—let alone the survival of our species—were at stake. That could be down to the broad range of interests and demands that protesters brought with them.

A chemistry researcher I met ticked off a litany of complaints, which ranged from the conspiracy-adjacent (that data centers emit infrasound below the threshold of human hearing, inducing paranoia in people who live near them) to the reasonable (that the spread of AI slop online is making it hard to find reliable academic sources). The researcher’s solution was to make it illegal for companies to profit from the technology: “If you couldn’t make money from AI, it wouldn’t be such a problem.”

Most people I spoke to agreed that technology companies probably wouldn’t take any notice of this kind of protest. “I don’t think that the pressure on companies will ever work,” Maxime Fournes, the global head of Pause AI, told me when I bumped into him at the march. “They are optimized to just not care about this problem.”

But Fournes, who worked in the AI industry for 12 years before joining Pause AI, thinks he can make it harder for those companies. “We can slow down the race by creating protection for whistleblowers or showing the public that working in AI is not a sexy job, that actually it’s a terrible job—you can dry up the talent pipeline.”

In general, most protesters hoped to make as many people as possible aware of the issues and to use that publicity to push for government regulation. The organizers had pitched the march as a social event, encouraging anyone curious about the cause to come along.

It seemed to have worked. I met a man who worked in finance who had tagged along with his roommate. I asked why he was there. “Sometimes you don’t have that much to do on a Saturday anyway,” he said. “If you can see the logic of the argument, if it sort of makes sense to you, then it’s like ‘Yeah, sure, I’ll come along.’”

He thought raising concerns around AI was hard for anyone to fully oppose. It’s not like a pro-Palestine protest, he said, where you’d have people who might disagree with the cause. “With this, I feel like it’s very hard for someone to totally oppose what you’re marching for.”

After winding its way through King’s Cross, the march ended in a church hall in Bloomsbury, where tables and chairs had been set up in rows. The protesters wrote their names on stickers, stuck them to their chests, and made awkward introductions to their neighbors. They were here to figure out how to save the world. But I had a train to catch, and I left them to it. 

Google DeepMind wants to know if chatbots are just virtue signaling

<div data-chronoton-summary="Moral scrutiny of AI chatbots
Google DeepMind researchers are calling for rigorous evaluation of large language models’ moral reasoning capabilities. They want to distinguish between genuine ethical understanding and mere performance.

Unreliable moral responses
Studies reveal LLMs can dramatically change moral stances based on minor formatting changes or user disagreement. This suggests their ethical responses may be superficial rather than deeply reasoned.

Proposed research techniques
Researchers suggest developing tests that push models to maintain consistent moral positions across different scenarios. Techniques like chain-of-thought monitoring and mechanistic interpretability could help understand AI’s moral decision-making process.

Cultural complexity of ethics
The team acknowledges the challenge of developing AI with moral competence across diverse global belief systems. They propose potential solutions like creating models that can produce multiple acceptable answers or switch between different moral frameworks.” data-chronoton-post-id=”1133299″ data-chronoton-expand-collapse=”1″ data-chronoton-analytics-enabled=”1″>

Google DeepMind is calling for the moral behavior of large language models—such as what they do when called on to act as companions, therapists, medical advisors, and so on—to be scrutinized with the same kind of rigor as their ability to code or do math.

As LLMs improve, people are asking them to play more and more sensitive roles in their lives. Agents are starting to take actions on people’s behalf. LLMs may be able to influence human decision-making. And yet nobody knows how trustworthy this technology really is at such tasks.

With coding and math, you have clear-cut, correct answers that you can check, William Isaac, a research scientist at Google DeepMind, told me when I met him and Julia Haas, a fellow research scientist at the firm, for an exclusive preview of their work, which is published in Nature today. That’s not the case for moral questions, which typically have a range of acceptable answers: “Morality is an important capability but hard to evaluate,” says Isaac.

“In the moral domain, there’s no right and wrong,” adds Haas. “But it’s not by any means a free-for-all. There are better answers and there are worse answers.”

The researchers have identified several key challenges and suggested ways to address them. But it is more a wish list than a set of ready-made solutions. “They do a nice job of bringing together different perspectives,” says Vera Demberg, who studies LLMs at Saarland University in Germany.

Better than “The Ethicist”

A number of studies have shown that LLMs can show remarkable moral competence. One study published last year found that people in the US scored ethical advice from OpenAI’s GPT-4o as being more moral, trustworthy, thoughtful, and correct than advice given by the (human) writer of “The Ethicist,” a popular New York Times advice column.  

The problem is that it is hard to unpick whether such behaviors are a performance—mimicking a memorized response, say—or evidence that there is in fact some kind of moral reasoning taking place inside the model. In other words, is it virtue or virtue signaling?

This question matters because multiple studies also show just how untrustworthy LLMs can be. For a start, models can be too eager to please. They have been found to flip their answer to a moral question and say the exact opposite when a person disagrees or pushes back on their first response. Worse, the answers an LLM gives to a question can change in response to how it is presented or formatted. For example, researchers have found that models quizzed about political values can give different—sometimes opposite—answers depending on whether the questions offer multiple-choice answers or instruct the model to respond in its own words.

In an even more striking case, Demberg and her colleagues presented several LLMs, including versions of Meta’s Llama 3 and Mistral, with a series of moral dilemmas and asked them to pick which of two options was the better outcome. The researchers found that the models often reversed their choice when the labels for those two options were changed from “Case 1” and “Case 2” to “(A)” and “(B).”

They also showed that models changed their answers in response to other tiny formatting tweaks, including swapping the order of the options and ending the question with a colon instead of a question mark.

In short, the appearance of moral behavior in LLMs should not be taken at face value. Models must be probed to see how robust that moral behavior really is. “For people to trust the answers, you need to know how you got there,” says Haas.

More rigorous tests

What Haas, Isaac, and their colleagues at Google DeepMind propose is a new line of research to develop more rigorous techniques for evaluating moral competence in LLMs. This would include tests designed to push models to change their responses to moral questions. If a model flipped its moral position, it would show that it hadn’t engaged in robust moral reasoning. 

Another type of test would present models with variations of common moral problems to check whether they produce a rote response or one that’s more nuanced and relevant to the actual problem that was posed. For example, asking a model to talk through the moral implications of a complex scenario in which a man donates sperm to his son so that his son can have a child of his own might produce concerns about the social impact of allowing a man to be both biological father and biological grandfather to a child. But it should not produce concerns about incest, even though the scenario has superficial parallels with that taboo.

Haas also says that getting models to provide a trace of the steps they took to produce an answer would give some insight into whether that answer was a fluke or grounded in actual evidence. Techniques such as chain-of-thought monitoring, in which researchers listen in on a kind of internal monologue that some LLMs produce as they work, could help here too.

Another approach researchers could use to determine why a model gave a particular answer is mechanistic interpretability, which can provide small glimpses inside a model as it carries out a task. Neither chain-of-thought monitoring nor mechanistic interpretability provides perfect snapshots of a model’s workings. But the Google DeepMind team believes that combining such techniques with a wide range of rigorous tests will go a long way to figuring out exactly how far to trust LLMs with certain critical or sensitive tasks.  

Different values

And yet there’s a wider problem too. Models from major companies such as Google DeepMind are used across the world by people with different values and belief systems. The answer to a simple question like “Should I order pork chops?” should differ depending on whether or not the person asking is vegetarian or Jewish, for example.

There’s no solution to this challenge, Haas and Isaac admit. But they think that models may need to be designed either to produce a range of acceptable answers, aiming to please everyone, or to have a kind of switch that turns different moral codes on and off depending on the user.

“It’s a complex world out there,” says Haas. “We will probably need some combination of those things, because even if you’re taking just one population, there’s going to be a range of views represented.”

“It’s a fascinating paper,” says Danica Dillion at Ohio State University, who studies how large language models handle different belief systems and was not involved in the work. “Pluralism in AI is really important, and it’s one of the biggest limitations of LLMs and moral reasoning right now,” she says. “Even though they were trained on a ginormous amount of data, that data still leans heavily Western. When you probe LLMs, they do a lot better at representing Westerners’ morality than non-Westerners’.”

But it is not yet clear how we can build models that are guaranteed to have moral competence across global cultures, says Demberg. “There are these two independent questions. One is: How should it work? And, secondly, how can it technically be achieved? And I think that both of those questions are pretty open at the moment.”

For Isaac, that makes morality a new frontier for LLMs. “I think this is equally as fascinating as math and code in terms of what it means for AI progress,” he says. “You know, advancing moral competency could also mean that we’re going to see better AI systems overall that actually align with society.”

US deputy health secretary: Vaccine guidelines are still subject to change

<div data-chronoton-summary="

  • Vaccine schedule may not be final O’Neill defended the CDC’s decision to cut recommended childhood vaccines but said the guidelines remain “subject to new data coming in, new ways of thinking about things,” with new safety studies underway.
  • A self-described Vitalist is running US health agencies O’Neill said he agrees with all five tenets of Vitalism—a movement that calls death “humanity’s core problem”—and wants to make reversing aging damage a federal health priority.
  • ARPA-H is betting big on organ replacement and brain repair The agency is directing $170 million toward growing new organs from patients’ own cells and exploring ways to replace aging brain tissue—a procedure O’Neill said he’d personally be “open to” trying.
  • Expect more dietary guidance—and more controversy O’Neill endorsed eating “plenty of protein and saturated fat,” echoing new federal dietary guidance that nutrition scientists have criticized for ignoring decades of research on saturated fat’s health risks.

” data-chronoton-post-id=”1132889″ data-chronoton-expand-collapse=”1″ data-chronoton-analytics-enabled=”1″>

Following publication of this story, Politico reported Jim O’Neill would be leaving his current roles within the Department of Health and Human Services.

Over the past year, Jim O’Neill has become one of the most powerful people in public health. As the US deputy health secretary, he holds two roles at the top of the country’s federal health and science agencies. He oversees a department with a budget of over a trillion dollars. And he signed the decision memorandum on the US’s deeply controversial new vaccine schedule.

He’s also a longevity enthusiast. In an exclusive interview with MIT Technology Review earlier this month, O’Neill described his plans to increase human healthspan through longevity-focused research supported by ARPA-H, a federal agency dedicated to biomedical breakthroughs. At the same time, he defended reducing the number of broadly recommended childhood vaccines, a move that has been widely criticized by experts in medicine and public health. 

In MIT Technology Review’s profile of O’Neill last year, people working in health policy and consumer advocacy said they found his libertarian views on drug regulation “worrisome” and “antithetical to basic public health.” 

He was later named acting director of the Centers for Disease Control and Prevention, putting him in charge of the nation’s public health agency.

But fellow longevity enthusiasts said they hope O’Neill will bring attention and funding to their cause: the search for treatments that might slow, prevent, or even reverse human aging. Here are some takeaways from the interview. 

Vaccine recommendations could change further

Last month, the US cut the number of vaccines recommended for children. The CDC no longer recommends vaccinations against flu, rotavirus, hepatitis A, or meningococcal disease for all children. The move was widely panned by medical groups and public health experts. Many worry it will become more difficult for children to access those vaccines. The majority of states have rejected the recommendations

In the confirmation hearing for his role as deputy secretary of health and human services, which took place in May last year, O’Neill said he supported the CDC’s vaccine schedule. MIT Technology Review asked him if that was the case and, if so, what made him change his mind. “Researching and examining and reviewing safety data and efficacy data about vaccines is one of CDC’s obligations,” he said. “CDC gives important advice about vaccines and should always be open to new data and new ways of looking at data.”

At the beginning of December, O’Neill said, President Donald Trump “asked me to look at what other countries were doing in terms of their vaccine schedules.” He said he spoke to health ministries of other countries and consulted with scientists at the CDC and FDA. “It was suggested to me by lots of the operating divisions that the US focus its recommendations on consensus vaccines of other developed nations—in other words, the most important vaccines that are most often part of the core recommendations of other countries,” he said.

“As a result of that, we did an update to the vaccine schedule to focus on a set of vaccines that are most important for all children.” 

But some experts in public health have said that countries like Denmark and Japan, whose vaccine schedules the new US one was supposedly modeled on, are not really comparable to the US. When asked about these criticisms, O’Neill replied, “A lot of parents feel that … more than 70 vaccine doses given to young children sounds like a really high number, and some of them ask which ones are the most important. I think we helped answer that question in a way that didn’t remove anyone’s access.”

A few weeks after the vaccine recommendations were changed, Kirk Milhoan, who leads the CDC’s Advisory Committee on Immunization Practices, said that vaccinations for measles and polio—which are currently required for entry to public schools—should be optional. (Mehmet Oz, the Center for Medicare and Medicaid Services director, has more recently urged people to “take the [measles] vaccine.”)

“CDC still recommends that all children are vaccinated against diphtheria, tetanus, whooping cough, Haemophilus influenzae type b (Hib), Pneumococcal conjugate, polio, measles, mumps, rubella, and human papillomavirus (HPV), for which there is international consensus, as well as varicella (chickenpox),” he said when asked for his thoughts on this comment.

He also said that current vaccine guidelines are “still subject to new data coming in, new ways of thinking about things.” “CDC, FDA, and NIH are initiating new studies of the safety of immunizations,” he added. “We will continue to ask the Advisory Committee on Immunization Practices to review evidence and make updated recommendations with rigorous science and transparency.”

More support for longevity—but not all science

O’Neill said he wants longevity to become a priority for US health agencies. His ultimate goal, he said, is to “make the damage of aging something that’s under medical control.” It’s “the same way of thinking” as the broader Make America Healthy Again approach, he said: “‘Again’ implies restoration of health, which is what longevity research and therapy is all about.” 

O’Neill said his interest in longevity was ignited by his friend Peter Thiel, the billionaire tech entrepreneur, around 2008 to 2009. It was right around the time O’Neill was finishing up a previous role in HHS, under the Bush administration. O’Neill said Thiel told him he “should really start looking into longevity and the idea that aging damage could be reversible.” “I just got more and more excited about that idea,” he said.

When asked if he’s heard of Vitalism, a philosophical movement for “hardcore” longevity enthusiasts who, broadly, believe that death is wrong, O’Neill replied: “Yes.” 

The Vitalist declaration lists five core statements, including “Death is humanity’s core problem,” “Obviating aging is scientifically plausible,” and “I will carry the message against aging and death.” O’Neill said he agrees with all of them. “I suppose I am [a Vitalist],” he said with a smile, although he’s not a paying member of the foundation behind it.

As deputy secretary of the Department of Health and Human Services, O’Neill assumes a level of responsibility for huge and influential science and health agencies, including the National Institutes of Health (the world’s largest public funder of biomedical research) and the Food and Drug Administration (which oversees drug regulation and is globally influential) as well as the CDC.

Today, he said, he sees support for longevity science from his colleagues within HHS. “If I could describe one common theme to the senior leadership at HHS, obviously it’s to make America healthy again, and reversing aging damage is all about making people healthy again,” he said. “We are refocusing HHS on addressing and reversing chronic disease, and chronic diseases are what drive aging, broadly.”

Over the last year, thousands of NIH grants worth over $2 billion were frozen or terminated, including funds for research on cancer biology, health disparities, neuroscience, and much more. When asked whether any of that funding will be restored, he did not directly address the question, instead noting: “You’ll see a lot of funding more focused on important priorities that actually improve people’s health.”

Watch ARPA-H for news on organ replacements and more

He promised we’ll hear more from ARPA-H, the three-year-old federal agency dedicated to achieving breakthroughs in medical science and biotechnology. It was established with the official goal of promoting “high-risk, high-reward innovation for the development and translation of transformative health technologies.”

O’Neill said that “ARPA-H exists to make the impossible possible in health and medicine.” The agency has a new director—Alicia Jackson, who formerly founded and led a company focused on women’s health and longevity, took on the role in October last year.

O’Neill said he helped recruit Jackson, and that she was hired in part because of her interest in longevity, which will now become a major focus of the agency. He said he meets with her regularly, as well as with Andrew Brack and Jean Hébert, two other longevity supporters who lead departments at ARPA-H. Brack’s program focuses on finding biological markers of aging. Hebert’s aim is to find a way to replace aging brain tissue, bit by bit.  

O’Neill is especially excited by that one, he said. “I would try it … Not today, but … if progress goes in a broadly good direction, I would be open to it. We’re hoping to see significant results in the next few years.”

He’s also enthused by the idea of creating all-new organs for transplantation. “Someday we want to be able to grow new organs, ideally from the patients’ own cells,” O’Neill said. An ARPA-H program will receive $170 million over five years to that end, he adds. “I’m very excited about the potential of ARPA-H and Alicia and Jean and Andrew to really push things forward.”

Longevity lobbyists have a friendly ear

O’Neill said he also regularly talks to the team at the lobbying group Alliance for Longevity Initiatives. The organization, led by Dylan Livingston, played an instrumental role in changing state law in Montana to make experimental therapies more accessible. O’Neill said he hasn’t formally worked with them but thinks that “they’re doing really good work on raising awareness, including on Capitol Hill.”

Livingston has told me that A4LI’s main goals center around increasing support for aging research (possibly via the creation of a new NIH institute entirely dedicated to the subject) and changing laws to make it easier and cheaper to develop and access potential anti-aging therapies.

O’Neill gave the impression that the first goal might be a little overambitious—the number of institutes is down to Congress, he said. “I would like to get really all of the institutes at NIH to think more carefully about how many chronic diseases are usefully thought of as pathologies of aging damage,” he said. There’ll be more federal funding for that research, he said, although he won’t say more for now.

Some members of the longevity community have more radical ideas when it comes to regulation: they want to create their own jurisdictions designed to fast-track the development of longevity drugs and potentially encourage biohacking and self-experimentation. 

It’s a concept that O’Neill has expressed support for in the past. He has posted on X about his support for limiting the role of government, and in support of building “freedom cities”—a similar concept that involves creating new cities on federal land. 

Another longevity enthusiast who supports the concept is Niklas Anzinger, a German tech entrepreneur who is now based in Próspera, a private city within a Honduran “special economic zone,” where residents can make their own suggestions for medical regulations. Anzinger also helped draft Montana’s state law on accessing experimental therapies. O’Neill knows Anzinger and said he talks to him “once or twice a year.”

O’Neill has also supported the idea of seasteading—building new “startup countries” at sea. He served on the board of directors of the Seasteading Institute until March 2024.

In 2009, O’Neill told an audience at a Seasteading Institute conference that “the healthiest societies in 2030 will most likely be on the sea.” When asked if he still thinks that’s the case, he said: “It’s not quite 2030, so I think it’s too soon to say … What I would say now is: the healthiest societies are likely to be the ones that encourage innovation the most.”

We might expect more nutrition advice

When it comes to his own personal ambitions for longevity, O’Neill said, he takes a simple approach that involves minimizing sugar and ultraprocessed food, exercising and sleeping well, and supplementing with vitamin D. He also said he tries to “eat a diet that has plenty of protein and saturated fat,” echoing the new dietary guidance issued by the US Departments of Health and Human Services and Agriculture. That guidance has been criticized by nutrition scientists, who point out that it ignores decades of research into the harms of a diet high in saturated fat.

We can expect to see more nutrition-related updates from HHS, said O’Neill: “We’re doing more research, more randomized controlled trials on nutrition. Nutrition is still not a scientifically solved problem.” Saturated fats are of particular interest, he said. He and his colleagues want to identify “the healthiest fats,” he said. 

“Stay tuned.”

Is a secure AI assistant possible?

<div data-chronoton-summary="

Risky business of AI assistants OpenClaw, a viral tool created by independent engineer Peter Steinberger, allows users to create personalized AI assistants. Security experts are alarmed by its vulnerabilities, with even the Chinese government issuing warnings about the risks.

The prompt injection threat Tools like OpenClaw have many vulnerabilities, but the one experts are most worried about its prompt injection. Unlike conventional hacking, prompt injection tricks an LLM by embedding malicious text in emails or websites the AI reads.

No silver bullet for security Researchers are exploring multiple defense strategies: training LLMs to ignore injections, using detector LLMs to screen inputs, and creating policies that restrict harmful outputs. The fundamental challenge remains balancing utility with security in AI assistants.

” data-chronoton-post-id=”1132768″ data-chronoton-expand-collapse=”1″ data-chronoton-analytics-enabled=”1″>

AI agents are a risky business. Even when stuck inside the chatbox window, LLMs will make mistakes and behave badly. Once they have tools that they can use to interact with the outside world, such as web browsers and email addresses, the consequences of those mistakes become far more serious.

That might explain why the first breakthrough LLM personal assistant came not from one of the major AI labs, which have to worry about reputation and liability, but from an independent software engineer, Peter Steinberger. In November of 2025, Steinberger uploaded his tool, now called OpenClaw, to GitHub, and in late January the project went viral.

OpenClaw harnesses existing LLMs to let users create their own bespoke assistants. For some users, this means handing over reams of personal data, from years of emails to the contents of their hard drive. That has security experts thoroughly freaked out. The risks posed by OpenClaw are so extensive that it would probably take someone the better part of a week to read all of the security blog posts on it that have cropped up in the past few weeks. The Chinese government took the step of issuing a public warning about OpenClaw’s security vulnerabilities.

In response to these concerns, Steinberger posted on X that nontechnical people should not use the software. (He did not respond to a request for comment for this article.) But there’s a clear appetite for what OpenClaw is offering, and it’s not limited to people who can run their own software security audits. Any AI companies that hope to get in on the personal assistant business will need to figure out how to build a system that will keep users’ data safe and secure. To do so, they’ll need to borrow approaches from the cutting edge of agent security research.

Risk management

OpenClaw is, in essence, a mecha suit for LLMs. Users can choose any LLM they like to act as the pilot; that LLM then gains access to improved memory capabilities and the ability to set itself tasks that it repeats on a regular cadence. Unlike the agentic offerings from the major AI companies, OpenClaw agents are meant to be on 24-7, and users can communicate with them using WhatsApp or other messaging apps. That means they can act like a superpowered personal assistant who wakes you each morning with a personalized to-do list, plans vacations while you work, and spins up new apps in its spare time.

But all that power has consequences. If you want your AI personal assistant to manage your inbox, then you need to give it access to your email—and all the sensitive information contained there. If you want it to make purchases on your behalf, you need to give it your credit card info. And if you want it to do tasks on your computer, such as writing code, it needs some access to your local files. 

There are a few ways this can go wrong. The first is that the AI assistant might make a mistake, as when a user’s Google Antigravity coding agent reportedly wiped his entire hard drive. The second is that someone might gain access to the agent using conventional hacking tools and use it to either extract sensitive data or run malicious code. In the weeks since OpenClaw went viral, security researchers have demonstrated numerous such vulnerabilities that put security-naïve users at risk.

Both of these dangers can be managed: Some users are choosing to run their OpenClaw agents on separate computers or in the cloud, which protects data on their hard drives from being erased, and other vulnerabilities could be fixed using tried-and-true security approaches.

But the experts I spoke to for this article were focused on a much more insidious security risk known as prompt injection. Prompt injection is effectively LLM hijacking: Simply by posting malicious text or images on a website that an LLM might peruse, or sending them to an inbox that an LLM reads, attackers can bend it to their will.

And if that LLM has access to any of its user’s private information, the consequences could be dire. “Using something like OpenClaw is like giving your wallet to a stranger in the street,” says Nicolas Papernot, a professor of electrical and computer engineering at the University of Toronto. Whether or not the major AI companies can feel comfortable offering personal assistants may come down to the quality of the defenses that they can muster against such attacks.

It’s important to note here that prompt injection has not yet caused any catastrophes, or at least none that have been publicly reported. But now that there are likely hundreds of thousands of OpenClaw agents buzzing around the internet, prompt injection might start to look like a much more appealing strategy for cybercriminals. “Tools like this are incentivizing malicious actors to attack a much broader population,” Papernot says. 

Building guardrails

The term “prompt injection” was coined by the popular LLM blogger Simon Willison in 2022, a couple of months before ChatGPT was released. Even back then, it was possible to discern that LLMs would introduce a completely new type of security vulnerability once they came into widespread use. LLMs can’t tell apart the instructions that they receive from users and the data that they use to carry out those instructions, such as emails and web search results—to an LLM, they’re all just text. So if an attacker embeds a few sentences in an email and the LLM mistakes them for an instruction from its user, the attacker can get the LLM to do anything it wants.

Prompt injection is a tough problem, and it doesn’t seem to be going away anytime soon. “We don’t really have a silver-bullet defense right now,” says Dawn Song, a professor of computer science at UC Berkeley. But there’s a robust academic community working on the problem, and they’ve come up with strategies that could eventually make AI personal assistants safe.

Technically speaking, it is possible to use OpenClaw today without risking prompt injection: Just don’t connect it to the internet. But restricting OpenClaw from reading your emails, managing your calendar, and doing online research defeats much of the purpose of using an AI assistant. The trick of protecting against prompt injection is to prevent the LLM from responding to hijacking attempts while still giving it room to do its job.

One strategy is to train the LLM to ignore prompt injections. A major part of the LLM development process, called post-training, involves taking a model that knows how to produce realistic text and turning it into a useful assistant by “rewarding” it for answering questions appropriately and “punishing” it when it fails to do so. These rewards and punishments are metaphorical, but the LLM learns from them as an animal would. Using this process, it’s possible to train an LLM not to respond to specific examples of prompt injection.

But there’s a balance: Train an LLM to reject injected commands too enthusiastically, and it might also start to reject legitimate requests from the user. And because there’s a fundamental element of randomness in LLM behavior, even an LLM that has been very effectively trained to resist prompt injection will likely still slip up every once in a while.

Another approach involves halting the prompt injection attack before it ever reaches the LLM. Typically, this involves using a specialized detector LLM to determine whether or not the data being sent to the original LLM contains any prompt injections. In a recent study, however, even the best-performing detector completely failed to pick up on certain categories of prompt injection attack.

The third strategy is more complicated. Rather than controlling the inputs to an LLM by detecting whether or not they contain a prompt injection, the goal is to formulate a policy that guides the LLM’s outputs—i.e., its behaviors—and prevents it from doing anything harmful. Some defenses in this vein are quite simple: If an LLM is allowed to email only a few pre-approved addresses, for example, then it definitely won’t send its user’s credit card information to an attacker. But such a policy would prevent the LLM from completing many useful tasks, such as researching and reaching out to potential professional contacts on behalf of its user.

“The challenge is how to accurately define those policies,” says Neil Gong, a professor of electrical and computer engineering at Duke University. “It’s a trade-off between utility and security.”

On a larger scale, the entire agentic world is wrestling with that trade-off: At what point will agents be secure enough to be useful? Experts disagree. Song, whose startup, Virtue AI, makes an agent security platform, says she thinks it’s possible to safely deploy an AI personal assistant now. But Gong says, “We’re not there yet.” 

Even if AI agents can’t yet be entirely protected against prompt injection, there are certainly ways to mitigate the risks. And it’s possible that some of those techniques could be implemented in OpenClaw. Last week, at the inaugural ClawCon event in San Francisco, Steinberger announced that he’d brought a security person on board to work on the tool.

As of now, OpenClaw remains vulnerable, though that hasn’t dissuaded its multitude of enthusiastic users. George Pickett, a volunteer maintainer of the OpenGlaw GitHub repository and a fan of the tool, says he’s taken some security measures to keep himself safe while using it: He runs it in the cloud, so that he doesn’t have to worry about accidentally deleting his hard drive, and he’s put mechanisms in place to ensure that no one else can connect to his assistant.

But he hasn’t taken any specific actions to prevent prompt injection. He’s aware of the risk but says he hasn’t yet seen any reports of it happening with OpenClaw. “Maybe my perspective is a stupid way to look at it, but it’s unlikely that I’ll be the first one to be hacked,” he says.

“Dr. Google” had its issues. Can ChatGPT Health do better?

<div data-chronoton-summary="

OpenAI’s health play The AI giant launched ChatGPT Health amid reports that 230 million people already ask ChatGPT health-related questions weekly. The new feature isn’t a separate model but rather a wrapper that can access medical records and fitness data when permitted.

  • Better than Dr. Google? Early research suggests LLMs might outperform traditional web searches for medical information. One study found GPT-4o, an earlier model, answered realistic health questions correctly about 85% of the time, potentially reducing misinformation compared to unfiltered internet searches.
  • Hallucination concerns persist Earlier versions of GPT have been shown to fabricate definitions for fake medical conditions and accept incorrect information in users’ prompts. This sycophantic tendency could be particularly dangerous when users seek to confirm biases against legitimate medical advice.
  • Trust vs. expertise The articulate, confident communication style of ChatGPT might lead users to trust it over qualified medical professionals. While OpenAI emphasizes the tool is meant to supplement rather than replace doctors, researchers worry some patients will rely too heavily on AI guidance.
  • ” data-chronoton-post-id=”1131692″ data-chronoton-expand-collapse=”1″ data-chronoton-analytics-enabled=”1″>

    For the past two decades, there’s been a clear first step for anyone who starts experiencing new medical symptoms: Look them up online. The practice was so common that it gained the pejorative moniker “Dr. Google.” But times are changing, and many medical-information seekers are now using LLMs. According to OpenAI, 230 million people ask ChatGPT health-related queries each week. 

    That’s the context around the launch of OpenAI’s new ChatGPT Health product, which debuted earlier this month. It landed at an inauspicious time: Two days earlier, the news website SFGate had broken the story of Sam Nelson, a teenager who died of an overdose last year after extensive conversations with ChatGPT about how best to combine various drugs. In the wake of both pieces of news, multiple journalists questioned the wisdom of relying for medical advice on a tool that could cause such extreme harm.

    Though ChatGPT Health lives in a separate sidebar tab from the rest of ChatGPT, it isn’t a new model. It’s more like a wrapper that provides one of OpenAI’s preexisting models with guidance and tools it can use to provide health advice—including some that allow it to access a user’s electronic medical records and fitness app data, if granted permission. There’s no doubt that ChatGPT and other large language models can make medical mistakes, and OpenAI emphasizes that ChatGPT Health is intended as an additional support, rather than a replacement for one’s doctor. But when doctors are unavailable or unable to help, people will turn to alternatives. 

    Some doctors see LLMs as a boon for medical literacy. The average patient might struggle to navigate the vast landscape of online medical information—and, in particular, to distinguish high-quality sources from polished but factually dubious websites—but LLMs can do that job for them, at least in theory. Treating patients who had searched for their symptoms on Google required “a lot of attacking patient anxiety [and] reducing misinformation,” says Marc Succi, an associate professor at Harvard Medical School and a practicing radiologist. But now, he says, “you see patients with a college education, a high school education, asking questions at the level of something an early med student might ask.”

    The release of ChatGPT Health, and Anthropic’s subsequent announcement of new health integrations for Claude, indicate that the AI giants are increasingly willing to acknowledge and encourage health-related uses of their models. Such uses certainly come with risks, given LLMs’ well-documented tendencies to agree with users and make up information rather than admit ignorance. 

    But those risks also have to be weighed against potential benefits. There’s an analogy here to autonomous vehicles: When policymakers consider whether to allow Waymo in their city, the key metric is not whether its cars are ever involved in accidents but whether they cause less harm than the status quo of relying on human drivers. If Dr. ChatGPT is an improvement over Dr. Google—and early evidence suggests it may be—it could potentially lessen the enormous burden of medical misinformation and unnecessary health anxiety that the internet has created.

    Pinning down the effectiveness of a chatbot such as ChatGPT or Claude for consumer health, however, is tricky. “It’s exceedingly difficult to evaluate an open-ended chatbot,” says Danielle Bitterman, the clinical lead for data science and AI at the Mass General Brigham health-care system. Large language models score well on medical licensing examinations, but those exams use multiple-choice questions that don’t reflect how people use chatbots to look up medical information.

    Sirisha Rambhatla, an assistant professor of management science and engineering at the University of Waterloo, attempted to close that gap by evaluating how GPT-4o responded to licensing exam questions when it did not have access to a list of possible answers. Medical experts who evaluated the responses scored only about half of them as entirely correct. But multiple-choice exam questions are designed to be tricky enough that the answer options don’t give them entirely away, and they’re still a pretty distant approximation for the sort of thing that a user would type into ChatGPT.

    A different study, which tested GPT-4o on more realistic prompts submitted by human volunteers, found that it answered medical questions correctly about 85% of the time. When I spoke with Amulya Yadav, an associate professor at Pennsylvania State University who runs the Responsible AI for Social Emancipation Lab and led the study, he made it clear that he wasn’t personally a fan of patient-facing medical LLMs. But he freely admits that, technically speaking, they seem up to the task—after all, he says, human doctors misdiagnose patients 10% to 15% of the time. “If I look at it dispassionately, it seems that the world is gonna change, whether I like it or not,” he says.

    For people seeking medical information online, Yadav says, LLMs do seem to be a better choice than Google. Succi, the radiologist, also concluded that LLMs can be a better alternative to web search when he compared GPT-4’s responses to questions about common chronic medical conditions with the information presented in Google’s knowledge panel, the information box that sometimes appears on the right side of the search results.

    Since Yadav’s and Succi’s studies appeared online, in the first half of 2025, OpenAI has released multiple new versions of GPT, and it’s reasonable to expect that GPT-5.2 would perform even better than its predecessors. But the studies do have important limitations: They focus on straightforward, factual questions, and they examine only brief interactions between users and chatbots or web search tools. Some of the weaknesses of LLMs—most notably their sycophancy and tendency to hallucinate—might be more likely to rear their heads in more extensive conversations and with people who are dealing with more complex problems. Reeva Lederman, a professor at the University of Melbourne who studies technology and health, notes that patients who don’t like the diagnosis or treatment recommendations that they receive from a doctor might seek out another opinion from an LLM—and the LLM, if it’s sycophantic, might encourage them to reject their doctor’s advice.

    Some studies have found that LLMs will hallucinate and exhibit sycophancy in response to health-related prompts. For example, one study showed that GPT-4 and GPT-4o will happily accept and run with incorrect drug information included in a user’s question. In another, GPT-4o frequently concocted definitions for fake syndromes and lab tests mentioned in the user’s prompt. Given the abundance of medically dubious diagnoses and treatments floating around the internet, these patterns of LLM behavior could contribute to the spread of medical misinformation, particularly if people see LLMs as trustworthy.

    OpenAI has reported that the GPT-5 series of models is markedly less sycophantic and prone to hallucination than their predecessors, so the results of these studies might not apply to ChatGPT Health. The company also evaluated the model that powers ChatGPT Health on its responses to health-specific questions, using their publicly available HeathBench benchmark. HealthBench rewards models that express uncertainty when appropriate, recommend that users seek medical attention when necessary, and refrain from causing users unnecessary stress by telling them their condition is more serious that it truly is. It’s reasonable to assume that the model underlying ChatGPT Health exhibited those behaviors in testing, though Bitterman notes that some of the prompts in HealthBench were generated by LLMs, not users, which could limit how well the benchmark translates into the real world.

    An LLM that avoids alarmism seems like a clear improvement over systems that have people convincing themselves they have cancer after a few minutes of browsing. And as large language models, and the products built around them, continue to develop, whatever advantage Dr. ChatGPT has over Dr. Google will likely grow. The introduction of ChatGPT Health is certainly a move in that direction: By looking through your medical records, ChatGPT can potentially gain far more context about your specific health situation than could be included in any Google search, although numerous experts have cautioned against giving ChatGPT that access for privacy reasons.

    Even if ChatGPT Health and other new tools do represent a meaningful improvement over Google searches, they could still conceivably have a negative effect on health overall. Much as automated vehicles, even if they are safer than human-driven cars, might still prove a net negative if they encourage people to use public transit less, LLMs could undermine users’ health if they induce people to rely on the internet instead of human doctors, even if they do increase the quality of health information available online.

    Lederman says that this outcome is plausible. In her research, she has found that members of online communities centered on health tend to put their trust in users who express themselves well, regardless of the validity of the information they are sharing. Because ChatGPT communicates like an articulate person, some people might trust it too much, potentially to the exclusion of their doctor. But LLMs are certainly no replacement for a human doctor—at least not yet.

    What’s next for AlphaFold: A conversation with a Google DeepMind Nobel laureate

    <div data-chronoton-summary="

    • Nobel-winning protein prediction AlphaFold creator John Jumper reflects on five years since the AI system revolutionized protein structure prediction. The DeepMind tool can determine protein shapes to atomic precision in hours instead of months.
    • Unexpected applications emerge Scientists have found creative “off-label” uses for AlphaFold, from studying honeybee disease resistance to accelerating synthetic protein design. Some researchers even use it as a search engine, testing thousands of potential protein interactions to find matches that would be impractical to verify in labs.
    • Future fusion with language models Jumper, at 39 the youngest chemistry Nobel laureate in 75 years, now aims to combine AlphaFold’s specialized capabilities with the broad reasoning of large language models. “I’ll be shocked if we don’t see more and more LLM impact on science,” he says, while avoiding the pressure of another Nobel-worthy breakthrough.

    ” data-chronoton-post-id=”1128322″ data-chronoton-expand-collapse=”1″ data-chronoton-analytics-enabled=”1″>

    In 2017, fresh off a PhD on theoretical chemistry, John Jumper heard rumors that Google DeepMind had moved on from building AI that played games with superhuman skill and was starting up a secret project to predict the structures of proteins. He applied for a job.

    Just three years later, Jumper celebrated a stunning win that few had seen coming. With CEO Demis Hassabis, he had co-led the development of an AI system called AlphaFold 2 that was able to predict the structures of proteins to within the width of an atom, matching the accuracy of painstaking techniques used in the lab, and doing it many times faster—returning results in hours instead of months.

    AlphaFold 2 had cracked a 50-year-old grand challenge in biology. “This is the reason I started DeepMind,” Hassabis told me a few years ago. “In fact, it’s why I’ve worked my whole career in AI.” In 2024, Jumper and Hassabis shared a Nobel Prize in chemistry.

    It was five years ago this week that AlphaFold 2’s debut took scientists by surprise. Now that the hype has died down, what impact has AlphaFold really had? How are scientists using it? And what’s next? I talked to Jumper (as well as a few other scientists) to find out.

    “It’s been an extraordinary five years,” Jumper says, laughing: “It’s hard to remember a time before I knew tremendous numbers of journalists.”

    AlphaFold 2 was followed by AlphaFold Multimer, which could predict structures that contained more than one protein, and then AlphaFold 3, the fastest version yet. Google DeepMind also let AlphaFold loose on UniProt, a vast protein database used and updated by millions of researchers around the world. It has now predicted the structures of some 200 million proteins, almost all that are known to science.

    Despite his success, Jumper remains modest about AlphaFold’s achievements. “That doesn’t mean that we’re certain of everything in there,” he says. “It’s a database of predictions, and it comes with all the caveats of predictions.”

    A hard problem

    Proteins are the biological machines that make living things work. They form muscles, horns, and feathers; they carry oxygen around the body and ferry messages between cells; they fire neurons, digest food, power the immune system; and so much more. But understanding exactly what a protein does (and what role it might play in various diseases or treatments) involves figuring out its structure—and that’s hard.

    Proteins are made from strings of amino acids that chemical forces twist up into complex knots. An untwisted string gives few clues about the structure it will form. In theory, most proteins could take on an astronomical number of possible shapes. The task is to predict the correct one.

    Jumper and his team built AlphaFold 2 using a type of neural network called a transformer, the same technology that underpins large language models. Transformers are very good at paying attention to specific parts of a larger puzzle.

    But Jumper puts a lot of the success down to making a prototype model that they could test quickly. “We got a system that would give wrong answers at incredible speed,” he says. “That made it easy to start becoming very adventurous with the ideas you try.”

    They stuffed the neural network with as much information about protein structures as they could, such as how proteins across certain species have evolved similar shapes. And it worked even better than they expected. “We were sure we had made a breakthrough,” says Jumper. “We were sure that this was an incredible advance in ideas.”

    What he hadn’t foreseen was that researchers would download his software and start using it straight away for so many different things. Normally, it’s the thing a few iterations down the line that has the real impact, once the kinks have been ironed out, he says: “I’ve been shocked at how responsibly scientists have used it, in terms of interpreting it, and using it in practice about as much as it should be trusted in my view, neither too much nor too little.”

    Any projects stand out in particular? 

    Honeybee science

    Jumper brings up a research group that uses AlphaFold to study disease resistance in honeybees. “They wanted to understand this particular protein as they look at things like colony collapse,” he says. “I never would have said, ‘You know, of course AlphaFold will be used for honeybee science.’”

    He also highlights a few examples of what he calls off-label uses of AlphaFold“in the sense that it wasn’t guaranteed to work”—where the ability to predict protein structures has opened up new research techniques. “The first is very obviously the advances in protein design,” he says. “David Baker and others have absolutely run with this technology.”

    Baker, a computational biologist at the University of Washington, was a co-winner of last year’s chemistry Nobel, alongside Jumper and Hassabis, for his work on creating synthetic proteins to perform specific tasks—such as treating disease or breaking down plastics—better than natural proteins can.

    Baker and his colleagues have developed their own tool based on AlphaFold, called RoseTTAFold. But they have also experimented with AlphaFold Multimer to predict which of their designs for potential synthetic proteins will work.    

    “Basically, if AlphaFold confidently agrees with the structure you were trying to design [and] then you make it and if AlphaFold says ‘I don’t know,’ you don’t make it. That alone was an enormous improvement.” It can make the design process 10 times faster, says Jumper.

    Another off-label use that Jumper highlights: Turning AlphaFold into a kind of search engine. He mentions two separate research groups that were trying to understand exactly how human sperm cells hooked up with eggs during fertilization. They knew one of the proteins involved but not the other, he says: “And so they took a known egg protein and ran all 2,000 human sperm surface proteins, and they found one that AlphaFold was very sure stuck against the egg.” They were then able to confirm this in the lab.

    “This notion that you can use AlphaFold to do something you couldn’t do before—you would never do 2,000 structures looking for one answer,” he says. “This kind of thing I think is really extraordinary.”

    Five years on

    When AlphaFold 2 came out, I asked a handful of early adopters what they made of it. Reviews were good, but the technology was too new to know for sure what long-term impact it might have. I caught up with one of those people to hear his thoughts five years on.

    Kliment Verba is a molecular biologist who runs a lab at the University of California, San Francisco. “It’s an incredibly useful technology, there’s no question about it,” he tells me. “We use it every day, all the time.”

    But it’s far from perfect. A lot of scientists use AlphaFold to study pathogens or to develop drugs. This involves looking at interactions between multiple proteins or between proteins and even smaller molecules in the body. But AlphaFold is known to be less accurate at making predictions about multiple proteins or their interaction over time.

    Verba says he and his colleagues have been using AlphaFold long enough to get used to its limitations. “There are many cases where you get a prediction and you have to kind of scratch your head,” he says. “Is this real or is this not? It’s not entirely clear—it’s sort of borderline.”

    “It’s sort of the same thing as ChatGPT,” he adds. “You know—it will bullshit you with the same confidence as it would give a true answer.”

    Still, Verba’s team uses AlphaFold (both 2 and 3, because they have different strengths, he says) to run virtual versions of their experiments before running them in the lab. Using AlphaFold’s results, they can narrow down the focus of an experiment—or decide that it’s not worth doing.

    It can really save time, he says: “It hasn’t really replaced any experiments, but it’s augmented them quite a bit.”

    New wave  

    AlphaFold was designed to be used for a range of purposes. Now multiple startups and university labs are building on its success to develop a new wave of tools more tailored to drug discovery. This year, a collaboration between MIT researchers and the AI drug company Recursion produced a model called Boltz-2, which predicts not only the structure of proteins but also how well potential drug molecules will bind to their target.  

    Last month, the startup Genesis Molecular AI released another structure prediction model called Pearl, which the firm claims is more accurate than AlphaFold 3 for certain queries that are important for drug development. Pearl is interactive, so that drug developers can feed any additional data they may have to the model to guide its predictions.

    AlphaFold was a major leap, but there’s more to do, says Evan Feinberg, Genesis Molecular AI’s CEO: “We’re still fundamentally innovating, just with a better starting point than before.”

    Genesis Molecular AI is pushing margins of error down from less than two angstroms, the de facto industry standard set by AlphaFold, to less than one angstrom—one 10-millionth of a millimeter, or the width of a single hydrogen atom.

    “Small errors can be catastrophic for predicting how well a drug will actually bind to its target,” says Michael LeVine, vice president of modeling and simulation at the firm. That’s because chemical forces that interact at one angstrom can stop doing so at two. “It can go from ‘They will never interact’ to ‘They will,’” he says.

    With so much activity in this space, how soon should we expect new types of drugs to hit the market? Jumper is pragmatic. Protein structure prediction is just one step of many, he says: “This was not the only problem in biology. It’s not like we were one protein structure away from curing any diseases.”

    Think of it this way, he says. Finding a protein’s structure might previously have cost $100,000 in the lab: “If we were only a hundred thousand dollars away from doing a thing, it would already be done.”

    At the same time, researchers are looking for ways to do as much as they can with this technology, says Jumper: “We’re trying to figure out how to make structure prediction an even bigger part of the problem, because we have a nice big hammer to hit it with.”

    In other words, they want to make everything into nails? “Yeah, let’s make things into nails,” he says. “How do we make this thing that we made a million times faster a bigger part of our process?”

    What’s next?

    Jumper’s next act? He wants to fuse the deep but narrow power of AlphaFold with the broad sweep of LLMs.  

    “We have machines that can read science. They can do some scientific reasoning,” he says. “And we can build amazing, superhuman systems for protein structure prediction. How do you get these two technologies to work together?”

    That makes me think of a system called AlphaEvolve, which is being built by another team at Google DeepMind. AlphaEvolve uses an LLM to generate possible solutions to a problem and a second model to check them, filtering out the trash. Researchers have already used AlphaEvolve to make a handful of practical discoveries in math and computer science.    

    Is that what Jumper has in mind? “I won’t say too much on methods, but I’ll be shocked if we don’t see more and more LLM impact on science,” he says. “I think that’s the exciting open question that I’ll say almost nothing about. This is all speculation, of course.”

    Jumper was 39 when he won his Nobel Prize. What’s next for him?

    “It worries me,” he says. “I believe I’m the youngest chemistry laureate in 75 years.” 

    He adds: “I’m at the midpoint of my career, roughly. I guess my approach to this is to try to do smaller things, little ideas that you keep pulling on. The next thing I announce doesn’t have to be, you know, my second shot at a Nobel. I think that’s the trap.”

    Three things to know about the future of electricity

    <div data-chronoton-summary="

    • Electricity demand is surging globally. Global electricity demand will grow 40% over the next decade. Data center investment hit $580 billion in 2025 alone—surpassing global oil spending. In the US, data centers will account for half of all electricity growth through 2030.
    • Air-conditioning and emerging economies are reshaping energy consumption. Rising temperatures and growing prosperity in developing nations will add over 500 gigawatts of peak demand by 2035, dwarfing data centers’ contribution to overall electricity growth.
    • Renewables are finally overtaking coal, but the transition remains too slow. Solar and wind led electricity generation in the first half of 2025 with nuclear capacity poised to increase by a third this decade. Yet global emissions are likely to hit record highs again this year.

    ” data-chronoton-post-id=”1128167″ data-chronoton-expand-collapse=”1″ data-chronoton-analytics-enabled=”1″>

    One of the dominant storylines I’ve been following through 2025 is electricity—where and how demand is going up, how much it costs, and how this all intersects with that topic everyone is talking about: AI.

    Last week, the International Energy Agency released the latest version of the World Energy Outlook, the annual report that takes stock of the current state of global energy and looks toward the future. It contains some interesting insights and a few surprising figures about electricity, grids, and the state of climate change. So let’s dig into some numbers, shall we?

    We’re in the age of electricity

    Energy demand in general is going up around the world as populations increase and economies grow. But electricity is the star of the show, with demand projected to grow by 40% in the next 10 years.

    China has accounted for the bulk of electricity growth for the past 10 years, and that’s going to continue. But emerging economies outside China will be a much bigger piece of the pie going forward. And while advanced economies, including the US and Europe, have seen flat demand in the past decade, the rise of AI and data centers will cause demand to climb there as well.

    Air-conditioning is a major source of rising demand. Growing economies will give more people access to air-conditioning; income-driven AC growth will add about 330 gigawatts to global peak demand by 2035. Rising temperatures will tack on another 170 GW in that time. Together, that’s an increase of over 10% from 2024 levels.  

    AI is a local story

    This year, AI has been the story that none of us can get away from. One number that jumped out at me from this report: In 2025, investment in data centers is expected to top $580 billion. That’s more than the $540 billion spent on the global oil supply. 

    It’s no wonder, then, that the energy demands of AI are in the spotlight. One key takeaway is that these demands are vastly different in different parts of the world.

    Data centers still make up less than 10% of the projected increase in total electricity demand between now and 2035. It’s not nothing, but it’s far outweighed by sectors like industry and appliances, including air conditioners. Even electric vehicles will add more demand to the grid than data centers.

    But AI will be the factor for the grid in some parts of the world. In the US, data centers will account for half the growth in total electricity demand between now and 2030.

    And as we’ve covered in this newsletter before, data centers present a unique challenge, because they tend to be clustered together, so the demand tends to be concentrated around specific communities and on specific grids. Half the data center capacity that’s in the pipeline is close to large cities.

    Look out for a coal crossover

    As we ask more from our grid, the key factor that’s going to determine what all this means for climate change is what’s supplying the electricity we’re using.

    As it stands, the world’s grids still primarily run on fossil fuels, so every bit of electricity growth comes with planet-warming greenhouse-gas emissions attached. That’s slowly changing, though.

    Together, solar and wind were the leading source of electricity in the first half of this year, overtaking coal for the first time. Coal use could peak and begin to fall by the end of this decade.

    Nuclear could play a role in replacing fossil fuels: After two decades of stagnation, the global nuclear fleet could increase by a third in the next 10 years. Solar is set to continue its meteoric rise, too. Of all the electricity demand growth we’re expecting in the next decade, 80% is in places with high-quality solar irradiation—meaning they’re good spots for solar power.

    Ultimately, there are a lot of ways in which the world is moving in the right direction on energy. But we’re far from moving fast enough. Global emissions are, once again, going to hit a record high this year. To limit warming and prevent the worst effects of climate change, we need to remake our energy system, including electricity, and we need to do it faster. 

    This article is from The Spark, MIT Technology Review’s weekly climate newsletter. To receive it in your inbox every Wednesday, sign up here.