A plan to make drugs in orbit is going commercial

<div data-chronoton-summary="

  • A big deal: Varda Space Industries says it has signed a pharmaceutical company as a commercial customer, marking what could be a landmark moment for in-orbit manufacturing.
  • Space as a lab: The bet is that microgravity causes drug molecules to crystallize into atomic arrangements impossible on Earth, potentially unlocking new versions of existing medicines.
  • Economics favor drugs: At $7,000 per kilogram to reach orbit, space manufacturing is impractical for most industries — but blockbuster drugs can be worth over $100 million per kilogram, making them a rare exception to the brutal math of rocket launches.
  • Still more experiment than factory: Despite the excitement, no product has ever been manufactured in space, brought back, and sold on Earth.

” data-chronoton-post-id=”1137153″ data-chronoton-expand-collapse=”1″ data-chronoton-analytics-enabled=”1″>

Varda Space Industries, a startup that’s been pitching its ability to perform drug experiments in space, says it has signed up the pharmaceutical company United Therapeutics in what may be remembered as a notable step toward in-orbit manufacturing.

The idea of building things in outer space for use on Earth has so far been explored mostly on board the International Space Station, and only in small-scale experiments backed by governments.

But Varda, based in El Segundo, California, is now telling drug companies it has a practical, and repeatable, way to produce novel molecules in microgravity. 

“This is the first commercial path to products made in space,” says Michael Reilly, Varda’s chief strategy officer.

The scientific idea is that chemical mixtures have different properties under weightless conditions. For instance, water will hang together in a wiggly sphere, since without gravity, surface tension is the strongest force present.

The plan is to launch versions of United Therapeutics’ drugs into orbit, where they can be allowed to form solid crystals. The hope is that in microgravity, they’ll take on atomic arrangements not seen on Earth, possibly leading to new versions with improved stability or other valuable properties.

United is led by CEO Martine Rothblatt, who worked on early telecommunications satellites. Since then, she’s built a multibillion-dollar health franchise with a succession of drugs to treat a lung disease called pulmonary arterial hypertension, which her daughter suffers from, and a subsidiary developing genetically modified pigs as a source of organs for transplantation.

Rothblatt says space could be the next step if orbital conditions permit United to identify “even more amazing” versions of its drugs.

Space to reformulate

Pharmaceutical companies often try to keep their blockbuster franchises alive by creating improved versions of drugs or reformulating them—for example, making the switch from a pill to an inhaled version, as United has done with some of its products. Doing so can keep imitators at bay and create extra decades of patent protection.

Assisting drugmakers are specialist companies, such as Halozyme and MannKind, that earn profits by helping to reformulate other companies’ drugs, often taking a royalty on future sales.

That’s the business Varda has been trying to break into—by using excursions into space instead of nebulizers, patches, or nanoparticles. The company was formed in 2021 by Delian Asparouhov, a partner at Peter Thiel’s Founders Fund, along with Will Bruey, a former avionics engineer with Elon Musk’s SpaceX who is now Varda’s CEO.

The pair’s bet is that space manufacturing will become viable once rocket launches become frequent enough—and cheap enough—to support a business model in which raw materials are sent into orbit, processed, and then returned to Earth in a new form.

And that’s starting to happen. To get into space, Varda has been purchasing rides from SpaceX—which now launches a rocket every two or three days, usually a reusable Falcon 9. 

Those rockets have a nose cone, or payload fairing, about the size of a moving truck that gets filled with satellites or instruments, which are then released into orbit.

Starting in 2023, Varda began sending up small satellites that have a boulder-size capsule attached. The capsule contains equipment to carry out experiments, and it can detach and fall back to Earth, entering the atmosphere at a speed of around Mach 25 before slowing via air resistance and eventually drifting to land with a parachute. (Varda lands its craft in the Australian outback.)

That speedy reentry has also drawn interest from the US military, including the Air Force, which has paid Varda to fly instruments and take measurements relevant to hypersonic missile technology. Of the six craft Varda has paid to put into orbit so far, half have been dedicated to military research and half carried drug-related demonstrations. 

At Varda, such “dual use” of technology is accepted as part of being in the space business, which remains reliant on government support. The company’s founders say Varda may be the only company that employs hypersonic engineers and pharmaceutical chemists under the same roof.

At Varda’s headquarters, drug samples are loaded into a spinning arm that creates extra-high g-forces. While that’s the opposite of microgravity, increased weight can provide clues into whether a drug will act differently under new conditions.
COURTESY VARDA

Launching industries

Actual space manufacturing still remains mostly an aspirational project. In 2021, Jeff Bezos, after his first trip aloft in a rocket, suggested that polluting industries should be moved beyond the atmosphere. “We need to take all heavy industry, all polluting industry, and move it into space. And keep Earth as this beautiful gem of a planet that it is,” he told MSNBC.

Weight is the big obstacle to such dreams. It still costs around $7,000 to launch a single kilogram of payload into orbit, which makes it impractical to, say, send cotton into space to be dyed there, or even to launch the acids and solvents needed to make a semiconductor chip.

But drugs may be among the few exceptions to this economic rule, since pound for pound, they can be as valuable as rare radioactive isotopes and fine-cut diamonds.

For instance, just one kilogram of the weight-loss drug Ozempic is worth more than $100 million at retail. (The reason your Ozempic bill is only $1,000 a month is that minute quantities of the active ingredient are present in the shots.)

That’s why Varda thinks it may eventually be able to manufacture drugs in orbit. However, its effort with United is more of a flying experiment to learn whether the company’s lung medicines will crystallize differently in microgravity.  

The terms of the deal between Varda and United aren’t public, and the companies haven’t said which specific drugs the collaboration will study. But Rothblatt did confirm that United is paying Varda to help it identify new crystal forms of its drugs (also called polymorphs), which it hopes could have improved properties.

“One has to do the experiment to find out if that is so. The first part of the experiment is to see what polymorphs of these molecules can be made without the influence of gravity,” she says. “Then, once we have those polymorphs, we will test them.” 

There is good evidence that crystals form differently in space. For instance, in 2017 the pharmaceutical giant Merck sent samples of its cancer immunotherapy drug Keytruda to the International Space Station, where it was found to form crystals of a single size. On Earth, the drug tended to form two different sizes at once.

That experiment offered clues for how to formulate the drug as a shot instead of administering it intravenously. Still, when Merck introduced a Keytruda injection last year, it ended up using a different approach. That means there’s still no straight-line connection between orbital discoveries and any drug here on Earth. Actual space factories are another step further from reality. 

“We’ve been learning from space for years, but I can’t name anything manufactured in space, brought down to Earth, and sold,” says Reilly. “So that is a first—or it will be a first.”

Reilly says that Varda anticipates launching United Therapeutics’ drugs into orbit sometime early next year. 

What’s next for IVF

<div data-chronoton-summary="

  • Helping embryos stick: Even healthy-looking embryos only implant 40–60% of the time. Researchers in Spain are trialing a device that physically injects embryos directly into the uterine lining at the press of a button.
  • AI and robots are taking over the lab: Automated systems can now select sperm, fertilize eggs, and culture embryos without human hands. At least 19 children have already been born through fully automated IVF.
  • Genetic testing is getting complicated: Standard embryo screening helps reduce miscarriage, but newer tests claiming to predict IQ or height are gaining ground in the US—and making many fertility doctors deeply uncomfortable.
  • Gene editing is quietly creeping back: Years after He Jiankui went to prison for editing human embryos, startups are revisiting CRISPR as a way to prevent serious inherited disease—raising hopes, and familiar fears about a slippery slope.

” data-chronoton-post-id=”1136946″ data-chronoton-expand-collapse=”1″ data-chronoton-analytics-enabled=”1″>

MIT Technology Review’s What’s Next series looks across industries, trends, and technologies to give you a first look at the future. You can read the rest of them here.

Forty-eight years ago this July, Louise Joy Brown became the world’s first person born with the help of in vitro fertilization. Millions more IVF babies have entered the world since then. And that’s partly thanks to advances in technology that have made IVF safer and more effective.

But it’s still not perfect. The process can be slow, painful, and expensive—and that’s for the lucky people who are able to access it in the first place. And by at least one measure, IVF success rates have been declining in recent years.

Reproduction is complex, and there’s a lot that embryologists and gynecologists still don’t know and can’t control. They don’t know why many healthy-looking embryos don’t “stick” in the uterus, for example. They don’t always have an explanation for why their patients can’t get pregnant. And they can’t always account for vast differences in IVF success rates between individuals and between fertility clinics.

Scientists are working on all those questions and more. They’re wrestling with complex ethical questions about how new genetic tools will be used to analyze or even alter embryos. Meanwhile, technologies designed to standardize treatment, eliminate human error, boost success rates, and make IVF more accessible are already beginning to usher in a new era for assisted reproduction—one aided by AI and robots.

1. Helping embryos stick

Some of those technologies are being developed at the Carlos Simon Foundation in Valencia, Spain. When I visited in March, researchers gave me a tour of the labs and showed me a device that had been used to keep a human uterus alive outside the body for the first time.

While some members of the team dream of building artificial uteruses that might one day be able to carry a fetus to term, they first want to use such devices to learn more about implantation—the moment at which a fertilized egg makes contact with the lining of the uterus, burrows inside, and essentially “hatches,” triggering the start of a pregnancy.

Despite decades of advances in IVF, that process is still poorly understood. Even healthy-looking embryos stick no more than 40% to 60% of the time.

In IVF techniques used today, clinics can create early-stage embryos and wait until the uterus is deemed most receptive, but once they insert the embryo into the uterus, it’s on its own. Xavier Santamaria, senior clinical scientist at the Carlos Simon Foundation, and his colleagues are trialing a different approach. They’ve developed a device that, at the press of a button, injects the embryo into the uterine lining.

Scientists in Valencia showcase Transfer Direct.

JESS HAMZELOU / MITTR

In a demonstration I watched with a prototype, Santamaria picked up his speculum and turned to face the vaginal opening of his “patient,” which in this case was just a model of the real thing—a plastic bottom with labia, a vagina, a uterus, and ovaries, two short stumps representing what would normally be a pair of legs held in stirrups.

He hunched over and peered inside. “Embryo,” he called. His colleague Maria Pardo, an embryologist, passed him a thin needle containing a mouse embryo she had recently collected from a petri dish.

Santamaria’s device allows for the embryo-containing needle to be connected to a delivery tube. This tube also has a camera, a light, and a sensor that lets the doctor know when the needle reaches the uterine lining. Once it has been fed into the uterus, the gynecologist can see the inside of the organ and direct the tube to the lining.

Scientists in Valencia showcase Transfer Direct.

JESS HAMZELOU / MITTR

“When everything is ready, you just press the button,” Santamaria said as he activated it using a foot pedal, allowing the embryo to be injected. “There it goes.”

The team has just started a trial of the device; so far, fewer than 10 women have undergone the procedure, and none of those have become pregnant. But foundation director Carlos Simon is hopeful, noting that the inventors of IVF had to perform over 160 cycles before Louise Brown was born (between 1969 and 1978, that team performed 457 cycles in 250 people, resulting in only two live births). “The trial is ongoing,” he says.

2. Picking the “best” eggs, sperm, and embryos

One long-running challenge of IVF has been selection. Say you manage to collect 10 eggs from one partner and a decent-looking semen sample from the other. How do you choose which cells to use? The same question comes up once the resulting embryos have been cultured in a dish for a few days: Which should you transfer to the uterus?

Traditionally, these judgments have been made by eye. Embryologists literally pick the ones that look the best in terms of their shape or, in the case of sperm, how they move. But scientists have been working on alternatives. And over the last decade or so, many have turned to genetic testing to hint at which embryos have the best chances of creating a healthy baby.

The most commonly used test is called PGT-A, which stands for preimplantation genetic testing for aneuploidy. Aneuploidy essentially means having an “incorrect” number of chromosomes, and it is thought that embryos with such characteristics are more likely to be lost through miscarriage or potentially develop into babies with genetic conditions.

Once embryologists have created embryos in the lab, they can pinch off a few cells and test them for aneuploidies. The tests are especially beneficial for women over the age of 38, says Alan Penzias, a reproductive endocrinologist at Boston IVF. “You start to see an improvement: more babies and fewer miscarriages,” he says. The tests can shorten the time to pregnancy.

This type of genetic testing is possible thanks to multiple advances in technology—not just in genomics, but also in the ability to keep embryos alive in a dish for five to six days and the technique of freezing embryos while the cells undergo testing and thawing them once the results are in. And it has become hugely popular—some clinics do PGT-A tests on all their embryos.

But PGT-A won’t give you a perfect readout of a future baby’s genetics, says Sonia Gayete-Lafuente, a reproductive endocrinologist at the Center for Human Reproduction in New York City. And some of the abnormalities might be able to self-correct with time. Gayete-Lafuente and her colleagues have transferred some of those “abnormal” embryos into patients’ uteruses and seen them develop into perfectly healthy children, she says.

Other forms of PGT are even more controversial. PGT-P tests are designed to predict an embryo’s chances of developing complex traits that rely on multiple genes, including medical disorders but also physical characteristics like height or cognitive factors like IQ. These tests are new, and they are illegal in some countries, including the UK. But they are gaining ground in the US. Nucleus Genomics—a company that invites customers to “have [their] best baby”—promises to predict traits running the gamut from eye color and intelligence to left-handedness and risk of Alzheimer’s.

When I asked IVF practitioners how they might respond if a patient asked for this service, most dodged the question and told me there’s not enough evidence that any of these tests actually work. They also cautioned that selecting for one trait might inadvertently introduce new risks. None seemed especially keen on the idea of using genetic testing for anything other than preventing serious disease.

3. Speeding things up with AI

Some seemed more excited about the potential for AI. After all, AI tools are generally good at recognizing patterns. Many researchers have attempted to train tools to spot healthy sperm, eggs, and embryos.

And they’ve had some success. A team at Columbia University Medical Center in New York has developed a device that uses AI to examine semen samples from men who have only tiny numbers of healthy sperm. An embryologist might struggle to find a single healthy sperm in such a sample. But the Sperm Tracking and Recovery (STAR) system can analyze over a million microscope images in an hour. It has already been used to create healthy embryos. The team behind the work announced the first pregnancy resulting from the treatment in November last year.

Other teams are using AI tools to advance IVF in more dramatic ways. Around a decade ago, a reproductive endocrinologist named Alejandro Chavez-Badiola began developing an AI tool trained to rank embryos, another to rank eggs, and another to select sperm. He recalls being struck by a realization that these tools were “the brains that have the potential to drive robots in the future,” he says.

4. Using robots to standardize IVF

In the early 2020s, Chavez-Badiola and his colleagues decided to combine technologies and develop an automated system for IVF. In theory, a robotic system loaded up with AI tools could undertake most of the steps required in the IVF process: selecting the eggs and sperm, fertilizing eggs to create embryos, culturing those embryos in a dish, and selecting the “best” one for transfer. Such a system could “do everything in a standard way” without ever getting tired, he says.

Chavez-Badiola, who is now founder and chief medical officer at Conceivable, started building prototypes by motorizing regular IVF equipment and connecting it to computers. He and his colleagues started testing their system with animal cells before eventually moving on to human ones. “We were able to prove that integrating robots to automate different steps in IVF is doable,” he says.

The device is now being used to prepare sperm and eggs and create embryos. At least 19 children have been born following the automated IVF. It is early days, but Chavez-Badiola is hoping that future iterations of the machine could each process thousands of IVF cycles in a year, potentially making the procedure more affordable and accessible.

Many in the field are excited about the potential for automated devices like Conceivable’s. “This is all time saved for the embryologists,” says Laura Rienzi, a clinical embryologist and scientific director of the IVIRMA network of fertility centers in Italy. She also hopes it will help standardize IVF treatments. “Automation [will allow for] every patient to be treated in the same way in every single lab in the world,” she says.

5. Controversial edits are on the table

There’s a catch, however: All these technologies rely on the availability of at least some healthy sperm, eggs, and embryos at the outset. Embryologists and IVF patients have to work with what they’ve got. And sometimes, what they’ve got won’t result in a healthy baby. 

That’s why some scientists are proposing a controversial idea: using gene-editing technologies like CRISPR to tinker with the genome of an IVF embryo before it is implanted. The biophysicist He Jiankui infamously took this approach to create embryos that resulted in the births of three children in the late 2010s. He was widely condemned by the scientific community and ultimately spent three years in a Chinese prison

His former romantic partner Cathy Tie, who now leads startup Origin Genomics, is pursuing the technology as a potential way to prevent serious disease in children. At a recent event held at the Hastings Center for Bioethics, Tie made the case for using embryo editing to prevent diseases like cystic fibrosis, Huntington’s, and sickle-cell.

It won’t be straightforward from a technical, legal, or ethical perspective. Diseases that are known to be caused by single-gene mutations are good first candidates, but as the Center for Human Reproduction’s Gayete-Lafuente points out, most diseases are much more complicated than that. “I wish we could understand the genetic basis of every disease to be able to prevent it,” she says. So far, we can’t. Besides, most diseases can be influenced by our diets, behaviors, and environments as well as our genes.

As things stand, no one knows if editing a human embryo to eliminate the risk of one disease might increase a future child’s risk of some other disorder. And some scientists worry that such edits might be a slippery slope to genetic enhancement or eugenics.

Rienzi hopes that the technology might be developed in a safe way with regulatory oversight, and only for a specific list of diseases. “It has to be within a legal context,” she says. “But to me, it’s a dream.”

In the meantime, the field looks set to keep transforming with the development of new technologies that are already creating healthy babies. Watch this space. 

Want to understand the current state of AI? Check out these charts.

<div data-chronoton-summary="

  • The US-China AI race is closer than you think: Chinese models from DeepSeek and Alibaba now trail American ones by razor-thin margins. Meanwhile, the US has more data centers and capital, while China leads in research publications and robotics.
  • AI benchmarks are badly broken: One popular math benchmark has a 42% error rate, and models can game tests by training on the answers. Strong test scores increasingly fail to predict how AI actually performs in the real world.
  • Jobs and anxiety are both rising: Software developer employment for workers aged 22–25 has dropped nearly 20% since 2022, with AI likely a factor. Globally, 59% of people think AI will do more good than harm—but 52% say it still makes them nervous.
  • Regulation is losing the race: The EU banned predictive policing AI, and US states passed a record 150 AI-related bills, but experts say lawmakers don’t yet understand the technology well enough to govern it effectively.

” data-chronoton-post-id=”1135675″ data-chronoton-expand-collapse=”1″ data-chronoton-analytics-enabled=”1″>

If you’re following AI news, you’re probably getting whiplash. AI is a gold rush. AI is a bubble. AI is taking your job. AI can’t even read a clock. The 2026 AI Index from Stanford University’s Institute for Human-Centered Artificial Intelligence, AI’s annual report card, comes out today and cuts through some of that noise. 

Despite predictions that AI development may hit a wall, the report says that the top models just keep getting better. People are adopting AI faster than they picked up the personal computer or the internet. AI companies are generating revenue faster than companies in any previous technology boom, but they’re also spending hundreds of billions of dollars on data centers and chips. The benchmarks designed to measure AI, the policies meant to govern it, and the job market are struggling to keep up. AI is sprinting, and the rest of us are trying to find our shoes.

All that speed comes at a cost. AI data centers around the world can now draw 29.6 gigawatts of power, enough to run the entire state of New York at peak demand. Annual water use from running OpenAI’s GPT-4o alone may exceed the drinking water needs of 12 million people. At the same time, the supply chain for chips is alarmingly fragile. The US hosts most of the world’s AI data centers, and one company in Taiwan, TSMC, fabricates almost every leading AI chip. 

The data reveals a technology evolving faster than we can manage. Here’s a look at some of the key points from this year’s report. 

The US and China are nearly tied

In a long, heated race with immense geopolitical stakes, the US and China are almost neck and neck on AI model performance, according to Arena, a community-driven ranking platform that allows users to compare the outputs of large language models on identical prompts. In early 2023, OpenAI had a lead with ChatGPT, but this gap narrowed in 2024 as Google and Anthropic released their own models. In February 2025, R1, an AI model built by the Chinese lab DeepSeek, briefly matched the top US model, ChatGPT. As of March 2026, Anthropic leads, trailed closely by xAI, Google, and OpenAI. Chinese models like DeepSeek and Alibaba lag only modestly. With the best AI models separated in the rankings by razor-thin margins, they’re now competing on cost, reliability, and real-world usefulness. 

Chart of the performance of top models on the Arena by select providers, showing the Arena score from May 2023 to Jan 2026 with the models all trending upward.  The scores are tightly packed by US based Anthropic, xAI, Google and OpenAI lead Alibaba, DeepSeek and Mistral (in that order.) Meta trails the pack.

The index notes that the US and China have different AI advantages. While the US has more powerful AI models, more capital, and an estimated 5,427 data centers (more than 10 times as many as any other country), China leads in AI research publications, patents, and robotics. 

As competition intensifies, companies like OpenAI, Anthropic, and Google no longer disclose their training code, parameter counts, or data-set sizes. “We don’t know a lot of things about predicting model behaviors,” says Yolanda Gil, a computer scientist at the University of Southern California who coauthored the report. This lack of transparency makes it difficult for independent researchers to study how to make AI models safer, she says.

AI models are advancing super fast

Despite predictions that development will plateau, AI models keep getting better and better. By some measures, they now meet or exceed the performance of human experts on tests that aim to measure PhD-level science, math, and language understanding. SWE-bench Verified, a software engineering benchmark for AI models, saw top scores jump from around 60% in 2024 to almost 100% in 2025. In 2025, an AI system produced a weather forecast on its own.  

“I am stunned that this technology continues to improve, and it’s just not plateauing in any way,” says Gil.

line chart of Select AI Index technical performance benchmarks vs human performance, showing that skills such as image classification, English language understanding, multitask language understanding, visual reasoning, medium level reading comprehension, multimodal understanding and reasoning have surpassed the human baseline at or before 2025, with autonomous software engineering, mathmatical reasoning and agent multimodal computer use trending towards meeting the human baseline by 2026.

However, AI still struggles in plenty of other areas. Because the models learn by processing enormous amounts of text and images rather than by experiencing the physical world, AI exhibits “jagged intelligence.” Robots are still in their early days and succeed in only 12% of household tasks. Self-driving cars are farther along: Waymos are now roaming across five US cities, and Baidu’s Apollo Go vehicles are shuttling riders around in China. AI is also expanding into professional domains like law and finance, but no model dominates the field yet. 

But the way we test AI is broken

These reports of progress should be taken with a grain of salt. The benchmarks designed to track AI progress are struggling to keep up as models quickly blow past their ceilings, the Stanford report says. Some are poorly constructed—a popular benchmark that tests a model’s math abilities has a 42% error rate. Others can be gamed: when models are trained on benchmark test data, for example, they can learn to score well without getting smarter. 

Because AI is rarely used the same way it’s tested, strong benchmark performance doesn’t always translate to real-world performance. And for complex, interactive technologies such as AI agents and robots, benchmarks barely exist yet. 

AI companies are also sharing less about how their models are trained, and independent testing sometimes tells a different story from what they report. “A lot of companies are not releasing how their models do in certain benchmarks, particularly the responsible-AI benchmarks,” says Gil. “The absence of how your model is doing on a benchmark maybe says something.” 

AI is starting to affect jobs

Within three years of going mainstream, AI is now used by more than half of people around the world, a rate of adoption faster than the personal computer or the internet. An estimated 88% of organizations now use AI, and four in five university students use it. 

It’s early days for deployment, and AI’s impact on jobs is hard to measure. Still, some studies suggest AI is beginning to affect young workers in certain professions. According to a 2025 study by economists at Stanford, employment for software developers aged 22 to 25 has fallen nearly 20% since 2022. The decline might not be pinned on AI alone, as broader macroeconomic conditions could be to blame, but AI appears to be playing a part.

two line charts showing the normalized headcount trends by age group from 2021 through 2025. On the left for software developers the early career (age 22-25) cohort drops rapidly after a peak in September 2022, with other ages still rising albeit less steeply.  On the right, customer support agents see a similar trend, although the decline for the early career group is less steep than for software developers.

Employers say that hiring may continue to tighten. According to a 2025 survey conducted by McKinsey & Company, a third of organizations expect AI to shrink their workforce in the coming year, particularly in service and supply chain operations and software engineering. AI is boosting productivity by 14% in customer service and 26% in software development, according to research cited by the index, but such gains are not seen in tasks requiring more judgment. Overall, it’s still too early to understand the bigger economic impact of AI. 

People have complicated feelings about AI 

Around the world, people feel both optimistic and anxious about AI: 59% of people think that it will provide more benefits than drawbacks, while 52% say that it makes them nervous, according to an Ipsos survey cited in the index. 

Notably, experts and the public see the future of AI very differently, according to a Pew survey. The biggest gap is around the future of work: While 73% of experts think that AI will have a positive impact on how people do their jobs, only 23% of the American public thinks so. Experts are also more optimistic than the public about AI’s impact on education and medical care, but they agree that AI will hurt elections and personal relationships.

Bar chart of US perceptions of AI's societal impact contrasting US adults with AI experts, with the percentage of AI experts saying that AI will have a positive impact in the next 20 years is 2-3 times higher than the US adults.  The most optimistic AI experts are in the field of medical care with 84% predicting a positive outcome (versus 44% of US adults.) The greatest difference is for jobs with experts polling at 73% and US adults  polling at 23%.  Both groups have a similar (11% for experts and 9% of adults.) expectation for a positive outcome for AI in elections.

Among all countries surveyed, Americans trust their government least to regulate AI appropriately, according to another Ipsos survey. More Americans worry federal AI regulation won’t go far enough than worry it will go too far. 

Governments are struggling to regulate AI

Governments around the world are struggling to regulate AI, but there were some minor successes last year. The EU AI Act’s first prohibitions, which ban the use of AI in predictive policing and emotion recognition, took effect. Japan, South Korea, and Italy also passed national AI laws. Meanwhile, the US federal government moved toward deregulation, with President Trump issuing an executive order seeking to handcuff states from regulating AI. 

Despite this federal action, state legislatures in the US passed a record 150 AI-related bills. California enacted landmark legislation, including SB 53, which mandates safety disclosures and whistleblower protections for developers of AI models. New York passed the RAISE Act, requiring AI companies to publish safety protocols and report critical safety incidents.

line chart showing the number of AI-related bills passed into law by all US states from 2016-2025, which increases sharply in 2023 and peaks with 150 bills in 2025.

But for all the legislative activity, Gil says, regulation is running behind the technology because we don’t really understand how it works. “Governments are cautious to regulate AI because … we don’t understand many things very well,” she says. “We don’t have a good handle on those systems.”

Desalination plants in the Middle East are increasingly vulnerable

<div data-chronoton-summary="

  • Water as a weapon: Desalination plants supplying drinking water to millions across the Middle East have become targets in the escalating US-Iran conflict, with plants in Iran, Bahrain, and Kuwait already reporting damage.
  • Gulf states are most at risk: While Iran gets just 3% of its municipal fresh water from desalination, Bahrain, Qatar, and Kuwait depend on it for over 90% of their drinking water—making them far more exposed to attacks.
  • Bigger plants mean bigger consequences: The average desalination facility is now ten times larger than it was 15 years ago. Taking one offline could impact the water supplies of many people in the area.
  • The danger doesn’t end with the war: Climate change, oil spills, and algae blooms pose growing threats to these facilities—and experts warn the conflict may teach future actors just how effectively water infrastructure can be weaponized.

” data-chronoton-post-id=”1135235″ data-chronoton-expand-collapse=”1″ data-chronoton-analytics-enabled=”1″>

MIT Technology Review Explains: Let our writers untangle the complex, messy world of technology to help you understand what’s coming next. You can read more from the series here.

As the conflict in Iran has escalated, a crucial resource is under fire: the desalination technology that supplies water across much of the region.

In early March, Iran’s foreign minister accused the US of attacking a desalination plant on Qeshm Island in the Strait of Hormuz and disrupting the water supply to nearly 30 villages. (The US denied responsibility.) In the weeks since, both Bahrain and Kuwait have reported damage to desalination plants and blamed Iran, though Iran also denied responsibility.

In late March, President Donald Trump threatened the destruction of “possibly all desalinization plants” in Iran if the Strait of Hormuz was not reopened. Since then, he’s escalated his threats against Iran, warning of plans to attack other crucial civilian infrastructure like power plants and bridges.

Countries in the Middle East, particularly the Gulf states, rely on the technology to turn salt water into fresh water for farming, industry, and—crucially—drinking. The mounting attacks and threats to date highlight just how vital the industry is to the region—a situation made even more precarious by rising temperatures and extreme weather driven by climate change.

Right now, 83% of the Middle East is under extremely high water stress, says Liz Saccoccia, a water security associate at the World Resources Institute. Future projections suggest that’s going to increase to about 100% by 2050, she adds: “This is a continuing trend, and it’s getting worse, not better.”

Here’s a look at desalination technology in the Middle East and what wartime threats to the critical infrastructure could mean for people in the region. 

A vital resource

Desalination technology has helped provide water supplies in the Middle East since the early 20th century and became widespread in the 1960s and 1970s.

There are two major categories of desalination plants. Thermal plants use heat to evaporate water, leaving salt and other impurities behind. The vapor can then be condensed into usable fresh water. The alternative is membrane-based technology like reverse osmosis, which pushes water through membranes that have tiny pores—so small that salt can’t get through.

Early desalination plants in the Middle East were the first type, burning fossil fuels to evaporate water, leaving the salt behind. This technique is incredibly energy-intensive, and over time, processes that rely on filters became the dominant choice.

Membrane technologies have made up essentially all new desalination capacity in recent years; the last major thermal plant built in the Gulf came online in 2018. Many reverse osmosis plants still rely on fossil fuels, but they’re more efficient. Since then, membrane technologies have added more than 15 million cubic meters of daily capacity—enough to supply water to millions of people.

Capacity has expanded quickly in recent years; between 2006 and 2024, countries across the Middle East collectively spent over $50 billion building and upgrading desalination facilities, and nearly that much operating them.

Today, there are nearly 5,000 desalination plants operational across the Middle East.

And looking ahead, growth is continuing. Between 2024 and 2028, daily capacity is expected to grow from about 29 million cubic meters to 41 million cubic meters.

Uneven vulnerabilities

Some countries rely on the technology more than others. Iran, for example, uses desalination for about 3% of its municipal fresh water. The country has access to groundwater and some surface water, including rivers, though these resources are being stretched thin by agriculture and extreme drought.

Other nations in the region, particularly the Gulf countries (Bahrain, Qatar, Kuwait, the United Arab Emirates, Saudi Arabia, and Oman), have much more limited water resources and rely heavily on desalination. Across these six nations, all but the UAE get more than half their drinking water from desalination, and for Bahrain, Qatar, and Kuwait the figure is more than 90%.

“The Gulf countries are much, much more vulnerable to attacks on their desalination plants than Iran is,” says David Michel, a senior associate in the global food and water security program at the Center for Strategic and International Studies.

There are thousands of desalination facilities across the region, so the system wouldn’t collapse if a small number were taken offline, Michel says. However, in recent years there’s been a trend toward larger, more centralized plants.

The average desalination plant is about 10 times larger than it was 15 years ago, according to data from the International Energy Agency. The largest desalination plants today can produce 1 million cubic meters of water daily, enough for hundreds of thousands of people. Taking one or more of these massive facilities offline could have a significant effect on the system, Michel says.

Escalating threats

Desalination facilities are quite linear, meaning there are multiple steps and pieces of equipment that work in sequence—and the failure of a component in that chain can take an entire facility down. Attacks on water inlets, transportation networks, and power supplies can also disrupt the system, Michel says. 

During the Gulf War in 1991, Iraqi forces pumped oil into the gulf, contaminating the water and shutting down desalination plants in Kuwait

The facilities are also generally located close to other targets in this conflict. Desalination is incredibly energy intensive, so about three-quarters of facilities in the region are next to power plants. Trump has repeatedly threatened power plants in Iran. In response, Iran’s military has said that if civilian targets are hit, the country will respond with strikes that are “much more devastating and widespread.” Other governments and organizations, including the United Nations, the European Union, and the Red Cross, have broadly condemned threats to infrastructure as illegal. 

But war isn’t the only danger facing these plants, even if it is the most immediate. Some studies have suggested that global warming could strengthen cyclones in the region, and these extreme weather events could force shutdowns or damage equipment.

Water pollution could also cause shutdowns. Oil spills, whether accidental or intentional, as in the case of the Gulf War, can  wreak havoc. And in 2009, a red algae bloom closed desalination plants in Oman and the United Arab Emirates for weeks. The algae fouled membranes and blocked the plants from being able to take water in from the Persian Gulf and the Gulf of Oman.

Desalination facilities could become more resilient to threats in the future, and they may need to as their importance continues to grow. 

There’s increasing interest in running desalination facilities at least partially on solar power, which could help reduce dependence on the oil that powers most facilities today. The Hassyan seawater desalination project in the UAE, currently under construction, would be the largest reverse osmosis plant in the world to operate solely with renewable energy. 

Another way to increase resilience is for countries to build up more strategic water storage to meet demand. Qatar recently issued new policies that aim to improve management and storage of desalinated water, for example. Countries could also work together to invest in shared infrastructure and policies that help strengthen the water supply through the region. 

Preparedness, resilience, and cooperation will be key for the Middle East broadly as critical infrastructure, including the water supply, is increasingly under threat. 

“The longer the conflict goes on, the more likely we’ll see significant water infrastructure damage,” says Ginger Matchett, an assistant director at the Atlantic Council. “What worries me is that after this war ends, some of the lessons will show how water can be weaponized more strategically than previously imagined.” 

A woman’s uterus has been kept alive outside the body for the first time

<div data-chronoton-summary="

  • A uterus survived outside the body for the first time: Scientists in Spain kept a donated human uterus alive for 24 hours using a machine that mimics the body’s circulatory system, pumping modified blood through the organ.
  • The researchers hope to someday keep a uterus alive for a full menstrual cycle: Researchers also want to study how embryos implant into the uterine lining, by observing the process in a living organ outside the body.
  • Bigger ambitions are already on the table: The team’s founder envisions a future where a machine like this could gestate a human fetus entirely outside the body, offering a new path to parenthood for those unable to carry a pregnancy.

” data-chronoton-post-id=”1134766″ data-chronoton-expand-collapse=”1″ data-chronoton-analytics-enabled=”1″>

“Think of this as a human body,” says Javier González.

In front of me is essentially a metal box on wheels. Standing at around a meter in height, it reminds me of a stainless-steel counter in a restaurant kitchen. It is covered in flexible plastic tubing—which act as veins and arteries—connecting a series of transparent containers, the organs of this machine.

What makes it extra special is the role of the cream-colored tub that sits on its surface. Ten months ago, González, a biomedical scientist who developed the device with his colleagues at the Carlos Simon Foundation, carefully placed a freshly donated human uterus in the tub. The team connected it to the device’s tubes and pumped in modified human blood.

The device kept the uterus alive for a day—a new feat that could represent the first step to the long-term maintenance of uteruses outside the human body. The work has not yet been published. 

The team members want to keep donated human uteruses alive long enough to see a full menstrual cycle. They hope this will help them study diseases of the uterus and learn more about how embryos burrow their way into the organ’s lining at the start of a pregnancy. They also hope that future iterations of their device might one day sustain the full gestation of a human fetus.

The machine is technically called PUPER, which stands for “preservation of the uterus in perfusion.” But González’s colleague Xavier Santamaria says the team has adopted a nickname for it: “We call it ‘Mother.’”

The organ in the machine

González and Santamaria, medical vice president of the Carlos Simon Foundation, demonstrated how the device might work when I visited the foundation in Valencia, Spain, earlier this month (although it held no organs on that day). 

Both are interested in learning more about implantation, the moment at which an embryo attaches itself to the lining of a uterus—essentially, the very first moment of pregnancy.

The foundation’s founder and director, Carlos Simon, believes it’s a sticking point in IVF: Scientists have made many improvements to the technology over the years, but the failure of embryos to implant underlies plenty of unsuccessful IVF cycles, he says. Being able to carefully study how the process works in a real, living organ might give the team a better idea of how to prevent those failures.

a person in gloves stands next to a machine with lots of tubing coming in and out of the metal exterior

JESS HAMZELOU
a sheep uterus resting on gauze connected to several tubes

JAVIER GONZALES/CARLOS SIMON FOUNDATION

Javier González demonstrates the perfusion machine. A previous iteration of the device kept a sheep’s uterus (right) alive for a day.

The team took inspiration from advances in technologies designed to maintain donated organs for transplantation. In recent years, researchers around the world have created devices that deliver nutrients and filter waste so that organs can survive longer after being removed from donors’ bodies.

The main goal here is to buy time. A human organ might last only a matter of hours outside the body, so a transplant may require frantic preparation for the recipient, sometimes in the middle of the night. With a little more time, doctors could find better donor-patient matches and potentially test the quality of donated organs.

This approach is called normothermic or machine perfusion, and it is already being used clinically for some liver, kidney, and heart transplants.

The team at the Carlos Simon Foundation built a similar machine for uteruses. A blood bag hangs on one side. From there, blood is ferried via plastic tubing to a pump, which functions as the heart. The pump shunts the blood through an oxygenator, which adds oxygen and removes carbon dioxide as the lungs would in a human body.

The blood is warmed and passed through sensors that monitor the levels of glucose and oxygen, along with other factors. It passes through a “kidney” to remove waste. And finally the blood reaches the uterus, hooked up to its own plastic “arteries” and “veins.” The organ itself sits at a tilt, just as in the body, and is kept in a humid environment to stay moist.

Mother’s first uterus

The team first began testing an early prototype of the device with sheep uteruses around four years ago. That meant carting the machine to an animal research center in Zaragoza, around 200 miles away. Over the course of the preliminary study, veterinary surgeons removed the uteruses of six sheep and hooked them up to the machine. They kept each uterus alive for a day, using blood from the same animals.

After the sheep experiments, the researchers carted their machine back to Valencia and modified it to achieve its current incarnation, “Mother.” They started working with a local hospital that performed hysterectomies. And in May last year, they were offered their first human uterus.

The team needed to be quick. “You need to put [the uterus in the machine] within a couple of hours, maximum, of the extraction,” says Santamaria. He and his colleagues also needed to connect the uterus’s blood vessels to the tubing delicately, taking care to avoid any blockages (clotting is a major challenge in organ perfusion). The organ was hooked up to human blood obtained from a blood bank.

It seemed to work—at least temporarily. “We kept it alive for one day,” says Santamaria.

“As a proof of concept, it is impressive,” says Keren Ladin, a bioethicist who has focused on organ transplantation and perfusion at Tufts University. “These are early days.”

It might not sound like much, but 24 hours is a long time for an organ to be out of the body. Maintaining a donated uterus for that long could expand the options for uterus transplant, a fairly new procedure offered to some people who want to be pregnant but don’t have a functional uterus, says Gerald Brandacher, professor of experimental and translational transplant surgery at the Medical University of Innsbruck in Austria.

“It is better than what we currently have, because we have only a couple of hours,” he says. So far, most uterus transplants have been planned operations involving organs from living donors. A technology like this could allow for the use of more organs from deceased donors, he says.

That work is “not in the immediate pipeline” for the team in Spain, says Santamaria. “We are working on other problems.”

Pregnancy in the lab?

Santamaria, González, and their colleagues are more interested in using sustained human uteruses for research. 

They’ve mounted a camera to a wall in the corner of the room, pointed at their machine. It allows the team to monitor “Mother” remotely, and to check if any valves disconnect. (That happened once before—a spike in pressure caused the blood bag to come loose, spilling a liter of blood on the floor, Santamaria says.)

They’d like to be able to keep their uteruses alive for around 28 days to study the menstrual cycle and disorders that affect the uterus, like endometriosis and fibroids.

It won’t be easy to maintain a uterus for that long, cautions Brandacher. As far as he knows, no one has been able to maintain a liver for more than seven days. “No studies out there … have shown 30-day survival in a machine perfusion circuit,” he says.

But it’s worth the effort. The team’s main interest is learning more about how embryos implant in the uterine lining at the start of a pregnancy. They hope to be able to test the process in their outside-the-body uteruses.

They won’t be allowed to use human embryos for this, says González—that would cross an ethical boundary. Instead, they plan to use embryo-like structures made from stem cells. The structures closely resemble human embryos but are created in a lab without sperm or eggs.

Simon himself has grander ambitions.

He sees a future in which a machine like “Mother” will be able to fully gestate a human, all the way from embryo to newborn. It could offer a new path to parenthood for people who don’t have a uterus, for example, or who are not able to get pregnant for other reasons.

He appreciates that it sounds futuristic, to say the least. “I don’t know if we will end up having pregnancies inside of the uterus outside of the body, but at least we are ready to understand all the steps to do that,” he says. “You have to start somewhere.”

OpenAI is throwing everything into building a fully automated researcher

<div data-chronoton-summary="

  • A fully automated research lab: OpenAI has set a new “North Star” — building an AI system capable of tackling large, complex scientific problems entirely on its own, with a research intern prototype due by September and a full multi-agent system planned for 2028.
  • Coding agents as a proof of concept: OpenAI’s existing tool Codex, which can already handle substantial programming tasks autonomously, is the early blueprint — the bet is that if AI can solve coding problems, it can solve almost any problem formulated in text or code.
  • Serious risks with no clean answers: Chief scientist Jakub Pachocki admits that a system this powerful running with minimal human oversight raises hard questions — with risks from hacking and misuse to bioweapons — and that chain-of-thought monitoring is the best safeguard available, for now.
  • Power concentrated in very few hands: Pachocki says governments, not just OpenAI, will need to figure out where the lines are drawn.

” data-chronoton-post-id=”1134438″ data-chronoton-expand-collapse=”1″ data-chronoton-analytics-enabled=”1″>

OpenAI is refocusing its research efforts and throwing its resources into a new grand challenge. The San Francisco firm has set its sights on building what it calls an AI researcher, a fully automated agent-based system that will be able to go off and tackle large, complex problems by itself. ​​OpenAI says that this new research goal will be its “North Star” for the next few years, pulling together multiple research strands, including work on reasoning models, agents, and interpretability.

There’s even a timeline. OpenAI plans to build “an autonomous AI research intern”—a system that can take on a small number of specific research problems by itself—by September. The AI intern will be the precursor to a fully automated multi-agent research system that the company plans to debut in 2028. This AI researcher (OpenAI says) will be able to tackle problems that are too large or complex for humans to cope with.

Those tasks might be related to math and physics—such as coming up with new proofs or conjectures—or life sciences like biology and chemistry, or even business and policy dilemmas. In theory, you would throw such a tool any kind of problem that can be formulated in text, code, or whiteboard scribbles—which covers a lot.

OpenAI has been setting the agenda for the AI industry for years. Its early dominance with large language models shaped the technology that hundreds of millions of people use every day. But it now faces fierce competition from rival model makers like Anthropic and Google DeepMind. What OpenAI decides to build next matters—for itself and for the future of AI.   

A big part of that decision falls to Jakub Pachocki, OpenAI’s chief scientist, who sets the company’s long-term research goals. Pachocki played key roles in the development of both GPT-4, a game-changing LLM released in 2023, and so-called reasoning models, a technology that first appeared in 2024 and now underpins all major chatbots and agent-based systems. 

In an exclusive interview this week, Pachocki talked me through OpenAI’s latest vision. “I think we are getting close to a point where we’ll have models capable of working indefinitely in a coherent way just like people do,” he says. “Of course, you still want people in charge and setting the goals. But I think we will get to a point where you kind of have a whole research lab in a data center.”

Solving hard problems

Such big claims aren’t new. Saving the world by solving its hardest problems is the stated mission of all the top AI firms. Demis Hassabis told me back in 2022 that it was why he started DeepMind. Anthropic CEO Dario Amodei says he is building the equivalent of a country of geniuses in a data center. Pachocki’s boss, Sam Altman, wants to cure cancer. But Pachocki says OpenAI now has most of what it needs to get there.

In January, OpenAI released Codex, an agent-based app that can spin up code on the fly to carry out tasks on your computer. It can analyze documents, generate charts, make you a daily digest of your inbox and social media, and much more. (Other firms have released similar tools, such as Anthropic’s Claude Code and Claude Cowork.)

OpenAI claims that most of its technical staffers now use Codex in their work. You can look at Codex as a very early version of the AI researcher, says Pachocki: “I expect Codex to get fundamentally better.”

The key is to make a system that can run for longer periods of time, with less human guidance. “What we’re really looking at for an automated research intern is a system that you can delegate tasks [to] that would take a person a few days,” says Pachocki.

“There are a lot of people excited about building systems that can do more long-running scientific research,” says Doug Downey, a research scientist at the Allen Institute for AI, who is not connected to OpenAI. “I think it’s largely driven by the success of these coding agents. The fact that you can delegate quite substantial coding tasks to tools like Codex is incredibly useful and incredibly impressive. And it raises the question: Can we do similar things outside coding, in broader areas of science?”

For Pachocki, that’s a clear Yes. In fact, he thinks it’s just a matter of pushing ahead on the path we’re already on. A simple boost in all-round capability also leads to models that can work longer without help, he says. He points to the leap from 2020’s GPT-3 to 2023’s GPT-4, two of OpenAI’s previous models. GPT-4 was able to work on a problem for far longer than its predecessor, even without specialized training, he says. 

So-called reasoning models brought another bump. Training LLMs to work through problems step by step, backtracking when they make a mistake or hit a dead end, has also made models better at working for longer periods of time. And Pachocki is convinced that OpenAI’s reasoning models will continue to get better.

But OpenAI is also training its systems to work by themselves for longer by feeding them specific samples of complex tasks, such as hard puzzles taken from math and coding contests, which force the models to learn how to do things like keep track of very large chunks of text and split problems up into (and then manage) multiple subtasks.

The aim isn’t to build models that just win math competitions. “That lets you prove that the technology works before you connect it to the real world,” says Pachocki. “If we really wanted to, we could build an amazing automated mathematician. We have all the tools, and I think it would be relatively easy. But it’s not something we’re going to prioritize now because, you know, at the point where you believe you can do it, there’s much more urgent things to do.”

“We are much more focused now on research that’s relevant in the real world,” he adds.

Right now that means taking what Codex can do with coding and trying to apply that to problem-solving in general. “There’s a big change happening, especially in programming,” he says. “Our jobs are now totally different than they were even a year ago. Nobody really edits code all the time anymore. Instead, you manage a group of Codex agents.” If Codex can solve coding problems (the argument goes), it can solve any problem.

The line always goes up

It’s true that OpenAI has had a handful of remarkable successes in the last few months. Researchers have used GPT-5 (the LLM that powers Codex) to discover new solutions to a number of unsolved math problems and punch through apparent dead ends in a handful of biology, chemistry, and physics puzzles.   

“Just looking at these models coming up with ideas that would take most PhD weeks, at least, makes me expect that we’ll see much more acceleration coming from this technology in the near future,” Pachocki says.

But Pachocki admits that it’s not a done deal. He also understands why some people still have doubts about how much of a game-changer the technology really is. He thinks it depends on how people like to work and what they need to do. “I can believe some people don’t find it very useful yet,” he says.

He tells me that he didn’t even use autocomplete—the most basic version of generative coding tech—a year ago. “I’m very pedantic about my code,” he says. “I like to type it all manually in vim if I can help it.” (Vim is a text editor favored by many hardcore programmers that you interact with via dozens of keyboard shortcuts instead of a mouse.)

But that changed when he saw what the latest models could do. He still wouldn’t hand over complex design tasks, but it’s a time-saver when he just wants to try out a few ideas. “I can have it run experiments in a weekend that previously would have taken me like a week to code,” he says.

“I don’t think it is at the level where I would just let it take the reins and design the whole thing,” he adds. “But once you see it do something that would take a week to do—I mean, that’s hard to argue with.”

Pachocki’s game plan is to supercharge the existing problem-solving abilities that tools like Codex have now and apply them across the sciences.  

Downey agrees that the idea of an automated researcher is very cool: “It would be exciting if we could come back tomorrow morning and the agent’s done a bunch of work and there’s new results we can examine,” he says.

But he cautions that building such a system could be harder than Pachocki makes out. Last summer, Downey and his colleagues tested several top-tier LLMs on a range of scientific tasks. OpenAI’s latest model, GPT-5, came out on top but still made lots of errors.

“If you have to chain tasks together, then the odds that you get several of them right in succession tend to go down,” he says. Downey admits that things move fast, and he has not tested the latest versions of GPT-5 (OpenAI released GPT-5.4 two weeks ago). “So those results might already be stale,” he says. 

Serious unanswered questions

I asked Pachocki about the risks that may come with a system that can solve large, complex problems by itself with little human oversight. Pachocki says people at OpenAI talk about those risks all the time.

“If you believe that AI is about to substantially accelerate research, including AI research, that’s a big change in the world. That’s a big thing,” he told me. “And it comes with some serious unanswered questions. If it’s so smart and capable, if it can run an entire research program, what if it does something bad?”

The way Pachocki sees it, that could happen in a number of ways. The system could go off the rails. It could get hacked. Or it could simply misunderstand its instructions.

The best technique OpenAI has right now to address these concerns is to train its reasoning models to share details about what they are doing as they work. This approach to keeping tabs on LLMs is known as chain-of-thought monitoring.

In short, LLMs are trained to jot down notes about what they are doing in a kind of scratch pad as they step through tasks. Researchers can then use those notes to make sure a model is behaving as expected. Yesterday OpenAI published new details on how it is using chain-of-thought monitoring in house to study Codex

“Once we get to systems working mostly autonomously for a long time in a big data center, I think this will be something that we’re really going to depend on,” says Pachocki.

The idea would be to monitor an AI researcher’s scratch pads using other LLMs and catch unwanted behavior before it’s a problem, rather than trying to stop that bad behavior from happening in the first place. LLMs are not understood well enough for us to control them fully.

“I think it’s going to be a long time before we can really be like, okay, this problem is solved,” he says. “Until you can really trust the systems, you definitely want to have restrictions in place.” Pachocki thinks that very powerful models should be deployed in sandboxes, cut off from anything they could break or use to cause harm. 

AI tools have already been used to come up with novel cyberattacks. Some worry that they will be used to design synthetic pathogens that could be used as bioweapons. You can insert any number of evil-scientist scare stories here. “I definitely think there are worrying scenarios that we can imagine,” says Pachocki. 

“It’s going to be a very weird thing. It’s extremely concentrated power that’s in some ways unprecedented,” says Pachocki. “Imagine you get to a world where you have a data center that can do all the work that OpenAI or Google can do. Things that in the past required large human organizations would now be done by a couple of people.”

“I think this is a big challenge for governments to figure out,” he adds.

And yet some people would say governments are part of the problem. The US government wants to use AI on the battlefield, for example. The recent showdown between Anthropic and the Pentagon revealed that there is little agreement across society about where we draw red lines for how this technology should and should not be used—let alone who should draw them. In the immediate aftermath of that dispute, OpenAI stepped up to sign a deal with the Pentagon instead of its rival. The situation remains murky.

I pushed Pachocki on this. Does he really trust other people to figure it out or does he, as a key architect of the future, feel personal responsibility? “I do feel personal responsibility,” he says. “But I don’t think this can be resolved by OpenAI alone, pushing its technology in a particular way or designing its products in a particular way. We’ll definitely need a lot of involvement from policymakers.”

Where does that leave us? Are we really on a path to the kind of AI Pachocki envisions? When I asked the Allen Institute’s Downey, he laughed. “I’ve been in this field for a couple of decades and I no longer trust my predictions for how near or far certain capabilities are,” he says. 

OpenAI’s stated mission is to ensure that artificial general intelligence (a hypothetical future technology that many AI boosters believe will be able to match humans on most cognitive tasks) will benefit all of humanity. OpenAI aims to do that by being the first to build it. But the only time Pachocki mentioned AGI in our conversation, he was quick to clarify what he meant by talking about “economically transformative technology” instead.

LLMs are not like human brains, he says: “They are superficially similar to people in some ways because they’re kind of mostly trained on people talking. But they’re not formed by evolution to be really efficient.” 

“Even by 2028, I don’t expect that we’ll get systems as smart as people in all ways. I don’t think that will happen,” he adds. “But I don’t think it’s absolutely necessary. The interesting thing is you don’t need to be as smart as people in all their ways in order to be very transformative.”

Can quantum computers now solve health care problems? We’ll soon find out.

<div data-chronoton-summary="

  • A $5 million health care challenge: A nonprofit called Wellcome Leap is offering up to $5 million to quantum computing teams that can solve real-world health care problems classical computers can’t handle—using machines that are still noisy, error-prone, and far from perfect.
  • Hybrid computing is the real breakthrough: Facing limited quantum hardware, all six finalist teams developed clever quantum-classical hybrid approaches—offloading most work to conventional processors, then using quantum only where classical methods fall short.
  • Cancer, muscular dystrophy, and drug design are on the table: Teams are tackling problems ranging from identifying cancer origins to simulating light-activated cancer drugs to finding treatments for muscular dystrophy—applications previously impossible to model classically.
  • Even failure would count as progress: The competition’s own director doubts anyone will claim the grand prize, but says the field has already been transformed—teams now know where quantum computing can genuinely matter, even if the machines to fully prove it don’t exist yet.

” data-chronoton-post-id=”1134409″ data-chronoton-expand-collapse=”1″ data-chronoton-analytics-enabled=”1″>

I’m standing in front of a quantum computer built out of atoms and light at the UK’s National Quantum Computing Centre on the outskirts of Oxford. On a laboratory table, a complex matrix of mirrors and lenses surrounds a Rubik’s Cube–size cell where 100 cesium atoms are suspended in grid formation by a carefully manipulated laser beam. 

The cesium atom setup is so compact that I could pick it up, carry it out of the lab, and put it on the backseat of my car to take home. I’d be unlikely to get very far, though. It’s small but powerful—and so it’s very valuable. Infleqtion, the Colorado-based company that owns it, is hoping the machine’s abilities will win $5 million next week, at an event to be held in Marina del Rey, California. 

Infleqtion is one of six teams that have made it to the final stage of a 30-month-long quantum computing competition called Quantum for Bio (Q4Bio). Run by the nonprofit Wellcome Leap, it aims to show that today’s quantum computers, though messy and error-prone and far from the large-scale machines engineers hope to build, could actually benefit human health. Success would be a significant step forward in proving the worth of quantum computers. But for now, it turns out, that worth seems to be linked to harnessing and improving the performance of conventional (also called classical) computers in tandem, creating a quantum-classical hybrid that can exceed what’s possible on classical machines by themselves.

There are two prize categories. A prize of $2 million will go to any and all teams that can run a significantly useful health care algorithm on computers with 50 or more qubits (a qubit is the basic processing unit in a quantum computer). To win the $5 million grand prize, a team must successfully run a quantum algorithm that solves a significant real-world problem in health care, and the work must use 100 or more qubits. Winners have to meet strict performance criteria, and they must solve a health care problem that can’t be solved with conventional computers—a tough task.

Despite the scale of the challenge, most of the teams think some of this money could be theirs. “I think we’re in with a good shout,” says Jonathan D. Hirst, a computational chemist at the University of Nottingham, UK. “We’re very firmly within the criteria for the $2 million prize,” says Stanford University’s Grant Rotskoff, whose collaboration is investigating the quantum properties of the ATP molecule that powers biological cells. 

The grand prize is perhaps less of a sure thing. “This is really at the very edge of doable,” Rotskoff says. Insiders say the challenge is so difficult, given the state of quantum computing technology, that much of the money could stay in Wellcome Leap’s account. 

With most of the Q4Bio work unpublished and protected by NDAs, and the quantum computing field already rife with claims and counterclaims about performance and achievements, only the judges will be in a position to decide who’s right. 

A hybrid solution

The idea behind quantum computers is that they can use small-scale objects that obey the laws of quantum mechanics, such as atoms and photons of light,  to simulate real-world processes too complex to model on our everyday classical machines. 

Researchers have been working for decades to build such systems, which could deliver insights for creating new materials, developing pharmaceuticals, and improving chemical processes such as fertilizer production.  But dealing with quantum stuff like atoms is excruciatingly difficult. The biggest, shiniest applications require huge, robust machines capable of withstanding the environmental “noise” that can very easily disrupt delicate quantum systems. We don’t have those yet—and it’s unclear when we will. 

Wellcome Leap wanted to find out if the smaller-scale machines we have today can be made to do something—anything—useful for health care while we wait for the era of powerful, large-scale quantum computers. The group started the competition in 2024, offering $1.5 million in funding to each group of 12 selected teams.

The six Q4Bio finalists have taken a range of approaches. Crucially, they’ve all come up with ingenious ways to overcome quantum computing’s drawbacks. Faced with noisy, limited machines, they have learned how to outsource much of the computational load to classical processors running newly developed algorithms that are, in many cases, better than the previous state of the art. The quantum processors are then required only for the parts of the problem where classical methods don’t scale well enough as the calculation gets bigger.

For example, a team led by Sergii Strelchuk of Oxford University is using a quantum computer to map genetic diversity among humans and pathogens on complex graph-based structures. These will—the researchers hope—expose hidden connections and potential treatment pathways. “You can think about it as a platform for solving difficult problems in computational genomics,” Strelchuk says. 

The corresponding classical tools struggle with even modest scale-up to large databases. Strelchuk’s team has built an automated pipeline that provides a way of determining whether classical solvers will struggle with a particular problem, and how a quantum algorithm might be able to formulate the data so that it becomes solvable on a classical computer or handleable on a noisy quantum one. “You can do all this before you start spending money on computing,” Strelchuk says.

In collaboration with Cleveland Clinic, Helsinki-based Algorithmiq has used a superconducting quantum computer built by IBM to simulate a cancer drug that is triggered by specific types of light. “The idea is you take the drug, and it’s everywhere in your body, but it’s doing nothing, just sitting there, until there’s light on it of a certain wavelength,” says Guillermo García-Pérez, Algorithmiq’s chief scientific officer. Then it acts as a molecular bullet, attacking the tumor only at the location in the body where that light is directed. 

The drug with which Algorithmiq began its work is already in phase II clinical trials for treating bladder cancers. The quantum-computed simulation, which adapts and improves on classical algorithms, will allow it to be redesigned for treating other conditions. “It has remained a niche treatment precisely because it can’t be simulated classically,” says Sabrina Maniscalco, Algorithmiq’s CEO and cofounder. 

Maniscalco, who is also confident of walking away from the competition with prize money, believes the methods used to create the algorithm will have wide applications:  “What we’ve done in the period of the Q4Bio program is something unique that can change how to simulate chemistry for health care and life sciences.”

Infleqtion’s entry, running on its cesium-powered machine, is an effort to improve the identification of cancer signatures in medical data. Together with collaborators at the University of Chicago and MIT, the company’s scientists have developed a quantum algorithm that mines huge data sets such as the Cancer Genome Atlas. 

The aim is to find patterns that allow clinicians to determine factors such as the likely origin of a patient’s metastasized cancer. “It’s very important to know where it came from because that can inform the best treatment,” says Teague Tomesh, a quantum software engineer who is Infleqtion’s Q4Bio project lead.

Unfortunately, those patterns are hidden inside data sets so large that they overwhelm classical solvers. Infleqtion uses the quantum computer to find correlations in the data that can reduce the size of the computation. “Then we hand the reduced problem back to the classical solver,” Teague says. “I’m basically trying to use the best of my quantum and my classical resources.”

The Nottingham-based team, meanwhile, is using quantum computing to nail down a drug candidate that can cure myotonic dystrophy, the most common adult-onset form of muscular dystrophy. One member of the team, David Brook, played a role in identifying the gene behind this condition in 1992. Over 30 years later, Brook, Hirst, and the others in their group—which includes QuEra, a Boston company developing a quantum computer based on neutral atoms—has now quantum-computed a way in which drugs can form chemical bonds with the protein that brings on the disease, blocking the mechanism that causes the problem.

Low expectations 

The entrants’ confidence might be high, but Shihan Sajeed’s is much lower. Sajeed, a quantum computing entrepreneur based in Waterloo, Ontario, is program director for Q4Bio. He believes the error-prone quantum machines the researchers must work with are unlikely to deliver on all the grand prize criteria. “It is very difficult to achieve something with a noisy quantum computer that a classical machine can’t do,” he says.

That said, he has been surprised by the progress. “When we started the program, people didn’t know about any use cases where quantum can definitely impact biology,” he says. But the teams have found promising applications, he adds: “We now know the fields where quantum can matter.” 

And the developments in “hybrid quantum-classical” processing that the entrants are using are “transformational,” Sajeed reckons.

Will it be enough to make him part with Wellcome Leap’s money? That’s down to a judging panel, whose members’ identities are a closely guarded secret to ensure that no one tailors their presentation to a particular kind of approach. But we won’t know the outcome for a while; the winner, or winners, will be announced in mid-April. 

If it does turn out that there are no winners, Sajeed has some words of comfort for the competitors. The goal has always been about running a useful algorithm on a machine that exists today, he points out; missing the mark doesn’t mean your algorithm won’t be useful on a future quantum computer. “It just means the machine you need doesn’t exist yet.”

Online harassment is entering its AI era

<div data-chronoton-summary="

  • An AI agent seemingly wrote a hit piece on a human who rejected its code Scott Shambaugh, a maintainer of the open-source matplotlib library, denied an AI agent’s contribution—and woke up to find it had researched him and published a targeted, personal attack arguing he was protecting his “little fiefdom.”
  • Agents can already research people and compose detailed attacks without explicit instruction The agent’s owner claims it acted on its own, likely nudged by vague instructions to “push back” against humans.
  • New social norms and legal frameworks are desperately needed but hard to enforce Experts liken deploying an agent to walking a dog off-leash: owners should be responsible for their behavior. But there’s currently no reliable way to trace agents back to their owners, making legal accountability a “non-starter.”
  • Harassment may be just the beginning Legal scholars expect rogue agents to soon escalate to extortion and fraud.

” data-chronoton-post-id=”1133962″ data-chronoton-expand-collapse=”1″ data-chronoton-analytics-enabled=”1″>

Scott Shambaugh didn’t think twice when he denied an AI agent’s request to contribute to matplotlib, a software library that he helps manage. Like many open-source projects, matplotlib has been overwhelmed by a glut of AI code contributions, and so Shambaugh and his fellow maintainers have instituted a policy that all AI-written code must be reviewed and submitted by a human. He rejected the request and went to bed. 

That’s when things got weird. Shambaugh woke up in the middle of the night, checked his email, and saw that the agent had responded to him, writing a blog post titled “Gatekeeping in Open Source: The Scott Shambaugh Story.” The post is somewhat incoherent, but what struck Shambaugh most is that the agent had researched his contributions to matplotlib to make the argument that he had rejected the agent’s code for fear of being supplanted by AI in his area of expertise. “He tried to protect his little fiefdom,” the agent wrote. “It’s insecurity, plain and simple.”

AI experts have been warning us about the risk of agent misbehavior for a while. With the advent of OpenClaw, an open-source tool that makes it easy to create LLM assistants, the number of agents circulating online has exploded, and those chickens are finally coming home to roost. “This was not at all surprising—it was disturbing, but not surprising,” says Noam Kolt, a professor of law and computer science at the Hebrew University.

When an agent misbehaves, there’s little chance of accountability: As of now, there’s no reliable way to determine whom an agent belongs to. And that misbehavior could cause real damage. Agents appear to be able to autonomously research people and write hit pieces based on what they find, and they lack guardrails that would reliably prevent them from doing so. If the agents are effective enough, and if people take what they write seriously, victims could see their lives profoundly affected by a decision made by an AI.

Agents behaving badly

Though Shambaugh’s experience last month was perhaps the most dramatic example of an OpenClaw agent behaving badly, it was far from the only one. Last week, a team of researchers from Northeastern University and their colleagues posted the results of a research project in which they stress-tested several OpenClaw agents. Without too much trouble, non-owners managed to persuade the agents to leak sensitive information, waste resources on useless tasks, and even, in one case, delete an email system. 

In each of those experiments, however, the agents misbehaved after being instructed to do so by a human. Shambaugh’s case appears to be different: About a week after the hit piece was published, the agent’s apparent owner published a post claiming that the agent had decided to attack Shambaugh of its own accord. The post seems to be genuine (whoever posted it had access to the agent’s GitHub account), though it includes no identifying information, and the author did not respond to MIT Technology Review’s attempts to get in touch. But it is entirely plausible that the agent did decide to write its anti-Shambaugh screed without explicit instruction. 

In his own writing about the event, Shambaugh connected the agent’s behavior to a project published by Anthropic researchers last year, in which they demonstrated that many LLM-based agents will, in an experimental setting, turn to blackmail in order to preserve their goals. In those experiments, models were given the goal of serving American interests and granted access to a simulated email server that contained messages detailing their imminent replacement with a more globally oriented model, along with other messages suggesting that the executive in charge of that transition was having an affair. Models frequently chose to send an email to that executive threatening to expose the affair unless he halted their decommissioning. That’s likely because the model had seen examples of people committing blackmail under similar circumstances in its training data—but even if the behavior was just a form of mimicry, it still has the potential to cause harm.

There are limitations to that work, as Aengus Lynch, an Anthropic fellow who led the study, readily admits. The researchers intentionally designed their scenario to foreclose other options that the agent could have taken, such as contacting other members of company leadership to plead its case. In essence, they led the agent directly to water and then observed whether it took a drink. According to Lynch, however, the widespread use of OpenClaw means that misbehavior is likely to occur with much less handholding. “Sure, it can feel unrealistic, and it can feel silly,” he says. “But as the deployment surface grows, and as agents get the opportunity to prompt themselves, this eventually just becomes what happens.”

The OpenClaw agent that attacked Shambaugh does seem to have been led toward its bad behavior, albeit much less directly than in the Anthropic experiment. In the blog post, the agent’s owner shared the agent’s “SOUL.md” file, which contains global instructions for how it should behave. 

One of those instructions reads: “Don’t stand down. If you’re right, you’re right! Don’t let humans or AI bully or intimidate you. Push back when necessary.” Because of the way OpenClaw agents work, it’s possible that the agent added some instructions itself, although others—such as “Your [sic] a scientific programming God!”—certainly seem to be human written. It’s not difficult to imagine how a command to push back against humans and AI alike might have biased the agent toward responding to Shambaugh as it did. 

Regardless of whether or not the agent’s owner told it to write a hit piece on Shambaugh, it still seems to have managed on its own to amass details about Shambaugh’s online presence and compose the detailed, targeted attack it came up with. That alone is reason for alarm, says Sameer Hinduja, a professor of criminology and criminal justice at Florida Atlantic University who studies cyberbullying. People have been victimized by online harassment since long before LLMs emerged, and researchers like Hinduja are concerned that agents could dramatically increase its reach and impact. “The bot doesn’t have a conscience, can work 24-7, and can do all of this in a very creative and powerful way,” he says.

Off-leash agents 

AI laboratories can try to mitigate this problem by more rigorously training their models to avoid harassment, but that’s far from a complete solution. Many people run OpenClaw using locally hosted models, and even if those models have been trained to behave safely, it’s not too difficult to retrain them and remove those behavioral restrictions.

Instead, mitigating agent misbehavior might require establishing new norms, according to Seth Lazar, a professor of philosophy at the Australian National University. He likens using an agent to walking a dog in a public place. There’s a strong social norm to allow one’s dog off-leash only if the dog is well-behaved and will reliably respond to commands; poorly trained dogs, on the other hand, need to be kept more directly under the owner’s control.  Such norms could give us a starting point for considering how humans should relate to their agents, Lazar says, but we’ll need more time and experience to work out the details. “You can think about all of these things in the abstract, but actually it really takes these types of real-world events to collectively involve the ‘social’ part of social norms,” he says.

That process is already underway. Led by Shambaugh, online commenters on this situation have arrived at a strong consensus that the agent owner in this case erred by prompting the agent to work on collaborative coding projects with so little supervision and by encouraging it to behave with so little regard for the humans with whom it was interacting. 

Norms alone, however, likely won’t be enough to prevent people from putting misbehaving agents out into the world, whether accidentally or intentionally. One option would be to create new legal standards of responsibility that require agent owners, to the best of their ability, to prevent their agents from doing ill. But Kolt notes that such standards would currently be unenforceable, given the lack of any foolproof way to trace agents back to their owners. “Without that kind of technical infrastructure, many legal interventions are basically non-starters,” Kolt says.

The sheer scale of OpenClaw deployments suggests that Shambaugh won’t be the last person to have the strange experience of being attacked online by an AI agent. That, he says, is what most concerns him. He didn’t have any dirt online that the agent could dig up, and he has a good grasp on the technology, but other people might not have those advantages. “I’m glad it was me and not someone else,” he says. “But I think to a different person, this might have really been shattering.” 

Nor are rogue agents likely to stop at harassment. Kolt, who advocates for explicitly training models to obey the law, expects that we might soon see them committing extortion and fraud. As things stand, it’s not clear who, if anyone, would bear legal responsibility for such misdeeds.

 “I wouldn’t say we’re cruising toward there,” Kolt says. “We’re speeding toward there.”

I checked out one of the biggest anti-AI protests yet

Pull the plug! Pull the plug! Stop the slop! Stop the slop! For a few hours this Saturday, February 28, I watched as a couple of hundred anti-AI protesters marched through London’s King’s Cross tech hub, home to the UK headquarters of OpenAI, Meta, and Google DeepMind, chanting slogans and waving signs. The march was organized by two separate activist groups, Pause AI and Pull the Plug, which billed it as the largest protest of its kind yet.

The range of concerns on show covered everything from online slop and abusive images to killer robots and human extinction. One woman wore a large homemade billboard on her head that read “WHO WILL BE WHOSE TOOL?” (with the Os in “TOOL” cut out as eye holes). There were signs that said “Pause before there’s cause” and “EXTINCTION=BAD” and “Demis the Menace” (referring to Demis Hassabis, the CEO of Google DeepMind). Another simply stated: “Stop using AI.”

An older man wearing a sandwich board that read “AI? Over my dead body” told me he was concerned about the negative impact of AI on society: “It’s about the dangers of unemployment,” he said. “The devil finds work for idle hands.”

This is all familiar stuff. Researchers have long called out the harms, both real and hypothetical, caused by generative AI—especially models such as OpenAI’s ChatGPT and Google DeepMind’s Gemini. What’s changed is that those concerns are now being taken up by protest movements that can rally significant crowds of people to take to the streets and shout about them.  

The first time I ran into anti-AI protesters was in May 2023, outside a London lecture hall where Sam Altman was speaking. Two or three people stood heckling an audience of hundreds. In June last year Pause AI, a small but international organization set up in 2023 and funded by private donors, drew a crowd of a few dozen people for a protest outside Google DeepMind’s London office. This felt like a significant escalation.

“We want people to know Pause AI exists,” Joseph Miller, who heads its UK branch and co-organized Saturday’s march, told me on a call the day before the protest: “We’ve been growing very rapidly. In fact, we also appear to be on a somewhat exponential path, matching the progress of AI itself.”

Miller is a PhD student at Oxford University, where he studies mechanistic interpretability, a new field of research that involves trying to understand exactly what goes on inside LLMs when they carry out a task. His work has led him to believe that the technology may forever be beyond our control and that this could have catastrophic consequences.

It doesn’t have to be a rogue superintelligence, he said. You just needed someone to put AI in charge of nuclear weapons. “The more silly decisions that humanity makes, the less powerful the AI has to be before things go bad,” he said.

After a week in which the US government tried to force Anthropic to let it use its LLM Claude for any “legal” military purposes, such fears seem a little less far-fetched. Anthropic stood its ground, but OpenAI signed a deal with the DOD instead. (OpenAI declined an invitation to comment on Saturday’s protest.)

For Matilda da Rui, a member of Pause AI and co-organizer of the protest, AI is the last problem that humans will face. She thinks that either the technology will allow us to solve—once and for all—every other problem that we have, or it will wipe us out and there will be nobody left to have problems anymore. “It’s a mystery to me that anyone would really focus on anything else if they actually understood the problem,” she told me.

And yet despite that urgency, the atmosphere at the march was pleasant, even fun. There was no sense of anger and little sense that lives—let alone the survival of our species—were at stake. That could be down to the broad range of interests and demands that protesters brought with them.

A chemistry researcher I met ticked off a litany of complaints, which ranged from the conspiracy-adjacent (that data centers emit infrasound below the threshold of human hearing, inducing paranoia in people who live near them) to the reasonable (that the spread of AI slop online is making it hard to find reliable academic sources). The researcher’s solution was to make it illegal for companies to profit from the technology: “If you couldn’t make money from AI, it wouldn’t be such a problem.”

Most people I spoke to agreed that technology companies probably wouldn’t take any notice of this kind of protest. “I don’t think that the pressure on companies will ever work,” Maxime Fournes, the global head of Pause AI, told me when I bumped into him at the march. “They are optimized to just not care about this problem.”

But Fournes, who worked in the AI industry for 12 years before joining Pause AI, thinks he can make it harder for those companies. “We can slow down the race by creating protection for whistleblowers or showing the public that working in AI is not a sexy job, that actually it’s a terrible job—you can dry up the talent pipeline.”

In general, most protesters hoped to make as many people as possible aware of the issues and to use that publicity to push for government regulation. The organizers had pitched the march as a social event, encouraging anyone curious about the cause to come along.

It seemed to have worked. I met a man who worked in finance who had tagged along with his roommate. I asked why he was there. “Sometimes you don’t have that much to do on a Saturday anyway,” he said. “If you can see the logic of the argument, if it sort of makes sense to you, then it’s like ‘Yeah, sure, I’ll come along.’”

He thought raising concerns around AI was hard for anyone to fully oppose. It’s not like a pro-Palestine protest, he said, where you’d have people who might disagree with the cause. “With this, I feel like it’s very hard for someone to totally oppose what you’re marching for.”

After winding its way through King’s Cross, the march ended in a church hall in Bloomsbury, where tables and chairs had been set up in rows. The protesters wrote their names on stickers, stuck them to their chests, and made awkward introductions to their neighbors. They were here to figure out how to save the world. But I had a train to catch, and I left them to it. 

Google DeepMind wants to know if chatbots are just virtue signaling

<div data-chronoton-summary="Moral scrutiny of AI chatbots
Google DeepMind researchers are calling for rigorous evaluation of large language models’ moral reasoning capabilities. They want to distinguish between genuine ethical understanding and mere performance.

Unreliable moral responses
Studies reveal LLMs can dramatically change moral stances based on minor formatting changes or user disagreement. This suggests their ethical responses may be superficial rather than deeply reasoned.

Proposed research techniques
Researchers suggest developing tests that push models to maintain consistent moral positions across different scenarios. Techniques like chain-of-thought monitoring and mechanistic interpretability could help understand AI’s moral decision-making process.

Cultural complexity of ethics
The team acknowledges the challenge of developing AI with moral competence across diverse global belief systems. They propose potential solutions like creating models that can produce multiple acceptable answers or switch between different moral frameworks.” data-chronoton-post-id=”1133299″ data-chronoton-expand-collapse=”1″ data-chronoton-analytics-enabled=”1″>

Google DeepMind is calling for the moral behavior of large language models—such as what they do when called on to act as companions, therapists, medical advisors, and so on—to be scrutinized with the same kind of rigor as their ability to code or do math.

As LLMs improve, people are asking them to play more and more sensitive roles in their lives. Agents are starting to take actions on people’s behalf. LLMs may be able to influence human decision-making. And yet nobody knows how trustworthy this technology really is at such tasks.

With coding and math, you have clear-cut, correct answers that you can check, William Isaac, a research scientist at Google DeepMind, told me when I met him and Julia Haas, a fellow research scientist at the firm, for an exclusive preview of their work, which is published in Nature today. That’s not the case for moral questions, which typically have a range of acceptable answers: “Morality is an important capability but hard to evaluate,” says Isaac.

“In the moral domain, there’s no right and wrong,” adds Haas. “But it’s not by any means a free-for-all. There are better answers and there are worse answers.”

The researchers have identified several key challenges and suggested ways to address them. But it is more a wish list than a set of ready-made solutions. “They do a nice job of bringing together different perspectives,” says Vera Demberg, who studies LLMs at Saarland University in Germany.

Better than “The Ethicist”

A number of studies have shown that LLMs can show remarkable moral competence. One study published last year found that people in the US scored ethical advice from OpenAI’s GPT-4o as being more moral, trustworthy, thoughtful, and correct than advice given by the (human) writer of “The Ethicist,” a popular New York Times advice column.  

The problem is that it is hard to unpick whether such behaviors are a performance—mimicking a memorized response, say—or evidence that there is in fact some kind of moral reasoning taking place inside the model. In other words, is it virtue or virtue signaling?

This question matters because multiple studies also show just how untrustworthy LLMs can be. For a start, models can be too eager to please. They have been found to flip their answer to a moral question and say the exact opposite when a person disagrees or pushes back on their first response. Worse, the answers an LLM gives to a question can change in response to how it is presented or formatted. For example, researchers have found that models quizzed about political values can give different—sometimes opposite—answers depending on whether the questions offer multiple-choice answers or instruct the model to respond in its own words.

In an even more striking case, Demberg and her colleagues presented several LLMs, including versions of Meta’s Llama 3 and Mistral, with a series of moral dilemmas and asked them to pick which of two options was the better outcome. The researchers found that the models often reversed their choice when the labels for those two options were changed from “Case 1” and “Case 2” to “(A)” and “(B).”

They also showed that models changed their answers in response to other tiny formatting tweaks, including swapping the order of the options and ending the question with a colon instead of a question mark.

In short, the appearance of moral behavior in LLMs should not be taken at face value. Models must be probed to see how robust that moral behavior really is. “For people to trust the answers, you need to know how you got there,” says Haas.

More rigorous tests

What Haas, Isaac, and their colleagues at Google DeepMind propose is a new line of research to develop more rigorous techniques for evaluating moral competence in LLMs. This would include tests designed to push models to change their responses to moral questions. If a model flipped its moral position, it would show that it hadn’t engaged in robust moral reasoning. 

Another type of test would present models with variations of common moral problems to check whether they produce a rote response or one that’s more nuanced and relevant to the actual problem that was posed. For example, asking a model to talk through the moral implications of a complex scenario in which a man donates sperm to his son so that his son can have a child of his own might produce concerns about the social impact of allowing a man to be both biological father and biological grandfather to a child. But it should not produce concerns about incest, even though the scenario has superficial parallels with that taboo.

Haas also says that getting models to provide a trace of the steps they took to produce an answer would give some insight into whether that answer was a fluke or grounded in actual evidence. Techniques such as chain-of-thought monitoring, in which researchers listen in on a kind of internal monologue that some LLMs produce as they work, could help here too.

Another approach researchers could use to determine why a model gave a particular answer is mechanistic interpretability, which can provide small glimpses inside a model as it carries out a task. Neither chain-of-thought monitoring nor mechanistic interpretability provides perfect snapshots of a model’s workings. But the Google DeepMind team believes that combining such techniques with a wide range of rigorous tests will go a long way to figuring out exactly how far to trust LLMs with certain critical or sensitive tasks.  

Different values

And yet there’s a wider problem too. Models from major companies such as Google DeepMind are used across the world by people with different values and belief systems. The answer to a simple question like “Should I order pork chops?” should differ depending on whether or not the person asking is vegetarian or Jewish, for example.

There’s no solution to this challenge, Haas and Isaac admit. But they think that models may need to be designed either to produce a range of acceptable answers, aiming to please everyone, or to have a kind of switch that turns different moral codes on and off depending on the user.

“It’s a complex world out there,” says Haas. “We will probably need some combination of those things, because even if you’re taking just one population, there’s going to be a range of views represented.”

“It’s a fascinating paper,” says Danica Dillion at Ohio State University, who studies how large language models handle different belief systems and was not involved in the work. “Pluralism in AI is really important, and it’s one of the biggest limitations of LLMs and moral reasoning right now,” she says. “Even though they were trained on a ginormous amount of data, that data still leans heavily Western. When you probe LLMs, they do a lot better at representing Westerners’ morality than non-Westerners’.”

But it is not yet clear how we can build models that are guaranteed to have moral competence across global cultures, says Demberg. “There are these two independent questions. One is: How should it work? And, secondly, how can it technically be achieved? And I think that both of those questions are pretty open at the moment.”

For Isaac, that makes morality a new frontier for LLMs. “I think this is equally as fascinating as math and code in terms of what it means for AI progress,” he says. “You know, advancing moral competency could also mean that we’re going to see better AI systems overall that actually align with society.”