OpenAI is throwing everything into building a fully automated researcher

<div data-chronoton-summary="

  • A fully automated research lab: OpenAI has set a new “North Star” — building an AI system capable of tackling large, complex scientific problems entirely on its own, with a research intern prototype due by September and a full multi-agent system planned for 2028.
  • Coding agents as a proof of concept: OpenAI’s existing tool Codex, which can already handle substantial programming tasks autonomously, is the early blueprint — the bet is that if AI can solve coding problems, it can solve almost any problem formulated in text or code.
  • Serious risks with no clean answers: Chief scientist Jakub Pachocki admits that a system this powerful running with minimal human oversight raises hard questions — with risks from hacking and misuse to bioweapons — and that chain-of-thought monitoring is the best safeguard available, for now.
  • Power concentrated in very few hands: Pachocki says governments, not just OpenAI, will need to figure out where the lines are drawn.

” data-chronoton-post-id=”1134438″ data-chronoton-expand-collapse=”1″ data-chronoton-analytics-enabled=”1″>

OpenAI is refocusing its research efforts and throwing its resources into a new grand challenge. The San Francisco firm has set its sights on building what it calls an AI researcher, a fully automated agent-based system that will be able to go off and tackle large, complex problems by itself. ​​OpenAI says that this new research goal will be its “North Star” for the next few years, pulling together multiple research strands, including work on reasoning models, agents, and interpretability.

There’s even a timeline. OpenAI plans to build “an autonomous AI research intern”—a system that can take on a small number of specific research problems by itself—by September. The AI intern will be the precursor to a fully automated multi-agent research system that the company plans to debut in 2028. This AI researcher (OpenAI says) will be able to tackle problems that are too large or complex for humans to cope with.

Those tasks might be related to math and physics—such as coming up with new proofs or conjectures—or life sciences like biology and chemistry, or even business and policy dilemmas. In theory, you would throw such a tool any kind of problem that can be formulated in text, code, or whiteboard scribbles—which covers a lot.

OpenAI has been setting the agenda for the AI industry for years. Its early dominance with large language models shaped the technology that hundreds of millions of people use every day. But it now faces fierce competition from rival model makers like Anthropic and Google DeepMind. What OpenAI decides to build next matters—for itself and for the future of AI.   

A big part of that decision falls to Jakub Pachocki, OpenAI’s chief scientist, who sets the company’s long-term research goals. Pachocki played key roles in the development of both GPT-4, a game-changing LLM released in 2023, and so-called reasoning models, a technology that first appeared in 2024 and now underpins all major chatbots and agent-based systems. 

In an exclusive interview this week, Pachocki talked me through OpenAI’s latest vision. “I think we are getting close to a point where we’ll have models capable of working indefinitely in a coherent way just like people do,” he says. “Of course, you still want people in charge and setting the goals. But I think we will get to a point where you kind of have a whole research lab in a data center.”

Solving hard problems

Such big claims aren’t new. Saving the world by solving its hardest problems is the stated mission of all the top AI firms. Demis Hassabis told me back in 2022 that it was why he started DeepMind. Anthropic CEO Dario Amodei says he is building the equivalent of a country of geniuses in a data center. Pachocki’s boss, Sam Altman, wants to cure cancer. But Pachocki says OpenAI now has most of what it needs to get there.

In January, OpenAI released Codex, an agent-based app that can spin up code on the fly to carry out tasks on your computer. It can analyze documents, generate charts, make you a daily digest of your inbox and social media, and much more. (Other firms have released similar tools, such as Anthropic’s Claude Code and Claude Cowork.)

OpenAI claims that most of its technical staffers now use Codex in their work. You can look at Codex as a very early version of the AI researcher, says Pachocki: “I expect Codex to get fundamentally better.”

The key is to make a system that can run for longer periods of time, with less human guidance. “What we’re really looking at for an automated research intern is a system that you can delegate tasks [to] that would take a person a few days,” says Pachocki.

“There are a lot of people excited about building systems that can do more long-running scientific research,” says Doug Downey, a research scientist at the Allen Institute for AI, who is not connected to OpenAI. “I think it’s largely driven by the success of these coding agents. The fact that you can delegate quite substantial coding tasks to tools like Codex is incredibly useful and incredibly impressive. And it raises the question: Can we do similar things outside coding, in broader areas of science?”

For Pachocki, that’s a clear Yes. In fact, he thinks it’s just a matter of pushing ahead on the path we’re already on. A simple boost in all-round capability also leads to models that can work longer without help, he says. He points to the leap from 2020’s GPT-3 to 2023’s GPT-4, two of OpenAI’s previous models. GPT-4 was able to work on a problem for far longer than its predecessor, even without specialized training, he says. 

So-called reasoning models brought another bump. Training LLMs to work through problems step by step, backtracking when they make a mistake or hit a dead end, has also made models better at working for longer periods of time. And Pachocki is convinced that OpenAI’s reasoning models will continue to get better.

But OpenAI is also training its systems to work by themselves for longer by feeding them specific samples of complex tasks, such as hard puzzles taken from math and coding contests, which force the models to learn how to do things like keep track of very large chunks of text and split problems up into (and then manage) multiple subtasks.

The aim isn’t to build models that just win math competitions. “That lets you prove that the technology works before you connect it to the real world,” says Pachocki. “If we really wanted to, we could build an amazing automated mathematician. We have all the tools, and I think it would be relatively easy. But it’s not something we’re going to prioritize now because, you know, at the point where you believe you can do it, there’s much more urgent things to do.”

“We are much more focused now on research that’s relevant in the real world,” he adds.

Right now that means taking what Codex can do with coding and trying to apply that to problem-solving in general. “There’s a big change happening, especially in programming,” he says. “Our jobs are now totally different than they were even a year ago. Nobody really edits code all the time anymore. Instead, you manage a group of Codex agents.” If Codex can solve coding problems (the argument goes), it can solve any problem.

The line always goes up

It’s true that OpenAI has had a handful of remarkable successes in the last few months. Researchers have used GPT-5 (the LLM that powers Codex) to discover new solutions to a number of unsolved math problems and punch through apparent dead ends in a handful of biology, chemistry, and physics puzzles.   

“Just looking at these models coming up with ideas that would take most PhD weeks, at least, makes me expect that we’ll see much more acceleration coming from this technology in the near future,” Pachocki says.

But Pachocki admits that it’s not a done deal. He also understands why some people still have doubts about how much of a game-changer the technology really is. He thinks it depends on how people like to work and what they need to do. “I can believe some people don’t find it very useful yet,” he says.

He tells me that he didn’t even use autocomplete—the most basic version of generative coding tech—a year ago. “I’m very pedantic about my code,” he says. “I like to type it all manually in vim if I can help it.” (Vim is a text editor favored by many hardcore programmers that you interact with via dozens of keyboard shortcuts instead of a mouse.)

But that changed when he saw what the latest models could do. He still wouldn’t hand over complex design tasks, but it’s a time-saver when he just wants to try out a few ideas. “I can have it run experiments in a weekend that previously would have taken me like a week to code,” he says.

“I don’t think it is at the level where I would just let it take the reins and design the whole thing,” he adds. “But once you see it do something that would take a week to do—I mean, that’s hard to argue with.”

Pachocki’s game plan is to supercharge the existing problem-solving abilities that tools like Codex have now and apply them across the sciences.  

Downey agrees that the idea of an automated researcher is very cool: “It would be exciting if we could come back tomorrow morning and the agent’s done a bunch of work and there’s new results we can examine,” he says.

But he cautions that building such a system could be harder than Pachocki makes out. Last summer, Downey and his colleagues tested several top-tier LLMs on a range of scientific tasks. OpenAI’s latest model, GPT-5, came out on top but still made lots of errors.

“If you have to chain tasks together, then the odds that you get several of them right in succession tend to go down,” he says. Downey admits that things move fast, and he has not tested the latest versions of GPT-5 (OpenAI released GPT-5.4 two weeks ago). “So those results might already be stale,” he says. 

Serious unanswered questions

I asked Pachocki about the risks that may come with a system that can solve large, complex problems by itself with little human oversight. Pachocki says people at OpenAI talk about those risks all the time.

“If you believe that AI is about to substantially accelerate research, including AI research, that’s a big change in the world. That’s a big thing,” he told me. “And it comes with some serious unanswered questions. If it’s so smart and capable, if it can run an entire research program, what if it does something bad?”

The way Pachocki sees it, that could happen in a number of ways. The system could go off the rails. It could get hacked. Or it could simply misunderstand its instructions.

The best technique OpenAI has right now to address these concerns is to train its reasoning models to share details about what they are doing as they work. This approach to keeping tabs on LLMs is known as chain-of-thought monitoring.

In short, LLMs are trained to jot down notes about what they are doing in a kind of scratch pad as they step through tasks. Researchers can then use those notes to make sure a model is behaving as expected. Yesterday OpenAI published new details on how it is using chain-of-thought monitoring in house to study Codex

“Once we get to systems working mostly autonomously for a long time in a big data center, I think this will be something that we’re really going to depend on,” says Pachocki.

The idea would be to monitor an AI researcher’s scratch pads using other LLMs and catch unwanted behavior before it’s a problem, rather than trying to stop that bad behavior from happening in the first place. LLMs are not understood well enough for us to control them fully.

“I think it’s going to be a long time before we can really be like, okay, this problem is solved,” he says. “Until you can really trust the systems, you definitely want to have restrictions in place.” Pachocki thinks that very powerful models should be deployed in sandboxes, cut off from anything they could break or use to cause harm. 

AI tools have already been used to come up with novel cyberattacks. Some worry that they will be used to design synthetic pathogens that could be used as bioweapons. You can insert any number of evil-scientist scare stories here. “I definitely think there are worrying scenarios that we can imagine,” says Pachocki. 

“It’s going to be a very weird thing. It’s extremely concentrated power that’s in some ways unprecedented,” says Pachocki. “Imagine you get to a world where you have a data center that can do all the work that OpenAI or Google can do. Things that in the past required large human organizations would now be done by a couple of people.”

“I think this is a big challenge for governments to figure out,” he adds.

And yet some people would say governments are part of the problem. The US government wants to use AI on the battlefield, for example. The recent showdown between Anthropic and the Pentagon revealed that there is little agreement across society about where we draw red lines for how this technology should and should not be used—let alone who should draw them. In the immediate aftermath of that dispute, OpenAI stepped up to sign a deal with the Pentagon instead of its rival. The situation remains murky.

I pushed Pachocki on this. Does he really trust other people to figure it out or does he, as a key architect of the future, feel personal responsibility? “I do feel personal responsibility,” he says. “But I don’t think this can be resolved by OpenAI alone, pushing its technology in a particular way or designing its products in a particular way. We’ll definitely need a lot of involvement from policymakers.”

Where does that leave us? Are we really on a path to the kind of AI Pachocki envisions? When I asked the Allen Institute’s Downey, he laughed. “I’ve been in this field for a couple of decades and I no longer trust my predictions for how near or far certain capabilities are,” he says. 

OpenAI’s stated mission is to ensure that artificial general intelligence (a hypothetical future technology that many AI boosters believe will be able to match humans on most cognitive tasks) will benefit all of humanity. OpenAI aims to do that by being the first to build it. But the only time Pachocki mentioned AGI in our conversation, he was quick to clarify what he meant by talking about “economically transformative technology” instead.

LLMs are not like human brains, he says: “They are superficially similar to people in some ways because they’re kind of mostly trained on people talking. But they’re not formed by evolution to be really efficient.” 

“Even by 2028, I don’t expect that we’ll get systems as smart as people in all ways. I don’t think that will happen,” he adds. “But I don’t think it’s absolutely necessary. The interesting thing is you don’t need to be as smart as people in all their ways in order to be very transformative.”

Why the world doesn’t recycle more nuclear waste

The prospect of making trash useful is always fascinating to me. Whether it’s used batteries, solar panels, or spent nuclear fuel, getting use out of something destined for disposal sounds like a win all around.

In nuclear energy, figuring out what to do with waste has always been a challenge, since the material needs to be dealt with carefully. In a new story, I dug into the question of what advanced nuclear reactors will mean for spent fuel waste. New coolants, fuels, and logistics popping up in companies’ designs could require some adjustments.

My reporting also helped answer another question that was lingering in my brain: Why doesn’t the world recycle more nuclear waste?

There’s still a lot of usable uranium in spent nuclear fuel when it’s pulled out of reactors. Getting more use out of the spent fuel could cut down on both waste and the need to mine new material, but the process is costly, complicated, and not 100% effective.

France has the largest and most established reprocessing program in the world today. The La Hague plant in northern France has the capacity to reprocess about 1,700 tons of spent fuel each year.

The plant uses a process called PUREX—spent fuel is dissolved in acid and goes through chemical processing to pull out the uranium and plutonium, which are then separated. The plutonium is used to make mixed oxide (or MOX) fuel, which can be used in a mixture to fuel conventional nuclear reactors or alone as fuel in some specialized designs. And the uranium can go on to be re-enriched and used in standard low-enriched uranium fuel.

Reprocessing can cut down on the total volume of high-level nuclear waste that needs special handling, says Allison Macfarlane, director of the school of public policy and global affairs at the University of British Columbia and a former chair of the NRC.

But there’s a bit of a catch. Today, the gold standard for permanent nuclear waste storage is a geological repository, a deep underground storage facility. Heat, not volume, is often the key limiting factor for how much material can be socked away in those facilities, depending on the specific repository. And spent MOX fuel gives off much more heat than conventional spent fuel, Macfarlane says. So even if there’s a smaller volume, the material might take up as much, or even more, space in a repository. 

It’s also tricky to make this a true loop: The uranium that’s produced from reprocessing is contaminated with isotopes that can be difficult to separate, Macfarlane says. Today, France essentially saves the uranium for possible future enrichment as a sort of strategic stockpile. (Historically, it’s also exported some to Russia for enrichment.) And while MOX fuel can be used in some reactors, once it is spent, it is technically challenging to reprocess. So today, the best case is that fuel could be used twice, not infinitely.

“Every responsible analyst understands that no matter what, no matter how good your recycling process is, you’re still going to need a geological repository in the end,” says Edwin Lyman, director of nuclear power safety at the Union of Concerned Scientists.

Reprocessing also has its downsides, Lyman adds. One risk comes from the plutonium made in the process, which can be used in nuclear weapons. France handles that risk with high security, and by quickly turning that plutonium into the MOX fuel product.

Reprocessing is also quite expensive, and uranium supply isn’t meaningfully limited. “There’s no economic benefit to reprocessing at this time,” says Paul Dickman, a former Department of Energy and NRC official.

France bears the higher cost that comes with reprocessing largely for political reasons, he says. The country doesn’t have uranium resources, importing its supply today. Reprocessing helps ensure its energy independence: “They’re willing to pay a national security premium.”

Japan is currently constructing a spent-fuel reprocessing facility, though delays have plagued the project, which started construction in 1993 and was originally supposed to start up by 1997. Now the facility is expected to open by 2027.

It’s possible that new technologies could make reprocessing more appealing, and agencies like the Department of Energy should do longer-term research on advanced separation technologies, Dickman says. Some companies working on advanced reactors say they plan to use alternative reprocessing methods in their fuel cycle.

This article is from The Spark, MIT Technology Review’s weekly climate newsletter. To receive it in your inbox every Wednesday, sign up here

Can quantum computers now solve health care problems? We’ll soon find out.

<div data-chronoton-summary="

  • A $5 million health care challenge: A nonprofit called Wellcome Leap is offering up to $5 million to quantum computing teams that can solve real-world health care problems classical computers can’t handle—using machines that are still noisy, error-prone, and far from perfect.
  • Hybrid computing is the real breakthrough: Facing limited quantum hardware, all six finalist teams developed clever quantum-classical hybrid approaches—offloading most work to conventional processors, then using quantum only where classical methods fall short.
  • Cancer, muscular dystrophy, and drug design are on the table: Teams are tackling problems ranging from identifying cancer origins to simulating light-activated cancer drugs to finding treatments for muscular dystrophy—applications previously impossible to model classically.
  • Even failure would count as progress: The competition’s own director doubts anyone will claim the grand prize, but says the field has already been transformed—teams now know where quantum computing can genuinely matter, even if the machines to fully prove it don’t exist yet.

” data-chronoton-post-id=”1134409″ data-chronoton-expand-collapse=”1″ data-chronoton-analytics-enabled=”1″>

I’m standing in front of a quantum computer built out of atoms and light at the UK’s National Quantum Computing Centre on the outskirts of Oxford. On a laboratory table, a complex matrix of mirrors and lenses surrounds a Rubik’s Cube–size cell where 100 cesium atoms are suspended in grid formation by a carefully manipulated laser beam. 

The cesium atom setup is so compact that I could pick it up, carry it out of the lab, and put it on the backseat of my car to take home. I’d be unlikely to get very far, though. It’s small but powerful—and so it’s very valuable. Infleqtion, the Colorado-based company that owns it, is hoping the machine’s abilities will win $5 million next week, at an event to be held in Marina del Rey, California. 

Infleqtion is one of six teams that have made it to the final stage of a 30-month-long quantum computing competition called Quantum for Bio (Q4Bio). Run by the nonprofit Wellcome Leap, it aims to show that today’s quantum computers, though messy and error-prone and far from the large-scale machines engineers hope to build, could actually benefit human health. Success would be a significant step forward in proving the worth of quantum computers. But for now, it turns out, that worth seems to be linked to harnessing and improving the performance of conventional (also called classical) computers in tandem, creating a quantum-classical hybrid that can exceed what’s possible on classical machines by themselves.

There are two prize categories. A prize of $2 million will go to any and all teams that can run a significantly useful health care algorithm on computers with 50 or more qubits (a qubit is the basic processing unit in a quantum computer). To win the $5 million grand prize, a team must successfully run a quantum algorithm that solves a significant real-world problem in health care, and the work must use 100 or more qubits. Winners have to meet strict performance criteria, and they must solve a health care problem that can’t be solved with conventional computers—a tough task.

Despite the scale of the challenge, most of the teams think some of this money could be theirs. “I think we’re in with a good shout,” says Jonathan D. Hirst, a computational chemist at the University of Nottingham, UK. “We’re very firmly within the criteria for the $2 million prize,” says Stanford University’s Grant Rotskoff, whose collaboration is investigating the quantum properties of the ATP molecule that powers biological cells. 

The grand prize is perhaps less of a sure thing. “This is really at the very edge of doable,” Rotskoff says. Insiders say the challenge is so difficult, given the state of quantum computing technology, that much of the money could stay in Wellcome Leap’s account. 

With most of the Q4Bio work unpublished and protected by NDAs, and the quantum computing field already rife with claims and counterclaims about performance and achievements, only the judges will be in a position to decide who’s right. 

A hybrid solution

The idea behind quantum computers is that they can use small-scale objects that obey the laws of quantum mechanics, such as atoms and photons of light,  to simulate real-world processes too complex to model on our everyday classical machines. 

Researchers have been working for decades to build such systems, which could deliver insights for creating new materials, developing pharmaceuticals, and improving chemical processes such as fertilizer production.  But dealing with quantum stuff like atoms is excruciatingly difficult. The biggest, shiniest applications require huge, robust machines capable of withstanding the environmental “noise” that can very easily disrupt delicate quantum systems. We don’t have those yet—and it’s unclear when we will. 

Wellcome Leap wanted to find out if the smaller-scale machines we have today can be made to do something—anything—useful for health care while we wait for the era of powerful, large-scale quantum computers. The group started the competition in 2024, offering $1.5 million in funding to each group of 12 selected teams.

The six Q4Bio finalists have taken a range of approaches. Crucially, they’ve all come up with ingenious ways to overcome quantum computing’s drawbacks. Faced with noisy, limited machines, they have learned how to outsource much of the computational load to classical processors running newly developed algorithms that are, in many cases, better than the previous state of the art. The quantum processors are then required only for the parts of the problem where classical methods don’t scale well enough as the calculation gets bigger.

For example, a team led by Sergii Strelchuk of Oxford University is using a quantum computer to map genetic diversity among humans and pathogens on complex graph-based structures. These will—the researchers hope—expose hidden connections and potential treatment pathways. “You can think about it as a platform for solving difficult problems in computational genomics,” Strelchuk says. 

The corresponding classical tools struggle with even modest scale-up to large databases. Strelchuk’s team has built an automated pipeline that provides a way of determining whether classical solvers will struggle with a particular problem, and how a quantum algorithm might be able to formulate the data so that it becomes solvable on a classical computer or handleable on a noisy quantum one. “You can do all this before you start spending money on computing,” Strelchuk says.

In collaboration with Cleveland Clinic, Helsinki-based Algorithmiq has used a superconducting quantum computer built by IBM to simulate a cancer drug that is triggered by specific types of light. “The idea is you take the drug, and it’s everywhere in your body, but it’s doing nothing, just sitting there, until there’s light on it of a certain wavelength,” says Guillermo García-Pérez, Algorithmiq’s chief scientific officer. Then it acts as a molecular bullet, attacking the tumor only at the location in the body where that light is directed. 

The drug with which Algorithmiq began its work is already in phase II clinical trials for treating bladder cancers. The quantum-computed simulation, which adapts and improves on classical algorithms, will allow it to be redesigned for treating other conditions. “It has remained a niche treatment precisely because it can’t be simulated classically,” says Sabrina Maniscalco, Algorithmiq’s CEO and cofounder. 

Maniscalco, who is also confident of walking away from the competition with prize money, believes the methods used to create the algorithm will have wide applications:  “What we’ve done in the period of the Q4Bio program is something unique that can change how to simulate chemistry for health care and life sciences.”

Infleqtion’s entry, running on its cesium-powered machine, is an effort to improve the identification of cancer signatures in medical data. Together with collaborators at the University of Chicago and MIT, the company’s scientists have developed a quantum algorithm that mines huge data sets such as the Cancer Genome Atlas. 

The aim is to find patterns that allow clinicians to determine factors such as the likely origin of a patient’s metastasized cancer. “It’s very important to know where it came from because that can inform the best treatment,” says Teague Tomesh, a quantum software engineer who is Infleqtion’s Q4Bio project lead.

Unfortunately, those patterns are hidden inside data sets so large that they overwhelm classical solvers. Infleqtion uses the quantum computer to find correlations in the data that can reduce the size of the computation. “Then we hand the reduced problem back to the classical solver,” Teague says. “I’m basically trying to use the best of my quantum and my classical resources.”

The Nottingham-based team, meanwhile, is using quantum computing to nail down a drug candidate that can cure myotonic dystrophy, the most common adult-onset form of muscular dystrophy. One member of the team, David Brook, played a role in identifying the gene behind this condition in 1992. Over 30 years later, Brook, Hirst, and the others in their group—which includes QuEra, a Boston company developing a quantum computer based on neutral atoms—has now quantum-computed a way in which drugs can form chemical bonds with the protein that brings on the disease, blocking the mechanism that causes the problem.

Low expectations 

The entrants’ confidence might be high, but Shihan Sajeed’s is much lower. Sajeed, a quantum computing entrepreneur based in Waterloo, Ontario, is program director for Q4Bio. He believes the error-prone quantum machines the researchers must work with are unlikely to deliver on all the grand prize criteria. “It is very difficult to achieve something with a noisy quantum computer that a classical machine can’t do,” he says.

That said, he has been surprised by the progress. “When we started the program, people didn’t know about any use cases where quantum can definitely impact biology,” he says. But the teams have found promising applications, he adds: “We now know the fields where quantum can matter.” 

And the developments in “hybrid quantum-classical” processing that the entrants are using are “transformational,” Sajeed reckons.

Will it be enough to make him part with Wellcome Leap’s money? That’s down to a judging panel, whose members’ identities are a closely guarded secret to ensure that no one tailors their presentation to a particular kind of approach. But we won’t know the outcome for a while; the winner, or winners, will be announced in mid-April. 

If it does turn out that there are no winners, Sajeed has some words of comfort for the competitors. The goal has always been about running a useful algorithm on a machine that exists today, he points out; missing the mark doesn’t mean your algorithm won’t be useful on a future quantum computer. “It just means the machine you need doesn’t exist yet.”

What do new nuclear reactors mean for waste?

MIT Technology Review Explains: Let our writers untangle the complex, messy world of technology to help you understand what’s coming next. You can read more from the series here.

The way the world currently deals with nuclear waste is as creative as it is varied: Drown it in water pools, encase it in steel, bury it hundreds of meters underground. 

These methods are how the nuclear industry safely manages the 10,000 metric tons of spent fuel waste that reactors produce as they churn out 10% of the world’s electricity every year. But as new nuclear designs emerge, they could introduce new wrinkles for nuclear waste management.  

Most operating reactors at nuclear power plants today follow a similar basic blueprint: They’re fueled with low-enriched uranium and cooled with water, and they’re mostly gigantic, sited at central power plants. But a large menu of new reactor designs that could come online in the next few years will likely require tweaks to ensure that existing systems can handle their waste.

“There’s no one answer about whether this panoply of new reactors and fuel types are going to make waste management any easier,” says Edwin Lyman, director of nuclear power safety at the Union of Concerned Scientists.

A nuclear disposal playbook

Nuclear waste can be roughly split into two categories: low-level waste, like contaminated protection equipment from hospitals and research centers, and high-level waste, which requires more careful handling. 

The vast majority by volume is low-level waste. This material can be stored onsite and often, once its radioactivity has decayed enough, largely handled like regular trash (with some additional precautions). High-level waste, on the other hand, is much more radioactive and often quite hot. This second category consists largely of spent fuel, a combination of materials including uranium-235, which is the fissile portion of nuclear fuel—the part that can sustain the chain reaction required for nuclear power plants to work. The material also contains fission products—the sometimes radioactive by-products of the splitting atoms that release energy.

Many experts agree that the best long-term solution for spent fuel and other high-level nuclear waste is a geologic repository—essentially, a very deep, very carefully managed hole in the ground. Finland is the furthest along with plans to build one, and its site on the southwest coast of the country should be operational this year.

The US designated a site for a geological repository in the 1980s, but political conflict has stalled progress. So today, used fuel in the US is stored onsite at operational and shuttered nuclear power plants. Once it’s removed from a reactor, it’s typically placed into wet storage, essentially submerged in pools of water to cool down. The material can then be put in protective cement and steel containers called dry casks, a stage known as dry storage.

Experts say the industry won’t need to entirely rewrite this playbook for the new reactor designs.  

“The way we’re going to manage spent fuel is going to be largely the same,” says Erik Cothron, manager of research and strategy at the Nuclear Innovation Alliance, a nonprofit think tank focused on the nuclear industry. “I don’t stay up late at night worried about how we’re going to manage spent fuel.” 

But new designs and materials could require some engineering solutions. And there’s a huge range of reactor designs, meaning there’s an equally wide range of potential waste types to handle.

Unusual waste

Some new nuclear reactors will look quite similar to operating models, so their spent fuel will be managed in much the same way that it is today. But others use novel materials as coolants and fuels. 

“Unusual materials will create unusual waste,” says Syed Bahauddin Alam, an assistant professor of nuclear, plasma, and radiological engineering at the University of Illinois Urbana-Champaign.

Some advanced designs could increase the volume of material that needs to be handled as high-level waste. Take reactors that use TRISO (tri-structural isotropic) fuel, for example. TRISO contains a uranium kernel surrounded by several layers of protective material and then embedded in graphite shells. The graphite that encases TRISO will likely be lumped together with the rest of the spent fuel, making the waste much bulkier than current fuel.

Today, separating those layers would be difficult and expensive, according to a 2024 report from the Nuclear Innovation Alliance. That means the entire package would be lumped together as high-level waste.  

The company X-energy is designing high-temperature gas-cooled reactors that use TRISO fuel. It has already submitted plans for dealing with spent fuel to the Nuclear Regulatory Commission, which oversees reactors in the US. The fuel’s form could actually help with waste management: The protective shells used in TRISO eliminate X-energy’s need for wet storage, allowing for dry storage from day one, according to the company.

Liquid-fueled molten-salt reactors, another new type, could increase waste volume too. In these designs, fuel and coolant are not kept separate as in most reactors; instead, the fuel is dissolved directly into a molten salt that’s used as the coolant. That means the entire vat of molten salt would need to be handled as high-level waste.

On the other hand, some other reactor designs could produce a smaller volume of spent fuel, but that isn’t necessarily a smaller problem. Fast reactors, for example, achieve a higher burn-up, consuming more of the fissile material and extracting more energy from their fuel. That means spent fuel from these reactors typically has a higher concentration of fission products and emits more heat. And that heat could be the killer factor for designing waste solutions. 

Spent fuel needs to be kept relatively cool, so it doesn’t melt and release hazardous by-products. Too much heat in a repository could also damage the surrounding rock. “Heat is what really drives how much you can put inside a repository,” says Paul Dickman, a former Department of Energy and NRC official.

Some spent fuel could require chemical processing prior to disposal, says Allison MacFarlane, director of the school of public policy and global affairs at the University of British Columbia and a former chair of the NRC. That could add complication and cost.

In fast reactors cooled by sodium metal, for example, the coolant can get into the fuel and fuse to its casing. Separation could be tricky, and sodium is highly reactive with water, so the spent fuel will require specialized treatment.

TerraPower’s Natrium reactor, a sodium fast reactor that received a construction permit from the NRC in early March, is designed to safely manage this challenge, says Jeffrey Miller, senior vice president for business development at TerraPower. The company has a plan to blow nitrogen over the material before it’s put into wet storage pools, removing the sodium.

Location, location, location

Regardless of what materials are used, even just changing the size of reactors and where they’re sited could introduce complications for waste management. 

Some new reactors are essentially smaller versions of the large reactors used today. These small modular reactors and microreactors may produce waste that can be handled in the same way as waste from today’s conventional reactors. But for places like the US, where waste is stored onsite, it would be impractical to have a ton of small sites that each hosts its own waste.  

Some companies are looking at sending their microreactors, and the waste material they produce, back to a single location, potentially the same one where reactors are manufactured.

Companies should be required to think carefully about waste and design in management protocols, and they should be held responsible for the waste they produce, UBC’s MacFarlane says. 

She also notes that so far, planning for waste has relied on research and modeling, and the reality will become clear only once the reactors are actually operational. As she puts it: “These reactors don’t exist yet, so we don’t really know a whole lot, in great gory detail, about the waste they’re going to produce.”

The Pentagon is planning for AI companies to train on classified data, defense official says

The Pentagon is discussing plans to set up secure environments for generative AI companies to train military-specific versions of their models on classified data, MIT Technology Review has learned. 

AI models like Anthropic’s Claude are already used to answer questions in classified settings; applications include analyzing targets in Iran. But allowing models to train on and learn from classified data would be a new development that presents unique security risks. It would mean sensitive intelligence like surveillance reports or battlefield assessments could become embedded into the models themselves, and it would bring AI firms into closer contact with classified data than before. 

Training versions of AI models on classified data is expected to make them more accurate and effective in certain tasks, according to a US defense official who spoke on background with MIT Technology Review. The news comes as demand for more powerful models is high: The Pentagon has reached agreements with OpenAI and Elon Musk’s xAI to operate their models in classified settings and is implementing a new agenda to become an “an ‘AI-first’ warfighting force” as the conflict with Iran escalates. (The Pentagon did not comment on its AI training plans as of publication time.)

Training would be done in a secure data center that’s accredited to host classified government projects, and where a copy of an AI model is paired with classified data, according to two people familiar with how such operations work. Though the Department of Defense would remain the owner of the data, personnel from AI companies might in rare cases access the data if they have appropriate security clearance, the official said. 

Before allowing this new training, though, the official said, the Pentagon intends to evaluate how accurate and effective models are when trained on nonclassified data, like commercially available satellite imagery. 

The military has long used computer vision models, an older form of AI, to identify objects in images and footage it collects from drones and airplanes, and federal agencies have awarded contracts to companies to train AI models on such content. And AI companies building large language models (LLMs) and chatbots have created versions of their models fine-tuned for government work, like Anthropic’s Claude Gov, which are designed to operate across more languages and in secure environments. But the official’s comments are the first indication that AI companies building LLMs, like OpenAI and xAI, could train government-specific versions of their models directly on classified data.

Aalok Mehta, who directs the Wadhwani AI Center at the Center for Strategic and International Studies and previously led AI policy efforts at Google and OpenAI, says training on classified data, as opposed to just answering questions about it, would present new risks. 

The biggest of these, he says, is that classified information these models train on could be resurfaced to anyone using the model. That would be a problem if lots of different military departments, all with different classification levels and needs for information, were to share the same AI. 

“You can imagine, for example, a model that has access to some sort of sensitive human intelligence—like the name of an operative—leaking that information to a part of the Defense Department that isn’t supposed to have access to that information,” Mehta says. That could create a security risk for the operative, one that’s difficult to perfectly mitigate if a particular model is used by more than one group within the military.

However, Mehta says, it’s not as hard to keep information contained from the broader world: “If you set this up right, you will have very little risk of that data being surfaced on the general internet or back to OpenAI.” The government has some of the infrastructure for this already; the security giant Palantir has won sizable contracts for building a secure environment through which officials can ask AI models about classified topics without sending the information back to AI companies. But using these systems for training is still a new challenge. 

The Pentagon, spurred by a memo from Defense Secretary Pete Hegseth in January, has been racing to incorporate more AI. It has been used in combat, where generative AI has ranked lists of targets and recommended which to strike first, and in more administrative roles, like drafting contracts and reports.

There are lots of tasks currently handled by human analysts that the military might want to train leading AI models to perform and would require access to classified data, Mehta says. That could include learning to identify subtle clues in an image the way an analyst does, or connecting new information with historical context. The classified data could be pulled from the unfathomable amounts of text, audio, images, and video, in many languages, that intelligence services collect. 

It’s really hard to say which specific military tasks would require AI models to train on such data, Mehta cautions, “because obviously the Defense Department has lots of incentives to keep that information confidential, and they don’t want other countries to know what kind of capabilities we have exactly in that space.”

If you have information about the military’s use of AI, you can share it securely via Signal (username jamesodonnell.22).

Where OpenAI’s technology could show up in Iran

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

It’s been just over two weeks since OpenAI reached a controversial agreement to allow the Pentagon to use its AI in classified environments. There are still pressing questions about what exactly OpenAI’s agreement allows for; Sam Altman said the military can’t use his company’s technology to build autonomous weapons, but the agreement really just demands that the military follow its own (quite permissive) guidelines about such weapons. OpenAI’s other main claim, that the agreement will prevent use of its technology for domestic surveillance, appears equally dubious.

It’s unclear what OpenAI’s motivations are. It’s not the first tech giant to embrace military contracts it had once vowed never to enter into, but the speed of the pivot was notable. Perhaps it’s just about money; OpenAI is spending lots on AI training and is on the hunt for more revenue (from sources including ads). Or perhaps Altman truly believes the ideological framing he often invokes: that liberal democracies (and their militaries) must have access to the most powerful AI to compete with China.

The more consequential question is what happens next. OpenAI has decided it is comfortable operating right in the messy heart of combat, just as the US escalates its strikes against Iran (with AI playing a larger role in that than ever before). So where exactly could OpenAI’s tech show up in this fight? And which applications will its customers (and employees) tolerate?

Targets and strikes

Though its Pentagon agreement is in place, it’s unclear when OpenAI’s technology will be ready for classified environments, since it must be integrated with other tools the military uses (Elon Musk’s xAI, which recently struck its own deal with the Pentagon, is expected to go through the same process with its AI model Grok). But there’s pressure to do this quickly because of controversy around the technology in use to date: After Anthropic refused to allow its AI to be used for “any lawful use,” President Trump ordered the military to stop using it, and Anthropic was designated a supply chain risk by the Pentagon. (Anthropic is fighting the designation in court.)

If the Iran conflict is still underway by the time OpenAI’s tech is in the system, what could it be used for? A recent conversation I had with a defense official suggests it might look something like this: A human analyst could put a list of potential targets into the AI model and ask it to analyze the information and prioritize which to strike first. The model could account for logistics information, like where particular planes or supplies are located. It could analyze lots of different inputs in the form of text, image, and video. 

A human would then be responsible for manually checking these outputs, the official said. But that raises an obvious question: If a person is truly double-checking AI’s outputs, how is it speeding up targeting and strike decisions?

For years the military has been using another AI system, called Maven, which can handle things like automatically analyzing drone footage to identify possible targets. It’s likely that OpenAI’s models, like Anthropic’s Claude, will offer a conversational interface on top of that, allowing users to ask for interpretations of intelligence and recommendations for which targets to strike first. 

It’s hard to overstate how new this is: AI has long done analysis for the military, drawing insights out of oceans of data. But using generative AI’s advice about which actions to take in the field is being tested in earnest for the first time in Iran.

Drone defense

At the end of 2024, OpenAI announced a partnership with Anduril, which makes both drones and counter-drone technologies for the military. The agreement said OpenAI would work with Anduril to do time-sensitive analysis of drones attacking US forces and help take them down. An OpenAI spokesperson told me at the time that this didn’t violate the company’s policies, which prohibited “systems designed to harm others,” because the technology was being used to target drones and not people. 

Anduril provides a suite of counter-drone technologies to military bases around the world (though the company declined to tell me whether its systems are deployed near Iran). Neither company has provided updates on how the project has developed since it was announced. However, Anduril has long trained its own AI models to analyze camera footage and sensor data to identify threats; what it focuses less on are conversational AI systems that allow soldiers to query those systems directly or receive guidance in natural language—an area where OpenAI’s models may fit.

The stakes are high. Six US service members were killed in Kuwait on March 1 following an Iranian drone attack that was not intercepted by US air defenses. 

Anduril’s interface, called Lattice, is where soldiers can control everything from drone defenses to missiles and autonomous submarines. And the company is winning massive contracts—$20 billion from the US Army just last week—to connect its systems with legacy military equipment and layer AI on them. If OpenAI’s models prove useful to Anduril, Lattice is designed to incorporate them quickly across this broader warfare stack. 

Back-office AI

In December, Defense Secretary Pete Hegseth started encouraging millions of people in more administrative roles in the military—contracts, logistics, purchasing—to use a new AI tool. Called GenAI.mil, it provided a way for personnel to securely access commercial AI models and use them for the same sorts of things as anyone in the business world. 

Google Gemini was one of the first to be available. In January, the Pentagon announced that xAI’s Grok was going to be added to the GenAI.mil platform as well, despite incidents in which the model had spread antisemitic content and created nonconsensual deepfakes. OpenAI followed in February, with the company announcing that its models would be used for drafting policy documents and contracts and assisting with administrative support of missions.

Anyone using ChatGPT for unclassified tasks on this platform is unlikely to have much sway over sensitive decisions in Iran, but the prospect of OpenAI deploying on the platform is important in another way. It serves the all-in attitude toward AI that Hegseth has been pushing relentlessly across the Pentagon (even if many early users aren’t entirely sure what they’re supposed to use it for). The message is that AI is transforming every aspect of how the US fights, from targeting decisions down to paperwork. And OpenAI is increasingly winning a piece of it all.

Future AI chips could be built on glass

Human-made glass is thousands of years old. But it’s now poised to find its way into the AI chips used in the world’s newest and largest data centers. This year, a South Korean company called Absolics is planning to start commercial production of special glass panels designed to make next-generation computing hardware more powerful and energy efficient. Other companies, including Intel, are also pushing forward in this area. If all goes well, such glass technology could reduce the energy demands of the sorts of high-performance computing chips used in AI data centers—and it could eventually do the same for consumer laptops and mobile devices if production costs fall.

The idea is to use glass as the substrate, or layer, on which multiple silicon chips are connected. This form of “packaging” is an increasingly popular way to build computing hardware, because it lets engineers combine specialized chips designed for specific functions into a single system. But it presents challenges, including the fact that hardworking chips can run so hot they physically warp the substrate they’re built on. This can lead to misaligned components and may reduce how efficiently the chips can be cooled, leading to damage or premature failure. 

“As AI workloads surge and package sizes expand, the industry is confronting very real mechanical constraints that impact the trajectory of high-performance computing,” says Deepak Kulkarni, a senior fellow at the chip design company Advanced Micro Devices (AMD). “One of the most fundamental is warpage.”

That’s where glass comes in. It can handle the added heat better than existing substrates, and it will let engineers keep shrinking chip packages—which will make them faster and more energy efficient. It “unlocks the ability to keep scaling package footprints without hitting a mechanical wall,” says Kulkarni. 

Momentum is building behind the shift. Absolics has finished building a factory in the US that is dedicated to producing glass substrates for advanced chips and expects to begin commercial manufacturing this year. The US semiconductor manufacturer Intel is working toward incorporating glass in its next-generation chip packages, and its research has spurred other companies in the chip packaging supply chain to invest in it as well. South Korean and Chinese companies are among the early adopters. “Historically, this is not the first attempt to adopt glass in semiconductor packaging,” says Bilal Hachemi, senior technology and market analyst at the market research firm Yole Group. “But this time, the ecosystem is more solid and wider; the need for glass-based [technology] is sharper.” 

Fragile but mighty

Chip packaging has relied on organic substrates such as fiberglass-reinforced epoxy since the 1990s, says Rahul Manepalli, vice president of advanced packaging at Intel. But electrochemical complications limit how closely designers can place drilled holes to create copper-coated signal and power connections between the chips and the rest of the system. Chip designers must also account for the unpredictable shrinkage and distortion that organic substrates undergo as chips heat up and cool down. “We realized about a decade ago that we are going to have some limitations with organic substrates,” says Manepalli.

close up on a grid of glass substrate test units held by a gloved hand
These glass substrate test units were photographed at an Intel facility in Chandler, Arizona, in 2023.
INTEL CORPORATION

Glass may help overcome a lot of these limitations. Its thermal stability could allow engineers to create 10 times more connections per millimeter than organic substrates, says Manepalli. With denser connections, Intel’s designers can then stuff 50% more silicon chips into the same package area, improving computational capability. The denser connections also enable more efficient routing for the copper wires that deliver power to the chip. And the fact that glass dissipates heat more efficiently allows for chip designs that reduce overall power consumption. 

“The benefits of glass core substrates are undeniable,” says Manepalli. “It’s clear that the benefits will drive the industry to make this happen sooner rather than later, and we want to be one of the first ones who do it.” 

However, working with glass creates its own challenges. For one thing, it’s fragile. Glass substrates for data center chip packages are made from panels that are only about 700 micrometers to 1.4 millimeters thick, which leaves them susceptible to cracking or even shattering, says Manepalli. Researchers at Intel and other organizations have spent years figuring out how to use other materials and special tools to integrate the glass panels safely into semiconductor manufacturing processes. 

Now, Manepalli says, Intel’s research and development teams are reliably fabricating glass panels and churning out test chip packages that incorporate glass—and in early 2025 they demonstrated that a functional device with a glass core substrate could boot up the Windows operating system. It’s a significant improvement from the early testing days, when hundreds of glass panels got cracked every couple of days, he says.

Semiconductor manufacturers already use glass for more limited purposes, such as temporary support structures for silicon wafers. But the independent market research firm IDTechEx estimates there’s a big market for glass substrates, one that could boost the semiconductor market for glass from $1 billion in 2025 to as much as $4.4 billion by 2036. 

The material could have additional benefits if it takes off. Glass can be made astoundingly smooth—5,000 times smoother than organic substrates. This would eliminate defects that can arise as metal gets layered onto semiconductors, says Xiaoxi He, a research analyst at IDTechEx. Defects in these layers can worsen chips’ performance or even render them unusable.  

Glass could also help speed the movement of data. The material can guide light, which means chip designers could use it to build high-speed signal pathways directly into the substrate. Glass “holds enormous potential for the future of energy-efficient AI compute,” says Kulkarni at AMD, because a light-based system could move signals around with far less energy than the “power-hungry” copper pathways that are currently used to carry signals between chips in a package.

A panel pivot

Early research on glass packaging started at the 3D Systems Packaging Research Center at the Georgia Institute of Technology in 2009. The university eventually partnered with Absolics, a subsidiary of SKC, a South Korean company that produces chemicals and advanced materials. SKC constructed a semiconductor facility for manufacturing glass substrates in Covington, Georgia, in 2024, and the glass substrate partnership between Absolics and Georgia Tech was eventually awarded two grants in the same year—worth a combined $175 million—throughthe US government’s CHIPS for America program, established under the administration of President Joe Biden.

An Absolics employee monitors production of an early version of the company’s glass substrate.
COURTESY OF ABSOLICS INC

Now Absolics is moving toward commercialization; it plans to start manufacturing small quantities of glass substrates for customers this year. The company has led the way in commercializing glass substrates, says Yongwon Lee, a research engineer at Georgia Tech who is not directly involved in the commercial partnership with Absolics.

Absolics says its facility can currently produce a maximum of 12,000 square meters of glass panels a year. That’s enough, Lee estimates, to provide glass substrates for between 2 million and 3 million chip packages the size of Nvidia’s H100 GPU.

But the company isn’t alone. Lee says that multiple large manufacturers, including Samsung Electronics, Samsung Electro-Mechanics, and LG Innotek, have “significantly accelerated” their research and pilot production efforts in glass packaging over the past year. “This trend suggests that the glass substrate ecosystem is evolving from a single early mover to a broader industrial race,” he says.

Other companies are pivoting to play more specialized roles in the glass substrate supply chain. In 2025, JNTC, a company that makes electrical connectors and tempered glass for electronics, established a facility in South Korea that’s capable of producing 10,000 semi-finished glass panels per month. Such panels include drilled holes for vertical electrical connections and thin metal layers coating the glass, but they require additional manufacturing work for installation in chip packages. 

Last year, that South Korean facility began taking orders to supply semi-finished glass to both specialized substrate companies and semiconductor manufacturers. The company plans to expand the facility’s production in 2026 and open an additional manufacturing line in Vietnam in 2027.  Such industry actions show how quickly glass substrate technology is moving from prototype to commercialization—and how many tech players are betting that glass could be a surprisingly strong foundation for the future of computing and AI.

Brutal times for the US battery industry

Just a few years ago, the battery industry was hot, hot, hot. There was a seemingly infinite number of companies popping up, with shiny new chemistries and massive fundraising rounds. My biggest problem was sifting through the pile to pick the most exciting news to cover.

That tide has turned, and in 2026, what seems to be in unlimited supply isn’t battery success stories but stumbles or straight-up implosions. Companies are failing, investors are pulling back, and batteries, especially for EVs, aren’t looking so hot anymore. On Monday, Steve Levine at The Information (paywalled link) reported that 24M Technologies, a battery company founded in 2010, was shutting down and would auction off its property.

The company itself has been silent, but this is the latest in a string of bad signs, and it’s a big one—at one point 24M was worth over $1 billion, and the company’s innovations could have worked with existing technology. So where does that leave the battery industry?

Many buzzy battery startups in recent years have been trying to sell some new, innovative chemistry to compete with lithium-ion batteries, the status quo that powers phones, laptops, electric vehicles, and even grid storage arrays today. Think sodium-ion batteries and solid-state cells.

24M wasn’t trying to sell a departure from lithium-ion but improvements that could work with the tech. One of the company’s major innovations was its manufacturing process, which involved essentially smearing materials onto sheets of metal to form the electrodes, a simpler and potentially cheaper technique than the standard one. 

The layers in the company’s batteries were thicker, which cut down on some of the inactive materials in cells and improved the energy density. That allows more energy to be stored in a smaller package, boosting the range of EVs—the company famously had a goal of a 1,000-mile battery (about 1,600 kilometers).

We’re still thin on details of what exactly went down at 24M and what comes next for its tech. The company didn’t get back to my questions sent to the official press email, and nobody picked up the phone when I called. 24M cofounder and MIT professor Yet-Ming Chiang declined to speak on the record.

For those who have been closely following the battery industry, more bad news isn’t too surprising. It feels as if everyone is short on money these days, and as purse strings tighten, there’s less interest in novel ideas. “It just feels like there’s not a lot of appetite for innovation,” says Kara Rodby, a technical principal at Volta Energy Technologies, a venture capital firm that focuses on the energy storage industry.

Natron Energy, one of the leading sodium-ion startups in the US, shut down operations in September last year. Ample, an EV battery-swapping company, filed for bankruptcy in December 2025.  

There were always going to be failures from the recent battery boom. Money was flowing to all sorts of companies, some pitching truly wild ideas. But what recent months have made clear is that the battery market is turning brutal, even for the relatively safe bets.

Because 24M’s technology was designed to work into existing lithium-ion chemistry, it could have been an attractive candidate for existing battery companies to license or even acquire. “It’s a great example of something that should have been easier,” Rodby says.  

The gutting of major components of the Inflation Reduction Act, key legislation in the US that provided funding and incentives for batteries and EVs, certainly hasn’t helped. The EV market in the US is cooling off, with automakers canceling EV models and slashing factory plans.

There are bright spots. China’s battery industry is thriving, and its battery and EV giants are looking ever more dominant. The market for stationary energy storage is also still seeing positive signs of growth, even in the US. 

But overall, it’s not looking great. 

This article is from The Spark, MIT Technology Review’s weekly climate newsletter. To receive it in your inbox every Wednesday, sign up here

A defense official reveals how AI chatbots could be used for targeting decisions

The US military might use generative AI systems to rank lists of targets and make recommendations—which would be vetted by humans—about which to strike first, according to a Defense Department official with knowledge of the matter. The disclosure about how the military may use AI chatbots comes as the Pentagon faces scrutiny over a strike on an Iranian school, which it is still investigating.  

A list of possible targets might be fed into a generative AI system that the Pentagon is fielding for classified settings. Then, said the official, who requested to speak on background with MIT Technology Review to discuss sensitive topics, humans might ask the system to analyze the information and prioritize the targets while accounting for factors like where aircraft are currently located. Humans would then be responsible for checking and evaluating the results and recommendations. OpenAI’s ChatGPT and xAI’s Grok could, in theory, be the models used for this type of scenario in the future, as both companies recently reached agreements for their models to be used by the Pentagon in classified settings.

The official described this as an example of how things might work but would not confirm or deny whether it represents how AI systems are currently being used.

Other outlets have reported that Anthropic’s Claude has been integrated into existing military AI systems and used in operations in Iran and Venezuela, but the official’s comments add insight into the specific role chatbots may play, particularly in accelerating the search for targets. They also shed light on the way the military is deploying two different AI technologies, each with distinct limitations.

Since at least 2017, the US military has been working on a “big data” initiative called Maven. It uses older types of AI, particularly computer vision, to analyze the oceans of data and imagery collected by the Pentagon. Maven might take thousands of hours of aerial drone footage, for example, and algorithmically identify targets. A 2024 report from Georgetown University showed soldiers using the system to select targets and vet them, which sped up the process to get approval for these targets. Soldiers interacted with Maven through an interface with a battlefield map and dashboard, which might highlight potential targets in one color and friendly forces in another.

The official’s comments suggest that generative AI is now being added as a conversational chatbot layer—one the military may use to find and analyze data more quickly as it makes decisions like which targets to prioritize. 

Generative AI systems, like those that underpin ChatGPT, Claude, and Grok, are a fundamentally different technology from the AI that has primarily powered Maven. Built on large language models, they are much less battle-tested. And while Maven’s interface forced users to directly inspect and interpret data on the map, the outputs produced by generative AI models are easier to access but harder to verify. 

The use of generative AI for such decisions is reducing the time required in the targeting process, added the official, who did not provide details when asked how much additional speed is possible if humans are required to spend time double-checking a model’s outputs.

The use of military AI systems is under increased public scrutiny following the recent strike on a girls’ school in Iran in which more than 100 children died. Multiple news outlets have reported that the strike was from a US missile, though the Pentagon has said it is still under investigation. And while the Washington Post has reported that Claude and Maven have been involved in targeting decisions in Iran, there is no evidence yet to explain what role generative AI systems played, if any. The New York Times reported on Wednesday that a preliminary investigation found outdated targeting data to be partly responsible for the strike. 

The Pentagon has been ramping up its use of AI across operations in recent months. It started offering nonclassified use of generative AI models, for tasks like analyzing contracts or writing presentations, to millions of service members back in December through an effort called GenAI.mil. But only a few generative AI models have been approved by the Pentagon for classified use. 

The first was Anthropic’s Claude, which in addition to its use in Iran was reportedly used in the operations to capture Venezuelan leader Nicolas Maduro in January. But following recent disagreements between the Pentagon and Anthropic over whether Anthropic could restrict the military’s use of its AI, the Defense Department designated the company a supply chain risk and President Trump demanded on social media that the government stop using its AI products within six months. Anthropic is fighting the designation in court. 

OpenAI announced an agreement on February 28 for the military to use its technologies in classified settings. Elon Musk’s company xAI has also reached a deal for the Pentagon to use its model Grok in such settings. OpenAI has said its agreement with the Pentagon came with limitations, though the practical effectiveness of those limitations is not clear. 

If you have information about the military’s use of AI, you can share it securely via Signal (username jamesodonnell.22).

Hustlers are cashing in on China’s OpenClaw AI craze

Feng Qingyang had always hoped to launch his own company, but he never thought this would be how—or that the day would come this fast. 

Feng, a 27-year-old software engineer based in Beijing, started tinkering with OpenClaw, a popular new open-source AI tool that can take over a device and autonomously complete tasks for a user,  in January. He was immediately hooked, and before long he was helping other curious tech workers with less technical proficiency install the AI agent.

Feng soon realized this could be a lucrative opportunity. By the end of January, he had set up a page on Xianyu, a secondhand shopping site, advertising “OpenClaw installation support.” “No need to know coding or complex terms. Fully remote,” reads the posting. “Anyone can quickly own an AI assistant, available within 30 minutes.” 

At the same time, the broader Chinese public was beginning to catch on—and the tool, which had begun as a niche interest among tech workers, started to evolve into a popular sensation.

Feng quickly became inundated with requests, and he started chatting with customers and managing orders late into the night. At the end of February, he quit his job. Now his side gig has now grown into a full-fledged professional operation with over 100 employees. So far, the store has handled 7,000 orders, each worth about 248 RMB or approximately $34. 

“Opportunities are always fleeting,” says Feng. “As programmers, we are the first to feel the winds shift.”

Feng is among a small cohort of savvy early adopters turning China’s OpenClaw craze into cash. As users with little technical background want in, a cottage industry of people offering installation services and preconfigured hardware has sprung up to meet them. The sudden rise of these tinkerers and impromptu consultants shows just how eager the general public in China is to adopt cutting-edge AI—even when there are huge security risks

A “lobster craze”

“Have you raised a lobster yet?” 

Xie Manrui, a 36-year-old software engineer in Shenzhen, says he has heard this question nonstop over the past month. “Lobster” is the nickname Chinese users have given to OpenClaw—a reference to its logo.

Xie, like Feng, has been experimenting with OpenClaw since January. He’s built new open-source tools on top of the ecosystem, including one that visualizes the agent’s progress as an animated little desktop worker and another that lets users voice-chat with it. 

“I’ve met so many new people through ‘lobster raising,’” says Xie. “Many are lawyers or doctors, with little technical background, but all dedicated to learning new things.”

Lobsters are indeed popping up everywhere in China right now—on and offline. In February, for instance, the entrepreneur and tech influencer Fu Sheng hosted a livestream showing off OpenClaw’s capabilities that got 20,000 views. And just last weekend, Xie attended three different OpenClaw events in Shenzhen, each drawing more than 500 people. These self-organized, unofficial gatherings feature power users, influencers, and sometimes venture capitalists as speakers. The biggest event Xie attended, on March 7, drew more than 1,000 people; in the packed venue, he says, people were shoulder to shoulder, with many attendees unable to even get a seat.

Now China’s AI giants are starting to piggyback on the trend too, promoting their models, APIs,  and cloud services (which can be used with OpenClaw), as well as their own OpenClaw-like agents. Earlier this month, Tencent held a public event offering free installation support for OpenClaw, drawing long lines of people waiting for help, including elderly users and children.

This sudden burst in popularity has even prompted local governments to get involved. Earlier this month the government of Longgang, a district in Shenzhen, released several policies to support OpenClaw-related ventures, including free computing credits and cash rewards for standout projects. Other cities, including Wuxi, have begun rolling out similar measures.

These policies only catalyze what’s already in the air. “It was not until my father, who is 77, asked me to help install a ‘lobster’ for him that I realized this thing is truly viral,” says Henry Li, a software engineer based in Beijing. 

A programmer gold rush

What’s making this moment particularly lucrative for people with technical skills, like Feng, is that so many people want OpenClaw, but not nearly as many have the capabilities to access it. Setting it up requires a level of technical knowledge most people do not possess, from typing commands into a black terminal window to navigating unfamiliar developer platforms. On the hardware side, an older or budget laptop may struggle to run it smoothly. And if the tool is not installed on a device separate from someone’s everyday computer, or if the data accessible to OpenClaw is not properly partitioned, the user’s privacy could be at risk—opening the door to data leaks and even malicious attacks. 

Chris Zhao, known as “Qi Shifu” online, organizes OpenClaw social media groups and events in Beijing. On apps like Rednote and Jike, Zhao routinely shares his thoughts on AI, and he asks other interested users to leave their WeChat ID so he can invite them to a semi-private group chat. The proof required to join is a screenshot that shows your “lobster” up and running. Zhao says that even in group chats for experienced users, hardware and cloud setup remain a constant topic of discussion.

The relatively high bar for setting up OpenClaw has generated a sense of exclusivity, creating a natural opening for a service industry to start unfolding around it. On Chinese e-commerce platforms like Taobao and JD, a simple search for “OpenClaw” now returns hundreds of listings, most of them installation guides and technical support packages aimed at nontechnical users, priced anywhere from 100 to 700 RMB (approximately $15 to $100). At the higher end, many vendors offer to come to help you in person. 

Like Feng, most providers of these services are early adopters with some technical ability who are looking for a side gig. But as demand has surged, some have found themselves overwhelmed. Xie, the developer in Shenzhen who created tools to layer on OpenClaw, was asked by a friend who runs one such business to help out over the weekend; the friend had a customer who worked in e-commerce and had little technical experience, so Xie had to show up in person to get it done. He walked away with 600 RMB ($87) for the afternoon.

The growing demand has also pushed vendors like Feng to expand quickly. He has now standardized his operation into tiers: a basic installation, a custom package where users can make specific requests like configuring a preferred chat app, and an ongoing tutoring service for those who want a hand to hold as they find their footing with the technology.

Other vendors in China are making money combining OpenClaw with hardware. Li Gong, a Shenzhen-based seller of refurbished Mac computers, was among the first online sellers to do this—offering Mac minis and MacBooks with OpenClaw preinstalled. Because OpenClaw is designed to operate with deep access to a hard drive and can run continuously in the background unattended, many users prefer to install it on a separate device rather than on the one they use every day. This would help prevent bad actors from infiltrating the program and immediately gaining access to a wide swathe of someone’s personal information. Many turn to secondhand or refurbished options to keep the cost down. Li says that in the last two weeks, orders have increased eightfold.

Though OpenClaw itself is a new technology, the general practice of buying software bundles, downloading third-party packages, and seeking out modified devices is nothing new for many Chinese internet users, says Tianyu Fang, a PhD candidate studying the history of technology at Harvard University. Many users pay for one-off IT support services for tasks from installing Adobe software to jailbreaking a Kindle.

Still, not everyone is getting swept up. Jiang Yunhui, a tech worker based in Ningbo, worries that ordinary users who struggle with setup may not be the right audience for a technology that is still effectively in testing. 

“The hype in first-tier cities can be a little overblown,” he says. “The agent is still a proof of concept, and I doubt it would be of any life-changing use to the average person for now.” He argues that using it safely and getting anything meaningful out of it requires a level of technical fluency and independent judgment that most new users simply don’t have yet.

He’s not alone in his concerns. On March 10, the Chinese cybersecurity regulator CNCERT issued a warning about the security and data risks tied to OpenClaw, saying it heightens users’ exposure to data breaches.

Despite the potential pitfalls, though, China’s enthusiasm for OpenClaw doesn’t seem to be slowing.

Feng, now flush with the earnings from his operation, wants to use the momentum—and the capital—to keep building out his own venture with AI tools at the center of it.

“With OpenClaw and other AI agents, I want to see if I can run a one-person company,” he says. “I’m giving myself one year.”