A new company plans to use Earth as a chemical reactor

Forget massive steel tanks—some scientists want to make chemicals with the help of rocks deep beneath Earth’s surface.

New research shows that ammonia, a chemical crucial for fertilizer, can be produced from rocks at temperatures and pressures that are common in the subsurface. The research was published today in Joule, and MIT Technology Review can exclusively report that a new company, called Addis Energy, was founded to commercialize the process.

Ammonia is used in most fertilizers and is a vital part of our modern food system. It’s also being considered for use as a green fuel in industries like transoceanic shipping. The problem is that current processes used to make ammonia require a lot of energy and produce huge amounts of the greenhouse gases that cause climate change—over 1% of the global total. The new study finds that the planet’s internal conditions can be used to produce ammonia in a much cleaner process. 

“Earth can be a factory for chemical production,” says Iwnetim Abate, an MIT professor and author of the new study.

This idea could be a major change for the chemical industry, which today relies on huge facilities running reactions at extremely high temperatures and pressures to make ammonia.

The key ingredients for ammonia production are sources of nitrogen and hydrogen. Much of the focus on cleaner production methods currently lies in finding new ways to make hydrogen, since that chemical makes up the bulk of ammonia’s climate footprint, says Patrick Molloy, a principal at the nonprofit research agency Rocky Mountain Institute. 

Recently, researchers and companies have located naturally occurring deposits of hydrogen underground. Iron-rich rocks tend to drive reactions that produce the gas, and these natural deposits could provide a source of low-cost, low-emissions hydrogen.

While geologic hydrogen is still in its infancy as an industry, some researchers are hoping to help the process along by stimulating production of hydrogen underground. With the right rocks, heat, and a catalyst, you can produce hydrogen cheaply and without emitting large amounts of climate pollution.

Hydrogen can be difficult to transport, though, so Abate was interested in going one step further by letting the conditions underground do the hard work in powering chemical reactions that transform hydrogen and nitrogen into ammonia. “As you dig, you get heat and pressure for free,” he says.

To test out how this might work, Abate and his team crushed up iron-rich minerals and added nitrates (a nitrogen source), water (a hydrogen source), and a catalyst to help reactions along in a small reactor in the lab. They found that even at relatively low temperatures and pressures, they could make ammonia in a matter of hours. If the process were scaled up, the researchers estimate, one well could produce 40,000 kilograms of ammonia per day. 

While the reactions tend to go faster at high temperature and pressure, the researchers found that ammonia production could be an economically viable process even at 130 °C (266 °F) and a little over two atmospheres of pressure, conditions that would be accessible at depths reachable with existing drilling technology. 

While the reactions work in the lab, there’s a lot of work to do to determine whether, and how, the process might actually work in the field. One thing the team will need to figure out is how to keep reactions going, because in the reaction that forms ammonia, the surface of the iron-rich rocks will be oxidized, leaving them in a state where they can’t keep reacting. But Abate says the team is working on controlling how thick the unusable layer of rock is, and its composition, so the chemical reactions can continue.

To commercialize this work, Abate is cofounding a company called Addis Energy with $4.25 million in pre-seed funds from investors including Engine Ventures. His cofounders include Michael Alexander and Charlie Mitchell (who have both spent time in the oil and gas industry) and Yet-Ming Chiang, an MIT professor and serial entrepreneur. The company will work on scaling up the research, including finding potential sites with the geological conditions to produce ammonia underground. 

The good news for scale-up efforts is that much of the necessary technology already exists in oil and gas operations, says Alexander, Addis’s CEO. A field-deployed system will involve drilling, pumping fluid down into the ground, and extracting other fluids from beneath the surface, all very common operations in that industry. “There’s novel chemistry that’s wrapped in an oil and gas package,” he says. 

The team will also work on refining cost estimates for the process and gaining a better understanding of safety and sustainability, Abate says. Ammonia is a toxic industrial chemical, but it’s common enough for there to be established procedures for handling, storing, and transporting it, says RMI’s Molloy.

Judging from the researchers’ early estimates, ammonia produced with this method could cost up to $0.55 per kilogram. That’s more than ammonia produced with fossil fuels today ($0.40/kg), but the technique would likely be less expensive than other low-emissions methods of producing the chemical. Tweaks to the process, including using nitrogen from the air instead of nitrates, could help cut costs further, even as low as $0.20/kg. 

New approaches to making ammonia could be crucial for climate efforts. “It’s a chemical that’s essential to our way of life,” says Karthish Manthiram, a professor at Caltech who studies electrochemistry, including alternative ammonia production methods.

The team’s research appears to be designed with scalability in mind from the outset, and using Earth itself as a reactor is the kind of thinking needed to accelerate the long-term journey to sustainable chemical production, Manthiram adds.

While the company focuses on scale-up efforts, there’s plenty of fundamental work left for Abate and other labs to do to understand what’s going on during the reactions at the atomic level, particularly at the interface between the rocks and the reacting fluid. 

Research in the lab is exciting, but it’s only the first step, Abate says. The next one is seeing if this actually works in the field. 

Correction: Due to a unit typo in the journal article, a previous version of this story misstated the amount of ammonia each well could theoretically produce. The estimate is 40,000 kilograms of ammonia per day, not 40,000 tons.

There can be no winners in a US-China AI arms race

The United States and China are entangled in what many have dubbed an “AI arms race.” 

In the early days of this standoff, US policymakers drove an agenda centered on “winning” the race, mostly from an economic perspective. In recent months, leading AI labs such as OpenAI and Anthropic got involved in pushing the narrative of “beating China” in what appeared to be an attempt to align themselves with the incoming Trump administration. The belief that the US can win in such a race was based mostly on the early advantage it had over China in advanced GPU compute resources and the effectiveness of AI’s scaling laws.

But now it appears that access to large quantities of advanced compute resources is no longer the defining or sustainable advantage many had thought it would be. In fact, the capability gap between leading US and Chinese models has essentially disappeared, and in one important way the Chinese models may now have an advantage: They are able to achieve near equivalent results while using only a small fraction of the compute resources available to the leading Western labs.    

The AI competition is increasingly being framed within narrow national security terms, as a zero-sum game, and influenced by assumptions that a future war between the US and China, centered on Taiwan, is inevitable. The US has employed “chokepoint” tactics to limit China’s access to key technologies like advanced semiconductors, and China has responded by accelerating its efforts toward self-sufficiency and indigenous innovation, which is causing US efforts to backfire.

Recently even outgoing US Secretary of Commerce Gina Raimondo, a staunch advocate for strict export controls, finally admitted that using such controls to hold back China’s progress on AI and advanced semiconductors is a “fool’s errand.” Ironically, the unprecedented export control packages targeting China’s semiconductor and AI sectors have unfolded alongside tentative bilateral and multilateral engagements to establish AI safety standards and governance frameworks—highlighting a paradoxical desire of both sides to compete and cooperate. 

When we consider this dynamic more deeply, it becomes clear that the real existential threat ahead is not from China, but from the weaponization of advanced AI by bad actors and rogue groups who seek to create broad harms, gain wealth, or destabilize society. As with nuclear arms, China, as a nation-state, must be careful about using AI-powered capabilities against US interests, but bad actors, including extremist organizations, would be much more likely to abuse AI capabilities with little hesitation. Given the asymmetric nature of AI technology, which is much like cyberweapons, it is very difficult to fully prevent and defend against a determined foe who has mastered its use and intends to deploy it for nefarious ends. 

Given the ramifications, it is incumbent on the US and China as global leaders in developing AI technology to jointly identify and mitigate such threats, collaborate on solutions, and cooperate on developing a global framework for regulating the most advanced models—instead of erecting new fences, small or large, around AI technologies and pursing policies that deflect focus from the real threat.

It is now clearer than ever that despite the high stakes and escalating rhetoric, there will not and cannot be any long-term winners if the intense competition continues on its current path. Instead, the consequences could be severe—undermining global stability, stalling scientific progress, and leading both nations toward a dangerous technological brinkmanship. This is particularly salient given the importance of Taiwan and the global foundry leader TSMC in the AI stack, and the increasing tensions around the high-tech island. 

Heading blindly down this path will bring the risk of isolation and polarization, threatening not only international peace but also the vast potential benefits AI promises for humanity as a whole.

Historical narratives, geopolitical forces, and economic competition have all contributed to the current state of the US-China AI rivalry. A recent report from the US-China Economic and Security Review Commission, for example, frames the entire issue in binary terms, focused on dominance or subservience. This “winner takes all” logic overlooks the potential for global collaboration and could even provoke a self-fulfilling prophecy by escalating conflict. Under the new Trump administration this dynamic will likely become more accentuated, with increasing discussion of a Manhattan Project for AI and redirection of US military resources from Ukraine toward China

Fortunately, a glimmer of hope for a responsible approach to AI collaboration is appearing now as Donald Trump recently  posted on January 17 that he’d restarted direct dialogue with Chairman Xi Jinping regarding various areas of collaboration, and given past cooperation should continue to be “partners and friends.” The outcome of the TikTok drama, putting Trump at odds with sharp China critics in his own administration and Congress, will be a preview of how his efforts to put US China relations on a less confrontational trajectory.

The promise of AI for good

Western mass media usually focuses on attention-grabbing issues described in terms like the “existential risks of evil AI.” Unfortunately, the AI safety experts who get the most coverage often recite the same narratives, scaring the public. In reality, no credible research shows that more capable AI will become increasingly evil. We need to challenge the current false dichotomy of pure accelerationism versus doomerism to allow for a model more like collaborative acceleration

It is important to note the significant difference between the way AI is perceived in Western developed countries and developing countries. In developed countries the public sentiment toward AI is 60% to 70% negative, while in the developing markets the positive ratings are 60% to 80%. People in the latter places have seen technology transform their lives for the better in the past decades and are hopeful AI will help solve the remaining issues they face by improving education, health care, and productivity, thereby elevating their quality of life and giving them greater world standing. What Western populations often fail to realize is that those same benefits could directly improve their lives as well, given the high levels of inequity even in developed markets. Consider what progress would be possible if we reallocated the trillions that go into defense budgets each year to infrastructure, education, and health-care projects. 

Once we get to the next phase, AI will help us accelerate scientific discovery, develop new drugs, extend our health span, reduce our work obligations, and ensure access to high-quality education for all. This may sound idealistic, but given current trends, most of this can become a reality within a generation, and maybe sooner. To get there we’ll need more advanced AI systems, which will be a much more challenging goal if we divide up compute/data resources and research talent pools. Almost half of all top AI researchers globally (47%) were born or educated in China, according to industry studies. It’s hard to imagine how we could have gotten where we are without the efforts of Chinese researchers. Active collaboration with China on joint AI research could be pivotal to supercharging progress with a major infusion of quality training data and researchers. 

The escalating AI competition between the US and China poses significant threats to both nations and to the entire world. The risks inherent in this rivalry are not hypothetical—they could lead to outcomes that threaten global peace, economic stability, and technological progress. Framing the development of artificial intelligence as a zero-sum race undermines opportunities for collective advancement and security. Rather than succumb to the rhetoric of confrontation, it is imperative that the US and China, along with their allies, shift toward collaboration and shared governance.

Our recommendations for policymakers:

  1. Reduce national security dominance over AI policy. Both the US and China must recalibrate their approach to AI development, moving away from viewing AI primarily as a military asset. This means reducing the emphasis on national security concerns that currently dominate every aspect of AI policy. Instead, policymakers should focus on civilian applications of AI that can directly benefit their populations and address global challenges, such as health care, education, and climate change. The US also needs to investigate how to implement a possible universal basic income program as job displacement from AI adoption becomes a bigger issue domestically. 
    • 2. Promote bilateral and multilateral AI governance. Establishing a robust dialogue between the US, China, and other international stakeholders is crucial for the development of common AI governance standards. This includes agreeing on ethical norms, safety measures, and transparency guidelines for advanced AI technologies. A cooperative framework would help ensure that AI development is conducted responsibly and inclusively, minimizing risks while maximizing benefits for all.
    • 3. Expand investment in detection and mitigation of AI misuse. The risk of AI misuse by bad actors, whether through misinformation campaigns, telecom, power, or financial system attacks, or cybersecurity attacks with the potential to destabilize society, is the biggest existential threat to the world today. Dramatically increasing funding for and international cooperation in detecting and mitigating these risks is vital. The US and China must agree on shared standards for the responsible use of AI and collaborate on tools that can monitor and counteract misuse globally.
    • 4. Create incentives for collaborative AI research. Governments should provide incentives for academic and industry collaborations across borders. By creating joint funding programs and research initiatives, the US and China can foster an environment where the best minds from both nations contribute to breakthroughs in AI that serve humanity as a whole. This collaboration would help pool talent, data, and compute resources, overcoming barriers that neither country could tackle alone. A global effort akin to the CERN for AI will bring much more value to the world, and a peaceful end, than a Manhattan Project for AI, which is being promoted by many in Washington today. 
    • 5. Establish trust-building measures. Both countries need to prevent misinterpretations of AI-related actions as aggressive or threatening. They could do this via data-sharing agreements, joint projects in nonmilitary AI, and exchanges between AI researchers. Reducing import restrictions for civilian AI use cases, for example, could help the nations rebuild some trust and make it possible for them to discuss deeper cooperation on joint research. These measures would help build transparency, reduce the risk of miscommunication, and pave the way for a less adversarial relationship.
    • 6. Support the development of a global AI safety coalition. A coalition that includes major AI developers from multiple countries could serve as a neutral platform for addressing ethical and safety concerns. This coalition would bring together leading AI researchers, ethicists, and policymakers to ensure that AI progresses in a way that is safe, fair, and beneficial to all. This effort should not exclude China, as it remains an essential partner in developing and maintaining a safe AI ecosystem.
    • 7. Shift the focus toward AI for global challenges. It is crucial that the world’s two AI superpowers use their capabilities to tackle global issues, such as climate change, disease, and poverty. By demonstrating the positive societal impacts of AI through tangible projects and presenting it not as a threat but as a powerful tool for good, the US and China can reshape public perception of AI. 

    Our choice is stark but simple: We can proceed down a path of confrontation that will almost certainly lead to mutual harm, or we can pivot toward collaboration, which offers the potential for a prosperous and stable future for all. Artificial intelligence holds the promise to solve some of the greatest challenges facing humanity, but realizing this potential depends on whether we choose to race against each other or work together. 

    The opportunity to harness AI for the common good is a chance the world cannot afford to miss.


    Alvin Wang Graylin

    Alvin Wang Graylin is a technology executive, author, investor, and pioneer with over 30 years of experience shaping innovation in AI, XR (extended reality), cybersecurity, and semiconductors. Currently serving as global vice president at HTC, Graylin was the company’s China president from 2016 to 2023. He is the author of Our Next Reality.

    Paul Triolo

    Paul Triolo is a partner for China and technology policy lead at DGA-Albright Stonebridge Group. He advises clients in technology, financial services, and other sectors as they navigate complex political and regulatory matters in the US, China, the European Union, India, and around the world.

    OpenAI has upped its lobbying efforts nearly sevenfold

    OpenAI spent $1.76 million on government lobbying in 2024 and $510,000 in the last three months of the year alone, according to a new disclosure filed on Tuesday—a significant jump from 2023, when the company spent just $260,000 on Capitol Hill. The company also disclosed a new in-house lobbyist, Meghan Dorn, who worked for five years for Senator Lindsey Graham and started at OpenAI in October. The filing also shows activity related to two new pieces of legislation in the final months of the year: the House’s AI Advancement and Reliability Act, which would set up a government center for AI research, and the Senate’s Future of Artificial Intelligence Innovation Act, which would create shared benchmark tests for AI models. 

    OpenAI did not respond to questions about its lobbying efforts.

    But perhaps more important, the disclosure is a clear signal of the company’s arrival as a political player, as its first year of serious lobbying ends and Republican control of Washington begins. While OpenAI’s lobbying spending is still dwarfed by its peers’—Meta tops the list of Big Tech spenders, with more than $24 million in 2024—the uptick comes as it and other AI companies have helped redraw the shape of AI policy. 

    For the past few years, AI policy has been something like a whack-a-mole response to the risks posed by deepfakes and misinformation. But over the last year, AI companies have started to position the success of the technology as pivotal to national security and American competitiveness, arguing that the government must therefore support the industry’s growth. As a result, OpenAI and others now seem poised to gain access to cheaper energy, lucrative national security contracts, and a more lax regulatory environment that’s unconcerned with the minutiae of AI safety.

    While the big players seem more or less aligned on this grand narrative, messy divides on other issues are still threatening to break through the harmony on display at President Trump’s inauguration this week.

    AI regulation really began in earnest after ChatGPT launched in November 2022. At that point, “a lot of the conversation was about responsibility,” says Liana Keesing, campaigns manager for technology reform at Issue One, a democracy nonprofit that tracks Big Tech’s influence. 

    Companies were asked what they’d do about sexually abusive deepfake images and election disinformation. “Sam Altman did a very good job coming in and painting himself early as a supporter of that process,” Keesing says. 

    OpenAI started its official lobbying effort around October 2023, hiring Chan Park—a onetime Senate Judiciary Committee counsel and Microsoft lobbyist—to lead the effort. Lawmakers, particularly then Senate majority leader Chuck Schumer, were vocal about wanting to curb these particular harms; OpenAI hired Schumer’s former legal counsel, Reginald Babin, as a lobbyist, according to data from OpenSecrets. This past summer, the company hired the veteran political operative Chris Lehane as its head of global policy. 

    OpenAI’s previous disclosures confirm that the company’s lobbyists subsequently focused much of last year on legislation like the No Fakes Act and the Protect Elections from Deceptive AI Act. The bills did not materialize into law. But as the year went on, the regulatory goals of AI companies began to change. “One of the biggest shifts that we’ve seen,” Keesing says, “is that they’ve really started to focus on energy.” 

    In September, Altman, along with leaders from Nvidia, Anthropic, and Google, visited the White House and pitched the vision that US competitiveness in AI will depend on subsidized energy infrastructure to train the best models. Altman proposed to the Biden administration the construction of multiple five-gigawatt data centers, which would each consume as much electricity as New York City. 

    Around the same time, companies like Meta and Microsoft started to say that nuclear energy will provide the path forward for AI, announcing deals aimed at firing up new nuclear power plants

    It seems likely OpenAI’s policy team was already planning for this particular shift. In April, the company hired lobbyist Matthew Rimkunas, who worked for Bill Gates’s sustainable energy effort Breakthrough Energies and, before that, spent 16 years working for Senator Graham; the South Carolina Republican serves on the Senate subcommittee that manages nuclear safety. 

    This new AI energy race is inseparable from the positioning of AI as essential for national security and US competitiveness with China. OpenAI laid out its position in a blog post in October, writing, “AI is a transformational technology that can be used to strengthen democratic values or to undermine them. That’s why we believe democracies should continue to take the lead in AI development.” Then in December, the company went a step further and reversed its policy against working with the military, announcing it would develop AI models with the defense-tech company Anduril to help take down drones around military bases. 

    That same month, Sam Altman said during an interview with The Free Press that the Biden administration was “not that effective” in shepherding AI: “The things that I think should have been the administration’s priorities, and I hope will be the next administration’s priorities, are building out massive AI infrastructure in the US, having a supply chain in the US, things like that.”

    That characterization glosses over the CHIPS Act, a $52 billion stimulus to the domestic chips industry that is, at least on paper, aligned with Altman’s vision. (It also preceded an executive order Biden issued just last week, to lease federal land to host the type of gigawatt-scale data centers that Altman had been asking for.)

    Intentionally or not, Altman’s posture aligned him with the growing camaraderie between President Trump and Silicon Valley. Mark Zuckerberg, Elon Musk, Jeff Bezos, and Sundar Pichai all sat directly behind Trump’s family at the inauguration on Monday, and Altman also attended. Many of them had also made sizable donations to Trump’s inaugural fund, with Altman personally throwing in $1 million.

    It’s easy to view the inauguration as evidence that these tech leaders are aligned with each other, and with other players in Trump’s orbit. But there are still some key dividing lines that will be worth watching. Notably, there’s the clash over H-1B visas, which allow many noncitizen AI researchers to work in the US. Musk and Vivek Ramaswamy (who is, as of this week, no longer a part of the so-called Department of Government Efficiency) have been pushing for that visa program to be expanded. This sparked backlash from some allies of the Trump administration, perhaps most loudly Steve Bannon

    Another fault line is the battle between open- and closed-source AI. Google and OpenAI prevent anyone from knowing exactly what’s in their most powerful models, often arguing that this keeps them from being used improperly by bad actors. Musk has sued OpenAI and Microsoft over the issue, alleging that closed-source models are antithetical to OpenAI’s hybrid nonprofit structure. Meta, whose Llama model is open-source, recently sided with Musk in that lawsuit. Venture capitalist and Trump ally Marc Andreessen echoed these criticisms of OpenAI on X just hours after the inauguration. (Andreessen has also said that making AI models open-source “makes overbearing regulations unnecessary.”) 

    Finally, there are the battles over bias and free speech. The vastly different approaches that social media companies have taken to moderating content—including Meta’s recent announcement that it would end its US fact-checking program—raise questions about whether the way AI models are moderated will continue to splinter too. Musk has lamented what he calls the “wokeness” of many leading models, and Andreessen said on Tuesday that “Chinese LLMs are much less censored than American LLMs” (though that’s not quite true, given that many Chinese AI models have government-mandated censorship in place that forbids particular topics). Altman has been more equivocal: “No two people are ever going to agree that one system is perfectly unbiased,” he told The Free Press.

    It’s only the start of a new era in Washington, but the White House has been busy. It has repealed many executive orders signed by President Biden, including the landmark order on AI that imposed rules for government use of the technology (while it appears to have kept Biden’s order on leasing land for more data centers). Altman is busy as well. OpenAI, Oracle, and SoftBank reportedly plan to spend up to $500 billion on a joint venture for new data centers; the project was announced by President Trump, with Altman standing alongside. And according to Axios, Altman will also be part of a closed-door briefing with government officials on January 30, reportedly about OpenAI’s development of a powerful new AI agent.

    The second wave of AI coding is here

    Ask people building generative AI what generative AI is good for right now—what they’re really fired up about—and many will tell you: coding. 

    “That’s something that’s been very exciting for developers,” Jared Kaplan, chief scientist at Anthropic, told MIT Technology Review this month: “It’s really understanding what’s wrong with code, debugging it.”

    Copilot, a tool built on top of OpenAI’s large language models and launched by Microsoft-backed GitHub in 2022, is now used by millions of developers around the world. Millions more turn to general-purpose chatbots like Anthropic’s Claude, OpenAI’s ChatGPT, and Google DeepMind’s Gemini for everyday help.

    “Today, more than a quarter of all new code at Google is generated by AI, then reviewed and accepted by engineers,” Alphabet CEO Sundar Pichai claimed on an earnings call in October: “This helps our engineers do more and move faster.” Expect other tech companies to catch up, if they haven’t already.

    It’s not just the big beasts rolling out AI coding tools. A bunch of new startups have entered this buzzy market too. Newcomers such as Zencoder, Merly, Cosine, Tessl (valued at $750 million within months of being set up), and Poolside (valued at $3 billion before it even released a product) are all jostling for their slice of the pie. “It actually looks like developers are willing to pay for copilots,” says Nathan Benaich, an analyst at investment firm Air Street Capital: “And so code is one of the easiest ways to monetize AI.”

    Such companies promise to take generative coding assistants to the next level. Instead of providing developers with a kind of supercharged autocomplete, like most existing tools, this next generation can prototype, test, and debug code for you. The upshot is that developers could essentially turn into managers, who may spend more time reviewing and correcting code written by a model than writing it from scratch themselves. 

    But there’s more. Many of the people building generative coding assistants think that they could be a fast track to artificial general intelligence (AGI), the hypothetical superhuman technology that a number of top firms claim to have in their sights.

    “The first time we will see a massively economically valuable activity to have reached human-level capabilities will be in software development,” says Eiso Kant, CEO and cofounder of Poolside. (OpenAI has already boasted that its latest o3 model beat the company’s own chief scientist in a competitive coding challenge.)

    Welcome to the second wave of AI coding. 

    Correct code 

    Software engineers talk about two types of correctness. There’s the sense in which a program’s syntax (its grammar) is correct—meaning all the words, numbers, and mathematical operators are in the right place. This matters a lot more than grammatical correctness in natural language. Get one tiny thing wrong in thousands of lines of code and none of it will run.

    The first generation of coding assistants are now pretty good at producing code that’s correct in this sense. Trained on billions of pieces of code, they have assimilated the surface-level structures of many types of programs.  

    But there’s also the sense in which a program’s function is correct: Sure, it runs, but does it actually do what you wanted it to? It’s that second level of correctness that the new wave of generative coding assistants are aiming for—and this is what will really change the way software is made.

    “Large language models can write code that compiles, but they may not always write the program that you wanted,” says Alistair Pullen, a cofounder of Cosine. “To do that, you need to re-create the thought processes that a human coder would have gone through to get that end result.”

    The problem is that the data most coding assistants have been trained on—the billions of pieces of code taken from online repositories—doesn’t capture those thought processes. It represents a finished product, not what went into making it. “There’s a lot of code out there,” says Kant. “But that data doesn’t represent software development.”

    What Pullen, Kant, and others are finding is that to build a model that does a lot more than autocomplete—one that can come up with useful programs, test them, and fix bugs—you need to show it a lot more than just code. You need to show it how that code was put together.  

    In short, companies like Cosine and Poolside are building models that don’t just mimic what good code looks like—whether it works well or not—but mimic the process that produces such code in the first place. Get it right and the models will come up with far better code and far better bug fixes. 

    Breadcrumbs

    But you first need a data set that captures that process—the steps that a human developer might take when writing code. Think of these steps as a breadcrumb trail that a machine could follow to produce a similar piece of code itself.

    Part of that is working out what materials to draw from: Which sections of the existing codebase are needed for a given programming task? “Context is critical,” says Zencoder founder Andrew Filev. “The first generation of tools did a very poor job on the context, they would basically just look at your open tabs. But your repo [code repository] might have 5000 files and they’d miss most of it.”

    Zencoder has hired a bunch of search engine veterans to help it build a tool that can analyze large codebases and figure out what is and isn’t relevant. This detailed context reduces hallucinations and improves the quality of code that large language models can produce, says Filev: “We call it repo grokking.”

    Cosine also thinks context is key. But it draws on that context to create a new kind of data set. The company has asked dozens of coders to record what they were doing as they worked through hundreds of different programming tasks. “We asked them to write down everything,” says Pullen: “Why did you open that file? Why did you scroll halfway through? Why did you close it?” They also asked coders to annotate finished pieces of code, marking up sections that would have required knowledge of other pieces of code or specific documentation to write.

    Cosine then takes all that information and generates a large synthetic data set that maps the typical steps coders take, and the sources of information they draw on, to finished pieces of code. They use this data set to train a model to figure out what breadcrumb trail it might need to follow to produce a particular program, and then how to follow it.  

    Poolside, based in San Francisco, is also creating a synthetic data set that captures the process of coding, but it leans more on a technique called RLCE—reinforcement learning from code execution. (Cosine uses this too, but to a lesser degree.)

    RLCE is analogous to the technique used to make chatbots like ChatGPT slick conversationalists, known as RLHF—reinforcement learning from human feedback. With RLHF, a model is trained to produce text that’s more like the kind human testers say they favor. With RLCE, a model is trained to produce code that’s more like the kind that does what it is supposed to do when it is run (or executed).  

    Gaming the system

    Cosine and Poolside both say they are inspired by the approach DeepMind took with its game-playing model AlphaZero. AlphaZero was given the steps it could take—the moves in a game—and then left to play against itself over and over again, figuring out via trial and error what sequence of moves were winning moves and which were not.  

    “They let it explore moves at every possible turn, simulate as many games as you can throw compute at—that led all the way to beating Lee Sedol,” says Pengming Wang, a founding scientist at Poolside, referring to the Korean Go grandmaster that AlphaZero beat in 2016. Before Poolside, Wang worked at Google DeepMind on applications of AlphaZero beyond board games, including FunSearch, a version trained to solve advanced math problems.

    When that AlphaZero approach is applied to coding, the steps involved in producing a piece of code—the breadcrumbs—become the available moves in a game, and a correct program becomes winning that game. Left to play by itself, a model can improve far faster than a human could. “A human coder tries and fails one failure at a time,” says Kant. “Models can try things 100 times at once.”

    A key difference between Cosine and Poolside is that Cosine is using a custom version of GPT-4o provided by OpenAI, which makes it possible to train on a larger data set than the base model can cope with, but Poolside is building its own large language model from scratch.

    Poolside’s Kant thinks that training a model on code from the start will give better results than adapting an existing model that has sucked up not only billions of pieces of code but most of the internet. “I’m perfectly fine with our model forgetting about butterfly anatomy,” he says.  

    Cosine claims that its generative coding assistant, called Genie, tops the leaderboard on SWE-Bench, a standard set of tests for coding models. Poolside is still building its model but claims that what it has so far already matches the performance of GitHub’s Copilot.

    “I personally have a very strong belief that large language models will get us all the way to being as capable as a software developer,” says Kant.

    Not everyone takes that view, however.

    Illogical LLMs

    To Justin Gottschlich, the CEO and founder of Merly, large language models are the wrong tool for the job—period. He invokes his dog: “No amount of training for my dog will ever get him to be able to code, it just won’t happen,” he says. “He can do all kinds of other things, but he’s just incapable of that deep level of cognition.”  

    Having worked on code generation for more than a decade, Gottschlich has a similar sticking point with large language models. Programming requires the ability to work through logical puzzles with unwavering precision. No matter how well large language models may learn to mimic what human programmers do, at their core they are still essentially statistical slot machines, he says: “I can’t train an illogical system to become logical.”

    Instead of training a large language model to generate code by feeding it lots of examples, Merly does not show its system human-written code at all. That’s because to really build a model that can generate code, Gottschlich argues, you need to work at the level of the underlying logic that code represents, not the code itself. Merly’s system is therefore trained on an intermediate representation—something like the machine-readable notation that most programming languages get translated into before they are run.

    Gottschlich won’t say exactly what this looks like or how the process works. But he throws out an analogy: There’s this idea in mathematics that the only numbers that have to exist are prime numbers, because you can calculate all other numbers using just the primes. “Take that concept and apply it to code,” he says.

    Not only does this approach get straight to the logic of programming; it’s also fast, because millions of lines of code are reduced to a few thousand lines of intermediate language before the system analyzes them.

    Shifting mindsets

    What you think of these rival approaches may depend on what you want generative coding assistants to be.  

    In November, Cosine banned its engineers from using tools other than its own products. It is now seeing the impact of Genie on its own engineers, who often find themselves watching the tool as it comes up with code for them. “You now give the model the outcome you would like, and it goes ahead and worries about the implementation for you,” says Yang Li, another Cosine cofounder.

    Pullen admits that it can be baffling, requiring a switch of mindset. “We have engineers doing multiple tasks at once, flitting between windows,” he says. “While Genie is running code in one, they might be prompting it to do something else in another.”

    These tools also make it possible to protype multiple versions of a system at once. Say you’re developing software that needs a payment system built in. You can get a coding assistant to simultaneously try out several different options—Stripe, Mango, Checkout—instead of having to code them by hand one at a time.

    Genie can be left to fix bugs around the clock. Most software teams use bug-reporting tools that let people upload descriptions of errors they have encountered. Genie can read these descriptions and come up with fixes. Then a human just needs to review them before updating the code base.

    No single human understands the trillions of lines of code in today’s biggest software systems, says Li, “and as more and more software gets written by other software, the amount of code will only get bigger.”

    This will make coding assistants that maintain that code for us essential. “The bottleneck will become how fast humans can review the machine-generated code,” says Li.

    How do Cosine’s engineers feel about all this? According to Pullen, at least, just fine. “If I give you a hard problem, you’re still going to think about how you want to describe that problem to the model,” he says. “Instead of writing the code, you have to write it in natural language. But there’s still a lot of thinking that goes into that, so you’re not really taking the joy of engineering away. The itch is still scratched.”

    Some may adapt faster than others. Cosine likes to invite potential hires to spend a few days coding with its team. A couple of months ago it asked one such candidate to build a widget that would let employees share cool bits of software they were working on to social media. 

    The task wasn’t straightforward, requiring working knowledge of multiple sections of Cosine’s millions of lines of code. But the candidate got it done in a matter of hours. “This person who had never seen our code base turned up on Monday and by Tuesday afternoon he’d shipped something,” says Li. “We thought it would take him all week.” (They hired him.)

    But there’s another angle too. Many companies will use this technology to cut down on the number of programmers they hire. Li thinks we will soon see tiers of software engineers. At one end there will be elite developers with million-dollar salaries who can diagnose problems when the AI goes wrong. At the other end, smaller teams of 10 to 20 people will do a job that once required hundreds of coders. “It will be like how ATMs transformed banking,” says Li.

    “Anything you want to do will be determined by compute and not head count,” he says. “I think it’s generally accepted that the era of adding another few thousand engineers to your organization is over.”

    Warp drives

    Indeed, for Gottschlich, machines that can code better than humans are going to be essential. For him, that’s the only way we will build the vast, complex software systems that he thinks we will eventually need. Like many in Silicon Valley, he anticipates a future in which humans move to other planets. That’s only going to be possible if we get AI to build the software required, he says: “Merly’s real goal is to get us to Mars.”

    Gottschlich prefers to talk about “machine programming” rather than “coding assistants,” because he thinks that term frames the problem the wrong way. “I don’t think that these systems should be assisting humans—I think humans should be assisting them,” he says. “They can move at the speed of AI. Why restrict their potential?”

    “There’s this cartoon called The Flintstones where they have these cars, but they only move when the drivers use their feet,” says Gottschlich. “This is sort of how I feel most people are doing AI for software systems.”

    “But what Merly’s building is, essentially, spaceships,” he adds. He’s not joking. “And I don’t think spaceships should be powered by humans on a bicycle. Spaceships should be powered by a warp engine.”

    If that sounds wild—it is. But there’s a serious point to be made about what the people building this technology think the end goal really is.

    Gottschlich is not an outlier with his galaxy-brained take. Despite their focus on products that developers will want to use today, most of these companies have their sights on a far bigger payoff. Visit Cosine’s website and the company introduces itself as a “Human Reasoning Lab.” It sees coding as just the first step toward a more general-purpose model that can mimic human problem-solving in a number of domains.

    Poolside has similar goals: The company states upfront that it is building AGI. “Code is a way of formalizing reasoning,” says Kant.

    Wang invokes agents. Imagine a system that can spin up its own software to do any task on the fly, he says. “If you get to a point where your agent can really solve any computational task that you want through the means of software—that is a display of AGI, essentially.”

    Down here on Earth, such systems may remain a pipe dream. And yet software engineering is changing faster than many at the cutting edge expected. 

    “We’re not at a point where everything’s just done by machines, but we’re definitely stepping away from the usual role of a software engineer,” says Cosine’s Pullen. “We’re seeing the sparks of that new workflow—what it means to be a software engineer going into the future.”

    What to expect from Neuralink in 2025

    MIT Technology Review’s What’s Next series looks across industries, trends, and technologies to give you a first look at the future. You can read the rest of them here.

    In November, a young man named Noland Arbaugh announced he’d be livestreaming from his home for three days straight. His broadcast was in some ways typical fare: a backyard tour, video games, meet mom.

    The difference is that Arbaugh, who is paralyzed, has thin electrode-studded wires installed in his brain, which he used to move a computer mouse on a screen, click menus, and play chess. The implant, called N1, was installed last year by neurosurgeons working with Neuralink, Elon Musk’s brain-interface company.

    The possibility of listening to neurons and using their signals to move a computer cursor was first demonstrated more than 20 years ago in a lab setting. Now, Arbaugh’s livestream is an indicator that Neuralink is a whole lot closer to creating a plug-and-play experience that can restore people’s daily ability to roam the web and play games, giving them what the company has called “digital freedom.”

    But this is not yet a commercial product. The current studies are small-scale—they are true experiments, explorations of how the device works and how it can be improved. For instance, at some point last year, more than half the electrode-studded “threads” inserted into Aurbaugh’s brain retracted, and his control over the device worsened; Neuralink rushed to implement fixes so he could use his remaining electrodes to move the mouse.

    Neuralink did not reply to emails seeking comment, but here is what our analysis of its public statements leads us to expect from the company in 2025.

    More patients

    How many people will get these implants? Elon Musk keeps predicting huge numbers. In August, he posted on X: “If all goes well, there will be hundreds of people with Neuralinks within a few years, maybe tens of thousands within five years, millions within 10 years.”

    In reality, the actual pace is slower—a lot slower. That’s because in a study of a novel device, it’s typical for the first patients to be staged months apart, to allow time to monitor for problems. 

    Neuralink has publicly announced that two people have received an implant: Arbaugh and a man referred to only as “Alex,” who received his in July or August. 

    Then, on January 8, Musk disclosed during an online interview that there was now a third person with an implant. “We’ve got now three patients, three humans with Neuralinks implanted, and they are all working …well,” Musk said. During 2025, he added, “we expect to hopefully do, I don’t know, 20 or 30 patients.”  

    Barring major setbacks, expect the pace of implants to increase—although perhaps not as fast as Musk says. In November, Neuralink updated its US trial listing to include space for five volunteers (up from three), and it also opened a trial in Canada with room for six. Considering these two studies only, Neuralink would carry out at least two more implants by the end of 2025 and eight by the end of 2026.

    However, by opening further international studies, Neuralink could increase the pace of the experiments.

    Better control

    So how good is Arbaugh’s control over the mouse? You can get an idea by trying a game called Webgrid, where you try to click quickly on a moving target. The program translates your speed into a measure of information transfer: bits per second. 

    Neuralink claims Arbaugh reached a rate of over nine bits per second, doubling the old brain-interface record. The median able-bodied user scores around 10 bits per second, according to Neuralink.

    And yet during his livestream, Arbaugh complained that his mouse control wasn’t very good because his “model” was out of date. It was a reference to how his imagined physical movements get mapped to mouse movements. That mapping degrades over hours and days, and to recalibrate it, he has said, he spends as long as 45 minutes doing a set of retraining tasks on his monitor, such as imagining moving a dot from a center point to the edge of a circle.

    Noland Arbaugh stops to calibrate during a livestream on X
    @MODDEDQUAD VIA X

    Improving the software that sits between Arbaugh’s brain and the mouse is a big area of focus for Neuralink—one where the company is still experimenting and making significant changes. Among the goals: cutting the recalibration time to a few minutes. “We want them to feel like they are in the F1 [Formula One] car, not the minivan,” Bliss Chapman, who leads the BCI software team, told the podcaster Lex Fridman last year.

    Device changes

    Before Neuralink ever seeks approval to sell its brain interface, it will have to lock in a final device design that can be tested in a “pivotal trial” involving perhaps 20 to 40 patients, to show it really works as intended. That type of study could itself take a year or two to carry out and hasn’t yet been announced.

    In fact, Neuralink is still tweaking its implant in significant ways—for instance, by trying to increase the number of electrodes or extend the battery life. This month, Musk said the next human tests would be using an “upgraded Neuralink device.”

    The company is also still developing the surgical robot, called R1, that’s used to implant the device. It functions like a sewing machine: A surgeon uses R1 to thread the electrode wires into people’s brains. According to Neuralink’s job listings, improving the R1 robot and making the implant process entirely automatic is a major goal of the company. That’s partly to meet Musk’s predictions of a future where millions of people have an implant, since there wouldn’t be enough neurosurgeons in the world to put them all in manually. 

    “We want to get to the point where it’s one click,” Neuralink president Dongjin Seo told Fridman last year.

    Robot arm

    Late last year, Neuralink opened a companion study through which it says some of its existing implant volunteers will get to try using their brain activity to control not only a computer mouse but other types of external devices, including an “assistive robotic arm.”

    We haven’t yet seen what Neuralink’s robotic arm looks like—whether it’s a tabletop research device or something that could be attached to a wheelchair and used at home to complete daily tasks.

    But it’s clear such a device could be helpful. During Aurbaugh’s livestream he frequently asked other people to do simple things for him, like brush his hair or put on his hat.

    Arbaugh demonstrates the use of Imagined Movement Control.
    @MODDEDQUAD VIA X

    And using brains to control robots is definitely possible—although so far only in a controlled research setting. In tests using a different brain implant, carried out at the University of Pittsburgh in 2012, a paralyzed woman named Jan Scheuermann was able to use a robot arm to stack blocks and plastic cups about as well as a person who’d had a severe stroke—impressive, since she couldn’t actually move her own limbs.

    There are several practical obstacles to using a robot arm at home. One is developing a robot that’s safe and useful. Another, as noted by Wired, is that the calibration steps to maintain control over an arm that can make 3D movements and grasp objects could be onerous and time consuming.

    Vision implant

    In September, Neuralink said it had received “breakthrough device designation” from the FDA for a version of its implant that could be used to restore limited vision to blind people. The system, which it calls Blindsight, would work by sending electrical impulses directly into a volunteer’s visual cortex, producing spots of light called phosphenes. If there are enough spots, they can be organized into a simple, pixelated form of vision, as previously demonstrated by academic researchers.

    The FDA designation is not the same as permission to start the vision study. Instead, it’s a promise by the agency to speed up review steps, including agreements around what a trial should look like. Right now, it’s impossible to guess when a Neuralink vision trial could start, but it won’t necessarily be this year. 

    More money

    Neuralink last raised money in 2023, collecting around $325 million from investors in a funding round that valued the company at over $3 billion, according to Pitchbook. Ryan Tanaka, who publishes a podcast about the company, Neura Pod, says he thinks Neuralink will raise more money this year and that the valuation of the private company could double.

    Fighting regulators

    Neuralink has attracted plenty of scrutiny from news reporters, animal-rights campaigners, and even fraud investigators at the Securities and Exchange Commission. Many of the questions surround its treatment of test animals and whether it rushed to try the implant in people.

    More recently, Musk has started using his X platform to badger and bully heads of state and was named by Donald Trump to co-lead a so-called Department of Government Efficiency, which Musk says will “get rid of nonsensical regulations” and potentially gut some DC agencies. 

    During 2025, watch for whether Musk uses his digital bullhorn to give health regulators pointed feedback on how they’re handling Neuralink.

    Other efforts

    Don’t forget that Neuralink isn’t the only company working on brain implants. A company called Synchron has one that’s inserted into the brain through a blood vessel, which it’s also testing in human trials of brain control over computers. Other companies, including Paradromics, Precision Neuroscience, and BlackRock Neurotech, are also developing advanced brain-computer interfaces.

    Special thanks to Ryan Tanaka of Neura Pod for pointing us to Neuralink’s public announcements and projections.

    Interest in nuclear power is surging. Is it enough to build new reactors?

    This article is from The Spark, MIT Technology Review’s weekly climate newsletter. To receive it in your inbox every Wednesday, sign up here.

    Lately, the vibes have been good for nuclear power. Public support is building, and public and private funding have made the technology more economical in key markets. There’s also a swell of interest from major companies looking to power their data centers. 

    These shifts have been great for existing nuclear plants. We’re seeing efforts to boost their power output, extend the lifetime of old reactors, and even reopen facilities that have shut down. That’s good news for climate action, because nuclear power plants produce consistent electricity with very low greenhouse-gas emissions.

    I covered all these trends in my latest story, which digs into what’s next for nuclear power in 2025 and beyond. But as I spoke with experts, one central question kept coming up for me: Will all of this be enough to actually get new reactors built?

    To zoom in on some of these trends, let’s take a look at the US, which has the largest fleet of nuclear reactors in the world (and the oldest, with an average age of over 42 years).

    In recent years we’ve seen a steady improvement in public support for nuclear power in the US. Today, around 56% of Americans support more nuclear power, up from 43% in 2020, according to a Pew Research poll.

    The economic landscape has also shifted in favor of the technology. The Inflation Reduction Act of 2022 includes tax credits specifically for operating nuclear plants, aimed at keeping them online. Qualifying plants can receive up to $15 per megawatt-hour, provided they meet certain labor requirements. (For context, in 2021, its last full year of operation, Palisades in Michigan generated over 7 million megawatt-hours.) 

    Big Tech has also provided an economic boost for the industry—tech giants like Microsoft, Meta, Google, and Amazon are all making deals to get in on nuclear.

    These developments have made existing (or recently closed) nuclear power plants a hot commodity. Plants that might have been candidates for decommissioning just a few years ago are now candidates for license extension. Plants that have already shut down are seeing a potential second chance at life.

    There’s also the potential to milk more power out of existing facilities through changes called uprates, which basically allow existing facilities to produce more energy by tweaking existing instruments and power generation systems. The US Nuclear Regulatory Commission has approved uprates totaling six gigawatts over the past two decades. That’s a small but certainly significant fraction of the roughly 97 gigawatts of nuclear on the grid today. 

    Any reactors kept online, reopened, or ramped up spell good news for emissions. But expanding the nuclear fleet in the US will require not just making the most of existing assets, but building new reactors. 

    We’ll probably also need new reactors just to maintain the current fleet, since so many reactors are scheduled to be retired in the next couple of decades. Will the enthusiasm for keeping old plants running also translate into building new ones? 

    In much of the world (China being a notable exception), building new nuclear capacity has historically been expensive and slow. It’s easy to point at Plant Vogtle in the US: The third and fourth reactors at that facility began construction in 2009. They were originally scheduled to start up in 2016 and 2017, at a cost of around $14 billion. They actually came online in 2023 and 2024, and the total cost of the project was north of $30 billion.

    Some advanced technology has promised to fix the problems in nuclear power. Small modular reactors could help cut cost and construction times, and next-generation reactors promise safety and efficiency improvements that could translate to cheaper, quicker construction. Realistically, though, getting these first-of-their-kind projects off the ground will still require a lot of money and a sustained commitment to making them happen. “The next four years are make or break for advanced nuclear,” says Jessica Lovering, cofounder at the Good Energy Collective, a policy research organization that advocates for the use of nuclear energy.  

    There are a few factors that could help the progress we’ve seen recently in nuclear extend to new builds. For one, public support from the US Department of Energy includes not only tax credits but public loans and grants for demonstration projects, which can be a key stepping stone to commercial plants that generate electricity for the grid. 

    Changes to the regulatory process could also help. The Advance Act, passed in 2024, aims at sprucing up the Nuclear Regulatory Commission (NRC) in the hopes of making the approval process more efficient (currently, it can take up to five years to complete). 

    “If you can see the NRC really start to modernize toward a more efficient, effective, and predictable regulator, it really helps the case for a lot of these commercial projects, because the NRC will no longer be seen as this barrier to innovation,” says Patrick White, research director at the Nuclear Innovation Alliance, a nonprofit think tank. We should start to see changes from that legislation this year, though what happens could depend on the Trump administration.

    The next few years are crucial for next-generation nuclear technology, and how the industry fares between now and the end of the decade could be very telling when it comes to how big a role this technology plays in our longer-term efforts to decarbonize energy. 


    Now read the rest of The Spark

    Related reading

    For more on what’s next for nuclear power, check out my latest story.

    One key trend I’m following is efforts to reopen shuttered nuclear plants. Here’s how to do it.  

    Kairos Power is working to build molten-salt-cooled reactors, and we named the company to our list of 10 Climate Tech Companies to watch in 2024.  

    Another thing 

    Devastating wildfires have been ravaging Southern California. Here’s a roundup of some key stories about the blazes. 

    → Strong winds have continued this week, bringing with them the threat of new fires. Here’s a page with live updates on the latest. (Washington Post)

    → Officials are scouring the spot where the deadly Palisades fire started to better understand how it was sparked. (New York Times)

    → Climate change didn’t directly start the fires, but global warming did contribute to how intensely they burned and how quickly they spread. (Axios

    →The LA fires show that controlled burns aren’t a cure-all when it comes to preventing wildfires. (Heatmap News)

    → Seawater is a last resort when it comes to fighting fires, since it’s corrosive and can harm the environment when dumped on a blaze. (Wall Street Journal)

    Keeping up with climate  

    US emissions cuts stalled last year, despite strong growth in renewables. The cause: After staying flat or falling for two decades, electricity demand is rising. (New York Times)

    With Donald Trump set to take office in the US next week, many are looking to state governments as a potential seat of climate action. Here’s what to look for in states including Texas, California, and Massachusetts. (Inside Climate News)

    The US could see as many as 80 new gas-fired power plants built by 2030. The surge comes as demand for power from data centers, including those powering AI, is ballooning. (Financial Times)

    Global sales of EVs and plug-in hybrids were up 25% in 2024 from the year before. China, the world’s largest EV market, is a major engine behind the growth. (Reuters)

    A massive plant to produce low-emissions steel could be in trouble. Steelmaker SSAB has pulled out of talks on federal funding for a plant in Mississippi. (Canary Media)

    Some solar panel companies have turned to door-to-door sales. Things aren’t always so sunny for those involved. (Wired)

    Deciding the fate of “leftover” embryos

    This article first appeared in The Checkup, MIT Technology Review’s weekly biotech newsletter. To receive it in your inbox every Thursday, and read articles like this first, sign up here.

    Over the past few months, I’ve been working on a piece about IVF embryos. The goal of in vitro fertilization is to create babies via a bit of lab work: Trigger the release of lots of eggs, introduce them to sperm in a lab, transfer one of the resulting embryos into a person’s uterus, and cross your fingers for a healthy pregnancy. Sometimes it doesn’t work. But often it does. For the article, I explored what happens to the healthy embryos that are left over.

    I spoke to Lisa Holligan, who had IVF in the UK around five years ago. Holligan donated her “genetically abnormal” embryos for scientific research. But she still has one healthy embryo frozen in storage. And she doesn’t know what to do with it.

    She’s not the only one struggling with the decision. “Leftover” embryos are kept frozen in storage tanks, where they sit in little straws, invisible to the naked eye, their growth paused in a state of suspended animation. What happens next is down to personal choice—but that choice can be limited by a complex web of laws and ethical and social factors.

    These days, responsible IVF clinics will always talk to people about the possibility of having leftover embryos before they begin treatment. Intended parents will sign a form indicating what they would like to happen to those embryos. Typically, that means deciding early on whether they might like any embryos they don’t end up using to be destroyed or donated, either to someone else trying to conceive or for research.

    But it can be really difficult to make these decisions before you’ve even started treatment. People seeking fertility treatment will usually have spent a long time trying to get pregnant. They are hoping for healthy embryos, and some can’t imagine having any left over—or how they might feel about them.

    For a lot of people, embryos are not just balls of cells. They hold the potential for life, after all. Some people see them as children, waiting to be born. Some even name their embryos, or call them their “freezer babies.” Others see them as the product of a long, exhausting, and expensive IVF journey.

    Holligan says that she initially considered donating her embryo to another person, but her husband disagreed. He saw the embryo as their child and said he wouldn’t feel comfortable with giving it up to another family. “I started having these thoughts about a child coming to me when they’re older, saying they’ve had a terrible life, and [asking] ‘Why didn’t you have me?’” she told me.

    Holligan lives in the UK, where you can store your embryos for up to 55 years. Destroying or donating them are also options. That’s not the case in other countries. In Italy, for example, embryos cannot be destroyed or donated. Any that are frozen will remain that way forever, unless the law changes at some point.

    In the US, regulations vary by state. The patchwork of laws means that one state can bestow a legal status on embryos, giving them the same rights as children, while another might have no legislation in place at all.

    No one knows for sure how many embryos are frozen in storage tanks, but the figure is thought to be somewhere between 1 million and 10 million in the US alone. Some of these embryos have been in storage for years or decades. In some cases, the intended parents have deliberately chosen this, opting to pay hundreds of dollars per year in fees.

    But in other cases, clinics have lost touch with their clients. Many of these former clients have stopped paying for the storage of their embryos, but without up-to-date consent forms, clinics can be reluctant to destroy them. What if the person comes back and wants to use those embryos after all?

    “Most clinics, if they have any hesitation or doubt or question, will err on the side of holding on to those embryos and not discarding them,” says Sigal Klipstein, a reproductive endocrinologist at InVia Fertility Center in Chicago, who also chairs the ethics committee of the American Society for Reproductive Medicine. “Because it’s kind of like a one-way ticket.”

    Klipstein thinks one of the reasons why some embryos end up “abandoned” in storage is that the people who created them can’t bring themselves to destroy them. “It’s just very emotionally difficult for someone who has wanted so much to have a family,” she tells me.

    Klipstein says she regularly talks to her patients about what to do with leftover embryos. Even people who make the decision with confidence can change their minds, she says. “We’ve all had those patients who have discarded embryos and then come back six months or a year later and said: ‘Oh, I wish I had those embryos,’” she tells me. “Those [embryos may have been] their best chance of pregnancy.”

    Those who do want to discard their embryos have options. Often, the embryos will simply be exposed to air and then disposed of. But some clinics will also offer to transfer them at a time or place where a pregnancy is extremely unlikely to result. This “compassionate transfer,” as it is known, might be viewed as a more “natural” way to dispose of the embryo.

    But it’s not for everyone. Holligan has experienced multiple miscarriages and wonders if a compassionate transfer might feel similar. She wonders if it might just end up “putting [her] body and mind through unnecessary stress.”

    Ultimately, for Holligan and many others in a similar position, the choice remains a difficult one. “These are … very desired embryos,” says Klipstein. “The purpose of going through IVF was to create embryos to make babies. And [when people] have these embryos, and they’ve completed their family plan, they’re in a place they couldn’t have imagined.”


    Now read the rest of The Checkup

    Read more from MIT Technology Review‘s archive

    Our relationship with embryos is unique, and a bit all over the place. That’s partly because we can’t agree on their moral status. Are they more akin to people or property, or something in between? Who should get to decide their fate? While we get to the bottom of these sticky questions, millions of embryos are stuck in suspended animation—some of them indefinitely.

    It is estimated that over 12 million babies have been born through IVF. The development of the Nobel Prize–winning technology behind the procedure relied on embryo research. Some worry that donating embryos for research can be onerous—and that valuable embryos are being wasted as a result.

    Fertility rates around the world are dropping below the levels needed to maintain stable populations. But IVF can’t save us from a looming fertility crisis. Gender equality and family-friendly policies are much more likely to prove helpful

    Two years ago, the US Supreme Court overturned Roe v. Wade, a legal decision that protected the right to abortion. Since then, abortion bans have been enacted in multiple states. But in November of last year, some states voted to extend and protect access to abortion, and voters in Missouri supported overturning the state’s ban.

    Last year, a ruling by the Alabama Supreme Court that embryos count as children ignited fears over access to fertility treatments in a state that had already banned abortion. The move could also have implications for the development of technologies like artificial uteruses and synthetic embryos, my colleague Antonio Regalado wrote at the time.

    From around the web

    It’s not just embryos that are frozen as part of fertility treatments. Eggs, sperm, and even ovarian and testicular tissue can be stored too. A man who had immature testicular tissue removed and frozen before undergoing chemotherapy as a child 16 years ago had the tissue reimplanted in a world first, according to the team at University Hospital Brussels that performed the procedure around a month ago. The tissue was placed into the man’s testicle and scrotum, and scientists will wait a year before testing to see if he is successfully producing sperm. (UZ Brussel)

    The Danish pharmaceutical company Novo Nordisk makes half the world’s insulin. Now it is better known as the manufacturer of the semaglutide drug Ozempic. How will the sudden shift affect the production and distribution of these medicines around the world? (Wired)

    The US has not done enough to prevent the spread of the H5N1 virus in dairy cattle. The response to bird flu is a national embarrassment, argues Katherine J. Wu. (The Atlantic)

    Elon Musk has said that if all goes well, millions of people will have brain-computer devices created by his company Neuralink implanted within 10 years. In reality, progress is slower—so far, Musk has said that three people have received the devices. My colleague Antonio Regalado predicts what we can expect from Neuralink in 2025. (MIT Technology Review)

    We need to protect the protocol that runs Bluesky

    Last week, when Mark Zuckerberg announced that Meta would be ending third-party fact-checking, it was a shocking pivot, but not exactly surprising. It’s just the latest example of a billionaire flip-flop affecting our social lives on the internet. 

    After January 6, 2021, Zuckerberg bragged to Congress about Facebook’s “industry-leading fact-checking program” and banned Donald Trump from the platform. But just two years later, he welcomed Trump back. And last year Zuckerberg was privately reassuring the conservative congressman Jim Jordan that Meta will no longer demote questionable content while it’s being fact-checked. 

    Now, not only is Meta ending fact-checking completely; it is loosening rules around hate speech, allowing horrendous personal attacks on migrants and trans people, for example, on its platforms. 

    And Zuckerberg isn’t the only social media CEO careening all over the road: Elon Musk, since buying Twitter in 2022 and touting free speech as “the bedrock of a functioning democracy,” has suspended journalists, restored tens of thousands of banned users (including white nationalists), brought back political advertising, and weakened verification and harassment policies. 

    Unfortunately, these capricious billionaires can do whatever they want because of an ownership model that privileges singular, centralized control in exchange for shareholder returns.

    And this has led to a constantly shifting digital environment in which people can lose their communication pathways and livelihoods in a second, with no recourse, as opaque rules change. 

    The internet doesn’t need to be like this. As luck would have it, a new way is emerging just in time. 

    If you’ve heard of Bluesky, you’ve probably heard of it as a clone of Twitter where liberals can take refuge. But under the hood it’s structured fundamentally differently—in a way that could point us to a healthier internet for everyone, regardless of politics or identity. 

    Just like email, Bluesky sits on top of an open protocol, in this case known as the AT Protocol. In practice, that means that anyone can build on it. Just as you wouldn’t need anyone’s permission to start a newsletter company built on email, people are starting to share remixed versions of their social media feeds, built on Bluesky. This sounds like a small thing, but think about all the harms enabled by social media companies’ algorithms in the last decade: insurrection, radicalization, self-harm, bullying. Bluesky enables users to collaborate on verification and moderation by sharing block lists and labels. Letting people shape their own experience of social media is nothing short of revolutionary. 

    And importantly, if you decide that you don’t agree with Bluesky’s design and moderation decisions, you can build something else on the same infrastructure and use that instead. This is fundamentally different from the dominant, centralized social media that has prevailed until now.

    At the core of Bluesky’s philosophy is the idea that instead of being centralized in the hands of one person or institution, social media governance should obey the principle of subsidiarity. The Nobel Prize–winning economist Elinor Ostrom found, through studying grassroots solutions to local environmental problems around the world, that some problems are best solved locally, while others are best solved at a higher level. 

    In terms of content moderation, posts related to child sexual abuse or terrorism are best handled by professionals trained to help keep millions or billions safe. But a lot of decisions about speech can be solved in each community, or even user by user as people assemble Bluesky block lists. 

    So all the right elements are currently in place at Bluesky to usher in this new architecture for social media: independent ownership, newfound popularity, a stark contrast with other dominant platforms, and right-minded leadership. But challenges remain, and we can’t count on Bluesky to do this right without support. 

    Critics have pointed out that Bluesky has yet to turn a profit and is currently running on venture capital, the same corporate structure that brought us Facebook, Twitter, and other social media companies. As of now, there’s no option to exit Bluesky and take your data and network with you, because there are no other servers that run the AT Protocol. Bluesky CEO Jay Graber deserves credit for her stewardship so far, and for attempting to avoid the dangers of advertising incentives. But the process by which capitalism degrades tech products is so predictable that Cory Doctorow coined a now-popular term for it: enshittification.

    That’s why we need to act now to secure the foundation of this digital future and make it enshittification-proof. This week, prominent technologists started a new project, which we at New_ Public are supporting, called Free Our Feeds. There are three parts: First, Free Our Feeds wants to create a nonprofit foundation to govern and protect the AT Protocol, outside of Bluesky the company. We also need to build redundant servers so all users can leave with their data or build anything they want—regardless of policies set by Bluesky. Finally, we need to spur the development of a whole ecosystem built on this tech with seed money and expertise. 

    It’s worth noting that this is not a hostile takeover: Bluesky and Graber recognize the importance of this effort and have signaled their approval. But the point is, it can’t rely on them. To free us from fickle billionaires, some of the power has to reside outside Bluesky, Inc. 

    If we get this right, so much is possible. Not too long ago, the internet was full of builders and people working together: the open web. Email. Podcasts. Wikipedia is one of the best examples—a collaborative project to create one of the web’s best free, public resources. And the reason we still have it today is the infrastructure built up around it: The nonprofit Wikimedia Foundation protects the project and insulates it from the pressures of capitalism. When’s the last time we collectively built anything as good?

    We can shift the balance of power and reclaim our social lives from these companies and their billionaires. This is an opportunity to bring much more independence, innovation, and local control to our online conversations. We can finally build the “Wikipedia of social media,” or whatever we want. But we need to act, because the future of the internet can’t depend on whether one of the richest men on Earth wakes up on the wrong side of the bed. 

    Eli Pariser is author of The Filter Bubble and Co-Director of New_ Public, a nonprofit R&D lab that’s working to reimagine social media. 

    Deepti Doshi is a Co-Director of New_ Public and was a director at Meta.

    OpenAI has created an AI model for longevity science

    When you think of AI’s contributions to science, you probably think of AlphaFold, the Google DeepMind protein-folding program that earned its creator a Nobel Prize last year.

    Now OpenAI says it’s getting into the science game too—with a model for engineering proteins.

    The company says it has developed a language model that dreams up proteins capable of turning regular cells into stem cells—and that it has handily beat humans at the task.

    The work represents OpenAI’s first model focused on biological data and its first public claim that its models can deliver unexpected scientific results. As such, it is a step toward determining whether or not AI can make true discoveries, which some argue is a major test on the pathway to “artificial general intelligence.”

    Last week, OpenAI CEO Sam Altman said he was “confident” his company knows how to build an AGI, adding that “superintelligent tools could massively accelerate scientific discovery and innovation well beyond what we are capable of doing on our own.” 

    The protein engineering project started a year ago when Retro Biosciences, a longevity research company based in San Francisco, approached OpenAI about working together.

    That link-up did not happen by chance. Sam Altman, the CEO of OpenAI, personally funded Retro with $180 million, as MIT Technology Review first reported in 2023.

    Retro has the goal of extending the normal human lifespan by 10 years. For that, it studies what are called Yamanaka factors. Those are a set of proteins that, when added to a human skin cell, will cause it to morph into a young-seeming stem cell, a type that can produce any other tissue in the body. 

    It’s a phenomenon that researchers at Retro, and at richly funded companies like Altos Labs, see as the possible starting point for rejuvenating animals, building human organs, or providing supplies of replacement cells.

    But such cell “reprogramming” is not very efficient. It takes several weeks, and less than 1% of cells treated in a lab dish will complete the rejuvenation journey.

    OpenAI’s new model, called GPT-4b micro, was trained to suggest ways to re-engineer the protein factors to increase their function. According to OpenAI, researchers used the model’s suggestions to change two of the Yamanaka factors to be more than 50 times as effective—at least according to some preliminary measures. 

    “Just across the board, the proteins seem better than what the scientists were able to produce by themselves,” says John Hallman, an OpenAI researcher.

    Hallman and OpenAI’s Aaron Jaech, as well as Rico Meinl from Retro, were the model’s lead developers.

    Outside scientists won’t be able to tell if the results are real until they’re published, something the companies say they are planning. Nor is the model available for wider use—it’s still a bespoke demonstration, not an official product launch.

    “This project is meant to show that we’re serious about contributing to science,” says Jaech. “But whether those capabilities will come out to the world as a separate model or whether they’ll be rolled into our mainline reasoning models—that’s still to be determined.”

    The model does not work the same way as Google’s AlphaFold, which predicts what shape proteins will take. Since the Yamanaka factors are unusually floppy and unstructured proteins, OpenAI said, they called for a different approach, which its large language models were suited to.

    The model was trained on examples of protein sequences from many species, as well as information on which proteins tend to interact with one another. While that’s a lot of data, it’s just a fraction of what OpenAI’s flagship chatbots were trained on, making GPT-4b an example of a “small language model” that works with a focused data set.

    Once Retro scientists were given the model, they tried to steer it to suggest possible redesigns of the Yamanaka proteins. The prompting tactic used is similar to the “few-shot” method, in which a user queries a chatbot by providing a series of examples with answers, followed by an example for the bot to respond to.

    Although genetic engineers have ways to direct evolution of molecules in the lab, they can usually test only so many possibilities. And even a protein of typical length can be changed in nearly infinite ways (since they’re built from hundreds of amino acids, and each acid comes in 20 possible varieties).

    OpenAI’s model, however, often spits out suggestions in which a third of the amino acids in the proteins were changed.

    an image of Fibroblasts on Day 1; an image of Cells reprogrammed with SOX@, KLF4, OCT4, and MYC on Day 10; and an image of cells reprogrammed with RetroSOX, RetroKLF, OCT4, and MYC on Day 10

    OPENAI

    “We threw this model into the lab immediately and we got real-world results,” says Retro’s CEO, Joe Betts-Lacroix. He says the model’s ideas were unusually good, leading to improvements over the original Yamanaka factors in a substantial fraction of cases.

    Vadim Gladyshev, a Harvard University aging researcher who consults with Retro, says better ways of making stem cells are needed. “For us, it would be extremely useful. [Skin cells] are easy to reprogram, but other cells are not,” he says. “And to do it in a new species—it’s often extremely different, and you don’t get anything.” 

    How exactly the GPT-4b arrives at its guesses is still not clear—as is often the case with AI models. “It’s like when AlphaGo crushed the best human at Go, but it took a long time to find out why,” says Betts-Lacroix. “We are still figuring out what it does, and we think the way we apply this is only scratching the surface.”

    OpenAI says no money changed hands in the collaboration. But because the work could benefit Retro—whose biggest investor is Altman—the announcement may add to questions swirling around the OpenAI CEO’s side projects.

    Last year, the Wall Street Journal said Altman’s wide-ranging investments in private tech startups amount to an “opaque investment empire” that is “creating a mounting list of potential conflicts,” since some of these companies also do business with OpenAI.

    In Retro’s case, simply being associated with Altman, OpenAI, and the race toward AGI could boost its profile and increase its ability to hire staff and raise funds. Betts-Lacroix did not answer questions about whether the early-stage company is currently in fundraising mode. 

    OpenAI says Altman was not directly involved in the work and that it never makes decisions based on Altman’s other investments. 

    Meta’s new AI model can translate speech from more than 100 languages

    Meta has released a new AI model that can translate speech from 101 different languages. It represents a step toward real-time, simultaneous interpretation, where words are translated as soon as they come out of someone’s mouth. 

    Typically, translation models for speech use a multistep approach. First they translate speech into text. Then they translate that text into text in another language. Finally, that translated text is turned into speech in the new language. This method can be inefficient, and at each step, errors and mistranslations can creep in. But Meta’s new model, called SeamlessM4T, enables more direct translation from speech in one language to speech in another. The model is described in a paper published today in Nature

    Seamless can translate text with 23% more accuracy than the top existing models. And although another model, Google’s AudioPaLM, can technically translate more languages—113 of them, versus 101 for Seamless—it can translate them only into English. SeamlessM4T can translate into 36 other languages.

    The key is a process called parallel data mining, which finds instances when the sound in a video or audio matches a subtitle in another language from crawled web data. The model learned to associate those sounds in one language with the matching pieces of text in another. This opened up a whole new trove of examples of translations for their model.

    “Meta has done a great job having a breadth of different things they support, like text-to-speech, speech-to-text, even automatic speech recognition,” says Chetan Jaiswal, a professor of computer science at Quinnipiac University, who was not involved in the research. “The mere number of languages they are supporting is a tremendous achievement.”

    Human translators are still a vital part of the translation process, the researchers say in the paper, because they can grapple with diverse cultural contexts and make sure the same meaning is conveyed from one language into another. This step is important, says Lynne Bowker, Canada Research Chair in Translation, Technologies and Society at Université Laval in Quebec, who didn’t work on Seamless. “Languages are a reflection of cultures, and cultures have their own ways of knowing things,” she says. 

    When it comes to applications like medicine or law, machine translations need to be thoroughly checked by a human, she says. If not, misunderstandings can result. For example, when Google Translate was used to translate public health information about the covid-19 vaccine from the Virginia Department of Health in January 2021, it translated “not mandatory” in English into “not necessary” in Spanish, changing the whole meaning of the message.

    AI models have much more examples to train on in some languages than others. This means current speech-to-speech models may be able to translate a language like Greek into English, where there may be many examples, but cannot translate from Swahili to Greek. The team behind Seamless aimed to solve this problem by pre-training the model on millions of hours of spoken audio in different languages. This pre-training allowed it to recognize general patterns in language, making it easier to process less widely spoken languages because it already had some baseline for what spoken language is supposed to sound like.  

    The system is open-source, which the researchers hope will encourage others to build upon its current capabilities. But some are skeptical of how useful it may be compared with available alternatives. “Google’s translation model is not as open-source as Seamless, but it’s way more responsive and fast, and it doesn’t cost anything as an academic,” says Jaiswal.

    The most exciting thing about Meta’s system is that it points to the possibility of instant interpretation across languages in the not-too-distant future—like the Babel fish in Douglas Adams’ cult novel The Hitchhiker’s Guide to the Galaxy. SeamlessM4T is faster than existing models but still not instant. That said, Meta claims to have a newer version of Seamless that’s as fast as human interpreters. 

    “While having this kind of delayed translation is okay and useful, I think simultaneous translation will be even more useful,” says Kenny Zhu, director of the Arlington Computational Linguistics Lab at the University of Texas at Arlington, who is not affiliated with the new research.