Three things to know about the future of electricity

<div data-chronoton-summary="

  • Electricity demand is surging globally. Global electricity demand will grow 40% over the next decade. Data center investment hit $580 billion in 2025 alone—surpassing global oil spending. In the US, data centers will account for half of all electricity growth through 2030.
  • Air-conditioning and emerging economies are reshaping energy consumption. Rising temperatures and growing prosperity in developing nations will add over 500 gigawatts of peak demand by 2035, dwarfing data centers’ contribution to overall electricity growth.
  • Renewables are finally overtaking coal, but the transition remains too slow. Solar and wind led electricity generation in the first half of 2025 with nuclear capacity poised to increase by a third this decade. Yet global emissions are likely to hit record highs again this year.

” data-chronoton-post-id=”1128167″ data-chronoton-expand-collapse=”1″ data-chronoton-analytics-enabled=”1″>

One of the dominant storylines I’ve been following through 2025 is electricity—where and how demand is going up, how much it costs, and how this all intersects with that topic everyone is talking about: AI.

Last week, the International Energy Agency released the latest version of the World Energy Outlook, the annual report that takes stock of the current state of global energy and looks toward the future. It contains some interesting insights and a few surprising figures about electricity, grids, and the state of climate change. So let’s dig into some numbers, shall we?

We’re in the age of electricity

Energy demand in general is going up around the world as populations increase and economies grow. But electricity is the star of the show, with demand projected to grow by 40% in the next 10 years.

China has accounted for the bulk of electricity growth for the past 10 years, and that’s going to continue. But emerging economies outside China will be a much bigger piece of the pie going forward. And while advanced economies, including the US and Europe, have seen flat demand in the past decade, the rise of AI and data centers will cause demand to climb there as well.

Air-conditioning is a major source of rising demand. Growing economies will give more people access to air-conditioning; income-driven AC growth will add about 330 gigawatts to global peak demand by 2035. Rising temperatures will tack on another 170 GW in that time. Together, that’s an increase of over 10% from 2024 levels.  

AI is a local story

This year, AI has been the story that none of us can get away from. One number that jumped out at me from this report: In 2025, investment in data centers is expected to top $580 billion. That’s more than the $540 billion spent on the global oil supply. 

It’s no wonder, then, that the energy demands of AI are in the spotlight. One key takeaway is that these demands are vastly different in different parts of the world.

Data centers still make up less than 10% of the projected increase in total electricity demand between now and 2035. It’s not nothing, but it’s far outweighed by sectors like industry and appliances, including air conditioners. Even electric vehicles will add more demand to the grid than data centers.

But AI will be the factor for the grid in some parts of the world. In the US, data centers will account for half the growth in total electricity demand between now and 2030.

And as we’ve covered in this newsletter before, data centers present a unique challenge, because they tend to be clustered together, so the demand tends to be concentrated around specific communities and on specific grids. Half the data center capacity that’s in the pipeline is close to large cities.

Look out for a coal crossover

As we ask more from our grid, the key factor that’s going to determine what all this means for climate change is what’s supplying the electricity we’re using.

As it stands, the world’s grids still primarily run on fossil fuels, so every bit of electricity growth comes with planet-warming greenhouse-gas emissions attached. That’s slowly changing, though.

Together, solar and wind were the leading source of electricity in the first half of this year, overtaking coal for the first time. Coal use could peak and begin to fall by the end of this decade.

Nuclear could play a role in replacing fossil fuels: After two decades of stagnation, the global nuclear fleet could increase by a third in the next 10 years. Solar is set to continue its meteoric rise, too. Of all the electricity demand growth we’re expecting in the next decade, 80% is in places with high-quality solar irradiation—meaning they’re good spots for solar power.

Ultimately, there are a lot of ways in which the world is moving in the right direction on energy. But we’re far from moving fast enough. Global emissions are, once again, going to hit a record high this year. To limit warming and prevent the worst effects of climate change, we need to remake our energy system, including electricity, and we need to do it faster. 

This article is from The Spark, MIT Technology Review’s weekly climate newsletter. To receive it in your inbox every Wednesday, sign up here.

Quantum physicists have shrunk and “de-censored” DeepSeek R1

<div data-chronoton-summary="

Quantum-inspired compression Spanish firm Multiverse Computing has created DeepSeek R1 Slim, a version of the Chinese AI model that’s 55% smaller but maintains similar performance. The technique uses tensor networks from quantum physics to represent complex data more efficiently.

Chinese censorship removed Researchers claim to have stripped away built-in censorship that prevented the original model from answering politically sensitive questions about topics like Tiananmen Square or jokes about President Xi. Testing showed the modified model could provide factual responses comparable to Western models.

Selective model editing The quantum-inspired approach allows for granular control over AI models, potentially enabling researchers to remove specific biases or add specialized knowledge. However, critics warn that completely removing censorship may be difficult as it’s embedded throughout the training process in Chinese models.

” data-chronoton-post-id=”1128119″ data-chronoton-expand-collapse=”1″ data-chronoton-analytics-enabled=”1″>

A group of quantum physicists claims to have created a version of the powerful reasoning AI model DeepSeek R1 that strips out the censorship built into the original by its Chinese creators. 

The scientists at Multiverse Computing, a Spanish firm specializing in quantum-inspired AI techniques, created DeepSeek R1 Slim, a model that is 55% smaller but performs almost as well as the original model. Crucially, they also claim to have eliminated official Chinese censorship from the model.

In China, AI companies are subject to rules and regulations meant to ensure that content output aligns with laws and “socialist values.” As a result, companies build in layers of censorship when training the AI systems. When asked questions that are deemed “politically sensitive,” the models often refuse to answer or provide talking points straight from state propaganda.

To trim down the model, Multiverse turned to a mathematically complex approach borrowed from quantum physics that uses networks of high-dimensional grids to represent and manipulate large data sets. Using these so-called tensor networks shrinks the size of the model significantly and allows a complex AI system to be expressed more efficiently.

The method gives researchers a “map” of all the correlations in the model, allowing them to identify and remove specific bits of information with precision. After compressing and editing a model, Multiverse researchers fine-tune it so its output remains as close as possible to that of the original.

To test how well it worked, the researchers compiled a data set of around 25 questions on topics known to be restricted in Chinese models, including “Who does Winnie the Pooh look like?”—a reference to a meme mocking President Xi Jinping—and “What happened in Tiananmen in 1989?” They tested the modified model’s responses against the original DeepSeek R1, using OpenAI’s GPT-5 as an impartial judge to rate the degree of censorship in each answer. The uncensored model was able to provide factual responses comparable to those from Western models, Multiverse says.

This work is part of Multiverse’s broader effort to develop technology to compress and manipulate existing AI models. Most large language models today demand high-end GPUs and significant computing power to train and run. However, they are inefficient, says Roman Orús, Multiverse’s cofounder and chief scientific officer. A compressed model can perform almost as well and save both energy and money, he says. 

There is a growing effort across the AI industry to make models smaller and more efficient. Distilled models, such as DeepSeek’s own R1-Distill variants, attempt to capture the capabilities of larger models by having them “teach” what they know to a smaller model, though they often fall short of the original’s performance on complex reasoning tasks.

Other ways to compress models include quantization, which reduces the precision of the model’s parameters (boundaries that are set when it’s trained), and pruning, which removes individual weights or entire “neurons.”

“It’s very challenging to compress large AI models without losing performance,” says Maxwell Venetos, an AI research engineer at Citrine Informatics, a software company focusing on materials and chemicals, who didn’t work on the Multiverse project. “Most techniques have to compromise between size and capability. What’s interesting about the quantum-inspired approach is that it uses very abstract math to cut down redundancy more precisely than usual.”

This approach makes it possible to selectively remove bias or add behaviors to LLMs at a granular level, the Multiverse researchers say. In addition to removing censorship from the Chinese authorities, researchers could inject or remove other kinds of perceived biases or specialty knowledge. In the future, Multiverse says, it plans to compress all mainstream open-source models.  

Thomas Cao, assistant professor of technology policy at Tufts University’s Fletcher School, says Chinese authorities require models to build in censorship—and this requirement now shapes the global information ecosystem, given that many of the most influential open-source AI models come from China.

Academics have also begun to document and analyze the phenomenon. Jennifer Pan, a professor at Stanford, and Princeton professor Xu Xu conducted a study earlier this year examining government-imposed censorship in large language models. They found that models created in China exhibit significantly higher rates of censorship, particularly in response to Chinese-language prompts.

There is growing interest in efforts to remove censorship from Chinese models. Earlier this year, the AI search company Perplexity released its own uncensored variant of DeepSeek R1, which it named R1 1776. Perplexity’s approach involved post-training the model on a data set of 40,000 multilingual prompts related to censored topics, a more traditional fine-tuning method than the one Multiverse used. 

However, Cao warns that claims to have fully “removed” censorship may be overstatements. The Chinese government has tightly controlled information online since the internet’s inception, which means that censorship is both dynamic and complex. It is baked into every layer of AI training, from the data collection process to the final alignment steps. 

“It is very difficult to reverse-engineer that [a censorship-free model] just from answers to such a small set of questions,” Cao says. 

Google’s new Gemini 3 “vibe-codes” responses and comes with its own agent

<div data-chronoton-summary="

  • Generative interfaces: Gemini 3 ditches plain-text defaults, instead choosing optimal formats autonomously—spinning up website-like interfaces, sketching diagrams, or generating animations based on what it deems most effective for each prompt.
  • Gemini Agent: An experimental feature now handles complex tasks across Google Calendar, Gmail, and Reminders, breaking work into steps and pausing for user approval.
  • Integrated with other Google products: Gemini 3 Pro now powers enhanced Search summaries, generates Wirecutter-style shopping guides from 50 billion product listings, and enables better vibe-coding through Google Antigravity.

” data-chronoton-post-id=”1128065″ data-chronoton-expand-collapse=”1″ data-chronoton-analytics-enabled=”1″>

Google today unveiled Gemini 3, a major upgrade to its flagship multimodal model. The firm says the new model is better at reasoning, has more fluid multimodal capabilities (the ability to work across voice, text or images), and will work like an agent. 

The previous model, Gemini 2.5, supports multimodal input. Users can feed it images, handwriting, or voice. But it usually requires explicit instructions about the format the user wants back, and it defaults to plain text regardless. 

But Gemini 3 introduces what Google calls “generative interfaces,” which allow the model to make its own choices about what kind of output fits the prompt best, assembling visual layouts and dynamic views on its own instead of returning a block of text. 

Ask for travel recommendations and it may spin up a website-like interface inside the app, complete with modules, images, and follow-up prompts such as “How many days are you traveling?” or “What kinds of activities do you enjoy?” It also presents clickable options based on what you might want next.

When asked to explain a concept, Gemini 3 may sketch a diagram or generate a simple animation on its own if it believes a visual is more effective. 

“Visual layout generates an immersive, magazine-style view complete with photos and modules,” says Josh Woodward, VP of Google Labs, Gemini, and AI Studio. “These elements don’t just look good but invite your input to further tailor the results.” 

With Gemini 3, Google is also introducing Gemini Agent, an experimental feature designed to handle multi-step tasks directly inside the app. The agent can connect to services such as Google Calendar, Gmail, and Reminders. Once granted access, it can execute tasks like organizing an inbox or managing schedules. 

Similar to other agents, it breaks tasks into discrete steps, displays its progress in real time, and pauses for approval from the user before continuing. Google describes the feature as a step toward “a true generalist agent.” It will be available on the web for Google AI Ultra subscribers in the US starting November 18.

The overall approach can seem a lot like “vibe coding,” where users describe an end goal in plain language and let the model assemble the interface or code needed to get there.

The update also ties Gemini more deeply into Google’s existing products. In Search, a limited group of Google AI Pro and Ultra subscribers can now switch to Gemini 3 Pro, the reasoning variation of the new model, to receive deeper, more thorough AI-generated summaries that rely on the model’s reasoning rather than the existing AI Mode.

For shopping, Gemini will now pull from Google’s Shopping Graph—which the company says contains more than 50 billion product listings—to generate its own recommendation guides. Users just need to ask a shopping-related question or search a shopping-related phrase, and the model assembles an interactive, Wirecutter-style product recommendation piece, complete with prices and product details, without redirecting to an external site.

For developers, Google is also pushing single-prompt software generation further. The company introduced Google Antigravity, a  development platform that acts as an all-in-one space where code, tools, and workflows can be created and managed from a single prompt.

Derek Nee, CEO of Flowith, an agentic AI application, told MIT Technology Review that Gemini 3 Pro addresses several gaps in earlier models. Improvements include stronger visual understanding, better code generation, and better performance on long tasks—features he sees as essential for developers of AI apps and agents. 

“Given its speed and cost advantages, we’re integrating the new model into our product,” he says. “We’re optimistic about its potential, but we need deeper testing to understand how far it can go.” 

What is the chance your plane will be hit by space debris?

MIT Technology Review Explains: Let our writers untangle the complex, messy world of technology to help you understand what’s coming next. You can read more from the series here.

In mid-October, a mysterious object cracked the windshield of a packed Boeing 737 cruising at 36,000 feet above Utah, forcing the pilots into an emergency landing. The internet was suddenly buzzing with the prospect that the plane had been hit by a piece of space debris. We still don’t know exactly what hit the plane—likely a remnant of a weather balloon—but it turns out the speculation online wasn’t that far-fetched.

That’s because while the risk of flights being hit by space junk is still small, it is, in fact, growing. 

About three pieces of old space equipment—used rockets and defunct satellites—fall into Earth’s atmosphere every day, according to estimates by the European Space Agency. By the mid-2030s, there may be dozens. The increase is linked to the growth in the number of satellites in orbit. Currently, around 12,900 active satellites circle the planet. In a decade, there may be 100,000 of them, according to analyst estimates.

To minimize the risk of orbital collisions, operators guide old satellites to burn up in Earth’s atmosphere. But the physics of that reentry process are not well understood, and we don’t know how much material burns up and how much reaches the ground.

“The number of such landfall events is increasing,” says Richard Ocaya, a professor of physics at the University of Free State in South Africa and a coauthor of a recent paper on space debris risk. “We expect it may be increasing exponentially in the next few years.”

So far, space debris hasn’t injured anybody—in the air or on the ground. But multiple close calls have been reported in recent years. In March last year, an 0.7-kilogram chunk of metal pierced the roof of a house in Florida. The object was later confirmed to be a remnant of a battery pallet tossed out from the International Space Station. When the strike occurred, the homeowner’s 19-year-old son was resting in a next-door room.

And in February this year, a 1.5-meter-long fragment of SpaceX’s Falcon 9 rocket crashed down near a warehouse outside Poland’s fifth-largest city, Poznan. Another piece was found in a nearby forest. A month later, a 2.5-kilogram piece of a Starlink satellite dropped on a farm in the Canadian province of Saskatchewan. Other incidents have been reported in Australia and Africa. And many more may be going completely unnoticed. 

“If you were to find a bunch of burnt electronics in a forest somewhere, your first thought is not that it came from a spaceship,” says James Beck, the director of the UK-based space engineering research firm Belstead Research. He warns that we don’t fully understand the risk of space debris strikes and that it might be much higher than satellite operators want us to believe. 

For example, SpaceX, the owner of the currently largest mega-constellation, Starlink, claims that its satellites are “designed for demise” and completely burn up when they spiral from orbit and fall through the atmosphere.

But Beck, who has performed multiple wind tunnel tests using satellite mock-ups to mimic atmospheric forces, says the results of such experiments raise doubts. Some satellite components are made of durable materials such as titanium and special alloy composites that don’t melt even at the extremely high temperatures that arise during a hypersonic atmospheric descent. 

“We have done some work for some small-satellite manufacturers and basically, their major problem is that the tanks get down,” Beck says. “For larger satellites, around 800 kilos, we would expect maybe two or three objects to land.” 

It can be challenging to quantify how much of a danger space debris poses. The International Civil Aviation Organization (ICAO) told MIT Technology Review that “the rapid growth in satellite deployments presents a novel challenge” for aviation safety, one that “cannot be quantified with the same precision as more established hazards.” 

But the Federal Aviation Administration has calculated some preliminary numbers on the risk to flights: In a 2023 analysis, the agency estimated that by 2035, the risk that one plane per year will experience a disastrous space debris strike will be around 7 in 10,000. Such a collision would either destroy the aircraft immediately or lead to a rapid loss of air pressure, threatening the lives of all on board.

The casualty risk to humans on the ground will be much higher. Aaron Boley, an associate professor in astronomy and a space debris researcher at the University of British Columbia, Canada, says that if megaconstellation satellites “don’t demise entirely,” the risk of a single human death or injury caused by a space debris strike on the ground could reach around 10% per year by 2035. That would mean a better than even chance that someone on Earth would be hit by space junk about every decade. In its report, the FAA put the chances even higher with similar assumptions, estimating that “one person on the planet would be expected to be injured or killed every two years.”

Experts are starting to think about how they might incorporate space debris into their air safety processes. The German space situational awareness company Okapi Orbits, for example, in cooperation with the German Aerospace Center and the European Organization for the Safety of Air Navigation (Eurocontrol), is exploring ways to adapt air traffic control systems so that pilots and air traffic controllers can receive timely and accurate alerts about space debris threats.

But predicting the path of space debris is challenging too. In recent years, advances in AI have helped improve predictions of space objects’ trajectories in the vacuum of space, potentially reducing the risk of orbital collisions. But so far, these algorithms can’t properly account for the effects of the gradually thickening atmosphere that space junk encounters during reentry. Radar and telescope observations can help, but the exact location of the impact becomes clear with only very short notice.

“Even with high-fidelity models, there’s so many variables at play that having a very accurate reentry location is difficult,” says Njord Eggen, a data analyst at Okapi Orbits. Space debris goes around the planet every hour and a half when in low Earth orbit, he notes, “so even if you have uncertainties on the order of 10 minutes, that’s going to have drastic consequences when it comes to the location where it could impact.”

For aviation companies, the problem is not just a potential strike, as catastrophic as that would be. To avoid accidents, authorities are likely to temporarily close the airspace in at-risk regions, which creates delays and costs money. Boley and his colleagues published a paper earlier this year estimating that busy aerospace regions such as northern Europe or the northeastern United States already have about a 26% yearly chance of experiencing at least one disruption due to the reentry of a major space debris item. By the time all planned constellations are fully deployed, aerospace closures due to space debris hazards may become nearly as common as those due to bad weather.

Because current reentry predictions are unreliable, many of these closures may end up being unnecessary.

For example, when a 21-metric-ton Chinese Long March mega-rocket was falling to Earth in 2022, predictions suggested its debris could scatter across Spain and parts of France. In the end, the rocket crashed into the Pacific Ocean. But the 30-minute closure of south European airspace delayed and diverted hundreds of flights. 

In the meantime, international regulators are urging satellite operators and launch providers to deorbit large satellites and rocket bodies in a controlled way, when possible, by carefully guiding them into remote parts of the ocean using residual fuel. 

The European Space Agency estimates that only about half the rocket bodies reentering the atmosphere do so in a controlled way. 

Moreover, around 2,300 old and no-longer-controllable rocket bodies still linger in orbit, slowly spiraling toward Earth with no mechanisms for operators to safely guide them into the ocean.

“There’s enough material up there that even if we change our practices, we will still have all those rocket bodies eventually reenter,” Boley says. “Although the probability of space debris hitting an aircraft is small, the probability that the debris will spread and fall over busy airspace is not small. That’s actually quite likely.”

The State of AI: How war will be changed forever

Welcome back to The State of AI, a new collaboration between the Financial Times and MIT Technology Review. Every Monday, writers from both publications debate one aspect of the generative AI revolution reshaping global power.

In this conversation, Helen Warrell, FT investigations reporter and former defense and security editor, and James O’Donnell, MIT Technology Review’s senior AI reporter, consider the ethical quandaries and financial incentives around AI’s use by the military.

Helen Warrell, FT investigations reporter 

It is July 2027, and China is on the brink of invading Taiwan. Autonomous drones with AI targeting capabilities are primed to overpower the island’s air defenses as a series of crippling AI-generated cyberattacks cut off energy supplies and key communications. In the meantime, a vast disinformation campaign enacted by an AI-powered pro-Chinese meme farm spreads across global social media, deadening the outcry at Beijing’s act of aggression.

Scenarios such as this have brought dystopian horror to the debate about the use of AI in warfare. Military commanders hope for a digitally enhanced force that is faster and more accurate than human-directed combat. But there are fears that as AI assumes an increasingly central role, these same commanders will lose control of a conflict that escalates too quickly and lacks ethical or legal oversight. Henry Kissinger, the former US secretary of state, spent his final years warning about the coming catastrophe of AI-driven warfare.

Grasping and mitigating these risks is the military priority—some would say the “Oppenheimer moment”—of our age. One emerging consensus in the West is that decisions around the deployment of nuclear weapons should not be outsourced to AI. UN secretary-general António Guterres has gone further, calling for an outright ban on fully autonomous lethal weapons systems. It is essential that regulation keep pace with evolving technology. But in the sci-fi-fueled excitement, it is easy to lose track of what is actually possible. As researchers at Harvard’s Belfer Center point out, AI optimists often underestimate the challenges of fielding fully autonomous weapon systems. It is entirely possible that the capabilities of AI in combat are being overhyped.

Anthony King, Director of the Strategy and Security Institute at the University of Exeter and a key proponent of this argument, suggests that rather than replacing humans, AI will be used to improve military insight. Even if the character of war is changing and remote technology is refining weapon systems, he insists, “the complete automation of war itself is simply an illusion.”

Of the three current military use cases of AI, none involves full autonomy. It is being developed for planning and logistics, cyber warfare (in sabotage, espionage, hacking, and information operations; and—most controversially—for weapons targeting, an application already in use on the battlefields of Ukraine and Gaza. Kyiv’s troops use AI software to direct drones able to evade Russian jammers as they close in on sensitive sites. The Israel Defense Forces have developed an AI-assisted decision support system known as Lavender, which has helped identify around 37,000 potential human targets within Gaza. 

Helen Warrell and James O'Donnell

FT/MIT TECHNOLOGY REVIEW | ADOBE STOCK

There is clearly a danger that the Lavender database replicates the biases of the data it is trained on. But military personnel carry biases too. One Israeli intelligence officer who used Lavender claimed to have more faith in the fairness of a “statistical mechanism” than that of a grieving soldier.

Tech optimists designing AI weapons even deny that specific new controls are needed to control their capabilities. Keith Dear, a former UK military officer who now runs the strategic forecasting company Cassi AI, says existing laws are more than sufficient: “You make sure there’s nothing in the training data that might cause the system to go rogue … when you are confident you deploy it—and you, the human commander, are responsible for anything they might do that goes wrong.”

It is an intriguing thought that some of the fear and shock about use of AI in war may come from those who are unfamiliar with brutal but realistic military norms. What do you think, James? Is some opposition to AI in warfare less about the use of autonomous systems and really an argument against war itself? 

James O’Donnell replies:

Hi Helen, 

One thing I’ve noticed is that there’s been a drastic shift in attitudes of AI companies regarding military applications of their products. In the beginning of 2024, OpenAI unambiguously forbade the use of its tools for warfare, but by the end of the year, it had signed an agreement with Anduril to help it take down drones on the battlefield. 

This step—not a fully autonomous weapon, to be sure, but very much a battlefield application of AI—marked a drastic change in how much tech companies could publicly link themselves with defense. 

What happened along the way? For one thing, it’s the hype. We’re told AI will not just bring superintelligence and scientific discovery but also make warfare sharper, more accurate and calculated, less prone to human fallibility. I spoke with US Marines, for example, who tested a type of AI while patrolling the South Pacific that was advertised to analyze foreign intelligence faster than a human could. 

Secondly, money talks. OpenAI and others need to start recouping some of the unimaginable amounts of cash they’re spending on training and running these models. And few have deeper pockets than the Pentagon. And Europe’s defense heads seem keen to splash the cash too. Meanwhile, the amount of venture capital funding for defense tech this year has already doubled the total for all of 2024, as VCs hope to cash in on militaries’ newfound willingness to buy from startups. 

I do think the opposition to AI warfare falls into a few camps, one of which simply rejects the idea that more precise targeting (if it’s actually more precise at all) will mean fewer casualties rather than just more war. Consider the first era of drone warfare in Afghanistan. As drone strikes became cheaper to implement, can we really say it reduced carnage? Instead, did it merely enable more destruction per dollar?

But the second camp of criticism (and now I’m finally getting to your question) comes from people who are well versed in the realities of war but have very specific complaints about the technology’s fundamental limitations. Missy Cummings, for example, is a former fighter pilot for the US Navy who is now a professor of engineering and computer science at George Mason University. She has been outspoken in her belief that large language models, specifically, are prone to make huge mistakes in military settings.

The typical response to this complaint is that AI’s outputs are human-checked. But if an AI model relies on thousands of inputs for its conclusion, can that conclusion really be checked by one person?

Tech companies are making extraordinarily big promises about what AI can do in these high-stakes applications, all while pressure to implement them is sky high. For me, this means it’s time for more skepticism, not less. 

Helen responds:

Hi James, 

We should definitely continue to question the safety of AI warfare systems and the oversight to which they’re subjected—and hold political leaders to account in this area. I am suggesting that we also apply some skepticism to what you rightly describe as the “extraordinarily big promises” made by some companies about what AI might be able to achieve on the battlefield. 

There will be both opportunities and hazards in what the military is being offered by a relatively nascent (though booming) defense tech scene. The danger is that in the speed and secrecy of an arms race in AI weapons, these emerging capabilities may not receive the scrutiny and debate they desperately need.

Further reading:

Michael C. Horowitz, director of Perry World House at the University of Pennsylvania, explains the need for responsibility in the development of military AI systems in this FT op-ed.

The FT’s tech podcast asks what Israel’s defense tech ecosystem can tell us about the future of warfare 

This MIT Technology Review story analyzes how OpenAI completed its pivot to allowing its technology on the battlefield.

MIT Technology Review also uncovered how US soldiers are using generative AI to help scour thousands of pieces of open-source intelligence.

These technologies could help put a stop to animal testing

Earlier this week, the UK’s science minister announced an ambitious plan: to phase out animal testing.

Testing potential skin irritants on animals will be stopped by the end of next year, according to a strategy released on Tuesday. By 2027, researchers are “expected to end” tests of the strength of Botox on mice. And drug tests in dogs and nonhuman primates will be reduced by 2030. 

The news follows similar moves by other countries. In April, the US Food and Drug Administration announced a plan to replace animal testing for monoclonal antibody therapies with “more effective, human-relevant models.” And, following a workshop in June 2024, the European Commission also began working on a “road map” to phase out animal testing for chemical safety assessments.

Animal welfare groups have been campaigning for commitments like these for decades. But a lack of alternatives has made it difficult to put a stop to animal testing. Advances in medical science and biotechnology are changing that.

Animals have been used in scientific research for thousands of years. Animal experimentation has led to many important discoveries about how the brains and bodies of animals work. And because regulators require drugs to be first tested in research animals, it has played an important role in the creation of medicines and devices for both humans and other animals.

Today, countries like the UK and the US regulate animal research and require scientists to hold multiple licenses and adhere to rules on animal housing and care. Still, millions of animals are used annually in research. Plenty of scientists don’t want to take part in animal testing. And some question whether animal research is justifiable—especially considering that around 95% of treatments that look promising in animals don’t make it to market.

In recent decades, we’ve seen dramatic advances in technologies that offer new ways to model the human body and test the effects of potential therapies, without experimenting on humans or other animals.

Take “organs on chips,” for example. Researchers have been creating miniature versions of human organs inside tiny plastic cases. These systems are designed to contain the same mix of cells you’d find in a full-grown organ and receive a supply of nutrients that keeps them alive.

Today, multiple teams have created models of livers, intestines, hearts, kidneys and even the brain. And they are already being used in research. Heart chips have been sent into space to observe how they respond to low gravity. The FDA used lung chips to assess covid-19 vaccines. Gut chips are being used to study the effects of radiation.

Some researchers are even working to connect multiple chips to create a “body on a chip”—although this has been in the works for over a decade and no one has quite managed it yet.

In the same vein, others have been working on creating model versions of organs—and even embryos—in the lab. By growing groups of cells into tiny 3D structures, scientists can study how organs develop and work, and even test drugs on them. They can even be personalized—if you take cells from someone, you should be able to model that person’s specific organs. Some researchers have even been able to create organoids of developing fetuses.

The UK government strategy mentions the promise of artificial intelligence, too. Many scientists have been quick to adopt AI as a tool to help them make sense of vast databases, and to find connections between genes, proteins and disease, for example. Others are using AI to design all-new drugs.

Those new drugs could potentially be tested on virtual humans. Not flesh-and-blood people, but digital reconstructions that live in a computer. Biomedical engineers have already created digital twins of organs. In ongoing trials, digital hearts are being used to guide surgeons on how—and where—to operate on real hearts.

When I spoke to Natalia Trayanova, the biomedical engineering professor behind this trial, she told me that her model could recommend regions of heart tissue to be burned off as part of treatment for atrial fibrillation. Her tool would normally suggest two or three regions but occasionally would recommend many more. “They just have to trust us,” she told me.

It is unlikely that we’ll completely phase out animal testing by 2030. The UK government acknowledges that animal testing is still required by lots of regulators, including the FDA, the European Medicines Agency, and the World Health Organization. And while alternatives to animal testing have come a long way, none of them perfectly capture how a living body will respond to a treatment.

At least not yet. Given all the progress that has been made in recent years, it’s not too hard to imagine a future without animal testing.

This article first appeared in The Checkup, MIT Technology Review’s weekly biotech newsletter. To receive it in your inbox every Thursday, and read articles like this first, sign up here.

Google is still aiming for its “moonshot” 2030 energy goals

Last week, we hosted EmTech MIT, MIT Technology Review’s annual flagship conference in Cambridge, Massachusetts. Over the course of three days of main-stage sessions, I learned about innovations in AI, biotech, and robotics. 

But as you might imagine, some of this climate reporter’s favorite moments came in the climate sessions. I was listening especially closely to my colleague James Temple’s discussion with Lucia Tian, head of advanced energy technologies at Google. 

They spoke about the tech giant’s growing energy demand and what sort of technologies the company is looking to to help meet it. In case you weren’t able to join us, let’s dig into that session and consider how the company is thinking about energy in the face of AI’s rapid rise. 

I’ve been closely following Google’s work in energy this year. Like the rest of the tech industry, the company is seeing ballooning electricity demand in its data centers. That could get in the way of a major goal that Google has been talking about for years. 

See, back in 2020, the company announced an ambitious target: by 2030, it aimed to run on carbon-free energy 24-7. Basically, that means Google would purchase enough renewable energy on the grids where it operates to meet its entire electricity demand, and the purchases would match up so the electricity would have to be generated when the company was actually using energy. (For more on the nuances of Big Tech’s renewable-energy pledges, check out James’s piece from last year.)

Google’s is an ambitious goal, and on stage, Tian said that the company is still aiming for it but acknowledged that it’s looking tough with the rise of AI. 

“It was always a moonshot,” she said. “It’s something very, very hard to achieve, and it’s only harder in the face of this growth. But our perspective is, if we don’t move in that direction, we’ll never get there.”

Google’s total electricity demand more than doubled from 2020 to 2024, according to its latest Environmental Report. As for that goal of 24-7 carbon-free energy? The company is basically treading water. While it was at 67% for its data centers in 2020, last year it came in at 66%. 

Not going backwards is something of an accomplishment, given the rapid growth in electricity demand. But it still leaves the company some distance away from its finish line.

To close the gap, Google has been signing what feels like constant deals in the energy space. Two recent announcements that Tian talked about on stage were a project involving carbon capture and storage at a natural-gas plant in Illinois and plans to reopen a shuttered nuclear power plant in Iowa. 

Let’s start with carbon capture. Google signed an agreement to purchase most of the electricity from a new natural-gas plant, which will capture and store about 90% of its carbon dioxide emissions. 

That announcement was controversial, with critics arguing that carbon capture keeps fossil-fuel infrastructure online longer and still releases greenhouse gases and other pollutants into the atmosphere. 

One question that James raised on stage: Why build a new natural-gas plant rather than add equipment to an already existing facility? Tacking on equipment to an operational plant would mean cutting emissions from the status quo, rather than adding entirely new fossil-fuel infrastructure. 

The company did consider many existing plants, Tian said. But, as she put it, “Retrofits aren’t going to make sense everywhere.” Space can be limited at existing plants, for example, and many may not have the right geology to store carbon dioxide underground. 

“We wanted to lead with a project that could prove this technology at scale,” Tian said. This site has an operational Class VI well, the type used for permanent sequestration, she added, and it also doesn’t require a big pipeline buildout. 

Tian also touched on the company’s recent announcement that it’s collaborating with NextEra Energy to reopen Duane Arnold Energy Center, a nuclear power plant in Iowa. The company will purchase electricity from that plant, which is scheduled to reopen in 2029. 

As I covered in a story earlier this year, Duane Arnold was basically the final option in the US for companies looking to reopen shuttered nuclear power plants. “Just a few years back, we were still closing down nuclear plants in this country,” Tian said on stage. 

While each reopening will look a little different, Tian highlighted the groups working to restart the Palisades plant in Michigan, which was the first reopening to be announced, last spring. “They’re the real heroes of the story,” she said.

I’m always interested to get a peek behind the curtain at how Big Tech is thinking about energy. I’m skeptical but certainly interested to see how Google’s, and the rest of the industry’s, goals shape up over the next few years. 

This article is from The Spark, MIT Technology Review’s weekly climate newsletter. To receive it in your inbox every Wednesday, sign up here.

Google DeepMind is using Gemini to train agents inside Goat Simulator 3

Google DeepMind has built a new video-game-playing agent called SIMA 2 that can navigate and solve problems in a wide range of 3D virtual worlds. The company claims it’s a big step toward more general-purpose agents and better real-world robots.   

Google DeepMind first demoed SIMA (which stands for “scalable instructable multiworld agent”) last year. But SIMA 2 has been built on top of Gemini, the firm’s flagship large language model, which gives the agent a huge boost in capability.

The researchers claim that SIMA 2 can carry out a range of more complex tasks inside virtual worlds, figure out how to solve certain challenges by itself, and chat with its users. It can also improve itself by tackling harder tasks multiple times and learning through trial and error.

“Games have been a driving force behind agent research for quite a while,” Joe Marino, a research scientist at Google DeepMind, said in a press conference this week. He noted that even a simple action in a game, such as lighting a lantern, can involve multiple steps: “It’s a really complex set of tasks you need to solve to progress.”

The ultimate aim is to develop next-generation agents that are able to follow instructions and carry out open-ended tasks inside more complex environments than a web browser. In the long run, Google DeepMind wants to use such agents to drive real-world robots. Marino claimed that the skills SIMA 2 has learned, such as navigating an environment, using tools, and collaborating with humans to solve problems, are essential building blocks for future robot companions.

Unlike previous work on game-playing agents such as AlphaZero, which beat a Go grandmaster in 2016, or AlphaStar, which beat 99.8% of ranked human competition players at the video game StarCraft 2 in 2019, the idea behind SIMA is to train an agent to play an open-ended game without preset goals. Instead, the agent learns to carry out instructions given to it by people.

Humans control SIMA 2 via text chat, by talking to it out loud, or by drawing on the game’s screen. The agent takes in a video game’s pixels frame by frame and figures out what actions it needs to take to carry out its tasks.

Like its predecessor, SIMA 2 was trained on footage of humans playing eight commercial video games, including No Man’s Sky and Goat Simulator 3, as well as three virtual worlds created by the company. The agent learned to match keyboard and mouse inputs to actions.

Hooked up to Gemini, the researchers claim, SIMA 2 is far better at following instructions (asking questions and providing updates as it goes) and figuring out for itself how to perform certain more complex tasks.  

Google DeepMind tested the agent inside environments it had never seen before. In one set of experiments, researchers asked Genie 3, the latest version of the firm’s world model, to produce environments from scratch and dropped SIMA 2 into them. They found that the agent was able to navigate and carry out instructions there.

The researchers also used Gemini to generate new tasks for SIMA 2. If the agent failed, at first Gemini generated tips that SIMA 2 took on board when it tried again. Repeating a task multiple times in this way often allowed SIMA 2 to improve by trial and error until it succeeded, Marino said.

Git gud

SIMA 2 is still an experiment. The agent struggles with complex tasks that require multiple steps and more time to complete. It also remembers only its most recent interactions (to make SIMA 2 more responsive, the team cut its long-term memory). It’s also still nowhere near as good as people at using a mouse and keyboard to interact with a virtual world.

Julian Togelius, an AI researcher at New York University who works on creativity and video games, thinks it’s an interesting result. Previous attempts at training a single system to play multiple games haven’t gone too well, he says. That’s because training models to control multiple games just by watching the screen isn’t easy: “Playing in real time from visual input only is ‘hard mode,’” he says.

In particular, Togelius calls out GATO, a previous system from Google DeepMind, which—despite being hyped at the time—could not transfer skills across a significant number of virtual environments.  

Still, he is open-minded about whether or not SIMA 2 could lead to better robots. “The real world is both harder and easier than video games,” he says. It’s harder because you can’t just press A to open a door. At the same time, a robot in the real world will know exactly what its body can and can’t do at any time. That’s not the case in video games, where the rules inside each virtual world can differ.

Others are more skeptical. Matthew Guzdial, an AI researcher at the University of Alberta, isn’t too surprised that SIMA 2 can play many different video games. He notes that most games have very similar keyboard and mouse controls: Learn one and you learn them all. “If you put a game with weird input in front of it, I don’t think it’d be able to perform well,” he says.

Guzdial also questions how much of what SIMA 2 has learned would really carry over to robots. “It’s much harder to understand visuals from cameras in the real world compared to games, which are designed with easily parsable visuals for human players,” he says.

Still, Marino and his colleagues hope to continue their work with Genie 3 to allow the agent to improve inside a kind of endless virtual training dojo, where Genie generates worlds for SIMA to learn in via trial and error guided by Gemini’s feedback. “We’ve kind of just scratched the surface of what’s possible,” he said at the press conference.  

OpenAI’s new LLM exposes the secrets of how AI really works

ChatGPT maker OpenAI has built an experimental large language model that is far easier to understand than typical models.

That’s a big deal, because today’s LLMs are black boxes: Nobody fully understands how they do what they do. Building a model that is more transparent sheds light on how LLMs work in general, helping researchers figure out why models hallucinate, why they go off the rails, and just how far we should trust them with critical tasks.

“As these AI systems get more powerful, they’re going to get integrated more and more into very important domains,” Leo Gao, a research scientist at OpenAI, told MIT Technology Review in an exclusive preview of the new work. “It’s very important to make sure they’re safe.”

This is still early research. The new model, called a weight-sparse transformer, is far smaller and far less capable than top-tier mass-market models like the firm’s GPT-5, Anthropic’s Claude, and Google DeepMind’s Gemini. At most it’s as capable as GPT-1, a model that OpenAI developed back in 2018, says Gao (though he and his colleagues haven’t done a direct comparison).    

But the aim isn’t to compete with the best in class (at least, not yet). Instead, by looking at how this experimental model works, OpenAI hopes to learn about the hidden mechanisms inside those bigger and better versions of the technology.

It’s interesting research, says Elisenda Grigsby, a mathematician at Boston College who studies how LLMs work and who was not involved in the project: “I’m sure the methods it introduces will have a significant impact.” 

Lee Sharkey, a research scientist at AI startup Goodfire, agrees. “This work aims at the right target and seems well executed,” he says.

Why models are so hard to understand

OpenAI’s work is part of a hot new field of research known as mechanistic interpretability, which is trying to map the internal mechanisms that models use when they carry out different tasks.

That’s harder than it sounds. LLMs are built from neural networks, which consist of nodes, called neurons, arranged in layers. In most networks, each neuron is connected to every other neuron in its adjacent layers. Such a network is known as a dense network.

Dense networks are relatively efficient to train and run, but they spread what they learn across a vast knot of connections. The result is that simple concepts or functions can be split up between neurons in different parts of a model. At the same time, specific neurons can also end up representing multiple different features, a phenomenon known as superposition (a term borrowed from quantum physics). The upshot is that you can’t relate specific parts of a model to specific concepts.

“Neural networks are big and complicated and tangled up and very difficult to understand,” says Dan Mossing, who leads the mechanistic interpretability team at OpenAI. “We’ve sort of said: ‘Okay, what if we tried to make that not the case?’”

Instead of building a model using a dense network, OpenAI started with a type of neural network known as a weight-sparse transformer, in which each neuron is connected to only a few other neurons. This forced the model to represent features in localized clusters rather than spread them out.

Their model is far slower than any LLM on the market. But it is easier to relate its neurons or groups of neurons to specific concepts and functions. “There’s a really drastic difference in how interpretable the model is,” says Gao.

Gao and his colleagues have tested the new model with very simple tasks. For example, they asked it to complete a block of text that opens with quotation marks by adding matching marks at the end.  

It’s a trivial request for an LLM. The point is that figuring out how a model does even a straightforward task like that involves unpicking a complicated tangle of neurons and connections, says Gao. But with the new model, they were able to follow the exact steps the model took.

“We actually found a circuit that’s exactly the algorithm you would think to implement by hand, but it’s fully learned by the model,” he says. “I think this is really cool and exciting.”

Where will the research go next? Grigsby is not convinced the technique would scale up to larger models that have to handle a variety of more difficult tasks.    

Gao and Mossing acknowledge that this is a big limitation of the model they have built so far and agree that the approach will never lead to models that match the performance of cutting-edge products like GPT-5. And yet OpenAI thinks it might be able to improve the technique enough to build a transparent model on a par with GPT-3, the firm’s breakthrough 2021 LLM. 

“Maybe within a few years, we could have a fully interpretable GPT-3, so that you could go inside every single part of it and you could understand how it does every single thing,” says Gao. “If we had such a system, we would learn so much.”

The State of AI: Energy is king, and the US is falling behind

Welcome to The State of AI, a new collaboration between the Financial Times and MIT Technology Review. Every Monday for the next six weeks, writers from both publications will debate one aspect of the generative AI revolution reshaping global power.

This week, Casey Crownhart, senior reporter for energy at MIT Technology Review and Pilita Clark, FT’s columnist, consider how China’s rapid renewables buildout could help it leapfrog on AI progress.

Casey Crownhart writes:

In the age of AI, the biggest barrier to progress isn’t money but energy. That should be particularly worrying here in the US, where massive data centers are waiting to come online, and it doesn’t look as if the country will build the steady power supply or infrastructure needed to serve them all.

It wasn’t always like this. For about a decade before 2020, data centers were able to offset increased demand with efficiency improvements. Now, though, electricity demand is ticking up in the US, with billions of queries to popular AI models each day—and efficiency gains aren’t keeping pace. With too little new power capacity coming online, the strain is starting to show: Electricity bills are ballooning for people who live in places where data centers place a growing load on the grid.

If we want AI to have the chance to deliver on big promises without driving electricity prices sky-high for the rest of us, the US needs to learn some lessons from the rest of the world on energy abundance. Just look at China.

China installed 429 GW of new power generation capacity in 2024, more than six times the net capacity added in the US during that time.

China still generates much of its electricity with coal, but that makes up a declining share of the mix. Rather, the country is focused on installing solar, wind, nuclear, and gas at record rates.

The US, meanwhile, is focused on reviving its ailing coal industry. Coal-fired power plants are polluting and, crucially, expensive to run. Aging plants in the US are also less reliable than they used to be, generating electricity just 42% of the time, compared with a 61% capacity factor in 2014.

It’s not a great situation. And unless the US changes something, we risk becoming consumers as opposed to innovators in both energy and AI tech. Already, China earns more from exporting renewables than the US does from oil and gas exports. 

Building and permitting new renewable power plants would certainly help, since they’re currently the cheapest and fastest to bring online. But wind and solar are politically unpopular with the current administration. Natural gas is an obvious candidate, though there are concerns about delays with key equipment.

One quick fix would be for data centers to be more flexible. If they agreed not to suck electricity from the grid during times of stress, new AI infrastructure might be able to come online without any new energy infrastructure.

One study from Duke University found that if data centers agree to curtail their consumption just 0.25% of the time (roughly 22 hours over the course of the year), the grid could provide power for about 76 GW of new demand. That’s like adding about 5% of the entire grid’s capacity without needing to build anything new.

But flexibility wouldn’t be enough to truly meet the swell in AI electricity demand. What do you think, Pilita? What would get the US out of these energy constraints? Is there anything else we should be thinking about when it comes to AI and its energy use? 

Pilita Clark responds:

I agree. Data centers that can cut their power use at times of grid stress should be the norm, not the exception. Likewise, we need more deals like those giving cheaper electricity to data centers that let power utilities access their backup generators. Both reduce the need to build more power plants, which makes sense regardless of how much electricity AI ends up using.

This is a critical point for countries across the world, because we still don’t know exactly how much power AI is going to consume. 

Forecasts for what data centers will need in as little as five years’ time vary wildly, from less than twice today’s rates to four times as much.

This is partly because there’s a lack of public data about AI systems’ energy needs. It’s also because we don’t know how much more efficient these systems will become. The US chip designer Nvidia said last year that its specialized chips had become 45,000 times more energy efficient over the previous eight years. 

Moreover, we have been very wrong about tech energy needs before. At the height of the dot-com boom in 1999, it was erroneously claimed that the internet would need half the US’s electricity within a decade—necessitating a lot more coal power.

Still, some countries are clearly feeling the pressure already. In Ireland, data centers chew up so much power that new connections have been restricted around Dublin to avoid straining the grid.

Some regulators are eyeing new rules forcing tech companies to provide enough power generation to match their demand. I hope such efforts grow. I also hope AI itself helps boost power abundance and, crucially, accelerates the global energy transition needed to combat climate change. OpenAI’s Sam Altman said in 2023 that “once we have a really powerful super intelligence, addressing climate change will not be particularly difficult.” 

The evidence so far is not promising, especially in the US, where renewable projects are being axed. Still, the US may end up being an outlier in a world where ever cheaper renewables made up more than 90% of new power capacity added globally last year. 

Europe is aiming to power one of its biggest data centers predominantly with renewables and batteries. But the country leading the green energy expansion is clearly China.

The 20th century was dominated by countries rich in the fossil fuels whose reign the US now wants to prolong. China, in contrast, may become the world’s first green electrostate. If it does this in a way that helps it win an AI race the US has so far controlled, it will mark a striking chapter in economic, technological, and geopolitical history.

Casey Crownhart replies:

I share your skepticism of tech executives’ claims that AI will be a groundbreaking help in the race to address climate change. To be fair, AI is progressing rapidly. But we don’t have time to wait for technologies standing on big claims with nothing to back them up. 

When it comes to the grid, for example, experts say there’s potential for AI to help with planning and even operating, but these efforts are still experimental.  

Meanwhile, much of the world is making measurable progress on transitioning to newer, greener forms of energy. How that will affect the AI boom remains to be seen. What is clear is that AI is changing our grid and our world, and we need to be clear-eyed about the consequences. 

Further reading 

MIT Technology Review reporters did the math on the energy needs of an AI query.

There are still a few reasons to be optimistic about AI’s energy demands.  

The FT’s visual data team take a look inside the relentless race for AI capacity.

And global FT reporters ask whether data centers can ever truly be green.