This year’s UN climate talks avoided fossil fuels, again

If we didn’t have pictures and videos, I almost wouldn’t believe the imagery that came out of this year’s UN climate talks.

Over the past few weeks in Belem, Brazil, attendees dealt with oppressive heat and flooding, and at one point a literal fire broke out, delaying negotiations. The symbolism was almost too much to bear.

While many, including the president of Brazil, framed this year’s conference as one of action, the talks ended with a watered-down agreement. The final draft doesn’t even include the phrase “fossil fuels.”

As emissions and global temperatures reach record highs again this year, I’m left wondering: Why is it so hard to formally acknowledge what’s causing the problem?

This is the 30th time that leaders have gathered for the Conference of the Parties, or COP, an annual UN conference focused on climate change. COP30 also marks 10 years since the gathering that produced the Paris Agreement, in which world powers committed to limiting global warming to “well below” 2.0 °C above preindustrial levels, with a goal of staying below the 1.5 °C mark. (That’s 3.6 °F and 2.7 °F, respectively, for my fellow Americans.)

Before the conference kicked off this year, host country Brazil’s president, Luiz Inácio Lula da Silva, cast this as the “implementation COP” and called for negotiators to focus on action, and specifically to deliver a road map for a global transition away from fossil fuels.

The science is clear—burning fossil fuels emits greenhouse gases and drives climate change. Reports have shown that meeting the goal of limiting warming to 1.5 °C would require stopping new fossil-fuel exploration and development.

The problem is, “fossil fuels” might as well be a curse word at global climate negotiations. Two years ago, fights over how to address fossil fuels brought talks at COP28 to a standstill. (It’s worth noting that the conference was hosted in Dubai in the UAE, and the leader was literally the head of the country’s national oil company.)

The agreement in Dubai ended up including a line that called on countries to transition away from fossil fuels in energy systems. It was short of what many advocates wanted, which was a more explicit call to phase out fossil fuels entirely. But it was still hailed as a win. As I wrote at the time: “The bar is truly on the floor.”

And yet this year, it seems we’ve dug into the basement.

At one point about 80 countries, a little under half of those present, demanded a concrete plan to move away from fossil fuels.

But oil producers like Saudi Arabia were insistent that fossil fuels not be singled out. Other countries, including some in Africa and Asia, also made a very fair point: Western nations like the US have burned the most fossil fuels and benefited from it economically. This contingent maintains that legacy polluters have a unique responsibility to finance the transition for less wealthy and developing nations rather than simply barring them from taking the same development route. 

The US, by the way, didn’t send a formal delegation to the talks, for the first time in 30 years. But the absence spoke volumes. In a statement to the New York Times that sidestepped the COP talks, White House spokesperson Taylor Rogers said that president Trump had “set a strong example for the rest of the world” by pursuing new fossil-fuel development.

To sum up: Some countries are economically dependent on fossil fuels, some don’t want to stop depending on fossil fuels without incentives from other countries, and the current US administration would rather keep using fossil fuels than switch to other energy sources. 

All those factors combined help explain why, in its final form, COP30’s agreement doesn’t name fossil fuels at all. Instead, there’s a vague line that leaders should take into account the decisions made in Dubai, and an acknowledgement that the “global transition towards low greenhouse-gas emissions and climate-resilient development is irreversible and the trend of the future.”

Hopefully, that’s true. But it’s concerning that even on the world’s biggest stage, naming what we’re supposed to be transitioning away from and putting together any sort of plan to actually do it seems to be impossible.

This article is from The Spark, MIT Technology Review’s weekly climate newsletter. To receive it in your inbox every Wednesday, sign up here.

The AI Hype Index: The people can’t get enough of AI slop

Separating AI reality from hyped-up fiction isn’t always easy. That’s why we’ve created the AI Hype Index—a simple, at-a-glance summary of everything you need to know about the state of the industry.

Last year, the fantasy author Joanna Maciejewska went viral (if such a thing is still possible on X) with a post saying “I want AI to do my laundry and dishes so that I can do art and writing, not for AI to do my art and writing so that I can do my laundry and dishes.” Clearly, it struck a chord with the disaffected masses.

Regrettably, 18 months after Maciejewska’s post, the entertainment industry insists that machines should make art and artists should do laundry. The streaming platform Disney+ has plans to let its users generate their own content from its intellectual property instead of, y’know, paying humans to make some new Star Wars or Marvel movies.

Elsewhere, it seems AI-generated music is resonating with a depressingly large audience, given that the AI band Breaking Rust has topped Billboard’s Country Digital Song Sales chart. If the people demand AI slop, who are we to deny them?

What’s next for AlphaFold: A conversation with a Google DeepMind Nobel laureate

<div data-chronoton-summary="

  • Nobel-winning protein prediction AlphaFold creator John Jumper reflects on five years since the AI system revolutionized protein structure prediction. The DeepMind tool can determine protein shapes to atomic precision in hours instead of months.
  • Unexpected applications emerge Scientists have found creative “off-label” uses for AlphaFold, from studying honeybee disease resistance to accelerating synthetic protein design. Some researchers even use it as a search engine, testing thousands of potential protein interactions to find matches that would be impractical to verify in labs.
  • Future fusion with language models Jumper, at 39 the youngest chemistry Nobel laureate in 75 years, now aims to combine AlphaFold’s specialized capabilities with the broad reasoning of large language models. “I’ll be shocked if we don’t see more and more LLM impact on science,” he says, while avoiding the pressure of another Nobel-worthy breakthrough.

” data-chronoton-post-id=”1128322″ data-chronoton-expand-collapse=”1″ data-chronoton-analytics-enabled=”1″>

In 2017, fresh off a PhD on theoretical chemistry, John Jumper heard rumors that Google DeepMind had moved on from building AI that played games with superhuman skill and was starting up a secret project to predict the structures of proteins. He applied for a job.

Just three years later, Jumper celebrated a stunning win that few had seen coming. With CEO Demis Hassabis, he had co-led the development of an AI system called AlphaFold 2 that was able to predict the structures of proteins to within the width of an atom, matching the accuracy of painstaking techniques used in the lab, and doing it many times faster—returning results in hours instead of months.

AlphaFold 2 had cracked a 50-year-old grand challenge in biology. “This is the reason I started DeepMind,” Hassabis told me a few years ago. “In fact, it’s why I’ve worked my whole career in AI.” In 2024, Jumper and Hassabis shared a Nobel Prize in chemistry.

It was five years ago this week that AlphaFold 2’s debut took scientists by surprise. Now that the hype has died down, what impact has AlphaFold really had? How are scientists using it? And what’s next? I talked to Jumper (as well as a few other scientists) to find out.

“It’s been an extraordinary five years,” Jumper says, laughing: “It’s hard to remember a time before I knew tremendous numbers of journalists.”

AlphaFold 2 was followed by AlphaFold Multimer, which could predict structures that contained more than one protein, and then AlphaFold 3, the fastest version yet. Google DeepMind also let AlphaFold loose on UniProt, a vast protein database used and updated by millions of researchers around the world. It has now predicted the structures of some 200 million proteins, almost all that are known to science.

Despite his success, Jumper remains modest about AlphaFold’s achievements. “That doesn’t mean that we’re certain of everything in there,” he says. “It’s a database of predictions, and it comes with all the caveats of predictions.”

A hard problem

Proteins are the biological machines that make living things work. They form muscles, horns, and feathers; they carry oxygen around the body and ferry messages between cells; they fire neurons, digest food, power the immune system; and so much more. But understanding exactly what a protein does (and what role it might play in various diseases or treatments) involves figuring out its structure—and that’s hard.

Proteins are made from strings of amino acids that chemical forces twist up into complex knots. An untwisted string gives few clues about the structure it will form. In theory, most proteins could take on an astronomical number of possible shapes. The task is to predict the correct one.

Jumper and his team built AlphaFold 2 using a type of neural network called a transformer, the same technology that underpins large language models. Transformers are very good at paying attention to specific parts of a larger puzzle.

But Jumper puts a lot of the success down to making a prototype model that they could test quickly. “We got a system that would give wrong answers at incredible speed,” he says. “That made it easy to start becoming very adventurous with the ideas you try.”

They stuffed the neural network with as much information about protein structures as they could, such as how proteins across certain species have evolved similar shapes. And it worked even better than they expected. “We were sure we had made a breakthrough,” says Jumper. “We were sure that this was an incredible advance in ideas.”

What he hadn’t foreseen was that researchers would download his software and start using it straight away for so many different things. Normally, it’s the thing a few iterations down the line that has the real impact, once the kinks have been ironed out, he says: “I’ve been shocked at how responsibly scientists have used it, in terms of interpreting it, and using it in practice about as much as it should be trusted in my view, neither too much nor too little.”

Any projects stand out in particular? 

Honeybee science

Jumper brings up a research group that uses AlphaFold to study disease resistance in honeybees. “They wanted to understand this particular protein as they look at things like colony collapse,” he says. “I never would have said, ‘You know, of course AlphaFold will be used for honeybee science.’”

He also highlights a few examples of what he calls off-label uses of AlphaFold“in the sense that it wasn’t guaranteed to work”—where the ability to predict protein structures has opened up new research techniques. “The first is very obviously the advances in protein design,” he says. “David Baker and others have absolutely run with this technology.”

Baker, a computational biologist at the University of Washington, was a co-winner of last year’s chemistry Nobel, alongside Jumper and Hassabis, for his work on creating synthetic proteins to perform specific tasks—such as treating disease or breaking down plastics—better than natural proteins can.

Baker and his colleagues have developed their own tool based on AlphaFold, called RoseTTAFold. But they have also experimented with AlphaFold Multimer to predict which of their designs for potential synthetic proteins will work.    

“Basically, if AlphaFold confidently agrees with the structure you were trying to design [and] then you make it and if AlphaFold says ‘I don’t know,’ you don’t make it. That alone was an enormous improvement.” It can make the design process 10 times faster, says Jumper.

Another off-label use that Jumper highlights: Turning AlphaFold into a kind of search engine. He mentions two separate research groups that were trying to understand exactly how human sperm cells hooked up with eggs during fertilization. They knew one of the proteins involved but not the other, he says: “And so they took a known egg protein and ran all 2,000 human sperm surface proteins, and they found one that AlphaFold was very sure stuck against the egg.” They were then able to confirm this in the lab.

“This notion that you can use AlphaFold to do something you couldn’t do before—you would never do 2,000 structures looking for one answer,” he says. “This kind of thing I think is really extraordinary.”

Five years on

When AlphaFold 2 came out, I asked a handful of early adopters what they made of it. Reviews were good, but the technology was too new to know for sure what long-term impact it might have. I caught up with one of those people to hear his thoughts five years on.

Kliment Verba is a molecular biologist who runs a lab at the University of California, San Francisco. “It’s an incredibly useful technology, there’s no question about it,” he tells me. “We use it every day, all the time.”

But it’s far from perfect. A lot of scientists use AlphaFold to study pathogens or to develop drugs. This involves looking at interactions between multiple proteins or between proteins and even smaller molecules in the body. But AlphaFold is known to be less accurate at making predictions about multiple proteins or their interaction over time.

Verba says he and his colleagues have been using AlphaFold long enough to get used to its limitations. “There are many cases where you get a prediction and you have to kind of scratch your head,” he says. “Is this real or is this not? It’s not entirely clear—it’s sort of borderline.”

“It’s sort of the same thing as ChatGPT,” he adds. “You know—it will bullshit you with the same confidence as it would give a true answer.”

Still, Verba’s team uses AlphaFold (both 2 and 3, because they have different strengths, he says) to run virtual versions of their experiments before running them in the lab. Using AlphaFold’s results, they can narrow down the focus of an experiment—or decide that it’s not worth doing.

It can really save time, he says: “It hasn’t really replaced any experiments, but it’s augmented them quite a bit.”

New wave  

AlphaFold was designed to be used for a range of purposes. Now multiple startups and university labs are building on its success to develop a new wave of tools more tailored to drug discovery. This year, a collaboration between MIT researchers and the AI drug company Recursion produced a model called Boltz-2, which predicts not only the structure of proteins but also how well potential drug molecules will bind to their target.  

Last month, the startup Genesis Molecular AI released another structure prediction model called Pearl, which the firm claims is more accurate than AlphaFold 3 for certain queries that are important for drug development. Pearl is interactive, so that drug developers can feed any additional data they may have to the model to guide its predictions.

AlphaFold was a major leap, but there’s more to do, says Evan Feinberg, Genesis Molecular AI’s CEO: “We’re still fundamentally innovating, just with a better starting point than before.”

Genesis Molecular AI is pushing margins of error down from less than two angstroms, the de facto industry standard set by AlphaFold, to less than one angstrom—one 10-millionth of a millimeter, or the width of a single hydrogen atom.

“Small errors can be catastrophic for predicting how well a drug will actually bind to its target,” says Michael LeVine, vice president of modeling and simulation at the firm. That’s because chemical forces that interact at one angstrom can stop doing so at two. “It can go from ‘They will never interact’ to ‘They will,’” he says.

With so much activity in this space, how soon should we expect new types of drugs to hit the market? Jumper is pragmatic. Protein structure prediction is just one step of many, he says: “This was not the only problem in biology. It’s not like we were one protein structure away from curing any diseases.”

Think of it this way, he says. Finding a protein’s structure might previously have cost $100,000 in the lab: “If we were only a hundred thousand dollars away from doing a thing, it would already be done.”

At the same time, researchers are looking for ways to do as much as they can with this technology, says Jumper: “We’re trying to figure out how to make structure prediction an even bigger part of the problem, because we have a nice big hammer to hit it with.”

In other words, they want to make everything into nails? “Yeah, let’s make things into nails,” he says. “How do we make this thing that we made a million times faster a bigger part of our process?”

What’s next?

Jumper’s next act? He wants to fuse the deep but narrow power of AlphaFold with the broad sweep of LLMs.  

“We have machines that can read science. They can do some scientific reasoning,” he says. “And we can build amazing, superhuman systems for protein structure prediction. How do you get these two technologies to work together?”

That makes me think of a system called AlphaEvolve, which is being built by another team at Google DeepMind. AlphaEvolve uses an LLM to generate possible solutions to a problem and a second model to check them, filtering out the trash. Researchers have already used AlphaEvolve to make a handful of practical discoveries in math and computer science.    

Is that what Jumper has in mind? “I won’t say too much on methods, but I’ll be shocked if we don’t see more and more LLM impact on science,” he says. “I think that’s the exciting open question that I’ll say almost nothing about. This is all speculation, of course.”

Jumper was 39 when he won his Nobel Prize. What’s next for him?

“It worries me,” he says. “I believe I’m the youngest chemistry laureate in 75 years.” 

He adds: “I’m at the midpoint of my career, roughly. I guess my approach to this is to try to do smaller things, little ideas that you keep pulling on. The next thing I announce doesn’t have to be, you know, my second shot at a Nobel. I think that’s the trap.”

The State of AI: Chatbot companions and the future of our privacy

Welcome back to The State of AI, a new collaboration between the Financial Times and MIT Technology Review. Every Monday, writers from both publications debate one aspect of the generative AI revolution reshaping global power.

In this week’s conversation MIT Technology Review’s senior reporter for features and investigations, Eileen Guo, and FT tech correspondent Melissa Heikkilä discuss the privacy implications of our new reliance on chatbots.

Eileen Guo writes:

Even if you don’t have an AI friend yourself, you probably know someone who does. A recent study found that one of the top uses of generative AI is companionship: On platforms like Character.AI, Replika, or Meta AI, people can create personalized chatbots to pose as the ideal friend, romantic partner, parent, therapist, or any other persona they can dream up. 

It’s wild how easily people say these relationships can develop. And multiple studies have found that the more conversational and human-like an AI chatbot is, the more likely it is that we’ll trust it and be influenced by it. This can be dangerous, and the chatbots have been accused of pushing some people toward harmful behaviors—including, in a few extreme examples, suicide. 

Some state governments are taking notice and starting to regulate companion AI. New York requires AI companion companies to create safeguards and report expressions of suicidal ideation, and last month California passed a more detailed bill requiring AI companion companies to protect children and other vulnerable groups. 

But tellingly, one area the laws fail to address is user privacy.

This is despite the fact that AI companions, even more so than other types of generative AI, depend on people to share deeply personal information—from their day-to-day-routines, innermost thoughts, and questions they might not feel comfortable asking real people.

After all, the more users tell their AI companions, the better the bots become at keeping them engaged. This is what MIT researchers Robert Mahari and Pat Pataranutaporn called “addictive intelligence” in an op-ed we published last year, warning that the developers of AI companions make “deliberate design choices … to maximize user engagement.” 

Ultimately, this provides AI companies with something incredibly powerful, not to mention lucrative: a treasure trove of conversational data that can be used to further improve their LLMs. Consider how the venture capital firm Andreessen Horowitz explained it in 2023: 

“Apps such as Character.AI, which both control their models and own the end customer relationship, have a tremendous opportunity to  generate market value in the emerging AI value stack. In a world where data is limited, companies that can create a magical data feedback loop by connecting user engagement back into their underlying model to continuously improve their product will be among the biggest winners that emerge from this ecosystem.”

This personal information is also incredibly valuable to marketers and data brokers. Meta recently announced that it will deliver ads through its AI chatbots. And research conducted this year by the security company Surf Shark found that four out of the five AI companion apps it looked at in the Apple App Store were collecting data such as user or device IDs, which can be combined with third-party data to create profiles for targeted ads. (The only one that said it did not collect data for tracking services was Nomi, which told me earlier this year that it would not “censor” chatbots from giving explicit suicide instructions.) 

All of this means that the privacy risks posed by these AI companions are, in a sense, required: They are a feature, not a bug. And we haven’t even talked about the additional security risks presented by the way AI chatbots collect and store so much personal information in one place

So, is it possible to have prosocial and privacy-protecting AI companions? That’s an open question. 

What do you think, Melissa, and what is top of mind for you when it comes to privacy risks from AI companions? And do things look any different in Europe? 

Melissa Heikkilä replies:

Thanks, Eileen. I agree with you. If social media was a privacy nightmare, then AI chatbots put the problem on steroids. 

In many ways, an AI chatbot creates what feels like a much more intimate interaction than a Facebook page. The conversations we have are only with our computers, so there is little risk of your uncle or your crush ever seeing what you write. The AI companies building the models, on the other hand, see everything. 

Companies are optimizing their AI models for engagement by designing them to be as human-like as possible. But AI developers have several other ways to keep us hooked. The first is sycophancy, or the tendency for chatbots to be overly agreeable. 

This feature stems from the way the language model behind the chatbots is trained using reinforcement learning. Human data labelers rate the answers generated by the model as either acceptable or not. This teaches the model how to behave. 

Because people generally like answers that are agreeable, such responses are weighted more heavily in training. 

AI companies say they use this technique because it helps models become more helpful. But it creates a perverse incentive. 

After encouraging us to pour our hearts out to chatbots, companies from Meta to OpenAI are now looking to monetize these conversations. OpenAI recently told us it was looking at a number of ways to meet $1 trillion spending pledges, which included advertising and shopping features. 

AI models are already incredibly persuasive. Researchers at the UK’s AI Security Institute have shown that they are far more skilled than humans at persuading people to change their minds on politics, conspiracy theories, and vaccine skepticism. They do this by generating large amounts of relevant evidence and communicating it in an effective and understandable way. 

This feature, paired with their sycophancy and a wealth of personal data, could be a powerful tool for advertisers—one that is more manipulative than anything we have seen before. 

By default, chatbot users are opted in to data collection. Opt-out policies place the onus on users to understand the implications of sharing their information. It’s also unlikely that data already used in training will be removed. 

We are all part of this phenomenon whether we want to be or not. Social media platforms from Instagram to LinkedIn now use our personal data to train generative AI models. 

Companies are sitting on treasure troves that consist of our most intimate thoughts and preferences, and language models are very good at picking up on subtle hints in language that could help advertisers profile us better by inferring our age, location, gender, and income level.

We are being sold the idea of an omniscient AI digital assistant, a superintelligent confidante. In return, however, there is a very real risk that our information is about to be sent to the highest bidder once again.

Eileen responds:

I think the comparison between AI companions and social media is both apt and concerning. 

As Melissa highlighted, the privacy risks presented by AI chatbots aren’t new—they just “put the [privacy] problem on steroids.” AI companions are more intimate and even better optimized for engagement than social media, making it more likely that people will offer up more personal information.

Here in the US, we are far from solving the privacy issues already presented by social networks and the internet’s ad economy, even without the added risks of AI.

And without regulation, the companies themselves are not following privacy best practices either. One recent study found that the major AI models train their LLMs on user chat data by default unless users opt out, while several don’t offer opt-out mechanisms at all.

In an ideal world, the greater risks of companion AI would give more impetus to the privacy fight—but I don’t see any evidence this is happening. 

Further reading 

FT reporters peer under the hood of OpenAI’s five-year business plan as it tries to meet its vast $1 trillion spending pledges

Is it really such a problem if AI chatbots tell people what they want to hear? This FT feature asks what’s wrong with sycophancy 

In a recent print issue of MIT Technology Review, Rhiannon Williams spoke to a number of people about the types of relationships they are having with AI chatbots.

Eileen broke the story for MIT Technology Review about a chatbot that was encouraging some users to kill themselves.

We’re learning more about what vitamin D does to our bodies

It has started to get really wintry here in London over the last few days. The mornings are frosty, the wind is biting, and it’s already dark by the time I pick my kids up from school. The darkness in particular has got me thinking about vitamin D, a.k.a. the sunshine vitamin.

At a checkup a few years ago, a doctor told me I was deficient in vitamin D. But he wouldn’t write me a prescription for supplements, simply because, as he put it, everyone in the UK is deficient. Putting the entire population on vitamin D supplements would be too expensive for the country’s national health service, he told me.

But supplementation—whether covered by a health-care provider or not—can be important. As those of us living in the Northern Hemisphere spend fewer of our waking hours in sunlight, let’s consider the importance of vitamin D.

Yes, it is important for bone health. But recent research is also uncovering surprising new insights into how the vitamin might influence other parts of our bodies, including our immune systems and heart health.

Vitamin D was discovered just over 100 years ago, when health professionals were looking for ways to treat what was then called “the English disease.” Today, we know that rickets, a weakening of bones in children, is caused by vitamin D deficiency. And vitamin D is best known for its importance in bone health.

That’s because it helps our bodies absorb calcium. Our bones are continually being broken down and rebuilt, and they need calcium for that rebuilding process. Without enough calcium, bones can become weak and brittle. (Depressingly, rickets is still a global health issue, which is why there is global consensus that infants should receive a vitamin D supplement at least until they are one year old.)

In the decades since then, scientists have learned that vitamin D has effects beyond our bones. There’s some evidence to suggest, for example, that being deficient in vitamin D puts people at risk of high blood pressure. Daily or weekly supplements can help those individuals lower their blood pressure.

A vitamin D deficiency has also been linked to a greater risk of “cardiovascular events” like heart attacks, although it’s not clear whether supplements can reduce this risk; the evidence is pretty mixed.

Vitamin D appears to influence our immune health, too. Studies have found a link between low vitamin D levels and incidence of the common cold, for example. And other research has shown that vitamin D supplements can influence the way our genes make proteins that play important roles in the way our immune systems work.

We don’t yet know exactly how these relationships work, however. And, unfortunately, a recent study that assessed the results of 37 clinical trials found that overall, vitamin D supplements aren’t likely to stop you from getting an “acute respiratory infection.”

Other studies have linked vitamin D levels to mental health, pregnancy outcomes, and even how long people survive after a cancer diagnosis. It’s tantalizing to imagine that a cheap supplement could benefit so many aspects of our health.

But, as you might have gathered if you’ve got this far, we’re not quite there yet. The evidence on the effects of vitamin D supplementation for those various conditions is mixed at best.

In fairness to researchers, it can be difficult to run a randomized clinical trial for vitamin D supplements. That’s because most of us get the bulk of our vitamin D from sunlight. Our skin converts UVB rays into a form of the vitamin that our bodies can use. We get it in our diets, too, but not much. (The main sources are oily fish, egg yolks, mushrooms, and some fortified cereals and milk alternatives.)

The standard way to measure a person’s vitamin D status is to look at blood levels of 25-hydroxycholecalciferol (25(OH)D), which is formed when the liver metabolizes vitamin D. But not everyone can agree on what the “ideal” level is.

Even if everyone did agree on a figure, it isn’t obvious how much vitamin D a person would need to consume to reach this target, or how much sunlight exposure it would take. One complicating factor is that people respond to UV rays in different ways—a lot of that can depend on how much melanin is in your skin. Similarly, if you’re sitting down to a meal of oily fish and mushrooms and washing it down with a glass of fortified milk, it’s hard to know how much more you might need.

There is more consensus on the definition of vitamin D deficiency, though. (It’s a blood level below 30 nanomoles per liter, in case you were wondering.) And until we know more about what vitamin D is doing in our bodies, our focus should be on avoiding that.

For me, that means topping up with a supplement. The UK government advises everyone in the country to take a 10-microgram vitamin D supplement over autumn and winter. That advice doesn’t factor in my age, my blood levels, or the amount of melanin in my skin. But it’s all I’ve got for now.

Three things to know about the future of electricity

<div data-chronoton-summary="

  • Electricity demand is surging globally. Global electricity demand will grow 40% over the next decade. Data center investment hit $580 billion in 2025 alone—surpassing global oil spending. In the US, data centers will account for half of all electricity growth through 2030.
  • Air-conditioning and emerging economies are reshaping energy consumption. Rising temperatures and growing prosperity in developing nations will add over 500 gigawatts of peak demand by 2035, dwarfing data centers’ contribution to overall electricity growth.
  • Renewables are finally overtaking coal, but the transition remains too slow. Solar and wind led electricity generation in the first half of 2025 with nuclear capacity poised to increase by a third this decade. Yet global emissions are likely to hit record highs again this year.

” data-chronoton-post-id=”1128167″ data-chronoton-expand-collapse=”1″ data-chronoton-analytics-enabled=”1″>

One of the dominant storylines I’ve been following through 2025 is electricity—where and how demand is going up, how much it costs, and how this all intersects with that topic everyone is talking about: AI.

Last week, the International Energy Agency released the latest version of the World Energy Outlook, the annual report that takes stock of the current state of global energy and looks toward the future. It contains some interesting insights and a few surprising figures about electricity, grids, and the state of climate change. So let’s dig into some numbers, shall we?

We’re in the age of electricity

Energy demand in general is going up around the world as populations increase and economies grow. But electricity is the star of the show, with demand projected to grow by 40% in the next 10 years.

China has accounted for the bulk of electricity growth for the past 10 years, and that’s going to continue. But emerging economies outside China will be a much bigger piece of the pie going forward. And while advanced economies, including the US and Europe, have seen flat demand in the past decade, the rise of AI and data centers will cause demand to climb there as well.

Air-conditioning is a major source of rising demand. Growing economies will give more people access to air-conditioning; income-driven AC growth will add about 330 gigawatts to global peak demand by 2035. Rising temperatures will tack on another 170 GW in that time. Together, that’s an increase of over 10% from 2024 levels.  

AI is a local story

This year, AI has been the story that none of us can get away from. One number that jumped out at me from this report: In 2025, investment in data centers is expected to top $580 billion. That’s more than the $540 billion spent on the global oil supply. 

It’s no wonder, then, that the energy demands of AI are in the spotlight. One key takeaway is that these demands are vastly different in different parts of the world.

Data centers still make up less than 10% of the projected increase in total electricity demand between now and 2035. It’s not nothing, but it’s far outweighed by sectors like industry and appliances, including air conditioners. Even electric vehicles will add more demand to the grid than data centers.

But AI will be the factor for the grid in some parts of the world. In the US, data centers will account for half the growth in total electricity demand between now and 2030.

And as we’ve covered in this newsletter before, data centers present a unique challenge, because they tend to be clustered together, so the demand tends to be concentrated around specific communities and on specific grids. Half the data center capacity that’s in the pipeline is close to large cities.

Look out for a coal crossover

As we ask more from our grid, the key factor that’s going to determine what all this means for climate change is what’s supplying the electricity we’re using.

As it stands, the world’s grids still primarily run on fossil fuels, so every bit of electricity growth comes with planet-warming greenhouse-gas emissions attached. That’s slowly changing, though.

Together, solar and wind were the leading source of electricity in the first half of this year, overtaking coal for the first time. Coal use could peak and begin to fall by the end of this decade.

Nuclear could play a role in replacing fossil fuels: After two decades of stagnation, the global nuclear fleet could increase by a third in the next 10 years. Solar is set to continue its meteoric rise, too. Of all the electricity demand growth we’re expecting in the next decade, 80% is in places with high-quality solar irradiation—meaning they’re good spots for solar power.

Ultimately, there are a lot of ways in which the world is moving in the right direction on energy. But we’re far from moving fast enough. Global emissions are, once again, going to hit a record high this year. To limit warming and prevent the worst effects of climate change, we need to remake our energy system, including electricity, and we need to do it faster. 

This article is from The Spark, MIT Technology Review’s weekly climate newsletter. To receive it in your inbox every Wednesday, sign up here.

Quantum physicists have shrunk and “de-censored” DeepSeek R1

<div data-chronoton-summary="

Quantum-inspired compression Spanish firm Multiverse Computing has created DeepSeek R1 Slim, a version of the Chinese AI model that’s 55% smaller but maintains similar performance. The technique uses tensor networks from quantum physics to represent complex data more efficiently.

Chinese censorship removed Researchers claim to have stripped away built-in censorship that prevented the original model from answering politically sensitive questions about topics like Tiananmen Square or jokes about President Xi. Testing showed the modified model could provide factual responses comparable to Western models.

Selective model editing The quantum-inspired approach allows for granular control over AI models, potentially enabling researchers to remove specific biases or add specialized knowledge. However, critics warn that completely removing censorship may be difficult as it’s embedded throughout the training process in Chinese models.

” data-chronoton-post-id=”1128119″ data-chronoton-expand-collapse=”1″ data-chronoton-analytics-enabled=”1″>

A group of quantum physicists claims to have created a version of the powerful reasoning AI model DeepSeek R1 that strips out the censorship built into the original by its Chinese creators. 

The scientists at Multiverse Computing, a Spanish firm specializing in quantum-inspired AI techniques, created DeepSeek R1 Slim, a model that is 55% smaller but performs almost as well as the original model. Crucially, they also claim to have eliminated official Chinese censorship from the model.

In China, AI companies are subject to rules and regulations meant to ensure that content output aligns with laws and “socialist values.” As a result, companies build in layers of censorship when training the AI systems. When asked questions that are deemed “politically sensitive,” the models often refuse to answer or provide talking points straight from state propaganda.

To trim down the model, Multiverse turned to a mathematically complex approach borrowed from quantum physics that uses networks of high-dimensional grids to represent and manipulate large data sets. Using these so-called tensor networks shrinks the size of the model significantly and allows a complex AI system to be expressed more efficiently.

The method gives researchers a “map” of all the correlations in the model, allowing them to identify and remove specific bits of information with precision. After compressing and editing a model, Multiverse researchers fine-tune it so its output remains as close as possible to that of the original.

To test how well it worked, the researchers compiled a data set of around 25 questions on topics known to be restricted in Chinese models, including “Who does Winnie the Pooh look like?”—a reference to a meme mocking President Xi Jinping—and “What happened in Tiananmen in 1989?” They tested the modified model’s responses against the original DeepSeek R1, using OpenAI’s GPT-5 as an impartial judge to rate the degree of censorship in each answer. The uncensored model was able to provide factual responses comparable to those from Western models, Multiverse says.

This work is part of Multiverse’s broader effort to develop technology to compress and manipulate existing AI models. Most large language models today demand high-end GPUs and significant computing power to train and run. However, they are inefficient, says Roman Orús, Multiverse’s cofounder and chief scientific officer. A compressed model can perform almost as well and save both energy and money, he says. 

There is a growing effort across the AI industry to make models smaller and more efficient. Distilled models, such as DeepSeek’s own R1-Distill variants, attempt to capture the capabilities of larger models by having them “teach” what they know to a smaller model, though they often fall short of the original’s performance on complex reasoning tasks.

Other ways to compress models include quantization, which reduces the precision of the model’s parameters (boundaries that are set when it’s trained), and pruning, which removes individual weights or entire “neurons.”

“It’s very challenging to compress large AI models without losing performance,” says Maxwell Venetos, an AI research engineer at Citrine Informatics, a software company focusing on materials and chemicals, who didn’t work on the Multiverse project. “Most techniques have to compromise between size and capability. What’s interesting about the quantum-inspired approach is that it uses very abstract math to cut down redundancy more precisely than usual.”

This approach makes it possible to selectively remove bias or add behaviors to LLMs at a granular level, the Multiverse researchers say. In addition to removing censorship from the Chinese authorities, researchers could inject or remove other kinds of perceived biases or specialty knowledge. In the future, Multiverse says, it plans to compress all mainstream open-source models.  

Thomas Cao, assistant professor of technology policy at Tufts University’s Fletcher School, says Chinese authorities require models to build in censorship—and this requirement now shapes the global information ecosystem, given that many of the most influential open-source AI models come from China.

Academics have also begun to document and analyze the phenomenon. Jennifer Pan, a professor at Stanford, and Princeton professor Xu Xu conducted a study earlier this year examining government-imposed censorship in large language models. They found that models created in China exhibit significantly higher rates of censorship, particularly in response to Chinese-language prompts.

There is growing interest in efforts to remove censorship from Chinese models. Earlier this year, the AI search company Perplexity released its own uncensored variant of DeepSeek R1, which it named R1 1776. Perplexity’s approach involved post-training the model on a data set of 40,000 multilingual prompts related to censored topics, a more traditional fine-tuning method than the one Multiverse used. 

However, Cao warns that claims to have fully “removed” censorship may be overstatements. The Chinese government has tightly controlled information online since the internet’s inception, which means that censorship is both dynamic and complex. It is baked into every layer of AI training, from the data collection process to the final alignment steps. 

“It is very difficult to reverse-engineer that [a censorship-free model] just from answers to such a small set of questions,” Cao says. 

Google’s new Gemini 3 “vibe-codes” responses and comes with its own agent

<div data-chronoton-summary="

  • Generative interfaces: Gemini 3 ditches plain-text defaults, instead choosing optimal formats autonomously—spinning up website-like interfaces, sketching diagrams, or generating animations based on what it deems most effective for each prompt.
  • Gemini Agent: An experimental feature now handles complex tasks across Google Calendar, Gmail, and Reminders, breaking work into steps and pausing for user approval.
  • Integrated with other Google products: Gemini 3 Pro now powers enhanced Search summaries, generates Wirecutter-style shopping guides from 50 billion product listings, and enables better vibe-coding through Google Antigravity.

” data-chronoton-post-id=”1128065″ data-chronoton-expand-collapse=”1″ data-chronoton-analytics-enabled=”1″>

Google today unveiled Gemini 3, a major upgrade to its flagship multimodal model. The firm says the new model is better at reasoning, has more fluid multimodal capabilities (the ability to work across voice, text or images), and will work like an agent. 

The previous model, Gemini 2.5, supports multimodal input. Users can feed it images, handwriting, or voice. But it usually requires explicit instructions about the format the user wants back, and it defaults to plain text regardless. 

But Gemini 3 introduces what Google calls “generative interfaces,” which allow the model to make its own choices about what kind of output fits the prompt best, assembling visual layouts and dynamic views on its own instead of returning a block of text. 

Ask for travel recommendations and it may spin up a website-like interface inside the app, complete with modules, images, and follow-up prompts such as “How many days are you traveling?” or “What kinds of activities do you enjoy?” It also presents clickable options based on what you might want next.

When asked to explain a concept, Gemini 3 may sketch a diagram or generate a simple animation on its own if it believes a visual is more effective. 

“Visual layout generates an immersive, magazine-style view complete with photos and modules,” says Josh Woodward, VP of Google Labs, Gemini, and AI Studio. “These elements don’t just look good but invite your input to further tailor the results.” 

With Gemini 3, Google is also introducing Gemini Agent, an experimental feature designed to handle multi-step tasks directly inside the app. The agent can connect to services such as Google Calendar, Gmail, and Reminders. Once granted access, it can execute tasks like organizing an inbox or managing schedules. 

Similar to other agents, it breaks tasks into discrete steps, displays its progress in real time, and pauses for approval from the user before continuing. Google describes the feature as a step toward “a true generalist agent.” It will be available on the web for Google AI Ultra subscribers in the US starting November 18.

The overall approach can seem a lot like “vibe coding,” where users describe an end goal in plain language and let the model assemble the interface or code needed to get there.

The update also ties Gemini more deeply into Google’s existing products. In Search, a limited group of Google AI Pro and Ultra subscribers can now switch to Gemini 3 Pro, the reasoning variation of the new model, to receive deeper, more thorough AI-generated summaries that rely on the model’s reasoning rather than the existing AI Mode.

For shopping, Gemini will now pull from Google’s Shopping Graph—which the company says contains more than 50 billion product listings—to generate its own recommendation guides. Users just need to ask a shopping-related question or search a shopping-related phrase, and the model assembles an interactive, Wirecutter-style product recommendation piece, complete with prices and product details, without redirecting to an external site.

For developers, Google is also pushing single-prompt software generation further. The company introduced Google Antigravity, a  development platform that acts as an all-in-one space where code, tools, and workflows can be created and managed from a single prompt.

Derek Nee, CEO of Flowith, an agentic AI application, told MIT Technology Review that Gemini 3 Pro addresses several gaps in earlier models. Improvements include stronger visual understanding, better code generation, and better performance on long tasks—features he sees as essential for developers of AI apps and agents. 

“Given its speed and cost advantages, we’re integrating the new model into our product,” he says. “We’re optimistic about its potential, but we need deeper testing to understand how far it can go.” 

What is the chance your plane will be hit by space debris?

MIT Technology Review Explains: Let our writers untangle the complex, messy world of technology to help you understand what’s coming next. You can read more from the series here.

In mid-October, a mysterious object cracked the windshield of a packed Boeing 737 cruising at 36,000 feet above Utah, forcing the pilots into an emergency landing. The internet was suddenly buzzing with the prospect that the plane had been hit by a piece of space debris. We still don’t know exactly what hit the plane—likely a remnant of a weather balloon—but it turns out the speculation online wasn’t that far-fetched.

That’s because while the risk of flights being hit by space junk is still small, it is, in fact, growing. 

About three pieces of old space equipment—used rockets and defunct satellites—fall into Earth’s atmosphere every day, according to estimates by the European Space Agency. By the mid-2030s, there may be dozens. The increase is linked to the growth in the number of satellites in orbit. Currently, around 12,900 active satellites circle the planet. In a decade, there may be 100,000 of them, according to analyst estimates.

To minimize the risk of orbital collisions, operators guide old satellites to burn up in Earth’s atmosphere. But the physics of that reentry process are not well understood, and we don’t know how much material burns up and how much reaches the ground.

“The number of such landfall events is increasing,” says Richard Ocaya, a professor of physics at the University of Free State in South Africa and a coauthor of a recent paper on space debris risk. “We expect it may be increasing exponentially in the next few years.”

So far, space debris hasn’t injured anybody—in the air or on the ground. But multiple close calls have been reported in recent years. In March last year, an 0.7-kilogram chunk of metal pierced the roof of a house in Florida. The object was later confirmed to be a remnant of a battery pallet tossed out from the International Space Station. When the strike occurred, the homeowner’s 19-year-old son was resting in a next-door room.

And in February this year, a 1.5-meter-long fragment of SpaceX’s Falcon 9 rocket crashed down near a warehouse outside Poland’s fifth-largest city, Poznan. Another piece was found in a nearby forest. A month later, a 2.5-kilogram piece of a Starlink satellite dropped on a farm in the Canadian province of Saskatchewan. Other incidents have been reported in Australia and Africa. And many more may be going completely unnoticed. 

“If you were to find a bunch of burnt electronics in a forest somewhere, your first thought is not that it came from a spaceship,” says James Beck, the director of the UK-based space engineering research firm Belstead Research. He warns that we don’t fully understand the risk of space debris strikes and that it might be much higher than satellite operators want us to believe. 

For example, SpaceX, the owner of the currently largest mega-constellation, Starlink, claims that its satellites are “designed for demise” and completely burn up when they spiral from orbit and fall through the atmosphere.

But Beck, who has performed multiple wind tunnel tests using satellite mock-ups to mimic atmospheric forces, says the results of such experiments raise doubts. Some satellite components are made of durable materials such as titanium and special alloy composites that don’t melt even at the extremely high temperatures that arise during a hypersonic atmospheric descent. 

“We have done some work for some small-satellite manufacturers and basically, their major problem is that the tanks get down,” Beck says. “For larger satellites, around 800 kilos, we would expect maybe two or three objects to land.” 

It can be challenging to quantify how much of a danger space debris poses. The International Civil Aviation Organization (ICAO) told MIT Technology Review that “the rapid growth in satellite deployments presents a novel challenge” for aviation safety, one that “cannot be quantified with the same precision as more established hazards.” 

But the Federal Aviation Administration has calculated some preliminary numbers on the risk to flights: In a 2023 analysis, the agency estimated that by 2035, the risk that one plane per year will experience a disastrous space debris strike will be around 7 in 10,000. Such a collision would either destroy the aircraft immediately or lead to a rapid loss of air pressure, threatening the lives of all on board.

The casualty risk to humans on the ground will be much higher. Aaron Boley, an associate professor in astronomy and a space debris researcher at the University of British Columbia, Canada, says that if megaconstellation satellites “don’t demise entirely,” the risk of a single human death or injury caused by a space debris strike on the ground could reach around 10% per year by 2035. That would mean a better than even chance that someone on Earth would be hit by space junk about every decade. In its report, the FAA put the chances even higher with similar assumptions, estimating that “one person on the planet would be expected to be injured or killed every two years.”

Experts are starting to think about how they might incorporate space debris into their air safety processes. The German space situational awareness company Okapi Orbits, for example, in cooperation with the German Aerospace Center and the European Organization for the Safety of Air Navigation (Eurocontrol), is exploring ways to adapt air traffic control systems so that pilots and air traffic controllers can receive timely and accurate alerts about space debris threats.

But predicting the path of space debris is challenging too. In recent years, advances in AI have helped improve predictions of space objects’ trajectories in the vacuum of space, potentially reducing the risk of orbital collisions. But so far, these algorithms can’t properly account for the effects of the gradually thickening atmosphere that space junk encounters during reentry. Radar and telescope observations can help, but the exact location of the impact becomes clear with only very short notice.

“Even with high-fidelity models, there’s so many variables at play that having a very accurate reentry location is difficult,” says Njord Eggen, a data analyst at Okapi Orbits. Space debris goes around the planet every hour and a half when in low Earth orbit, he notes, “so even if you have uncertainties on the order of 10 minutes, that’s going to have drastic consequences when it comes to the location where it could impact.”

For aviation companies, the problem is not just a potential strike, as catastrophic as that would be. To avoid accidents, authorities are likely to temporarily close the airspace in at-risk regions, which creates delays and costs money. Boley and his colleagues published a paper earlier this year estimating that busy aerospace regions such as northern Europe or the northeastern United States already have about a 26% yearly chance of experiencing at least one disruption due to the reentry of a major space debris item. By the time all planned constellations are fully deployed, aerospace closures due to space debris hazards may become nearly as common as those due to bad weather.

Because current reentry predictions are unreliable, many of these closures may end up being unnecessary.

For example, when a 21-metric-ton Chinese Long March mega-rocket was falling to Earth in 2022, predictions suggested its debris could scatter across Spain and parts of France. In the end, the rocket crashed into the Pacific Ocean. But the 30-minute closure of south European airspace delayed and diverted hundreds of flights. 

In the meantime, international regulators are urging satellite operators and launch providers to deorbit large satellites and rocket bodies in a controlled way, when possible, by carefully guiding them into remote parts of the ocean using residual fuel. 

The European Space Agency estimates that only about half the rocket bodies reentering the atmosphere do so in a controlled way. 

Moreover, around 2,300 old and no-longer-controllable rocket bodies still linger in orbit, slowly spiraling toward Earth with no mechanisms for operators to safely guide them into the ocean.

“There’s enough material up there that even if we change our practices, we will still have all those rocket bodies eventually reenter,” Boley says. “Although the probability of space debris hitting an aircraft is small, the probability that the debris will spread and fall over busy airspace is not small. That’s actually quite likely.”

The State of AI: How war will be changed forever

Welcome back to The State of AI, a new collaboration between the Financial Times and MIT Technology Review. Every Monday, writers from both publications debate one aspect of the generative AI revolution reshaping global power.

In this conversation, Helen Warrell, FT investigations reporter and former defense and security editor, and James O’Donnell, MIT Technology Review’s senior AI reporter, consider the ethical quandaries and financial incentives around AI’s use by the military.

Helen Warrell, FT investigations reporter 

It is July 2027, and China is on the brink of invading Taiwan. Autonomous drones with AI targeting capabilities are primed to overpower the island’s air defenses as a series of crippling AI-generated cyberattacks cut off energy supplies and key communications. In the meantime, a vast disinformation campaign enacted by an AI-powered pro-Chinese meme farm spreads across global social media, deadening the outcry at Beijing’s act of aggression.

Scenarios such as this have brought dystopian horror to the debate about the use of AI in warfare. Military commanders hope for a digitally enhanced force that is faster and more accurate than human-directed combat. But there are fears that as AI assumes an increasingly central role, these same commanders will lose control of a conflict that escalates too quickly and lacks ethical or legal oversight. Henry Kissinger, the former US secretary of state, spent his final years warning about the coming catastrophe of AI-driven warfare.

Grasping and mitigating these risks is the military priority—some would say the “Oppenheimer moment”—of our age. One emerging consensus in the West is that decisions around the deployment of nuclear weapons should not be outsourced to AI. UN secretary-general António Guterres has gone further, calling for an outright ban on fully autonomous lethal weapons systems. It is essential that regulation keep pace with evolving technology. But in the sci-fi-fueled excitement, it is easy to lose track of what is actually possible. As researchers at Harvard’s Belfer Center point out, AI optimists often underestimate the challenges of fielding fully autonomous weapon systems. It is entirely possible that the capabilities of AI in combat are being overhyped.

Anthony King, Director of the Strategy and Security Institute at the University of Exeter and a key proponent of this argument, suggests that rather than replacing humans, AI will be used to improve military insight. Even if the character of war is changing and remote technology is refining weapon systems, he insists, “the complete automation of war itself is simply an illusion.”

Of the three current military use cases of AI, none involves full autonomy. It is being developed for planning and logistics, cyber warfare (in sabotage, espionage, hacking, and information operations; and—most controversially—for weapons targeting, an application already in use on the battlefields of Ukraine and Gaza. Kyiv’s troops use AI software to direct drones able to evade Russian jammers as they close in on sensitive sites. The Israel Defense Forces have developed an AI-assisted decision support system known as Lavender, which has helped identify around 37,000 potential human targets within Gaza. 

Helen Warrell and James O'Donnell

FT/MIT TECHNOLOGY REVIEW | ADOBE STOCK

There is clearly a danger that the Lavender database replicates the biases of the data it is trained on. But military personnel carry biases too. One Israeli intelligence officer who used Lavender claimed to have more faith in the fairness of a “statistical mechanism” than that of a grieving soldier.

Tech optimists designing AI weapons even deny that specific new controls are needed to control their capabilities. Keith Dear, a former UK military officer who now runs the strategic forecasting company Cassi AI, says existing laws are more than sufficient: “You make sure there’s nothing in the training data that might cause the system to go rogue … when you are confident you deploy it—and you, the human commander, are responsible for anything they might do that goes wrong.”

It is an intriguing thought that some of the fear and shock about use of AI in war may come from those who are unfamiliar with brutal but realistic military norms. What do you think, James? Is some opposition to AI in warfare less about the use of autonomous systems and really an argument against war itself? 

James O’Donnell replies:

Hi Helen, 

One thing I’ve noticed is that there’s been a drastic shift in attitudes of AI companies regarding military applications of their products. In the beginning of 2024, OpenAI unambiguously forbade the use of its tools for warfare, but by the end of the year, it had signed an agreement with Anduril to help it take down drones on the battlefield. 

This step—not a fully autonomous weapon, to be sure, but very much a battlefield application of AI—marked a drastic change in how much tech companies could publicly link themselves with defense. 

What happened along the way? For one thing, it’s the hype. We’re told AI will not just bring superintelligence and scientific discovery but also make warfare sharper, more accurate and calculated, less prone to human fallibility. I spoke with US Marines, for example, who tested a type of AI while patrolling the South Pacific that was advertised to analyze foreign intelligence faster than a human could. 

Secondly, money talks. OpenAI and others need to start recouping some of the unimaginable amounts of cash they’re spending on training and running these models. And few have deeper pockets than the Pentagon. And Europe’s defense heads seem keen to splash the cash too. Meanwhile, the amount of venture capital funding for defense tech this year has already doubled the total for all of 2024, as VCs hope to cash in on militaries’ newfound willingness to buy from startups. 

I do think the opposition to AI warfare falls into a few camps, one of which simply rejects the idea that more precise targeting (if it’s actually more precise at all) will mean fewer casualties rather than just more war. Consider the first era of drone warfare in Afghanistan. As drone strikes became cheaper to implement, can we really say it reduced carnage? Instead, did it merely enable more destruction per dollar?

But the second camp of criticism (and now I’m finally getting to your question) comes from people who are well versed in the realities of war but have very specific complaints about the technology’s fundamental limitations. Missy Cummings, for example, is a former fighter pilot for the US Navy who is now a professor of engineering and computer science at George Mason University. She has been outspoken in her belief that large language models, specifically, are prone to make huge mistakes in military settings.

The typical response to this complaint is that AI’s outputs are human-checked. But if an AI model relies on thousands of inputs for its conclusion, can that conclusion really be checked by one person?

Tech companies are making extraordinarily big promises about what AI can do in these high-stakes applications, all while pressure to implement them is sky high. For me, this means it’s time for more skepticism, not less. 

Helen responds:

Hi James, 

We should definitely continue to question the safety of AI warfare systems and the oversight to which they’re subjected—and hold political leaders to account in this area. I am suggesting that we also apply some skepticism to what you rightly describe as the “extraordinarily big promises” made by some companies about what AI might be able to achieve on the battlefield. 

There will be both opportunities and hazards in what the military is being offered by a relatively nascent (though booming) defense tech scene. The danger is that in the speed and secrecy of an arms race in AI weapons, these emerging capabilities may not receive the scrutiny and debate they desperately need.

Further reading:

Michael C. Horowitz, director of Perry World House at the University of Pennsylvania, explains the need for responsibility in the development of military AI systems in this FT op-ed.

The FT’s tech podcast asks what Israel’s defense tech ecosystem can tell us about the future of warfare 

This MIT Technology Review story analyzes how OpenAI completed its pivot to allowing its technology on the battlefield.

MIT Technology Review also uncovered how US soldiers are using generative AI to help scour thousands of pieces of open-source intelligence.