How to fix a Windows PC affected by the global outage

MIT Technology Review Explains: Let our writers untangle the complex, messy world of technology to help you understand what’s coming next. You can read more here.

Windows PCs have crashed in a major IT outage around the world, bringing airlines, major banks, TV broadcasters, health-care providers, and other businesses to a standstill.

Airlines including United, Delta, and American have been forced to ground and delay flights, stranding passengers in airports, while the UK broadcaster Sky News was temporarily pulled off air. Meanwhile, banking customers in Europe, Australia, and India have been unable to access their online accounts. Doctor’s offices and hospitals in the UK have lost access to patient records and appointment scheduling systems. 

The problem stems from a defect in a single content update for Windows machines from the cybersecurity provider CrowdStrike. George Kurtz, CrowdStrike’s CEO, says that the company is actively working with customers affected.

“This is not a security incident or cyberattack,” he said in a statement on X. “The issue has been identified, isolated and a fix has been deployed. We refer customers to the support portal for the latest updates and will continue to provide complete and continuous updates on our website.” CrowdStrike pointed MIT Technology Review to its blog with additional updates for customers.

What caused the issue?

The issue originates from a faulty update from CrowdStrike, which has knocked affected servers and PCs offline and caused some Windows workstations to display the “blue screen of death” when users attempt to boot them. Mac and Linux hosts are not affected.

The update was intended for CrowdStrike’s Falcon software, which is “endpoint detection and response” software designed to protect companies’ computer systems from cyberattacks and malware. But instead of working as expected, the update caused computers running Windows software to crash and fail to reboot. Home PCs running Windows are less likely to have been affected, because CrowdStrike is predominantly used by large organizations. Microsoft did not immediately respond to a request for comment.

“The CrowdStrike software works at the low-level operating system layer. Issues at this level make the OS not bootable,” says Lukasz Olejnik, an independent cybersecurity researcher and consultant, and author of Philosophy of Cybersecurity.

Not all computers running Windows were affected in the same way, he says, pointing out that if a machine’s systems had been turned off at the time CrowdStrike pushed out the update (which has since been withdrawn), it wouldn’t have received it.

For the machines running systems that received the mangled update and were rebooted, an automated update from CloudStrike’s server management infrastructure should suffice, he says.

“But in thousands or millions of cases, this may require manual human intervention,” he adds. “That means a really bad weekend ahead for plenty of IT staff.”

How to manually fix your affected computer

There is a known workaround for Windows computers that requires administrative access to its systems. If you’re affected and have that high level of access, CrowdStrike has recommended the following steps:

1. Boot Windows into safe mode or the Windows Recovery Environment.

2. Navigate to the C:WindowsSystem32driversCrowdStrike directory.

3. Locate the file matching “C-00000291*.sys” and delete it.

4. Boot the machine normally.

Sounds simple, right? But while the above fix is fairly easy to administer, it requires someone to enter it physically, meaning IT teams will need to track down remote machines that have been affected, says Andrew Dwyer of the Department of Information Security at Royal Holloway, University of London.

“We’ve been quite lucky that this is an outage and not an exploitation by a criminal gang or another state,” he says. “It also shows how easy it is to inflict quite significant global damage if you get into the right part of the IT supply chain.”

While fixing the problem is going to cause headaches for IT teams for the next week or so, it’s highly unlikely to cause significant long-term damage to the affected systems—which would not have been the case if it had been ransomware rather than a bungled update, he says.

“If this was a piece of ransomware, there could have been significant outages for months,” he adds. “Without endpoint detection software, many organizations would be in a much more vulnerable place. But they’re critical nodes in the system that have a lot of access to the computer systems that we use.”

Google, Amazon and the problem with Big Tech’s climate claims

MIT Technology Review Explains: Let our writers untangle the complex, messy world of technology to help you understand what’s coming next. You can read more from the series here.

Last week, Amazon trumpeted that it had purchased enough clean electricity to cover the energy demands of all the offices, data centers, grocery stores, and warehouses across its global operations, seven years ahead of its sustainability target. 

That news closely followed Google’s acknowledgment that the soaring energy demands of its AI operations helped ratchet up its corporate emissions by 13% last year—and that it had backed away from claims that it was already carbon neutral.

If you were to take the announcements at face value, you’d be forgiven for believing that Google is stumbling while Amazon is speeding ahead in the race to clean up climate pollution. 

But while both companies are coming up short in their own ways, Google’s approach to driving down greenhouse-gas emissions is now arguably more defensible. 

In fact, there’s a growing consensus that how a company gets to net zero is more important than how fast it does so. And a new school of thought is emerging that moves beyond the net-zero model of corporate climate action, arguing that companies should focus on achieving broader climate impacts rather than trying to balance out every ton of carbon dioxide they emit. 

But to understand why, let’s first examine how the two tech giants’ approaches stack up, and where company climate strategies often go wrong.

Perverse incentives

The core problem is that the costs and complexity of net-zero emissions plans, which require companies to cut or cancel out every ton of climate pollution across their supply chains, can create perverse incentives. Corporate sustainability officers often end up pursuing the quickest, cheapest ways of cleaning up a company’s pollution on paper, rather than the most reliable ways of reducing its emissions in the real world. 

That may mean buying inexpensive carbon credits to offset ongoing pollution from their direct operations or that of their suppliers, rather than undertaking the tougher task of slashing those emissions at the source. Those programs can involve paying other parties to plant trees, restore coastal ecosystems, or alter agriculture practices in ways that purport to reduce emissions or pull carbon dioxide out of the air. The snag is, numerous studies and investigative stories have shown that such efforts often overstate the climate benefits, sometimes wildly.  

Net-zero goals can also compel companies to buy what are known as renewable energy credits (RECs), which ostensibly support additional generation of renewable electricity but raise similar concerns that the climate gains are overstated.

The argument for RECs is that companies often can’t purchase a pure stream of clean electricity to power their operations, since grid operators rely on a mix of natural gas, coal, solar, wind, and other sources. But if those businesses provide money or an indication of demand that spurs developers to build new renewables projects and generate more clean electricity than they would have otherwise, the companies can then claim this cancels out ongoing pollution from the electricity they use.

Experts, however, are less and less convinced of the value of RECs at this stage.

The claim that clean-energy projects wouldn’t have been built without that added support is increasingly unconvincing in a world where those facilities can easily compete in the marketplace on their own, Emily Grubert, an associate professor at Notre Dame, previously told me. And if a company’s purchase of such credits doesn’t bring about changes that reduce the emissions in the atmosphere, it can’t balance out the company’s ongoing pollution. 

‘Creative accounting’

For its part, Amazon is relying on both carbon credits and RECs. 

In its sustainability report, the company says that it reached its clean-electricity targets and drove down emissions by improving energy efficiency, buying more carbon-free power, building renewables projects at its facilities, and supporting such projects around the world. It did this in part by “purchasing additional environmental attributes (such as renewable energy credits) to signal our support for renewable energy in the grids where we operate, in line with the expected generation of the projects we have contracted.”

But there’s yet another issue that can arise when a company pays for clean power that it’s not directly consuming, whether through RECs or through power purchase agreements made before a project is built: Merely paying for renewable electricity generation that occurred at some point, somewhere in the world, isn’t the same as procuring the amount of electricity that the company consumed in the specific places and times that it did so. As you may have heard, the sun stops shining and the wind stops blowing, even as Amazon workers and operations keep grinding around the world and around the clock. 

Paying a solar-farm operator some additional money for producing electricity it was already going to generate in the middle of the day doesn’t in any meaningful way reverse the emissions that an Amazon fulfillment center or server farm produces by, say, drawing electricity from a natural-gas power plant two states away in the middle of the night. 

“The reality on the ground is that its data centers are driving up demand for fossil fuels,” argued a report last week from Amazon Employees for Climate Justice, a group of workers that has been pushing the company to take more aggressive action on climate change. 

The organization said that a significant share of Amazon’s RECs aren’t driving development of new projects. It also stressed that those payments and projects often aren’t generating electricity in the same areas and at the same times that Amazon is consuming power.

The employee group estimates that 78% of Amazon’s US energy comes from nonrenewable sources and accuses the company of using “creative accounting” to claim it’s reached its clean-electricity goals.

To its credit, Amazon is investing billions of dollars in renewables, electrifying its fleet of delivery vehicles, and otherwise making real strides in reducing its waste and emissions. In addition, it’s lobbying US legislators to make it easier to permit electric transmission projects, funding more reliable forms of carbon removal, and working to diversify its mix of electricity sources. The company also insists it’s being careful and selective about the types of carbon offsets it supports, investing only in “additional, quantifiable, real, permanent, and socially beneficial” projects.

“Amazon is focused on making the grid cleaner and more reliable for everyone,” the company said in response to an inquiry from MIT Technology Review. “An emissions-first approach is the fastest, most cost-effective and scalable way to leverage corporate clean-energy procurement to help decarbonize global power grids. This includes procuring renewable energy in locations and countries that still rely heavily on fossil fuels to power their grids, and where energy projects can have the biggest impact on carbon reduction.”

The company has adopted what’s known as a “carbon matching” approach (which it lays out further here), stressing that it wants to be sure the emissions reduced through its investments in renewables equal or exceed the emissions it continues to produce. 

But a recent study led by Princeton researchers found that carbon matching had a “minimal impact” on long-term power system emissions, because it rarely helps get projects built or clean energy generated where those things wouldn’t have happened anyway.

“It’s an offsetting scheme at its core,” Wilson Ricks, an author of the study and an energy systems researcher at Princeton, said of the method, without commenting on Amazon specifically. 

(Meta, Salesforce, and General Motors have also embraced this model, the study notes.)

The problem in asserting that a company is effectively running entirely on clean electricity, when it’s not doing so directly and may not be doing so completely, is that it takes off any pressure to finish the job for real. 

Backing off claims of carbon neutrality

Google has made its own questionable climate claims over the years as well, and it faces growing challenges as the energy it uses for artificial intelligence soars. 

But it is striving to address its power consumption in arguably more defensible ways and now appears to be taking some notable course-correcting steps, according to its recent sustainability report

Google says that it’s no longer buying carbon credits that purport to prevent emissions. With this change, it has also backed away from the claim that it had already achieved carbon neutrality across its operations years ago.

“We’re no longer procuring carbon avoidance credits year-over-year to compensate for our annual operational emissions,” the company told MIT Technology Review in a statement. “We’re instead focusing on accelerating an array of carbon solutions and partnerships that will help us work toward our net-zero goal, while simultaneously helping develop broader solutions to mitigate climate change.”

Notably, that includes funding the development of more expensive but possibly more reliable ways of pulling greenhouse gas out of the atmosphere through direct air capture machines or other methods. The company pledged $200 million to Frontier, an effort to pay in advance for one billion tons of carbon dioxide that startups will eventually draw down and store. 

Those commitments may not allow the company to make any assertions about its own emissions today, and some of the early-stage approaches it funds might not work at all. But the hope is that these sorts of investments could help stand up a carbon removal industry, which studies find may be essential for keeping warming in check over the coming decades. 

Clean power around the clock

In addition, for several years now Google has worked to purchase or otherwise support generation of clean power in the areas where it operates and across every hour that it consumes electricity—an increasingly popular approach known as 24/7 carbon-free energy.

The idea is that this will stimulate greater development of what grid operators increasingly need: forms of carbon-free energy that can run at all hours of the day (commonly called “firm generation”), matching up with the actual hour-by-hour energy demands of corporations. That can include geothermal plants, nuclear reactors, hydroelectric plants, and more.

More than 150 organizations and governments have now signed the 24/7 Carbon-Free Energy Compact, a pledge to ensure that clean-electricity purchases match up hourly with their consumption. Those include Google, Microsoft, SAP, and Rivian.

The Princeton study notes that hourly matching is more expensive than other approaches but finds that it drives “significant reductions in system-level CO2 emissions” while “incentivizing advanced clean firm generation and long-duration storage technologies that would not otherwise see market uptake.”

In Google’s case, pursuing 24/7 matching has steered the company to support more renewables projects in the areas where it operates and to invest in more energy storage projects. It has also entered into purchase agreements with power plants that can deliver carbon-free electricity around the clock. These include several deals with Fervo Energy, an enhanced-geothermal startup.

The company says its goal is to achieve net-zero emissions across its supply chains by 2030, with all its electricity use synced up, hour by hour, with clean sources across every grid it operates on.

Energy-hungry AI

Which brings us back to the growing problem of AI energy consumption.

Jonathan Koomey, an independent researcher studying the energy demands of computing, argues that the hue and cry over rising electricity use for AI is overblown. He notes that AI accounts for only a sliver of overall energy consumption from information technology, which produces about 1.4% of global emissions.

But major data center companies like Google, Amazon, and others will need to make significant changes to ensure that they stay ahead of rising AI-driven energy use while keeping on track with their climate goals.

They will have to improve overall energy efficiency, procure more clean energy, and use their clout as major employers to push utilities to increase carbon-free generation in the areas where they operate, he says. But the clear focus must be on directly cutting corporate climate pollution, not mucking around with RECs and offsets.

“Reduce your emissions; that’s it,” Koomey says. “We need actual, real, meaningful emissions reductions, not trading around credits that have, at best, an ambiguous effect.”

Google says it’s already making progress on its AI footprint, while stressing that it’s leveraging artificial intelligence to find ways to drive down climate pollution across sectors. Those include efforts like Tapestry, a project within the company’s X “moonshot factory” to create more efficient and reliable electricity grids, as well as a Google Research collaboration to determine airline flight paths that produce fewer heat-trapping cirrus clouds

“AI holds immense promise to drive climate action,” the company said in its report.

The contribution model

The contrasting approaches of Google and Amazon call to mind an instructive hypothetical that a team of carbon market researchers sketched out in a paper this January. They noted that one company could do the hard, expensive work of directly eliminating nearly every ton of its emissions, while another could simply buy cheap offsets to purportedly address all of its own. In that case the first company would have done more actual good for the climate, but only the latter would be able to say it had reached its net-zero target.

Given these challenges and the perverse incentives driving companies toward cheap offsets, the authors have begun arguing for a different approach, known as the “contribution model.”

Like Koomey and others, they stress that companies should dedicate most of their money and energy to directly cutting their emissions as much as possible. But they assert that companies should adopt a new way of dealing with what’s left over (either because that remaining pollution is occurring outside their direct operations or because there are not yet affordable, emissions-free alternatives).

Instead of trying to cancel out every ongoing ton of emissions, a company might pick a percentage of its revenue or set a defensible carbon price on those tons, and then dedicate all that money toward achieving the maximum climate benefit the money can buy, says Libby Blanchard, a research scholar at the University of Cambridge. (She coauthored the paper on the contribution model with Barbara Haya of the University of California, Berkeley, and Bill Anderegg at the University of Utah.)

That could mean funding well-managed forestry projects that help trap carbon dioxide, protect biodiversity, and improve air and water quality. It could mean supporting research and development on the technologies still needed to slow global warming and efforts to scale them up, as Google seems to be doing. Or it could even mean lobbying for stricter climate laws, since few things can drive change as quickly as public policy. 

But the key difference is that the company won’t be able to claim that those actions canceled out every ton of remaining emissions—only that it took real, responsible steps to “contribute” to addressing the problem of climate change. 

The hope is that this approach frees companies to focus on the quality of the projects it funds, not the quantity of cheap offsets it buys, Blanchard says.

It could “replace this race to the bottom with a race to the top,” she says.

As with any approach put before profit-motivated companies that employ ranks of savvy accountants and attorneys, there will surely be ways to abuse this method in the absence of appropriate safeguards and oversight.

And plenty of companies may refuse to adopt it, since they won’t be able to claim they’ve achieved net-zero emissions, which has become the de facto standard for corporate climate action.

But Blanchard says there’s one obvious incentive for them to move away from that goal.

“There’s way less risk that they’ll be sued or accused of greenwashing,” she says.

What are AI agents? 

MIT Technology Review Explains: Let our writers untangle the complex, messy world of technology to help you understand what’s coming next. You can read more from the series here.

When ChatGPT was first released, everyone in AI was talking about the new generation of AI assistants. But over the past year, that excitement has turned to a new target: AI agents. 

Agents featured prominently in Google’s annual I/O conference in May, when the company unveiled its new AI agent called Astra, which allows users to interact with it using audio and video. OpenAI’s new GPT-4o model has also been called an AI agent.  

And it’s not just hype, although there is definitely some of that too. Tech companies are plowing vast sums into creating AI agents, and their research efforts could usher in the kind of useful AI we have been dreaming about for decades. Many experts, including Sam Altman, say they are the next big thing.   

But what are they? And how can we use them? 

How are they defined? 

It is still early days for research into AI agents, and the field does not have a definitive definition for them. But simply, they are AI models and algorithms that can autonomously make decisions in a dynamic world, says Jim Fan, a senior research scientist at Nvidia who leads the company’s AI agents initiative. 

The grand vision for AI agents is a system that can execute a vast range of tasks, much like a human assistant. In the future, it could help you book your vacation, but it will also remember if you prefer swanky hotels, so it will only suggest hotels that have four stars or more and then go ahead and book the one you pick from the range of options it offers you. It will then also suggest flights that work best with your calendar, and plan the itinerary for your trip according to your preferences. It could make a list of things to pack based on that plan and the weather forecast. It might even send your itinerary to any friends it knows live in your destination and invite them along. In the workplace, it  could analyze your to-do list and execute tasks from it, such as sending calendar invites, memos, or emails. 

One vision for agents is that they are multimodal, meaning they can process language, audio, and video. For example, in Google’s Astra demo, users could point a smartphone camera at things and ask the agent questions. The agent could respond to text, audio, and video inputs. 

These agents could also make processes smoother for businesses and public organizations, says David Barber, the director of the University College London Centre for Artificial Intelligence. For example, an AI agent might be able to function as a more sophisticated customer service bot. The current generation of language-model-based assistants can only generate the next likely word in a sentence. But an AI agent would have the ability to act on natural-language commands autonomously and process customer service tasks without supervision. For example, the agent would be able to analyze customer complaint emails and then know to check the customer’s reference number, access databases such as customer relationship management and delivery systems to see whether the complaint is legitimate, and process it according to the company’s policies, Barber says. 

Broadly speaking, there are two different categories of agents, says Fan: software agents and embodied agents. 

Software agents run on computers or mobile phones and use apps, much as in the travel agent example above. “Those agents are very useful for office work or sending emails or having this chain of events going on,” he says. 

Embodied agents are agents that are situated in a 3D world such as a video game, or in a robot. These kinds of agents might make video games more engaging by letting people play with nonplayer characters controlled by AI. These sorts of agents could also help build more useful robots that could help us with everyday tasks at home, such as folding laundry and cooking meals. 

Fan was part of a team that built an embodied AI agent called MineDojo in the popular computer game Minecraft. Using a vast trove of data collected from the internet, Fan’s AI agent was able to learn new skills and tasks that allowed it to freely explore the virtual 3D world and complete complex tasks such as encircling llamas with fences or scooping lava into a bucket. Video games are good proxies for the real world, because they require agents to understand physics, reasoning, and common sense. 

In a new paper, which has not yet been peer-reviewed, researchers at Princeton say that AI agents tend to have three different characteristics. AI systems are considered “agentic” if they can pursue difficult goals without being instructed in complex environments. They also qualify if they can be instructed in natural language and act autonomously without supervision. And finally, the term “agent” can also apply to systems that are able to use tools, such as web search or programming, or are capable of planning. 

Are they a new thing?

The term “AI agents” has been around for years and has meant different things at different times, says Chirag Shah, a computer science professor at the University of Washington. 

There have been two waves of agents, says Fan. The current wave is thanks to the language model boom and the rise of systems such as ChatGPT. 

The previous wave was in 2016, when Google DeepMind introduced AlphaGo, its AI system that can play—and win—the game Go. AlphaGo was able to make decisions and plan strategies. This relied on reinforcement learning, a technique that rewards AI algorithms for desirable behaviors. 

“But these agents were not general,” says Oriol Vinyals, vice president of research at Google DeepMind. They were created for very specific tasks—in this case, playing Go. The new generation of foundation-model-based AI makes agents more universal, as they can learn from the world humans interact with. 

“You feel much more that the model is interacting with the world and then giving back to you better answers or better assisted assistance or whatnot,” says Vinyals. 

What are the limitations? 

There are still many open questions that need to be answered. Kanjun Qiu, CEO and founder of the AI startup Imbue, which is working on agents that can reason and code, likens the state of agents to where self-driving cars were just over a decade ago. They can do stuff, but they’re unreliable and still not really autonomous. For example, a coding agent can generate code, but it sometimes gets it wrong, and it doesn’t know how to test the code it’s creating, says Qiu. So humans still need to be actively involved in the process. AI systems still can’t fully reason, which is a critical step in operating in a complex and  ambiguous human world. 

“We’re nowhere close to having an agent that can just automate all of these chores for us,” says Fan. Current systems “hallucinate and they also don’t always follow instructions closely,” Fan says. “And that becomes annoying.”  

Another limitation is that after a while, AI agents lose track of what they are working on. AI systems are limited by their context windows, meaning the amount of data they can take into account at any given time. 

“ChatGPT can do coding, but it’s not able to do long-form content well. But for human developers, we look at an entire GitHub repository that has tens if not hundreds of lines of code, and we have no trouble navigating it,” says Fan. 

To tackle this problem, Google has increased its models’ capacity to process data, which allows users to have longer interactions with them in which they remember more about past interactions. The company said it is working on making its context windows infinite in the future.

For embodied agents such as robots, there are even more limitations. There is not enough training data to teach them, and researchers are only just starting to harness the power of foundation models in robotics. 

So amid all the hype and excitement, it’s worth bearing in mind that research into AI agents is still in its very early stages, and it will likely take years until we can experience their full potential. 

That sounds cool. Can I try an AI agent now? 

Sort of. You’ve most likely tried their early prototypes, such as OpenAI’s ChatGPT and GPT-4. “If you’re interacting with software that feels smart, that is kind of an agent,” says Qiu. 

Right now the best agents we have are systems with very narrow and specific use cases, such as coding assistants, customer service bots, or workflow automation software like Zapier, she says. But these are a far cry from a universal AI agent that can do complex tasks. 

“Today we have these computers and they’re really powerful, but we have to micromanage them,” says Qiu. 

OpenAI’s ChatGPT plug-ins, which allow people to create AI-powered assistants for web browsers, were an attempt at agents, says Qiu. But these systems are still clumsy, unreliable, and not capable of reasoning, she says. 

Despite that, these systems will one day change the way we interact with technology, Qiu believes, and it is a trend people need to pay attention to. 

“It’s not like, ‘Oh my God, all of a sudden we have AGI’ … but more like ‘Oh my God, my computer can do way more than it did five years ago,’” she says.

Why does AI hallucinate?

MIT Technology Review Explains: Let our writers untangle the complex, messy world of technology to help you understand what’s coming next. You can read more from the series here.

The World Health Organization’s new chatbot launched on April 2 with the best of intentions. 

A fresh-faced virtual avatar backed by GPT-3.5, SARAH (Smart AI Resource Assistant for Health) dispenses health tips in eight different languages, 24/7, about how to eat well, quit smoking, de-stress, and more, for millions around the world.

But like all chatbots, SARAH can flub its answers. It was quickly found to give out incorrect information. In one case, it came up with a list of fake names and addresses for nonexistent clinics in San Francisco. The World Health Organization warns on its website that SARAH may not always be accurate.

Here we go again. Chatbot fails are now a familiar meme. Meta’s short-lived scientific chatbot Galactica made up academic papers and generated wiki articles about the history of bears in space. In February, Air Canada was ordered to honor a refund policy invented by its customer service chatbot. Last year, a lawyer was fined for submitting court documents filled with fake judicial opinions and legal citations made up by ChatGPT. 

The problem is, large language models are so good at what they do that what they make up looks right most of the time. And that makes trusting them hard.

This tendency to make things up—known as hallucination—is one of the biggest obstacles holding chatbots back from more widespread adoption. Why do they do it? And why can’t we fix it?

Magic 8 Ball

To understand why large language models hallucinate, we need to look at how they work. The first thing to note is that making stuff up is exactly what these models are designed to do. When you ask a chatbot a question, it draws its response from the large language model that underpins it. But it’s not like looking up information in a database or using a search engine on the web. 

Peel open a large language model and you won’t see ready-made information waiting to be retrieved. Instead, you’ll find billions and billions of numbers. It uses these numbers to calculate its responses from scratch, producing new sequences of words on the fly. A lot of the text that a large language model generates looks as if it could have been copy-pasted from a database or a real web page. But as in most works of fiction, the resemblances are coincidental. A large language model is more like an infinite Magic 8 Ball than an encyclopedia. 

Large language models generate text by predicting the next word in a sequence. If a model sees “the cat sat,” it may guess “on.” That new sequence is fed back into the model, which may now guess “the.” Go around again and it may guess “mat”—and so on. That one trick is enough to generate almost any kind of text you can think of, from Amazon listings to haiku to fan fiction to computer code to magazine articles and so much more. As Andrej Karpathy, a computer scientist and cofounder of OpenAI, likes to put it: large language models learn to dream internet documents. 

Think of the billions of numbers inside a large language model as a vast spreadsheet that captures the statistical likelihood that certain words will appear alongside certain other words. The values in the spreadsheet get set when the model is trained, a process that adjusts those values over and over again until the model’s guesses mirror the linguistic patterns found across terabytes of text taken from the internet. 

To guess a word, the model simply runs its numbers. It calculates a score for each word in its vocabulary that reflects how likely that word is to come next in the sequence in play. The word with the best score wins. In short, large language models are statistical slot machines. Crank the handle and out pops a word. 

It’s all hallucination

The takeaway here? It’s all hallucination, but we only call it that when we notice it’s wrong. The problem is, large language models are so good at what they do that what they make up looks right most of the time. And that makes trusting them hard. 

Can we control what large language models generate so they produce text that’s guaranteed to be accurate? These models are far too complicated for their numbers to be tinkered with by hand. But some researchers believe that training them on even more text will continue to reduce their error rate. This is a trend we’ve seen as large language models have gotten bigger and better. 

Another approach involves asking models to check their work as they go, breaking responses down step by step. Known as chain-of-thought prompting, this has been shown to increase the accuracy of a chatbot’s output. It’s not possible yet, but future large language models may be able to fact-check the text they are producing and even rewind when they start to go off the rails.

But none of these techniques will stop hallucinations fully. As long as large language models are probabilistic, there is an element of chance in what they produce. Roll 100 dice and you’ll get a pattern. Roll them again and you’ll get another. Even if the dice are, like large language models, weighted to produce some patterns far more often than others, the results still won’t be identical every time. Even one error in 1,000—or 100,000—adds up to a lot of errors when you consider how many times a day this technology gets used. 

The more accurate these models become, the more we will let our guard down. Studies show that the better chatbots get, the more likely people are to miss an error when it happens.  

Perhaps the best fix for hallucination is to manage our expectations about what these tools are for. When the lawyer who used ChatGPT to generate fake documents was asked to explain himself, he sounded as surprised as anyone by what had happened. “I heard about this new site, which I falsely assumed was, like, a super search engine,” he told a judge. “I did not comprehend that ChatGPT could fabricate cases.” 

Here’s the defense tech at the center of US aid to Israel, Ukraine, and Taiwan

MIT Technology Review Explains: Let our writers untangle the complex, messy world of technology to help you understand what’s coming next. You can read more from the series here.

After weeks of drawn-out congressional debate over how much the United States should spend on conflicts abroad, President Joe Biden signed a $95.3 billion aid package into law on Wednesday.

The bill will send a significant quantity of supplies to Ukraine and Israel, while also supporting Taiwan with submarine technology to aid its defenses against China. It’s also sparked renewed calls for stronger crackdowns on Iranian-produced drones. 

Though much of the money will go toward replenishing fairly standard munitions and supplies, the spending bill provides a window into US strategies around four key defense technologies that continue to reshape how today’s major conflicts are being fought.

For a closer look at the military technology at the center of the aid package, I spoke with Andrew Metrick, a fellow with the defense program at the Center for a New American Security, a think tank.

Ukraine and the role of long-range missiles

Ukraine has long sought the Army Tactical Missile System (ATACMS), a long-range ballistic missile made by Lockheed Martin. First debuted in Operation Desert Storm in Iraq in 1990, it’s 13 feet high, two feet wide, and over 3,600 pounds. It can use GPS to accurately hit targets 190 miles away. 

Last year, President Biden was apprehensive about sending such missiles to Ukraine, as US stockpiles of the weapons were relatively low. In October, the administration changed tack. The US sent shipments of ATACMS, a move celebrated by President Volodymyr Zelensky of Ukraine, but they came with restrictions: the missiles were older models with a shorter range, and Ukraine was instructed not to fire them into Russian territory, only Ukrainian territory. 

This week, just hours before the new aid package was signed, multiple news outlets reported that the US had secretly sent more powerful long-range ATACMS to Ukraine several weeks before. They were used on Tuesday, April 23, to target a Russian airfield in Crimea and Russian troops in Berdiansk, 50 miles southwest of Mariupol.

The long range of the weapons has proved essential for Ukraine, says Metrick. “It allows the Ukrainians to strike Russian targets at ranges for which they have very few other options,” he says. That means being able to hit locations like supply depots, command centers, and airfields behind Russia’s front lines in Ukraine. This capacity has grown more important as Ukraine’s troop numbers have waned, Metrick says.

Replenishing Israel’s Iron Dome

On April 13, Iran launched its first-ever direct attack on Israeli soil. In the attack, which Iran says was retaliation for Israel’s airstrike on its embassy in Syria, hundreds of missiles were lobbed into Israeli airspace. Many of them were neutralized by the web of cutting-edge missile launchers dispersed throughout Israel that can automatically detonate incoming strikes before they hit land. 

One of those systems is Israel’s Iron Dome, in which radar systems detect projectiles and then signal units to launch defensive missiles that detonate the target high in the sky before it strikes populated areas. Israel’s other system, called David’s Sling, works a similar way but can identify rockets coming from a greater distance, upwards of 180 miles. 

Both systems are hugely costly to research and build, and the new US aid package allocates $15 billion to replenish their missile stockpile. The missiles can cost anywhere from $100,000 to $10 million each, and a system like Iron Dome might fire them daily during intense periods of conflict. 

The aid comes as funding for Israel has grown more contentious amid the dire conditions faced by displaced Palestinians in Gaza. While the spending bill worked its way through Congress, increasing numbers of Democrats sought to put conditions on the military aid to Israel, particularly after an Israeli air strike on April 1 killed seven aid workers from World Central Kitchen, an international food charity. The funding package does provide $9 billion in humanitarian assistance for the conflict, but the efforts to impose conditions for Israeli military aid failed. 

Taiwan and underwater defenses against China

A rising concern for the US defense community—and a subject of “wargaming” simulations that Metrick has carried out—is an amphibious invasion of Taiwan from China. The rising risk of that scenario has driven the US to build and deploy larger numbers of advanced submarines, Metrick says. A bigger fleet of these submarines would be more likely to keep attacks from China at bay, thereby protecting Taiwan.

The trouble is that the US shipbuilding effort, experts say, is too slow. It’s been hampered by budget cuts and labor shortages, but the new aid bill aims to jump-start it. It will provide $3.3 billion to do so, specifically for the production of Columbia-class submarines, which carry nuclear weapons, and Virginia-class submarines, which carry conventional weapons. 

Though these funds aim to support Taiwan by building up the US supply of submarines, the package also includes more direct support, like $2 billion to help it purchase weapons and defense equipment from the US. 

The US’s Iranian drone problem 

Shahed drones are used almost daily on the Russia-Ukraine battlefield, and Iran launched more than 100 against Israel earlier this month. Produced by Iran and resembling model planes, the drones are fast, cheap, and lightweight, capable of being launched from the back of a pickup truck. They’re used frequently for potent one-way attacks, where they detonate upon reaching their target. US experts say the technology is tipping the scales toward Russian and Iranian military groups and their allies. 

The trouble of combating them is partly one of cost. Shooting down the drones, which can be bought for as little as $40,000, can cost millions in ammunition.

“Shooting down Shaheds with an expensive missile is not, in the long term, a winning proposition,” Metrick says. “That’s what the Iranians, I think, are banking on. They can wear people down.”

This week’s aid package renewed White House calls for stronger sanctions aimed at curbing production of the drones. The United Nations previously passed rules restricting any drone-related material from entering or leaving Iran, but those expired in October. The US now wants them reinstated. 

Even if that happens, it’s unlikely the rules would do much to contain the Shahed’s dominance. The components of the drones are not all that complex or hard to obtain to begin with, but experts also say that Iran has built a sprawling global supply chain to acquire the materials needed to manufacture them and has worked with Russia to build factories. 

“Sanctions regimes are pretty dang leaky,” Metrick says. “They [Iran] have friends all around the world.”

How virtual power plants are shaping tomorrow’s energy system

MIT Technology Review Explains: Let our writers untangle the complex, messy world of technology to help you understand what’s coming next. You can read more from the series here.

For more than a century, the prevalent image of power plants has been characterized by towering smokestacks, endless coal trains, and loud spinning turbines. But the plants powering our future will look radically different—in fact, many may not have a physical form at all. Welcome to the era of virtual power plants (VPPs).

The shift from conventional energy sources like coal and gas to variable renewable alternatives such as solar and wind means the decades-old way we operate the energy system is changing. 

Governments and private companies alike are now counting on VPPs’ potential to help keep costs down and stop the grid from becoming overburdened. 

Here’s what you need to know about VPPs—and why they could be the key to helping us bring more clean power and energy storage online.

What are virtual power plants and how do they work?

A virtual power plant is a system of distributed energy resources—like rooftop solar panels, electric vehicle chargers, and smart water heaters—that work together to balance energy supply and demand on a large scale. They are usually run by local utility companies who oversee this balancing act.

A VPP is a way of “stitching together” a portfolio of resources, says Rudy Shankar, director of Lehigh University’s Energy Systems Engineering, that can help the grid respond to high energy demand while reducing the energy system’s carbon footprint.

The “virtual” nature of VPPs comes from its lack of a central physical facility, like a traditional coal or gas plant. By generating electricity and balancing the energy load, the aggregated batteries and solar panels provide many of the functions of conventional power plants.

They also have unique advantages.

Kevin Brehm, a manager at Rocky Mountain Institute who focuses on carbon-free electricity, says comparing VPPs to traditional plants is a “helpful analogy,” but VPPs “do certain things differently and therefore can provide services that traditional power plants can’t.”

One significant difference is VPPs’ ability to shape consumers’ energy use in real time. Unlike conventional power plants, VPPs can communicate with distributed energy resources and allow grid operators to control the demand from end users.

For example, smart thermostats linked to air conditioning units can adjust home temperatures and manage how much electricity the units consume. On hot summer days these thermostats can pre-cool homes before peak hours, when air conditioning usage surges. Staggering cooling times can help prevent abrupt demand hikes that might overwhelm the grid and cause outages. Similarly, electric vehicle chargers can adapt to the grid’s requirements by either supplying or utilizing electricity. 

These distributed energy sources connect to the grid through communication technologies like Wi-Fi, Bluetooth, and cellular services. In aggregate, adding VPPs can increase overall system resilience. By coordinating hundreds of thousands of devices, VPPs have a meaningful impact on the grid—they shape demand, supply power, and keep the electricity flowing reliably.

How popular are VPPs now?

Until recently, VPPs were mostly used to control consumer energy use. But because solar and battery technology has evolved, utilities can now use them to supply electricity back to the grid when needed.

In the United States, the Department of Energy estimates VPP capacity at around 30 to 60 gigawatts. This represents about 4% to 8% of peak electricity demand nationwide, a minor fraction within the overall system. However, some states and utility companies are moving quickly to add more VPPs to their grids.

Green Mountain Power, Vermont’s largest utility company, made headlines last year when it expanded its subsidized home battery program. Customers have the option to lease a Tesla home battery at a discounted rate or purchase their own, receiving assistance of up to $10,500, if they agree to share stored energy with the utility as required. The Vermont Public Utility Commission, which approved the program, said it can also provide emergency power during outages.

In Massachusetts, three utility companies (National Grid, Eversource, and Cape Light Compact) have implemented a VPP program that pays customers in exchange for utility control of their home batteries.

Meanwhile, in Colorado efforts are underway to launch the state’s first VPP system. The Colorado Public Utilities Commission is urging Xcel Energy, its largest utility company, to develop a fully operational VPP pilot by this summer.

Why are VPPs important for the clean energy transition?

Grid operators must meet the annual or daily “peak load,” the moment of highest electricity demand. To do that, they often resort to using gas “peaker” plants, ones that remain dormant most of the year that they can switch during in times of high demand. VPPs will reduce the grids’ reliance on these plants.

The Department of Energy currently aims to expand national VPP capacity to 80 to 160 GW by 2030. That’s roughly equivalent to 80 to 160 fossil fuel plants that need not be built, says Brehm.

Many utilities say VPPs can lower energy bills for consumers in addition to reducing emissions. Research suggests that leveraging distributed sources during peak demand is up to 60% more cost effective than relying on gas plants.

Another significant, if less tangible, advantage of VPPs is that they encourage people to be more involved in the energy system. Usually, customers merely receive electricity. Within a VPP system, they both consume power and contribute it back to the grid. This dual role can improve their understanding of the grid and get them more invested in the transition to clean energy.

What’s next for VPPs?

The capacity of distributed energy sources is expanding rapidly, according to the Department of Energy, owing to the widespread adoption of electric vehicles, charging stations, and smart home devices. Connecting these to VPP systems enhances the grid’s ability to balance electricity demand and supply in real time. Better AI can also help VPPs become more adept at coordinating diverse assets, says Shankar.

Regulators are also coming on board. The National Association of Regulatory Utility Commissioners has started holding panels and workshops to educate its members about VPPs and how to implement them in their states. The California Energy Commission is set to fund research exploring the benefits of integrating VPPs into its grid system. This kind of interest from regulators is new but promising, says Brehm.

Still, hurdles remain. Enrolling in a VPP can be confusing for consumers because the process varies among states and companies. Simplifying it for people will help utility companies make the most of distributed energy resources such as EVs and heat pumps. Standardizing the deployment of VPPs can also speed up their growth nationally by making it easier to replicate successful projects across regions.

“It really comes down to policy,” says Brehm. “The technology is in place. We are continuing to learn about how to best implement these solutions and how to interface with consumers.”