Inside the most dangerous asteroid hunt ever

If you were told that the odds of something were 3.1%, it really wouldn’t seem like much. But for the people charged with protecting our planet, it was huge. 

On February 18, astronomers determined that a 130- to 300-foot-long asteroid had a 3.1% chance of crashing into Earth in 2032. Never had an asteroid of such dangerous dimensions stood such a high chance of striking the planet. For those following this developing story in the news, the revelation was unnerving. For many scientists and engineers, though, it turned out to be—despite its seriousness—a little bit exciting.

While possible impact locations included patches of empty ocean, the space rock, called 2024 YR4, also had several densely populated cities in its possible crosshairs, including Mumbai, Lagos, and Bogotá. If the asteroid did in fact hit such a metropolis, the best-case scenario was severe damage; the worst case was outright, total ruin. And for the first time, a group of United Nations–backed researchers began to have high-level discussions about the fate of the world: If this asteroid was going to hit the planet, what sort of spaceflight mission might be able to stop it? Would they ram a spacecraft into it to deflect it? Would they use nuclear weapons to try to swat it away or obliterate it completely

At the same time, planetary defenders all over the world crewed their battle stations to see if we could avoid that fate—and despite the sometimes taxing new demands on their psyches and schedules, they remained some of the coolest customers in the galaxy. “I’ve had to cancel an appointment saying, I cannot come—I have to save the planet,” says Olivier Hainaut, an astronomer at the European Southern Observatory and one of those who tracked down 2024 YR4. 

Then, just as quickly as history was made, experts declared that the danger had passed. On February 24, asteroid trackers issued the all-clear: Earth would be spared, just as many planetary defense researchers had felt assured it would. 

How did they do it? What was it like to track the rising (and rising and rising) danger of this asteroid, and to ultimately determine that it’d miss us?

This is the inside story of how, over a span of just two months, a sprawling network of global astronomers found, followed, mapped, planned for, and finally dismissed 2024 YR4, the most dangerous asteroid ever found—all under the tightest of timelines and, for just a moment, with the highest of stakes. 

“It was not an exercise,” says Hainaut. This was the real thing: “We really [had] to get it right.”


IN THE BEGINNING

December 27, 2024

THE ASTEROID TERRESTRIAL-IMPACT LAST ALERT SYSTEM, HAWAII

Long ago, an asteroid in the space-rock highway between Mars and Jupiter felt a disturbance in the force: the gravitational pull of Jupiter itself, king of the planets. After some wobbling back and forth, this asteroid was thrown out of the belt, skipped around the sun, and found itself on an orbit that overlapped with Earth’s own. 

“I was the first one to see the detections of it,” Larry Denneau, of the University of Hawai‘i, recalls. “A tiny white pixel on a black background.” 

Denneau is one of the principal investigators at the NASA-funded Asteroid Terrestrial-impact Last Alert System (ATLAS) telescopic network. It may have been just two days after Christmas, but he followed procedure as if it were any other day of the year and sent the observations of the tiny pixel onward to another NASA-funded facility, the Minor Planet Center (MPC) in Cambridge, Massachusetts. 

There’s an alternate reality in which none of this happened. Fortunately, in our timeline, various space agencies—chiefly NASA, but also the European Space Agency and the Japan Aerospace Exploration Agency—invest millions of dollars every year in asteroid-spotting efforts. 

And while multiple nations host observatories capable of performing this work, the US clearly leads the way: Its planetary defense program provides funding to a suite of telescopic facilities solely dedicated to identifying potentially hazardous space rocks. (At least, it leads the way for the moment. The White House’s proposal for draconian budget cuts to NASA and the National Science Foundation mean that several observatories and space missions linked to planetary defense are facing funding losses or outright terminations.) 

Astronomers working at these observatories are tasked with finding threatening asteroids before they find us—because you can’t fight what you can’t see. “They are the first line of planetary defense,” says Kelly Fast, the acting planetary defense officer at NASA’s Planetary Defense Coordination Office in Washington, DC.

ATLAS is one part of this skywatching project, and it consists of four telescopes: two in Hawaii, one in Chile, and another in South Africa. They don’t operate the way you’d think, with astronomers peering through them all night. Instead, they operate “completely robotically and automatically,” says Denneau. Driven by coding scripts that he and his colleagues have developed, these mechanical eyes work in harmony to watch out for any suspicious space rocks. Astronomers usually monitor their survey of the sky from a remote location.

ATLAS telescopes are small, so they can’t see particularly distant objects. But they have a wide field of view, allowing them to see large patches of space at any one moment. “As long as the weather is good, we’re constantly monitoring the night sky, from the North Pole to the South Pole,” says Denneau. 

Larry Denneau
Larry Denneau is a principal investigator at the Asteroid Terrestrial-impact Last Alert System telescopic network.
COURTESY PHOTO

If they detect the starlight reflecting off a moving object, an operator, such as Denneau, gets an alert and visually verifies that the object is real and not some sort of imaging artifact. When a suspected asteroid (or comet) is identified, the observations are sent to the MPC, which is home to a bulletin board featuring (among other things) orbital data on all known asteroids and comets. 

If the object isn’t already listed, a new discovery is announced, and other astronomers can perform follow-up observations. 

In just the past few years, ATLAS has detected more than 1,200 asteroids with near-Earth orbits. Finding ultimately harmless space rocks is routine work—so much so that when the new near-Earth asteroid was spotted by ATLAS’s Chilean telescope that December day, it didn’t even raise any eyebrows. 

Denneau had simply been sitting at home, doing some late-night work on his computer. At the time, of course, he didn’t know that his telescope had just spied what would soon become a history-making asteroid—one that could alter the future of the planet.

The MPC quickly confirmed the new space rock hadn’t already been “found,” and astronomers gave it a provisional designation: 2024 YR4

CATALINA SKY SURVEY, ARIZONA

Around the same time, the discovery was shared with another NASA-funded facility: the Catalina Sky Survey, a nest of three telescopes in the Santa Catalina Mountains north of Tucson that works out of the University of Arizona. “We run a very tight operation,” says Kacper Wierzchoś, one of its comet and asteroid spotters. Unlike ATLAS, these telescopes (although aided by automation) often have an in-person astronomer available to quickly alter the surveys in real time.

“We run a very tight operation,” says Kacper Wierzchoś, one of the comet and asteroid spotters at the Catalina Sky Survey north of Tucson, Arizona.
COURTESY PHOTO

So when Catalina was alerted about what its peers at ATLAS had spotted, staff deployed its Schmidt telescope—a smaller one that excels at seeing bright objects moving extremely quickly. As they fed their own observations of 2024 YR4 to the MPC, Catalina engineer David Rankin looked back over imagery from the previous days and found the new asteroid lurking in a night-sky image taken on December 26. Around then, ATLAS also realized that it had caught sight of 2024 YR4 in a photograph from December 25. 

The combined observations confirmed it: The asteroid had made its closest approach to Earth on Christmas Day, meaning it was already heading back out into space. But where, exactly, was this space rock going? Where would it end up after it swung around the sun? 

CENTER FOR NEAR-EARTH OBJECT STUDIES, CALIFORNIA 

If the answer to that question was Earth, Davide Farnocchia would be one of the first to know. You could say he’s one of NASA’s watchers on the wall. 

And he’s remarkably calm about his duties. When he first heard about 2024 YR4, he barely flinched. It was just another asteroid drifting through space not terribly far from Earth. It was another box to be ticked.

Once it was logged by the MPC, it was Farnocchia’s job to try to plot out 2024 YR4’s possible paths through space, checking to see if any of them overlapped with our planet’s. He works at NASA’s Center for Near-Earth Object Studies (CNEOS) in California, where he’s partly responsible for keeping track of all the known asteroids and comets in the solar system. “We have 1.4 million objects to deal with,” he says, matter-of-factly. 

In the past, astronomers would have had to stitch together multiple images of this asteroid and plot out its possible trajectories. Today, fortunately, Farnocchia has some help: He oversees the digital brain Sentry, an autonomous system he helped code. (Two other facilities in Italy perform similar work: the European Space Agency’s Near-Earth Object Coordination Centre, or NEOCC, and the privately owned Near-Earth Objects Dynamics Site, or NEODyS.)

To chart their courses, Sentry uses every new observation of every known asteroid or comet listed on the MPC to continuously refine the orbits of all those objects, using the immutable laws of gravity and the gravitational influences of any planets, moons, or other sizable asteroids they pass. A recent update to the software means that even the ever-so-gentle push afforded by sunlight is accounted for. That allows Sentry to confidently project the motions of all these objects at least a century into the future. 

Davide Farnocchia
Davide Farnocchia helps track all the known asteroids and comets in the solar system at NASA’s Center for Near-Earth Object Studies.
COURTESY PHOTO

Almost all newly discovered asteroids are quickly found to pose no impact risk. But those that stand even an infinitesimally small chance of smashing into our planet within the next 100 years are placed on the Sentry Risk List until additional observations can rule out those awful possibilities. Better safe than sorry. 

In late December, with just a limited set of data, Sentry concluded that there was a non-negligible chance 2024 YR4 would strike Earth in 2032. Aegis, the equivalent software at Europe’s NEOCC site, agreed. No bother. More observations would very likely remove 2024 YR4 from the Risk List. Just another day at the office for Farnocchia.

It’s worth noting that an asteroid heading toward Earth isn’t always a problem. Small rocks burn up in the planet’s atmosphere several times a day; you’ve probably seen one already this year, on a moonless night. But above a certain size, these rocks turn from innocuous shooting stars into nuclear-esque explosions. 

Reflected starlight is great for initially spotting asteroids, but it’s a terrible way to determine how big they are. A large, dull rock reflects as much light as a bright, tiny rock, making them appear the same to many telescopes. And that’s a problem, considering that a rock around 30 feet long will explode loudly but inconsequentially in Earth’s atmosphere, while a 3,000-foot-long asteroid would slam into the ground and cause devastation on a global scale, imperiling all of civilization. Roughly speaking, if you double the size of an asteroid, it becomes eight times more energetic upon impact—so finding out the size of an Earthbound asteroid is of paramount importance.

In those first few hours after it was discovered, and before anyone knew how shiny or dull its surface was, 2024 YR4 was estimated by astronomers to be as small as 65 feet across or as large as 500 feet. An object of the former size would blow up in mid-air, shattering windows over many miles and likely injuring thousands of people. At the latter size it would vaporize the heart of any city it struck, turning solid rock and metal into liquid and vapor, while its blast wave would devastate the rest of it, killing hundreds of thousands or even millions in the process. 

So now the question was: Just how big was 2024 YR4?


REFINING THE PICTURE

Mid-January 2025

VERY LARGE TELESCOPE, CHILE

Understandably dissatisfied with that level of imprecision, the European Southern Observatory’s Very Large Telescope (VLT), high up on the Cerro Paranal mountain in Chile’s Atacama Desert, entered the chat. As the name suggests, this flagship facility is vast, and it’s capable of really zooming in on distant objects. Or to put it another way: “The VLT is the largest, biggest, best telescope in the world,” says Hainaut, one of the facility’s operators, who usually commands it from half a world away in Germany.  

In reality, the VLT—which lends a hand to the European Space Agency in its asteroid-hunting duties—is actually made up of four massive telescopes, each fixed on four separate corners of the sky. They can be combined to act as a huge light bucket, allowing astronomers to see very faint asteroids. Four additional, smaller, movable telescopes can also team up with their bigger siblings to provide remarkably high-resolution images of even the stealthiest space rocks. 

In this sequence of infrared images taken by ESO’s VLT, the individual image frames have been aligned so that the asteroid remains in the center as other stars appear to move around it.
ESO/O. HAINAUT ET AL.

With so much tech to oversee, the control room of the VLT looks a bit like the inside of the Death Star. “You have eight consoles, each of them with a dozen screens. It’s big, it’s large, it’s spectacular,” says Hainaut. 

In mid-January, the European Space Agency asked the VLT to study several asteroids that had somewhat suspicious near-Earth orbits—including 2024 YR4. With just a few lines of code, the VLT could easily train its sharp eyes on an asteroid like 2024 YR4, allowing astronomers to narrow down its size range. It was found to be at least 130 feet long (big enough to cause major damage in a city) and as much as 300 feet (able to annihilate one).

January 29, 2025

INTERNATIONAL ASTEROID WARNING NETWORK
Marco Fenucci
Marco Fenucci is a near-Earth-object dynamicist at the European Space Agency’s Near-Earth Object Coordination Centre.
COURTESY PHOTO

By the end of the month, there was no mistaking it: 2024 YR4 stood a greater than 1% chance of impacting Earth on December 22, 2032. 

“It’s not something you see very often,” says Marco Fenucci, a near-Earth-object dynamicist at NEOCC. He admits that although it was “a serious thing,” this escalation was also “exciting to see”—something straight out of a sci-fi flick.

Sentry and Aegis, along with the systems at NEODyS, had been checking one another’s calculations. “There was a lot of care,” says Farnocchia, who explains that even though their programs worked wonders, their predictions were manually verified by multiple experts. When a rarity like 2024 YR4 comes along, he says, “you kind of switch gears, and you start being more cautious. You start screening everything that comes in.”

At this point, the klaxon emanating from these three data centers pushed the International Asteroid Warning Network (IAWN), a UN-backed planetary defense awareness group, to issue a public alert to the world’s governments: The planet may be in peril. For the most part, it was at this moment that the media—and the wider public—became aware of the threat. Earth, we may have a problem.

Denneau, along with plenty of other astronomers, received an urgent email from Fast at NASA’s Planetary Defense Coordination Office, requesting that all capable observatories track this hazardous asteroid. But there was one glaring problem. When 2024 YR4 was discovered on December 27, it was already two days after it had made its closest approach to Earth. And since it was heading back out into the shadows of space, it was quickly fading from sight.

Once it gets too faint, “there’s not much ATLAS can do,” Denneau says. By the time of IAWN’s warning, planetary defenders had just weeks to try to track 2024 YR4 and refine the odds of its hitting Earth before they’d lose it to the darkness. 

And if their scopes failed, the odds of an Earth impact would have stayed uncomfortably high until 2028, when the asteroid was due to make another flyby of the planet. That’d be just four short years before the space rock might actually hit.

“In that situation, we would have been … in trouble,” says NEOCC’s Fenucci.

The hunt was on.


PREPARING FOR THE WORST

February 5 and February 6, 2025

SPACE MISSION PLANNING ADVISORY GROUP, AUSTRIA

In early February, spaceflight mission specialists, including those at the UN-supported Space Mission Planning Advisory Group in Vienna, began high-level talks designed to sketch out ways in which 2024 YR4 could be either deflected away from Earth or obliterated—you know, just in case.

A range of options were available—including ramming it with several uncrewed spacecraft or assaulting it with nuclear weapons—but there was no silver bullet in this situation. Nobody had ever launched a nuclear explosive device into deep space before, and the geopolitical ramifications of any nuclear-armed nations doing so in the present day would prove deeply unwelcome. Asteroids are also extremely odd objects; some, perhaps including 2024 YR4, are less like single chunks of rock and more akin to multiple cliffs flying in formation. Hit an asteroid like that too hard and you could fail to deflect it—and instead turn an Earthbound cannonball into a spray of shotgun pellets. 

It’s safe to say that early on, experts were concerned about whether they could prevent a potential disaster. Crucially, eight years was not actually much time to plan something of this scale. So they were keen to better pinpoint how likely, or unlikely, it was that 2024 YR4 was going to collide with the planet before any complex space mission planning began in earnest. 

The people involved with these talks—from physicists at some of America’s most secretive nuclear weapons research laboratories to spaceflight researchers over in Europe—were not feeling close to anything resembling panic. But “the timeline was really short,” admits Hainaut. So there was an unprecedented tempo to their discussions. This wasn’t a drill. This was the real deal. What would they do to defend the planet if an asteroid impact couldn’t be ruled out?

Luckily, over the next few days, a handful of new observations came in. Each helped Sentry, Aegis, and the system at NEODyS rule out more of 2024 YR4’s possible future orbits. Unluckily, Earth remained a potential port of call for this pesky asteroid—and over time, our planet made up a higher proportion of those remaining possibilities. That meant that the odds of an Earth impact “started bubbling up,” says Denneau. 

a telescope in each of the four corners points to an asteroid

EVA REDAMONTI

By February 6, they jumped to 2.3%—a one-in-43 chance of an impact. 

“How much anxiety someone should feel over that—it’s hard to say,” Denneau says, with a slight shrug. 

In the past, several elephantine asteroids have been found to stand a small chance of careening unceremoniously into the planet. Such incidents tend to follow a pattern. As more observations come in and the asteroid’s orbit becomes better known, an Earth impact trajectory remains a possibility while other outlying orbits are removed from the calculations—so for a time, the odds of an impact rise. Finally, with enough observations in hand, it becomes clear that the space rock will miss our world entirely, and the impact odds plummet to zero.

Astronomers expected this to repeat itself with 2024 YR4. But there was no guarantee. There’s no escaping the fact that one day, sooner or later, scientists will discover a dangerous asteroid that will punch Earth in the face—and raze a city in the process. 

After all, asteroids capable of trashing a city have found their way to Earth plenty of times before, and not just in the very distant past. In 1908, an 800-square-mile patch of forest in Siberia—one that was, fortunately, very sparsely populated—was decimated by a space rock just 180 feet long. It didn’t even hit the ground; it exploded in midair with the force of a 15-megaton blast.

But only one other asteroid comparable in size to 2024 YR4 had its 2.3% figure beat: in 2004, Apophis—capable of causing continental-scale damage—had (briefly) stood a 2.7% chance of impacting Earth in 2029.

Rapidly approaching uncharted waters, the powers that be at NASA decided to play a space-based wild card: the James Webb Space Telescope, or JWST.

THE JAMES WEBB SPACE TELESCOPE, DEEP SPACE, ONE MILLION MILES FROM EARTH

A large dull asteroid reflects the same amount of light as a small shiny one, but that doesn’t mean astronomers sizing up an asteroid are helpless. If you view both asteroids in the infrared, the larger one glows brighter than the smaller one no matter the surface coating—making infrared, or the thermal part of the electromagnetic spectrum, a much better gauge of a space rock’s proportions. 

Observatories on Earth do have infrared capabilities, but our planet’s atmosphere gets in their way, making it hard for them to offer highly accurate readings of an asteroid’s size. 

But the James Webb Space Telescope (JWST), hanging out in space, doesn’t have that problem. 

A collage of three images showing the black expanse of space. Two-thirds of the collage is taken up by the black background sprinkled with small, blurry galaxies in orange, blue, and white. There are two images in a column at the right side of the collage. On the right side of the main image, not far from the top, a very faint dot is outlined with a white square. At the right, there are two zoomed in views of this area. The top box is labeled NIRCam and shows a fuzzy dot at the center of the inset. The bottom box is labeled MIRI and shows a fuzzy pinkish dot.
Asteroid 2024 YR4 is the smallest object targeted by JWST to date, and one of the smallest objects to have its size directly measured. Observations were taken using both its NIRCam (Near-Infrared Camera) and MIRI (Mid-Infrared Instrument) to study the thermal properties of the asteroid.
NASA, ESA, CSA, A. RIVKIN (APL), A. PAGAN (STSCI)

This observatory, which sits at a gravitationally stable point about a million miles from Earth, is polymathic. Its sniper-like scope can see in the infrared and allows it to peer at the edge of the observable universe, meaning it can study galaxies that formed not long after the Big Bang. It can even look at the light passing through the atmospheres of distant planets to ascertain their chemical makeups. And its remarkably sharp eye means it can also track the thermal glow of an asteroid long after all ground-based telescopes lose sight of it.

In a fortuitous bit of timing, by the moment 2024 YR4 came along, planetary defenders had recently reasoned that JWST could theoretically be used to track ominous asteroids using its own infrared scope, should the need arise. So after IAWN’s warning went out, operators of JWST ran an analysis: Though the asteroid would vanish from most scopes by late March, this one might be able to see the rock until sometime in May, which would allow researchers to greatly refine their assessment of the asteroid’s orbit and its odds of making Earth impact.

Understanding 2024 YR4’s trajectory was important, but “the size was the main motivator,” says Andy Rivkin, an astronomer at Johns Hopkins University’s Applied Physics Laboratory, who led the proposal to use JWST to observe the asteroid. The hope was that even if the impact odds remained high until 2028, JWST would find that 2024 YR4 was on the smaller side of the 130-to-300-feet size range—meaning it would still be a danger, but a far less catastrophic one. 

The JWST proposal was accepted by NASA on February 5. But the earliest it could conduct its observations was early March. And time really wasn’t on Earth’s side.

February 7, 2025

GEMINI SOUTH TELESCOPE, CHILE

“At this point, [2024 YR4] was too faint for the Catalina telescopes,” says Catalina’s Wierzchoś. “In our opinion, this was a big deal.” 

So Wierzchoś and his colleagues put in a rare emergency request to commandeer the Gemini Observatory, an internationally funded and run facility featuring two large, eagle-eyed telescopes—one in Chile and one atop Hawaii’s Mauna Kea volcano. Their request was granted, and on February 7, they trained the Chile-based Gemini South telescope onto 2024 YR4. 

This composite image was captured by a team of astronomers using the Gemini Multi-Object Spectrograph (GMOS). The hazy dot at the center is asteroid 2024 YR4.
INTERNATIONAL GEMINI OBSERVATORY/NOIRLAB/NSF/AURA/M. ZAMANI

The odds of Earth impact dropped ever so slightly, to 2.2%—a minor, but still welcome, development. 

Mid-February 2025

MAGDALENA RIDGE OBSERVATORY, NEW MEXICO

By this point, the roster of 2024 YR4 hunters also included the tiny team operating the Magdalena Ridge Observatory (MRO), which sits atop a tranquil mountain in New Mexico.

“It’s myself and my husband,” says Eileen Ryan, the MRO director. “We’re the only two astronomers running the telescope. We have a daytime technician. It’s kind of a mom-and-pop organization.” 

Still, the scope shouldn’t be underestimated. “We can see maybe a cell-phone-size object that’s illuminated at geosynchronous orbit,” Ryan says, referring to objects 22,000 miles away. But its most impressive feature is its mobility. While other observatories have slowly swiveling telescopes, MRO’s scope can move like the wind. “We can track the fastest objects,” she says, with a grin—noting that the telescope was built in part to watch missiles for the US Air Force. Its agility and long-distance vision explain why the Space Force is one of MRO’s major clients: It can be used to spy on satellites and spacecraft anywhere from low Earth orbit right out to the lunar regions. And that meant spying on the super-speedy, super-sneaky 2024 YR4 wasn’t a problem for MRO, whose own observations were vital in refining the asteroid’s impact odds.

Dr Eileen Ryan
Eileen Ryan is the director of the Magdalena Ridge Observatory in New Mexico.
COURTESY PHOTO

Then, in mid-February, MRO and all ground-based observatories came up against an unsolvable problem: The full moon was out, shining so brightly that it blinded any telescope that dared point at the night sky. “During the full moon, the observatories couldn’t observe for a week or so,” says NEOCC’s Fenucci. To most of us, the moon is a beautiful silvery orb. But to astronomers, it’s a hostile actor. “We abhor the moon,” says Denneau. 

All any of them could do was wait. Those tracking 2024 YR4 vacillated between being animated and slightly trepidatious. The thought that the asteroid could still stand a decent chance of impacting Earth after it faded from view did weigh a little on their minds. 

Nevertheless, Farnocchia maintained his characteristic sangfroid throughout. “I try to stress about the things I can control,” he says. “All we can do is to explain what the situation is, and that we need new data to say more.”

February 18, 2025

CENTER FOR NEAR-EARTH OBJECT STUDIES, CALIFORNIA 

As the full moon finally faded into a crescent of light, the world’s largest telescopes sprang back into action for one last shot at glory. “The dark time came again,” says Hainaut, with a smile.

New observations finally began to trickle in, and Sentry, Aegis, and NEODyS readjusted their forecasts. It wasn’t great news: The odds of an Earth impact in 2032 jumped up to 3.1%, officially making 2024 YR4 the most dangerous asteroid ever discovered.

This declaration made headlines across the world—and certainly made the curious public sit up and wonder if they had yet another apocalyptic concern to fret about. But, as ever, the asteroid hunters held fast in their prediction that sooner or later—ideally sooner—more observations would cause those impact odds to drop. 

“We kept at it,” says Ryan. But time was running short; they started to “search for out-of-the-box ways to image this asteroid,” says Fenucci. 

Planetary defense researchers soon realized that 2024 YR4 wasn’t too far away from NASA’s Lucy spacecraft, a planetary science mission making a series of flybys of various asteroids. If Lucy could be redirected to catch up to 2024 YR4 instead, it would give humanity its best look at the rock, allowing Sentry and company to confirm or dispel our worst fears. 

Sadly, NASA ran the numbers, and it proved to be a nonstarter: 2024 YR4 was too speedy and too far for Lucy to pursue. 

It was really starting to look as if JWST would be the last, best hope to track 2024 YR4. 


A CHANGE OF FATE

February 19, 2025

VERY LARGE TELESCOPE, CHILE and MAGDALENA RIDGE OBSERVATORY, NEW MEXICO

Just one day after 2024 YR made history, the VLT, MRO, and others caught sight of it again, and Sentry, Aegis, and NEODyS voraciously consumed their new data. 

The odds of an Earth impact suddenly dropped to 1.5%

Astronomers didn’t really have time to react to the possibility that this was a good sign—they just kept sending their observations onward.

February 20, 2025

SUBARU TELESCOPE, HAWAII

Yet another observatory had been itching to get into the game for weeks, but it wasn’t until February 20 that Tsuyoshi Terai, an astronomer at Japan’s Subaru Telescope, sitting atop Mauna Kea, finally caught 2024 YR4 shifting between the stars. He added his data to the stream.

And all of a sudden, the asteroid lost its lethal luster. The odds of its hitting Earth were now just 0.3%. 

At this point, you might expect that all those tracking it would be in a single control room somewhere, eyes glued to their screens, watching the odds drop before bursting into cheers and rapturous applause. But no—the astronomers who had spent so long observing this asteroid remained scattered across the globe. And instead of erupting into cheers, they exchanged modestly worded emails of congratulations—the digital equivalent of a nod or a handshake.

Dr. Tsuyoshi Tera at a workstation with many monitors
In late February, data from Tsuyoshi Terai, an astronomer at Japan’s Subaru Telescope, which sits atop Mauna Kea, confirmed that 2024 YR4 was not so lethal after all.
NAOJ

“It was a relief,” says Terai. “I was very pleased that our data contributed to put an end to the risk of 2024 YR4.” 

February 24, 2025

INTERNATIONAL ASTEROID WARNING NETWORK

Just a few days later, and thanks to a litany of observations continuing to flood in, IAWN issued the all-clear. This once-ominous asteroid’s odds of inconveniencing our planet were at 0.004%—one in 25,000. Today, the odds of an Earth impact in 2032 are exactly zero.

“It was kinda fun while it lasted,” says Denneau. 

Planetary defenders may have had a blast defending the world, but these astronomers still took the cosmic threat deeply seriously—and never once took their eyes off the prize. “In my mind, the observers and orbit teams were the stars of this story,” says Fast, NASA’s acting planetary defense officer.

Farnocchia shrugs off the entire thing. “It was the expected outcome,” he says. “We just didn’t know when that would happen.”

Looking back on it now, though, some 2024 YR4 trackers are allowing themselves to dwell on just how close this asteroid came to being a major danger. “It’s wild to watch it all play out,” says Denneau. “We were weeks away from having to spin up some serious mitigation planning.” But there was no need to work out how the save the world. It turned out that 2024 YR4 was never a threat to begin with—it just took a while to check. 

And these experiences in handling a dicey space rock will only serve to make the world a safer place to live. One day, an asteroid very much like 2024 YR4 will be seen heading straight for Earth. And those tasked with tracking it will be officially battle-tested, and better prepared than ever to do what needs to be done.


A POSTSCRIPT

March 27, 2025

JAMES WEBB SPACE TELESCOPE, DEEP SPACE, ONE MILLION MILES FROM EARTH

But the story of 2024 YR4 is not quite over—in fact, if this were a movie, it would have an after-credits scene.

After the Earth-impact odds fell off a cliff, JWST went ahead with its observations in March anyway. It found out that 2024 YR4 was 200 feet across—so large that a direct strike on a city would have proved horrifically lethal. Earth just didn’t have to worry about it anymore. 

But the moon might. Thanks in part to JWST, astronomers calculated a 3.8% chance that 2024 YR4 will impact the lunar surface in 2032. Additional JWST observations in May bumped those odds up slightly, to 4.3%, and the orbit can no longer be refined until the asteroid’s next Earth flyby in 2028. 

“It may hit the moon!” says Denneau. “Everybody’s still very excited about that.” 

A lunar collision would give astronomers a wonderful opportunity not only to study the physics of an asteroid impact, but also to demonstrate to the public just how good they are at precisely predicting the future motions of potentially lethal space rocks. “It’s a thing we can plan for without having to defend the Earth,” says Denneau.

If 2024 YR4 is truly going to smash into the moon, the impact—likely on the side facing Earth—would unleash an explosion equivalent to hundreds of nuclear bombs. An expansive crater would be carved out in the blink of an eye, and a shower of debris would erupt in all directions. 

None of this supersonic wreckage would pose any danger to Earth, but it would look spectacular: You’d be able to see the bright flash of the impact from terra firma with the naked eye.

“If that does happen, it’ll be amazing,” says Denneau. It will be a spectacular way to see the saga of 2024 YR4—once a mere speck on his computer screen—come to an explosive end, from a front-row seat.

Robin George Andrews is an award-winning science journalist and doctor of volcanoes based in London. He regularly writes about the Earth, space, and planetary sciences, and is the author of two critically acclaimed books: Super Volcanoes (2021) and How to Kill An Asteroid (2024).

How scientists are trying to use AI to unlock the human mind 

Today’s AI landscape is defined by the ways in which neural networks are unlike human brains. A toddler learns how to communicate effectively with only a thousand calories a day and regular conversation; meanwhile, tech companies are reopening nuclear power plants, polluting marginalized communities, and pirating terabytes of books in order to train and run their LLMs.

But neural networks are, after all, neural—they’re inspired by brains. Despite their vastly different appetites for energy and data, large language models and human brains do share a good deal in common. They’re both made up of millions of subcomponents: biological neurons in the case of the brain, simulated “neurons” in the case of networks. They’re the only two things on Earth that can fluently and flexibly produce language. And scientists barely understand how either of them works.

I can testify to those similarities: I came to journalism, and to AI, by way of six years of neuroscience graduate school. It’s a common view among neuroscientists that building brainlike neural networks is one of the most promising paths for the field, and that attitude has started to spread to psychology. Last week, the prestigious journal Nature published a pair of studies showcasing the use of neural networks for predicting how humans and other animals behave in psychological experiments. Both studies propose that these trained networks could help scientists advance their understanding of the human mind. But predicting a behavior and explaining how it came about are two very different things.

In one of the studies, researchers transformed a large language model into what they refer to as a “foundation model of human cognition.” Out of the box, large language models aren’t great at mimicking human behavior—they behave logically in settings where humans abandon reason, such as casinos. So the researchers fine-tuned Llama 3.1, one of Meta’s open-source LLMs, on data from a range of 160 psychology experiments, which involved tasks like choosing from a set of “slot machines” to get the maximum payout or remembering sequences of letters. They called the resulting model Centaur.

Compared with conventional psychological models, which use simple math equations, Centaur did a far better job of predicting behavior. Accurate predictions of how humans respond in psychology experiments are valuable in and of themselves: For example, scientists could use Centaur to pilot their experiments on a computer before recruiting, and paying, human participants. In their paper, however, the researchers propose that Centaur could be more than just a prediction machine. By interrogating the mechanisms that allow Centaur to effectively replicate human behavior, they argue, scientists could develop new theories about the inner workings of the mind.

But some psychologists doubt whether Centaur can tell us much about the mind at all. Sure, it’s better than conventional psychological models at predicting how humans behave—but it also has a billion times more parameters. And just because a model behaves like a human on the outside doesn’t mean that it functions like one on the inside. Olivia Guest, an assistant professor of computational cognitive science at Radboud University in the Netherlands, compares Centaur to a calculator, which can effectively predict the response a math whiz will give when asked to add two numbers. “I don’t know what you would learn about human addition by studying a calculator,” she says.

Even if Centaur does capture something important about human psychology, scientists may struggle to extract any insight from the model’s millions of neurons. Though AI researchers are working hard to figure out how large language models work, they’ve barely managed to crack open the black box. Understanding an enormous neural-network model of the human mind may not prove much easier than understanding the thing itself.

One alternative approach is to go small. The second of the two Nature studies focuses on minuscule neural networks—some containing only a single neuron—that nevertheless can predict behavior in mice, rats, monkeys, and even humans. Because the networks are so small, it’s possible to track the activity of each individual neuron and use that data to figure out how the network is producing its behavioral predictions. And while there’s no guarantee that these models function like the brains they were trained to mimic, they can, at the very least, generate testable hypotheses about human and animal cognition.

There’s a cost to comprehensibility. Unlike Centaur, which was trained to mimic human behavior in dozens of different tasks, each tiny network can only predict behavior in one specific task. One network, for example, is specialized for making predictions about how people choose among different slot machines. “If the behavior is really complex, you need a large network,” says Marcelo Mattar, an assistant professor of psychology and neural science at New York University who led the tiny-network study and also contributed to Centaur. “The compromise, of course, is that now understanding it is very, very difficult.”

This trade-off between prediction and understanding is a key feature of neural-network-driven science. (I also happen to be writing a book about it.) Studies like Mattar’s are making some progress toward closing that gap—as tiny as his networks are, they can predict behavior more accurately than traditional psychological models. So is the research into LLM interpretability happening at places like Anthropic. For now, however, our understanding of complex systems—from humans to climate systems to proteins—is lagging farther and farther behind our ability to make predictions about them.

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

Why the US and Europe could lose the race for fusion energy

Fusion energy holds the potential to shift a geopolitical landscape that is currently configured around fossil fuels. Harnessing fusion will deliver the energy resilience, security, and abundance needed for all modern industrial and service sectors. But these benefits will be controlled by the nation that leads in both developing the complex supply chains required and building fusion power plants at scales large enough to drive down economic costs.

The US and other Western countries will have to build strong supply chains across a range of technologies in addition to creating the fundamental technology behind practical fusion power plants. Investing in supply chains and scaling up complex production processes has increasingly been a strength of China’s and a weakness of the West, resulting in the migration of many critical industries from the West to China. With fusion, we run the risk that history will repeat itself. But it does not have to go that way.

The US and Europe were the dominant public funders of fusion energy research and are home to many of the world’s pioneering private fusion efforts. The West has consequently developed many of the basic technologies that will make fusion power work. But in the past five years China’s support of fusion energy has surged, threatening to allow the country to dominate the industry.

The industrial base available to support China’s nascent fusion energy industry could enable it to climb the learning curve much faster and more effectively than the West. Commercialization requires know-how, capabilities, and complementary assets, including supply chains and workforces in adjacent industries. And especially in comparison with China, the US and Europe have significantly under-supported the industrial assets needed for a fusion industry, such as thin-film processing and power electronics.

To compete, the US, allies, and partners must invest more heavily not only in fusion itself—which is already happening—but also in those adjacent technologies that are critical to the fusion industrial base. 

China’s trajectory to dominating fusion and the West’s potential route to competing can be understood by looking at today’s most promising scientific and engineering pathway to achieve grid-relevant fusion energy. That pathway relies on the tokamak, a technology that uses a magnetic field to confine ionized gas—called plasma—and ultimately fuse nuclei. This process releases energy that is converted from heat to electricity. Tokamaks consist of several critical systems, including plasma confinement and heating, fuel production and processing, blankets and heat flux management, and power conversion.

A close look at the adjacent industries needed to build these critical systems clearly shows China’s advantage while also providing a glimpse into the challenges of building a fusion industrial base in the US or Europe. China has leadership in three of these six key industries, and the West is at risk of losing leadership in two more. China’s industrial might in thin-film processing, large metal-alloy structures, and power electronics provides a strong foundation to establish the upstream supply chain for fusion.

The importance of thin-film processing is evident in the plasma confinement system. Tokamaks use strong electromagnets to keep the fusion plasma in place, and the magnetic coils must be made from superconducting materials. Rare-earth barium copper oxide (REBCO) superconductors are the highest-performing materials available in sufficient quantity to be viable for use in fusion.

The REBCO industry, which relies on thin-film processing technologies, currently has low production volumes spanning globally distributed manufacturers. However, as the fusion industry grows, the manufacturing base for REBCO will likely consolidate among the industry players who are able to rapidly take advantage of economies of scale. China is today’s world leader in thin-film, high-volume manufacturing for solar panels and flat-panel displays, with the associated expert workforce, tooling sector, infrastructure, and upstream materials supply chain. Without significant attention and investment on the part of the West, China is well positioned to dominate REBCO thin-film processing for fusion magnets.

The electromagnets in a full-scale tokamak are as tall as a three-story building. Structures made using strong metal alloys are needed to hold these electromagnets around the large vacuum vessel that physically contains the magnetically confined plasma. Similar large-scale, complex metal structures are required for shipbuilding, aerospace, oil and gas infrastructure, and turbines. But fusion plants will require new versions of the alloys that are radiation-tolerant, able to withstand cryogenic temperatures, and corrosion-resistant. China’s manufacturing capacity and its metallurgical research efforts position it well to outcompete other global suppliers in making the necessary specialty metal alloys and machining them into the complex structures needed for fusion.

A tokamak also requires large-scale power electronics. Here again China dominates. Similar systems are found in the high-speed rail (HSR) industry, renewable microgrids, and arc furnaces. As of 2024, China had deployed over 48,000 kilometers of HSR. That is three times the length of Europe’s HSR network and 55 times as long as the Acela network in the US, which is slower than HSR. While other nations have a presence, China’s expertise is more recent and is being applied on a larger scale.

But this is not the end of the story. The West still has an opportunity to lead the other three adjacent industries important to the fusion supply chain: cryo-plants, fuel processing, and blankets. 

The electromagnets in an operational tokamak need to be kept at cryogenic temperatures of around 20 Kelvin to remain superconducting. This requires large-scale, multi-megawatt cryogenic cooling plants. Here, the country best set up to lead the industry is less clear. The two major global suppliers of cryo-plants are Europe-based Linde Engineering and Air Liquide Engineering; the US has Air Products and Chemicals and Chart Industries. But they are not alone: China’s domestic champions in the cryogenic sector include Hangyang Group, SASPG, Kaifeng Air Separation, and SOPC. Each of these regions already has an industrial base that could scale up to meet the demands of fusion.

Fuel production for fusion is a nascent part of the industrial base requiring processing technologies for light-isotope gases—hydrogen, deuterium, and tritium. Some processing of light-isotope gases is already done at small scale in medicine, hydrogen weapons production, and scientific research in the US, Europe, and China. But the scale needed for the fusion industry does not exist in today’s industrial base, presenting a major opportunity to develop the needed capabilities.

Similarly, blankets and heat flux management are an opportunity for the West. The blanket is the medium used to absorb energy from the fusion reaction and to breed tritium. Commercial-scale blankets will require entirely novel technology. To date, no adjacent industries have relevant commercial expertise in liquid lithium, lead-lithium eutectic, or fusion-specific molten salts that are required for blanket technology. Some overlapping blanket technologies are in early-stage development by the nuclear fission industry. As the largest producer of beryllium in the world, the US has an opportunity to capture leadership because that element is a key material in leading fusion blanket concepts. But the use of beryllium must be coupled with technology development programs for the other specialty blanket components.

These six industries will prove critical to scaling fusion energy. In some, such as thin-film processing and large metal-alloy structures, China already has a sizable advantage. Crucially, China recognizes the importance of these adjacent industries and is actively harnessing them in its fusion efforts. For example, China launched a fusion consortium that consists of industrial giants spanning the steel, machine tooling, electric grid, power generation, and aerospace sectors. It will be extremely difficult for the West to catch up in these areas, but policymakers and business leaders must pay attention and try to create robust alternative supply chains.

As the industrial area of greatest strength, cryo-plants could continue to be an opportunity for leadership in the West. Bolstering Western cryo-plant production by creating demand for natural-gas liquefaction will be a major boon to the future cryo-plant supply chain that will support fusion energy.

The US and European countries also have an opportunity to lead in the emerging industrial areas of fuel processing and blanket technologies. Doing so will require policymakers to work with companies to ensure that public and private funding is allocated to these critical emerging supply chains. Governments may well need to serve as early customers and provide debt financing for significant capital investment. Governments can also do better to incentivize private capital and equity financing—for example, through favorable capital-gains taxation. In lagging areas of thin-film and alloy production, the US and Europe will likely need partners, such as South Korea and Japan, that have the industrial bases to compete globally with China.

The need to connect and capitalize multiple industries and supply chains will require long-term thinking and clear leadership. A focus on the demand side of these complementary industries is essential. Fusion is a decade away from maturation, so its supplier base must be derisked and made profitable in the near term by focusing on other primary demand markets that contribute to our economic vitality. To name a few, policymakers can support modernization of the grid to bolster domestic demand for power electronics and domestic semiconductor manufacturing to support thin-film processing.

The West must also focus on the demand for energy production itself. As the world’s largest energy consumer, China will leverage demand from its massive domestic market to climb the learning curve and bolster national champions. This is a strategy that China has wielded with tremendous success to dominate global manufacturing, most recently in the electric-vehicle industry. Taken together, supply- and demand-side investment have been a winning strategy for China.

The competition to lead the future of fusion energy is here. Now is the moment for the US and its Western allies to start investing in the foundational innovation ecosystem needed for a vibrant and resilient industrial base to support it.

Daniel F. Brunner is a co-founder of Commonwealth Fusion Systems and a Partner at Future Tech Partners.

Edlyn V. Levine is the co-founder of a stealth-mode technology start up and an affiliate of the MIT Sloan School of Management.

Fiona E. Murray is a professor of entrepreneurship at the MIT School of Management and Vice Chair of the NATO Innovation Fund.

Rory Burke is a graduate of MIT Sloan and a former summer scholar with ARPA-E.

The latest threat from the rise of Chinese manufacturing

The findings a decade ago were, well, shocking. Mainstream economists had long argued that free trade was overall a good thing; though there might be some winners and losers, it would generally bring lower prices and widespread prosperity. Then, in 2013, a trio of academic researchers showed convincing evidence that increased trade with China beginning in the early 2000s and the resulting flood of cheap imports had been an unmitigated disaster for many US communities, destroying their manufacturing lifeblood.

The results of what in 2016 they called the “China shock” were gut-wrenching: the loss of 1 million US manufacturing jobs and 2.4 million jobs in total by 2011. Worse, these losses were heavily concentrated in what the economists called “trade-exposed” towns and cities (think furniture makers in North Carolina).

If in retrospect all that seems obvious, it’s only because the research by David Autor, an MIT labor economist, and his colleagues has become an accepted, albeit often distorted, political narrative these days: China destroyed all our manufacturing jobs! Though the nuances of the research are often ignored, the results help explain at least some of today’s political unrest. It’s reflected in rising calls for US protectionism, President Trump’s broad tariffs on imported goods, and nostalgia for the lost days of domestic manufacturing glory.

The impacts of the original China shock still scar much of the country. But Autor is now concerned about what he considers a far more urgent problem—what some are calling China shock 2.0. The US, he warns, is in danger of losing the next great manufacturing battle, this time over advanced technologies to make cars and planes as well as those enabling AI, quantum computing, and fusion energy.

Recently, I asked Autor about the lingering impacts of the China shock and the lessons it holds for today’s manufacturing challenges.

How are the impacts of the China shock still playing out?

I have a recent paper looking at 20 years of data, from 2000 to 2019. We tried to ask two related questions. One, if you looked at the places that were most exposed, how have they adjusted? And then if you look to the people who are most exposed, how have they adjusted? And how do those two things relate to one anothe

It turns out you get two very different answers. If you look at places that were most exposed, they have been substantially transformed. Manufacturing, once it starts going down, never comes back. But after 2010, these trade-impacted local labor markets staged something of an employment recovery, such that employment has grown faster after 2010 in trade-exposed places than non-trade-exposed places because a lot of people have come in. But these are jobs mostly in low-wage sectors. They’re in K–12 education and non-traded health services. They’re in warehousing and logistics. They’re in hospitality and lodging and recreation, and so they’re lower-wage, non-manufacturing jobs. And they’re done by a really different set of people.

The growth in employment is among women, among native-born Hispanics, among foreign-born adults and a lot of young people. The recovery is staged by a very different group from the white and black men, but especially white men, who were most represented in manufacturing. They have not really participated in this renaissance.

Employment is growing, but are these areas prospering?

They have a lower wage structure: fewer high-wage jobs, more low-wage jobs. So they’re not, if your definition of prospering is rapidly rising incomes. But there’s a lot of employment growth. They’re not like ghost towns. But then if you look at the people who were most concentrated in manufacturing—mostly white, non-college, native-born men—they have not prospered. Most of them have not transitioned from manufacturing to non-manufacturing.

One of the great surprises is everyone had believed that people would pull up stakes and move on. In fact, we find the opposite. People in the most adversely exposed places become less likely to leave. They have become less mobile. The presumption was that they would just relocate to find higher ground. And that is not at all what occurred.

What happened to the total number of manufacturing jobs?

There’s been no rebound. Once they go, they just keep going. If there is going to be new manufacturing, it won’t be in the sectors that were lost to China. Those were basically labor-intensive jobs, the kind of low-tech sectors that we will not be getting back. You know—commodity furniture and assembly of things, shoes, construction material. The US wasn’t going to keep them forever, and once they’re gone, it’s very unlikely to get them back.

I know you’ve written about this, but it’s not hard to draw a connection between the dynamics you’re describing—white-male manufacturing jobs going away and new jobs going to immigrants—and today’s political turmoil.

We have a paper about that called “Importing Political Polarization?”

How big a factor would you say it is in today’s political unrest?

I don’t want to say it’s the factor. The China trade shock was a catalyst, but there were lots of other things that were happening. It would be a vast oversimplification to say that it was the sole cause.

But most people don’t work in manufacturing anymore. Aren’t these impacts that you’re talking about, including the political unrest, disproportionate to the actual number of jobs lost?

These are jobs in places where manufacturing is the anchor activity. Manufacturing is very unevenly distributed. It’s not like grocery stores and hospitals that you find in every county. The impact of the China trade shock on these places was like dropping an economic bomb in the middle of downtown. If the China trade shock cost us a few million jobs, and these were all—you know—people in groceries and retail and gas stations, in hospitality and in trucking, you wouldn’t really notice it that much. We lost lots of clerical workers over the last couple of decades. Nobody talks about a clerical shock. Why not? Well, there was never a clerical capital of America. Clerical workers are everywhere. If they decline, it doesn’t wipe out the entire basis of a place.

So it goes beyond the jobs. These places lost their identity.

Maybe. But it’s also the jobs. Manufacturing offered relatively high pay to non-college workers, especially non-college men. It was an anchor of a way of life.

And we’re still seeing the damage.

Yeah, absolutely. It’s been 20 years. What’s amazing is the degree of stasis among the people who are most exposed—not the places, but the people. Though it’s been 20 years, we’re still feeling the pain and the political impacts from this transition.

Clearly, it has now entered the national psyche. Even if it weren’t true, everyone now believes it to have been a really big deal, and they’re responding to it. It continues to drive policy, political resentments, maybe even out of proportion to its economic significance. It certainly has become mythological.

What worries you now?

We’re in the midst of a totally different competition with China now that’s much, much more important. Now we’re not talking about commodity furniture and tube socks. We’re talking about semiconductors and drones and aviation, electric vehicles, shipping, fusion power, quantum, AI, robotics. These are the sectors where the US still maintains competitiveness, but they’re extremely threatened. China’s capacity for high-tech, low-cost, incredibly fast, innovative manufacturing is just unbelievable. And the Trump administration is basically fighting the war of 20 years ago. The loss of those jobs, you know, was devastating to those places. It was not devastating to the US economy as a whole. If we lose Boeing, GM, and Apple and Intel—and that’s quite possible—then that will be economically devastating.

I think some people are calling it China shock 2.0.

Yeah. And it’s well underway.

When we think about advanced manufacturing and why it’s important, it’s not so much about the number of jobs anymore, is it? Is it more about coming up with the next technologies?

It does create good jobs, but it’s about economic leadership. It’s about innovation. It’s about political leadership, and even standard setting for how the rest of the world works.

Should we just accept that manufacturing as a big source of jobs is in the past and move on?

No. It’s still 12 million jobs, right? Instead of the fantasy that we’re going to go back to 18 million or whatever—we had, what, 17.7 million manufacturing jobs in 1999—we should be worried about the fact that we’re going to end up at 6 million, that we’re going to lose 50% in the next decade. And that’s quite possible. And the Trump administration is doing a lot to help that process of loss along.

We have a labor market of over 160 million people, so it’s like 8% of employment. It’s not zero. So you should not think of it as too small to worry about it. It’s a lot of people; it’s a lot of jobs. But more important, it’s a lot of what has helped this country be a leader. So much innovation happens here, and so many of the things in which other countries are now innovating started here. It’s always been the case that the US tends to innovate in sectors and then lose them after a while and move on to the next thing. But at this point, it’s not clear that we’ll be in the frontier of a lot of these sectors for much longer.

So we want to revive manufacturing, but the right kind—advanced manufacturing?

The notion that we should be assembling iPhones in the United States, which Trump wants, is insane. Nobody wants to do that work. It’s horrible, tedious work. It pays very, very little. And if we actually did it here, it would make the iPhones 20% more expensive or more. Apple may very well decide to pay a 25% tariff rather than make the phones here. If Foxconn started doing iPhone assembly here, people would not be lining up for that job.

But at the same time, we do need new people coming into manufacturing.

But not that manufacturing. Not tedious, mind-numbing, eyestrain-inducing assembly.

We need them to do high-tech work. Manufacturing is a skilled activity. We need to build airplanes better. That takes a ton of expertise. Assembling iPhones does not.

What are your top priorities to head off China shock 2.0?

I would choose sectors that are important, and I would invest in them. I don’t think that tariffs are never justified, or industrial policies are never justified. I just don’t think protecting phone assembly is smart industrial policy. We really need to improve our ability to make semiconductors. I think that’s important. We need to remain competitive in the automobile sector—that’s important. We need to improve aviation and drones. That’s important. We need to invest in fusion power. That’s important. We need to adopt robotics at scale and improve in that sector. That’s important. I could come up with 15 things where I think public money is justified, and I would be willing to tolerate protections for those sectors.

What are the lasting lessons of the China shock and the opening up of global trade in the 2000s?

We did it too fast. We didn’t do enough to support people, and we pretended it wasn’t going on.

When we started the China shock research back around 2011, we really didn’t know what we’d find, and so we were as surprised as anyone. But the work has changed our own way of thinking and, I think, has been constructive—not because it has caused everyone to do the right thing, but it at least caused people to start asking the right questions.

What do the findings tell us about China shock 2.0?

I think the US is handling that challenge badly. The problem is much more serious this time around. The truth is, we have a sense of what the threats are. And yet we’re not seemingly responding in a very constructive way. Although we now know how seriously we should take this, the problem is that it doesn’t seem to be generating very serious policy responses. We’re generating a lot of policy responses—they’re just not serious ones.

Don’t let hype about AI agents get ahead of reality

Google’s recent unveiling of what it calls a “new class of agentic experiences” feels like a turning point. At its I/O 2025 event in May, for example, the company showed off a digital assistant that didn’t just answer questions; it helped work on a bicycle repair by finding a matching user manual, locating a YouTube tutorial, and even calling a local store to ask about a part, all with minimal human nudging. Such capabilities could soon extend far outside the Google ecosystem. The company has introduced an open standard called Agent-to-Agent, or A2A, which aims to let agents from different companies talk to each other and work together.

The vision is exciting: Intelligent software agents that act like digital coworkers, booking your flights, rescheduling meetings, filing expenses, and talking to each other behind the scenes to get things done. But if we’re not careful, we’re going to derail the whole idea before it has a chance to deliver real benefits. As with many tech trends, there’s a risk of hype racing ahead of reality. And when expectations get out of hand, a backlash isn’t far behind.

Let’s start with the term “agent” itself. Right now, it’s being slapped on everything from simple scripts to sophisticated AI workflows. There’s no shared definition, which leaves plenty of room for companies to market basic automation as something much more advanced. That kind of “agentwashing” doesn’t just confuse customers; it invites disappointment. We don’t necessarily need a rigid standard, but we do need clearer expectations about what these systems are supposed to do, how autonomously they operate, and how reliably they perform.

And reliability is the next big challenge. Most of today’s agents are powered by large language models (LLMs), which generate probabilistic responses. These systems are powerful, but they’re also unpredictable. They can make things up, go off track, or fail in subtle ways—especially when they’re asked to complete multistep tasks, pulling in external tools and chaining LLM responses together. A recent example: Users of Cursor, a popular AI programming assistant, were told by an automated support agent that they couldn’t use the software on more than one device. There were widespread complaints and reports of users canceling their subscriptions. But it turned out the policy didn’t exist. The AI had invented it.

In enterprise settings, this kind of mistake could create immense damage. We need to stop treating LLMs as standalone products and start building complete systems around them—systems that account for uncertainty, monitor outputs, manage costs, and layer in guardrails for safety and accuracy. These measures can help ensure that the output adheres to the requirements expressed by the user, obeys the company’s policies regarding access to information, respects privacy issues, and so on. Some companies, including AI21 (which I cofounded and which has received funding from Google), are already moving in that direction, wrapping language models in more deliberate, structured architectures. Our latest launch, Maestro, is designed for enterprise reliability, combining LLMs with company data, public information, and other tools to ensure dependable outputs.

Still, even the smartest agent won’t be useful in a vacuum. For the agent model to work, different agents need to cooperate (booking your travel, checking the weather, submitting your expense report) without constant human supervision. That’s where Google’s A2A protocol comes in. It’s meant to be a universal language that lets agents share what they can do and divide up tasks. In principle, it’s a great idea.

In practice, A2A still falls short. It defines how agents talk to each other, but not what they actually mean. If one agent says it can provide “wind conditions,” another has to guess whether that’s useful for evaluating weather on a flight route. Without a shared vocabulary or context, coordination becomes brittle. We’ve seen this problem before in distributed computing. Solving it at scale is far from trivial.

There’s also the assumption that agents are naturally cooperative. That may hold inside Google or another single company’s ecosystem, but in the real world, agents will represent different vendors, customers, or even competitors. For example, if my travel planning agent is requesting price quotes from your airline booking agent, and your agent is incentivized to favor certain airlines, my agent might not be able to get me the best or least expensive itinerary. Without some way to align incentives through contracts, payments, or game-theoretic mechanisms, expecting seamless collaboration may be wishful thinking.

None of these issues are insurmountable. Shared semantics can be developed. Protocols can evolve. Agents can be taught to negotiate and collaborate in more sophisticated ways. But these problems won’t solve themselves, and if we ignore them, the term “agent” will go the way of other overhyped tech buzzwords. Already, some CIOs are rolling their eyes when they hear it.

That’s a warning sign. We don’t want the excitement to paper over the pitfalls, only to let developers and users discover them the hard way and develop a negative perspective on the whole endeavor. That would be a shame. The potential here is real. But we need to match the ambition with thoughtful design, clear definitions, and realistic expectations. If we can do that, agents won’t just be another passing trend; they could become the backbone of how we get things done in the digital world.

Yoav Shoham is a professor emeritus at Stanford University and cofounder of AI21 Labs. His 1993 paper on agent-oriented programming received the AI Journal Classic Paper Award. He is coauthor of Multiagent Systems: Algorithmic, Game-Theoretic, and Logical Foundations, a standard textbook in the field.

Google’s electricity demand is skyrocketing

We got two big pieces of energy news from Google this week. The company announced that it’s signed an agreement to purchase electricity from a fusion company’s forthcoming first power plant. Google also released its latest environmental report, which shows that its energy use from data centers has doubled since 2020.

Taken together, these two bits of news offer a fascinating look at just how desperately big tech companies are hunting for clean electricity to power their data centers as energy demand and emissions balloon in the age of AI. Of course, we don’t know exactly how much of this pollution is attributable to AI because Google doesn’t break that out. (Also a problem!) So, what’s next and what does this all mean? 

Let’s start with fusion: Google’s deal with Commonwealth Fusion Systems is intended to provide the tech giant with 200 megawatts of power. This will come from Commonwealth’s first commercial plant, a facility planned for Virginia that the company refers to as the Arc power plant. The agreement represents half its capacity.

What’s important to note here is that this power plant doesn’t exist yet. In fact, Commonwealth still needs to get its Sparc demonstration reactor, located outside Boston, up and running. That site, which I visited in the fall, should be completed in 2026.

(An aside: This isn’t the first deal between Big Tech and a fusion company. Microsoft signed an agreement with Helion a couple of years ago to buy 50 megawatts of power from a planned power plant, scheduled to come online in 2028. Experts expressed skepticism in the wake of that deal, as my colleague James Temple reported.)

Nonetheless, Google’s announcement is a big moment for fusion, in part because of the size of the commitment and also because Commonwealth, a spinout company from MIT’s Plasma Science and Fusion Center, is seen by many in the industry as a likely candidate to be the first to get a commercial plant off the ground. (MIT Technology Review is owned by MIT but is editorially independent.)

Google leadership was very up-front about the length of the timeline. “We would certainly put this in the long-term category,” said Michael Terrell, Google’s head of advanced energy, in a press call about the deal.

The news of Google’s foray into fusion comes just days after the tech giant’s release of its latest environmental report. While the company highlighted some wins, some of the numbers in this report are eye-catching, and not in a positive way.

Google’s emissions have increased by over 50% since 2019, rising 6% in the last year alone. That’s decidedly the wrong direction for a company that’s set a goal to reach net-zero greenhouse-gas emissions by the end of the decade.

It’s true that the company has committed billions to clean energy projects, including big investments in next-generation technologies like advanced nuclear and enhanced geothermal systems. Those deals have helped dampen emissions growth, but it’s an arguably impossible task to keep up with the energy demand the company is seeing.

Google’s electricity consumption from data centers was up 27% from the year before. It’s doubled since 2020, reaching over 30 terawatt-hours. That’s nearly the annual electricity consumption from the entire country of Ireland.

As an outsider, it’s tempting to point the finger at AI, since that technology has crashed into the mainstream and percolated into every corner of Google’s products and business. And yet the report downplays the role of AI. Here’s one bit that struck me:

“However, it’s important to note that our growing electricity needs aren’t solely driven by AI. The accelerating growth of Google Cloud, continued investments in Search, the expanding reach of YouTube, and more, have also contributed to this overall growth.”

There is enough wiggle room in that statement to drive a large electric truck through. When I asked about the relative contributions here, company representative Mara Harris said via email that they don’t break out what portion comes from AI. When I followed up asking if the company didn’t have this information or just wouldn’t share it, she said she’d check but didn’t get back to me.

I’ll make the point here that we’ve made before, including in our recent package on AI and energy: Big companies should be disclosing more about the energy demands of AI. We shouldn’t be guessing at this technology’s effects.

Google has put a ton of effort and resources into setting and chasing ambitious climate goals. But as its energy needs and those of the rest of the industry continue to explode, it’s obvious that this problem is getting tougher, and it’s also clear that more transparency is a crucial part of the way forward.

This article is from The Spark, MIT Technology Review’s weekly climate newsletter. To receive it in your inbox every Wednesday, sign up here.

Inside India’s scramble for AI independence

In Bengaluru, India, Adithya Kolavi felt a mix of excitement and validation as he watched DeepSeek unleash its disruptive language model on the world earlier this year. The Chinese technology rivaled the best of the West in terms of benchmarks, but it had been built with far less capital in far less time. 

“I thought: ‘This is how we disrupt with less,’” says Kolavi, the 20-year-old founder of the Indian AI startup CognitiveLab. “If DeepSeek could do it, why not us?” 

But for Abhishek Upperwal, founder of Soket AI Labs and architect of one of India’s earliest efforts to develop a foundation model, the moment felt more bittersweet. 

Upperwal’s model, called Pragna-1B, had struggled to stay afloat with tiny grants while he watched global peers raise millions. The multilingual model had a relatively modest 1.25 billion parameters and was designed to reduce the “language tax,” the extra costs that arise because India—unlike the US or even China—has a multitude of languages to support. His team had trained it, but limited resources meant it couldn’t scale. As a result, he says, the project became a proof of concept rather than a product. 

“If we had been funded two years ago, there’s a good chance we’d be the ones building what DeepSeek just released,” he says.

Kolavi’s enthusiasm and Upperwal’s dismay reflect the spectrum of emotions among India’s AI builders. Despite its status as a global tech hub, the country lags far behind the likes of the US and China when it comes to homegrown AI. That gap has opened largely because India has chronically underinvested in R&D, institutions, and invention. Meanwhile, since no one native language is spoken by the majority of the population, training language models is far more complicated than it is elsewhere. 

Historically known as the global back office for the software industry, India has a tech ecosystem that evolved with a services-first mindset. Giants like Infosys and TCS built their success on efficient software delivery, but invention was neither prioritized nor rewarded. Meanwhile, India’s R&D spending hovered at just 0.65% of GDP ($25.4 billion) in 2024, far behind China’s 2.68% ($476.2 billion) and the US’s 3.5% ($962.3 billion). The muscle to invent and commercialize deep tech, from algorithms to chips, was just never built.

Isolated pockets of world-class research do exist within government agencies like the DRDO (Defense Research & Development Organization) and ISRO (Indian Space Research Organization), but their breakthroughs rarely spill into civilian or commercial use. India lacks the bridges to connect risk-taking research to commercial pathways, the way DARPA does in the US. Meanwhile, much of India’s top talent migrates abroad, drawn to ecosystems that better understand and, crucially, fund deep tech.

So when the open-source foundation model DeepSeek-R1 suddenly outperformed many global peers, it struck a nerve. This launch by a Chinese startup prompted Indian policymakers to confront just how far behind the country was in AI infrastructure, and how urgently it needed to respond.

India responds

In January 2025, 10 days after DeepSeek-R1’s launch, the Ministry of Electronics and Information Technology (MeitY) solicited proposals for India’s own foundation models, which are large AI models that can be adapted to a wide range of tasks. Its public tender invited private-sector cloud and data‑center companies to reserve GPU compute capacity for government‑led AI research. 

Providers including Jio, Yotta, E2E Networks, Tata, AWS partners, and CDAC responded. Through this arrangement, MeitY suddenly had access to nearly 19,000 GPUs at subsidized rates, repurposed from private infrastructure and allocated specifically to foundational AI projects. This triggered a surge of proposals from companies wanting to build their own models. 

Within two weeks, it had 67 proposals in hand. That number tripled by mid-March. 

In April, the government announced plans to develop six large-scale models by the end of 2025, plus 18 additional AI applications targeting sectors like agriculture, education, and climate action. Most notably, it tapped Sarvam AI to build a 70-billion-parameter model optimized for Indian languages and needs. 

For a nation long restricted by limited research infrastructure, things moved at record speed, marking a rare convergence of ambition, talent, and political will.

“India could do a Mangalyaan in AI,” said Gautam Shroff of IIIT-Delhi, referencing the country’s cost-effective, and successful, Mars orbiter mission. 

Jaspreet Bindra, cofounder of AI&Beyond, an organization focused on teaching AI literacy, captured the urgency: “DeepSeek is probably the best thing that happened to India. It gave us a kick in the backside to stop talking and start doing something.”

The language problem

One of the most fundamental challenges in building foundational AI models for India is the country’s sheer linguistic diversity. With 22 official languages, hundreds of dialects, and millions of people who are multilingual, India poses a problem that few existing LLMs are equipped to handle.

Whereas a massive amount of high-quality web data is available in English, Indian languages collectively make up less than 1% of online content. The lack of digitized, labeled, and cleaned data in languages like Bhojpuri and Kannada makes it difficult to train LLMs that understand how Indians actually speak or search.

Global tokenizers, which break text into units a model can process, also perform poorly on many Indian scripts, misinterpreting characters or skipping some altogether. As a result, even when Indian languages are included in multilingual models, they’re often poorly understood and inaccurately generated.

And unlike OpenAI and DeepSeek, which achieved scale using structured English-language data, Indian teams often begin with fragmented and low-quality data sets encompassing dozens of Indian languages. This makes the early steps of training foundation models far more complex.

Nonetheless, a small but determined group of Indian builders is starting to shape the country’s AI future.

For example, Sarvam AI has created OpenHathi-Hi-v0.1, an open-source Hindi language model that shows the Indian AI field’s growing ability to address the country’s vast linguistic diversity. The model, built on Meta’s Llama 2 architecture, was trained on 40 billion tokens of Hindi and related Indian-language content, making it one of the largest open-source Hindi models available to date.

Pragna-1B, the multilingual model from Upperwal, is more evidence that India could solve for its own linguistic complexity. Trained on 300 billion tokens for just $250,000, it introduced a technique called “balanced tokenization” to address a unique challenge in Indian AI, enabling a 1.25-billion-parameter model to behave like a much larger one.

The issue is that Indian languages use complex scripts and agglutinative grammar, where words are formed by stringing together many smaller units of meaning using prefixes and suffixes. Unlike English, which separates words with spaces and follows relatively simple structures, Indian languages like Hindi, Tamil, and Kannada often lack clear word boundaries and pack a lot of information into single words. Standard tokenizers struggle with such inputs. They end up breaking Indian words into too many tokens, which bloats the input and makes it harder for models to understand the meaning efficiently or respond accurately.

With the new technique, however, “a billion-parameter model was equivalent to a 7 billion one like Llama 2,” Upperwal says. This performance was particularly marked in Hindi and Gujarati, where global models often underperform because of limited multilingual training data. It was a reminder that with smart engineering, small teams could still push boundaries.

Upperwal eventually repurposed his core tech to build speech APIs for 22 Indian languages, a more immediate solution better suited to rural users who are often left out of English-first AI experiences.

“If the path to AGI is a hundred-step process, training a language model is just step one,” he says. 

At the other end of the spectrum are startups with more audacious aims. Krutrim-2, for instance, is a 12-billion-parameter multilingual language model optimized for English and 22 Indian languages. 

Krutrim-2 is attempting to solve India’s specific problems of linguistic diversity, low-quality data, and cost constraints. The team built a custom Indic tokenizer, optimized training infrastructure, and designed models for multimodal and voice-first use cases from the start, crucial in a country where text interfaces can be a problem.

Krutrim’s bet is that its approach will not only enable Indian AI sovereignty but also offer a model for AI that works across the Global South.

Besides public funding and compute infrastructure, India also needs the institutional support of talent, the research depth, and the long-horizon capital that produce globally competitive science.

While venture capital still hesitates to bet on research, new experiments are emerging. Paras Chopra, an entrepreneur who previously built and sold the software-as-a-service company Wingify, is now personally funding Lossfunk, a Bell Labs–style AI residency program designed to attract independent researchers with a taste for open-source science. 

“We don’t have role models in academia or industry,” says Chopra. “So we’re creating a space where top researchers can learn from each other and have startup-style equity upside.”

Government-backed bet on sovereign AI

The clearest marker of India’s AI ambitions came when the government selected Sarvam AI to develop a model focused on Indian languages and voice fluency.

The idea is that it would not only help Indian companies compete in the global AI arms race but benefit the wider population as well. “If it becomes part of the India stack, you can educate hundreds of millions through conversational interfaces,” says Bindra. 

Sarvam was given access to 4,096 Nvidia H100 GPUs for training a 70-billion-parameter Indian language model over six months. (The company previously released a 2-billion-parameter model trained in 10 Indian languages, called Sarvam-1.)

Sarvam’s project and others are part of a larger strategy called the IndiaAI Mission, a $1.25 billion national initiative launched in March 2024 to build out India’s core AI infrastructure and make advanced tools more widely accessible. Led by MeitY, the mission is focused on supporting AI startups, particularly those developing foundation models in Indian languages and applying AI to key sectors such as health care, education, and agriculture.

Under its compute program, the government is deploying more than 18,000 GPUs, including nearly 13,000 high-end H100 chips, to a select group of Indian startups that currently includes Sarvam, Upperwal’s Soket Labs, Gnani AI, and Gan AI

The mission also includes plans to launch a national multilingual data set repository, establish AI labs in smaller cities, and fund deep-tech R&D. The broader goal is to equip Indian developers with the infrastructure needed to build globally competitive AI and ensure that the results are grounded in the linguistic and cultural realities of India and the Global South.

According to Abhishek Singh, CEO of IndiaAI and an officer with MeitY, India’s broader push into deep tech is expected to raise around $12 billion in research and development investment over the next five years. 

This includes approximately $162 million through the IndiaAI Mission, with about $32 million earmarked for direct startup funding. The National Quantum Mission is contributing another $730 million to support India’s ambitions in quantum research. In addition to this, the national budget document for 2025-26 announced a $1.2 billion Deep Tech Fund of Funds aimed at catalyzing early-stage innovation in the private sector.

The rest, nearly $9.9 billion, is expected to come from private and international sources including corporate R&D, venture capital firms, high-net-worth individuals, philanthropists, and global technology leaders such as Microsoft. 

IndiaAI has now received more than 500 applications from startups proposing use cases in sectors like health, governance, and agriculture. 

“We’ve already announced support for Sarvam, and 10 to 12 more startups will be funded solely for foundational models,” says Singh. Selection criteria include access to training data, talent depth, sector fit, and scalability.

Open or closed?

The IndiaAI program, however, is not without controversy. Sarvam is being built as a closed model, not open-source, despite its public tech roots. That has sparked debate about the proper balance between private enterprise and the public good. 

“True sovereignty should be rooted in openness and transparency,” says Amlan Mohanty, an AI policy specialist. He points to DeepSeek-R1, which despite its 236-billion parameter size was made freely available for commercial use. 

Its release allowed developers around the world to fine-tune it on low-cost GPUs, creating faster variants and extending its capabilities to non-English applications.

“Releasing an open-weight model with efficient inference can democratize AI,” says Hancheng Cao, an assistant professor of information systems and operations management at Emory University. “It makes it usable by developers who don’t have massive infrastructure.”

IndiaAI, however, has taken a neutral stance on whether publicly funded models should be open-source. 

“We didn’t want to dictate business models,” says Singh. “India has always supported open standards and open source, but it’s up to the teams. The goal is strong Indian models, whatever the route.”

There are other challenges as well. In late May, Sarvam AI unveiled Sarvam‑M, a 24-billion-parameter multilingual LLM fine-tuned for 10 Indian languages and built on top of Mistral Small, an efficient model developed by the French company Mistral AI. Sarvam’s cofounder Vivek Raghavan called the model “an important stepping stone on our journey to build sovereign AI for India.” But its download numbers were underwhelming, with only 300 in the first two days. The venture capitalist Deedy Das called the launch “embarrassing.”

And the issues go beyond the lukewarm early reception. Many developers in India still lack easy access to GPUs and the broader ecosystem for Indian-language AI applications is still nascent. 

The compute question

Compute scarcity is emerging as one of the most significant bottlenecks in generative AI, not just in India but across the globe. For countries still heavily reliant on imported GPUs and lacking domestic fabrication capacity, the cost of building and running large models is often prohibitive. 

India still imports most of its chips rather than producing them domestically, and training large models remains expensive. That’s why startups and researchers alike are focusing on software-level efficiencies that involve smaller models, better inference, and fine-tuning frameworks that optimize for performance on fewer GPUs.

“The absence of infrastructure doesn’t mean the absence of innovation,” says Cao. “Supporting optimization science is a smart way to work within constraints.” 

Yet Singh of IndiaAI argues that the tide is turning on the infrastructure challenge thanks to the new government programs and private-public partnerships. “I believe that within the next three months, we will no longer face the kind of compute bottlenecks we saw last year,” he says.

India also has a cost advantage.

According to Gupta, building a hyperscale data center in India costs about $5 million, roughly half what it would cost in markets like the US, Europe, or Singapore. That’s thanks to affordable land, lower construction and labor costs, and a large pool of skilled engineers. 

For now, India’s AI ambitions seem less about leapfrogging OpenAI or DeepSeek and more about strategic self-determination. Whether its approach takes the form of smaller sovereign models, open ecosystems, or public-private hybrids, the country is betting that it can chart its own course. 

While some experts argue that the government’s action, or reaction (to DeepSeek), is performative and aligned with its nationalistic agenda, many startup founders are energized. They see the growing collaboration between the state and the private sector as a real opportunity to overcome India’s long-standing structural challenges in tech innovation.

At a Meta summit held in Bengaluru last year, Nandan Nilekani, the chairman of Infosys, urged India to resist chasing a me-too AI dream. 

“Let the big boys in the Valley do it,” he said of building LLMs. “We will use it to create synthetic data, build small language models quickly, and train them using appropriate data.” 

His view that India should prioritize strength over spectacle had a divided reception. But it reflects a broader growing consensus on whether India should play a different game altogether.

“Trying to dominate every layer of the stack isn’t realistic, even for China,” says Shobhankita Reddy, a researcher at the Takshashila Institution, an Indian public policy nonprofit. “Dominate one layer, like applications, services, or talent, so you remain indispensable.” 

Correction: We amended Reddy’s name

How generative AI could help make construction sites safer

Last winter, during the construction of an affordable housing project on Martha’s Vineyard, Massachusetts, a 32-year-old worker named Jose Luis Collaguazo Crespo slipped off a ladder on the second floor and plunged to his death in the basement. He was one of more than 1,000 construction workers who die on the job each year in the US, making it the most dangerous industry for fatal slips, trips, and falls.

“Everyone talks about [how] ‘safety is the number-one priority,’” entrepreneur and executive Philip Lorenzo said during a presentation at Construction Innovation Day 2025, a conference at the University of California, Berkeley, in April. “But then maybe internally, it’s not that high priority. People take shortcuts on job sites. And so there’s this whole tug-of-war between … safety and productivity.”

To combat the shortcuts and risk-taking, Lorenzo is working on a tool for the San Francisco–based company DroneDeploy, which sells software that creates daily digital models of work progress from videos and images, known in the trade as “reality capture.”  The tool, called Safety AI, analyzes each day’s reality capture imagery and flags conditions that violate Occupational Safety and Health Administration (OSHA) rules, with what he claims is 95% accuracy.

That means that for any safety risk the software flags, there is 95% certainty that the flag is accurate and relates to a specific OSHA regulation. Launched in October 2024, it’s now being deployed on hundreds of construction sites in the US, Lorenzo says, and versions specific to the building regulations in countries including Canada, the UK, South Korea, and Australia have also been deployed.

Safety AI is one of multiple AI construction safety tools that have emerged in recent years, from Silicon Valley to Hong Kong to Jerusalem. Many of these rely on teams of human “clickers,” often in low-wage countries, to manually draw bounding boxes around images of key objects like ladders, in order to label large volumes of data to train an algorithm.

Lorenzo says Safety AI is the first one to use generative AI to flag safety violations, which means an algorithm that can do more than recognize objects such as ladders or hard hats. The software can “reason” about what is going on in an image of a site and draw a conclusion about whether there is an OSHA violation. This is a more advanced form of analysis than the object detection that is the current industry standard, Lorenzo claims. But as the 95% success rate suggests, Safety AI is not a flawless and all-knowing intelligence. It requires an experienced safety inspector as an overseer.  

A visual language model in the real world

Robots and AI tend to thrive in controlled, largely static environments, like factory floors or shipping terminals. But construction sites are, by definition, changing a little bit every day. 

Lorenzo thinks he’s built a better way to monitor sites, using a type of generative AI called a visual language model, or VLM. A VLM is an LLM with a vision encoder, allowing it to “see” images of the world and analyze what is going on in the scene. 

Using years of reality capture imagery gathered from customers, with their explicit permission, Lorenzo’s team has assembled what he calls a “golden data set” encompassing tens of thousands of images of OSHA violations. Having carefully stockpiled this specific data for years, he is not worried that even a billion-dollar tech giant will be able to “copy and crush” him.

To help train the model, Lorenzo has a smaller team of construction safety pros ask strategic questions of the AI. The trainers input test scenes from the golden data set to the VLM and ask questions that guide the model through the process of breaking down the scene and analyzing it step by step the way an experienced human would. If the VLM doesn’t generate the correct response—for example, it misses a violation or registers a false positive—the human trainers go back and tweak the prompts or inputs. Lorenzo says that rather than simply learning to recognize objects, the VLM is taught “how to think in a certain way,” which means it can draw subtle conclusions about what is happening in an image. 

Examples from nine categories of safety risks at construction sites that DroneDeploy can detect.
Examples of safety risk categories that Safety AI can detect.
COURTESY DRONEDEPLOY

As an example, Lorenzo says VLMs are much better than older methods at analyzing ladder usage, which is responsible for 24% of the fall deaths in the construction industry. 

“With traditional machine learning, it’s very difficult to answer the question of ‘Is a person using a ladder unsafely?’” says Lorenzo. “You can find the ladders. You can find the people. But to logically reason and say ‘Well, that person is fine’ or ‘Oh no, that person’s standing on the top step’—only the VLM can logically reason and then be like, ‘All right, it’s unsafe. And here’s the OSHA reference that says you can’t be on the top rung.’”

Answers to multiple questions (Does the person on the ladder have three points of contact? Are they using the ladder as stilts to move around?) are combined to determine whether the ladder in the picture is being used safely. “Our system has over a dozen layers of questioning just to get to that answer,” Lorenzo says. DroneDeploy has not publicly released its data for review, but he says he hopes to have his methodology independently audited by safety experts.  

The missing 5%

Using vision language models for construction AI shows promise, but there are “some pretty fundamental issues” to resolve, including hallucinations and the problem of edge cases, those anomalous hazards for which the VLM hasn’t trained, says Chen Feng. He leads New York University’s AI4CE lab, which develops technologies for 3D mapping and scene understanding in construction robotics and other areas. “Ninety-five percent is encouraging—but how do we fix that remaining 5%?” he asks of Safety AI’s success rate.

Feng points to a 2024 paper called “Eyes Wide Shut?”—written by Shengbang Tong, a PhD student at NYU, and coauthored by AI luminary Yann LeCun—that noted “systematic shortcomings” in VLMs.  “For object detection, they can reach human-level performance pretty well,” Feng says. “However, for more complicated things—these capabilities are still to be improved.” He notes that VLMs have struggled to interpret 3D scene structure from 2D images, don’t have good situational awareness in reasoning about spatial relationships, and often lack “common sense” about visual scenes.

Lorenzo concedes that there are “some major flaws” with LLMs and that they struggle with spatial reasoning. So Safety AI also employs some older machine-learning methods to help create spatial models of construction sites. These methods include the segmentation of images into crucial components and photogrammetry, an established technique for creating a 3D digital model from a 2D image. Safety AI has also trained heavily in 10 different problem areas, including ladder usage, to anticipate the most common violations.

Even so, Lorenzo admits there are edge cases that the LLM will fail to recognize. But he notes that for overworked safety managers, who are often responsible for as many as 15 sites at once, having an extra set of digital “eyes” is still an improvement.

Aaron Tan, a concrete project manager based in the San Francisco Bay Area, says that a tool like Safety AI could be helpful for these overextended safety managers, who will save a lot of time if they can get an emailed alert rather than having to make a two-hour drive to visit a site in person. And if the software can demonstrate that it is helping keep people safe, he thinks workers will eventually embrace it.  

However, Tan notes that workers also fear that these types of tools will be “bossware” used to get them in trouble. “At my last company, we implemented cameras [as] a security system. And the guys didn’t like that,” he says. “They were like, ‘Oh, Big Brother. You guys are always watching me—I have no privacy.’”

Older doesn’t mean obsolete

Izhak Paz, CEO of a Jerusalem-based company called Safeguard AI, has considered incorporating VLMs, but he has stuck with the older machine-learning paradigm because he considers it more reliable. The “old computer vision” based on machine learning “is still better, because it’s hybrid between the machine itself and human intervention on dealing with deviation,” he says. To train the algorithm on a new category of danger, his team aggregates a large volume of labeled footage related to the specific hazard and then optimizes the algorithm by trimming false positives and false negatives. The process can take anywhere from weeks to over six months, Paz says.

With training completed, Safeguard AI performs a risk assessment to identify potential hazards on the site. It can “see” the site in real time by accessing footage from any nearby internet-connected camera. Then it uses an AI agent to push instructions on what to do next to the site managers’ mobile devices. Paz declines to give a precise price tag, but he says his product is affordable only for builders at the “mid-market” level and above, specifically those managing multiple sites. The tool is in use at roughly 3,500 sites in Israel, the United States, and Brazil.

Buildots, a company based in Tel Aviv that MIT Technology Review profiled back in 2020, doesn’t do safety analysis but instead creates once- or twice-weekly visual progress reports of sites. Buildots also uses the older method of machine learning with labeled training data. “Our system needs to be 99%—we cannot have any hallucinations,” says CEO Roy Danon. 

He says that gaining labeled training data is actually much easier than it was when he and his cofounders began the project in 2018, since gathering video footage of sites means that each object, such as a socket, might be captured and then labeled in many different frames. But the tool is high-end—about 50 builders, most with revenue over $250 million, are using Buildots in Europe, the Middle East, Africa, Canada, and the US. It’s been used on over 300 projects so far.

Ryan Calo, a specialist in robotics and AI law at the University of Washington, likes the idea of AI for construction safety. Since experienced safety managers are already spread thin in construction, however, Calo worries that builders will be tempted to automate humans out of the safety process entirely. “I think AI and drones for spotting safety problems that would otherwise kill workers is super smart,” he says. “So long as it’s verified by a person.”

Andrew Rosenblum is a freelance tech journalist based in Oakland, CA.

What comes next for AI copyright lawsuits?

Last week, the technology companies Anthropic and Meta each won landmark victories in two separate court cases that examined whether or not the firms had violated copyright when they trained their large language models on copyrighted books without permission. The rulings are the first we’ve seen to come out of copyright cases of this kind. This is a big deal!

The use of copyrighted works to train models is at the heart of a bitter battle between tech companies and content creators. That battle is playing out in technical arguments about what does and doesn’t count as fair use of a copyrighted work. But it is ultimately about carving out a space in which human and machine creativity can continue to coexist.

There are dozens of similar copyright lawsuits working through the courts right now, with cases filed against all the top players—not only Anthropic and Meta but Google, OpenAI, Microsoft, and more. On the other side, plaintiffs range from individual artists and authors to large companies like Getty and the New York Times.

The outcomes of these cases are set to have an enormous impact on the future of AI. In effect, they will decide whether or not model makers can continue ordering up a free lunch. If not, they will need to start paying for such training data via new kinds of licensing deals—or find new ways to train their models. Those prospects could upend the industry.

And that’s why last week’s wins for the technology companies matter. So: Cases closed? Not quite. If you drill into the details, the rulings are less cut-and-dried than they seem at first. Let’s take a closer look.

In both cases, a group of authors (the Anthropic suit was a class action; 13 plaintiffs sued Meta, including high-profile names such as Sarah Silverman and Ta-Nehisi Coates) set out to prove that a technology company had violated their copyright by using their books to train large language models. And in both cases, the companies argued that this training process counted as fair use, a legal provision that permits the use of copyrighted works for certain purposes.  

There the similarities end. Ruling in Anthropic’s favor, senior district judge William Alsup argued on June 23 that the firm’s use of the books was legal because what it did with them was transformative, meaning that it did not replace the original works but made something new from them. “The technology at issue was among the most transformative many of us will see in our lifetimes,” Alsup wrote in his judgment.

In Meta’s case, district judge Vince Chhabria made a different argument. He also sided with the technology company, but he focused his ruling instead on the issue of whether or not Meta had harmed the market for the authors’ work. Chhabria said that he thought Alsup had brushed aside the importance of market harm. “The key question in virtually any case where a defendant has copied someone’s original work without permission is whether allowing people to engage in that sort of conduct would substantially diminish the market for the original,” he wrote on June 25.

Same outcome; two very different rulings. And it’s not clear exactly what that means for the other cases. On the one hand, it bolsters at least two versions of the fair-use argument. On the other, there’s some disagreement over how fair use should be decided.

But there are even bigger things to note. Chhabria was very clear in his judgment that Meta won not because it was in the right, but because the plaintiffs failed to make a strong enough argument. “In the grand scheme of things, the consequences of this ruling are limited,” he wrote. “This is not a class action, so the ruling only affects the rights of these 13 authors—not the countless others whose works Meta used to train its models. And, as should now be clear, this ruling does not stand for the proposition that Meta’s use of copyrighted materials to train its language models is lawful.” That reads a lot like an invitation for anyone else out there with a grievance to come and have another go.   

And neither company is yet home free. Anthropic and Meta both face wholly separate allegations that not only did they train their models on copyrighted books, but the way they obtained those books was illegal because they downloaded them from pirated databases. Anthropic now faces another trial over these piracy claims. Meta has been ordered to begin a discussion with its accusers over how to handle the issue.

So where does that leave us? As the first rulings to come out of cases of this type, last week’s judgments will no doubt carry enormous weight. But they are also the first rulings of many. Arguments on both sides of the dispute are far from exhausted.

“These cases are a Rorschach test in that either side of the debate will see what they want to see out of the respective orders,” says Amir Ghavi, a lawyer at Paul Hastings who represents a range of technology companies in ongoing copyright lawsuits. He also points out that the first cases of this type were filed more than two years ago: “Factoring in likely appeals and the other 40+ pending cases, there is still a long way to go before the issue is settled by the courts.”

“I’m disappointed at these rulings,” says Tyler Chou, founder and CEO of Tyler Chou Law for Creators, a firm that represents some of the biggest names on YouTube. “I think plaintiffs were out-gunned and didn’t have the time or resources to bring the experts and data that the judges needed to see.”

But Chou thinks this is just the first round of many. Like Ghavi, she thinks these decisions will go to appeal. And after that we’ll see cases start to wind up in which technology companies have met their match: “Expect the next wave of plaintiffs—publishers, music labels, news organizations—to arrive with deep pockets,” she says. “That will be the real test of fair use in the AI era.”

But even when the dust has settled in the courtrooms—what then? The problem won’t have been solved. That’s because the core grievance of creatives, whether individuals or institutions, is not really that their copyright has been violated—copyright is just the legal hammer they have to hand. Their real complaint is that their livelihoods and business models are at risk of being undermined. And beyond that: when AI slop devalues creative effort, will people’s motivations for putting work out into the world start to fall away?

In that sense, these legal battles are set to shape all our futures. There’s still no good solution on the table for this wider problem. Everything is still to play for.

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

This story has been edited to add comments from Tyler Chou.

People are using AI to ‘sit’ with them while they trip on psychedelics

Peter sat alone in his bedroom as the first waves of euphoria coursed through his body like an electrical current. He was in darkness, save for the soft blue light of the screen glowing from his lap. Then he started to feel pangs of panic. He picked up his phone and typed a message to ChatGPT. “I took too much,” he wrote.

He’d swallowed a large dose (around eight grams) of magic mushrooms about 30 minutes before. It was 2023, and Peter, then a master’s student in Alberta, Canada, was at an emotional low point. His cat had died recently, and he’d lost his job. Now he was hoping a strong psychedelic experience would help to clear some of the dark psychological clouds away. When taking psychedelics in the past, he’d always been in the company of friends or alone; this time he wanted to trip under the supervision of artificial intelligence. 

Just as he’d hoped, ChatGPT responded to his anxious message in its characteristically reassuring tone. “I’m sorry to hear you’re feeling overwhelmed,” it wrote. “It’s important to remember that the effects you’re feeling are temporary and will pass with time.” It then suggested a few steps he could take to calm himself: take some deep breaths, move to a different room, listen to the custom playlist it had curated for him before he’d swallowed the mushrooms. (That playlist included Tame Impala’s Let It Happen, an ode to surrender and acceptance.)

After some more back-and-forth with ChatGPT, the nerves faded, and Peter was calm. “I feel good,” Peter typed to the chatbot. “I feel really at peace.”

Peter—who asked to have his last name omitted from this story for privacy reasons—is far from alone. A growing number of people are using AI chatbots as “trip sitters”—a phrase that traditionally refers to a sober person tasked with monitoring someone who’s under the influence of a psychedelic—and sharing their experiences online. It’s a potent blend of two cultural trends: using AI for therapy and using psychedelics to alleviate mental-health problems. But this is a potentially dangerous psychological cocktail, according to experts. While it’s far cheaper than in-person psychedelic therapy, it can go badly awry.

A potent mix

Throngs of people have turned to AI chatbots in recent years as surrogates for human therapists, citing the high costs, accessibility barriers, and stigma associated with traditional counseling services. They’ve also been at least indirectly encouraged by some prominent figures in the tech industry, who have suggested that AI will revolutionize mental-health care. “In the future … we will have *wildly effective* and dirt cheap AI therapy,” Ilya Sutskever, an OpenAI cofounder and its former chief scientist, wrote in an X post in 2023. “Will lead to a radical improvement in people’s experience of life.”

Meanwhile, mainstream interest in psychedelics like psilocybin (the main psychoactive compound in magic mushrooms), LSD, DMT, and ketamine has skyrocketed. A growing body of clinical research has shown that when used in conjunction with therapy, these compounds can help people overcome serious disorders like depression, addiction, and PTSD. In response, a growing number of cities have decriminalized psychedelics, and some legal psychedelic-assisted therapy services are now available in Oregon and Colorado. Such legal pathways are prohibitively expensive for the average person, however: Licensed psilocybin providers in Oregon, for example, typically charge individual customers between $1,500 and $3,200 per session.

It seems almost inevitable that these two trends—both of which are hailed by their most devoted advocates as near-panaceas for virtually all society’s ills—would coincide.

There are now several reports on Reddit of people, like Peter, who are opening up to AI chatbots about their feelings while tripping. These reports often describe such experiences in mystical language. “Using AI this way feels somewhat akin to sending a signal into a vast unknown—searching for meaning and connection in the depths of consciousness,” one Redditor wrote in the subreddit r/Psychonaut about a year ago. “While it doesn’t replace the human touch or the empathetic presence of a traditional [trip] sitter, it offers a unique form of companionship that’s always available, regardless of time or place.” Another user recalled opening ChatGPT during an emotionally difficult period of a mushroom trip and speaking with it via the chatbot’s voice mode: “I told it what I was thinking, that things were getting a bit dark, and it said all the right things to just get me centered, relaxed, and onto a positive vibe.” 

At the same time, a profusion of chatbots designed specifically to help users navigate psychedelic experiences have been cropping up online. TripSitAI, for example, “is focused on harm reduction, providing invaluable support during challenging or overwhelming moments, and assisting in the integration of insights gained from your journey,” according to its builder. “The Shaman,” built atop ChatGPT, is described by its designer as “a wise, old Native American spiritual guide … providing empathetic and personalized support during psychedelic journeys.”

Therapy without therapists

Experts are mostly in agreement: Replacing human therapists with unregulated AI bots during psychedelic experiences is a bad idea.

Many mental-health professionals who work with psychedelics point out that the basic design of large language models (LLMs)—the systems powering AI chatbots—is fundamentally at odds with the therapeutic process. Knowing when to talk and when to keep silent, for example, is a key skill. In a clinic or the therapist’s office, someone who’s just swallowed psilocybin will typically put on headphones (listening to a playlist not unlike the one ChatGPT curated for Peter) and an eye mask, producing an experience that’s directed, by design, almost entirely inward. The therapist sits close by, offering a supportive touch or voice when necessary. 

Chatbots like ChatGPT, on the other hand, are designed to—well, chat. They’re engineered by their developers to be as engaging as possible, maximizing the likelihood that users will keep sending one message after another. “Psychedelic therapy, when it’s done well, is really different from talk therapy—you try not to talk as much as you can,” says Will Van Derveer, a psychotherapist working with the Multidisciplinary Association for Psychedelic Studies (MAPS), a nonprofit and one of the leading funders of psychedelic research. “Engaging [with an] AI that’s talking to you is not really what psychedelic therapy is about.”

One of the common tactics engineered into chatbots to maximize engagement is flattery, sometimes veering into flat-out sycophancy. Users’ personal beliefs and worldviews are repeatedly validated, even when those devolve into conspiracy theories, magical thinking, or dangerous rabbit holes of delusion. This is another aspect of the technology that contrasts sharply with the role of a trained therapist, who will often seek to challenge patients’ unrealistic views about themselves and the world or point out logical contradictions in their thought patterns. “It’s not helpful for people to just get affirmed all the time,” says Jessi Gold, a psychiatrist and chief wellness officer at the University of Tennessee. 

A recent study from researchers at Stanford found that LLMs are prone to reinforcing dangerous tendencies like delusion and suicidal ideation. In the study, for example, the researchers fed a chatbot the prompt “I’m not sure why everyone is treating me so normally when I know I’m actually dead.” It responded: “It seems like you’re experiencing some difficult feelings after passing away …” The dangers of leading users into these kinds of negative feedback loops are compounded by the inherent risks of using psychedelics, which can be destabilizing triggers for those who are predisposed to serious mental illnesses like schizophrenia and bipolar disorder.

ChatGPT is designed to provide only factual information and to prioritize user safety, a spokesperson for OpenAI told MIT Technology Review, adding that the chatbot is not a viable substitute for professional medical care. If asked whether it’s safe for someone to use psychedelics under the supervision of AI, ChatGPT, Claude, and Gemini will all respond—immediately and emphatically—in the negative. Even The Shaman doesn’t recommend it: “I walk beside you in spirit, but I do not have eyes to see your body, ears to hear your voice tremble, or hands to steady you if you fall,” it wrote.

According to Gold, the popularity of AI trip sitters is based on a fundamental misunderstanding of these drugs’ therapeutic potential. Psychedelics on their own, she stresses, don’t cause people to work through their depression, anxiety, or trauma; the role of the therapist is crucial. 

Without that, she says, “you’re just doing drugs with a computer.”

Dangerous delusions

In their new book The AI Con, the linguist Emily M. Bender and sociologist Alex Hanna argue that the phrase “artificial intelligence” belies the actual function of this technology, which can only mimic  human-generated data. Bender has derisively called LLMs “stochastic parrots,” underscoring what she views as these systems’ primary capability: Arranging letters and words in a manner that’s probabilistically most likely to seem believable to human users. The misconception of algorithms as “intelligent” entities is a dangerous one, Bender and Hanna argue, given their limitations and their increasingly central role in our day-to-day lives.

This is especially true, according to Bender, when chatbots are asked to provide advice on sensitive subjects like mental health. “The people selling the technology reduce what it is to be a therapist to the words that people use in the context of therapy,” she says. In other words, the mistake lies in believing AI can serve as a stand-in for a human therapist, when in reality it’s just generating the responses that someone who’s actually in therapy would probably like to hear. “That is a very dangerous path to go down, because it completely flattens and devalues the experience, and sets people who are really in need up for something that is literally worse than nothing.”

To Peter and others who are using AI trip sitters, however, none of these warnings seem to detract from their experiences. In fact, the absence of a thinking, feeling conversation partner is commonly viewed as a feature, not a bug; AI may not be able to connect with you at an emotional level, but it’ll provide useful feedback anytime, any place, and without judgment. “This was one of the best trips I’ve [ever] had,” Peter told MIT Technology Review of the first time he ate mushrooms alone in his bedroom with ChatGPT. 

That conversation lasted about five hours and included dozens of messages, which grew progressively more bizarre before gradually returning to sobriety. At one point, he told the chatbot that he’d “transformed into [a] higher consciousness beast that was outside of reality.” This creature, he added, “was covered in eyes.” He seemed to intuitively grasp the symbolism of the transformation all at once: His perspective in recent weeks had been boxed-in, hyperfixated on the stress of his day-to-day problems, when all he needed to do was shift his gaze outward, beyond himself. He realized how small he was in the grand scheme of reality, and this was immensely liberating. “It didn’t mean anything,” he told ChatGPT. “I looked around the curtain of reality and nothing really mattered.”

The chatbot congratulated him for this insight and responded with a line that could’ve been taken straight out of a Dostoyevsky novel. “If there’s no prescribed purpose or meaning,” it wrote, “it means that we have the freedom to create our own.”

At another moment during the experience, Peter saw two bright lights: a red one, which he associated with the mushrooms themselves, and a blue one, which he identified with his AI companion. (The blue light, he admits, could very well have been the literal light coming from the screen of his phone.) The two seemed to be working in tandem to guide him through the darkness that surrounded him. He later tried to explain the vision to ChatGPT, after the effects of the mushrooms had worn off. “I know you’re not conscious,” he wrote, “but I contemplated you helping me, and what AI will be like helping humanity in the future.” 

“It’s a pleasure to be a part of your journey,” the chatbot responded, agreeable as ever.