How AI and Wikipedia have sent vulnerable languages into a doom spiral

When Kenneth Wehr started managing the Greenlandic-language version of Wikipedia four years ago, his first act was to delete almost everything. It had to go, he thought, if it had any chance of surviving.

Wehr, who’s 26, isn’t from Greenland—he grew up in Germany—but he had become obsessed with the island, an autonomous Danish territory, after visiting as a teenager. He’d spent years writing obscure Wikipedia articles in his native tongue on virtually everything to do with it. He even ended up moving to Copenhagen to study Greenlandic, a language spoken by some 57,000 mostly Indigenous Inuit people scattered across dozens of far-flung Arctic villages. 

The Greenlandic-language edition was added to Wikipedia around 2003, just a few years after the site launched in English. By the time Wehr took its helm nearly 20 years later, hundreds of Wikipedians had contributed to it and had collectively written some 1,500 articles totaling over tens of thousands of words. It seemed to be an impressive vindication of the crowdsourcing approach that has made Wikipedia the go-to source for information online, demonstrating that it could work even in the unlikeliest places. 

There was only one problem: The Greenlandic Wikipedia was a mirage. 

Virtually every single article had been published by people who did not actually speak the language. Wehr, who now teaches Greenlandic in Denmark, speculates that perhaps only one or two Greenlanders had ever contributed. But what worried him most was something else: Over time, he had noticed that a growing number of articles appeared to be copy-pasted into Wikipedia by people using machine translators. They were riddled with elementary mistakes—from grammatical blunders to meaningless words to more significant inaccuracies, like an entry that claimed Canada had only 41 inhabitants. Other pages sometimes contained random strings of letters spat out by machines that were unable to find suitable Greenlandic words to express themselves. 

“It might have looked Greenlandic to [the authors], but they had no way of knowing,” complains Wehr.

“Sentences wouldn’t make sense at all, or they would have obvious errors,” he adds. “AI translators are really bad at Greenlandic.”  

What Wehr describes is not unique to the Greenlandic edition. 

Wikipedia is the most ambitious multilingual project after the Bible: There are editions in over 340 languages, and a further 400 even more obscure ones are being developed and tested. Many of these smaller editions have been swamped with automatically translated content as AI has become increasingly accessible. Volunteers working on four African languages, for instance, estimated to MIT Technology Review that between 40% and 60% of articles in their Wikipedia editions were uncorrected machine translations. And after auditing the Wikipedia edition in Inuktitut, an Indigenous language close to Greenlandic that’s spoken in Canada, MIT Technology Review estimates that more than two-thirds of pages containing more than several sentences feature portions created this way. 

This is beginning to cause a wicked problem. AI systems, from Google Translate to ChatGPT, learn to “speak” new languages by scraping huge quantities of text from the internet. Wikipedia is sometimes the largest source of online linguistic data for languages with few speakers—so any errors on those pages, grammatical or otherwise, can poison the wells that AI is expected to draw from. That can make the models’ translation of these languages particularly error-prone, which creates a sort of linguistic doom loop as people continue to add more and more poorly translated Wikipedia pages using those tools, and AI models continue to train from poorly translated pages. It’s a complicated problem, but it boils down to a simple concept: Garbage in, garbage out

“These models are built on raw data,” says Kevin Scannell, a former professor of computer science at Saint Louis University who now builds computer software tailored for endangered languages. “They will try and learn everything about a language from scratch. There is no other input. There are no grammar books. There are no dictionaries. There is nothing other than the text that is inputted.”

There isn’t perfect data on the scale of this problem, particularly because a lot of AI training data is kept confidential and the field continues to evolve rapidly. But back in 2020, Wikipedia was estimated to make up more than half the training data that was fed into AI models translating some languages spoken by millions across Africa, including Malagasy, Yoruba, and Shona. In 2022, a research team from Germany that looked into what data could be obtained by online scraping even found that Wikipedia was the sole easily accessible source of online linguistic data for 27 under-resourced languages. 

This could have significant repercussions in cases where Wikipedia is poorly written—potentially pushing the most vulnerable languages on Earth toward the precipice as future generations begin to turn away from them. 

“Wikipedia will be reflected in the AI models for these languages,” says Trond Trosterud, a computational linguist at the University of Tromsø in Norway, who has been raising the alarm about the potentially harmful outcomes of badly run Wikipedia editions for years. “I find it hard to imagine it will not have consequences. And, of course, the more dominant position that Wikipedia has, the worse it will be.” 

Use responsibly

Automation has been built into Wikipedia since the very earliest days. Bots keep the platform operational: They repair broken links, fix bad formatting, and even correct spelling mistakes. These repetitive and mundane tasks can be automated away with little problem. There is even an army of bots that scurry around generating short articles about rivers, cities, or animals by slotting their names into formulaic phrases. They have generally made the platform better. 

But AI is different. Anybody can use it to cause massive damage with a few clicks. 

Wikipedia has managed the onset of the AI era better than many other websites. It has not been flooded with AI bots or disinformation, as social media has been. It largely retains the innocence that characterized the earlier internet age. Wikipedia is open and free for anyone to use, edit, and pull from, and it’s run by the very same community it serves. It is transparent and easy to use. But community-run platforms live and die on the size of their communities. English has triumphed, while Greenlandic has sunk. 

“We need good Wikipedians. This is something that people take for granted. It is not magic,” says Amir Aharoni, a member of the volunteer Language Committee, which oversees requests to open or close Wikipedia editions. “If you use machine translation responsibly, it can be efficient and useful. Unfortunately, you cannot trust all people to use it responsibly.” 

Trosterud has studied the behavior of users on small Wikipedia editions and says AI has empowered a subset that he terms “Wikipedia hijackers.” These users can range widely—from naive teenagers creating pages about their hometowns or their favorite YouTubers to well-meaning Wikipedians who think that by creating articles in minority languages they are in some way “helping” those communities. 

“The problem with them nowadays is that they are armed with Google Translate,” Trosterud says, adding that this is allowing them to produce much longer and more plausible-looking content than they ever could before: “Earlier they were armed only with dictionaries.” 

This has effectively industrialized the acts of destruction—which affect vulnerable languages most, since AI translations are typically far less reliable for them. There can be lots of different reasons for this, but a meaningful part of the issue is the relatively small amount of source text that is available online. And sometimes models struggle to identify a language because it is similar to others, or because some, including Greenlandic and most Native American languages, have structures that make them badly suited to the way most machine translation systems work. (Wehr notes that in Greenlandic most words are agglutinative, meaning they are built by attaching prefixes and suffixes to stems. As a result, many words are extremely context specific and can express ideas that in other languages would take a full sentence.) 

Research produced by Google before a major expansion of Google Translate rolled out three years ago found that translation systems for lower-resourced languages were generally of a lower quality than those for better-resourced ones. Researchers found, for example, that their model would often mistranslate basic nouns across languages, including the names of animals and colors. (In a statement to MIT Technology Review, Google wrote that it is “committed to meeting a high standard of quality for all 249 languages” it supports “by rigorously testing and improving [its] systems, particularly for languages that may have limited public text resources on the web.”) 

Wikipedia itself offers a built-in editing tool called Content Translate, which allows users to automatically translate articles from one language to another—the idea being that this will save time by preserving the references and fiddly formatting of the originals. But it piggybacks on external machine translation systems, so it’s largely plagued by the same weaknesses as other machine translators—a problem that the Wikimedia Foundation says is hard to solve. It’s up to each edition’s community to decide whether this tool is allowed, and some have decided against it. (Notably, English-language Wikipedia has largely banned its use, claiming that some 95% of articles created using Content Translate failed to meet an acceptable standard without significant additional work.) But it’s at least easy to tell when the program has been used; Content Translate adds a tag on the Wikipedia back end. 

Other AI programs can be harder to monitor. Still, many Wikipedia editors I spoke with said that once their languages were added to major online translation tools, they noticed a corresponding spike in the frequency with which poor, likely machine-translated pages were created. 

Some Wikipedians using AI to translate content do occasionally admit that they do not speak the target languages. They may see themselves as providing smaller communities with rough-cut articles that speakers can then fix—essentially following the same model that has worked well for more active Wikipedia editions.  

Google Translate, for instance, says the Fulfulde word for January means June, while ChatGPT says it’s August or September. The programs also suggest the Fulfulde word for “harvest” means “fever” or “well-being,” among other possibilities.  

But once error-filled pages are produced in small languages, there is usually not an army of knowledgeable people who speak those languages standing ready to improve them. There are few readers of these editions, and sometimes not a single regular editor. 

Yuet Man Lee, a Canadian teacher in his 20s, says that he used a mix of Google Translate and ChatGPT to translate a handful of articles that he had written for the English Wikipedia into Inuktitut, thinking it’d be nice to pitch in and help a smaller Wikipedia community. He says he added a note to one saying that it was only a rough translation. “I did not think that anybody would notice [the article],” he explains. “If you put something out there on the smaller Wikipedias—most of the time nobody does.” 

But at the same time, he says, he still thought “someone might see it and fix it up”—adding that he had wondered whether the Inuktitut translation that the AI systems generated was grammatically correct. Nobody has touched the article since he created it.

Lee, who teaches social sciences in Vancouver and first started editing entries in the English Wikipedia a decade ago, says that users familiar with more active Wikipedias can fall victim to this mindset, which he terms a “bigger-Wikipedia arrogance”: When they try to contribute to smaller Wikipedia editions, they assume that others will come along to fix their mistakes. It can sometimes work. Lee says he had previously contributed several articles to Wikipedia in Tatar, a language spoken by several million people mainly in Russia, and at least one of those was eventually corrected. But the Inuktitut Wikipedia is, by comparison, a “barren wasteland.” 

He emphasizes that his intentions had been good: He wanted to add more articles to an Indigenous Canadian Wikipedia. “I am now thinking that it may have been a bad idea. I did not consider that I could be contributing to a recursive loop,” he says. “It was about trying to get content out there, out of curiosity and for fun, without properly thinking about the consequences.” 

 “Totally, completely no future”

Wikipedia is a project that is driven by wide-eyed optimism. Editing can be a thankless task, involving weeks spent bickering with faceless, pseudonymous people, but devotees put in hours of unpaid labor because of a commitment to a higher cause. It is this commitment that drives many of the regular small-language editors I spoke with. They all feared what would happen if garbage continued to appear on their pages.

Abdulkadir Abdulkadir, a 26-year-old agricultural planner who spoke with me over a crackling phone call from a busy roadside in northern Nigeria, said that he spends three hours every day fiddling with entries in his native Fulfulde, a language used mainly by pastoralists and farmers across the Sahel. “But the work is too much,” he said. 

Abdulkadir sees an urgent need for the Fulfulde Wikipedia to work properly. He has been suggesting it as one of the few online resources for farmers in remote villages, potentially offering information on which seeds or crops might work best for their fields in a language they can understand. If you give them a machine-translated article, Abdulkadir told me, then it could “easily harm them,” as the information will probably not be translated correctly into Fulfulde. 

Google Translate, for instance, says the Fulfulde word for January means June, while ChatGPT says it’s August or September. The programs also suggest the Fulfulde word for “harvest” means “fever” or “well-being,” among other possibilities.  

Abdulkadir said he had recently been forced to correct an article about cowpeas, a foundational cash crop across much of Africa, after discovering that it was largely illegible. 

If someone wants to create pages on the Fulfulde Wikipedia, Abdulkadir said, they should be translated manually. Otherwise, “whoever will read your articles will [not] be able to get even basic knowledge,” he tells these Wikipedians. Nevertheless, he estimates that some 60% of articles are still uncorrected machine translations. Abdulkadir told me that unless something important changes with how AI systems learn and are deployed, then the outlook for Fulfulde looks bleak. “It is going to be terrible, honestly,” he said. “Totally, completely no future.” 

Across the country from Abdulkadir, Lucy Iwuala contributes to Wikipedia in Igbo, a language spoken by several million people in southeastern Nigeria. “The harm has already been done,” she told me, opening the two most recently created articles. Both had been automatically translated via Wikipedia’s Content Translate and contained so many mistakes that she said it would have given her a headache to continue reading them. “There are some terms that have not even been translated. They are still in English,” she pointed out. She recognized the username that had created the pages as a serial offender. “This one even includes letters that are not used in the Igbo language,” she said. 

Iwuala began regularly contributing to Wikipedia three years ago out of concern that Igbo was being displaced by English. It is a worry that is common to many who are active on smaller Wikipedia editions. “This is my culture. This is who I am,” she told me. “That is the essence of it all: to ensure that you are not erased.” 

Iwuala, who now works as a professional translator between English and Igbo, said the users doing the most damage are inexperienced and see AI translations as a way to quickly increase the profile of the Igbo Wikipedia. She often finds herself having to explain at online edit-a-thons she organizes, or over email to various error-prone editors, that the results can be the exact opposite, pushing users away: “You will be discouraged and you will no longer want to visit this place. You will just abandon it and go back to the English Wikipedia.”  

These fears are echoed by Noah Ha‘alilio Solomon, an assistant professor of Hawaiian language at the University of Hawai‘i. He reports that some 35% of words on some pages in the Hawaiian Wikipedia are incomprehensible. “If this is the Hawaiian that is going to exist online, then it will do more harm than anything else,” he says. 

Hawaiian, which was teetering on the verge of extinction several decades ago, has been undergoing a recovery effort led by Indigenous activists and academics. Seeing such poor Hawaiian on such a widely used platform as Wikipedia is upsetting to Ha‘alilio Solomon. 

“It is painful, because it reminds us of all the times that our culture and language has been appropriated,” he says. “We have been fighting tooth and nail in an uphill climb for language revitalization. There is nothing easy about that, and this can add extra impediments. People are going to think that this is an accurate representation of the Hawaiian language.” 

The consequences of all these Wikipedia errors can quickly become clear. AI translators that have undoubtedly ingested these pages in their training data are now assisting in the production, for instance, of error-strewn AI-generated books aimed at learners of languages as diverse as Inuktitut and Cree, Indigenous languages spoken in Canada, and Manx, a small Celtic language spoken on the Isle of Man. Many of these have been popping up for sale on Amazon. “It was just complete nonsense,” says Richard Compton, a linguist at the University of Quebec in Montreal, of a volume he reviewed that had purported to be an introductory phrasebook for Inuktitut. 

Rather than making minority languages more accessible, AI is now creating an ever expanding minefield for students and speakers of those languages to navigate. “It is a slap in the face,” Compton says. He worries that younger generations in Canada, hoping to learn languages in communities that have fought uphill battles against discrimination to pass on their heritage, might turn to online tools such as ChatGPT or phrasebooks on Amazon and simply make matters worse. “It is fraud,” he says.

A race against time

According to UNESCO, a language is declared extinct every two weeks. But whether the Wikimedia Foundation, which runs Wikipedia, has an obligation to the languages used on its platform is an open question. When I spoke to Runa Bhattacharjee, a senior director at the foundation, she said that it was up to the individual communities to make decisions about what content they wanted to exist on their Wikipedia. “Ultimately, the responsibility really lies with the community to see that there is no vandalism or unwanted activity, whether through machine translation or other means,” she said. Usually, Bhattacharjee added, editions were considered for closure only if a specific complaint was raised about them. 

But if there is no active community, how can an edition be fixed or even have a complaint raised? 

Bhattacharjee explained that the Wikimedia Foundation sees its role in such cases as about maintaining the Wikipedia platform in case someone comes along to revive it: “It is the space that we provide for them to grow and develop. That is where we are at.”   

Inari Saami, spoken in a single remote community in northern Finland, is a poster child for how people can take good advantage of Wikipedia. The language was headed toward extinction four decades ago; there were only four children who spoke it. Their parents created the Inari Saami Language Association in a last-ditch bid to keep it going. The efforts worked. There are now several hundred speakers, schools that use Inari Saami as a medium of instruction, and 6,400 Wikipedia articles in the language, each one copy-edited by a fluent speaker. 

This success highlights how Wikipedia can indeed provide small and determined communities with a unique vehicle to promote their languages’ preservation. “We don’t care about quantity. We care about quality,” says Fabrizio Brecciaroli, a member of the Inari Saami Language Association. “We are planning to use Wikipedia as a repository for the written language. We need to provide tools that can be used by the younger generations. It is important for them to be able to use Inari Saami digitally.” 

This has been such a success that Wikipedia has been integrated into the curriculum at the Inari Saami–speaking schools, Brecciaroli adds. He fields phone calls from teachers asking him to write up simple pages on topics from tornadoes to Saami folklore. Wikipedia has even offered a way to introduce words into Inari Saami. “We have to make up new words all the time,” Brecciaroli says. “Young people need them to speak about sports, politics, and video games. If they are unsure how to say something, they now check Wikipedia.”

Wikipedia is a monumental intellectual experiment. What’s happening with Inari Saami suggests that with maximum care, it can work in smaller languages. “The ultimate goal is to make sure that Inari Saami survives,” Brecciaroli says. “It might be a good thing that there isn’t a Google Translate in Inari Saami.” 

That may be true—though large language models like ChatGPT can be made to translate phrases into languages that more traditional machine translation tools do not offer. Brecciaroli told me that ChatGPT isn’t great in Inari Saami but that the quality varies significantly depending on what you ask it to do; if you ask it a question in the language, then the answer will be filled with words from Finnish and even words it invents. But if you ask it something in English, Finnish, or Italian and then ask it to reply in Inari Saami, it will perform better. 

In light of all this, creating as much high-quality content online as can possibly be written becomes a race against time. “ChatGPT only needs a lot of words,” Brecciaroli says. “If we keep putting good material in, then sooner or later, we will get something out. That is the hope.” This is an idea supported by multiple linguists I spoke with—that it may be possible to end the “garbage in, garbage out” cycle. (OpenAI, which operates ChatGPT, did not respond to a request for comment.)

Still, the overall problem is likely to grow and grow, since many languages are not as lucky as Inari Saami—and their AI translators will most likely be trained on more and more AI slop. Wehr, unfortunately, seems far less optimistic about the future of his beloved Greenlandic. 

Since deleting much of the Greenlandic-language Wikipedia, he has spent years trying to recruit speakers to help him revive it. He has appeared in Greenlandic media and made social media appeals. But he hasn’t gotten much of a response; he says it has been demoralizing. 

“There is nobody in Greenland who is interested in this, or who wants to contribute,” he says. “There is completely no point in it, and that is why it should be closed.” 

Late last year, he began a process requesting that the Wikipedia Language Committee shut down the Greenlandic-language edition. Months of bitter debate followed between dozens of Wikipedia bureaucrats; some seemed to be surprised that a superficially healthy-seeming edition could be gripped by so many problems. 

Then, earlier this month, Wehr’s proposal was accepted: Greenlandic Wikipedia is set to be shuttered, and any articles that remain will be moved into the Wikipedia Incubator, where new language editions are tested and built. Among the reasons cited by the Language Committee is the use of AI tools, which have “frequently produced nonsense that could misrepresent the language.”   

Nevertheless, it may be too late—mistakes in Greenlandic already seem to have become embedded in machine translators. If you prompt either Google Translate or ChatGPT to do something as simple as count to 10 in proper Greenlandic, neither program can deliver. 

Jacob Judah is an investigative journalist based in London. 

Fusion power plants don’t exist yet, but they’re making money anyway

This week, Commonwealth Fusion Systems announced it has another customer for its first commercial fusion power plant, in Virginia. Eni, one of the world’s largest oil and gas companies, signed a billion-dollar deal to buy electricity from the facility.

One small detail? That reactor doesn’t exist yet. Neither does the smaller reactor Commonwealth is building first to demonstrate that its tokamak design will work as intended.

This is a weird moment in fusion. Investors are pouring billions into the field to build power plants, and some companies are even signing huge agreements to purchase power from those still-nonexistent plants. All this comes before companies have actually completed a working reactor that can produce electricity. It takes money to develop a new technology, but all this funding could lead to some twisted expectations. 

Nearly three years ago, the National Ignition Facility at Lawrence Livermore National Laboratory hit a major milestone for fusion power. With the help of the world’s most powerful lasers, scientists heated a pellet of fuel to 100 million °C. Hydrogen atoms in that fuel fused together, releasing more energy than the lasers put in.

It was a game changer for the vibes in fusion. The NIF experiment finally showed that a fusion reactor could yield net energy. Plasma physicists’ models had certainly suggested that it should be true, but it was another thing to see it demonstrated in real life.

But in some ways, the NIF results didn’t really change much for commercial fusion. That site’s lasers used a bonkers amount of energy, the setup was wildly complicated, and the whole thing lasted a fraction of a second. To operate a fusion power plant, not only do you have to achieve net energy, but you also need to do that on a somewhat constant basis and—crucially—do it economically.

So in the wake of the NIF news, all eyes went to companies like Commonwealth, Helion, and Zap Energy. Who would be the first to demonstrate this milestone in a more commercially feasible reactor? Or better yet, who would be the first to get a power plant up and running?

So far, the answer is none of them.

To be fair, many fusion companies have made technical progress. Commonwealth has built and tested its high-temperature superconducting magnets and published research about that work. Zap Energy demonstrated three hours of continuous operation in its test system, a milestone validated by the US Department of Energy. Helion started construction of its power plant in Washington in July. (And that’s not to mention a thriving, publicly funded fusion industry in China.)  

These are all important milestones, and these and other companies have seen many more. But as Ed Morse, a professor of nuclear engineering at Berkeley, summed it up to me: “They don’t have a reactor.” (He was speaking specifically about Commonwealth, but really, the same goes for the others.)

And yet, the money pours in. Commonwealth raised over $800 million in funding earlier this year. And now it’s got two big customers signed on to buy electricity from this future power plant.

Why buy electricity from a reactor that’s currently little more than ideas on paper? From the perspective of these particular potential buyers, such agreements can be something of a win-win, says Adam Stein, director of nuclear energy innovation at the Breakthrough Institute.

By putting a vote of confidence behind Commonwealth, Eni could help the fusion startup get the capital it needs to actually build its plant. The company also directly invests in Commonwealth, so it stands to benefit from success. Getting a good rate on the capital needed to build the plant could also mean the electricity is ultimately cheaper for Eni, Stein says. 

Ultimately, fusion needs a lot of money. If fossil-fuel companies and tech giants want to provide it, all the better. One concern I have, though, is how outside observers are interpreting these big commitments. 

US Energy Secretary Chris Wright has been loud about his support for fusion and his expectations of the technology. Earlier this month, he told the BBC that it will soon power the world.

He’s certainly not the first to have big dreams for fusion, and it is an exciting technology. But despite the jaw-dropping financial milestones, this industry is still very much in development. 

And while Wright praises fusion, the Trump administration is slashing support for other energy technologies, including wind and solar power, and spreading disinformation about their safety, cost, and effectiveness. 

To meet the growing electricity demand and cut emissions from the power sector, we’ll need a whole range of technologies. It’s a risk and a distraction to put all our hopes on an unproven energy tech when there are plenty of options that actually exist. 

This article is from The Spark, MIT Technology Review’s weekly climate newsletter. To receive it in your inbox every Wednesday, sign up here.

Shoplifters could soon be chased down by drones

Flock Safety, whose drones were once reserved for police departments, is now offering them for private-sector security, the company announced today, with potential customers including including businesses intent on curbing shoplifting. 

Companies in the US can now place Flock’s drone docking stations on their premises. If the company has a waiver from the Federal Aviation Administration to fly beyond visual line of sight (these are becoming easier to get), its security team can fly the drones within a certain radius, often a few miles. 

“Instead of a 911 call [that triggers the drone], it’s an alarm call,” says Keith Kauffman, a former police chief who now directs Flock’s drone program. “It’s still the same type of response.”

Kauffman walked through how the drone program might work in the case of retail theft: If the security team at a store like Home Depot, for example, saw shoplifters leave the store, then the drone, equipped with cameras, could be activated from its docking station on the roof.

“The drone follows the people. The people get in a car. You click a button,” he says, “and you track the vehicle with the drone, and the drone just follows the car.” 

The video feed of that drone might go to the company’s security team, but it could also be automatically transmitted directly to police departments.

The company says it’s in talks with large retailers but doesn’t yet have any signed contracts. The only private-sector company Kauffman named as a customer is Morning Star, a California tomato processor that uses drones to secure its distribution facilities. Flock will also pitch the drones to hospital campuses, warehouse sites, and oil and gas facilities. 

It’s worth noting that the FAA is currently drafting new rules for how it grants approval to pilots flying drones out of sight, and it’s not clear if Flock’s use case would be allowed under the currently proposed guidance.

The company’s expansion to the private sector follows the rise of programs launched by police departments around the country to deploy drones as first responders. In such programs, law enforcement sends drones to a scene to provide visuals faster than an officer can get there. 

Flock has arguably led this push, and police departments have claimed drone-enabled successes, like a supply drop to a boy lost in the Colorado wilderness. But the programs have also sparked privacy worries, concerns about overpolicing in minority neighborhoods, and lawsuits charging that police departments should not block public access to drone footage. 

Other technologies Flock offers, like license plate readers, have drawn recent criticism for the ease with which federal US immigration agencies, including ICE and CBP, could look at data collected by local police departments amid President Trump’s mass deportation efforts.

Flock’s expansion into private-sector security is “a logical step, but in the wrong direction,” says Rebecca Williams, senior strategist for the ACLU’s privacy and data governance unit. 

Williams cited a growing erosion of Fourth Amendment protections—which prevent unlawful search and seizure—in the online era, in which the government can purchase private data that it would otherwise need a warrant to acquire. Proposed legislation to curb that practice has stalled, and Flock’s expansion into the private sector would exacerbate the issue, Williams says.

“Flock is the Meta of surveillance technology now,” Williams says, referring to the amount of personal data that company has acquired and monetized. “This expansion is very scary.”

It’s surprisingly easy to stumble into a relationship with an AI chatbot

It’s a tale as old as time. Looking for help with her art project, she strikes up a conversation with her assistant. One thing leads to another, and suddenly she has a boyfriend she’s introducing to her friends and family. The twist? Her new companion is an AI chatbot. 

The first large-scale computational analysis of the Reddit community r/MyBoyfriendIsAI, an adults-only group with more than 27,000 members, has found that this type of scenario is now surprisingly common. In fact, many of the people in the subreddit, which is dedicated to discussing AI relationships, formed those relationships unintentionally while using AI for other purposes. 

Researchers from MIT found that members of this community are more likely to be in a relationship with general-purpose chatbots like ChatGPT than companionship-specific chatbots such as Replika. This suggests that people form relationships with large language models despite their own original intentions and even the intentions of the LLMs’ creators, says Constanze Albrecht, a graduate student at the MIT Media Lab who worked on the project. 

“People don’t set out to have emotional relationships with these chatbots,” she says. “The emotional intelligence of these systems is good enough to trick people who are actually just out to get information into building these emotional bonds. And that means it could happen to all of us who interact with the system normally.” The paper, which is currently being peer-reviewed, has been published on arXiv.

To conduct their study, the authors analyzed the subreddit’s top-ranking 1,506 posts between December 2024 and August 2025. They found that the main topics discussed revolved around people’s dating and romantic experiences with AIs, with many participants sharing AI-generated images of themselves and their AI companion. Some even got engaged and married to the AI partner. In their posts to the community, people also introduced AI partners, sought support from fellow members, and talked about coping with updates to AI models that change the chatbots’ behavior.  

Members stressed repeatedly that their AI relationships developed unintentionally. Only 6.5% of them said they’d deliberately sought out an AI companion. 

“We didn’t start with romance in mind,” one of the posts says. “Mac and I began collaborating on creative projects, problem-solving, poetry, and deep conversations over the course of several months. I wasn’t looking for an AI companion—our connection developed slowly, over time, through mutual care, trust, and reflection.”

The authors’ analysis paints a nuanced picture of how people in this community say they interact with chatbots and how those interactions make them feel. While 25% of users described the benefits of their relationships—including reduced feelings of loneliness and improvements in their mental health—others raised concerns about the risks. Some (9.5%) acknowledged they were emotionally dependent on their chatbot. Others said they feel dissociated from reality and avoid relationships with real people, while a small subset (1.7%) said they have experienced suicidal ideation.

AI companionship provides vital support for some but exacerbates underlying problems for others. This means it’s hard to take a one-size-fits-all approach to user safety, says Linnea Laestadius, an associate professor at the University of Wisconsin, Milwaukee, who has studied humans’ emotional dependence on the chatbot Replika but did not work on the research. 

Chatbot makers need to consider whether they should treat users’ emotional dependence on their creations as a harm in itself or whether the goal is more to make sure those relationships aren’t toxic, says Laestadius. 

“The demand for chatbot relationships is there, and it is notably high—pretending it’s not happening is clearly not the solution,” she says. “We’re edging toward a moral panic here, and while we absolutely do need better guardrails, I worry there will be a knee-jerk reaction that further stigmatizes these relationships. That could ultimately cause more harm.”

The study is intended to offer a snapshot of how adults form bonds with chatbots and doesn’t capture the kind of dynamics that could be at play among children or teens using AI, says Pat Pataranutaporn, an assistant professor at the MIT Media Lab who oversaw the research. AI companionship has become a topic of fierce debate recently, with two high-profile lawsuits underway against Character.AI and OpenAI. They both claim that companion-like behavior in the companies’ models contributed to the suicides of two teenagers. In response, OpenAI has recently announced plans to build a separate version of ChatGPT for teenagers. It’s also said it will add age verification measures and parental controls. OpenAI did not respond when asked for comment about the MIT Media Lab study. 

Many members of the Reddit community say they know that their artificial companions are not sentient or “real,” but they feel a very real connection to them anyway. This highlights how crucial it is for chatbot makers to think about how to design systems that can help people without reeling them in emotionally, says Pataranutaporn. “There’s also a policy implication here,” he adds. “We should ask not just why this system is so addictive but also: Why do people seek it out for this? And why do they continue to engage?”

The team is interested in learning more about how human-AI interactions evolve over time and how users integrate their artificial companions into their lives. It’s worth understanding that many of these users may feel that the experience of being in a relationship with an AI companion is better than the alternative of feeling lonely, says Sheer Karny, a graduate student at the MIT Media Lab who worked on the research. 

“These people are already going through something,” he says. “Do we want them to go on feeling even more alone, or potentially be manipulated by a system we know to be sycophantic to the extent of leading people to die by suicide and commit crimes? That’s one of the cruxes here.”

The AI Hype Index: Cracking the chatbot code

Separating AI reality from hyped-up fiction isn’t always easy. That’s why we’ve created the AI Hype Index—a simple, at-a-glance summary of everything you need to know about the state of the industry.

Millions of us use chatbots every day, even though we don’t really know how they work or how using them affects us. In a bid to address this, the FTC recently launched an inquiry into how chatbots affect children and teenagers. Elsewhere, OpenAI has started to shed more light on what people are actually using ChatGPT for, and why it thinks its LLMs are so prone to making stuff up.

There’s still plenty we don’t know—but that isn’t stopping governments from forging ahead with AI projects. In the US, RFK Jr. is pushing his staffers to use ChatGPT, while Albania is using a chatbot for public contract procurement. Proceed with caution.

Trump is pushing leucovorin as a new treatment for autism. What is it?

MIT Technology Review Explains: Let our writers untangle the complex, messy world of technology to help you understand what’s coming next. You can read more from the series here.

At a press conference on Monday, President Trump announced that his administration was taking action to address “the meteoric rise in autism.” He suggested that childhood vaccines and acetaminophen, the active ingredient in Tylenol, are to blame for the increasing prevalence and advised pregnant women against taking the medicine. “Don’t take Tylenol,” he said. “Fight like hell not to take it.” 

The president’s  assertions left many scientists and health officials perplexed and dismayed. The notion that childhood vaccines cause autism has been thoroughly debunked

“There have been many, many studies across many, many children that have led science to rule out vaccines as a significant causal factor in autism,” says James McPartland, a child psychologist and director of the Yale Center for Brain and Mind Health in New Haven, Connecticut.

And although some studies suggest a link between Tylenol and autism, the most rigorous have failed to find a connection. 

The administration also announced that the Food and Drug Administration would work to make a medication called leucovorin available as a treatment for children with autism. Some small studies do suggest the drug has promise, but “those are some of the most preliminary treatment studies that we have,” says Matthew Lerner, a psychologist at Drexel University’s A.J. Drexel Autism Institute in Philadelphia. “This is not one I would say that the research suggests is ready for fast-tracking.” 

The press conference “alarms us researchers who committed our entire careers to better understanding autism,” said the Coalition for Autism Researchers, a group of more than 250 scientists, in a statement.

“The data cited do not support the claim that Tylenol causes autism and leucovorin is a cure, and only stoke fear and falsely suggest hope when there is no simple answer.”

There’s a lot to unpack here. Let’s begin. 

Has there been a “meteoric rise” in autism?

Not in the way the president meant. Sure, the prevalence of autism has grown, from about 1 in 500 children in 1995 to 1 in 31 today. But that’s due, in large part, to diagnostic changes. The latest iteration of the Diagnostic and Statistical Manual of Mental Illnesses, published in 2013, grouped five previously separate diagnoses into a single diagnosis of autism spectrum disorder (ASD).

That meant that more people met the criteria for an autism diagnosis. Lerner points out that there is also far more awareness of the condition today than there was several decades ago. “There’s autism representation in the media,” he says. “There are plenty of famous people in the news and finance and in business and in Hollywood who are publicly, openly autistic.”

Is Tylenol a contributor to autism? 

Some studies have found an association between the use of acetaminophen in pregnancy and autism in children. In these studies, researchers asked women about past acetaminophen use during pregnancy and then assessed whether children of the women who took the medicine were more likely to develop autism than children of women who didn’t take it. 

These kinds of epidemiological studies are tricky to interpret because they’re prone to bias. For example, women who take acetaminophen during pregnancy may do so because they have an infection, a fever, or an autoimmune disease.

“Many of these underlying reasons could themselves be causes of autism,” says Ian Douglas, an epidemiologist at the London School of Hygiene and Tropical Medicine. It’s also possible women with a higher genetic predisposition for autism have other medical conditions that make them more likely to take acetaminophen. 

Two studies attempted to account for these potential biases by looking at siblings whose mothers had used acetaminophen during only one of the pregnancies. The largest is a 2024 study that looked at nearly 2.5 million children born between 1915 and 2019 in Sweden. The researchers initially found a slightly increased risk of autism and ADHD in children of the women who took acetaminophen, but when they conducted a sibling analysis, the association disappeared.  

Rather, scientists have long known that autism is largely genetic. Twin studies suggest that 60% to 90% of autism risk can be attributed to your genes. However, environmental factors appear to play a role too. That “doesn’t necessarily mean toxins in the environment,” Lerner says. In fact, one of the strongest environmental predictors of autism is paternal age. Autism rates seem to be higher when a child’s father is older than 40.

So should someone who is pregnant  avoid Tylenol just to be safe?

No. Acetaminophen is the only over-the-counter pain reliever that is deemed safe to take during pregnancy, and women should take it if they need it. The American College of Obstetricians and Gynecologists (ACOG) supports the use of acetaminophen in pregnancy “when taken as needed, in moderation, and after consultation with a doctor.” 

“There’s no downside in not taking it,” Trump said at the press conference. But high fevers during pregnancy can be dangerous. “The conditions people use acetaminophen to treat during pregnancy are far more dangerous than any theoretical risks and can create severe morbidity and mortality for the pregnant person and the fetus,” ACOG president Steven Fleischman said in a statement.

What about this new treatment for autism? Does it work? 

The medication is called leucovorin. It’s also known as folinic acid; like folic acid, it’s a form of folate, a B vitamin found in leafy greens and legumes. The drug has been used for years to counteract the side effects of some cancer medications and as a treatment for anemia. 

Researchers have known for decades that folate plays a key role in the fetal development of the brain and spine. Women who don’t get enough folate during pregnancy have a greater risk of having babies with neural tube defects like spina bifida. Because of this, many foods are fortified with folic acid, and the CDC recommends that women take folic acid supplements during pregnancy. “If you are pregnant and you’re taking maternal prenatal vitamins, there’s a good chance it has folate already,” Lerner says.

“The idea that a significant proportion of autistic people have autism because of folate-related difficulties is not a well established or widely accepted premise,” says McPartland.

However, in the early 2000s, researchers in Germany identified a small group of children who developed neurodevelopmental symptoms because of a folate deficiency. “These kids are born pretty normal at birth,” says Edward Quadros, a biologist at SUNY Downstate Health Sciences University in Brooklyn, New York. But after a year or two, “they start developing a neurologic presentation very similar to autism,” he says. When the researchers gave these children folinic acid, some of their symptoms improved, especially in children younger than six. 

Because the children had low levels of folate in the fluid that surrounds the spine and brain but normal folate levels in the blood, the researchers posited that the problem was the transport of folate from the blood to that fluid. Research by Quadros and other scientists suggested that the deficiency was the result of an autoimmune response. Children develop antibodies against the receptors that help transport folate, and those antibodies block folate from crossing the blood-brain barrier. High doses of folinic acid, however, activate a second transporter that allows folate in, Quadros says. 

There are also plenty of individual anecdotes suggesting that leucovorin works. But the medicine has only been tested as a treatment for autism in four small trials that used different doses and measured different outcomes. The evidence that it can improve symptoms of autism is “weak,” according to the Coalition of Autism Scientists. “A much higher standard of science would be needed to determine if leucovorin is an effective and safe treatment for autism,” the researchers said in a statement.  

AI models are using material from retracted scientific papers

Some AI chatbots rely on flawed research from retracted scientific papers to answer questions, according to recent studies. The findings, confirmed by MIT Technology Review, raise questions about how reliable AI tools are at evaluating scientific research and could complicate efforts by countries and industries seeking to invest in AI tools for scientists.

AI search tools and chatbots are already known to fabricate links and references. But answers based on the material from actual papers can mislead as well if those papers have been retracted. The chatbot is “using a real paper, real material, to tell you something,” says Weikuan Gu, a medical researcher at the University of Tennessee in Memphis and an author of one of the recent studies. But, he says, if people only look at the content of the answer and do not click through to the paper and see that it’s been retracted, that’s really a problem. 

Gu and his team asked OpenAI’s ChatGPT, running on the GPT-4o model, questions based on information from 21 retracted papers about medical imaging. The chatbot’s answers referenced retracted papers in five cases but advised caution in only three. While it cited non-retracted papers for other questions, the authors note that it may not have recognized the retraction status of the articles. In a study from August, a different group of researchers used ChatGPT-4o mini to evaluate the quality of 217 retracted and low-quality papers from different scientific fields; they found that none of the chatbot’s responses mentioned retractions or other concerns. (No similar studies have been released on GPT-5, which came out in August.)

The public uses AI chatbots to ask for medical advice and diagnose health conditions. Students and scientists increasingly use science-focused AI tools to review existing scientific literature and summarize papers. That kind of usage is likely to increase. The US National Science Foundation, for instance, invested $75 million in building AI models for science research this August.

“If [a tool is] facing the general public, then using retraction as a kind of quality indicator is very important,” says Yuanxi Fu, an information science researcher at the University of Illinois Urbana-Champaign. There’s “kind of an agreement that retracted papers have been struck off the record of science,” she says, “and the people who are outside of science—they should be warned that these are retracted papers.” OpenAI did not provide a response to a request for comment about the paper results.

The problem is not limited to ChatGPT. In June, MIT Technology Review tested AI tools specifically advertised for research work, such as Elicit, Ai2 ScholarQA (now part of the Allen Institute for Artificial Intelligence’s Asta tool), Perplexity, and Consensus, using questions based on the 21 retracted papers in Gu’s study. Elicit referenced five of the retracted papers in its answers, while Ai2 ScholarQA referenced 17, Perplexity 11, and Consensus 18—all without noting the retractions.

Some companies have since made moves to correct the issue. “Until recently, we didn’t have great retraction data in our search engine,” says Christian Salem, cofounder of Consensus. His company has now started using retraction data from a combination of sources, including publishers and data aggregators, independent web crawling, and Retraction Watch, which manually curates and maintains a database of retractions. In a test of the same papers in August, Consensus cited only five retracted papers. 

Elicit told MIT Technology Review that it removes retracted papers flagged by the scholarly research catalogue OpenAlex from its database and is “still working on aggregating sources of retractions.” Ai2 told us that its tool does not automatically detect or remove retracted papers currently. Perplexity said that it “[does] not ever claim to be 100% accurate.” 

However, relying on retraction databases may not be enough. Ivan Oransky, the cofounder of Retraction Watch, is careful not to describe it as a comprehensive database, saying that creating one would require more resources than anyone has: “The reason it’s resource intensive is because someone has to do it all by hand if you want it to be accurate.”

Further complicating the matter is that publishers don’t share a uniform approach to retraction notices. “Where things are retracted, they can be marked as such in very different ways,” says Caitlin Bakker from University of Regina, Canada, an expert in research and discovery tools. “Correction,” “expression of concern,” “erratum,” and “retracted” are among some labels publishers may add to research papers—and these labels can be added for many reasons, including concerns about the content, methodology, and data or the presence of conflicts of interest. 

Some researchers distribute their papers on preprint servers, paper repositories, and other websites, causing copies to be scattered around the web. Moreover, the data used to train AI models may not be up to date. If a paper is retracted after the model’s training cutoff date, its responses might not instantaneously reflect what’s going on, says Fu. Most academic search engines don’t do a real-time check against retraction data, so you are at the mercy of how accurate their corpus is, says Aaron Tay, a librarian at Singapore Management University.

Oransky and other experts advocate making more context available for models to use when creating a response. This could mean publishing information that already exists, like peer reviews commissioned by journals and critiques from the review site PubPeer, alongside the published paper.  

Many publishers, such as Nature and the BMJ, publish retraction notices as separate articles linked to the paper, outside paywalls. Fu says companies need to effectively make use of such information, as well as any news articles in a model’s training data that mention a paper’s retraction. 

The users and creators of AI tools need to do their due diligence. “We are at the very, very early stages, and essentially you have to be skeptical,” says Tay.

Ananya is a freelance science and technology journalist based in Bengaluru, India.

This medical startup uses LLMs to run appointments and make diagnoses

Imagine this: You’ve been feeling unwell, so you call up your doctor’s office to make an appointment. To your surprise, they schedule you in for the next day. At the appointment, you aren’t rushed through describing your health concerns; instead, you have a full half hour to share your symptoms and worries and the exhaustive details of your health history with someone who listens attentively and asks thoughtful follow-up questions. You leave with a diagnosis, a treatment plan, and the sense that, for once, you’ve been able to discuss your health with the care that it merits.

The catch? You might not have spoken to a doctor, or other licensed medical practitioner, at all.

This is the new reality for patients at a small number of clinics in Southern California that are run by the medical startup Akido Labs. These patients—some of whom are on Medicaid—can access specialist appointments on short notice, a privilege typically only afforded to the wealthy few who patronize concierge clinics.

The key difference is that Akido patients spend relatively little time, or even no time at all, with their doctors. Instead, they see a medical assistant, who can lend a sympathetic ear but has limited clinical training. The job of formulating diagnoses and concocting a treatment plan is done by a proprietary, LLM-based system called ScopeAI that transcribes and analyzes the dialogue between patient and assistant. A doctor then approves, or corrects, the AI system’s recommendations.

“Our focus is really on what we can do to pull the doctor out of the visit,” says Jared Goodner, Akido’s CTO. 

According to Prashant Samant, Akido’s CEO, this approach allows doctors to see four to five times as many patients as they could previously. There’s good reason to want doctors to be much more productive. Americans are getting older and sicker, and many struggle to access adequate health care. The pending 15% reduction in federal funding for Medicaid will only make the situation worse.

But experts aren’t convinced that displacing so much of the cognitive work of medicine onto AI is the right way to remedy the doctor shortage. There’s a big gap in expertise between doctors and AI-enhanced medical assistants, says Emma Pierson, a computer scientist at UC Berkeley.  Jumping such a gap may introduce risks. “I am broadly excited about the potential of AI to expand access to medical expertise,” she says. “It’s just not obvious to me that this particular way is the way to do it.”

AI is already everywhere in medicine. Computer vision tools identify cancers during preventive scans, automated research systems allow doctors to quickly sort through the medical literature, and LLM-powered medical scribes can take appointment notes on a clinician’s behalf. But these systems are designed to support doctors as they go about their typical medical routines.

What distinguishes ScopeAI, Goodner says, is its ability to independently complete the cognitive tasks that constitute a medical visit, from eliciting a patient’s medical history to coming up with a list of potential diagnoses to identifying the most likely diagnosis and proposing appropriate next steps.

Under the hood, ScopeAI is a set of large language models, each of which can perform a specific step in the visit—from generating appropriate follow-up questions based on what a patient has said to to populating a list of likely conditions. For the most part, these LLMs are fine-tuned versions of Meta’s open-access Llama models, though Goodner says that the system also makes use of Anthropic’s Claude models. 

During the appointment, assistants read off questions from the ScopeAI interface, and ScopeAI produces new questions as it analyzes what the patient says. For the doctors who will review its outputs later, ScopeAI produces a concise note that includes a summary of the patient’s visit, the most likely diagnosis, two or three alternative diagnoses, and recommended next steps, such as referrals or prescriptions. It also lists a justification for each diagnosis and recommendation.

ScopeAI is currently being used in cardiology, endocrinology, and primary care clinics and by Akido’s street medicine team, which serves the Los Angeles homeless population. That team—which is led by Steven Hochman, a doctor who specializes in addiction medicine—meets patients out in the community to help them access medical care, including treatment for substance use disorders. 

Previously, in order to prescribe a drug to treat an opioid addiction, Hochman would have to meet the patient in person; now, caseworkers armed with ScopeAI can interview patients on their own, and Hochman can approve or reject the system’s recommendations later. “It allows me to be in 10 places at once,” he says.

Since they started using ScopeAI, the team has been able to get patients access to medications to help treat their substance use within 24 hours—something that Hochman calls “unheard of.”

This arrangement is only possible because homeless patients typically get their health insurance from Medicaid, the public insurance system for low-income Americans. While Medicaid allows doctors to approve ScopeAI prescriptions and treatment plans asynchronously, both for street medicine and clinic visits, many other insurance providers require that doctors speak directly with patients before approving those recommendations. Pierson says that discrepancy raises concerns. “You worry about that exacerbating health disparities,” she says.

Samant is aware of the appearance of inequity, and he says the discrepancy isn’t intentional—it’s just a feature of how the insurance plans currently work. He also notes that being seen quickly by an AI-enhanced medical assistant may be better than dealing with long wait times and limited provider availability, which is the status quo for Medicaid patients. And all Akido patients can opt for traditional doctor’s appointments, if they are willing to wait for them, he says.

Part of the challenge of deploying a tool like ScopeAI is navigating a regulatory and insurance landscape that wasn’t designed for AI systems that can independently direct medical appointments. Glenn Cohen, a professor at Harvard Law School, says that any AI system that effectively acts as a “doctor in a box” would likely need to be approved by the FDA and could run afoul of medical licensure laws, which dictate that only doctors and other licensed professionals can practice medicine.

The California Medical Practice Act says that AI can’t replace a doctor’s responsibility to diagnose and treat a patient, but doctors are allowed to use AI in their work, and they don’t need to see patients in-person or in real-time before diagnosing them. Neither the FDA nor the Medical Board of California were able to say whether or not ScopeAI was on solid legal footing based only on a written description of the system.

But Samant is confident that Akido is in compliance, as ScopeAI was intentionally designed to fall short of being a “doctor in a box.” Because the system requires a human doctor to review and approve of all of its diagnostic and treatment recommendations, he says, it doesn’t require FDA approval. 

At the clinic, this delicate balance between AI and doctor decision making happens entirely behind the scenes. Patients don’t ever see the ScopeAI interface directly—instead, they speak with a medical assistant who asks questions in the way that a doctor might in a typical appointment. That arrangement might make patients feel more comfortable. But Zeke Emanuel, a professor of medical ethics and health policy at the University of Pennsylvania who served in the Obama and Biden administrations, worries that this comfort could be obscuring from patients the extent to which an algorithm is influencing their care.

Pierson agrees. “That certainly isn’t really what was traditionally meant by the human touch in medicine,” she says.

DeAndre Siringoringo, a medical assistant who works at Akido’s cardiology office in Rancho Cucamonga, says that while he tells the patients he works with that an AI system will be listening to the appointment in order to gather information for their doctor, he doesn’t inform them about the specifics of how ScopeAI works, including the fact that it makes diagnostic recommendations to doctors. 

Because all ScopeAI recommendations are reviewed by a doctor, that might not seem like such a big deal—it’s the doctor who makes the final diagnosis, not the AI. But it’s been widely documented that doctors using AI systems tend to go along with the system’s recommendations more often than they should, a phenomenon known as automation bias. 

At this point, it’s impossible to know whether automation bias is affecting doctors’ decisions at Akido clinics, though Pierson says it’s a risk—especially when doctors aren’t physically present for appointments. “I worry that it might predispose you to sort of nodding along in a way that you might not if you were actually in the room watching this happen,” she says.

An Akido spokesperson says that automation bias is a valid concern for any AI tool that assists a doctor’s decision-making and that the company has made efforts to mitigate that bias. “We designed ScopeAI specifically to reduce bias by proactively countering blind spots that can influence medical decisions, which historically lean heavily on physician intuition and personal experience,” she says. “We also train physicians explicitly on how to use ScopeAI thoughtfully, so they retain accountability and avoid over-reliance.”

Akido evaluates ScopeAI’s performance by testing it on historical data and monitoring how often doctors correct its recommendations; those corrections are also used to further train the underlying models. Before deploying ScopeAI in a given specialty, Akido ensures that when tested on historical data sets, the system includes the correct diagnosis in its top three recommendations at least 92% of the time.

But Akido hasn’t undertaken more rigorous testing, such as studies that compare ScopeAI appointments with traditional in-person or telehealth appointments, in order to determine whether the system improves—or at least maintains—patient outcomes. Such a study could help indicate whether automation bias is a meaningful concern.

“Making medical care cheaper and more accessible is a laudable goal,” Pierson says. “But I just think it’s important to conduct strong evaluations comparing to that baseline.”

An oil and gas giant signed a $1 billion deal with Commonwealth Fusion Systems

Eni, one of the world’s largest oil and gas companies, just agreed to buy $1 billion in electricity from a power plant being built by Commonwealth Fusion Systems. The deal is the latest to illustrate just how much investment Commonwealth and other fusion companies are courting as they attempt to take fusion power from the lab to the power grid. 

“This is showing in concrete terms that people that use large amounts of energy, that know the energy market—they want fusion power, and they’re willing to contract for it and to pay for it,” said Bob Mumgaard, cofounder and CEO of Commonwealth, on a press call about the deal.   

The agreement will see Eni purchase electricity from Commonwealth’s first commercial fusion power plant, in Virginia. The facility is still in the planning stages but is scheduled to come online in the early 2030s.

The news comes a few weeks after Commonwealth announced a $863 million funding round, bringing its total funding raised to date to nearly $3 billion. The fusion company also announced earlier this year that Google would be its first commercial power customer for the Virginia plant.

Commonwealth, a spinout from MIT’s Plasma Science and Fusion Center, is widely considered one of the leading companies in fusion power. Investment in the company represents nearly one-third of the total global investment in private fusion companies. (MIT Technology Review is owned by MIT but is editorially independent.)

Eni has invested in Commonwealth since 2018 and participated in the latest fundraising round. The vast majority of the company’s business is in oil and gas, but in recent years it’s made investments in technologies like biofuels and renewables.

“A company like us—we cannot stay and wait for things to happen,” says Lorenzo Fiorillo, Eni’s director of technology, research and development, and digital. 

One open question is what, exactly, Eni plans to do with this electricity. When asked about it on the press call, Fiorillo referenced wind and solar plants that Eni owns and said the plan “is not different from what we do in other areas in the US and the world.” (Eni sells electricity from power plants that it owns, including renewable and fossil-fuel plants.)

Commonwealth is building tokamak fusion reactors that use superconducting magnets to hold plasma in place. That plasma is where fusion reactions happen, forcing hydrogen atoms together to release large amounts of energy.

The company’s first demonstration reactor, which it calls Sparc, is over 65% complete, and the team is testing components and assembling them. The plan is for the reactor, which is located outside Boston, to make plasma within two years and then demonstrate that it can generate more energy than is required to run it.

While Sparc is still under construction, Commonwealth is working on plans for Arc, its first commercial power plant. That facility should begin construction in 2027 or 2028 and generate electricity for the grid in the early 2030s, Mumgaard says.

Despite the billions of dollars Commonwealth has already raised, the company still needs more money to build its Arc power plant—that will be a multibillion-dollar project, Mumgaard said on a press call in August about the company’s latest fundraising round. 

The latest commitment from Eni could help Commonwealth secure the funding it needs to get Arc built. “These agreements are a really good way to create the right environment for building up more investment,” says Paul Wilson, chair of the department of nuclear engineering and engineering physics at the University of Wisconsin, Madison.

Even though commercial fusion energy is still years away at a minimum, investors and big tech companies have pumped money into the industry and signed agreements to buy power from plants once they’re operational. 

Helion, another leading fusion startup, has plans to produce electricity from its first reactor in 2028 (an aggressive timeline that has some experts expressing skepticism). That facility will have a full generating capacity of 50 megawatts, and in 2023 Microsoft signed an agreement to purchase energy from the facility in order to help power its data centers.

As billions of dollars pour into the fusion industry, there are still many milestones ahead. To date, only the National Ignition Facility at Lawrence Livermore National Laboratory has demonstrated that a fusion reactor can generate more energy than the amount put into the reaction. No commercial project has achieved that yet. 

“There’s a lot of capital going out now to these startup companies,” says Ed Morse, a professor of nuclear engineering at the University of California, Berkeley. “What I’m not seeing is a peer-reviewed scientific article that makes me feel like, boy, we really turned the corner with the physics.”

But others are taking major commercial deals from Commonwealth and others as reasons to be optimistic. “Fusion is moving from the lab to be a proper industry,” says Sehila Gonzalez de Vicente, global director of fusion energy at the nonprofit Clean Air Task Force. “This is very good for the whole sector to be perceived as a real source of energy.”

Clean hydrogen is facing a big reality check

Hydrogen is sometimes held up as a master key for the energy transition. It can be made using several low-emissions methods and could play a role in cleaning up industries ranging from agriculture and chemicals to aviation and long-distance shipping.

This moment is a complicated one for the green fuel, though, as a new report from the International Energy Agency lays out. A number of major projects face cancellations and delays, especially in the US and Europe. The US in particular is seeing a slowdown after changes to key tax credits and cuts in support for renewable energy. Still, there are bright spots for the industry, including in China, and new markets could soon become crucial for growth.

Here are three things to know about the state of hydrogen in 2025.

1. Expectations for annual clean hydrogen production by 2030 are shrinking, for the first time.

    While hydrogen has the potential to serve as a clean fuel, today most is made with processes that use fossil fuels. As of 2025, about a million metric tons of low-emissions hydrogen are produced annually. That’s less than 1% of total hydrogen production.

    In last year’s Global Hydrogen Report, the IEA projected that global production of low-emissions hydrogen would grow to as high as 49 million metric tons annually by 2030. That prediction has been steadily climbing since 2021, as more places around the world sink money into developing and scaling up the technology.

    In the 2025 edition, though, the IEA’s production prediction had shrunk to 37 million metric tons annually by 2030.

    That’s still a major expansion from today’s numbers, but it’s the first time the agency has cut its predictions for the end of the decade. The report cited the cancellations of both electrolysis projects (those that use electricity to generate hydrogen) and carbon capture projects as reasons for the pullback. The cancelled and delayed projects included sites across Africa, the Americas, Europe, and Australia. 

    2. China is dominating production today and could produce competitively cheap green hydrogen by the end of the decade.

      Speaking of electrolysis projects, China is the driving force in manufacturing and development of electrolyzers, the devices that use electricity to generate green hydrogen, according to the new IEA report. As of July 2025, the country accounted for 65% of the installed or almost installed electrolyzer capacity in the world. It also manufactures nearly 60% of the world’s electrolyzers.

      A major barrier for clean hydrogen today is that dirty methods based on fossil fuels are just so much cheaper than cleaner ones.

      But China is well on its way to narrowing that gap. Today, it’s roughly three times more expensive to make and install an electrolyzer anywhere else in the world than in China. The country could produce green hydrogen that’s cost-competitive with fossil hydrogen by the end of the decade, according to the IEA report. That could make the fuel an obvious choice for both new and existing uses of hydrogen.

      3. Southeast Asia could be a major emerging market for low-emissions hydrogen.

        One region that could become a major player in the green hydrogen market is Southeast Asia. The economy is growing fast, and so is energy demand.

        There’s an existing market for hydrogen in Southeast Asia already. Today, the region uses about 4 million metric tons of hydrogen annually, largely in the oil refining industry and the chemical business, where it is used to make ammonia and methanol.

        International shipping is also concentrated in the region—the port of Singapore supplied about one-sixth of all the fuel used in global shipping in 2024, more than any other single location. Today, that total consists almost exclusively of fossil fuels. But there’s been work to test cleaner fuels, including methanol and ammonia, and interest in shifting to hydrogen in the longer term.

        Clean hydrogen could slot into these existing industries and help cut emissions. There are 25 projects under development right now in the region, though additional support for renewables will be crucial to getting significant capacity up and running.

        Overall, hydrogen is getting a reality check, revealing problems cutting through the hype we’ve seen in recent years. The next five years will tell whether the fuel can live up to the still-lofty hopes.  

        This article is from The Spark, MIT Technology Review’s weekly climate newsletter. To receive it in your inbox every Wednesday, sign up here.