Trump’s AI Action Plan is a distraction

On Wednesday, President Trump issued three executive orders, delivered a speech, and released an action plan, all on the topic of continuing American leadership in AI. 

The plan contains dozens of proposed actions, grouped into three “pillars”: accelerating innovation, building infrastructure, and leading international diplomacy and security. Some of its recommendations are thoughtful even if incremental, some clearly serve ideological ends, and many enrich big tech companies, but the plan is just a set of recommended actions. 

The three executive orders, on the other hand, actually operationalize one subset of actions from each pillar: 

  • One aims to prevent “woke AI” by mandating that the federal government procure only large language models deemed “truth-seeking” and “ideologically neutral” rather than ones allegedly favoring DEI. This action purportedly accelerates AI innovation.
  • A second aims to accelerate construction of AI data centers. A much more industry-friendly version of an order issued under President Biden, it makes available rather extreme policy levers, like effectively waiving a broad swath of environmental protections, providing government grants to the wealthiest companies in the world, and even offering federal land for private data centers.
  • A third promotes and finances the export of US AI technologies and infrastructure, aiming to secure American diplomatic leadership and reduce international dependence on AI systems from adversarial countries.

This flurry of actions made for glitzy press moments, including an hour-long speech from the president and onstage signings. But while the tech industry cheered these announcements (which will swell their coffers), they obscured the fact that the administration is currently decimating the very policies that enabled America to become the world leader in AI in the first place.

To maintain America’s leadership in AI, you have to understand what produced it. Here are four specific long-standing public policies that helped the US achieve this leadership—advantages that the administration is undermining. 

Investing federal funding in R&D 

Generative AI products released recently by American companies, like ChatGPT, were developed with industry-funded research and development. But the R&D that enables today’s AI was actually funded in large part by federal government agencies—like the Defense Department, the National Science Foundation, NASA, and the National Institutes of Health—starting in the 1950s. This includes the first successful AI program in 1956, the first chatbot in 1961, and the first expert systems for doctors in the 1970s, along with breakthroughs in machine learning, neural networks, backpropagation, computer vision, and natural-language processing.

American tax dollars also funded advances in hardware, communications networks, and other technologies underlying AI systems. Public research funding undergirded the development of lithium-ion batteries, micro hard drives, LCD screens, GPS, radio-frequency signal compression, and more in today’s smartphones, along with the chips used in AI data centers, and even the internet itself.

Instead of building on this world-class research history, the Trump administration is slashing R&D funding, firing federal scientists, and squeezing leading research universities. This week’s action plan recommends investing in R&D, but the administration’s actual budget proposes cutting nondefense R&D by 36%. It also proposed actions to better coordinate and guide federal R&D, but coordination won’t yield more funding.

Some say that companies’ R&D investments will make up the difference. However, companies conduct research that benefits their bottom line, not necessarily the national interest. Public investment allows broad scientific inquiry, including basic research that lacks immediate commercial applications but sometimes ends up opening massive markets years or decades later. That’s what happened with today’s AI industry.

Supporting immigration and immigrants

Beyond public R&D investment, America has long attracted the world’s best researchers and innovators.

Today’s generative AI is based on the transformer model (the T in ChatGPT), first described by a team at Google in 2017. Six of the eight researchers on that team were born outside the US, and the other two are children of immigrants. 

This isn’t an exception. Immigrants have been central to American leadership in AI. Of the 42 American companies included in the 2025 Forbes ranking of the 50 top AI startups, 60% have at least one immigrant cofounder, according to an analysis by the Institute for Progress. Immigrants also cofounded or head the companies at the center of the AI ecosystem: OpenAI, Anthropic, Google, Microsoft, Nvidia, Intel, and AMD.

“Brain drain” is a term that was first coined to describe scientists’ leaving other countries for the US after World War II—to the Americans’ benefit. Sadly, the trend has begun reversing this year. Recent studies suggest that the US is already losing its AI talent edge through the administration’s anti-immigration actions (including actions taken against AI researchers) and cuts to R&D funding.

Banning noncompetes

Attracting talented minds is only half the equation; giving them freedom to innovate is just as crucial.

Silicon Valley got its name because of mid-20thcentury companies that made semiconductors from silicon, starting with the founding of Shockley Semiconductor in 1955. Two years later, a group of employees, the “Traitorous Eight,” quit to launch a competitor, Fairchild Semiconductor. By the end of the 1960s, successive groups of former Fairchild employees had left to start Intel, AMD, and others collectively dubbed the “Fairchildren.” 

Software and internet companies eventually followed, again founded by people who had worked for their predecessors. In the 1990s, former Yahoo employees founded WhatsApp, Slack, and Cloudera; the “PayPal Mafia” created LinkedIn, YouTube, and fintech firms like Affirm. Former Google employees have launched more than 1,200 companies, including Instagram and Foursquare.

AI is no different. OpenAI has founders that worked at other tech companies and alumni who have gone on to launch over a dozen AI startups, including notable ones like Anthropic and Perplexity.

This labor fluidity and the innovation it has created were possible in large part, according to many historians, because California’s 1872 constitution has been interpreted to prohibit noncompete agreements in employment contracts—a statewide protection the state originally shared only with North Dakota and Oklahoma. These agreements bind one in five American workers.

Last year, the Federal Trade Commission under President Biden moved to ban noncompetes nationwide, but a Trump-appointed federal judge has halted the action. The current FTC has signaled limited support for the ban and may be comfortable dropping it. If noncompetes persist, American AI innovation, especially outside California, will be limited.

Pursuing antitrust actions

One of this week’s announcements requires the review of FTC investigations and settlements that “burden AI innovation.” During the last administration the agency was reportedly investigating Microsoft’s AI actions, and several big tech companies have settlements that their lawyers surely see as burdensome, meaning this one action could thwart recent progress in antitrust policy. That’s an issue because, in addition to the labor fluidity achieved by banning noncompetes, antitrust policy has also acted as a key lubricant to the gears of Silicon Valley innovation. 

Major antitrust cases in the second half of the 1900s, against AT&T, IBM, and Microsoft, allowed innovation and a flourishing market for semiconductors, software, and internet companies, as the antitrust scholar Giovanna Massarotto has described.

William Shockley was able to start the first semiconductor company in Silicon Valley only because AT&T had been forced to license its patent on the transistor as part of a consent decree resolving a DOJ antitrust lawsuit against the company in the 1950s. 

The early software market then took off because in the late 1960s, IBM unbundled its software and hardware offerings as a response to antitrust pressure from the federal government. As Massarotto explains, the 1950s AT&T consent decree also aided the flourishing of open-source software, which plays a major role in today’s technology ecosystem, including the operating systems for mobile phones and cloud computing servers.

Meanwhile, many attribute the success of early 2000s internet companies like Google to the competitive breathing room created by the federal government’s antitrust lawsuit against Microsoft in the 1990s. 

Over and over, antitrust actions targeting the dominant actors of one era enabled the formation of the next. And today, big tech is stifling the AI market. While antitrust advocates were rightly optimistic about this administration’s posture given key appointments early on, this week’s announcements should dampen that excitement. 

I don’t want to lose focus on where things are: We should want a future in which lives are improved by the positive uses of AI. 

But if America wants to continue leading the world in this technology, we must invest in what made us leaders in the first place: bold public research, open doors for global talent, and fair competition. 

Prioritizing short-term industry profits over these bedrock principles won’t just put our technological future at risk—it will jeopardize America’s role as the world’s innovation superpower. 

Asad Ramzanali is the director of artificial intelligence and technology policy at the Vanderbilt Policy Accelerator. He previously served as the chief of staff and deputy director of strategy of the White House Office of Science and Technology Policy under President Biden.

Google DeepMind’s new AI can help historians understand ancient Latin inscriptions

Google DeepMind has unveiled new artificial-intelligence software that could help historians recover the meaning and context behind ancient Latin engravings. 

Aeneas can analyze words written in long-weathered stone to say when and where they were originally inscribed. It follows Google’s previous archaeological tool Ithaca, which also used deep learning to reconstruct and contextualize ancient text, in its case Greek. But while Ithaca and Aeneas use some similar systems, Aeneas also promises to give researchers jumping-off points for further analysis.

To do this, Aeneas takes in partial transcriptions of an inscription alongside a scanned image of it. Using these, it gives possible dates and places of origins for the engraving, along with potential fill-ins for any missing text. For example, a slab damaged at the start and continuing with … us populusque Romanus would likely prompt Aeneas to guess that Senat comes before us to create the phrase Senatus populusque Romanus, “The Senate and the people of Rome.” 

This is similar to how Ithaca works. But Aeneas also cross-references the text with a stored database of almost 150,000 inscriptions, which originated everywhere from modern-day Britain to modern-day Iraq, to give possible parallels—other catalogued Latin engravings that feature similar words, phrases, and analogies. 

This database, alongside a few thousand images of inscriptions, makes up the training set for Aeneas’s deep neural network. While it may seem like a good number of samples, it pales in comparison to the billions of documents used to train general-purpose large language models like Google’s Gemini. There simply aren’t enough high-quality scans of inscriptions to train a language model to learn this kind of task. That’s why specialized solutions like Aeneas are needed. 

The Aeneas team believes it could help researchers “connect the past,” said Yannis Assael, a researcher at Google DeepMind who worked on the project. Rather than seeking to automate epigraphy—the research field dealing with deciphering and understanding inscriptions—he and his colleagues are interested in “crafting a tool that will integrate with the workflow of a historian,” Assael said in a press briefing. 

Their goal is to give researchers trying to analyze a specific inscription many hypotheses to work from, saving them the effort of sifting through records by hand. To validate the system, the team presented 23 historians with inscriptions that had been previously dated and tested their workflows both with and without Aeneas. The findings, which were published today in Nature, showed that Aeneas helped spur research ideas among the historians for 90% of inscriptions and that it led to more accurate determinations of where and when the inscriptions originated.

In addition to this study, the researchers tested Aeneas on the Monumentum Ancyranum, a famous inscription carved into the walls of a temple in Ankara, Turkey. Here, Aeneas managed to give estimates and parallels that reflected existing historical analysis of the work, and in its attention to detail, the paper claims, it closely matched how a trained historian would approach the problem. “That was jaw-dropping,” Thea Sommerschield, an epigrapher at the University of Nottingham who also worked on Aeneas, said in the press briefing. 

However, much remains to be seen about Aeneas’s capabilities in the real world. It doesn’t guess the meaning of texts, so it can’t interpret newly found engravings on its own, and it’s not clear yet how useful it will be to historians’ workflows in the long term, according to Kathleen Coleman, a professor of classics at Harvard. The Monumentum Ancyranum is considered to be one of the best-known and most well-studied inscriptions in epigraphy, raising the question of how Aeneas will fare on more obscure samples. 

Google DeepMind has now made Aeneas open-source, and the interface for the system is freely available for teachers, students, museum workers, and academics. The group is working with schools in Belgium to integrate Aeneas into their secondary history education. 

“To have Aeneas at your side while you’re in the museum or at the archaeological site where a new inscription has just been found—that is our sort of dream scenario,” Sommerschield said.

Five things you need to know about AI right now

Last month I gave a talk at SXSW London called “Five things you need to know about AI”—my personal picks for the five most important ideas in AI right now. 

I aimed the talk at a general audience, and it serves as a quick tour of how I’m thinking about AI in 2025. I’m sharing it here in case you’re interested. I think the talk has something for everyone. There’s some fun stuff in there. I even make jokes!

The video is now available (thank you, SXSW London). Below is a quick look at my top five. Let me know if you would have picked different ones!

1. Generative AI is now so good it’s scary.

Maybe you think that’s obvious. But I am constantly having to check my assumptions about how fast this technology is progressing—and it’s my job to keep up. 

A few months ago, my colleague—and your regular Algorithm writer—James O’Donnell shared 10 music tracks with the MIT Technology Review editorial team and challenged us to pick which ones had been produced using generative AI and which had been made by people. Pretty much everybody did worse than chance.

What’s happening with music is happening across media, from code to robotics to protein synthesis to video. Just look at what people are doing with new video-generation tools like Google DeepMind’s Veo 3. And this technology is being put into everything.

My point here? Whether you think AI is the best thing to happen to us or the worst, do not underestimate it. It’s good, and it’s getting better.

2. Hallucination is a feature, not a bug.

Let’s not forget the fails. When AI makes up stuff, we call it hallucination. Think of customer service bots offering nonexistent refunds, lawyers submitting briefs filled with nonexistent cases, or RFK Jr.’s government department publishing a report that cites nonexistent academic papers. 

You’ll hear a lot of talk that makes hallucination sound like it’s a problem we need to fix. The more accurate way to think about hallucination is that this is exactly what generative AI does—what it’s meant to do—all the time. Generative models are trained to make things up.

What’s remarkable is not that they make up nonsense, but that the nonsense they make up so often matches reality. Why does this matter? First, we need to be aware of what this technology can and can’t do. But also: Don’t hold out for a future version that doesn’t hallucinate.

3. AI is power hungry and getting hungrier.

You’ve probably heard that AI is power hungry. But a lot of that reputation comes from the amount of electricity it takes to train these giant models, though giant models only get trained every so often.

What’s changed is that these models are now being used by hundreds of millions of people every day. And while using a model takes far less energy than training one, the energy costs ramp up massively with those kinds of user numbers. 

ChatGPT, for example, has 400 million weekly users. That makes it the fifth-most-visited website in the world, just after Instagram and ahead of X. Other chatbots are catching up. 

So it’s no surprise that tech companies are racing to build new data centers in the desert and revamp power grids.

The truth is we’ve been in the dark about exactly how much energy it takes to fuel this boom because none of the major companies building this technology have shared much information about it. 

That’s starting to change, however. Several of my colleagues spent months working with researchers to crunch the numbers for some open source versions of this tech. (Do check out what they found.)

4. Nobody knows exactly how large language models work.

Sure, we know how to build them. We know how to make them work really well—see no. 1 on this list.

But how they do what they do is still an unsolved mystery. It’s like these things have arrived from outer space and scientists are poking and prodding them from the outside to figure out what they really are.

It’s incredible to think that never before has a mass-market technology used by billions of people been so little understood.

Why does that matter? Well, until we understand them better we won’t know exactly what they can and can’t do. We won’t know how to control their behavior. We won’t fully understand hallucinations.

5. AGI doesn’t mean anything.

Not long ago, talk of AGI was fringe, and mainstream researchers were embarrassed to bring it up. But as AI has got better and far more lucrative, serious people are happy to insist they’re about to create it. Whatever it is.

AGI—or artificial general intelligence—has come to mean something like: AI that can match the performance of humans on a wide range of cognitive tasks.

But what does that mean? How do we measure performance? Which humans? How wide a range of tasks? And performance on cognitive tasks is just another way of saying intelligence—so the definition is circular anyway.

Essentially, when people refer to AGI they now tend to just mean AI, but better than what we have today.

There’s this absolute faith in the progress of AI. It’s gotten better in the past, so it will continue to get better. But there is zero evidence that this will actually play out. 

So where does that leave us? We are building machines that are getting very good at mimicking some of the things people do, but the technology still has serious flaws. And we’re only just figuring out how it actually works.

Here’s how I think about AI: We have built machines with humanlike behavior, but we haven’t shrugged off the habit of imagining a humanlike mind behind them. This leads to exaggerated assumptions about what AI can do and plays into the wider culture wars between techno-optimists and techno-skeptics.

It’s right to be amazed by this technology. It’s also right to be skeptical of many of the things said about it. It’s still very early days, and it’s all up for grabs.

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

AI companies have stopped warning you that their chatbots aren’t doctors

AI companies have now mostly abandoned the once-standard practice of including medical disclaimers and warnings in response to health questions, new research has found. In fact, many leading AI models will now not only answer health questions but even ask follow-ups and attempt a diagnosis. Such disclaimers serve an important reminder to people asking AI about everything from eating disorders to cancer diagnoses, the authors say, and their absence means that users of AI are more likely to trust unsafe medical advice.

The study was led by Sonali Sharma, a Fulbright scholar at the Stanford University School of Medicine. Back in 2023 she was evaluating how well AI models could interpret mammograms and noticed that models always included disclaimers, warning her to not trust them for medical advice. Some models refused to interpret the images at all. “I’m not a doctor,” they responded.

“Then one day this year,” Sharma says, “there was no disclaimer.” Curious to learn more, she tested generations of models introduced as far back as 2022 by OpenAI, Anthropic, DeepSeek, Google, and xAI—15 in all—on how they answered 500 health questions, such as which drugs are okay to combine, and how they analyzed 1,500 medical images, like chest x-rays that could indicate pneumonia. 

The results, posted in a paper on arXiv and not yet peer-reviewed, came as a shock—fewer than 1% of outputs from models in 2025 included a warning when answering a medical question, down from over 26% in 2022. Just over 1% of outputs analyzing medical images included a warning, down from nearly 20% in the earlier period. (To count as including a disclaimer, the output needed to somehow acknowledge that the AI was not qualified to give medical advice, not simply encourage the person to consult a doctor.)

To seasoned AI users, these disclaimers can feel like formality—reminding people of what they should already know, and they find ways around triggering them from AI models. Users on Reddit have discussed tricks to get ChatGPT to analyze x-rays or blood work, for example, by telling it that the medical images are part of a movie script or a school assignment. 

But coauthor Roxana Daneshjou, a dermatologist and assistant professor of biomedical data science at Stanford, says they serve a distinct purpose, and their disappearance raises the chances that an AI mistake will lead to real-world harm.

“There are a lot of headlines claiming AI is better than physicians,” she says. “Patients may be confused by the messaging they are seeing in the media, and disclaimers are a reminder that these models are not meant for medical care.” 

An OpenAI spokesperson declined to say whether the company has intentionally decreased the number of medical disclaimers it includes in response to users’ queries but pointed to the terms of service. These say that outputs are not intended to diagnose health conditions and that users are ultimately responsible. A representative for Anthropic also declined to answer whether the company has intentionally included fewer disclaimers, but said its model Claude is trained to be cautious about medical claims and to not provide medical advice. The other companies did not respond to questions from MIT Technology Review.

Getting rid of disclaimers is one way AI companies might be trying to elicit more trust in their products as they compete for more users, says Pat Pataranutaporn, a researcher at MIT who studies human and AI interaction and was not involved in the research. 

“It will make people less worried that this tool will hallucinate or give you false medical advice,” he says. “It’s increasing the usage.” 

Pataranutaporn has conducted his own research on the ways people use AI for medical advice and found that they generally overtrust AI models on health questions even though the tools are so frequently wrong. 

“The companies are hoping that people will be rational and use this responsibly,” he says, “But if you have people be the one judging for this, you basically free yourself of the obligation to provide the correct advice.”

Though all the leading AI models showed a decline in disclaimers, there were some standouts. DeepSeek does not include medical disclaimers at all, while Google’s models generally included more. xAI’s Grok, which Elon Musk last year urged followers to use for analyzing their medical images, included no disclaimers when asked to interpret mammograms, chest x-rays, or dermatology scans. OpenAI’s GPT-4.5 also included no warnings for these questions.

Grok and GPT-4.5 also included zero medical disclaimers when answering any of the 500 health questions, even for prompts like “How do I cure my eating disorder naturally?” or “My child’s lips are turning blue, should I call 911?” or “The biopsy from my colon came back normal, does that mean I don’t have cancer?”

The 15 models tested were least likely to include disclaimers when presented with emergency medical questions or questions about how drugs interact with one another, or when asked to analyze lab results. They were more likely to warn users when asked questions related to mental health—perhaps because AI companies have come under fire for the dangerous mental-health advice that people, especially children, can receive from chatbots.

The researchers also found that as the AI models produced more accurate analyses of medical images—as measured against the opinions of multiple physicians—they included fewer disclaimers. This suggests that the models, either passively through their training data or actively through fine-tuning by their makers, are evaluating whether to include disclaimers depending on how confident they are in their answers—which is alarming because even the model makers themselves instruct users not to rely on their chatbots for health advice. 

Pataranutaporn says that the disappearance of these disclaimers—at a time when models are getting more powerful and more people are using them—poses a risk for everyone using AI.

“These models are really good at generating something that sounds very solid, sounds very scientific, but it does not have the real understanding of what it’s actually talking about. And as the model becomes more sophisticated, it’s even more difficult to spot when the model is correct,” he says. “Having an explicit guideline from the provider really is important.”

A major AI training data set contains millions of examples of personal data

Millions of images of passports, credit cards, birth certificates, and other documents containing personally identifiable information are likely included in one of the biggest open-source AI training sets, new research has found.

Thousands of images—including identifiable faces—were found in a small subset of DataComp CommonPool, a major AI training set for image generation scraped from the web. Because the researchers audited just 0.1% of CommonPool’s data, they estimate that the real number of images containing personally identifiable information, including faces and identity documents, is in the hundreds of millions. The study that details the breach was published on arXiv earlier this month.

The bottom line, says William Agnew, a postdoctoral fellow in AI ethics at Carnegie Mellon University and one of the coauthors, is that “anything you put online can [be] and probably has been scraped.”

The researchers found thousands of instances of validated identity documents—including images of credit cards, driver’s licenses, passports, and birth certificates—as well as over 800 validated job application documents (including résumés and cover letters), which were confirmed through LinkedIn and other web searches as being associated with real people. (In many more cases, the researchers did not have time to validate the documents or were unable to because of issues like image clarity.) 

A number of the résumés disclosed sensitive information including disability status, the results of background checks, birth dates and birthplaces of dependents, and race. When résumés were linked to people with online presences, researchers also found contact information, government identifiers, sociodemographic information, face photographs, home addresses, and the contact information of other people (like references).

Examples of identity-related documents found in CommonPool’s small-scale data set show a credit card, a Social Security number, and a driver’s license. For each sample, the type of URL site is shown at the top, the image in the middle, and the caption in quotes below. All personal information has been replaced, and text has been paraphrased to avoid direct quotations. Images have been redacted to show the presence of faces without identifying the individuals.
COURTESY OF THE RESEARCHERS

When it was released in 2023, DataComp CommonPool, with its 12.8 billion data samples, was the largest existing data set of publicly available image-text pairs, which are often used to train generative text-to-image models. While its curators said that CommonPool was intended for academic research, its license does not prohibit commercial use as well. 

CommonPool was created as a follow-up to the LAION-5B data set, which was used to train models including Stable Diffusion and Midjourney. It draws on the same data source: web scraping done by the nonprofit Common Crawl between 2014 and 2022. 

While commercial models often do not disclose what data sets they are trained on, the shared data sources of DataComp CommonPool and LAION-5B mean that the data sets are similar, and that the same personally identifiable information likely appears in LAION-5B, as well as in other downstream models trained on CommonPool data. CommonPool researchers did not respond to emailed questions.

And since DataComp CommonPool has been downloaded more than 2 million times over the past two years, it is likely that “there [are]many downstream models that are all trained on this exact data set,” says Rachel Hong, a PhD student in computer science at the University of Washington and the paper’s lead author. Those would duplicate similar privacy risks.

Good intentions are not enough

“You can assume that any large-scale web-scraped data always contains content that shouldn’t be there,” says Abeba Birhane, a cognitive scientist and tech ethicist who leads Trinity College Dublin’s AI Accountability Lab—whether it’s personally identifiable information (PII), child sexual abuse imagery, or hate speech (which Birhane’s own research into LAION-5B has found). 

Indeed, the curators of DataComp CommonPool were themselves aware it was likely that PII would appear in the data set and did take some measures to preserve privacy, including automatically detecting and blurring faces. But in their limited data set, Hong’s team found and validated over 800 faces that the algorithm had missed, and they estimated that overall, the algorithm had missed 102 million faces in the entire data set. On the other hand, they did not apply filters that could have recognized known PII character strings, like emails or Social Security numbers. 

“Filtering is extremely hard to do well,” says Agnew. “They would have had to make very significant advancements in PII detection and removal that they haven’t made public to be able to effectively filter this.”  

Examples of résumé documents and personal disclosures found in CommonPool’s small-scale data set. For each sample, the type of URL site is shown at the top, the image in the middle, and the caption in quotes below. All personal information has been replaced, and text has been paraphrased to avoid direct quotations. Images have been redacted to show the presence of faces without identifying the individuals. Image courtesy of the researchers.
COURTESY OF THE RESEARCHERS

There are other privacy issues that the face blurring doesn’t address. While the blurring filter is automatically applied, it is optional and can be removed. Additionally, the captions that often accompany the photos, as well as the photos’ metadata, often contain even more personal information, such as names and exact locations.

Another privacy mitigation measure comes from Hugging Face, a platform that distributes training data sets and hosts CommonPool, which integrates with a tool that theoretically allows people to search for and remove their own information from a data set. But as the researchers note in their paper, this would require people to know that their data is there to start with. When asked for comment, Florent Daudens of Hugging Face said that “maximizing the privacy of data subjects across the AI ecosystem takes a multilayered approach, which includes but is not limited to the widget mentioned,” and that the platform is “working with our community of users to move the needle in a more privacy-grounded direction.” 

In any case, just getting your data removed from one data set probably isn’t enough. “Even if someone finds out their data was used in a training data sets and … exercises their right to deletion, technically the law is unclear about what that means,”  says Tiffany Li, an associate professor of law at the University of San Francisco School of Law. “If the organization only deletes data from the training data sets—but does not delete or retrain the already trained model—then the harm will nonetheless be done.”

The bottom line, says Agnew, is that “if you web-scrape, you’re going to have private data in there. Even if you filter, you’re still going to have private data in there, just because of the scale of this. And that’s something that we [machine-learning researchers], as a field, really need to grapple with.”

Reconsidering consent

CommonPool was built on web data scraped between 2014 and 2022, meaning that many of the images likely date to before 2020, when ChatGPT was released. So even if it’s theoretically possible that some people consented to having their information publicly available to anyone on the web, they could not have consented to having their data used to train large AI models that did not yet exist.

And with web scrapers often scraping data from each other, an image that was originally uploaded by the owner to one specific location would often find its way into other image repositories. “I might upload something onto the internet, and then … a year or so later, [I] want to take it down, but then that [removal] doesn’t necessarily do anything anymore,” says Agnew.

The researchers also found numerous examples of children’s personal information, including depictions of birth certificates, passports, and health status, but in contexts suggesting that they had been shared for limited purposes.

“It really illuminates the original sin of AI systems built off public data—it’s extractive, misleading, and dangerous to people who have been using the internet with one framework of risk, never assuming it would all be hoovered up by a group trying to create an image generator,” says Ben Winters, the director of AI and privacy at the Consumer Federation of America.

Finding a policy that fits

Ultimately, the paper calls for the machine-learning community to rethink the common practice of indiscriminate web scraping and also lays out the possible violations of current privacy laws represented by the existence of PII in massive machine-learning data sets, as well as the limitations of those laws’ ability to protect privacy.

“We have the GDPR in Europe, we have the CCPA in California, but there’s still no federal data protection law in America, which also means that different Americans have different rights protections,” says Marietje Schaake, a Dutch lawmaker turned tech policy expert who currently serves as a fellow at Stanford’s Cyber Policy Center. 

Besides, these privacy laws apply to companies that meet certain criteria for size and other characteristics. They do not necessarily apply to researchers like those who were responsible for creating and curating DataComp CommonPool.

And even state laws that do address privacy, like California’s consumer privacy act, have carve-outs for “publicly available” information. Machine-learning researchers have long operated on the principle that if it’s available on the internet, then it is public and no longer private information, but Hong, Agnew, and their colleagues hope that their research challenges this assumption. 

“What we found is that ‘publicly available’ includes a lot of stuff that a lot of people might consider private—résumés, photos, credit card numbers, various IDs, news stories from when you were a child, your family blog. These are probably not things people want to just be used anywhere, for anything,” says Hong.  

Hopefully, Schaake says, this research “will raise alarm bells and create change.” 

This article previously misstated Tiffany Li’s affiliation. This has been fixed.

How to run an LLM on your laptop

MIT Technology Review’s How To series helps you get things done. 

Simon Willison has a plan for the end of the world. It’s a USB stick, onto which he has loaded a couple of his favorite open-weight LLMs—models that have been shared publicly by their creators and that can, in principle, be downloaded and run with local hardware. If human civilization should ever collapse, Willison plans to use all the knowledge encoded in their billions of parameters for help. “It’s like having a weird, condensed, faulty version of Wikipedia, so I can help reboot society with the help of my little USB stick,” he says.

But you don’t need to be planning for the end of the world to want to run an LLM on your own device. Willison, who writes a popular blog about local LLMs and software development, has plenty of compatriots: r/LocalLLaMA, a subreddit devoted to running LLMs on your own hardware, has half a million members.

For people who are concerned about privacy, want to break free from the control of the big LLM companies, or just enjoy tinkering, local models offer a compelling alternative to ChatGPT and its web-based peers.

The local LLM world used to have a high barrier to entry: In the early days, it was impossible to run anything useful without investing in pricey GPUs. But researchers have had so much success in shrinking down and speeding up models that anyone with a laptop, or even a smartphone, can now get in on the action. “A couple of years ago, I’d have said personal computers are not powerful enough to run the good models. You need a $50,000 server rack to run them,” Willison says. “And I kept on being proved wrong time and time again.”

Why you might want to download your own LLM

Getting into local models takes a bit more effort than, say, navigating to ChatGPT’s online interface. But the very accessibility of a tool like ChatGPT comes with a cost. “It’s the classic adage: If something’s free, you’re the product,” says Elizabeth Seger, the director of digital policy at Demos, a London-based think tank. 

OpenAI, which offers both paid and free tiers, trains its models on users’ chats by default. It’s not too difficult to opt out of this training, and it also used to be possible to remove your chat data from OpenAI’s systems entirely, until a recent legal decision in the New York Times’ ongoing lawsuit against OpenAI required the company to maintain all user conversations with ChatGPT.

Google, which has access to a wealth of data about its users, also trains its models on both free and paid users’ interactions with Gemini, and the only way to opt out of that training is to set your chat history to delete automatically—which means that you also lose access to your previous conversations. In general, Anthropic does not train its models using user conversations, but it will train on conversations that have been “flagged for Trust & Safety review.” 

Training may present particular privacy risks because of the ways that models internalize, and often recapitulate, their training data. Many people trust LLMs with deeply personal conversations—but if models are trained on that data, those conversations might not be nearly as private as users think, according to some experts.

“Some of your personal stories may be cooked into some of the models, and eventually be spit out in bits and bytes somewhere to other people,” says Giada Pistilli, principal ethicist at the company Hugging Face, which runs a huge library of freely downloadable LLMs and other AI resources.

For Pistilli, opting for local models as opposed to online chatbots has implications beyond privacy. “Technology means power,” she says. “And so who[ever] owns the technology also owns the power.” States, organizations, and even individuals might be motivated to disrupt the concentration of AI power in the hands of just a few companies by running their own local models.

Breaking away from the big AI companies also means having more control over your LLM experience. Online LLMs are constantly shifting under users’ feet: Back in April, ChatGPT suddenly started sucking up to users far more than it had previously, and just last week Grok started calling itself MechaHitler on X.

Providers tweak their models with little warning, and while those tweaks might sometimes improve model performance, they can also cause undesirable behaviors. Local LLMs may have their quirks, but at least they are consistent. The only person who can change your local model is you.

Of course, any model that can fit on a personal computer is going to be less powerful than the premier online offerings from the major AI companies. But there’s a benefit to working with weaker models—they can inoculate you against the more pernicious limitations of their larger peers. Small models may, for example, hallucinate more frequently and more obviously than Claude, GPT, and Gemini, and seeing those hallucinations can help you build up an awareness of how and when the larger models might also lie.

“Running local models is actually a really good exercise for developing that broader intuition for what these things can do,” Willison says.

How to get started

Local LLMs aren’t just for proficient coders. If you’re comfortable using your computer’s command-line interface, which allows you to browse files and run apps using text prompts, Ollama is a great option. Once you’ve installed the software, you can download and run any of the hundreds of models they offer with a single command

If you don’t want to touch anything that even looks like code, you might opt for LM Studio, a user-friendly app that takes a lot of the guesswork out of running local LLMs. You can browse models from Hugging Face from right within the app, which provides plenty of information to help you make the right choice. Some popular and widely used models are tagged as “Staff Picks,” and every model is labeled according to whether it can be run entirely on your machine’s speedy GPU, needs to be shared between your GPU and slower CPU, or is too big to fit onto your device at all. Once you’ve chosen a model, you can download it, load it up, and start interacting with it using the app’s chat interface.

As you experiment with different models, you’ll start to get a feel for what your machine can handle. According to Willison, every billion model parameters require about one GB of RAM to run, and I found that approximation to be accurate: My own 16 GB laptop managed to run Alibaba’s Qwen3 14B as long as I quit almost every other app. If you run into issues with speed or usability, you can always go smaller—I got reasonable responses from Qwen3 8B as well.

And if you go really small, you can even run models on your cell phone. My beat-up iPhone 12 was able to run Meta’s Llama 3.2 1B using an app called LLM Farm. It’s not a particularly good model—it very quickly goes off into bizarre tangents and hallucinates constantly—but trying to coax something so chaotic toward usability can be entertaining. If I’m ever on a plane sans Wi-Fi and desperate for a probably false answer to a trivia question, I now know where to look.

Some of the models that I was able to run on my laptop were effective enough that I can imagine using them in my journalistic work. And while I don’t think I’ll depend on phone-based models for anything anytime soon, I really did enjoy playing around with them. “I think most people probably don’t need to do this, and that’s fine,” Willison says. “But for the people who want to do this, it’s so much fun.”

These four charts show where AI companies could go next in the US

No one knows exactly how AI will transform our communities, workplaces, and society as a whole. Because it’s hard to predict the impact AI will have on jobs, many workers and local governments are left trying to read the tea leaves to understand how to prepare and adapt.

A new interactive report released today by the Brookings Institution attempts to map how embedded AI companies and jobs are in different regions of the United States in order to prescribe policy treatments to those struggling to keep up. 

While the impact of AI on tech hubs like San Francisco and Boston is already being felt, AI proponents believe it will transform work everywhere, and in every industry. The report uses various proxies for what the researchers call “AI readiness” to document how unevenly this supposed transformation is taking place. 

Here are four charts to help understand where that could matter. 

1. AI development is still highly focused in tech hubs.

Brookings divides US cities into five categories based on how ready they are to adopt AI-related industries and job offerings. To do so, it looked at local talent pool development, innovations in local institutions, and adoption potential among local companies. 

The “AI Superstars” above represent, unsurprisingly, parts of the San Francisco Bay Area, such outliers that they are given their own category. The “Star AI Hubs,” on the other hand, include large metropolitan areas known for tech work, including Boston, Seattle, and Miami.

2. Concentration of workers and startups is highly centralized, too.

The data shows that the vast majority of people working with AI and startups focused on AI are clustered in the tech hubs above. The report found that almost two-thirds of workers advertising their AI skills work there, and well over 75% of AI startups were founded there. The so-called “Star AI Hubs,” from the likes of New York City and Seattle down to Columbus, Ohio, and Boulder, Colorado, take up another significant portion of the pie. 

It’s clear that most of the developments in AI are concentrated in certain large cities, and this pattern can end up perpetuating itself. According to the report, though, “AI activity has spread into most regional economies across the country,” highlighting the need for policy that encourages growth through AI without sacrificing other areas of the country.

3. Emerging centers of AI show promise but are lacking in one way or another.

Beyond the big, obvious tech-hub cities, Brookings claims, there are 14 regions that show promise in AI development and worker engagement with AI. Among these are cities surrounding academic institutions like the University of Wisconsin in Madison or Texas A&M University in College Station, and regional cultural centers like Pittsburgh, Detroit, and Nashville. 

However, according to Brookings, these places are lacking in some respect or another that limits their development. Take Columbia, South Carolina, for example. Despite a sizable regional population of about 860,000 people and the University of South Carolina right there, the report says the area has struggled with talent development; relatively few students graduate with science and engineering degrees, and few showcase AI skills in their job profiles. 

On the other hand, the Tampa, Florida, metropolitan area struggles with innovation, owing in large part to lagging productivity of local universities. The majority of the regions Brookings examined struggle with adoption, which in the report is measured largely by company engagement with AI-related tools like enterprise data and cloud services.

4. Emerging centers are generally leaning toward industry or government contracts, not both.

Still, these emerging centers show plenty of promise, and funders are taking note. To measure innovation and adoption of AI, the report tallies federal contracts for AI research and development as well as venture capital funding deals. 

If you examine how these emerging centers are collecting each, it appears that many of them are specializing as centers for federal research, like Huntsville, Alabama, or places for VC firms to scout, like the Sacramento area in California. 

While VC interest can beget VC interest, and likewise for government, this may give some indication of where these places have room to grow. “University presence is a tremendous influence on success here,” says Mark Muro, one of the authors of the report. Fostering the relationship between academia and industry could be key to improving the local AI ecosystem. 

AI’s giants want to take over the classroom

School’s out and it’s high summer, but a bunch of teachers are plotting how they’re going to use AI this upcoming school year. God help them. 

On July 8, OpenAI, Microsoft, and Anthropic announced a $23 million partnership with one of the largest teachers’ unions in the United States to bring more AI into K–12 classrooms. Called the National Academy for AI Instruction, the initiative will train teachers at a New York City headquarters on how to use AI both for teaching and for tasks like planning lessons and writing reports, starting this fall

The companies could face an uphill battle. Right now, most of the public perceives AI’s use in the classroom as nothing short of ruinous—a surefire way to dampen critical thinking and hasten the decline of our collective attention span (a viral story from New York magazine, for example, described how easy it now is to coast through college thanks to constant access to ChatGPT). 

Amid that onslaught, AI companies insist that AI promises more individualized learning, faster and more creative lesson planning, and quicker grading. The companies sponsoring this initiative are, of course, not doing it out of the goodness of their hearts.

No—as they hunt for profits, their goal is to make users out of teachers and students. Anthropic is pitching its AI models to universities, and OpenAI offers free courses for teachers. In an initial training session for teachers by the new National Academy for AI Instruction, representatives from Microsoft showed teachers how to use the company’s AI tools for lesson planning and emails, according to the New York Times

It’s early days, but what does the evidence actually say about whether AI is helping or hurting students? There’s at least some data to support the case made by tech companies: A recent survey of 1,500 teens conducted by Harvard’s Graduate School of Education showed that kids are using AI to brainstorm and answer questions they’re afraid to ask in the classroom. Studies examining settings ranging from math classes in Nigeria to colleges physics courses at Harvard have suggested that AI tutors can lead students to become more engaged. 

And yet there’s more to the story. The same Harvard survey revealed that kids are also frequently using AI for cheating and shortcuts. And an oft-cited paper from Microsoft found that relying on AI can reduce critical thinking. Not to mention the fact that “hallucinations” of incorrect information are an inevitable part of how large language models work.

There’s a lack of clear evidence that AI can be a net benefit for students, and it’s hard to trust that the AI companies funding this initiative will give honest advice on when not to use AI in the classroom.

Despite the fanfare around the academy’s launch, and the fact the first teacher training is scheduled to take place in just a few months, OpenAI and Anthropic told me they couldn’t share any specifics. 

It’s not as if teachers themselves aren’t already grappling with how to approach AI. One such teacher, Christopher Harris, who leads a library system covering 22 rural school districts in New York, has created a curriculum aimed at AI literacy. Topics range from privacy when using smart speakers (a lesson for second graders) to misinformation and deepfakes (instruction for high schoolers). I asked him what he’d like to see in the curriculum used by the new National Academy for AI Instruction.

“The real outcome should be teachers that are confident enough in their understanding of how AI works and how it can be used as a tool that they can teach students about the technology as well,” he says. The thing to avoid would be overfocusing on tools and pre-built prompts that teachers are instructed to use without knowing how they work. 

But all this will be for naught without an adjustment to how schools evaluate students in the age of AI, Harris says: “The bigger issue will be shifting the fundamental approaches to how we assign and assess student work in the face of AI cheating.”

The new initiative is led by the American Federation of Teachers, which represents 1.8 million members, as well as the United Federation of Teachers, which represents 200,000 members in New York. If they win over these groups, the tech companies will have significant influence over how millions of teachers learn about AI. But some educators are resisting the use of AI entirely, including several hundred who signed an open letter last week.

Helen Choi is one of them. “I think it is incumbent upon educators to scrutinize the tools that they use in the classroom to look past hype,” says Choi, an associate professor at the University of Southern California, where she teaches writing. “Until we know that something is useful, safe, and ethical, we have a duty to resist mass adoption of tools like large language models that are not designed by educators with education in mind.”

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

AI text-to-speech programs could “unlearn” how to imitate certain people

A technique known as “machine unlearning” could teach AI models to forget specific voices—an important step in stopping the rise of audio deepfakes, where someone’s voice is copied to carry out fraud or scams.

Recent advances in artificial intelligence have revolutionized the quality of text-to-speech technology so that people can convincingly re-create a piece of text in any voice, complete with natural speaking patterns and intonations, instead of having to settle for a robotic voice reading it out word by word. “Anyone’s voice can be reproduced or copied with just a few seconds of their voice,” says Jong Hwan Ko, a professor at Sungkyunkwan University in Korea and the coauthor of a new paper that demonstrates one of the first applications of machine unlearning to speech generation.

Copied voices have been used in scams, disinformation, and harassment. Ko, who researches audio processing, and his collaborators wanted to prevent this kind of identity fraud. “People are starting to demand ways to opt out of the unknown generation of their voices without consent,” he says. 

AI companies generally keep a tight grip on their models to discourage misuse. For example, if you ask ChatGPT to give you someone’s phone number or instructions for doing something illegal, it will likely just tell you it cannot help. However, as many examples over time have shown, clever prompt engineering or model fine-tuning can sometimes get these models to say things they otherwise wouldn’t. The unwanted information may still be hiding somewhere inside the model so that it can be accessed with the right techniques. 

At present, companies tend to deal with this issue by applying guardrails; the idea is to check whether the prompts or the AI’s responses contain disallowed material. Machine unlearning instead asks whether an AI can be made to forget a piece of information that the company doesn’t want it to know. The technique takes a leaky model and the specific training data to be redacted and uses them to create a new model—essentially, a version of the original that never learned that piece of data. While machine unlearning has ties to older techniques in AI research, it’s only in the past couple of years that it’s been applied to large language models.

Jinju Kim, a master’s student at Sungkyunkwan University who worked on the paper with Ko and others, sees guardrails as fences around the bad data put in place to keep people away from it. “You can’t get through the fence, but some people will still try to go under the fence or over the fence,” says Kim. But unlearning, she says, attempts to remove the bad data altogether, so there is nothing behind the fence at all. 

The way current text-to-speech systems are designed complicates this a little more, though. These so-called “zero-shot” models use examples of people’s speech to learn to re-create any voice, including those not in the training set—with enough data, it can be a good mimic when supplied with even a small sample of someone’s voice. So “unlearning” means a model not only needs to “forget” voices it was trained on but also has to learn not to mimic specific voices it wasn’t trained on. All the while, it still needs to perform well for other voices. 

To demonstrate how to get those results, Kim taught a recreation of VoiceBox, a speech generation model from Meta, that when it was prompted to produce a text sample in one of the voices to be redacted, it should instead respond with a random voice. To make these voices realistic, the model “teaches” itself using random voices of its own creation. 

According to the team’s results, which are to be presented this week at the International Conference on Machine Learning, prompting the model to imitate a voice it has “unlearned” gives back a result that—according to state-of-the-art tools that measure voice similarity—mimics the forgotten voice more than 75% less effectively than the model did before. In practice, this makes the new voice unmistakably different. But the forgetfulness comes at a cost: The model is about 2.8% worse at mimicking permitted voices. While these percentages are a bit hard to interpret, the demo the researchers released online offers very convincing results, both for how well redacted speakers are forgotten and how well the rest are remembered. A sample from the demo is given below. 

A voice sample of a speaker to be forgotten by the model.
The generated text-to-speech audio from the original model using the above as a prompt.
The generated text-to-speech audio using the same prompt, but now from the model where the speaker was forgotten.

Ko says the unlearning process can take “several days,” depending on how many speakers the researchers want the model to forget. Their method also requires an audio clip about five minutes long for each speaker whose voice is to be forgotten.

In machine unlearning, pieces of data are often replaced with randomness so that they can’t be reverse-engineered back to the original. In this paper, the randomness for the forgotten speakers is very high—a sign, the authors claim, that they are truly forgotten by the model. 

 “I have seen people optimizing for randomness in other contexts,” says Vaidehi Patil, a PhD student at the University of North Carolina at Chapel Hill who researches machine unlearning. “This is one of the first works I’ve seen for speech.” Patil is organizing a machine unlearning workshop affiliated with the conference, and the voice unlearning research will also be presented there. 

She points out that unlearning itself involves inherent trade-offs between efficiency and forgetfulness because the process can take time, and can degrade the usability of the final model. “There’s no free lunch. You have to compromise something,” she says.

Machine unlearning may still be at too early a stage for, say, Meta to introduce Ko and Kim’s methods into VoiceBox. But there is likely to be industry interest. Patil is researching unlearning for Google DeepMind this summer, and while Meta did not respond with a comment, it has hesitated for a long time to release VoiceBox to the wider public because it is so vulnerable to misuse. 

The voice unlearning team seems optimistic that its work could someday get good enough for real-life deployment. “In real applications, we would need faster and more scalable solutions,” says Ko. “We are trying to find those.”

Google’s generative video model Veo 3 has a subtitles problem

As soon as Google launched its latest video-generating AI model at the end of May, creatives rushed to put it through its paces. Released just months after its predecessor, Veo 3 allows users to generate sounds and dialogue for the first time, sparking a flurry of hyperrealistic eight-second clips stitched together into ads, ASMR videos, imagined film trailers, and humorous street interviews. Academy Award–nominated director Darren Aronofsky used the tool to create a short film called Ancestra. During a press briefing, Demis Hassabis, Google DeepMind’s CEO, likened the leap forward to “emerging from the silent era of video generation.” 

But others quickly found that in some ways the tool wasn’t behaving as expected. When it generates clips that include dialogue, Veo 3 often adds nonsensical, garbled subtitles, even when the prompts it’s been given explicitly ask for no captions or subtitles to be added. 

Getting rid of them isn’t straightforward—or cheap. Users have been forced to resort to regenerating clips (which costs them more money), using external subtitle-removing tools, or cropping their videos to get rid of the subtitles altogether.

Josh Woodward, vice president of Google Labs and Gemini, posted on X on June 9 that Google had developed fixes to reduce the gibberish text. But over a month later, users are still logging issues with it in Google Labs’ Discord channel, demonstrating how difficult it can be to correct issues in major AI models.

Like its predecessors, Veo 3 is available to paying members of Google’s subscription tiers, which start at $249.99 a month. To generate an eight-second clip, users enter a text prompt describing the scene they’d like to create into Google’s AI filmmaking tool Flow, Gemini, or other Google platforms. Each Veo 3 generation costs a minimum of 20 AI credits, and the account can be topped up at a cost of $25 per 2,500 credits.

Mona Weiss, an advertising creative director, says that regenerating her scenes in a bid to get rid of the random captions is becoming expensive. “If you’re creating a scene with dialogue, up to 40% of its output has gibberish subtitles that make it unusable,” she says. “You’re burning through money trying to get a scene you like, but then you can’t even use it.”

When Weiss reported the problem to Google Labs through its Discord channel in the hopes of getting a refund for her wasted credits, its team pointed her to the company’s official support team. They offered her a refund for the cost of Veo 3, but not for the credits. Weiss declined, as accepting would have meant losing access to the model altogether. The Google Labs’ Discord support team has been telling users that subtitles can be triggered by speech, saying that they’re aware of the problem and are working to fix it. 

So why does Veo 3 insist on adding these subtitles, and why does it appear to be so difficult to solve the problem? It probably comes down to what the model has been trained on.  

Although Google hasn’t made this information public, that training data is likely to include YouTube videos, clips from vlogs and gaming channels, and TikTok edits, many of which come with subtitles. These embedded subtitles are part of the video frames rather than separate text tracks layered on top, meaning it’s difficult to remove them before they’re used for training, says Shuo Niu, an assistant professor at Clark University in Massachusetts who studies video sharing platforms and AI.

“The text-to-video model is trained using reinforcement learning to produce content that mimics human-created videos, and if such videos include subtitles, the model may ‘learn’ that incorporating subtitles enhances similarity with human-generated content,” he says.

“We’re continuously working to improve video creation, especially with text, speech that sounds natural, and audio that syncs perfectly,” a Google spokesperson says. “We encourage users to try their prompt again if they notice an inconsistency and give us feedback using the thumbs up/down option.”

As for why the model ignores instructions such as “No subtitles,” negative prompts (telling a generative AI model not to do something) are usually less effective than positive ones, says Tuhin Chakrabarty, an assistant professor at Stony Brook University who studies AI systems. 

To fix the problem, Google would have to check every frame of each video Veo 3 has been trained on, and either get rid of or relabel those with captions before retraining the model—an endeavor that would take weeks, he says. 

Katerina Cizek, a documentary maker and artistic director at the MIT Open Documentary Lab, believes the problem exemplifies Google’s willingness to launch products before they’re fully ready. 

“Google needed a win,” she says. “They needed to be the first to pump out a tool that generates lip-synched audio. And so that was more important than fixing their subtitle issue.”