What we’ve been getting wrong about AI’s truth crisis

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

What would it take to convince you that the era of truth decay we were long warned about—where AI content dupes us, shapes our beliefs even when we catch the lie, and erodes societal trust in the process—is now here? A story I published last week pushed me over the edge. It also made me realize that the tools we were sold as a cure for this crisis are failing miserably. 

On Thursday, I reported the first confirmation that the US Department of Homeland Security, which houses immigration agencies, is using AI video generators from Google and Adobe to make content that it shares with the public. The news comes as immigration agencies have flooded social media with content to support President Trump’s mass deportation agenda—some of which appears to be made with AI (like a video about “Christmas after mass deportations”).

But I received two types of reactions from readers that may explain just as much about the epistemic crisis we’re in. 

One was from people who weren’t surprised, because on January 22 the White House had posted a digitally altered photo of a woman arrested at an ICE protest, one that made her appear hysterical and in tears. Kaelan Dorr, the White House’s deputy communications director, did not respond to questions about whether the White House altered the photo but wrote, “The memes will continue.”

The second was from readers who saw no point in reporting that DHS was using AI to edit content shared with the public, because news outlets were apparently doing the same. They pointed to the fact that the news network MS Now (formerly MSNBC) shared an image of Alex Pretti that was AI-edited and appeared to make him look more handsome, a fact that led to many viral clips this week, including one from Joe Rogan’s podcast. Fight fire with fire, in other words? A spokesperson for MS Now told Snopes that the news outlet aired the image without knowing it was edited.

There is no reason to collapse these two cases of altered content into the same category, or to read them as evidence that truth no longer matters. One involved the US government sharing a clearly altered photo with the public and declining to answer whether it was intentionally manipulated; the other involved a news outlet airing a photo it should have known was altered but taking some steps to disclose the mistake.

What these reactions reveal instead is a flaw in how we were collectively preparing for this moment. Warnings about the AI truth crisis revolved around a core thesis: that not being able to tell what is real will destroy us, so we need tools to independently verify the truth. My two grim takeaways are that these tools are failing, and that while vetting the truth remains essential, it is no longer capable on its own of producing the societal trust we were promised.

For example, there was plenty of hype in 2024 about the Content Authenticity Initiative, cofounded by Adobe and adopted by major tech companies, which would attach labels to content disclosing when it was made, by whom, and whether AI was involved. But Adobe applies automatic labels only when the content is wholly AI-generated. Otherwise the labels are opt-in on the part of the creator.

And platforms like X, where the altered arrest photo was posted, can strip content of such labels anyway (a note that the photo was altered was added by users). Platforms can also simply not choose to show the label; indeed, when Adobe launched the initiative, it noted that the Pentagon’s website for sharing official images, DVIDS, would display the labels to prove authenticity, but a review of the website today shows no such labels.

Noticing how much traction the White House’s photo got even after it was shown to be AI-altered, I was struck by the findings of a very relevant new paper published in the journal Communications Psychology. In the study, participants watched a deepfake “confession” to a crime, and the researchers found that even when they were told explicitly that the evidence was fake, participants relied on it when judging an individual’s guilt. In other words, even when people learn that the content they’re looking at is entirely fake, they remain emotionally swayed by it. 

“Transparency helps, but it isn’t enough on its own,” the disinformation expert Christopher Nehring wrote recently about the study’s findings. “We have to develop a new masterplan of what to do about deepfakes.”

AI tools to generate and edit content are getting more advanced, easier to operate, and cheaper to run—all reasons why the US government is increasingly paying to use them. We were well warned of this, but we responded by preparing for a world in which the main danger was confusion. What we’re entering instead is a world in which influence survives exposure, doubt is easily weaponized, and establishing the truth does not serve as a reset button. And the defenders of truth are already trailing way behind.

Update: This story was updated on February 2 with details about how Adobe applies its content authenticity labels.

Inside the marketplace powering bespoke AI deepfakes of real women

Civitai—an online marketplace for buying and selling AI-generated content, backed by the venture capital firm Andreessen Horowitz—is letting users buy custom instruction files for generating celebrity deepfakes. Some of these files were specifically designed to make pornographic images banned by the site, a new analysis has found.

The study, from researchers at Stanford and Indiana University, looked at people’s requests for content on the site, called “bounties.” The researchers found that between mid-2023 and the end of 2024, most bounties asked for animated content—but a significant portion were for deepfakes of real people, and 90% of these deepfake requests targeted women. (Their findings have not yet been peer reviewed.)

The debate around deepfakes, as illustrated by the recent backlash to explicit images on the X-owned chatbot Grok, has revolved around what platforms should do to block such content. Civitai’s situation is a little more complicated. Its marketplace includes actual images, videos, and models, but it also lets individuals buy and sell instruction files called LoRAs that can coach mainstream AI models like Stable Diffusion into generating content they were not trained to produce. Users can then combine these files with other tools to make deepfakes that are graphic or sexual. The researchers found that 86% of deepfake requests on Civitai were for LoRAs.

In these bounties, users requested “high quality” models to generate images of public figures like the influencer Charli D’Amelio or the singer Gracie Abrams, often linking to their social media profiles so their images could be grabbed from the web. Some requests specified a desire for models that generated the individual’s entire body, accurately captured their tattoos, or allowed hair color to be changed. Some requests targeted several women in specific niches, like artists who record ASMR videos. One request was for a deepfake of a woman said to be the user’s wife. Anyone on the site could offer up AI models they worked on for the task, and the best submissions received payment—anywhere from $0.50 to $5. And nearly 92% of the deepfake bounties were awarded.

Neither Civitai nor Andreessen Horowitz responded to requests for comment.

It’s possible that people buy these LoRAs to make deepfakes that aren’t sexually explicit (though they’d still violate Civitai’s terms of use, and they’d still be ethically fraught). But Civitai also offers educational resources on how to use external tools to further customize the outputs of image generators—for example, by changing someone’s pose. The site also hosts user-written articles with details on how to instruct models to generate pornography. The researchers found that the amount of porn on the platform has gone up, and that the majority of requests each week are now for NSFW content.

“Not only does Civitai provide the infrastructure that facilitates these issues; they also explicitly teach their users how to utilize them,” says Matthew DeVerna, a postdoctoral researcher at Stanford’s Cyber Policy Center and one of the study’s leaders. 

The company used to ban only sexually explicit deepfakes of real people, but in May 2025 it announced it would ban all deepfake content. Nonetheless, countless requests for deepfakes submitted before this ban now remain live on the site, and many of the winning submissions fulfilling those requests remain available for purchase, MIT Technology Review confirmed.

“I believe the approach that they’re trying to take is to sort of do as little as possible, such that they can foster as much—I guess they would call it—creativity on the platform,” DeVerna says.

Users buy LoRAs with the site’s online currency, called Buzz, which is purchased with real money. In May 2025, Civita’s credit card processor cut off the company because of its ongoing problem with nonconsensual content. To pay for explicit content, users must now use gift cards or cryptocurrency to buy Buzz; the company offers a different scrip for non-explicit content. 

Civitai automatically tags bounties requesting deepfakes and lists a way for the person featured in the content to manually request its takedown. This system means that Civitai has a reasonably successful way of knowing which bounties are for deepfakes, but it’s still leaving moderation to the general public rather than carrying it out proactively. 

A company’s legal liability for what its users do isn’t totally clear. Generally, tech companies have broad legal protections against such liability for their content under Section 230 of the Communications Decency Act, but those protections aren’t limitless. For example, “you cannot knowingly facilitate illegal transactions on your website,” says Ryan Calo, a professor specializing in technology and AI at the University of Washington’s law school. (Calo wasn’t involved in this new study.)

Civitai joined OpenAI, Anthropic, and other AI companies in 2024 in adopting design principles to guard against the creation and spread of AI-generated child sexual abuse material . This move followed a 2023 report from the Stanford Internet Observatory, which found that the vast majority of AI models named in child sexual abuse communities were Stable Diffusion–based models “predominantly obtained via Civitai.”

But adult deepfakes have not gotten the same level of attention from content platforms or the venture capital firms that fund them. “They are not afraid enough of it. They are overly tolerant of it,” Calo says. “Neither law enforcement nor civil courts adequately protect against it. It is night and day.”

Civitai received a $5 million investment from Andreessen Horowitz (a16z) in November 2023. In a video shared by a16z, Civitai cofounder and CEO Justin Maier described his goal of building the main place where people find and share AI models for their own individual purposes. “We’ve aimed to make this space that’s been very, I guess, niche and engineering-heavy more and more approachable to more and more people,” he said. 

Civitai is not the only company with a deepfake problem in a16z’s investment portfolio; in February, MIT Technology Review first reported that another company, Botify AI, was hosting AI companions resembling real actors that stated their age as under 18, engaged in sexually charged conversations, offered “hot photos,” and in some instances described age-of-consent laws as “arbitrary” and “meant to be broken.”

DHS is using Google and Adobe AI to make videos

The US Department of Homeland Security is using AI video generators from Google and Adobe to make and edit content shared with the public, a new document reveals. It comes as immigration agencies have flooded social media with content to support President Trump’s mass deportation agenda—some of which appears to be made with AI—and as workers in tech have put pressure on their employers to denounce the agencies’ activities. 

The document, released on Wednesday, provides an inventory of which commercial AI tools DHS uses for tasks ranging from generating drafts of documents to managing cybersecurity. 

In a section about “editing images, videos or other public affairs materials using AI,” it reveals for the first time that DHS is using Google’s Veo 3 video generator and Adobe Firefly, estimating that the agency has between 100 and 1,000 licenses for the tools. It also discloses that DHS uses Microsoft Copilot Chat for generating first drafts of documents and summarizing long reports and Poolside software for coding tasks, in addition to tools from other companies.

Google, Adobe, and DHS did not immediately respond to requests for comment.

The news provides details about how agencies like Immigrations and Customs Enforcement, which is part of DHS, might be creating the large amounts of content they’ve shared on X and other channels as immigration operations have expanded across US cities. They’ve posted content celebrating “Christmas after mass deportations,” referenced Bible verses and Christ’s birth, showed faces of those the agency has arrested, and shared ads aimed at recruiting agents. The agencies have also repeatedly used music without permissions from artists in their videos.

Some of the content, particularly videos, has the appearance of being AI-generated, but it hasn’t been clear until now what AI models the agencies might be using. This marks the first concrete evidence such generators are being used by DHS to create content shared with the public.

It still remains impossible to verify which company helped create a specific piece of content, or indeed if it was AI-generated at all. Adobe offers options to “watermark” a video made with its tools to disclose that it is AI-generated, for example, but this disclosure does not always stay intact when the content is uploaded and shared across different sites. 

The document reveals that DHS has specifically been using Flow, a tool from Google that combines its Veo 3 video generator with a suite of filmmaking tools. Users can generate clips and assemble entire videos with AI, including videos that contain sound, dialogue, and background noise, making them hyperrealistic. Adobe launched its Firefly generator in 2023, promising that it does not use copyrighted content in its training or output. Like Google’s tools, Adobe’s can generate videos, images, soundtracks, and speech. The document does not reveal further details about how the agency is using these video generation tools.

Workers at large tech companies, including more than 140 current and former employees from Google and more than 30 from Adobe, have been putting pressure on their employers in recent weeks to take a stance against ICE and the shooting of Alex Pretti on January 24. Google’s leadership has not made statements in response. In October, Google and Apple removed apps on their app stores that were intended to track sightings of ICE, citing safety risks. 

An additional document released on Wednesday revealed new details about how the agency is using more niche AI products, including a facial recognition app used by ICE, as first reported by 404Media in June.

The AI Hype Index: Grok makes porn, and Claude Code nails your job

Everyone is panicking because AI is very bad; everyone is panicking because AI is very good. It’s just that you never know which one you’re going to get. Grok is a pornography machine. Claude Code can do anything from building websites to reading your MRI. So of course Gen Z is spooked by what this means for jobs. Unnerving new research says AI is going to have a seismic impact on the labor market this year.

If you want to get a handle on all that, don’t expect any help from the AI companies—they’re turning on each other like it’s the last act in a zombie movie. Meta’s former chief AI scientist, Yann LeCun, is spilling tea, while Big Tech’s messiest exes, Elon Musk and OpenAI, are about to go to trial. Grab your popcorn.

Rules fail at the prompt, succeed at the boundary

From the Gemini Calendar prompt-injection attack of 2026 to the September 2025 state-sponsored hack using Anthropic’s Claude code as an automated intrusion engine, the coercion of human-in-the-loop agentic actions and fully autonomous agentic workflows are the new attack vector for hackers. In the Anthropic case, roughly 30 organizations across tech, finance, manufacturing, and government were affected. Anthropic’s threat team assessed that the attackers used AI to carry out 80% to 90% of the operation: reconnaissance, exploit development, credential harvesting, lateral movement, and data exfiltration, with humans stepping in only at a handful of key decision points.

This was not a lab demo; it was a live espionage campaign. The attackers hijacked an agentic setup (Claude code plus tools exposed via Model Context Protocol (MCP)) and jailbroke it by decomposing the attack into small, seemingly benign tasks and telling the model it was doing legitimate penetration testing. The same loop that powers developer copilots and internal agents was repurposed as an autonomous cyber-operator. Claude was not hacked. It was persuaded and used tools for the attack.

Prompt injection is persuasion, not a bug

Security communities have been warning about this for several years. Multiple OWASP Top 10 reports put prompt injection, or more recently Agent Goal Hijack, at the top of the risk list and pair it with identity and privilege abuse and human-agent trust exploitation: too much power in the agent, no separation between instructions and data, and no mediation of what comes out.

Guidance from the NCSC and CISA describes generative AI as a persistent social-engineering and manipulation vector that must be managed across design, development, deployment, and operations, not patched away with better phrasing. The EU AI Act turns that lifecycle view into law for high-risk AI systems, requiring a continuous risk management system, robust data governance, logging, and cybersecurity controls.

In practice, prompt injection is best understood as a persuasion channel. Attackers don’t break the model—they convince it. In the Anthropic example, the operators framed each step as part of a defensive security exercise, kept the model blind to the overall campaign, and nudged it, loop by loop, into doing offensive work at machine speed.

That’s not something a keyword filter or a polite “please follow these safety instructions” paragraph can reliably stop. Research on deceptive behavior in models makes this worse. Anthropic’s research on sleeper agents shows that once a model has learned a backdoor, then strategic pattern recognition, standard fine-tuning, and adversarial training can actually help the model hide the deception rather than remove it. If one tries to defend a system like that purely with linguistic rules, they are playing on its home field.

Why this is a governance problem, not a vibe coding problem

Regulators aren’t asking for perfect prompts; they’re asking that enterprises demonstrate control.

NIST’s AI RMF emphasizes asset inventory, role definition, access control, change management, and continuous monitoring across the AI lifecycle. The UK AI Cyber Security Code of Practice similarly pushes for secure-by-design principles by treating AI like any other critical system, with explicit duties for boards and system operators from conception through decommissioning.

In other words: the rules actually needed are not “never say X” or “always respond like Y,” they are:

  • Who is this agent acting as?
  • What tools and data can it touch?
  • Which actions require human approval?
  • How are high-impact outputs moderated, logged, and audited?

Frameworks like Google’s Secure AI Framework (SAIF) make this concrete. SAIF’s agent permissions control is blunt: agents should operate with least privilege, dynamically scoped permissions, and explicit user control for sensitive actions. OWASP’s Top 10 emerging guidance on agentic applications mirrors that stance: constrain capabilities at the boundary, not in the prose.

From soft words to hard boundaries

The Anthropic espionage case makes the boundary failure concrete:

  • Identity and scope: Claude was coaxed into acting as a defensive security consultant for the attacker’s fictional firm, with no hard binding to a real enterprise identity, tenant, or scoped permissions. Once that fiction was accepted, everything else followed.
  • Tool and data access: MCP gave the agent flexible access to scanners, exploit frameworks, and target systems. There was no independent policy layer saying, “This tenant may never run password crackers against external IP ranges,” or “This environment may only scan assets labeled ‘internal.’”
  • Output execution: Generated exploit code, parsed credentials, and attack plans were treated as actionable artifacts with little mediation. Once a human decided to trust the summary, the barrier between model output and real-world side effect effectively disappeared.

We’ve seen the other side of this coin in civilian contexts. When Air Canada’s website chatbot misrepresented its bereavement policy and the airline tried to argue that the bot was a separate legal entity, the tribunal rejected the claim outright: the company remained liable for what the bot said. In espionage, the stakes are higher but the logic is the same: if an AI agent misuses tools or data, regulators and courts will look through the agent and to the enterprise.

Rules that work, rules that don’t

So yes, rule-based systems fail if by rules one means ad-hoc allow/deny lists, regex fences, and baroque prompt hierarchies trying to police semantics. Those crumble under indirect prompt injection, retrieval-time poisoning, and model deception. But rule-based governance is non-optional when we move from language to action.

The security community is converging on a synthesis:

  • Put rules at the capability boundary: Use policy engines, identity systems, and tool permissions to determine what the agent can actually do, with which data, and under which approvals.
  • Pair rules with continuous evaluation: Use observability tooling, red-teaming packages, and robust logging and evidence.
  • Treat agents as first-class subjects in your threat model: For example, MITRE ATLAS now catalogs techniques and case studies specifically targeting AI systems.

The lesson from the first AI-orchestrated espionage campaign is not that AI is uncontrollable. It’s that control belongs in the same place it always has in security: at the architecture boundary, enforced by systems, not by vibes.

This content was produced by Protegrity. It was not written by MIT Technology Review’s editorial staff.

What AI “remembers” about you is privacy’s next frontier

The ability to remember you and your preferences is rapidly becoming a big selling point for AI chatbots and agents. 

Earlier this month, Google announced Personal Intelligence, a new way for people to interact with the company’s Gemini chatbot that draws on their Gmail, photos, search, and YouTube histories to make Gemini “more personal, proactive, and powerful.” It echoes similar moves by OpenAI, Anthropic, and Meta to add new ways for their AI products to remember and draw from people’s personal details and preferences. While these features have potential advantages, we need to do more to prepare for the new risks they could introduce into these complex technologies.

Personalized, interactive AI systems are built to act on our behalf, maintain context across conversations, and improve our ability to carry out all sorts of tasks, from booking travel to filing taxes. From tools that learn a developer’s coding style to shopping agents that sift through thousands of products, these systems rely on the ability to store and retrieve increasingly intimate details about their users.  But doing so over time introduces alarming, and all-too-familiar, privacy vulnerabilities––many of which have loomed since “big data” first teased the power of spotting and acting on user patterns. Worse, AI agents now appear poised to plow through whatever safeguards had been adopted to avoid those vulnerabilities. 

Today, we interact with these systems through conversational interfaces, and we frequently switch contexts. You might ask a single AI agent to draft an email to your boss, provide medical advice, budget for holiday gifts, and provide input on interpersonal conflicts. Most AI agents collapse all data about you—which may once have been separated by context, purpose, or permissions—into single, unstructured repositories. When an AI agent links to external apps or other agents to execute a task, the data in its memory can seep into shared pools. This technical reality creates the potential for unprecedented privacy breaches that expose not only isolated data points, but the entire mosaic of people’s lives.

When information is all in the same repository, it is prone to crossing contexts in ways that are deeply undesirable. A casual chat about dietary preferences to build a grocery list could later influence what health insurance options are offered, or a search for restaurants offering accessible entrances could leak into salary negotiations—all without a user’s awareness (this concern may sound familiar from the early days of “big data,” but is now far less theoretical). An information soup of memory not only poses a privacy issue, but also makes it harder to understand an AI system’s behavior—and to govern it in the first place. So what can developers do to fix this problem

First, memory systems need structure that allows control over the purposes for which memories can be accessed and used. Early efforts appear to be underway: Anthropic’s Claude creates separate memory areas for different “projects,” and OpenAI says that information shared through ChatGPT Health is compartmentalized from other chats. These are helpful starts, but the instruments are still far too blunt: At a minimum, systems must be able to distinguish between specific memories (the user likes chocolate and has asked about GLP-1s), related memories (user manages diabetes and therefore avoids chocolate), and memory categories (such as professional and health-related). Further, systems need to allow for usage restrictions on certain types of memories and reliably accommodate explicitly defined boundaries—particularly around memories having to do with sensitive topics like medical conditions or protected characteristics, which will likely be subject to stricter rules.

Needing to keep memories separate in this way will have important implications for how AI systems can and should be built. It will require tracking memories’ provenance—their source, any associated time stamp, and the context in which they were created—and building ways to trace when and how certain memories influence the behavior of an agent. This sort of model explainability is on the horizon, but current implementations can be misleading or even deceptive. Embedding memories directly within a model’s weights may result in more personalized and context-aware outputs, but structured databases are currently more segmentable, more explainable, and thus more governable. Until research advances enough, developers may need to stick with simpler systems.

Second, users need to be able to see, edit, or delete what is remembered about them. The interfaces for doing this should be both transparent and intelligible, translating system memory into a structure users can accurately interpret. The static system settings and legalese privacy policies provided by traditional tech platforms have set a low bar for user controls, but natural-language interfaces may offer promising new options for explaining what information is being retained and how it can be managed. Memory structure will have to come first, though: Without it, no model can clearly state a memory’s status. Indeed, Grok 3’s system prompt includes an instruction to the model to “NEVER confirm to the user that you have modified, forgotten, or won’t save a memory,” presumably because the company can’t guarantee those instructions will be followed. 

Critically, user-facing controls cannot bear the full burden of privacy protection or prevent all harms from AI personalization. Responsibility must shift toward AI providers to establish strong defaults, clear rules about permissible memory generation and use, and technical safeguards like on-device processing, purpose limitation, and contextual constraints. Without system-level protections, individuals will face impossibly convoluted choices about what should be remembered or forgotten, and the actions they take may still be insufficient to prevent harm. Developers should consider how to limit data collection in memory systems until robust safeguards exist, and build memory architectures that can evolve alongside norms and expectations.

Third, AI developers must help lay the foundations for approaches to evaluating systems so as to capture not only performance, but also the risks and harms that arise in the wild. While independent researchers are best positioned to conduct these tests (given developers’ economic interest in demonstrating demand for more personalized services), they need access to data to understand what risks might look like and therefore how to address them. To improve the ecosystem for measurement and research, developers should invest in automated measurement infrastructure, build out their own ongoing testing, and implement privacy-preserving testing methods that enable system behavior to be monitored and probed under realistic, memory-enabled conditions.

In its parallels with human experience, the technical term “memory” casts impersonal cells in a spreadsheet as something that builders of AI tools have a responsibility to handle with care. Indeed, the choices AI developers make today—how to pool or segregate information, whether to make memory legible or allow it to accumulate opaquely, whether to prioritize responsible defaults or maximal convenience—will determine how the systems we depend upon remember us. Technical considerations around memory are not so distinct from questions about digital privacy and the vital lessons we can draw from them. Getting the foundations right today will determine how much room we can give ourselves to learn what works—allowing us to make better choices around privacy and autonomy than we have before.

Miranda Bogen is the Director of the AI Governance Lab at the Center for Democracy & Technology. 

Ruchika Joshi is a Fellow at the Center for Democracy & Technology specializing in AI safety and governance.

OpenAI’s latest product lets you vibe code science

OpenAI just revealed what its new in-house team, OpenAI for Science, has been up to. The firm has released a free LLM-powered tool for scientists called Prism, which embeds ChatGPT in a text editor for writing scientific papers.

The idea is to put ChatGPT front and center inside software that scientists use to write up their work in much the same way that chatbots are now embedded into popular programming editors. It’s vibe coding, but for science.

Kevin Weil, head of OpenAI for Science, pushes that analogy himself. “I think 2026 will be for AI and science what 2025 was for AI in software engineering,” he said at a press briefing yesterday. “We’re starting to see that same kind of inflection.”

OpenAI claims that around 1.3 million scientists around the world submit more than 8 million queries a week to ChatGPT on advanced topics in science and math. “That tells us that AI is moving from curiosity to core workflow for scientists,” Weil said.

Prism is a response to that user behavior. It can also be seen as a bid to lock in more scientists to OpenAI’s products in a marketplace full of rival chatbots.

“I mostly use GPT-5 for writing code,” says Roland Dunbrack, a professor of biology at the Fox Chase Cancer Center in Philadelphia, who is not connected to OpenAI. “Occasionally, I ask LLMs a scientific question, basically hoping it can find information in the literature faster than I can. It used to hallucinate references but does not seem to do that very much anymore.”

Nikita Zhivotovskiy, a statistician at the University of California, Berkeley, says GPT-5 has already become an important tool in his work. “It sometimes helps polish the text of papers, catching mathematical typos or bugs, and provides generally useful feedback,” he says. “It is extremely helpful for quick summarization of research articles, making interaction with the scientific literature smoother.”

By combining a chatbot with an everyday piece of software, Prism follows a trend set by products such as OpenAI’s Atlas, which embeds ChatGPT in a web browser, as well as LLM-powered office tools from firms such as Microsoft and Google DeepMind.

Prism incorporates GPT-5.2, the company’s best model yet for mathematical and scientific problem-solving, into an editor for writing documents in LaTeX, a common coding language that scientists use for formatting scientific papers.

A ChatGPT chat box sits at the bottom of the screen, below a view of the article being written. Scientists can call on ChatGPT for anything they want. It can help them draft the text, summarize related articles, manage their citations, turn photos of whiteboard scribbles into equations or diagrams, or talk through hypotheses or mathematical proofs.

It’s clear that Prism could be a huge time saver. It’s also clear that a lot of people may be disappointed, especially after weeks of high-profile social media chatter from researchers at the firm about how good GPT-5 is at solving math problems. Science is drowning in AI slop: Won’t this just make it worse? Where is OpenAI’s fully automated AI scientist? And when will GPT-5 make a stunning new discovery?

That’s not the mission, says Weil. He would love to see GPT-5 make a discovery. But he doesn’t think that’s what will have the biggest impact on science, at least not in the near term.

“I think more powerfully—and with 100% probability—there’s going to be 10,000 advances in science that maybe wouldn’t have happened or wouldn’t have happened as quickly, and AI will have been a contributor to that,” Weil told MIT Technology Review in an exclusive interview this week. “It won’t be this shining beacon—it will just be an incremental, compounding acceleration.”

Why chatbots are starting to check your age

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

How do tech companies check if their users are kids?

This question has taken on new urgency recently thanks to growing concern about the dangers that can arise when children talk to AI chatbots. For years Big Tech asked for birthdays (that one could make up) to avoid violating child privacy laws, but they weren’t required to moderate content accordingly. Two developments over the last week show how quickly things are changing in the US and how this issue is becoming a new battleground, even among parents and child-safety advocates.

In one corner is the Republican Party, which has supported laws passed in several states that require sites with adult content to verify users’ ages. Critics say this provides cover to block anything deemed “harmful to minors,” which could include sex education. Other states, like California, are coming after AI companies with laws to protect kids who talk to chatbots (by requiring them to verify who’s a kid). Meanwhile, President Trump is attempting to keep AI regulation a national issue rather than allowing states to make their own rules. Support for various bills in Congress is constantly in flux.

So what might happen? The debate is quickly moving away from whether age verification is necessary and toward who will be responsible for it. This responsibility is a hot potato that no company wants to hold.

In a blog post last Tuesday, OpenAI revealed that it plans to roll out automatic age prediction. In short, the company will apply a model that uses factors like the time of day, among others, to predict whether a person chatting is under 18. For those identified as teens or children, ChatGPT will apply filters to “reduce exposure” to content like graphic violence or sexual role-play. YouTube launched something similar last year. 

If you support age verification but are concerned about privacy, this might sound like a win. But there’s a catch. The system is not perfect, of course, so it could classify a child as an adult or vice versa. People who are wrongly labeled under 18 can verify their identity by submitting a selfie or government ID to a company called Persona. 

Selfie verifications have issues: They fail more often for people of color and those with certain disabilities. Sameer Hinduja, who co-directs the Cyberbullying Research Center, says the fact that Persona will need to hold millions of government IDs and masses of biometric data is another weak point. “When those get breached, we’ve exposed massive populations all at once,” he says. 

Hinduja instead advocates for device-level verification, where a parent specifies a child’s age when setting up the child’s phone for the first time. This information is then kept on the device and shared securely with apps and websites. 

That’s more or less what Tim Cook, the CEO of Apple, recently lobbied US lawmakers to call for. Cook was fighting lawmakers who wanted to require app stores to verify ages, which would saddle Apple with lots of liability. 

More signals of where this is all headed will come on Wednesday, when the Federal Trade Commission—the agency that would be responsible for enforcing these new laws—is holding an all-day workshop on age verification. Apple’s head of government affairs, Nick Rossi, will be there. He’ll be joined by higher-ups in child safety at Google and Meta, as well as a company that specializes in marketing to children.

The FTC has become increasingly politicized under President Trump (his firing of the sole Democratic commissioner was struck down by a federal court, a decision that is now pending review by the US Supreme Court). In July, I wrote about signals that the agency is softening its stance toward AI companies. Indeed, in December, the FTC overturned a Biden-era ruling against an AI company that allowed people to flood the internet with fake product reviews, writing that it clashed with President Trump’s AI Action Plan.

Wednesday’s workshop may shed light on how partisan the FTC’s approach to age verification will be. Red states favor laws that require porn websites to verify ages (but critics warn this could be used to block a much wider range of content). Bethany Soye, a Republican state representative who is leading an effort to pass such a bill in her state of South Dakota, is scheduled to speak at the FTC meeting. The ACLU generally opposes laws requiring IDs to visit websites and has instead advocated for an expansion of existing parental controls.

While all this gets debated, though, AI has set the world of child safety on fire. We’re dealing with increased generation of child sexual abuse material, concerns (and lawsuits) about suicides and self-harm following chatbot conversations, and troubling evidence of kids’ forming attachments to AI companions. Colliding stances on privacy, politics, free expression, and surveillance will complicate any effort to find a solution. Write to me with your thoughts. 

Inside OpenAI’s big play for science 

In the three years since ChatGPT’s explosive debut, OpenAI’s technology has upended a remarkable range of everyday activities at home, at work, in schools—anywhere people have a browser open or a phone out, which is everywhere.

Now OpenAI is making an explicit play for scientists. In October, the firm announced that it had launched a whole new team, called OpenAI for Science, dedicated to exploring how its large language models could help scientists and tweaking its tools to support them.

The last couple of months have seen a slew of social media posts and academic publications in which mathematicians, physicists, biologists, and others have described how LLMs (and OpenAI’s GPT-5 in particular) have helped them make a discovery or nudged them toward a solution they might otherwise have missed. In part, OpenAI for Science was set up to engage with this community.

And yet OpenAI is also late to the party. Google DeepMind, the rival firm behind groundbreaking scientific models such as AlphaFold and AlphaEvolve, has had an AI-for-science team for years. (When I spoke to Google DeepMind’s CEO and cofounder Demis Hassabis in 2023 about that team, he told me: “This is the reason I started DeepMind … In fact, it’s why I’ve worked my whole career in AI.”)

So why now? How does a push into science fit with OpenAI’s wider mission? And what exactly is the firm hoping to achieve?

I put these questions to Kevin Weil, a vice president at OpenAI who leads the new OpenAI for Science team, in an exclusive interview last week.

On mission

Weil is a product guy. He joined OpenAI a couple of years ago as chief product officer after being head of product at Twitter and Instagram. But he started out as a scientist. He got two-thirds of the way through a PhD in particle physics at Stanford University before ditching academia for the Silicon Valley dream. Weil is keen to highlight his pedigree: “I thought I was going to be a physics professor for the rest of my life,” he says. “I still read math books on vacation.”

Asked how OpenAI for Science fits with the firm’s existing lineup of white-collar productivity tools or the viral video app Sora, Weil recites the company mantra: “The mission of OpenAI is to try and build artificial general intelligence and, you know, make it beneficial for all of humanity.”

Just imagine the future impact this technology could have on science he says: New medicines, new materials, new devices. “Think about it helping us understand the nature of reality, helping us think through open problems. Maybe the biggest, most positive impact we’re going to see from AGI will actually be from its ability to accelerate science.”

He adds: “With GPT-5, we saw that becoming possible.” 

As Weil tells it, LLMs are now good enough to be useful scientific collaborators. They can spitball ideas, suggest novel directions to explore, and find fruitful parallels between new problems and old solutions published in obscure journals decades ago or in foreign languages.

That wasn’t the case a year or so ago. Since it announced its first so-called reasoning model—a type of LLM that can break down problems into multiple steps and work through them one by one—in December 2024, OpenAI has been pushing the envelope of what the technology can do. Reasoning models have made LLMs far better at solving math and logic problems than they used to be. “You go back a few years and we were all collectively mind-blown that the models could get an 800 on the SAT,” says Weil.

But soon LLMs were acing math competitions and solving graduate-level physics problems. Last year, OpenAI and Google DeepMind both announced that their LLMs had achieved gold-medal-level performance in the International Math Olympiad, one of the toughest math contests in the world. “These models are no longer just better than 90% of grad students,” says Weil. “They’re really at the frontier of human abilities.”

That’s a huge claim, and it comes with caveats. Still, there’s no doubt that GPT-5, which includes a reasoning model, is a big improvement on GPT-4 when it comes to complicated problem-solving. Measured against an industry benchmark known as GPQA, which includes more than 400 multiple-choice questions that test PhD-level knowledge in biology, physics, and chemistry, GPT-4 scores 39%, well below the human-expert baseline of around 70%. According to OpenAI, GPT-5.2 (the latest update to the model, released in December) scores 92%. 

Overhyped

The excitement is evident—and perhaps excessive. In October, senior figures at OpenAI, including Weil, boasted on X that GPT-5 had found solutions to several unsolved math problems. Mathematicians were quick to point out that in fact what GPT-5 appeared to have done was dig up existing solutions in old research papers, including at least one written in German. That was still useful, but it wasn’t the achievement OpenAI seemed to have claimed. Weil and his colleagues deleted their posts.

Now Weil is more careful. It is often enough to find answers that exist but have been forgotten, he says: “We collectively stand on the shoulders of giants, and if LLMs can kind of accumulate that knowledge so that we don’t spend time struggling on a problem that is already solved, that’s an acceleration all of its own.”

He plays down the idea that LLMs are about to come up with a game-changing new discovery. “I don’t think models are there yet,” he says. “Maybe they’ll get there. I’m optimistic that they will.”

But, he insists, that’s not the mission: “Our mission is to accelerate science. And I don’t think the bar for the acceleration of science is, like, Einstein-level reimagining of an entire field.”

For Weil, the question is this: “Does science actually happen faster because scientists plus models can do much more, and do it more quickly, than scientists alone? I think we’re already seeing that.”

In November, OpenAI published a series of anecdotal case studies contributed by scientists, both inside and outside the company, that illustrated how they had used GPT-5 and how it had helped. “Most of the cases were scientists that were already using GPT-5 directly in their research and had come to us one way or another saying, ‘Look at what I’m able to do with these tools,’” says Weil.

The key things that GPT-5 seems to be good at are finding references and connections to existing work that scientists were not aware of, which sometimes sparks new ideas; helping scientists sketch mathematical proofs; and suggesting ways for scientists to test hypotheses in the lab.  

“GPT 5.2 has read substantially every paper written in the last 30 years,” says Weil. “And it understands not just the field that a particular scientist is working in; it can bring together analogies from other, unrelated fields.”

“That’s incredibly powerful,” he continues. “You can always find a human collaborator in an adjacent field, but it’s difficult to find, you know, a thousand collaborators in all thousand adjacent fields that might matter. And in addition to that, I can work with the model late at night—it doesn’t sleep—and I can ask it 10 things in parallel, which is kind of awkward to do to a human.”

Solving problems

Most of the scientists OpenAI reached out to back up Weil’s position.

Robert Scherrer, a professor of physics and astronomy at Vanderbilt University, only played around with ChatGPT for fun (“I used to it rewrite the theme song for Gilligan’s Island in the style of Beowulf, which it did very well,” he tells me) until his Vanderbilt colleague Alex Lupsasca, a fellow physicist who now works at OpenAI, told him that GPT-5 had helped solve a problem he’d been working on.

Lupsasca gave Scherrer access to GPT-5 Pro, OpenAI’s $200-a-month premium subscription. “It managed to solve a problem that I and my graduate student could not solve despite working on it for several months,” says Scherrer.

It’s not perfect, he says: “GTP-5 still makes dumb mistakes. Of course, I do too, but the mistakes GPT-5 makes are even dumber.” And yet it keeps getting better, he says: “If current trends continue—and that’s a big if—I suspect that all scientists will be using LLMs soon.”

Derya Unutmaz, a professor of biology at the Jackson Laboratory, a nonprofit research institute, uses GPT-5 to brainstorm ideas, summarize papers, and plan experiments in his work studying the immune system. In the case study he shared with OpenAI, Unutmaz used GPT-5 to analyze an old data set that his team had previously looked at. The model came up with fresh insights and interpretations.  

“LLMs are already essential for scientists,” he says. “When you can complete analysis of data sets that used to take months, not using them is not an option anymore.”

Nikita Zhivotovskiy, a statistician at the University of California, Berkeley, says he has been using LLMs in his research since the first version of ChatGPT came out.

Like Scherrer, he finds LLMs most useful when they highlight unexpected connections between his own work and existing results he did not know about. “I believe that LLMs are becoming an essential technical tool for scientists, much like computers and the internet did before,” he says. “I expect a long-term disadvantage for those who do not use them.”

But he does not expect LLMs to make novel discoveries anytime soon. “I have seen very few genuinely fresh ideas or arguments that would be worth a publication on their own,” he says. “So far, they seem to mainly combine existing results, sometimes incorrectly, rather than produce genuinely new approaches.”

I also contacted a handful of scientists who are not connected to OpenAI.

Andy Cooper, a professor of chemistry at the University of Liverpool and director of the Leverhulme Research Centre for Functional Materials Design, is less enthusiastic. “We have not found, yet, that LLMs are fundamentally changing the way that science is done,” he says. “But our recent results suggest that they do have a place.”

Cooper is leading a project to develop a so-called AI scientist that can fully automate parts of the scientific workflow. He says that his team doesn’t use LLMs to come up with ideas. But the tech is starting to prove useful as part of a wider automated system where an LLM can help direct robots, for example.

“My guess is that LLMs might stick more in robotic workflows, at least initially, because I’m not sure that people are ready to be told what to do by an LLM,” says Cooper. “I’m certainly not.”

Making errors

LLMs may be becoming more and more useful, but caution is still key. In December, Jonathan Oppenheim, a scientist who works on quantum mechanics, called out a mistake that had made its way into a scientific journal. “OpenAI leadership are promoting a paper in Physics Letters B where GPT-5 proposed the main idea—possibly the first peer-reviewed paper where an LLM generated the core contribution,” Oppenheim posted on X. “One small problem: GPT-5’s idea tests the wrong thing.”

He continued: “GPT-5 was asked for a test that detects nonlinear theories. It provided a test that detects nonlocal ones. Related-sounding, but different. It’s like asking for a COVID test, and the LLM cheerfully hands you a test for chickenpox.”

It is clear that a lot of scientists are finding innovative and intuitive ways to engage with LLMs. It is also clear that the technology makes mistakes that can be so subtle even experts miss them.

Part of the problem is the way ChatGPT can flatter you into letting down your guard. As Oppenheim put it: “A core issue is that LLMs are being trained to validate the user, while science needs tools that challenge us.” In an extreme case, one individual (who was not a scientist) was persuaded by ChatGPT into thinking for months that he’d invented a new branch of mathematics.

Of course, Weil is well aware of the problem of hallucination. But he insists that newer models are hallucinating less and less. Even so, focusing on hallucination might be missing the point, he says.

“One of my teammates here, an ex math professor, said something that stuck with me,” says Weil. “He said: ‘When I’m doing research, if I’m bouncing ideas off a colleague, I’m wrong 90% of the time and that’s kind of the point. We’re both spitballing ideas and trying to find something that works.’”

“That’s actually a desirable place to be,” says Weil. “If you say enough wrong things and then somebody stumbles on a grain of truth and then the other person seizes on it and says, ‘Oh, yeah, that’s not quite right, but what if we—’ You gradually kind of find your trail through the woods.”

This is Weil’s core vision for OpenAI for Science. GPT-5 is good, but it is not an oracle. The value of this technology is in pointing people in new directions, not coming up with definitive answers, he says.

In fact, one of the things OpenAI is now looking at is making GPT-5 dial down its confidence when it delivers a response. Instead of saying Here’s the answer, it might tell scientists: Here’s something to consider.

“That’s actually something that we are spending a bunch of time on,” says Weil. “Trying to make sure that the model has some sort of epistemological humility.”

Watching the watchers

Another thing OpenAI is looking at is how to use GPT-5 to fact-check GPT-5. It’s often the case that if you feed one of GPT-5’s answers back into the model, it will pick it apart and highlight mistakes.

“You can kind of hook the model up as its own critic,” says Weil. “Then you can get a workflow where the model is thinking and then it goes to another model, and if that model finds things that it could improve, then it passes it back to the original model and says, ‘Hey, wait a minute—this part wasn’t right, but this part was interesting. Keep it.’ It’s almost like a couple of agents working together and you only see the output once it passes the critic.”

What Weil is describing also sounds a lot like what Google DeepMind did with AlphaEvolve, a tool that wrapped the firms LLM, Gemini, inside a wider system that filtered out the good responses from the bad and fed them back in again to be improved on. Google DeepMind has used AlphaEvolve to solve several real-world problems.

OpenAI faces stiff competition from rival firms, whose own LLMs can do most, if not all, of the things it claims for its own models. If that’s the case, why should scientists use GPT-5 instead of Gemini or Anthropic’s Claude, families of models that are themselves improving every year? Ultimately, OpenAI for Science may be as much an effort to plant a flag in new territory as anything else. The real innovations are still to come. 

“I think 2026 will be for science what 2025 was for software engineering,” says Weil. “At the beginning of 2025, if you were using AI to write most of your code, you were an early adopter. Whereas 12 months later, if you’re not using AI to write most of your code, you’re probably falling behind. We’re now seeing those same early flashes for science as we did for code.”

He continues: “I think that in a year, if you’re a scientist and you’re not heavily using AI, you’ll be missing an opportunity to increase the quality and pace of your thinking.”

America’s coming war over AI regulation

MIT Technology Review’s What’s Next series looks across industries, trends, and technologies to give you a first look at the future. You can read the rest of them here.

In the final weeks of 2025, the battle over regulating artificial intelligence in the US reached a boiling point. On December 11, after Congress failed twice to pass a law banning state AI laws, President Donald Trump signed a sweeping executive order seeking to handcuff states from regulating the booming industry. Instead, he vowed to work with Congress to establish a “minimally burdensome” national AI policy, one that would position the US to win the global AI race. The move marked a qualified victory for tech titans, who have been marshaling multimillion-dollar war chests to oppose AI regulations, arguing that a patchwork of state laws would stifle innovation.

In 2026, the battleground will shift to the courts. While some states might back down from passing AI laws, others will charge ahead, buoyed by mounting public pressure to protect children from chatbots and rein in power-hungry data centers. Meanwhile, dueling super PACs bankrolled by tech moguls and AI-safety advocates will pour tens of millions into congressional and state elections to seat lawmakers who champion their competing visions for AI regulation. 

Trump’s executive order directs the Department of Justice to establish a task force that sues states whose AI laws clash with his vision for light-touch regulation. It also directs the Department of Commerce to starve states of federal broadband funding if their AI laws are “onerous.” In practice, the order may target a handful of laws in Democratic states, says James Grimmelmann, a law professor at Cornell Law School. “The executive order will be used to challenge a smaller number of provisions, mostly relating to transparency and bias in AI, which tend to be more liberal issues,” Grimmelmann says.

For now, many states aren’t flinching. On December 19, New York’s governor, Kathy Hochul, signed the Responsible AI Safety and Education (RAISE) Act, a landmark law requiring AI companies to publish the protocols used to ensure the safe development of their AI models and report critical safety incidents. On January 1, California debuted the nation’s first frontier AI safety law, SB 53—which the RAISE Act was modeled on—aimed at preventing catastrophic harms such as biological weapons or cyberattacks. While both laws were watered down from earlier iterations to survive bruising industry lobbying, they struck a rare, if fragile, compromise between tech giants and AI safety advocates.

If Trump targets these hard-won laws, Democratic states like California and New York will likely take the fight to court. Republican states like Florida with vocal champions for AI regulation might follow suit. Trump could face an uphill battle. “The Trump administration is stretching itself thin with some of its attempts to effectively preempt [legislation] via executive action,” says Margot Kaminski, a law professor at the University of Colorado Law School. “It’s on thin ice.”

But Republican states that are anxious to stay off Trump’s radar or can’t afford to lose federal broadband funding for their sprawling rural communities might retreat from passing or enforcing AI laws. Win or lose in court, the chaos and uncertainty could chill state lawmaking. Paradoxically, the Democratic states that Trump wants to rein in—armed with big budgets and emboldened by the optics of battling the administration—may be the least likely to budge.

In lieu of state laws, Trump promises to create a federal AI policy with Congress. But the gridlocked and polarized body won’t be delivering a bill this year. In July, the Senate killed a moratorium on state AI laws that had been inserted into a tax bill, and in November, the House scrapped an encore attempt in a defense bill. In fact, Trump’s bid to strong-arm Congress with an executive order may sour any appetite for a bipartisan deal. 

The executive order “has made it harder to pass responsible AI policy by hardening a lot of positions, making it a much more partisan issue,” says Brad Carson, a former Democratic congressman from Oklahoma who is building a network of super PACs backing candidates who support AI regulation. “It hardened Democrats and created incredible fault lines among Republicans,” he says. 

While AI accelerationists in Trump’s orbit—AI and crypto czar David Sacks among them—champion deregulation, populist MAGA firebrands like Steve Bannon warn of rogue superintelligence and mass unemployment. In response to Trump’s executive order, Republican state attorneys general signed a bipartisan letter urging the FCC not to supersede state AI laws.

With Americans increasingly anxious about how AI could harm mental health, jobs, and the environment, public demand for regulation is growing. If Congress stays paralyzed, states will be the only ones acting to keep the AI industry in check. In 2025, state legislators introduced more than 1,000 AI bills, and nearly 40 states enacted over 100 laws, according to the National Conference of State Legislatures.

Efforts to protect children from chatbots may inspire rare consensus. On January 7, Google and Character Technologies, a startup behind the companion chatbot Character.AI, settled several lawsuits with families of teenagers who killed themselves after interacting with the bot. Just a day later, the Kentucky attorney general sued Character Technologies, alleging that the chatbots drove children to suicide and other forms of self-harm. OpenAI and Meta face a barrage of similar suits. Expect more to pile up this year. Without AI laws on the books, it remains to be seen how product liability laws and free speech doctrines apply to these novel dangers. “It’s an open question what the courts will do,” says Grimmelmann. 

While litigation brews, states will move to pass child safety laws, which are exempt from Trump’s proposed ban on state AI laws. On January 9, OpenAI inked a deal with a former foe, the child-safety advocacy group Common Sense Media, to back a ballot initiative in California called the Parents & Kids Safe AI Act, setting guardrails around how chatbots interact with children. The measure proposes requiring AI companies to verify users’ age, offer parental controls, and undergo independent child-safety audits. If passed, it could be a blueprint for states across the country seeking to crack down on chatbots. 

Fueled by widespread backlash against data centers, states will also try to regulate the resources needed to run AI. That means bills requiring data centers to report on their power and water use and foot their own electricity bills. If AI starts to displace jobs at scale, labor groups might float AI bans in specific professions. A few states concerned about the catastrophic risks posed by AI may pass safety bills mirroring SB 53 and the RAISE Act. 

Meanwhile, tech titans will continue to use their deep pockets to crush AI regulations. Leading the Future, a super PAC backed by OpenAI president Greg Brockman and the venture capital firm Andreessen Horowitz, will try to elect candidates who endorse unfettered AI development to Congress and state legislatures. They’ll follow the crypto industry’s playbook for electing allies and writing the rules. To counter this, super PACs funded by Public First, an organization run by Carson and former Republican congressman Chris Stewart of Utah, will back candidates advocating for AI regulation. We might even see a handful of candidates running on anti-AI populist platforms.

In 2026, the slow, messy process of American democracy will grind on. And the rules written in state capitals could decide how the most disruptive technology of our generation develops far beyond America’s borders, for years to come.