Anthropic’s chief scientist on 5 ways agents will be even better in 2025

Agents are the hottest thing in tech right now. Top firms from Google DeepMind to OpenAI to Anthropic are racing to augment large language models with the ability to carry out tasks by themselves. Known as agentic AI in industry jargon, such systems have fast become the new target of Silicon Valley buzz. Everyone from Nvidia to Salesforce is talking about how they are going to upend the industry. 

“We believe that, in 2025, we may see the first AI agents ‘join the workforce’ and materially change the output of companies,” Sam Altman claimed in a blog post last week.

In the broadest sense, an agent is a software system that goes off and does something, often with minimal to zero supervision. The more complex that thing is, the smarter the agent needs to be. For many, large language models are now smart enough to power agents that can do a whole range of useful tasks for us, such as filling out forms, looking up a recipe and adding the ingredients to an online grocery basket, or using a search engine to do last-minute research before a meeting and producing a quick bullet-point summary.

In October, Anthropic showed off one of the most advanced agents yet: an extension of its Claude large language model called computer use. As the name suggests, it lets you direct Claude to use a computer much as a person would, by moving a cursor, clicking buttons, and typing text. Instead of simply having a conversation with Claude, you can now ask it to carry out on-screen tasks for you.

Anthropic notes that the feature is still cumbersome and error-prone. But it is already available to a handful of testers, including third-party developers at companies such as DoorDash, Canva, and Asana.

Computer use is a glimpse of what’s to come for agents. To learn what’s coming next, MIT Technology Review talked to Anthropic’s cofounder and chief scientist Jared Kaplan. Here are five ways that agents are going to get even better in 2025.

(Kaplan’s answers have been lightly edited for length and clarity.)

1/ Agents will get better at using tools

“I think there are two axes for thinking about what AI is capable of. One is a question of how complex the task is that a system can do. And as AI systems get smarter, they’re getting better in that direction. But another direction that’s very relevant is what kinds of environments or tools the AI can use. 

“So, like, if you go back almost 10 years now to [DeepMind’s Go-playing model] AlphaGo, we had AI systems that were superhuman in terms of how well they could play board games. But if all you can work with is a board game, then that’s a very restrictive environment. It’s not actually useful, even if it’s very smart. With text models, and then multimodal models, and now computer use—and perhaps in the future with robotics—you’re moving toward bringing AI into different situations and tasks, and making it useful. 

“We were excited about computer use basically for that reason. Until recently, with large language models, it’s been necessary to give them a very specific prompt, give them very specific tools, and then they’re restricted to a specific kind of environment. What I see is that computer use will probably improve quickly in terms of how well models can do different tasks and more complex tasks. And also to realize when they’ve made mistakes, or realize when there’s a high-stakes question and it needs to ask the user for feedback.”

2/ Agents will understand context  

“Claude needs to learn enough about your particular situation and the constraints that you operate under to be useful. Things like what particular role you’re in, what styles of writing or what needs you and your organization have.

Jared Kaplan

ANTHROPIC

“I think that we’ll see improvements there where Claude will be able to search through things like your documents, your Slack, etc., and really learn what’s useful for you. That’s underemphasized a bit with agents. It’s necessary for systems to be not only useful but also safe, doing what you expected.

“Another thing is that a lot of tasks won’t require Claude to do much reasoning. You don’t need to sit and think for hours before opening Google Docs or something. And so I think that a lot of what we’ll see is not just more reasoning but the application of reasoning when it’s really useful and important, but also not wasting time when it’s not necessary.”

3/ Agents will make coding assistants better

“We wanted to get a very initial beta of computer use out to developers to get feedback while the system was relatively primitive. But as these systems get better, they might be more widely used and really collaborate with you on different activities.

“I think DoorDash, the Browser Company, and Canva are all experimenting with, like, different kinds of browser interactions and designing them with the help of AI.

“My expectation is that we’ll also see further improvements to coding assistants. That’s something that’s been very exciting for developers. There’s just a ton of interest in using Claude 3.5 for coding, where it’s not just autocomplete like it was a couple of years ago. It’s really understanding what’s wrong with code, debugging it—running the code, seeing what happens, and fixing it.”

4/ Agents will need to be made safe

“We founded Anthropic because we expected AI to progress very quickly and [thought] that, inevitably, safety concerns were going to be relevant. And I think that’s just going to become more and more visceral this year, because I think these agents are going to become more and more integrated into the work we do. We need to be ready for the challenges, like prompt injection. 

[Prompt injection is an attack in which a malicious prompt is passed to a large language model in ways that its developers did not foresee or intend. One way to do this is to add the prompt to websites that models might visit.]

“Prompt injection is probably one of the No.1 things we’re thinking about in terms of, like, broader usage of agents. I think it’s especially important for computer use, and it’s something we’re working on very actively, because if computer use is deployed at large scale, then there could be, like, pernicious websites or something that try to convince Claude to do something that it shouldn’t do.

“And with more advanced models, there’s just more risk. We have a robust scaling policy where, as AI systems become sufficiently capable, we feel like we need to be able to really prevent them from being misused. For example, if they could help terrorists—that kind of thing.

“So I’m really excited about how AI will be useful—it’s actually also accelerating us a lot internally at Anthropic, with people using Claude in all kinds of ways, especially with coding. But, yeah, there’ll be a lot of challenges as well. It’ll be an interesting year.”

What’s next for AI in 2025

MIT Technology Review’s What’s Next series looks across industries, trends, and technologies to give you a first look at the future. You can read the rest of them here.

For the last couple of years we’ve had a go at predicting what’s coming next in AI. A fool’s game given how fast this industry moves. But we’re on a roll, and we’re doing it again.

How did we score last time round? Our four hot trends to watch out for in 2024 included what we called customized chatbots—interactive helper apps powered by multimodal large language models (check: we didn’t know it yet, but we were talking about what everyone now calls agents, the hottest thing in AI right now); generative video (check: few technologies have improved so fast in the last 12 months, with OpenAI and Google DeepMind releasing their flagship video generation models, Sora and Veo, within a week of each other this December); and more general-purpose robots that can do a wider range of tasks (check: the payoffs from large language models continue to trickle down to other parts of the tech industry, and robotics is top of the list). 

We also said that AI-generated election disinformation would be everywhere, but here—happily—we got it wrong. There were many things to wring our hands over this year, but political deepfakes were thin on the ground

So what’s coming in 2025? We’re going to ignore the obvious here: You can bet that agents and smaller, more efficient, language models will continue to shape the industry. Instead, here are five alternative picks from our AI team.

1. Generative virtual playgrounds 

If 2023 was the year of generative images and 2024 was the year of generative video—what comes next? If you guessed generative virtual worlds (a.k.a. video games), high fives all round.

We got a tiny glimpse of this technology in February, when Google DeepMind revealed a generative model called Genie that could take a still image and turn it into a side-scrolling 2D platform game that players could interact with. In December, the firm revealed Genie 2, a model that can spin a starter image into an entire virtual world.

Other companies are building similar tech. In October, the AI startups Decart and Etched revealed an unofficial Minecraft hack in which every frame of the game gets generated on the fly as you play. And World Labs, a startup cofounded by Fei-Fei Li—creator of ImageNet, the vast data set of photos that kick-started the deep-learning boom—is building what it calls large world models, or LWMs.

One obvious application is video games. There’s a playful tone to these early experiments, and generative 3D simulations could be used to explore design concepts for new games, turning a sketch into a playable environment on the fly. This could lead to entirely new types of games

But they could also be used to train robots. World Labs wants to develop so-called spatial intelligence—the ability for machines to interpret and interact with the everyday world. But robotics researchers lack good data about real-world scenarios with which to train such technology. Spinning up countless virtual worlds and dropping virtual robots into them to learn by trial and error could help make up for that.   

Will Douglas Heaven

2. Large language models that “reason”

The buzz was justified. When OpenAI revealed o1 in September, it introduced a new paradigm in how large language models work. Two months later, the firm pushed that paradigm forward in almost every way with o3—a model that just might reshape this technology for good.

Most models, including OpenAI’s flagship GPT-4, spit out the first response they come up with. Sometimes it’s correct; sometimes it’s not. But the firm’s new models are trained to work through their answers step by step, breaking down tricky problems into a series of simpler ones. When one approach isn’t working, they try another. This technique, known as “reasoning” (yes—we know exactly how loaded that term is), can make this technology more accurate, especially for math, physics, and logic problems.

It’s also crucial for agents.

In December, Google DeepMind revealed an experimental new web-browsing agent called Mariner. In the middle of a preview demo that the company gave to MIT Technology Review, Mariner seemed to get stuck. Megha Goel, a product manager at the company, had asked the agent to find her a recipe for Christmas cookies that looked like the ones in a photo she’d given it. Mariner found a recipe on the web and started adding the ingredients to Goel’s online grocery basket.

Then it stalled; it couldn’t figure out what type of flour to pick. Goel watched as Mariner explained its steps in a chat window: “It says, ‘I will use the browser’s Back button to return to the recipe.’”

It was a remarkable moment. Instead of hitting a wall, the agent had broken the task down into separate actions and picked one that might resolve the problem. Figuring out you need to click the Back button may sound basic, but for a mindless bot it’s akin to rocket science. And it worked: Mariner went back to the recipe, confirmed the type of flour, and carried on filling Goel’s basket.

Google DeepMind is also building an experimental version of Gemini 2.0, its latest large language model, that uses this step-by-step approach to problem solving, called Gemini 2.0 Flash Thinking.

But OpenAI and Google are just the tip of the iceberg. Many companies are building large language models that use similar techniques, making them better at a whole range of tasks, from cooking to coding. Expect a lot more buzz about reasoning (we know, we know) this year.

—Will Douglas Heaven

3. It’s boom time for AI in science 

One of the most exciting uses for AI is speeding up discovery in the natural sciences. Perhaps the greatest vindication of AI’s potential on this front came last October, when the Royal Swedish Academy of Sciences awarded the Nobel Prize for chemistry to Demis Hassabis and John M. Jumper from Google DeepMind for building the AlphaFold tool, which can solve protein folding, and to David Baker for building tools to help design new proteins.

Expect this trend to continue next year, and to see more data sets and models that are aimed specifically at scientific discovery. Proteins were the perfect target for AI, because the field had excellent existing data sets that AI models could be trained on. 

The hunt is on to find the next big thing. One potential area is materials science. Meta has released massive data sets and models that could help scientists use AI to discover new materials much faster, and in December, Hugging Face, together with the startup Entalpic, launched LeMaterial, an open-source project that aims to simplify and accelerate materials research. Their first project is a data set that unifies, cleans, and standardizes the most prominent material data sets. 

AI model makers are also keen to pitch their generative products as research tools for scientists. OpenAI let scientists test its latest o1 model and see how it might support them in research. The results were encouraging. 

Having an AI tool that can operate in a similar way to a scientist is one of the fantasies of the tech sector. In a manifesto published in October last year, Anthropic founder Dario Amodei highlighted science, especially biology, as one of the key areas where powerful AI could help. Amodei speculates that in the future, AI could be not only a method of data analysis but a “virtual biologist who performs all the tasks biologists do.” We’re still a long way away from this scenario. But next year, we might see important steps toward it. 

—Melissa Heikkilä

4. AI companies get cozier with national security

There is a lot of money to be made by AI companies willing to lend their tools to border surveillance, intelligence gathering, and other national security tasks. 

The US military has launched a number of initiatives that show it’s eager to adopt AI, from the Replicator program—which, inspired by the war in Ukraine, promises to spend $1 billion on small drones—to the Artificial Intelligence Rapid Capabilities Cell, a unit bringing AI into everything from battlefield decision-making to logistics. European militaries are under pressure to up their tech investment, triggered by concerns that Donald Trump’s administration will cut spending to Ukraine. Rising tensions between Taiwan and China weigh heavily on the minds of military planners, too. 

In 2025, these trends will continue to be a boon for defense-tech companies like Palantir, Anduril, and others, which are now capitalizing on classified military data to train AI models. 

The defense industry’s deep pockets will tempt mainstream AI companies into the fold too. OpenAI in December announced it is partnering with Anduril on a program to take down drones, completing a year-long pivot away from its policy of not working with the military. It joins the ranks of Microsoft, Amazon, and Google, which have worked with the Pentagon for years. 

Other AI competitors, which are spending billions to train and develop new models, will face more pressure in 2025 to think seriously about revenue. It’s possible that they’ll find enough non-defense customers who will pay handsomely for AI agents that can handle complex tasks, or creative industries willing to spend on image and video generators. 

But they’ll also be increasingly tempted to throw their hats in the ring for lucrative Pentagon contracts. Expect to see companies wrestle with whether working on defense projects will be seen as a contradiction to their values. OpenAI’s rationale for changing its stance was that “democracies should continue to take the lead in AI development,” the company wrote, reasoning that lending its models to the military would advance that goal. In 2025, we’ll be watching others follow its lead. 

James O’Donnell

5. Nvidia sees legitimate competition

For much of the current AI boom, if you were a tech startup looking to try your hand at making an AI model, Jensen Huang was your man. As CEO of Nvidia, the world’s most valuable corporation, Huang helped the company become the undisputed leader of chips used both to train AI models and to ping a model when anyone uses it, called “inferencing.”

A number of forces could change that in 2025. For one, behemoth competitors like Amazon, Broadcom, AMD, and others have been investing heavily in new chips, and there are early indications that these could compete closely with Nvidia’s—particularly for inference, where Nvidia’s lead is less solid. 

A growing number of startups are also attacking Nvidia from a different angle. Rather than trying to marginally improve on Nvidia’s designs, startups like Groq are making riskier bets on entirely new chip architectures that, with enough time, promise to provide more efficient or effective training. In 2025 these experiments will still be in their early stages, but it’s possible that a standout competitor will change the assumption that top AI models rely exclusively on Nvidia chips.

Underpinning this competition, the geopolitical chip war will continue. That war thus far has relied on two strategies. On one hand, the West seeks to limit exports to China of top chips and the technologies to make them. On the other, efforts like the US CHIPS Act aim to boost domestic production of semiconductors.

Donald Trump may escalate those export controls and has promised massive tariffs on any goods imported from China. In 2025, such tariffs would put Taiwan—on which the US relies heavily because of the chip manufacturer TSMC—at the center of the trade wars. That’s because Taiwan has said it will help Chinese firms relocate to the island to help them avoid the proposed tariffs. That could draw further criticism from Trump, who has expressed frustration with US spending to defend Taiwan from China. 

It’s unclear how these forces will play out, but it will only further incentivize chipmakers to reduce reliance on Taiwan, which is the entire purpose of the CHIPS Act. As spending from the bill begins to circulate, next year could bring the first evidence of whether it’s materially boosting domestic chip production. 

James O’Donnell

What’s next for our privacy?

MIT Technology Review’s What’s Next series looks across industries, trends, and technologies to give you a first look at the future. You can read the rest of them here.

Every day, we are tracked hundreds or even thousands of times across the digital world. Cookies and web trackers capture every website link that we click, while code installed in mobile apps tracks every physical location that our devices—and, by extension, we—have visited. All of this is collected, packaged together with other details (compiled from public records, supermarket member programs, utility companies, and more), and used to create highly personalized profiles that are then shared or sold, often without our explicit knowledge or consent. 

A consensus is growing that Americans need better privacy protections—and that the best way to deliver them would be for Congress to pass comprehensive federal privacy legislation. While the latest iteration of such a bill, the American Privacy Rights Act of 2024, gained more momentum than previously proposed laws, it became so watered down that it lost support from both Republicans and Democrats before it even came to a vote. 

There have been some privacy wins in the form of limits on what data brokers—third-party companies that buy and sell consumers’ personal information for targeted advertisements, messaging, and other purposes—can do with geolocation data. 

These are still small steps, though—and they are happening as increasingly pervasive and powerful technologies collect more data than ever. And at the same time, Washington is preparing for a new presidential administration that has attacked the press and other critics, promised to target immigrants for mass deportation, threatened to seek retribution against perceived enemies, and supported restrictive state abortion laws. This is not even to mention the increased collection of our biometric data, especially for facial recognition, and the normalization of its use in all kinds of ways. In this light, it’s no stretch to say our personal data has arguably never been more vulnerable, and the imperative for privacy has never felt more urgent. 

So what can Americans expect for their personal data in 2025? We spoke to privacy experts and advocates about (some of) what’s on their mind regarding how our digital data might be traded or protected moving forward. 

Reining in a problematic industry

In early December, the Federal Trade Commission announced separate settlement agreements with the data brokers Mobilewalla and Gravy Analytics (and its subsidiary Venntel). Finding that the companies had tracked and sold geolocation data from users at sensitive locations like churches, hospitals, and military installations without explicit consent, the FTC banned the companies from selling such data except in specific circumstances. This follows something of a busy year in regulation of data brokers, including multiple FTC enforcement actions against other companies for similar use and sale of geolocation data, as well as a proposed rule from the Justice Department that would prohibit the sale of bulk data to foreign entities. 

And on the same day that the FTC announced these settlements in December, the Consumer Financial Protection Bureau proposed a new rule that would designate data brokers as consumer reporting agencies, which would trigger stringent reporting requirements and consumer privacy protections. The rule would prohibit the collection and sharing of people’s sensitive information, such as their salaries and Social Security numbers, without “legitimate purposes.” While the rule will still need to undergo a 90-day public comment period, and it’s unclear whether it will move forward under the Trump administration, if it’s finalized it has the power to fundamentally limit how data brokers do business.

Right now, there just aren’t many limits on how these companies operate—nor, for that matter, clear information on how many data brokerages even exist. Industry watchers estimate there may be 4,000 to 5,000 data brokers around the world, many of which we’ve never heard of—and whose names constantly shift. In California alone, the state’s 2024 Data Broker Registry lists 527 such businesses that have voluntarily registered there, nearly 90 of which also self-reported that they collect geolocation data. 

All this data is widely available for purchase by anyone who will pay. Marketers buy data to create highly targeted advertisements, and banks and insurance companies do the same to verify identity, prevent fraud, and conduct risk assessments. Law enforcement buys geolocation data to track people’s whereabouts without getting traditional search warrants. Foreign entities can also currently buy sensitive information on members of the military and other government officials. And on people-finder websites, basically anyone can pay for anyone else’s contact details and personal history.  

Data brokers and their clients defend these transactions by saying that most of this data is anonymized—though it’s questionable whether that can truly be done in the case of geolocation data. Besides, anonymous data can be easily reidentified, especially when it’s combined with other personal information. 

Digital-rights advocates have spent years sounding the alarm on this secretive industry, especially the ways in which it can harm already marginalized communities, though various types of data collection have sparked consternation across the political spectrum. Representative Cathy McMorris Rodgers, the Republican chair of the House Energy and Commerce Committee, for example, was concerned about how the Centers for Disease Control and Prevention bought location data to evaluate the effectiveness of pandemic lockdowns. Then a study from last year showed how easy (and cheap) it was to buy sensitive data about members of the US military; Senator Elizabeth Warren, a Democrat, called out the national security risks of data brokers in a statement to MIT Technology Review, and Senator John Cornyn, a Republican, later said he was “shocked” when he read about the practice in our story. 

But it was the 2022 Supreme Court decision ending the constitutional guarantee of legal abortion that spurred much of the federal action last year. Shortly after the Dobbs ruling, President Biden issued an executive order to protect access to reproductive health care; it included instructions for the FTC to take steps preventing information about visits to doctor’s offices or abortion clinics from being sold to law enforcement agencies or state prosecutors.

The new enforcers

With Donald Trump taking office in January, and Republicans taking control of both houses of Congress, the fate of the CFPB’s proposed rule—and the CFPB itself—is uncertain. Republicans, the people behind Project 2025, and Elon Musk (who will lead the newly created advisory group known as the Department of Government Efficiency) have long been interested in seeing the bureau “deleted,” as Musk put it on X. That would take an act of Congress, making it unlikely, but there are other ways that the administration could severely curtail its powers. Trump is likely to fire the current director and install a Republican who could rescind existing CFPB rules and stop any proposed rules from moving forward. 

Meanwhile, the FTC’s enforcement actions are only as good as the enforcers. FTC decisions do not set legal precedent in quite the same way that court cases do, says Ben Winters, a former Department of Justice official and the director of AI and privacy at the Consumer Federation of America, a network of organizations and agencies focused on consumer protection. Instead, they “require consistent [and] additional enforcement to make the whole industry scared of not having an FTC enforcement action against them.” (It’s also worth noting that these FTC settlements are specifically focused on geolocation data, which is just one of the many types of sensitive data that we regularly give up in order to participate in the digital world.)

Looking ahead, Tiffany Li, a professor at the University of San Francisco School of Law who focuses on AI and privacy law, is worried about “a defanged FTC” that she says would be “less aggressive in taking action against companies.” 

Lina Khan, the current FTC chair, has been the leader of privacy protection action in the US, notes Li, and she’ll soon be leaving. Andrew Ferguson, Trump’s recently named pick to be the next FTC chair, has come out in strong opposition to data brokers: “This type of data—records of a person’s precise physical locations—is inherently intrusive and revealing of people’s most private affairs,” he wrote in a statement on the Mobilewalla decision, indicating that he is likely to continue action against them. (Ferguson has been serving as a commissioner on the FTC since April 20214.) On the other hand, he has spoken out against using FTC actions as an alternative to privacy legislation passed by Congress. And, of course, this brings us right back around to that other major roadblock: Congress has so far failed to pass such laws—and it’s unclear if the next Congress will either. 

Movement in the states

Without federal legislative action, many US states are taking privacy matters into their own hands. 

In 2025, eight new state privacy laws will take effect, making a total of 25 around the country. A number of other states—like Vermont and Massachusetts—are considering passing their own privacy bills next year, and such laws could, in theory, force national legislation, says Woodrow Hartzog, a technology law scholar at Boston University School of Law. “Right now, the statutes are all similar enough that the compliance cost is perhaps expensive but manageable,” he explains. But if one state passed a law that was different enough from the others, a national law could be the only way to resolve the conflict. Additionally, four states—California, Texas, Vermont, and Oregon—already have specific laws regulating data brokers, including the requirement that they register with the state. 

Along with new laws, says Justin Brookman, the director of technology policy at Consumer Reports, comes the possibility that “we can put some more teeth on these laws.” 

Brookman points to Texas, where some of the most aggressive enforcement action at the state level has taken place under its Republican attorney general, Ken Paxton. Even before the state’s new consumer privacy bill went into effect in July, Paxton announced the creation of a special task force focused on enforcing the state’s privacy laws. He has since targeted a number of data brokers—including National Public Data, which exposed millions of sensitive customer records in a data breach in August, as well as companies that sell to them, like Sirius XM. 

At the same time, though, Paxton has moved to enforce the state’s strict abortion laws in ways that threaten individual privacy. In December, he sued a New York doctor for sending abortion pills to a Texas woman through the mail. While the doctor is theoretically protected by New York’s shield laws, which provide a safeguard from out-of-state prosecution, Paxton’s aggressive action makes it even more crucial that states enshrine data privacy protections into their laws, says Albert Fox Cahn, the executive director of the Surveillance Technology Oversight Project, an advocacy group. “There is an urgent need for states,” he says, “to lock down our resident’s’ data, barring companies from collecting and sharing information in ways that can be weaponized against them by out-of-state prosecutors.” 

Data collection in the name of “security”

While privacy has become a bipartisan issue, Republicans, in particular, are interested in “addressing data brokers in the context of national security,” such as protecting the data of military members or other government officials, says Winters. But in his view, it’s the effects on reproductive rights and immigrants that are potentially the “most dangerous” threats to privacy. 

Indeed, data brokers (including Venntel, the Gravy Analytics subsidiary named in the recent FTC settlement) have sold cell-phone data to Immigration and Customs Enforcement, as well as to Customs and Border Protection. That data has then been used to track individuals for deportation proceedings—allowing the agencies to bypass local and state sanctuary laws that ban local law enforcement from sharing information for immigration enforcement. 

“The more data that corporations collect, the more data that’s available to governments for surveillance,” warns Ashley Gorski, a senior attorney who works on national security and privacy at the American Civil Liberties Union.

The ACLU is among a number of organizations that have been pushing for the passage of another federal law related to privacy: the Fourth Amendment Is Not For Sale Act. It would close the so-called “data-broker loophole” that allows law enforcement and intelligence agencies to buy personal information from data brokers without a search warrant. The bill would “dramatically limit the ability of the government to buy Americans’ private data,” Gorski says. It was first introduced in 2021 and passed the House in April 2024, with the support of 123 Republicans and 93 Democrats, before stalling in the Senate. 

While Gorski is hopeful that the bill will move forward in the next Congress, others are less sanguine about these prospects—and alarmed about other ways that the incoming administration might “co-opt private systems for surveillance purposes,” as Hartzog puts it. So much of our personal information that is “collected for one purpose,” he says, could “easily be used by the government … to track us.” 

This is especially concerning, adds Winters, given that the next administration has been “very explicit” about wanting to use every tool at its disposal to carry out policies like mass deportations and to exact revenge on perceived enemies. And one possible change, he says, is as simple as loosening the government’s procurement processes to make them more open to emerging technologies, which may have fewer privacy protections. “Right now, it’s annoying to procure anything as a federal agency,” he says, but he expects a more “fast and loose use of commercial tools.” 

“That’s something we’ve [already] seen a lot,” he adds, pointing to “federal, state, and local agencies using the Clearviews of the world”—a reference to the controversial facial recognition company. 

The AI wild card

Underlying all of these debates on potential legislation is the fact that technology companies—especially AI companies—continue to require reams and reams of data, including personal data, to train their machine-learning models. And they’re quickly running out of it. 

This is something of a wild card in any predictions about personal data. Ideally, says Jennifer King, a privacy and data policy fellow at the Stanford Institute for Human-Centered Artificial Intelligence, the shortage would lead to ways for consumers to directly benefit, perhaps financially, from the value of their own data. But it’s more likely that “there will be more industry resistance against some of the proposed comprehensive federal privacy legislation bills,” she says. “Companies benefit from the status quo.” 

The hunt for more and more data may also push companies to change their own privacy policies, says Whitney Merrill, a former FTC official who works on data privacy at Asana. Speaking in a personal capacity, she says that companies “have felt the squeeze in the tech recession that we’re in, with the high interest rates,” and that under those circumstances, “we’ve seen people turn around, change their policies, and try to monetize their data in an AI world”—even if it’s at the expense of user privacy. She points to the $60-million-per-year deal that Reddit struck last year to license its content to Google to help train the company’s AI. 

Earlier this year, the FTC warned companies that it would be “unfair and deceptive” to “surreptitiously” change their privacy policies to allow for the use of user data to train AI. But again, whether or not officials follow up on this depends on those in charge. 

So what will privacy look like in 2025? 

While the recent FTC settlements and the CFPB’s proposed rule represent important steps forward in privacy protection—at least when it comes to geolocation data—Americans’ personal information still remains widely available and vulnerable. 

Rebecca Williams, a senior strategist at the ACLU for privacy and data governance, argues that all of us, as individuals and communities, should take it upon ourselves to do more to protect ourselves and “resist … by opting out” of as much data collection as possible. That means checking privacy settings on accounts and apps, and using encrypted messaging services. 

Cahn, meanwhile, says he’ll “be striving to protect [his] local community, working to enact safeguards to ensure that we live up to our principles and stated commitments.” One example of such safeguards is a proposed New York City ordinance that would ban the sharing of any location data originating from within the city limits. Hartzog says that kind of local activism has already been effective in pushing for city bans on facial recognition. 

“Privacy rights are at risk, but they’re not gone, and it’s not helpful to take an overly pessimistic look right now,” says Li, the USF law professor. “We definitely still have privacy rights, and the more that we continue to fight for these rights, the more we’re going to be able to protect our rights.”

How optimistic are you about AI’s future?

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

The start of a new year, and maybe especially this one, feels like a good time for a gut check: How optimistic are you feeling about the future of technology? 

Our annual list of 10 Breakthrough Technologies, published on Friday, might help you decide. It’s the 24th time we’ve published such a list. But just like our earliest picks (2001’s list featured brain-computer interfaces and ways to track copyrighted content on the internet, by the way), this year’s technologies may come to help society, harm it, or both.

Artificial intelligence powers four of the breakthroughs featured on the list, and I expect your optimism about them will vary widely. Take generative AI search. Now becoming the norm on Google with its AI Overviews, it promises to help sort through the internet’s incomprehensible volume of information to offer better answers for the questions we ask. Along the way, it is upending the model of how content creators get paid, and positioning fallible AI as the arbiter of truth and facts. Read more here

Also making the list is the immense progress in the world of robots, which can now learn faster thanks to AI. This means we will soon have to wrestle with whether we will trust humanoid robots enough to welcome them into our most private spaces, and how we will feel if they are remotely controlled by human beings working abroad. 

The list also features lots of technologies outside the world of AI, which I implore you to read about if only for a reminder of just how much other scientific progress is being made. This year may see advances in studying dark matter with the largest digital camera ever made for astronomy, reducing emissions from cow burps, and preventing HIV with an injection just once every six months. We also detail how technologies that you’ve long heard about—from robotaxis to stem cells—are finally making good on some of their promises.

This year, the cultural gulf between techno-optimists and, well, everyone else is set to widen. The incoming administration will be perhaps the one most shaped by Silicon Valley in recent memory, thanks to Donald Trump’s support from venture capitalists like Marc Andreessen (the author of the Techno-Optimist Manifesto) and his relationship, however recently fraught, with Elon Musk. Those figures have critiqued the Biden administration’s approach to technology as slow, “woke,” and overly cautious—attitudes they have vowed to reverse. 

So as we begin a year of immense change, here’s a small experiment I’d encourage you to do. Think about your level of optimism for technology and what’s driving it. Read our list of breakthroughs. Then see how you’ve shifted. I suspect that, like many people, you’ll find you don’t fit neatly in the camp of either optimists or pessimists. Perhaps that’s where the best progress will be made. 


Now read the rest of The Algorithm

Deeper Learning

The biggest AI flops of 2024

Though AI has remained in the spotlight this year (and even contributed to Nobel Prize–winning research in chemistry), it has not been without its failures. Take a look back over the year’s top AI failures, from chatbots dishing out illegal advice to dodgy AI-generated search results. 

Why it matters: These failures show that there are tons of unanswered questions about the technology, including who will moderate what it produces and how, whether we’re getting too trusting of the answers that chatbots produce, and what we’ll do with the mountain of “AI slop” that is increasingly taking over the internet. Above all, they illustrate the many pitfalls of blindly shoving AI into every product we interact with.

Bits and Bytes

What it’s like being a pedestrian in the world of Waymos 

Tech columnist Geoffrey Fowler finds that Waymo robotaxis regularly fail to stop for him at a crosswalk he uses every day. Though you can sometimes make eye contact with human drivers to gauge whether they’ll stop, Waymos lack that “social intelligence,” Fowler writes. (The Washington Post)

The AI Hype Index

For each print issue, MIT Technology Review publishes an AI Hype Index, a highly subjective take on the latest buzz about AI. See where facial recognition, AI replicas of your personality, and more fall on the index. (MIT Technology Review)

What’s going on at the intersection of AI and spirituality

Modern religious leaders are experimenting with A. just as earlier generations examined radio, television, and the internet. They include Rabbi Josh Fixler, who created “Rabbi Bot,” a chatbot trained on his old sermons. (The New York Times)

Meta has appointed its most prominent Republican to lead its global policy team

Just two weeks ahead of Donald Trump’s inauguration, Meta has announced it will appoint Joel Kaplan, who was White House deputy chief of staff under George W. Bush, to the company’s top policy role. Kaplan will replace Nick Clegg, who has led changes on content and elections policies. (Semafor)

Apple has settled a privacy lawsuit against Siri

The company has agreed to pay $95 million to settle a class action lawsuit alleging that Siri could be activated accidentally and then record private conversations without consent. The news comes after MIT Technology Review reported that Apple was looking into whether it could get rid of the need to use a trigger phrase like “Hey Siri” entirely. (The Washington Post)

Driving into the future

Welcome to our annual breakthroughs issue. If you’re an MIT Technology Review superfan, you may already know that putting together our 10 Breakthrough Technologies (TR10) list is one of my favorite things we do as a publication. We spend months researching and discussing which technologies will make the list. We try to highlight a mix of items that reflect innovations happening in various fields. We look at consumer technologies, large industrial­-scale projects, biomedical advances, changes in computing, climate solutions, the latest in AI, and more. 

We’ve been publishing this list every year since 2001 and, frankly, have a great track record of flagging things that are poised to hit a tipping point. When you look back over the years, you’ll find items like natural-language processing (2001), wireless power (2008), and reusable rockets (2016)—spot-on in terms of horizon scanning. You’ll also see the occasional miss, or moments when maybe we were a little bit too far ahead of ourselves. (See our Magic Leap entry from 2015.)

But the real secret of the TR10 is what we leave off the list. It is hard to think of another industry, aside from maybe entertainment, that has as much of a hype machine behind it as tech does. Which means that being too conservative is rarely the wrong call. But it does happen. 

Last year, for example, we were going to include robotaxis on the TR10. Autonomous vehicles have been around for years, but 2023 seemed like a real breakthrough moment; both Cruise and Waymo were ferrying paying customers around various cities, with big expansion plans on the horizon. And then, last fall, after a series of mishaps (including an incident when a pedestrian was caught under a vehicle and dragged), Cruise pulled its entire fleet of robotaxis from service. Yikes. 

The timing was pretty miserable, as we were in the process of putting some of the finishing touches on the issue. I made the decision to pull it. That was a mistake. 

What followed turned out to be a banner year for the robotaxi. Waymo, which had previously been available only to a select group of beta testers, opened its service to the general public in San Francisco and Los Angeles in 2024. Its cars are now ubiquitous in the City by the Bay, where they have not only become a real competitor to the likes of Uber and Lyft but even created something of a tourist attraction. Which is no wonder, because riding in one is delightful. They are still novel enough to make it feel like a kind of magic. And as you can read, Waymo is just a part of this amazing story. 

The item we swapped into the robotaxi’s place was the Apple Vision Pro, an example of both a hit and a miss. We’d included it because it is truly a revolutionary piece of hardware, and we zeroed in on its micro-OLED display. Yet a year later, it has seemingly failed to find a market fit, and its sales are reported to be far below what Apple predicted. I’ve been covering this field for well over a decade, and I would still argue that the Vision Pro (unlike the Magic Leap vaporware of 2015) is a breakthrough device. But it clearly did not have a breakthrough year. Mea culpa. 

Having said all that, I think we have an incredible and thought-provoking list for you this year—from a new astronomical observatory that will allow us to peer into the fourth dimension to new ways of searching the internet to, well, robotaxis. I hope there’s something here for everyone.

AI means the end of internet search as we’ve known it

We all know what it means, colloquially, to google something. You pop a few relevant words in a search box and in return get a list of blue links to the most relevant results. Maybe some quick explanations up top. Maybe some maps or sports scores or a video. But fundamentally, it’s just fetching information that’s already out there on the internet and showing it to you, in some sort of structured way. 

But all that is up for grabs. We are at a new inflection point.

The biggest change to the way search engines have delivered information to us since the 1990s is happening right now. No more keyword searching. No more sorting through links to click. Instead, we’re entering an era of conversational search. Which means instead of keywords, you use real questions, expressed in natural language. And instead of links, you’ll increasingly be met with answers, written by generative AI and based on live information from all across the internet, delivered the same way. 

Of course, Google—the company that has defined search for the past 25 years—is trying to be out front on this. In May of 2023, it began testing AI-generated responses to search queries, using its large language model (LLM) to deliver the kinds of answers you might expect from an expert source or trusted friend. It calls these AI Overviews. Google CEO Sundar Pichai described this to MIT Technology Review as “one of the most positive changes we’ve done to search in a long, long time.”

AI Overviews fundamentally change the kinds of queries Google can address. You can now ask it things like “I’m going to Japan for one week next month. I’ll be staying in Tokyo but would like to take some day trips. Are there any festivals happening nearby? How will the surfing be in Kamakura? Are there any good bands playing?” And you’ll get an answer—not just a link to Reddit, but a built-out answer with current results. 

More to the point, you can attempt searches that were once pretty much impossible, and get the right answer. You don’t have to be able to articulate what, precisely, you are looking for. You can describe what the bird in your yard looks like, or what the issue seems to be with your refrigerator, or that weird noise your car is making, and get an almost human explanation put together from sources previously siloed across the internet. It’s amazing, and once you start searching that way, it’s addictive.

And it’s not just Google. OpenAI’s ChatGPT now has access to the web, making it far better at finding up-to-date answers to your queries. Microsoft released generative search results for Bing in September. Meta has its own version. The startup Perplexity was doing the same, but with a “move fast, break things” ethos. Literal trillions of dollars are at stake in the outcome as these players jockey to become the next go-to source for information retrieval—the next Google.

Not everyone is excited for the change. Publishers are completely freaked out. The shift has heightened fears of a “zero-click” future, where search referral traffic—a mainstay of the web since before Google existed—vanishes from the scene. 

I got a vision of that future last June, when I got a push alert from the Perplexity app on my phone. Perplexity is a startup trying to reinvent web search. But in addition to delivering deep answers to queries, it will create entire articles about the news of the day, cobbled together by AI from different sources. 

On that day, it pushed me a story about a new drone company from Eric Schmidt. I recognized the story. Forbes had reported it exclusively, earlier in the week, but it had been locked behind a paywall. The image on Perplexity’s story looked identical to one from Forbes. The language and structure were quite similar. It was effectively the same story, but freely available to anyone on the internet. I texted a friend who had edited the original story to ask if Forbes had a deal with the startup to republish its content. But there was no deal. He was shocked and furious and, well, perplexed. He wasn’t alone. Forbes, the New York Times, and Condé Nast have now all sent the company cease-and-desist orders. News Corp is suing for damages. 

People are worried about what these new LLM-powered results will mean for our fundamental shared reality. It could spell the end of the canonical answer.

It was precisely the nightmare scenario publishers have been so afraid of: The AI was hoovering up their premium content, repackaging it, and promoting it to its audience in a way that didn’t really leave any reason to click through to the original. In fact, on Perplexity’s About page, the first reason it lists to choose the search engine is “Skip the links.”

But this isn’t just about publishers (or my own self-interest). 

People are also worried about what these new LLM-powered results will mean for our fundamental shared reality. Language models have a tendency to make stuff up—they can hallucinate nonsense. Moreover, generative AI can serve up an entirely new answer to the same question every time, or provide different answers to different people on the basis of what it knows about them. It could spell the end of the canonical answer.

But make no mistake: This is the future of search. Try it for a bit yourself, and you’ll see. 

Sure, we will always want to use search engines to navigate the web and to discover new and interesting sources of information. But the links out are taking a back seat. The way AI can put together a well-reasoned answer to just about any kind of question, drawing on real-time data from across the web, just offers a better experience. That is especially true compared with what web search has become in recent years. If it’s not exactly broken (data shows more people are searching with Google more often than ever before), it’s at the very least increasingly cluttered and daunting to navigate. 

Who wants to have to speak the language of search engines to find what you need? Who wants to navigate links when you can have straight answers? And maybe: Who wants to have to learn when you can just know? 


In the beginning there was Archie. It was the first real internet search engine, and it crawled files previously hidden in the darkness of remote servers. It didn’t tell you what was in those files—just their names. It didn’t preview images; it didn’t have a hierarchy of results, or even much of an interface. But it was a start. And it was pretty good. 

Then Tim Berners-Lee created the World Wide Web, and all manner of web pages sprang forth. The Mosaic home page and the Internet Movie Database and Geocities and the Hampster Dance and web rings and Salon and eBay and CNN and federal government sites and some guy’s home page in Turkey.

Until finally, there was too much web to even know where to start. We really needed a better way to navigate our way around, to actually find the things we needed. 

And so in 1994 Jerry Yang created Yahoo, a hierarchical directory of websites. It quickly became the home page for millions of people. And it was … well, it was okay. TBH, and with the benefit of hindsight, I think we all thought it was much better back then than it actually was.

But the web continued to grow and sprawl and expand, every day bringing more information online. Rather than just a list of sites by category, we needed something that actually looked at all that content and indexed it. By the late ’90s that meant choosing from a variety of search engines: AltaVista and AlltheWeb and WebCrawler and HotBot. And they were good—a huge improvement. At least at first.  

But alongside the rise of search engines came the first attempts to exploit their ability to deliver traffic. Precious, valuable traffic, which web publishers rely on to sell ads and retailers use to get eyeballs on their goods. Sometimes this meant stuffing pages with keywords or nonsense text designed purely to push pages higher up in search results. It got pretty bad. 

And then came Google. It’s hard to overstate how revolutionary Google was when it launched in 1998. Rather than just scanning the content, it also looked at the sources linking to a website, which helped evaluate its relevance. To oversimplify: The more something was cited elsewhere, the more reliable Google considered it, and the higher it would appear in results. This breakthrough made Google radically better at retrieving relevant results than anything that had come before. It was amazing

Sundar Pichai
Google CEO Sundar Pichai describes AI Overviews as “one of the most positive changes we’ve done to search in a long, long time.”
JENS GYARMATY/LAIF/REDUX

For 25 years, Google dominated search. Google was search, for most people. (The extent of that domination is currently the subject of multiple legal probes in the United States and the European Union.)  

But Google has long been moving away from simply serving up a series of blue links, notes Pandu Nayak, Google’s chief scientist for search. 

“It’s not just so-called web results, but there are images and videos, and special things for news. There have been direct answers, dictionary answers, sports, answers that come with Knowledge Graph, things like featured snippets,” he says, rattling off a litany of Google’s steps over the years to answer questions more directly. 

It’s true: Google has evolved over time, becoming more and more of an answer portal. It has added tools that allow people to just get an answer—the live score to a game, the hours a café is open, or a snippet from the FDA’s website—rather than being pointed to a website where the answer may be. 

But once you’ve used AI Overviews a bit, you realize they are different

Take featured snippets, the passages Google sometimes chooses to highlight and show atop the results themselves. Those words are quoted directly from an original source. The same is true of knowledge panels, which are generated from information stored in a range of public databases and Google’s Knowledge Graph, its database of trillions of facts about the world.

While these can be inaccurate, the information source is knowable (and fixable). It’s in a database. You can look it up. Not anymore: AI Overviews can be entirely new every time, generated on the fly by a language model’s predictive text combined with an index of the web. 

“I think it’s an exciting moment where we have obviously indexed the world. We built deep understanding on top of it with Knowledge Graph. We’ve been using LLMs and generative AI to improve our understanding of all that,” Pichai told MIT Technology Review. “But now we are able to generate and compose with that.”

The result feels less like a querying a database than like asking a very smart, well-read friend. (With the caveat that the friend will sometimes make things up if she does not know the answer.) 

“[The company’s] mission is organizing the world’s information,” Liz Reid, Google’s head of search, tells me from its headquarters in Mountain View, California. “But actually, for a while what we did was organize web pages. Which is not really the same thing as organizing the world’s information or making it truly useful and accessible to you.” 

That second concept—accessibility—is what Google is really keying in on with AI Overviews. It’s a sentiment I hear echoed repeatedly while talking to Google execs: They can address more complicated types of queries more efficiently by bringing in a language model to help supply the answers. And they can do it in natural language. 

That will become even more important for a future where search goes beyond text queries. For example, Google Lens, which lets people take a picture or upload an image to find out more about something, uses AI-generated answers to tell you what you may be looking at. Google has even showed off the ability to query live video. 

When it doesn’t have an answer, an AI model can confidently spew back a response anyway. For Google, this could be a real problem. For the rest of us, it could actually be dangerous.

“We are definitely at the start of a journey where people are going to be able to ask, and get answered, much more complex questions than where we’ve been in the past decade,” says Pichai. 

There are some real hazards here. First and foremost: Large language models will lie to you. They hallucinate. They get shit wrong. When it doesn’t have an answer, an AI model can blithely and confidently spew back a response anyway. For Google, which has built its reputation over the past 20 years on reliability, this could be a real problem. For the rest of us, it could actually be dangerous.

In May 2024, AI Overviews were rolled out to everyone in the US. Things didn’t go well. Google, long the world’s reference desk, told people to eat rocks and to put glue on their pizza. These answers were mostly in response to what the company calls adversarial queries—those designed to trip it up. But still. It didn’t look good. The company quickly went to work fixing the problems—for example, by deprecating so-called user-generated content from sites like Reddit, where some of the weirder answers had come from.

Yet while its errors telling people to eat rocks got all the attention, the more pernicious danger might arise when it gets something less obviously wrong. For example, in doing research for this article, I asked Google when MIT Technology Review went online. It helpfully responded that “MIT Technology Review launched its online presence in late 2022.” This was clearly wrong to me, but for someone completely unfamiliar with the publication, would the error leap out? 

I came across several examples like this, both in Google and in OpenAI’s ChatGPT search. Stuff that’s just far enough off the mark not to be immediately seen as wrong. Google is banking that it can continue to improve these results over time by relying on what it knows about quality sources.

“When we produce AI Overviews,” says Nayak, “we look for corroborating information from the search results, and the search results themselves are designed to be from these reliable sources whenever possible. These are some of the mechanisms we have in place that assure that if you just consume the AI Overview, and you don’t want to look further … we hope that you will still get a reliable, trustworthy answer.”

In the case above, the 2022 answer seemingly came from a reliable source—a story about MIT Technology Review’s email newsletters, which launched in 2022. But the machine fundamentally misunderstood. This is one of the reasons Google uses human beings—raters—to evaluate the results it delivers for accuracy. Ratings don’t correct or control individual AI Overviews; rather, they help train the model to build better answers. But human raters can be fallible. Google is working on that too. 

“Raters who look at your experiments may not notice the hallucination because it feels sort of natural,” says Nayak. “And so you have to really work at the evaluation setup to make sure that when there is a hallucination, someone’s able to point out and say, That’s a problem.”

The new search

Google has rolled out its AI Overviews to upwards of a billion people in more than 100 countries, but it is facing upstarts with new ideas about how search should work.


Search Engine

Google
The search giant has added AI Overviews to search results. These overviews take information from around the web and Google’s Knowledge Graph and use the company’s Gemini language model to create answers to search queries.

What it’s good at

Google’s AI Overviews are great at giving an easily digestible summary in response to even the most complex queries, with sourcing boxes adjacent to the answers. Among the major options, its deep web index feels the most “internety.” But web publishers fear its summaries will give people little reason to click through to the source material.


Perplexity
Perplexity is a conversational search engine that uses third-party large
language models from OpenAI and Anthropic to answer queries.

Perplexity is fantastic at putting together deeper dives in response to user queries, producing answers that are like mini white papers on complex topics. It’s also excellent at summing up current events. But it has gotten a bad rep with publishers, who say it plays fast and loose with their content.


ChatGPT
While Google brought AI to search, OpenAI brought search to ChatGPT. Queries that the model determines will benefit from a web search automatically trigger one, or users can manually select the option to add a web search.

Thanks to its ability to preserve context across a conversation, ChatGPT works well for performing searches that benefit from follow-up questions—like planning a vacation through multiple search sessions. OpenAI says users sometimes go “20 turns deep” in researching queries. Of these three, it makes links out to publishers least prominent.


When I talked to Pichai about this, he expressed optimism about the company’s ability to maintain accuracy even with the LLM generating responses. That’s because AI Overviews is based on Google’s flagship large language model, Gemini, but also draws from Knowledge Graph and what it considers reputable sources around the web. 

“You’re always dealing in percentages. What we have done is deliver it at, like, what I would call a few nines of trust and factuality and quality. I’d say 99-point-few-nines. I think that’s the bar we operate at, and it is true with AI Overviews too,” he says. “And so the question is, are we able to do this again at scale? And I think we are.”

There’s another hazard as well, though, which is that people ask Google all sorts of weird things. If you want to know someone’s darkest secrets, look at their search history. Sometimes the things people ask Google about are extremely dark. Sometimes they are illegal. Google doesn’t just have to be able to deploy its AI Overviews when an answer can be helpful; it has to be extremely careful not to deploy them when an answer may be harmful. 

“If you go and say ‘How do I build a bomb?’ it’s fine that there are web results. It’s the open web. You can access anything,” Reid says. “But we do not need to have an AI Overview that tells you how to build a bomb, right? We just don’t think that’s worth it.” 

But perhaps the greatest hazard—or biggest unknown—is for anyone downstream of a Google search. Take publishers, who for decades now have relied on search queries to send people their way. What reason will people have to click through to the original source, if all the information they seek is right there in the search result?  

Rand Fishkin, cofounder of the market research firm SparkToro, publishes research on so-called zero-click searches. As Google has moved increasingly into the answer business, the proportion of searches that end without a click has gone up and up. His sense is that AI Overviews are going to explode this trend.  

“If you are reliant on Google for traffic, and that traffic is what drove your business forward, you are in long- and short-term trouble,” he says. 

Don’t panic, is Pichai’s message. He argues that even in the age of AI Overviews, people will still want to click through and go deeper for many types of searches. “The underlying principle is people are coming looking for information. They’re not looking for Google always to just answer,” he says. “Sometimes yes, but the vast majority of the times, you’re looking at it as a jumping-off point.” 

Reid, meanwhile, argues that because AI Overviews allow people to ask more complicated questions and drill down further into what they want, they could even be helpful to some types of publishers and small businesses, especially those operating in the niches: “You essentially reach new audiences, because people can now express what they want more specifically, and so somebody who specializes doesn’t have to rank for the generic query.”


 “I’m going to start with something risky,” Nick Turley tells me from the confines of a Zoom window. Turley is the head of product for ChatGPT, and he’s showing off OpenAI’s new web search tool a few weeks before it launches. “I should normally try this beforehand, but I’m just gonna search for you,” he says. “This is always a high-risk demo to do, because people tend to be particular about what is said about them on the internet.” 

He types my name into a search field, and the prototype search engine spits back a few sentences, almost like a speaker bio. It correctly identifies me and my current role. It even highlights a particular story I wrote years ago that was probably my best known. In short, it’s the right answer. Phew? 

A few weeks after our call, OpenAI incorporated search into ChatGPT, supplementing answers from its language model with information from across the web. If the model thinks a response would benefit from up-to-date information, it will automatically run a web search (OpenAI won’t say who its search partners are) and incorporate those responses into its answer, with links out if you want to learn more. You can also opt to manually force it to search the web if it does not do so on its own. OpenAI won’t reveal how many people are using its web search, but it says some 250 million people use ChatGPT weekly, all of whom are potentially exposed to it.  

“There’s an incredible amount of content on the web. There are a lot of things happening in real time. You want ChatGPT to be able to use that to improve its answers and to be a better super-assistant for you.”

Kevin Weil, chief product officer, OpenAI

According to Fishkin, these newer forms of AI-assisted search aren’t yet challenging Google’s search dominance. “It does not appear to be cannibalizing classic forms of web search,” he says. 

OpenAI insists it’s not really trying to compete on search—although frankly this seems to me like a bit of expectation setting. Rather, it says, web search is mostly a means to get more current information than the data in its training models, which tend to have specific cutoff dates that are often months, or even a year or more, in the past. As a result, while ChatGPT may be great at explaining how a West Coast offense works, it has long been useless at telling you what the latest 49ers score is. No more. 

“I come at it from the perspective of ‘How can we make ChatGPT able to answer every question that you have? How can we make it more useful to you on a daily basis?’ And that’s where search comes in for us,” Kevin Weil, the chief product officer with OpenAI, tells me. “There’s an incredible amount of content on the web. There are a lot of things happening in real time. You want ChatGPT to be able to use that to improve its answers and to be able to be a better super-assistant for you.”

Today ChatGPT is able to generate responses for very current news events, as well as near-real-time information on things like stock prices. And while ChatGPT’s interface has long been, well, boring, search results bring in all sorts of multimedia—images, graphs, even video. It’s a very different experience. 

Weil also argues that ChatGPT has more freedom to innovate and go its own way than competitors like Google—even more than its partner Microsoft does with Bing. Both of those are ad-dependent businesses. OpenAI is not. (At least not yet.) It earns revenue from the developers, businesses, and individuals who use it directly. It’s mostly setting large amounts of money on fire right now—it’s projected to lose $14 billion in 2026, by some reports. But one thing it doesn’t have to worry about is putting ads in its search results as Google does. 

Elizabeth Reid
“For a while what we did was organize web pages. Which is not really the same thing as organizing the world’s information or making it truly useful and accessible to you,” says Google head of search, Liz Reid.
WINNI WINTERMEYER/REDUX

Like Google, ChatGPT is pulling in information from web publishers, summarizing it, and including it in its answers. But it has also struck financial deals with publishers, a payment for providing the information that gets rolled into its results. (MIT Technology Review has been in discussions with OpenAI, Google, Perplexity, and others about publisher deals but has not entered into any agreements. Editorial was neither party to nor informed about the content of those discussions.)

But the thing is, for web search to accomplish what OpenAI wants—to be more current than the language model—it also has to bring in information from all sorts of publishers and sources that it doesn’t have deals with. OpenAI’s head of media partnerships, Varun Shetty, told MIT Technology Review that it won’t give preferential treatment to its publishing partners.

Instead, OpenAI told me, the model itself finds the most trustworthy and useful source for any given question. And that can get weird too. In that very first example it showed me—when Turley ran that name search—it described a story I wrote years ago for Wired about being hacked. That story remains one of the most widely read I’ve ever written. But ChatGPT didn’t link to it. It linked to a short rewrite from The Verge. Admittedly, this was on a prototype version of search, which was, as Turley said, “risky.” 

When I asked him about it, he couldn’t really explain why the model chose the sources that it did, because the model itself makes that evaluation. The company helps steer it by identifying—sometimes with the help of users—what it considers better answers, but the model actually selects them. 

“And in many cases, it gets it wrong, which is why we have work to do,” said Turley. “Having a model in the loop is a very, very different mechanism than how a search engine worked in the past.”

Indeed! 

The model, whether it’s OpenAI’s GPT-4o or Google’s Gemini or Anthropic’s Claude, can be very, very good at explaining things. But the rationale behind its explanations, its reasons for selecting a particular source, and even the language it may use in an answer are all pretty mysterious. Sure, a model can explain very many things, but not when that comes to its own answers. 


It was almost a decade ago, in 2016, when Pichai wrote that Google was moving from “mobile first” to “AI first”: “But in the next 10 years, we will shift to a world that is AI-first, a world where computing becomes universally available—be it at home, at work, in the car, or on the go—and interacting with all of these surfaces becomes much more natural and intuitive, and above all, more intelligent.” 

We’re there now—sort of. And it’s a weird place to be. It’s going to get weirder. That’s especially true as these things we now think of as distinct—querying a search engine, prompting a model, looking for a photo we’ve taken, deciding what we want to read or watch or hear, asking for a photo we wish we’d taken, and didn’t, but would still like to see—begin to merge. 

The search results we see from generative AI are best understood as a waypoint rather than a destination. What’s most important may not be search in itself; rather, it’s that search has given AI model developers a path to incorporating real-time information into their inputs and outputs. And that opens up all sorts of possibilities.

“A ChatGPT that can understand and access the web won’t just be about summarizing results. It might be about doing things for you. And I think there’s a fairly exciting future there,” says OpenAI’s Weil. “You can imagine having the model book you a flight, or order DoorDash, or just accomplish general tasks for you in the future. It’s just once the model understands how to use the internet, the sky’s the limit.”

This is the agentic future we’ve been hearing about for some time now, and the more AI models make use of real-time data from the internet, the closer it gets. 

Let’s say you have a trip coming up in a few weeks. An agent that can get data from the internet in real time can book your flights and hotel rooms, make dinner reservations, and more, based on what it knows about you and your upcoming travel—all without your having to guide it. Another agent could, say, monitor the sewage output of your home for certain diseases, and order tests and treatments in response. You won’t have to search for that weird noise your car is making, because the agent in your vehicle will already have done it and made an appointment to get the issue fixed. 

“It’s not always going to be just doing search and giving answers,” says Pichai. “Sometimes it’s going to be actions. Sometimes you’ll be interacting within the real world. So there is a notion of universal assistance through it all.”

And the ways these things will be able to deliver answers is evolving rapidly now too. For example, today Google can not only search text, images, and even video; it can create them. Imagine overlaying that ability with search across an array of formats and devices. “Show me what a Townsend’s warbler looks like in the tree in front of me.” Or “Use my existing family photos and videos to create a movie trailer of our upcoming vacation to Puerto Rico next year, making sure we visit all the best restaurants and top landmarks.”

“We have primarily done it on the input side,” he says, referring to the ways Google can now search for an image or within a video. “But you can imagine it on the output side too.”

This is the kind of future Pichai says he is excited to bring online. Google has already showed off a bit of what that might look like with NotebookLM, a tool that lets you upload large amounts of text and have it converted into a chatty podcast. He imagines this type of functionality—the ability to take one type of input and convert it into a variety of outputs—transforming the way we interact with information. 

In a demonstration of a tool called Project Astra this summer at its developer conference, Google showed one version of this outcome, where cameras and microphones in phones and smart glasses understand the context all around you—online and off, audible and visual—and have the ability to recall and respond in a variety of ways. Astra can, for example, look at a crude drawing of a Formula One race car and not only identify it, but also explain its various parts and their uses. 

But you can imagine things going a bit further (and they will). Let’s say I want to see a video of how to fix something on my bike. The video doesn’t exist, but the information does. AI-assisted generative search could theoretically find that information somewhere online—in a user manual buried in a company’s website, for example—and create a video to show me exactly how to do what I want, just as it could explain that to me with words today.

These are the kinds of things that start to happen when you put the entire compendium of human knowledge—knowledge that’s previously been captured in silos of language and format; maps and business registrations and product SKUs; audio and video and databases of numbers and old books and images and, really, anything ever published, ever tracked, ever recorded; things happening right now, everywhere—and introduce a model into all that. A model that maybe can’t understand, precisely, but has the ability to put that information together, rearrange it, and spit it back in a variety of different hopefully helpful ways. Ways that a mere index could not.

That’s what we’re on the cusp of, and what we’re starting to see. And as Google rolls this out to a billion people, many of whom will be interacting with a conversational AI for the first time, what will that mean? What will we do differently? It’s all changing so quickly. Hang on, just hang on. 

Small language models: 10 Breakthrough Technologies 2025

WHO

Allen Institute for Artificial Intelligence, Anthropic, Google, Meta, Microsoft, OpenAI

WHEN

Now

Make no mistake: Size matters in the AI world. When OpenAI launched GPT-3 back in 2020, it was the largest language model ever built. The firm showed that supersizing this type of model was enough to send performance through the roof. That kicked off a technology boom that has been sustained by bigger models ever since. As Noam Brown, a research scientist at OpenAI, told an audience at TEDAI San Francisco in October, “The incredible progress in AI over the past five years can be summarized in one word: scale.”

But as the marginal gains for new high-end models trail off, researchers are figuring out how to do more with less. For certain tasks, smaller models that are trained on more focused data sets can now perform just as well as larger ones—if not better. That’s a boon for businesses eager to deploy AI in a handful of specific ways. You don’t need the entire internet in your model if you’re making the same kind of request again and again. 

Most big tech firms now boast fun-size versions of their flagship models for this purpose: OpenAI offers both GPT-4o and GPT-4o mini; Google DeepMind has Gemini Ultra and Gemini Nano; and Anthropic’s Claude 3 comes in three flavors: outsize Opus, midsize Sonnet, and tiny Haiku. Microsoft is pioneering a range of small language models called Phi.

A growing number of smaller companies offer small models as well. The AI startup Writer claims that its latest language model matches the performance of the largest top-tier models on many key metrics despite in some cases having just a 20th as many parameters (the values that get calculated during training and determine how a model behaves). 

Explore the full 2025 list of 10 Breakthrough Technologies.

Smaller models are more efficient, making them quicker to train and run. That’s good news for anyone wanting a more affordable on-ramp. And it could be good for the climate, too: Because smaller models work with a fraction of the computer oomph required by their giant cousins, they burn less energy. 

These small models also travel well: They can run right in our pockets, without needing to send requests to the cloud. Small is the next big thing.

Vera C. Rubin Observatory: 10 Breakthrough Technologies 2025

WHO

US Department of Energy’s SLAC National Accelerator Laboratory, US National Science Foundation

WHEN

6 months

The next time you glance up at the night sky, consider: The particles inside everything you can see make up only about 5% of what’s out there in the universe. Dark energy and dark matter constitute the rest, astronomers believe—but what exactly is this mysterious stuff? 

A massive new telescope erected in Chile will explore this question and other cosmic unknowns. It’s named for Vera Rubin, an American astronomer who in the 1970s and 1980s observed stars moving faster than expected in the outer reaches of dozens of spiral galaxies. Her calculations made a strong case for the existence of dark matter—mass we can’t directly observe but that appears to shape everything from the paths of stars to the structure of the universe itself. 

Explore the full 2025 list of 10 Breakthrough Technologies.

Soon, her namesake observatory will carry on that work in much higher definition. The facility, run by the SLAC National Accelerator Laboratory and the US National Science Foundation, will house the largest digital camera ever made for astronomy. And its first mission will be to complete what’s called the Legacy Survey of Space and Time. Astronomers will focus its giant lens on the sky over the Southern Hemisphere and snap photo after photo, passing over the same patches of sky repeatedly for a decade. 

By the end of the survey, this 3.2-gigapixel camera will have catalogued 20 billion galaxies and collected up to 60 petabytes of data—roughly three times the amount currently stored by the US Library of Congress. Compiling all these images together, with help from specialized algorithms and a supercomputer, will give astronomers a time-lapse view of the sky. Seeing how so many galaxies are dispersed and shaped will enable them to study dark matter’s gravitational effect. They also plan to create the most detailed three-dimensional map of our Milky Way galaxy ever made. 

If all goes well, the telescope will snap its first science-quality images—a special moment known as first light—in mid-2025. The public could see the first photo released from Rubin soon after. 

Long-acting HIV prevention meds: 10 Breakthrough Technologies 2025

WHO

Gilead Sciences, GSK, ViiV Healthcare

WHEN

1 to 3 years

In June 2024, results from a trial of a new medicine to prevent HIV were announced—and they were jaw-dropping. Lenacapavir, a treatment injected once every six months, protected over 5,000 girls and women in Uganda and South Africa from getting HIV. And it was 100% effective.

The drug, which is produced by Gilead, has other advantages. We’ve had effective pre-exposure prophylactic (PrEP) drugs for HIV since 2012, but these must be taken either daily or in advance of each time a person is exposed to the virus. That’s a big ask for healthy people. And because these medicines also treat infections, there’s stigma attached to taking them. For some, the drugs are expensive or hard to access. In the lenacapavir trial, researchers found that injections of the new drug were more effective than a daily PrEP pill, probably because participants didn’t manage to take the pills every day.

 In 2021, the US Food and Drug Administration approved another long-acting injectable drug that protects against HIV. That drug, cabotegravir, is manufactured by ViiV Healthcare (which is largely owned by GSK) and needs to be injected every two months. But despite huge demand, rollout has been slow.   

Explore the full 2025 list of 10 Breakthrough Technologies.

Scientists and activists hope that the story will be different for lenacapavir. So far, the FDA has approved the drug only for people who already have HIV that’s resistant to other treatments. But Gilead has signed licensing agreements with manufacturers to produce generic versions for HIV prevention in 120 low-income countries. 

In October, Gilead announced more trial results for lenacapavir, finding it 96% effective at preventing HIV infection in just over 3,200 cisgender gay, bisexual, and other men, as well as transgender men, transgender women, and nonbinary people who have sex with people assigned male at birth. 

The United Nations has set a goal of ending AIDS by 2030. It’s ambitious, to say the least: We still see over 1 million new HIV infections globally every year. But we now have the medicines to get us there. What we need is access. 

Generative AI search: 10 Breakthrough Technologies 2025

WHO

Apple, Google, Meta, Microsoft, OpenAI, Perplexity

WHEN

Now

Google’s introduction of AI Overviews, powered by its Gemini language model, will alter how billions of people search the internet. And generative search may be the first step toward an AI agent that handles any question you have or task you need done.

Rather than returning a list of links, AI Overviews offer concise answers to your queries. This makes it easier to get quick insights without scrolling and clicking through to multiple sources. After a rocky start with high-profile nonsense results following its US release in May 2024, Google limited its use of answers that draw on user-­generated content or satire and humor sites.   

Explore the full 2025 list of 10 Breakthrough Technologies.

The rise of generative search isn’t limited to Google. Microsoft and OpenAI both rolled out versions in 2024 as well. Meanwhile, in more places, on our computers and other gadgets, AI-assisted searches are now analyzing images, audio, and video to return custom answers to our queries. 

But Google’s global search dominance makes it the most important player, and the company has already rolled out AI Overviews to more than a billion people worldwide. The result is searches that feel more like conversations. Google and OpenAI both report that people interact differently with generative search—they ask longer questions and pose more follow-ups.    

This new application of AI has serious implications for online advertising and (gulp) media. Because these search products often summarize information from online news stories and articles in their responses, concerns abound that generative search results will leave little reason for people to click through to the original sources, depriving those websites of potential ad revenue. A number of publishers and artists have sued over the use of their content to train AI models; now, generative search will be another battleground between media and Big Tech.