Generative AI search: 10 Breakthrough Technologies 2025

WHO

Apple, Google, Meta, Microsoft, OpenAI, Perplexity

WHEN

Now

Google’s introduction of AI Overviews, powered by its Gemini language model, will alter how billions of people search the internet. And generative search may be the first step toward an AI agent that handles any question you have or task you need done.

Rather than returning a list of links, AI Overviews offer concise answers to your queries. This makes it easier to get quick insights without scrolling and clicking through to multiple sources. After a rocky start with high-profile nonsense results following its US release in May 2024, Google limited its use of answers that draw on user-­generated content or satire and humor sites.   

Explore the full 2025 list of 10 Breakthrough Technologies.

The rise of generative search isn’t limited to Google. Microsoft and OpenAI both rolled out versions in 2024 as well. Meanwhile, in more places, on our computers and other gadgets, AI-assisted searches are now analyzing images, audio, and video to return custom answers to our queries. 

But Google’s global search dominance makes it the most important player, and the company has already rolled out AI Overviews to more than a billion people worldwide. The result is searches that feel more like conversations. Google and OpenAI both report that people interact differently with generative search—they ask longer questions and pose more follow-ups.    

This new application of AI has serious implications for online advertising and (gulp) media. Because these search products often summarize information from online news stories and articles in their responses, concerns abound that generative search results will leave little reason for people to click through to the original sources, depriving those websites of potential ad revenue. A number of publishers and artists have sued over the use of their content to train AI models; now, generative search will be another battleground between media and Big Tech.

Fast-learning robots: 10 Breakthrough Technologies 2025

WHO

Agility, Amazon, Covariant, Robust, Toyota Research Institute

WHEN

Now

Generative AI is causing a paradigm shift in how robots are trained. It’s now clear how we might finally build the sort of truly capable robots that have for decades remained the stuff of science fiction. 

Robotics researchers are no strangers to artificial intelligence—it has for years helped robots detect objects in their path, for example. But a few years ago, roboticists began marveling at the progress being made in large language models. Makers of those models could feed them massive amounts of text—books, poems, manuals—and then fine-tune them to generate text based on prompts. 

Explore the full 2025 list of 10 Breakthrough Technologies.

The idea of doing the same for robotics was tantalizing—but incredibly complicated. It’s one thing to use AI to create sentences on a screen, but another thing entirely to use it to coach a physical robot in how to move about and do useful things.

Now, roboticists have made major breakthroughs in that pursuit. One was figuring out how to combine different sorts of data and then make it all useful and legible to a robot. Take washing dishes as an example. You can collect data from someone washing dishes while wearing sensors. Then you can combine that with teleoperation data from a human doing the same task with robotic arms. On top of all that, you can also scrape the internet for images and videos of people doing dishes.

By merging these data sources properly into a new AI model, it’s possible to train a robot that, though not perfect, has a massive head start over those trained with more manual methods. Seeing so many ways that a single task can be done makes it easier for AI models to improvise, and to surmise what a robot’s next move should be in the real world. 

It’s a breakthrough that’s set to redefine how robots learn. Robots that work in commercial spaces like warehouses are already using such advanced training methods, and the lessons we learn from those experiments could lay the groundwork for smart robots that help out at home. 

The AI Hype Index: Robot pets, simulated humans, and Apple’s AI text summaries

Separating AI reality from hyped-up fiction isn’t always easy. That’s why we’ve created the AI Hype Index—a simple, at-a-glance summary of everything you need to know about the state of the industry.

More than 70 countries went to the polls in 2024. The good news is that this year of global elections turned out to be largely free from any major deepfake campaigns or AI manipulation. Instead we saw lots of AI slop: buff Trump, Elon as ultra-Chad, California as catastrophic wasteland. While some worry that development of large language models is slowing down, you wouldn’t know it from the steady drumbeat of new products, features, and services rolling out from itty-bitty startups and massive incumbents alike. So what’s for real and what’s just a lot of hallucinatory nonsense? 

The biggest AI flops of 2024

The past 12 months have been undeniably busy for those working in AI. There have been more successful product launches than we can count, and even Nobel Prizes. But it hasn’t always been smooth sailing.

AI is an unpredictable technology, and the increasing availability of generative models has led people to test their limits in new, weird, and sometimes harmful ways. These were some of 2024’s biggest AI misfires. 

AI slop infiltrated almost every corner of the internet

Generative AI makes creating reams of text, images, videos, and other types of material a breeze. Because it takes just a few seconds between entering a prompt for your model of choice to spit out the result, these models have become a quick, easy way to produce content on a massive scale. And 2024 was the year we started calling this (generally poor quality) media what it is—AI slop.  

This low-stakes way of creating AI slop means it can now be found in pretty much every corner of the internet: from the newsletters in your inbox and books sold on Amazon, to ads and articles across the web and shonky pictures on your social media feeds. The more emotionally evocative these pictures are (wounded veterans, crying children, a signal of support in the Israel-Palestine conflict) the more likely they are to be shared, resulting in higher engagement and ad revenue for their savvy creators.

AI slop isn’t just annoying—its rise poses a genuine problem for the future of the very models that helped to produce it. Because those models are trained on data scraped from the internet, the increasing number of junky websites containing AI garbage means there’s a very real danger models’ output and performance will get steadily worse

AI art is warping our expectations of real events

2024 was also the year that the effects of surreal AI images started seeping into our real lives. Willy’s Chocolate Experience, a wildly unofficial immersive event inspired by Roald Dahl’s Charlie and the Chocolate Factory, made headlines across the world in February after its fantastical AI-generated marketing materials gave visitors the impression it would be much grander than the sparsely-decorated warehouse its producers created.

Similarly, hundreds of people lined the streets of Dublin for a Halloween parade that didn’t exist. A Pakistan-based website used AI to create a list of events in the city, which was shared widely across social media ahead of October 31. Although the SEO-baiting site (myspirithalloween.com) has since been taken down, both events illustrate how misplaced public trust in AI-generated material online can come back to haunt us.

Grok allows users to create images of pretty much any scenario

The vast majority of major AI image generators have guardrails—rules that dictate what AI models can and can’t do—to prevent users from creating violent, explicit, illegal, and other types of harmful content. Sometimes these guardrails are just meant to make sure that no one makes blatant use of others’ intellectual property. But Grok, an assistant made by Elon Musk’s AI company, called xAI, ignores almost all of these principles in line with Musk’s rejection of what he calls “woke AI.”

Whereas other image models will generally refuse to create images of celebrities, copyrighted material, violence, or terrorism—unless they’re tricked into ignoring these rules—Grok will happily generate images of Donald Trump firing a bazooka, or Mickey Mouse holding a bomb. While it draws the line at generating nude images, its refusal to play by the rules undermines other companies’ efforts to steer clear of creating problematic material.

Sexually explicit deepfakes of Taylor Swift circulated online

In January, non-consensual deepfake nudes of singer Taylor Swift started circulating on social media, including X and Facebook. A Telegram community tricked Microsoft’s AI image generator Designer into making the explicit images, demonstrating how guardrails can be circumvented even when they are in place. 

While Microsoft quickly closed the system’s loopholes, the incident shone a light on the platforms’ poor content-moderation policies, after posts containing the images circulated widely and remained live for days. But the most chilling takeaway is how powerless we still are to fight non-consensual deepfake porn. While watermarking and data-poisoning tools can help, they’ll need to be adopted much more widely to make a difference.

Business chatbots went haywire

As AI becomes more widespread, businesses are racing to adopt generative tools to save time and money, and to maximize efficiency. The problem is—chatbots make stuff up and can’t be relied upon to always provide you with accurate information.

Air Canada found this out the hard way after its chatbot advised a customer to follow a bereavement refund policy that didn’t exist. In February, a Canadian small-claims tribunal upheld the customer’s legal complaint, despite the airline’s assertion that the chatbot was a “separate legal entity that is responsible for its own actions.”

In other high-profile examples of how chatbots can do more harm than good, delivery firm DPD’s bot cheerfully swore and called itself useless with little prompting, while a different bot set up to provide New Yorkers with accurate information about their city’s government ended up dispensing guidance on how to break the law.

AI gadgets aren’t exactly setting the market alight

Hardware assistants are something the AI industry tried, and failed, to crack in 2024. Humane attempted to sell customers on the promise of the Ai Pin, a wearable lapel computer, but even slashing its price failed to boost weak sales. The Rabbit R1, a ChatGPT-based personal assistant device, suffered a similar fate, following a rash of critical reviews and reports that it was slow and buggy. Both products seemed to be trying to solve a problem that did not actually exist. 

AI search summaries went awry

Have you ever added glue to a pizza, or eaten a small rock? These are just some of the outlandish suggestions that Google’s AI Overviews feature gave web users in May after the search giant added generated responses to the top of search results. Because AI systems can’t tell the difference between a factually correct news story and a joke post on Reddit, users raced to find the strangest responses AI Overviews could generate.

But AI summaries can also have serious consequences. A new iPhone feature that groups app notifications together and creates summaries of their contents, recently generated a false BBC News headline. The summary falsely stated that Luigi Mangione, who has been charged with the murder of healthcare insurance CEO Brian Thompson, had shot himself. The same feature had previously created a headline claiming that Israeli prime minister Benjamin Netanyahu had been arrested, which was also incorrect. These kinds of errors can inadvertently spread misinformation and undermine trust in news organizations.

The humans behind the robots

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

Here’s a question. Imagine that, for $15,000, you could purchase a robot to pitch in with all the mundane tasks in your household. The catch (aside from the price tag) is that for 80% of those tasks, the robot’s AI training isn’t good enough for it to act on its own. Instead, it’s aided by a remote assistant working from the Philippines to help it navigate your home and clear your table or put away groceries. Would you want one?

That’s the question at the center of my story for our magazine, published online today, on whether we will trust humanoid robots enough to welcome them into our most private spaces, particularly if they’re part of an asymmetric labor arrangement in which workers in low-wage countries perform physical tasks for us in our homes through robot interfaces. In the piece, I wrote about one robotics company called Prosper and its massive effort—bringing in former Pixar designers and professional butlers—to design a trustworthy household robot named Alfie. It’s quite a ride. Read the story here.

There’s one larger question that the story raises, though, about just how profound a shift in labor dynamics robotics could bring in the coming years. 

For decades, robots have found success on assembly lines and in other somewhat predictable environments. Then, in the last couple of years, robots started being able to learn tasks more quickly thanks to AI, and that has broadened their applications to tasks in more chaotic settings, like picking orders in warehouses. But a growing number of well-funded companies are pushing for an even more monumental shift. 

Prosper and others are betting that they don’t have to build a perfect robot that can do everything on its own. Instead, they can build one that’s pretty good, but receives help from remote operators anywhere in the world. If that works well enough, they’re hoping to bring robots into jobs that most of us would have guessed couldn’t be automated: the work of hotel housekeepers, care providers in hospitals, or domestic help. “Almost any indoor physical labor” is on the table, Prosper’s founder and CEO, Shariq Hashme, told me. 

Until now, we’ve mostly thought about automation and outsourcing as two separate forces that can affect the labor market. Jobs might be outsourced overseas or lost to automation, but not both. A job that couldn’t be sent offshore and could not yet be fully automated by machines, like cleaning a hotel room, wasn’t going anywhere. Now, advancements in robotics are promising that employers can outsource such a job to low-wage countries without needing the technology to fully automate it. 

It’s a tall order, to be clear. Robots, as advanced as they’ve gotten, may find it difficult to move around complex environments like hotels and hospitals, even with assistance. That will take years to change. However, robots will only get more nimble, as will the systems that enable them to be controlled from halfway around the world. Eventually, the bets made by these companies may pay off.

What would that mean? One, the labor movement’s battle with AI—which this year has focused its attention on automation at ports and generative AI’s theft of artists’ work—will have a whole new battle to fight. It won’t just be dock workers, delivery drivers, and actors seeking contracts to protect their jobs from automation—it will be hospitality and domestic workers too, along with many others. 

Second, our expectations of privacy would radically shift. People buying those hypothetical household robots would have to be comfortable with the idea that someone that they have never met is seeing their dirty laundry—literally and figuratively. 

Some of those changes might happen sooner rather than later. For robots to learn how to navigate places effectively, they need training data, and this year has already seen a race to collect new data sets to help them learn. To achieve their ambitions for teleoperated robots, companies will expand their search for training data to hospitals, workplaces, hotels, and more. 


Now read the rest of The Algorithm

Deeper Learning

This is where the data to build AI comes from

AI developers often don’t really know or share much about the sources of the data they are using, and the Data Provenance Initiative, a group of over 50 researchers from both academia and industry, wanted to fix that. They dug into 4,000 public data sets spanning over 600 languages, 67 countries, and three decades to understand what’s feeding today’s top AI models, and how that will affect the rest of us. 

Why it matters: AI is being incorporated into everything, and what goes into the AI models determines what comes out. However, the team found that AI’s data practices risk concentrating power overwhelmingly in the hands of a few dominant technology companies, a shift from how AI models were being trained just a decade ago. Over 90% of the data sets that the researchers analyzed came from Europe and North America, and over 70% of data for both speech and image data sets comes from YouTube. This concentration means that AI models are unlikely to “capture all the nuances of humanity and all the ways that we exist,” says Sara Hooker, a researcher involved in the project. Read more from Melissa Heikkilä.

Bits and Bytes

In the shadows of Arizona’s data center boom, thousands live without power

As new research shows that AI’s emissions have soared, Arizona is expanding plans for AI data centers while rejecting plans to finally provide electricity to parts of the Navajo Nation’s land. (Washington Post)

AI is changing how we study bird migration

After decades of frustration, machine-learning tools are unlocking a treasure trove of acoustic data for ecologists. (MIT Technology Review)

OpenAI unveils a more advanced reasoning model in race with Google

The new o3 model, unveiled during a livestreamed event on Friday, spends more time computing an answer before responding to user queries, with the goal of solving more complex multi-step problems. (Bloomberg)

How your car might be making roads safer

Researchers say data from long-haul trucks and General Motors cars is critical for addressing traffic congestion and road safety. Data privacy experts have concerns. (New York Times)

The next generation of neural networks could live in hardware

Networks programmed directly into computer chip hardware can identify images faster, and use much less energy, than the traditional neural networks that underpin most modern AI systems. That’s according to work presented at a leading machine learning conference in Vancouver last week.

Neural networks, from GPT-4 to Stable Diffusion, are built by wiring together perceptrons, which are highly simplified simulations of the neurons in our brains. In very large numbers, perceptrons are powerful, but they also consume enormous volumes of energy—so much that Microsoft has penned a deal that will reopen Three Mile Island to power its AI advancements.

Part of the trouble is that perceptrons are just software abstractions—running a perceptron network on a GPU requires translating that network into the language of hardware, which takes time and energy. Building a network directly from hardware components does away with a lot of those costs. One day, they could even be built directly into chips used in smartphones and other devices, dramatically reducing the need to send data to and from servers.

Felix Petersen, who did this work as a postdoctoral researcher at Stanford University, has a strategy for making that happen. He designed networks composed of logic gates, which are some of the basic building blocks of computer chips. Made up of a few transistors apiece, logic gates accept two bits—1s or 0s—as inputs and, according to a rule determined by their specific pattern of transistors, output a single bit. Just like perceptrons, logic gates can be chained up into networks. And running logic-gate networks is cheap, fast, and easy: in his talk at the Neural Information Processing Systems (NeurIPS) conference, Petersen said that they consume less energy than perceptron networks by a factor of hundreds of thousands.

Logic-gate networks don’t perform nearly as well as traditional neural networks on tasks like image labeling. But the approach’s speed and efficiency make it promising, according to Zhiru Zhang, a professor of electrical and computer engineering at Cornell University. “If we can close the gap, then this could potentially open up a lot of possibilities on this edge of machine learning,” he says.

Petersen didn’t go looking for ways to build energy-efficient AI networks. He came to logic gates through an interest in “differentiable relaxations,” or strategies for wrangling certain classes of mathematical problems into a form that calculus can solve. “It really started off as a mathematical and methodological curiosity,” he says.

Backpropagation, the training algorithm that made the deep-learning revolution possible, was an obvious use case for this approach. Because backpropagation runs on calculus, it can’t be used directly to train logic-gate networks. Logic gates only work with 0s and 1s, and calculus demands answers about all the fractions in between. Petersen devised a way to “relax” logic-gate networks enough for backpropagation by creating functions that work like logic gates on 0s and 1s but also give answers for intermediate values. He ran simulated networks with those gates through training and then  converted the relaxed logic-gate network back into something that he could implement in computer hardware.

One challenge with this approach  is that training the relaxed networks is tough. Each node in the network could end up as any one of 16 different logic gates, and the 16 probabilities associated with each of those gates must be kept track of and continually adjusted. That takes a huge amount of time and energy—during his NeurIPS talk, Petersen said that training his networks takes hundreds of times longer than training conventional neural networks on GPUs. At universities, which can’t afford to amass hundreds of thousands of GPUs, that amount of GPU time can be tough to swing—Petersen developed these networks, in collaboration with his colleagues, at Stanford University and the University of Konstanz. “It definitely makes the research tremendously hard,” he says. 

Once the network has been trained, though, things get way, way cheaper. Petersen compared his logic-gate networks with a cohort of other ultra-efficient networks, such as binary neural networks, which use simplified perceptrons that can process only binary values. The logic-gate networks did just as well as these other efficient methods at classifying images in the CIFAR-10 data set, which includes 10 different categories of low-resolution pictures, from “frog” to “truck.” It achieved this with fewer than a tenth of the logic gates required by those other methods, and in less than a thousandth of the time. Petersen tested his networks using programmable computer chips called FPGAs, which can be used to emulate many different potential patterns of logic gates; implementing the networks in non-programmable ASIC chips would reduce costs even further, because programmable chips need to use more components in order to achieve their flexibility.

Farinaz Koushanfar, a professor of electrical and computer engineering at the University of California, San Diego, says she isn’t convinced that logic-gate networks will be able to perform when faced with more realistic problems. “It’s a cute idea, but I’m not sure how well it scales,” she says. She notes that the logic-gate networks can only be trained approximately, via the relaxation strategy, and approximations can fail. That hasn’t caused issues yet, but Koushanfar says that it could prove more problematic as the networks grow. 

Nevertheless, Petersen is ambitious. He plans to continue pushing the abilities of his logic-gate networks, and he hopes, eventually, to create what he calls a “hardware foundation model.” A powerful, general-purpose logic-gate network for vision could be mass-produced directly on computer chips, and those chips could be integrated into devices like personal phones and computers. That could reap enormous energy benefits, Petersen says. If those networks could effectively reconstruct photos and videos from low-resolution information, for example, then far less data would need to be sent between servers and personal devices. 

Petersen acknowledges that logic-gate networks will never compete with traditional neural networks on performance, but that isn’t his goal. Making something that works, and that is as efficient as possible, should be enough. “It won’t be the best model,” he says. “But it should be the cheapest.”

Accelerating AI innovation through application modernization

Business applications powered by AI are revolutionizing customer experiences, accelerating the speed of business, and driving employee productivity. In fact, according to research firm Frost & Sullivan’s 2024 Global State of AI report, 89% of organizations believe AI and machine learning will help them grow revenue, boost operational efficiency, and improve customer experience.

Take for example, Vodafone. The telecommunications company is using a suite of Azure AI services, such as Azure OpenAI Service, to deliver real-time, hyper-personalized experiences across all of its customer touchpoints, including its digital chatbot TOBi. By leveraging AI to increase customer satisfaction, Naga Surendran, senior director of product marketing for Azure Application Services at Microsoft, says Vodafone has managed to resolve 70% of its first-stage inquiries through AI-powered digital channels. It has also boosted the productivity of support agents by providing them with access to AI capabilities that mirror those of Microsoft Copilot, an AI-powered productivity tool.

“The result is a 20-point increase in net promotor score,” he says. “These benefits are what’s driving AI infusion into every business process and application.”

Yet realizing measurable business value from AI-powered applications requires a new game plan. Legacy application architectures simply aren’t capable of meeting the high demands of AI-enhanced applications. Rather, the time is now for organizations to modernize their infrastructure, processes, and application architectures using cloud native technologies to stay competitive.

The time is now for modernization

Today’s organizations exist in an era of geopolitical shifts, growing competition, supply chain disruptions, and evolving consumer preferences. AI applications can help by supporting innovation, but only if they have the flexibility to scale when needed. Fortunately, by modernizing applications, organizations can achieve the agile development, scalability, and fast compute performance needed to support rapid innovation and accelerate the delivery of AI applications. David Harmon, director of software development for AMD says companies, “really want to make sure that they can migrate their current [environment] and take advantage of all the hardware changes as much as possible.” The result is not only a reduction in the overall development lifecycle of new applications but a speedy response to changing world circumstances.

Beyond building and deploying intelligent apps quickly, modernizing applications, data, and infrastructure can significantly improve customer experience. Consider, for example, Coles, an Australian supermarket that invested in modernization and is using data and AI to deliver dynamic e-commerce experiences to its customers both online and in-store. With Azure DevOps, Coles has shifted from monthly to weekly deployments of applications while, at the same time, reducing build times by hours. What’s more, by aggregating views of customers across multiple channels, Coles has been able to deliver more personalized customer experiences. In fact, according to a 2024 CMSWire Insights report, there is a significant rise in the use of AI across the digital customer experience toolset, with 55% of organizations now using it to some degree, and more beginning their journey.

But even the most carefully designed applications are vulnerable to cybersecurity attacks. If given the opportunity, bad actors can extract sensitive information from machine learning models or maliciously infuse AI systems with corrupt data. “AI applications are now interacting with your core organizational data,” says Surendran. “Having the right guard rails is important to make sure the data is secure and built on a platform that enables you to do that.” The good news is modern cloud based architectures can deliver robust security, data governance, and AI guardrails like content safety to protect AI applications from security threats and ensure compliance with industry standards.

The answer to AI innovation

New challenges, from demanding customers to ill-intentioned hackers, call for a new approach to modernizing applications. “You have to have the right underlying application architecture to be able to keep up with the market and bring applications faster to market,” says Surendran. “Not having that foundation can slow you down.”

Enter cloud native architecture. As organizations increasingly adopt AI to accelerate innovation and stay competitive, there is a growing urgency to rethink how applications are built and deployed in the cloud. By adopting cloud native architectures, Linux, and open source software, organizations can better facilitate AI adoption and create a flexible platform purpose built for AI and optimized for the cloud. Harmon explains that open source software creates options, “And the overall open source ecosystem just thrives on that. It allows new technologies to come into play.”

Application modernization also ensures optimal performance, scale, and security for AI applications. That’s because modernization goes beyond just lifting and shifting application workloads to cloud virtual machines. Rather, a cloud native architecture is inherently designed to provide developers with the following features:

  • The flexibility to scale to meet evolving needs
  • Better access to the data needed to drive intelligent apps
  • Access to the right tools and services to build and deploy intelligent applications easily
  • Security embedded into an application to protect sensitive data

Together, these cloud capabilities ensure organizations derive the greatest value from their AI applications. “At the end of the day, everything is about performance and security,” says Harmon. Cloud is no exception.

What’s more, Surendran notes that “when you leverage a cloud platform for modernization, organizations can gain access to AI models faster and get to market faster with building AI-powered applications. These are the factors driving the modernization journey.”

Best practices in play

For all the benefits of application modernization, there are steps organizations must take to ensure both technological and operational success. They are:

Train employees for speed. As modern infrastructure accelerates the development and deployment of AI-powered applications, developers must be prepared to work faster and smarter than ever. For this reason, Surendran warns, “Employees must be skilled in modern application development practices to support the digital business needs.” This includes developing expertise in working with loosely coupled microservices to build scalable and flexible application and AI integration.

Start with an assessment. Large enterprises are likely to have “hundreds of applications, if not thousands,” says Surendran. As a result, organizations must take the time to evaluate their application landscape before embarking on a modernization journey. “Starting with an assessment is super important,” continues Surendran. “Understanding, taking inventory of the different applications, which team is using what, and what this application is driving from a business process perspective is critical.”

Focus on quick wins. Modernization is a huge, long-term transformation in how companies build, deliver, and support applications. Most businesses are still learning and developing the right strategy to support innovation. For this reason, Surendran recommends focusing on quick wins while also working on a larger application estate transformation. “You have to show a return on investment for your organization and business leaders,” he says. For example, modernize some apps quickly with re-platforming and then infuse them with AI capabilities.

Partner up. “Modernization can be daunting,” says Surendran. Selecting the right strategy, process, and platform to support innovation is only the first step. Organizations must also “bring on the right set of partners to help them go through change management and the execution of this complex project.”

Address all layers of security. Organizations must be unrelenting when it comes to protecting their data. According to Surendran, this means adopting a multi-layer approach to security that includes: security by design, in which products and services are developed from the get-go with security in mind; security by default, in which protections exist at every layer and interaction where data exists; and security by ongoing operations, which means using the right tools and dashboards to govern applications throughout their lifecycle.

A look to the future

Most organizations are already aware of the need for application modernization. But with the arrival of AI comes the startling revelation that modernization efforts must be done right, and that AI applications must be built and deployed for greater business impact. Adopting a cloud native architecture can help by serving as a platform for enhanced performance, scalability, security, and ongoing innovation. “As soon as you modernize your infrastructure with a cloud platform, you have access to these rapid innovations in AI models,” says Surendran. “It’s about being able to continuously innovate with AI.”

Read more about how to accelerate app and data estate readiness for AI innovation with Microsoft Azure and AMD. Explore Linux on Azure.

This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by MIT Technology Review’s editorial staff.

AI is changing how we study bird migration

A small songbird soars above Ithaca, New York, on a September night. He is one of 4 billion birds, a great annual river of feathered migration across North America. Midair, he lets out what ornithologists call a nocturnal flight call to communicate with his flock. It’s the briefest of signals, barely 50 milliseconds long, emitted in the woods in the middle of the night. But humans have caught it nevertheless, with a microphone topped by a focusing funnel. Moments later, software called BirdVoxDetect, the result of a collaboration between New York University, the Cornell Lab of Ornithology, and École Centrale de Nantes, identifies the bird and classifies it to the species level.

Biologists like Cornell’s Andrew Farnsworth had long dreamed of snooping on birds this way. In a warming world increasingly full of human infrastructure that can be deadly to them, like glass skyscrapers and power lines, migratory birds are facing many existential threats. Scientists rely on a combination of methods to track the timing and location of their migrations, but each has shortcomings. Doppler radar, with the weather filtered out, can detect the total biomass of birds in the air, but it can’t break that total down by species. GPS tags on individual birds and careful observations by citizen-scientist birders help fill in that gap, but tagging birds at scale is an expensive and invasive proposition. And there’s another key problem: Most birds migrate at night, when it’s more difficult to identify them visually and while most birders are in bed. For over a century, acoustic monitoring has hovered tantalizingly out of reach as a method that would solve ornithologists’ woes.

In the late 1800s, scientists realized that migratory birds made species-specific nocturnal flight calls—“acoustic fingerprints.” When microphones became commercially available in the 1950s, scientists began recording birds at night. Farnsworth led some of this acoustic ecology research in the 1990s. But even then it was challenging to spot the short calls, some of which are at the edge of the frequency range humans can hear. Scientists ended up with thousands of tapes they had to scour in real time while looking at spectrograms that visualize audio. Though digital technology made recording easier, the “perpetual problem,” Farnsworth says, “was that it became increasingly easy to collect an enormous amount of audio data, but increasingly difficult to analyze even some of it.”

Then Farnsworth met Juan Pablo Bello, director of NYU’s Music and Audio Research Lab. Fresh off a project using machine learning to identify sources of urban noise pollution in New York City, Bello agreed to take on the problem of nocturnal flight calls. He put together a team including the French machine-listening expert Vincent Lostanlen, and in 2015, the BirdVox project was born to automate the process. “Everyone was like, ‘Eventually, when this nut is cracked, this is going to be a super-rich source of information,’” Farnsworth says. But in the beginning, Lostanlen recalls, “there was not even a hint that this was doable.” It seemed unimaginable that machine learning could approach the listening abilities of experts like Farnsworth.

“Andrew is our hero,” says Bello. “The whole thing that we want to imitate with computers is Andrew.”

They started by training BirdVoxDetect, a neural network, to ignore faults like low buzzes caused by rainwater damage to microphones. Then they trained the system to detect flight calls, which differ between (and even within) species and can easily be confused with the chirp of a car alarm or a spring peeper. The challenge, Lostanlen says, was similar to the one a smart speaker faces when listening for its unique “wake word,” except in this case the distance from the target noise to the microphone is far greater (which means much more background noise to compensate for). And, of course, the scientists couldn’t choose a unique sound like “Alexa” or “Hey Google” for their trigger. “For birds, we don’t really make that choice. Charles Darwin made that choice for us,” he jokes. Luckily, they had a lot of training data to work with—Farnsworth’s team had hand-annotated thousands of hours of recordings collected by the microphones in Ithaca.

With BirdVoxDetect trained to detect flight calls, another difficult task lay ahead: teaching it to classify the detected calls by species, which few expert birders can do by ear. To deal with uncertainty, and because there is not training data for every species, they decided on a hierarchical system. For example, for a given call, BirdVoxDetect might be able to identify the bird’s order and family, even if it’s not sure about the species—just as a birder might at least identify a call as that of a warbler, whether yellow-rumped or chestnut-sided. In training, the neural network was penalized less when it mixed up birds that were closer on the taxonomical tree.  

Last August, capping off eight years of research, the team published a paper detailing BirdVoxDetect’s machine-learning algorithms. They also released the software as a free, open-source product for ornithologists to use and adapt. In a test on a full season of migration recordings totaling 6,671 hours, the neural network detected 233,124 flight calls. In a 2022 study in the Journal of Applied Ecology, the team that tested BirdVoxDetect found acoustic data as effective as radar for estimating total biomass.

BirdVoxDetect works on a subset of North American migratory songbirds. But through “few-shot” learning, it can be trained to detect other, similar birds with just a few training examples. It’s like learning a language similar to one you already speak, Bello says. With cheap microphones, the system could be expanded to places around the world without birders or Doppler radar, even in vastly different recording conditions. “If you go to a bioacoustics conference and you talk to a number of people, they all have different use cases,” says Lostanlen. The next step for bioacoustics, he says, is to create a foundation model, like the ones scientists are working on for natural-language processing and image and video analysis, that would be reconfigurable for any species—even beyond birds. That way, scientists won’t have to build a new BirdVoxDetect for every animal they want to study.

The BirdVox project is now complete, but scientists are already building on its algorithms and approach. Benjamin Van Doren, a migration biologist at the University of Illinois Urbana-Champaign who worked on BirdVox, is using Nighthawk, a new user-friendly neural network based on both BirdVoxDetect and the popular birdsong ID app Merlin, to study birds migrating over Chicago and elsewhere in North and South America. And Dan Mennill, who runs a bioacoustics lab at the University of Windsor, says he’s excited to try Nighthawk on flight calls his team currently hand-­annotates after they’re recorded by microphones on the Canadian side of the Great Lakes. One weakness of acoustic monitoring is that unlike radar, a single microphone can’t detect the altitude of a bird overhead or the direction in which it is moving. Mennill’s lab is experimenting with an array of eight microphones that can triangulate to solve that problem. Sifting through recordings has been slow. But with Nighthawk, the analysis will speed dramatically.

With birds and other migratory animals under threat, Mennill says, BirdVoxDetect came at just the right time. Knowing exactly which birds are flying over in real time can help scientists keep tabs on how species are doing and where they’re going. That can inform practical conservation efforts like “Lights Out” initiatives that encourage skyscrapers to go dark at night to prevent bird collisions. “Bioacoustics is the future of migration research, and we’re really just getting to the stage where we have the right tools,” he says. “This ushers us into a new era.”

Christian Elliott is a science and environmental reporter based in Illinois.  

This is where the data to build AI comes from

AI is all about data. Reams and reams of data are needed to train algorithms to do what we want, and what goes into the AI models determines what comes out. But here’s the problem: AI developers and researchers don’t really know much about the sources of the data they are using. AI’s data collection practices are immature compared with the sophistication of AI model development. Massive data sets often lack clear information about what is in them and where it came from. 

The Data Provenance Initiative, a group of over 50 researchers from both academia and industry, wanted to fix that. They wanted to know, very simply: Where does the data to build AI come from? They audited nearly 4,000 public data sets spanning over 600 languages, 67 countries, and three decades. The data came from 800 unique sources and nearly 700 organizations. 

Their findings, shared exclusively with MIT Technology Review, show a worrying trend: AI’s data practices risk concentrating power overwhelmingly in the hands of a few dominant technology companies. 

In the early 2010s, data sets came from a variety of sources, says Shayne Longpre, a researcher at MIT who is part of the project. 

It came not just from encyclopedias and the web, but also from sources such as parliamentary transcripts, earning calls, and weather reports. Back then, AI data sets were specifically curated and collected from different sources to suit individual tasks, Longpre says.

Then transformers, the architecture underpinning language models, were invented in 2017, and the AI sector started seeing performance get better the bigger the models and data sets were. Today, most AI data sets are built by indiscriminately hoovering material from the internet. Since 2018, the web has been the dominant source for data sets used in all media, such as audio, images, and video, and a gap between scraped data and more curated data sets has emerged and widened.

“In foundation model development, nothing seems to matter more for the capabilities than the scale and heterogeneity of the data and the web,” says Longpre. The need for scale has also boosted the use of synthetic data massively.

The past few years have also seen the rise of multimodal generative AI models, which can generate videos and images. Like large language models, they need as much data as possible, and the best source for that has become YouTube. 

For video models, as you can see in this chart, over 70% of data for both speech and image data sets comes from one source.

This could be a boon for Alphabet, Google’s parent company, which owns YouTube. Whereas text is distributed across the web and controlled by many different websites and platforms, video data is extremely concentrated in one platform.

“It gives a huge concentration of power over a lot of the most important data on the web to one company,” says Longpre. 

And because Google is also developing its own AI models, its massive advantage also raises questions about how the company will make this data available for competitors, says Sarah Myers West, the co–executive director at the AI Now Institute.

“It’s important to think about data not as though it’s sort of this naturally occurring resource, but it’s something that is created through particular processes,” says Myers West.

“If the data sets on which most of the AI that we’re interacting with reflect the intentions and the design of big, profit-motivated corporations—that’s reshaping the infrastructures of our world in ways that reflect the interests of those big corporations,” she says.

This monoculture also raises questions about how accurately the human experience is portrayed in the data set and what kinds of models we are building, says Sara Hooker, the vice president of research at the technology company Cohere, who is also part of the Data Provenance Initiative.

People upload videos to YouTube with a particular audience in mind, and the way people act in those videos is often intended for very specific effect. “Does [the data] capture all the nuances of humanity and all the ways that we exist?” says Hooker. 

Hidden restrictions

AI companies don’t usually share what data they used to train their models. One reason is that they want to protect their competitive edge. The other is that because of the complicated and opaque way data sets are bundled, packaged, and distributed, they likely don’t even know where all the data came from.

They also probably don’t have complete information about any constraints on how that data is supposed to be used or shared. The researchers at the Data Provenance Initiative found that data sets often have restrictive licenses or terms attached to them, which should limit their use for commercial purposes, for example.

“This lack of consistency across the data lineage makes it very hard for developers to make the right choice about what data to use,” says Hooker.

It also makes it almost impossible to be completely certain you haven’t trained your model on copyrighted data, adds Longpre.

More recently, companies such as OpenAI and Google have struck exclusive data-sharing deals with publishers, major forums such as Reddit, and social media platforms on the web. But this becomes another way for them to concentrate their power.

“These exclusive contracts can partition the internet into various zones of who can get access to it and who can’t,” says Longpre.

The trend benefits the biggest AI players, who can afford such deals, at the expense of researchers, nonprofits, and smaller companies, who will struggle to get access. The largest companies also have the best resources for crawling data sets.

“This is a new wave of asymmetric access that we haven’t seen to this extent on the open web,” Longpre says.

The West vs. the rest

The data that is used to train AI models is also heavily skewed to the Western world. Over 90% of the data sets that the researchers analyzed came from Europe and North America, and fewer than 4% came from Africa. 

“These data sets are reflecting one part of our world and our culture, but completely omitting others,” says Hooker.

The dominance of the English language in training data is partly explained by the fact that the internet is still over 90% in English, and there are still a lot of places on Earth where there’s really poor internet connection or none at all, says Giada Pistilli, principal ethicist at Hugging Face, who was not part of the research team. But another reason is convenience, she adds: Putting together data sets in other languages and taking other cultures into account requires conscious intention and a lot of work. 

The Western focus of these data sets becomes particularly clear with multimodal models. When an AI model is prompted for the sights and sounds of a wedding, for example, it might only be able to represent Western weddings, because that’s all that it has been trained on, Hooker says. 

This reinforces biases and could lead to AI models that push a certain US-centric worldview, erasing other languages and cultures.

“We are using these models all over the world, and there’s a massive discrepancy between the world we’re seeing and what’s invisible to these models,” Hooker says. 

AI’s search for more energy is growing more urgent

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

If you drove by one of the 2,990 data centers in the United States, you’d probably think little more than “Huh, that’s a boring-looking building.” You might not even notice it at all. However, these facilities underpin our entire digital world, and they are responsible for tons of greenhouse-gas emissions. New research shows just how much those emissions have skyrocketed during the AI boom. 

Since 2018, carbon emissions from data centers in the US have tripled, according to new research led by a team at the Harvard T.H. Chan School of Public Health. That puts data centers slightly below domestic commercial airlines as a source of this pollution.

That leaves a big problem for the world’s leading AI companies, which are caught between pressure to meet their own sustainability goals and the relentless competition in AI that’s leading them to build bigger models requiring tons of energy. The trend toward ever more energy-intensive new AI models, including video generators like OpenAI’s Sora, will only send those numbers higher. 

A growing coalition of companies is looking toward nuclear energy as a way to power artificial intelligence. Meta announced on December 3 it was looking for nuclear partners, and Microsoft is working to restart the Three Mile Island nuclear plant by 2028. Amazon signed nuclear agreements in October. 

However, nuclear plants take ages to come online. And though public support has increased in recent years, and president-elect Donald Trump has signaled support, only a slight majority of Americans say they favor more nuclear plants to generate electricity. 

Though OpenAI CEO Sam Altman pitched the White House in September on an unprecedented effort to build more data centers, the AI industry is looking far beyond the United States. Countries in Southeast Asia, like Malaysia, Indonesia, Thailand, and Vietnam, are all courting AI companies, hoping to be their new data center hubs. 

In the meantime, AI companies will continue to use up power from their current sources, which are far from renewable. Since so many data centers are located in coal-producing regions, like Virginia, the “carbon intensity” of the energy they use is 48% higher than the national average. The researchers found that 95% of data centers in the US are built in places with sources of electricity that are dirtier than the national average. Read more about the new research here.


Deeper Learning

We saw a demo of the new AI system powering Anduril’s vision for war

We’re living through the first drone wars, but AI is poised to change the future of warfare even more drastically. I saw that firsthand during a visit to a test site in Southern California run by Anduril, the maker of AI-powered drones, autonomous submarines, and missiles. Anduril has built a way for the military to command much of its hardware—from drones to radars to unmanned fighter jets—from a single computer screen. 

Why it matters: Anduril, other companies in defense tech, and growing numbers of people within the Pentagon itself are increasingly adopting a new worldview: A future “great power” conflict—military jargon for a global war involving multiple countries—will not be won by the entity with the most advanced drones or firepower, or even the cheapest firepower. It will be won by whoever can sort through and share information the fastest. The Pentagon is betting lots of energy and money that AI—despite its flaws and risks—will be what puts the US and its allies ahead in that fight. Read more here.

Bits and Bytes

Bluesky has an impersonator problem 

The platform’s rise has brought with it a surge of crypto scammers, as my colleague Melissa Heikkilä experienced firsthand. (MIT Technology Review)

Tech’s elite make large donations to Trump ahead of his inauguration 

Leaders in Big Tech, who have been lambasted by Donald Trump, have made sizable donations to his ​​inauguration committee. (The Washington Post)

Inside the premiere of the first commercially streaming AI-generated movies

The films, according to writer Jason Koebler, showed the telltale flaws of AI-generated video: dead eyes, vacant expressions, unnatural movements, and a reliance on voice-overs, since dialogue doesn’t work well. The company behind the films is confident viewers will stomach them anyway. (404 Media)

Meta asked California’s attorney general to stop OpenAI from becoming for-profit

Meta now joins Elon Musk in alleging that OpenAI has improperly enjoyed the benefits of nonprofit status while developing its technology. (Wall Street Journal)

How Silicon Valley is disrupting democracy

Two books explore the price we’ve paid for handing over unprecedented power to Big Tech—and explain why it’s imperative we start taking it back. (MIT Technology Review)