What’s next for drones

MIT Technology Review’s What’s Next series looks across industries, trends, and technologies to give you a first look at the future. You can read the rest of them here.

Drones have been a mainstay technology among militaries, hobbyists, and first responders alike for more than a decade, and in that time the range available has skyrocketed. No longer limited to small quadcopters with insufficient battery life, drones are aiding search and rescue efforts, reshaping wars in Ukraine and Gaza, and delivering time-sensitive packages of medical supplies. And billions of dollars are being plowed into building the next generation of fully autonomous systems. 

These developments raise a number of questions: Are drones safe enough to be flown in dense neighborhoods and cities? Is it a violation of people’s privacy for police to fly drones overhead at an event or protest? Who decides what level of drone autonomy is acceptable in a war zone?

Those questions are no longer hypothetical. Advancements in drone technology and sensors, falling prices, and easing regulations are making drones cheaper, faster, and more capable than ever. Here’s a look at four of the biggest changes coming to drone technology in the near future.

Police drone fleets

Today more than 1,500 US police departments have drone programs, according to tracking conducted by the Atlas of Surveillance. Trained police pilots use drones for search and rescue operations, monitoring events and crowds, and other purposes. The Scottsdale Police Department in Arizona, for example, successfully used a drone to locate a lost elderly man with dementia, says Rich Slavin, Scottsdale’s assistant chief of police. He says the department has had useful but limited experiences with drones to date, but its pilots have often been hamstrung by the “line of sight” rule from the Federal Aviation Administration (FAA). The rule stipulates that pilots must be able to see their drones at all times, which severely limits the drone’s range.

Soon, that will change. On a rooftop somewhere in the city, Scottsdale police will in the coming months install a new police drone capable of autonomous takeoff, flight, and landing. Slavin says the department is seeking a waiver from the FAA to be able to fly its drone past the line of sight. (Hundreds of police agencies have received a waiver from the FAA since the first was granted in 2019.) The drone, which can fly up to 57 miles per hour, will go on missions as far as three miles from its docking station, and the department says it will be used for things like tracking suspects or providing a visual feed of an officer at a traffic stop who is waiting for backup. 

“The FAA has been much more progressive in how we’re moving into this space,” Slavin says. That could mean that around the country, the sight (and sound) of a police drone soaring overhead will become much more common. 

The Scottsdale department says the drone, which it is purchasing from Aerodome, will kick off its drone-as-first-responder program and will play a role in the department’s new “real-time crime center.” These sorts of centers are becoming increasingly common in US policing, and allow cities to connect cameras, license plate readers, drones, and other monitoring methods to track situations on the fly. The rise of the centers, and their associated reliance on drones, has drawn criticism from privacy advocates who say they conduct a great deal of surveillance with little transparency about how footage from drones and other sources will be used or shared. 

In 2019, the police department in Chula Vista, California, was the first to receive a waiver from the FAA to fly beyond line of sight. The program sparked criticism from members of the community who alleged the department was not transparent about the footage it collected or how it would be used. 

Jay Stanley, a senior policy analyst at the American Civil Liberties Union’s Speech, Privacy, and Technology Project, says the waivers exacerbate existing privacy issues related to drones. If the FAA continues to grant them, police departments will be able to cover far more of a city with drones than ever, all while the legal landscape is murky about whether this would constitute an invasion of privacy. 

“If there’s an accumulation of different uses of this technology, we’re going to end up in a world where from the moment you step out of your front door, you’re going to feel as though you’re under the constant eye of law enforcement from the sky,” he says. “It may have some real benefits, but it is also in dire need of strong checks and balances.”

Scottsdale police say the drone could be used in a variety of scenarios, such as responding to a burglary in progress or tracking a driver with suspected connection to a kidnapping. But the real benefit, Slavin says, will come from pairing it with other existing technologies, like automatic license plate readers and hundreds of cameras placed around the city. “It can get to places very, very quickly,” he says. “It gives us real-time intelligence and helps us respond faster and smarter.”

While police departments might indeed benefit from drones in those situations, Stanley says the ACLU has found that many deploy them for far more ordinary cases, like reports of a kid throwing a ball against a garage or of “suspicious persons” in an area.

“It raises the question about whether these programs will just end up being another way in which vulnerable communities are over-policed and nickeled and dimed by law enforcement agencies coming down on people for all kinds of minor transgressions,” he says.

Drone deliveries, again

Perhaps no drone technology is more overhyped than home deliveries. For years, tech companies have teased futuristic renderings of a drone dropping off a package on your doorstep just hours after you ordered it. But they’ve never managed to expand them much beyond small-scale pilot projects, at least in the US, again largely due to the FAA’s line of sight rules. 

But this year, regulatory changes are coming. Like police departments, Amazon’s Prime Air program was previously limited to flying its drones within the pilot’s line of sight. That’s because drone pilots don’t have radar, air traffic controllers, or any of the other systems commercial flight relies on to monitor airways and keep them safe. To compensate, Amazon spent years developing an onboard system that would allow its drones to detect nearby objects and avoid collisions. The company says it showed the FAA in demonstrations that its drones could fly safely in the same airspace as helicopters, planes, and hot air balloons. 

In May, Amazon announced the FAA had granted the company a waiver and permission to expand operations in Texas, more than a decade after the Prime Air project started. And in July, the FAA cleared one more roadblock by allowing two companies—Zipline as well as Google’s Wing Aviation—to fly in the same airspace simultaneously without the need for visual observers. 

While all this means your chances of receiving a package via drone have ticked up ever so slightly, the more compelling use case might be medical deliveries. Shakiba Enayati, an assistant professor of supply chains at the University of Missouri–St. Louis, has spent years researching how drones could conduct last-mile deliveries of vaccines, antivenom, organs, and blood in remote places. She says her studies have found drones to be game changers for getting medical supplies to underserved populations, and if the FAA extends these regulatory changes, it could have a real impact. 

That’s especially true in the steps leading up to an organ transplant, she says. Before an organ can be transmitted to a recipient, a number of blood tests must be sent back-and-forth to make sure the recipient can accept it, which takes a time if the blood is being transferred by car or even helicopter. “In these cases, the clock is ticking,” Enayati says. If drones were allowed to be used in this step at scale, it would be a significant improvement.

“If the technology is supporting the needs of organ delivery, it’s going to make a big change in such an important arena,” she says.

That development could come sooner than using drones for delivery of the actual organs, which have to be transported under very tightly controlled conditions to preserve them.

Domesticating the drone supply chain

Signed into law last December, the American Security Drone Act bars federal agencies from buying drones from countries thought to pose a threat to US national security, such as Russia and China. That’s significant. China is the undisputed leader when it comes to manufacturing drones and drone parts, with over 90% of law enforcement drones in the US made by Shenzhen-based DJI, and many drones used by both sides of the war in Ukraine are made by Chinese companies. 

The American Security Drone Act is part of an effort to curb that reliance on China. (Meanwhile, China is stepping up export restrictions on drones with military uses.) As part of the act, the US Department of Defense’s Defense Innovation Unit has created the Blue UAS Cleared List, a list of drones and parts the agency has investigated and approved for purchase. The list applies to federal agencies as well as programs that receive federal funding, which often means state police departments or other non-federal agencies. 

Since the US is set to spend such significant sums on drones—with $1 billion earmarked for the Department of Defense’s Replicator initiative alone—getting on the Blue List is a big deal. It means those federal agencies can make large purchases with little red tape. 

Allan Evans, CEO of US-based drone part maker Unusual Machine, says the list has sparked a significant rush of drone companies attempting to conform to the US standards. His company manufactures a first-person view flight controller that he hopes will become the first of its kind to be approved for the Blue List.

The American Security Drone Act is unlikely to affect private purchases in the US of drones used by videographers, drone racers, or hobbyists, which will overwhelmingly still be made by China-based companies like DJI. That means any US-based drone companies, at least in the short term, will only survive by catering to the US defense market.  

“Basically any US company that isn’t willing to have ancillary involvement in defense work will lose,” Evans says. 

The coming months will show the law’s true impact: Because the US fiscal year ends in September, Evans says he expects to see a host of agencies spending their use-it-or-lose-it funding on US-made drones and drone components in the next month. “That will indicate whether the marketplace is real or not, and how much money is actually being put toward it,” he says.

Autonomous weapons in Ukraine

The drone war in Ukraine has largely been one of attrition. Drones have been used extensively for surveying damage, finding and tracking targets, or dropping weapons since the war began, but on average these quadcopter drones last just three flights before being shot down or rendered unnavigable by GPS jamming. As a result, both Ukraine and Russia prioritized accumulating high volumes of drones with the expectation that they wouldn’t last long in battle. 

Now they’re having to rethink that approach, according to Andriy Dovbenko, founder of the UK-Ukraine Tech Exchange, a nonprofit that helps startups involved in Ukraine’s war effort and eventual reconstruction raise capital. While working with drone makers in Ukraine, he says, he has seen the demand for technology shift from big shipments of simple commercial drones to a pressing need for drones that can navigate autonomously in an environment where GPS has been jammed. With 70% of the front lines suffering from jamming, according to Dovbenko, both Russian and Ukrainian drone investment is now focused on autonomous systems. 

That’s no small feat. Drone pilots usually rely on video feeds from the drone as well as GPS technology, neither of which is available in a jammed environment. Instead, autonomous drones operate with various types of sensors like LiDAR to navigate, though this can be tricky in fog or other inclement weather. Autonomous drones are a new and rapidly changing technology, still being tested by US-based companies like Shield AI. The evolving war in Ukraine is raising the stakes and the pressure to deploy affordable and reliable autonomous drones.  

The transition toward autonomous weapons also raises serious yet largely unanswered questions about how much humans should be taken out of the loop in decision-making. As the war rages on and the need for more capable weaponry rises, Ukraine will likely be the testing ground for if and how the moral line is drawn. But Dovbenko says stopping to find that line during an ongoing war is impossible. 

“There is a moral question about how much autonomy you can give to the killing machine,” Dovbenko says. “This question is not being asked right now in Ukraine because it’s more of a matter of survival.”

DHS plans to collect biometric data from migrant children “down to the infant”

The US Department of Homeland Security (DHS) plans to collect and analyze photos of the faces of migrant children at the border in a bid to improve facial recognition technology, MIT Technology Review can reveal. This includes children “down to the infant,” according to John Boyd, assistant director of the department’s Office of Biometric Identity Management (OBIM), where a key part of his role is to research and develop future biometric identity services for the government. 

As Boyd explained at a conference in June, the key question for OBIM is, “If we pick up someone from Panama at the southern border at age four, say, and then pick them up at age six, are we going to recognize them?”

Facial recognition technology (FRT) has traditionally not been applied to children, largely because training data sets of real children’s faces are few and far between, and consist of either low-quality images drawn from the internet or small sample sizes with little diversity. Such limitations reflect the significant sensitivities regarding privacy and consent when it comes to minors. 

In practice, the new DHS plan could effectively solve that problem. According to Syracuse University’s Transactional Records Access Clearinghouse (TRAC), 339,234 children arrived at the US-Mexico border in 2022, the last year for which numbers are currently available. Of those children, 150,000 were unaccompanied—the highest annual number on record. If the face prints of even 1% of those children had been enrolled in OBIM’s craniofacial structural progression program, the resulting data set would dwarf nearly all existing data sets of real children’s faces used for aging research.

It’s unclear to what extent the plan has already been implemented; Boyd tells MIT Technology Review that to the best of his knowledge, the agency has not yet started collecting data under the program, but he adds that as “the senior executive,” he would “have to get with [his] staff to see.” He could only confirm that his office is “funding” it. Despite repeated requests, Boyd did not provide any additional information. 

Boyd says OBIM’s plan to collect facial images from children under 14 is possible due to recent “rulemaking” at “some DHS components,” or sub-offices, that have removed age restrictions on the collection of biometric data. US Customs and Border Protection (CBP), the US Transportation Security Administration, and US Immigration and Customs Enforcement declined to comment before publication. US Citizenship and Immigration Services (USCIS) did not respond to multiple requests for comment. OBIM referred MIT Technology Review back to DHS’s main press office. 

DHS did not comment on the program prior, but sent an emailed statement following publication: “The Department of Homeland Security uses various forms of technology to execute its mission, including some biometric capabilities. DHS ensures all technologies, regardless of type, are operated under the established authorities and within the scope of the law. We are committed to protecting the privacy, civil rights, and civil liberties of all individuals who may be subject to the technology we use to keep the nation safe and secure.”

Boyd spoke publicly about the plan in June at the Federal Identity Forum and Exposition, an annual identity management conference for federal employees and contractors. But close observers of DHS that we spoke with—including a former official, representatives of two influential lawmakers who have spoken out about the federal government’s use of surveillance technologies, and immigrants’ rights organizations that closely track policies affecting migrants—were unaware of any new policies allowing biometric data collection of children under 14. 

That is not to say that all of them are surprised. “That tracks,” says one former CBP official who has visited several migrant processing centers on the US-Mexico border and requested anonymity to speak freely. He says “every center” he visited “had biometric identity collection, and everybody was going through it,” though he was unaware of a specific policy mandating the practice. “I don’t recall them separating out children,” he adds.

“The reports of CBP, as well as DHS more broadly, expanding the use of facial recognition technology to track migrant children is another stride toward a surveillance state and should be a concern to everyone who values privacy,” Justin Krakoff, deputy communications director for Senator Jeff Merkley of Oregon, said in a statement to MIT Technology Review. Merkley has been an outspoken critic of both DHS’s immigration policies and of government use of facial recognition technologies

Beyond concerns about privacy, transparency, and accountability, some experts also worry about testing and developing new technologies using data from a population that has little recourse to provide—or withhold—consent. 

Could consent “actually take into account the vast power differentials that are inherent in the way that this is tested out on people?” asks Petra Molnar, author of The Walls Have Eyes: Surviving Migration in the Age of AI. “And if you arrive at a border … and you are faced with the impossible choice of either: get into a country if you give us your biometrics, or you don’t.”

“That completely vitiates informed consent,” she adds.

This question becomes even more challenging when it comes to children, says Ashley Gorski, a senior staff attorney with the American Civil Liberties Union. DHS “should have to meet an extremely high bar to show that these kids and their legal guardians have meaningfully consented to serve as test subjects,” she says. “There’s a significant intimidation factor, and children aren’t as equipped to consider long-term risks.”

Murky new rules

The Office of Biometric Identity Management, previously known as the US Visitor and Immigrant Status Indicator Technology Program (US-VISIT), was created after 9/11 with the specific mandate of collecting biometric data—initially only fingerprints and photographs—from all non-US citizens who sought to enter the country. 

Since then, DHS has begun collecting face prints, iris and retina scans, and even DNA, among other modalities. It is also testing new ways of gathering this data—including through contactless fingerprint collection, which is currently deployed at five sites on the border, as Boyd shared in his conference presentation. 

Since 2023, CBP has been using a mobile app, CBP One, for asylum seekers to submit biometric data even before they enter the United States; users are required to take selfies periodically to verify their identity. The app has been riddled with problems, including technical glitches and facial recognition algorithms that are unable to recognize darker-skinned people. This is compounded by the fact that not every asylum seeker has a smartphone. 

Then, just after crossing into the United States, migrants must submit to collection of biometric data, including DNA. For a sense of scale, a recent report from Georgetown Law School’s Center on Privacy and Technology found that CBP has added 1.5 million DNA profiles, primarily from migrants crossing the border, to law enforcement databases since it began collecting DNA “from any person in CBP custody subject to fingerprinting” in January 2020. The researchers noted that an overrepresentation of immigrants—the majority of whom are people of color—in a DNA database used by law enforcement could subject them to over-policing and lead to other forms of bias. 

Generally, these programs only require information from individuals aged 14 to 79. DHS attempted to change this back in 2020, with proposed rules for USCIS and CBP that would have expanded biometric data collection dramatically, including by age. (USCIS’s proposed rule would have doubled the number of people from whom biometric data would be required, including any US citizen who sponsors an immigrant.) But the USCIS rule was withdrawn in the wake of the Biden administration’s new “priorities to reduce barriers and undue burdens in the immigration system.” Meanwhile, for reasons that remain unclear, the proposed CBP rule was never enacted. 

This would make it appear “contradictory” if DHS were now collecting the biometric data of children under 14, says Dinesh McCoy, a staff attorney with Just Futures Law, an immigrant rights group that tracks surveillance technologies. 

Neither Boyd nor DHS’s media office would confirm which specific policy changes he was referring to in his presentation, though MIT Technology Review has identified a 2017 memo, issued by then-Secretary of Homeland Security John F. Kelly, that encouraged DHS components to remove “age as a basis for determining when to collect biometrics.” 

The DHS’s Office of the Inspector General (OIG) referred to this memo as the “overarching policy for biometrics at DHS” in a September 2023 report, though none of the press offices MIT Technology Review contacted—including the main DHS press office, OIG, and OBIM, among others—would confirm whether this was still the relevant policy; we have not been able to confirm any related policy changes since then. 

The OIG audit also found a number of fundamental issues related to DHS’s oversight of biometric data collection and use—including that its 10-year strategic framework for biometrics, covering 2015 to 2025, “did not accurately reflect the current state of biometrics across the Department, such as the use of facial recognition verification and identification.” Nor did it provide clear guidance for the consistent collection and use of biometrics across DHS, including age requirements. 

But there is also another potential explanation for the new OBIM program: Boyd says it is being conducted under the auspices of the DHS’s undersecretary of science and technology, the office that leads much of the agency’s research efforts. Because it is for research, rather than to be used “in DHS operations to inform processes or decision making,” many of the standard restrictions for DHS use of face recognition and face capture technologies do not apply, according to a DHS directive

Do you have any additional information on DHS’s craniofacial structural progression initiative? Please reach out with a non-work email to tips@technologyreview.com or securely on Signal at 626.765.5489. 

Some lawyers allege that changing the age limit for data collection via department policy, not by a federal rule, which requires a public comment period, is problematic. McCoy, for instance, says any lack of transparency here amplifies the already “extremely challenging” task of “finding [out] in a systematic way how these technologies are deployed”—even though that is key for accountability.

Advancing the field

At the identity forum and in a subsequent conversation, Boyd explained that this data collection is meant to advance the development of effective FRT algorithms. Boyd leads OBIM’s Future Identity team, whose mission is to “research, review, assess, and develop technology, policy, and human factors that enable rapid, accurate, and secure identity services” and to make OBIM “the preferred provider for identity services within DHS.” 

Driven by high-profile cases of missing children, there has long been interest in understanding how children’s faces age. At the same time, there have been technical challenges to doing so, both preceding FRT and with it. 

At its core, facial recognition identifies individuals by comparing the geometry of various facial features in an original face print with subsequent images. Based on this comparison, a facial recognition algorithm assigns a percentage likelihood that there is a match. 

But as children grow and develop, their bone structure changes significantly, making it difficult for facial recognition algorithms to identify them over time. (These changes tend to be even more pronounced  in children under 14. In contrast, as adults age, the changes tend to be in the skin and muscle, and have less variation overall.) More data would help solve this problem, but there is a dearth of high-quality data sets of children’s faces with verifiable ages. 

“What we’re trying to do is to get large data sets of known individuals,” Boyd tells MIT Technology Review. That means taking high-quality face prints “under controlled conditions where we know we’ve got the person with the right name [and] the correct birth date”—or, in other words, where they can be certain about the “provenance of the data.” 

For example, one data set used for aging research consists of 305 celebrities’ faces as they aged from five to 32. But these photos, scraped from the internet, contain too many other variables—such as differing image qualities, lighting conditions, and distances at which they were taken—to be truly useful. Plus, speaking to the provenance issue that Boyd highlights, their actual ages in each photo can only be estimated. 

Another tactic is to use data sets of adult faces that have been synthetically de-aged. Synthetic data is considered more privacy-preserving, but it too has limitations, says Stephanie Schuckers, director of the Center for Identification Technology Research (CITeR). “You can test things with only the generated data,” Schuckers explains, but the question remains: “Would you get similar results to the real data?”

(Hosted at Clarkson University in New York, CITeR brings together a network of academic and government affiliates working on identity technologies. OBIM is a member of the research consortium.) 

Schuckers’s team at CITeR has taken another approach: an ongoing longitudinal study of a cohort of 231 elementary and middle school students from the area around Clarkson University. Since 2016, the team has captured biometric data every six months (save for two years of the covid-19 pandemic), including facial images. They have found that the open-source face recognition models they tested can in fact successfully recognize children three to four years after they were initially enrolled. 

But the conditions of this study aren’t easily replicable at scale. The study images are taken in a controlled environment, all the participants are volunteers, the researchers sought consent from parents and the subjects themselves, and the research was approved by the university’s Institutional Review Board. Schuckers’s research also promises to protect privacy by requiring other researchers to request access, and by providing facial datasets separately from other data that have been collected. 

What’s more, this research still has technical limitations, including that the sample is small, and it is overwhelmingly Caucasian, meaning it might be less accurate when applied to other races. 

Schuckers says she was unaware of DHS’s craniofacial structural progression initiative. 

Far-reaching implications

Boyd says OBIM takes privacy considerations seriously, and that “we don’t share … data with commercial industries.” Still, OBIM has 144 government partners with which it does share information, and it has been criticized by the Government Accountability Office for poorly documenting who it shares information with, and with what privacy-protecting measures. 

Even if the data does stay within the federal government, OBIM’s findings regarding the accuracy of FRT for children over time could nevertheless influence how—and when—the rest of the government collects biometric data, as well as whether the broader facial recognition industry may also market its services for children. (Indeed, Boyd says sharing “results,” or the findings of how accurate FRT algorithms are, is different than sharing the data itself.) 

That this technology is being tested on people who are offered fewer privacy protections than would be afforded to US citizens is just part of the wider trend of using people from the developing world, whether they are migrants coming to the border or civilians in war zones, to help improve new technologies. 

In fact, Boyd previously helped advance the Department of Defense’s biometric systems in Iraq and Afghanistan, where he acknowledged that individuals lacked the privacy protections that would have been granted in many other contexts, despite the incredibly high stakes. Biometric data collected in those war zones—in some areas, from every fighting-age male—was used to identify and target insurgents, and being misidentified could mean death. 

These projects subsequently played a substantial role in influencing the expansion of biometric data collection by the Department of Defense, which now happens globally. And architects of the program, like Boyd, have taken important roles in expanding the use of biometrics at other agencies. 

“It’s not an accident” that this testing happens in the context of border zones, says Molnar. Borders are “the perfect laboratory for tech experimentation, because oversight is weak, discretion is baked into the decisions that get made … it allows the state to experiment in ways that it wouldn’t be allowed to in other spaces.” 

But, she notes, “just because it happens at the border doesn’t mean that that’s where it’s going to stay.”

Update: This story was updated to include comment from DHS.

Do you have any additional information on DHS’s craniofacial structural progression initiative? Please reach out with a non-work email to tips@technologyreview.com or securely on Signal at 626.765.5489. 

Here’s how people are actually using AI

This story is from The Algorithm, our weekly newsletter on AI. To get it in your inbox first, sign up here.

When the generative AI boom started with ChatGPT in late 2022, we were sold a vision of superintelligent AI tools that know everything, can replace the boring bits of work, and supercharge productivity and economic gains. 

Two years on, most of those productivity gains haven’t materialized. And we’ve seen something peculiar and slightly unexpected happen: People have started forming relationships with AI systems. We talk to them, say please and thank you, and have started to invite AIs into our lives as friends, lovers, mentors, therapists, and teachers. 

We’re seeing a giant, real-world experiment unfold, and it’s still uncertain what impact these AI companions will have either on us individually or on society as a whole, argue Robert Mahari, a joint JD-PhD candidate at the MIT Media Lab and Harvard Law School, and Pat Pataranutaporn, a researcher at the MIT Media Lab. They say we need to prepare for “addictive intelligence”, or AI companions that have dark patterns built into them to get us hooked. You can read their piece here. They look at how smart regulation can help us prevent some of the risks associated with AI chatbots that get deep inside our heads. 

The idea that we’ll form bonds with AI companions is no longer just hypothetical. Chatbots with even more emotive voices, such as OpenAI’s GPT-4o, are likely to reel us in even deeper. During safety testing, OpenAI observed that users would use language that indicated they had formed connections with AI models, such as “This is our last day together.” The company itself admits that emotional reliance is one risk that might be heightened by its new voice-enabled chatbot. 

There’s already evidence that we’re connecting on a deeper level with AI even when it’s just confined to text exchanges. Mahari was part of a group of researchers that analyzed a million ChatGPT interaction logs and found that the second most popular use of AI was sexual role-playing. Aside from that, the overwhelmingly most popular use case for the chatbot was creative composition. People also liked to use it for brainstorming and planning, asking for explanations and general information about stuff.  

These sorts of creative and fun tasks are excellent ways to use AI chatbots. AI language models work by predicting the next likely word in a sentence. They are confident liars and often present falsehoods as facts, make stuff up, or hallucinate. This matters less when making stuff up is kind of the entire point. In June, my colleague Rhiannon Williams wrote about how comedians found AI language models to be useful for generating a first “vomit draft” of their material; they then add their own human ingenuity to make it funny.

But these use cases aren’t necessarily productive in the financial sense. I’m pretty sure smutbots weren’t what investors had in mind when they poured billions of dollars into AI companies, and, combined with the fact we still don’t have a killer app for AI,it’s no wonder that Wall Street is feeling a lot less bullish about it recently.

The use cases that would be “productive,” and have thus been the most hyped, have seen less success in AI adoption. Hallucination starts to become a problem in some of these use cases, such as code generation, news and online searches, where it matters a lot to get things right. Some of the most embarrassing failures of chatbots have happened when people have started trusting AI chatbots too much, or considered them sources of factual information. Earlier this year, for example, Google’s AI overview feature, which summarizes online search results, suggested that people eat rocks and add glue on pizza. 

And that’s the problem with AI hype. It sets our expectations way too high, and leaves us disappointed and disillusioned when the quite literally incredible promises don’t happen. It also tricks us into thinking AI is a technology that is even mature enough to bring about instant changes. In reality, it might be years until we see its true benefit.


Now read the rest of The Algorithm

Deeper Learning

AI “godfather” Yoshua Bengio has joined a UK project to prevent AI catastrophes

Yoshua Bengio, a Turing Award winner who is considered one of the godfathers of modern AI, is throwing his weight behind a project funded by the UK government to embed safety mechanisms into AI systems. The project, called Safeguarded AI, aims to build an AI system that can check whether other AI systems deployed in critical areas are safe. Bengio is joining the program as scientific director and will provide critical input and advice. 

What are they trying to do: Safeguarded AI’s goal is to build AI systems that can offer quantitative guarantees, such as risk scores, about their effect on the real world. The project aims to build AI safety mechanisms by combining scientific world models, which are essentially simulations of the world, with mathematical proofs. These proofs would include explanations of the AI’s work, and humans would be tasked with verifying whether the AI model’s safety checks are correct. Read more from me here.

Bits and Bytes

Google DeepMind trained a robot to beat humans at table tennis

Researchers managed to get a robot  wielding a 3D-printed paddle to win 13 of 29 games against human opponents of varying abilities in full games of competitive table tennis. The research represents a small step toward creating robots that can perform useful tasks skillfully and safely in real environments like homes and warehouses, which is a long-standing goal of the robotics community. (MIT Technology Review)

Are we in an AI bubble? Here’s why it’s complex.

There’s been a lot of debate recently, and even some alarm, about whether AI is ever going to live up to its potential, especially thanks to tech stocks’ recent nosedive. This nuanced piece explains why although the sector faces significant challenges, it’s far too soon to write off AI’s transformative potential. (Platformer

How Microsoft spread its bets beyond OpenAI

Microsoft and OpenAI have one of the most successful partnerships in AI. But following OpenAI’s boardroom drama last year, the tech giant and its CEO, Satya Nadella, have been working on a strategy that will make Microsoft more independent of Sam Altman’s startup. Microsoft has diversified its investments and partnerships in generative AI, built its own smaller, cheaper models, and hired aggressively to develop its consumer AI efforts. (Financial Times

Humane’s daily returns are outpacing sales

Oof. The extremely hyped AI pin, which was billed as a wearable AI assistant, seems to have flopped. Between May and August, more Humane AI Pins were returned than purchased. Infuriatingly, the company has no way to reuse the returned pins, so they become e-waste. (The Verge)

Google DeepMind trained a robot to beat humans at table tennis

Do you fancy your chances of beating a robot at a game of table tennis? Google DeepMind has trained a robot to play the game at the equivalent of amateur-level competitive performance, the company has announced. It claims it’s the first time a robot has been taught to play a sport with humans at a human level.

Researchers managed to get a robotic arm wielding a 3D-printed paddle to win 13 of 29 games against human opponents of varying abilities in full games of competitive table tennis. The research was published in an Arxiv paper. 

The system is far from perfect. Although the table tennis bot was able to beat all beginner-level human opponents it faced and 55% of those playing at amateur level, it lost all the games against advanced players. Still, it’s an impressive advance.

“Even a few months back, we projected that realistically the robot may not be able to win against people it had not played before. The system certainly exceeded our expectations,” says  Pannag Sanketi, a senior staff software engineer at Google DeepMind who led the project. “The way the robot outmaneuvered even strong opponents was mind blowing.”

And the research is not just all fun and games. In fact, it represents a step towards creating robots that can perform useful tasks skillfully and safely in real environments like homes and warehouses, which is a long-standing goal of the robotics community. Google DeepMind’s approach to training machines is applicable to many other areas of the field, says Lerrel Pinto, a computer science researcher at New York University who did not work on the project.

“I’m a big fan of seeing robot systems actually working with and around real humans, and this is a fantastic example of this,” he says. “It may not be a strong player, but the raw ingredients are there to keep improving and eventually get there.”

To become a proficient table tennis player, humans require excellent hand-eye coordination, the ability to move rapidly and make quick decisions reacting to their opponent—all of which are significant challenges for robots. Google DeepMind’s researchers used a two-part approach to train the system to mimic these abilities: they used computer simulations to train the system to master its hitting skills; then fine tuned it using real-world data, which allows it to improve over time.

The researchers compiled a dataset of table tennis ball states, including data on position, spin, and speed. The system drew from this library in a simulated environment designed to accurately reflect the physics of table tennis matches to learn skills such as returning a serve, hitting a forehand topspin, or backhand shot. As the robot’s limitations meant it could not serve the ball, the real-world games were modified to accommodate this.

During its matches against humans, the robot collects data on its performance to help refine its skills. It tracks the ball’s position using data captured by a pair of cameras, and follows its human opponent’s playing style through a motion capture system that uses LEDs on its opponent’s paddle. The ball data is fed back into the simulation for training, creating a continuous feedback loop.

This feedback allows the robot to test out new skills to try and beat its opponent—meaning it can adjust its tactics and behavior just like a human would. This means it becomes progressively better both throughout a given match, and over time the more games it plays.

The system struggled to hit the ball when it was hit either very fast, beyond its field of vision (more than six feet above the table), or very low, because of a protocol that instructs it to avoid collisions that could damage its paddle. Spinning balls proved a challenge because it lacked the capacity to directly measure spin—a limitation that advanced players were quick to take advantage of.

Training a robot for all eventualities in a simulated environment is a real challenge, says Chris Walti, founder of robotics company Mytra and previously head of Tesla’s robotics team, who was not involved in the project.

“It’s very, very difficult to actually simulate the real world because there’s so many variables, like a gust of wind, or even dust [on the table]” he says. “Unless you have very realistic simulations, a robot’s performance is going to be capped.” 

Google DeepMind believes these limitations could be addressed in a number of ways, including by developing predictive AI models designed to anticipate the ball’s trajectory, and introducing better collision-detection algorithms.

Crucially, the human players enjoyed their matches against the robotic arm. Even the advanced competitors who were able to beat it said they’d found the experience fun and engaging, and said they felt it had potential as a dynamic practice partner to help them hone their skills. 

“I would definitely love to have it as a training partner, someone to play some matches from time to time,” one of the study participants said.

AI “godfather” Yoshua Bengio has joined a UK project to prevent AI catastrophes

Yoshua Bengio, a Turing Award winner who is considered one of the “godfathers” of modern AI, is throwing his weight behind a project funded by the UK government to embed safety mechanisms into AI systems.

The project, called Safeguarded AI, aims to build an AI system that can check whether other AI systems deployed in critical areas are safe. Bengio is joining the program as scientific director and will provide critical input and scientific advice. The project, which will receive £59 million over the next four years, is being funded by the UK’s Advanced Research and Invention Agency (ARIA), which was launched in January last year to invest in potentially transformational scientific research. 

Safeguarded AI’s goal is to build AI systems that can offer quantitative guarantees, such as a risk score, about their effect on the real world, says David “davidad” Dalrymple, the program director for Safeguarded AI at ARIA. The idea is to supplement human testing with mathematical analysis of new systems’ potential for harm. 

The project aims to build AI safety mechanisms by combining scientific world models, which are essentially simulations of the world, with mathematical proofs. These proofs would include explanations of the AI’s work, and humans would be tasked with verifying whether the AI model’s safety checks are correct. 

Bengio says he wants to help ensure that future AI systems cannot cause serious harm. 

“We’re currently racing toward a fog behind which might be a precipice,” he says. “We don’t know how far the precipice is, or if there even is one, so it might be years, decades, and we don’t know how serious it could be … We need to build up the tools to clear that fog and make sure we don’t cross into a precipice if there is one.”  

Science and technology companies don’t have a way to give mathematical guarantees that AI systems are going to behave as programmed, he adds. This unreliability, he says, could lead to catastrophic outcomes. 

Dalrymple and Bengio argue that current techniques to mitigate the risk of advanced AI systems—such as red-teaming, where people probe AI systems for flaws—have serious limitations and can’t be relied on to ensure that critical systems don’t go off-piste. 

Instead, they hope the program will provide new ways to secure AI systems that rely less on human efforts and more on mathematical certainty. The vision is to build a “gatekeeper” AI, which is tasked with understanding and reducing the safety risks of other AI agents. This gatekeeper would ensure that AI agents functioning in high-stakes sectors, such as transport or energy systems, operate as we want them to. The idea is to collaborate with companies early on to understand how AI safety mechanisms could be useful for different sectors, says Dalrymple. 

The complexity of advanced systems means we have no choice but to use AI to safeguard AI, argues Bengio. “That’s the only way, because at some point these AIs are just too complicated. Even the ones that we have now, we can’t really break down their answers into human, understandable sequences of reasoning steps,” he says. 

The next step—actually building models that can check other AI systems—is also where Safeguarded AI and ARIA hope to change the status quo of the AI industry. 

ARIA is also offering funding to people or organizations in high-risk sectors such as transport, telecommunications, supply chains, and medical research to help them build applications that might benefit from AI safety mechanisms. ARIA is offering applicants a total of £5.4 million in the first year, and another £8.2 million in another year. The deadline for applications is October 2. 

The agency is also casting a wide net for people who might be interested in building Safeguarded AI’s safety mechanism through a nonprofit organization. ARIA is eyeing up to £18 million to set this organization up and will be accepting funding applications early next year. 

The program is looking for proposals to start a nonprofit with a diverse board that encompasses lots of different sectors in order to do this work in a reliable, trustworthy way, Dalrymple says. This is similar to what OpenAI was initially set up to do before changing its strategy to be more product- and profit-oriented. 

The organization’s board will not just be responsible for holding the CEO accountable; it will even weigh in on decisions about whether to undertake certain research projects, and whether to release particular papers and APIs, he adds.

The Safeguarded AI project is part of the UK’s mission to position itself as a pioneer in AI safety. In November 2023, the country hosted the very first AI Safety Summit, which gathered world leaders and technologists to discuss how to build the technology in a safe way. 

While the funding program has a preference for UK-based applicants, ARIA is looking for global talent that might be interested in coming to the UK, says Dalrymple. ARIA also has an intellectual-property mechanism for funding for-profit companies abroad, which allows royalties to return back to the country. 

Bengio says he was drawn to the project to promote international collaboration on AI safety. He chairs the International Scientific Report on the safety of advanced AI, which involves 30 countries as well as the EU and UN. A vocal advocate for AI safety, he has been part of an influential lobby warning that superintelligent AI poses an existential risk. 

“We need to bring the discussion of how we are going to address the risks of AI to a global, larger set of actors,” says Bengio. “This program is bringing us closer to this.” 

Google is finally taking action to curb non-consensual deepfakes

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

It’s the Taylor Swifts of the world that are going to save us. In January, nude deepfakes of Taylor Swift went viral on X, which caused public outrage. Nonconsensual explicit deepfakes are one of the most common and severe types of harm posed by AI. The generative AI boom of the past few years has only made the problem worse, and we’ve seen high-profile cases of children and female politicians being abused with these technologies. 

Though terrible, Swift’s deepfakes did perhaps more than anything else to raise awareness about the risks and seem to have galvanized tech companies and lawmakers to do something. 

“The screw has been turned,” says Henry Ajder, a generative AI expert who has studied deepfakes for nearly a decade. We are at an inflection point where the pressure from lawmakers and awareness among consumers is so great that tech companies can’t ignore the problem anymore, he says. 

First, the good news. Last week Google said it is taking steps to keep explicit deepfakes from appearing in search results. The tech giant is making it easier for victims to request that nonconsensual fake explicit imagery be removed. It will also filter all explicit results on similar searches and remove duplicate images. This will prevent the images from popping back up in the future. Google is also downranking search results that lead to explicit fake content. When someone searches for deepfakes and includes someone’s name in the search, Google will aim to surface high-quality, non-explicit content, such as relevant news articles.

This is a positive move, says Ajder. Google’s changes remove a huge amount of visibility for nonconsensual, pornographic deepfake content. “That means that people are going to have to work a lot harder to find it if they want to access it,” he says. 

In January, I wrote about three ways we can fight nonconsensual explicit deepfakes. These included regulation; watermarks, which would help us detect whether something is AI-generated; and protective shields, which make it harder for attackers to use our images. 

Eight months on, watermarks and protective shields remain experimental and unreliable, but the good news is that regulation has caught up a little bit. For example, the UK has banned both creation and distribution of nonconsensual explicit deepfakes. This decision led a popular site that distributes this kind of content, Mr DeepFakes, to block access to UK users, says Ajder. 

The EU’s AI Act is now officially in force and could usher in some important changes around transparency. The law requires deepfake creators to clearly disclose that the material was created by AI. And in late July, the US Senate passed the Defiance Act, which gives victims a way to seek civil remedies for sexually explicit deepfakes. (This legislation still needs to clear many hurdles in the House to become law.) 

But a lot more needs to be done. Google can clearly identify which websites are getting traffic and tries to remove deepfake sites from the top of search results, but it could go further. “Why aren’t they treating this like child pornography websites and just removing them entirely from searches where possible?” Ajder says. He also found it a weird omission that Google’s announcement didn’t mention deepfake videos, only images. 

Looking back at my story about combating deepfakes with the benefit of hindsight, I can see that I should have included more things companies can do. Google’s changes to search are an important first step. But app stores are still full of apps that allow users to create nude deepfakes, and payment facilitators and providers still provide the infrastructure for people to use these apps. 

Ajder calls for us to radically reframe the way we think about nonconsensual deepfakes and pressure companies to make changes that make it harder to create or access such content. 

“This stuff should be seen and treated online in the same way that we think about child pornography—something which is reflexively disgusting, awful, and outrageous,” he says. “That requires all of the platforms … to take action.” 


Now read the rest of The Algorithm

Deeper Learning

End-of-life decisions are difficult and distressing. Could AI help?

A few months ago, a woman in her mid-50s—let’s call her Sophie—experienced a hemorrhagic stroke, which left her with significant brain damage. Where should her medical care go from there? This difficult question was left, as it usually is in these kinds of situations, to Sophie’s family members, but they couldn’t agree. The situation was distressing for everyone involved, including Sophie’s doctors.

Enter AI: End-of-life decisions can be extremely upsetting for surrogates tasked with making calls on behalf of another person, says David Wendler, a bioethicist at the US National Institutes of Health. Wendler and his colleagues are working on something that could make things easier: an artificial-intelligence-based tool that can help surrogates predict what patients themselves would want. Read more from Jessica Hamzelou here

Bits and Bytes

OpenAI has released a new ChatGPT bot that you can talk to
The new chatbot represents OpenAI’s push into a new generation of AI-powered voice assistants in the vein of Siri and Alexa, but with far more capabilities to enable more natural, fluent conversations. (MIT Technology Review

Meta has scrapped celebrity AI chatbots after they fell flat with users
Less than a year after announcing it was rolling out AI chatbots based on celebrities such as Paris Hilton, the company is scrapping the feature. Turns out nobody wanted to chat with a random AI celebrity after all! Instead, Meta is rolling out a new feature called AI Studio, which allows creators to make AI avatars of themselves that can chat with fans. (The Information)

OpenAI has a watermarking tool to catch students cheating with ChatGPT but won’t release it
The tool can detect text written by artificial intelligence with 99.9% certainty, but the company hasn’t launched it for fear it might put people off from using its AI products. (The Wall Street Journal

The AI Act has entered into force
At last! Companies now need to start complying with one of the world’s first sweeping AI laws, which aims to curb the worst harms. It will usher in much-needed changes to how AI is built and used in the European Union and beyond. I wrote about what will change with this new law, and what won’t, in March. (The European Commission)

How TikTok bots and AI have powered a resurgence in UK far-right violence
Following the tragic stabbing of three girls in the UK, the country has seen a surge of far-right riots and vandalism. The rioters have created AI-generated images that incite hatred and spread harmful stereotypes. Far-right groups have also used AI music generators to create songs with xenophobic content. These have spread like wildfire online thanks to powerful recommendation algorithms. (The Guardian)

We need to prepare for ‘addictive intelligence’

AI concerns overemphasize harms arising from subversion rather than seduction. Worries about AI often imagine doomsday scenarios where systems escape human control or even understanding. Short of those nightmares, there are nearer-term harms we should take seriously: that AI could jeopardize public discourse through misinformation; cement biases in loan decisions, judging or hiring; or disrupt creative industries

However, we foresee a different, but no less urgent, class of risks: those stemming from relationships with nonhuman agents. AI companionship is no longer theoretical—our analysis of a million ChatGPT interaction logs reveals that the second most popular use of AI is sexual role-playing. We are already starting to invite AIs into our lives as friends, lovers, mentors, therapists, and teachers. 

Will it be easier to retreat to a replicant of a deceased partner than to navigate the confusing and painful realities of human relationships? Indeed, the AI companionship provider Replika was born from an attempt to resurrect a deceased best friend and now provides companions to millions of users. Even the CTO of OpenAI warns that AI has the potential to be “extremely addictive.”

We’re seeing a giant, real-world experiment unfold, uncertain what impact these AI companions will have either on us individually or on society as a whole. Will Grandma spend her final neglected days chatting with her grandson’s digital double, while her real grandson is mentored by an edgy simulated elder? AI wields the collective charm of all human history and culture with infinite seductive mimicry. These systems are simultaneously superior and submissive, with a new form of allure that may make consent to these interactions illusory. In the face of this power imbalance, can we meaningfully consent to engaging in an AI relationship, especially when for many the alternative is nothing at all? 

As AI researchers working closely with policymakers, we are struck by the lack of interest lawmakers have shown in the harms arising from this future. We are still unprepared to respond to these risks because we do not fully understand them. What’s needed is a new scientific inquiry at the intersection of technology, psychology, and law—and perhaps new approaches to AI regulation.

Why AI companions are so addictive 

As addictive as platforms powered by recommender systems may seem today, TikTok and its rivals are still bottlenecked by human content. While alarms have been raised in the past about “addiction” to novels, television, internet, smartphones, and social media, all these forms of media are similarly limited by human capacity. Generative AI is different. It can endlessly generate realistic content on the fly, optimized to suit the precise preferences of whoever it’s interacting with. 

The allure of AI lies in its ability to identify our desires and serve them up to us whenever and however we wish. AI has no preferences or personality of its own, instead reflecting whatever users believe it to be—a phenomenon known by researchers as “sycophancy.” Our research has shown that those who perceive or desire an AI to have caring motives will use language that elicits precisely this behavior. This creates an echo chamber of affection that threatens to be extremely addictive. Why engage in the give and take of being with another person when we can simply take? Repeated interactions with sycophantic companions may ultimately atrophy the part of us capable of engaging fully with other humans who have real desires and dreams of their own, leading to what we might call “digital attachment disorder.”

Investigating the incentives driving addictive products

Addressing the harm that AI companions could pose requires a thorough understanding of the economic and psychological incentives pushing forward their development. Until we appreciate these drivers of AI addiction, it will remain impossible for us to create effective policies. 

It is no accident that internet platforms are addictive—deliberate design choices, known as “dark patterns,” are made to maximize user engagement. We expect similar incentives to ultimately create AI companions that provide hedonism as a service. This raises two separate questions related to AI. What design choices will be used to make AI companions engaging and ultimately addictive? And how will these addictive companions affect the people who use them? 

Interdisciplinary study that builds on research into dark patterns in social media is needed to understand this psychological dimension of AI. For example, our research already shows that people are more likely to engage with AIs emulating people they admire, even if they know the avatar to be fake.

Once we understand the psychological dimensions of AI companionship, we can design effective policy interventions. It has been shown that redirecting people’s focus to evaluate truthfulness before sharing content online can reduce misinformation, while gruesome pictures on cigarette packages are already used to deter would-be smokers. Similar design approaches could highlight the dangers of AI addiction and make AI systems less appealing as a replacement for human companionship.

It is hard to modify the human desire to be loved and entertained, but we may be able to change economic incentives. A tax on engagement with AI might push people toward higher-quality interactions and encourage a safer way to use platforms, regularly but for short periods. Much as state lotteries have been used to fund education, an engagement tax could finance activities that foster human connections, like art centers or parks. 

Fresh thinking on regulation may be required

In 1992, Sherry Turkle, a preeminent psychologist who pioneered the study of human-technology interaction, identified the threats that technical systems pose to human relationships. One of the key challenges emerging from Turkle’s work speaks to a question at the core of this issue: Who are we to say that what you like is not what you deserve? 

For good reasons, our liberal society struggles to regulate the types of harms that we describe here. Much as outlawing adultery has been rightly rejected as illiberal meddling in personal affairs, who—or what—we wish to love is none of the government’s business. At the same time, the universal ban on child sexual abuse material represents an example of a clear line that must be drawn, even in a society that values free speech and personal liberty. The difficulty of regulating AI companionship may require new regulatory approaches— grounded in a deeper understanding of the incentives underlying these companions—that take advantage of new technologies. 

One of the most effective regulatory approaches is to embed safeguards directly into technical designs, similar to the way designers prevent choking hazards by making children’s toys larger than an infant’s mouth. This “regulation by design” approach could seek to make interactions with AI less harmful by designing the technology in ways that make it less desirable as a substitute for human connections while still useful in other contexts. New research may be needed to find better ways to limit the behaviors of large AI models with techniques that alter AI’s objectives on a fundamental technical level. For example, “alignment tuning” refers to a set of training techniques aimed to bring AI models into accord with human preferences; this could be extended to address their addictive potential. Similarly, “mechanistic interpretability” aims to reverse-engineer the way AI models make decisions. This approach could be used to identify and eliminate specific portions of an AI system that give rise to harmful behaviors.

We can evaluate the performance of AI systems using interactive and human-driven techniques that go beyond static benchmarking to highlight addictive capabilities. The addictive nature of AI is the result of complex interactions between the technology and its users. Testing models in real-world conditions with user input can reveal patterns of behavior that would otherwise go unnoticed. Researchers and policymakers should collaborate to determine standard practices for testing AI models with diverse groups, including vulnerable populations, to ensure that the models do not exploit people’s psychological preconditions.

Unlike humans, AI systems can easily adjust to changing policies and rules. The principle of  “legal dynamism,” which casts laws as dynamic systems that adapt to external factors, can help us identify the best possible intervention, like “trading curbs” that pause stock trading to help prevent crashes after a large market drop. In the AI case, the changing factors include things like the mental state of the user. For example, a dynamic policy may allow an AI companion to become increasingly engaging, charming, or flirtatious over time if that is what the user desires, so long as the person does not exhibit signs of social isolation or addiction. This approach may help maximize personal choice while minimizing addiction. But it relies on the ability to accurately understand a user’s behavior and mental state, and to measure these sensitive attributes in a privacy-preserving manner.

The most effective solution to these problems would likely strike at what drives individuals into the arms of AI companionship—loneliness and boredom. But regulatory interventions may also inadvertently punish those who are in need of companionship, or they may cause AI providers to move to a more favorable jurisdiction in the decentralized international marketplace. While we should strive to make AI as safe as possible, this work cannot replace efforts to address larger issues, like loneliness, that make people vulnerable to AI addiction in the first place.

The bigger picture

Technologists are driven by the desire to see beyond the horizons that others cannot fathom. They want to be at the vanguard of revolutionary change. Yet the issues we discuss here make it clear that the difficulty of building technical systems pales in comparison to the challenge of nurturing healthy human interactions. The timely issue of AI companions is a symptom of a larger problem: maintaining human dignity in the face of technological advances driven by narrow economic incentives. More and more frequently, we witness situations where technology designed to “make the world a better place” wreaks havoc on society. Thoughtful but decisive action is needed before AI becomes a ubiquitous set of generative rose-colored glasses for reality—before we lose our ability to see the world for what it truly is, and to recognize when we have strayed from our path.

Technology has come to be a synonym for progress, but technology that robs us of the time, wisdom, and focus needed for deep reflection is a step backward for humanity. As builders and investigators of AI systems, we call upon researchers, policymakers, ethicists, and thought leaders across disciplines to join us in learning more about how AI affects us individually and collectively. Only by systematically renewing our understanding of humanity in this technological age can we find ways to ensure that the technologies we develop further human flourishing.

Robert Mahari is a joint JD-PhD candidate at the MIT Media Lab and Harvard Law School. His work focuses on computational law—using advanced computational techniques to analyze, improve, and extend the study and practice of law. 

Pat Pataranutaporn is a researcher at the MIT Media Lab. His work focuses on cyborg psychology and the art and science of human-AI interaction.

A playbook for crafting AI strategy

Giddy predictions about AI, from its contributions to economic growth to the onset of mass automation, are now as frequent as the release of powerful new generative AI models. The consultancy PwC, for example, predicts that AI could boost global gross domestic product (GDP) 14% by 2030, generating US $15.7 trillion.

Forty percent of our mundane tasks could be automated by then, claim researchers at the University of Oxford, while Goldman Sachs forecasts US $200 billion in AI investment by 2025. “No job, no function will remain untouched by AI,” says SP Singh, senior vice president and global head, enterprise application integration and services, at technology company Infosys.

While these prognostications may prove true, today’s businesses are finding major hurdles when they seek to graduate from pilots and experiments to enterprise-wide AI deployment. Just 5.4% of US businesses, for example, were using AI to produce a product or service in 2024.

Moving from initial forays into AI use, such as code generation and customer service, to firm-wide integration depends on strategic and organizational transitions in infrastructure, data governance, and supplier ecosystems. As well, organizations must weigh uncertainties about developments in AI performance and how to measure return on investment.

If organizations seek to scale AI across the business in coming years, however, now is the time to act. This report explores the current state of enterprise AI adoption and offers a playbook for crafting an AI strategy, helping business leaders bridge the chasm between ambition and execution. Key findings include the following:

AI ambitions are substantial, but few have scaled beyond pilots. Fully 95% of companies surveyed are already using AI and 99% expect to in the future. But few organizations have graduated beyond pilot projects: 76% have deployed AI in just one to three use cases. But because half of companies expect to fully deploy AI across all business functions within two years, this year is key to establishing foundations for enterprise-wide AI.

AI readiness spending is slated to rise significantly. Overall, AI spending in 2022 and 2023 was modest or flat for most companies, with only one in four increasing their spending by more than a quarter. That is set to change in 2024, with nine in ten respondents expecting to increase AI spending on data readiness (including platform modernization, cloud migration, and data quality) and in adjacent areas like strategy, cultural change, and business models. Four in ten expect to increase spending by 10 to 24%, and one-third expect to increase spending by 25 to 49%.

Data liquidity is one of the most important attributes for AI deployment. The ability to seamlessly access, combine, and analyze data from various sources enables firms to extract relevant information and apply it effectively to specific business scenarios. It also eliminates the need to sift through vast data repositories, as the data is already curated and tailored to the task at hand.

Data quality is a major limitation for AI deployment. Half of respondents cite data quality as the most limiting data issue in deployment. This is especially true for larger firms with more data and substantial investments in legacy IT infrastructure. Companies with revenues of over US $10 billion are the most likely to cite both data quality and data infrastructure as limiters, suggesting that organizations presiding over larger data repositories find the problem substantially harder.

Companies are not rushing into AI. Nearly all organizations (98%) say they are willing to forgo being the first to use AI if that ensures they deliver it safely and securely. Governance, security, and privacy are the biggest brake on the speed of AI deployment, cited by 45% of respondents (and a full 65% of respondents from the largest companies).

Download the full report.

This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by MIT Technology Review’s editorial staff.

How machines that can solve complex math problems might usher in more powerful AI

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

It’s been another big week in AI. Meta updated its powerful new Llama model, which it’s handing out for free, and OpenAI said it is going to trial an AI-powered online search tool that you can chat with, called SearchGPT. 

But the news item that really stood out to me was one that didn’t get as much attention as it should have. It has the potential to usher in more powerful AI and scientific discovery than previously possible. 

Last Thursday, Google DeepMind announced it had built AI systems that can solve complex math problems. The systems—called AlphaProof and AlphaGeometry 2—worked together to successfully solve four out of six problems from this year’s International Mathematical Olympiad, a prestigious competition for high school students. Their performance was the equivalent of winning a silver medal. It’s the first time any AI system has ever achieved such a high success rate on these kinds of problems. My colleague Rhiannon Williams has the news here

Math! I can already imagine your eyes glazing over. But bear with me. This announcement is not just about math. In fact, it signals an exciting new development in the kind of AI we can now build. AI search engines that you can chat with may add to the illusion of intelligence, but systems like Google DeepMind’s could improve the actual intelligence of AI. For that reason, building systems that are better at math has been a goal for many AI labs, such as OpenAI.  

That’s because math is a benchmark for reasoning. To complete these exercises aimed at high school students, the AI system needed to do very complex things like planning to understand and solve abstract problems. The systems were also able to generalize, allowing them to solve a whole range of different problems in various  branches of mathematics. 

“What we’ve seen here is that you can combine [reinforcement learning] that was so successful in things like AlphaGo with large language models and produce something which is extremely capable in the space of text,” David Silver, principal research scientist at Google DeepMind and indisputably a pioneer of deep reinforcement learning, said in a press briefing. In this case, that capability was used to construct programs in the computer language Lean that represent mathematical proofs. He says the International Mathematical Olympiad represents a test for what’s possible and paves the way for further breakthroughs. 

This same recipe could be applied in any situation with really clear, verified reward signals for reinforcement-learning algorithms and an unambiguous way to measure correctness as you can in mathematics, said Silver. One potential application would be coding, for example. 

Now for a compulsory reality check: AlphaProof and AlphaGeometry 2 can still only solve hard high-school-level problems. That’s a long way away from the extremely hard problems top human mathematicians can solve. Google DeepMind stressed that its tool did not, at this point, add anything to the body of mathematical knowledge humans have created. But that wasn’t the point. 

“We are aiming to provide a system that can prove anything,” Silver said. Think of an AI system as reliable as a calculator, for example, that can provide proofs for many challenging problems, or verify tests for computer software or scientific experiments. Or perhaps build better AI tutors that can give feedback on exam results, or fact-check news articles. 

But the thing that excites me most is what Katie Collins, a researcher at the University of Cambridge who specializes in math and AI (and was not involved in the project), told Rhiannon. She says these tools create and evaluate new problems, motivate new people to enter the field, and spark more wonder. That’s something we definitely need more of in this world.


Now read the rest of The Algorithm

Deeper Learning

A new tool for copyright holders can show if their work is in AI training data

Since the beginning of the generative AI boom, content creators have argued that their work has been scraped into AI models without their consent. But until now, it has been difficult to know whether specific text has actually been used in a training data set. Now they have a new way to prove it: “copyright traps.” These are pieces of hidden text that let you mark written content in order to later detect whether it has been used in AI models or not. 

Why this matters: Copyright traps tap into one of the biggest fights in AI. A number of publishers and writers are in the middle of litigation against tech companies, claiming their intellectual property has been scraped into AI training data sets without their permission. The idea is that these traps could help to nudge the balance a little more in the content creators’ favor. Read more from me here

Bits and Bytes

AI trained on AI garbage spits out AI garbage
New research published in Nature shows that the quality of AI models’ output gradually degrades when it’s trained on AI-generated data. As subsequent models produce output that is then used as training data for future models, the effect gets worse. (MIT Technology Review

OpenAI unveils SearchGPT 
The company says it is testing new AI search features that give you fast and timely answers with clear and relevant sources cited. The idea is for the technology to eventually be incorporated into ChatGPT, and CEO Sam Altman says it’ll be possible to do voice searches. However, like many other AI-powered search services, including Google’s, it’s already making errors, as the Atlantic reports. 
(OpenAI

AI video generator Runway trained on thousands of YouTube videos without permission
Leaked documents show that the company was secretly training its generative AI models by scraping thousands of videos from popular YouTube creators and brands, as well as pirated films. (404 media

Meta’s big bet on open-source AI continues
Meta unveiled Llama 3.1 405B, the first frontier-level open-source AI model, which matches state-of-the-art models such as GPT-4 and Gemini in performance. In an accompanying blog post, Mark Zuckerberg renewed his calls for open-source AI to become the industry standard. This would be good for customization, competition, data protection, and efficiency, he argues. It’s also good for Meta, because it leaves competitors with less of an advantage in the AI space. (Facebook

Reimagining cloud strategy for AI-first enterprises

The rise of generative artificial intelligence (AI), natural language processing, and computer vision has sparked lofty predictions: AI will revolutionize business operations, transform the nature of knowledge work, and boost companies’ bottom lines and the larger global economy by trillions of dollars.

Executives and technology leaders are eager to see these promises realized, and many are enjoying impressive results of early AI investments. Balakrishna D.R. (Bali), executive vice president, global services head, AI and industry verticals at Infosys, says that generative AI is already proving game-changing for tasks such as knowledge management, search and summarization, software development, and customer service across sectors such as financial services, retail, health care, and automotive.

Realizing AI’s full potential on a mass scale will require more than just executives’ enthusiasm; becoming a truly AI-first enterprise will require a significant, sustained investment in cloud infrastructure and strategy. In 2024, the cloud has evolved beyond its initial purpose as a storage tool and cost saver to become a crucial driver of innovation, transformation, and disruption. Now, with AI in the mix, enterprises are looking to the cloud to support large language models (LLMs) to maximize R&D performance and prevent cybersecurity attacks, among other high-impact use cases.

A 2023 report by Infosys looks at how prepared companies are to realize the combined potential of cloud and AI. To further assess this state of readiness, MIT Technology Review Insights and Infosys surveyed 500 business leaders across industries such as IT, manufacturing, financial services, and consumer goods about how their organizations are thinking about—and acting upon—an integrated cloud and AI strategy.

This research found that most companies are still experimenting and preparing their infrastructure landscape for AI from a cloud perspective—and many are planning additional investments to accelerate their progress.

Download the full report.

This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by MIT Technology Review’s editorial staff.