MIT Technology Review Explains: Let our writers untangle the complex, messy world of technology to help you understand what’s coming next. You can read more from the series here.
The World Health Organization’s new chatbot launched on April 2 with the best of intentions.
A fresh-faced virtual avatar backed by GPT-3.5, SARAH (Smart AI Resource Assistant for Health) dispenses health tips in eight different languages, 24/7, about how to eat well, quit smoking, de-stress, and more, for millions around the world.
Here we go again. Chatbot fails are now a familiar meme. Meta’s short-lived scientific chatbot Galactica made up academic papers and generated wiki articles about the history of bears in space. In February, Air Canada was ordered to honor a refund policy invented by its customer service chatbot. Last year, a lawyer was fined for submitting court documents filled with fake judicial opinions and legal citations made up by ChatGPT.
The problem is, large language models are so good at what they do that what they make up looks right most of the time. And that makes trusting them hard.
This tendency to make things up—known as hallucination—is one of the biggest obstacles holding chatbots back from more widespread adoption. Why do they do it? And why can’t we fix it?
Magic 8 Ball
To understand why large language models hallucinate, we need to look at how they work. The first thing to note is that making stuff up is exactly what these models are designed to do. When you ask a chatbot a question, it draws its response from the large language model that underpins it. But it’s not like looking up information in a database or using a search engine on the web.
Peel open a large language model and you won’t see ready-made information waiting to be retrieved. Instead, you’ll find billions and billions of numbers. It uses these numbers to calculate its responses from scratch, producing new sequences of words on the fly. A lot of the text that a large language model generates looks as if it could have been copy-pasted from a database or a real web page. But as in most works of fiction, the resemblances are coincidental. A large language model is more like an infinite Magic 8 Ball than an encyclopedia.
Large language models generate text by predicting the next word in a sequence. If a model sees “the cat sat,” it may guess “on.” That new sequence is fed back into the model, which may now guess “the.” Go around again and it may guess “mat”—and so on. That one trick is enough to generate almost any kind of text you can think of, from Amazon listings to haiku to fan fiction to computer code to magazine articles and so much more. As Andrej Karpathy, a computer scientist and cofounder of OpenAI, likes to put it: large language models learn to dream internet documents.
Think of the billions of numbers inside a large language model as a vast spreadsheet that captures the statistical likelihood that certain words will appear alongside certain other words. The values in the spreadsheet get set when the model is trained, a process that adjusts those values over and over again until the model’s guesses mirror the linguistic patterns found across terabytes of text taken from the internet.
To guess a word, the model simply runs its numbers. It calculates a score for each word in its vocabulary that reflects how likely that word is to come next in the sequence in play. The word with the best score wins. In short, large language models are statistical slot machines. Crank the handle and out pops a word.
It’s all hallucination
The takeaway here? It’s all hallucination, but we only call it that when we notice it’s wrong. The problem is, large language models are so good at what they do that what they make up looks right most of the time. And that makes trusting them hard.
Can we control what large language models generate so they produce text that’s guaranteed to be accurate? These models are far too complicated for their numbers to be tinkered with by hand. But some researchers believe that training them on even more text will continue to reduce their error rate. This is a trend we’ve seen as large language models have gotten bigger and better.
Another approach involves asking models to check their work as they go, breaking responses down step by step. Known as chain-of-thought prompting, this has been shown to increase the accuracy of a chatbot’s output. It’s not possible yet, but future large language models may be able to fact-check the text they are producing and even rewind when they start to go off the rails.
But none of these techniques will stop hallucinations fully. As long as large language models are probabilistic, there is an element of chance in what they produce. Roll 100 dice and you’ll get a pattern. Roll them again and you’ll get another. Even if the dice are, like large language models, weighted to produce some patterns far more often than others, the results still won’t be identical every time. Even one error in 1,000—or 100,000—adds up to a lot of errors when you consider how many times a day this technology gets used.
The more accurate these models become, the more we will let our guard down. Studies show that the better chatbots get, the more likely people are to miss an error when it happens.
Perhaps the best fix for hallucination is to manage our expectations about what these tools are for. When the lawyer who used ChatGPT to generate fake documents was asked to explain himself, he sounded as surprised as anyone by what had happened. “I heard about this new site, which I falsely assumed was, like, a super search engine,” he told a judge. “I did not comprehend that ChatGPT could fabricate cases.”
This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.
Knock, knock.
Who’s there?
An AI with generic jokes. Researchers from Google DeepMind asked 20 professional comedians to use popular AI language models to write jokes and comedy performances. Their results were mixed.
The comedians said that the tools were useful in helping them produce an initial “vomit draft” that they could iterate on, and helped them structure their routines. But the AI was not able to produce anything that was original, stimulating, or, crucially, funny. My colleague Rhiannon Williams has the full story.
As Tuhin Chakrabarty, a computer science researcher at Columbia University who specializes in AI and creativity, told Rhiannon, humor often relies on being surprising and incongruous. Creative writing requires its creator to deviate from the norm, whereas LLMs can only mimic it.
And that is becoming pretty clear in the way artists are approaching AI today. I’ve just come back from Hamburg, which hosted one of the largest events for creatives in Europe, and the message I got from those I spoke to was that AI is too glitchy and unreliable to fully replace humans and is best used instead as a tool to augment human creativity.
Right now, we are in a moment where we are deciding how much creative power we are comfortable giving AI companies and tools. After the boom first started in 2022, when DALL-E 2 and Stable Diffusion first entered the scene, many artists raised concerns that AI companies were scraping their copyrighted work without consent or compensation. Tech companies argue that anything on the public internet falls under fair use, a legal doctrine that allows the reuse of copyrighted-protected material in certain circumstances. Artists, writers, image companies, and the New York Times have filed lawsuits against these companies, and it will likely take years until we have a clear-cut answer as to who is right.
Meanwhile, the court of public opinion has shifted a lot in the past two years. Artists I have interviewed recently say they were harassed and ridiculed for protesting AI companies’ data-scraping practices two years ago. Now, the general public is more aware of the harms associated with AI. In just two years, the public has gone from being blown away by AI-generated images to sharing viral social media posts about how to opt out of AI scraping—a concept that was alien to most laypeople until very recently. Companies have benefited from this shift too. Adobe has been successful in pitching its AI offerings as an “ethical” way to use the technology without having to worry about copyright infringement.
There are also several grassroots efforts to shift the power structures of AI and give artists more agency over their data. I’ve written about Nightshade, a tool created by researchers at the University of Chicago, which lets users add an invisible poison attack to their images so that they break AI models when scraped. The same team is behind Glaze, a tool that lets artists mask their personal style from AI copycats. Glaze has been integrated into Cara, a buzzy new art portfolio site and social media platform, which has seen a surge of interest from artists. Cara pitches itself as a platform for art created by people; it filters out AI-generated content. It got nearly a million new users in a few days.
This all should be reassuring news for any creative people worried that they could lose their job to a computer program. And the DeepMind study is a great example of how AI can actually be helpful for creatives. It can take on some of the boring, mundane, formulaic aspects of the creative process, but it can’t replace the magic and originality that humans bring. AI models are limited to their training data and will forever only reflect the zeitgeist at the moment of their training. That gets old pretty quickly.
Now read the rest of The Algorithm
Deeper Learning
Apple is promising personalized AI in a private cloud. Here’s how that will work.
Last week, Apple unveiled its vision for supercharging its product lineup with artificial intelligence. The key feature, which will run across virtually all of its product line, is Apple Intelligence, a suite of AI-based capabilities that promises to deliver personalized AI services while keeping sensitive data secure.
Why this matters: Apple says its privacy-focused system will first attempt to fulfill AI tasks locally on the device itself. If any data is exchanged with cloud services, it will be encrypted and then deleted afterward. It’s a pitch that offers an implicit contrast with the likes of Alphabet, Amazon, or Meta, which collect and store enormous amounts of personal data. Read more from James O’Donnell here.
Bits and Bytes
How to opt out of Meta’s AI training If you post or interact with chatbots on Facebook, Instagram, Threads, or WhatsApp, Meta can use your data to train its generative AI models. Even if you don’t use any of Meta’s platforms, it can still scrape data such as photos of you if someone else posts them. Here’s our quick guide on how to opt out. (MIT Technology Review)
Microsoft’s Satya Nadella is building an AI empire Nadella is going all in on AI. His $13 billion investment in OpenAI was just the beginning. Microsoft has become an “the world’s most aggressive amasser of AI talent, tools, and technology” and has started building an in-house OpenAI competitor. (The Wall Street Journal)
OpenAI has hired an army of lobbyists As countries around the world mull AI legislation, OpenAI is on a lobbyist hiring spree to protect its interests. The AI company has expanded its global affairs team from three lobbyists at the start of 2023 to 35 and intends to have up to 50 by the end of this year. (Financial Times)
UK rolls out Amazon-powered emotion recognition AI cameras on trains People traveling through some of the UK’s biggest train stations have likely had their faces scanned by Amazon software without their knowledge during an AI trial. London stations such as Euston and Waterloo have tested CCTV cameras with AI to reduce crime and detect people’s emotions. Emotion recognition technology is extremely controversial. Experts say it is unreliable and simply does not work. (Wired)
Clearview AI used your face. Now you may get a stake in the company. The facial recognition company, which has been under fire for scraping images of people’s faces from the web and social media without their permission, has agreed to an unusual settlement in a class action against it. Instead of paying cash, it is offering a 23% stake in the company for Americans whose faces are in its data sets. (The New York Times)
Elephants call each other by their names This is so cool! Researchers used AI to analyze the calls of two herds of African savanna elephants in Kenya. They found that elephants use specific vocalizations for each individual and recognize when they are being addressed by other elephants. (The Guardian)
Meta has created a system that can embed hidden signals, known as watermarks, in AI-generated audio clips, which could help in detecting AI-generated content online.
The tool, called AudioSeal, is the first that can pinpoint which bits of audio in, for example, a full hourlong podcast might have been generated by AI. It could help to tackle the growing problem of misinformation and scams using voice cloning tools, says Hady Elsahar, a research scientist at Meta. Malicious actors have used generative AI to create audio deepfakes of President Joe Biden, and scammers have used deepfakes to blackmail their victims. Watermarks could in theory help social media companies detect and remove unwanted content.
However, there are some big caveats. Meta says it has no plans yet to apply the watermarks to AI-generated audio created using its tools. Audio watermarks are not yet adopted widely, and there is no single agreed industry standard for them. And watermarks for AI-generated content tend to be easy to tamper with—for example, by removing or forging them.
Fast detection, and the ability to pinpoint which elements of an audio file are AI-generated, will be critical to making the system useful, says Elsahar. He says the team achieved between 90% and 100% accuracy in detecting the watermarks, much better results than in previous attempts at watermarking audio.
AudioSeal is available on GitHub for free. Anyone can download it and use it to add watermarks to AI-generated audio clips. It could eventually be overlaid on top of AI audio generation models, so that it is automatically applied to any speech generated using them. The researchers who created it will present their work at the International Conference on Machine Learning in Vienna, Austria, in July.
AudioSeal is created using two neural networks. One generates watermarking signals that can be embedded into audio tracks. These signals are imperceptible to the human ear but can be detected quickly using the other neural network. Currently, if you want to try to spot AI-generated audio in a longer clip, you have to comb through the entire thing in second-long chunks to see if any of them contain a watermark. This is a slow and laborious process, and not practical on social media platforms with millions of minutes of speech.
AudioSeal works differently: by embedding a watermark throughout each section of the entire audio track. This allows the watermark to be “localized,” which means it can still be detected even if the audio is cropped or edited.
Ben Zhao, a computer science professor at the University of Chicago, says this ability, and the near-perfect detection accuracy, makes AudioSeal better than any previous audio watermarking system he’s come across.
“It’s meaningful to explore research improving the state of the art in watermarking, especially across mediums like speech that are often harder to mark and detect than visual content,” says Claire Leibowicz, head of AI and media integrity at the nonprofit Partnership on AI.
But there are some major flaws that need to be overcome before these sorts of audio watermarks can be adopted en masse. Meta’s researchers tested different attacks to remove the watermarks and found that the more information is disclosed about the watermarking algorithm, the more vulnerable it is. The system also requires people to voluntarily add the watermark to their audio files.
This places some fundamental limitations on the tool, says Zhao. “Where the attacker has some access to the [watermark] detector, it’s pretty fragile,” he says. And this means only Meta will be able to verify whether audio content is AI-generated or not.
Leibowicz says she remains unconvinced that watermarks will actually further public trust in the information they’re seeing or hearing, despite their popularity as a solution in the tech sector. That’s partly because they are themselves so open to abuse.
“I’m skeptical that any watermark will be robust to adversarial stripping and forgery,” she adds.
MIT Technology Review’s How To series helps you get things done.
If you post or interact with chatbots on Facebook, Instagram, Threads, or WhatsApp, Meta can use your data to train its generative AI models beginning June 26, according to its recently updated privacy policy. Even if you don’t use any of Meta’s platforms, it can still scrape data such as photos of you if someone else posts them.
Internet data scraping is one of the biggest fights in AI right now. Tech companies argue that anything on the public internet is fair game, but they are facing a barrage of lawsuits over their data practices and copyright. It will likely take years until clear rules are in place.
In the meantime, they are running out of training data to build even bigger, more powerful models, and to Meta, your posts are a gold mine.
If you’re uncomfortable with having Meta use your personal information and intellectual property to train its AI models in perpetuity, consider opting out. Although Meta does not guarantee it will allow this, it does say it will “review objection requests in accordance with relevant data protection laws.”
What that means for US users
Users in the US or other countries without national data privacy laws don’t have any foolproof ways to prevent Meta from using their data to train AI, which has likely already been used for such purposes. Meta does not have an opt-out feature for people living in these places.
A spokesperson for Meta says it does not use the content of people’s private messages to each other to train AI. However, public social media posts are seen as fair game and can be hoovered up into AI training data sets by anyone. Users who don’t want that can set their account settings to private to minimize the risk.
The company has built in-platform tools that allow people to delete their personal information from chats with Meta AI, the spokesperson says.
How users in Europe and the UK can opt out
Users in the European Union and the UK, which are protected by strict data protection regimes, have the right to object to their data being scraped, so they can opt out more easily.
If you have a Facebook account:
1. Log in to your account. You can access the new privacy policy by following this link. At the very top of the page, you should see a box that says “Learn more about your right to object.” Click on that link, or here.
Alternatively, you can click on your account icon at the top right-hand corner. Select “Settings and privacy” and then “Privacy center.” On the left-hand side you will see a drop-down menu labeled “How Meta uses information for generative AI models and features.” Click on that, and scroll down. Then click on “Right to object.”
2. Fill in the form with your information. The form requires you to explain how Meta’s data processing affects you. I was successful in my request by simply stating that I wished to exercise my right under data protection law to object to my personal data being processed. You will likely have to confirm your email address.
3. You should soon receive both an email and a notification on your Facebook account confirming if your request has been successful. I received mine a minute after submitting the request.
If you have an Instagram account:
1. Log in to your account. Go to your profile page, and click on the three lines at the top-right corner. Click on “Settings and privacy.”
2. Scroll down to the “More info and support” section, and click “About.” Then click on “Privacy policy.” At the very top of the page, you should see a box that says “Learn more about your right to object.” Click on that link, or here.
At its Worldwide Developer Conference on Monday, Apple for the first time unveiled its vision for supercharging its product lineup with artificial intelligence. The key feature, which will run across virtually all of its product line, is Apple Intelligence, a suite of AI-based capabilities that promises to deliver personalized AI services while keeping sensitive data secure. It represents Apple’s largest leap forward in using our private data to help AI do tasks for us. To make the case it can do this without sacrificing privacy, the company says it has built a new way to handle sensitive data in the cloud.
Apple says its privacy-focused system will first attempt to fulfill AI tasks locally on the device itself. If any data is exchanged with cloud services, it will be encrypted and then deleted afterward. The company also says the process, which it calls Private Cloud Compute, will be subject to verification by independent security researchers.
The pitch offers an implicit contrast with the likes of Alphabet, Amazon, or Meta, which collect and store enormous amounts of personal data. Apple says any personal data passed on to the cloud will be used only for the AI task at hand and will not be retained or accessible to the company, even for debugging or quality control, after the model completes the request.
Simply put, Apple is saying people can trust it to analyze incredibly sensitive data—photos, messages, and emails that contain intimate details of our lives—and deliver automated services based on what it finds there, without actually storing the data online or making any of it vulnerable.
It showed a few examples of how this will work in upcoming versions of iOS. Instead of scrolling through your messages for that podcast your friend sent you, for example, you could simply ask Siri to find and play it for you. Craig Federighi, Apple’s senior vice president of software engineering, walked through another scenario: an email comes in pushing back a work meeting, but his daughter is appearing in a play that night. His phone can now find the PDF with information about the performance, predict the local traffic, and let him know if he’ll make it on time. These capabilities will extend beyond apps made by Apple, allowing developers to tap into Apple’s AI too.
Because the company profits more from hardware and services than from ads, Apple has less incentive than some other companies to collect personal online data, allowing it to position the iPhone as the most private device. Even so, Apple has previously found itself in the crosshairs of privacy advocates. Security flaws led to leaks of explicit photos from iCloud in 2014. In 2019, contractors were found to be listening to intimate Siri recordings for quality control. Disputes about how Apple handles data requests from law enforcement are ongoing.
The first line of defense against privacy breaches, according to Apple, is to avoid cloud computing for AI tasks whenever possible. “The cornerstone of the personal intelligence system is on-device processing,” Federighi says, meaning that many of the AI models will run on iPhones and Macs rather than in the cloud. “It’s aware of your personal data without collecting your personal data.”
That presents some technical obstacles. Two years into the AI boom, pinging models for even simple tasks still requires enormous amounts of computing power. Accomplishing that with the chips used in phones and laptops is difficult, which is why only the smallest of Google’s AI models can be run on the company’s phones, and everything else is done via the cloud. Apple says its ability to handle AI computations on-device is due to years of research into chip design, leading to the M1 chips it began rolling out in 2020.
Yet even Apple’s most advanced chips can’t handle the full spectrum of tasks the company promises to carry out with AI. If you ask Siri to do something complicated, it may need to pass that request, along with your data, to models that are available only on Apple’s servers. This step, security experts say, introduces a host of vulnerabilities that may expose your information to outside bad actors, or at least to Apple itself.
“I always warn people that as soon as your data goes off your device, it becomes much more vulnerable,” says Albert Fox Cahn, executive director of the Surveillance Technology Oversight Project and practitioner in residence at NYU Law School’s Information Law Institute.
Apple claims to have mitigated this risk with its new Private Cloud Computer system. “For the first time ever, Private Cloud Compute extends the industry-leading security and privacy of Apple devices into the cloud,” Apple security experts wrote in their announcement, stating that personal data “isn’t accessible to anyone other than the user—not even to Apple.” How does it work?
Historically, Apple has encouraged people to opt in to end-to-end encryption (the same type of technology used in messaging apps like Signal) to secure sensitive iCloud data. But that doesn’t work for AI. Unlike messaging apps, where a company like WhatsApp does not need to see the contents of your messages in order to deliver them to your friends, Apple’s AI models need unencrypted access to the underlying data to generate responses. This is where Apple’s privacy process kicks in. First, Apple says, data will be used only for the task at hand. Second, this process will be verified by independent researchers.
Needless to say, the architecture of this system is complicated, but you can imagine it as an encryption protocol. If your phone determines it needs the help of a larger AI model, it will package a request containing the prompt it’s using and the specific model, and then put a lock on that request. Only the specific AI model to be used will have the proper key.
When asked by MIT Technology Review whether users will be notified when a certain request is sent to cloud-based AI models instead of being handled on-device, an Apple spokesperson said there will be transparency to users but that further details aren’t available.
Dawn Song, co-Director of UC Berkeley Center on Responsible Decentralized Intelligence and an expert in private computing, says Apple’s new developments are encouraging. “The list of goals that they announced is well thought out,” she says. “Of course there will be some challenges in meeting those goals.”
Cahn says that to judge from what Apple has disclosed so far, the system seems much more privacy-protective than other AI products out there today. That said, the common refrain in his space is “Trust but verify.” In other words, we won’t know how secure these systems keep our data until independent researchers can verify its claims, as Apple promises they will, and the company responds to their findings.
“Opening yourself up to independent review by researchers is a great step,” he says. “But that doesn’t determine how you’re going to respond when researchers tell you things you don’t want to hear.” Apple did not respond to questions from MIT Technology Review about how the company will evaluate feedback from researchers.
The privacy-AI bargain
Apple is not the only company betting that many of us will grant AI models mostly unfettered access to our private data if it means they could automate tedious tasks. OpenAI’s Sam Altman described his dream AI tool to MIT Technology Review as one “that knows absolutely everything about my whole life, every email, every conversation I’ve ever had.” At its own developer conference in May, Google announced Project Astra, an ambitious project to build a “universal AI agent that is helpful in everyday life.”
It’s a bargain that will force many of us to consider for the first time what role, if any, we want AI models to play in how we interact with our data and devices. When ChatGPT first came on the scene, that wasn’t a question we needed to ask. It was simply a text generator that could write us a birthday card or a poem, and the questions it raised—like where its training data came from or what biases it perpetuated—didn’t feel quite as personal.
Now, less than two years later, Big Tech is making billion-dollar bets that we trust the safety of these systems enough to fork over our private information. It’s not yet clear if we know enough to make that call, or how able we are to opt out even if we’d like to. “I do worry that we’re going to see this AI arms race pushing ever more of our data into other people’s hands,” Cahn says.
Apple will soon release beta versions of its Apple Intelligence features, starting this fall with the iPhone 15 and the new macOS Sequoia, which can be run on Macs and iPads with M1 chips or newer. Says Apple CEO Tim Cook, “We think Apple intelligence is going to be indispensable.”
The rise of generative AI, coupled with the rapid adoption and democratization of AI across industries this decade, has emphasized the singular importance of data. Managing data effectively has become critical to this era of business—making data practitioners, including data engineers, analytics engineers, and ML engineers, key figures in the data and AI revolution.
Organizations that fail to use their own data will fall behind competitors that do and miss out on opportunities to uncover new value for themselves and their customers. As the quantity and complexity of data grows, so do its challenges, forcing organizations to adopt new data tools and infrastructure which, in turn, change the roles and mandate of the technology workforce.
Data practitioners are among those whose roles are experiencing the most significant change, as organizations expand their responsibilities. Rather than working in a siloed data team, data engineers are now developing platforms and tools whose design improves data visibility and transparency for employees across the organization, including analytics engineers, data scientists, data analysts, machine learning engineers, and business stakeholders.
This report explores, through a series of interviews with expert data practitioners, key shifts in data engineering, the evolving skill set required of data practitioners, options for data infrastructure and tooling to support AI, and data challenges and opportunities emerging in parallel with generative AI. The report’s key findings include the following:
The foundational importance of data is creating new demands on data practitioners. As the rise of AI demonstrates the business importance of data more clearly than ever, data practitioners are encountering new data challenges, increasing data complexity, evolving team structures, and emerging tools and technologies—as well as establishing newfound organizational importance.
Data practitioners are getting closer to the business, and the business closer to the data. The pressure to create value from data has led executives to invest more substantially in data-related functions. Data practitioners are being asked to expand their knowledge of the business, engage more deeply with business units, and support the use of data in the organization, while functional teams are finding they require their own internal data expertise to leverage their data.
The data and AI strategy has become a key part of the business strategy. Business leaders need to invest in their data and AI strategy—including making important decisions about the data team’s organizational structure, data platform and architecture, and data governance—because every business’s key differentiator will increasingly be its data.
Data practitioners will shape how generative AI is deployed in the enterprise. The key considerations for generative AI deployment—producing high-quality results, preventing bias and hallucinations, establishing governance, designing data workflows, ensuring regulatory compliance—are the province of data practitioners, giving them outsize influence on how this powerful technology will be put to work.
The first time Teodor Grantcharov sat down to watch himself perform surgery, he wanted to throw the VHS tape out the window.
“My perception was that my performance was spectacular,” Grantcharov says, and then pauses—“until the moment I saw the video.” Reflecting on this operation from 25 years ago, he remembers the roughness of his dissection, the wrong instruments used, the inefficiencies that transformed a 30-minute operation into a 90-minute one. “I didn’t want anyone to see it.”
This reaction wasn’t exactly unique. The operating room has long been defined by its hush-hush nature—what happens in the OR stays in the OR—because surgeons are notoriously bad at acknowledging their own mistakes. Grantcharov jokes that when you ask “Who are the top three surgeons in the world?” a typical surgeon “always has a challenge identifying who the other two are.”
But after the initial humiliation over watching himself work, Grantcharov started to see the value in recording his operations. “There are so many small details that normally take years and years of practice to realize—that some surgeons never get to that point,” he says. “Suddenly, I could see all these insights and opportunities overnight.”
There was a big problem, though: it was the ’90s, and spending hours playing back grainy VHS recordings wasn’t a realistic quality improvement strategy. It would have been nearly impossible to determine how often his relatively mundane slipups happened at scale—not to mention more serious medical errors like those that kill some 22,000 Americans each year. Many of these errors happen on the operating table, from leaving surgical sponges inside patients’ bodies to performing the wrong procedure altogether.
While the patient safety movement has pushed for uniform checklists and other manual fail-safes to prevent such mistakes, Grantcharov believes that “as long as the only barrier between success and failure is a human, there will be errors.” Improving safety and surgical efficiency became something of a personal obsession. He wanted to make it challenging to make mistakes, and he thought developing the right system to create and analyze recordings could be the key.
It’s taken many years, but Grantcharov, now a professor of surgery at Stanford, believes he’s finally developed the technology to make this dream possible: the operating room equivalent of an airplane’s black box. It records everything in the OR via panoramic cameras, microphones, and anesthesia monitors before using artificial intelligence to help surgeons make sense of the data.
Grantcharov’s company, Surgical Safety Technologies, is not the only one deploying AI to analyze surgeries. Many medical device companies are already in the space—including Medtronic with its Touch Surgery platform, Johnson & Johnson with C-SATS, and Intuitive Surgical with Case Insights.
But most of these are focused solely on what’s happening inside patients’ bodies, capturing intraoperative video alone. Grantcharov wants to capture the OR as a whole, from the number of times the door is opened to how many non-case-related conversations occur during an operation. “People have simplified surgery to technical skills only,” he says. “You need to study the OR environment holistically.”
Teodor Grantcharov in a procedure that is being recorded by Surgical Safety Technologies’ AI-powered black-box system.
COURTESY OF SURGICAL SAFETY TECHNOLOGIES
Success, however, isn’t as simple as just having the right technology. The idea of recording everything presents a slew of tricky questions around privacy and could raise the threat of disciplinary action and legal exposure. Because of these concerns, some surgeons have refused to operate when the black boxes are in place, and some of the systems have even been sabotaged. Aside from those problems, some hospitals don’t know what to do with all this new data or how to avoid drowning in a deluge of statistics.
Grantcharov nevertheless predicts that his system can do for the OR what black boxes did for aviation. In 1970, the industry was plagued by 6.5 fatal accidents for every million flights; today, that’s down to less than 0.5. “The aviation industry made the transition from reactive to proactive thanks to data,” he says—“from safe to ultra-safe.”
Grantcharov’s black boxes are now deployed at almost 40 institutions in the US, Canada, and Western Europe, from Mount Sinai to Duke to the Mayo Clinic. But are hospitals on the cusp of a new era of safety—or creating an environment of confusion and paranoia?
Shaking off the secrecy
The operating room is probably the most measured place in the hospital but also one of the most poorly captured. From team performance to instrument handling, there is “crazy big data that we’re not even recording,” says Alexander Langerman, an ethicist and head and neck surgeon at Vanderbilt University Medical Center. “Instead, we have post hoc recollection by a surgeon.”
Indeed, when things go wrong, surgeons are supposed to review the case at the hospital’s weekly morbidity and mortality conferences, but these errors are notoriously underreported. And even when surgeons enter the required notes into patients’ electronic medical records, “it’s undoubtedly—and I mean this in the least malicious way possible—dictated toward their best interests,” says Langerman. “It makes them look good.”
The operating room wasn’t always so secretive.
In the 19th century, operations often took place in large amphitheaters—they were public spectacles with a general price of admission. “Every seat even of the top gallery was occupied,” recounted the abdominal surgeon Lawson Tait about an operation in the 1860s. “There were probably seven or eight hundred spectators.”
However, around the 1900s, operating rooms became increasingly smaller and less accessible to the public—and its germs. “Immediately, there was a feeling that something was missing, that the public surveillance was missing. You couldn’t know what happened in the smaller rooms,” says Thomas Schlich, a historian of medicine at McGill University.
And it was nearly impossible to go back. In the 1910s a Boston surgeon, Ernest Codman, suggested a form of surveillance known as the end-result system, documenting every operation (including failures, problems, and errors) and tracking patient outcomes. Massachusetts General Hospital didn’t accept it, says Schlich, and Codman resigned in frustration.
Students watch a surgery performed at the former Philadelphia General Hospital around the turn of the century.
PUBLIC DOMAIN VIA WIKIPEDIA
Such opacity was part of a larger shift toward medicine’s professionalization in the 20th century, characterized by technological advancements, the decline of generalists, and the bureaucratization of health-care institutions. All of this put distance between patients and their physicians. Around the same time, and particularly from the 1960s onward, the medical field began to see a rise in malpractice lawsuits—at least partially driven by patients trying to find answers when things went wrong.
This battle over transparency could theoretically be addressed by surgical recordings. But Grantcharov realized very quickly that the only way to get surgeons to use the black box was to make them feel protected. To that end, he has designed the system to record the action but hide the identity of both patients and staff, even deleting all recordings within 30 days. His idea is that no individual should be punished for making a mistake. “We want to know what happened, and how we can build a system that makes it difficult for this to happen,” Grantcharov says. Errors don’t occur because “the surgeon wakes up in the morning and thinks, ‘I’m gonna make some catastrophic event happen,’” he adds. “This is a system issue.”
AI that sees everything
Grantcharov’s OR black box is not actually a box at all, but a tablet, one or two ceiling microphones, and up to four wall-mounted dome cameras that can reportedly analyze more than half a million data points per day per OR. “In three days, we go through the entire Netflix catalogue in terms of video processing,” he says.
The black-box platform utilizes a handful of computer vision models and ultimately spits out a series of short video clips and a dashboard of statistics—like how much blood was lost, which instruments were used, and how many auditory disruptions occurred. The system also identifies and breaks out key segments of the procedure (dissection, resection, and closure) so that instead of having to watch a whole three- or four-hour recording, surgeons can jump to the part of the operation where, for instance, there was major bleeding or a surgical stapler misfired.
Critically, each person in the recording is rendered anonymous; an algorithm distorts people’s voices and blurs out their faces, transforming them into shadowy, noir-like figures. “For something like this, privacy and confidentiality are critical,” says Grantcharov, who claims the anonymization process is irreversible. “Even though you know what happened, you can’t really use it against an individual.”
Another AI model works to evaluate performance. For now, this is done primarily by measuring compliance with the surgical safety checklist—a questionnaire that is supposed to be verbally ticked off during every type of surgical operation. (This checklist has long been associated with reductions in both surgical infections and overall mortality.) Grantcharov’s team is currently working to train more complex algorithms to detect errors during laparoscopic surgery, such as using excessive instrument force, holding the instruments in the wrong way, or failing to maintain a clear view of the surgical area. However, assessing these performance metrics has proved more difficult than measuring checklist compliance. “There are some things that are quantifiable, and some things require judgment,” Grantcharov says.
Each model has taken up to six months to train, through a labor-intensive process relying on a team of 12 analysts in Toronto, where the company was started. While many general AI models can be trained by a gig worker who labels everyday items (like, say, chairs), the surgical models need data annotated by people who know what they’re seeing—either surgeons, in specialized cases, or other labelers who have been properly trained. They have reviewed hundreds, sometimes thousands, of hours of OR videos and manually noted which liquid is blood, for instance, or which tool is a scalpel. Over time, the model can “learn” to identify bleeding or particular instruments on its own, says Peter Grantcharov, Surgical Safety Technologies’ vice president of engineering, who is Teodor Grantcharov’s son.
For the upcoming laparoscopic surgery model, surgeon annotators have also started to label whether certain maneuvers were correct or mistaken, as defined by the Generic Error Rating Tool—a standardized way to measure technical errors.
While most algorithms operate near perfectly on their own, Peter Grantcharov explains that the OR black box is still not fully autonomous. For example, it’s difficult to capture audio through ceiling mikes and thus get a reliable transcript to document whether every element of the surgical safety checklist was completed; he estimates that this algorithm has a 15% error rate. So before the output from each procedure is finalized, one of the Toronto analysts manually verifies adherence to the questionnaire. “It will require a human in the loop,” Peter Grantcharov says, but he gauges that the AI model has made the process of confirming checklist compliance 80% to 90% more efficient. He also emphasizes that the models are constantly being improved.
In all, the OR black box can cost about $100,000 to install, and analytics expenses run $25,000 annually, according to Janet Donovan, an OR nurse who shared with MIT Technology Review an estimate given to staff at Brigham and Women’s Faulkner Hospital in Massachusetts. (Peter Grantcharov declined to comment on these numbers, writing in an email: “We don’t share specific pricing; however, we can say that it’s based on the product mix and the total number of rooms, with inherent volume-based discounting built into our pricing models.”)
“Big brother is watching”
Long Island Jewish Medical Center in New York, part of the Northwell Health system, was the first hospital to pilot OR black boxes, back in February 2019. The rollout was far from seamless, though not necessarily because of the tech.
“In the colorectal room, the cameras were sabotaged,” recalls Northwell’s chair of urology, Louis Kavoussi—they were turned around and deliberately unplugged. In his own OR, the staff fell silent while working, worried they’d say the wrong thing. “Unless you’re taking a golf or tennis lesson, you don’t want someone staring there watching everything you do,” says Kavoussi, who has since joined the scientific advisory board for Surgical Safety Technologies.
Grantcharov’s promises about not using the system to punish individuals have offered little comfort to some OR staff. When two black boxes were installed at Faulkner Hospital in November 2023, they threw the department of surgery into crisis. “Everybody was pretty freaked out about it,” says one surgical tech who asked not to be identified by name since she wasn’t authorized to speak publicly. “We were being watched, and we felt like if we did something wrong, our jobs were going to be on the line.”
It wasn’t that she was doing anything illegal or spewing hate speech; she just wanted to joke with her friends, complain about the boss, and be herself without the fear of administrators peeking over her shoulder. “You’re very aware that you’re being watched; it’s not subtle at all,” she says. The early days were particularly challenging, with surgeons refusing to work in the black-box-equipped rooms and OR staff boycotting those operations: “It was definitely a fight every morning.”
“In the colorectal room, the cameras were sabotaged,” recalls Louis Kavoussi. “Unless you’re taking a golf or tennis lesson, you don’t want someone staring there watching everything you do.”
At some level, the identity protections are only half measures. Before 30-day-old recordings are automatically deleted, Grantcharov acknowledges, hospital administrators can still see the OR number, the time of operation, and the patient’s medical record number, so even if OR personnel are technically de-identified, they aren’t truly anonymous. The result is a sense that “Big Brother is watching,” says Christopher Mantyh, vice chair of clinical operations at Duke University Hospital, which has black boxes in seven ORs. He will draw on aggregate data to talk generally about quality improvement at departmental meetings, but when specific issues arise, like breaks in sterility or a cluster of infections, he will look to the recordings and “go to the surgeons directly.”
In many ways, that’s what worries Donovan, the Faulkner Hospital nurse. She’s not convinced the hospital will protect staff members’ identities and is worried that these recordings will be used against them—whether through internal disciplinary actions or in a patient’s malpractice suit. In February 2023, she and almost 60 others sent a letter to the hospital’s chief of surgery objecting to the black box. She’s since filed a grievance with the state, with arbitration proceedings scheduled for October.
The legal concerns in particular loom large because, already, over 75% of surgeons report having been sued at least once, according to a 2021 survey by Medscape, an online resource hub for health-care professionals. To the layperson, any surgical video “looks like a horror show,” says Vanderbilt’s Langerman. “Some plaintiff’s attorney is going to get ahold of this, and then some jury is going to see a whole bunch of blood, and then they’re not going to know what they’re seeing.” That prospect turns every recording into a potential legal battle.
From a purely logistical perspective, however, the 30-day deletion policy will likely insulate these recordings from malpractice lawsuits, according to Teneille Brown, a law professor at the University of Utah. She notes that within that time frame, it would be nearly impossible for a patient to find legal representation, go through the requisite conflict-of-interest checks, and then file a discovery request for the black-box data. While deleting data to bypass the judicial system could provoke criticism, Brown sees the wisdom of Surgical Safety Technologies’ approach. “If I were their lawyer, I would tell them to just have a policy of deleting it because then they’re deleting the good and the bad,” she says. “What it does is orient the focus to say, ‘This is not about a public-facing audience. The audience for these videos is completely internal.’”
A data deluge
When it comes to improving quality, there are “the problem-first people, and then there are the data-first people,” says Justin Dimick, chair of the department of surgery at the University of Michigan. The latter, he says, push “massive data collection” without first identifying “a question of ‘What am I trying to fix?’” He says that’s why he currently has no plans to use the OR black boxes in his hospital.
Mount Sinai’s chief of general surgery, Celia Divino, echoes this sentiment, emphasizing that too much data can be paralyzing. “How do you interpret it? What do you do with it?” she asks. “This is always a disease.”
At Northwell, even Kavoussi admits that five years of data from OR black boxes hasn’t been used to change much, if anything. He says that hospital leadership is finally beginning to think about how to use the recordings, but a hard question remains: OR black boxes can collect boatloads of data, but what does it matter if nobody knows what to do with it?
Grantcharov acknowledges that the information can be overwhelming. “In the early days, we let the hospitals figure out how to use the data,” he says. “That led to a big variation in how the data was operationalized. Some hospitals did amazing things; others underutilized it.” Now the company has a dedicated “customer success” team to help hospitals make sense of the data, and it offers a consulting-type service to work through surgical errors. But ultimately, even the most practical insights are meaningless without buy-in from hospital leadership, Grantcharov suggests.
Getting that buy-in has proved difficult in some centers, at least partly because there haven’t yet been any large, peer-reviewed studies showing how OR black boxes actually help to reduce patient complications and save lives. “If there’s some evidence that a comprehensive data collection system—like a black box—is useful, then we’ll do it,” says Dimick. “But I haven’t seen that evidence yet.”
A screenshot of the analytics produced by the black box.
COURTESY OF SURGICAL SAFETY TECHNOLOGIES
The best hard data thus far is from a 2022 study published in the Annals of Surgery, in which Grantcharov and his team used OR black boxes to show that the surgical checklist had not been followed in a fifth of operations, likely contributing to excess infections. He also says that an upcoming study, scheduled to be published this fall, will show that the OR black box led to an improvement in checklist compliance and reduced ICU stays, reoperations, hospital readmissions, and mortality.
On a smaller scale, Grantcharov insists that he has built a steady stream of evidence showing the power of his platform. For example, he says, it’s revealed that auditory disruptions—doors opening, machine alarms and personal pagers going off—happen every minute in gynecology ORs, that a median 20 intraoperative errors are made in each laparoscopic surgery case, and that surgeons are great at situational awareness and leadership while nurses excel at task management.
Meanwhile, some hospitals have reported small improvements based on black-box data. Duke’s Mantyh says he’s used the data to check how often antibiotics are given on time. Duke and other hospitals also report turning to this data to help decrease the amount of time ORs sit empty between cases. By flagging when “idle” times are unexpectedly long and having the Toronto analysts review recordings to explain why, they’ve turned up issues ranging from inefficient communication to excessive time spent bringing in new equipment.
That can make a bigger difference than one might think, explains Ra’gan Laventon, clinical director of perioperative services at Texas’s Memorial Hermann Sugar Land Hospital: “We have multiple patients who are depending on us to get to their care today. And so the more time that’s added in some of these operational efficiencies, the more impactful it is to the patient.”
The real world
At Northwell, where some of the cameras were initially sabotaged, it took a couple of weeks for Kavoussi’s urology team to get used to the black boxes, and about six months for his colorectal colleagues. Much of the solution came down to one-on-one conversations in which Kavoussi explained how the data was automatically de-identified and deleted.
During his operations, Kavoussi would also try to defuse the tension, telling the OR black box “Good morning, Toronto,” or jokingly asking, “How’s the weather up there?” In the end, “since nothing bad has happened, it has become part of the normal flow,” he says.
The reality is that no surgeon wants to be an average operator, “but statistically, we’re mostly average surgeons, and that’s okay,” says Vanderbilt’s Langerman. “I’d hate to be a below-average surgeon, but if I was, I’d really want to know about it.” Like athletes watching game film to prepare for their next match, surgeons might one day review their recordings, assessing their mistakes and thinking about the best ways to avoid them—but only if they feel safe enough to do so.
“Until we know where the guardrails are around this, there’s such a risk—an uncertain risk—that no one’s gonna let anyone turn on the camera,” Langerman says. “We live in a real world, not a perfect world.”
Simar Bajaj is an award-winning science journalist and 2024 Marshall Scholar. He has previously written for the Washington Post, Time magazine, the Guardian, NPR, and the Atlantic, as well as the New England Journal of Medicine, Nature Medicine, and The Lancet. He won Science Story of the Year from the Foreign Press Association in 2022 and the top prize for excellence in science communications from the National Academies of Science, Engineering, and Medicine in 2023. Follow him on X at @SimarSBajaj.
At the end of May, OpenAI marked a new “first” in its corporate history. It wasn’t an even more powerful language model or a new data partnership, but a report disclosing that bad actors had misused their products to run influence operations. The company had caught five networks of covert propagandists—including players from Russia, China, Iran, and Israel—using their generative AI tools for deceptive tactics that ranged from creating large volumes of social media comments in multiple languages to turning news articles into Facebook posts. The use of these tools, OpenAI noted, seemed intended to improve the quality and quantity of output. AI gives propagandists a productivity boost too.
First and foremost, OpenAI should be commended for this report and the precedent it hopefully sets. Researchers have long expected adversarial actors to adopt generative AI technology, particularly large language models, to cheaply increase the scale and caliber of their efforts. The transparent disclosure that this has begun to happen—and that OpenAI has prioritized detecting it and shutting down accounts to mitigate its impact—shows that at least one large AI company has learned something from the struggles of social media platforms in the years following Russia’s interference in the 2016 US election. When that misuse was discovered, Facebook, YouTube, and Twitter (now X) created integrity teams and began making regular disclosures about influence operations on their platforms. (X halted this activity after Elon Musk’s purchase of the company.)
OpenAI’s disclosure, in fact, was evocative of precisely such a report from Meta, released a mere day earlier. The Meta transparency report for the first quarter of 2024 disclosed the takedown of six covert operations on its platform. It, too, found networks tied to China, Iran, and Israel and noted the use of AI-generated content. Propagandists from China shared what seem to be AI-generated poster-type images for a “fictitious pro-Sikh activist movement.” An Israel-based political marketing firm posted what were likely AI-generated comments. Meta’s report also noted that one very persistent Russian threat actor was still quite active, and that its strategies were evolving. Perhaps most important, Meta included a direct set of “recommendations for stronger industry response” that called for governments, researchers, and other technology companies to collaboratively share threat intelligence to help disrupt the ongoing Russian campaign.
We are two such researchers, and we have studied online influence operations for years. We have published investigations of coordinated activity—sometimes in collaboration with platforms—and analyzed how AI tools could affect the way propaganda campaigns are waged. Our teams’ peer-reviewed research has found that language models can produce text that is nearly as persuasive as propaganda from human-written campaigns. We have seen influence operations continue to proliferate, on every social platform and focused on every region of the world; they are table stakes in the propaganda game at this point. State adversaries and mercenary public relations firms are drawn to social media platforms and the reach they offer. For authoritarian regimes in particular, there is little downside to running such a campaign, particularly in a critical global election year. And now, adversaries are demonstrably using AI technologies that may make this activity harder to detect. Media is writing about the “AI election,” and many regulators are panicked.
It’s important to put this in perspective, though. Most of the influence campaigns that OpenAI and Meta announced did not have much impact, something the companies took pains to highlight. It’s critical to reiterate that effort isn’t the same thing as engagement: the mere existence of fake accounts or pages doesn’t mean that real people are paying attention to them. Similarly, just because a campaign uses AI does not mean it will sway public opinion. Generative AI reduces the cost of running propaganda campaigns, making it significantly cheaper to produce content and run interactive automated accounts. But it is not a magic bullet, and in the case of the operations that OpenAI disclosed, what was generated sometimes seemed to be rather spammy. Audiences didn’t bite.
Producing content, after all, is only the first step in a propaganda campaign; even the most convincing AI-generated posts, images, or audio still need to be distributed. Campaigns without algorithmic amplification or influencer pickup are often just tweeting into the void. Indeed, it is consistently authentic influencers—people who have the attention of large audiences enthusiastically resharing their posts—that receive engagement and drive the public conversation, helping content and narratives to go viral. This is why some of the more well-resourced adversaries, like China, simply surreptitiously hire those voices. At this point, influential real accounts have far more potential for impact than AI-powered fakes.
Nonetheless, there is a lot of concern that AI could disrupt American politics and become a national security threat. It’s important to “rightsize” that threat, particularly in an election year. Hyping the impact of disinformation campaigns can undermine trust in elections and faith in democracy by making the electorate believe that there are trolls behind every post, or that the mere targeting of a candidate by a malign actor, even with a very poorly executed campaign, “caused” their loss.
By putting an assessment of impact front and center in its first report, OpenAI is clearly taking the risk of exaggerating the threat seriously. And yet, diminishing the threat or not fielding integrity teams—letting trolls simply continue to grow their followings and improve their distribution capability—would also be a bad approach. Indeed, the Meta report noted that one network it disrupted, seemingly connected to a political party in Bangladesh and targeting the Bangladeshi public, had amassed 3.4 million followers across 98 pages. Since that network was not run by an adversary of interest to Americans, it will likely get little attention. Still, this example highlights the fact that the threat is global, and vigilance is key. Platforms must continue to prioritize threat detection.
So what should we do about this? The Meta report’s call for threat sharing and collaboration, although specific to a Russian adversary, highlights a broader path forward for social media platforms, AI companies, and academic researchers alike.
Transparency is paramount. As outside researchers, we can learn only so much from a social media company’s description of an operation it has taken down. This is true for the public and policymakers as well, and incredibly powerful platforms shouldn’t just be taken at their word. Ensuring researcher access to data about coordinated inauthentic networks offers an opportunity for outside validation (or refutation!) of a tech company’s claims. Before Musk’s takeover of Twitter, the company regularly released data sets of posts from inauthentic state-linked accounts to researchers, and even to the public. Meta shared data with external partners before it removed a network and, more recently, moved to a model of sharing content from already-removed networks through Meta’s Influence Operations Research Archive. While researchers should continue to push for more data, these efforts have allowed for a richer understanding of adversarial narratives and behaviors beyond what the platform’s own transparency report summaries provided.
OpenAI’s adversarial threat report should be a prelude to more robust data sharing moving forward. Where AI is concerned, independent researchers have begun to assemble databases of misuse—like the AI Incident Database and the Political Deepfakes Incident Database—to allow researchers to compare different types of misuse and track how misuse changes over time. But it is often hard to detect misuse from the outside. As AI tools become more capable and pervasive, it’s important that policymakers considering regulation understand how they are being used and abused. While OpenAI’s first report offered high-level summaries and select examples, expanding data-sharing relationships with researchers that provide more visibility into adversarial content or behaviors is an important next step.
When it comes to combating influence operations and misuse of AI, online users also have a role to play. After all, this content has an impact only if people see it, believe it, and participate in sharing it further. In one of the cases OpenAI disclosed, online users called out fake accounts that used AI-generated text.
In our own research, we’ve seen communities of Facebook users proactively call out AI-generated image content created by spammers and scammers, helping those who are less aware of the technology avoid falling prey to deception. A healthy dose of skepticism is increasingly useful: pausing to check whether content is real and people are who they claim to be, and helping friends and family members become more aware of the growing prevalence of generated content, can help social media users resist deception from propagandists and scammers alike.
OpenAI’s blog post announcing the takedown report put it succinctly: “Threat actors work across the internet.” So must we. As we move into an new era of AI-driven influence operations, we must address shared challenges via transparency, data sharing, and collaborative vigilance if we hope to develop a more resilient digital ecosystem.
Josh A. Goldstein is a research fellow at Georgetown University’s Center for Security and Emerging Technology (CSET), where he works on the CyberAI Project. Renée DiResta is the research manager of the Stanford Internet Observatory and the author of Invisible Rulers: The People Who Turn Lies into Reality.
This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.
Greetings from Switzerland! I’ve just come back from Geneva, which last week hosted the UN’s AI for Good Summit, organized by the International Telecommunication Union. The summit’s big focus was how AI can be used to meet the UN’s Sustainable Development Goals, such as eradicating poverty and hunger, achieving gender equality, promoting clean energy and climate action and so on.
The conference featured lots of robots (including one that dispenses wine), but what I liked most of all was how it managed to convene people working in AI from around the globe, featuring speakers from China, the Middle East, and Africa too, such as Pelonomi Moiloa, the CEO of Lelapa AI, a startup building AI for African languages. AI can be very US-centric and male dominated, and any effort to make the conversation more global and diverse is laudable.
But honestly, I didn’t leave the conference feeling confident AI was going to play a meaningful role in advancing any of the UN goals. In fact, the most interesting speeches were about how AI is doing the opposite. Sage Lenier, a climate activist, talked about how we must not let AI accelerate environmental destruction. Tristan Harris, the cofounder of the Center for Humane Technology, gave a compelling talk connecting the dots between our addiction to social media, the tech sector’s financial incentives, and our failure to learn from previous tech booms. And there are still deeply ingrained gender biases in tech, Mia Shah-Dand, the founder of Women in AI Ethics, reminded us.
So while the conference itself was about using AI for “good,” I would have liked to see more talk about how increased transparency, accountability, and inclusion could make AI itself good from development to deployment.
We now know that generating one image with generative AI uses as much energy as charging a smartphone. I would have liked more honest conversations about how to make the technology more sustainable itself in order to meet climate goals. And it felt jarring to hear discussions about how AI can be used to help reduce inequalities when we know that so many of the AI systems we use are built on the backs of human content moderators in the Global South who sift through traumatizing content while being paid peanuts.
Making the case for the “tremendous benefit” of AI was OpenAI’s CEO Sam Altman, the star speaker of the summit. Altman was interviewed remotely by Nicholas Thompson, the CEO of the Atlantic, which has incidentally just announced a deal for OpenAI to share its content to train new AI models. OpenAI is the company that instigated the current AI boom, and it would have been a great opportunity to ask him about all these issues. Instead, the two had a relatively vague, high-level discussion about safety, leaving the audience none the wiser about what exactly OpenAI is doing to make their systems safer. It seemed they were simply supposed to take Altman’s word for it.
Altman’s talk came a week or so after Helen Toner, a researcher at the Georgetown Center for Security and Emerging Technology and a former OpenAI board member, said in an interview that the board found out about the launch of ChatGPT through Twitter, and that Altman had on multiple occasions given the board inaccurate information about the company’s formal safety processes. She has also argued that it is a bad idea to let AI firms govern themselves, because the immense profit incentives will always win. (Altman said he “disagree[s] with her recollection of events.”)
When Thompson asked Altman what the first good thing to come out of generative AI will be, Altman mentioned productivity, citing examples such as software developers who can use AI tools to do their work much faster. “We’ll see different industries become much more productive than they used to be because they can use these tools. And that will have a positive impact on everything,” he said. I think the jury is still out on that one.
Now read the rest of The Algorithm
Deeper Learning
Why Google’s AI Overviews gets things wrong
Google’s new feature, called AI Overviews, provides brief, AI-generated summaries highlighting key information and links on top of search results. Unfortunately, within days of AI Overviews’ release in the US, users were sharing examples of responses that were strange at best. It suggested that users add glue to pizza or eat at least one small rock a day.
MIT Technology Review explains: In order to understand why AI-powered search engines get things wrong, we need to look at how they work. The models that power them simply predict the next word (or token) in a sequence, which makes them appear fluent but also leaves them prone to making things up. They have no ground truth to rely on, but instead choose each word purely on the basis of a statistical calculation. Worst of all? There’s probably no way to fix things. That’s why you shouldn’t trust AI search engines. Read more from Rhiannon Williams here.
Bits and Bytes
OpenAI’s latest blunder shows the challenges facing Chinese AI models OpenAI’s GPT-4o data set is polluted by Chinese spam websites. But this problem is indicative of a much wider issue for those building Chinese AI services: finding the high-quality data sets they need to be trained on is tricky, because of the way China’s internet functions. (MIT Technology Review)
Five ways criminals are using AI Artificial intelligence has brought a big boost in productivity—to the criminal underworld. Generative AI has made phishing, scamming, and doxxing easier than ever. (MIT Technology Review)
OpenAI found Russian and Chinese groups using its tech for propaganda campaigns OpenAI said that it caught, and removed, groups from Russia, China, Iran, and Israel that were using its technology to try to influence political discourse around the world. But this is likely just the tip of the iceberg when it comes to how AI is being used to affect this year’s record-breaking number of elections. (The Washington Post)
Inside Anthropic, the AI company betting that safety can be a winning strategy The AI lab Anthropic, creator of the Claude model, was started by former OpenAI employees who resigned over “trust issues.” This profile is an interesting peek inside one of OpenAI’s competitors, showing how the ideology behind AI safety and effective altruism is guiding business decisions. (Time)
AI-directed drones could help find lost hikers faster Drones are already used for search and rescue, but planning their search paths is more art than science. AI could change that. (MIT Technology Review)
When Google announced it was rolling out its artificial-intelligence-powered search feature earlier this month, the company promised that “Google will do the googling for you.” The new feature, called AI Overviews, provides brief, AI-generated summaries highlighting key information and links on top of search results.
Unfortunately, AI systems are inherently unreliable. Within days of AI Overviews’ release in the US, users were sharing examples of responses that were strange at best. It suggested that users add glue to pizza or eat at least one small rock a day, and that former US president Andrew Johnson earned university degrees between 1947 and 2012, despite dying in 1875.
On Thursday, Liz Reid, head of Google Search, announced that the company has been making technical improvements to the system to make it less likely to generate incorrect answers, including better detection mechanisms for nonsensical queries. It is also limiting the inclusion of satirical, humorous, and user-generated content in responses, since such material could result in misleading advice.
But why is AI Overviews returning unreliable, potentially dangerous information? And what, if anything, can be done to fix it?
How does AI Overviews work?
In order to understand why AI-powered search engines get things wrong, we need to look at how they’ve been optimized to work. We know that AI Overviews uses a new generative AI model in Gemini, Google’s family of large language models (LLMs), that’s been customized for Google Search. That model has been integrated with Google’s core web ranking systems and designed to pull out relevant results from its index of websites.
Most LLMs simply predict the next word (or token) in a sequence, which makes them appear fluent but also leaves them prone to making things up. They have no ground truth to rely on, but instead choose each word purely on the basis of a statistical calculation. That leads to hallucinations. It’s likely that the Gemini model in AI Overviews gets around this by using an AI technique called retrieval-augmented generation (RAG), which allows an LLM to check specific sources outside of the data it’s been trained on, such as certain web pages, says Chirag Shah, a professor at the University of Washington who specializes in online search.
Once a user enters a query, it’s checked against the documents that make up the system’s information sources, and a response is generated. Because the system is able to match the original query to specific parts of web pages, it’s able to cite where it drew its answer from—something normal LLMs cannot do.
One major upside of RAG is that the responses it generates to a user’s queries should be more up to date, more factually accurate, and more relevant than those from a typical model that just generates an answer based on its training data. The technique is often used to try to prevent LLMs from hallucinating. (A Google spokesperson would not confirm whether AI Overviews uses RAG.)
So why does it return bad answers?
But RAG is far from foolproof. In order for an LLM using RAG to come up with a good answer, it has to both retrieve the information correctly and generate the response correctly. A bad answer results when one or both parts of the process fail.
In the case of AI Overviews’ recommendation of a pizza recipe that contains glue—drawing from a joke post on Reddit—it’s likely that the post appeared relevant to the user’s original query about cheese not sticking to pizza, but something went wrong in the retrieval process, says Shah. “Just because it’s relevant doesn’t mean it’s right, and the generation part of the process doesn’t question that,” he says.
Similarly, if a RAG system comes across conflicting information, like a policy handbook and an updated version of the same handbook, it’s unable to work out which version to draw its response from. Instead, it may combine information from both to create a potentially misleading answer.
“The large language model generates fluent language based on the provided sources, but fluent language is not the same as correct information,” says Suzan Verberne, a professor at Leiden University who specializes in natural-language processing.
The more specific a topic is, the higher the chance of misinformation in a large language model’s output, she says, adding: “This is a problem in the medical domain, but also education and science.”
According to the Google spokesperson, in many cases when AI Overviews returns incorrect answers it’s because there’s not a lot of high-quality information available on the web to show for the query—or because the query most closely matches satirical sites or joke posts.
The spokesperson says the vast majority of AI Overviews provide high-quality information and that many of the examples of bad answers were in response to uncommon queries, adding that AI Overviews containing potentially harmful, obscene, or otherwise unacceptable content came up in response to less than one in every 7 million unique queries. Google is continuing to remove AI Overviews on certain queries in accordance with its content policies.
It’s not just about bad training data
Although the pizza glue blunder is a good example of a case where AI Overviews pointed to an unreliable source, the system can also generate misinformation from factually correct sources. Melanie Mitchell, an artificial-intelligence researcher at the Santa Fe Institute in New Mexico, googled “How many Muslim presidents has the US had?’” AI Overviews responded: “The United States has had one Muslim president, Barack Hussein Obama.”
While Barack Obama is not Muslim, making AI Overviews’ response wrong, it drew its information from a chapter in an academic book titled Barack Hussein Obama: America’s First Muslim President?So not only did the AI system miss the entire point of the essay, it interpreted it in the exact opposite of the intended way, says Mitchell. “There’s a few problems here for the AI; one is finding a good source that’s not a joke, but another is interpreting what the source is saying correctly,” she adds. “This is something that AI systems have trouble doing, and it’s important to note that even when it does get a good source, it can still make errors.”
Can the problem be fixed?
Ultimately, we know that AI systems are unreliable, and so long as they are using probability to generate text word by word, hallucination is always going to be a risk. And while AI Overviews is likely to improve as Google tweaks it behind the scenes, we can never be certain it’ll be 100% accurate.
Google has said that it’s adding triggering restrictions for queries where AI Overviews were not proving to be especially helpful and has added additional “triggering refinements” for queries related to health. The company could add a step to the information retrieval process designed to flag a risky query and have the system refuse to generate an answer in these instances, says Verberne. Google doesn’t aim to show AI Overviews for explicit or dangerous topics, or for queries that indicate a vulnerable situation, the company spokesperson says.
Techniques like reinforcement learning from human feedback, which incorporates such feedback into an LLM’s training, can also help improve the quality of its answers.
Similarly, LLMs could be trained specifically for the task of identifying when a question cannot be answered, and it could also be useful to instruct them to carefully assess the quality of a retrieved document before generating an answer, Verbene says: “Proper instruction helps a lot!”
Although Google has added a label to AI Overviews answers reading “Generative AI is experimental,” it should consider making it much clearer that the feature is in beta and emphasizing that it is not ready to provide fully reliable answers, says Shah. “Until it’s no longer beta—which it currently definitely is, and will be for some time— it should be completely optional. It should not be forced on us as part of core search.”