Chatbots are surprisingly effective at debunking conspiracy theories

It’s become a truism that facts alone don’t change people’s minds. Perhaps nowhere is this more clear than when it comes to conspiracy theories: Many people believe that you can’t talk conspiracists out of their beliefs. 

But that’s not necessarily true. It turns out that many conspiracy believers do respond to evidence and arguments—information that is now easy to deliver in the form of a tailored conversation with an AI chatbot.

In research we published in the journal Science this year, we had over 2,000 conspiracy believers engage in a roughly eight-minute conversation with DebunkBot, a model we built on top of OpenAI’s GPT-4 Turbo (the most up-to-date GPT model at that time). Participants began by writing out, in their own words, a conspiracy theory that they believed and the evidence that made the theory compelling to them. Then we instructed the AI model to persuade the user to stop believing in that conspiracy and adopt a less conspiratorial view of the world. A three-round back-and-forth text chat with the AI model (lasting 8.4 minutes on average) led to a 20% decrease in participants’ confidence in the belief, and about one in four participants—all of whom believed the conspiracy theory beforehand—indicated that they did not believe it after the conversation. This effect held true for both classic conspiracies (think the JFK assassination or the moon landing hoax) and more contemporary politically charged ones (like those related to the 2020 election and covid-19).


This story is part of MIT Technology Review’s series “The New Conspiracy Age,” on how the present boom in conspiracy theories is reshaping science and technology.


This is good news, given the outsize role that unfounded conspiracy theories play in today’s political landscape. So while there are widespread and legitimate concerns that generative AI is a potent tool for spreading disinformation, our work shows that it can also be part of the solution. 

Even people who began the conversation absolutely certain that their conspiracy was true, or who indicated that it was highly important to their personal worldview, showed marked decreases in belief. Remarkably, the effects were very durable; we followed up with participants two months later and saw just as big a reduction in conspiracy belief as we did immediately after the conversations. 

Our experiments indicate that many believers are relatively rational but misinformed, and getting them timely, accurate facts can have a big impact. Conspiracy theories can make sense to reasonable people who have simply never heard clear, non-conspiratorial explanations for the events they’re fixated on. This may seem surprising. But many conspiratorial claims, while wrong, seem reasonable on the surface and require specialized, esoteric knowledge to evaluate and debunk. 

For example, 9/11 deniers often point to the claim that jet fuel doesn’t burn hot enough to melt steel as evidence that airplanes were not responsible for bringing down the Twin Towers—but the chatbot responds by pointing out that although this is true, the American Institute of Steel Construction says jet fuel does burn hot enough to reduce the strength of steel by over 50%, which is more than enough to cause such towers to collapse. 

Although we have greater access to factual information than ever before, it is extremely difficult to search that vast corpus of knowledge efficiently. Finding the truth that way requires knowing what to google—or who to listen to—and being sufficiently motivated to seek out conflicting information. There are large time and skill barriers to conducting such a search every time we hear a new claim, and so it’s easy to take conspiratorial content you stumble upon at face value. And most would-be debunkers at the Thanksgiving table make elementary mistakes that AI avoids: Do you know the melting point and tensile strength of steel offhand? And when your relative calls you an idiot while trying to correct you, are you able to maintain your composure? 

With enough effort, humans would almost certainly be able to research and deliver facts like the AI in our experiments. And in a follow-up experiment, we found that the AI debunking was just as effective if we told participants they were talking to an expert rather than an AI. So it’s not that the debunking effect is AI-specific. Generally speaking, facts and evidence delivered by humans would also work. But it would require a lot of time and concentration for a human to come up with those facts. Generative AI can do the cognitive labor of fact-checking and rebutting conspiracy claims much more efficiently. 

In another large follow-up experiment, we found that what drove the debunking effect was specifically the facts and evidence the model provided: Factors like letting people know the chatbot was going to try to talk them out of their beliefs didn’t reduce its efficacy, whereas telling the model to try to persuade its chat partner without using facts and evidence totally eliminated the effect. 

Although the foibles and hallucinations of these models are well documented, our results suggest that debunking efforts are widespread enough on the internet to keep the conspiracy-focused conversations roughly accurate. When we hired a professional fact-checker to evaluate GPT-4’s claims, they found that over 99% of the claims were rated as true (and not politically biased). Also, in the few cases where participants named conspiracies that turned out to be true (like MK Ultra, the CIA’s human experimentation program from the 1950s), the AI chatbot confirmed their accurate belief rather than erroneously talking them out of it.

To date, largely by necessity, interventions to combat conspiracy theorizing have been mainly prophylactic—aiming to prevent people from going down the rabbit hole rather than trying to pull them back out. Now, thanks to advances in generative AI, we have a tool that can change conspiracists’ minds using evidence. 

Bots prompted to debunk conspiracy theories could be deployed on social media platforms to engage with those who share conspiratorial content—including other AI chatbots that spread conspiracies. Google could also link debunking AI models to search engines to provide factual answers to conspiracy-related queries. And instead of arguing with your conspiratorial uncle over the dinner table, you could just pass him your phone and have him talk to AI. 

Of course, there are much deeper implications here for how we as humans make sense of the world around us. It is widely argued that we now live in a “post-truth” world, where polarization and politics have eclipsed facts and evidence. By that account, our passions trump truth, logic-based reasoning is passé, and the only way to effectively change people’s minds is via psychological tactics like presenting compelling personal narratives or changing perceptions of the social norm. If so, the typical, discourse-based work of living together in a democracy is fruitless.

But facts aren’t dead. Our findings about conspiracy theories are the latest—and perhaps most extreme—in an emerging body of research demonstrating the persuasive power of facts and evidence. For example, while it was once believed that correcting falsehoods that aligns with one’s politics would just cause people to dig in and believe them even more, this idea of a “backfire” has itself been debunked: Many studies consistently find that corrections and warning labels reduce belief in, and sharing of, falsehoods—even among those who most distrust the fact-checkers making the corrections. Similarly, evidence-based arguments can change partisans’ minds on political issues, even when they are actively reminded that the argument goes against their party leader’s position. And simply reminding people to think about whether content is accurate before they share it can substantially reduce the spread of misinformation. 

And if facts aren’t dead, then there’s hope for democracy—though this arguably requires a consensus set of facts from which rival factions can work. There is indeed widespread partisan disagreement on basic facts, and a disturbing level of belief in conspiracy theories. Yet this doesn’t necessarily mean our minds are inescapably warped by our politics and identities. When faced with evidence—even inconvenient or uncomfortable evidence—many people do shift their thinking in response. And so if it’s possible to disseminate accurate information widely enough, perhaps with the help of AI, we may be able to reestablish the factual common ground that is missing from society today.

You can try our debunking bot yourself at at debunkbot.com

Thomas Costello is an assistant professor in social and decision sciences at Carnegie Mellon University. His research integrates psychology, political science, and human-computer interaction to examine where our viewpoints come from, how they differ from person to person, and why they change—as well as the sweeping impacts of artificial intelligence on these processes.

Gordon Pennycook is the Dorothy and Ariz Mehta Faculty Leadership Fellow and associate professor of psychology at Cornell University. He examines the causes and consequences of analytic reasoning, exploring how intuitive versus deliberative thinking shapes decision-making to understand errors underlying issues such as climate inaction, health behaviors, and political polarization.

David Rand is a professor of information science, marketing and management communication, and psychology at Cornell University. He uses approaches from computational social science and cognitive science to explore how human-AI dialogue can correct inaccurate beliefs, why people share falsehoods, and how to reduce political polarization and promote cooperation.

DeepSeek may have found a new way to improve AI’s ability to remember

<div data-chronoton-summary="

  • Memory Through Images: DeepSeek’s new OCR model stores information as visual rather than text tokens, a technique that allows it to retain more data. This approach could drastically reduce computing costs and carbon footprint while improving AI’s ability to ‘remember’.
  • Addressing Context Rot: The model works a bit like human memory, storing older or less critical information in slightly blurred form to save space. This could help address the fact current AI systems forget or muddle information over long conversations, a problem dubbed “context rot.”
  • DeepSeek Disruption: DeepSeek shocked the AI industry with its efficient DeepSeek-R1 reasoning model in January, and is again pushing boundaries. The OCR system can generate over 200,000 training data pages daily on a single GPU, potentially addressing the industry’s severe shortage of quality training text.

” data-chronoton-post-id=”1126932″ data-chronoton-expand-collapse=”1″ data-chronoton-analytics-enabled=”1″>

An AI model released by the Chinese AI company DeepSeek uses new techniques that could significantly improve AI’s ability to “remember.”

Released last week, the optical character recognition (OCR) model works by extracting text from an image and turning it into machine-readable words. This is the same technology that powers scanner apps, translation of text in photos, and many accessibility tools. 

OCR is already a mature field with numerous high-performing systems, and according to the paper and some early reviews, DeepSeek’s new model performs on par with top models on key benchmarks.

But researchers say the model’s main innovation lies in how it processes information—specifically, how it stores and retrieves memories. Improving how AI models “remember” information could reduce the computing power they need to run, thus mitigating AI’s large (and growing) carbon footprint. 

Currently, most large language models break text down into thousands of tiny units called tokens. This turns the text into representations that models can understand. However, these tokens quickly become expensive to store and compute with as conversations with end users grow longer. When a user chats with an AI for lengthy periods, this challenge can cause the AI to forget things it’s been told and get information muddled, a problem some call “context rot.”

The new methods developed by DeepSeek (and published in its latest paper) could help to overcome this issue. Instead of storing words as tokens, its system packs written information into image form, almost as if it’s taking a picture of pages from a book. This allows the model to retain nearly the same information while using far fewer tokens, the researchers found. 

Essentially, the OCR model is a test bed for these new methods that permit more information to be packed into AI models more efficiently. 

Besides using visual tokens instead of just text tokens, the model is built on a type of tiered compression that is not unlike how human memories fade: Older or less critical content is stored in a slightly more blurry form in order to save space. Despite that, the paper’s authors argue, this compressed content can still remain accessible in the background while maintaining a high level of system efficiency.

Text tokens have long been the default building block in AI systems. Using visual tokens instead is unconventional, and as a result, DeepSeek’s model is quickly capturing researchers’ attention. Andrej Karpathy, the former Tesla AI chief and a founding member of OpenAI, praised the paper on X, saying that images may ultimately be better than text as inputs for LLMs. Text tokens might be “wasteful and just terrible at the input,” he wrote. 

Manling Li, an assistant professor of computer science at Northwestern University, says the paper offers a new framework for addressing the existing challenges in AI memory. “While the idea of using image-based tokens for context storage isn’t entirely new, this is the first study I’ve seen that takes it this far and shows it might actually work,” Li says.

The method could open up new possibilities in AI research and applications, especially in creating more useful AI agents, says Zihan Wang, a PhD candidate at Northwestern University. He believes that since conversations with AI are continuous, this approach could help models remember more and assist users more effectively.

The technique can also be used to produce more training data for AI models. Model developers are currently grappling with a severe shortage of quality text to train systems on. But the DeepSeek paper says that the company’s OCR system can generate over 200,000 pages of training data a day on a single GPU.

The model and paper, however, are only an early exploration of using image tokens rather than text tokens for AI memorization. Li says she hopes to see visual tokens applied not just to memory storage but also to reasoning. Future work, she says, should explore how to make AI’s memory fade in a more dynamic way, akin to how we can recall a life-changing moment from years ago but forget what we ate for lunch last week. Currently, even with DeepSeek’s methods, AI tends to forget and remember in a very linear way—recalling whatever was most recent, but not necessarily what was most important, she says. 

Despite its attempts to keep a low profile, DeepSeek, based in Hangzhou, China, has built a reputation for pushing the frontier in AI research. The company shocked the industry at the start of this year with the release of DeepSeek-R1, an open-source reasoning model that rivaled leading Western systems in performance despite using far fewer computing resources. 

The AI Hype Index: Data centers’ neighbors are pivoting to power blackouts

Separating AI reality from hyped-up fiction isn’t always easy. That’s why we’ve created the AI Hype Index—a simple, at-a-glance summary of everything you need to know about the state of the industry.

Just about all businesses these days seem to be pivoting to AI, even when they don’t seem to know exactly why they’re investing in it—or even what it really does. “Optimization,” “scaling,” and “maximizing efficiency” are convenient buzzwords bandied about to describe what AI can achieve in theory, but for most of AI companies’ eager customers, the hundreds of billions of dollars they’re pumping into the industry aren’t adding up. And maybe they never will.

This month’s news doesn’t exactly cast the technology in a glowing light either. A bunch of NGOs and aid agencies are using AI models to generate images of fake suffering people to guilt their Instagram followers. AI translators are pumping out low-quality Wikipedia pages in the languages most vulnerable to going extinct. And thanks to the construction of new AI data centers, lots of neighborhoods living in their shadows are getting forced into their own sort of pivots—fighting back against the power blackouts and water shortages the data centers cause. How’s that for optimization?

Building a high performance data and AI organization (2nd edition)

Four years is a lifetime when it comes to artificial intelligence. Since the first edition of this study was published in 2021, AI’s capabilities have been advancing at speed, and the advances have not slowed since generative AI’s breakthrough. For example, multimodality— the ability to process information not only as text but also as audio, video, and other unstructured formats—is becoming a common feature of AI models. AI’s capacity to reason and act autonomously has also grown, and organizations are now starting to work with AI agents that can do just that.

Amid all the change, there remains a constant: the quality of an AI model’s outputs is only ever as good as the data
that feeds it. Data management technologies and practices have also been advancing, but the second edition of this study suggests that most organizations are not leveraging those fast enough to keep up with AI’s development. As a result of that and other hindrances, relatively few organizations are delivering the desired business results from their AI strategy. No more than 2% of senior executives we surveyed rate their organizations highly in terms of delivering results from AI.

To determine the extent to which organizational data performance has improved as generative AI and other AI advances have taken hold, MIT Technology Review Insights surveyed 800 senior data and technology executives. We also conducted in-depth interviews with 15 technology and business leaders.

Key findings from the report include the following:

Few data teams are keeping pace with AI. Organizations are doing no better today at delivering on data strategy than in pre-generative AI days. Among those surveyed in 2025, 12% are self-assessed data “high achievers” compared with 13% in 2021. Shortages of skilled talent remain a constraint, but teams also struggle with accessing fresh data, tracing lineage, and dealing with security complexity—important requirements for AI success.

Partly as a result, AI is not fully firing yet. There are even fewer “high achievers” when it comes to AI. Just 2% of respondents rate their organizations’ AI performance highly today in terms of delivering measurable business results. In fact, most are still struggling to scale generative AI. While two thirds have deployed it, only 7% have done so widely.

Download the report.

This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by MIT Technology Review’s editorial staff. It was researched, designed, and written by human writers, editors, analysts, and illustrators. AI tools that may have been used were limited to secondary production processes that passed thorough human review.

An AI adoption riddle

A few weeks ago, I set out on what I thought would be a straightforward reporting journey. 

After years of momentum for AI—even if you didn’t think it would be good for the world, you probably thought it was powerful enough to take seriously—hype for the technology had been slightly punctured. First there was the underwhelming release of GPT-5 in August. Then a report released two weeks later found that 95% of generative AI pilots were failing, which caused a brief stock market panic. I wanted to know: Which companies are spooked enough to scale back their AI spending?

I searched and searched for them. As I did, more news fueled the idea of an AI bubble that, if popped, would spell doom economy-wide. Stories spread about the circular nature of AI spending, layoffs, the inability of companies to articulate what exactly AI will do for them. Even the smartest people building modern AI systems were saying the tech has not progressed as much as its evangelists promised. 

But after all my searching, companies that took these developments as a sign to perhaps not go all in on AI were nowhere to be found. Or, at least, none that were willing to admit it. What gives? 

There are several interpretations of this one reporter’s quest (which, for the record, I’m presenting as an anecdote and not a representation of the economy), but let’s start with the easy ones. First is that this is a huge score for the “AI is a bubble” believers. What is a bubble if not a situation where companies continue to spend relentlessly even in the face of worrying news? The other is that underneath the bad headlines, there’s not enough genuinely troubling news about AI to convince companies they should pivot.

But it could also be that the unbelievable speed of AI progress and adoption has made me think industries are more sensitive to news than they perhaps should be. I spoke with Martha Gimbel, who leads the Yale Budget Lab and coauthored a report finding that AI has not yet changed anyone’s jobs. What I gathered is that Gimbel, like many economists, thinks on a longer time scale than anyone in the AI world is used to. 

“It would be historically shocking if a technology had had an impact as quickly as people thought that this one was going to,” she says. In other words, perhaps most of the economy is still figuring out what the hell AI even does, not deciding whether to abandon it. 

The other reaction I heard—particularly from the consultant crowd—is that when executives hear that so many AI pilots are failing, they indeed take it very seriously. They’re just not reading it as a failure of the technology itself. They instead point to pilots not moving quickly enough, companies lacking the right data to build better AI, or a host of other strategic reasons.

Even if there is incredible pressure, especially on public companies, to invest heavily in AI, a few have taken big swings on the technology only to pull back. The buy now, pay later company Klarna laid off staff and paused hiring in 2024, claiming it could use AI instead. Less than a year later it was hiring again, explaining that “AI gives us speed. Talent gives us empathy.” 

Drive-throughs, from McDonald’s to Taco Bell, ended pilots testing the use of AI voice assistants. The vast majority of Coca-Cola advertisements, according to experts I spoke with, are not made with generative AI, despite the company’s $1 billion promise. 

So for now, the question remains unanswered: Are there companies out there rethinking how much their bets on AI will pay off, or when? And if there are, what’s keeping them from talking out loud about it? (If you’re out there, email me!)

“We will never build a sex robot,” says Mustafa Suleyman

<div data-chronoton-summary="

  • Balancing humanlike interaction with safety concerns: Suleyman emphasizes that Microsoft’s new Copilot features—including group chat and the “Real Talk” personality—are designed to keep AI as a tool serving humanity rather than a replacement for human connection. The company deliberately avoids building chatbots that encourage romantic or sexual relationships, drawing clear boundaries where others in the industry see market opportunity.
  • Personality as craft, not deception: While acknowledging that engaging personalities make AI more useful, Suleyman argues the industry must learn to “sculpt” emotional intelligence carefully.
  • Reframing the “digital species” metaphor: Suleyman clarifies that describing AI as a new digital species isn’t endorsing consciousness or rights for machines; rather, it’s a warning about what’s coming that demands proper containment. He insists the goal is keeping AI subordinate to human interests, not granting it autonomy or moral consideration that would distract from protecting actual human rights.

” data-chronoton-post-id=”1126781″ data-chronoton-expand-collapse=”1″ data-chronoton-analytics-enabled=”1″>

Mustafa Suleyman, CEO of Microsoft AI, is trying to walk a fine line. On the one hand, he thinks that the industry is taking AI in a dangerous direction by building chatbots that present as human: He worries that people will be tricked into seeing life instead of lifelike behavior. In August, he published a much-discussed post on his personal blog that urged his peers to stop trying to make what he called “seemingly conscious artificial intelligence,” or SCAI.

On the other hand, Suleyman runs a product shop that must compete with those peers. Last week, Microsoft announced a string of updates to its Copilot chatbot, designed to boost its appeal in a crowded market in which customers can pick and choose between a pantheon of rival bots that already includes ChatGPT, Perplexity, Gemini, Claude, DeepSeek, and more.

I talked to Suleyman about the tension at play when it comes to designing our interactions with chatbots and his ultimate vision for what this new technology should be.

One key Copilot update is a group-chat feature that lets multiple people talk to the chatbot at the same time. A big part of the idea seems to be to stop people from falling down a rabbit hole in a one-on-one conversation with a yes-man bot. Another feature, called Real Talk, lets people tailor how much Copilot pushes back on you, dialing down the sycophancy so that the chatbot challenges what you say more often.

Copilot also got a memory upgrade, so that it can now remember your upcoming events or long-term goals and bring up things that you told it in past conversations. And then there’s Mico, an animated yellow blob—a kind of Chatbot Clippy—that Microsoft hopes will make Copilot more accessible and engaging for new and younger users.  

Microsoft says the updates were designed to make Copilot more expressive, engaging, and helpful. But I’m curious how far those features can be pushed without starting down the SCAI path that Suleyman has warned about.  

Suleyman’s concerns about SCAI come at a time when we are starting to hear more and more stories about people being led astray by chatbots that are too engaging, too expressive, too helpful. OpenAI is being sued by the parents of a teenager who they allege was talked into killing himself by ChatGPT. There’s even a growing scene that celebrates romantic relationships with chatbots.

With all that in mind, I wanted to dig a bit deeper into Suleyman’s views. Because a couple of years ago he gave a TED Talk in which he told us that the best way to think about AI is as a new kind of digital species. Doesn’t that kind of hype feed the misperceptions Suleyman is now concerned about?  

In our conversation, Suleyman told me what he was trying to get across in that TED Talk, why he really believes SCAI is a problem, and why Microsoft would never build sex robots (his words). He had a lot of answers, but he left me with more questions.

Our conversation has been edited for length and clarity.

In an ideal world, what kind of chatbot do you want to build? You’ve just launched a bunch of updates to Copilot. How do you get the balance right when you’re building a chatbot that has to compete in a market in which people seem to value humanlike interaction, but you also say you want to avoid seemingly conscious AI?

It’s a good question. With group chat, this will be the first time that a large group of people will be able to speak to an AI at the same time. It really is a way of emphasizing that AIs shouldn’t be drawing you out of the real world. They should be helping you to connect, to bring in your family, your friends, to have community groups, and so on.

That is going to become a very significant differentiator over the next few years. My vision of AI has always been one where an AI is on your team, in your corner.

This is a very simple, obvious statement, but it isn’t about exceeding and replacing humanity—it’s about serving us. That should be the test of technology at every step. Does it actually, you know, deliver on the quest of civilization, which is to make us smarter and happier and more productive and healthier and stuff like that?

So we’re just trying to build features that constantly remind us to ask that question, and remind our users to push us on that issue.

Last time we spoke, you told me that you weren’t interested in making a chatbot that would role-play personalities. That’s not true of the wider industry. Elon Musk’s Grok is selling that kind of flirty experience. OpenAI has said it’s interested in exploring new adult interactions with ChatGPT. There’s a market for that. And yet this is something you’ll just stay clear of?

Yeah, we will never build sex robots. Sad in a way that we have to be so clear about that, but that’s just not our mission as a company. The joy of being at Microsoft is that for 50 years, the company has built, you know, software to empower people, to put people first.

Sometimes, as a result, that means the company moves slower than other startups and is more deliberate and more careful. But I think that’s a feature, not a bug, in this age, when being attentive to potential side effects and longer-term consequences is really important.

And that means what, exactly?

We’re very clear on, you know, trying to create an AI that fosters a meaningful relationship. It’s not that it’s trying to be cold and anodyne—it cares about being fluid and lucid and kind. It definitely has some emotional intelligence.

So where does it—where do you—draw those boundaries?

Our newest chat model, which is called Real Talk, is a little bit more sassy. It’s a bit more cheeky, it’s a bit more fun, it’s quite philosophical. It’ll happily talk about the big-picture questions, the meaning of life, and so on. But if you try and flirt with it, it’ll push back and it’ll be very clear—not in a judgmental way, but just, like: “Look, that’s not for me.”

There are other places where you can go to get that kind of experience, right? And I think that’s just a decision we’ve made as a company.

Is a no-flirting policy enough? Because if the idea is to stop people even imagining an entity, a consciousness, behind the interactions, you could still get that with a chatbot that wanted to keep things SFW. You know, I can imagine some people seeing something that’s not there even with a personality that’s saying, hey, let’s keep this professional.

Here’s a metaphor to try to make sense of it. We hold each other accountable in the workplace. There’s an entire architecture of boundary management, which essentially sculpts human behavior to fit a mold that’s functional and not irritating.

The same is true in our personal lives. The way that you interact with your third cousin is very different to the way you interact with your sibling. There’s a lot to learn from how we manage boundaries in real human interactions.

It doesn’t have to be either a complete open book of emotional sensuality or availability—drawing people into a spiraled rabbit hole of intensity—or, like, a cold dry thing. There’s a huge spectrum in between, and the craft that we’re learning as an industry and as a species is to sculpt these attributes.

And those attributes obviously reflect the values of the companies that design them. And I think that’s where Microsoft has a lot of strengths, because our values are pretty clear, and that’s what we’re standing behind.

A lot of people seem to like personalities. Some of the backlash to GPT-5, for example, was because the previous model’s personality had been taken away. Was it a mistake for OpenAI to have put a strong personality there in the first place, to give people something that they then missed?

No, personality is great. My point is that we’re trying to sculpt personality attributes in a more fine-grained way, right?

Like I said, Real Talk is a cool personality. It’s quite different to normal Copilot. We are also experimenting with Mico, which is this visual character, that, you know, people—some people—really love. It’s much more engaging. It’s easier to talk to about all kinds of emotional questions and stuff.

I guess this is what I’m trying to get straight. Features like Mico are meant to make Copilot more engaging and nicer to use, but it seems to go against the idea of doing whatever you can to stop people thinking there’s something there that you are actually having a friendship with.

Yeah. I mean, it doesn’t stop you necessarily. People want to talk to somebody, or something, that they like. And we know that if your teacher is nice to you at school, you’re going to be more engaged. The same with your manager, the same with your loved ones. And so emotional intelligence has always been a critical part of the puzzle, so it’s not to say that we don’t want to pursue it.

It’s just that the craft is in trying to find that boundary. And there are some things which we’re saying are just off the table, and there are other things which we’re going to be more experimental with. Like, certain people have complained that they don’t get enough pushback from Copilot—they want it to be more challenging. Other people aren’t looking for that kind of experience—they want it to be a basic information provider. The task for us is just learning to disentangle what type of experience to give to different people.

I know you’ve been thinking about how people engage with AI for some time. Was there an inciting incident that made you want to start this conversation in the industry about seemingly conscious AI?

I could see that there was a group of people emerging in the academic literature who were taking the question of moral consideration for artificial entities very seriously. And I think it’s very clear that if we start to do that, it would detract from the urgent need to protect the rights of many humans that already exist, let alone animals.

If you grant AI rights, that implies—you know—fundamental autonomy, and it implies that it might have free will to make its own decisions about things. So I’m really trying to frame a counter to that, which is that it won’t ever have free will. It won’t ever have complete autonomy like another human being.

AI will be able to take actions on our behalf. But these models are working for us. You wouldn’t want a pack of, you know, wolves wandering around that weren’t tame and that had complete freedom to go and compete with us for resources and weren’t accountable to humans. I mean, most people would think that was a bad idea and that you would want to go and kill the wolves.

Okay. So the idea is to stop some movement that’s calling for AI welfare or rights before it even gets going, by making sure that we don’t build AI that appears to be conscious? What about not building that kind of AI because certain vulnerable people may be tricked by it in a way that may be harmful? I mean, those seem to be two different concerns.

I think the test is going to be in the kinds of features the different labs put out and in the types of personalities that they create. Then we’ll be able to see how that’s affecting human behavior.

But is it a concern of yours that we are building a technology that might trick people into seeing something that isn’t there? I mean, people have claimed they’ve seen sentience inside far less sophisticated models than we have now. Or is that just something that some people will always do?

It’s possible. But my point is that a responsible developer has to do our best to try and detect these patterns emerging in people as quickly as possible and not take it for granted that people are going to be able to disentangle those kinds of experiences themselves.

When I read your post about seemingly conscious AI, I was struck by a line that says: “We must build AI for people; not to be a digital person.” It made me think of a TED Talk you gave last year where you say that the best way to think about AI is as a new kind of digital species. Can you help me understand why talking about this technology as a digital species isn’t a step down the path of thinking about AI models as digital persons or conscious entities?

I think the difference is that I’m trying to offer metaphors that make it easier for people to understand where things might be headed, and therefore how to avert that and how to control it.

Okay.

It’s not to say that we should do those things. It’s just pointing out that this is the emergence of a technology which is unique in human history. And if you just assume that it’s a tool or just a chatbot or a dumb— you know, I kind of wrote that TED Talk in the context of a lot of skepticism. And I think it’s important to be clear-eyed about what’s coming so that one can think about the right guardrails.

And yet, if you’re telling me this technology is a new digital species, I have some sympathy for the people who say, well, then we need to consider welfare.

I wouldn’t. [He starts laughing.] Just not in the slightest. No way. It’s not a direction that any of us want to go in.

No, that’s not what I meant. I don’t think chatbots should have welfare. I’m saying I’d have some sympathy for where such people were coming from when they hear, you know, Mustafa Suleyman tell them that this thing he’s building was a new digital species. I’d understand why they might then say that they wanted to stand up for it. I’m saying the words we use matter, I guess.

The rest of the TED Talk was all about how to contain AI and how not to let this species take over, right? That was the whole point of setting it up as, like, this is what’s coming. I mean, that’s what my whole book [The Coming Wave, published in 2023] was about—containment and alignment and stuff like that. There’s no point in pretending that it’s something that it’s not and then building guardrails and boundaries that don’t apply because you think it’s just a tool.

Honestly, it does have the potential to recursively self-improve. It does have the potential to set its own goals. Those are quite profound things. No other technology we’ve ever invented has that. And so, yeah, I think that it is accurate to say that it’s like a digital species, a new digital species. That’s what we’re trying to restrict to make sure it’s always in service of people. That’s the target for containment.

Finding return on AI investments across industries

The market is officially three years post ChatGPT and many of the pundit bylines have shifted to using terms like “bubble” to suggest reasons behind generative AI not realizing material returns outside a handful of technology suppliers. 

In September, the MIT NANDA report made waves because the soundbite every author and influencer picked up on was that 95% of all AI pilots failed to scale or deliver clear and measurable ROI. McKinsey earlier published a similar trend indicating that agentic AI would be the way forward to achieve huge operational benefits for enterprises. At The Wall Street Journal’s Technology Council Summit, AI technology leaders recommended CIOs stop worrying about AI’s return on investment because measuring gains is difficult and if they were to try, the measurements would be wrong. 

This places technology leaders in a precarious position–robust tech stacks already sustain their business operations, so what is the upside to introducing new technology? 

For decades, deployment strategies have followed a consistent cadence where tech operators avoid destabilizing business-critical workflows to swap out individual components in tech stacks. For example, a better or cheaper technology is not meaningful if it puts your disaster recovery at risk. 

While the price might increase when a new buyer takes over mature middleware, the cost of losing part of your enterprise data because you are mid-way through transitioning your enterprise to a new technology is way more severe than paying a higher price for a stable technology that you’ve run your business on for 20 years.

So, how do enterprises get a return on investing in the latest tech transformation?

First principle of AI: Your data is your value

Most of the articles about AI data relate to engineering tasks to ensure that an AI model infers against business data in repositories that represent past and present business realities. 

However, one of the most widely-deployed use cases in enterprise AI begins with prompting an AI model by uploading file attachments into the model. This step narrows an AI model’s range to the content of the uploaded files, accelerating accurate response times and reducing the number of prompts required to get the best answer. 

This tactic relies upon sending your proprietary business data into an AI model, so there are two important considerations to take in parallel with data preparation: first, governing your system for appropriate confidentiality; and second, developing a deliberate negotiation strategy with the model vendors, who cannot advance their frontier models without getting access to non-public data, like your business’ data. 

Recently, Anthropic and OpenAI completed massive deals with enterprise data platforms and owners because there is not enough high-value primary data publicly available on the internet. 

Most enterprises would automatically prioritize confidentiality of their data and design business workflows to maintain trade secrets. From an economic value point of view, especially considering how costly every model API call really is, exchanging selective access to your data for services or price offsets may be the right strategy. Rather than approaching model purchase/onboarding as a typical supplier/procurement exercise, think through the potential to realize mutual benefits in advancing your suppliers’ model and your business adoption of the model in tandem.

Second principle of AI: Boring by design

According to Information is Beautiful, in 2024 alone, 182 new generative AI models were introduced to the market. When GPT5 came into the market in 2025, many of the models from 12 to 24 months prior were rendered unavailable until subscription customers threatened to cancel. Their previously stable AI workflows were built on models that no longer worked. Their tech providers thought the customers would be excited about the newest models and did not realize the premium that business workflows place on stability. Video gamers are happy to upgrade their custom builds throughout the entire lifespan of the system components in their gaming rigs, and will upgrade the entire system just to play a newly released title. 

However, behavior does not translate to business run rate operations. While many employees may use the latest models for document processing or generating content, back-office operations can’t sustain swapping a tech stack three times a week to keep up with the latest model drops. The back-office work is boring by design.

The most successful AI deployments have focused on deploying AI on business problems unique to their business, often running in the background to accelerate or augment mundane but mandated tasks. Relieving legal or expense audits from having to manually cross check individual reports but putting the final decision in a humans’ responsibility zone combines the best of both. 

The important point is that none of these tasks require constant updates to the latest model to deliver that value. This is also an area where abstracting your business workflows from using direct model APIs can offer additional long-term stability while maintaining options to update or upgrade the underlying engines at the pace of your business.

Third principle of AI: Mini-van economics

The best way to avoid upside-down economics is to design systems to align to the users rather than vendor specs and benchmarks. 

Too many businesses continue to fall into the trap of buying new gear or new cloud service types based on new supplier-led benchmarks rather than starting their AI journey from what their business can consume, at what pace, on the capabilities they have deployed today. 

While Ferrari marketing is effective and those automobiles are truly magnificent, they drive the same speed through school zones and lack ample trunk space for groceries. Keep in mind that every remote server and model touched by a user layers on the costs and design for frugality by reconfiguring workflows to minimize spending on third-party services. 

Too many companies have found that their customer support AI workflows add millions of dollars of operational run rate costs and end up adding more development time and cost to update the implementation for OpEx predictability. Meanwhile, the companies that decided that a system running at the pace a human can read—less than 50 tokens per second—were able to successfully deploy scaled-out AI applications with minimal additional overhead.

There are so many aspects of this new automation technology to unpack—the best guidance is to start practical, design for independence in underlying technology components to keep from disrupting stable applications long term, and to leverage the fact that AI technology makes your business data valuable to the advancement of your tech suppliers’ goals.

This content was produced by Intel. It was not written by MIT Technology Review’s editorial staff.

Redefining data engineering in the age of AI

As organizations weave AI into more of their operations, senior executives are realizing data engineers hold a central role in bringing these initiatives to life. After all, AI only delivers when you have large amounts of reliable and well-managed, high-quality data. Indeed, this report finds that data engineers play a pivotal role in their organizations as enablers of AI. And in so doing, they are integral to the overall success of the business.

According to the results of a survey of 400 senior data and technology executives, conducted by MIT Technology Review Insights, data engineers have become influential in areas that extend well beyond their traditional remit as pipeline managers. The technology is also changing how data engineers work, with the balance of their time shifting from core data management tasks toward AI-specific activities.

As their influence grows, so do the challenges data engineers face. A major one is dealing with greater complexity, as more advanced AI models elevate the importance of managing unstructured data and real-time pipelines. Another challenge is managing expanding workloads; data engineers are being asked to do more today than ever before, and that’s not likely to change.

Key findings from the report include the following:

  • Data engineers are integral to the business. This is the view of 72% of the surveyed technology leaders—and 86% of those in the survey’s biggest organizations, where AI maturity is greatest. It is a view held especially strongly among executives in financial services and manufacturing companies.
  • AI is changing everything data engineers do. The share of time data engineers spend each day on AI projects has nearly doubled in the past two years, from an average of 19% in 2023 to 37% in 2025, according to our survey. Respondents expect this figure to continue rising to an average of 61% in two years’ time. This is also contributing to bigger data engineer workloads; most respondents (77%) see these growing increasingly heavy.

Download the report.

This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by MIT Technology Review’s editorial staff.

This content was researched, designed, and written by human writers, editors, analysts, and illustrators. This includes the writing of surveys and collection of data for surveys. AI tools that may have been used were limited to secondary production processes that passed thorough human review.

Dispatch: Partying at one of Africa’s largest AI gatherings

It’s late August in Rwanda’s capital, Kigali, and people are filling a large hall at one of Africa’s biggest gatherings of minds in AI and machine learning. The room is draped in white curtains, and a giant screen blinks with videos created with generative AI. A classic East African folk song by the Tanzanian singer Saida Karoli plays loudly on the speakers.

Friends greet each other as waiters serve arrowroot crisps and sugary mocktails. A man and a woman wearing leopard skins atop their clothes sip beer and chat; many women are in handwoven Ethiopian garb with red, yellow, and green embroidery. The crowd teems with life. “The best thing about the Indaba is always the parties,” computer scientist Nyalleng Moorosi tells me. Indaba means “gathering” in Zulu, and Deep Learning Indaba, where we’re meeting, is an annual AI conference where Africans present their research and technologies they’ve built.

Moorosi is a senior researcher at the Distributed AI Research Institute and has dropped in for the occasion from the mountain kingdom of Lesotho. Dressed in her signature “Mama Africa” headwrap, she makes her way through the crowded hall.

Moments later, a cheerful set of Nigerian music begins to play over the speakers. Spontaneously, people pop up and gather around the stage, waving flags of many African nations. Moorosi laughs as she watches. “The vibe at the Indaba—the community spirit—is really strong,” she says, clapping.

Moorosi is one of the founding members of the Deep Learning Indaba, which began in 2017 from a nucleus of 300 people gathered in Johannesburg, South Africa. Since then, the event has expanded into a prestigious pan-African movement with local chapters in 50 countries.

This year, nearly 3,000 people applied to join the Indaba; about 1,300 were accepted. They hail primarily from English-speaking African countries, but this year I noticed a new influx from Chad, Cameroon, the Democratic Republic of Congo, South Sudan, and Sudan. 

Moorosi tells me that the main “prize” for many attendees is to be hired by a tech company or accepted into a PhD program. Indeed, the organizations I’ve seen at the event include Microsoft Research’s AI for Good Lab, Google, the Mastercard Foundation, and the Mila–Quebec AI Institute. But she hopes to see more homegrown ventures create opportunities within Africa.

That evening, before the dinner, we’d both attended a panel on AI policy in Africa. Experts discussed AI governance and called for those developing national AI strategies to seek more community engagement. People raised their hands to ask how young Africans could access high-level discussions on AI policy, and whether Africa’s continental AI strategy was being shaped by outsiders. Later, in conversation, Moorosi told me she’d like to see more African priorities (such as African Union–backed labor protections, mineral rights, or safeguards against exploitation) reflected in such strategies. 

On the last day of the Indaba, I ask Moorosi about her dreams for the future of AI in Africa. “I dream of African industries adopting African-built AI products,” she says, after a long moment. “We really need to show our work to the world.” 

Abdullahi Tsanni is a science writer based in Senegal who specializes in narrative features.