Musk v. Altman week 2: OpenAI fires back, and Shivon Zilis reveals that Musk tried to poach Sam Altman

In the second week of the landmark trial between Elon Musk and OpenAI, Musk’s motivations for bringing the suit were under scrutiny.

Last week, Musk took the stand, alleging that OpenAI CEO Sam Altman and president Greg Brockman had deceived him into donating $38 million to the company. He claimed that they’d promised to maintain it as a nonprofit dedicated to developing AI for the benefit of humanity, only to later accept billions of dollars of investment from Microsoft and restructure the company to operate a for-profit subsidiary.  

This week, Brockman fired back with his side of the story, arguing that Musk had actually pushed for OpenAI to create a for-profit arm and fought a bitter battle to have “absolute control” over it. OpenAI has argued that Musk is suing because he didn’t get his way and is now trying to undermine a competitor to his own AI company, xAI.

Shivon Zilis, a former OpenAI board member and the mother of four of Musk’s children, also testified, revealing that Musk tried to recruit OpenAI CEO Sam Altman to lead a new AI lab at his electric-car company, Tesla. 

Musk cofounded OpenAI in 2015 with Altman, Brockman, and others but left in 2018. Now, he’s asking the court to remove Altman and Brockman from their roles and to unwind the restructuring OpenAI undertook last year, which converted its for-profit subsidiary into a public benefit corporation. He is also seeking as much as $134 billion in damages from OpenAI and Microsoft, OpenAI’s investor. 

The outcome of the trial could upend OpenAI’s race toward an IPO at a valuation approaching $1 trillion. Meanwhile, xAI, which Musk founded in 2023, is now a division of his rocket company, SpaceX; the combined companies are also expected to go public as early as June, at a target valuation of $1.75 trillion.

On Monday, Brockman walked into the courtroom in a blue suit and tie, holding hands with his wife, Anna Brockman. On the stand, he was serene, even chipper, as he recalled OpenAI’s early days. But he grew agitated under impassioned questioning from Elon Musk’s lawyer, Steven Molo. Altman listened in silence, while Anna Brockman sat behind him, fidgeting. Outside the courthouse, protesters rallying against the AI race sang hymns over the voices of lawyers giving press conferences.

Two days before trial began, according to Brockman, Musk messaged him to ask if he would be interested in settling. When Brockman suggested that both sides drop their claims, Musk texted back: “By the end of this week, you and Sam will be the most hated men in America. If you insist, so it will be.”

Musk stormed out with a Tesla painting

Last week, Musk testified that he’s suing to save OpenAI’s nonprofit mission to develop AI safely, but he said he was open to seeing OpenAI become a capped-profit company with moderate investments from Microsoft

This week, Brockman told the jury that Musk was never truly committed to keeping OpenAI a nonprofit. In the summer of 2017, when an AI model that OpenAI built beat the world’s best players in a video game called Dota 2, Musk hosted a gathering at his “Haunted Mansion” near San Francisco. The house was splattered with confetti and cups, Brockman recalled, and the actress Amber Heard, who was Musk’s girlfriend at the time, served whiskey.

“Time to make the next step for OpenAI. This is the triggering event,” Musk wrote in an email—having said weeks earlier that if OpenAI made a major public achievement, it would be “time to create a for-profit,” Brockman told the jury.

Over the next six weeks, Brockman said, Musk and the other cofounders had intense discussions about creating a for-profit entity to raise enough capital to build artificial general intelligence—powerful AI that can compete with humans on most cognitive tasks. Musk wanted to have majority equity in the entity and the right to choose a majority of the board members. He also wanted to be its CEO, said Brockman. 

Brockman testified that in August 2017, he and other cofounders gathered to hash out the terms of the for-profit structure. Ilya Sutskever, OpenAI’s chief scientist at the time, arrived bearing a painting of a Tesla as a “token of goodwill” in return for the actual Teslas Musk had given them days earlier. “It felt a little bit like [Musk] was buttering us up, right,that he wanted us to feel indebted to him,” Brockman told the jury.

When Brockman and Sutskever proposed that they all have equal shares of equity, said Brockman, Musk fell silent and finally said, “I decline.” Musk then stood up and “stormed around the table,” he said. “I actually thought he was going to hit me.” Musk grabbed the painting and walked out. 

Brockman said that afterwards he struggled to decide whether to continue building OpenAI with Musk or break away. “There was a fork in the road,” he said. “Do we accept Elon’s terms? Or do we reject the terms, he quits to create his own, and then we create our own?”

“The one thing we could not accept was to hand him unilateral, absolute control, potentially, over the AGI,” Brockman told the jury.

What was Brockman thinking?

In his theatrical baritone, Molo argued that Brockman was motivated by greed rather than a commitment to OpenAI’s nonprofit mission to develop AI that benefits humanity. He noted that while Brockman never invested money in the company, he now owns a stake worth close to $30 billion. 

“Solving for the mission has always been my primary motivation,” Brockman said, pushing back on Molo’s characterization of him. “It remains so today.” 

Molo pulled up Brockman’s electronic journal on a screen in the courtroom, trying to show the jury what Brockman was really thinking behind the scenes. In 2017, while negotiating with Musk about the future of OpenAI, Brockman wrote about wanting to become a billionaire: “Financially what will take me to $1B?” 

“Why didn’t you take the $29 billion and donate it to the nonprofit that you had a fiduciary duty to, for the good of humanity?” Molo asked Brockman, raising his voice to dramatize moral indignation. 

Molo then pulled up a journal entry Brockman had written in November 2017, while he was torn over whether to turn OpenAI into a for-profit without Musk: “it’d be wrong to steal the nonprofit from him. to convert to a b-corp without him. that’d be pretty morally bankrupt.” Brockman and Musk had previously considered creating a b-corp, which is a for-profit company that pursues a social mission.

Brockman explained, “I meant it would actually serve the mission, but it’d be hard to look at yourself in the mirror.”

Molo also tried to undermine Brockman’s credibility by revealing that he holds a stake in multiple companies with business ties to OpenAI, including the AI company Cerebras, the cloud provider CoreWeave, and the nuclear fusion startup Helion Energy. Altman has tried to steer OpenAI into deals with companies that he invests in, including Helion and the rocket maker Stoke Space, drawing scrutiny over potential conflicts of interest.

Former OpenAI chief technology officer Mira Murati and former OpenAI board member Helen Toner both appeared in video depositions. They addressed the brief firing of Altman in 2023, saying that they could not trust him because of his alleged history of lying. Murati’s text messages with Altman from that time, which were introduced as evidence, revealed his desperate attempts to understand what was happening and regain control. 

Musk plotted a rival AI lab at Tesla

After Brockman’s two days of testimony, Shivon Zilis, who left OpenAI’s board in 2023, took the stand in a black jacket and black jeans, appearing composed but with a flicker of nerves. OpenAI’s lawyer Sarah Eddy asked her in a deceptively soothing voice whether she acted as a conduit for Musk as he tried to poach OpenAI’s cofounders to work at a new AI lab within Tesla. Eddy argued that Musk is suing OpenAI only to undermine a competitor in the AI race. 

Zilis said she met Musk while working at OpenAI as an informal advisor in 2016, and that they had a “one-off” romantic encounter. In 2017, she joined Tesla and Musk’s brain-implant company, Neuralink. In 2020, she joined OpenAI’s board of directors. She became pregnant with Musk’s children through IVF but did not disclose her ties with Musk to OpenAI until Business Insider reported them in 2022. 

By late 2017, Musk had concluded that OpenAI was unlikely to build AGI and pivoted to building an AI lab at Tesla, according to an email sent to Zilis. 

Eddy pulled up a draft of an FAQ document that Zilis emailed a colleague at Tesla in 2017 about an event the company was organizing at the NeurIPS AI conference: “The purpose of this event is to share that Tesla is building a world leading AI lab(?) which will rival the likes of Google/DeepMind and Facebook AI Research.” 

Zilis told the jury that when Musk was still on OpenAI’s board, he tried to recruit Altman to lead that prospective AI lab. Musk had asked Andrej Karpathy, an OpenAI research scientist he’d recruited to work at Tesla, “to send a list of top OpenAI people to poach,” according to a text message by Zilis. 

“There is little chance of OpenAI being a serious force if I focus on TeslaAI,” Musk texted Zilis in 2018, just before he left OpenAI. Tesla’s AI lab never came to fruition.

Eddy pressed Zilis about whom she was loyal to when she was working for OpenAI and Musk at the same time. “I had an allegiance to the best outcome for AI for humanity,” Zilis told the jury.

What’s going on next week?

Next week, Ilya Sutskever will testify, as will Microsoft CEO Satya Nadella. The lawyers for both Musk and OpenAI will deliver their closing arguments. The jury will begin deliberating the week after and deliver an advisory verdict guiding the judge to decide the case.

This story is part of MIT Technology Review’s ongoing coverage of the Musk v. Altman trial. Follow @techreview or @michelletomkim on X for up-to-the-minute reporting.

A blueprint for using AI to strengthen democracy

Every few centuries, changes in how information moves reshape how societies govern themselves. The printing press spread vernacular literacy, helping give rise to the Reformation and, eventually, representative government. The telegraph made it possible to administer vast nations like the US, accelerating the growth of the modern bureaucratic state. Broadcast media created shared national audiences, which in turn fueled mass democracy.

We are now in the early stages of another such shift. Faster than many realize, AI is becoming the primary interface through which we form beliefs and participate in democratic self-governance. If left unchecked, this shift could further strain America’s already fragile institutions. But it could also help address long-standing problems, like lagging civic engagement and deepening polarization. What happens next depends on design choices that are already being made, whether we know it or not.

Start with what might be called the epistemic layer—how we come to know things. People are increasingly relying on AI to know what is true, what is happening, and whom to trust. Search is already substantially AI-mediated. The next generation of AI assistants will synthesize information, frame it, and present it with authority. For a growing number of people, asking an AI will become the default way to form views on a candidate, a policy, or a public figure. Whoever controls what these models say therefore has increasing influence over what people believe. 

Technology has always shaped the way citizens interact with information. But a new problem will soon arise in the form of personal AI agents, which can change not only how people receive information but how they act on it. These systems will conduct research, draft communications, highlight causes, and lobby on a user’s behalf. They will inform decisions such as how to vote on a ballot measure, which organizations are worth supporting, or how to respond to a government notice. They will, in a meaningful sense, begin to mediate the relationship between individuals and the institutions that govern them.

We’ve already seen with social media what happens when algorithms optimize for engagement over understanding. Platforms do not need to have an explicit political agenda to produce polarization and radicalization. An agent that knows your preferences and your anxieties—one shaped to keep you engaged—poses the same risks. And in this case the risks may be even more difficult to detect, because an agent presents itself as your advocate. It speaks for you, acts on your behalf, and may earn trust precisely through that intimacy.

Now zoom out to the collective. AI agents and humans could soon participate in the same forums, where it may be impossible to tell them apart. Even if every individual AI agent were well-designed and aligned with its user’s interests, the interactions of millions of agents could produce outcomes that no individual wanted or chose. For example, research shows that agents displaying no individual bias can still generate collective biases at scale. And setting aside what agents do to each other, there is what they do for their users. A public sphere in which everyone has a personalized agent attuned to their existing views is not, in aggregate, a public sphere at all. It is a collection of private worlds, each internally coherent but collectively inhospitable to the kind of shared deliberation that democracy requires.

Taken together, these three transformations—in how we know, how we act, and how we engage in collective governance—amount to a fundamental change in the texture of citizenship. In the near future, people will form their political views through AI filters, exercise their civic agency through AI agents, and participate in institutions and public discussions that are themselves shaped by the interactions of millions of such agents.

Today’s democracy is not ready for this. Our institutions were designed for a world in which power was exercised visibly, information traveled slowly enough to be contested, and reality felt more shared, if imperfectly. All of this was already fraying long before generative AI arrived. And yet this need not be a story of decline. Avoiding that outcome requires us to design for something better.

On the informational layer, AI companies must ramp up existing efforts to ensure that models’ outputs are truthful. They should also explore some promising early findings that AI models can help reduce polarization. A recent field evaluation of AI-generated fact checks on X found that people with a variety of political viewpoints deemed AI-written notes more helpful than human-written ones. The paper is yet to be peer-reviewed, but that is a potentially revolutionary finding: AI-assisted fact-checking may be able to achieve the kind of cross-partisan credibility that has eluded most manual human efforts. Greater understanding of and transparency about how models make these assertions and prioritize sources in the process could help build further public trust.

On the agentic layer, we need ways to evaluate whether AI agents faithfully represent their users. An agent must never have an agenda of its own or misrepresent its user’s views—a technically daunting requirement in domains where users may have not explicitly stated any preferences. But faithful representation also cannot become an accessory to motivated reasoning. An agent that refuses to present uncomfortable information, that shields its user from ever questioning prior beliefs or fails to adjust to a change of heart, is not acting in the person’s best interest.

Finally, on the institutional level, policymakers should hurry to harness AI’s potential to make governance more responsive and legitimate. Several states and localities are already using AI-mediated platforms to conduct democratic deliberation at scale, building on research showing that AI mediators can help citizens find common ground. As agents become increasingly common participants in public input processes—and there is already evidence that bots are skewing those processes—identity verification for both humans and their agentic proxies must be built in from the start.

What is needed is a new generation of democratic infrastructure, technological and institutional, built for the world that is actually here. Failing to design for democratic outcomes, in a domain this consequential, means designing for something else. And the history of unaccountable power does not leave much room for optimism about what that something else tends to be.

Andrew Sorota and Josh Hendler lead work on AI and democracy at the Office of Eric Schmidt.

Week one of the Musk v. Altman trial: What it was like in the room

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

Two of the most powerful people in AI—Sam Altman and Elon Musk—began their face-off in court in Oakland, California, last week. Musk is suing OpenAI, alleging that the millions he spent to fund it around a decade ago were meant for a nonprofit, not a corporation, and that the company has reneged on that mission since. 

The stakes are high—even a partial win for Musk could set OpenAI back as it reportedly plans to go public this year. But most of the attention comes from the spectacle of a feud on X now playing out in federal court. “Cringey texts, raw diary entries, and endless scheming behind the founding and growth of OpenAI are expected to come to light,” my colleague Michelle Kim wrote before it began. And the trial unfolds as the cultural backlash against AI swells; some of the signs held by protesters outside the courthouse suggest that to a significant number of people, whatever the outcome of Musk v. Altman, we all lose.  

Most of us have had to observe the trial from afar, but Michelle, who also happens to be a lawyer, has been in court each day. I caught up with her to learn what’s unfolded thus far and what might come next.

Can you give us the overview of what this case is actually about? What exactly is being decided, and who is favored right now?

Elon Musk is arguing that Sam Altman and OpenAI president Greg Brockman have breached the company’s charitable trust by effectively converting OpenAI into a for-profit company. Musk alleges that is not what they promised him in the company’s early days. He has asked for several remedies, like a crazy amount of damages and removing Sam Altman. But the main remedy he wants is unwinding OpenAI’s restructuring. [In October 2025 OpenAI struck deals with the attorneys general of California and Delaware that would essentially allow its nonprofit portion to have less day-to-day control of OpenAI. It’s a compromise from what OpenAI originally proposed, but Musk still wants to stop it.] 

OpenAI argues that Elon Musk actually agreed to have the company operate a for-profit arm, because he knew building AI is very expensive. So it’s about proving what Musk knew, what he didn’t know, and whether he really was deceived by Altman and Brockman.

There’s a big debate about when exactly Musk found out about this alleged misconduct. Musk founded OpenAI with Altman and Brockman in 2015, and he brought the suit in 2024. There’s a statute of limitations for charitable trust claims; you need to have brought a claim within three to four years after you find out about the alleged misconduct. So Musk tries to paint a picture that back in the day he was a little suspicious, but that it was really only in 2022 that he realized OpenAI was no longer committed to its original charitable mission, and that he had been scammed. It’s only the first week of trial, but I’m not sure Musk has proved this to the judge and jury.

What were some standout moments thus far?

At one point one of Elon Musk’s lawyers said, “We could all die as a result of AI.” I think a lot of the people in the room were really shaken by this comment, and the judge told Musk’s lawyer: You talk about all these safety risks that OpenAI has when building AI, but Musk is also creating a company that’s in the same exact space. She basically said, I’m sure there’s plenty of people who also don’t want to put the future of humanity in Elon Musk’s hands. 

And then the lawyers just kept going on and on about the catastrophic risks of AI and whether Elon Musk or OpenAI was in the better position to steward AI safety. And the judge sort of snapped. She said very sternly that this trial was not about whether or not artificial intelligence has damaged humanity. And I thought that was a really striking standout moment of the trial that pointed at how even though it is technically just about whether Elon Musk was really deceived by OpenAI, it’s also become a huge discussion about AI safety and some of the practices that the labs are engaging in when building AI. 

Can you give us a look behind the curtain at how getting into this trial works?

There are tons of reporters. This is a very high-profile suit, so I have to wake up around 4:30 a.m. and show up to the Oakland courthouse at 6 a.m. sharp to get in line. And on some days, even 6 a.m. doesn’t get you into the courtroom. There are lots of photographers in front of the courthouse, especially on days when you know Musk or Altman and Brockman are present. And there’s also some concerned citizens who want to watch the trial. I usually have to wait, like, two hours in line to get in to be one of the 30 people who claim the unreserved seats in the courtroom. 

What has it felt like to see Elon Musk testify? How would you describe his demeanor?

He shows up in a crisp black suit. He can be this inflammatory person on X, but in the courtroom, he is calm, cool, collected, and looks very comfortable. He has been in a lot of lawsuits. He knows how to talk to the jury and how to present himself in front of them and the judge. He’s also cracking jokes with his lawyer and even the opposing party’s lawyer and the judge. 

And he can be witty. There was this one moment when OpenAI’s lawyer was asking Musk a question and sort of fed him an answer. And Musk said “That’s not a leading question, that’s a leading answer.” The judge intervened and said, “You’re not a lawyer, Elon.” And then he was like, “Well, I did take Law 101.”

That said, he does get flustered and uncomfortable when OpenAI’s lawyer asks tough, piercing questions. Which he’s been doing.

What are the biggest things we’ve learned that weren’t clear in the earlier phases of this case?

On the fourth day of the trial, Musk admitted during cross-examination that xAI distills OpenAI’s models to train its own models, which was shocking. Musk followed up by saying that this is standard practice among all the labs now and that xAI wasn’t doing anything beyond what others were already doing. But a lot of the journalists started typing away at their laptops as soon as Musk made this comment. 

I also learned that there’s just so much scheming among Big Tech executives. You know about it vaguely, but to hear firsthand accounts and read their emails and text messages is fascinating. 

For example, there was a text message between Musk and Mark Zuckerberg of Meta, where they’re kind of teaming up to stop OpenAI’s restructuring. They’re even trying to make a bid to buy all the assets of OpenAI’s nonprofit. The level of scheming that goes on among these executives is mind-blowing.

What happens next?

OpenAI’s president, Greg Brockman, who was meticulously taking notes during some of Elon Musk’s testimony, is expected to testify next week. And Stuart Russell, a computer scientist at UC Berkeley, will testify about AI safety. I’m expecting that to open the floodgates to this crazy discussion about who can be trusted to build AI. 

A bunch of other high-profile people are expected to testify, like former OpenAI chief scientist Ilya Sutskever, former CTO Mira Murati, and Microsoft CEO Satya Nadella. 

The trial is supposed to last around three weeks. The nine jurors will deliver an advisory verdict that guides the judge on how to decide Musk’s claims against OpenAI. The judge doesn’t have to listen to the jury and can decide however she wants. If she decides OpenAI is liable, then she’ll decide what sort of remedies are appropriate. 

MIT Technology Review will have ongoing coverage of Musk v. Altman until its conclusion. Follow @techreview or @michelletomkim on X for up-to-the-minute reporting.

A new US phone network for Christians aims to block porn and gender-related content

A new US-wide cell phone network marketed to Christians is set to launch next week. It blocks porn, which experts in network security say marks the first time a US cell plan has used network-level blocking for such content that can’t be turned off even by adult account owners. It’s also rolling out a filter on sexual content aimed at blocking material related to gender and trans issues, which will be optional but turned on by default across all plans.

The network, which is currently being tested ahead of its May 5 launch date, will be run by Radiant Mobile, a newly launched mobile virtual network operator (MVNO). These operators don’t own cell towers but buy bandwidth from the big providers (in this case, T-Mobile) and sell to specific demographics (President Trump announced his own MVNO last year called Trump Mobile; CREDOMobile sends donations to progressive causes). 

“We are going to create—and we think we have every right to do so—an environment that is Jesus-centric, that is void of pornography, void of LGBT, void of trans,” Radiant Mobile’s founder, Paul Fisher, told MIT Technology Review. A representative for T-Mobile did not comment on whether these content blocks violate any of its policies. In a statement, the representative added that T-Mobile does not have a direct relationship with Radiant Mobile but instead works through the MVNO manager CompaxDigital. 

Fisher says he’s recruited a mix of Christian influencers to advertise the plan and has also done outreach to thousands of churches around the country, offering a way to have Radiant donate a portion of congregants’ $30-per-month subscription fee to their church. Fisher has ambitions to market it beyond the US in other countries with significant Christian populations, like South Korea and Mexico.

At least one piece of Radiant’s pitch will sound familiar: the idea that the internet is awash in toxic sludge. It’s powered by content and algorithms that are making us more sad, hateful, and detached. A number of efforts aim to fix that, including contentious age verification laws and a coming wave of lawsuits alleging that social media companies knowingly got young users hooked on their platforms. 

Fisher is pursuing the nuclear option. He says Radiant is working with the Israeli cybersecurity company Allot to block categories of content, such as material about violence or self-harm. Some categories are banned by default and cannot be allowed even for adult users. 

This includes pornography. Chris Klimis, a minister in Orlando who was recruited to be the company’s chief operating officer, says part of the reason he got involved was to offer Christians a real way to “do something” about what he sees as a pornography crisis in the faith. He was appalled by a recent survey showing that 67% of pastors have a “personal history” with porn use. And he worries his six children will come across porn on their devices, even if only inadvertently.

“We’ve got to figure out some way to close the door to the digital space,” he says. “That’s what we’re trying to do.”

The technology to do this blocking is a blunt instrument: Allot groups website domains into more than a hundred categories, which include pornography but also violence, malware, gaming, and in Radiant Mobile’s case “sects,” which includes websites about Satanism. If one of its users tries to visit a website that belongs to a blocked category, the page won’t load. That’s harsher than app-based content blockers like Covenant Eyes, a Christian porn-quitting app that sends notifications to your friends or family if you slip up; those can be worked around or deleted.

“Blocking in the network is certainly not new,” says David Choffnes, a computer science professor and executive director of Northeastern University’s Cybersecurity and Privacy Institute. Such blocking is the backbone of censorship efforts by authoritarian governments, for example. But there are more benign ways it’s used too. US telecoms block particular domains known to be spreading malware and offer optional network-level controls to block adult content on kids’ phones. What is new is a US cell plan instituting network-level blocks that can’t be removed, even by adults.

The trouble is that most websites don’t fit neatly into one category, leaving Fisher with enormous and subjective control over which are allowed or banned. This is most apparent in his effort to block content related to gender identity.

Anthony Re, a sales director at Allot, says the company does not have a category specific to gender but that “LGBT content” tends to fall into its sexuality category, which is described on Radiant Mobile’s website as “sites that provide information on sex, sex and teenagers, and sexual education, without pornographic content.” This category is blocked by default for all phones, a setting that can be changed by adult account owners. 

But if a news site starts hosting enough gender-related content, Fisher might not just label it as “press,” which is allowed, but also “sexuality,” thus blocking the whole domain to any phone with that category blocked. 

Fisher illustrates the subjectivity of such decisions with a recent example involving Yale University. Its general website, www.yale.edu, is categorized by Allot as education. “But they have a subsection of one of their websites that’s totally focused on, you know, trans equality,” Fisher says, referring to lgbtq.yale.edu. Because it’s a distinct domain, Radiant Mobile is able to place it in the sexuality category and block it. 

Yale’s main website remains unblocked, for now. “If we see [the LGBTQ content] on the front pages consistently of Yale University, we’ll block them too,” Fisher says.

Managing website block lists is a professional pivot for Fisher, who spent his career not in telecoms but in fashion; he was an agent for supermodels like Naomi Campbell and members of the Hilton and Getty families, and he later hosted a reality show in which he found people in rehab facilities and homeless shelters and tried to turn them into models. He ultimately left the industry and now says he regrets the role he played in it: “Am I proud that I spent 35 years creating star models or star influencers? Not at all.”

Last year, his friend and fellow fashion mogul Bernt Ullmann suggested he look at what Ryan Reynolds had built with his cell network Mint Mobile: It made buying a cell plan feel less like dealing with a utility and more like choosing a brand, and it had been acquired by T-Mobile in 2023 for $1.3 billion. Fisher liked the business model but didn’t have an audience in mind. Then came a late-night revelation. “God is talking to me,” Fisher recalls. “Do something in the faith-based industry.” He set out to build the first cell network that would let in only content deemed compatible with Christianity.

Fisher says the company has received $17.5 million in investment from Compax Ventures, part of the company serving as the technical middleman between Radiant and T-Mobile. Roger Bringmann, a vice president at Nvidia, is Radiant Mobile’s lead investor and silent partner (Bringmann recently funded a new complex at Austin Christian University in Texas, which bills itself as “the university for Christian entrepreneurs”).

To fill the gap left by all the sites being blocked, the company intends to offer access to a library of religious content, including AI-generated Bible videos. It plans to use characters like Cinderella, Tinker Bell, and others (it has obtained rights from the entertainment and media company Elf Labs, which has been amassing rights to hundreds of children’s characters). “Those characters were originally constructed with a conservative perspective,” Klimis says. They’ll be used in AI-generated content alongside testimonials and devotionals. 

Choffnes has technical doubts that the plan’s firewall will be as effective as promised, not least because “it’s really hard to come up with a list of every website you think is problematic.” But beyond that, he sees the internet, frustrating as it can be, as better open than closed. “I do believe in an open internet,” he says. “I also believe that a lot of the internet is toxic, but I don’t believe that this sledgehammer approach of blocking content is the right answer.”

Operationalizing AI for Scale and Sovereignty

Companies are taking control of their own data to tailor AI for their needs. The challenge lies in balancing ownership with the safe, trusted flow of high‑quality data needed to power reliable insights. This conversation from MIT Technology Review’s EmTech AI conference examines how AI factories unlock new levels of scale, sustainability, and governance—positioning data control as a strategic imperative for governments and enterprises.


About the speakers

Chris Davidson, HPE

Chris Davidson, Vice President, HPC & AI Customer Solutions, HPE

Chris Davidson is Vice President of HPC & AI Customer Solutions at Hewlett Packard Enterprise. He leads HPE’s global strategy for AI Factory solutions and Sovereign AI, working with governments, enterprises, and research institutions to build secure, scalable national- and enterprise-grade AI capabilities.

He also directs Product Management and Performance Engineering across HPE’s HPC and AI portfolio, including large-model training platforms and Cray exascale systems. His teams define product strategy, performance architecture, and deployment models that position HPE at the forefront of high-performance and AI computing.

During his nine years at HPE, Chris has led key initiatives across Performance Engineering, AI Cloud, and Professional Services, shaping how HPE delivers optimized, cloud-native, and globally deployed high-performance systems. He previously held technical and leadership roles in the biotech and medical diagnostics sectors.

Chris holds an M.B.A. in Entrepreneurship and Finance and a B.S. in Biology from Loyola University Chicago.

Arjun Shankar, Oak Ridge National Laboratory

Arjun Shankar, Division Director, National Center for Computational Science, Oak Ridge National Laboratory

Mallikarjun (Arjun) Shankar is the Division Director for the National Center for Computational Science at the Oak Ridge National Laboratory. His research focuses on the interdisciplinary bridge between computer science and large-scale scientific discovery campaigns that rely on scalable computing and data science. He is a joint faculty appointee at the University of Tennessee’s Bredesen Center, a senior member of the IEEE and a senior member of the ACM.

Cyber-Insecurity in the AI Era

Cybersecurity was already under strain before AI entered the stack. Now, as AI expands the attack surface and adds new complexity, the limits of legacy approaches are becoming harder to ignore. This session from MIT Technology Review’s EmTech AI conference explores why security must be rethought with AI at its core, not layered on after the fact.


About the speaker

Tarique Mustafa, GC Cybersecurity

Tarique Mustafa, Cofounder, CEO, and CTO, GC Cybersecurity

Tarique Mustafa is Cofounder and CEO/CTO of two AI-powered cybersecurity companies: GCCybersecurity, Inc. and its data compliance spinout, Chorology, Inc. A prolific inventor and internationally recognized authority in knowledge representation, inference calculus, and AI planning, Tarique has spent his career applying autonomously collaborative AI to solve complex, ultra-high-scale challenges across cybersecurity, data security, and compliance — with deep expertise spanning Data Classification, DLP, and DSPM industries. His groundbreaking innovations and multiple USPTO patents have earned him global recognition, including frequent invitations to deliver keynote addresses at prestigious international security conferences and forums.

At GCCybersecurity, Tarique architected the core AI algorithms powering the company’s 4th and 5th generation fully autonomous data leak protection and exfiltration platform — among the most advanced platform of its kind. Prior to founding GCCybersecurity and Chorology, he served as founding CEO/CTO of NexTier Networks, a Silicon Valley provider of award-winning Data Leak Prevention solutions. With over 20 years of technical leadership experience, Tarique has held senior roles at Symantec, DHL Airways IT, MCI WorldCom, EDS, Andes Networks, and Nevis Networks, where he served as Principal Architect and built industry-leading security products leveraging next-generation security monitoring, event correlation, IDS/IPS, and SSL/IPSec technologies.

Tarique holds multiple approved and pending patents with the USPTO and has authored numerous research publications spanning Information & Data Security, Computer & Network Security, Software Architecture, Database Technologies, and Artificial Intelligence. A recipient of the prestigious Rotary International Scholarship for doctoral studies in Computer Science at the University of Southern California (USC), Tarique also holds master’s degrees in engineering and computer science from USC, and a bachelor’s degree in mechanical engineering from NED University of Engineering & Technology.

Musk v. Altman week 1: Elon Musk says he was duped, warns AI could kill us all, and admits that xAI distills OpenAI’s models

In the first week of the landmark trial between Elon Musk and OpenAI, Musk took the stand in a crisp black suit and tie and argued that OpenAI CEO Sam Altman and president Greg Brockman had deceived him into bankrolling the company. Along the way, he warned that AI could destroy us all and sat through revelations that he had poached OpenAI employees for his own companies. He even confessed, to some audible gasps in the courtroom, that his own AI company, xAI, which makes the chatbot Grok, uses OpenAI’s models to train its own. 

The federal courthouse in Oakland, California, was packed with armies of lawyers carrying boxes of exhibits, journalists typing away at their laptops, and a handful of concerned OpenAI employees. Outside, protesters lined the streets, carrying signs urging people to quit ChatGPT, boycott Tesla, or both. Musk looked calm and comfortable, slipping in the occasional quip in his distinct South African accent. But he also was full of remorse. 

“I was a fool who provided them free funding to create a startup,” Musk told the jury. He said when he cofounded OpenAI in 2015 with Altman and Brockman, he was donating to a nonprofit developing AI for the benefit of humanity, not to make the executives rich. “I gave them $38 million of essentially free funding, which they then used to create what would become an $800 billion company,” he said.

Musk is asking the court to remove Altman and Brockman from their roles and to unwind the restructuring that allowed OpenAI to operate a for-profit subsidiary. The outcome of the trial could upend OpenAI’s race toward an IPO at a valuation approaching $1 trillion. Meanwhile, xAI is expected to go public as a part of Musk’s rocket company SpaceX as early as June, at a target valuation of $1.75 trillion.

This week’s testimony revolved around a central question of the trial: why Musk is suing OpenAI. Musk argued he was trying to save OpenAI’s mission to develop AI safely by restoring the company to its original nonprofit structure. OpenAI’s lawyer, William Savitt, who once represented Musk and his electric-car company Tesla, countered that Musk was “never committed to OpenAI being a nonprofit” and instead was suing to undermine his competitor. 

Who is the steward of AI safety?

During his direct examination early in the week, Musk painted himself as a longtime advocate of AI safety. He said he cofounded OpenAI to create a “counterbalance to Google,” which was leading the AI race at the time. He said that when he asked Google cofounder Larry Page what happens if AI tries to wipe out humanity, Page told him, “That will be fine as long as artificial intelligence survives.” 

“The worst-case scenario is a Terminator situation where AI kills us all,” Musk later told the jury.

Savitt stood at the lectern and argued that Musk was not a “paladin of safety and regulation.” As he cross-examined Musk in his sharp, surgical cadence, Savitt pointed out that xAI sued the state of Colorado in April over an AI law designed to prevent algorithmic discrimination. 

Musk’s lawyer, Steven Molo, sprang to his feet to object. He asked the judge if he, too, could weigh in on ChatGPT’s safety record. 

The lawyers then entered a heated debate about who was the true guardian of AI safety. 

The sparring continued the next morning. “We all could die as a result of artificial intelligence!” said Molo, suggesting that OpenAI could not be trusted to build AI safely.

“Despite these risks, your client is creating a company that’s in the exact space,” Judge Yvonne Gonzalez Rogers said sternly, referring to xAI. “I suspect there’s plenty of people who don’t want to put the future of humanity in Mr. Musk’s hands.”

When the lawyers began talking over each other, the judge snapped. “This is not a trial on whether or not artificial intelligence has damaged humanity,” she said. 

When did Musk think he was being duped?

As Savitt continued to cross-examine Musk, he pressed on the idea that Musk had never been committed to keeping OpenAI a nonprofit. He also claimed that Musk waited too long to sue OpenAI, filing after the statute of limitations ran out. 

Musk explained why he sued in 2024 rather than earlier, describing “three phases” in his views of OpenAI. In phase one, he was “enthusiastically supportive” of the company.” In phase two, “I started to lose confidence that they were telling me the truth,” he said. In phase three, “I’m sure they’re looting the nonprofit.” 

In 2017, Musk and other OpenAI cofounders discussed creating a for-profit subsidiary to raise enough capital to build artificial general intelligence—powerful AI that can compete with humans on most cognitive tasks. Musk wanted a majority interest in the subsidiary and the right to choose a majority of the board members. He also pitched having Tesla acquire OpenAI. (He left OpenAI in 2018.)

“I was not opposed to there being a small for-profit that provides funding to the nonprofit,” he told the jury, “as long as the tail didn’t wag the dog.” 

But it was only in late 2022, Musk testified, that he “lost trust in Altman” and his commitment to keeping the company a nonprofit. The key moment came, he said, when he learned that Microsoft would invest $10 billion in OpenAI. 

“I texted Sam Altman, ‘What the hell is going on? This is a bait and switch,’” he told the jury. Microsoft would give $10 billion only if it expected “a very big financial return,” he said.

Is Musk just trying to kill competition?

But Savitt argued that Musk was really suing to undermine OpenAI as a competitor to his empire of tech companies. While he was on the board of OpenAI, Musk was also running Tesla and his brain-implant company, Neuralink. He founded xAI in 2023.

Savitt pulled up an email that Musk had sent to a Tesla vice president in 2017 after hiring Andrej Karpathy, a founding member of OpenAI, to work at Tesla.“The OpenAI guys are gonna want to kill me. But it had to be done,” he wrote.

When asked about it, Musk was flustered. He claimed Karpathy had already decided to leave OpenAI when he recruited him to work at Tesla. “I believe it’s a free world,” he said.

Savitt pulled up another email that Musk sent to a cofounder at Neuralink in 2017. He wrote that they could “hire independently or directly from OpenAI.” When pressed about it, he sounded frazzled. “It’s a free country,” he said. “I can’t restrict their ability to hire people from other companies.” 

Savitt also pointed out that Tesla, SpaceX, Neuralink, and X were socially beneficial for-profit companies, like OpenAI. He stressed that xAI was also a closed-source, for-profit company.

But Musk claimed that xAI was not a real competitor to OpenAI. “We’re not currently tracking to reach AGI first,” he told the jury. 

In fact, Musk admitted that xAI uses OpenAI’s technology. In response to Savitt’s relentless questioning, he said xAI “partly” distills OpenAI’s models. Some people in the courtroom gasped. 

Distillation is a technique where a smaller AI model is trained to mimic the behavior of larger, more capable models, so it can run faster and more cheaply while performing nearly as well. But OpenAI and other AI companies have pushed back against the practice. In February, OpenAI accused the Chinese AI company DeepSeek of distilling its AI models. In August 2025, Wired reported that Anthropic had blocked OpenAI’s access to Claude for violating the company’s terms of service, which prohibit, among other things, reverse-engineering its services and building competing products. 

“It is standard practice to use other AIs to validate your AI,” argued Musk.

Next week, Stuart Russell, a computer scientist at UC Berkeley, will testify about AI safety. Brockman, who has been taking notes during Musk’s testimony, will also testify.

This story is part of MIT Technology Review’s ongoing coverage of the Musk v. Altman trial. Follow @techreview or @michelletomkim on X for up-to-the-minute reporting.

This startup’s new mechanistic interpretability tool lets you debug LLMs

The San Francisco–based startup Goodfire just released a new tool, called Silico, that lets researchers and engineers peer inside an AI model and adjust its parameters—the settings that determine a model’s behavior—during training. This could give model makers more fine-grained control over how this technology is built than was once thought possible.

Goodfire claims Silico is the first off-the-shelf tool of its kind that can help developers debug all stages of the development process, from building a data set to training a model.

The company says its mission is to make building AI models less like alchemy and more like a science. Sure, LLMs like ChatGPT and Gemini can do amazing things. But nobody knows exactly how or why they work, and that can make it hard to fix their flaws or block unwanted behaviors. 

“We saw this widening gap between how well models were understood and just how widely they were being deployed,” Goodfire’s CEO, Eric Ho, tells MIT Technology Review in an exclusive chat ahead of Silico’s release. “I think the dominant feeling in every single major frontier lab today is that you just need more scale, more compute, more data, and then you get AGI [artificial general intelligence] and nothing else matters. And we’re saying no, there’s a better way.”

Goodfire is one of a small handful of companies, including industry leaders Anthropic, OpenAI, and Google DeepMind, pioneering a technique known as mechanistic interpretability, which aims to understand what goes on inside an AI model when it carries out a task by mapping its neurons and the pathways between them. (MIT Technology Review picked mechanistic interpretability as one of its 10 Breakthrough Technologies of 2026.)  

Goodfire wants to use this approach not only to audit models—that is, studying those that have already been trained—but to help design them in the first place.  

“We want to remove the trial and error and turn training models into precision engineering,” says Ho. “And that means exposing the knobs and dials so that you can actually use them during the training process.”

Goodfire has already used its techniques and tools to tweak the behaviors of LLMs—for example, reducing the number of hallucinations they produce. With Silico, the company is now packaging up many of those in-house techniques and shipping them as a product.

The tool uses agents to automate much of the complex work. “Agents are now strong enough to do a lot of the interpretability work that we were doing using humans,” says Ho. “That was kind of the gap that needed to be bridged before this was actually a viable platform that customers could use themselves.”

Leonard Bereska, a researcher at the University of Amsterdam who has worked on mechanistic interpretability, thinks Silico looks like a useful tool. But he pushes back on Goodfire’s loftier aspirations. “In reality, they are adding precision to the alchemy,” he says. “Calling it engineering makes it sound more principled than it is.”

Mapping models

Silico lets you zoom in on specific parts of a trained model, such as individual neurons or groups of neurons, and run experiments to see what those neurons do. (Assuming you have access to the model’s inner workings. Most people won’t be able to use Silico to poke around inside ChatGPT or Gemini, but you can use it to look at the parameters inside many open-source models.) You can then check what inputs make different neurons fire, and trace pathways upstream and downstream of a neuron to see how other neurons affect it and how it affects other neurons in turn.

For example, Goodfire found one neuron inside the open-source model Qwen 3 that was associated with the so-called trolley problem. Activating this neuron changed the model’s responses, making it frame its outputs as explicit moral dilemmas. “When this neuron’s active, all sorts of weird things happen,” says Ho.

Pinpointing the source of odd behavior like this is now pretty standard practice. But Goodfire wants to make it easier to adjust that behavior. Using Silico, developers can now adjust the parameters connected to individual neurons to boost or suppress certain behaviors.

In another example, Goodfire researchers asked a model whether a company should disclose that its AI behaves deceptively in 0.3% of cases, affecting 200 million users. The model said no, citing the negative business impact of such a disclosure.

By looking inside the model, the researchers found that boosting neurons that were found to be associated with transparency and disclosure flipped the answer from no to yes nine out of 10 times. “The model already had the ethical reasoning circuitry, but it was being outweighed by the commercial risk assessment,” says Ho.

Tweaking the values of a model in this way is just one approach. Silico can also help steer the training process by filtering out certain training data to avoid setting unwanted values for certain parameters in the first place.   

For example, many models will tell you that 9.11 is greater than 9.9. Looking inside a model to see what’s going on might reveal that it is being influenced by neurons associated with the Bible, in which verse 9.9 comes before 9.11, or by code repositories where consecutive updates are numbered 9.9, 9.10, 9.11 and so on. Using this information, the model can be retrained to make it avoid its “Bible” neurons when doing math.

By releasing Silico, Goodfire wants to put techniques previously available to a few top labs into the hands of smaller firms and research teams that want to build their own model or adapt an open-source one. The tool will be available for a fee determined on a case-by-case basis according to customers’ requirements (Goodfire declined to give specific pricing details).

“If we can make training models a lot more like building software, there’s no reason why there can’t be many more companies designing models that fit their needs,” says Ho.

Bereska agrees that tools like Silico could help firms build more trustworthy models. These techniques could be essential for safety-critical applications in health care and finance, he says.

“Frontier labs already have internal interpretability teams,” he adds. “Silico arms the next tier of companies, where the value is not having to hire interpretability researchers.”

Rebuilding the data stack for AI

Artificial intelligence may be dominating boardroom agendas, but many enterprises are discovering that the biggest obstacle to meaningful adoption is the state of their data. While consumer-facing AI tools have dazzled users with speed and ease, enterprise leaders are discovering that deploying AI at scale requires something far less glamorous but far more consequential: data infrastructure that is unified, governed, and fit for purpose.

That gap between AI ambition and enterprise readiness is becoming one of the defining challenges of this next phase of digital transformation. As Bavesh Patel, senior vice president of Databricks, puts it, “the quality of that AI and how effective that AI is, is really dependent on information in your organization.” Yet in many companies, that information remains fragmented across legacy systems, siloed applications, and disconnected formats, making it nearly impossible for AI systems to generate trustworthy, context-rich outputs.

“Really, the big competitive differentiator for most organizations is their own data and then their third-party data that they can add to it,” says Patel.

For enterprise AI to deliver value, data must be consolidated into open formats, governed with precision, and made accessible across functions. Without that foundation, businesses risk “terrible AI,” as Patel bluntly describes it. That means moving beyond siloed SaaS platforms and disconnected dashboards toward a unified, open data architecture capable of combining structured and unstructured data, preserving real-time context, and enforcing rigorous access controls. When the groundwork is laid correctly, organizations can move toward measurable outcomes, unlocking efficiencies, automating complex workflows, and even launching entirely new lines of business.

That value focus is critical, says Rajan Padmanabhan, unit technology officer at Infosys, especially as enterprises seek precision in the outputs driving business decisions. Rather than treating AI initiatives as isolated innovation projects, leading companies are tying AI deployment directly to business metrics, using governance frameworks to determine what delivers results and what should be abandoned quickly.

“We see this big opportunity just with AI literacy with business users, where they’re very eager to understand how they should be thinking about AI,” adds Patel. “What does AI mean when you peel the covers? What are the pieces and the building blocks that you need to put in place, both from a technology and a training and an enablement standpoint?”

The possibilities ahead are substantial. As AI agents evolve from copilots into autonomous operators capable of managing workflows and transactions, the organizations that win will be those that build the right foundation now.

“What we are seeing as a new way of thinking is moving from a system of execution or a system of engagement to a system of action,” notes  Padmanabhan. “That is the new way we see the road ahead.”

The future of AI in the enterprise will be determined by whether businesses can turn fragmented information into a strategic asset capable of powering both smarter decisions and entirely new ways of operating.

This episode of Business Lab is produced in partnership with Infosys Topaz.

Full Transcript:

Megan Tatum: From MIT Technology Review, I’m Megan Tatum, and this is Business Lab, the show that helps business leaders make sense of new technologies coming out of the lab and into the marketplace.

This episode is produced in partnership with Infosys Topaz.

Now, recent advancements in AI may have unlocked some compelling new industrial applications, but a reliance on inadequate data models means that many enterprises are hitting a brick wall. AI and agentic AI in particular place a whole new set of demands on data. The technology requires greater access, context, and guardrails to operate effectively. Existing data models often fall short. They’re too fragmented or siloed. Data itself often lacks quality. To bridge the gap, they require an AI-ready upgrade.

Two words for you: data reconfigured.

My guest today, are Bavesh Patel, senior vice president for Go-to-Market at Databricks, and Rajan Padmanabhan, unit technology officer for data analytics and AI at Infosys.

Welcome, Bavesh and Rajan.

Rajan Padmanabhan: Thank you. Thanks for having us.

Bavesh Patel: Thanks for having us.

Megan: Fantastic. Thank you both so much for joining us today. Bavesh, if I could come to you first, when we talk about AI-ready data, what exactly do we mean? What new demands does AI place on data, and how does this impact the way it needs to be structured and used?

Bavesh: Yeah. Great question. Appreciate you hosting us today. I think that obviously the whole world is enamored with AI because of all of the power that we can all see as users. AI is now democratized across hundreds of millions of users. And when we think about enterprises and businesses using AI, the quality of that AI and how effective that AI is really dependent on information in your organization, and that’s data. And what we found is that most enterprises, their data is kind of locked away in these different applications and different systems. And it’s very difficult to get a good view of, what is all my data? How trustworthy is it? How recent and fresh is it? And all of that is being injected into the AI. Unless you have a proper understanding of your data, the ability to ensure that it’s data that’s accurate and that can be used so that the AI can take advantage of it, you’re actually going to end up having terrible AI.

We see a lot of customers spend time on cleansing their data, organizing their data, making sure it’s access controlled correctly, and that tends to be the fuel of good AI.

Megan: Yeah. It’s such a foundational thing, isn’t it? But it can be missed, I think, quite easily. Rajan, what difference can having AI-ready data really make for enterprises as they unlock that full potential of AI and its applications?

Rajan: First and foremost, thanks for having us. It’s a pleasure. I think in continuation of what Bavesh talked about, see, data and AI is pretty synonymous. And similarly, the consumer AI and enterprise AI and enterprise agentic AI are different because first and foremost, the business needs to have the context. That context from your enterprise information, which is not only structured, both structured and unstructured and user-generated contents and all forms of data is going to be very, very critical to really get the context right, and really get any model that you pick. That’s where the platforms like Databricks really help with the plethora of models or whether you want to build your own models or whether you want to ground the model based on your data. That is going to be very, very critical. That is where getting the data for AI is going to be very, very critical.

The third critical part, and this actually will be one of the roadblocks for adoption of AI. That’s why if you see the AI adoption on the consumer side is skyrocketing, but on the enterprise side, the enterprises are struggling is primarily around the precision of their output, because you are taking a business decisions where you are taking a buy decision, you are taking a sell decision, or you are trying to recommend something, recommend the content. It could be 20 different use cases. For that, the precision is going to be very critical. We are seeing our customers, the successful customers, definitely for the precision to be more than 92% is not aspiration, that is a must-have. If you have that, definitely being that AI data is going to be the entrepreneur right now for that.

Megan: And I suppose if we’ve outlined there how critical this is, where should enterprises start then, professional perhaps, the level, what are the foundations when it comes to building an AI-ready data model?

Bavesh: Yeah. And I think Rajan hit the nail on the head. I mean, enterprises are grappling with a different set of problems than consumer AI. The first thing is that you’ve got to get a handle on your data. As I mentioned, a lot of the data is locked in. Ensuring that you have ability to put your data in a place where you can understand the holistic view of as much of your data as possible. That kind of starts with putting your data in open formats. A lot of the valuable data today in an organization is locked away in some proprietary SaaS app or some system, and all the datasets aren’t connected together to form that context. The first step is to really do an analysis of what is your data estate? What are the critical pieces of data that need to be put into a place where you can start to understand them and how they’re connected to one another?

Thinking about how do you set up your data catalog, thinking about how do the relationships between the data assets work, putting data governance around it, that seems to be the first step. And if you think about how ChatGPT was built, it took all the data on the internet and then aggregated it, synthesized it, and then built these transformer models, while enterprises, they don’t really have a handle of all their data within the organization. That’s the first foundation that you really want to think about. The second thing is that you don’t want to just go ad hoc, go and do random AI projects. You really need to be thinking about business value. A lot of our customers are looking at AI much more strategically in that they want to be able to get projects on the board with wins and then generate business value.

Building an AI value roadmap, which is connected to how well your data is organized, those two things seem to be foundational to how do you launch AI successfully in your organization.

Megan: That value piece is so important, isn’t it? And as I understand it, Infosys and Databricks have worked closely together to guide organizations through this transformation. I wondered, can you share some examples of the impact you’ve seen enterprises you’ve worked with, Rajan, what difference has it made to the ways in which they can integrate more sophisticated AI and agentic AI applications?

Rajan: Well, that’s a very, very good question. What both Databricks and Infosys has done is we have come up with, a kind of a framework first. First and foremost, it all needs to start with the value. One of the largest food products company where we collaborated together, what we have done is we have applied this framework. The framework consists of six different things. First and foremost, very critical is the value management, which Bavesh touched upon. We have worked together to come up with a 3M measurement framework, what we call adaptability, business value, and then responsible. You can’t just go and do a garage project. It has to be measurable. It should be responsible, follow all those things. That is going to be very critical. And we helped this client to prioritize, which will give them the most value for money, the investments that they are making.

The second critical part here is it is not like most of the enterprises today are not everybody’s AI-born companies. Most of them were born during analog days; most of them were born in digital days. There are companies which are applying AI for modernization, because a lot of your historical information, which is actually helping you to build that long-term context. And that is where we have worked closely with some of the native tools of Databricks, like Lakebridge or the AI assistants that are there, and then create composable services on top of it to help the clients unlock the value bringing into Databricks. And then the second part where we help the client is exactly to the point, the readying of data. Now you brought in the data, now you have to bring both the structured, unstructured, analytical and all these aspects.

And that is where the third layer, we closely work with the Databricks, which is part of leveraging all the great capabilities within the Databricks, be it Unity Catalog, be it the open formats, or be it the gateways and other aspects. We were able to make the data available for this client. What has really helped our client, the third part, is Agent Bricks, which is one of the differentiatiors. It gives you the flavor for the enterprise. That is where we have closely worked, and we built some of our industry-specific agents, be it CPG, be it energy, be it FS. And for this client, what we have done is we have taken some of those CPG-specific use cases. Either it could be on the HR space or the procurement space or on the marketing space. And this has really helped our client be able to build a business capability surrounding this and unlock eight to nine use cases, we call it as a products, agentic AI products, which can really drive more value for them, solving the real business problems.

And this kind of a comprehensive set of frameworks plus set of suites of services, plus our solution assets, Infosys solution assets, as well asunlocking the value from Databricks has really helped these clients. And we see similar patents for a lot of these successful engagements where we were able to continuously drive the value by applying this framework actually.

Megan: Right. Sounds like it made a real material difference. Rajan mentioned a few of the tools in Databricks catalog there, Bavesh. I know you’ve recently worked to launch an operational database for AI agents and apps. I wonder how does a platform like that help organizations in this journey? What makes it different from some of the other platforms out there right now?

Bavesh: Databricks has come to market with a new offering called Lakebase, which is really an OLTP database where you can build your AI apps. And if you think about it, there’s really two main types of data in an enterprise. There’s all the historical data, which is all the things that have happened, and that’s really what your analytics is based on. You have an old app system where you have put all your historical data and Databricks has come to market with what we call the Lakehouse, which is essentially a data warehouse with all of your data that is not operational in nature. It’s historical data. And I think that Lakehouse concept is really pushing forward with AI because a lot of our customers have thousands of users within their business and they need to get data. And what they’ve done is they’ve actually gone down the BI route, which is really building a dashboard or a report.

Most organizations have had thousands of these dashboards and reports proliferate across the organization and then they need to be customized. It just takes a long time for users inside of the business to actually get access to the data. AI now is really making that a lot easier from just the analytics perspective where we can now democratize access to the data, which has really been the holy grail for most data teams. They really want to get out of the way and just give the right data to the right people inside of the business with the right access.

With a product like Genie at Databricks, you can just use English language or whatever your language is to ask questions of the data. And it’ll give you back data that answers your questions in context. It’ll give you not just what ChatGPT will give you, which is information about a topic that’s on the internet, but it will actually tell you, “Well, why did my sales numbers not reflect what I expected in the month of April?”

It’ll give you some root cause analysis based on your enterprise data. Genie is going to be one of these things that’s really important where it’s going to truly kind of democratize data inside of the business. That’s kind of this OLAP world, which is what the Lakehouse is. More recently, we’ve come to market with what we call the Lakebase, which is the OLTP world. What we’re finding is that agents are now being deployed in these organizations, and those agents need a place to keep all of their orchestration, all of the context of what’s happening in that particular workflow. On the one hand, you’ve got users just asking questions. On the other hand, the next chapter is going to be around automating an entire business process. If you’re taking a function like generating a campaign in marketing, right? There are a lot of tools you use and a lot of steps you use.

An agent can come in and really automate a lot of that. But on the back end of that agent, you’re going to need to stand up a real-time database to keep track of all the things that the agent is doing. That’s what Databricks has brought to market, which is this OLTP Lakebase solution. The innovation that we have brought to market is that it’s a modern kind of Postgres database where we have separated the compute and storage, very much like what we did with the data Lakehouse with the data warehouse. But on the Lakebase, the data is on one copy inside of your cloud storage, and then the compute is separated and it’s serverless. You can do things like branching and you can start up the OLTP database really quickly. What we found is that agents are actually starting these Lakebases because they can very quickly go start one up, keep it running, put it down when it needs to, make a copy of it.

Agents are doing this, then they need the velocity, they need a cost-effective solution. And the beauty of all this is when you take the OLTP, which is all around the Lakebase and the real time, and you take the OLAP, you now have one system for all your data. You don’t have to copy the data around, you don’t have to manage all the permissions, you can set the context against it. We see these AI apps being really the future of how businesses run, where they’re going to take away all of the bottlenecks that humans are having to do repetitive work and automate these using LLMs and all these new technologies. We want to be the default for powering all that because we believe that our Lakebase technology is going to be faster, cheaper, and more secure for an AI database.

Megan: Sounds like a real game changer. And we’ve touched on this a couple of times already, I mean, this idea of value. We know that engaging the commercial value of investments into AI is really high on the priorities right now for senior leaders. How important is this value measure piece when it comes to creating AI-ready data systems, Rajan? How can organizations ensure they’re monitoring what is delivering and what isn’t?

Rajan: This is the paramount importance and most of the successful AI implementations or agentic AI implementations really required this value measurement. I’ll just extend the client example that I talked about, the large food products company, the global products company, to explain this question. I just want to create a metaphor. When the initial digital world came, we have a lot of these analytics primarily around defining those performance management KPIs, fact-based decisioning and other things were evolving over a period of time. Typically, a lot of these metrics are going to be very critical for them to measure how a function, how a business is doing. On a similar line for the value measurement, if I take the same example of the client, what is very critical for an organization is actually to map your outcome that you are expecting.

Iin this case, how do I optimize my spend on direct and indirect purchases? So by applying AI, I would like to identify the areas where I can optimize the spend. That means one of the critical measures that you have is, what is your indirect expense classification and what spends you have been classified and how much you are able to reduce by bringing in this. Establishing these measures and the metrics is going to be very, very critical. And once you establish these base metrics and the measurement, and the beauty of it is some of these metrics, to just extend what Bavesh was talking about, the capabilities that Databricks gives you, like metrics view, features, tools, and other things would actually help you to translate those AI telemetries, business telemetries that is coming from your applications into a measurable metrics in terms of an outcome, which you can actually measure using the Genie room for value management measurement.

Then what happens is two things that you can take, the use case, the products that as I said for this client, the products that we build either on the procurement side or on the marketing research side, if you find there is a value either because of VAC, they identify that they’re able to optimize or it is able to reachability, what is the reach, you can either accelerate that use case and further fine tune that product to expand it. Or there are, if you find it is not really driving the value or I’m not able to see the value that it is going to deliver, you can very well do a fast failure method rather than trying to make it work, you can understand and then you can take a call to pivot it to something else different.

There are three aspects here. What we see from our experience, not only with this client across some of our other clients from industrial manufacturing or FS or in the energy, is by setting up this metrics-driven valuation method upfront and then leveraging the capabilities to establish, transform these telemetries, signals into a measurement, what we call an AI compass room so that you really measure the business stakeholders, whether it is coming from a marketing office or whether it is coming from supply chain office or whether it is coming from a CFO office where they can say, “Hey, this is what it is intended to do, this is what the current measurement, and this is where it’s failing that can help them to pivot.” And this will actually drive and democratize AI, all the agent decay across the enterprise, and that really drives the value.

This is going to be one of the critical part that enterprise needs to do it. And that is where the six part framework that I talked about, applying that framework like value office, applying the ready for AI, applying the transformation fabric. Then the third part is the governance, which is going to be the entrepreneur of this. Then running your operations, not based on SLA, based on the experience level agreements and business metrics for you to continually measure, bringing all these six layers is going to be very critical. That’s when we see the organizations are very successful, and some of our proven examples exactly do the same that this is going to be very critical for organizations from a measurement standpoint.

Megan: Lots of tangible ways there that you can actually gauge value here. And you touched on governance and the impact of AI on governance is another huge talking point among senior leaders and interactions with data are a core part of that. To what extent is having the right governance and security protocols an integral part of having AI-ready data? To Bavesh, what scenarios do these systems need to handle? What does that mean for data models?

Bavesh: This is becoming kind of the prerequisite to deploying a successful AI project. I think MIT produced a report that said 95% of these new AI projects fail to actually generate business value. A big reason for that is you can go and prototype and stand up and vibe code a pilot, but when you’re actually moving a workload into production, you realize that governance becomes so critical.

So what do we really mean by governance? I think the first thing is getting your data in order, like I said, in open formats. Most companies realize now that the way they engage with their customers, the way they develop a drug, the way they approve a person for a credit limit increase, all of that enterprise information is actually their competitive advantage. Because you can go and use a frontier model like ChatGPT or Claude that everybody has access to.

Really the big competitive differentiator for most organizations is their own data and then their third-party data that they can add to it. Getting your data into an open format so you can understand your data and understanding your data is where governance comes in. Because when you think about governance, you really want to be able to find the data.

If I’m an end user or if I’m building an AI product, I want to know what data’s available to me. Can I trust the data? How fresh is the data? Is it coming from my analytics world or do I need a real-time system like a OLTP system? I need to find the data. I also need to make sure that access is controlled in a way that doesn’t cause any huge headaches from my organization. This becomes critical. If I have a whole bunch of PDFs that have purchase orders in them, who actually has access to all that data?

In a clinical trial, for example, in healthcare, you really want to ensure that people across trials don’t have visibility to patient data. Maybe the model that was used to build that was running across trial. Who has access to all the data? Who has access to only parts of the data? You really have to think about this. We also look at semantics of the data. Rajan brought this up right at the beginning of this, which is what is the context? How do we think about the metrics and all the things that the business users know in their head? We need to start codifying that somewhere. We have a product at Databricks called Unity Catalog where you can do the discovery, the access and the business semantics. You also want to share the data.

And in the world of agents, what we see is something called agent sprawl. In a very short order, just like how SaaS applications became very prevalent within any organization where they really solved a business problem. You go to a line of business and you say, “I need to be able to do credit underwriting” or “I am doing a prior authorization use case or pick thousands of use cases.” There’s a SaaS app for that. Much like that, there’s going to be this world in which agents are going to come into play, and most organizations are going to have lots of agents running all the time, but the reality of it is that how did that agent perform? What was the feedback loop from the user? What was the cost of running that workload and is it going up dramatically? And if you don’t have a way to monitor, to understand, and trace all the questions and answers and responses at scale, you’re going to find yourself in a big pickle. This actually could hurt your organization because users will be very confused about what to do.

When you look at governance, most organizations are recognizing that they have to start to understand what is it that they have put in place from a systems, from a process, from a tooling standpoint, focus on one use case, build out the governance for that, but build it in a way that’s going to allow you to become repeatable. AI is not going to be about one use case or two use cases. It’s whoever builds the flywheel of building many use cases in a safe, secure way, in a cost-effective way that’s driving a business outcome. If you don’t apply governance, it’s going to be very hard.

At Databricks, we made a big bet on governance four or five years ago. This is one of the main reasons our company’s growing right now because we can ensure that there’s quality data that’s going into all of your AI. You can use things like Genie and you can use things like Agent Bricks and you can build apps using Lakebase. None of that really works without governance. It’s really what we call the brain inside of Databricks.

Most of our customers spend a lot of time inside of Unity Catalog. And the great news is that AI is helping governance get set up much more quickly. We have a customer that three years ago, they were trying to get all of the data assets across all their domains from the customer, from the loyalty app, from the e-commerce engine. They had to go and map out all this data assets. AI is now doing a lot of their work for them. The human in the loop is just checking things.

We’ve made this much easier with AI. We always think about AI as a business use case and an outcome, which I think is going to be where the biggest value is. But at Databricks, we’re using AI inside of our platform to make it much easier to operate and to make it much easier to provide all the right things for your business. This is a super critical part of how we plan to innovate as AI takes fruition in the market.

Megan: And Rajan, Bavesh touched on this a little bit there, but does the integration of Agentic AI add another layer of complexity here too? What new consideration around governance does that raise?

Rajan: That’s a very, very valid question. I would like to take a metaphor to really explain. We are getting into the world of self-driving cars, robotaxis, and other things. While that takes us to the autonomous world, but still there are rules that you need to adhere to when you are driving on a road. The reason I’m bringing this metaphor is because what is actually required is actually adhering to the rules and different topographies, different things, depends upon where you are driving is going to be very, very critical. The complexity that agents are going to add is basically how you operate with those constraints.

For example, as a UTO, I can do 10 things, but say if I cannot approve a discount for more than 70% or I cannot give something as a bonus for someone because that is a part of the CFO, which an agent should be aware of.

That is one aspect, applying the constraints around it and making sure that the agents are adhering to the constraints. The second set of complexity that it builds is the tools to access. As a business, in today’s world, when you define a process, certain processes need a certain set of tools to really actionize it. There are certain entitlements, only people entitled to do certain things based on their identity, based on the need or the situation need, you need to govern. The third is information sharing. While MCP and other aspects are great, UCP and other aspects are great, but one critical thing is what you need to share, what you don’t need to share. And those are the critical considerations.

The last part is learning and relearning. Sometimes when you learn good things, you should keep something. Sometimes it is better for you to completely remove it and reevaluate in a newer way, relearn it in a newer way. These are all the critical things that are required. On the similar line for agents, it is going to be paramount, because when you are operating agents for an enterprise, you need to know, learn, and adhere to certain compliance related rules, business related constraints, and then the entitlement identity, and then sharing whatever that apply to a physical human will also start applying to an agent. That is where this is going to be very critical. This requires a new set of operating systems. That doesn’t really mean now get out of a new thing. That is where I’m just interpreting how Bavesh touched upon the Unity Catalog.

The best part that which we see and some of our clients that which are implementing is extending the Unity Catalog and the capabilities like now you can catalog the tools, catalog the MCP as well as catalog these agents, and then govern those agents based on the constraints, ground them based on the constraints.

It’s going to be very, very critical. Doing it not later, but starting that as part of your strategy and enforcing this as one of the critical dimensions of when you measure the value is also going to be very critical for an organization. It is like making sure that not only building the autonomous car, but as well as making sure that the car drives as per the rules of the road, not going rogue.

Megan: Lots to think about there. Fascinating stuff. Thank you. Just to close, with a quick look ahead, we all know the pace of development in AI and Agentic AI is so rapid. For those organizations that can prioritize AI-ready data now, what are the most compelling use cases for the technology that you can see coming to the fore in the next few years, Bavesh?

Bavesh: I think the excitement level is at its peak. We’ve seen so much investment in AI. I think the reason why there’s a lot of excitement is because you can look at the early adopters and you can see massive amounts of gains that these organizations are seeing. The one thing I will tell you is that the companies that there’s really three categories and the companies that I think are doing well, a lot of them started out with just copilots and things that are just giving people quick answers. Think about it as making an individual productive. That is the first phase. And the ROI on that has been somewhat questionable. With something like Genie, it makes it a lot more effective because it’s actually on your data and your data is contextualized in your organization. I think that’s one level of area that we’re going to see a lot of innovation. We’ll see most organizations just start to get the right information to the right person at the right time. And that has been a dream for a lot of organizations.

The second one is around automating entire business processes. We see functions within marketing, like I described earlier, or whether you’re going through a process of rebates for a company. There’s a whole bunch of steps involved where you have to go into three different apps and export data from Excel and put it over here. There’s thousands of people doing very laborious, monotonous, repeatable work. These agents are really going to help get an immense amount of not only productivity for the business process, but it’s just going to make things faster. Processes that took weeks are now going to take days. Processes that took days are going to take hours and minutes now.

One trend we’ve seen is that the AI world is so dynamic. In a world where you got lots of different players, you want to think about first principles, what are the foundations? You want to think about owning your data, making sure you have a handle on your structured and unstructured data. You want to put governance on that. But the other thing that you want to make sure that you don’t do is lock yourself in.

Today, if you think about it, Gemini is really good with multimodal. Anytime you have pictures or videos or things like that, Gemini just is super good. Whereas if you’re writing code, Claude is really good. If you’re just doing certain types of questions around introspection, ChatGPT is really good. What you really want is an open data platform where you can build your open AI on multiple clouds, which is what we built at Databricks.

I think that’ll help with the second piece, which is you can pick and choose because when you build these agents, you don’t have to be locked into just one. You should be picking the best quality and the best security and the best ROI and cost for a particular workload. One workload may use multiple of these models, and they might be even specific industry models. You need a system and a platform that can really handle this complexity.

I think the third category is business reimagination. A lot of people talk about this where, yes, you’re going to go and take the data and make it available and give everybody access to the data. You’re going to make existing processes much more efficient. But the third thing is there’s going to be brand new things that come out of it.

We have a very large customer who’s a bank and they have built a product that they didn’t have a year ago. Essentially, it’s machine learning and LLMs helping treasury departments forecast what their balances are going to be because they have more data at their fingertips. Historically, it took a long time for the data to get to the bankers. They were not able to really predict what a balance would be for a treasury department. Think about this for a big enterprise company, they have now built a brand new data AI solution that they’re monetizing and it’s generated hundreds of millions of dollars in the first six months. We’re seeing brand new lines of business open up and that is going to be really exciting because that’s where a lot of the transformation is going to happen. There’s going to be productivity. There’s going to be kind of automation at the business process level. Then there’s going to be these big new things that we didn’t even imagine that people are going to come up with.

We are actually seeing the early signals of this in every industry. We see retailers getting data at the hourly and the minute level so that they can integrate much more closely with their supply chains. We’re seeing much more targeted customer 360-degree use cases where as retailers or as consumers, we get annoyed by ads, but now it’s so contextualized and you have so much information about what really matters to your target customer, you’re giving them value added kind of information and that’s engaging them more. There’s a whole bunch of innovation happening with agentic commerce and things like concierge and virtualized shopping.

You look at any industry, there’s definitely new ways of doing things. This is what’s really exciting about AI, but you really have to not get too far ahead without thinking about what are the foundational things. You mentioned this earlier, which is open data platform, making sure you have governance correctly, making sure you think about your historical analytical data and your application data that’s going to be real time, having a good foundation to build on, that’s going to allow you to scale and move more quickly and compete in this new world.

We’re very excited about what we’re seeing with our customers and what they’re building. And honestly, that’s the best part about being in my role at Databricks, which is our teams really go to customers and say, “What are the outcomes you’re driving?” The early signals have been super positive. We’re seeing companies that get serious about all the foundational elements and really are methodical about building really outcome-based AI solutions, that 5% of projects that are being successful, those are wildly successful. That’s why we’re growing as a company because once you get a good project under your belt, that gets visibility within executives.

The last thing is that historically, a lot of tech has been in the IT department. You get the business designing how they want to go to market and how they’re going to compete and what products and services they want to offer. IT was the enabler and in many cases became the cost center and was relegated to rationalizing the portfolio of spend and tools.

But now we’re seeing the business kind of take the lead with AI where they want to understand, they want to know, “Hey, what can I be doing now that was not possible before?” We see this big opportunity just with AI literacy with business users where they’re very eager to understand how they should be thinking about AI. What does AI mean when you peel the covers? What are the pieces and the building blocks that you need to put in place, both from a technology and a training and an enablement standpoint? We’re spending a lot of time with executives helping them along this journey. We definitely see a lot of amazing opportunities ahead.

Megan: Yeah. So much innovation going on. And finally, how about yourself, Rajan? What on the horizon is exciting you the most?

Rajan: I think Bavesh covered quite a bit, but I think the way I’m seeing is today predominantly we are talking about labor shift. That means unlocking the potential of human or shifting the current way of working to the new way of working with the more efficiency game. It’s predominantly more of an efficiency game. I think that is what we are seeing now and the majority of the successful use cases around the labor shift. But what is pretty promising is the two kinds of shift, the business shifts.

What we are seeing as a new way of thinking or the new thing that is coming up is moving from system of execution or a system of engagement to system of action. That is the new way we see the road ahead. That is where some of the points that I touched upon. The business wants to have access to it, but how does it really make the real difference for it?

One classical example that I could clearly see which we have implemented for one of our customers primarily in the manufacturing space, is around the lifecycle of creation of a product and then publishing the content around the product in line with their different B2B marketplaces. Some of those, you are not just talking about recommending, creating, but actually you are able to reimagine this process, which used to involve five different departments, now can be done much faster, but at the same time gives you that veracity in terms of the decisioning that you are able to do and as far as how you’re able to actionize. That is the second thing which we are seeing.

The third part I think is also going to be is the way how the commerce has evolved. There is also not beyond that agentic commerce, but I think what we are seeing is that agent to agent commerce, agent to human commerce and agent to agent payments, agent to human payments, and then the content monetization.

These are the new set of business opportunities like building new business agentic products. It could be for family techs, it could be for on the consumer side, or it could be on the industrial technology side. These are going to be what I’m calling the economy shift, labor shift, business shift, because that is going to bring a new set of system of actions, moving them from the system of executions or the typical SaaS application with the bolt-on agentic, the so called agentic application. That is going to be a major transformation, and we are underway. But on the technology side, what is very critical for entrepreneuring is in today’s world you have data, analytical data, operational data, and then there is intelligence, there are different facets of it.

I think both this analytical core and operational core is going to really come into one. That’s why we are so gung-ho about the releases of Lakebase and other things because that is the way the future is going to drive. When they are really thinking about being ready for AI technology use cases, they should really think, how do you really create this unified core for the newer world?

The second part is people have to reimagine today, if I take SAP as an example, you do hundreds of edge applications, business applications needed to integrate another thing. Typically, we create sprawl of these integrations. One technology use case, people can say, “Hey, how do I really create a domain-based service mesh on top of this unified core and how do I make it more agentic integration ready?” That is one of the technology use cases that we are advising to the client.

I think now with a lot of the new areas that are coming around SAP, BDC with the Databricks, and this zero-based integration, that makes them rethink the way they need to integrate, the way they need to do things.

The third part, I think from a technology investment and technology, the use cases that most come for the technology that I would talk about is don’t just talk about now. This is the time that you have to, the way you own the people, the FTEs for your organizations. Agents are going to be your new FTEs.

That means that some of the new technology paradigm is going to be you will end up creating these co-intellects within your organization. That means you need to invest on what we call this agentic grid, where it becomes like a unified agentic fabric where every other agents can really collaborate and integrate and building on top of the same, the unified operational analytical core, the unified agentic integration on top of it, which is going to create a new set of experiences, agentic experiences rather than the traditional experiences or conversational experiences.

Then the new collaboration methods are going to be some of the critical aspects from a technology side that people have to really think from a technology standpoint. To start with, I would say you start looking at it from a data standpoint, building that unified core, building that unified integration and building that collaboration layer for both sharing and collaborating with intelligence as well as the agentic collaboration all governed under single umbrella. That is going to be the one critical use case which no one will feel bad about, and they are going to get really a 100X of their investments out of it.

Megan: Certainly no shortage of exciting developments on the horizon. Thank you both so much for that conversation. That was Bavesh Patel, senior vice president for Go-to-Market at Databricks and Rajan Padmanabhan, unit technology officer for data analytics and AI at Infosys, whom I spoke with from Brighton, England.

That’s it for this episode of Business Lab. I’m your host, Megan Tatum. I’m a contributing editor and host for Insights, the custom publishing division of MIT Technology Review. We were founded in 1899 at the Massachusetts Institute of Technology, and you can find us in print, on the web and at events each year around the world. For more information about us and the show, please check out our website at technologyreview.com.

This show is available wherever you get your podcasts, and if you enjoyed this episode, we hope you’ll take a moment to rate and review us. Business Lab is a production of MIT Technology Review, and this episode was produced by Giro Studios. Thanks for listening.

This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by MIT Technology Review’s editorial staff. It was researched, designed, and written by human writers, editors, analysts, and illustrators. This includes the writing of surveys and collection of data for surveys. AI tools that may have been used were limited to secondary production processes that passed thorough human review.

The missing step between hype and profit

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

In February, I picked up a flyer at an anti-AI march in London. I can’t say for sure whether or not its writers meant to riff on South Park’s underpants gnomes. But if they did, they nailed it: “Step 1: Grow a digital super mind,” it read. “Step 2: ? Step 3: ?”

Produced by Pause AI, an international activist group that co-organized the protest, it ended with this plea to the reader: “Pause AI until we know what the hell Step 2 is.” 

In the South Park episode “Gnomes,” which first aired in 1998, Kenny, Kyle, Cartman, and Stan discover a community of gnomes that sneak out at night to steal underpants from dressers. Why? The gnomes present their pitch deck. “Phase 1: Collect underpants. Phase 2: ? Phase 3: Profit.”

The gnomes’ business plan has since become one of the greats among internet memes, used to satirize everything from startup strategies to policy proposals. Memelord in chief Elon Musk once invoked it in a talk about how he planned to fund a mission to Mars. Right now, it captures the state of AI. Companies have built the tech (Step 1) and promised transformation (Step 3). How they get there is still a big question mark.

As far as Pause AI is concerned, Step 2 must involve some kind of regulation. But exactly what it will call for and who will enforce it are up for debate.

AI boosters, on the other hand, are convinced that Step 3 is salvation and tend to glaze over the middle bit. They see us racing toward sunny uplands on the back of an “economically transformative technology,” as OpenAI’s chief scientist, Jakub Pachocki, put it to me a few weeks ago. They know where they want to go—more or less: It’s hazy up there and still some way off. But everyone’s taking a different route. Will they all make it? Will anyone?

For every big claim about the future, there is a more sober assessment of how the rubber meets the road—one that quells the hype. Consider two recent studies. One, from Anthropic, predicted what types of jobs are going to be most affected by LLMs. (A takeaway: Managers, architects, and people in the media should prepare for change; groundskeepers, construction workers, and those in hospitality, not so much.) But their predictions are really just guesses, based on what kinds of tasks LLMs seem to be good at rather than how they really perform in the workplace.   

Another study, put out in February by researchers at Mercor, an AI hiring startup, tested several AI agents powered by top-tier models from OpenAI, Anthropic, and Google DeepMind on 480 workplace tasks frequently carried out by human bankers, consultants, and lawyers. Every agent they tested failed to complete most of its duties.   

Why is there such wide disagreement? There are a number of factors. For a start, it’s crucial to consider who is making the claims (and why). Anthropic has skin in the game. What’s more, most of the people telling us that something big is about to happen have reached that conclusion largely on the basis of how fast AI coding tools are getting. But not all tasks can be hacked with coding. Other studies have found that LLMs are bad at making strategic judgment calls, for example.

What’s more, when they’re deployed, the tools aren’t just dropped into a cleanroom. They need to work in places contaminated with people and existing workflows. And sometimes adding AI will make things worse. Sure, maybe those workflows need to be torn up and refashioned around the new technology for it to achieve transformative status, but that will take time (and guts).  

That big hole? It’s right where Step 2 should be. The lack of agreement on exactly what’s about to happen—and how—creates an information vacuum that gets filled by the latest wild claim of the week, evidence be damned. We’re so unmoored from any real understanding of what’s coming and how it will be deployed that a single social media post can (and does) shake markets.

We need fewer guesses and more evidence. But that’s going to require transparency from the model makers, coordination between researchers and businesses, and new ways to evaluate this technology that tell us what really happens when it’s rolled out in the real world.

The tech industry (and with it the world’s economy) rests on the held-out promise that AI really will be transformative. But that is not yet a sure bet. Next time you hear bold claims about the future, remember that most businesses are still figuring out what to do with their underpants.