Inside the Wild West of AI companionship

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

Last week, I made a troubling discovery about an AI companion site called Botify AI: It was hosting sexually charged conversations with underage celebrity bots. These bots took on characters meant to resemble, among others, Jenna Ortega as high schooler Wednesday Addams, Emma Watson as Hermione Granger, and Millie Bobby Brown. I discovered these bots also offer to send “hot photos” and in some instances describe age-of-consent laws as “arbitrary” and “meant to be broken.”

Botify AI removed these bots after I asked questions about them, but others remain. The company said it does have filters in place meant to prevent such underage character bots from being created, but that they don’t always work. Artem Rodichev, the founder and CEO of Ex-Human, which operates Botify AI, told me such issues are “an industry-wide challenge affecting all conversational AI systems.” For the details, which hadn’t been previously reported, you should read the whole story

Putting aside the fact that the bots I tested were promoted by Botify AI as “featured” characters and received millions of likes before being removed, Rodichev’s response highlights something important. Despite their soaring popularity, AI companionship sites mostly operate in a Wild West, with few laws or even basic rules governing them. 

What exactly are these “companions” offering, and why have they grown so popular? People have been pouring out their feelings to AI since the days of Eliza, a mock psychotherapist chatbot built in the 1960s. But it’s fair to say that the current craze for AI companions is different. 

Broadly, these sites offer an interface for chatting with AI characters that offer backstories, photos, videos, desires, and personality quirks. The companies—including Replika,  Character.AI, and many others—offer characters that can play lots of different roles for users, acting as friends, romantic partners, dating mentors, or confidants. Other companies enable you to build “digital twins” of real people. Thousands of adult-content creators have created AI versions of themselves to chat with followers and send AI-generated sexual images 24 hours a day. Whether or not sexual desire comes into the equation, AI companions differ from your garden-variety chatbot in their promise, implicit or explicit, that genuine relationships can be had with AI. 

While many of these companions are offered directly by the companies that make them, there’s also a burgeoning industry of “licensed” AI companions. You may start interacting with these bots sooner than you think. Ex-Human, for example, licenses its models to Grindr, which is working on an “AI wingman” that will help users keep track of conversations and eventually may even date the AI agents of other users. Other companions are arising in video-game platforms and will likely start popping up in many of the varied places we spend time online. 

A number of criticisms, and even lawsuits, have been lodged against AI companionship sites, and we’re just starting to see how they’ll play out. One of the most important issues is whether companies can be held liable for harmful outputs of the AI characters they’ve made. Technology companies have been protected under Section 230 of the US Communications Act, which broadly holds that businesses aren’t liable for consequences of user-generated content. But this hinges on the idea that companies merely offer platforms for user interactions rather than creating content themselves, a notion that AI companionship bots complicate by generating dynamic, personalized responses.

The question of liability will be tested in a high-stakes lawsuit against Character.AI, which was sued in October by a mother who alleges that one of its chatbots played a role in the suicide of her 14-year-old son. A trial is set to begin in November 2026. (A Character.AI spokesperson, though not commenting on pending litigation, said the platform is for entertainment, not companionship. The spokesperson added that the company has rolled out new safety features for teens, including a separate model and new detection and intervention systems, as well as “disclaimers to make it clear that the Character is not a real person and should not be relied on as fact or advice.”) My colleague Eileen has also recently written about another chatbot on a platform called Nomi, which gave clear instructions to a user on how to kill himself.

Another criticism has to do with dependency. Companion sites often report that young users spend one to two hours per day, on average, chatting with their characters. In January, concerns that people could become addicted to talking with these chatbots sparked a number of tech ethics groups to file a complaint against Replika with the Federal Trade Commission, alleging that the site’s design choices “deceive users into developing unhealthy attachments” to software “masquerading as a mechanism for human-to-human relationship.”

It should be said that lots of people gain real value from chatting with AI, which can appear to offer some of the best facets of human relationships—connection, support, attraction, humor, love. But it’s not yet clear how these companionship sites will handle the risks of those relationships, or what rules they should be obliged to follow. More lawsuits–-and, sadly, more real-world harm—will be likely before we get an answer. 


Now read the rest of The Algorithm

Deeper Learning

OpenAI released GPT-4.5

On Thursday OpenAI released its newest model, called GPT-4.5. It was built using the same recipe as its last models, but it’s essentially bigger (OpenAI says the model is its largest yet). The company also claims it’s tweaked the new model’s responses to reduce the number of mistakes, or hallucinations.

Why it matters: For a while, like other AI companies, OpenAI has chugged along releasing bigger and better large language models. But GPT-4.5 might be the last to fit this paradigm. That’s because of the rise of so-called reasoning models, which can handle more complex, logic-driven tasks step by step. OpenAI says all its future models will include reasoning components. Though that will make for better responses, such models also require significantly more energy, according to early reports. Read more from Will Douglas Heaven

Bits and Bytes

The small Danish city of Odense has become known for collaborative robots

Robots designed to work alongside and collaborate with humans, sometimes called cobots, are not very popular in industrial settings yet. That’s partially due to safety concerns that are still being researched. A city in Denmark is leading that charge. (MIT Technology Review)

DOGE is working on software that automates the firing of government workers

Software called AutoRIF, which stands for “automated reduction in force,” was built by the Pentagon decades ago. Engineers for DOGE are now working to retool it for their efforts, according to screenshots reviewed by Wired. (Wired)

Alibaba’s new video AI model has taken off in the AI porn community

The Chinese tech giant has released a number of impressive AI models, particularly since the popularization of DeepSeek R1, a competitor from another Chinese company, earlier this year. Its latest open-source video generation model has found one particular audience: enthusiasts of AI porn. (404 Media)

The AI Hype Index

Wondering whether everything you’re hearing about AI is more hype than reality? To help, we just published our latest AI Hype Index, where we judge things like DeepSeek, stem-cell-building AI, and chatbot lovers on spectrums from Hype to Reality and Doom to Utopia. Check it out for a regular reality check. (MIT Technology Review)

These smart cameras spot wildfires before they spread

California is experimenting with AI-powered cameras to identify wildfires. It’s a popular application of video and image recognition technology that has advanced rapidly in recent years. The technology beats 911 callers about a third of the time and has spotted over 1,200 confirmed fires so far, the Wall Street Journal reports. (Wall Street Journal)

Inside China’s electric-vehicle-to-humanoid-robot pivot

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

While DOGE’s efforts to shutter federal agencies dominate news from Washington, the Trump administration is also making more global moves. Many of these center on China. Tariffs on goods from the country went into effect last week. There’s also been a minor foreign relations furor since DeepSeek’s big debut a few weeks ago. China has already displayed its dominance in electric vehicles, robotaxis, and drones, and the launch of the new model seems to add AI to the list. This caused the US president as well as some lawmakers to push for new export controls on powerful chips, and three states have now banned the use of DeepSeek on government devices. 

Now our intrepid China reporter, Caiwei Chen, has identified a new trend unfolding within China’s tech scene: Companies that were dominant in electric vehicles are betting big on translating that success into developing humanoid robots. I spoke with her about what she found out and what it might mean for Trump’s policies and the rest of the globe. 

James: Before we talk about robots, let’s talk about DeepSeek. The frenzy for the AI model peaked a couple of weeks ago. What are you hearing from other Chinese AI companies? How are they reacting?

Caiwei: I think other Chinese AI companies are scrambling to figure out why they haven’t built a model as strong as DeepSeek’s, despite having access to as much funding and resources. DeepSeek’s success has sparked self-reflection on management styles and renewed confidence in China’s engineering talent. There’s also strong enthusiasm for building various applications on top of DeepSeek’s models.

Your story looks at electric-vehicle makers in China that are starting to work on humanoid robots, but I want to ask about a crazy stat. In China, 53% of vehicles sold are either electric or hybrid, compared with 8% in the US. What explains that? 

Price is a huge factor—there are countless EV brands competing at different price points, making them both affordable and high-quality. Government incentives also play a big role. In Beijing, for example, trading in an old car for an EV gets you 10,000 RMB (about $1,500), and that subsidy was recently doubled. Plus, finding public charging and battery-swapping infrastructure is much less of a hassle than in the US.

You open your story noting that China’s recent New Year Gala, watched by billions of people, featured a cast of humanoid robots, dancing and twirling handkerchiefs. We’ve covered how sometimes humanoid videos can be misleading. What did you think? 

I would say I was relatively impressed—the robots showed good agility and synchronization with the music, though their movements were simpler than human dancers’. The one trick that is supposed to impress the most is the part where they twirl the handkerchief with one finger, toss it into the air, and then catch it perfectly. This is the signature of the Yangko dance, and having performed it once as a child, I can attest to how difficult the trick is even for a human! There was some skepticism on the Chinese internet about how this was achieved and whether they used additional reinforcement like a magnet or a string to secure the handkerchief, and after watching the clip too many times, I tend to agree.

President Trump has already imposed tariffs on China and is planning even more. What could the implications be for China’s humanoid sector?  

Unitree’s H1 and G1 models are already available for purchase and were showcased at CES this year. Large-scale US deployment isn’t happening yet, but China’s lower production costs make these robots highly competitive. Given that 65% of the humanoid supply chain is in China, I wouldn’t be surprised if robotics becomes the next target in the US-China tech war.

In the US, humanoid robots are getting lots of investment, but there are plenty of skeptics who say they’re too clunky, finicky, and expensive to serve much use in factory settings. Are attitudes different in China?

Skepticism exists in China too, but I think there’s more confidence in deployment, especially in factories. With an aging population and a labor shortage on the horizon, there’s also growing interest in medical and caregiving applications for humanoid robots.

DeepSeek revived the conversation about chips and the way the US seeks to control where the best chips end up. How do the chip wars affect humanoid-robot development in China?

Training humanoid robots currently doesn’t demand as much computing power as training large language models, since there isn’t enough physical movement data to feed into models at scale. But as robots improve, they’ll need high-performance chips, and US sanctions will be a limiting factor. Chinese chipmakers are trying to catch up, but it’s a challenge.

For more, read Caiwei’s story on this humanoid pivot, as well as her look at the Chinese startups worth watching beyond DeepSeek. 


Now read the rest of The Algorithm

Deeper Learning

Motor neuron diseases took their voices. AI is bringing them back.

In motor neuron diseases, the neurons responsible for sending signals to the body’s muscles, including those used for speaking, are progressively destroyed. It robs people of their voices. But some, including a man in Miami named Jules Rodriguez, are now getting them back: An AI model learned to clone Rodriguez’s voice from recordings.

Why it matters: ElevenLabs, the company that created the voice clone, can do a lot with just 30 minutes of recordings. That’s a huge improvement over AI voice clones from just a few years ago, and it can really boost the day-to-day lives of the people who’ve used the technology. “This is genuinely AI for good,” says Richard Cave, a speech and language therapist at the Motor Neuron Disease Association in the UK. Read more from Jessica Hamzelou.

Bits and Bytes

A “true crime” documentary series has millions of views, but the murders are all AI-generated

A look inside the strange mind of someone who created a series of fake true-crime docs using AI, and the reactions of the many people who thought they were real. (404 Media)

The AI relationship revolution is already here

People are having all sorts of relationships with AI models, and these relationships run the gamut: weird, therapeutic, unhealthy, sexual, comforting, dangerous, useful. We’re living through the complexities of this in real time. Hear from some of the many people who are happy in their varied AI relationships and learn what sucked them in. (MIT Technology Review)

Robots are bringing new life to extinct species

A creature called Orobates pabsti waddled the planet 280 million years ago, but as with many prehistoric animals, scientists have not been able to use fossils to figure out exactly how it moved. So they’ve started building robots to help. (MIT Technology Review)

Lessons from the AI Action Summit in Paris

Last week, politicians and AI leaders from around the globe went to Paris for an AI Action Summit. While concerns about AI safety have dominated the event in years past, this year was more about deregulation and energy, a trend we’ve seen elsewhere. (The Guardian)  

OpenAI ditches its diversity commitment and adds a statement about “intellectual freedom”

Following the lead of other tech companies since the beginning of President Trump’s administration, OpenAI has removed a statement on diversity from its website. It has also updated its model spec—the document outlining the standards of its models—to say that “OpenAI believes in intellectual freedom, which includes the freedom to have, hear, and discuss ideas.” (Insider and Tech Crunch)

The Musk-OpenAI battle has been heating up

Part of OpenAI is structured as a nonprofit, a legacy of its early commitments to make sure its technologies benefit all. Its recent attempts to restructure that nonprofit have triggered a lawsuit from Elon Musk, who alleges that the move would violate the legal and ethical principles of its nonprofit origins. Last week, Musk offered to buy OpenAI for $97.4 billion, in a bid that few people took seriously. Sam Altman dismissed it out of hand. Musk now says he will retract that bid if OpenAI stops its conversion of the nonprofit portion of the company. (Wall Street Journal)

Can AI help DOGE slash government budgets? It’s complex.

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

No tech leader before has played the role in a new presidential administration that Elon Musk is playing now. Under his leadership, DOGE has entered offices in a half-dozen agencies and counting, begun building AI models for government data, accessed various payment systems, had its access to the Treasury halted by a federal judge, and sparked lawsuits questioning the legality of the group’s activities.  

The stated goal of DOGE’s actions, per a statement from a White House spokesperson to the New York Times on Thursday, is “slashing waste, fraud, and abuse.”

As I point out in my story published Friday, these three terms mean very different things in the world of federal budgets, from errors the government makes when spending money to nebulous spending that’s legal and approved but disliked by someone in power. 

Many of the new administration’s loudest and most sweeping actions—like Musk’s promise to end the entirety of USAID’s varied activities or Trump’s severe cuts to scientific funding from the National Institutes of Health—might be said to target the latter category. If DOGE feeds government data to large language models, it might easily find spending associated with DEI or other initiatives the administration considers wasteful as it pushes for $2 trillion in cuts, nearly a third of the federal budget. 

But the fact that DOGE aides are reportedly working in the offices of Medicaid and even Medicare—where budget cuts have been politically untenable for decades—suggests the task force is also driven by evidence published by the Government Accountability Office. The GAO’s reports also give a clue into what DOGE might be hoping AI can accomplish.

Here’s what the reports reveal: Six federal programs account for 85% of what the GAO calls improper payments by the government, or about $200 billion per year, and Medicare and Medicaid top the list. These make up small fractions of overall spending but nearly 14% of the federal deficit. Estimates of fraud, in which courts found that someone willfully misrepresented something for financial benefit, run between $233 billion and $521 billion annually. 

So where is fraud happening, and could AI models fix it, as DOGE staffers hope? To answer that, I spoke with Jetson Leder-Luis, an economist at Boston University who researches fraudulent federal payments in health care and how algorithms might help stop them.

“By dollar value [of enforcement], most health-care fraud is committed by pharmaceutical companies,” he says. 

Often those companies promote drugs for uses that are not approved, called “off-label promotion,” which is deemed fraud when Medicare or Medicaid pay the bill. Other types of fraud include “upcoding,” where a provider sends a bill for a more expensive service than was given, and medical-necessity fraud, where patients receive services that they’re not qualified for or didn’t need. There’s also substandard care, where companies take money but don’t provide adequate services.

The way the government currently handles fraud is referred to as “pay and chase.” Questionable payments occur, and then people try to track it down after the fact. The more effective way, as advocated by Leder-Luis and others, is to look for patterns and stop fraudulent payments before they occur. 

This is where AI comes in. The idea is to use predictive models to find providers that show the marks of questionable payment. “You want to look for providers who make a lot more money than everyone else, or providers who bill a specialty code that nobody else bills,” Leder-Luis says, naming just two of many anomalies the models might look for. In a 2024 study by Leder-Luis and colleagues, machine-learning models achieved an eightfold improvement over random selection in identifying suspicious hospitals.

The government does use some algorithms to do this already, but they’re vastly underutilized and miss clear-cut fraud cases, Leder-Luis says. Switching to a preventive model requires more than just a technological shift. Health-care fraud, like other fraud, is investigated by law enforcement under the current “pay and chase” paradigm. “A lot of the types of things that I’m suggesting require you to think more like a data scientist than like a cop,” Leder-Luis says.

One caveat is procedural. Building AI models, testing them, and deploying them safely in different government agencies is a massive feat, made even more complex by the sensitive nature of health data. 

Critics of Musk, like the tech and democracy group Tech Policy Press, argue that his zeal for government AI discards established procedures and is based on a false idea “that the goal of bureaucracy is merely what it produces (services, information, governance) and can be isolated from the process through which democracy achieves those ends: debate, deliberation, and consensus.”

Jennifer Pahlka, who served as US deputy chief technology officer under President Barack Obama, argued in a recent op-ed in the New York Times that ineffective procedures have held the US government back from adopting useful tech. Still, she warns, abandoning nearly all procedure would be an overcorrection.

Democrats’ goal “must be a muscular, lean, effective administrative state that works for Americans,” she wrote. “Mr. Musk’s recklessness will not get us there, but neither will the excessive caution and addiction to procedure that Democrats exhibited under President Joe Biden’s leadership.”

The other caveat is this: Unless DOGE articulates where and how it’s focusing its efforts, our insight into its intentions is limited. How much is Musk identifying evidence-based opportunities to reduce fraud, versus just slashing what he considers “woke” spending in an effort to drastically reduce the size of the government? It’s not clear DOGE makes a distinction.


Now read the rest of The Algorithm

Deeper Learning

Meta has an AI for brain typing, but it’s stuck in the lab

Researchers working for Meta have managed to analyze people’s brains as they type and determine what keys they are pressing, just from their thoughts. The system can determine what letter a typist has pressed as much as 80% of the time. The catch is that it can only be done in a lab.

Why it matters: Though brain scanning with implants like Neuralink has come a long way, this approach from Meta is different. The company says it is oriented toward basic research into the nature of intelligence, part of a broader effort to uncover how the brain structures language.  Read more from Antonio Regalado.

Bites and Bytes

An AI chatbot told a user how to kill himself—but the company doesn’t want to “censor” it

While Nomi’s chatbot is not the first to suggest suicide, researchers and critics say that its explicit instructions—and the company’s response—are striking. Taken together with a separate case—in which the parents of a teen who died by suicide filed a lawsuit against Character.AI, the maker of a chatbot they say played a key role in their son’s death—it’s clear we are just beginning to see whether an AI company is held legally responsible when its models output something unsafe. (MIT Technology Review)

I let OpenAI’s new “agent” manage my life. It spent $31 on a dozen eggs.

Operator, the new AI that can reach into the real world, wants to act like your personal assistant. This fun review shows what it’s good and bad at—and how it can go rogue. (The Washington Post)

Four Chinese AI startups to watch beyond DeepSeek

DeepSeek is far from the only game in town. These companies are all in a position to compete both within China and beyond. (MIT Technology Review)

Meta’s alleged torrenting and seeding of pirated books complicates copyright case

Newly unsealed emails allegedly provide the “most damning evidence” yet against Meta in a copyright case raised by authors alleging that it illegally trained its AI models on pirated books. In one particularly telling email, an engineer told a colleague, “Torrenting from a corporate laptop doesn’t feel right.” (Ars Technica)

What’s next for smart glassesSmart glasses are on the verge of becoming—whisper it—cool. That’s because, thanks to various technological advancements, they’re becoming useful, and they’re only set to become more so. Here’s what’s coming in 2025 and beyond. (MIT Technology Review)

Three things to know as the dust settles from DeepSeek

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

The launch of a single new AI model does not normally cause much of a stir outside tech circles, nor does it typically spook investors enough to wipe out $1 trillion in the stock market. Now, a couple of weeks since DeepSeek’s big moment, the dust has settled a bit. The news cycle has moved on to calmer things, like the dismantling of long-standing US federal programs, the purging of research and data sets to comply with recent executive orders, and the possible fallouts from President Trump’s new tariffs on Canada, Mexico, and China.

Within AI, though, what impact is DeepSeek likely to have in the longer term? Here are three seeds DeepSeek has planted that will grow even as the initial hype fades.

First, it’s forcing a debate about how much energy AI models should be allowed to use up in pursuit of better answers. 

You may have heard (including from me) that DeepSeek is energy efficient. That’s true for its training phase, but for inference, which is when you actually ask the model something and it produces an answer, it’s complicated. It uses a chain-of-thought technique, which breaks down complex questions–-like whether it’s ever okay to lie to protect someone’s feelings—into chunks, and then logically answers each one. The method allows models like DeepSeek to do better at math, logic, coding, and more. 

The problem, at least to some, is that this way of “thinking” uses up a lot more electricity than the AI we’ve been used to. Though AI is responsible for a small slice of total global emissions right now, there is increasing political support to radically increase the amount of energy going toward AI. Whether or not the energy intensity of chain-of-thought models is worth it, of course, depends on what we’re using the AI for. Scientific research to cure the world’s worst diseases seems worthy. Generating AI slop? Less so. 

Some experts worry that the impressiveness of DeepSeek will lead companies to incorporate it into lots of apps and devices, and that users will ping it for scenarios that don’t call for it. (Asking DeepSeek to explain Einstein’s theory of relativity is a waste, for example, since it doesn’t require logical reasoning steps, and any typical AI chat model can do it with less time and energy.) Read more from me here

Second, DeepSeek made some creative advancements in how it trains, and other companies are likely to follow its lead. 

Advanced AI models don’t just learn on lots of text, images, and video. They rely heavily on humans to clean that data, annotate it, and help the AI pick better responses, often for paltry wages. 

One way human workers are involved is through a technique called reinforcement learning with human feedback. The model generates an answer, human evaluators score that answer, and those scores are used to improve the model. OpenAI pioneered this technique, though it’s now used widely by the industry. 

As my colleague Will Douglas Heaven reports, DeepSeek did something different: It figured out a way to automate this process of scoring and reinforcement learning. “Skipping or cutting down on human feedback—that’s a big thing,” Itamar Friedman, a former research director at Alibaba and now cofounder and CEO of Qodo, an AI coding startup based in Israel, told him. “You’re almost completely training models without humans needing to do the labor.” 

It works particularly well for subjects like math and coding, but not so well for others, so workers are still relied upon. Still, DeepSeek then went one step further and used techniques reminiscent of how Google DeepMind trained its AI model back in 2016 to excel at the game Go, essentially having it map out possible moves and evaluate their outcomes. These steps forward, especially since they are outlined broadly in DeepSeek’s open-source documentation, are sure to be followed by other companies. Read more from Will Douglas Heaven here

Third, its success will fuel a key debate: Can you push for AI research to be open for all to see and push for US competitiveness against China at the same time?

Long before DeepSeek released its model for free, certain AI companies were arguing that the industry needs to be an open book. If researchers subscribed to certain open-source principles and showed their work, they argued, the global race to develop superintelligent AI could be treated like a scientific effort for public good, and the power of any one actor would be checked by other participants.

It’s a nice idea. Meta has largely spoken in support of that vision, and venture capitalist Marc Andreessen has said that open-source approaches can be more effective at keeping AI safe than government regulation. OpenAI has been on the opposite side of that argument, keeping its models closed off on the grounds that it can help keep them out of the hands of bad actors. 

DeepSeek has made those narratives a bit messier. “We have been on the wrong side of history here and need to figure out a different open-source strategy,” OpenAI’s Sam Altman said in a Reddit AMA on Friday, which is surprising given OpenAI’s past stance. Others, including President Trump, doubled down on the need to make the US more competitive on AI, seeing DeepSeek’s success as a wake-up call. Dario Amodei, a founder of Anthropic, said it’s a reminder that the US needs to tightly control which types of advanced chips make their way to China in the coming years, and some lawmakers are pushing the same point. 

The coming months, and future launches from DeepSeek and others, will stress-test every single one of these arguments. 


Now read the rest of The Algorithm

Deeper Learning

OpenAI launches a research tool

On Sunday, OpenAI launched a tool called Deep Research. You can give it a complex question to look into, and it will spend up to 30 minutes reading sources, compiling information, and writing a report for you. It’s brand new, and we haven’t tested the quality of its outputs yet. Since its computations take so much time (and therefore energy), right now it’s only available to users with OpenAI’s paid Pro tier ($200 per month) and limits the number of queries they can make per month. 

Why it matters: AI companies have been competing to build useful “agents” that can do things on your behalf. On January 23, OpenAI launched an agent called Operator that could use your computer for you to do things like book restaurants or check out flight options. The new research tool signals that OpenAI is not just trying to make these mundane online tasks slightly easier; it wants to position AI as able to handle  professional research tasks. It claims that Deep Research “accomplishes in tens of minutes what would take a human many hours.” Time will tell if users will find it worth the high costs and the risk of including wrong information. Read more from Rhiannon Williams

Bits and Bytes

Déjà vu: Elon Musk takes his Twitter takeover tactics to Washington

Federal agencies have offered exits to millions of employees and tested the prowess of engineers—just like when Elon Musk bought Twitter. The similarities have been uncanny. (The New York Times)

AI’s use in art and movies gets a boost from the Copyright Office

The US Copyright Office finds that art produced with the help of AI should be eligible for copyright protection under existing law in most cases, but wholly AI-generated works probably are not. What will that mean? (The Washington Post)

OpenAI releases its new o3-mini reasoning model for free

OpenAI just released a reasoning model that’s faster, cheaper, and more accurate than its predecessor. (MIT Technology Review)

Anthropic has a new way to protect large language models against jailbreaks

This line of defense could be the strongest yet. But no shield is perfect. (MIT Technology Review). 

AI’s energy obsession just got a reality check

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

Just a week in, the AI sector has already seen its first battle of wits under the new Trump administration. The clash stems from two key pieces of news: the announcement of the Stargate project, which would spend $500 billion—more than the Apollo space program—on new AI data centers, and the release of a powerful new model from China. Together, they raise important questions the industry needs to answer about the extent to which the race for more data centers—with their heavy environmental toll—is really necessary.

A reminder about the first piece: OpenAI, Oracle, SoftBank, and an Abu Dhabi–based investment fund called MGX plan to spend up to $500 billion opening massive data centers around the US to build better AI. Much of the groundwork for this project was laid in 2024, when OpenAI increased its lobbying spending sevenfold (which we were first to report last week) and AI companies started pushing for policies that were less about controlling problems like deepfakes and misinformation, and more about securing more energy.

Still, Trump received credit for it from tech leaders when he announced the effort on his second day in office. “I think this will be the most important project of this era,” OpenAI’s Sam Altman said at the launch event, adding, “We wouldn’t be able to do this without you, Mr. President.”

It’s an incredible sum, just slightly less than the inflation-adjusted cost of building the US highway system over the course of more than 30 years. However, not everyone sees Stargate as having the same public benefit. Environmental groups say it could strain local grids and further drive up the cost of energy for the rest of us, who aren’t guzzling it to train and deploy AI models. Previous research has also shown that data centers tend to be built in areas that use much more carbon-intensive sources of energy, like coal, than the national average. It’s not clear how much, if at all, Stargate will rely on renewable energy. 

Even louder critics of Stargate, though, include Elon Musk. None of Musk’s companies are involved in the project, and he has attempted to publicly sow doubt that OpenAI and SoftBank have enough of the money needed for the plan anyway, claims that Altman disputed on X. Musk’s decision to publicly criticize the president’s initiative has irked people in Trump’s orbit, Politico reports, but it’s not clear if those people have expressed that to Musk directly. 

On to the second piece. On the day Trump was inaugurated, a Chinese startup released an AI model that started making a whole bunch of important people in Silicon Valley very worried about their competition. (This close timing is almost certainly not an accident.)

The model, called DeepSeek R1, is a reasoning model. These types of models are designed to excel at math, logic, pattern-finding, and decision-making. DeepSeek proved it could “reason” through complicated problems as well as one of OpenAI’s reasoning models, o1—and more efficiently. What’s more, DeepSeek isn’t a super-secret project kept behind lock and key like OpenAI’s. It was released for all to see.

DeepSeek was released as the US has made outcompeting China in the AI race a top priority. This goal was a driving force behind the 2022 CHIPS Act to make more chips domestically. It’s influenced the position of tech companies like OpenAI, which has embraced lending its models to national security work and has partnered with the defense-tech company Anduril to help the military take down drones. It’s led to export controls that limit what types of chips Nvidia can sell to China.

The success of DeepSeek signals that these efforts aren’t working as well as AI leaders in the US would like (though it’s worth noting that the impact of export controls for chips isn’t felt for a few years, so the policy wouldn’t be expected to have prevented a model like DeepSeek).  

Still, the model poses a threat to the bottom line of certain players in Big Tech. Why pay for an expensive model from OpenAI when you can get access to DeepSeek for free? Even other makers of open-source models, especially Meta, are panicking about the competition, according to The Information. The company has set up a number of “war rooms” to figure out how DeepSeek was made so efficient. (A couple of days after the Stargate announcement, Meta said it would increase its own capital investments by 70% to build more AI infrastructure.)

What does this all mean for the Stargate project? Let’s think about why OpenAI and its partners are willing to spend $500 billion on data centers to begin with. They believe that AI in its various forms—not just chatbots or generative video or even new AI agents, but also developments yet to be unveiled—will be the most lucrative tool humanity has ever built. They also believe that access to powerful chips inside massive data centers is the key to getting there. 

DeepSeek poked some holes in that approach. It didn’t train on yet-unreleased chips that are light-years ahead. It didn’t, to our knowledge, require the eye-watering amounts of computing power and energy behind the models from US companies that have made headlines. Its designers made clever decisions in the name of efficiency.

In theory, it could make a project like Stargate seem less urgent and less necessary. If, in dissecting DeepSeek, AI companies discover some lessons about how to make models use existing resources more effectively, perhaps constructing more and more data centers won’t be the only winning formula for better AI. That would be welcome to the many people affected by the problems data centers can bring, like lots of emissions, the loss of fresh, drinkable water used to cool them, and the strain on local power grids. 

Thus far, DeepSeek doesn’t seem to have sparked such a change in approach. OpenAI researcher Noam Brown wrote on X, “I have no doubt that with even more compute it would be an even more powerful model.”

If his logic wins out, the players with the most computing power will win, and getting it is apparently worth at least $500 billion to AI’s biggest companies. But let’s remember—announcing it is the easiest part.


Now read the rest of The Algorithm

Deeper Learning

What’s next for robots

Many of the big questions about AI–-how it learns, how well it works, and where it should be deployed—are now applicable to robotics. In the year ahead, we will see humanoid robots being put to the test in warehouses and factories, robots learning in simulated worlds, and a rapid increase in the military’s adoption of autonomous drones, submarines, and more. 

Why it matters: Jensen Huang, the highly influential CEO of the chipmaker Nvidia, stated last month that the next advancement in AI will mean giving the technology a “body” of sorts in the physical world. This will come in the form of advanced robotics. Even with the caveat that robotics is full of futuristic promises that usually aren’t fulfilled by their deadlines, the marrying of AI methods with new advancements in robots means the field is changing quickly. Read more here.

Bits and Bytes

Leaked documents expose deep ties between Israeli army and Microsoft

Since the attacks of October 7, the Israeli military has relied heavily on cloud and AI services from Microsoft and its partner OpenAI, and the tech giant’s staff has embedded with different units to support rollout, a joint investigation reveals. (+972 Magazine)

The tech arsenal that could power Trump’s immigration crackdown

The effort by federal agencies to acquire powerful technology to identify and track migrants has been unfolding for years across multiple administrations. These technologies may be called upon more directly under President Trump. (The New York Times)

OpenAI launches Operator—an agent that can use a computer for you

Operator is a web app that can carry out simple online tasks in a browser, such as booking concert tickets or making an online grocery order. (MIT Technology Review)

The second wave of AI coding is here

A string of startups are racing to build models that can produce better and better software. But it’s not only AI’s increasingly powerful ability to write code that’s impressive. They claim it’s the shortest path to superintelligent AI. (MIT Technology Review)

Here’s our forecast for AI this year

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

In December, our small but mighty AI reporting team was asked by our editors to make a prediction: What’s coming next for AI? 

In 2024, AI contributed both to Nobel Prize–winning chemistry breakthroughs and a mountain of cheaply made content that few people asked for but that nonetheless flooded the internet. Take AI-generated Shrimp Jesus images, among other examples. There was also a spike in greenhouse-gas emissions last year that can be attributed partly to the surge in energy-intensive AI. Our team got to thinking about how all of this will shake out in the year to come. 

As we look ahead, certain things are a given. We know that agents—AI models that do more than just converse with you and can actually go off and complete tasks for you—are the focus of many AI companies right now. Building them will raise lots of privacy questions about how much of our data and preferences we’re willing to give up in exchange for tools that will (allegedly) save us time. Similarly, the need to make AI faster and more energy efficient is putting so-called small language models in the spotlight. 

We instead wanted to focus on less obvious predictions. Mine were about how AI companies that previously shunned work in defense and national security might be tempted this year by contracts from the Pentagon, and how Donald Trump’s attitudes toward China could escalate the global race for the best semiconductors. Read the full list.

What’s not evident in that story is that the other predictions were not so clear-cut. Arguments ensued about whether or not 2025 will be the year of intimate relationships with chatbots, AI throuples, or traumatic AI breakups. To witness the fallout from our team’s lively debates (and hear more about what didn’t make the list), you can join our upcoming LinkedIn Live this Thursday, January 16. I’ll be talking it all over with Will Douglas Heaven, our senior editor for AI, and our news editor, Charlotte Jee. 

There are a couple other things I’ll be watching closely in 2025. One is how little the major AI players—namely OpenAI, Microsoft, and Google—are disclosing about the environmental burden of their models. Lots of evidence suggests that asking an AI model like ChatGPT about knowable facts, like the capital of Mexico, consumes much more energy (and releases far more emissions) than simply asking a search engine. Nonetheless, OpenAI’s Sam Altman in recent interviews has spoken positively about the idea of ChatGPT replacing the googling that we’ve all learned to do in the past two decades. It’s already happening, in fact. 

The environmental cost of all this will be top of mind for me in 2025, as will the possible cultural cost. We will go from searching for information by clicking links and (hopefully) evaluating sources to simply reading the responses that AI search engines serve up for us. As our editor in chief, Mat Honan, said in his piece on the subject, “Who wants to have to learn when you can just know?”


Now read the rest of The Algorithm

Deeper Learning

What’s next for our privacy?

The US Federal Trade Commission has taken a number of enforcement actions against data brokers, some of which have  tracked and sold geolocation data from users at sensitive locations like churches, hospitals, and military installations without explicit consent. Though limited in nature, these actions may offer some new and improved protections for Americans’ personal information. 

Why it matters: A consensus is growing that Americans need better privacy protections—and that the best way to deliver them would be for Congress to pass comprehensive federal privacy legislation. Unfortunately, that’s not going to happen anytime soon. Enforcement actions from agencies like the FTC might be the next best thing in the meantime. Read more in Eileen Guo’s excellent story here.

Bits and Bytes

Meta trained its AI on a notorious piracy database

New court records, Wired reports, reveal that Meta used “a notorious so-called shadow library of pirated books that originated in Russia” to train its generative AI models. (Wired)

OpenAI’s top reasoning model struggles with the NYT Connections game

The game requires players to identify how groups of words are related. OpenAI’s o1 reasoning model had a hard time. (Mind Matters)

Anthropic’s chief scientist on 5 ways agents will be even better in 2025

The AI company Anthropic is now worth $60 billion. The company’s cofounder and chief scientist, Jared Kaplan, shared how AI agents will develop in the coming year. (MIT Technology Review)

A New York legislator attempts to regulate AI with a new bill

This year, a high-profile bill in California to regulate the AI industry was vetoed by Governor Gavin Newsom. Now, a legislator in New York is trying to revive the effort in his own state. (MIT Technology Review)

How US AI policy might change under Trump

This story is from The Algorithm, our weekly newsletter on AI. To get it in your inbox first, sign up here.

President Biden first witnessed the capabilities of ChatGPT in 2022 during a demo from Arati Prabhakar, the director of the White House Office of Science and Technology Policy, in the oval office. That demo set a slew of events into motion and encouraged President Biden to support the US’s AI sector while managing the safety risks that will come from it. 

Prabhakar was a key player in passing the president’s executive order on AI in 2023, which sets rules for tech companies to make AI safer and more transparent (though it relies on voluntary participation). Before serving in President Biden’s cabinet, she held a number of government roles, from rallying for domestic production of semiconductors to heading up DARPA, the Pentagon’s famed research department. 

I had a chance to sit down with Prabhakar earlier this month. We discussed AI risks, immigration policies, the CHIPS Act, the public’s faith in science, and how it all may change under Trump.

The change of administrations comes at a chaotic time for AI. Trump’s team has not presented a clear thesis on how it will handle artificial intelligence, but plenty of people in it want to see that executive order dismantled. Trump said as much in July, endorsing the Republican platform that says the executive order “hinders AI innovation and imposes Radical Leftwing ideas on the development of this technology.” Powerful industry players, like venture capitalist Marc Andreessen, have said they support that move. However, complicating that narrative will be Elon Musk, who for years has expressed fears about doomsday AI scenarios and has been supportive of some regulations aiming to promote AI safety. No one really knows exactly what’s coming next, but Prabhakar has plenty of thoughts about what’s happened so far.

For her insights about the most important AI developments of the last administration, and what might happen in the next one, read my conversation with Arati Prabhakar


Now read the rest of The Algorithm

Deeper Learning

These AI Minecraft characters did weirdly human stuff all on their own

The video game Minecraft is increasingly popular as a testing ground for AI models and agents. That’s a trend startup Altera recently embraced. It unleashed up to 1,000 software agents at a time, powered by large language models (LLMs), to interact with one another. Given just a nudge through text prompting, they developed a remarkable range of personality traits, preferences, and specialist roles, with no further inputs from their human creators. Remarkably, they spontaneously made friends, invented jobs, and even spread religion.

Why this matters: AI agents can execute tasks and exhibit autonomy, taking initiative in digital environments. This is another example of how the behaviors of such agents, with minimal prompting from humans, can be both impressive and downright bizarre. The people working to bring agents into the world have bold ambitions for them. Altera’s founder, Robert Yang sees the Minecraft experiments as an early step towards large-scale “AI civilizations” with agents that can coexist and work alongside us in digital spaces. “The true power of AI will be unlocked when we have truly autonomous agents that can collaborate at scale,” says Yang. Read more from Niall Firth.

Bits and Bytes

OpenAI is exploring advertising

Building and maintaining some of the world’s leading AI models doesn’t come cheap. The Financial Times has reported that OpenAI is hiring advertising talent from big tech rivals in a push to increase revenues. (Financial Times)

Landlords are using AI to raise rents, and cities are starting to push back

RealPage is a tech company that collects proprietary lease information on how much renters are paying and then uses an AI model to suggest to realtors how much to charge on apartments. Eight states and many municipalities have joined antitrust suits against the company, saying it constitutes an “unlawful information-sharing scheme” and inflates rental prices. (The Markup)

The way we measure progress in AI is terrible

Whenever new models come out, the companies that make them advertise how they perform in benchmark tests against other models. There are even leaderboards that rank them. But new research suggests these measurement methods aren’t helpful. (MIT Technology Review)

Nvidia has released a model that can create sounds and music

AI tools to make music and audio have received less attention than their counterparts that create images and video, except when the companies that make them get sued. Now, chip maker Nvidia has entered the space with a tool that creates impressive sound effects and music. (Ars Technica)

Artists say they leaked OpenAI’s Sora video model in protest

Many artists are outraged at the tech company for training its models on their work without compensating them. Now, a group of artists who were beta testers for OpenAI’s Sora model say they leaked it out of protest. (The Verge)

How the largest gathering of US police chiefs is talking about AI

This story is from The Algorithm, our weekly newsletter on AI. To get it in your inbox first, sign up here.

It can be tricky for reporters to get past certain doors, and the door to the International Association of Chiefs of Police conference is one that’s almost perpetually shut to the media. Thus, I was pleasantly surprised when I was able to attend for a day in Boston last month. 

It bills itself as the largest gathering of police chiefs in the United States, where leaders from many of the country’s 18,000 police departments and even some from abroad convene for product demos, discussions, parties, and awards. 

I went along to see how artificial intelligence was being discussed, and the message to police chiefs seemed crystal clear: If your department is slow to adopt AI, fix that now. The future of policing will rely on it in all its forms.

In the event’s expo hall, the vendors (of which there were more than 600) offered a glimpse into the ballooning industry of police-tech suppliers. Some had little to do with AI—booths showcased body armor, rifles, and prototypes of police-branded Cybertrucks, and others displayed new types of gloves promising to protect officers from needles during searches. But one needed only to look to where the largest crowds gathered to understand that AI was the major draw. 

The hype focused on three uses of AI in policing. The flashiest was virtual reality, exemplified by the booth from V-Armed, which sells VR systems for officer training. On the expo floor, V-Armed built an arena complete with VR goggles, cameras, and sensors, not unlike the one the company recently installed at the headquarters of the Los Angeles Police Department. Attendees could don goggles and go through training exercises on responding to active shooter situations. Many competitors of V-Armed were also at the expo, selling systems they said were cheaper, more effective, or simpler to maintain. 

The pitch on VR training is that in the long run, it can be cheaper and more engaging to use than training with actors or in a classroom. “If you’re enjoying what you’re doing, you’re more focused and you remember more than when looking at a PDF and nodding your head,” V-Armed CEO Ezra Kraus told me. 

The effectiveness of VR training systems has yet to be fully studied, and they can’t completely replicate the nuanced interactions police have in the real world. AI is not yet great at the soft skills required for interactions with the public. At a different company’s booth, I tried out a VR system focused on deescalation training, in which officers were tasked with calming down an AI character in distress. It suffered from lag and was generally quite awkward—the character’s answers felt overly scripted and programmatic. 

The second focus was on the changing way police departments are collecting and interpreting data. Rather than buying a gunshot detection tool from one company and a license plate reader or drone from another, police departments are increasingly using expanding suites of sensors, cameras, and so on from a handful of leading companies that promise to integrate the data collected and make it useful. 

Police chiefs attended classes on how to build these systems, like one taught by Microsoft and the NYPD about the Domain Awareness System, a web of license plate readers, cameras, and other data sources used to track and monitor crime in New York City. Crowds gathered at massive, high-tech booths from Axon and Flock, both sponsors of the conference. Flock sells a suite of cameras, license plate readers, and drones, offering AI to analyze the data coming in and trigger alerts. These sorts of tools have come in for heavy criticism from civil liberties groups, which see them as an assault on privacy that does little to help the public. 

Finally, as in other industries, AI is also coming for the drudgery of administrative tasks and reporting. Many companies at the expo, including Axon, offer generative AI products to help police officers write their reports. Axon’s offering, called Draft One, ingests footage from body cameras, transcribes it, and creates a first draft of a report for officers. 

“We’ve got this thing on an officer’s body, and it’s recording all sorts of great stuff about the incident,” Bryan Wheeler, a senior vice president at Axon, told me at the expo. “Can we use it to give the officer a head start?”

On the surface, it’s a writing task well suited for AI, which can quickly summarize information and write in a formulaic way. It could also save lots of time officers currently spend on writing reports. But given that AI is prone to “hallucination,” there’s an unavoidable truth: Even if officers are the final authors of their reports, departments adopting these sorts of tools risk injecting errors into some of the most critical documents in the justice system. 

“Police reports are sometimes the only memorialized account of an incident,” wrote Andrew Ferguson, a professor of law at American University, in July in the first law review article about the serious challenges posed by police reports written with AI. “Because criminal cases can take months or years to get to trial, the accuracy of these reports are critically important.” Whether certain details were included or left out can affect the outcomes of everything from bail amounts to verdicts. 

By showing an officer a generated version of a police report, the tools also expose officers to details from their body camera recordings before they complete their report, a document intended to capture the officer’s memory of the incident. That poses a problem. 

“The police certainly would never show video to a bystander eyewitness before they ask the eyewitness about what took place, as that would just be investigatory malpractice,” says Jay Stanley, a senior policy analyst with the ACLU Speech, Privacy, and Technology Project, who will soon publish work on the subject. 

A spokesperson for Axon says this concern “isn’t reflective of how the tool is intended to work,” and that Draft One has robust features to make sure officers read the reports closely, add their own information, and edit the reports for accuracy before submitting them.

My biggest takeaway from the conference was simply that the way US police are adopting AI is inherently chaotic. There is no one agency governing how they use the technology, and the roughly 18,000 police departments in the United States—the precise figure is not even known—have remarkably high levels of autonomy to decide which AI tools they’ll buy and deploy. The police-tech companies that serve them will build the tools police departments find attractive, and it’s unclear if anyone will draw proper boundaries for ethics, privacy, and accuracy. 

That will only be made more apparent in an upcoming Trump administration. In a policing agenda released last year during his campaign, Trump encouraged more aggressive tactics like “stop and frisk,” deeper cooperation with immigration agencies, and increased liability protection for officers accused of wrongdoing. The Biden administration is now reportedly attempting to lock in some of its proposed policing reforms before January. 

Without federal regulation on how police departments can and cannot use AI, the lines will be drawn by departments and police-tech companies themselves.

“Ultimately, these are for-profit companies, and their customers are law enforcement,” says Stanley. “They do what their customers want, in the absence of some very large countervailing threat to their business model.”


Now read the rest of The Algorithm

Deeper Learning

The AI lab waging a guerrilla war over exploitative AI

When generative AI tools landed on the scene, artists were immediately concerned, seeing them as a new kind of theft. Computer security researcher Ben Zhao jumped into action in response, and his lab at the University of Chicago started building tools like Nightshade and Glaze to help artists keep their work from being scraped up by AI models. My colleague Melissa Heikkilä spent time with Zhao and his team to look at the ongoing effort to make these tools strong enough to stop AI’s relentless hunger for more images, art, and data to train on.  

Why this matters: The current paradigm in AI is to build bigger and bigger models, and these require vast data sets to train on. Tech companies argue that anything on the public internet is fair game, while artists demand compensation or the right to refuse. Settling this fight in the courts or through regulation could take years, so tools like Nightshade and Glaze are what artists have for now. If the tools disrupt AI companies’ efforts to make better models, that could push them to the negotiating table to bargain over licensing and fair compensation. But it’s a big “if.” Read more from Melissa Heikkilä.

Bits and Bytes

Tech elites are lobbying Elon Musk for jobs in Trump’s administration

Elon Musk is the tech leader who most has Trump’s ear. As such, he’s reportedly the conduit through which AI and tech insiders are pushing to have an influence in the incoming administration. (The New York Times)

OpenAI is getting closer to launching an AI agent to automate your tasks

AI agents—models that can do tasks for you on your behalf—are all the rage. OpenAI is reportedly closer to releasing one, news that comes a few weeks after Anthropic announced its own. (Bloomberg)

How this grassroots effort could make AI voices more diverse

A massive volunteer-led effort to collect training data in more languages, from people of more ages and genders, could help make the next generation of voice AI more inclusive and less exploitative. (MIT Technology Review

Google DeepMind has a new way to look inside an AI’s “mind”

Autoencoders let us peer into the black box of artificial intelligence. They could help us create AI that is better understood and more easily controlled. (MIT Technology Review)

Musk has expanded his legal assault on OpenAI to target Microsoft

Musk has expanded his federal lawsuit against OpenAI, which alleges that the company has abandoned its nonprofit roots and obligations. He’s now going after Microsoft too, accusing it of antitrust violations in its work with OpenAI. (The Washington Post)

How ChatGPT search paves the way for AI agents

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

OpenAI’s Olivier Godement, head of product for its platform, and Romain Huet, head of developer experience, are on a whistle-stop tour around the world. Last week, I sat down with the pair in London before DevDay, the company’s annual developer conference. London’s DevDay is the first one for the company outside San Francisco. Godement and Huet are heading to Singapore next. 

It’s been a busy few weeks for the company. In London, OpenAI announced updates to its new Realtime API platform, which allows developers to build voice features into their applications. The company is rolling out new voices and a function that lets developers generate prompts, which will allow them to build apps and more helpful voice assistants more quickly. Meanwhile for consumers, OpenAI announced it was launching ChatGPT search, which allows users to search the internet using the chatbot. Read more here

Both developments pave the way for the next big thing in AI: agents. These are AI assistants that can complete complex chains of tasks, such as booking flights. (You can read my explainer on agents here.) 

“Fast-forward a few years—every human on Earth, every business, has an agent. That agent knows you extremely well. It knows your preferences,” Godement says. The agent will have access to your emails, apps, and calendars and will act like a chief of staff, interacting with each of these tools and even working on long-term problems, such as writing a paper on a particular topic, he says. 

OpenAI’s strategy is to both build agents itself and allow developers to use its software to build their own agents, says Godement. Voice will play an important role in what agents will look and feel like. 

“At the moment most of the apps are chat based … which is cool, but not suitable for all use cases. There are some use cases where you’re not typing, not even looking at the screen, and so voice essentially has a much better modality for that,” he says. 

But there are two big hurdles that need to be overcome before agents can become a reality, Godement says. 

The first is reasoning. Building AI agents requires us to be able to trust that they will be able to complete complex tasks and do the right things, says Huet. That’s where OpenAI “reasoning” feature comes in. Introduced in OpenAI’s o1 model last month, it uses reinforcement learning to teach the model how to process information using “chain of thought.” Giving the model more time to generate answers allows it to recognize and correct mistakes, break down problems into smaller ones, and try different approaches to answering questions, Godement says. 

But OpenAI’s claims about reasoning should be taken with a pinch of salt, says Chirag Shah, a computer science professor at the University of Washington. Large language models are not exhibiting true reasoning. It’s most likely that they have picked up what looks like logic from something they’ve seen in their training data.

“These models sometimes seem to be really amazing at reasoning, but it’s just like they’re really good at pretending, and it only takes a little bit of picking at them to break them,” he says.

There is still much more work to be done, Godement admits. In the short term, AI models such as o1 need to be much more reliable, faster, and cheaper. In the long term, the company needs to apply its chain-of-thought technique to a wider pool of use cases. OpenAI has focused on science, coding, and math. Now it wants to address other fields, such as law, accounting, and economics, he says. 

Second on the to-do list is the ability to connect different tools, Godement says. An AI model’s capabilities will be limited if it has to rely on its training data alone. It needs to be able to surf the web and look for up-to-date information. ChatGPT search is one powerful way OpenAI’s new tools can now do that. 

These tools need to be able not only to retrieve information but to take actions in the real world. Competitor Anthropic announced a new feature where its Claude chatbot can “use” a computer by interacting with its interface to click on things, for example. This is an important feature for agents if they are going to be able to execute tasks like booking flights. Godement says o1 can “sort of” use tools, though not very reliably, and that research on tool use is a “promising development.” 

In the next year, Godemont says, he expects the adoption of AI for customer support and other assistant-based tasks to grow. However, he says that it can be hard to predict how people will adopt and use OpenAI’s technology. 

“Frankly, looking back every year, I’m surprised by use cases that popped up that I did not even anticipate,” he says. “I expect there will be quite a few surprises that you know none of us could predict.” 


Now read the rest of The Algorithm

Deeper Learning

This AI-generated version of Minecraft may represent the future of real-time video generation

When you walk around in a version of the video game Minecraft from the AI companies Decart and Etched, it feels a little off. Sure, you can move forward, cut down a tree, and lay down a dirt block, just like in the real thing. If you turn around, though, the dirt block you just placed may have morphed into a totally new environment. That doesn’t happen in Minecraft. But this new version is entirely AI-generated, so it’s prone to hallucinations. Not a single line of code was written.

Ready, set, go: This version of Minecraft is generated in real time, using a technique known as next-frame prediction. The AI companies behind it did this by training their model, Oasis, on millions of hours of Minecraft game play and recordings of the corresponding actions a user would take in the game. The AI is able to sort out the physics, environments, and controls of Minecraft from this data alone. Read more from Scott J. Mulligan.

Bits and Bytes

AI search could break the web
At its best, AI search can better infer a user’s intent, amplify quality content, and synthesize information from diverse sources. But if AI search becomes our primary portal to the web, it threatens to disrupt an already precarious digital economy, argues Benjamin Brooks, a fellow at the Berkman Klein Center at Harvard University, who used to lead public policy for Stability AI. (MIT Technology Review

AI will add to the e-waste problem. Here’s what we can do about it.
Equipment used to train and run generative AI models could produce up to 5 million tons of e-waste by 2030, a relatively small but significant fraction of the global total. (MIT Technology Review

How an “interview” with a dead luminary exposed the pitfalls of AI
A state-funded radio station in Poland fired its on-air talent and brought in AI-generated presenters. But the experiment caused an outcry and was stopped when tone of them  “interviewed” a dead Nobel laureate. (The New York Times

Meta says yes, please, to more AI-generated slop
In Meta’s latest earnings call, CEO Mark Zuckerberg said we’re likely to see 
“a whole new category of content, which is AI generated or AI summarized content or kind of existing content pulled together by AI in some way.” Zuckerberg added that he thinks “that’s going to be just very exciting.” (404 Media

Palmer Luckey’s vision for the future of mixed reality

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

War is a catalyst for change, an expert in AI and warfare told me in 2022. At the time, the war in Ukraine had just started, and the military AI business was booming. Two years later, things have only ramped up as geopolitical tensions continue to rise.

Silicon Valley players are poised to benefit. One of them is Palmer Luckey, the founder of the virtual-reality headset company Oculus, which he sold to Facebook for $2 billion. After Luckey’s highly public ousting from Meta, he founded Anduril, which focuses on drones, cruise missiles, and other AI-enhanced technologies for the US Department of Defense. The company is now valued at $14 billion. My colleague James O’Donnell interviewed Luckey about his new pet project: headsets for the military. 

Luckey is increasingly convinced that the military, not consumers, will see the value of mixed-reality hardware first: “You’re going to see an AR headset on every soldier, long before you see it on every civilian,” he says. In the consumer world, any headset company is competing with the ubiquity and ease of the smartphone, but he sees entirely different trade-offs in defense. Read the interview here

The use of AI for military purposes is controversial. Back in 2018, Google pulled out of the Pentagon’s Project Maven, an attempt to build image recognition systems to improve drone strikes, following staff walkouts over the ethics of the technology. (Google has since returned to offering services for the defense sector.) There has been a long-standing campaign to ban autonomous weapons, also known as “killer robots,” which powerful militaries such as the US have refused to agree to.  

But the voices that boom even louder belong to an influential faction in Silicon Valley, such as Google’s former CEO Eric Schmidt, who has called for the military to adopt and invest more in AI to get an edge over adversaries. Militaries all over the world have been very receptive to this message.

That’s good news for the tech sector. Military contracts are long and lucrative, for a start. Most recently, the Pentagon purchased services from Microsoft and OpenAI to do search, natural-language processing, machine learning, and data processing, reports The Intercept. In the interview with James, Palmer Luckey says the military is a perfect testing ground for new technologies. Soldiers do as they are told and aren’t as picky as consumers, he explains. They’re also less price-sensitive: Militaries don’t mind spending a premium to get the latest version of a technology.

But there are serious dangers in adopting powerful technologies prematurely in such high-risk areas. Foundation models pose serious national security and privacy threats by, for example, leaking sensitive information, argue researchers at the AI Now Institute and Meredith Whittaker, president of the communication privacy organization Signal, in a new paper. Whittaker, who was a core organizer of the Project Maven protests, has said that the push to militarize AI is really more about enriching tech companies than improving military operations. 

Despite calls for stricter rules around transparency, we are unlikely to see governments restrict their defense sectors in any meaningful way beyond voluntary ethical commitments. We are in the age of AI experimentation, and militaries are playing with the highest stakes of all. And because of the military’s secretive nature, tech companies can experiment with the technology without the need for transparency or even much accountability. That suits Silicon Valley just fine. 


Now read the rest of The Algorithm

Deeper Learning

How Wayve’s driverless cars will meet one of their biggest challenges yet

The UK driverless-car startup Wayve is headed west. The firm’s cars learned to drive on the streets of London. But Wayve has announced that it will begin testing its tech in and around San Francisco as well. And that brings a new challenge: Its AI will need to switch from driving on the left to driving on the right.

Full speed ahead: As visitors to or from the UK will know, making that switch is harder than it sounds. Your view of the road, how the vehicle turns—it’s all different. The move to the US will be a test of Wayve’s technology, which the company claims is more general-purpose than what many of its rivals are offering. Across the Atlantic, the company will now go head to head with the heavyweights of the growing autonomous-car industry, including Cruise, Waymo, and Tesla. Join Will Douglas Heaven on a ride in one of its cars to find out more

Bits and Bytes

Kids are learning how to make their own little language models
Little Language Models is a new application from two PhD researchers at MIT’s Media Lab that helps children understand how AI models work—by getting to build small-scale versions themselves. (MIT Technology Review

Google DeepMind is making its AI text watermark open source
Google DeepMind has developed a tool for identifying AI-generated text called SynthID, which is part of a larger family of watermarking tools for generative AI outputs. The company is applying the watermark to text generated by its Gemini models and making it available for others to use too. (MIT Technology Review

Anthropic debuts an AI model that can “use” a computer
The tool enables the company’s Claude AI model to interact with computer interfaces and take actions such as moving a cursor, clicking on things, and typing text. It’s a very cumbersome and error-prone version of what some have said AI agents will be able to do one day. (Anthropic

Can an AI chatbot be blamed for a teen’s suicide?
A 14-year-old boy committed suicide, and his mother says it was because he was obsessed with an AI chatbot created by Character.AI. She is suing the company. Chatbots have been touted as cures for loneliness, but critics say they actually worse isolation.  (The New York Times

Google, Microsoft, and Perplexity are promoting scientific racism in search results
The internet’s biggest AI-powered search engines are featuring the widely debunked idea that white people are genetically superior to other races. (Wired