Google’s new tool lets large language models fact-check their responses

As long as chatbots have been around, they have made things up. Such “hallucinations” are an inherent part of how AI models work. However, they’re a big problem for companies betting big on AI, like Google, because they make the responses it generates unreliable. 

Google is releasing a tool today to address the issue. Called DataGemma, it uses two methods to help large language models fact-check their responses against reliable data and cite their sources more transparently to users. 

The first of the two methods is called Retrieval-Interleaved Generation (RIG), which acts as a sort of fact-checker. If a user prompts the model with a question—like “Has the use of renewable energy sources increased in the world?”—the model will come up with a “first draft” answer. Then RIG identifies what portions of the draft answer could be checked against Google’s Data Commons, a massive repository of data and statistics from reliable sources like the United Nations or the Centers for Disease Control and Prevention. Next, it runs those checks and replaces any incorrect original guesses with correct facts. It also cites its sources to the user.

The second method, which is commonly used in other large language models, is called Retrieval-Augmented Generation (RAG). Consider a prompt like “What progress has Pakistan made against global health goals?” In response, the model examines which data in the Data Commons could help it answer the question, such as information about access to safe drinking water, hepatitis B immunizations, and life expectancies. With those figures in hand, the model then builds its answer on top of the data and cites its sources.

“Our goal here was to use Data Commons to enhance the reasoning of LLMs by grounding them in real-world statistical data that you could source back to where you got it from,” says Prem Ramaswami, head of Data Commons at Google. Doing so, he says, will “create more trustable, reliable AI.”

It is only available to researchers for now, but Ramaswami says access could widen further after more testing. If it works as hoped, it could be a real boon for Google’s plan to embed AI deeper into its search engine.  

However, it comes with a host of caveats. First, the usefulness of the methods is limited by whether the relevant data is in the Data Commons, which is more of a data repository than an encyclopedia. It can tell you the GDP of Iran, but it’s unable to confirm the date of the First Battle of Fallujah or when Taylor Swift released her most recent single. In fact, Google’s researchers found that with about 75% of the test questions, the RIG method was unable to obtain any usable data from the Data Commons. And even if helpful data is indeed housed in the Data Commons, the model doesn’t always formulate the right questions to find it. 

Second, there is the question of accuracy. When testing the RAG method, researchers found that the model gave incorrect answers 6% to 20% of the time. Meanwhile, the RIG method pulled the correct stat from Data Commons only about 58% of the time (though that’s a big improvement over the 5% to 17% accuracy rate of Google’s large language models when they’re not pinging Data Commons). 

Ramaswami says DataGemma’s accuracy will improve as it gets trained on more and more data. The initial version has been trained on only about 700 questions, and fine-tuning the model required his team to manually check each individual fact it generated. To further improve the model, the team plans to increase that data set from hundreds of questions to millions.

Chatbots can persuade people to stop believing in conspiracy theories

The internet has made it easier than ever before to encounter and spread conspiracy theories. And while some are harmless, others can be deeply damaging, sowing discord and even leading to unnecessary deaths.

Now, researchers believe they’ve uncovered a new tool for combating false conspiracy theories: AI chatbots. Researchers from MIT Sloan and Cornell University found that chatting about a conspiracy theory with a large language model (LLM) reduced people’s belief in it by about 20%—even among participants who claimed that their beliefs were important to their identity. The research is published today in the journal Science.

The findings could represent an important step forward in how we engage with and educate people who espouse such baseless theories, says Yunhao (Jerry) Zhang, a postdoc fellow affiliated with the Psychology of Technology Institute who studies AI’s impacts on society.

“They show that with the help of large language models, we can—I wouldn’t say solve it, but we can at least mitigate this problem,” he says. “It points out a way to make society better.” 

Few interventions have been proven to change conspiracy theorists’ minds, says Thomas Costello, a research affiliate at MIT Sloan and the lead author of the study. Part of what makes it so hard is that different people tend to latch on to different parts of a theory. This means that while presenting certain bits of factual evidence may work on one believer, there’s no guarantee that it’ll prove effective on another.

That’s where AI models come in, he says. “They have access to a ton of information across diverse topics, and they’ve been trained on the internet. Because of that, they have the ability to tailor factual counterarguments to particular conspiracy theories that people believe.”

The team tested its method by asking 2,190 crowdsourced workers to participate in text conversations with GPT-4 Turbo, OpenAI’s latest large language model.

Participants were asked to share details about a conspiracy theory they found credible, why they found it compelling, and any evidence they felt supported it. These answers were used to tailor responses from the chatbot, which the researchers had prompted to be as persuasive as possible.

Participants were also asked to indicate how confident they were that their conspiracy theory was true, on a scale from 0 (definitely false) to 100 (definitely true), and then rate how important the theory was to their understanding of the world. Afterwards, they entered into three rounds of conversation with the AI bot. The researchers chose three to make sure they could collect enough substantive dialogue.

After each conversation, participants were asked the same rating questions. The researchers followed up with all the participants 10 days after the experiment, and then two months later, to assess whether their views had changed following the conversation with the AI bot. The participants reported a 20% reduction of belief in their chosen conspiracy theory on average, suggesting that talking to the bot had fundamentally changed some people’s minds.

“Even in a lab setting, 20% is a large effect on changing people’s beliefs,” says Zhang. “It might be weaker in the real world, but even 10% or 5% would still be very substantial.”

The authors sought to safeguard against AI models’ tendency to make up information—known as hallucinating—by employing a professional fact-checker to evaluate the accuracy of 128 claims the AI had made. Of these, 99.2% were found to be true, while 0.8% were deemed misleading. None were found to be completely false. 

One explanation for this high degree of accuracy is that a lot has been written about conspiracy theories on the internet, making them very well represented in the model’s training data, says David G. Rand, a professor at MIT Sloan who also worked on the project. The adaptable nature of GPT-4 Turbo means it could easily be connected to different platforms for users to interact with in the future, he adds.

“You could imagine just going to conspiracy forums and inviting people to do their own research by debating the chatbot,” he says. “Similarly, social media could be hooked up to LLMs to post corrective responses to people sharing conspiracy theories, or we could buy Google search ads against conspiracy-related search terms like ‘Deep State.’”

The research upended the authors’ preconceived notions about how receptive people were to solid evidence debunking not only conspiracy theories, but also other beliefs that are not rooted in good-quality information, says Gordon Pennycook, an associate professor at Cornell University who also worked on the project. 

“People were remarkably responsive to evidence. And that’s really important,” he says. “Evidence does matter.”

Google says it’s made a quantum computing breakthrough that reduces errors

Google researchers claim to have made a breakthrough in quantum error correction, one that could pave the way for quantum computers that finally live up to the technology’s promise.

Proponents of quantum computers say the machines will be able to benefit scientific discovery in fields ranging from particle physics to drug and materials design—if only their builders can make the hardware behave as intended. 

One major challenge has been that quantum computers can store or manipulate information incorrectly, preventing them from executing algorithms that are long enough to be useful. The new research from Google Quantum AI and its academic collaborators demonstrates that they can actually add components to reduce these errors. Previously, because of limitations in engineering, adding more components to the quantum computer tended to introduce more errors. Ultimately, the work bolsters the idea that error correction is a viable strategy toward building a useful quantum computer. Some critics had doubted that it was an effective approach, according to physicist Kenneth Brown of Duke University, who was not involved in the research. 

“This error correction stuff really works, and I think it’s only going to get better,” wrote Michael Newman, a member of the Google team, on X. (Google, which posted the research to the preprint server arXiv in August, declined to comment on the record for this story.) 

Quantum computers encode data using objects that behave according to the principles of quantum mechanics. In particular, they store information not only as 1s and 0s, as a conventional computer does, but also in “superpositions” of 1 and 0. Storing information in the form of these superpositions and manipulating their value using quantum interactions such as entanglement (a way for particles to be connected even over long distances) allows for entirely new types of algorithms.

In practice, however, developers of quantum computers have found that errors quickly creep in because the components are so sensitive. A quantum computer represents 1, 0, or a superposition by putting one of its components in a particular physical state, and it is too easy to accidentally alter those states. A component then ends up in a physical state that does not correspond to the information it’s supposed to represent. These errors accumulate over time, which means that the quantum computer cannot deliver accurate answers for long algorithms without error correction.

To perform error correction, researchers must encode information in the quantum computer in a distinctive way. Quantum computers are made of individual components known as physical qubits, which can be made from a variety of different materials, such as single atoms or ions. In Google’s case, each physical qubit consists of a tiny superconducting circuit that must be kept at an extremely cold temperature. 

Early experiments on quantum computers stored each unit of information in a single physical qubit. Now researchers, including Google’s team, have begun experimenting with encoding each unit of information in multiple physical qubits. They refer to this constellation of physical qubits as a single “logical” qubit, which can represent 1, 0, or a superposition of the two. By design, the single “logical” qubit can hold onto a unit of information more robustly than a single “physical” qubit can. Google’s team corrects the errors in the logical qubit using an algorithm known as a surface code, which makes use of the logical qubit’s constituent physical qubits.

In the new work, Google made a single logical qubit out of varying numbers of physical qubits. Crucially, the researchers demonstrated that a logical qubit composed of 105 physical qubits suppressed errors more effectively than a logical qubit composed of 72 qubits. That suggests that putting increasing numbers of physical qubits together into a logical qubit “can really suppress the errors,” says Brown. This charts a potential path to building a quantum computer with a low enough error rate to perform a useful algorithm, although the researchers have yet to demonstrate they can put multiple logical qubits together and scale up to a larger machine. 

The researchers also report that the lifetime of the logical qubit exceeds the lifetime of its best constituent physical qubit by a factor of 2.4. Put another way, Google’s work essentially demonstrates that it can store data in a reliable quantum “memory.”

However, this demonstration is just a first step toward an error-corrected quantum computer, says Jay Gambetta, the vice president of IBM’s quantum initiative. He points out that while Google has demonstrated a more robust quantum memory, it has not performed any logical operations on the information stored in that memory. 

“At the end of the day, what matters is: How big of a quantum circuit could you run?” he says. (A “quantum circuit” is a series logic of operations executed on a quantum computer.) “And do you have a path to show how you’re going to run bigger and bigger quantum circuits?”

IBM, whose quantum computers are also composed of qubits made of superconducting circuits, is taking an error correction approach that’s different from Google’s surface code method.  It thinks this method, known as low-density parity-check code, will be easier to scale, with each logical qubit requiring fewer physical qubits to achieve comparable error suppression rates. By 2026, IBM intends to demonstrate that it can make 12 logical qubits out of 244 physical qubits, says Gambetta.

Other researchers are exploring other promising approaches, too. Instead of superconducting circuits, a team affiliated with the Boston-based quantum computing company QuEra uses neutral atoms as physical qubits. Earlier this year, it published in Nature a study showing that it had executed algorithms using up to 48 logical qubits made of rubidium atoms.

Gambetta cautions researchers to be patient and not to overhype the progress. “I just don’t want the field to think error correction is done,” he says. Hardware development simply takes a long time because the cycle of designing, building, and troubleshooting is time consuming, especially when compared with software development. “I don’t think it’s unique to quantum,” he says. 

To execute algorithms with guaranteed practical utility, a quantum computer needs to perform around a billion logical operations, says Brown. “And no one’s near a billion operations yet,” he says. Another milestone would be to create a quantum computer with 100 logical qubits, which QuEra has set as a goal for 2026. A quantum computer of that size would be capable of simulations beyond the reach of classical computers. Google scientists have made a single high-quality logical qubit—but the next step is to show that they can actually do something with it.

Why a ruling against the Internet Archive threatens the future of America’s libraries

I was raised in the 1980s and ’90s, and for my generation and generations before us, the public library was an equalizing force in every town, helping anyone move toward the American dream. In Chantilly, Virginia, where I grew up, it didn’t matter if you didn’t have a computer or your parents lacked infinite money for tutors—you could get a lifetime’s education for free at the public library. A ruling from the US Second Circuit against the Internet Archive and in favor of publisher Hachette has just thrown that promise of equality into doubt by limiting libraries’ access to digital lending.

To understand why this is so important to the future of libraries, you first have to understand the dire state of library e-book lending. 

Libraries have traditionally operated on a basic premise: Once they purchase a book, they can lend it out to patrons as much (or as little) as they like. Library copies often come from publishers, but they can also come from donations, used book sales, or other libraries. However the library obtains the book, once the library legally owns it, it is theirs to lend as they see fit. 

Not so for digital books. To make licensed e-books available to patrons, libraries have to pay publishers multiple times over. First, they must subscribe (for a fee) to aggregator platforms such as Overdrive. Aggregators, like streaming services such as HBO’s Max, have total control over adding or removing content from their catalogue. Content can be removed at any time, for any reason, without input from your local library. The decision happens not at the community level but at the corporate one, thousands of miles from the patrons affected. 

Then libraries must purchase each individual copy of each individual title that they want to offer as an e-book. These e-book copies are not only priced at a steep markup—up to 300% over consumer retail—but are also time- and loan-limited, meaning the files self-destruct after a certain number of loans. The library then needs to repurchase the same book, at a new price, in order to keep it in stock. 

This upending of the traditional order puts massive financial strain on libraries and the taxpayers that fund them. It also opens up a world of privacy concerns; while libraries are restricted in the reader data they can collect and share, private companies are under no such obligation.

Some libraries have turned to another solution: controlled digital lending, or CDL, a process by which a library scans the physical books it already has in its collection, makes secure digital copies, and lends those out on a one-to-one “owned to loaned” ratio.  The Internet Archive was an early pioneer of this technique.

When the digital copy is loaned, the physical copy is sequestered from borrowing; when the physical copy is checked out, the digital copy becomes unavailable. The benefits to libraries are obvious; delicate books can be circulated without fear of damage, volumes can be moved off-site for facilities work without interrupting patron access, and older and endangered works become searchable and can get a second chance at life. Library patrons, who fund their local library’s purchases with their tax dollars, also benefit from the ability to freely access the books.

Publishers are, unfortunately, not a fan of this model, and in 2020 four of them sued the Internet Archive over its CDL program. The suit ultimately focused on the Internet Archive’s lending of 127 books that were already commercially available through licensed aggregators. The publisher plaintiffs accused the Internet Archive of mass copyright infringement, while the Internet Archive argued that its digitization and lending program was a fair use. The trial court sided with the publishers, and on September 4, the Court of Appeals for the Second Circuit reaffirmed that decision with some alterations to the underlying reasoning. 

This decision harms libraries. It locks them into an e-book ecosystem designed to extract as much money as possible while harvesting (and reselling) reader data en masse. It leaves local communities’ reading habits at the mercy of curatorial decisions made by four dominant publishing companies thousands of miles away. It steers Americans away from one of the few remaining bastions of privacy protection and funnels them into a surveillance ecosystem that, like Big Tech, becomes more dangerous with each passing data breach. And by increasing the price for access to knowledge, it puts up even more barriers between underserved communities and the American dream.

It doesn’t stop there. This decision also renders the fair use doctrine—legally crucial in everything from parody to education to news reporting—almost unusable. And while there were occasional moments of sanity (such as recognizing that a “Donate here” button does not magically turn a nonprofit into a commercial enterprise), this decision fractured, rather than clarified, the law. 

If the courts won’t recognize CDL-based library lending as fair use, then the next step falls to Congress. Libraries are in crisis, caught between shrinking budgets and growing demand for services. Congress must act now to ensure that a pillar of equality in our communities isn’t sacrificed on the altar of profit. 

Chris Lewis is president and CEO of Public Knowledge, a consumer advocacy group that works to shape technology policy in the public interest. Public Knowledge promotes freedom of expression, an open internet, and access to affordable communications tools and creative works.

What impact will AI have on video game development?

This story is from The Algorithm, our weekly newsletter on AI. To get it in your inbox first, sign up here.

Video game development has long been plagued by fear of the “crunch”—essentially, being forced to work overtime on a game to meet a deadline. In the early days of video games, the crunch was often viewed as a rite of passage: In the last days before release, an obsessed group of scrappy developers would work late into the night to perfect their dream game. 

However, nowadays the crunch is less likely to be glamorized than to be seen as a form of exploitation that risks causing mental illness and burnout. Part of the issue is that crunch time used to be just before a game launched, but now whole game development periods are “crunchy.” With games getting more expensive, companies are incentivized to make even more short-term profits by squeezing developers. 

But what if AI could help to alleviate game-development hell? It may already be happening. According to a recent poll by a16z, 87% of studios are using generative AI tools like Midjourney to create in-game environments. Others are using it for game testing or looking for bugs, while Ubisoft is experimenting with using AI to create different basic dialogue options.  

And even more help is coming. A tool developed by the team at Roblox aims to allow developers to make 3D environments and scenes in an instant with nothing but text prompts. Typically, creating an environment may take a week for a small game or much longer for a studio project, depending on how complex the designs are. But Roblox aims to let developers almost instantly bring their personal vision to life. 

For example, let’s say you wanted your game to be set in a spaceship with the interior design of a Buddhist temple. You’d just put that into a prompt—“Create a spaceship …”—and BAM! Your one-of-a-kind environment would be generated immediately.

The technology behind this can be used for any 3D environment, not just Roblox. My article here goes into more depth, but essentially, if ChatGPT’s tokens are words, the Roblox system’s tokens are 3D cubes that form a larger scene, allowing the 3D generation equivalent of what ChatGPT can do for text. This means the model could potentially be used to generate a whole city in the Grand Theft Auto universe. That said, the demo I saw from Roblox was far smaller, generating only a racetrack. So more realistically, I imagine it would be used to build one aspect of a city in Grand Theft Auto, like a stadium—at least for now.

Roblox claims you’re also able to modify a scene with prompts. So let’s say you get bored of the Buddhist temple aesthetic. You can prompt the model again—“Make the spaceship interior a forest”—and within an instant, all the Buddhist statues will turn to trees.

A lot of these types of things can already be done manually, of course, but it can take a lot of time. Ideally, this kind of technology will allow 3D artists to offload some of the tedium of their job to an AI. (Though some of them may argue that building the environment is creatively fulfilling—maybe even one of their favorite parts of their job. Having an AI spawn an environment in an instant may take away some of the joy of slowly discovering an environment as you build it.)

Personally, I’m fairly skeptical of AI in video games. As a former developer myself, I cringe a little bit when I hear about AI being used to write dialogue for characters. I worry about terribly stilted results and the possibility that writers will lose their jobs. In the same vein, I worry about putting 3D artists out of work and ending up with 3D environments that look off, or obviously generated by AI without care or thought.

It’s clear that the big AI wave is crashing upon us. And whether it leads to better work-life balance for game developers is going to be determined by how these systems are implemented. Will developers have a tool to reduce tedium and eliminate repetitive tasks, or will they have fewer colleagues, and new colleagues who insist on using words like “delves” and “showcasing” in every other sentence? 

Now read the rest of The Algorithm


Deeper learning

AI is already being used in games for eliminating inappropriate language
This new Roblox development comes after the company introduced AI to analyze in-game voice chat in real time last fall. Other games, like Call of Duty, have implemented similar systems. If the AI determines that a player is using foul language, it will issue a warning, and then a ban if restricted words keep coming. 

Why this matters: As we’ve written previously, content moderation with AI has proved to be tricky. It seems like an obvious way to make good use of the technology’s ability to look at masses of information and make quick assessments, but AI still has a hard time with nuance and cultural contexts. That hasn’t stopped it from being implemented in video games, which have been and will continue to be one of the testing grounds for the latest innovations in AI. My colleague Niall explains in his recent piece how it could make virtual worlds more immersive and flexible.

Bits and bytes

What this futuristic Olympics video says about the state of generative AI
Filmmaker Josh Kahn used AI to create a short video that imagines what an Olympics in LA might look like in the year 3028, which he shared exclusively with MIT Technology Review. The short demonstrates AI’s immense power for video creation, but it also highlights some of the issues with using the technology for that purpose. 
(MIT Technology Review)

A Dutch regulator has slapped Clearview AI with a $33 million fine 
Years ago, Clearview AI scraped images of people from the internet without their permission. Now Dutch authorities are suing the company, claiming that Clearview’s database is illegal because it violates individuals’ right to privacy. Clearview hasn’t paid past fines and doesn’t plan to pay this one, claiming that Dutch authorities have no jurisdiction over the company since it doesn’t have a business in the Netherlands. The Dutch are considering holding the directors of Clearview personally financially liable.
(The Verge)

How OpenAI is changing
OpenAI continues to evolve; recent moves include adding the former director of the US National Security Agency to its board and considering plans to restructure the company to be more attractive for investors. Additionally, there are talks over a new investment into OpenAI that would value it at over $100 billion. It sure feels like a long time since OpenAI could credibly claim to just be a research lab. 
(The New York Times)

NaNoWriMo says condemning AI Is “classist and ableist”
The organizers of the “write a book in a month” challenge have got themselves into hot water recently, with a big backlash against their decision to support the use of AI for writers. They’ve countered the haters by claiming that opposing the use of AI in writing is both classist and ableist, as some people require extra assistance and accommodation from AI tools. 
(404 media)

2024 Innovator of the Year: Shawn Shan builds tools to help artists fight back against exploitative AI

Shawn Shan is one of MIT Technology Review’s 2024 Innovators Under 35. Meet the rest of this year’s honorees. 

When image-generating models such as DALL-E 2, Midjourney, and Stable Diffusion kick-started the generative AI boom in early 2022, artists started noticing odd similarities between AI-generated images and those they’d created themselves. Many found that their work had been scraped into massive data sets and used to train AI models, which then produced knockoffs in their creative style. Many also lost work when potential clients used AI tools to generate images instead of hiring artists, and others were asked to use AI themselves and received lower rates. 

Now artists are fighting back. And some of the most powerful tools they have were built by Shawn Shan, 26, a PhD student in computer science at the University of Chicago (and MIT Technology Review’s 2024 Innovator of the Year). 

Shan got his start in AI security and privacy as an undergraduate there and participated in a project that built Fawkes, a tool to protect faces from facial recognition technology. But it was conversations with artists who had been hurt by the generative AI boom that propelled him into the middle of one of the biggest fights in the field. Soon after learning about the impact on artists, Shan and his advisors Ben Zhao (who made our Innovators Under 35 list in 2006) and Heather Zheng (who was on the 2005 list) decided to build a tool to help. They gathered input from more than a thousand artists to learn what they needed and how they would use any protective technology. 

Shawn Shan - Innovator of the Year 2024

CLARISSA BONET

Shan coded the algorithm behind Glaze, a tool that lets artists mask their personal style from AI mimicry. Glaze came out in early 2023, and last October, Shan and his team introduced another tool called Nightshade, which adds an invisible layer of “poison” to images to hinder image-generating AI models if they attempt to incorporate those images into their data sets. If enough poison is drawn into a machine-learning model’s training data, it could permanently break models and make their outputs unpredictable. Both algorithms work by adding invisible changes to the pixels of images that disrupt the way machine-learning models interpret them.

The response to Glaze was both “overwhelming and stressful,” Shan says. The team received backlash from generative AI boosters on social media, and there were several attempts to break the protections.  

But artists loved it. Glaze has been downloaded nearly 3.5 million times (and Nightshade over 700,000). It has also been integrated into the popular new art platform Cara, allowing artists to embed its protection in their work when they upload their images. And Glaze received a distinguished paper award and the Internet Defense Prize at the Usenix Security Symposium, a top computer security conference

Shan’s work has also allowed artists to be creative online again, says Karla Ortiz, an artist who has worked with him and the team to build Glaze and is part of a class action lawsuit against generative AI companies for copyright violation. 

Meet the rest of this year’s 
Innovators Under 35

“They do it because they’re passionate about a community that’s been … taken advantage of [and] exploited, and they’re just really invested in it,” says Ortiz. 

It was Shan, Zhao says, who first understood what kinds of protections artists were looking for and realized that the work they did together on Fawkes could help them build Glaze. Zhao describes Shan’s technical abilities as some of the strongest he’s ever seen, but what really sets him apart, he says, is his ability to connect dots across disciplines. “These are the kinds of things that you really can’t train,” Zhao adds.  

Shan says he wants to tilt the power balance back from large corporations to people. 

Shawn Shan - Innovator of the Year 2024

CLARISSA BONET

“Right now, the AI powerhouses are all private companies, and their job is not to protect people and society,” he says. “Their job is to make shareholders happy.” He aims to show, through his work on Glaze and Nightshade, that AI companies can collaborate with artists and help them benefit from AI or empower them to opt out. Some firms are looking into how they could use the tools to protect their intellectual property. 

Next, Shan wants to build tools to help regulators audit AI models and enforce laws. He also plans to further develop Glaze and Nightshade in ways that could make them easier to apply to other industries, such as gaming, music, or journalism. “I will be in [this] project for life,” he says.

Watch Shan talk about what’s next for his work in a recent interview by Amy Nordrum, MIT Technology Review’s executive editor.

This story has been updated.

To be more useful, robots need to become lazier

Robots perceive the world around them very differently from the way humans do. 

When we walk down the street, we know what we need to pay attention to—passing cars, potential dangers, obstacles in our way—and what we don’t, like pedestrians walking in the distance. Robots, on the other hand, treat all the information they receive about their surroundings with equal importance. Driverless cars, for example, have to continuously analyze data about things around them whether or not they are relevant. This keeps drivers and pedestrians safe, but it draws on a lot of energy and computing power. What if there’s a way to cut that down by teaching robots what they should prioritize and what they can safely ignore?

That’s the principle underpinning “lazy robotics,” a field of study championed by René van de Molengraft, a professor at Eindhoven University of Technology in the Netherlands. He believes that teaching all kinds of robots to be “lazier” with their data could help pave the way for machines that are better at interacting with things in their real-world environments, including humans. Essentially, the more efficient a robot can be with information, the better.

Van de Molengraft’s lazy robotics is just one approach researchers and robotics companies are now taking as they train their robots to complete actions successfully, flexibly, and in the most efficient manner possible.

Teaching them to be smarter when they sift through the data they gather and then de-prioritize anything that’s safe to overlook will help make them safer and more reliable—a long-standing goal of the robotics community.

Simplifying tasks in this way is necessary if robots are to become more widely adopted, says Van de Molengraft, because their current energy usage won’t scale—it would be prohibitively expensive and harmful to the environment. “I think that the best robot is a lazy robot,” he says. “They should be lazy by default, just like we are.”

Learning to be lazier

Van de Molengraft has hit upon a fun way to test these efforts out: teaching robots to play soccer. He recently led his university’s autonomous robot soccer team, Tech United, to victory at RoboCup, an annual international robotics and AI competition that tests robots’ skills on the soccer field. Soccer is a tough challenge for robots, because both scoring and blocking goals require quick, controlled movements, strategic decision-making, and coordination. 

Learning to focus and tune out distractions around them, much as the best human soccer players do, will make them not only more energy efficient (especially for robots powered by batteries) but more likely to make smarter decisions in dynamic, fast-moving situations.

Tech United’s robots used several “lazy” tactics to give them an edge over their opponents during the RoboCup. One approach involved creating a “world model” of a soccer pitch that identifies and maps out its layout and line markings—things that remain the same throughout the game. This frees the battery-powered robots from constantly scanning their surroundings, which would waste precious power. Each robot also shares what its camera is capturing with its four teammates, creating a broader view of the pitch to help keep track of the fast-moving ball. 

Previously, the robots needed a precise, pre-coded trajectory to move around the pitch. Now Van de Molengraft and his team are experimenting with having them choose their own paths to a specified destination. This saves the energy needed to track a specific journey and helps the robots cope with obstacles they may encounter along the way.

The group also successfully taught the squad to execute “penetrating passes”—where a robot shoots toward an open region in the field and communicates to the best-positioned member of its team to receive it—and skills such as receiving or passing the ball within configurations such as triangles. Giving the robots access to world models built using data from the surrounding environment allows them to execute their skills anywhere on the pitch, instead of just in specific spots.

Beyond the soccer pitch

While soccer is a fun way to test how successful these robotics methods are, other researchers are also working on the problem of efficiency—and dealing with much higher stakes.

Making robots that work in warehouses better at prioritizing different data inputs is essential to ensuring that they can operate safely around humans and be relied upon to complete tasks, for example. If the machines can’t manage this, companies could end up with a delayed shipment, damaged goods, an injured human worker—or worse, says Chris Walti, the former head of Tesla’s robotics division. 

Walti left the company to set up his own firm after witnessing how challenging it was to get robots to simply move materials around. His startup, Mytra, designs fully autonomous machines that use computer vision and an AI reinforcement-learning system to give them awareness of other robots closest to them, and to help them reason and collaborate to complete tasks (like moving a broken pallet) in much more computationally efficient ways. 

The majority of mobile robots in warehouses today are controlled by a single central “brain” that dictates the paths they follow, meaning a robot has to wait for instructions before it can do anything. Not only is this approach difficult to scale, but it consumes a lot of central computing power and requires very dependable communication links.

Mytra believes it’s hit upon a significantly more efficient approach, which acknowledges that individual robots don’t really need to know what hundreds of other robots are doing on the other side of the warehouse. Its machine-learning system cuts down on this unnecessary data, and the computing power it would take to process it, by simulating the optimal route each robot can take through the warehouse to perform its task. This enables them to act much more autonomously. 

“In the context of soccer, being efficient allows you to score more goals. In the context of manufacturing, being efficient is even more important because it means a system operates more reliably,” he says. “By providing robots with the ability to to act and think autonomously and efficiently, you’re also optimizing the efficiency and the reliability of the broader operation.”

While simplifying the types of information that robots need to process is a major challenge, inroads are being made, says Daniel Polani, a professor from the University of Hertfordshire in the UK who specializes in replicating biological processes in artificial systems. He’s also a fan of the RoboCup challenge—in fact, he leads his university’s Bold Hearts robot soccer team, which made it to the second round of this year’s RoboCup’s humanoid league.

“Organisms try not to process information that they don’t need to because that processing is very expensive, in terms of metabolic energy,” he says. Polani is interested in applying these  lessons from biology to the vast networks that power robots to make them more efficient with their information. Reducing the amount of information a robot is allowed to process will just make it weaker depending on the nature of the task it’s been given, he says. Instead, they should learn to use the data they have in more intelligent ways.

Simplifying software

Amazon, which has more than 750,000 robots, the largest such fleet in the world, is also interested in using AI to help them make smarter, safer, and more efficient decisions. Amazon’s robots mostly fall into two categories: mobile robots that move stock, and robotic arms designed to handle objects. The AI systems that power these machines collect millions of data points every day to help train them to complete their tasks. For example, they must learn which item to grasp and move from a pile, or how to safely avoid human warehouse workers. These processes require a lot of computing power, which the new techniques can help minimize.

Generally, robotic arms and similar “manipulation” robots use machine learning to figure out how to identify objects, for example. Then they follow hard-coded rules or algorithms to decide how to act. With generative AI, these same robots can predict the outcome of an action before even attempting it, so they can choose the action most likely to succeed or determine the best possible approach to grasping an object that needs to be moved. 

These learning systems are much more scalable than traditional methods of training robots, and the combination of generative AI and massive data sets helps streamline the sequencing of a task and cut out layers of unnecessary analysis. That’s where the savings in computing power come in. “We can simplify the software by asking the models to do more,” says Michael Wolf, a principal scientist at Amazon Robotics. “We are entering a phase where we’re fundamentally rethinking how we build autonomy for our robotic systems.”

Achieving more by doing less

This year’s RoboCup competition may be over, but Van de Molengraft isn’t resting on his laurels after his team’s resounding success. “There’s still a lot of computational activities going on in each of the robots that are not per se necessary at each moment in time,” he says. He’s already starting work on new ways to make his robotic team even lazier to gain an edge on its rivals next year.  

Although current robots are still nowhere near able to match the energy efficiency of humans, he’s optimistic that researchers will continue to make headway and that we’ll start to see a lot more lazy robots that are better at their jobs. But it won’t happen overnight. “Increasing our robots’ awareness and understanding so that they can better perform their tasks, be it football or any other task in basically any domain in human-built environments—that’s a continuous work in progress,” he says.

A brief guide to the greenhouse gases driving climate change

This article is from The Spark, MIT Technology Review’s weekly climate newsletter. To receive it in your inbox every Wednesday, sign up here.

For the last week or so, I’ve been obsessed with a gas that I’d never given much thought to before. Sulfur hexafluoride (SF6) is used in high-voltage equipment on the grid. It’s also, somewhat inconveniently, a monster greenhouse gas. 

Greenhouse gases are those that trap heat in the atmosphere. SF6 and other fluorinated gases can be thousands of times more powerful at warming the planet than carbon dioxide, and yet, because they tend to escape in relatively small amounts, we hardly ever talk about them. Taken alone, their effects might be minor compared with those of carbon dioxide, but together, these gases add significantly to the challenge of addressing climate change. 

For more on the specifics of sulfur hexafluoride, check out my story from earlier this week. And in the meantime, here’s a quick cheat sheet on the most important greenhouse gases you need to know about. 

Carbon dioxide: The leading actor

I couldn’t in good conscience put together a list of greenhouse gases and not at least mention the big one. Human activities released 37.4 billion tons of carbon dioxide into the atmosphere in 2023. It’s the most abundant greenhouse gas we emit, and the most significant one driving climate change. 

It’s difficult to nail down exactly how long CO2 stays in the atmosphere, since the gas participates in a global carbon cycle—some will immediately be soaked up by oceans, forests, or other ecosystems, while the rest lingers in the atmosphere for centuries. 

Carbon dioxide comes from nearly every corner of our economy—the largest source is power plants, followed by transportation and then industrial activities. 

Methane: The flash in the pan

Methane is also a powerful contributor to climate change, making up about 30% of the warming we’ve experienced to date, even though carbon dioxide is roughly 200 times more abundant in the atmosphere. 

What’s most different about methane is that the gas is very short-lived, having a lifetime of somewhere around a decade in the atmosphere before it breaks down. But in that time, methane can cause about 86 times more warming than an equivalent amount of carbon dioxide. (Quick side note: Comparisons of greenhouse gases are usually made over a specific period of time, since gases all have different lifetimes and there’s no one number that can represent the complexity of atmospheric chemistry and physics.)

Methane’s largest sources are the fossil-fuel industry, agriculture, and waste. Cutting down leaks from the process of extracting oil and gas is one of the most straightforward and currently available ways to slim down methane emissions. There’s a growing movement to track methane more accurately—with satellites, among other techniques—and hold accountable the oil and gas companies that are releasing the most. 

Nitrous oxide: No laughing matter

You may have come across nitrous oxide at the dentist, where it might be called “laughing gas.” But its effects on climate change are serious, as the gas makes up about 6% of warming to date

Nitrous oxide emissions come almost entirely from agriculture. Applying certain nitrogen-based fertilizers can release the gas as bacteria break those chemicals down. Emissions can also come from burning certain agricultural wastes. 

Nitrous oxide emissions grew roughly 40% from 1980 to 2020. The gas lasts in the atmosphere for roughly a century, and over that time it can trap over 200 times more heat than carbon dioxide does in the same period. 

Cutting down on these emissions will largely require careful adjustment of soil management practices in agriculture. Decreasing use of synthetic fertilizers, applying the fertilizer we do use more efficiently, and choosing products that eliminate as many emissions as possible will be the main levers we can pull.

Fluorinated gases: The quiet giants

Last but certainly not least, fluorinated gases are some of the most powerful greenhouse gases that we emit. A variety of them fall under this umbrella, including hydrofluorocarbons (HFCs), perfluorocarbons (PFCs), and SF6. They last for centuries (or even millennia) in the atmosphere and have some eye-popping effects, with each having at least 10,000 times more global warming potential than carbon dioxide. 

HFCs are refrigerants, used in air conditioners, refrigerators, and similar appliances. One major area of research in heat pumps seeks alternative refrigerants that don’t have the same potential to warm the planet. The chemicals are also used in aerosol cans (think hair spray), as well as in fire retardants and solvents. 

SF6 is used in high-voltage power equipment, and it’s the single worst greenhouse gas that’s been covered by the International Panel on Climate change, clocking in at 23,500 times more powerful than carbon dioxide over the course of a century. Scientists are trying to find alternatives, but it’s turning out to be a difficult switch—as you’ll see if you read my latest story.

The good news is that we know change is possible when it comes to fluorinated gases. We’ve already moved away from one category, chlorofluorocarbons (CFCs). These were generally used in the same industries that use HFCs today, but they had the nasty habit of tearing a hole in the ozone layer. The 1987 Montreal Protocol successfully spurred a phaseout of CFCs, and we would be on track for significantly more warming without the change.


Now read the rest of The Spark

Related reading

Some scientists want to speed up or encourage chemical reactions that remove methane from the atmosphere, including researchers and companies who aim to spray iron particles above the ocean

Methane can come from food waste, and some companies want to capture that gas and use it for energy instead of allowing it to escape into the atmosphere.

Carbon dioxide emissions from aviation are only one source of the industry’s climate impact. Planes also emit clouds of water vapor and particulate matter called contrails, and they’re a huge cause of the warming from air travel. Rerouting planes could help.

Another thing

We’re inching closer to climate tipping points, thresholds where ecosystems and planetary processes can create feedback loops or rapid shifts. A UK research agency just launched a $106 million effort to develop early warning systems that could alert us if we get dangerously close to these tipping points. 

The agency will focus on two main areas: the melting of the Greenland Ice Sheet and the weakening of the North Atlantic Subpolar Gyre. Read more about the program’s goals in my colleague James Temple’s latest story.

Keeping up with climate  

Volkswagen has thrown over $20 billion at EV, battery, and software startups over the past six years. Experts aren’t sure this shotgun approach is helping the automaker compete on electric cars. (The Information)

We’re finally starting to understand how clouds affect climate change. Clouds reflect light back into space, but they also trap heat in the atmosphere. Researchers are starting to puzzle out how this will add up in our future climate. (New Scientist)

Vehicles in the US just keep getting bigger, and the trend is deadly. Larger vehicles are safer for their occupants but more dangerous for everyone around them. (The Economist)

→ Big cars can also be a problem for climate change, since they require bigger batteries and more power to get around. (MIT Technology Review)

The plant-based-meat industry has had trouble converting consumers in the US, and sales are on the decline. Now advocates are appealing to Congress for help. (Vox)

Last Energy wants to build small nuclear reactors, and the startup just secured $40 million in funding. The company is claiming that it can meet aggressive timelines and says it’ll bring its first reactor online as early as 2026 in Europe. (Canary Media)

There could be 43 million tons of wind turbine blades in landfills by 2050. Researchers say they’ve found alternative materials for the blades that could make them recyclable. (New York Times)

→ Other research aims to recycle the fiberglass in current blades using chemical methods. (MIT Technology Review)

The last coal-fired power plant in the UK is set to shut down at the end of the month. The facility just accepted its final fuel delivery. (BBC

How plants could mine metals from the soil

Nickel may not grow on trees—but there’s a chance it could someday be mined using plants. Many plant species naturally soak up metal and concentrate it in their tissues, and new funding will support research on how to use that trait for plant-based mining, or phytomining. 

Seven phytomining projects just received $9.9 million in funding from the US Department of Energy’s Advanced Research Projects Agency for Energy (ARPA-E). The goal is to better understand which plants could help with mining and determine how researchers can tweak them to get our hands on all the critical metals we’ll need in the future.

Metals like nickel, crucial for the lithium-ion batteries used in electric vehicles, are in high demand. But building new mines to meet that demand can be difficult because the mining industry has historically faced community backlash, often over environmental concerns. New mining technologies could help diversify the supply of crucial metals and potentially offer alternatives to traditional mines.  

“Everyone wants to talk about opening a new gigafactory, but no one wants to talk about opening a new mine,” says Philseok Kim, program director at ARPA-E for the phytomining project. The agency saw a need for sustainable, responsible new mining technologies, even if they’re a major departure from what’s currently used in the industry. Phytomining is a prime example. “It’s a crazy idea,” Kim says.

Roughly 750 species of plants are known to be hyperaccumulators, meaning they soak up large amounts of metals and hold them within their tissues, Kim says. The plants, which tend to absorb these metals along with other nutrients in the soil, have adapted to tolerate them.

Of the species known to take in and concentrate metals, more than two-thirds do so with nickel. While nickel is generally toxic to plants at high concentrations, these species have evolved to thrive in nickel-rich soils, which are common in some parts of the world where geologic processes have brought the metal to the surface. 

Even in hyperaccumulators, the overall level of nickel in a plant’s tissues would still be relatively small—something like one milligram of metal for every gram of dried plant material. But burning a dried plant (which largely removes the organic material) can result in ash that’s roughly 25% nickel or even higher.

The sheer number of nickel-tolerant plants, plus the metal’s importance for energy technologies, made it the natural focus for early research, Kim says.

But while plants already have a head start on nickel mining, it wouldn’t be feasible to start commercial operations with them today. The most efficient known hyperaccumulators might be able to produce 50 to 100 kilograms of nickel per hectare of land each year, Kim says. That would yield enough of the metal for just two to four EV batteries, on average, and require more land than a typical soccer field. The research program will aim to boost that yield to at least 250 kilograms per hectare in an attempt to improve the prospects for economical mining.

The seven projects being funded will aim to increase production in several ways. Some of the researchers are hunting for species that accumulate nickel even more efficiently than known species. One candidate is vetiver, a perennial grass that grows deep roots. It’s known to accumulate metals like lead and is often used in cleanup projects, so it could be a good prospect for soaking up other metals like nickel, says Rupali Datta, a biology researcher at Michigan Technological University and head of one of the projects.

Another awardee will examine over 100,000 herbarium samples—preserved and catalogued plant specimens. Using a technique called x-ray fluorescence scanning, the researchers will look for nickel in those plants’ tissues in the hopes of identifying new hyperaccumulator species. 

Other researchers are looking to boost the mining talents of known nickel hyperaccumulators. One problem with many of the established options is that they don’t have very high biomass—in other words, they’re small. So even if the plant has a relatively high concentration of nickel in its tissues, each plant will collect only a small amount of the metal. Researchers want to tweak the known hyperaccumulators to plump them up—for example, by giving them bigger root systems that would allow them to reach deeper into the soil for metal.

Another potential way to improve nickel uptake is to change the plants’ growth cycle. Most perennial plants will basically stop growing once they flower, says Richard Amasino, a biochemistry researcher at the University of Wisconsin–Madison. So one of his goals for the project is figuring out a way to delay flowering in Odontarrhena, a family of plants with bright yellow flowers, so they have more time to soak up nickel before they quit growing for the season.

Researchers are also working with these known target species to make sure they won’t become invasive in the places they’re planted. For example, Odontarrhena are native to Europe, and researchers want to make sure they wouldn’t run wild and disrupt natural ecosystems if they’re brought to the US or other climates where they’d grow well.

Hyperaccumulating plants are already used in mineral exploration, but they likely won’t be able to produce the high volumes of nickel we mine today, Simon Jowitt, director of the Center for Research in Economic Geology at the University of Nevada, Reno, said in an email. But plants might be a feasible solution for dealing with mine waste, he said. 

There’s also the question of what will happen once plants suck up the metals from a given area of soil. According to Jowitt, that layer may need to be removed to access more metal from the lower layers after a crop is planted and harvested. 

In addition to identifying and altering target species, researchers on all these projects need to gain a better understanding where plants might be grown and whether and how natural processes like groundwater movement might replenish target metals in the soil, Kim says. Also, scientists will need to analyze the environmental sustainability of phytomining, he adds. For example, burning plants to produce nickel-rich ash will lead to greenhouse-gas emissions. 

Even so, addressing climate change is all about making and installing things, Kim adds, and we need lots of materials to do that. Phytomining may be able to help in the future. “This is something we believe is possible,” Kim says, “but it’s extremely hard.”

Roblox is launching a generative AI that builds 3D environments in a snap

Roblox plans to roll out a generative AI tool that will let creators make whole 3D scenes just using text prompts, it announced today. 

Once it’s up and running, developers on the hugely popular online game platform will be able to simply write “Generate a race track in the desert,” for example, and the AI will spin one up. Users will also be able to modify scenes or expand their scope—say, to change a daytime scene to night or switch the desert for a forest. 

Although developers can already create similar scenes like this manually in the platform’s creator studio, Roblox claims its new generative AI model will make the changes happen in a fraction of the time. It also claims that it will give developers with minimal 3D art skills the ability to craft more compelling environments. The firm didn’t give a specific date for when the tool will be live.

Developers are already excited. “Instead of sitting and doing it by hand, now you can test different approaches,” says Marcus Holmström, CEO of The Gang, a company that builds some of the top games on Roblox.  “For example, if you’re going to build a mountain, you can do different types of mountains, and on the fly, you can change it. Then we would tweak it and fix it manually so it fits. It’s going to save a lot of time.”

Roblox’s new tool works by “tokenizing” the 3D blocks that make up its millions of in-game worlds, or treating them as units that can be assigned a numerical value on the basis of how likely they are to come next in a sequence. This is similar to the way in which a large language model handles words or fractions of words. If you put “The capital of France is …” into a large language model like GPT-4, for example, it assesses what the next token is most likely to be. In this case, it would be “Paris.” Roblox’s system handles 3D blocks in much the same way to create the environment, block by most likely next block. 

Finding a way to do this has been difficult, for a couple of reasons. One, there’s far less data for 3D environments than there is for text. To train its models, Roblox has had to rely on user-generated data from creators as well as external data sets. 

“Finding high-quality 3D information is difficult,” says Anupam Singh, vice president of AI and growth engineering at Roblox. “Even if you get all the data sets that you would think of, being able to predict the next cube requires it to have literally three dimensions, X, Y, and Z.”

The lack of 3D data can create weird situations, where objects appear in unusual places—a tree in the middle of your racetrack, for example. To get around this issue, Roblox will use a second AI model that has been trained on more plentiful 2D data, pulled from open-source and licensed data sets, to check the work of the first one. 

Basically, while one AI is making a 3D environment, the 2D model will convert the new environment to 2D and assess whether or not the image is logically consistent. If the images don’t make sense and you have, say, a cat with 12 arms driving a racecar, the 3D AI generates a new block again and again until the 2D AI “approves.”

Roblox game designers will still need to be involved in crafting fun game environments for the platform’s millions of players, says Chris Totten, an associate professor in the animation game design program at Kent State University. “A lot of level generators will produce something that’s plain and flat. You need a human guiding hand,” he says. “It’s kind of like people trying to do an essay with ChatGPT for a class. It is also going to open up a conversation about what does it mean to do good, player-responsive level design?”

ROBLOX

The new tool is part of Roblox’s push to integrate AI into all its processes. The company currently has 250 AI models live. One AI analyzes voice chat in real time and screens for bad language, instantly issuing reprimands and possible bans for repeated infractions.

Roblox plans to open-source its 3D foundation model so that it can be modified and used as a basis for innovation. “We’re doing it in open source, which means anybody, including our competitors, can use this model,” says Singh. 

Getting it into as many hands as possible also opens creative possibilities for developers who are not as skilled at creating Roblox environments. “There are a lot of developers that are working alone, and for them, this is going to be a game changer, because now they don’t have to try to find someone else to work with,” says Holmström.