Three ways we can fight deepfake porn

Last week, sexually explicit images of Taylor Swift, one of the world’s biggest pop stars, went viral online. Millions of people viewed nonconsensual deepfake porn of Swift on the social media platform X, formerly known as Twitter. X has since taken the drastic step of blocking all searches for Taylor Swift to try to get the problem under control. 

This is not a new phenomenon: deepfakes have been around for years. However, the rise of generative AI has made it easier than ever to create deepfake pornography and sexually harass people using AI-generated images and videos. 

Of all types of harm related to generative AI, nonconsensual deepfakes affect the largest number of people, with women making up the vast majority of those targeted, says Henry Ajder, an AI expert who specializes in generative AI and synthetic media.

Thankfully, there is some hope. New tools and laws could make it harder for attackers to weaponize people’s photos, and they could help us hold perpetrators accountable. 

Here are three ways we can combat nonconsensual deepfake porn. 

WATERMARKS

Social media platforms sift through the posts that are uploaded onto their sites and take down content that goes against their policies. But this process is patchy at best and misses a lot of harmful content, as the Swift videos on X show. It is also hard to distinguish between authentic and AI-generated content. 

One technical solution could be watermarks. Watermarks hide an invisible signal in images that helps computers identify if they are AI generated. For example, Google has developed a system called SynthID, which uses neural networks to modify pixels in images and adds a watermark that is invisible to the human eye. That mark is designed to be detected even if the image is edited or screenshotted. In theory, these tools could help companies improve their content moderation and make them faster to spot fake content, including nonconsensual deepfakes.

Pros: Watermarks could be a useful tool that makes it easier and quicker to identify AI-generated content and identify toxic posts that should be taken down. Including watermarks in all images by default would also make it harder for attackers to create nonconsensual deepfakes to begin with, says Sasha Luccioni, a researcher at the AI startup Hugging Face who has studied bias in AI systems.

Cons: These systems are still experimental and not widely used. And a determined attacker can still tamper with them. Companies are also not applying the technology to all images across the board. Users of Google’s Imagen AI image generator can choose whether they want their AI-generated images to have the watermark, for example. All these factors limit their usefulness in fighting deepfake porn. 

PROTECTIVE SHIELDS

At the moment, all the images we post online are free game for anyone to use to create a deepfake. And because the latest image-making AI systems are so sophisticated, it is growing harder to prove that AI-generated content is fake. 

But a slew of new defensive tools allow people to protect their images from AI-powered exploitation by making them look warped or distorted in AI systems. 

One such tool, called PhotoGuard, was developed by researchers at MIT. It works like a protective shield by altering the pixels in photos in ways that are invisible to the human eye. When someone uses an AI app like the image generator Stable Diffusion to manipulate an image that has been treated with PhotoGuard, the result will look unrealistic. Fawkes, a similar tool developed by researchers at the University of Chicago, cloaks images with hidden signals that make it harder for facial recognition software to recognize faces. 

Another new tool, called Nightshade, could help people fight back against being used in AI systems. The tool, developed by researchers at the University of Chicago, applies an invisible layer of “poison” to images. The tool was developed to protect artists from having their copyrighted images scraped by tech companies without their consent. However, in theory, it could be used on any image its owner doesn’t want to end up being scraped by AI systems. When tech companies grab training material online without consent, these poisoned images will break the AI model. Images of cats could become dogs, and images of Taylor Swift could also become dogs. 

Pros: These tools make it harder for attackers to use our images to create harmful content. They show some promise in providing private individuals with protection against AI image abuse, especially if dating apps and social media companies apply them by default, says Ajder. 

“We should all be using Nightshade for every image we post on the internet,” says Luccioni. 

Cons: These defensive shields work on the latest generation of AI models. But there is no guarantee future versions won’t be able to override these protective mechanisms. They also don’t work on images that are already online, and they are harder to apply to images of celebrities, as famous people don’t control which photos of them are uploaded online. 

“It’s going to be this giant game of cat and mouse,” says Rumman Chowdhury, who runs the ethical AI consulting and auditing company Parity Consulting. 

REGULATION

Technical fixes go only so far. The only thing that will lead to lasting change is stricter regulation, says Luccioni. 

Taylor Swift’s viral deepfakes have put new momentum behind efforts to clamp down on deepfake porn. The White House said the incident was “alarming” and urged Congress to take legislative action. Thus far, the US has had a piecemeal, state-by-state approach to regulating the technology. For example, California and Virginia have banned the creation of pornographic deepfakes made without consent. New York and Virginia also ban the distribution of this sort of content. 

However, we could finally see action on a federal level. A new bipartisan bill that would make sharing fake nude images a federal crime was recently reintroduced in the US Congress. A deepfake porn scandal at a New Jersey high school has also pushed lawmakers to respond with a bill called the Preventing Deepfakes of Intimate Images Act. The attention Swift’s case has brought to the problem might drum up more bipartisan support. 

Lawmakers around the world are also pushing stricter laws for the technology. The UK’s Online Safety Act, passed last year, outlaws the sharing of deepfake porn material, but not its creation. Perpetrators could face up to six months of jail time. 

In the European Union, a bunch of new bills tackle the problem from different angles. The sweeping AI Act requires deepfake creators to clearly disclose that the material was created by AI, and the Digital Services Act will require tech companies to remove harmful content much more quickly. 

China’s deepfake law, which entered into force in 2023, goes the furthest. In China, deepfake creators need to take steps to prevent the use of their services for illegal or harmful purposes, ask for consent from users before making their images into deepfakes, authenticate people’s identities, and label AI-generated content. 

Pros: Regulation will offer victims recourse, hold creators of nonconsensual deepfake pornography accountable, and create a powerful deterrent. It also sends a clear message that creating nonconsensual deepfakes is not acceptable. Laws and public awareness campaigns making it clear that people who create this sort of deepfake porn are sex offenders could have a real impact, says Ajder. “That would change the slightly blasé attitude that some people have toward this kind of content as not harmful or not a real form of sexual abuse,” he says. 

Cons: It will be difficult to enforce these sorts of laws, says Ajder. With current techniques, it will be hard for victims to identify who has assaulted them and build a case against that person. The person creating the deepfakes might also be in a different jurisdiction, which makes prosecution more difficult. 

Dear Taylor Swift, we’re sorry about those explicit deepfakes

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

Hi, Taylor.

I can only imagine how you must be feeling after sexually explicit deepfake videos of you went viral on X. Disgusted. Distressed, perhaps. Humiliated, even. 

I’m really sorry this is happening to you. Nobody deserves to have their image exploited like that. But if you aren’t already, I’m asking you to be furious. 

Furious that this is happening to you and so many other women and marginalized people around the world. Furious that our current laws are woefully inept at protecting us from violations like this. Furious that men (because let’s face it, it’s mostly men doing this) can violate us in such an intimate way and walk away unscathed and unidentified. Furious that the companies that enable this material to be created and shared widely face no consequences either, and can profit off such a horrendous use of their technology. 

Deepfake porn has been around for years, but its latest incarnation is its worst one yet. Generative AI has made it ridiculously easy and cheap to create realistic deepfakes. And nearly all deepfakes are made for porn. Only one image plucked off social media is enough to generate something passable. Anyone who has ever posted or had a photo published of them online is a sitting duck. 

First, the bad news. At the moment, we have no good ways to fight this. I just published a story looking at three ways we can combat nonconsensual deepfake porn, which include watermarks and data-poisoning tools. But the reality is that there is no neat technical fix for this problem. The fixes we do have are still experimental and haven’t been adopted widely by the tech sector, which limits their power. 

The tech sector has thus far been unwilling or unmotivated to make changes that would prevent such material from being created with their tools or shared on their platforms. That is why we need regulation. 

People with power, like yourself, can fight with money and lawyers. But low-income women, women of color, women fleeing abusive partners, women journalists, and even children are all seeing their likeness stolen and pornified, with no way to seek justice or support. Any one of your fans could be hurt by this development. 

The good news is that the fact that this happened to you means politicians in the US are listening. You have a rare opportunity, and momentum, to push through real, actionable change. 

I know you fight for what is right and aren’t afraid to speak up when you see injustice. There will be intense lobbying against any rules that would affect tech companies. But you have a platform and the power to convince lawmakers across the board that rules to combat these sorts of deepfakes are a necessity. Tech companies and politicians need to know that the days of dithering are over. The people creating these deepfakes need to be held accountable. 

You once caused an actual earthquake. Winning the fight against nonconsensual deepfakes would have an even more earth-shaking impact.

What’s next for robotaxis in 2024

MIT Technology Review’s What’s Next series looks across industries, trends, and technologies to give you a first look at the future. You can read the rest of our series here.

In 2023, it almost felt as if the promise of robotaxis was soon to be fulfilled. Hailing a robotaxi had briefly become the new trendy thing to do in San Francisco, as simple and everyday as ordering a delivery via app. However, that dream crashed and burned in October, when a serious accident in downtown San Francisco involving a vehicle belonging to Cruise, one of the leading US robotaxi companies, ignited distrust, casting a long shadow over the technology’s future. 

Following that and another accident, the state of California suspended Cruise’s operations there indefinitely, and the National Highway Traffic Safety Administration launched an investigation of the company. Since then, Cruise has pulled all its vehicles from the road and laid off 24% of its workforce.

Despite that, other robotaxi companies are still forging ahead. In half a dozen cities in the US and China, fleets of robotaxis run by companies such as Waymo and Baidu are still serving anyone who would like to try them. Regulators in places like San Francisco, Phoenix, Beijing, and Shanghai now allow these vehicles to drive without human safety operators. 

However, other perils loom. Robotaxi companies need to make a return on the vast sums that have been invested into getting them up and running. Until robotaxis become cheaper, they can’t meaningfully compete with conventional taxis and Uber. Yet at the same time, if companies try to increase adoption too fast, they risk following in Cruise’s footsteps. Waymo, another major robotaxi operator, has been going more slowly and cautiously. But no one is immune to accidents. 

“If they have an accident, it’s going to be big news, and it will hurt everyone,” says Missy Cummings, a professor and director of the Mason Autonomy and Robotics Center at George Mason University. “That’s the big lesson of this year. The whole industry is on thin ice.”

MIT Technology Review talked to experts about how to understand the challenges facing the robotaxi industry. Here’s how they expect it to change in 2024.

Money, money, money

After years of testing robotaxis on the road, companies have demonstrated that a version of the autonomous driving technology is ready today, though with some heavy asterisks. They operate only within strict, pre-set geographical boundaries; while some cars no longer have a human operator in the driver’s seat, they still require remote operators to take control in case of emergencies; and they are limited to warmer climates, because snow can be challenging for the cars’ cameras and sensors. 

“From what has been disclosed publicly, these systems still rely on some remote human supervision to operate safely. This is why I am calling them automated rather than autonomous,” says Ramanarayan Vasudevan, an associate professor of robotics and mechanical engineering at the University of Michigan.

The problem is that this version of automated driving is much more costly than traditional taxis. A robotaxi ride can be “several orders of magnitude more expensive than what it costs other taxi companies,” he says. “Unfortunately I don’t think the technology will dramatically change in the coming year to really drive down that cost.”

That higher ticket price will inevitably suppress demand. If robotaxis want to keep customers—not just those curious to try it out for the first time—they need to make the service cheaper than other forms of transportation. 

Bryant Walker Smith, an associate professor of law at the University of South Carolina, echoes this concern. “These companies are competing with an Uber driver who, in any estimate, makes less than minimum wage, has a midpriced car, and probably maintains it themselves,” he says. 

By way of contrast, robotaxis are expensive vehicles packed full of cameras, sensors, and advanced software systems, and they require constant monitoring and help from humans. It’s almost impossible for them to compete with ride-sharing services yet, at least until a lot more robotaxis can hit the road.

And as robotaxi companies keep burning the cash from investors, concerns are growing that they are not getting enough in return for their vast expenditure, says Smith. That means even more pressure to produce results, while balancing the potential revenues and costs. 

The resistance to scaling up

In the US, there are currently four cities where people can take a robotaxi: San Francisco, Phoenix, Los Angeles, and Las Vegas. 

The terms differ by city. Some require you to sign up for a waitlist first, which could take months to clear, while others only operate the vehicles in a small area.

Expanding robotaxi services into a new city involves a huge upfront effort and cost: the new area has to be thoroughly mapped (and that map has to be kept up to date), and the operator has to buy more autonomous vehicles to keep up with demand. 

Also, cars whose autonomous systems are geared toward, say, San Francisco have a limited ability to adapt to Austin, says Cummings, who’s researching how to measure this type of adaptability. “If I’m looking at that as a basic research question, it probably means the companies haven’t learned something important yet,” she says. 

These factors have combined to cause renewed concern about robotaxis’ profitability. Even after Cruise removed its vehicles from the road, Waymo, the other major robotaxi company in the US, hasn’t jumped in to fill the vacuum. Since each robotaxi ride currently costs the company more money than it makes, there’s hardly an appetite for endless expansion.

Worldwide development

It’s not just the US where robotaxis are being researched, tested, and even deployed. 

China is the other leader right now, and it is proceeding on roughly the same timeline as the US. In 2023, a few cities in China, including Beijing and Shanghai, received government clearance to run robotaxies on the road without any safety operators. However, the cars can only run in certain small and relatively remote areas of the cities, making the service tricky to access for most people.

The Middle East is also quickly gaining a foothold in the sector, with the help of Chinese and American companies. Saudi Arabia invested $100 million in the Chinese robotaxi startup Pony.AI to bring its cars to Neom, a futuristic city it’s constructing that will supposedly be built with all the latest technologies. Meanwhile, Dubai and Abu Dhabi are competing with each other to become the first city in the Middle East to pilot driverless vehicles on the road, with vehicles made by Cruise and the Chinese company WeRide.

Chinese robotaxi companies face the same central challenge as their US peers: proving their profitability. A push to monetize permeated the Chinese industry in 2023 and launched a new trend: Chinese self-driving companies are now racing to sell their autopilot systems to other companies. This lets them make some quick cash by repackaging their technologies into less advanced but more in-demand services, like urban autopilot systems that can be sold to carmakers.

Meanwhile, robotaxi development in Europe has lagged behind, partly because countries there prefer deploying autonomous vehicles in mass transit. While Germany, the UK, and France have seen robotaxis running road tests, commercial operations remain a distant hope. 

Lessons from Cruise’s fiasco

Cruise’s dreadful experience points to one major remaining roadblock for robotaxis: they still sometimes behave erratically. When a human driver (in a non-autonomous vehicle) hit a pedestrian in San Francisco in October and drove away from the scene, a passing Cruise car then ran over the victim and dragged her 20 feet before stopping. 

“We are deeply concerned that more people will be killed, more first responders will be obstructed, more sudden stops will happen,” says Cathy Chase, president of Advocates for Highway and Auto Safety, an activist group based in Washington, DC. “We are not against autonomous vehicles. We are concerned about the unsafe deployment and a rush to the market at the expense of the traveling public.”

These companies are simply not reporting enough data to show us how safe their vehicles are, she says. While they are required to submit data to the National Highway Traffic Safety Administration, the data is heavily redacted in the name of protecting trade secrets before it’s released to the public. Some federal bills proposed in the last year, which haven’t passed, could even lighten these reporting requirements, Chase says.

“If there’s a silver lining in this accident, it’s that people were forced to reckon with the fact that these operations are not simple and not that straightforward,” Cummings says. It will likely cause the industry to rely more on remote human operators, something that could have changed the Cruise vehicle’s response in the October accident. But introducing more humans will further tip the balance away from profitability.

Meanwhile, Cruise was accused by the California Public Utilities Commission of misleading the public and regulators about its involvement in the incident. “If we cannot trust these companies, then they have no businesses on our roads,” says Smith. 

A Cruise spokesperson told MIT Technology Review the company has no updates to share currently but pointed to a blog post from November saying it had hired third-party law firms and technology consultants to review the accident and Cruise’s responses to the regulators. In a settlement proposal to CPUC, Cruise also offered to share more data, including “collision reporting as well as regular reports detailing incidents involving stopped AVs.”

The future of Cruise remains unclear, and so does the company’s original plan to launch operations in several more cities soon. Meanwhile, though, Waymo is applying to expand its services in Los Angeles while taking its vehicles to the highways of Phoenix. Zoox, an autonomous-driving startup owned by Amazon, could launch commercial service in Las Vegas this year. For residents of these cities, more and more robotaxis may be on the road in 2024.

Correction: The story has been updated to clarify that Cruise’s October 2 accident was not fatal. The victim was hospitalized with serious injuries but survived.

How China is regulating robotaxis

This story first appeared in China Report, MIT Technology Review’s newsletter about technology in China. Sign up to receive it in your inbox every Tuesday.

Not many technologies have had such a roller-coaster ride in the past year as robotaxis. In just a few months they went from San Francisco’s new darling to a national scandal after Cruise, one of the leading companies in the business, was involved in a serious accident.

This morning, I published a story looking at where the industry is going in the new year. Other than rebuilding the public trust that was lost last year, robotaxi companies are also struggling to find a realistic business model. It’s not a moonshot idea anymore, but it’s also not quite a feasible business yet.

The piece mostly focuses on the situation in the US, which is experiencing the most uncertainty because of that Cruise incident. But obviously, I’ve also been paying close attention to Chinese companies in this field. And this newsletter will be a crash course to bring you up to date on how the Chinese peers of Cruise and Waymo are doing now.

The US and China have been on similar timelines when it comes to autonomous-driving technology.

It’s hard to make a direct comparison because these companies don’t usually operate in the same market. Cruise and Waymo don’t exist in China at all; while some Chinese companies have set up research labs in the US and have been testing their vehicles here, they have no plan to compete in the US domestic market. 

But there are so many developments happening in parallel. In the last two years, companies in both countries obtained permits to remove safety operators from the cars, charge passengers for the ride, expand their services to airports, and extend their operating hours to 24/7. 

What it means is that robotaxis are nearly as accessible in China right now as they are in the US. If you are determined to try them out, there are several Chinese cities where you can find these self-driving vehicles on the streets—although you still can’t really use them for a regular commute.

And now, American and Chinese robotaxi companies are both under pressure to start making money. Considering the advanced hardware and software in a robotaxi and the human intervention still needed to operate and maintain them, robotaxi rides have much higher costs than taxi rides today, which makes the business hard to scale up. 

I first heard about this in the summer of 2023, when I noticed that Chinese carmakers had suddenly started aggressively rolling out autopilot functions (similar to Tesla’s FSD). One of the reasons behind that push turned out to be that Chinese autonomous-driving companies wanted to generate quicker revenues. Instead of waiting for fully autonomous vehicles, like robotaxis, to make money by themselves, they’d rather package the technologies into driver-assistance systems and sell to the market now.

One difference between the robotaxi rollouts in the two countries has to do with regulation. Chinese regulators are known for their hands-on approach and the tendency to rein in new technologies early. Robotaxis have been no exception.

In December 2023, China’s first regulation on commercial operation of autonomous vehicles went into effect. It sets some ground rules for different kinds of vehicles: roboshuttles or robotrucks still need to have in-car safety operators, while robotaxis can use remote operators. The ratio of robotaxis to remote operators cannot exceed 3:1, and operators need to pass certain skill tests. There are also rules specifying what data the companies need to report when accidents happen.

Like any first stab at regulations, the document still seems relatively abstract at this point. But it does put China one step ahead of the US, where national legislation on robotaxis could take much longer to come, and it’s the individual states taking all the steps for now. It will be interesting to see how the different regulatory approaches will affect the industry in the future.

On that note, we might be able to see American and Chinese robotaxi companies compete directly soon—not on their home turfs, but in the Middle East. In Abu Dhabi, the capital city of the United Arab Emirates, robotaxis made by the Chinese company WeRide have been carrying out free test rides. Just a two-hour drive away in Dubai, robotaxis made by Cruise are also being tested, even though the same vehicles have been pulled off the roads in the US. 

Maybe we will soon be able to ride a Chinese robotaxi and an American robotaxi on the same day. If you get the chance to do that, definitely let me know how it goes!

Do you think the Chinese hands-on regulatory approach will benefit the industry or not? Tell me your thoughts at zeyi@technologyreview.com.

Catch up with China

1. China’s population declined for the second year in a row, according to new data released by the government. (NBC News)

2. AI scientists and policy experts from the US and China held secretive talks in Geneva last year to align their views on AI safety. (Financial Times $)

3. More than one million Chinese people have emigrated since 2019, transforming the communities in neighboring Asian countries. (Bloomberg $)

4. China’s internet regulator is investigating what data Shein could share overseas as the fast-fashion company prepares to go public in the US. (Wall Street Journal $)

5. CJPL, a Chinese research lab hunting for dark matter, just opened its second phase and became the world’s deepest and largest underground lab. (Nature)

6. The Chinese government vows to rein in the overproduction of electric cars and the “disorderly competition behaviors” among domestic EV companies. (Financial Times $)

Lost in translation

Once determined to reduce its reliance on coal, China is ramping up coal production and consumption again. According to the Chinese publication Caixin Weekly, China produced 4.66 billion tons of coal in 2023, reaching a historical peak; China also imported 0.47 billion tons of coal in 2023, a 61.8% increase from the year before and another historical high. 

In 2020, the Chinese government was taking measures to phase out coal power plants. But the war between Russia and Ukraine pushed up the global energy price and left China uncertain whether shifting to renewable energy could disrupt its economic stability. As a result, more coal power plants have entered construction in the past few years in China, causing a rebound in coal consumption and production.

One more thing

Today in gadgets nobody asked for: A Chinese party-owned outlet designed a power bank–slash–speaker. Called the “‘Xi Jinping’s “The Governance of China” Volumes 1-4’ Ideology Power Bank,” the speaker reads you 72 political essays explaining the Chinese president’s ideology while it also charges your phone. What, you want to buy one? Sorry, it’s exclusively given to party cadres and government employees.

GUANGMING ONLINE
How satellite images and AI could help fight spatial apartheid in South Africa  

Raesetje Sefala grew up sharing a bedroom with her six siblings in a cramped township in the Limpopo province of South Africa. The township’s inhabitants, predominantly Black people, had inadequate access to schools, health care, parks, and hospitals. 

But just a few miles away in Limpopo, white families lived in big, attractive houses, with easy access to all these things. The physical division of communities along economic and racial lines, so that townships are close enough for the people living there to commute to work but too far to easily access essential services, is just one damaging inheritance from South Africa’s era of apartheid.

The older Sefala became, the more she peppered her father with questions about the visible racial segregation of their neighborhood: “Why is it like this?”

Now, at 28, she is helping do something about it. Alongside computer scientists Nyalleng Moorosi and Timnit Gebru at the nonprofit Distributed AI Research Institute (DAIR), which Gebru set up in 2021, she is deploying computer vision tools and satellite images to analyze the impacts of racial segregation in housing, with the ultimate hope that their work will help to reverse it.

“We still see previously marginalized communities’ lives not improving,” says Sefala. Though she was never alive during the apartheid regime, she has still been affected by its awful enduring legacy: “It’s just very unequal, very frustrating.”

In South Africa, the government census categorizes both wealthier suburbs and townships, a creation of apartheid and typically populated by Black people, as “formal residential neighborhoods.” That census is used to allocate public spending, and when they are lumped together with richer areas, townships are in effect hidden, disproportionately excluding the people living there from access to resources such as health services, education centers, and green spaces. This issue is commonly known as spatial apartheid. 

Raesetje Sefala
Raesetje Sefala is deploying satellite images and AI to map out spatial apartheid in South Africa.
HANNAH YOON

Sefala and her team have spent the last three years building a data set that maps out townships in order to study how neighborhoods are changing in terms of population and size. The hope is that it could help them see whether or not people’s lives in townships have improved since the legal dissolution of apartheid.

They did it by collecting millions of satellite images of all nine provinces in South Africa, and geospatial data from the government that shows the location of different neighborhoods and buildings across the country. Then they used all this data to train machine-learning models and build an AI system that can label specific areas as wealthy, non-wealthy, non-residential, or vacant land. 

In 2021, they discovered that over 70% of South African land is vacant, and they saw how much less land is allocated to townships than to suburbs. It was a confirmation of the inequalities they had expected to see, but the staggering quantity of vacant land still took them aback, says Sefala.

Now they’re sharing the data set with researchers and public service institutions, including nonprofits and civic organizations working to identify land that could be used for public services and housing. DAIR plans to make the data set free and accessible on its website from February 2.

“The work fits squarely into our research paradigm to put data using AI into the hands of marginalized groups,” says Gebru. 

While dismantling spatial apartheid may take a lifetime, Sefala hopes to use the tools they have developed to fuel systemic change and social justice. “We want the work to push the government to start labeling these townships so that we can begin to tackle real issues of resource allocation,” she says.

Data for change  

Moorosi, who now co-advises Sefala at DAIR, first hired her at the South African Council for Scientific and Industrial Research (CSIR) in 2018. Sefala “was absolutely brilliant and fully understood the concept of machine learning,” she says. And Moorosi made her realize that she was not alone in worrying about the impacts of spatial apartheid and neighborhood segregation.

South Africa is the world’s most unequal country, according to the World Bank. Nearly three decades after the end of apartheid, its brutal legacy continues to rob millions of Black South Africans of basic rights, including jobs, education, and access to health care. “It impacts every aspect of people’s lives,” says Nick Budlender, an urban policy researcher at Ndifuna Ukwazi, a nonprofit that advocates for urban land justice in Cape Town.

Sefala’s work is beginning to make its way into the hands of South African institutions and researchers. Earlier this month, DAIR shared its data with a South African policy think tank, the Human Sciences Research Council (HSRC), which is using the information to advise the government on budget allocations for HIV treatment programs across the country. “If they don’t know where the townships are—how fast the population is growing—it makes it difficult for them to allocate resources that are realistic,” says Sefala.

But perhaps the biggest impact her work could have would be to help provide information to organizations fighting for justice in urban planning, especially in the face of South Africa’s worsening housing crisis. For example, in Cape Town, the most racially segregated city in the world, about 14% of households live in informal settlements—unplanned areas without adequate shelter and infrastructure. If some of the vast tracts of public land were turned into affordable public housing, many people would not have to live in informal settlements, advocates say.  

However, the lack of publicly available information on public land in the city perpetuates a government myth that the city lacks vacant land.

“We have a real dearth of quality data,” says Budlender, and that makes it a lot harder to advocate for the use of public land to build public housing and services like hospitals. Last September, after five years of research, Ndifuna Ukwazi launched a digital interactive map, known as the People’s Land Map, displaying 2,700 parcels of vacant and underutilized public land in Cape Town. 

Its aim is to demonstrate that there’s ample public land available to help address the housing crisis. “When we have called for the development of affordable housing, the government has often responded by saying that there isn’t land available. By developing the map we have conclusively proven that this isn’t the case,” says Budlender.

Sefala says that they hope to share their data to support the work of Ndifuna Ukwazi. And Budlender is excited about the possibilities it could open. “It offers a real opportunity to track and show evidence on how townships are changing, and to shape policy,” he says. “Policy is only ever as good as the data that it is based on.” 

These days Sefala travels all over South Africa, giving talks to policymakers, advocates, and students. When she walks through the streets of Johannesburg, she often stops and stares at the huge gated houses and ponders the difference between townships and rich neighborhoods.

“Townships are terribly poor, and it was part of my reality,” she says. “But I’m happy doing something about it.”       

Google DeepMind’s new AI system can solve complex geometry problems

Google DeepMind has created an AI system that can solve complex geometry problems. It’s a significant step towards machines with more human-like reasoning skills, experts say. 

Geometry, and mathematics more broadly, have challenged AI researchers for some time. Compared with text-based AI models, there is significantly less training data for mathematics because it is symbol driven and domain specific, says Thang Wang, a coauthor of the research, which is published in Nature today.

Solving mathematics problems requires logical reasoning, something that most current AI models aren’t great at. This demand for reasoning is why mathematics serves as an important benchmark to gauge progress in AI intelligence, says Wang.

DeepMind’s program, named AlphaGeometry, combines a language model with a type of AI called a symbolic engine, which uses symbols and logical rules to make deductions. Language models excel at recognizing patterns and predicting subsequent steps in a process. However, their reasoning lacks the rigor required for mathematical problem-solving. The symbolic engine, on the other hand, is based purely on formal logic and strict rules, which allows it to guide the language model toward rational decisions. 

These two approaches, responsible for creative thinking and logical reasoning respectively, work together to solve difficult mathematical problems. This closely mimics how humans work through geometry problems, combining their existing understanding with explorative experimentation. 

DeepMind says it tested AlphaGeometry on 30 geometry problems at the same level of difficulty found at the International Mathematical Olympiad, a competition for top high school mathematics students. It completed 25 within the time limit. The previous state-of-the-art system, developed by the Chinese mathematician Wen-Tsün Wu in 1978, completed only 10.

“This is a really impressive result,” says Floris van Doorn, a mathematics professor at the University of Bonn, who was not involved in the research. “I expected this to still be multiple years away.”

DeepMind says this system demonstrates AI’s ability to reason and discover new mathematical knowledge.

“This is another example that reinforces how AI can help us advance science and better understand the underlying processes that determine how the world works,” said Quoc V. Le, a scientist at Google DeepMind and one of the authors of the research, at a press conference.

When presented with a geometry problem, AlphaGeometry first attempts to generate a proof using its symbolic engine, driven by logic. If it cannot do so using the symbolic engine alone, the language model adds a new point or line to the diagram. This opens up additional possibilities for the symbolic engine to continue searching for a proof. This cycle continues, with the language model adding helpful elements and the symbolic engine testing new proof strategies, until a verifiable solution is found.

To train AlphaGeometry’s language model, the researchers had to create their own training data to compensate for the scarcity of existing geometric data. They generated nearly half a billion random geometric diagrams and fed them to the symbolic engine. This engine analyzed each diagram and produced statements about their properties. These statements were organized into 100 million synthetic proofs to train the language model.

Roman Yampolskiy, an associate professor of computer science and engineering at the University of Louisville who was not involved in the research, says that AlphaGeometry’s ability shows a significant advancement toward more “sophisticated, human-like problem-solving skills in machines.” 

“Beyond mathematics, its implications span across fields that rely on geometric problem-solving, such as computer vision, architecture, and even theoretical physics,” said Yampoliskiy in an email.

However, there is room for improvement. While AlphaGeometry can solve problems found in  “elementary” mathematics, it remains unable to grapple with the sorts of advanced, abstract problems taught at university.

“Mathematicians would be really interested if AI can solve problems that are posed in research mathematics, perhaps by having new mathematical insights,” said van Doorn.

Wang says the goal is to apply a similar approach to broader math fields. “Geometry is just an example for us to demonstrate that we are on the verge of AI being able to do deep reasoning,” he says.

A new AI-based risk prediction system could help catch deadly pancreatic cancer cases earlier

A new AI system could help detect the most common form of pancreatic cancer, new research has found.

Pancreatic cancer is a difficult disease to detect. The pancreas itself is hidden by other organs in the abdomen, making it tough to spot tumors during tests. Patients also rarely experience symptoms in the early stages, meaning that the majority of cases are diagnosed at an advanced stage—once it’s already spread to other parts of the body. This makes it much harder to cure.

As a result, it’s essential to try to catch pancreatic cancer at the earliest stage possible. A team of researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) worked with Limor Appelbaum, a staff scientist in the department of radiation oncology at the Beth Israel Deaconess Medical Center in Boston, to develop an AI system that predicts a patient’s likelihood of developing pancreatic ductal adenocarcinoma (PDAC), the most common form of the cancer. 

The system outperformed current diagnostic standards and could someday be used in a clinical setting to identify patients who could benefit from early screening or testing, helping catch the disease earlier and save lives. The research was published in the journal eBioMedicine last month.

The researchers’ goal was a model capable of predicting a patient’s risk of being diagnosed with PDAC in the next six to 18 months, making early-stage detection and cure more likely. To develop it, they examined their existing electronic health records.

The resulting system, known as PRISM, consists of two AI models. The first uses artificial neural networks to spot patterns in the data, which include patients’ ages, medical history, and lab results. It then calculates a risk score for an individual patient. The second AI model was fed the same data to generate a score, but used a simpler algorithm.

The researchers fed the two models anonymized data from 6 million electronic health records, 35,387 of which were PDAC cases, from 55 health-care organizations in the US.

The team used the models to evaluate patients’ PDAC risk every 90 days until there was no more sufficient data or or the patient was diagnosed with pancreatic cancer. They followed up on all enrolled patients from six months after their first risk evaluation until 18 months after their last risk evaluation to see if they were diagnosed with PDAC in that time. 

Among people who developed pancreatic cancer, the neural network identified 35% of them as high risk six to 18 months before their diagnosis, which the authors say is a significant improvement over current screening systems. For most of the general population, there is no recommended screening routine for pancreatic cancer the way there is for breast or colon cancer, and the current standard screening criteria catch around 10% of cases.

Given how important it is to detect the disease at the earliest stage possible, the system looks promising, says Michael Goggins, a professor of pathology and pancreatic cancer specialist at Johns Hopkins University School of Medicine, who was not involved in the project.

“It would be anticipated that such a model would improve the current landscape,” he says. “But it really needs to be very early to make the biggest impact.”

It’s possible that some people could have developed advanced pancreatic cancer within the six-to-18-month window, meaning it could be too late to treat them effectively by the time they’ve received a risk assessment, he says.

While this particular study is retrospective, looking at existing data and tasking the models with making hypothetical predictions, the team has started work on a study that will gather gather data on existing patients, compute their risk factors, and wait to see how accurate the predictions are, says Martin Rinard, a professor of electrical engineering and computer science at MIT, who worked on the project.

In the past, other AI models built with data from a particular hospital sometimes didn’t work as well when provided with data from another hospital, he points out. That could be down to all kinds of reasons, such as different populations, procedures, and practices.

“Because we have what is coming close to data from essentially a very significant fraction of the entire population of the United States, we have hopes that the model will work better across organizations and not be tied to a specific organization,” he says. “And because we’re working with so many organizations, it also gives us a bigger training set.”

In the future, PRISM could be deployed in two ways, says Appelbaum.

First, it could help single out patients for pancreatic cancer testing. Second, it could offer a broader type of screening, prompting people without symptoms to take a blood or saliva test that may indicate whether they need further testing.

“There are tens of thousands of these models for different cancers out there, but most of them are stuck in the literature,” she adds. “I think we have the pathway to take them to clinical practice, which is why I started all of this—so that we can actually get it to people and detect cancer early. It has the potential to save many, many lives.”

Outperforming competitors as a data-driven organization

In 2006, British mathematician Clive Humby said, “data is the new oil.” While the phrase is almost a cliché, the advent of generative AI is breathing new life into this idea. A global study on the Future of Enterprise Data & AI by WNS Triange and Corinium Intelligence shows 76% of C-suite leaders and decision-makers are planning or implementing generative AI projects. 

Harnessing the potential of data through AI is essential in today’s business environment. A McKinsey report says data-driven organizations demonstrate EBITDA increases of up to 25%. AI-driven data strategy can boost growth and realize untapped potential by increasing alignment with business objectives, breaking down silos, prioritizing data governance, democratizing data, and incorporating domain expertise.

“Companies need to have the necessary data foundations, data ecosystems, and data culture to embrace an AI-driven operating model,” says Akhilesh Ayer, executive vice president and global head of AI, analytics, data, and research practice at WNS Triange, a unit of business process management company WNS Global Services.

A unified data ecosystem

Embracing an AI-driven operating model requires companies to make data the foundation of their business. Business leaders need to ensure “every decision-making process is data-driven, so that individual judgment-based decisions are minimized,” says Ayer. This makes real-time data collection essential. “For example, if I’m doing fraud analytics for a bank, I need real-time data of a transaction,” explains Ayer. “Therefore, the technology team will have to enable real-time data collection for that to happen.” 

Real-time data is just one element of a unified data ecosystem. Ayer says an all-round approach is necessary. Companies need clear direction from senior management; well-defined control of data assets; cultural and behavioral changes; and the ability to identify the right business use cases and assess the impact they’ll create. 

Aligning business goals with data initiatives  

An AI-driven data strategy will only boost competitiveness if it underpins primary business goals. Ayer says companies must determine their business goals before deciding what to do with data. 

One way to start, Ayer explains, is a data-and-AI maturity audit or a planning exercise to determine whether an enterprise needs a data product roadmap. This can determine if a business needs to “re-architect the way data is organized or implement a data modernization initiative,” he says. 

The demand for personalization, convenience, and ease in the customer experience is a central and differentiating factor. How businesses use customer data is particularly important for maintaining a competitive advantage, and can fundamentally transform business operations. 

Ayer cites WNS Triange’s work with a retail client as an example of how evolving customer expectations drive businesses to make better use of data. The retailer wanted greater value from multiple data assets to improve customer experience. In a data triangulation exercise while modernizing the company’s data with cloud and AI, WNS Triange created a unified data store with personalization models to increase return on investment and reduce marketing spend. “Greater internal alignment of data is just one way companies can directly benefit and offer an improved customer experience,” says Ayer. 

Weeding out silos 

Regardless of an organization’s data ambitions, few manage to thrive without clear and effective communication. Modern data practices have process flows or application programming interfaces that enable reliable, consistent communication between departments to ensure secure and seamless data-sharing, says Ayer. 

This is essential to breaking down silos and maintaining buy-in. “When companies encourage business units to adopt better data practices through greater collaboration with other departments and data ecosystems, every decision-making process becomes automatically data-driven,” explains Ayer.  

WNS Triange helped a well-established insurer remove departmental silos and establish better communication channels. Silos were entrenched. The company had multiple business lines in different locations and legacy data ecosystems. WNS Triange brought them together and secured buy-in for a common data ecosystem. “The silos are gone and there’s the ability to cross leverage,” says Ayer. “As a group, they decide what prioritization they should take; which data program they need to pick first; and which businesses should be automated and modernized.”

Data ownership beyond IT

Removing silos is not always straightforward. In many organizations, data sits in different departments. To improve decision-making, Ayer says, businesses can unite underlying data from various departments and broaden data ownership. One way to do this is to integrate the underlying data and treat this data as a product. 

While IT can lay out the system architecture and design, primary data ownership shifts to business users. They understand what data is needed and how to use it, says Ayer. “This means you give the ownership and power of insight-generation to the users,” he says. 

This data democratization enables employees to adopt data processes and workflows that cultivate a healthy data culture. Ayer says companies are investing in trainings in this area. “We’ve even helped a few companies design the necessary training programs that they need to invest in,” he says. 

Tools for data decentralization

Data mesh and data fabric, powered by AI, empower businesses to decentralize data ownership, nurture the data-as-a-product concept, and create a more agile business. 

For organizations adopting a data fabric model, it’s crucial to include a data ingestion framework to manage new data sources. “Dynamic data integration must be enabled because it’s new data with a new set of variables,” says Ayer. “How it integrates with an existing data lake or warehouse is something that companies should consider.” 

Ayer cites WNS Triange’s collaboration with a travel client as an example of improving data control. The client had various business lines in different countries, meaning controlling data centrally was difficult and ineffective. WNS Triange deployed a data mesh and data fabric ecosystem that allowed for federated governance controls. This boosted data integration and automation, enabling the organization to become more data-centric and efficient. 

A governance structure for all

“Governance controls can be federated, which means that while central IT designs the overall governance protocols, you hand over some of the governance controls to different business units, such as data-sharing, security, and privacy, making data deployment more seamless and effective,” says Ayer. 

AI-powered data workflow automation can add precision and improve downstream analytics. For example, Ayer says, in screening insurance claims for fraud, when an insurer’s data ecosystem and workflows are fully automated, instantaneous AI-driven fraud assessments are possible. 

“The ability to process a fresh claim, bring it into a central data ecosystem, match the policyholder’s information with the claim’s data, and make sure that the claim-related information passes through a model to give a recommendation, and then push back that recommendation into the company’s workflow is the phenomenal experience of improving downstream analytics,” Ayer says. 

Data-driven organizations of the future

A well-crafted data strategy aligned with clear business objectives can seamlessly integrate AI tools and technologies into organizational infrastructure. This helps ensure competitive advantage in the digital age. 

To benefit from any data strategy, organizations must continuously overcome barriers such as legacy data platforms, slow adoption, and cultural resistance. “It’s extremely critical that employees embrace it for the betterment of themselves, customers, and other stakeholders,” Ayer points out. “Organizations can stay data-driven by aligning data strategy with business goals, ensuring stakeholders’ buy-in and employees’ empowerment for smoother adoption, and using the right technologies and frameworks.” 

This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by MIT Technology Review’s editorial staff.

Deploying high-performance, energy-efficient AI

Although AI is by no means a new technology there have been massive and rapid investments in it and large language models. However, the high-performance computing that powers these rapidly growing AI tools — and enables record automation and operational efficiency — also consumes a staggering amount of energy. With the proliferation of AI comes the responsibility to deploy that AI responsibly and with an eye to sustainability during hardware and software R&D as well as within data centers.

“Enterprises need to be very aware of the energy consumption of their digital technologies, how big it is, and how their decisions are affecting it,” says corporate vice president and general manager of data center platform engineering and architecture at Intel, Zane Ball.

One of the key drivers of a more sustainable AI is modularity, says Ball. Modularity breaks down subsystems of a server into standard building blocks, defining interfaces between those blocks so they can work together. This system can reduce the amount of embodied carbon in a server’s hardware components and allows for components of the overall ecosystem to be reused, subsequently reducing R&D investments.

Downsizing infrastructure within data centers, hardware, and software can also help enterprises reach greater energy efficiency without compromising function or performance. While very large AI models require megawatts of super compute power, smaller, fine-tuned models that operate within a specific knowledge domain can maintain high performance but low energy consumption.

“You give up that kind of amazing general purpose use like when you’re using ChatGPT-4 and you can ask it everything from 17th century Italian poetry to quantum mechanics, if you narrow your range, these smaller models can give you equivalent or better kind of capability, but at a tiny fraction of the energy consumption,” says Ball.

The opportunities for greater energy efficiency within AI deployment will only expand over the next three to five years. Ball forecasts significant hardware optimization strides, the rise of AI factories — facilities that train AI models on a large scale while modulating energy consumption based on its availability — as well as the continued growth of liquid cooling, driven by the need to cool the next generation of powerful AI innovations.

“I think making those solutions available to our customers is starting to open people’s eyes how energy efficient you can be while not really giving up a whole lot in terms of the AI use case that you’re looking for.”

This episode of Business Lab is produced in partnership with Intel.

Full Transcript

Laurel Ruma: From MIT Technology Review, I’m Laurel Ruma and this is Business Lab, the show that helps business leaders make sense of new technologies coming out of the lab and into the marketplace.

Our topic is building a better AI architecture. Going green isn’t for the faint of heart, but it’s also a pressing need for many, if not all enterprises. AI provides many opportunities for enterprises to make better decisions, so how can it also help them be greener?

Two words for you: sustainable AI.

My guest is Zane Ball, corporate vice president and general manager of data center platform engineering and architecture at Intel.

This podcast is produced in partnership with Intel.

Welcome Zane.

Zane Ball: Good morning.

Laurel: So to set the stage for our conversation, let’s start off with the big topic. As AI transforms businesses across industries, it brings the benefits of automation and operational efficiency, but that high-performance computing also consumes more energy. Could you give an overview of the current state of AI infrastructure and sustainability at the large enterprise level?

Zane: Absolutely. I think it helps to just kind of really zoom out big picture, and if you look at the history of IT services maybe in the last 15 years or so, obviously computing has been expanding at a very fast pace. And the good news about that history of the last 15 years or so, is while computing has been expanding fast, we’ve been able to contain the growth in energy consumption overall. There was a great study a couple of years ago in Science Magazine that talked about how compute had grown by maybe 550% over a decade, but that we had just increased electricity consumption by a few percent. So those kind of efficiency gains were really profound. So I think the way to kind of think about it is computing’s been expanding rapidly, and that of course creates all kinds of benefits in society, many of which reduce carbon emissions elsewhere.

But we’ve been able to do that without growing electricity consumption all that much. And that’s kind of been possible because of things like Moore’s Law, Big Silicon has been improving with every couple of years and make devices smaller, they consume less power, things get more efficient. That’s part of the story. Another big part of this story is the advent of these hyperscale data centers. So really, really large-scale computing facilities, finding all kinds of economies of scale and efficiencies, high utilization of hardware, not a lot of idle hardware sitting around. That also was a very meaningful energy efficiency. And then finally this development of virtualization, which allowed even more efficient utilization of hardware. So those three things together allowed us to kind of accomplish something really remarkable. And during that time, we also had AI starting to play, I think since about 2015, AI workloads started to play a pretty significant role in digital services of all kinds.

But then just about a year ago, ChatGPT happens and we have a non-linear shift in the environment and suddenly large language models, probably not news to anyone on this listening to this podcast, has pivoted to the center and there’s just a breakneck investment across the industry to build very, very fast. And what is also driving that is that not only is everyone rushing to take advantage of this amazing large language model kind of technology, but that technology itself is evolving very quickly. And in fact also quite well known, these models are growing in size at a rate of about 10x per year. So the amount of compute required is really sort of staggering. And when you think of all the digital services in the world now being infused with AI use cases with very large models, and then those models themselves growing 10x per year, we’re looking at something that’s not very similar to that last decade where our efficiency gains and our greater consumption were almost penciling out.

Now we’re looking at something I think that’s not going to pencil out. And we’re really facing a really significant growth in energy consumption in these digital services. And I think that’s concerning. And I think that means that we’ve got to take some strong actions across the industry to get on top of this. And I think just the very availability of electricity at this scale is going to be a key driver. But of course many companies have net-zero goals. And I think as we pivot into some of these AI use cases, we’ve got work to do to square all of that together.

Laurel: Yeah, as you mentioned, the challenges are trying to develop sustainable AI and making data centers more energy efficient. So could you describe what modularity is and how a modularity ecosystem can power a more sustainable AI?

Zane: Yes, I think over the last three or four years, there’ve been a number of initiatives. Intel’s played a big part of this as well of re-imagining how servers are engineered into modular components. And really modularity for servers is just exactly as it sounds. We break different subsystems of the server down into some standard building blocks, define some interfaces between those standard building blocks so that they can work together. And that has a number of advantages. Number one, from a sustainability point of view, it lowers the embodied carbon of those hardware components. Some of these hardware components are quite complex and very energy intensive to manufacture. So imagine a 30 layer circuit board, for example, is a pretty carbon intensive piece of hardware. I don’t want the entire system, if only a small part of it needs that kind of complexity. I can just pay the price of the complexity where I need it.

And by being intelligent about how we break up the design in different pieces, we bring that embodied carbon footprint down. The reuse of pieces also becomes possible. So when we upgrade a system, maybe to a new telemetry approach or a new security technology, there’s just a small circuit board that has to be replaced versus replacing the whole system. Or maybe a new microprocessor comes out and the processor module can be replaced without investing in new power supplies, new chassis, new everything. And so that circularity and reuse becomes a significant opportunity. And so that embodied carbon aspect, which is about 10% of carbon footprint in these data centers can be significantly improved. And another benefit of the modularity, aside from the sustainability, is it just brings R&D investment down. So if I’m going to develop a hundred different kinds of servers, if I can build those servers based on the very same building blocks just configured differently, I’m going to have to invest less money, less time. And that is a real driver of the move towards modularity as well.

Laurel: So what are some of those techniques and technologies like liquid cooling and ultrahigh dense compute that large enterprises can use to compute more efficiently? And what are their effects on water consumption, energy use, and overall performance as you were outlining earlier as well?

Zane: Yeah, those are two I think very important opportunities. And let’s just take them one at a  time. Emerging AI world, I think liquid cooling is probably one of the most important low hanging fruit opportunities. So in an air cooled data center, a tremendous amount of energy goes into fans and chillers and evaporative cooling systems. And that is actually a significant part. So if you move a data center to a fully liquid cooled solution, this is an opportunity of around 30% of energy consumption, which is sort of a wow number. I think people are often surprised just how much energy is burned. And if you walk into a data center, you almost need ear protection because it’s so loud and the hotter the components get, the higher the fan speeds get, and the more energy is being burned in the cooling side and liquid cooling takes a lot of that off the table.

What offsets that is liquid cooling is a bit complex. Not everyone is fully able to utilize it. There’s more upfront costs, but actually it saves money in the long run. So the total cost of ownership with liquid cooling is very favorable, and as we’re engineering new data centers from the ground up. Liquid cooling is a really exciting opportunity and I think the faster we can move to liquid cooling, the more energy that we can save. But it’s a complicated world out there. There’s a lot of different situations, a lot of different infrastructures to design around. So we shouldn’t trivialize how hard that is for an individual enterprise. One of the other benefits of liquid cooling is we get out of the business of evaporating water for cooling. A lot of North America data centers are in arid regions and use large quantities of water for evaporative cooling.

That is good from an energy consumption point of view, but the water consumption can be really extraordinary. I’ve seen numbers getting close to a trillion gallons of water per year in North America data centers alone. And then in humid climates like in Southeast Asia or eastern China for example, that evaporative cooling capability is not as effective and so much more energy is burned. And so if you really want to get to really aggressive energy efficiency numbers, you just can’t do it with evaporative cooling in those humid climates. And so those geographies are kind of the tip of the spear for moving into liquid cooling.

The other opportunity you mentioned was density and bringing higher and higher density of computing has been the trend for decades. That is effectively what Moore’s Law has been pushing us forward. And I think it’s just important to realize that’s not done yet. As much as we think about racks of GPUs and accelerators, we can still significantly improve energy consumption with higher and higher density traditional servers that allows us to pack what might’ve been a whole row of racks into a single rack of computing in the future. And those are substantial savings. And at Intel, we’ve announced we have an upcoming processor that has 288 CPU cores and 288 cores in a single package enables us to build racks with as many as 11,000 CPU cores. So the energy savings there is substantial, not just because those chips are very, very efficient, but because the amount of networking equipment and ancillary things around those systems is a lot less because you’re using those resources more efficiently with those very high dense components. So continuing, if perhaps even accelerating our path to this ultra-high dense kind of computing is going to help us get to the energy savings we need maybe to accommodate some of those larger models that are coming.

Laurel: Yeah, that definitely makes sense. And this is a good segue into this other part of it, which is how data centers and hardware as well software can collaborate to create greater energy efficient technology without compromising function. So how can enterprises invest in more energy efficient hardware such as hardware-aware software, and as you were mentioning earlier, large language models or LLMs with smaller downsized infrastructure but still reap the benefits of AI?

Zane: I think there are a lot of opportunities, and maybe the most exciting one that I see right now is that even as we’re pretty wowed and blown away by what these really large models are able to do, even though they require tens of megawatts of super compute power to do, you can actually get a lot of those benefits with far smaller models as long as you’re content to operate them within some specific knowledge domain. So we’ve often referred to these as expert models. So take for example an open source model like the Llama 2 that Meta produced. So there’s like a 7 billion parameter version of that model. There’s also, I think, a 13 and 70 billion parameter versions of that model compared to a GPT-4, maybe something like a trillion element model. So it’s far, far, far smaller, but when you fine tune that model with data to a specific use case, so if you’re an enterprise, you’re probably working on something fairly narrow and specific that you’re trying to do.

Maybe it’s a customer service application or it’s a financial services application, and you as an enterprise have a lot of data from your operations, that’s data that you own and you have the right to use to train the model. And so even though that’s a much smaller model, when you train it on that domain specific data, the domain specific results can be quite good in some cases even better than the large model. So you give up that kind of amazing general purpose use like when you’re using ChatGPT-4 and you can ask it everything from 17th century Italian poetry to quantum mechanics, if you narrow your range, these smaller models can give you equivalent or better kind of capability, but at a tiny fraction of the energy consumption.

And we’ve demonstrated a few times, even with just a standard Intel Xeon two socket server with some of the AI acceleration technologies we have in those systems, you can actually deliver quite a good experience. And that’s without even any GPUs involved in the system. So that’s just good old-fashioned servers and I think that’s pretty exciting.

That also means the technology’s quite accessible, right? So you may be an enterprise, you have a general purpose infrastructure that you use for a lot of things, you can use that for AI use cases as well. And if you’ve taken advantage of these smaller models that fit within infrastructure we already have or infrastructure that you can easily obtain. And so those smaller models are pretty exciting opportunities. And I think that’s probably one of the first things the industry will adopt to get energy consumption under control is just right sizing the model to the activity to the use case that we’re targeting. I think there’s also… you mentioned the concept of hardware-aware software. I think that the collaboration between hardware and software has always been an opportunity for significant efficiency gains.

I mentioned early on in this conversation how virtualization was one of the pillars that gave us that kind of fantastic result over the last 15 years. And that was very much exactly that. That’s bringing some deep collaboration between the operating system and the hardware to do something remarkable. And a lot of the acceleration that exists in AI today actually is a similar kind of thinking, but that’s not really the end of the hardware software collaboration. We can deliver quite stunning results in encryption and in memory utilization in a lot of areas. And I think that that’s got to be an area where the industry is ready to invest. It is very easy to have plug and play hardware where everyone programs at a super high level language, nobody thinks about the impact of their software application downstream. I think that’s going to have to change. We’re going to have to really understand how our application designs are impacting energy consumption going forward. And it isn’t purely a hardware problem. It’s got to be hardware and software working together.

Laurel: And you’ve outlined so many of these different kind of technologies. So how can enterprise adoption of things like modularity and liquid cooling and hardware aware software be incentivized to actually make use of all these new technologies?

Zane: A year ago, I worried a lot about that question. How do we get people who are developing new applications to just be aware of the downstream implications? One of the benefits of this revolution in the last 12 months is I think just availability of electricity is going to be a big challenge for many enterprises as they seek to adopt some of these energy intensive applications. And I think the hard reality of energy availability is going to bring some very strong incentives very quickly to attack these kinds of problems.

But I do think beyond that like a lot of areas in sustainability, accounting is really important. There’s a lot of good intentions. There’s a lot of companies with net-zero goals that they’re serious about. They’re willing to take strong actions against those goals. But if you can’t accurately measure what your impact is either as an enterprise or as a software developer, I think you have to kind of find where the point of action is, where does the rubber meet the road where a micro decision is being made. And if the carbon impact of that is understood at that point, then I think you can see people take the actions to take advantage of the tools and capabilities that are there to get a better result. And so I know there’s a number of initiatives in the industry to create that kind of accounting, and especially for software development, I think that’s going to be really important.

Laurel: Well, it’s also clear there’s an imperative for enterprises that are trying to take advantage of AI to curb that energy consumption as well as meet their environmental, social, and governance or ESG goals. So what are the major challenges that come with making more sustainable AI and computing transformations?

Zane: It’s a complex topic, and I think we’ve already touched on a couple of them. Just as I was just mentioning, definitely getting software developers to understand their impact within the enterprise. And if I’m an enterprise that’s procuring my applications and software, maybe cloud services, I need to make sure that accounting is part of my procurement process, that in some cases that’s gotten easier. In some cases, there’s still work to do. If I’m operating my own infrastructure, I really have to look at liquid cooling, for example, an adoption of some of these more modern technologies that let us get to significant gains in energy efficiency. And of course, really looking at the use cases and finding the most energy efficient architecture for that use case. For example, like using those smaller models that I was talking about. Enterprises need to be very aware of the energy consumption of their digital technologies, how big it is and how their decisions are affecting it.

Laurel: So could you offer an example or use case of one of those energy efficient AI driven architectures and how AI was subsequently deployed for it?

Zane: Yes. I think that some of the best examples I’ve seen in the last year were really around these smaller models where Intel did an example that we published around financial services, and we found that something like three hours of fine-tuning training on financial services data allowed us to create a chatbot solution that performed in an outstanding manner on a standard Xeon processor. And I think making those solutions available to our customers is starting to open people’s eyes how energy efficient you can be while not really giving up a whole lot in terms of the AI use case that you’re looking for. And so I think we need to just continue to get those examples out there. We have a number of collaborations such as with Hugging Face with open source models, enabling those solutions on our products like our Gaudi2 accelerator has also performed very well from a performance per watt point of view, the Xeon processor itself. So those are great opportunities.

Laurel: And then how do you envision the future of AI and sustainability in the next three to five years? There seems like so much opportunity here.

Zane: I think there’s going to be so much change in the next three to five years. I hope no one holds me to what I’m about to say, but I think there are some pretty interesting trends out there. One thing, I think, to think about is the trend of AI factories. So training a model is a little bit of an interesting activity that’s distinct from what we normally think of as real time digital services. You have real time digital service like Vinnie, the app on your iPhone that’s connected somewhere in the cloud, and that’s a real time experience. And it’s all about 99.999% uptime, short latencies to deliver that user experience that people expect. But AI training is different. It’s a little bit more like a factory. We produce models as a product and then the models are used to create the digital services. And that I think becomes an important distinction.

So I can actually build some giant gigawatt facility somewhere that does nothing but train models on a large scale. I can partner with the infrastructure of the electricity providers and utilities much like an aluminum plant or something would do today where I actually modulate my energy consumption with its availability. Or maybe I take advantage of solar or wind power’s ability, I can modulate when I’m consuming power, not consuming power. And so I think if we’re going to see some really large scale kinds of efforts like that, and those AI factories could be very, very efficient, they can be liquid cooled and they can be closely coupled to the utility infrastructure. I think that’s a pretty exciting opportunity. And while that’s kind of an acknowledgement that there’s going to be gigawatts and gigawatts of AI training going on. Second opportunity, I think in this three to five years, I do think liquid cooling will become far more pervasive.

I think that will be driven by the need to cool the next generation of accelerators and GPUs will make it a requirement, but then that will be able to build that technology out and scale it more ubiquitously for all kinds of infrastructure. And that will let us shave huge amounts of gigawatts out of the infrastructure, save hundreds of billions of gallons of water annually. I think that’s incredibly exciting. And if I just… the innovation on the model size as well, so much has changed with just the last five years with large language models like ChatGPT, let’s not assume there’s not going to be even bigger change in the next three to five years. What are the new problems that are going to be solved, new innovations? So I think as the costs and impact of AI are being felt more substantively, there’ll be a lot of innovation on the model side and people will come up with new ways of cracking some of these problems and there’ll be new exciting use cases that come about.

Finally, I think on the hardware side, there will be new AI architectures. From an acceleration point of view today, a lot of AI performance is limited by memory bandwidth, memory bandwidth and networking bandwidth between the various accelerator components. And I don’t think we’re anywhere close to having an optimized AI training system or AI inferencing systems. I think the discipline is moving faster than the hardware and there’s a lot of opportunity for optimization. So I think we’ll see significant differences in networking, significant differences in memory solutions over the next three to five years, and certainly over the next 10 years that I think can open up a substantial set of improvements.

And of course, Moore’s Law itself continues to advance advanced packaging technologies, new transistor types that allow us to build ever more ambitious pieces of silicon, which will have substantially higher energy efficiency. So all of those things I think will be important. Whether we can keep up with our energy efficiency gains with the explosion in AI functionality, I think that’s the real question and it’s just going to be a super interesting time. I think it’s going to be a very innovative time in the computing industry over the next few years.

Laurel: And we’ll have to see. Zane, thank you so much for joining us on the Business Lab.

Zane: Thank you.

Laurel: That was Zane Ball, corporate vice president and general manager of data center platform engineering and architecture at Intel, who I spoke with from Cambridge, Massachusetts, the home of MIT and MIT Technology Review.

That’s it for this episode of Business Lab. I’m your host, Laurel Ruma. I’m the director of Insights, the custom publishing division of MIT Technology Review. We were founded in 1899 at the Massachusetts Institute of Technology, and you can also find us in print, on the web, and at events each year around the world. For more information about us and the show, please check out our website at technologyreview.com.

This show is available wherever you get your podcasts. If you enjoyed this episode, we hope you’ll take a moment to rate and review us. Business Lab is a production of MIT Technology Review. This episode was produced by Giro Studios. Thanks for listening.


This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by MIT Technology Review’s editorial staff.

AI for everything: 10 Breakthrough Technologies 2024

WHO

Google, Meta, Microsoft, OpenAI

WHEN

Now

When OpenAI launched a free web app called ChatGPT in November 2022, nobody knew what was coming. But that low-key release changed everything.

By January, ChatGPT had become the fastest-growing web app ever, offering anyone with a browser access to one of the most powerful neural networks ever built. We were dazzled and disturbed.  

And that was only the start. In February, Microsoft and Google revealed rival plans to combine chatbots with search—plans that reimagined our daily interactions with the internet.  

Early demos weren’t great. Microsoft’s Bing Chat went off the rails, quick to churn out nonsense. Google’s Bard was caught making a factual error in its promo pitch. But the genie wasn’t going back in its bottle, no matter how weird it was. 

Microsoft and Google have since moved beyond search to put chatbot-based assistants into the hands of billions of people via their office software. The tech promises to summarize emails and meetings; draft reports and replies; generate whole slide decks—titles, bullet points, and pictures—in seconds.

Microsoft and Meta released image-making models that let users generate shareable images of anything with a click. Cue a nonstop stream of zany mash-ups—and dozens of posts about Mickey Mouse and SpongeBob SquarePants flying a plane into the Twin Towers.

Google’s new phones now use AI to let you edit photos to a degree never seen before, exchanging sad faces for happy ones and overcast afternoons for perfect sunsets.

Never has such radical new technology gone from experimental prototype to consumer product so fast and at such scale. What’s clear is that we haven’t even begun to make sense of it all, let alone reckon with its impact.

Is the shine coming off? Maybe. With each release, the astonishing becomes more mundane. But 2023’s legacy is clear: billions have now looked AI in the face. Now we need to figure out exactly what’s looking back.