Robotaxis are here. It’s time to decide what to do about them

In some San Francisco neighborhoods, at certain hours of the night, it seems as if one in 10 cars on the road has no driver behind the wheel. 

These are not experimental test vehicles, and this is not a drill. Many of San Francisco’s ghostly driverless cars are commercial robotaxis, directly competing with taxis, Uber and Lyft, and public transit. They are a real, albeit still marginal, part of the city’s transportation system. And the companies that operate them, Cruise and Waymo, appear poised to continue expanding their services in San Francisco, Austin, Phoenix, and perhaps even Los Angeles in the coming months. 

I spent the past year covering robotaxis for the San Francisco Examiner and have taken nearly a dozen rides in Cruise driverless cars over the past few months. During my reporting, I’ve been struck by the lack of urgency in the public discourse about robotaxis. I’ve come to believe that most people, including many powerful decision makers, are not aware of how quickly this industry is advancing, or how severe the near-term labor and transportation impacts could be. 

Hugely important decisions about robotaxis are being made in relative obscurity by appointed agencies like the California Public Utilities Commission. Legal frameworks remain woefully inadequate: in the Golden State, cities have no regulatory authority over the robotaxis that ply their streets, and police legally cannot cite them for moving violations. 

It’s high time for the public and its elected representatives to play a more active role in shaping the future of this new technology. Like it or not, robotaxis are here. Now comes the difficult work of deciding what to do about them. 

After years of false promises, it’s now widely acknowledged that the dream of owning your very own sleep/gaming/makeup mobility pod remains years, if not decades, away. Tesla’s misleadingly named Autopilot system, the closest thing to autonomous driving in a mass-market car, is under investigation by both the National Highway Traffic Safety Administration and the Justice Department. 

Unfortunately, there is no standard, government-approved framework for evaluating the safety of autonomous vehicles.

Media coverage of robotaxis has been rightfully skeptical. Journalists (myself included) have highlighted strange robo-­behavior, concerning software failures, and Cruise and Waymo’s lack of transparency about their data. Cruise’s driverless vehicles, in particular, have shown an alarming tendency to inexplicably stop in the middle of the road, blocking traffic for extended periods of time. San Francisco officials have documented at least 92 such incidents in just six months, including three that disrupted emergency responders

These critical stories, though important, obscure the general trend, which has been moving steadily in the robotaxi industry’s favor. Over the past few years, Cruise and Waymo have cleared several major regulatory hurdles, expanded into new markets, and racked up over a million relatively uneventful, truly driverless miles each in major American cities. 

Robotaxis are operationally quite different from personally owned autonomous vehicles, and they are in a much better position for commercial deployment. They can be unleashed within a strictly limited area where they’re well trained; their use can be closely monitored by the company that designed them; and they can immediately be pulled off the road in bad weather or if there’s another issue.

Unfortunately, there is no standard, government-approved framework for evaluating the safety of autonomous vehicles. In a paper on its first million “rider-only” miles, Waymo had two police-reportable crashes (with no injuries) and 18 minor contact events, about half of which involved a human driver hitting a stationary Waymo. The company cautions against direct comparisons with human drivers because there are rarely analogous data sets. Cruise, on the other hand, claims that its robotaxis experienced 53% fewer collisions than the typical human ride-hail driver in San Francisco in their first million driverless miles, and 73% fewer collisions with a meaningful risk of injury. 

While not perfect, my most recent Cruise ride, in April, was sufficiently close to the experience of riding with a responsible human driver that I momentarily forgot I was in a robotaxi. The mere fact that these vehicles are programmed to follow traffic laws and the speed limit automatically makes them feel like safer drivers than a large percentage of humans on the road.

It remains to be seen whether robotaxis are ready for deployment on a significant scale, or what the metric for determining readiness would even be.But barring a significant shift in momentum, like an economic shock, a horrific tragedy, or a dramatic political pivot, robotaxis are positioned to continue their roll. This is enough to warrant a broader discussion of how they will change cities and society.  

Cruise and Waymo are close to being authorized to provide all-day commercial robotaxi service throughout virtually all of San Francisco. That could immediately have a considerable economic impact on the city’s taxi and ride-hail drivers. The same goes for every other city where Cruise and Waymo set up shop. The prospect of automating professional drivers out of existence is not theoretical anymore. It’s a very real possibility in the near future. 

Robotaxis also have huge immediate-term implications for transportation policy. This technology could make automotive transportation so cheap and easy that people decide to make more trips by car, increasing congestion and undermining public transportation. Traffic could be made even worse, San Francisco officials fear, by the many robotaxis double-parking as they await passengers, lacking the situational awareness of where and for how long it’s appropriate to stop. 

The emergence of robotaxis adds urgency to fraught questions in labor and transportation policy that will need to be addressed sooner or later. Should workers be protected from displacement, or be somehow compensated if they are displaced? Should cars have free rein in the most congested, transit-accessible parts of cities? Should electric vehicles continue to be exempt from the gas taxes that pay for road maintenance? 

As technology accelerates, public policy should accelerate along with it. But in order to keep up, the public needs to have a clear-eyed view of just how quickly the future could arrive.

I ordered a bubble tea by drone in Shenzhen

China Report is MIT Technology Review’s newsletter about technology developments in China. Sign up to receive it in your inbox every Tuesday.

Last week, I told you about my adventure at Tencent’s customer service center. But the quest to get my QQ account back wasn’t the only reason I went to Shenzhen. While I was in China, I learned that the dominant Chinese food delivery platform, Meituan, has been flying delivery drones in the city for more than a year now, and I wanted to check it out myself.

I found that the reality of drone delivery is still far from ideal, and people may be turned away by the steep learning curve. But at the same time, it was an exciting experience—the prospect of routine drone delivery feels more realistic than it’s ever been.

Meituan currently operates more than a hundred drones from five delivery hubs (or launchpads) in the city. Together, they completed over 100,000 orders in 2022. While the platform itself can deliver basically anything, from dinner to medicine to fresh flowers to electronic devices, the drones are mostly used for food and drinks. 

Why? Because Chinese people care about the temperature of their meals, Mao Yinian, head of Meituan’s drone delivery department, tells me. “People care about it greatly—whether they can receive a hot meal or a cup of iced bubble tea in time. But when it comes to other [types of products], people don’t mind if it arrives 30 minutes faster or slower,” he says. Since Meituan’s drone flight routes are all automated—and the drones never run into traffic—it’s easier to precisely control the time it takes for the meal to be delivered. The drones usually arrive within seconds of the estimated time.

To have a cup of bubble tea delivered exactly when you want it? As a bubble tea enthusiast, all I can say is sign me up. But when I tried it out, I found out it’s not as simple as it sounds.

The first obstacle: the drones don’t deliver to your doorstep. Instead, they deliver to one of a dozen pickup locations scattered around the city—vending-machine-size kiosks that function as both a landing pad for the drone and storage for your package if you’re late to pick it up. 

A yellow-and-white Meituan pickup kiosk in front of trees. A man is standing nearby.
A Meituan pickup kiosk at the entrance of a residential neighborhood.
ZEYI YANG

Here began my first attempt. After looking up all Meituan pickup locations on the map, I chose one near the subway station I was at. I ordered an iced coconut tea latte, which was specifically marked in the app as being deliverable by a drone. I paid and began waiting in excitement.

Nope. I immediately got a text telling me that “because of a system upgrade,” my order would be delivered by a human courier instead. Was it because of the bad weather? There had been a rainstorm in Shenzhen that morning, and the sky was still covered with dark clouds. But when I checked with a representative at Meituan, she said the drones were working. 

It turns out, she told me, I had ordered from a restaurant in a different district, and there were no drone routes that flew from there to the kiosk I wanted to send my order to. There’s no way to know that from the app, she said. 

That evening, I tried it a second time. As directed, this time I chose a pickup kiosk in the same district as the restaurant. In fact, they were only a few hundred feet apart. That would surely work, right? 

I ordered an avocado strawberry yogurt smoothie and again received a text immediately after the purchase was made. “Drone deliveries are not operational at this time of the day. It will be delivered by a human courier instead,” I was told. I later learned that drones only deliver until 7 p.m. every day. I was 30 minutes too late.

It wasn’t a promising start. But as it happened, I had arranged to visit one of the company’s drone launchpads the next day. So I got the chance to take an inside look at the operation.

The launchpad sits on the rooftop of a five-floor mall. I visited just after the lunch rush, met with some Meituan employees, and saw that humans and robots are equally important in making every delivery possible. I had wondered whether drones were deployed to each restaurant to pick up the food. No—Meituan workers pick up food from the vendors, bring it to the rooftop to package it, and load it onto the drones. Workers also need to change the drones’ batteries.

A woman in a white shirt is handling a white cardboard box where she will put the customer's order in.
One meituan worker is sealing the package.
ZEYI YANG
Four drones parked on the rooftop.
The rooftop launchpad.
ZEYI YANG

This launchpad services three nearby pickup kiosks. The rooftop area is divided into three zones, each with its own huge QR codes painted on the floor to mark the exact landing positions for the drones. 

Once I learned about the logistics involved, it was clear Meituan had made some compromises in order to make drone delivery work in densely populated areas. Arrangements like making the drones deliver to pickup kiosks instead of straight to your home may be less convenient for customers, but it also reduces the risk that drones will get trapped in difficult locations or injure people. It’s a model for other companies working on drone delivery, and you can read more about what I learned in a story on Meituan’s efforts I published this morning.

When I left the launchpad, I made one last order, from that very site to one of the three kiosks it serves. I felt confident that I’d learned everything I could about the service. Standing by the kiosk, I could even predict what direction the drone would come from, having already watched several of them complete the route from the other end.

Indeed, at exactly the time that the app predicted, the drone came and landed on the kiosk. I typed in my phone number on a screen, and after what sounded like robotic arms moving, a door lifted up, allowing me to retrieve a cardboard box. Inside was my order: an iced orange black tea, sealed in an insulating bag. My drink hadn’t spilled, and it was still cold. And I had finally accomplished my goal of getting a drone delivery in Shenzhen.

Do you think delivery companies should invest in developing drone delivery systems? Let me know your thoughts at zeyi@technologyreview.com.

Catch up with China

1. The Chinese government said it found “relatively serious” cybersecurity risks in products sold by the American memory-chip company Micron. (Bloomberg $)

2. A data visualization of the supply chain for lithium-ion batteries explains why the world still relies on China to make batteries for electric vehicles. (New York Times $)

3. Chinese researchers surpassed their American peers for the first time in contributions to a range of natural-science research journals, according to an academic publication index compiled by Nature. (Nature)

4. Police departments in China have spent millions of dollars developing geographic information systems to improve their surveillance capacity. (China Digital Times)

5. China’s standup comedy industry has been shaken by the possibility of nationwide censorship—all because of a joke by one comedian about the Chinese military. (Reuters $)

6. The business of “expert network” consultancy—paying industry experts for information that might benefit companies and investors—has become a top anti-espionage concern for the Chinese government. (Wall Street Journal $). As a result, executives at the US consultancy firm Mintz are rushing to leave Hong Kong after the company was involved in a police probe. (Reuters $)

7. Meet the astronomer who wrote over 2,000 letters in response to Chinese UFO fans, trying to make sense of their UFO sighting experiences. (Sixth Tone)

8. Montana banned TikTok in the state, and TikTok is now suing it. (Semafor)

Lost in translation

A cartoon dinosaur character named Momo.

If you frequently see someone on Chinese social media with the alias “Momo” and the avatar of a cartoonish dinosaur in pink, you are not meeting just one individual, but a group of people sharing an online identity to avoid being recognized in real life. According to the Chinese tech publication 36Kr, some young social media users in China are increasingly scared by the doxxing incidents they’ve seen online. To protect their privacy, they are giving up on individualized account settings and adopting a common identity, using the same default avatar generated by one Chinese social platform and pretending to be the same person. The feeling of group anonymity makes them feel more comfortable sharing their opinions online.

But it is not a perfect solution. Some “Momos” are gatekeeping who gets to be one of them—they ask that people using the avatar support the same social causes (and since they can’t enforce it, they openly attack people they don’t like). At the same time, people are finding it difficult to hold these anonymous users accountable when they post extreme opinions. The community that promised to be a safe space has turned out to be full of fights and politics too.

One more thing

What can you do if your billion-dollar tech startup fails? Well, you can always open a coffee shop instead. As Bloomberg recently reported, Dai Wei, the founder of the famed Chinese dockless bike-sharing company Ofo, which once put millions of bikes on the streets in China but has been on the edge of bankruptcy in recent years, is behind a new coffee chain in New York City called About Time Coffee. The café actually shares quite a few similarities with Dai’s last startup—both offer generous discounts to attract potential customers and have drawn generous investment. The café brand has already received more than $10 million from investors.

Food delivery by drone is just part of daily life in Shenzhen

My iced tea arrived from the sky.

In a buzzy urban area in Shenzhen, China, sandwiched between several skyscrapers, I watched as a yellow-and-black drone descended onto a pickup kiosk by the street. The top of the vending-machine-size kiosk opened up for the drone to land, and a white cardboard box containing my drink was placed inside. When I had made the delivery order on my phone half an hour before, the app noted that it would arrive by drone at 2:03 p.m., and that was exactly when it came.

How I got my iced tea from the drone.
ZEYI YANG

The drone delivery service I was trying out is operated by Meituan, China’s most popular food delivery platform. In 2022, the company engaged some 6 million gig delivery workers to deliver billions of orders. But the company has also been developing drone delivery since 2017. And in Shenzhen, a southern city that’s home to a mature drone supply chain, Meituan has been regularly operating such delivery routes for the last year and a half.

Many big corporations have had their eyes on drone delivery: Amazon first proposed doing it in 2013, but its progress has been limited by regulations and a lack of demand. Wing, owned by Google’s parent company Alphabet, has had more success, operating drone deliveries on three continents. And Walmart is backing several drone startups to experiment with delivering its products.

What differentiates Meituan from these American peers is that it has chosen to offer drone delivery in what is potentially the most challenging environment: dense urban neighborhoods. It’s an approach that makes sense in China, where most people live in high-rise apartment buildings in populous cities, and many of them order food delivery on a daily basis. 

To make the service work in a dense city, Meituan doesn’t have the drones deliver directly to your doorstep. Instead, the company has set up pickup kiosks close to residential or office buildings. Drones drop off deliveries at the kiosks, which can hold several packages at once. The process may be less convenient for customers, but it allows every drone to fly a predetermined route, from one launchpad to one kiosk, making the task of navigating urban areas much easier. 

In 2022, Meituan made more than 100,000 drone deliveries in Shenzhen. My own experience wasn’t seamless. The first time I tried to use the service, I accidentally ordered from a restaurant that was too far away. My second attempt failed because I had unwittingly ordered after hours (the drones go to bed at 7 p.m.). 

But for some Shenzhen residents and vendors, delivery by drone is no longer a novelty—it’s just part of their daily routine. Meituan’s progress shows that regular drone delivery in cities is possible, even though it requires making some compromises when it comes to user experience. How does the magic happen? I visited one of the company’s drone launchpads to see how it’s done.

The rooftop “airport”

Meituan launches its drones in Shenzhen from five delivery hubs. My tea actually came from one that was only a few hundred feet away, on the rooftop of a gigantic shopping mall. There, the building’s rooftop has been turned into an airport for the drones and a handful of support staff.

When I visited in April, there were about 10 drones parked on the rooftop, and two or three either taking off or landing. I had just missed the lunch peak, I was told by a Meituan employee, and the drones and humans there were mostly resting and recharging in anticipation of the dinner peak.

The workflow is a mix of human and automated labor. Once the drone delivery system gets an order (customers order specific items marked for drone delivery in the company’s app), a runner (human) goes to the restaurants, all located a few flights down in the shopping mall, to pick up the order and brings it to the launchpad. The runner places the food and drinks in a standardized cardboard box, weighs it to make sure it’s not too heavy, seals the box, and hands it off to a different worker who specializes in dealing with the drones. The second worker places the box under a drone and waits for it to lock in. 

One worker sealed the package before another worker took it to the drone.
ZEYI YANG

Everything after that is highly automated, says Mao Yinian, the director of drone delivery services at Meituan. The drones’ movements are controlled by a central algorithm, and the routes are predetermined. “You can know in advance, at every precise second, where each drone will be and how fast its speed is, so the customers can expect the arrival time with a deviation of two seconds, instead of three minutes or even 10 minutes (when it comes to traditional delivery),” he tells MIT Technology Review.

The company has a centralized control room in Shenzhen, where staff can take control of a drone in an emergency. There are now more than a hundred drones that can be deployed for deliveries in the city. On average, one operator is watching 10 drones at the same time.

Not all human labor can or should be replaced by machines, Mao says. But the company has plans to automate even more of the delivery process. For example, Mao would like to see robots take over the work of loading packages onto drones and changing their batteries: “Our ground crew may have to bend over a hundred times a day to load the package and change the batteries. Human bodies are not designed for such movements.”

“Our vision is to turn the [launchpad] into a fully automated factory assembly line,” he says. “The only work for humans is to place the nonstandardized food and drinks into a standardized packaging box, and then there’s no more work for humans.”

Regulatory and economic constraints

Today, there are few technical obstacles left for drones delivery of food and packages, says Jonathan Roberts, a professor of robotics at Queensland University of Technology in Australia, who has researched drones since 1999. “We definitely can do reliable drone delivery, but whether it makes financial sense is a little bit hard to know,” he says. 

Regulation often determines where companies choose to set up shop. In 2002, Australia was the first country in the world to introduce legislation on unmanned aerial vehicles, as drones are technically called. The law allowed universities and companies to conduct drone experiments as long as they obtained official licenses. “So [Australia] was the perfect place then to do testing,” says Roberts. That’s why Alphabet’s Wing tested and launched its drone deliveries in Australia before trying them in any other country.

It was a similar story for Meituan and the city of Shenzhen, where the municipal government has a strong drone manufacturing supply chain and has been particularly friendly toward the industry. On a national policy level, the central government has also permitted Shenzhen, one of the country’s designated Special Economic Zones, to have more flexibility when it comes to commercial drone legislation. 

That’s why Meituan has chosen Shenzhen to carry out the majority of its drone delivery experiments so far. The company has just established a new route in Shanghai, and it has occasionally deployed drones in other cities, but Shenzhen will remain the center of its drone activity. 

Regulations only determine whether drone delivery is permitted, however. Economics determines whether it can actually happen—and whether it can be sustainable.

A number of companies, like Wing, have chosen to start testing their operations in suburban neighborhoods, where residents are well-off but traditional delivery isn’t efficient. That model is hard to replicate in China, where most people are urban dwellers. Some Chinese companies, like the e-commerce platform JD and the logistics company SF Express, opted to go first to rural villages, where ground transportation infrastructure is underdeveloped and drones can fill in a natural gap. 

That approach may not make sense if you’re trying to make as much money as possible from drone delivery: “If you look at the total numbers of deliveries in rural areas and in urban areas, you can see they differ by maybe two orders of magnitude,” Mao says. But the safety risks for drone operation in rural areas are lower. 

“The industry used to avoid urban areas because the technology was not advanced enough to guarantee it’s safe,” Mao says. By the time he joined Meituan to head the drone delivery team in 2019, about six years after other companies had piloted rural drone delivery programs in China, he made the judgment that the technology had become safe enough to operate in cities.

There have been no reports of safety incidents with Meituan’s drones so far. Across the world, delivery drones haven’t injured any humans, but they do occasionally crash, resulting in bush fires and power outages

Meituan has made technical adjustments to make sure its drones can safely fly in cities, like opting for wing designs that are more stable in strong winds and developing its own navigation system based on computer vision to complement weak GPS signals between buildings. In February, the company obtained a license to offer commercial drone delivery in urban areas—a stamp of approval from China’s aviation authority. But gaining the residents’ complete trust will be a longer process, Mao says: “We need to explain to them, either through education or demonstrations where they can see the drones fly, that we can guarantee it’s safe.”

Drones vs. humans?

Some vendors and customers have already gotten used to the Meituan drones. 

I spoke to a restaurant server at the mall who said her restaurant was one of the first adopters of the drone delivery service. (She asked to be kept anonymous because she didn’t have permission to speak to the media.) The drones used to be unable to deliver during rainy days, but then the technology improved. Nowadays, the restaurant can fill dozens of orders through drone delivery every day.

One reason she likes drone delivery is that the service is more predictable, while the behavior of delivery workers can vary. “The problem of [delivery workers] stealing food from the customer’s order is very serious,” she says. When customers complain to the restaurant that they didn’t receive the food they ordered, the restaurant bears the burden of correcting the problem.

“If it really becomes a mature technology, it will be so much more efficient,” she says. “But also, a lot of people across the country would lose their jobs.” 

The same preference for drones over delivery workers can also be heard from customers. Not long after I got my iced tea from the drone, a second drone arrived at the same pickup spot. Wang, a tech worker from a nearby office who wished to be identified only by her last name, came with two friends to pick up the fruit she’d ordered. She makes such orders almost daily and finds it quite convenient.

“Compared to ordinary deliveries, it’s quicker and more sustainable, since the cardboard packages can be recycled. Plus, I don’t have to communicate with the delivery workers,” she said. Her attitude reflects a common tension between city dwellers and gig workers, who often come from rural areas.

Mao says Meituan is not planning to replace all delivery workers; he says the main goal is for drones to complement humans. They might deliver packages to places workers can’t go, like tourist attractions that require ticketed entries, or perform urgent tasks that would be difficult for humans to pull off.

In an ideal future, drones may make up 5% or 10% of all delivery orders, Mao says. But a precise target isn’t what he’s after—he says he’s more interested in making sure that drone delivery actually adds value for customers and becomes an easy-to-use delivery method. 

There is still some growing that has to happen before Meituan’s drone delivery feels seamless: there are few vendors available, and just a dozen kiosks in Shenzhen. Mao expects the service to become much more widespread in Shenzhen in three to five years.

As for the sci-fi vision of drone delivery straight to your window? “In the longer run, I believe it will become true, but that could be 20 to 30 years from now,” Mao says. “Because it would take 20 to 30 years to update urban infrastructure, particularly when it comes to residential buildings.”

How Roomba tester’s private images ended up on Facebook

A Roomba recorded a woman on the toilet. How did screenshots end up on social media?

This episode we go behind the scenes of an MIT Technology Review investigation that uncovered how sensitive photos taken by an AI powered vacuum were leaked and landed on the internet.

Reporting:

We meet:

  • Eileen Guo, MIT Technology Review
  • Albert Fox Cahn, Surveillance Technology Oversight Project

Credits:

This episode was reported by Eileen Guo and produced by Emma Cillekens and Anthony Green. It was hosted by Jennifer Strong and edited by Amanda Silverman and Mat Honan. This show is mixed by Garret Lang with original music from Garret Lang and Jacob Gorski. Artwork by Stephanie Arnett.

Full transcript:

[TR ID]

Jennifer: As more and more companies put artificial intelligence into their products, they need data to train their systems.

And we don’t typically know where that data comes from. 

But sometimes just by using a product, a company takes that as consent to use our data to improve its products and services. 

Consider a device in a home, where setting it up involves just one person consenting on behalf of every person who enters… and living there—or just visiting—might be unknowingly recorded.

I’m Jennifer Strong and this episode we bring you a Tech Review investigation of training data… that was leaked from inside homes around the world. 

[SHOW ID] 

Jennifer: Last year someone reached out to a reporter I work with… and flagged some pretty concerning photos that were floating around the internet. 

Eileen Guo: They were essentially, pictures from inside people’s homes that were captured from low angles, sometimes had people and animals in them that didn’t appear to know that they were being recorded in most cases.

Jennifer: This is investigative reporter Eileen Guo.

And based on what she saw… she thought the photos might have been taken by an AI powered vacuum. 

Eileen Guo: They looked like, you know, they were taken from ground level and pointing up so that you could see whole rooms, the ceilings, whoever happened to be in them…

Jennifer: So she set to work investigating. It took months.  

Eileen Guo: So first we had to confirm whether or not they came from robot vacuums, as we suspected. And from there, we also had to then whittle down which robot vacuum it came from. And what we found was that they came from the largest manufacturer, by the number of sales of any robot vacuum, which is iRobot, which produces the Roomba.

Jennifer: It raised questions about whether or not these photos had been taken with consent… and how they wound up on the internet. 

In one of them, a woman is sitting on a toilet.

So our colleague looked into it, and she found the images weren’t of customers… they were Roomba employees… and people the company calls ‘paid data collectors’.

In other words, the people in the photos were beta testers… and they’d agreed to participate in this process… although it wasn’t totally clear what that meant. 

Eileen Guo: They’re really not as clear as you would think about what the data is ultimately being used for, who it’s being shared with and what other protocols or procedures are going to be keeping them safe—other than a broad statement that this data will be safe.

Jennifer: She doesn’t believe the people who gave permission to be recorded, really knew what they agreed to. 

Eileen Guo: They understood that the robot vacuums would be taking videos from inside their houses, but they didn’t understand that, you know, they would then be labeled and viewed by humans or they didn’t understand that they would be shared with third parties outside of the country. And no one understood that there was a possibility at all that these images could end up on Facebook and Discord, which is how they ultimately got to us.

Jennifer: The investigation found these images were leaked by some data labelers in the gig economy.

At the time they were working for a data labeling company (hired by iRobot) called Scale AI.

Eileen Guo: It’s essentially very low paid workers that are being asked to label images to teach artificial intelligence how to recognize what it is that they’re seeing. And so the fact that these images were shared on the internet, was just incredibly surprising, given how incredibly surprising given how sensitive they were.

Jennifer: Labeling these images with relevant tags is called data annotation. 

The process makes it easier for computers to understand and interpret the data in the form of images, text, audio, or video.

And it’s used in everything from flagging inappropriate content on social media to helping robot vacuums recognize what’s around them. 

Eileen Guo: The most useful datasets to train algorithms is the most realistic, meaning that it’s sourced from real environments. But to make all of that data useful for machine learning, you actually need a person to go through and look at whatever it is, or listen to whatever it is, and categorize and label and otherwise just add context to each bit of data. You know, for self driving cars, it’s, it’s an image of a street and saying, this is a stoplight that is turning yellow, this is a stoplight that is green. This is a stop sign. 

Jennifer: But there’s more than one way to label data. 

Eileen Guo: If iRobot chose to, they could have gone with other models in which the data would have been safer. They could have gone with outsourcing companies that may be outsourced, but people are still working out of an office instead of on their own computers. And so their work process would be a little bit more controlled. Or they could have actually done the data annotation in house. But for whatever reason, iRobot chose not to go either of those routes.

Jennifer: When Tech Review got in contact with the company—which makes the Roomba—they confirmed the 15 images we’ve been talking about did come from their devices, but from pre-production devices. Meaning these machines weren’t released to consumers.

Eileen Guo: They said that they started an investigation into how these images leaked. They terminated their contract with Scale AI, and also said that they were going to take measures to prevent anything like this from happening in the future. But they really wouldn’t tell us what that meant.  

Jennifer: These days, the most advanced robot vacuums can efficiently move around the room while also making maps of areas being cleaned. 

Plus, they recognize certain objects on the floor and avoid them. 

It’s why these machines no longer drive through certain kinds of messes… like dog poop for example.

But what’s different about these leaked training images is the camera isn’t pointed at the floor…  

Eileen Guo: Why do these cameras point diagonally upwards? Why do they know what’s on the walls or the ceilings? How does that help them navigate around the pet waste, or the phone cords or the stray sock or whatever it is. And that has to do with some of the broader goals that iRobot has and other robot vacuum companies has for the future, which is to be able to recognize what room it’s in, based on what you have in the home. And all of that is ultimately going to serve the broader goals of these companies which is create more robots for the home and all of this data is going to ultimately help them reach those goals.

Jennifer: In other words… This data collection might be about building new products altogether.

Eileen Guo: These images are not just about iRobot. They’re not just about test users. It’s this whole data supply chain, and this whole new point where personal information can leak out that consumers aren’t really thinking of or aware of. And the thing that’s also scary about this is that as more companies adopt artificial intelligence, they need more data to train that artificial intelligence. And where is that data coming from? Is.. is a really big question.

Jennifer: Because in the US, companies aren’t required to disclose that…and privacy policies usually have some version of a line that allows consumer data to be used to improve products and services… Which includes training AI. Often, we opt in simply by using the product.

Eileen Guo: So it’s a matter of not even knowing that this is another place where we need to be worried about privacy, whether it’s robot vacuums, or Zoom or anything else that might be gathering data from us.

Jennifer: One option we expect to see more of in the future… is the use of synthetic data… or data that doesn’t come directly from real people. 

And she says companies like Dyson are starting to use it.

Eileen Guo: There’s a lot of hope that synthetic data is the future. It is more privacy protecting because you don’t need real world data. There have been early research that suggests that it is just as accurate if not more so. But most of the experts that I’ve spoken to say that that is anywhere from like 10 years to multiple decades out.

Jennifer: You can find links to our reporting in the show notes… and you can support our journalism by going to tech review dot com slash subscribe.

We’ll be back… right after this.

[MIDROLL]

Albert Fox Cahn: I think this is yet another wake up call that regulators and legislators are way behind in actually enacting the sort of privacy protections we need.

Albert Fox Cahn: My name’s Albert Fox Cahn. I’m the Executive Director of the Surveillance Technology Oversight Project.  

Albert Fox Cahn: Right now it’s the Wild West and companies are kind of making up their own policies as they go along for what counts as a ethical policy for this type of research and development, and, you know, quite frankly, they should not be trusted to set their own ground rules and we see exactly why with this sort of debacle, because here you have a company getting its own employees to sign these ludicrous consent agreements that are just completely lopsided. Are, to my view, almost so bad that they could be unenforceable all while the government is basically taking a hands off approach on what sort of privacy protection should be in place. 

Jennifer: He’s an anti-surveillance lawyer… a fellow at Yale and with Harvard’s Kennedy School.

And he describes his work as constantly fighting back against the new ways people’s data gets taken or used against them.

Albert Fox Cahn: What we see in here are terms that are designed to protect the privacy of the product, that are designed to protect the intellectual property of iRobot, but actually have no protections at all for the people who have these devices in their home. One of the things that’s really just infuriating for me about this is you have people who are using these devices in homes where it’s almost certain that a third party is going to be videotaped and there’s no provision for consent from that third party. One person is signing off for every single person who lives in that home, who visits that home, whose images might be recorded from within the home. And additionally, you have all these legal fictions in here like, oh, I guarantee that no minor will be recorded as part of this. Even though as far as we know, there’s no actual provision to make sure that people aren’t using these in houses where there are children.

Jennifer: And in the US, it’s anyone’s guess how this data will be handled.

Albert Fox Cahn: When you compare this to the situation we have in Europe where you actually have, you know, comprehensive privacy legislation where you have, you know, active enforcement agencies and regulators that are constantly pushing back at the way companies are behaving. And you have active trade unions that would prevent this sort of a testing regime with a employee most likely. You know, it’s night and day. 

Jennifer: He says having employees work as beta testers is problematic… because they might not feel like they have a choice.

Albert Fox Cahn: The reality is that when you’re an employee, oftentimes you don’t have the ability to meaningfully consent. You oftentimes can’t say no. And so instead of volunteering, you’re being voluntold to bring this product into your home, to collect your data. And so you’ll have this coercive dynamic where I just don’t think, you know, at, at, from a philosophical perspective, from an ethics perspective, that you can have meaningful consent for this sort of an invasive testing program by someone who is in an employment arrangement with the person who’s, you know, making the product.

Jennifer: Our devices already monitor our data… from smartphones to washing machines. 

And that’s only going to get more common as AI gets integrated into more and more products and services.

Albert Fox Cahn: We see evermore money being spent on evermore invasive tools that are capturing data from parts of our lives that we once thought were sacrosanct. I do think that there is just a growing political backlash against this sort of technological power, this surveillance capitalism, this sort of, you know, corporate consolidation.  

Jennifer: And he thinks that pressure is going to lead to new data privacy laws in the US. Partly because this problem is going to get worse.

Albert Fox Cahn: And when we think about the sort of data labeling that goes on the sorts of, you know, armies of human beings that have to pour over these recordings in order to transform them into the sorts of material that we need to train machine learning systems. There then is an army of people who can potentially take that information, record it, screenshot it, and turn it into something that goes public. And, and so, you know, I, I just don’t ever believe companies when they claim that they have this magic way of keeping safe all of the data we hand them, there’s this constant potential harm when we’re, especially when we’re dealing with any product that’s in its early training and design phase.

[CREDITS]

Jennifer: This episode was reported by Eileen Guo, produced by Emma Cillekens and Anthony Green, edited by Amanda Silverman and Mat Honan. And it’s mixed by Garret Lang, with original music from Garret Lang and Jacob Gorski.

Thanks for listening, I’m Jennifer Strong.

Roomba testers feel misled after intimate images ended up on Facebook

When Greg unboxed a new Roomba robot vacuum cleaner in December 2019, he thought he knew what he was getting into. 

He would allow the preproduction test version of iRobot’s Roomba J series device to roam around his house, let it collect all sorts of data to help improve its artificial intelligence, and provide feedback to iRobot about his user experience.

He had done this all before. Outside of his day job as an engineer at a software company, Greg had been beta-testing products for the past decade. He estimates that he’s tested over 50 products in that time—everything from sneakers to smart home cameras. 

“I really enjoy it,” he says. “The whole idea is that you get to learn about something new, and hopefully be involved in shaping the product, whether it’s making a better-quality release or actually defining features and functionality.”

But what Greg didn’t know—and does not believe he consented to—was that iRobot would share test users’ data in a sprawling, global data supply chain, where everything (and every person) captured by the devices’ front-facing cameras could be seen, and perhaps annotated, by low-paid contractors outside the United States who could screenshot and share images at their will. 

Greg, who asked that we identify him only by his first name because he signed a nondisclosure agreement with iRobot, is not the only test user who feels dismayed and betrayed. 

Nearly a dozen people who participated in iRobot’s data collection efforts between 2019 and 2022 have come forward in the weeks since MIT Technology Review published an investigation into how the company uses images captured from inside real homes to train its artificial intelligence. The participants have shared similar concerns about how iRobot handled their data—and whether those practices conform with the company’s own data protection promises. After all, the agreements go both ways, and whether or not the company legally violated its promises, the participants feel misled. 

“There is a real concern about whether the company is being deceptive if people are signing up for this sort of highly invasive type of surveillance and never fully understand … what they’re agreeing to,” says Albert Fox Cahn, the executive director of the Surveillance Technology Oversight Project.

The company’s failure to adequately protect test user data feels like “a clear breach of the agreement on their side,” Greg says. It’s “a failure … [and] also a violation of trust.” 

Now, he wonders, “where is the accountability?” 

The blurry line between testers and consumers

Last month MIT Technology Review revealed how iRobot collects photos and videos from the homes of test users and employees and shares them with data annotation companies, including San Francisco–based Scale AI, which hire far-flung contractors to label the data that trains the company’s artificial-intelligence algorithms. 

We found that in one 2020 project, gig workers in Venezuela were asked to label objects in a series of images of home interiors, some of which included individuals—their faces visible to the data annotators. These workers then shared at least 15 images—including shots of a minor and of a woman sitting on the toilet—to social media groups where they gathered to talk shop. We know about these particular images because the screenshots were subsequently shared with us, but our interviews with data annotators and researchers who study data annotation suggest they are unlikely to be the only ones that made their way online; it’s not uncommon for sensitive images, videos, and audio to be shared with labelers. 

Shortly after MIT Technology Review contacted iRobot for comment on the photos last fall, the company terminated its contract with Scale AI. 

Nevertheless, in a LinkedIn post in response to our story, iRobot CEO Colin Angle did not acknowledge the mere fact that these images, and the faces of test users, were visible to human gig workers was a reason for concern. Rather, he wrote, making such images available was actually necessary to train iRobot’s object recognition algorithms: “How do our robots get so smart? It starts during the development process, and as part of that, through the collection of data to train machine learning algorithms.” Besides, he pointed out, the images came not from customers but from “paid data collectors and employees” who had signed consent agreements.

In the LinkedIn post and in statements to MIT Technology Review, Angle and iRobot have repeatedly emphasized that no customer data was shared and that “participants are informed and acknowledge how the data will be collected.” 

This attempt to clearly delineate between customers and beta testers—and how those people’s data will be treated—has been confounding to many testers, who say they consider themselves part of iRobot’s broader community and feel that the company’s comments are dismissive. Greg and the other testers who reached out also strongly dispute any implication that by volunteering to test a product, they have signed away all their privacy. 

What’s more, the line between tester and consumer is not so clear cut. At least one of the testers we spoke with enjoyed his test Roomba so much that he later purchased the device. 

This is not an anomaly; rather, converting beta testers to customers and evangelists for the product is something Centercode, the company that recruited the participants on behalf of iRobot, actively tries to promote: “It’s hard to find better potential brand ambassadors than in your beta tester community. They’re a great pool of free, authentic voices that can talk about your launched product to the world, and their (likely techie) friends,” it wrote in a marketing blog post

To Greg, iRobot has “failed spectacularly” in its treatment of the testing community, particularly in its silence over the privacy breach. iRobot says it has notified individuals whose photos appeared in the set of 15 images, but it did not respond to a question about whether it would notify other individuals who had taken part in its data collection. The participants who reached out to us said they have not received any kind of notice from the company. 

“If your credit card information … was stolen at Target, Target doesn’t notify the one person who has the breach,” he adds. “They send out a notification that there was a breach, this is what happened, [and] this is how they’re handling it.” 

Inside the world of beta testing

The journey of iRobot’s AI-powering data points starts on testing platforms like Betabound, which is run by Centercode. The technology company, based in Laguna Hills, California, recruits volunteers to test out products and services for its clients—primarily consumer tech companies. (iRobot spokesperson James Baussmann confirmed that the company has used Betabound but said that “not all of the paid data collectors were recruited via Betabound.” Centercode did not respond to multiple requests for comment.) 

“If your credit card information … was stolen at Target, Target doesn’t notify the one person who has the breach.” 

As early adopters, beta testers are often more tech savvy than the average consumer. They are enthusiastic about gadgets and, like Greg, sometimes work in the technology sector themselves—so they are often well aware of the standards around data protection. 

A review of all 6,200 test opportunities listed on Betabound’s website as of late December shows that iRobot has been testing on the platform since at least 2017. The latest project, which is specifically recruiting German testers, started just last month. 

iRobot’s vacuums are far from the only devices in its category. There are over 300 tests listed for other “smart” devices powered by AI, including “a smart microwave with Alexa support,” as well as multiple other robot vacuums. 

The first step for potential testers is to fill out a profile on the Betabound website. They can then apply for specific opportunities as they’re announced. If accepted by the company running the test, testers sign numerous agreements before they are sent the devices. 

Betabound testers are not paid, as the platform’s FAQ for testers notes: “Companies cannot expect your feedback to be honest and reliable if you’re being paid to give it.” Rather, testers might receive gift cards, a chance to keep their test devices free of charge, or complimentary production versions delivered after the device they tested goes to market. 

iRobot, however, did not allow testers to keep their devices, nor did they receive final products. Instead, the beta testers told us that they received gift cards in amounts ranging from $30 to $120 for running the robot vacuums multiple times a week over multiple weeks. (Baussmann says that “with respect to the amount paid to participants, it varies depending upon the work involved.”) 

For some testers, this compensation was disappointing—“even before considering … my naked ass could now be on the Internet,” as B, a tester we’re identifying only by his first initial, wrote in an email. He called iRobot “cheap bastards” for the $30 gift card that he received for his data, collected daily over three months. 

What users are really agreeing to 

When MIT Technology Review reached out to iRobot for comment on the set of 15 images last fall, the company emphasized that each image had a corresponding consent agreement. It would not, however, share the agreements with us, citing “legal reasons.” Instead, the company said the agreement required an “acknowledgment that video and images are being captured during cleaning jobs” and that “the agreement encourages paid data collectors to remove anything they deem sensitive from any space the robot operates in, including children.”

Test users have since shared with MIT Technology Review copies of their agreement with iRobot. These include several different forms—including a general Betabound agreement and a “global test agreement for development robots,” as well as agreements on nondisclosure, test participation, and product loan. There are also agreements for some of the specific tests being run.

The text of iRobot’s global test agreement from 2019, copied into a new document to protect the identity of test users.

The forms do contain the language iRobot previously laid out, while also spelling out the company’s own commitments on data protection for test users. But they provide little clarity on what exactly that means, especially how the company will handle user data after it’s collected and whom the data will be shared with.

The “global test agreement for development robots,” similar versions of which were independently shared by a half-dozen individuals who signed them between 2019 and 2022, contains the bulk of the information on privacy and consent. 

In the short document of roughly 1,300 words, iRobot notes that it is the controller of information, which comes with legal responsibilities under the EU’s GDPR to ensure that data is collected for legitimate purposes and securely stored and processed. Additionally, it states, “iRobot agrees that third-party vendors and service providers selected to process [personal information] will be vetted for privacy and data security, will be bound by strict confidentiality, and will be governed by the terms of a Data Processing Agreement,” and that users “may be entitled to additional rights under applicable privacy laws where [they] reside.”

It’s this section of the agreement that Greg believes iRobot breached. “Where in that statement is the accountability that iRobot is proposing to the testers?” he asks. “I completely disagree with how offhandedly this is being responded to.”

“A lot of this language seems to be designed to exempt the company from applicable privacy laws, but none of it reflects the reality of how the product operates.”

What’s more, all test participants had to agree that their data could be used for machine learning and object detection training. Specifically, the global test agreement’s section on “use of research information” required an acknowledgment that “text, video, images, or audio … may be used by iRobot to analyze statistics and usage data, diagnose technology problems, enhance product performance, product and feature innovation, market research, trade presentations, and internal training, including machine learning and object detection.” 

What isn’t spelled out here is that iRobot carries out the machine-learning training through human data labelers who teach the algorithms, click by click, to recognize the individual elements captured in the raw data. In other words, the agreements shared with us never explicitly mention that personal images will be seen and analyzed by other humans. 

Baussmann, iRobot’s spokesperson, said that the language we highlighted “covers a variety of testing scenarios” and is not specific to images sent for data annotation. “For example, sometimes testers are asked to take photos or videos of a robot’s behavior, such as when it gets stuck on a certain object or won’t completely dock itself, and send those photos or videos to iRobot,” he wrote, adding that “for tests in which images will be captured for annotation purposes, there are specific terms that are outlined in the agreement pertaining to that test.” 

He also wrote that “we cannot be sure the people you have spoken with were part of the development work that related to your article,” though he notably did not dispute the veracity of the global test agreement, which ultimately allows all test users’ data to be collected and used for machine learning. 

What users really understand

When we asked privacy lawyers and scholars to review the consent agreements and shared with them the test users’ concerns, they saw the documents and the privacy violations that ensued as emblematic of a broken consent framework that affects us all—whether we are beta testers or regular consumers. 

Experts say companies are well aware that people rarely read privacy policies closely, if we read them at all. But what iRobot’s global test agreement attests to, says Ben Winters, a lawyer with the Electronic Privacy Information Center who focuses on AI and human rights, is that “even if you do read it, you still don’t get clarity.”

Rather, “a lot of this language seems to be designed to exempt the company from applicable privacy laws, but none of it reflects the reality of how the product operates,” says Cahn, pointing to the robot vacuums’ mobility and the impossibility of controlling where potentially sensitive people or objects—in particular children—are at all times in their own home. 

Ultimately, that “place[s] much of the responsibility … on the end user,” notes Jessica Vitak, an information scientist at the University of Maryland’s College of Information Studies who studies best practices in research and consent policies. Yet it doesn’t give them a true accounting of “how things might go wrong,” she says—“which would be very valuable information when deciding whether to participate.”

Not only does it put the onus on the user; it also leaves it to that single person to “unilaterally affirm the consent of every person within the home,” explains Cahn, even though “everyone who lives in a house that uses one of these devices will potentially be put at risk.”

All of this lets the company shirk its true responsibility as a data controller, adds Deirdre Mulligan, a professor in the School of Information at UC Berkeley. “A device manufacturer that is a data controller” can’t simply “offload all responsibility for the privacy implications of the device’s presence in the home to an employee” or other volunteer data collectors. 

Some participants did admit that they hadn’t read the consent agreement closely. “I skimmed the [terms and conditions] but didn’t notice the part about sharing *video and images* with a third party—that would’ve given me pause,” one tester, who used the vacuum for three months last year, wrote in an email. 

Before testing his Roomba, B said, he had “perused” the consent agreement and “figured it was a standard boilerplate: ‘We can do whatever the hell we want with what we collect, and if you don’t like that, don’t participate [or] use our product.’” He added, “Admittedly, I just wanted a free product.”

Still, B expected that iRobot would offer some level of data protection—not that the “company that made us swear up and down with NDAs that we wouldn’t share any information” about the tests would “basically subcontract their most intimate work to the lowest bidder.”

Notably, many of the test users who reached out—even those who say they did read the full global test agreement, as well as myriad other agreements, including ones applicable to all consumers—still say they lacked a clear understanding of what collecting their data actually meant or how exactly that data would be processed and used. 

What they did understand often depended more on their own awareness of how artificial intelligence is trained than on anything communicated by iRobot. 

One tester, Igor, who asked to be identified only by his first name, works in IT for a bank; he considers himself to have “above average training in cybersecurity” and has built his own internet infrastructure at home, allowing him to self-host sensitive information on his own servers and monitor network traffic. He said he did understand that videos would be taken from inside his home and that they would be tagged. “I felt that the company handled the disclosure of the data collection responsibly,” he wrote in an email, pointing to both the consent agreement and the device’s prominently placed sticker reading “video recording in process.” But, he emphasized, “I’m not an average internet user.” 

Photo of iRobot’s preproduction Roomba J series device.
COURTESY OF IROBOT

For many testers, the greatest shock from our story was how the data would be handled after collection—including just how much humans would be involved. “I assumed it [the video recording] was only for internal validation if there was an issue as is common practice (I thought),” another tester who asked to be anonymous wrote in an email. And as B put it, “It definitely crossed my mind that these photos would probably be viewed for tagging within a company, but the idea that they were leaked online is disconcerting.” 

“Human review didn’t surprise me,” Greg adds, but “the level of human review did … the idea, generally, is that AI should be able to improve the system 80% of the way … and the remainder of it, I think, is just on the exception … that [humans] have to look at it.” 

Even the participants who were comfortable with having their images viewed and annotated, like Igor, said they were uncomfortable with how iRobot processed the data after the fact. The consent agreement, Igor wrote, “doesn’t excuse the poor data handling” and “the overall storage and control that allowed a contractor to export the data.”

Multiple US-based participants, meanwhile, expressed concerns about their data being transferred out of the country. The global agreement, they noted, had language for participants “based outside of the US” saying that “iRobot may process Research Data on servers not in my home country … including those whose laws may not offer the same level of data protection as my home country”—but the agreement did not have any corresponding information for US-based participants on how their data would be processed. 

“I had no idea that the data was going overseas,” one US-based participant wrote to MIT Technology Review—a sentiment repeated by many. 

Once data is collected, whether from test users or from customers, people ultimately have little to no control over what the company does with it next—including, for US users, sharing their data overseas.

US users, in fact, have few privacy protections even in their home country, notes Cahn, which is why the EU has laws to protect data from being transferred outside the EU—and to the US specifically. “Member states have to take such extensive steps to protect data being stored in that country. Whereas in the US, it’s largely the Wild West,” he says. “Americans have no equivalent protection against their data being stored in other countries.” 

For some testers, this compensation was disappointing—“even before considering … my naked ass could now be on the Internet.”

Many testers themselves are aware of the broader issues around data protection in the US, which is why they chose to speak out. 

“Outside of regulated industries like banking and health care, the best thing we can probably do is create significant liability for data protection failure, as only hard economic incentives will make companies focus on this,” wrote Igor, the tester who works in IT at a bank. “Sadly the political climate doesn’t seem like anything could pass here in the US. The best we have is the public shaming … but that is often only reactionary and catches just a small percentage of what’s out there.”

In the meantime, in the absence of change and accountability—whether from iRobot itself or pushed by regulators—Greg has a message for potential Roomba buyers. “I just wouldn’t buy one, flat out,” he says, because he feels “iRobot is not handling their data security model well.” 

And on top of that, he warns, they’re “really dismissing their responsibility as vendors to … notify [or] protect customers—which in this case include the testers of these products.”

Lam Thuy Vo contributed research. 

Correction: This piece has been updated to clarify what iRobot CEO Colin Angle wrote in a LinkedIn post in response to faces appearing in data collection.

A Roomba recorded a woman on the toilet. How did screenshots end up on Facebook?

In the fall of 2020, gig workers in Venezuela posted a series of images to online forums where they gathered to talk shop. The photos were mundane, if sometimes intimate, household scenes captured from low angles—including some you really wouldn’t want shared on the Internet. 

In one particularly revealing shot, a young woman in a lavender T-shirt sits on the toilet, her shorts pulled down to mid-thigh.

The images were not taken by a person, but by development versions of iRobot’s Roomba J7 series robot vacuum. They were then sent to Scale AI, a startup that contracts workers around the world to label audio, photo, and video data used to train artificial intelligence. 

They were the sorts of scenes that internet-connected devices regularly capture and send back to the cloud—though usually with stricter storage and access controls. Yet earlier this year, MIT Technology Review obtained 15 screenshots of these private photos, which had been posted to closed social media groups. 

The photos vary in type and in sensitivity. The most intimate image we saw was the series of video stills featuring the young woman on the toilet, her face blocked in the lead image but unobscured in the grainy scroll of shots below. In another image, a boy who appears to be eight or nine years old, and whose face is clearly visible, is sprawled on his stomach across a hallway floor. A triangular flop of hair spills across his forehead as he stares, with apparent amusement, at the object recording him from just below eye level.

The other shots show rooms from homes around the world, some occupied by humans, one by a dog. Furniture, décor, and objects located high on the walls and ceilings are outlined by rectangular boxes and accompanied by labels like “tv,” “plant_or_flower,” and “ceiling light.” 

iRobot—the world’s largest vendor of robotic vacuums, which Amazon recently acquired for $1.7 billion in a pending deal—confirmed that these images were captured by its Roombas in 2020. All of them came from “special development robots with hardware and software modifications that are not and never were present on iRobot consumer products for purchase,” the company said in a statement. They were given to “paid collectors and employees” who signed written agreements acknowledging that they were sending data streams, including video, back to the company for training purposes. According to iRobot, the devices were labeled with a bright green sticker that read “video recording in progress,” and it was up to those paid data collectors to “remove anything they deem sensitive from any space the robot operates in, including children.”

In other words, by iRobot’s estimation, anyone whose photos or video appeared in the streams had agreed to let their Roombas monitor them. iRobot declined to let MIT Technology Review view the consent agreements and did not make any of its paid collectors or employees available to discuss their understanding of the terms.

While the images shared with us did not come from iRobot customers, consumers regularly consent to having our data monitored to varying degrees on devices ranging from iPhones to washing machines. It’s a practice that has only grown more common over the past decade, as data-hungry artificial intelligence has been increasingly integrated into a whole new array of products and services. Much of this technology is based on machine learning, a technique that uses large troves of data—including our voices, faces, homes, and other personal information—to train algorithms to recognize patterns. The most useful data sets are the most realistic, making data sourced from real environments, like homes, especially valuable. Often, we opt in simply by using the product, as noted in privacy policies with vague language that gives companies broad discretion in how they disseminate and analyze consumer information. 

Did you participate in iRobot’s data collection efforts? We’d love to hear from you. Please reach out at tips@technologyreview.com. 

The data collected by robot vacuums can be particularly invasive. They have “powerful hardware, powerful sensors,” says Dennis Giese, a PhD candidate at Northeastern University who studies the security vulnerabilities of Internet of Things devices, including robot vacuums. “And they can drive around in your home—and you have no way to control that.” This is especially true, he adds, of devices with advanced cameras and artificial intelligence—like iRobot’s Roomba J7 series.

This data is then used to build smarter robots whose purpose may one day go far beyond vacuuming. But to make these data sets useful for machine learning, individual humans must first view, categorize, label, and otherwise add context to each bit of data. This process is called data annotation.

There’s always a group of humans sitting somewhere—usually in a windowless room, just doing a bunch of point-and-click: ‘Yes, that is an object or not an object,’” explains Matt Beane, an assistant professor in the technology management program at  the University of California, Santa Barbara, who studies the human work behind robotics.

The 15 images shared with MIT Technology Review are just a tiny slice of a sweeping data ecosystem. iRobot has said that it has shared over 2 million images with Scale AI and an unknown quantity more with other data annotation platforms; the company has confirmed that Scale is just one of the data annotators it has used. 

James Baussmann, iRobot’s spokesperson, said in an email the company had “taken every precaution to ensure that personal data is processed securely and in accordance with applicable law,” and that the images shared with MIT Technology Review were “shared in violation of a written non-disclosure agreement between iRobot and an image annotation service provider.” In an emailed statement a few weeks after we shared the images with the company, iRobot CEO Colin Angle said that “iRobot is terminating its relationship with the service provider who leaked the images, is actively investigating the matter, and [is] taking measures to help prevent a similar leak by any service provider in the future.” The company did not respond to additional questions about what those measures were. 

Ultimately, though, this set of images represents something bigger than any one individual company’s actions. They speak to the widespread, and growing, practice of sharing potentially sensitive data to train algorithms, as well as the surprising, globe-spanning journey that a single image can take—in this case, from homes in North America, Europe, and Asia to the servers of Massachusetts-based iRobot, from there to San Francisco–based Scale AI, and finally to Scale’s contracted data workers around the world (including, in this instance, Venezuelan gig workers who posted the images to private groups on Facebook, Discord, and elsewhere). 

Together, the images reveal a whole data supply chain—and new points where personal information could leak out—that few consumers are even aware of. 

“It’s not expected that human beings are going to be reviewing the raw footage,” emphasizes Justin Brookman, director of tech policy at Consumer Reports and former policy director of the Federal Trade Commission’s Office of Technology Research and Investigation. iRobot would not say whether data collectors were aware that humans, in particular, would be viewing these images, though the company said the consent form made clear that “service providers” would be.

“It’s not expected that human beings are going to be reviewing the raw footage.”

“We literally treat machines differently than we treat humans,” adds Jessica Vitak, an information scientist and professor at the University of Maryland’s communication department and its College of Information Studies. “It’s much easier for me to accept a cute little vacuum, you know, moving around my space [than] somebody walking around my house with a camera.” 

And yet, that’s essentially what is happening. It’s not just a robot vacuum watching you on the toilet—a person may be looking too. 

The robot vacuum revolution 

Robot vacuums weren’t always so smart. 

The earliest model, the Swiss-made Electrolux Trilobite, came to market in 2001. It used ultrasonic sensors to locate walls and plot cleaning patterns; additional bump sensors on its sides and cliff sensors at the bottom helped it avoid running into objects or falling off stairs. But these sensors were glitchy, leading the robot to miss certain areas or repeat others. The result was unfinished and unsatisfactory cleaning jobs. 

The next year, iRobot released the first-generation Roomba, which relied on similar basic bump sensors and turn sensors. Much cheaper than its competitor, it became the first commercially successful robot vacuum.

The most basic models today still operate similarly, while midrange cleaners incorporate better sensors and other navigational techniques like simultaneous localization and mapping to find their place in a room and chart out better cleaning paths. 

Higher-end devices have moved on to computer vision, a subset of artificial intelligence that approximates human sight by training algorithms to extract information from images and videos, and/or lidar, a laser-based sensing technique used by NASA and widely considered the most accurate—but most expensive—navigational technology on the market today. 

Computer vision depends on high-definition cameras, and by our count, around a dozen companies have incorporated front-facing cameras into their robot vacuums for navigation and object recognition—as well as, increasingly, home monitoring. This includes the top three robot vacuum makers by market share: iRobot, which has 30% of the market and has sold over 40 million devices since 2002; Ecovacs, with about 15%; and Roborock, which has about another 15%, according to the market intelligence firm Strategy Analytics. It also includes familiar household appliance makers like Samsung, LG, and Dyson, among others. In all, some 23.4 million robot vacuums were sold in Europe and the Americas in 2021 alone, according to Strategy Analytics. 

From the start, iRobot went all in on computer vision, and its first device with such capabilities, the Roomba 980, debuted in 2015. It was also the first of iRobot’s Wi-Fi-enabled devices, as well as its first that could map a home, adjust its cleaning strategy on the basis of room size, and identify basic obstacles to avoid. 

Computer vision “allows the robot to … see the full richness of the world around it,” says Chris Jones, iRobot’s chief technology officer. It allows iRobot’s devices to “avoid cords on the floor or understand that that’s a couch.” 

But for computer vision in robot vacuums to truly work as intended, manufacturers need to train it on high-quality, diverse data sets that reflect the huge range of what they might see. “The variety of the home environment is a very difficult task,” says Wu Erqi, the senior R&D director of Beijing-based Roborock. Road systems “are quite standard,” he says, so for makers of self-driving cars, “you’ll know how the lane looks … [and] how the traffic sign looks.” But each home interior is vastly different. 

“The furniture is not standardized,” he adds. “You cannot expect what will be on your ground. Sometimes there’s a sock there, maybe some cables”—and the cables may look different in the US and China. 

family bent over a vacuum. light emitting from the vaccuum shines on their obscured faces.

MATTHIEU BOUREL

MIT Technology Review spoke with or sent questions to 12 companies selling robot vacuums and found that they respond to the challenge of gathering training data differently. 

In iRobot’s case, over 95% of its image data set comes from real homes, whose residents are either iRobot employees or volunteers recruited by third-party data vendors (which iRobot declined to identify). People using development devices agree to allow iRobot to collect data, including video streams, as the devices are running, often in exchange for “incentives for participation,” according to a statement from iRobot. The company declined to specify what these incentives were, saying only that they varied “based on the length and complexity of the data collection.” 

The remaining training data comes from what iRobot calls “staged data collection,” in which the company builds models that it then records.

iRobot has also begun offering regular consumers the opportunity to opt in to contributing training data through its app, where people can choose to send specific images of obstacles to company servers to improve its algorithms. iRobot says that if a customer participates in this “user-in-the-loop” training, as it is known, the company receives only these specific images, and no others. Baussmann, the company representative, said in an email that such images have not yet been used to train any algorithms. 

In contrast to iRobot, Roborock said that it either “produce[s] [its] own images in [its] labs” or “work[s] with third-party vendors in China who are specifically asked to capture & provide images of objects on floors for our training purposes.” Meanwhile, Dyson, which sells two high-end robot vacuum models, said that it gathers data from two main sources: “home trialists within Dyson’s research & development department with a security clearance” and, increasingly, synthetic, or AI-generated, training data. 

Most robot vacuum companies MIT Technology Review spoke with explicitly said they don’t use customer data to train their machine-learning algorithms. Samsung did not respond to questions about how it sources its data (though it wrote that it does not use Scale AI for data annotation), while Ecovacs calls the source of its training data “confidential.” LG and Bosch did not respond to requests for comment.

“You have to assume that people … ask each other for help. The policy always says that you’re not supposed to, but it’s very hard to control.” 

Some clues about other methods of data collection come from Giese, the IoT hacker, whose office at Northeastern is piled high with robot vacuums that he has reverse-engineered, giving him access to their machine-learning models. Some are produced by Dreame, a relatively new Chinese company based in Shenzhen that sells affordable, feature-rich devices. 

Giese found that Dreame vacuums have a folder labeled “AI server,” as well as image upload functions. Companies often say that “camera data is never sent to the cloud and whatever,” Giese says, but “when I had access to the device, I was basically able to prove that it’s not true.” Even if they didn’t actually upload any photos, he adds, “[the function] is always there.”  

Dreame manufactures robot vacuums that are also rebranded and sold by other companies—an indication that this practice could be employed by other brands as well, says Giese. 

Dreame did not respond to emailed questions about the data collected from customer devices, but in the days following MIT Technology Review’s initial outreach, the company began changing its privacy policies, including those related to how it collects personal information, and pushing out multiple firmware updates.

But without either an explanation from companies themselves or a way, besides hacking, to test their assertions, it’s hard to know for sure what they’re collecting from customers for training purposes.

How and why our data ends up halfway around the world

With the raw data required for machine-learning algorithms comes the need for labor, and lots of it. That’s where data annotation comes in. A young but growing industry, data annotation is projected to reach $13.3 billion in market value by 2030. 

The field took off largely to meet the huge need for labeled data to train the algorithms used in self-driving vehicles. Today, data labelers, who are often low-paid contract workers in the developing world, help power much of what we take for granted as “automated” online. They keep the worst of the Internet out of our social media feeds by manually categorizing and flagging posts, improve voice recognition software by transcribing low-quality audio, and help robot vacuums recognize objects in their environments by tagging photos and videos. 

Among the myriad companies that have popped up over the past decade, Scale AI has become the market leader. Founded in 2016, it built a business model around contracting with remote workers in less-wealthy nations at cheap project- or task-based rates on Remotasks, its proprietary crowdsourcing platform. 

In 2020, Scale posted a new assignment there: Project IO. It featured images captured from the ground and angled upwards at roughly 45 degrees, and showed the walls, ceilings, and floors of homes around the world, as well as whatever happened to be in or on them—including people, whose faces were clearly visible to the labelers. 

Labelers discussed Project IO in Facebook, Discord, and other groups that they had set up to share advice on handling delayed payments, talk about the best-paying assignments, or request assistance in labeling tricky objects. 

iRobot confirmed that the 15 images posted in these groups and subsequently sent to MIT Technology Review came from its devices, sharing a spreadsheet listing the specific dates they were made (between June and November 2020), the countries they came from (the United States, Japan, France, Germany, and Spain), and the serial numbers of the devices that produced the images, as well as a column indicating that a consent form had been signed by each device’s user. (Scale AI confirmed that 13 of the 15 images came from “an R&D project [it] worked on with iRobot over two years ago,” though it declined to clarify the origins of or offer additional information on the other two images.)

iRobot says that sharing images in social media groups violates Scale’s agreements with it, and Scale says that contract workers sharing these images breached their own agreements. 

“The underlying problem is that your face is like a password you can’t change. Once somebody has recorded the ‘signature’ of your face, they can use it forever to find you in photos or video.” 

But such actions are nearly impossible to police on crowdsourcing platforms. 

When I ask Kevin Guo, the CEO of Hive, a Scale competitor that also depends on contract workers, if he is aware of data labelers sharing content on social media, he is blunt. “These are distributed workers,” he says. “You have to assume that people … ask each other for help. The policy always says that you’re not supposed to, but it’s very hard to control.” 

That means that it’s up to the service provider to decide whether or not to take on certain work. For Hive, Guo says, “we don’t think we have the right controls in place given our workforce” to effectively protect sensitive data. Hive does not work with any robot vacuum companies, he adds. 

“It’s sort of surprising to me that [the images] got shared on a crowdsourcing platform,” says Olga Russakovsky, the principal investigator at Princeton University’s Visual AI Lab and a cofounder of the group AI4All. Keeping the labeling in house, where “folks are under strict NDAs” and “on company computers,” would keep the data far more secure, she points out.

In other words, relying on far-flung data annotators is simply not a secure way to protect data. “When you have data that you’ve gotten from customers, it would normally reside in a database with access protection,” says Pete Warden, a leading computer vision researcher and a PhD student at Stanford University. But with machine-learning training, customer data is all combined “in a big batch,” widening the “circle of people” who get access to it.

Screenshots shared with MIT Technology Review of data annotation in progress

For its part, iRobot says that it shares only a subset of training images with data annotation partners, flags any image with sensitive information, and notifies the company’s chief privacy officer if sensitive information is detected. Baussmann calls this situation “rare,” and adds that when it does happen, “the entire video log, including the image, is deleted from iRobot servers.”

The company specified, “When an image is discovered where a user is in a compromising position, including nudity, partial nudity, or sexual interaction, it is deleted—in addition to ALL other images from that log.” It did not clarify whether this flagging would be done automatically by algorithm or manually by a person, or why that did not happen in the case of the woman on the toilet.

iRobot policy, however, does not deem faces sensitive, even if the people are minors. 

“In order to teach the robots to avoid humans and images of humans”—a feature that it has promoted to privacy-wary customers—the company “first needs to teach the robot what a human is,” Baussmann explained. “In this sense, it is necessary to first collect data of humans to train a model.” The implication is that faces must be part of that data.

But facial images may not actually be necessary for algorithms to detect humans, according to William Beksi, a computer science professor who runs the Robotic Vision Laboratory at the University of Texas at Arlington: human detector models can recognize people based “just [on] the outline (silhouette) of a human.” 

“If you were a big company, and you were concerned about privacy, you could preprocess these images,” Beksi says. For example, you could blur human faces before they even leave the device and “before giving them to someone to annotate.”

“It does seem to be a bit sloppy,” he concludes, “especially to have minors recorded in the videos.” 

In the case of the woman on the toilet, a data labeler made an effort to preserve her privacy, by placing a black circle over her face. But in no other images featuring people were identities obscured, either by the data labelers themselves, by Scale AI, or by iRobot. That includes the image of the young boy sprawled on the floor.

Baussmann explained that iRobot protected “the identity of these humans” by “decoupling all identifying information from the images … so if an image is acquired by a bad actor, they cannot map backwards to identify the person in the image.”

But capturing faces is inherently privacy-violating, argues Warden. “The underlying problem is that your face is like a password you can’t change,” he says. “Once somebody has recorded the ‘signature’ of your face, they can use it forever to find you in photos or video.” 

AI labels over the illustrated faces of a family

MATTHIEU BOUREL

Additionally, “lawmakers and enforcers in privacy would view biometrics, including faces, as sensitive information,” says Jessica Rich, a privacy lawyer who served as director of the FTC’s Bureau of Consumer Protection between 2013 and 2017. This is especially the case if any minors are captured on camera, she adds: “Getting consent from the employee [or testers] isn’t the same as getting consent from the child. The employee doesn’t have the capacity to consent to data collection about other individuals—let alone the children that appear to be implicated.” Rich says she wasn’t referring to any specific company in these comments. 

In the end, the real problem is arguably not that the data labelers shared the images on social media. Rather, it’s that this type of AI training set—specifically, one depicting faces—is far more common than most people understand, notes Milagros Miceli, a sociologist and computer scientist who has been interviewing distributed workers contracted by data annotation companies for years. Miceli has spoken to multiple labelers who have seen similar images, taken from the same low vantage points and sometimes showing people in various stages of undress. 

The data labelers found this work “really uncomfortable,” she adds. 

Surprise: you may have agreed to this 

Robot vacuum manufacturers themselves recognize the heightened privacy risks presented by on-device cameras. “When you’ve made the decision to invest in computer vision, you do have to be very careful with privacy and security,” says Jones, iRobot’s CTO. “You’re giving this benefit to the product and the consumer, but you also have to be treating privacy and security as a top-order priority.”

In fact, iRobot tells MIT Technology Review it has implemented many privacy- and security-protecting measures in its customer devices, including using encryption, regularly patching security vulnerabilities, limiting and monitoring internal employee access to information, and providing customers with detailed information on the data that it collects. 

But there is a wide gap between the way companies talk about privacy and the way consumers understand it. 

It’s easy, for instance, to conflate privacy with security, says Jen Caltrider, the lead researcher behind Mozilla’s “*Privacy Not Included” project, which reviews consumer devices for both privacy and security. Data security refers to a product’s physical and cyber security, or how vulnerable it is to a hack or intrusion, while data privacy is about transparency—knowing and being able to control the data that companies have, how it is used, why it is shared, whether and for how long it’s retained, and how much a company is collecting to start with. 

Conflating the two is convenient, Caltrider adds, because “security has gotten better, while privacy has gotten way worse” since she began tracking products in 2017. “The devices and apps now collect so much more personal information,” she says. 

Company representatives also sometimes use subtle differences, like the distinction between “sharing” data and selling it, that make how they handle privacy particularly hard for non-experts to parse. When a company says it will never sell your data, that doesn’t mean it won’t use it or share it with others for analysis.

These expansive definitions of data collection are often acceptable under companies’ vaguely worded privacy policies, virtually all of which contain some language permitting the use of data for the purposes of “improving products and services”—language that Rich calls so broad as to “permit basically anything.”

“Developers are not traditionally very good [at] security stuff.” Their attitude becomes “Try to get the functionality, and if the functionality is working, ship the product. And then the scandals come out.” 

Indeed, MIT Technology Review reviewed 12 robot vacuum privacy policies, and all of them, including iRobot’s, contained similar language on “improving products and services.” Most of the companies to which MIT Technology Review reached out for comment did not respond to questions on whether “product improvement” would include machine-learning algorithms. But Roborock and iRobot say it would. 

And because the United States lacks a comprehensive data privacy law—instead relying on a mishmash of state laws, most notably the California Consumer Privacy Act—these privacy policies are what shape companies’ legal responsibilities, says Brookman. “A lot of privacy policies will say, you know, we reserve the right to share your data with select partners or service providers,” he notes. That means consumers are likely agreeing to have their data shared with additional companies, whether they are familiar with them or not.

Brookman explains that the legal barriers companies must clear to collect data directly from consumers are fairly low. The FTC, or state attorneys general, may step in if there are either “unfair” or “deceptive” practices, he notes, but these are narrowly defined: unless a privacy policy specifically says “Hey, we’re not going to let contractors look at your data” and they share it anyway, Brookman says, companies are “probably okay on deception, which is the main way” for the FTC to “enforce privacy historically.” Proving that a practice is unfair, meanwhile, carries additional burdens—including proving harm. “The courts have never really ruled on it,” he adds.

Most companies’ privacy policies do not even mention the audiovisual data being captured, with a few exceptions. iRobot’s privacy policy notes that it collects audiovisual data only if an individual shares images via its mobile app. LG’s privacy policy for the camera- and AI-enabled Hom-Bot Turbo+ explains that its app collects audiovisual data, including “audio, electronic, visual, or similar information, such as profile photos, voice recordings, and video recordings.” And the privacy policy for Samsung’s Jet Bot AI+ Robot Vacuum with lidar and Powerbot R7070, both of which have cameras, will collect “information you store on your device, such as photos, contacts, text logs, touch interactions, settings, and calendar information” and “recordings of your voice when you use voice commands to control a Service or contact our Customer Service team.” Meanwhile, Roborock’s privacy policy makes no mention of audiovisual data, though company representatives tell MIT Technology Review that consumers in China have the option to share it. 

iRobot cofounder Helen Greiner, who now runs a startup called Tertill that sells a garden-weeding robot, emphasizes that in collecting all this data, companies are not trying to violate their customers’ privacy. They’re just trying to build better products—or, in iRobot’s case, “make a better clean,” she says. 

Still, even the best efforts of companies like iRobot clearly leave gaps in privacy protection. “It’s less like a maliciousness thing, but just incompetence,” says Giese, the IoT hacker. “Developers are not traditionally very good [at] security stuff.” Their attitude becomes “Try to get the functionality, and if the functionality is working, ship the product.” 

“And then the scandals come out,” he adds.

Robot vacuums are just the beginning

The appetite for data will only increase in the years ahead. Vacuums are just a tiny subset of the connected devices that are proliferating across our lives, and the biggest names in robot vacuums—including iRobot, Samsung, Roborock, and Dyson—are vocal about ambitions much grander than automated floor cleaning. Robotics, including home robotics, has long been the real prize.  

Consider how Mario Munich, then the senior vice president of technology at iRobot, explained the company’s goals back in 2018. In a presentation on the Roomba 980, the company’s first computer-vision vacuum, he showed images from the device’s vantage point—including one of a kitchen with a table, chairs, and stools—next to how they would be labeled and perceived by the robot’s algorithms. “The challenge is not with the vacuuming. The challenge is with the robot,” Munich explained. “We would like to know the environment so we can change the operation of the robot.” 

This bigger mission is evident in what Scale’s data annotators were asked to label—not items on the floor that should be avoided (a feature that iRobot promotes), but items like “cabinet,” “kitchen countertop,” and “shelf,” which together help the Roomba J series device recognize the entire space in which it operates. 

The companies making robot vacuums are already investing in other features and devices that will bring us closer to a robotics-enabled future. The latest Roombas can be voice controlled through Nest and Alexa, and they recognize over 80 different objects around the home. Meanwhile, Ecovacs’s Deebot X1 robot vacuum has integrated the company’s proprietary voice assistance, while Samsung is one of several companies developing “companion robots” to keep humans company. Miele, which sells the RX2 Scout Home Vision, has turned its focus toward other smart appliances, like its camera-enabled smart oven.

And if iRobot’s $1.7 billion acquisition by Amazon moves forward—pending approval by the FTC, which is considering the merger’s effect on competition in the smart-home marketplace—Roombas are likely to become even more integrated into Amazon’s vision for the always-on smart home of the future.

Perhaps unsurprisingly, public policy is starting to reflect the growing public concern with data privacy. From 2018 to 2022, there has been a marked increase in states considering and passing privacy protections, such as the California Consumer Privacy Act and the Illinois Biometric Information Privacy Act. At the federal level, the FTC is considering new rules to crack down on harmful commercial surveillance and lax data security practices—including those used in training data. In two cases, the FTC has taken action against the undisclosed use of customer data to train artificial intelligence, ultimately forcing the companies, Weight Watchers International and the photo app developer Everalbum, to delete both the data collected and the algorithms built from it. 

Still, none of these piecemeal efforts address the growing data annotation market and its proliferation of companies based around the world or contracting with global gig workers, who operate with little oversight, often in countries with even fewer data protection laws. 

When I spoke this summer to Greiner, she said that she personally was not worried about iRobot’s implications for privacy—though she understood why some people might feel differently. Ultimately, she framed privacy in terms of consumer choice: anyone with real concerns could simply not buy that device. 

“Everybody needs to make their own privacy decisions,” she told me. “And I can tell you, overwhelmingly, people make the decision to have the features as long as they are delivered at a cost-effective price point.”

But not everyone agrees with this framework, in part because it is so challenging for consumers to make fully informed choices. Consent should be more than just “a piece of paper” to sign or a privacy policy to glance through, says Vitak, the University of Maryland information scientist. 

True informed consent means “that the person fully understands the procedure, they fully understand the risks … how those risks will be mitigated, and … what their rights are,” she explains. But this rarely happens in a comprehensive way—especially when companies market adorable robot helpers promising clean floors at the click of a button.

Do you have more information about how companies collect data to train AI? Did you participate in data collection efforts by iRobot or other robot vacuum companies? We’d love to hear from you and will respect requests for anonymity. Please reach out at tips@technologyreview.com or securely on Signal at 626.765.5489. 

Additional research by Tammy Xu.

Watch this robot dog scramble over tricky terrain just by using its camera

When Ananye Agarwal took his dog out for a walk up and down the steps in the local park near Carnegie Mellon University, other dogs stopped in their tracks. 

That’s because Agarwal’s dog was a robot—and a special one at that. Unlike other robots, which tend to rely heavily on an internal map to get around, his robot uses a built-in camera. Agarwal, a PhD student at Carnegie Mellon, is one of a group of researchers that has developed a technique allowing robots to walk on tricky terrain using computer vision and reinforcement learning. The researchers hope their work will help make it easier for robots to be deployed in the real world.  

Unlike existing robots on the market, such as Boston Dynamics’ Spot, which moves around using internal maps, this robot uses cameras alone to guide its movements in the wild, says Ashish Kumar, a graduate student at UC Berkeley, who is one of the authors of a paper describing the work; it’s due to be presented at the Conference on Robot Learning next month. Other attempts to use cues from cameras to guide robot movement have been limited to flat terrain, but they managed to get their robot to walk up stairs, climb on stones, and hop over gaps. 

grid of clips of robot dog walking on stairs

COURTESY OF THE RESEARCHERS

The four-legged robot is first trained to move around different environments in a simulator, so it has a general idea of what walking in a park or up and down stairs is like. When it’s deployed in the real world, visuals from a single camera in the front of the robot guide its movement. The robot learns to adjust its gait to navigate things like stairs and uneven ground using reinforcement learning, an AI technique that allows systems to improve through trial and error. 

Removing the need for an internal map makes the robot more robust, because it is no longer constrained by potential errors in a map, says Deepak Pathak, an assistant professor at Carnegie Mellon, who was part of the team. 

It is extremely difficult for a robot to translate raw pixels from a camera into the kind of precise and balanced movement needed to navigate its surroundings, says Jie Tan, a research scientist at Google, who was not involved in the study. He says the work is the first time he’s seen a small and low-cost robot demonstrate such impressive mobility. 

The team has achieved a “breakthrough in robot learning and autonomy,” says Guanya Shi, a researcher at the University of Washington who studies machine learning and robotic control, who also was not involved in the research. 

Akshara Rai, a research scientist at Facebook AI Research who works on machine learning and robotics, and was not involved in this work, agrees. 

“This work is a promising step toward building such perceptive legged robots and deploying them in the wild,” says Rai.

However, while the team’s work is helpful for improving how the robot walks, it won’t help the robot work out where to go in advance, Rai says. “Navigation is important for deploying robots in the real world,” she says.

More work is needed before the robot dog will be able to prance around parks or fetch things in the house. While the robot may understand depth through its front camera, it cannot cope with situations such as slippery ground or tall grass, Tan says; it could step into puddles or get stuck in mud.