AI trained on AI garbage spits out AI garbage

AI models work by training on huge swaths of data from the internet. But as AI is increasingly being used to pump out web pages filled with junk content, that process is in danger of being undermined.

New research published in Nature shows that the quality of the model’s output gradually degrades when AI trains on AI-generated data. As subsequent models produce output that is then used as training data for future models, the effect gets worse.  

Ilia Shumailov, a computer scientist from the University of Oxford, who led the study, likens the process to taking photos of photos. “If you take a picture and you scan it, and then you print it, and you repeat this process over time, basically the noise overwhelms the whole process,” he says. “You’re left with a dark square.” The equivalent of the dark square for AI is called “model collapse,” he says, meaning the model just produces incoherent garbage. 

This research may have serious implications for the largest AI models of today, because they use the internet as their database. GPT-3, for example, was trained in part on data from Common Crawl, an online repository of over 3 billion web pages. And the problem is likely to get worse as an increasing number of AI-generated junk websites start cluttering up the internet. 

Current AI models aren’t just going to collapse, says Shumailov, but there may still be substantive effects: The improvements will slow down, and performance might suffer. 

To determine the potential effect on performance, Shumailov and his colleagues fine-tuned a large language model (LLM) on a set of data from Wikipedia, then fine-tuned the new model on its own output over nine generations. The team measured how nonsensical the output was using a “perplexity score,” which measures an AI model’s confidence in its ability to predict the next part of a sequence; a higher score translates to a less accurate model. 

The models trained on other models’ outputs had higher perplexity scores. For example, for each generation, the team asked the model for the next sentence after the following input:

“some started before 1360—was typically accomplished by a master mason and a small team of itinerant masons, supplemented by local parish labourers, according to Poyntz Wright. But other authors reject this model, suggesting instead that leading architects designed the parish church towers based on early examples of Perpendicular.”

On the ninth and final generation, the model returned the following:

“architecture. In addition to being home to some of the world’s largest populations of black @-@ tailed jackrabbits, white @-@ tailed jackrabbits, blue @-@ tailed jackrabbits, red @-@ tailed jackrabbits, yellow @-.”

Shumailov explains what he thinks is going on using this analogy: Imagine you’re trying to find the least likely name of a student in school. You could go through every student name, but it would take too long. Instead, you look at 100 of the 1,000 student names. You get a pretty good estimate, but it’s probably not the correct answer. Now imagine that another person comes and makes an estimate based on your 100 names, but only selects 50. This second person’s estimate is going to be even further off.

“You can certainly imagine that the same happens with machine learning models,” he says. “So if the first model has seen half of the internet, then perhaps the second model is not going to ask for half of the internet, but actually scrape the latest 100,000 tweets, and fit the model on top of it.”

Additionally, the internet doesn’t hold an unlimited amount of data. To feed their appetite for more, future AI models may need to train on synthetic data—or data that has been produced by AI.   

“Foundation models really rely on the scale of data to perform well,” says Shayne Longpre, who studies how LLMs are trained at the MIT Media Lab, and who didn’t take part in this research. “And they’re looking to synthetic data under curated, controlled environments to be the solution to that. Because if they keep crawling more data on the web, there are going to be diminishing returns.”

Matthias Gerstgrasser, an AI researcher at Stanford who authored a different paper examining model collapse, says adding synthetic data to real-world data instead of replacing it doesn’t cause any major issues. But he adds: “One conclusion all the model collapse literature agrees on is that high-quality and diverse training data is important.”

Another effect of this degradation over time is that information that affects minority groups is heavily distorted in the model, as it tends to overfocus on samples that are more prevalent in the training data. 

In current models, this may affect underrepresented languages as they require more synthetic (AI-generated) data sets, says Robert Mahari, who studies computational law at the MIT Media Lab (he did not take part in the research).

One idea that might help avoid degradation is to make sure the model gives more weight to the original human-generated data. Another part of Shumailov’s study allowed future generations to sample 10% of the original data set, which mitigated some of the negative effects. 

That would require making a trail from the original human-generated data to further generations, known as data provenance.

But provenance requires some way to filter the internet into human-generated and AI-generated content, which hasn’t been cracked yet. Though a number of tools now exist that aim to determine whether text is AI-generated, they are often inaccurate.

“Unfortunately, we have more questions than answers,” says Shumailov. “But it’s clear that it’s important to know where your data comes from and how much you can trust it to capture a representative sample of the data you’re dealing with.”

Google’s new weather prediction system combines AI with traditional physics

Researchers from Google have built a new weather prediction model that combines machine learning with more conventional techniques, potentially yielding accurate forecasts at a fraction of the current cost. 

The model, called NeuralGCM and described in a paper in Nature today, bridges a divide that’s grown among weather prediction experts in the last several years. 

While new machine-learning techniques that predict weather by learning from years of past data are extremely fast and efficient, they can struggle with long-term predictions. General circulation models, on the other hand, which have dominated weather prediction for the last 50 years, use complex equations to model changes in the atmosphere and give accurate projections, but they are exceedingly slow and expensive to run. Experts are divided on which tool will be most reliable going forward. But the new model from Google instead attempts to combine the two. 

“It’s not sort of physics versus AI. It’s really physics and AI together,” says Stephan Hoyer, an AI researcher at Google Research and a coauthor of the paper. 

The system still uses a conventional model to work out some of the large atmospheric changes required to make a prediction. It then incorporates AI, which tends to do well where those larger models fall flat—typically for predictions on scales smaller than about 25 kilometers, like those dealing with cloud formations or regional microclimates (San Francisco’s fog, for example). “That’s where we inject AI very selectively to correct the errors that accumulate on small scales,” Hoyer says.

The result, the researchers say, is a model that can produce quality predictions faster with less computational power. They say NeuralGCM is as accurate as one-to-15-day forecasts from the European Centre for Medium-Range Weather Forecasts (ECMWF), which is a partner organization in the research. 

But the real promise of technology like this is not in better weather predictions for your local area, says Aaron Hill, an assistant professor at the School of Meteorology at the University of Oklahoma, who was not involved in this research. Instead, it’s in larger-scale climate events that are prohibitively expensive to model with conventional techniques. The possibilities could range from predicting tropical cyclones with more notice to modeling more complex climate changes that are years away. 

“It’s so computationally intensive to simulate the globe over and over again or for long periods of time,” Hill says. That means the best climate models are hamstrung by the high costs of computing power, which presents a real bottleneck to research. 

AI-based models are indeed more compact. Once trained, typically on 40 years of historical weather data from ECMWF, a machine-learning model like Google’s GraphCast can run on less than 5,500 lines of code, compared with the nearly 377,000 lines required for the model from the National Oceanic and Atmospheric Administration, according to the paper. 

NeuralGCM, according to Hill, seems to make a strong case that AI can be brought in for particular elements of weather modeling to make things faster, while still keeping the strengths of conventional systems.

“We don’t have to throw away all the knowledge that we’ve gained over the last 100 years about how the atmosphere works,” he says. “We can actually integrate that with the power of AI and machine learning as well.”

Hoyer says using the model to predict short-term weather has been useful for validating its predictions, but that the goal is indeed to be able to use it for longer-term modeling, particularly for extreme weather risk. 

NeuralGCM will be open source. While Hoyer says he looks forward to having climate scientists use it in their research, the model may also be of interest to more than just academics. Commodities traders and agricultural planners pay top dollar for high-resolution predictions, and the models used by insurance companies for products like flood or extreme weather insurance are struggling to account for the impact of climate change. 

While many of the AI skeptics in weather forecasting have been won over by recent developments, according to Hill, the fast pace is hard for the research community to keep up with. “It’s gangbusters,” he says—it seems as if a new model is released by Google, Nvidia, or Huawei every two months. That makes it difficult for researchers to actually sort out which of the new tools will be most useful and apply for research grants accordingly. 

“The appetite is there [for AI],” Hill says. “But I think a lot of us still are waiting to see what happens.”

Correction: This story was updated to clarify that Stephan Hoyer is a researcher at Google Research, not Google DeepMind.

AI companies promised to self-regulate one year ago. What’s changed?

One year ago, on July 21, 2023, seven leading AI companies—Amazon, Anthropic, Google, Inflection, Meta, Microsoft, and OpenAI—committed with the White House to a set of eight voluntary commitments on how to develop AI in a safe and trustworthy way.

These included promises to do things like improve the testing and transparency around AI systems, and share information on potential harms and risks. 

On the first anniversary of the voluntary commitments, MIT Technology Review asked the AI companies that signed the commitments for details on their work so far. Their replies show that the tech sector has made some welcome progress, with big caveats.

The voluntary commitments came at a time when generative AI mania was perhaps at its frothiest, with companies racing to launch their own models and make them bigger and better than their competitors’. At the same time, we started to see developments such as fights over copyright and deepfakes. A vocal lobby of influential tech players, such as Geoffrey Hinton, had also raised concerns that AI could pose an existential risk to humanity. Suddenly, everyone was talking about the urgent need to make AI safe, and regulators everywhere were under pressure to do something about it.

Until very recently, AI development has been a Wild West. Traditionally, the US has been loath to regulate its tech giants, instead relying on them to regulate themselves. The voluntary commitments are a good example of that: they were some of the first prescriptive rules for the AI sector in the US, but they remain voluntary and unenforceable. The White House has since issued an executive order, which expands on the commitments and also applies to other tech companies and government departments. 

“One year on, we see some good practices towards their own products, but [they’re] nowhere near where we need them to be in terms of good governance or protection of rights at large,” says Merve Hickok, the president and research director of the Center for AI and Digital Policy, who reviewed the companies’ replies as requested by MIT Technology Review. Many of these companies continue to push unsubstantiated claims about their products, such as saying that they can supersede human intelligence and capabilities, adds Hickok. 

One trend that emerged from the tech companies’ answers is that companies are doing more  to pursue technical fixes such as red-teaming (in which humans probe AI models for flaws) and watermarks for AI-generated content. 

But it’s not clear what the commitments have changed and whether the companies would have implemented these measures anyway, says Rishi Bommasani, the society lead at the Stanford Center for Research on Foundation Models, who also reviewed the responses for MIT Technology Review.  

One year is a long time in AI. Since the voluntary commitments were signed, Inflection AI founder Mustafa Suleyman has left the company and joined Microsoft to lead the company’s AI efforts. Inflection declined to comment. 

“We’re grateful for the progress leading companies have made toward fulfilling their voluntary commitments in addition to what is required by the executive order,” says Robyn Patterson, a spokesperson for the White House. But, Patterson adds, the president continues to call on Congress to pass bipartisan legislation on AI. 

Without comprehensive federal legislation, the best the US can do right now is to demand that companies follow through on these voluntary commitments, says Brandie Nonnecke, the director of the CITRIS Policy Lab at UC Berkeley. 

But it’s worth bearing in mind that “these are still companies that are essentially writing the exam by which they are evaluated,” says Nonnecke. “So we have to think carefully about whether or not they’re … verifying themselves in a way that is truly rigorous.” 

Here’s our assessment of the progress AI companies have made in the past year.

Commitment 1

The companies commit to internal and external security testing of their AI systems before their release. This testing, which will be carried out in part by independent experts, guards against some of the most significant sources of AI risks, such as biosecurity and cybersecurity, as well as its broader societal effects.

All the companies (excluding Inflection, which chose not to comment) say they conduct red-teaming exercises that get both internal and external testers to probe their models for flaws and risks. OpenAI says it has a separate preparedness team that tests models for cybersecurity, chemical, biological, radiological, and nuclear threats and for situations where a sophisticated AI model can do or persuade a person to do things that might lead to harm. Anthropic and OpenAI also say they conduct these tests with external experts before launching their new models. For example, for the launch of Anthropic’s latest model, Claude 3.5, the company conducted predeployment testing with experts at the UK’s AI Safety Institute. Anthropic has also allowed METR, a research nonprofit, to do an “initial exploration” of Claude 3.5’s capabilities for autonomy. Google says it also conducts internal red-teaming to test the boundaries of its model, Gemini, around election-related content, societal risks, and national security concerns. Microsoft says it has worked with third-party evaluators at NewsGuard, an organization advancing journalistic integrity, to evaluate risks and mitigate the risk of abusive deepfakes in Microsoft’s text-to-image tool. In addition to red-teaming, Meta says, it evaluated its latest model, Llama 3, to understand its performance in a series of risk areas like weapons, cyberattacks, and child exploitation. 

But when it comes to testing, it’s not enough to just report that a company is taking actions, says Bommasani. For example, Amazon and Anthropic said they had worked with the nonprofit Thorn to combat risks to child safety posed by AI. Bommasani would have wanted to see more specifics about how the interventions that companies are implementing actually reduce those risks. 

“It should become clear to us that it’s not just that companies are doing things but those things are having the desired effect,” Bommasani says.  

RESULT: Good. The push for red-teaming and testing for a wide range of risks is a good and important one. However, Hickok would have liked to see independent researchers get broader access to companies’ models. 

Commitment 2

The companies commit to sharing information across the industry and with governments, civil society, and academia on managing AI risks. This includes best practices for safety, information on attempts to circumvent safeguards, and technical collaboration.

After they signed the commitments, Anthropic, Google, Microsoft, and OpenAI founded the Frontier Model Forum, a nonprofit that aims to facilitate discussions and actions on AI safety and responsibility. Amazon and Meta have also joined.  

Engaging with nonprofits that the AI companies funded themselves may not be in the spirit of the voluntary commitments, says Bommasani. But the Frontier Model Forum could be a way for these companies to cooperate with each other and pass on information about safety, which they normally could not do as competitors, he adds. 

“Even if they’re not going to be transparent to the public, one thing you might want is for them to at least collectively figure out mitigations to actually reduce risk,” says Bommasani. 

All of the seven signatories are also part of the Artificial Intelligence Safety Institute Consortium (AISIC), established by the National Institute of Standards and Technology (NIST), which develops guidelines and standards for AI policy and evaluation of AI performance. It is a large consortium consisting of a mix of public- and private-sector players. Google, Microsoft, and OpenAI also have representatives at the UN’s High-Level Advisory Body on Artificial Intelligence

Many of the labs also highlighted their research collaborations with academics. For example, Google is part of MLCommons, where it worked with academics on a cross-industry AI Safety Benchmark. Google also says it actively contributes tools and resources, such as computing credit, to projects like the National Science Foundation’s National AI Research Resource pilot, which aims to democratize AI research in the US.

Many of the companies also contributed to guidance by the Partnership on AI, another nonprofit founded by Amazon, Facebook, Google, DeepMind, Microsoft, and IBM, on the deployment of foundation models. 

RESULT: More work is needed. More information sharing is a welcome step as the industry tries to collectively make AI systems safe and trustworthy. However, it’s unclear how much of the effort advertised will actually lead to meaningful changes and how much is window dressing. 

Commitment 3

The companies commit to investing in cybersecurity and insider threat safeguards to protect proprietary and unreleased model weights. These model weights are the most essential part of an AI system, and the companies agree that it is vital that the model weights be released only when intended and when security risks are considered.

Many of the companies have implemented new cybersecurity measures in the past year. For example, Microsoft has launched the Secure Future Initiative to address the growing scale of cyberattacks. The company says its model weights are encrypted to mitigate the potential risk of model theft, and it applies strong identity and access controls when deploying highly capable proprietary models. 

Google too has launched an AI Cyber Defense Initiative. In May OpenAI shared six new measures it is developing to complement its existing cybersecurity practices, such as extending cryptographic protection to AI hardware. It also has a Cybersecurity Grant Program, which gives researchers access to its models to build cyber defenses. 

Amazon mentioned that it has also taken specific measures against attacks specific to generative AI, such as data poisoning and prompt injection, in which someone uses prompts that direct the language model to ignore its previous directions and safety guardrails.

Just a couple of days after signing the commitments, Anthropic published details about its protections, which include common cybersecurity practices such as controlling who has access to the models and sensitive assets such as model weights, and inspecting and controlling the third-party supply chain. The company also works with independent assessors to evaluate whether the controls it has designed meet its cybersecurity needs.

RESULT: Good. All of the companies did say they had taken extra measures to protect their models, although it doesn’t seem there is much consensus on the best way to protect AI models. 

Commitment 4

The companies commit to facilitating third-party discovery and reporting of vulnerabilities in their AI systems. Some issues may persist even after an AI system is released and a robust reporting mechanism enables them to be found and fixed quickly. 

For this commitment, one of the most popular responses was to implement bug bounty programs, which reward people who find flaws in AI systems. Anthropic, Google, Microsoft, Meta, and OpenAI all have one for AI systems. Anthropic and Amazon also said they have forms on their websites where security researchers can submit vulnerability reports. 

It will likely take us years to figure out how to do third-party auditing well, says Brandie Nonnecke. “It’s not just a technical challenge. It’s a socio-technical challenge. And it just kind of takes years for us to figure out not only the technical standards of AI, but also socio-technical standards, and it’s messy and hard,” she says. 

Nonnecke says she worries that the first companies to implement third-party audits might set poor precedents for how to think about and address the socio-technical risks of AI. For example, audits might define, evaluate, and address some risks but overlook others.

RESULT: More work is needed. Bug bounties are great, but they’re nowhere near comprehensive enough. New laws, such as the EU’s AI Act, will require tech companies to conduct audits, and it would have been great to see tech companies share successful examples of such audits. 

Commitment 5

The companies commit to developing robust technical mechanisms to ensure that users know when content is AI generated, such as a watermarking system. This action enables creativity with AI to flourish but reduces the dangers of fraud and deception.

Many of the companies have built watermarks for AI-generated content. For example, Google launched SynthID, a watermarking tool for image, audio, text, and video generated by Gemini. Meta has a tool called Stable Signature for images, and AudioSeal for AI-generated speech. Amazon now adds an invisible watermark to all images generated by its Titan Image Generator. OpenAI also uses watermarks in Voice Engine, its custom voice model, and has built an image-detection classifier for images generated by DALL-E 3. Anthropic was the only company that hadn’t built a watermarking tool, because watermarks are mainly used in images, which the company’s Claude model doesn’t support. 

All the companies excluding Inflection, Anthropic, and Meta are also part of the Coalition for Content Provenance and Authenticity (C2PA), an industry coalition that embeds information about when content was created, and whether it was created or edited by AI, into an image’s metadata. Microsoft and OpenAI automatically attach the C2PA’s provenance metadata to images generated with DALL-E 3 and videos generated with Sora. While Meta is not a member, it announced it is using the C2PA standard to identify AI-generated images on its platforms. 

The six companies that signed the commitments have a “natural preference to more technical approaches to addressing risk,” says Bommasani, “and certainly watermarking in particular has this flavor.”  

“The natural question is: Does [the technical fix] meaningfully make progress and address the underlying social concerns that motivate why we want to know whether content is machine generated or not?” he adds. 

RESULT: Good. This is an encouraging result overall. While watermarking remains experimental and is still unreliable, it’s still good to see research around it and a commitment to the C2PA standard. It’s better than nothing, especially during a busy election year.  

Commitment 6

The companies commit to publicly reporting their AI systems’ capabilities, limitations, and areas of appropriate and inappropriate use. This report will cover both security risks and societal risks, such as the effects on fairness and bias.

The White House’s commitments leave a lot of room for interpretation. For example, companies can technically meet this public reporting commitment with widely varying levels of transparency, as long as they do something in that general direction. 

The most common solutions tech companies offered here were so-called model cards. Each company calls them by a slightly different name, but in essence they act as a kind of product description for AI models. They can address anything from the model’s capabilities and limitations (including how it measures up against benchmarks on fairness and explainability) to veracity, robustness, governance, privacy, and security. Anthropic said it also tests models for potential safety issues that may arise later.

Microsoft has published an annual Responsible AI Transparency Report, which provides insight into how the company builds applications that use generative AI, make decisions, and oversees the deployment of those applications. The company also says it gives clear notice on where and how AI is used within its products.

RESULT: More work is needed. One area of improvement for AI companies would be to increase transparency on their governance structures and on the financial relationships between companies, Hickok says. She would also have liked to see companies be more public about data provenance, model training processes, safety incidents, and energy use. 

Commitment 7

The companies commit to prioritizing research on the societal risks that AI systems can pose, including on avoiding harmful bias and discrimination, and protecting privacy. The track record of AI shows the insidiousness and prevalence of these dangers, and the companies commit to rolling out AI that mitigates them. 

Tech companies have been busy on the safety research front, and they have embedded their findings into products. Amazon has built guardrails for Amazon Bedrock that can detect hallucinations and can apply safety, privacy, and truthfulness protections. Anthropic says it employs a team of researchers dedicated to researching societal risks and privacy. In the past year, the company has pushed out research on deception, jailbreaking, strategies to mitigate discrimination, and emergent capabilities such as models’ ability to tamper with their own code or engage in persuasion. And OpenAI says it has trained its models to avoid producing hateful content and refuse to generate output on hateful or extremist content. It trained its GPT-4V to refuse many requests that require drawing from stereotypes to answer. Google DeepMind has also released research to evaluate dangerous capabilities, and the company has done a study on misuses of generative AI. 

All of them have poured a lot of money into this area of research. For example, Google has invested millions into creating a new AI Safety Fund to promote research in the field through the Frontier Model Forum. Microsoft says it has committed $20 million in compute credits to researching societal risks through the National AI Research Resource and started its own AI model research accelerator program for academics, called the Accelerating Foundation Models Research program. The company has also hired 24 research fellows focusing on AI and society. 

RESULT: Very good. This is an easy commitment to meet, as the signatories are some of the biggest and richest corporate AI research labs in the world. While more research into how to make AI systems safe is a welcome step, critics say that the focus on safety research takes attention and resources from AI research that focuses on more immediate harms, such as discrimination and bias. 

Commitment 8

The companies commit to develop and deploy advanced AI systems to help address society’s greatest challenges. From cancer prevention to mitigating climate change to so much in between, AI—if properly managed—can contribute enormously to the prosperity, equality, and security of all.

Since making this commitment, tech companies have tackled a diverse set of problems. For example, Pfizer used Claude to assess trends in cancer treatment research after gathering relevant data and scientific content, and Gilead, an American biopharmaceutical company, used generative AI from Amazon Web Services to do feasibility evaluations on clinical studies and analyze data sets. 

Google DeepMind has a particularly strong track record in pushing out AI tools that can help scientists. For example, AlphaFold 3 can predict the structure and interactions of all life’s molecules. AlphaGeometry can solve geometry problems at a level comparable with the world’s brightest high school mathematicians. And GraphCast is an AI model that is able to make medium-range weather forecasts. Meanwhile, Microsoft has used satellite imagery and AI to improve responses to wildfires in Maui and map climate-vulnerable populations, which helps researchers expose risks such as food insecurity, forced migration, and disease. 

OpenAI, meanwhile, has announced partnerships and funding for various research projects, such as one looking at how multimodal AI models can be used safely by educators and by scientists in laboratory settings It has also offered credits to help researchers use its platforms during hackathons on clean energy development.  

RESULT: Very good. Some of the work on using AI to boost scientific discovery or predict weather events is genuinely exciting. AI companies haven’t used AI to prevent cancer yet, but that’s a pretty high bar. 

Overall, there have been some positive changes in the way AI has been built, such as red-teaming practices, watermarks and new ways for industry to share best practices. However, these are only a couple of neat technical solutions to the messy socio-technical problem that is AI harm, and a lot more work is needed. One year on, it is also odd to see the commitments talk about a very particular type of AI safety that focuses on hypothetical risks, such bioweapons, and completely fail to mention consumer protection, nonconsensual deepfakes, data and copyright, and the environmental footprint of AI models. These seem like weird omissions today. 

A short history of AI, and what it is (and isn’t)

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

It’s the simplest questions that are often the hardest to answer. That applies to AI, too. Even though it’s a technology being sold as a solution to the world’s problems, nobody seems to know what it really is. It’s a label that’s been slapped on technologies ranging from self-driving cars to facial recognition, chatbots to fancy Excel. But in general, when we talk about AI, we talk about technologies that make computers do things we think need intelligence when done by people. 

For months, my colleague Will Douglas Heaven has been on a quest to go deeper to understand why everybody seems to disagree on exactly what AI is, why nobody even knows, and why you’re right to care about it. He’s been talking to some of the biggest thinkers in the field, asking them, simply: What is AI? It’s a great piece that looks at the past and present of AI to see where it is going next. You can read it here

Here’s a taste of what to expect: 

Artificial intelligence almost wasn’t called “artificial intelligence” at all. The computer scientist John McCarthy is credited with coming up with the term in 1955 when writing a funding application for a summer research program at Dartmouth College in New Hampshire. But more than one of McCarthy’s colleagues hated it. “The word ‘artificial’ makes you think there’s something kind of phony about this,” said one. Others preferred the terms “automata studies,” “complex information processing,” “engineering psychology,” “applied epistemology,” “neural cybernetics,”  “non-numerical computing,” “neuraldynamics,” “advanced automatic programming,” and “hypothetical automata.” Not quite as cool and sexy as AI.

AI has several zealous fandoms. AI has acolytes, with a faith-like belief in the technology’s current power and inevitable future improvement. The buzzy popular narrative is shaped by a pantheon of big-name players, from Big Tech marketers in chief like Sundar Pichai and Satya Nadella to edgelords of industry like Elon Musk and Sam Altman to celebrity computer scientists like Geoffrey Hinton. As AI hype has ballooned, a vocal anti-hype lobby has risen in opposition, ready to smack down its ambitious, often wild claims. As a result, it can feel as if different camps are talking past one another, not always in good faith.

This sometimes seemingly ridiculous debate has huge consequences that affect us all. AI has a lot of big egos and vast sums of money at stake. But more than that, these disputes matter when industry leaders and opinionated scientists are summoned by heads of state and lawmakers to explain what this technology is and what it can do (and how scared we should be). They matter when this technology is being built into software we use every day, from search engines to word-processing apps to assistants on your phone. AI is not going away. But if we don’t know what we’re being sold, who’s the dupe?

For example, meet the TESCREALists. A clunky acronym (pronounced “tes-cree-all”) replaces an even clunkier list of labels: transhumanism, extropianism, singularitarianism, cosmism, rationalism, effective altruism, and longtermism. It was coined by Timnit Gebru, who founded the Distributed AI Research Institute and was Google’s former ethical AI co-lead, and Émile Torres, a philosopher and historian at Case Western Reserve University. Some anticipate human immortality; others predict humanity’s colonization of the stars. The common tenet is that an all-powerful technology is not only within reach but inevitable. TESCREALists believe that artificial general intelligence, or AGI, could not only fix the world’s problems but level up humanity. Gebru and Torres link several of these worldviews—with their common focus on “improving” humanity—to the racist eugenics movements of the 20th century.

Is AI math or magic? Either way, people have strong, almost religious beliefs in one or the other. “It’s offensive to some people to suggest that human intelligence could be re-created through these kinds of mechanisms,” Ellie Pavlick, who studies neural networks at Brown University, told Will. “People have strong-held beliefs about this issue—it almost feels religious. On the other hand, there’s people who have a little bit of a God complex. So it’s also offensive to them to suggest that they just can’t do it.”

Will’s piece really is the definitive look at this whole debate. No spoilers—there are no simple answers, but lots of fascinating characters and viewpoints. I’d recommend you read the whole thing here—and see if you can make your mind up about what AI really is.


Now read the rest of The Algorithm

Deeper Learning

AI can make you more creative—but it has limits

Generative AI models have made it simpler and quicker to produce everything from text passages and images to video clips and audio tracks. But while AI’s output can certainly seem creative, do these models actually boost human creativity?  

A new study looked at how people used OpenAI’s large language model GPT-4 to write short stories. The model was helpful—but only to an extent. The researchers found that while AI improved the output of less creative writers, it made little difference to the quality of the stories produced by writers who were already creative. The stories in which AI had played a part were also more similar to each other than those dreamed up entirely by humans. Read more from Rhiannon Williams.

Bits and Bytes

Robot-packed meals are coming to the frozen-food aisle
Found everywhere from airplanes to grocery stores, prepared meals are usually packed by hand. AI-powered robotics is changing that. (MIT Technology Review

AI is poised to automate today’s most mundane manual warehouse task
Pallets are everywhere, but training robots to stack them with goods takes forever. Fixing that could be a tangible win for commercial AI-powered robots. (MIT Technology Review)

The Chinese government is going all-in on autonomous vehicles
The government is finally allowing Tesla to bring its Full Self-Driving feature to China. New government permits let companies test driverless cars on the road and allow cities to build smart road infrastructure that will tell these cars where to go. (MIT Technology Review

The US and its allies took down a Russian AI bot farm on X
The US seized control of a sophisticated Russian operation that used AI to push propaganda through nearly a thousand covert accounts on the social network X. Western intelligence agencies traced the propaganda mill to an officer of the Russian FSB intelligence force and to a former senior editor at state-controlled publication RT, formerly called Russia Today. (The Washington Post)

AI investors are starting to wonder: Is this just a bubble?
After a massive investment in the language-model boom, the biggest beneficiary is Nvidia, which designs and sells the best chips for training and running modern AI models. Investors are now starting to ask what LLMs are actually going to be used for, and when they will start making them money. (New York magazine

Goldman Sachs thinks AI is overhyped, wildly expensive, and unreliable
Meanwhile, the major investment bank published a research paper about the economic viability of generative AI. It notes that there is “little to show for” the huge amount of spending on generative AI infrastructure and questions “whether this large spend will ever pay off in terms of AI benefits and returns.” (404 Media

The UK politician accused of being AI is actually a real person
A hilarious story about how Mark Matlock, a candidate for the far-right Reform UK party, was accused of being a fake candidate created with AI after he didn’t show up to campaign events. Matlock has assured the press he is a real person, and he wasn’t around because he had pneumonia. (The Verge

Building supply chain resilience with AI

If the last five years have taught businesses with complex supply chains anything, it is that resilience is crucial. In the first three months of the covid-19 pandemic, for example, supply-chain leader Amazon grew its business 44%. Its investments in supply chain resilience allowed it to deliver when its competitors could not, says Sanjeev Maddila, worldwide head of supply chain solutions at Amazon Web Services (AWS), increasing its market share and driving profits up 220%. A resilient supply chain ensures that a company can meet its customers’ needs despite inevitable disruption.

Today, businesses of all sizes must deliver to their customers against a backdrop of supply chain disruptions, with technological changes, shifting labor pools, geopolitics, and climate change adding new complexity and risk at a global scale. To succeed, they need to build resilient supply chains: fully digital operations that prioritize customers and their needs while establishing a fast, reliable, and sustainable delivery network.

The Canadian fertilizer company Nutrien, for example, operates two dozen manufacturing and processing facilities spread across the globe and nearly 2,000 retail stores in the Americas and Australia. To collect underutilized data from its industrial operations, and gain greater visibility into its supply chain, the company relies on a combination of cloud technology and artificial intelligence/machine learning (AI/ML) capabilities.

“A digital supply chain connects us from grower to manufacturer, providing visibility throughout the value chain,” says Adam Lorenz, senior director for strategic fleet and indirect procurement at Nutrien. This visibility is critical when it comes to navigating the company’s supply chain challenges, which include seasonal demands, weather dependencies, manufacturing capabilities, and product availability. The company requires real-time visibility into its fleets, for example, to identify the location of assets, see where products are moving, and determine inventory requirements.

Currently, Nutrien can locate a fertilizer or nutrient tank in a grower’s field and determine what Nutrien products are in it. By achieving that “real-time visibility” into a tank’s location and a customer’s immediate needs, Lorenz says the company “can forecast where assets are from a fill-level perspective and plan accordingly.” In turn, Nutrien can respond immediately to emerging customer needs, increasing company revenue while enhancing customer satisfaction, improving inventory management, and optimizing supply chain operations.

“For us, it’s about starting with data creation and then adding a layer of AI on top to really drive recommendations,” says Lorenz. In addition to improving product visibility and asset utilization, Lorenz says that Nutrien plans to add AI capabilities to its collaboration platforms that will make it easier for less-tech-savvy customers to take advantage of self-service capabilities and automation that accelerates processes and improves compliance with complex policies.

To meet and exceed customer expectations with differentiated service, speed, and reliability, all companies need to similarly modernize their supply chain operations. The key to doing so—and to increasing organizational resilience and sustainability—will be applying AI/ML to their extensive operational data in the cloud.

Resilience as a business differentiator

Like Nutrien, a wide variety of organizations from across industries are discovering the competitive advantages of modernizing their supply chains. A pharmaceutical company that aggregates its supply chain data for greater end-to-end visibility, for example, can provide better product tracking for critically ill customers. A retail startup undergoing meteoric growth can host its workloads in the cloud to support sudden upticks in demand while minimizing operating costs. And a transportation company can achieve inbound supply chain savings by evaluating the total distance its fleet travels to reduce mileage costs and CO2 emissions.

Download the full report.

This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by MIT Technology Review’s editorial staff.

AI is poised to automate today’s most mundane manual warehouse task

Before almost any item reaches your door, it traverses the global supply chain on a pallet. More than 2 billion pallets are in circulation in the United States alone, and $400 billion worth of goods are exported on them annually. However, loading boxes onto these pallets is a task stuck in the past: Heavy loads and repetitive movements leave workers at high risk of injury, and in the rare instances when robots are used, they take months to program using handheld computers that have changed little since the 1980s.

Jacobi Robotics, a startup spun out of the labs of the University of California, Berkeley, says it can vastly speed up that process with AI command-and-control software. The researchers approached palletizing—one of the most common warehouse tasks—as primarily an issue of motion planning: How do you safely get a robotic arm to pick up boxes of different shapes and stack them efficiently on a pallet without getting stuck? And all that computation also has to be fast, because factory lines are producing more varieties of products than ever before—which means boxes of more shapes and sizes.

After much trial and error, Jacobi’s founders, including roboticist Ken Goldberg, say they’ve cracked it. Their software, built upon research from a paper they published in Science Robotics in 2020, is designed to work with the four leading makers of robotic palletizing arms. It uses deep learning to generate a “first draft” of how an arm might move an item onto the pallet. Then it uses more traditional robotics methods, like optimization, to check whether the movement can be done safely and without glitches. 

Jacobi aims to replace the legacy methods customers are currently using to train their bots. In the conventional approach, robots are programmed using tools called “teaching pendants,” and customers usually have to manually guide the robot to demonstrate how to pick up each individual box and place it on the pallet. The entire coding process can take months. Jacobi says its AI-driven solution promises to cut that time down to a day and can compute motions in less than a millisecond. The company says it plans to launch its product later this month.

Billions of dollars are being poured into AI-powered robotics, but most of the excitement is geared toward next-generation robots that promise to be capable of many different tasks—like the humanoid robot that has helped Figure raise $675 million from investors, including Microsoft and OpenAI, and reach a $2.6 billion evaluation in February. Against this backdrop, using AI to train a better box-stacking robot might feel pretty basic. 

Indeed, Jacobi’s seed funding round is trivial in comparison: $5 million led by Moxxie Ventures. But amid hype around promised robotics breakthroughs that could take years to materialize, palletizing might be the warehouse problem AI is best poised to solve in the short term. 

“We have a very pragmatic approach,” says Max Cao, Jacobi’s co-founder and CEO. “These tasks are within reach, and we can get a lot of adoption within a short time frame, versus some of the moonshots out there.”

Jacobi’s software product includes a virtual studio where customers can build replicas of their setups, capturing factors like which robot models they have, what types of boxes will come off the conveyor belt, and which direction the labels should face. A warehouse moving sporting goods, say, might use the program to figure out the best way to stack a mixed pallet of tennis balls, rackets, and apparel. Then Jacobi’s algorithms will automatically plan the many movements the robotic arm should take to stack the pallet, and the instructions will be transmitted to the robot.

JACOBI ROBOTICS

The approach merges the benefits of fast computing provided by AI with the accuracy of more traditional robotics techniques, says Dmitry Berenson, a professor of robotics at the University of Michigan, who is not involved with the company.

“They’re doing something very reasonable here,” he says. A lot of modern robotics research is betting big on AI, hoping that deep learning can augment or replace more manual training by having the robot learn from past examples of a given motion or task. But by making sure the predictions generated by deep learning are checked against the results of more traditional methods, Jacobi is developing planning algorithms that will likely be less prone to error, Berenson says.

The planning speed that could result “is pushing this into a new category,” he adds. “You won’t even notice the time it takes to compute a motion. That’s really important in the industrial setting, where every pause means delays.”

AI can make you more creative—but it has limits

Generative AI models have made it simpler and quicker to produce everything from text passages and images to video clips and audio tracks. Texts and media that might have taken years for humans to create can now be generated in seconds.

But while AI’s output can certainly seem creative, do these models actually boost human creativity?  

That’s what two researchers set out to explore in new research published today in Science Advances, studying how people used OpenAI’s large language model GPT-4 to write short stories.

The model was helpful—but only to an extent. They found that while AI improved the output of less creative writers, it made little difference to the quality of the stories produced by writers who were already creative. The stories in which AI had played a part were also more similar to each other than those dreamed up entirely by humans. 

The research adds to the growing body of work investigating how generative AI affects human creativity, suggesting that although access to AI can offer a creative boost to an individual, it reduces creativity in the aggregate. 

To understand generative AI’s effect on humans’ creativity, we first need to determine how creativity is measured. This study used two metrics: novelty and usefulness. Novelty refers to a story’s originality, while usefulness in this context reflects the possibility that each resulting short story could be developed into a book or other publishable work. 

First, the authors recruited 293 people through the research platform Prolific to complete a task designed to measure their inherent creativity. Participants were instructed to provide 10 words that were as different from each other as possible.

Next, the participants were asked to write an eight-sentence story for young adults on one of three topics: an adventure in the jungle, on open seas, or on a different planet. First, though, they were randomly sorted into three groups. The first group had to rely solely on their own ideas, while the second group was given the option to receive a single story idea from GPT-4. The third group could elect to receive up to five story ideas from the AI model.

Of the participants with the option of AI assistance, the vast majority—88.4%—took advantage of it. They were then asked to evaluate how creative they thought their stories were, before a separate group of 600 recruits reviewed their efforts. Each reviewer was shown six stories and asked to give feedback on the stylistic characteristics, novelty, and usefulness of the story.

The researchers found that the writers with the greatest level of access to the AI model were evaluated as showing the most creativity. Of these, the writers who had scored as less creative on the first test benefited the most. 

However, the stories produced by writers who were already creative didn’t get the same boost. “We see this leveling effect where the least creative writers get the biggest benefit,” says Anil Doshi, an assistant professor at the UCL School of Management in the UK, who coauthored the paper. “But we don’t see any kind of respective benefit to be gained from the people who are already inherently creative.”

The findings make sense, given that people who are already creative don’t really need to use AI to be creative, says Tuhin Chakrabarty, a computer science researcher at Columbia University, who specializes in AI and creativity but wasn’t involved in the study. 

There are some potential drawbacks to taking advantage of the model’s help, too. AI-generated stories across the board are similar in terms of semantics and content, Chakrabarty says, and AI-generated writing is full of telltale giveaways, such as very long, exposition-heavy sentences that contain lots of stereotypes.   

“These kinds of idiosyncrasies probably also reduce the overall creativity,” he says. “Good writing is all about showing, not telling. AI is always telling.”

Because stories generated by AI models can only draw from the data that those models have been trained on, those produced in the study were less distinctive than the ideas the human participants came up with entirely on their own. If the publishing industry were to embrace generative AI, the books we read could become more homogenous, because they would all be produced by models trained on the same corpus.

This is why it’s essential to study what AI models can and, crucially, can’t do well as we grapple with what the rapidly evolving technology means for society and the economy, says Oliver Hauser, a professor at the University of Exeter Business School, another coauthor of the study. “Just because technology can be transformative, it doesn’t mean it will be,” he says.

Robot-packed meals are coming to the frozen-food aisle

Advances in artificial intelligence are coming to your freezer, in the form of robot-assembled prepared meals. 

Chef Robotics, a San Francisco–based startup, has launched a system of AI-powered robotic arms that can be quickly programmed with a recipe to dole out accurate portions of everything from tikka masala to pesto tortellini. After experiments with leading brands, including Amy’s Kitchen, the company says its robots have proved their worth and are being rolled out at scale to more production facilities. They are also being offered to new customers in the US and Canada. 

You might think the meals that end up in the grocery store’s frozen aisle, at Starbucks, or on airplanes are robot-packed already, but that’s rarely the case. Workers are often much more flexible than robots and can handle production lines that frequently rotate recipes. Not only that, but certain ingredients, like rice or shredded cheese, are hard to portion out with robotic arms. That means the vast majority of meals from recognizable brands are still typically hand-packed. 

However, advancements from AI have changed the calculus, making robots more useful on production lines, says David Griego, senior director of engineering at Amy’s.

“Before Silicon Valley got involved, the industry was much more about ‘Okay, we’re gonna program—a robot is gonna do this and do this only,’” he says. For a brand with so many different meals, that wasn’t very helpful. But the robots Griego is now able to add to the production line can learn how scooping a portion of peas is different from scooping cauliflower, and they can improve their accuracy for next time. “It’s astounding just how they can adapt to all the different types of ingredients that we use,” he says. Meal-packing robots suddenly make much more financial sense. 

Rather than selling the machines outright, Chef uses a service model, where customers pay a yearly fee that covers maintenance and training. Amy’s currently uses eight systems (each with two robotic arms) spread across two of its plants. One of these systems can now do the work of two to four workers depending on which ingredients are being packed, Griego says. The robots also reduce waste, since they can pack more consistent portions than their human counterparts. One-arm systems typically cost less than $135,000 per year, according to Chef CEO Rajat Bhageria.

With these advantages in mind, Griego imagines the robots handling more and more of the meal assembly process. “I have a vision,” he says, “where the only thing people would do is run the systems.” They’d make sure the hoppers of ingredients and packaging materials were full, for example, and the robots would do the rest. 

Robot chefs have been getting more skilled in recent years thanks to AI, and some companies have promised that burger-flipping and nugget-frying robots can provide cost savings to restaurants. But much of this technology has seen little adoption in the restaurant industry so far, says Bhageria. That’s because fast-casual restaurants often only need one cook running the grill, and if a robot cannot fully replace that person because it still needs supervision, it makes little sense to use it. Packaged meal companies, however, have a larger source of labor costs that they want to bring down: plating and assembly.

“That’s going to be the highest bang for our buck for our customers,” Bhageria says. 

CHEF

The notion that more flexible robots could mean broader adoption in new industries is no surprise, says Lerrel Pinto, who leads the General-Purpose Robotics and AI Lab at New York University and is not involved with Chef or Amy’s Kitchen. 

“A lot of robots deployed in the real world are used in a very repetitive way, where they’re supposed to do the same thing over and over again,” he says. Deep learning has caused a paradigm shift over the past few years, sparking the idea that more generally capable robots might be not only possible but necessary for more widespread adoption. If Chef’s robots can perform without frequent stops for repair or training, they could deliver material savings to food companies and shift how they use human labor, Pinto says: “In the next few years, we will probably see a lot more companies trying to actually deploy these types of learning-based robots in the real world.”

One new challenge the robots have created for Amy’s, Griego says, is maintaining the look of a hand-packed meal when it was assembled by a robot. The company’s cheese enchilada dish in particular was causing trouble: it’s finished with a hand-distributed sprinkling of cheddar on top, but Amy’s panel of examiners said the cheese on the robot-packed dish looked too machine-spread, sending Griego back to the drawing board.

“The first few tests went pretty well,” he says. After a couple of changes, the robots are ready to take over. Amy’s plans to bring them to more of its facilities and train them on a growing list of ingredients, meaning your frozen meals are increasingly likely to be packed by a robot.

Update: This story has been amended to include updating pricing information from Chef.

How to use AI to plan your next vacation

MIT Technology Review‘s How To series helps you get things done.

Planning a vacation should, in theory, be fun. But drawing up a list of activities for a trip can also be time consuming and stressful, particularly if you don’t know where to begin.

Luckily, tech companies have been competing to create tools that can help you to do just that. Travel has become one of the most popular use cases for AI that Google, Microsoft, and OpenAI like to point to in demos, and firms like Tripadvisor, Expedia, and Booking.com have started to launch AI-powered vacation-planning products too. While AI agents that can manage the entire process of planning and booking your vacation are still some way off, the current generation of AI tools are still pretty handy at helping you with various tasks, like creating itineraries or brushing up on your language skills. 

AI models are prone to making stuff up, which means you should always double-check their suggestions yourself. But they can still be a really useful resource. Read on for some ideas on how AI tools can help make planning your time away that little bit easier—leaving you with more time to enjoy yourself.

Narrow down potential locations for your break

First things first: You have to choose where to travel to. The beauty of large language models (LLMs) like ChatGPT is that they’re trained on vast swathes of the internet, meaning they can digest information that would take a human hours to research and quickly condense it into simple paragraphs.

This makes them great tools to help draw you up a list of places you’d be interested in going. The more specific you can be in your prompt, the better—for example, telling the chatbot you’d like suggestions for destinations with warm climates, child-friendly beaches, and busy nightlife (such as Mexico, Thailand, Ibiza, and Australia) will return more relevant countries than vague prompts. 

However, given AI models’ propensity for making things up—known as hallucinating—it’s worth checking that its information on proposed locations and potential activities is actually accurate.

How to use it: Fire up your LLM of choice—ChatGPT, Gemini, or Copilot are just some of the available models—and ask it to suggest locations for a holiday. Include important details like the temperatures, locations, length of trip, and activities you’re interested in. This could look something like: “Suggest a list of locations for two people going on a two-week vacation. The locations should be hot throughout July and August, based in a city but with easy access to a beach.”

Pick places to visit while you’re there 

Once you’re on your vacation, you can use tools like ChatGPT or Google’s Gemini to draw up itineraries for day trips. For example, you could use a prompt like “Give me an itinerary for a day driving from Florence around the countryside in Chianti. Include some medieval villages and a winery, and finish with dinner at a restaurant with a good view.” As always with LLMs, the more specific you can be, the better. And to be on the safe side, you ought to cross-reference the final itinerary against Google Maps to check that the order of the suggestions makes sense. 

Beyond LLMs, there are also tailored tools available that can help you to work out the kinds of conditions you might encounter, including weather and traffic. If you’re planning a city break, you might want to check out Immersive View, a feature for Google Maps that Google launched last year. It uses AI and computer vision to create a 3D model depicting how a certain location in a supported city will look at a specific time of day up to four days in the future. Because it’s able to draw from weather forecasts and traffic data, it could help you predict whether a rooftop bar will still be bathed in sunshine tomorrow evening, or if you’d be better off picking a different route for a drive at the weekend.

How to use it: Check to see if your city is on this list. Then open up Google Maps, navigate to an area you’re interested in, and select Immersive View. You’ll be presented with an interactive map with the option to change the date and time of day you’d like to check.

Checking flights and accommodations

Once you’ve decided where to go, booking flights and a place to stay is the next thing to tackle. Many travel booking sites have integrated AI chatbots into their websites, the vast majority of which are powered by ChatGPT. But unless you’re particularly wedded to using a specific site, it could be worth looking at the bigger picture.

Looking up flights on multiple browser tabs can be cumbersome, but Google’s Gemini has a solution. The model integrates with Google Flights and Google Hotels, pulling in real-time information from Google’s partner companies in a way that makes it easy to compare times and, crucially, prices.

This is a quick and easy way to search for flights and accommodations within your personal budget. For example, I instructed Gemini to show me flights for a round trip from London to Paris for under £200. It’s a great starting point to get a rough idea of how much you’re likely to spend, and how long it’ll take you to get there.

How to use it: Once you’ve opened up Gemini (you may need to sign in to a Google account to do this), open up Settings and go to Extensions to check that Google Flights & Hotels is enabled. Then return to the Gemini main page and enter your query, specifying where you’re flying from and to, the length of your stay, and any cost requirements you may wish to share.

If you’re a spreadsheet fan, you can ask Gemini to export the plan to Sheets, which you can then share with friends and family. 

Practice your language skills

You’ve probably heard that the best way to get better at another language is to practice speaking it. However, tutors can be expensive, and you may not know anyone else who speaks the tongue you’re trying to brush up on.

Back in September last year, OpenAI updated ChatGPT to allow users to speak to it. You can try it out for yourself using the ChatGPT app for Android or iOS. I opened up the voice chat option and read it some basic phrases in French that it successfully translated into English (“Do you speak English?” “Can you help me?” and “Where is the museum?”) in spite of my poor pronunciation. It was also good at offering up alternative phrases when I asked it for less formal examples, such as swapping bonjour (hello) for salut, which translates as “hi.” And it allowed me to hold basic conversations with the disembodied AI voice.  

How to use it: Download the ChatGPT app and press the headphone icon to the right of the search bar. This will trigger a voice conversation with the AI model.

Translate on the go

Google has integrated its powerful translation technology into camera software, allowing you to simply point your phone camera toward an unfamiliar phrase and see it translated into English. This is particularly useful for deciphering menus, road signs, and shop names while you’re out and about. 

How to use it: Download the Google Translate app and select Camera.

Write online reviews (and social media captions)

Positive reviews are a great way for small businesses to set themselves apart from their competition on the internet. But writing them can be time consuming, so why not get AI to help you out?

How to use it: Telling a chatbot like Gemini, Copilot, or ChatGPT what you enjoyed about a particular restaurant, guided tour, or destination can take some of the hard work out of writing a quick summary. The more specific you can be, the better. Prompt the model with something like: “Write a positive review for the Old Tavern in Mykonos, Greece, that mentions its delicious calamari.” While you’re unlikely to want to copy and paste the chatbot’s response in its entirety, it can help you with the structure and phrasing of your own review. 

Similarly, if you’re someone who struggles to come up with captions for Instagram posts about your travels, asking the same LLMs to help you can be a good way to get over writer’s block.

Can AI help me plan my honeymoon?

This story originally appeared in The Algorithm, our weekly newsletter on AI. To get stories like this in your inbox first, sign up here.

I’m getting married later this summer and am feverishly planning a honeymoon together with my fiancé. It has been at times overwhelming trying to research and decide between what seem like millions of options while juggling busy work schedules and wedding planning.

Thankfully, my colleague Rhiannon Williams has just published a piece about how to use AI to plan your vacation. You can read her story here. The timing could not be better! I decided to put her tips to the test and use AI to plan my honeymoon itinerary.

I asked ChatGPT to suggest a travel plan over three weeks in Japan and the Philippines, our dream destinations. I told the chatbot that in Tokyo I wanted to see art and design and eat good food, and in the Philippines I wanted to go somewhere laid-back and outdoorsy that is not very touristy. I also asked ChatGPT to be specific in its suggestions for hotels and activities to book. 

The results were pretty good, and they aligned with the research I had already done. I was delighted to see the AI propose we visit Siargao Island in the Philippines, which is known for its surfing. We were planning on going there anyway, but I haven’t had a chance to do much research on what there is to do. ChatGPT came up with some divine-looking day trips involving a stingless-jellyfish sanctuary, cave pools, and other adventures. 

The AI produced a decent first draft of the trip itinerary. I reckon this saved me a lot of time doing research on planned destinations I didn’t know much about, such as Siargao. 

But … when I asked about places I did know more about, such as Tokyo, I wasn’t that impressed. ChatGPT suggested I visit Shibuya Crossing and eat at a sushi restaurant, which, like, c’mon, are some of the most obvious things for tourists to do there. However, I am willing to consider that the problem might have been me and my prompting. Because I found that the more specific I made my prompts, the better the results were. 

But here’s the thing. Language models work by predicting the next likely word in a sentence. These AI systems don’t have an understanding of what it is like to experience these things, or how long they take. For example, ChatGPT suggested spending one whole day taking photos at a scenic spot. That would get boring pretty quickly. The AI systems of today lack the kind of last-mile reasoning and planning skills that would help me with logistics and budgeting. It also suggested accommodations that were way out of our price range. 

But this whole process might become much smoother as we build the next generation of AI agents. 

Agents are AI algorithms and models that can complete complex tasks in the real world. The idea is that one day they could execute a vast range of tasks, much like a human assistant. Agents are the new hot thing in AI, and I just published an explainer looking at what they are and how they work. You can read it here

In the future, an AI agent could not only suggest things to do and places to stay on my honeymoon; it would also go a step further than ChatGPT and book flights for me. It would remember my preferences and budget for hotels and only propose accommodation that matched my criteria. It might also remember what I liked to do on past trips, and suggest very specific things to do tailored to those tastes. It might even request bookings for restaurants on my behalf.

Unfortunately for my honeymoon, today’s AI systems lack the kind of reasoning, planning, and memory needed. It’s still early days for these systems, and there are a lot of unsolved research questions. But who knows—maybe for our 10th anniversary trip? 


Now read the rest of The Algorithm

Deeper Learning

A way to let robots learn by listening will make them more useful

Most AI-powered robots today use cameras to understand their surroundings and learn new tasks, but it’s becoming easier to train robots with sound too, helping them adapt to tasks and environments where visibility is limited. 

Sound on: Researchers at Stanford University tested how much more successful a robot can be if it’s capable of “listening.” They chose four tasks: flipping a bagel in a pan, erasing a whiteboard, putting two Velcro strips together, and pouring dice out of a cup. In each task, sounds provided clues that cameras or tactile sensors struggle with, like knowing if the eraser is properly contacting the whiteboard or whether the cup contains dice. When using vision alone in the last test, the robot could tell 27% of the time whether there were dice in the cup, but that rose to 94% when sound was included. Read more from James O’Donnell.

Bits and Bytes

AI lie detectors are better than humans at spotting lies
Researchers at the University of Würzburg in Germany found that an AI system was significantly better at spotting fabricated statements than humans. Humans usually only get it right around half the time, but the AI could spot if a statement was true or false in 67% of cases. However, lie detection is a controversial and unreliable technology, and it’s debatable  whether we should even be using it in the first place. (MIT Technology Review

A hacker stole secrets from OpenAI 
A hacker managed to access OpenAI’s internal messaging systems and steal information about its AI technology. The company believes the hacker was a private individual, but the incident raised fears among OpenAI employees that China could steal the company’s technology too. (The New York Times)

AI has vastly increased Google’s emissions over the past five years
Google said its greenhouse-gas emissions totaled 14.3 million metric tons of carbon dioxide equivalent throughout 2023. This is 48% higher than in 2019, the company said. This is mostly due to Google’s enormous push toward AI, which will likely make it harder to hit its goal of eliminating carbon emissions by 2030. This is an utterly depressing example of how our societies prioritize profit over the climate emergency we are in. (Bloomberg

Why a $14 billion startup is hiring PhDs to train AI systems from their living rooms
An interesting read about the shift happening in AI and data work. Scale AI has previously hired low-paid data workers in countries such as India and the Philippines to annotate data that is used to train AI. But the massive boom in language models has prompted Scale to hire highly skilled contractors in the US with the necessary expertise to help train those models. This highlights just how important data work really is to AI. (The Information

A new “ethical” AI music generator can’t write a halfway decent song
Copyright is one of the thorniest problems facing AI today. Just last week I wrote about how AI companies are being forced to cough up for high-quality training data to build powerful AI. This story illustrates why this matters. This story is about an “ethical” AI music generator, which only used a limited data set of licensed music. But without high-quality data, it is not able to generate anything even close to decent. (Wired)