Why we need an AI safety hotline

In the past couple of years, regulators have been caught off guard again and again as tech companies compete to launch ever more advanced AI models. It’s only a matter of time before labs release another round of models that pose new regulatory challenges. We’re likely just weeks away, for example, from OpenAI’s release of ChatGPT-5, which promises to push AI capabilities further than ever before. As it stands, it seems there’s little anyone can do to delay or prevent the release of a model that poses excessive risks.

Testing AI models before they’re released is a common approach to mitigating certain risks, and it may help regulators weigh up the costs and benefits—and potentially block models from being released if they’re deemed too dangerous. But the accuracy and comprehensiveness of these tests leaves a lot to be desired. AI models may “sandbag” the evaluation—hiding some of their capabilities to avoid raising any safety concerns. The evaluations may also fail to reliably uncover the full set of risks posed by any one model. Evaluations likewise suffer from limited scope—current tests are unlikely to uncover all the risks that warrant further investigation. There’s also the question of who conducts the evaluations and how their biases may influence testing efforts. For those reasons, evaluations need to be used alongside other governance tools. 

One such tool could be internal reporting mechanisms within the labs. Ideally, employees should feel empowered to regularly and fully share their AI safety concerns with their colleagues, and they should feel those colleagues can then be counted on to act on the concerns. However, there’s growing evidence that, far from being promoted, open criticism is becoming rarer in AI labs. Just three months ago, 13 former and current workers from OpenAI and other labs penned an open letter expressing fear of retaliation if they attempt to disclose questionable corporate behaviors that fall short of breaking the law. 

How to sound the alarm

In theory, external whistleblower protections could play a valuable role in the detection of AI risks. These could protect employees fired for disclosing corporate actions, and they could help make up for inadequate internal reporting mechanisms. Nearly every state has a public policy exception to at-will employment termination—in other words, terminated employees can seek recourse against their employers if they were retaliated against for calling out unsafe or illegal corporate practices. However, in practice this exception offers employees few assurances. Judges tend to favor employers in whistleblower cases. The likelihood of AI labs’ surviving such suits seems particularly high given that society has yet to reach any sort of consensus as to what qualifies as unsafe AI development and deployment. 

These and other shortcomings explain why the aforementioned 13 AI workers, including ex-OpenAI employee William Saunders, called for a novel “right to warn.” Companies would have to offer employees an anonymous process for disclosing risk-related concerns to the lab’s board, a regulatory authority, and an independent third body made up of subject-matter experts. The ins and outs of this process have yet to be figured out, but it would presumably be a formal, bureaucratic mechanism. The board, regulator, and third party would all need to make a record of the disclosure. It’s likely that each body would then initiate some sort of investigation. Subsequent meetings and hearings also seem like a necessary part of the process. Yet if Saunders is to be taken at his word, what AI workers really want is something different. 

When Saunders went on the Big Technology Podcast to outline his ideal process for sharing safety concerns, his focus was not on formal avenues for reporting established risks. Instead, he indicated a desire for some intermediate, informal step. He wants a chance to receive neutral, expert feedback on whether a safety concern is substantial enough to go through a “high stakes” process such as a right-to-warn system. Current government regulators, as Saunders says, could not serve that role. 

For one thing, they likely lack the expertise to help an AI worker think through safety concerns. What’s more, few workers will pick up the phone if they know it’s a government official on the other end—that sort of call may be “very intimidating,” as Saunders himself said on the podcast. Instead, he envisages being able to call an expert to discuss his concerns. In an ideal scenario, he’d be told that the risk in question does not seem that severe or likely to materialize, freeing him up to return to whatever he was doing with more peace of mind. 

Lowering the stakes

What Saunders is asking for in this podcast isn’t a right to warn, then, as that suggests the employee is already convinced there’s unsafe or illegal activity afoot. What he’s really calling for is a gut check—an opportunity to verify whether a suspicion of unsafe or illegal behavior seems warranted. The stakes would be much lower, so the regulatory response could be lighter. The third party responsible for weighing up these gut checks could be a much more informal one. For example, AI PhD students, retired AI industry workers, and other individuals with AI expertise could volunteer for an AI safety hotline. They could be tasked with quickly and expertly discussing safety matters with employees via a confidential and anonymous phone conversation. Hotline volunteers would have familiarity with leading safety practices, as well as extensive knowledge of what options, such as right-to-warn mechanisms, may be available to the employee. 

As Saunders indicated, few employees will likely want to go from 0 to 100 with their safety concerns—straight from colleagues to the board or even a government body. They are much more likely to raise their issues if an intermediary, informal step is available.

Studying examples elsewhere

The details of how precisely an AI safety hotline would work deserve more debate among AI community members, regulators, and civil society. For the hotline to realize its full potential, for instance, it may need some way to escalate the most urgent, verified reports to the appropriate authorities. How to ensure the confidentiality of hotline conversations is another matter that needs thorough investigation. How to recruit and retain volunteers is another key question. Given leading experts’ broad concern about AI risk, some may be willing to participate simply out of a desire to lend a hand. Should too few folks step forward, other incentives may be necessary. The essential first step, though, is acknowledging this missing piece in the puzzle of AI safety regulation. The next step is looking for models to emulate in building out the first AI hotline. 

One place to start is with ombudspersons. Other industries have recognized the value of identifying these neutral, independent individuals as resources for evaluating the seriousness of employee concerns. Ombudspersons exist in academia, nonprofits, and the private sector. The distinguishing attribute of these individuals and their staffers is neutrality—they have no incentive to favor one side or the other, and thus they’re more likely to be trusted by all. A glance at the use of ombudspersons in the federal government shows that when they are available, issues may be raised and resolved sooner than they would be otherwise.

This concept is relatively new. The US Department of Commerce established the first federal ombudsman in 1971. The office was tasked with helping citizens resolve disputes with the agency and investigate agency actions. Other agencies, including the Social Security Administration and the Internal Revenue Service, soon followed suit. A retrospective review of these early efforts concluded that effective ombudspersons can meaningfully improve citizen-government relations. On the whole, ombudspersons were associated with an uptick in voluntary compliance with regulations and cooperation with the government. 

An AI ombudsperson or safety hotline would surely have different tasks and staff from an ombudsperson in a federal agency. Nevertheless, the general concept is worthy of study by those advocating safeguards in the AI industry. 

A right to warn may play a role in getting AI safety concerns aired, but we need to set up more intermediate, informal steps as well. An AI safety hotline is low-hanging regulatory fruit. A pilot made up of volunteers could be organized in relatively short order and provide an immediate outlet for those, like Saunders, who merely want a sounding board.

Kevin Frazier is an assistant professor at St. Thomas University College of Law and senior research fellow in the Constitutional Studies Program at the University of Texas at Austin.

Why OpenAI’s new model is such a big deal

This story is from The Algorithm, our weekly newsletter on AI. To get it in your inbox first, sign up here.

Last weekend, I got married at a summer camp, and during the day our guests competed in a series of games inspired by the show Survivor that my now-wife and I orchestrated. When we were planning the games in August, we wanted one station to be a memory challenge, where our friends and family would have to memorize part of a poem and then relay it to their teammates so they could re-create it with a set of wooden tiles. 

I thought OpenAI’s GPT-4o, its leading model at the time, would be perfectly suited to help. I asked it to create a short wedding-themed poem, with the constraint that each letter could only appear a certain number of times so we could make sure teams would be able to reproduce it with the provided set of tiles. GPT-4o failed miserably. The model repeatedly insisted that its poem worked within the constraints, even though it didn’t. It would correctly count the letters only after the fact, while continuing to deliver poems that didn’t fit the prompt. Without the time to meticulously craft the verses by hand, we ditched the poem idea and instead challenged guests to memorize a series of shapes made from colored tiles. (That ended up being a total hit with our friends and family, who also competed in dodgeball, egg tosses, and capture the flag.)    

However, last week OpenAI released a new model called o1 (previously referred to under the code name “Strawberry” and, before that, Q*) that blows GPT-4o out of the water for this type of purpose

Unlike previous models that are well suited for language tasks like writing and editing, OpenAI o1 is focused on multistep “reasoning,” the type of process required for advanced mathematics, coding, or other STEM-based questions. It uses a “chain of thought” technique, according to OpenAI. “It learns to recognize and correct its mistakes. It learns to break down tricky steps into simpler ones. It learns to try a different approach when the current one isn’t working,” the company wrote in a blog post on its website.

OpenAI’s tests point to resounding success. The model ranks in the 89th percentile on questions from the competitive coding organization Codeforces and would be among the top 500 high school students in the USA Math Olympiad, which covers geometry, number theory, and other math topics. The model is also trained to answer PhD-level questions in subjects ranging from astrophysics to organic chemistry. 

In math olympiad questions, the new model is 83.3% accurate, versus 13.4% for GPT-4o. In the PhD-level questions, it averaged 78% accuracy, compared with 69.7% from human experts and 56.1% from GPT-4o. (In light of these accomplishments, it’s unsurprising the new model was pretty good at writing a poem for our nuptial games, though still not perfect; it used more Ts and Ss than instructed to.)

So why does this matter? The bulk of LLM progress until now has been language-driven, resulting in chatbots or voice assistants that can interpret, analyze, and generate words. But in addition to getting lots of facts wrong, such LLMs have failed to demonstrate the types of skills required to solve important problems in fields like drug discovery, materials science, coding, or physics. OpenAI’s o1 is one of the first signs that LLMs might soon become genuinely helpful companions to human researchers in these fields. 

It’s a big deal because it brings “chain-of-thought” reasoning in an AI model to a mass audience, says Matt Welsh, an AI researcher and founder of the LLM startup Fixie. 

“The reasoning abilities are directly in the model, rather than one having to use separate tools to achieve similar results. My expectation is that it will raise the bar for what people expect AI models to be able to do,” Welsh says.

That said, it’s best to take OpenAI’s comparisons to “human-level skills” with a grain of salt, says Yves-Alexandre de Montjoye, an associate professor in math and computer science at Imperial College London. It’s very hard to meaningfully compare how LLMs and people go about tasks such as solving math problems from scratch.

Also, AI researchers say that measuring how well a model like o1 can “reason” is harder than it sounds. If it answers a given question correctly, is that because it successfully reasoned its way to the logical answer? Or was it aided by a sufficient starting point of knowledge built into the model? The model “still falls short when it comes to open-ended reasoning,” Google AI researcher François Chollet wrote on X.

Finally, there’s the price. This reasoning-heavy model doesn’t come cheap. Though access to some versions of the model is included in premium OpenAI subscriptions, developers using o1 through the API will pay three times as much as they pay for GPT-4o—$15 per 1 million input tokens in o1, versus $5 for GPT-4o. The new model also won’t be most users’ first pick for more language-heavy tasks, where GPT-4o continues to be the better option, according to OpenAI’s user surveys. 

What will it unlock? We won’t know until researchers and labs have the access, time, and budget to tinker with the new mode and find its limits. But it’s surely a sign that the race for models that can outreason humans has begun. 

Now read the rest of The Algorithm


Deeper learning

Chatbots can persuade people to stop believing in conspiracy theories

Researchers believe they’ve uncovered a new tool for combating false conspiracy theories: AI chatbots. Researchers from MIT Sloan and Cornell University found that chatting about a conspiracy theory with a large language model (LLM) reduced people’s belief in it by about 20%—even among participants who claimed that their beliefs were important to their identity. 

Why this matters: The findings could represent an important step forward in how we engage with and educate people who espouse such baseless theories, says Yunhao (Jerry) Zhang, a postdoc fellow affiliated with the Psychology of Technology Institute who studies AI’s impacts on society. “They show that with the help of large language models, we can—I wouldn’t say solve it, but we can at least mitigate this problem,” he says. “It points out a way to make society better.” Read more from Rhiannon Williams here.

Bits and bytes

Google’s new tool lets large language models fact-check their responses

Called DataGemma, it uses two methods to help LLMs check their responses against reliable data and cite their sources more transparently to users. (MIT Technology Review)

Meet the radio-obsessed civilian shaping Ukraine’s drone defense 

Since Russia’s invasion, Serhii “Flash” Beskrestnov has become an influential, if sometimes controversial, force—sharing expert advice and intel on the ever-evolving technology that’s taken over the skies. His work may determine the future of Ukraine, and wars far beyond it. (MIT Technology Review)

Tech companies have joined a White House commitment to prevent AI-generated sexual abuse imagery

The pledges, signed by firms like OpenAI, Anthropic, and Microsoft, aim to “curb the creation of image-based sexual abuse.” The companies promise to set limits on what models will generate and to remove nude images from training data sets where possible.  (Fortune)

OpenAI is now valued at $150 billion

The valuation arose out of talks it’s currently engaged in to raise $6.5 billion. Given that OpenAI is becoming increasingly costly to operate, and could lose as much as $5 billion this year, it’s tricky to see how it all adds up. (The Information)

There are more than 120 AI bills in Congress right now

More than 120 bills related to regulating artificial intelligence are currently floating around the US Congress.

They’re pretty varied. One aims to improve knowledge of AI in public schools, while another is pushing for model developers to disclose what copyrighted material they use in their training.  Three deal with mitigating AI robocalls, while two address biological risks from AI. There’s even a bill that prohibits AI from launching a nuke on its own.

The flood of bills is indicative of the desperation Congress feels to keep up with the rapid pace of technological improvements. “There is a sense of urgency. There’s a commitment to addressing this issue, because it is developing so quickly and because it is so crucial to our economy,” says Heather Vaughan, director of communications for the US House of Representatives Committee on Science, Space, and Technology.

Because of the way Congress works, the majority of these bills will never make it into law. But simply taking a look at all the different bills that are in motion can give us insight into policymakers’ current preoccupations: where they think the dangers are, what each party is focusing on, and more broadly, what vision the US is pursuing when it comes to AI and how it should be regulated.

That’s why, with help from the Brennan Center for Justice, which created a tracker with all the AI bills circulating in various committees in Congress right now, MIT Technology Review has taken a closer look to see if there’s anything we can learn from this legislative smorgasbord. 

As you can see, it can seem as if Congress is trying to do everything at once when it comes to AI. To get a better sense of what may actually pass, it’s useful to look at what bills are moving along to potentially become law. 

A bill typically needs to pass a committee, or a smaller body of Congress, before it is voted on by the whole Congress. Many will fall short at this stage, while others will simply be introduced and then never spoken of again. This happens because there are so many bills presented in each session, and not all of them are given equal consideration. If the leaders of a party don’t feel a bill from one of its members can pass, they may not even try to push it forward. And then, depending on the makeup of Congress, a bill’s sponsor usually needs to get some members of the opposite party to support it for it to pass. In the current polarized US political climate, that task can be herculean. 

Congress has passed legislation on artificial intelligence before. Back in 2020, the National AI Initiative Act was part of the Defense Authorization Act, which invested resources in AI research and provided support for public education and workforce training on AI.

And some of the current bills are making their way through the system. The Senate Commerce Committee pushed through five AI-related bills at the end of July. The bills focused on authorizing the newly formed US AI Safety Institute (AISI) to create test beds and voluntary guidelines for AI models. The other bills focused on expanding education on AI, establishing public computing resources for AI research, and criminalizing the publication of deepfake pornography. The next step would be to put the bills on the congressional calendar to be voted on, debated, or amended.

“The US AI Safety Institute, as a place to have consortium building and easy collaboration between corporate and civil society actors, is amazing. It’s exactly what we need,” says Yacine Jernite, an AI researcher at Hugging Face.

The progress of these bills is a positive development, says Varun Krovi, executive director of the Center for AI Safety Action Fund. “We need to codify the US AI Safety Institute into law if you want to maintain our leadership on the global stage when it comes to standards development,” he says. “And we need to make sure that we pass a bill that provides computing capacity required for startups, small businesses, and academia to pursue AI.”

Following the Senate’s lead, the House Committee on Science, Space, and Technology just passed nine more bills regarding AI on September 11. Those bills focused on improving education on AI in schools, directing the National Institute of Standards and Technology (NIST) to establish guidelines for artificial-intelligence systems, and expanding the workforce of AI experts. These bills were chosen because they have a narrower focus and thus might not get bogged down in big ideological battles on AI, says Vaughan.

“It was a day that culminated from a lot of work. We’ve had a lot of time to hear from members and stakeholders. We’ve had years of hearings and fact-finding briefings on artificial intelligence,” says Representative Haley Stevens, one of the Democratic members of the House committee.

Many of the bills specify that any guidance they propose for the industry is nonbinding and that the goal is to work with companies to ensure safe development rather than curtail innovation. 

For example, one of the bills from the House, the AI Development Practices Act, directs NIST to establish “voluntary guidance for practices and guidelines relating to the development … of AI systems” and a “voluntary risk management framework.” Another bill, the AI Advancement and Reliability Act, has similar language. It supports “the development of voluntary best practices and technical standards” for evaluating AI systems. 

“Each bill contributes to advancing AI in a safe, reliable, and trustworthy manner while fostering the technology’s growth and progress through innovation and vital R&D,” committee chairman Frank Lucas, an Oklahoma Republican, said in a press release on the bills coming out of the House.

“It’s emblematic of the approach that the US has taken when it comes to tech policy. We hope that we would move on from voluntary agreements to mandating them,” says Krovi.

Avoiding mandates is a practical matter for the House committee. “Republicans don’t go in for mandates for the most part. They generally aren’t going to go for that. So we would have a hard time getting support,” says Vaughan. “We’ve heard concerns about stifling innovation, and that’s not the approach that we want to take.” When MIT Technology Review asked about the origin of these concerns, they were attributed to unidentified “third parties.” 

And fears of slowing innovation don’t just come from the Republican side. “What’s most important to me is that the United States of America is establishing aggressive rules of the road on the international stage,” says Stevens. “It’s concerning to me that actors within the Chinese Communist Party could outpace us on these technological advancements.”

But these bills come at a time when big tech companies have ramped up lobbying efforts on AI. “Industry lobbyists are in an interesting predicament—their CEOs have said that they want more AI regulation, so it’s hard for them to visibly push to kill all AI regulation,” says David Evan Harris, who teaches courses on AI ethics at the University of California, Berkeley. “On the bills that they don’t blatantly try to kill, they instead try to make them meaningless by pushing to transform the language in the bills to make compliance optional and enforcement impossible.”

“A [voluntary commitment] is something that is also only accessible to the largest companies,” says Jernite at Hugging Face, claiming that sometimes the ambiguous nature of voluntary commitments allows big companies to set definitions for themselves. “If you have a voluntary commitment—that is, ‘We’re going to develop state-of-the-art watermarking technology’—you don’t know what state-of-the-art means. It doesn’t come with any of the concrete things that make regulation work.”

“We are in a very aggressive policy conversation about how to do this right, and how this carrot and stick is actually going to work,” says Stevens, indicating that Congress may ultimately draw red lines that AI companies must not cross.

There are other interesting insights to be gleaned from looking at the bills all together. Two-thirds of the AI bills are sponsored by Democrats. This isn’t too surprising, since some House Republicans have claimed to want no AI regulations, believing that guardrails will slow down progress.

The topics of the bills (as specified by Congress) are dominated by science, tech, and communications (28%), commerce (22%), updating government operations (18%), and national security (9%). Topics that don’t receive much attention include labor and employment (2%), environmental protection (1%), and civil rights, civil liberties, and minority issues (1%).

The lack of a focus on equity and minority issues came into view during the Senate markup session at the end of July. Senator Ted Cruz, a Republican, added an amendment that explicitly prohibits any action “to ensure inclusivity and equity in the creation, design, or development of the technology.” Cruz said regulatory action might slow US progress in AI, allowing the country to fall behind China.

On the House side, there was also a hesitation to work on bills dealing with biases in AI models. “None of our bills are addressing that. That’s one of the more ideological issues that we’re not moving forward on,” says Vaughan.

The lead Democrat on the House committee, Representative Zoe Lofgren, told MIT Technology Review, “It is surprising and disappointing if any of my Republican colleagues have made that comment about bias in AI systems. We shouldn’t tolerate discrimination that’s overt and intentional any more than we should tolerate discrimination that occurs because of bias in AI systems. I’m not really sure how anyone can argue against that.”

After publication, Vaughan clarified that “[Bias] is one of the bigger, more cross-cutting issues, unlike the narrow, practical bills we considered that week. But we do care about bias as an issue,” and she expects it to be addressed within an upcoming House Task Force report.

One issue that may rise above the partisan divide is deepfakes. The Defiance Act, one of several bills addressing them, is cosponsored by a Democratic senator, Amy Klobuchar, and a Republican senator, Josh Hawley. Deepfakes have already been abused in elections; for example, someone faked Joe Biden’s voice for a robocall to tell citizens not to vote. And the technology has been weaponized to victimize people by incorporating their images into pornography without their consent. 

“I certainly think that there is more bipartisan support for action on these issues than on many others,” says Daniel Weiner, director of the Brennan Center’s Elections & Government Program. “But it remains to be seen whether that’s going to win out against some of the more traditional ideological divisions that tend to arise around these issues.” 

Although none of the current slate of bills have resulted in laws yet, the task of regulating any new technology, and specifically advanced AI systems that no one entirely understands, is difficult. The fact that Congress is making any progress at all may be surprising in itself. 

“Congress is not sleeping on this by any stretch of the means,” says Stevens. “We are evaluating and asking the right questions and also working alongside our partners in the Biden-Harris administration to get us to the best place for the harnessing of artificial intelligence.”

Update: We added further comments from the Republican spokesperson.

The Download: Congress’s AI bills, and Snap’s new AR spectacles

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology.

There are more than 120 AI bills in Congress right now

More than 120 bills related to regulating artificial intelligence are currently floating around the US Congress. This flood of bills is indicative of the desperation Congress feels to keep up with the rapid pace of technological improvements. 

Because of the way Congress works, the majority of these bills will never make it into law. But simply taking a look at them all can give us insight into policymakers’ current preoccupations: where they think the dangers are, what each party is focusing on, and more broadly, what vision the US is pursuing when it comes to AI and how it should be regulated.

That’s why, with help from the Brennan Center for Justice, we’ve created a tracker with all the AI bills circulating in various committees in Congress right now, to see if there’s anything we can learn from this legislative smorgasbord. Read the full story.

—Scott J Mulligan

Here’s what I made of Snap’s new augmented-reality Spectacles

Snap has announced a new version of its Spectacles: AR glasses that could finally deliver on the promises that devices like Magic Leap, or HoloLens, or even Google Glass, made many years ago.

Our editor-in-chief Mat Honan got to try them out a couple of weeks ago. He found they packed a pretty impressive punch layering visual information and applications directly on their see-through lenses, making objects appear as if they are in the real world—if you don’t mind looking a little goofy, that is. Read Mat’s full thoughts here.

Google is funding an AI-powered satellite constellation that will spot wildfires faster

What’s happening: Early next year, Google and its partners plan to launch the first in a series of satellites that together would provide close-up, frequently refreshed images of wildfires around the world, offering data that could help firefighters battle blazes more rapidly, effectively, and safely.

Why it matters: The images and analysis will be provided free to fire agencies around the world, helping to improve understanding of where fires are, where they’re moving, and how hot they’re burning. The information could help agencies stamp out small fires before they turn into raging infernos, place limited firefighting resources where they’ll do the most good, and evacuate people along the safest paths. Read the full story.

—James Temple

The must-reads

I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology.

1 California has passed three election deepfake laws
But only one will take effect in time for the presidential election in November. (NYT $)
+ The bills also protect actors from AI impersonation without their consent. (WP $)

2 How did thousands of Hezbollah pagers explode simultaneously?
The devices were probably intercepted by hackers during shipment. (WSJ $)
+ Here’s everything we know about the attack so far. (Vox)
+ Small lithium batteries alone don’t tend to cause this much damage. (404 Media)
+ Exploding comms devices are nothing new. (FT $)

3 Instagram has introduced new accounts specifically for teens
In response to increasing pressure over Meta’s minor protection policies. (BBC)
+ Parents will be given greater control over their activities. (The Guardian)
+ Here’s how to set up the new restricted accounts. (WP $)

4 Google has won its bid to overturn a €1.5 billion fine from the EU
But the court said it stands by the majority of the previous findings. (CNBC)
+ But the ruling can still be appealed in the Court of Justice. (Bloomberg $)
+ Meanwhile, Meta’s antitrust woes are escalating. (FT $)

5 SpaceX has been accused of breaking launch rules 
And the US Federal Aviation Administration wants to slap it with a hefty fine. (WP $)

6 Electric cars now outnumber petrol cars in Norway
It’s particularly impressive given the country’s history as an oil producer. (The Guardian)
+ Why full EVs, not hybrids, are the future. (Economist $)
+ Three frequently asked questions about EVs, answered. (MIT Technology Review)

7 Our understanding of the universe is still up in the air
What looked like a breakthrough in physics actually might not be at all. (New Scientist $)
+ Why is the universe so complex and beautiful? (MIT Technology Review)

8 Tech’s middle managers are having a tough time
They’re losing their jobs left, right and center. (Insider $)

9 YouTube astrology is booming in Pakistan
Amid economic and political turmoil, Pakistanis are seeking answers in the stars. (Rest of World)

10 Not everything bad is AI-generated
But what’s AI-generated is often bad. (NY Mag $)

Quote of the day

“I’d rather go back to school than work in an office again.”

—CJ Felli, a system development engineer for Amazon Web Services, is not happy about the company’s back-to-the-office directive, Quartz reports.

The big story

What’s next for the world’s fastest supercomputers

September 2023

When the Frontier supercomputer came online last year, it marked the dawn of so-called exascale computing, with machines that can execute an exaflop—or a quintillion (1018) floating point operations a second.

Since then, scientists have geared up to make more of these blazingly fast computers: several exascale machines are due to come online in the US and Europe in 2024.

But speed itself isn’t the endgame. Researchers hope to pursue previously unanswerable questions about nature—and to design new technologies in areas from transportation to medicine. Read the full story.

—Sophia Chen

We can still have nice things

A place for comfort, fun and distraction to brighten up your day. (Got any ideas? Drop me a line or tweet ’em at me.)

+ These Ocean Photographer of the Year winning images are simply stunning 🐋($)
+ Here’s where you’ll have the best chance of finding a fossilized shark tooth in the US.
+ Vans are back in style, as if they ever went out of it.
+ Potatoes are great every which way, but here’s how long to boil them for that perfect al dente bite.

AI-generated content doesn’t seem to have swayed recent European elections 

AI-generated falsehoods and deepfakes seem to have had no effect on election results in the UK, France, and the European Parliament this year, according to new research. 

Since the beginning of the generative-AI boom, there has been widespread fear that AI tools could boost bad actors’ ability to spread fake content with the potential to interfere with elections or even sway the results. Such worries were particularly heightened this year, when billions of people were expected to vote in over 70 countries. 

Those fears seem to have been unwarranted, says Sam Stockwell, the researcher at the Alan Turing Institute who conducted the study. He focused on three elections over a four-month period from May to August 2024, collecting data on public reports and news articles on AI misuse. Stockwell identified 16 cases of AI-enabled falsehoods or deepfakes that went viral during the UK general election and only 11 cases in the EU and French elections combined, none of which appeared to definitively sway the results. The fake AI content was created by both domestic actors and groups linked to hostile countries such as Russia. 

These findings are in line with recent warnings from experts that the focus on election interference is distracting us from deeper and longer-lasting threats to democracy.   

AI-generated content seems to have been ineffective as a disinformation tool in most European elections this year so far. This, Stockwell says, is because most of the people who were exposed to the disinformation already believed its underlying message (for example, that levels of immigration to their country are too high). Stockwell’s analysis showed that people who were actively engaging with these deepfake messages by resharing and amplifying them had some affiliation or previously expressed views that aligned with the content. So the material was more likely to strengthen preexisting views than to influence undecided voters. 

Tried-and-tested election interference tactics, such as flooding comment sections with bots and exploiting influencers to spread falsehoods, remained far more effective. Bad actors mostly used generative AI to rewrite news articles with their own spin or to create more online content for disinformation purposes. 

“AI is not really providing much of an advantage for now, as existing, simpler methods of creating false or misleading information continue to be prevalent,” says Felix Simon, a researcher at the Reuters Institute for Journalism, who was not involved in the research. 

However, it’s hard to draw firm conclusions about AI’s impact upon elections at this stage, says Samuel Woolley, a disinformation expert at the University of Pittsburgh. That’s in part because we don’t have enough data.

“There are less obvious, less trackable, downstream impacts related to uses of these tools that alter civic engagement,” he adds.

Stockwell agrees: Early evidence from these elections suggests that AI-generated content could be more effective for harassing politicians and sowing confusion than changing people’s opinions on a large scale. 

Politicians in the UK, such as former prime minister Rishi Sunak, were targeted by AI deepfakes that, for example, showed them promoting scams or admitting to financial corruption. Female candidates were also targeted with nonconsensual sexual deepfake content, intended to disparage and intimidate them. 

“There is, of course, a risk that in the long run, the more that political candidates are on the receiving end of online harassment, death threats, deepfake pornographic smears—that can have a real chilling effect on their willingness to, say, participate in future elections, but also obviously harm their well-being,” says Stockwell. 

Perhaps more worrying, Stockwell says, his research indicates that people are increasingly unable to discern the difference between authentic and AI-generated content in the election context. Politicians are also taking advantage of that. For example, political candidates in the European Parliament elections in France have shared AI-generated content amplifying anti-immigration narratives without disclosing that they’d been made with AI. 

“This covert engagement, combined with a lack of transparency, presents in my view a potentially greater risk to the integrity of political processes than the use of AI by the general population or so-called ‘bad actors,’” says Simon. 

Charts: Ecommerce Revenue Forecasts U.S., Global

The International Trade Administration, an agency of the U.S. Department of Commerce, projects global B2B ecommerce sales to reach $36.2 trillion by 2026, a 50% increase from 2023. The ITA’s mission is to promote trade and investment, strengthen the competitiveness of the U.S. industry, and ensure fair trade and compliance with trade laws and agreements.

Gross merchandise value is total sales over a specified period, typically measured quarterly or yearly.

The ITA projects B2C ecommerce revenue to reach $5.5 trillion by 2027, a compound annual growth rate from 2024 of 14.4%. Although fashion and consumer electronics are the largest sectors, pharmaceuticals is the fastest-growing category.


Statista tracks the leading online shopping categories by revenue, both globally and within the United States. Electronics account for a substantial share of global ecommerce sales, with projected spending reaching $922.5 billion. Fashion and apparel follow, ranking second among the top online shopping categories.

Per Statista, the top ecommerce categories in the U.S. reflect global trends, with fashion emerging as the top revenue category, projected to generate $162.9 billion in revenue in 2024.

Google Revamps Entire Crawler Documentation via @sejournal, @martinibuster

Google has launched a major revamp of its Crawler documentation, shrinking the main overview page and splitting content into three new, more focused pages.  Although the changelog downplays the changes there is an entirely new section and basically a rewrite of the entire crawler overview page. The additional pages allows Google to increase the information density of all the crawler pages and improves topical coverage.

What Changed?

Google’s documentation changelog notes two changes but there is actually a lot more.

Here are some of the changes:

  • Added an updated user agent string for the GoogleProducer crawler
  • Added content encoding information
  • Added a new section about technical properties

The technical properties section contains entirely new information that didn’t previously exist. There are no changes to the crawler behavior, but by creating three topically specific pages Google is able to add more information to the crawler overview page while simultaneously making it smaller.

This is the new information about content encoding (compression):

“Google’s crawlers and fetchers support the following content encodings (compressions): gzip, deflate, and Brotli (br). The content encodings supported by each Google user agent is advertised in the Accept-Encoding header of each request they make. For example, Accept-Encoding: gzip, deflate, br.”

There is additional information about crawling over HTTP/1.1 and HTTP/2, plus a statement about their goal being to crawl as many pages as possible without impacting the website server.

What Is The Goal Of The Revamp?

The change to the documentation was due to the fact that the overview page had become large. Additional crawler information would make the overview page even larger. A decision was made to break the page into three subtopics so that the specific crawler content could continue to grow and making room for more general information on the overviews page. Spinning off subtopics into their own pages is a brilliant solution to the problem of how best to serve users.

This is how the documentation changelog explains the change:

“The documentation grew very long which limited our ability to extend the content about our crawlers and user-triggered fetchers.

…Reorganized the documentation for Google’s crawlers and user-triggered fetchers. We also added explicit notes about what product each crawler affects, and added a robots.txt snippet for each crawler to demonstrate how to use the user agent tokens. There were no meaningful changes to the content otherwise.”

The changelog downplays the changes by describing them as a reorganization because the crawler overview is substantially rewritten, in addition to the creation of three brand new pages.

While the content remains substantially the same, the division of it into sub-topics makes it easier for Google to add more content to the new pages without continuing to grow the original page. The original page, called Overview of Google crawlers and fetchers (user agents), is now truly an overview with more granular content moved to standalone pages.

Google published three new pages:

  1. Common crawlers
  2. Special-case crawlers
  3. User-triggered fetchers

1. Common Crawlers

As it says on the title, these are common crawlers, some of which are associated with GoogleBot, including the Google-InspectionTool, which uses the GoogleBot user agent. All of the bots listed on this page obey the robots.txt rules.

These are the documented Google crawlers:

  • Googlebot
  • Googlebot Image
  • Googlebot Video
  • Googlebot News
  • Google StoreBot
  • Google-InspectionTool
  • GoogleOther
  • GoogleOther-Image
  • GoogleOther-Video
  • Google-CloudVertexBot
  • Google-Extended

3. Special-Case Crawlers

These are crawlers that are associated with specific products and are crawled by agreement with users of those products and operate from IP addresses that are distinct from the GoogleBot crawler IP addresses.

List of Special-Case Crawlers:

  • AdSense
    User Agent for Robots.txt: Mediapartners-Google
  • AdsBot
    User Agent for Robots.txt: AdsBot-Google
  • AdsBot Mobile Web
    User Agent for Robots.txt: AdsBot-Google-Mobile
  • APIs-Google
    User Agent for Robots.txt: APIs-Google
  • Google-Safety
    User Agent for Robots.txt: Google-Safety

3. User-Triggered Fetchers

The User-triggered Fetchers page covers bots that are activated by user request, explained like this:

“User-triggered fetchers are initiated by users to perform a fetching function within a Google product. For example, Google Site Verifier acts on a user’s request, or a site hosted on Google Cloud (GCP) has a feature that allows the site’s users to retrieve an external RSS feed. Because the fetch was requested by a user, these fetchers generally ignore robots.txt rules. The general technical properties of Google’s crawlers also apply to the user-triggered fetchers.”

The documentation covers the following bots:

  • Feedfetcher
  • Google Publisher Center
  • Google Read Aloud
  • Google Site Verifier

Takeaway:

Google’s crawler overview page became overly comprehensive and possibly less useful because people don’t always need a comprehensive page, they’re just interested in specific information. The overview page is less specific but also easier to understand. It now serves as an entry point where users can drill down to more specific subtopics related to the three kinds of crawlers.

This change offers insights into how to freshen up a page that might be underperforming because it has become too comprehensive. Breaking out a comprehensive page into standalone pages allows the subtopics to address specific users needs and possibly make them more useful should they rank in the search results.

I would not say that the change reflects anything in Google’s algorithm, it only reflects how Google updated their documentation to make it more useful and set it up for adding even more information.

Read Google’s New Documentation

Overview of Google crawlers and fetchers (user agents)

List of Google’s common crawlers

List of Google’s special-case crawlers

List of Google user-triggered fetchers

Featured Image by Shutterstock/Cast Of Thousands

HubSpot Rolls Out AI-Powered Marketing Tools via @sejournal, @MattGSouthern

HubSpot announced a push into AI this week at its annual Inbound marketing conference, launching “Breeze.”

Breeze is an artificial intelligence layer integrated across the company’s marketing, sales, and customer service software.

According to HubSpot, the goal is to provide marketers with easier, faster, and more unified solutions as digital channels become oversaturated.

Karen Ng, VP of Product at HubSpot, tells Search Engine Journal in an interview:

“We’re trying to create really powerful tools for marketers to rise above the noise that’s happening now with a lot of this AI-generated content. We might help you generate titles or a blog content…but we do expect kind of a human there to be a co-assist in that.”

Breeze AI Covers Copilot, Workflow Agents, Data Enrichment

The Breeze layer includes three main components.

Breeze Copilot

An AI assistant that provides personalized recommendations and suggestions based on data in HubSpot’s CRM.

Ng explained:

“It’s a chat-based AI companion that assists with tasks everywhere – in HubSpot, the browser, and mobile.”

Breeze Agents

A set of four agents that can automate entire workflows like content generation, social media campaigns, prospecting, and customer support without human input.

Ng added the following context:

“Agents allow you to automate a lot of those workflows. But it’s still, you know, we might generate for you a content backlog. But taking a look at that content backlog, and knowing what you publish is still a really important key of it right now.”

Breeze Intelligence

Combines HubSpot customer data with third-party sources to build richer profiles.

Ng stated:

“It’s really important that we’re bringing together data that can be trusted. We know your AI is really only as good as the data that it’s actually trained on.”

Addressing AI Content Quality

While prioritizing AI-driven productivity, Ng acknowledged the need for human oversight of AI content:

“We really do need eyes on it still…We think of that content generation as still human-assisted.”

Marketing Hub Updates

Beyond Breeze, HubSpot is updating Marketing Hub with tools like:

  • Content Remix to repurpose videos into clips, audio, blogs, and more.
  • AI video creation via integration with HeyGen
  • YouTube and Instagram Reels publishing
  • Improved marketing analytics and attribution

The announcements signal HubSpot’s AI-driven vision for unifying customer data.

But as Ng tells us, “We definitely think a lot about the data sources…and then also understand your business.”

HubSpot’s updates are rolling out now, with some in public beta.


Featured Image: Poetra.RH/Shutterstock

How To Choose The Right Bid Strategy For Lead Generation Campaigns via @sejournal, @adsliaison

Welcome back to our series on getting started with value-based bidding for lead gen marketers!

We’ve discussed evaluating whether it makes sense for your business, setting your data strategy, and assigning the right values for your conversions.

Now, we’re going to cover the final step before activating your value-based bidding strategy: choosing the right bid strategy for your lead generation campaigns.

The big benefit of value-based bidding is that it allows you to prioritize conversions that are most likely to drive higher revenue or achieve your specific business goals, such as sales, profit margins, or customer lifetime value.

By assigning different values to different conversion actions, you gain greater control over your bidding strategy and optimize for conversions that deliver the most significant impact.

Whether it’s a purchase, a lead, or a specific action on your website, value-based bidding ensures that your bids reflect the true worth of each conversion, enabling you to maximize your return on investment (ROI).

The bidding strategy you select to optimize for value depends on a few factors. Check out this two-minute video for a quick overview, and then keep reading to dive deeper.

Which Value-Based Bidding Strategy Should You Choose?

With value-based bidding, Smart Bidding predicts the value of a potential conversion with each auction.

  • If the bid strategy determines that an impression is likely to generate a conversion with high value, it will place a higher bid.
  • If this bid strategy determines that the impression isn’t likely to generate a high-value conversion, it’ll place a lower bid.

Value-based bidding can use data from all of your campaigns, including the conversion values you are reporting, to optimize performance.

It also uses real-time signals, such as device, browser, location, and time of day, and can adjust bids based on whether or not someone is on one of your remarketing lists.

To start bidding for value, ensure the following:

  • Measure at least two unique conversion values optimized for your business.
  • Have at least 15 conversions at the account level in the past 30 days. (Note: Demand Gen should have at least 50 conversions in the past 35 days, with at least 10 in the last seven days or 100 conversions in the past 35 days.)

You’ve got two bidding strategy options to tell Google how you want to optimize for value:

  • Maximize conversion value.
  • Maximize conversion value with a target ROAS.

Here’s a quick way to think about each before we dig in further:

Maximize conversion value Target ROAS
Goal Maximize conversion values for a specific budget. Maximize conversion values for a target return on ad spend.
When
  • Your priority is to maximize value and spend the budget.
  • You don’t have a specific ROI target.
  • You have a specific ROI target.

Maximize Conversion Value

This option focuses on maximizing conversion value within a defined budget.

It’s suited for advertisers who prioritize driving the highest possible value within a specific allocated budget.

Advertisers often start with this before moving to a target ROAS strategy.

Maximize Conversion Value At A Target ROAS

This option allows you to set a specific target return on ad spend (ROAS) and instructs Google Ads to optimize your bids to achieve that target while maximizing conversion value.

Target ROAS: Why Your Budget Should Be Uncapped

When bidding with a target ROAS, your campaign budget should not be limited or capped.

That may sound scary at first, but let me reassure you that it doesn’t mean you don’t have control over your campaign spend!

Your ROAS target is the lever that manages your spend.

With a target ROAS, you’re telling Google to optimize for value at that specific target rather than find as much value within a specific budget.

So, when your budget is limited, it potentially prevents the system from having the flexibility to find the next conversion at your target.

Setting Your ROAS Targets

You can choose whether to use a recommended target ROAS or set your own.

When setting ROAS targets, use the last 30 days’ return on ad spend as a benchmark.

Google’s target ROAS recommendations are calculated based on your actual ROAS over the last few weeks.

This recommendation excludes performance from the last few days to account for conversions that may take more than a day to complete following an ad click or interaction (such as an engaged view).

You can find more details on target ROAS here.

Get Started With An Experiment

While you can launch value-based bidding directly, you may want to start with a small test using a campaign experiment. This allows you to compare the performance of value-based bidding against your existing bidding strategy and make data-driven decisions.

You have two options to create a campaign experiment:

One-Click Experiment From Recommendations Page

You may see suggestions on implementing value-based bidding on your Recommendations page.

The recommendation to “Bid more efficiently with Maximize conversion value” will show if our simulations identify that your account is measuring two or more unique conversion values and will likely benefit from this strategy.

From this recommendation, you can create a one-click experiment to test the impact of value-based bidding on a specific campaign.

Custom Experiment

You also have the option to create a more tailored experiment to test value-based bidding in your campaign.

Be sure to choose a campaign that receives sufficient traffic and conversions to generate statistically significant results.

Configure the experiment to use value-based bidding, while the original campaign continues to use your existing bidding strategy.

See the instructions here to set up a custom experiment.

How To Jumpstart Value-Based Bidding

You can employ strategies to jumpstart the system, such as initially setting a low ROAS target or starting with Maximize conversion value without a ROAS target.

If you opt to start with Maximize conversion value without a target, be sure that your budgets are aligned with your daily spend goals.

Allow A Ramp-Up Period Before You Optimize

Once you’ve launched value-based bidding, give the system a ramp-up period of two weeks or three conversion cycles. This allows Google Ads to learn and optimize your bids effectively.

When measuring performance, be sure to exclude this initial period from your analysis to obtain accurate insights.

We’ve now covered all the basics for getting started with value-based bidding.

In our last segment, we’ll discuss monitoring and optimizing your performance to drive value for your business.

More resources:


Featured Image: Sammby/Shutterstock