Google DeepMind has a new way to look inside an AI’s “mind”

AI has led to breakthroughs in drug discovery and robotics and is in the process of entirely revolutionizing how we interact with machines and the web. The only problem is we don’t know exactly how it works, or why it works so well. We have a fair idea, but the details are too complex to unpick. That’s a problem: It could lead us to deploy an AI system in a highly sensitive field like medicine without understanding that it could have critical flaws embedded in its workings.

A team at Google DeepMind that studies something called mechanistic interpretability has been working on new ways to let us peer under the hood. At the end of July, it released Gemma Scope, a tool to help researchers understand what is happening when AI is generating an output. The hope is that if we have a better understanding of what is happening inside an AI model, we’ll be able to control its outputs more effectively, leading to better AI systems in the future.

“I want to be able to look inside a model and see if it’s being deceptive,” says Neel Nanda, who runs the mechanistic interpretability team at Google DeepMind. “It seems like being able to read a model’s mind should help.”

Mechanistic interpretability, also known as “mech interp,” is a new research field that aims to understand how neural networks actually work. At the moment, very basically, we put inputs into a model in the form of a lot of data, and then we get a bunch of model weights at the end of training. These are the parameters that determine how a model makes decisions. We have some idea of what’s happening between the inputs and the model weights: Essentially, the AI is finding patterns in the data and making conclusions from those patterns, but these patterns can be incredibly complex and often very hard for humans to interpret.

It’s like a teacher reviewing the answers to a complex math problem on a test. The student—the AI, in this case—wrote down the correct answer, but the work looks like a bunch of squiggly lines. This example assumes the AI is always getting the correct answer, but that’s not always true; the AI student may have found an irrelevant pattern that it’s assuming is valid. For example, some current AI systems will give you the result that 9.11 is bigger than 9.8. Different methods developed in the field of mechanistic interpretability are beginning to shed a little bit of light on what may be happening, essentially making sense of the squiggly lines.

“A key goal of mechanistic interpretability is trying to reverse-engineer the algorithms inside these systems,” says Nanda. “We give the model a prompt, like ‘Write a poem,’ and then it writes some rhyming lines. What is the algorithm by which it did this? We’d love to understand it.”

To find features—or categories of data that represent a larger concept—in its AI model, Gemma, DeepMind ran a tool known as a “sparse autoencoder” on each of its layers. You can think of a sparse autoencoder as a microscope that zooms in on those layers and lets you look at their details. For example, if you prompt Gemma about a chihuahua, it will trigger the “dogs” feature, lighting up what the model knows about “dogs.” The reason it is considered “sparse” is that it’s limiting the number of neurons used, basically pushing for a more efficient and generalized representation of the data.

The tricky part of sparse autoencoders is deciding how granular you want to get. Think again about the microscope. You can magnify something to an extreme degree, but it may make what you’re looking at impossible for a human to interpret. But if you zoom too far out, you may be limiting what interesting things you can see and discover. 

DeepMind’s solution was to run sparse autoencoders of different sizes, varying the number of features they want the autoencoder to find. The goal was not for DeepMind’s researchers to thoroughly analyze the results on their own. Gemma and the autoencoders are open-source, so this project was aimed more at spurring interested researchers to look at what the sparse autoencoders found and hopefully make new insights into the model’s internal logic. Since DeepMind ran autoencoders on each layer of their model, a researcher could map the progression from input to output to a degree we haven’t seen before.

“This is really exciting for interpretability researchers,” says Josh Batson, a researcher at Anthropic. “If you have this model that you’ve open-sourced for people to study, it means that a bunch of interpretability research can now be done on the back of those sparse autoencoders. It lowers the barrier to entry to people learning from these methods.”

Neuronpedia, a platform for mechanistic interpretability, partnered with DeepMind in July to build a demo of Gemma Scope that you can play around with right now. In the demo, you can test out different prompts and see how the model breaks up your prompt and what activations your prompt lights up. You can also mess around with the model. For example, if you turn the feature about dogs way up and then ask the model a question about US presidents, Gemma will find some way to weave in random babble about dogs, or the model may just start barking at you.

One interesting thing about sparse autoencoders is that they are unsupervised, meaning they find features on their own. That leads to surprising discoveries about how the models break down human concepts. “My personal favorite feature is the cringe feature,” says Joseph Bloom, science lead at Neuronpedia. “It seems to appear in negative criticism of text and movies. It’s just a great example of tracking things that are so human on some level.” 

You can search for concepts on Neuronpedia and it will highlight what features are being activated on specific tokens, or words, and how strongly each one is activated. “If you read the text and you see what’s highlighted in green, that’s when the model thinks the cringe concept is most relevant. The most active example for cringe is somebody preaching at someone else,” says Bloom.

Some features are proving easier to track than others. “One of the most important features that you would want to find for a model is deception,” says Johnny Lin, founder of Neuronpedia. “It’s not super easy to find: ‘Oh, there’s the feature that fires when it’s lying to us.’ From what I’ve seen, it hasn’t been the case that we can find deception and ban it.”

DeepMind’s research is similar to what another AI company, Anthropic, did back in May with Golden Gate Claude. It used sparse autoencoders to find the parts of Claude, their model, that lit up when discussing the Golden Gate Bridge in San Francisco. It then amplified the activations related to the bridge to the point where Claude literally identified not as Claude, an AI model, but as the physical Golden Gate Bridge and would respond to prompts as the bridge.

Although it may just seem quirky, mechanistic interpretability research may prove incredibly useful. “As a tool for understanding how the model generalizes and what level of abstraction it’s working at, these features are really helpful,” says Batson.

For example, a team lead by Samuel Marks, now at Anthropic, used sparse autoencoders to find features that showed a particular model was associating certain professions with a specific gender. They then turned off these gender features to reduce bias in the model. This experiment was done on a very small model, so it’s unclear if the work will apply to a much larger model.

Mechanistic interpretability research can also give us insights into why AI makes errors. In the case of the assertion that 9.11 is larger than 9.8, researchers from Transluce saw that the question was triggering the parts of an AI model related to Bible verses and September 11. The researchers concluded the AI could be interpreting the numbers as dates, asserting the later date, 9/11, as greater than 9/8. And in a lot of books like religious texts, section 9.11 comes after section 9.8, which may be why the AI thinks of it as greater. Once they knew why the AI made this error, the researchers tuned down the AI’s activations on Bible verses and September 11, which led to the model giving the correct answer when prompted again on whether 9.11 is larger than 9.8.

There are also other potential applications. Currently, a system-level prompt is built into LLMs to deal with situations like users who ask how to build a bomb. When you ask ChatGPT a question, the model is first secretly prompted by OpenAI to refrain from telling you how to make bombs or do other nefarious things. But it’s easy for users to jailbreak AI models with clever prompts, bypassing any restrictions. 

If the creators of the models are able to see where in an AI the bomb-building knowledge is, they can theoretically turn off those nodes permanently. Then even the most cleverly written prompt wouldn’t elicit an answer about how to build a bomb, because the AI would literally have no information about how to build a bomb in its system.

This type of granularity and precise control are easy to imagine but extremely hard to achieve with the current state of mechanistic interpretability. 

“A limitation is the steering [influencing a model by adjusting its parameters] is just not working that well, and so when you steer to reduce violence in a model, it ends up completely lobotomizing its knowledge in martial arts. There’s a lot of refinement to be done in steering,” says Lin. The knowledge of “bomb making,” for example, isn’t just a simple on-and-off switch in an AI model. It most likely is woven into multiple parts of the model, and turning it off would probably involve hampering the AI’s knowledge of chemistry. Any tinkering may have benefits but also significant trade-offs.

That said, if we are able to dig deeper and peer more clearly into the “mind” of AI, DeepMind and others are hopeful that mechanistic interpretability could represent a plausible path to alignment—the process of making sure AI is actually doing what we want it to do.

What’s on the table at this year’s UN climate conference

This article is from The Spark, MIT Technology Review’s weekly climate newsletter. To receive it in your inbox every Wednesday, sign up here.

It’s time for a party—the Conference of the Parties, that is. Talks kicked off this week at COP29 in Baku, Azerbaijan. Running for a couple of weeks each year, the global summit is the largest annual meeting on climate change.

The issue on the table this time around: Countries need to agree to set a new goal on how much money should go to developing countries to help them finance the fight against climate change. Complicating things? A US president-elect whose approach to climate is very different from that of the current administration (understatement of the century).

This is a big moment that could set the tone for what the next few years of the international climate world looks like. Here’s what you need to know about COP29 and how Donald Trump’s election is coloring things.

The UN COP meetings are an annual chance for nearly 200 nations to get together to discuss (and hopefully act on) climate change. Greatest hits from the talks include the Paris Agreement, a 2015 global accord that set a goal to limit global warming to 1.5 °C (2.7 °F) above preindustrial levels.

This year, the talks are in Azerbaijan, a petrostate if there ever was one. Oil and gas production makes up over 90% of the country’s export revenue and nearly half its GDP as of 2022. A perfectly ironic spot for a global climate summit!

The biggest discussion this year centers on global climate finance—specifically, how much of it is needed to help developing countries address climate change and adapt to changing conditions. The current goal, set in 2009, is for industrialized countries to provide $100 billion each year to developing nations. The deadline was 2020, and that target was actually met for the first time in 2022, according to the Organization for Economic Cooperation and Development, which keeps track of total finance via reports from contributing countries. Currently, most of that funding is in the form of public loans and grants.

The thing is, that $100 billion number was somewhat arbitrary—in Paris in 2015, countries agreed that a new, larger target should be set in 2025 to take into account how much countries actually need.

It’s looking as if the magic number is somewhere around $1 trillion each year. However, it remains to be seen how this goal will end up shaking out, because there are disagreements about basically every part of this. What should the final number be? What kind of money should count—just public funds, or private investments as well? Which nations should pay? How long will this target stand? What, exactly, would this money be going toward?

Working out all those details is why nations are gathering right now. But one shadow looming over these negotiations is the impending return of Donald Trump.

As I covered last week, Trump’s election will almost certainly result in less progress on cutting emissions than we might have seen under a more climate-focused administration. But arguably an even bigger deal than domestic progress (or lack thereof) will be how Trump shifts the country’s climate position on the international stage.

The US has emitted more carbon pollution into the atmosphere than any other country, it currently leads the world in per capita emissions, and it’s the world’s richest economy. If anybody should be a leader at the table in talks about climate finance, it’s the US. And yet, Trump is coming into power soon, and we’ve all seen this film before. 

Last time Trump was in office, he pulled the US out of the Paris Agreement. He’s made promises to do it again—and could go one step further by backing out of the UN Framework Convention on Climate Change (UNFCCC) altogether. If leaving the Paris Agreement is walking away from the table, withdrawing from the UNFCCC is like hopping on a rocket and blasting in a different direction. It’s a more drastic action and could be tougher to reverse in the future, though experts also aren’t sure if Trump could technically do this on his own.

The uncertainty of what happens next in the US is a cloud hanging over these negotiations. “This is going to be harder because we don’t have a dynamic and pushy and confident US helping us on climate action,” said Camilla Born, an independent climate advisor and former UK senior official at COP26, during an online event last week hosted by Carbon Brief.

Some experts are confident that others will step up to fill the gap. “There are many drivers of climate action beyond the White House,” said Mohamed Adow, founding director of Power Shift Africa, at the CarbonBrief event.

If I could characterize the current vibe in the climate world, it’s uncertainty. But the negotiations over the next couple of weeks could provide clues to what we can expect for the next few years. Just how much will a Trump presidency slow global climate action? Will the European Union step up? Could this cement the rise of China as a climate leader? We’ll be watching it all.


Now read the rest of The Spark

Related reading

In case you want some additional context from the last few years of these meetings, here’s my coverage of last year’s fight at COP28 over a transition away from fossil fuels, and a newsletter about negotiations over the “loss and damages” fund at COP27.

For the nitty-gritty details about what’s on the table at COP29, check out this very thorough explainer from Carbon Brief.

The White House in Washington DC under dark stormy clouds

DAN THORNBERG/ADOBE STOCK

Another thing

Trump’s election will have significant ripple effects across the economy and our lives. His victory is a tragic loss for climate progress, as my colleague James Temple wrote in an op-ed last week. Give it a read, if you haven’t already, to dig into some of the potential impacts we might see over the next four years and beyond. 

Keeping up with climate  

The US Environmental Protection Agency finalized a rule to fine oil and gas companies for methane emissions. The fee was part of the Inflation Reduction Act of 2022. (Associated Press)
→ This rule faces a cloudy future under the Trump administration; industry groups are already talking about repealing it. (NPR)

Speaking of the EPA, Donald Trump chose Lee Zeldin, a former Republican congressman from New York, to lead the agency. Zeldin isn’t particularly known for climate or economic policy. (New York Times)

Oil giant BP is scaling back its early-stage hydrogen projects. The company revealed in an earnings report that it’s canceling 18 such projects and currently plans to greenlight between five and 10. (TechCrunch)

Investors betting against renewable energy scored big last week, earning nearly $1.2 billion as stocks in that sector tumbled. (Financial Times)

Lithium iron phosphate batteries are taking over the world, or at least electric vehicles. These lithium-ion batteries are cheaper and longer-lasting than their nickel-containing cousins, though they also tend to be heavier. (Canary Media
→ I wrote about this trend last year in a newsletter about batteries and their ingredients. (MIT Technology Review)

The US unveiled plans to triple its nuclear energy capacity by 2050. That’s an additional 200 gigawatts’ worth of consistently available power. (Bloomberg)

Five subsea cables that can help power millions of homes just got the green light in Great Britain. The projects will help connect the island to other power grids, as well as to offshore wind farms in Dutch and Belgian waters. (The Guardian)

How this grassroots effort could make AI voices more diverse

We are on the cusp of a voice AI boom, with tech companies such as Apple and OpenAI rolling out the next generation of artificial-intelligence-powered assistants. But the default voices for these assistants are often white American—British, if you’re lucky—and most definitely speak English. They represent only a tiny proportion of the many dialects and accents in the English language, which spans many regions and cultures. And if you’re one of the billions of people who don’t speak English, bad luck: These tools don’t sound nearly as good in other languages.

This is because the data that has gone into training these models is limited. In AI research, most data used to train models is extracted from the English-language internet, which reflects Anglo-American culture. But there is a massive grassroots effort underway to change this status quo and bring more transparency and diversity to what AI sounds like: Mozilla’s Common Voice initiative. 

The data set Common Voice has created over the past seven years is one of the most useful resources for people wanting to build voice AI. It has seen a massive spike in downloads, partly thanks to the current AI boom; it recently hit the 5 million mark, up from 38,500 in 2020. Creating this data set has not been easy, mainly because the data collection relies on an army of volunteers. Their numbers have also jumped, from just under 500,000 in 2020 to over 900,000 in 2024. But by giving its data away, some members of this community argue, Mozilla is encouraging volunteers to effectively do free labor for Big Tech. 

Since 2017, volunteers for the Common Voice project have collected a total of 31,000 hours of voice data in around 180 languages as diverse as Russian, Catalan, and Marathi. If you’ve used a service that uses audio AI, it’s likely been trained at least partly on Common Voice. 

Mozilla’s cause is a noble one. As AI is integrated increasingly into our lives and the ways we communicate, it becomes more important that the tools we interact with sound like us. The technology could break down communication barriers and help convey information in a compelling way to, for example, people who can’t read. But instead, an intense focus on English risks entrenching a new colonial world order and wiping out languages entirely.

“It would be such an own goal if, rather than finally creating truly multimodal, multilingual, high-performance translation models and making a more multilingual world, we actually ended up forcing everybody to operate in, like, English or French,” says EM Lewis-Jong, a director for Common Voice. 

Common Voice is open source, which means anyone can see what has gone into the data set, and users can do whatever they want with it for free. This kind of transparency is unusual in AI data governance. Most large audio data sets simply aren’t publicly available, and many consist of data that has been scraped from sites like YouTube, according to research conducted by a team from the University of Washington, and Carnegie Mellon andNorthwestern universities. 

The vast majority of language data is collected by volunteers such as Bülent Özden, a researcher from Turkey. Since 2020, he has been not only donating his voice but also raising awareness around the project to get more people to donate. He recently spent two full-time months correcting data and checking for typos in Turkish. For him, improving AI models is not the only motivation to do this work. 

“I’m doing it to preserve cultures, especially low-resource [languages],” Özden says. He tells me he has recently started collecting samples of Turkey’s smaller languages, such as Circassian and Zaza.

However, as I dug into the data set, I noticed that the coverage of languages and accents is very uneven. There are only 22 hours of Finnish voices from 231 people. In comparison, the data set contains 3,554 hours of English from 94,665 speakers. Some languages, such as Korean and Punjabi, are even less well represented. Even though they have tens of millions of speakers, they account for only a couple of hours of recorded data. 

This imbalance has emerged because data collection efforts are started from the bottom up by language communities themselves, says Lewis-Jong. 

“We’re trying to give communities what they need to create their own AI training data sets. We have a particular focus on doing this for language communities where there isn’t any data, or where maybe larger tech organizations might not be that interested in creating those data sets,” Lewis-Jong says. They hope that with the help of volunteers and various bits of grant funding, the Common Voice data set will have close to 200 languages by the end of the year.

Common Voice’s permissive license means that many companies rely on it—for example, the Swedish startup Mabel AI, which builds translation tools for health-care providers. One of the first languages the company used was Ukrainian; it built a translation tool to help Ukrainian refugees interact with Swedish social services, says Karolina Sjöberg, Mabel AI’s founder and CEO. The team has since expanded to other languages, such as Arabic and Russian. 

The problem with a lot of other audio data is that it consists of people reading from books or texts. The result is very different from how people really speak, especially when they are distressed or in pain, Sjöberg says. Because anyone can submit sentences to Common Voice for others to read aloud, Mozilla’s data set also includes sentences that are more colloquial and feel more natural, she says.

Not that it is perfectly representative. The Mabel AI team soon found out that most voice data in the languages it needed was donated by younger men, which is fairly typical for the data set. 

“The refugees that we intended to use the app with were really anything but younger men,” Sjöberg says. “So that meant that the voice data that we needed did not quite match the voice data that we had.” The team started collecting its own voice data from Ukrainian women, as well as from elderly people. 

Unlike other data sets, Common Voice asks participants to share their gender and details about their accent. Making sure different genders are represented is important to fight bias in AI models, says Rebecca Ryakitimbo, a Common Voice fellow who created the project’s gender action plan. More diversity leads not only to better representation but also to better models. Systems that are trained on narrow and homogenous data tend to spew stereotyped and harmful results.

“We don’t want a case where we have a chatbot that is named after a woman but does not give the same response to a woman as it would a man,” she says. 

Ryakitimbo has collected voice data in Kiswahili in Tanzania, Kenya, and the Democratic Republic of Congo. She tells me she wanted to collect voices from a socioeconomically diverse set of Kiswahili speakers and has reached out to women young and old living in rural areas, who might not always be literate or even have access to devices. 

This kind of data collection is challenging. The importance of collecting AI voice data can feel abstract to many people, especially if they aren’t familiar with the technologies. Ryakitimbo and volunteers would approach women in settings where they felt safe to begin with, such as presentations on menstrual hygiene, and explain how the technology could, for example, help disseminate information about menstruation. For women who did not know how to read, the team read out sentences that they would repeat for the recording. 

The Common Voice project is bolstered by the belief that languages form a really important part of identity. “We think it’s not just about language, but about transmitting culture and heritage and treasuring people’s particular cultural context,” says Lewis-Jong. “There are all kinds of idioms and cultural catchphrases that just don’t translate,” they add. 

Common Voice is the only audio data set where English doesn’t dominate, says Willie Agnew, a researcher at Carnegie Mellon University who has studied audio data sets. “I’m very impressed with how well they’ve done that and how well they’ve made this data set that is actually pretty diverse,” Agnew says. “It feels like they’re way far ahead of almost all the other projects we looked at.” 

I spent some time verifying the recordings of other Finnish speakers on the Common Voice platform. As their voices echoed in my study, I felt surprisingly touched. We had all gathered around the same cause: making AI data more inclusive, and making sure our culture and language was properly represented in the next generation of AI tools. 

But I had some big questions about what would happen to my voice if I donated it. Once it was in the data set, I would have no control about how it might be used afterwards. The tech sector isn’t exactly known for giving people proper credit, and the data is available for anyone’s use. 

“As much as we want it to benefit the local communities, there’s a possibility that also Big Tech could make use of the same data and build something that then comes out as the commercial product,” says Ryakitimbo. Though Mozilla does not share who has downloaded Common Voice, Lewis-Jong tells me Meta and Nvidia have said that they have used it.

Open access to this hard-won and rare language data is not something all minority groups want, says Harry H. Jiang, a researcher at Carnegie Mellon University, who was part of the team doing audit research. For example, Indigenous groups have raised concerns. 

“Extractivism” is something that Mozilla has been thinking about a lot over the past 18 months, says Lewis-Jong. Later this year the project will work with communities to pilot alternative licenses including Nwulite Obodo Open Data License, which was created by researchers at the University of Pretoria for sharing African data sets more equitably. For example, people who want to download the data might be asked to write a request with details on how they plan to use it, and they might be allowed to license it only for certain products or for a limited time. Users might also be asked to contribute to community projects that support poverty reduction, says Lewis-Jong.  

Lewis-Jong says the pilot is a learning exercise to explore whether people will want data with alternative licenses, and whether they are sustainable for communities managing them. The hope is that it could lead to something resembling “open source 2.0.”

In the end, I decided to donate my voice. I received a list of phrases to say, sat in front of my computer, and hit Record. One day, I hope, my effort will help a company or researcher build voice AI that sounds less generic, and more like me. 

This story has been updated.

The Download: diversifying AI voices, and a science-fiction glimpse into the future

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology.

How this grassroots effort could make AI voices more diverse

We are on the cusp of a voice AI boom, as tech companies roll out the next generation of artificial-intelligence-powered assistants. But the default voices for these assistants are often white American—British, if you’re lucky—and most definitely speak English. And if you’re one of the billions of people who don’t speak English, bad luck: These tools don’t sound nearly as good in other languages.

This is because the data that has gone into training these models is limited. In AI research, most data used to train models is extracted from the English-language internet, which reflects Anglo-American culture. But there is a massive grassroots effort underway to change this status quo and bring more transparency and diversity to what AI sounds like. Read the full story.

—Melissa Heikkilä

Azalea: a science-fiction story

Fancy something fiction to read this weekend? If you enjoy Sci-Fi, check out this story written by Paolo Bacigalupi, featured in the latest edition of our print magazine. It imagines a future shaped by climate change—read it for yourself here.

The must-reads

I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology.

1 Cruise has admitted to falsifying a crash report
The report failed to mention that its robotaxi dragged a pedestrian after striking her. (San Francisco Chronicle)
+ The firm has been fined $500,000 to resolve the criminal charges. (WP $)

2 The US plans to investigate Microsoft’s cloud business 
As the Biden administration prepares to hand over power to Donald Trump’s team. (FT $)

3 Silicon Valley hates regulation. So does Trump.
AI and energy ventures could be the first to prosper under lighter-touch governance. (WP $)
+ Peter Thiel claims the tech industry is fed up with ‘wokeness.’ (Insider $)

4 Elon Musk’s cost-cutting team will be working 80+ hours a week
And you’ll need to subscribe to X to apply. (WSJ $)
+ As if that wasn’t appealing enough, the positions are also unpaid. (NBC News)
+ The ‘lucky’ workers can expect a whole lot of meetings. (Bloomberg $)

5 The trolls are in charge now
And it’s increasingly unclear what’s a joke and what’s an actual threat. (The Atlantic $)
+ It’s possible, but not guaranteed, that Trump’s more controversial cabinet picks will be defeated in the Senate. (New Yorker $)

6 How to keep abortion plans private in the age of Trump
Reproductive rights are under threat. Here’s how to protect them. (The Markup)

7 The first mechanical Qubit is here
And mechanical quantum computers could be the first to benefit. (IEEE Spectrum)
+ Quantum computing is taking on its biggest challenge: noise. (MIT Technology Review)

8 Can Bluesky recapture the old Twitter’s magic?
No algorithms, no interfering billionaires. (Vox)
+ More than one million new users joined the platform earlier this week. (TechCrunch)

9 Weight-loss drugs could help to treat chronic pain
And could present a safer alternative to opioids. (New Scientist $)
+ Weight-loss injections have taken over the internet. But what does this mean for people IRL? (MIT Technology Review)

10 These are the most expensive photographs ever taken
The first human-taken pictures from space are truly awe-inspiring. (The Guardian)

Quote of the day

“It feels like it’s a platform for and by real people.”

—US politician Alexandria Ocasio-Cortez tells the Washington Post about the appeal of Bluesky as users join the social network after abandoning X.

The big story

How environmental DNA is giving scientists a new way to understand our world

February 2024

Environmental DNA is a relatively inexpensive, widespread, potentially automated way to observe the diversity and distribution of life.

Unlike previous techniques, which could identify DNA from, say, a single organism, the method also collects the swirling cloud of other genetic material that surrounds it. It can serve as a surveillance tool, offering researchers a means of detecting the seemingly undetectable.

By sampling eDNA, or mixtures of genetic material in water, soil, ice cores, cotton swabs, or practically any environment imaginable, even thin air, it is now possible to search for a specific organism or assemble a snapshot of all the organisms in a given place.

It offers a thrilling — and potentially chilling — way to collect information about organisms, including humans, as they go about their everyday business. Read the full story.

—Peter Andrey Smith

We can still have nice things

A place for comfort, fun and distraction to brighten up your day. (Got any ideas? Drop me a line or tweet ’em at me.)

+ Smells like punk spirit.
+ If you’ve been feeling creaky lately (and who hasn’t), give these mobility exercises a go.
+ Talk about a glow up—these beautiful locations really do emanate light.
+ It’s the truly chilling collab we never knew we needed: Bon Jovi has joined forces with Mr Worldwide himself, Pitbull.

Why the term “women of childbearing age” is problematic

This article first appeared in The Checkup, MIT Technology Review’s weekly biotech newsletter. To receive it in your inbox every Thursday, and read articles like this first, sign up here.

Every journalist has favorite topics. Regular Checkup readers might already know some of mine, which include the quest to delay or reverse human aging, and new technologies for reproductive health and fertility. So when I saw trailers for The Substance, a film centered on one middle-aged woman’s attempt to reexperience youth, I had to watch it.

I won’t spoil the movie for anyone who hasn’t seen it yet (although I should warn that it is not for the squeamish, or anyone with an aversion to gratuitous close-ups of bums and nipples). But a key premise of the film involves harmful attitudes toward female aging.

“Hey, did you know that a woman’s fertility starts to decrease by the age of 25?” a powerful male character asks early in the film. “At 50, it just stops,” he later adds. He never explains what stops, exactly, but to the viewer the message is pretty clear: If you’re a woman, your worth is tied to your fertility. Once your fertile window is over, so are you.

The insidious idea that women’s bodies are, above all else, vessels for growing children has plenty of negative consequences for us all. But it has also set back scientific research and health policy.

Earlier this week, I chatted about this with Alana Cattapan, a political scientist at the University of Waterloo in Ontario, Canada. Cattapan has been exploring the concept of “women of reproductive age”—a descriptor that is ubiquitous in health research and policy.

The idea for the research project came to her when the Zika virus was making headlines around eight years ago. “I was planning on going to the Caribbean for a trip related to my partner’s research, and I kept getting advice that women of reproductive age shouldn’t go,” she told me. At the time, Zika was being linked to microcephaly—unusually small heads—in newborn babies. It was thought that the virus was affecting key stages of fetal development.

Cattapan wasn’t pregnant. And she wasn’t planning on becoming pregnant at the time. So why was she being advised to stay away from areas with the virus?

The experience got her thinking about the ways in which attitudes toward our bodies are governed by the idea of potential pregnancy. Take, for example, biomedical research on the causes and treatment of disease. Women’s health has lagged behind men’s as a focus of such work, for multiple reasons. Male bodies have long been considered the “default” human form, for example. And clinical trials have historically been designed in ways that make them less accessible for women.

Fears about the potential effects of drugs on fetuses have also played a significant role in keeping people who have the potential to become pregnant out of studies. “Scientific research has excluded women of ‘reproductive age,’ or women who might potentially conceive, in a blanket way,” says Cattapan. “The research that we have on many, many drugs does not include women and certainly doesn’t include women in pregnancy.”  

This lack of research goes some way to explaining why women are much more likely to experience side effects from drugs—some of them fatal. Over the last couple of decades, greater effort has been made to include people with ovaries and uteruses in clinical research. But we still have a long way to go.

Women are also often subjected to medical advice designed to protect a potential fetus, whether they are pregnant or not. Official guidelines on how much mercury-containing fish it is safe to eat can be different for “women of childbearing age,” according to the US Environmental Protection Agency, for example.  And in 2021, the World Health Organization used the same language to describe people who should be a focus of policies to reduce alcohol consumption

The takeaway message is that it’s women who should be thinking about fetal health, says Cattapan. Not the industries producing these chemicals or the agencies that regulate them. Not even the men who contribute to a pregnancy. Just women who stand a chance of getting pregnant, whether they intend to or not. “It puts the onus of the health of future generations squarely on the shoulders of women,” she says.

Another problem is the language itself. The term “women of reproductive age” typically includes women between 15 and 44. Women at one end of that spectrum will have very different bodies and a very different set of health risks from those at the other. And the term doesn’t account for people who might be able to get pregnant but don’t necessarily identify as female.

In other cases it is overly broad. In the context of the Zika virus, for example, it was not all women between the ages of 15 and 44 who should have considered taking precautions. The travel advice didn’t apply to people who’d had hysterectomies or did not have sex with men, for example, says Cattapan. “Precision here matters,” she says. 

More nuanced health advice would be helpful in cases like these. Guidelines often read as though they’re written for people assumed to be stupid, she adds. “I don’t think that needs to be the case.”

Another thing

On Thursday, president-elect Donald Trump said that he will nominate Robert F. Kennedy Jr. to lead the US Department of Health and Human Services. The news was not entirely a surprise, given that Trump had told an audience at a campaign rally that he would let Kennedy “go wild” on health, “the foods,” and “the medicines.”

The role would give Kennedy some control over multiple agencies, including the Food and Drug Administration, which regulates medicines in the US, and the Centers for Disease Control and Prevention, which coordinates public health advice and programs.

That’s extremely concerning to scientists, doctors, and health researchers, given Kennedy’s positions on evidence-based medicine, including his antivaccine stance. A few weeks ago, in a post on X, he referred to the FDA’s “aggressive suppression of psychedelics, peptides, stem cells, raw milk, hyperbaric therapies, chelating compounds, ivermectin, hydroxychloroquine, vitamins, clean foods, sunshine, exercise, nutraceuticals and anything else that advances human health and can’t be patented by Pharma.”  

“If you work for the FDA and are part of this corrupt system, I have two messages for you,” continued the post. “1. Preserve your records, and 2. Pack your bags.”

There’s a lot to unpack here. But briefly, we don’t yet have good evidence that mind-altering psychedelic drugs are the mental-health cure-alls some claim they are. There’s not enough evidence to support the many unapproved stem-cell treatments sold by clinics throughout the US and beyond, either. These “treatments” can be dangerous.

Health agencies are currently warning against the consumption of raw unpasteurized milk, because it might carry the bird flu virus that has been circulating in US dairy farms. And it’s far too simplistic to lump all vitamins together—some might be of benefit to some people, but not everyone needs supplements, and high doses can be harmful.

Kennedy’s 2021 book The Real Anthony Fauci has already helped spread misinformation about AIDS. Here at MIT Technology Review, we’ll continue our work reporting on whatever comes next. Watch this space.


Now read the rest of The Checkup

Read more from MIT Technology Review’s archive

The tech industry has a gender problem, as the Gamergate and various #MeToo scandals made clear. A new generation of activists is hoping to remedy it

Male and female immune systems work differently. Which is another reason why it’s vital to study both women and female animals as well as males

Both of the above articles were published in the Gender issue of MIT Technology Review magazine. You can read more from that issue online here.

Women are more likely to receive abuse online. My colleague Charlotte Jee spoke to the technologists working on an alternative way to interact online: a feminist internet.

From around the web 

The scientific community and biopharma investors are reacting to the news of Robert F. Kennedy Jr.’s nomination to lead the Department of Health and Human Services. “It’s hard to see HHS functioning,” said one biotech analyst. (STAT)

Virologist Beata Halassy successfully treated her own breast cancer with viruses she grew in the lab. She has no regrets. (Nature)

Could diet influence the growth of endometriosis lesions? Potentially, according to research in mice fed high-fat, low-fiber “Western” diets. (BMC Medicine)

Last week, 43 female rhesus macaque monkeys escaped from a lab in South Carolina. The animals may have a legal claim to freedom. (Vox)

Health Crisis Drives $50 Million Supplement CEO

Dean Brennan says a diet of beer, pizza, and fast food led to his ulcerative colitis. His doctors diagnosed it years ago in his twenties and told him he’d need medications for life. But Brennan decided otherwise.

“I didn’t want to take lifelong medication,” he told me. “It sparked my passion for health and led me to want to help others.”

Fast forward to 2024, and Brennan is the CEO of Heart & Soil, a nutritional supplement company doing $50 million in annual revenue.

In our recent conversation, he addressed his journey to Heart & Soil, key supplement ingredients, supply chain challenges, and more. The entire audio is embedded below. The transcript is edited for clarity and length.

Eric Bandholz: Give us a rundown of what you do.

Dean Brennan: I’m the CEO of Heart & Soil, a nutritional supplements company. I entered ecommerce in 2020 with no experience, coming from a background in filmmaking.

I got involved with the company from my personal health journey. In my twenties, I was diagnosed with ulcerative colitis, and doctors told me I’d need medication for life. I grew up eating home-cooked, natural foods, although in college I consumed a lot of beer, pizza, and fast food.

I didn’t want to take lifelong medication. It sparked my passion for health and led me to want to help others who suffer from conditions like psoriasis, Crohn’s disease, and eczema.

Heart & Soil offers supplements containing nature-based multivitamins made from bovine organs sourced from regenerative farms, initially in New Zealand and now also from the U.S.

Bandholz: How did you get connected with Heart & Soil?

Brennan: I was aware of Paul Saladino, our founder, but not the company. He’s a board-certified physician and a nutrition specialist. I followed him on social media while experimenting with a carnivore diet. I admired his ability to simplify complex health concepts and share them in an engaging way.

In 2020, I met Paul by chance, along with two employees who are now our chief research officer and head of operations. At the time, the company hadn’t launched yet, and I offered feedback on its prototype product. Initially, I wasn’t looking for a position in the company, but I was passionate about their mission.

Later that year, after my persistence, Paul brought me on board the day the company launched. I printed shipping labels and prepared the orders. Within three months, I had worked my way into a bigger role.

The team was small then — Paul, me, and three others. We worked out of a rental house in West Austin, packing and shipping supplements ourselves. We grew quickly. Paul realized his expertise was podcasting and researching, not operations. He assigned those responsibilities to me by January 2021.

Bandholz: How did you earn Paul’s trust so quickly?

Brennan: It was a gradual transition. Paul left for a trip to Africa. Then there was a massive ice storm in Austin, and he couldn’t return. Eventually, he went to Costa Rica and decided to stay there, leaving me to run the business. I think he trusted me because I showed up every day, worked hard, and didn’t ask for anything.

The transition was easy. I was nervous about how the team would react, but they were all on board. We’ve worked well together ever since.

Bandholz: How do you spread awareness beyond Paul’s podcast audience?

Brennan: Only about 30% of our customers come from Paul’s audience, with the same percentage coming through word of mouth. Our product works, and we’ve received hundreds of customer success stories. One of our strengths is personalized customer service. Our team of health guides offers one-on-one support, which has led to word-of-mouth referrals. People often tell others about us, even if they haven’t purchased our products themselves.

We also started another podcast called Radical Health Radio, and we’re producing films for YouTube. Our documentary on seed oils will be released next month.

Bandholz: What’s your supply chain like?

Brennan: Our long-term goal is to build a U.S.-based supply chain to produce all the organs needed for our supplements. In 2020, nothing like this existed in the U.S., so we sourced from New Zealand, where regenerative farming is common. But we’ve worked hard over the last four years to develop U.S. suppliers, supporting American farmers.

There’s a huge education gap in the U.S. regarding organ consumption. Around the world, most cultures consume organs regularly. We hope that educating consumers can drive demand for better products and ingredients.

When consumers ask for healthier alternatives, large companies will have to respond. This movement isn’t just about our products but about supporting sustainable farming practices and improving public health.

Bandholz: Where can people buy your supplements and follow you?

Brennan: Our ecommerce site is Heartandsoil.co. You can follow me on X and LinkedIn.

WP Engine Escalates Legal Battle With Automattic and Mullenweg via @sejournal, @martinibuster

WP Engine escalated its Federal complaint by citing Automattic’s publication of the WP Engine Tracker website as evidence of intent to harm WP Engine and exposing customers to potential cybercrimes. The updated complaint incorporates recent actions by Mullenweg to further strengthen their case.

A spokesperson for WP Engine issued a statement to Search Engine Journal about the WP Engine Tracker website:

“Automattic’s wrongful and reckless publication of customer’s information without their consent underscores why we have moved for a preliminary injunction. WP Engine has requested the immediate takedown of this information and looks forward to the November 26th hearing on the injunction.”

Legal Complaint Amended With More Evidence

WP Engine (WPE) filed a complaint in Federal court seeking a preliminary injunction to prevent Matt Mullenweg and Automattic from continuing actions that harm WPE’s business and their relationships with their customers. That complaint was amended with further details to support their allegations against Mullenweg and Automattic.

The legal complaint begins by stating in general terms what gives rise to their claim:

“This is a case about abuse of power, extortion, and greed.”

It then grows progressively specific by introducing evidence of how Automattic and Mullenweg continue their “bad acts unabated” for the purpose of harming WP Engine (WPE).

The amended claim adds the following, quoting Mullenweg himself:

“Since then, Defendants have continued to escalate their war, unleashing a campaign to steal WPE’s software, customers and employees. Indeed, just days ago, Defendants were unambiguous about their future plans:”

This is the statement Mullenweg made that is quoted in the amended complaint:

“[S]ince this started [with WPE] they’ve had uh, we estimate tens of thousands of customers leave. . . . So, um you know, I think over the next few weeks, they’re actually gonna lose far more than 8% of their business . . . we’re at war with them. We’re . . . going to go brick by brick and take . . . take every single one of their customers . . . if they weren’t around guess what? . . . We’d happily have those customers, and in fact we’re getting a lot of them.”

WP Engine Tracker Site Used As Evidence

Automattic recently created a website on the WordPressEngineTracker.com domain called WP Engine Tracker that encourages WordPress Engine customers to leave, offering links to promotions that offer discounts and promise a smooth transition to other web hosts.

WPE states that the WP Engine Tracker website is part of a campaign to encourage WPE customers to abandon it, writing:

“Defendants also created a webpage at wordpress.org offering “Promotions and Coupons” to convince WPE customers to stop doing business with WPE and switch over to Automattic’s competitor hosting companies like wordpress.com and Pressable; they later added links to other competitors as well.”

The WordPress Engine Tracker website calls attention to the number of sites that have abandoned WP Engine (WPE) since Matt Mullenweg’s September 21st public denunciation of WP Engine and the start of his “nuclear” war against the web host. The amended Federal lawsuit points to the September 21st date listed on that site as additional evidence tying Automattic to a campaign to harm WP Engine’s business.

The legal document explains:

“Just last week, in an apparent effort to brag about how successful they have been in harming WPE, Defendants created a website—www.wordpressenginetracker.com—that “list[s] . . . every domain hosted by @wpengine, which you can see decline every day. 15,080 sites have left already since September 21st.

September 21 was not selected randomly. It is the day after Defendants’ self-proclaimed nuclear war began – an admission that these customer losses were caused by Defendants’ wrongful actions. In this extraordinary attack on WPE and its customers, Defendants included on their disparaging website a downloadable file of ‘all [WPE] sites ready for a new home’—that is, WPE’s customer list, literally inviting others to target and poach WPE’s clients while Defendants’ attacks on WPE continued..”

The purpose of the above allegations are to build as much evidence that lend credence to WP Engine’s claim that Automattic is actively trying to cause harm WP Engine’s business.

WPE Accuses Automattic Of Additional Harms

Another new allegation against Automattic is that the spreadsheet offered for download on the WP Engine Tracker website includes sensitive information that is not publicly available and could cause direct harm to WPE customers.

The amended Federal lawsuit explains:

“Worse, this downloadable file contains private information regarding WPE’s customers’ domain names, including development, test, and pre-production servers—many of which are not intended to be accessed publicly and contain sensitive or private information. Many of these servers are intentionally not indexed or otherwise included in public search results because the servers are not safe, secure or production-ready and not intended to be accessed by the general public.

By disclosing this information to the general public, Defendants put these development, test, and pre-production domains at risk for hacking and unauthorized access.”

WP Engine Tracker Site Part Of A Larger Strategy

WPE’s amended complaint alleges that the WP Engine Tracker site is one part of a larger strategy to cause harm to WP Engine’s business that includes encouraging WPE employees to resign. The legal document adds new information of how the WP Engine Tracker website is just one part of a larger strategy to harm WPE’s business.

The updated document adds the following new allegations as evidence of WPE’s claims:

“Not content with interfering with WPE’s customer relations, Automattic has recently escalated its tactics by actively recruiting hundreds of WPE employees, in an apparent effort to weaken WPE by sowing doubts about the company’s future and enticing WPE’s employees to join Automattic:”

The document includes a screenshot of an email solicitation apparently sent to an employee that encourages them to join Automattic.

Screenshot Of Evidence Presented In Amended Complaint

Escalation Of Federal Complaint

WP Engine’s amended complaint against Mullenweg and Automattic invokes the Sherman Act (prohibiting monopolization to maintain a competitive marketplace), the Lanham Act (governing trademarks, false advertising, and unfair competition), and the Computer Fraud and Abuse Act (addressing unauthorized computer access and cybercrimes). The amendments tie recent actions by Mullenweg and Automattic—such as the creation of the WP Engine Tracker website—directly to their claims, turning Mullenweg’s attacks on WP Engine into evidence.

Read the amended Federal complaint here: (PDF).

Featured Image by Shutterstock/chaiyapruek youprasert

Bad & Toxic Backlinks You Should Avoid via @sejournal, @BennyJamminS

Link building is a complicated art form with many different tactics and approaches.

Despite being one of the most mature processes in SEO, there’s still much disagreement about what makes a “bad” or “good” link building strategy, including effectiveness vs. risk, and what tactics Google can detect or punish a website for.

This post will help you determine what to avoid when link building or vetting the tactics of a new service provider.

I’m not going to claim to put any disagreements to rest, and if you’re a particularly experiment-minded SEO you might find this post a little on the conservative side.

As with all things in the industry, there’s inconsistency between what Google says and what works, and everyone benefits from those who experiment and push boundaries.

But I’m taking a conservative approach that follows Google’s guidelines closely for two core reasons:

  • This post is for readers looking for reliable and sustainable strategies. I don’t advise that you use experimental or high-risk tactics when it comes to link building if you don’t already know what you’re doing and what the risks are.
  • You should take the guidelines as a statement of intent, not absolute or current truth. Even if a link building tactic that goes against Google’s guidelines works now, there is reason to believe that Google intends to address it.

Types Of Unnatural Links

A an unnatural link is any link that is created for the purposes of manipulating search engines or that violates Google’s spam policies.

The following are some of the most common types of unnatural links.

Buying Or Selling Links

There is nothing fundamentally wrong with paying for a link or exchanging some kind of product or service for a link as long as the nature of the relationship is disclosed and the links are not for SEO purposes.

Buying, exchanging, or trading for links for SEO is the problem. Links for SEO are supposed to be a choice influenced only by the content on the page.

If your content is highly valued and people choose to link to it for that reason, then you deserve SEO benefits.

When you enter money or value exchanges into that dynamic, it breaks the ideal purpose of SEO links and introduces a high potential for manipulation. In such cases, Google requires marking the link as rel=nofollow or rel=sponsored so that the links do not pass SEO value. As long as you or the parties linking to you do this, for the most part, there’s no problem.

Here is an example of implementing nofollow and sponsored attributes:

Here are some ways that buying or selling links can fall afoul of Google’s spam policies:

  • Text advertisements with links that pass SEO signals because they haven’t been identified with “nofollow” or “sponsored.”
  • Paying for articles that include links that pass SEO signals.

Another way to buy links is to pay someone to create them for you. In this case, a service provider does that work of creating assets, reaching out to acquire links, or both. As long as this service provider doesn’t engage in shady tactics of their own and doesn’t give you links on domains that they own, this is totally fine.

Keep in mind that the “buying” and “selling” definitions are not limited to an exchange of currency.

It describes any kind of relationship where something is exchanged for a link, like a product.

As Matt Cutts explained in 2014, Google aligns pretty closely with the FTC on what it understands to be a “material connection” between a link provider and link recipient:

  • If a party receives enough value to reasonably change their behavior, a material connection must be disclosed.
    • A pen or a t-shirt likely won’t change behavior (unless received for the explicit purpose of reviewing / linking to it).
    • A direct payment for a link, a gift card, or a product with a high dollar value likely changes behavior and incentivizes a link.
    • An item loaned has different implications than an item given.
  • Consider the intended audience: if you’re giving things away for reasons other than to acquire links (for example as part of a conference attendance gift package), then disclosure might be necessary, but it might not be strictly necessary to ask all those people to mark links as sponsored if they choose to talk about it.
  • Consider whether a link relationship would be surprising: it makes sense that a movie reviewer might see a movie for free. It makes less sense that a tech reported would get to keep a laptop they’re reporting about without disclosure.

Link Exchange Agreements

Link exchanges are similar to buying links because they involve an exchange of value.

Mutual linking happens often, and when it occurs organically, it’s no problem. It makes perfect sense for some websites to link back and forth.

But you need to watch out for any kind of agreement. “Link for link” is a no-go, and if you do it often enough, it can become easy to spot.

The thing about links is that any time you give or get a link for a reason other than the value and relevance of the link itself, it’s easy to spot – likely easier than you think.

The occasional bit of back rubbing isn’t a big deal. When given a few different choices of websites to reference, it makes sense that people would choose those they already know or have existing relationships with.

That’s generally fine. The problem comes when you enter into specific agreements: You link to me, and I’ll link to you.

The video below explains the difference between a link that’s an editorial choice and a link that’s based on an agreement.

Private Blog Networks

Private blog networks (PBNs) are networks of sites created to artificially inflate the rankings of one specific central website.

Basically, one entity controls an entire network of websites and can use a few different specific linking methods to manipulate to pass authority and SEO value around.

This network can then be used to artificially inflate the rankings of other websites by linking out to them.

In order for this tactic to work, all the websites need to have relationships or be owned by the same entity.

This is a pretty clear violation of Google’s guidelines, and it’s also pretty easy to spot.

Sites that are part of these networks can be penalized, and if you’re a little too lax with user-generated content on your site, you could find yourself accidentally becoming one.

If you accept any kind of content from external parties, scrutinize it carefully, especially links. Skip down to “How To Spot Shady Links” to find out more.

Unnatural Links From Forums, Blog Comments, And Other User-Generated Content

User-generated content is tricky when it comes to links. Ideally, a random person loves your content so much that they use you as a reference. Not so ideal is faking it.

Comments, forums, blogs, guestbooks, and even sites like Reddit might be tempting sources for links, and in the right context, they can absolutely be part of a healthy backlink profile. You can even link to yourself if you’re genuinely engaging in a relevant discussion. Google doesn’t consider all comment links and UGC links to be spam.

However, it’s a bad idea to try and engineer these links as part of a mass strategy.

The first thing to keep in mind is that many user-generated content (UGC) websites have blanket nofollow attributes on outgoing links. It’s an old tactic, so many high-quality communities moderate UGC heavily. This means that doing this effectively requires effort. The big question to ask yourself is: does the comment add genuine value to the community?

Most commonly, people execute these links unnaturally using bots to post automatically. Generally, automated posting using bots isn’t exactly valuable, and you’ll be flagged and moderated out of those communities.

Automated Link Syndication

There are tons of ways to automate links, but Google considers automating links at scale to be spam.

There are plenty of ways to safely automate your content processes, but we aren’t talking about that. We’re talking about using automation to post content externally from your website purely to acquire SEO links.

From automated article spinners to bots that will post comments and social media posts, if you’re intentionally building links “at scale,” then the chances are high that you’re building toxic links.

This could look like an automated press release or directory posting. It could look like low-quality article directories, which are often filled with spammy content that is widely distributed.

Generative AI has enabled new forms of automation for links and content, so it’s important to consider the overall principles in Google’s and the FTC guidelines when you evaluate novel functions and strategies.

Links In Distributed Widgets

People sometimes engage in automated link building by adding links to widgets distributed to multiple websites. Google clarified its stance on this and provided examples of manipulative widgets.

This kind of link building is pretty easy to spot, and it’s pretty clear that these types of links don’t add value.

Using Expired Domains To Build Links

Expired domain abuse is another tactic Google is wise to, but that doesn’t stop people from trying it.

One way that expired domains can be used to build unnatural links is by purchasing it and then redirecting it to another website. The idea is that all of the authority and backlinks belonging to the expired domain will be forwarded through the redirect. Don’t do this.

Any Link Can Be Bad If It’s Lazy Enough

Does the automated press release spam mean you shouldn’t send press releases? No!

Does the prevalence of poor-quality directors mean you can’t use directories in a high-quality way? Also no!

This goes for many link building strategies. There’s usually a high-effort, valuable version and a low-effort, spammy version.

Take guest posting as an example.

If you’re an expert in your field and take the time to write useful content aligned with E-E-A-T best practices, that’s valuable.

If you want to reach new audiences, you could send that post to a website with a large reach. It makes sense for that website to then link back to you as a reference for readers if they like your writing and want to learn more.

This is an ideal linking relationship. A website has chosen your content because it provides value to its readers and links to you as the source of the expertise.

But when one party turns lazy, this becomes toxic.

A website might decide that, for whatever reason, it makes sense to start allowing poor-quality content with links.

Maybe it starts charging or uses a big catalog of content to build an affiliate strategy.

On the other side, link builders might generate poor-quality content with links and post it on websites that either don’t mind or don’t know better. Or they might try and sneak them by following stricter editorial guidelines.

When one side of the equation gets lazy, guest posting becomes a manipulative linking strategy.

The Risk Of Manual Actions

The most likely risk of an unnatural link is that it will be a waste of time and/or money.

If you build a link for SEO that goes against Google’s guidelines, algorithms will simply ignore it either immediately or at an unspecified time in the future when they discover it.

If you have many toxic links and you’re using a strategy that the algorithms don’t immediately catch, this can open you up to a sudden reduction in SEO effectiveness.

At some point, Google will likely release an update that improves how the algorithms detect the links.

When that happens, if you have many of them, the adjustment can significantly impact your rankings and traffic. This can look like a targeted penalty, but generally, it isn’t.

Google uses automated systems and manual actions to punish toxic and spammy link building, but generally, you’re safe from this action unless you’re intentionally using these tactics on a large scale.

On the other hand, you can receive specific penalties for unnatural links, both coming to your site or going out from your site.

Unnatural links manual action notification in search console.Unnatural links manual action notification in search console.

Links To Your Site Vs. Links From Your Site

If you host unnatural links from your site to other sites, you may be hit with a manual action. This indicates to Google that you’re on the supply side of the ecosystem it’s trying to stop.

A large number of unnatural links coming from your website could cause Google to decide it doesn’t trust you and issue a penalty. This will be communicated to you in Google Search Console. These penalties can be reversed, but generally this requires you to fix the problems and submit a request for reevaluation.

This video from Google about unnatural links from your site explains more. It’s your responsibility to ensure that your site does not host unnatural links. This video from Google provides a great overview. Remember: “A natural link is an editorial choice.”

For example, if you use your domains to host bad link tactics and sell links to others, you’re at a high risk of receiving a manual penalty from Google that suppresses or removes your website from the Search index.

You can also receive a manual penalty for unnatural links to your website. This seems less likely, because there are many cases where it wouldn’t be fair to punish a website for incoming links. However, you might still receive a manual penalty if Google is confident that you are trying to manipulate your ranking.

This video from Google about unnatural links to your site has more information.

How To Spot Shady Links

A good link is a genuine interaction of trust between two parties.

Spotting shady links is actually pretty easy, especially when there’s a pattern.

If you’re auditing your backlink profile or putting a potential service provider through their paces, here are some signs to look for.

1. New or young sites on blogging domains.

If you notice links from blogging subdomains ( e.g. blogger.com ) to your website, especially if they aren’t directly relevant, appear in high numbers (without nofollow attribute), or even in some cases where the blog has your website or brand name, this is a sign that someone was building shady links to your website.

This is a good indication of a PBN.

You should ask a link building service provider whether they create new websites to build links. This is a red flag.

2. Many unnatural links from unrelated forums.

Links like this can indicate automated link building with bots. Generally, using UGC sites to build links is against the terms of service of those websites.

Usually, the strategy involves pretending to be a genuine user. If you have to pretend you’re someone you’re not, it’s a shady link.

3. Links from irrelevant websites and directories.

Relevance really does matter with links, and if you’re looking through a link profile and see domains that just don’t make sense, they bear investigation. For example if you are a recipe publisher a link from plumber’s article is highly irrelevant. That means it was likely the result of an unnatural link building technique.

However, if you add your website to relevant directories that have value from the users’ perspective, this can be totally fine. For example, you should add your restaurant website to Yelp, which is used by 32M active users who look for reviews before booking a reservation. Check our list of directories that still matter.

If you want to learn more about link building and its many pitfalls, check out SEJ’s ebook The Dark Side Of Link Building.

More resources: 


Featured Image: Jakub Krechowicz/Shutterstock

6 Web Hosts Ranked By Core Web Vitals: One Outperforms All via @sejournal, @martinibuster

HTTPArchive is offering a new technology comparison dashboard, currently in beta testing. Users can now view real-world web hosting performance scores for Core Web Vitals. We compare six web hosts and find one that consistently performs better across nearly all metrics.

About HTTPArchive

HTTPArchive tracks websites through crawling and with data collected in the Chrome User Experience Report (CrUX). It publishes reports about the technologies that power websites, including Core Web Vitals performance of content management systems like WordPress and Wix.

New Technology Comparison Dashboard – Beta

HTTPArchive has new reports under development, one of which is a comparison of Core Web Vitals and Lighthouse performance scores by web hosts. HTTPArchive also tracks the median page weight by web hosts but it’s still under development and is in Beta testing.

The new reports allow comparison by web hosts. There isn’t data yet for many web hosts but there is for the following six. Comparing web hosts by core web vitals is not a totally fair comparison. A web host like Ionos might host many thousands of small and local sites which might not be resource intensive.

So with those caveats, here are the six web hosts under comparison:

  1. Bluehost
  2. GoDaddy
  3. HostGator
  4. IONOS
  5. SiteGround
  6. WP Engine

Core Web Vitals By Web Host

The following are the list of web hosts by percentage of sites hosted at each one that pass Core Web Vitals. The HTTPArchive says that thise report is still under development and, as previously mentioned, the percentages don’t necessarily reflect the quality of the web hosts themselves, but rather the quality of the sites hosted there.

This is the description of the CWV metric scores:

Passes Core Web Vitals
The percentage of origins passing all three Core Web Vitals (LCP, INP, CLS) with a good experience. Note that if an origin is missing INP data, it’s assessed based on the performance of the remaining metrics.”

However, it’s interesting to see that the number one web host is a managed WordPress web host because that may indicate that the platform itself may be optimized better than a general web host. The following scores are based on a snapshot taken at the beginning of September.

Core Web Vitals Scores In Descending Order

  • WP Engine 70%
  • GoDaddy 67%
  • SiteGround 65%
  • HostGator 58%
  • Ionos 58%
  • Bluehost 45%

Largest Contentful Paint (LCP)

LCP measures the perceived page loading speed, how fast the page appears to load for a site visitor.

HTTPArchive defines this metric:

“Largest Contentful Paint (LCP) is an important, stable Core Web Vital metric for measuring perceived load speed because it marks the point in the page load timeline when the page’s main content has likely loaded—a fast LCP helps reassure the user that the page is useful. Good experiences are less than or equal to 2.5 seconds.”

WP Engine again comes out on top, perhaps indicating the quality of the sites hosted on that platform as well as the performance optimizations that are a key element of that web host.

LCP Scores In Descending Order

  • WP Engine 79%
  • GoDaddy 78%
  • SiteGround 75%
  • HostGator 69%
  • IONOS 69%
  • Bluehost 52%

Cumulative Layout Shift (CLS)

HTTPArchive also provides a comparison of the six web hosts by the CLS score. CLS measures how much a web page shifts around as it’s rendered in a web browser. A score of 0.1 or less for 75% of visitors is recommended. The percentages for each of the web hosts were all higher than the 75% minimum. This time WP Engine is tied for first place with HostGator.

CLS Scores In Descending Order

  • WP Engine 88%
  • HostGator 88%
  • Bluehost 87%
  • SiteGround 86%
  • IONOS 85%
  • GoDaddy 84%

First Contentful Paint (FCP)

FCP measures how long it takes for the content to become visible. A low FCP means that the content is rendered quickly. The number one ranked web host for FCP turns out to be GoDaddy, ahead by a significant margin of 7 points. WP Engine comes in second, followed by SiteGround.

FCP Scores In Descending Order

  • GoDaddy 73%
  • WP Engine 67%
  • SiteGround 62%
  • IONOS 60%
  • HostGator 57%
  • Bluehost 39%

Time To First Byte (TTFB)

TTFB measures how long it takes from to download the first byte of a resource after it’s requested by a browser. GoDaddy scores top of the list again.

TTFB In Descending Order

  • GoDaddy 59%
  • IONOS 45%
  • WP Engine 39%
  • HostGator 38%
  • SiteGround 37%
  • Bluehost 25%

Interaction to Next Paint (INP)

This metric represents the overall responsiveness of the entire web page.

HTTPArchive explains what this score means:

“INP is a metric that assesses a page’s overall responsiveness to user interactions by observing the latency of all click, tap, and keyboard interactions that occur throughout the lifespan of a user’s visit to a page. The final INP value is the longest interaction observed, ignoring outliers. A good experience is less than or equal to 200ms.”

The scores are the percentage of pages that provide a good INP experience. WP Engine is back on top for INP but the other five web hosts are not far behind.

INP Scores In Descending Order

  • WP Engine 95%
  • SiteGround 94%
  • Bluehost 92%
  • GoDaddy 90%
  • HostGator 89%
  • IONOS 88%

Lighthouse Performance Score

Lighthouse is an open source auditing tool that scores web pages for performance, SEO, and other metrics. The performance scores for the six web hosts are fairly close to each, clustering on either side of a performance score of 40.

This is HTTPArchive’s description of this score:

“In general, only metrics contribute to your Lighthouse Performance score, not the results of Opportunities or Diagnostics.”

Interestingly, HostGator ranks the highest for the Lighthouse Performance score, with GoDaddy and Ionos tied for second place. The other three were tied for third place, by one point less than the second place. Nevertheless, HostGator was the clear winner for the Lighthouse Performance score metric.

Lighthouse Performance Scores

  • HostGator 43
  • GoDaddy 40
  • IONOS 40
  • Bluehost 39
  • SiteGround 39
  • WP Engine 39

HostGator came out near the top for Core Web Vitals and scores at the top of the list for the Lighthouse Performance metric. WP Engine is clustered with two other web hosts scoring 39 points.

Lighthouse Accessibility Scores

The accessibility scores are clustered similarly to the performance scores, on either side of a score of 85.

This is how HTTPArchive describes this metric:

“The Lighthouse Accessibility score is a weighted average of all accessibility audits. Weighting is based on axe user impact assessments. Each accessibility audit is pass or fail. Unlike the Performance audits, a page doesn’t get points for partially passing an accessibility audit.”

Accessibility Scores In Descending Order

  • GoDaddy 87
  • Bluehost 86
  • WP Engine 86
  • SiteGround 86
  • HostGator 85
  • Ionos 85

Lighthouse SEO Scores

The SEO scores were even more tightly clustered, with GoDaddy scoring the highest of the six web hosts under comparison.

HTTPArchive describes what the SEO Score is measuring:

“These checks ensure that your page is following basic search engine optimization advice. There are many additional factors Lighthouse does not score here that may affect your search ranking, including performance on Core Web Vitals.”

SEO Scores In Descending Order:

  • GoDaddy 91
  • Bluehost 88
  • WP Engine 88
  • HostGator 88
  • IONOS 88
  • SiteGround 88

Lighthouse Best Practices Score

The last score is interesting because it measures if the hosted sites are created with web development best practices. HTTPArchive doesn’t explain at this time what those best practices are.

Here’s the description of this score:

“This ensures that your page is built using modern web development best practices.”

Best Practices Scores In Descending Order

  • Bluehost 79
  • HostGator 79
  • SiteGround 79
  • WP Engine 77
  • GoDaddy 77
  • IONOS 77
  • Takeaway

HTTPArchive is expanding on what it is measuring. The performance dashboard is still in Beta and under development, meaning that it may have bugs but that it’s ready for a public preview. It’s interesting to see a managed WordPress host come on top. The scores will be more meaningful once there are more managed web hosts that can be compared against each other, which may provide a more meaningful comparison. Nevertheless, this is a good start.

Visit the new dashboard here and provide your feedback to make it better.

Featured Image by Shutterstock/TierneyMJ

New Ecommerce Tools: November 14, 2024

This week, our rundown of new tools from companies offering services to ecommerce merchants includes updates on holiday marketing campaigns, drone deliveries, analytics and insights, search, video generators, and several AI-based platforms.

Got an ecommerce product release? Email releases@practicalecommerce.com.

New Tools for Merchants

Amazon opens Virtual Holiday Shop, a 3D shopping experience. Amazon has launched Virtual Holiday Shop, a virtual shopping experience that uses three-dimensional technology powered by the Amazon Beyond virtual store. Inside the shop, visitors experience music, animations, and a guided search for gifts. Visitors can add products directly to a cart and then check out as usual. Per Amazon, the Virtual Holiday Shop spotlights selections of the top 100-plus gifts, stocking stuffers, holiday decor, and premium products, including customer favorites.

Web page showing Amazon's Virtual Holiday Shop

Amazon’s Virtual Holiday Shop

WPForms launches AI-powered form builder for WordPress. WPForms, a WordPress plugin, has released an AI-powered form builder to automatically generate customizable forms for contact-us, surveys, registrations, and feedback. Users describe what they want through an AI chatbot, and WPForms AI generates a form. The builder can translate entire forms into multiple languages, automatically set up conditional logic, and tweak or adjust forms afterward.

eBay mobile app adds traffic and performance data. Performance Insights is now live on the eBay mobile app to help merchants understand and improve their businesses. With Performance Insights, sellers can view real-time traffic graphs, track listing views, and monitor click rates and traffic sources — all on the go.

CapCut launches a video content platform for ecommerce merchants. Short-form editing app CapCut by ByteDance has launched Commerce Pro, a platform for ecommerce sellers and creators to produce and scale ads and branded content. The AI video generator instantly converts the URL of a product into ad videos with links to the products. AI-generated presenters assist with product demonstrations, explainer videos, and more. AI models can virtually try on products and generate photos for showcasing.

Home page of CapCut Commerce

CapCut Commerce

Buy-now, pay-later provider Affirm expands to the U.K. Affirm, a U.S.-based fintech firm, has launched its BNPL loans in the U.K. According to Affirm, the U.K. offering will include both interest-free and interest-bearing payment options. Interest on its plans will be fixed and calculated on the original principal amount, so it won’t increase or compound. The U.K. expansion is Affirm’s first outside of the U.S. and Canada.

Coveo partners with Shopify on scalable AI search and commerce experiences. Coveo, a commerce experience platform that leverages search and generative AI, has partnered with Shopify to bring its AI capabilities to enterprise customers. Coveo says its platform enables Shopify enterprise merchants to manage AI models and strategies for search relevance, personalization, real-time recommendations, unified indexing, and generative shopper experiences for product discovery and session optimization.

Contentsquare expands its analytics platform with advanced AI features. Contentsquare, an analytics provider, has upgraded its AI-driven Experience Intelligence platform to help marketing, product, and tech teams work more efficiently — with flexible purchasing options for businesses of all sizes. According to Contentsquare, the genAI CoPilot offers immediate insights and recommended next steps, summaries of customer sentiment, automatic session replays, and more. Contentsquare has also added heatmaps, enhanced customer feedback, and expanded analysis.

Home page of Contentsquare

Contentsquare

Zenapse launches AI-powered marketing platform on Google Cloud Marketplace. Zenapse, an AI-powered marketing platform utilizing emotional intelligence, has launched the ZenImpact Optimization Studio on Google Cloud Marketplace. Zenapse states Google Cloud Marketplace users can access its AI-driven psychographic signals, which analyze consumer thoughts, feelings, and beliefs, combined with demographic and behavioral data, to predict in real-time which content, products, and offers will resonate most with an audience. This launch will help marketers enhance business outcomes and gain deeper customer insights.

DreamHost partners with ecommerce solution provider Ecwid by Lightspeed. DreamHost, a provider of web hosting and managed WordPress services, has partnered with Ecwid by Lightspeed, enabling individuals and businesses to set up online stores and scale their businesses through multiple channels. According to DreamHost, the partnership means customers can quickly set up an online store without technical expertise. The solution includes built-in real-time reporting, marketing tools, and integrations for scaling. DreamHost customers have immediate access to Ecwid’s free tier.

Brizy launches a page builder for Shopify. Brizy, a London-based developer of website-building tools, has launched a landing page builder for Shopify store owners. With its drag-and-drop interface, users can design custom pages, product showcases, and marketing materials without needing any coding skills. For a limited time, Shopify users can try Brizy’s free plan. Brizy’s library includes over 90 templates and advanced elements such as countdowns, pop-ups, and alert bars.

Amazon expands drone delivery in Arizona. Amazon is expanding Prime Air drone delivery in the West Valley of metro Phoenix. Customers who live near Amazon’s Same-Day site in Tolleson, Arizona, and purchase an eligible item weighing five pounds or less can have it delivered by drone in under an hour. Tolleson’s Same-Day Delivery site is a hybrid — part fulfillment center and part delivery station. Amazon’s new MK30 drones will deploy from the facility.

Photo of an Amazon Prime Air drone

Amazon Prime Air: Phoenix Metro Area Drone Deliveries