How Twitter’s “Teacher Li” became the central hub of China protest information

As protests against rigid covid control measures in China engulfed social media in the past week, one Twitter account has emerged as the central source of information: @李老师不是你老师 (“Teacher Li Is Not Your Teacher”). People everywhere in China have sent protest footage and real-time updates to the account through private messages, and it has posted them on their behalf—taking care to keep the sources anonymous during a period of widespread fear and uncertainty.

There’s just one man behind the account: Li, a Chinese painter based in Italy, who requested to be identified only by his last name in light of the security risks. He has never received training in journalism, but that hasn’t stopped him from operating what’s essentially become a one-person newsroom. 

At the peak of activity over the weekend, Li was receiving dozens of submissions every second, and he did his best to filter out unreliable information in a matter of moments. It was a totally new experience—even though he’d spent the past year posting anonymous submissions from his followers. While he has long talked about Chinese social issues online, sometime in 2021 he started receiving private messages on Weibo, the Chinese equivalent to Twitter (which is banned in China), from people asking him to post on their behalf. They feared exposing their own identities. 

His posts would get removed by Chinese censors, and by February, his account was banned. Over the next two months, another 49 of his accounts were suspended. But his followers generously allowed him to borrow their phone numbers to keep registering for more. In April 2022, after he could no longer access new Weibo accounts, he finally moved to Twitter. There, he quickly grew a large following of international accounts and Chinese people accessing the blocked social media platform via VPN. 

Then, last week, workers in a Foxconn factory in Zhengzhou started a violent confrontation with management, and Li started monitoring the situation through Chinese social media and follower submissions. He slept only three hours that night.

More protests then broke out over the weekend in major Chinese cities, and Li once again posted real-time protest footage—aiming to help people within China get information so they could decide if they wanted to join in, and also to inform people outside China about what was really going on. “Even though they’re not in China right this second, things are happening, and they’re watching,” Li told me. 

His Twitter account is now the hub for information on the protests, having gained over 600,000 followers in the past tumultuous week alone. 

The demanding work, though, has taken a toll. Within China, mentions of his account name have been censored on social media platforms, including Weibo and WeChat. He is getting death threats and insults in DMs. And police have visited his family back in China. 

But the anxiety has been mixed with a feeling of liberation as he feels he’s finally able to say the name of Xi Jinping on social media without fear. Li joked that his Twitter avatar, which is a doodle of a cat, has become the most famous and most dangerous cat of his time. 

Over a long conversation early this week, Li told me about what it’s like to be under such immense pressure and how challenging it can be to remain objective. Though this work has occupied almost every waking minute, he told me, he finally forced himself to start taking breaks on Monday—which led to a surprise encounter that warmed his heart. 

Here’s Li’s story, told in his own words. The following transcript has been translated, lightly edited, and rearranged for clarity. —Zeyi Yang

On lending his voice to people who are afraid 

This account is, in essence, the same as many other ordinary Twitter users’. I talk about my life, some topics related to my profession, and, of course, social issues.

But this account also carries another purpose. I don’t know when it started, but gradually I began receiving submissions. People would contact me through private messages, send me what’s happening, or their own stories, and hope I could post that for them. 

I think this may be a phenomenon that emerged from the increasingly strong internet and speech controls on Chinese digital platforms since Xi Jinping came to power. People are afraid to say things directly on the internet, even if their accounts are anonymous. But they still have the desire for expression, so they want someone else to say it for them. 

It was the same on Weibo. Last year, at a time when I only had less than 10,000 followers, people slowly realized that this person could speak [for them], so they came to me. Then, when news broke [in February] about the mother who gave birth to eight children [Editor’s note: She was a trafficking victim who was found chained in a shed], I helped someone publish a submission about how he wanted to find his sister. That was reposted over 30,000 times on Weibo, and then my account was banned. 

In the months after my account was gone, I kept registering new accounts, and they kept getting banned. In about two months, I had 50 accounts suspended. The fastest record was when it took 10 minutes for an account to disappear [after registration]. As soon as [censors] blew up my account, I would immediately start a new one.

My followers—I don’t know why—were able to immediately find me, so I would gain thousands of followers in an instant. It ended when [the regulators] seemed to find the website where I bought those Weibo accounts and suspended that website, after which I couldn’t access any more accounts.

I was really moved during that period. Weibo verifies your identity through your phone number. But a lot of online friends just lent me their phone numbers and said: “It’s okay, Teacher Li, you can use my number for verification. That’s fine.” That really touched me. But later I couldn’t get a new account, so I had to move to Twitter. 

My [Twitter] account was registered in 2020, but I actually started using it in April [2022]. From the very beginning, I have always been sent the latest news. I don’t know why, but there’s always someone on the ground who can send something to me immediately, including about the incident in Shanghai where people held up a white banner [in October]. Slowly, the number of followers grew.

Before I reported on the Foxconn incident, I had about 140,000 followers, and then it got to 190,000 when I finished reporting on Foxconn. I lost count of how many followers I have now. [Editor’s note: By the time of our interview, Li had over 670,000 followers on Twitter; by the time of publishing, the number had increased to over 784,000.

On becoming a one-man newsroom (on only a few hours of sleep)

These days I sleep for about five hours, and I’m focusing on [Twitter] for the rest of the day. There’s no one else. Even my girlfriend is not involved—just me.

In fact, the day I was online for the longest time was not in the past few days; it was during the Foxconn [protest]. Because the situation was developing so quickly, if they didn’t stop, I couldn’t stop. It didn’t cross my mind that since this had nothing to do with me, I could go to sleep. I never had that thought.

The fire in Urumqi [which sparked the broader wave of protests] has actually triggered a lot of empathy from the public. The possibility of a fire is really a concern for everyone, because all of them have at one point been locked in their home and not allowed to go out. 

In every similar news event in the past, no matter whether the government was responsible, it would always censor the news. After having their mouths sealed again and again, people became furious. There is always going to be a last straw, no matter what it ends up being. If it didn’t happen today, it would happen tomorrow, or the day after tomorrow. 

I thought [the protest in] Xinjiang on the night of the 26th was a moment to be remembered in history, but it turned out that was just the beginning of the story.

Particularly when protesters chanted the slogans that originated from the Sitong Bridge protest, I was like, “Oh no, it’s going to be a very, very serious thing if people are shouting these slogans in the center of Shanghai.” I had to document it in a neutral and objective manner, because if not, it could soon be forgotten, even on Twitter. I thought, “I need to take up the baton immediately,” and then I started doing it without thinking too much. 

It’s hard to describe the feeling that came after. It’s like everyone is coming to you and all kinds of information from all over the world is converging toward you and [people are] telling you: Hey, what’s happening here; hey, what’s happening there; do you know, this is what’s happening in Guangzhou; I’m in Wuhan, Wuhan is doing this; I’m in Beijing, and I’m following the big group and walking together. Suddenly all the real-time information is being submitted to me, and I don’t know how to describe that feeling. But there was also no time to think about it. 

My heart was beating very fast, and my hands and my brain were constantly switching between several software programs—because you know, you can’t save a video with Twitter’s web version. So I was constantly switching software, editing the video, exporting it, and then posting it on Twitter. [Editor’s note: Li adds subtitles, blocks out account information, and compiles shorter videos into one.] By the end, there was no time to edit the videos anymore. If someone shot and sent over a 12-second WeChat video, I would just use it as is. That’s it. 

I got the largest amount of [private messages] around 6:00 p.m. on Sunday night. At that time, there were many people on the street in five major cities in China: Beijing, Shanghai, Chengdu, Wuhan, and Guangzhou. So I basically was receiving a dozen private messages every second. In the end, I couldn’t even screen the information anymore. I saw it, I clicked on it, and if it was worth posting, I posted it.

People all over the country are telling me about their real-time situations. In order for more people not to be in danger, they went to the [protest] sites themselves and sent me what was going on there. Like, some followers were riding bikes near the presidential palace in Nanjing, taking pictures, and telling me about the situation in the city. And then they asked me to inform everyone to be cautious. I think that’s a really moving thing.

It’s like I have gradually become an anchor sitting in a TV studio, getting endless information from reporters on the scene all over the country. For example, on Monday in Hangzhou, there were five or six people updating me on the latest news simultaneously. But there was a break because all of them were fleeing when the police cleared the venue. 

On the importance of staying objective 

There are a lot of tweets that embellish the truth. From their point of view, they think it’s the right thing to do. They think you have to maximize the outrage so that there can be a revolt. But for me, I think we need reliable information. We need to know what’s really going on, and that’s the most important thing. If we were doing it for the emotion, then in the end I really would have been part of the “foreign influence,” right? 

But if there is a news account outside China that can record what’s happening objectively, in real time, and accurately, then people inside the Great Firewall won’t have doubts anymore. At this moment, in this quite extreme situation of a continuous news blackout, to be able to have an account that can keep posting news from all over the country at a speed of almost one tweet every few seconds is actually a morale boost for everyone. 

Chinese people grow up with patriotism, so they become shy or don’t dare to say something directly or oppose something directly. That’s why the crowd was singing the national anthem and waving the red flag, the national flag [during protests]. You have to understand that the Chinese people are patriotic. Even when they are demanding things [from the government], they do it with that sentiment. 

So one reason they are willing to pass on information to me is that they know that I am reporting it in a neutral, objective, and truthful way. But for other accounts, they are afraid of messaging them because what if it’s true—as they are told in China—that they are being taken advantage of by foreign forces? 

You can understand it like this: they want to voice their opposition, but they also don’t want it to be too radical. They want to stay in the middle. So I’m actually that middle point. I will report on what happens, but I will only report on what happens and not say a word more. That’s probably why I’ve become the central hub. Of course, I’ve become the central hub also because I’ve kept posting and posting.

So I try to only report on whatever information I receive, but it’s hard to do that now, because there are so many submissions. And to fact-check one thing, I may need videos from several different angles. 

For example, last night there were rumors of a shooting in Wuhan, a shooting in Chengdu, and a shooting in Xi’an, but I didn’t find any videos that I could use to verify them, so I didn’t end up posting anything. Well, that resulted in some Twitter users thinking I might be deliberately covering up some faults by the police.

And now there’s a somewhat awkward situation where some people in China think I’m inciting these things and some people abroad think I’m a big China propagandist. That’s a very difficult spot to be in. When you choose to stand in the middle, you are definitely under pressure from both sides, but that’s okay.

On dealing with digital chaos and deception

Since I basically had no time to think and was just posting every few seconds, the feed became very dense and very chaotic. Some people sent me the same videos repetitively. There were also many videos that originated from me, and then spread to other platforms like WeChat Moments, and were later sent back to me. Maybe this post was about Beijing, the next was Guangzhou, and the next one was Shanghai. There was no way for people to know at once whether the video in their hands had been sent or not, so they had to resend it to me. Maybe the video was taken at 9:00, but they sent it to me at almost 12:00 and thought it was in real time.

The fake video I got the most Sunday night was probably one where a police car was driving under an overpass and running over people. I must have watched it 60 or 70 times. Every time, it says that it was the Sitong Bridge or something. But that footage was actually not taken in China. Many people are willing to believe these videos, or they just want to believe that something big has happened.

One big crisis I experienced Monday morning was that—I don’t know who it was and whether it was someone [on the Chinese government’s side], but they kept sending me false news. There were some messages about things that happened, but not at the places they claimed. Then there were some messages that you could tell were fake immediately. Maybe they were hoping to take me down in this way. 

There are always people in my private messages who want me to post a call to action, or people who want me to summarize the slogans and post them, or declare what people should do, but I have not crossed that line. I believe everyone has a mission for themselves, and my mission is to report on what happened. If I suddenly joined those [activists], I would have really become—particularly since I wasn’t there on the ground—the one giving commands. If people died in the end, then the blood would be on me, because I directed them to act. So I don’t think that should be the case. I can only do the reporting. 

But I think in the end, I will inevitably be the one to blame. Even if I don’t do it, people will assume I’m guilty. 

If I can keep my independence till the end, then I can be a candle, a torch, just standing there on my own.

On the mental toll the work is taking

​​I just finished graduate school, so technically I’m a recent graduate, and I was just dragged into this thing out of nowhere and suddenly found myself with a role in it. I don’t know how to feel. I’m just anxious. I don’t know what will happen to me. Of course, I’m afraid that one day a car is going to run toward me when I’m crossing the road and fake a traffic accident or something. But I only worry about it when I turn off the computer. When I’m sitting in front of it, I don’t have time to think about myself.

It’s mostly just exhausting. I forced myself to take a break today. Usually, I just sit there, start, and keep going until the end, and I hardly ever get up.

But today I started to get some threats, and I became more stressed. You have to be afraid because you have seen so much and know so much. So today, I’m forcing myself to take a vacation. It’s not much of a vacation, I guess, but I spent a long time just walking.

Today has also been quite amazing. 

I got this death threat last night, but I don’t know where it’s from. It just said, “We already know where you live. You just wait.” I didn’t have time to take a screenshot then, because that message was quickly pushed down by other messages. I took one look and it was gone, immediately. But since then, it has been heavy on my mind.

Then this morning when I went out to buy cat food, I stood in front of the peephole and checked repeatedly whether someone was standing outside. On the way, I kept checking if someone was tailing me. And after I returned home, there was some weird movement in the stairs, so I put down everything by the door and stood in front of the peephole for 10 minutes, but never saw anyone. 

Then I thought to myself, I can’t do this forever—I have to make the person leave. I was thinking I would start livestreaming, find them, and then ask them to leave. But it turned out that there was no one. It was a tiny, tiny, tiny kitten. I don’t know why it was hiding there, but I took it home. And now my girlfriend is feeding it. This is the amazing thing that happened today. I’m considering naming it Urumqi.

I forgot whether it started when Xi Jinping came to power [in 2013], but I’ve been feeling quite aggrieved. All these years, I’ve been constantly, repeatedly censoring myself and staying cautious just so I can keep talking.

But just yesterday, suddenly, I’m not afraid anymore. I had no time to think about it, and I just kept posting. The simple version of what happened is: When they shouted out “Xi Jinping, step down,” I suddenly felt it didn’t matter anymore. I can report this thing. I can type these words. If they aren’t afraid to say it, then I’m also not afraid to type it. That’s it.

You know what these three characters mean when they are typed out. It’s completely different [from other words]. At that moment, I suddenly felt like I’m dead, I’m alive, I’m liberated, and I’m aggrieved, all at the same time. It was a very, very complicated feeling.

Your microbiome ages as you do—and that’s a problem

This article is from The Checkup, MIT Technology Review’s weekly biotech newsletter. To receive it in your inbox every Thursday, sign up here.

We’re all crawling with bugs. Our bodies are home to plenty of distinct ecosystems that are home to microbes, fungi, and other organisms. They are crucial to our well-being. Shifts in the microbiome have been linked to a whole host of diseases. Look after your bugs and they’ll look after you, the theory goes.

These ecosystems appear to change as we age—and these changes can potentially put us at increased risk of age-related diseases. So how can we best look after them as we get old? And could an A-grade ecosystem help fend off diseases and help us lead longer, healthier lives?

It’s a question I’ve been pondering this week, partly because I know a few people who have been put on antibiotics for winter infections. These drugs—lifesaving though they can be—can cause mass destruction of gut microbes, wiping out the good along with the bad. How might people who take them best restore a healthy ecosystem afterwards?

I also came across a recent study in which scientists looked at thousands of samples of people’s gut microbe populations to see how they change with age. The standard approach to working out what microbes are living in a person’s gut is to look at feces. The idea is that when we have a bowel movement, we shed plenty of gut bacteria. Scientists can find out which species and strains of bacteria are present to get an estimate of what’s in your intestines.

In this study, a team based at University College Cork in Ireland analyzed data that had already been collected from 21,000 samples of human feces. These had come from people all over the world, including Europe, North and South America, Asia, and Africa. Nineteen nationalities were represented. The samples were all from adults between 18 and 100. 

The authors of this study wanted to get a better handle on what makes for a “good” microbiome, especially as we get older. It has been difficult for microbiologists to work this out. We do know that some bacteria can produce compounds that are good for our guts. Some seem to aid digestion, for example, while others lower inflammation.
 
But when it comes to the ecosystem as a whole, things get more complicated. At the moment, the accepted wisdom is that variety seems to be a good thing—the more microbial diversity, the better. Some scientists believe that unique microbiomes also have benefits, and that a collection of microbes that differs from the norm can keep you healthy.
 
The team looked at how the microbiomes of younger people compared with those of older people, and how they appeared to change with age. The scientists also looked at how the microbial ecosystems varied with signs of unhealthy aging, such as cognitive decline, frailty, and inflammation.
 
They found that the microbiome does seem to change with age, and that, on the whole, the ecosystems in our guts do tend to become more unique—it looks as though we lose aspects of a general “core” microbiome and stray toward a more individual one.
 
But this isn’t necessarily a good thing. In fact, this uniqueness seems to be linked to unhealthy aging and the development of those age-related symptoms listed above, which we’d all rather stave off for as long as possible. And measuring diversity alone doesn’t tell us much about whether the bugs in our guts are helpful or not in this regard.
 
The findings back up what these researchers and others have seen before, challenging the notion that uniqueness is a good thing. Another team has come up with a good analogy, which is known as the Anna Karenina principle of the microbiome: “All happy microbiomes look alike; each unhappy microbiome is unhappy in its own way.”
 
Of course, the big question is: What can we do to maintain a happy microbiome? And will it actually help us stave off age-related diseases?
 
There’s plenty of evidence to suggest that, on the whole, a diet with plenty of fruit, vegetables, and fiber is good for the gut. A couple of years ago, researchers found that after 12 months on a Mediterranean diet—one rich in olive oil, nuts, legumes, and fish, as well as fruit and veg—older people saw changes in their microbiomes that might benefit their health. These changes have been linked to a lowered risk of developing frailty and cognitive decline.
 
But at the individual level, we can’t really be sure of the impact that changes to our diets will have. Probiotics are a good example; you can chug down millions of microbes, but that doesn’t mean that they’ll survive the journey to your gut. Even if they do get there, we don’t know if they’ll be able to form niches in the existing ecosystem, or if they might cause some kind of unwelcome disruption. Some microbial ecosystems might respond really well to fermented foods like sauerkraut and kimchi, while others might not.
 
I personally love kimchi and sauerkraut. If they do turn out to support my microbiome in a way that protects me against age-related diseases, then that’s just the icing on the less-microbiome-friendly cake.

To read more, check out these stories from the Tech Review archive:
 
At-home microbiome tests can tell you which bugs are in your poo, but not much more than that, as Emily Mullin found.
 
Industrial-scale fermentation is one of the technologies transforming the way we produce and prepare our food, according to these experts.
 
Can restricting your calorie intake help you live longer? It seems to work for monkeys, as Katherine Bourzac wrote in 2009. 
 
Adam Piore bravely tried caloric restriction himself to find out if it might help people, too. Teaser: even if you live longer on the diet, you will be miserable doing so. 

From around the web:

Would you pay $15,000 to save your cat’s life? More people are turning to expensive surgery to extend the lives of their pets. (The Atlantic)
 
The World Health Organization will now start using the term “mpox” in place of “monkeypox,” which will be phased out over the next year. (WHO)
 
After three years in prison, He Jiankui—the scientist behind the infamous “CRISPR babies”—is attempting a comeback. (STAT)
 
Tech that allows scientists to listen in on the natural world is revealing some truly amazing discoveries. Who knew that Amazonian sea turtles make more than 200 distinct sounds? And that they start making sounds before they even hatch? (The Guardian)
 
These recordings provide plenty of inspiration for musicians. Whale song is particularly popular. (The New Yorker)
 
Scientists are using tiny worms to diagnose pancreatic cancer. The test, launched in Japan, could be available in the US next year. (Reuters)

Chaiz Aims to Disrupt Car Warranty Biz

Consumers love price comparison marketplaces. Example niches include airline tickets, hotel accommodations, and shipping carriers. Chaiz, a startup, aims to achieve the same success with extended automotive warranties.

Ryan Hartman is Chaiz’s chief marketing officer. He told me, “Our mission is to provide auto-repair protection that doesn’t break the bank.”

He and I recently discussed the need for lower-priced warranties, the challenges of reaching consumers, raising capital, and more. Our entire audio conversation is embedded below. The transcript is edited for clarity and length.

Eric Bandholz: Tell us about Chaiz.

Ryan Hartman: Chaiz is the first online comparison marketplace for automotive repair warranties. The company launched about a year ago. I joined shortly after as chief marketing officer. I’m one of three co-founders, plus we have an angel investor.

Before Chaiz I was head of growth for The Zebra, a leading insurance comparison marketplace.

Our mission is to provide auto-repair protection that doesn’t break the bank. Most people know about CarShield. It spends about $170 million on advertising annually — $160 million on TV alone. CarShield is a call center, a middleman. They sell insurance through a company called American Auto Shield that underwrites it, and CarShield marks it up.

Our model is direct relationships with multiple providers and passing the savings to the consumer.

Our long-term plan is to work through dealerships. They make most of their money on finance and insurance and net only about $500 on the vehicle itself. Unfortunately, they see us as a threat. But more folks are buying cars online now. We see an opportunity with the Carvanas of the world for a white-label solution with us.

We offer service for about 90% of U.S. vehicles on the road. The only limitation is California with restrictive insurance laws. So we don’t operate in that state.

Bandholz: How are you getting in front of prospects?

Hartman: Most of what we’re doing now is search engine optimization. We’ve made good strides. We write articles to educate consumers about what an extended warranty covers and what could go wrong on a road trip. We explain the difference between normal car insurance and repair protection.

Sixty percent of Americans can’t afford a breakdown that’s over $1,000. So our product is worthwhile. But the industry’s history is shady deals and a lack of transparency. So we have a lot of explaining to do.

We are trying to raise about $1 million in new equity. But it’s a tough market now on the venture capital front. We’re going after smaller investment firms that focus on seed and pre-seed. We’re hearing very positive feedback.

Bandholz: How do you pull in investors?

Hartman: It would be easier if our revenue was growing — $10,000 one month, $20,000 the next. We’d have money in the bank.

Still, we’ve got a good story because we’ve got eight providers on the site right now, including three of the top five. We signed a deal with Endurance, the second-largest direct-to-consumer warranty provider. We just signed Olive.com, and Protect My Car will launch soon. Investors want to see buy-in from the providers.

Bandholz: Where can people learn more about Chaiz and reach out to you?

Hartman: I’m on LinkedIn. Our site is Chaiz.com.

Is IP Address A Google Ranking Factor? via @sejournal, @kristileilani

Does the IP address of your website’s server affect your rankings in search results? According to some sources around the internet, your IP address is a ranking signal used by Google.

But does your IP address have the potential to help or harm your rankings in search? Continue reading to learn whether IP addresses are a Google ranking factor.

The Claim: IP Address As A Ranking Factor

Articles on the internet from reputable marketing sites claim that Google has over 200 “known” ranking factors.

These lists often include statements about flagged IP addresses affecting rankings or higher-value links because they are from separate C-class IP addresses.

IP address from hubspotScreenshot from HubSpot.com, June 2022

Fortunately, these lists sparked numerous conversations with Google employees about the validity of IP addresses as ranking factors in Google’s algorithm.

[Ebook:] The Complete Guide To Google Ranking Factors

The Evidence Against IP Address As A Ranking Factor

In 2010, Matt Cutts, former head of Google’s webspam team, was asked if the ranking of a client’s website would be affected by spammy websites on the same server.

His response:

“On the list of things that I worry about, that would not be near the top. So I understand, and Google understands that shared web hosting happens. You can’t really control who else is on that IP address or class c subnet.”

Ultimately, Google decided if they took action on an IP address or Class C subnet, the spammers would just move to another IP address. Therefore, it wouldn’t be the most efficient way to tackle the issue.

Cutts did note a specific exception, where an IP address had 26,000 spam sites and one non-spammy site that invited more scrutiny but reiterated that this was an exceptional outlier.

In 2011, a tweet from Kaspar Szymanski, another former member of Google’s webspam team, noted that Google has the right to take action when free hosts have been massively spammed.

In 2016, during a Google Webmaster Central Office Hours, John Mueller, Search Advocate at Google, was asked if having all of a group’s websites on the same c block of IP addresses was a problem.

He answered:

“No, that’s perfectly fine. So that’s not something where you artificially need to buy IP address blocks to just shuffle things around.

And especially if you are on a CDN, then maybe you’ll end up on an IP address block that’s used by other companies. Or if you’re on shared hosting, then these things happen. That’s not something you need to artificially move around.”

In March 2018, Mueller was asked if an IP change with a different geo-location would affect SEO. He responded:

“If you move to a server in a different location? Usually not. We get enough geotargeting information otherwise, e.g., from the TLD & geotargeting settings in Search Console.”

A few months later, Mueller replied to a tweet asking if Google still counted bad neighborhoods as a ranking signal and if a dedicated IP was necessary.

“Shared IP addresses are fine for search! Lots of hosting / CDN environments use them.”

In October 2018, Mueller was asked if the IP address location mattered for a site’s rankings. His response was simply, “Nope.”

A few tweets later, within the same Twitter thread, another user commented that IP addresses mattered regarding backlinks. Mueller again responded with a simple “Nope.”

In June 2019, Mueller received a question about Google Search Console showing a website’s IP address instead of a domain name. His answer:

“Usually, getting your IP addresses indexed is a bad idea. IP addresses are often temporary.”

He suggested that the user ensure the IP address redirects to their domain.

A few months later, when asked if links from IP addresses were bad, Mueller tweeted:

“Links from IP addresses are absolutely fine. Most of the time, it means the server wasn’t set up well (we canonicalized to the IP address rather than the hostname, easy to fix with redirects & rel=canonical), but that’s just a technical detail. It doesn’t mean they’re bad.”

In early 2020, when asked about getting links from different IP addresses, Mueller said that the bad part was the user was making the backlinks themselves – not the IP addresses.

Then, in June, Mueller was asked what happens if a website on an IP address bought links. Would there be an IP-level action taken?

“Shared hosting & CDNs on a single IP is really common. Having some bad sites on an IP doesn’t make everything on that IP bad.”

In September, during a discussion about bad neighborhoods affecting search rankings, Mueller stated:

“I’m not aware of any ranking algorithm that would take IPs like that into account. Look at Blogger. There are great sites that do well (ignoring on-page limitations, etc.), and there are terrible sites hosted there. It’s all the same infrastructure, the same IP addresses.”

In November, Gary Illyes, Chief of Sunshine and Happiness at Google, shared a fun fact.

“Fun fact: changing a site’s underlaying infrastructure like servers, IPs, you name it, can change how fast and often Googlebot crawls from said site. That’s because it actually detects that something changed, which prompts it to relearn how fast and often it can crawl.”

While it’s interesting information, it seems to impact crawling and not ranking. Crawling is, of course, required to rank, but crawling is not a ranking factor.

In 2021, a Twitter user asked if IP canonicalization could positively affect SEO. Meuller replied:

“Unless folks are linking to your site’s IP address (which would be unexpected), this wouldn’t have any effect on SEO.”

Later in December, when asked if an IP address instead of a hostname looks unusual when Google evaluates a link’s quality, Meuller stated, “Ip addresses are fine. The internet has tons of them.”

If you’re worried about your IP address or hosting company, the consensus seems to be: Don’t worry.

Get More Google Ranking Factor Insights.

Our Verdict: IP Address Is Not A Ranking Factor Anymore

Is IP Address A Google Ranking Factor?

Maybe in the past, Google experimented with IP-level actions against spammy websites.

But it must have found this ineffective because we are not seeing any confirmation from Google representatives that IP addresses, shared hosting, and bad neighborhoods are a part of the algorithm.

Therefore, we can conclude for now that IP addresses are not a ranking factor.


Featured Image: Paulo Bobita/Search Engine Journal

Ranking Factors: Fact Or Fiction? Let’s Bust Some Myths! [Ebook]

How To Achieve 7-Figures With Your Law Firm Marketing Website via @sejournal, @xandervalencia

Many law firms are simply leasing space when it comes to their online marketing.

Whether it’s Google pay-per-click (PPC) ads, Facebook Ads, or social media, these channels often yield only temporary wins. Once you pull the investment, your results go away entirely.

Your website, on the other hand, can be a 24/7 selling tool for your law firm practice. It can effectively become your greatest asset, getting leads and cases while you sleep.

In this guide, we’ll talk about how to turn your website into the ultimate marketing tool for your law firm practice and generate seven figures in revenue for your business.

A Well-Optimized Law Firm Website Can Yield Huge Results

With your law firm’s website, you can use content marketing to your advantage to generate lucrative results for your business. Content and SEO allow you to attract users organically and convert traffic passively into new cases for your law firm.

As an example, a high-ranking webpage in a competitive market getting 1,000 users per month can get huge results:

  • Convert visitors at 2-5% = 20-50 leads.
  • Convert even 10-20% of leads = 2-10 cases.
  • Average $8000 revenue per case = $16,000-$80,000 monthly revenue from one page.

Over the course of a year, this could lead to high six-figures to seven-figures in revenue!

The Foundations Of A Revenue-Generating Law Firm Website

At its core, your law firm website should serve to speak to the needs, struggles, and interests of your target audience. It should be laser-focused on your practice area, who you serve, and what you have to offer.

With this in mind, a well-crafted website content strategy should define:

  • Your business goals (the cases you want).
  • What competitors are doing.
  • What pages to write and keywords to target.
  • How to use your content budget.
  • Your editorial calendar.
  • The purpose/intent of each page.
  • PR and backlink strategy.

Below, we’ll dive deeper into how to develop this strategy, build out amazing content, and achieve your seven-figure revenue goals.

1. Define The Cases You Want

The first step to developing a successful website marketing strategy is to define the types of legal cases you want.

This activity will help you determine the types of people you want to reach, the type of content you should create, and the types of SEO keywords you need to target.

That way, you end up marketing to a more specific subset of potential clients, rather than a broad range of users.

Not sure where to set your focus? Here are a few questions that might help:

  • Which of your cases are the most profitable?
  • What types of cases are you not getting enough of?
  • In what markets are you strongest?
  • In which markets do you want to improve?
  • Are there any practice areas you want to explore?

At the end of this activity, you might decide that you want to attract more family law cases, foreclosure law cases, or DUI cases – whatever it is, getting hyper-focused on the types of cases you want to attract will only make your website marketing even stronger.

2. Identify Your Top Competitors

One of the best ways to “hack” your website marketing strategy is to figure out what’s working for your competitors.

By “competitors” we mean law firms that are working to attract the types of cases you’re trying to attract, at the same level at which your law firm is currently operating.

I say this because I see many law firms trying to out beat and outrank the “big” fish and this can feel like a losing battle. You want to set your sights on your closest competitors, rise above them, and then get more competitive with your strategy.

Here are a few ways to identify your closest competitors:

  • Conduct a Google search of your legal practice area + your service area (e.g., “family law Kirkland”, “DUI lawyer LA”, “Denver probate attorney” etc.). Take note of the top-ranking domains (i.e., websites).
  • Use SEO tools like Semrush or Ahrefs to search your domain name. These tools will often surface close competitors to your domain.
  • Using the same tools above, conduct organic research on your domain to see what keywords you are already ranking for. Search these keywords in Google and see what other domains come up.
  • Use these tools to determine the domain authority (DA) of your domain. Compare this to the other top-ranking domains to see which domains have an authority score that’s similar to your own.

Be sure to look at your known business competitors as well.

These may or may not be ranking well in Google Search, but it’s still worth a peek to see if they are targeting any high-priority keywords that your website should be targeting.

3. Conduct A Content Audit Of Your Website

Your next step is to conduct an audit of your current website. This will allow you to take stock of what content is performing well, and what content requires improvement.

First, start with your main service pages.

Use SEO tools like Semrush or Ahrefs again to review the rank (position), performance, and keywords of each page. Identify any pages that are ranking low, or not at all.

Then, find “low-hanging fruit” pages. These are the pages that are ranking around position 5-10. They require less effort to optimize to reach those higher rank positions – compared to pages ranking at, say, position 59.

Next, use the same tools to conduct a “gap analysis” (most SEO tools have this feature).

This compares your website’s performance to that of your closest competitors. It will show you a list of keywords that your competitors are ranking for that your website is not ranking for at all.

Finally, create an inventory of what pages you already have, which need to be revised, and which you need to create. Doing so will help you stay organized and stay on task when developing your content strategy.

4. Plan Your Content Silos

By this step, you will have a pretty good idea of what pages you already have, and which pages are “missing” from your strategy (based on the list of keywords you are not yet targeting).

From here, you will plan what’s called “content silos”.

Here is the basic process:

  • Review an existing service page (if you have one) and optimize it as best you can. Ideally, this is a page that’s already performing well and is otherwise a “low-hanging fruit” page.
  • If you don’t have any existing service pages, create one based on one of your high-priority keywords. Again, these should be a keyword that is meant to attract your preferred type of cases.
  • Next, build a “silo” of content around your main page. In other words, create new pages that are topically related to your main service page, but that target slightly different keywords (ideally, “long-tail”, lower competition keywords).
  • Add internal links between these pages and your primary service page.
  • Over time, build backlinks to these pages (through guest posting, PR, content marketing, etc.)

Below is an example of a content silo approach for “personal injury:”

Personal Injury law content siloImage from author, November 2022

5. Identify Supporting Topics

As part of your website content strategy, you’ll then want to create other supporting content pieces. This should be content that provides value to your potential clients.

FAQs, blogs, and other service pages can support your main pages.

For example, if you are a DUI lawyer, you might want to publish an FAQ page that addresses the main questions clients have about DUI law, or a blog post titled “What to Do When You Get a DUI.”

There are a few tools you can use to research supporting topics:

  • Semrush – Use this tool to identify untapped keywords, content topics, and more.
  • AlsoAsked – Identify other questions people have searched for relevant to your primary topic.
  • Answer the Public – Use this search listening tool to identify topics and questions related to your practice area.

Below is an example of how the full content silo can come together for “Los Angeles Car Accident Lawyer:”

Accident lawyer content siloImage from author, November 2022

6. Build An Editorial Calendar

Once you have all of your content ideas down on paper, it’s time to develop your editorial calendar.

This is essentially a plan of what content you need to create when you want to publish it, and what keywords you plan to target.

This can be as simple as a Google Sheet or as fancy as a project management tool (like Monday.com or Asana).

Here are a few tips to get you started:

  • Always prioritize main pages. These should be the first content pieces you create on your website.
  • Create or revise your main pages and monitor their performance. Use Google Analytics and other SEO tools to keep your eye on how your content is performing.
  • Depending on budget and urgency, you might start with all main pages, or go silo by silo. Determine which service pages are most important to you. You can create all of your main pages at once, or develop the entire silo as you go.
  • Keep a record of your target keywords. Just because you “optimize” for them doesn’t mean your content will automatically rank for your target keywords. In your editorial calendar, keep track of the keywords you wish to target – by page – so you have a record of your original SEO strategy.

What Makes A Winning Law Firm Website Strategy?

The key to achieving seven figures with your law firm website is content.

Content allows you to target your ideal clients, attract your preferred cases, engage your audience, and so much more.

A well-thought-out content strategy will empower your website to achieve more for your business than any other marketing channel could!

Above, I outline a few steps to developing this type of winning strategy. But, achieving excellence takes time.

I recommend keeping your eye on the prize, monitoring performance, and making updates as you go along.

This will help you reach your desired result.

More resources: 


Featured Image: PanuShot/Shutterstock

Transitioning From Excel To Python: Essential Functions For SEO Data Analysis via @sejournal, @williamjnye

Learning to code, whether with PythonJavaScript, or another programming language, has a whole host of benefits, including the ability to work with larger datasets and automate repetitive tasks.

But despite the benefits, many SEO professionals are yet to make the transition – and I completely understand why! It isn’t an essential skill for SEO, and we’re all busy people.

If you’re pressed for time, and you already know how to accomplish a task within Excel or Google Sheets, then changing tack can feel like reinventing the wheel.

When I first started coding, I initially only used Python for tasks that I couldn’t accomplish in Excel – and it’s taken several years to get to the point where it’s my defacto choice for data processing.

Looking back, I’m incredibly glad that I persisted, but at times it was a frustrating experience, with many an hour spent scanning threads on Stack Overflow.

This post is designed to spare other SEO pros the same fate.

Within it, we’ll cover the Python equivalents of the most commonly used Excel formulas and features for SEO data analysis – all of which are available within a Google Colab notebook linked in the summary.

Specifically, you’ll learn the equivalents of:

  • LEN.
  • Drop Duplicates.
  • Text to Columns.
  • SEARCH/FIND.
  • CONCATENATE.
  • Find and Replace.
  • LEFT/MID/RIGHT.
  • IF.
  • IFS.
  • VLOOKUP.
  • COUNTIF/SUMIF/AVERAGEIF.
  • Pivot Tables.

Amazingly, to accomplish all of this, we’ll primarily be using a singular library – Pandas – with a little help in places from its big brother, NumPy.

Prerequisites

For the sake of brevity, there are a few things we won’t be covering today, including:

  • Installing Python.
  • Basic Pandas, like importing CSVs, filtering, and previewing dataframes.

If you’re unsure about any of this, then Hamlet’s guide on Python data analysis for SEO is the perfect primer.

Now, without further ado, let’s jump in.

LEN

LEN provides a count of the number of characters within a string of text.

For SEO specifically, a common use case is to measure the length of title tags or meta descriptions to determine whether they’ll be truncated in search results.

Within Excel, if we wanted to count the second cell of column A, we’d enter:

=LEN(A2)
LEN formula excelScreenshot from Microsoft Excel, November 2022

Python isn’t too dissimilar, as we can rely on the inbuilt len function, which can be combined with Pandas’ loc[] to access a specific row of data within a column:

len(df['Title'].loc[0])

In this example, we’re getting the length of the first row in the “Title” column of our dataframe.

len function python
Screenshot of VS Code, November, 2022

Finding the length of a cell isn’t that useful for SEO, though. Normally, we’d want to apply a function to an entire column!

In Excel, this would be achieved by selecting the formula cell on the bottom right-hand corner and either dragging it down or double-clicking.

When working with a Pandas dataframe, we can use str.len to calculate the length of rows within a series, then store the results in a new column:

df['Length'] = df['Title'].str.len()

Str.len is a ‘vectorized’ operation, which is designed to be applied simultaneously to a series of values. We’ll use these operations extensively throughout this article, as they almost universally end up being faster than a loop.

Another common application of LEN is to combine it with SUBSTITUTE to count the number of words in a cell:

=LEN(TRIM(A2))-LEN(SUBSTITUTE(A2," ",""))+1

In Pandas, we can achieve this by combining the str.split and str.len functions together:

df['No. Words'] = df['Title'].str.split().str.len()

We’ll cover str.split in more detail later, but essentially, what we’re doing is splitting our data based upon whitespaces within the string, then counting the number of component parts.

word count PythonScreenshot from VS Code, November 2022

Dropping Duplicates

Excel’s ‘Remove Duplicates’ feature provides an easy way to remove duplicate values within a dataset, either by deleting entirely duplicate rows (when all columns are selected) or removing rows with the same values in specific columns.

Excel drop duplicatesScreenshot from Microsoft Excel, November 2022

In Pandas, this functionality is provided by drop_duplicates.

To drop duplicate rows within a dataframe type:

df.drop_duplicates(inplace=True)

To drop rows based on duplicates within a singular column, include the subset parameter:

df.drop_duplicates(subset='column', inplace=True)

Or specify multiple columns within a list:

df.drop_duplicates(subset=['column','column2'], inplace=True)

One addition above that’s worth calling out is the presence of the inplace parameter. Including inplace=True allows us to overwrite our existing dataframe without needing to create a new one.

There are, of course, times when we want to preserve our raw data. In this case, we can assign our deduped dataframe to a different variable:

df2 = df.drop_duplicates(subset='column')

Text To Columns

Another everyday essential, the ‘text to columns’ feature can be used to split a text string based on a delimiter, such as a slash, comma, or whitespace.

As an example, splitting a URL into its domain and individual subfolders.

Excel drop duplicatesScreenshot from Microsoft Excel, November 2022

When dealing with a dataframe, we can use the str.split function, which creates a list for each entry within a series. This can be converted into multiple columns by setting the expand parameter to True:

df['URL'].str.split(pat='/', expand=True)
str split PythonScreenshot from VS Code, November 2022

As is often the case, our URLs in the image above have been broken up into inconsistent columns, because they don’t feature the same number of folders.

This can make things tricky when we want to save our data within an existing dataframe.

Specifying the n parameter limits the number of splits, allowing us to create a specific number of columns:

df[['Domain', 'Folder1', 'Folder2', 'Folder3']] = df['URL'].str.split(pat='/', expand=True, n=3)

Another option is to use pop to remove your column from the dataframe, perform the split, and then re-add it with the join function:

df = df.join(df.pop('Split').str.split(pat='/', expand=True))

Duplicating the URL to a new column before the split allows us to preserve the full URL. We can then rename the new columns:🐆

df['Split'] = df['URL']

df = df.join(df.pop('Split').str.split(pat='/', expand=True))

df.rename(columns = {0:'Domain', 1:'Folder1', 2:'Folder2', 3:'Folder3', 4:'Parameter'}, inplace=True)
Split pop join functions PythonScreenshot from VS Code, November 2022

CONCATENATE

The CONCAT function allows users to combine multiple strings of text, such as when generating a list of keywords by adding different modifiers.

In this case, we’re adding “mens” and whitespace to column A’s list of product types:

=CONCAT($F$1," ",A2)
concat Excel
Screenshot from Microsoft Excel, November 2022

Assuming we’re dealing with strings, the same can be achieved in Python using the arithmetic operator:

df['Combined] = 'mens' + ' ' + df['Keyword']

Or specify multiple columns of data:

df['Combined'] = df['Subdomain'] + df['URL']
concat PythonScreenshot from VS Code, November 2022

Pandas has a dedicated concat function, but this is more useful when trying to combine multiple dataframes with the same columns.

For instance, if we had multiple exports from our favorite link analysis tool:

df = pd.read_csv('data.csv')
df2 = pd.read_csv('data2.csv')
df3 = pd.read_csv('data3.csv')

dflist = [df, df2, df3]

df = pd.concat(dflist, ignore_index=True)

SEARCH/FIND

The SEARCH and FIND formulas provide a way of locating a substring within a text string.

These commands are commonly combined with ISNUMBER to create a Boolean column that helps filter down a dataset, which can be extremely helpful when performing tasks like log file analysis, as explained in this guide. E.g.:

=ISNUMBER(SEARCH("searchthis",A2)
isnumber search ExcelScreenshot from Microsoft Excel, November 2022

The difference between SEARCH and FIND is that find is case-sensitive.

The equivalent Pandas function, str.contains, is case-sensitive by default:

df['Journal'] = df['URL'].str.contains('engine', na=False)

Case insensitivity can be enabled by setting the case parameter to False:

df['Journal'] = df['URL'].str.contains('engine', case=False, na=False)

In either scenario, including na=False will prevent null values from being returned within the Boolean column.

One massive advantage of using Pandas here is that, unlike Excel, regex is natively supported by this function – as it is in Google sheets via REGEXMATCH.

Chain together multiple substrings by using the pipe character, also known as the OR operator:

df['Journal'] = df['URL'].str.contains('engine|search', na=False)

Find And Replace

Excel’s “Find and Replace” feature provides an easy way to individually or bulk replace one substring with another.

find replace ExcelScreenshot from Microsoft Excel, November 2022

When processing data for SEO, we’re most likely to select an entire column and “Replace All.”

The SUBSTITUTE formula provides another option here and is useful if you don’t want to overwrite the existing column.

As an example, we can change the protocol of a URL from HTTP to HTTPS, or remove it by replacing it with nothing.

When working with dataframes in Python, we can use str.replace:

df['URL'] = df['URL'].str.replace('http://', 'https://')

Or:

df['URL'] = df['URL'].str.replace('http://', '') # replace with nothing

Again, unlike Excel, regex can be used – like with Google Sheets’ REGEXREPLACE:

df['URL'] = df['URL'].str.replace('http://|https://', '')

Alternatively, if you want to replace multiple substrings with different values, you can use Python’s replace method and provide a list.

This prevents you from having to chain multiple str.replace functions:

df['URL'] = df['URL'].replace(['http://', ' https://'], ['https://www.', 'https://www.’], regex=True)

LEFT/MID/RIGHT

Extracting a substring within Excel requires the usage of the LEFT, MID, or RIGHT functions, depending on where the substring is located within a cell.

Let’s say we want to extract the root domain and subdomain from a URL:

=MID(A2,FIND(":",A2,4)+3,FIND("/",A2,9)-FIND(":",A2,4)-3)
left mid right ExcelScreenshot from Microsoft Excel, November 2022

Using a combination of MID and multiple FIND functions, this formula is ugly, to say the least – and things get a lot worse for more complex extractions.

Again, Google Sheets does this better than Excel, because it has REGEXEXTRACT.

What a shame that when you feed it larger datasets, it melts faster than a Babybel on a hot radiator.

Thankfully, Pandas offers str.extract, which works in a similar way:

df['Domain'] = df['URL'].str.extract('.*://?([^/]+)')
str extract PythonScreenshot from VS Code, November 2022

Combine with fillna to prevent null values, as you would in Excel with IFERROR:

df['Domain'] = df['URL'].str.extract('.*://?([^/]+)').fillna('-')

If

IF statements allow you to return different values, depending on whether or not a condition is met.

To illustrate, suppose that we want to create a label for keywords that are ranking within the top three positions.

Excel IFScreenshot from Microsoft Excel, November 2022

Rather than using Pandas in this instance, we can lean on NumPy and the where function (remember to import NumPy, if you haven’t already):

df['Top 3'] = np.where(df['Position'] <= 3, 'Top 3', 'Not Top 3')

Multiple conditions can be used for the same evaluation by using the AND/OR operators, and enclosing the individual criteria within round brackets:

df['Top 3'] = np.where((df['Position'] <= 3) & (df['Position'] != 0), 'Top 3', 'Not Top 3')

In the above, we’re returning “Top 3” for any keywords with a ranking less than or equal to three, excluding any keywords ranking in position zero.

IFS

Sometimes, rather than specifying multiple conditions for the same evaluation, you may want multiple conditions that return different values.

In this case, the best solution is using IFS:

=IFS(B2<=3,"Top 3",B2<=10,"Top 10",B2<=20,"Top 20")
IFS ExcelScreenshot from Microsoft Excel, November 2022

Again, NumPy provides us with the best solution when working with dataframes, via its select function.

With select, we can create a list of conditions, choices, and an optional value for when all of the conditions are false:

conditions = [df['Position'] <= 3, df['Position'] <= 10, df['Position'] <=20]

choices = ['Top 3', 'Top 10', 'Top 20']

df['Rank'] = np.select(conditions, choices, 'Not Top 20')

It’s also possible to have multiple conditions for each of the evaluations.

Let’s say we’re working with an ecommerce retailer with product listing pages (PLPs) and product display pages (PDPs), and we want to label the type of branded pages ranking within the top 10 results.

The easiest solution here is to look for specific URL patterns, such as a subfolder or extension, but what if competitors have similar patterns?

In this scenario, we could do something like this:

conditions = [(df['URL'].str.contains('/category/')) & (df['Brand Rank'] > 0),
(df['URL'].str.contains('/product/')) & (df['Brand Rank'] > 0),
(~df['URL'].str.contains('/product/')) & (~df['URL'].str.contains('/category/')) & (df['Brand Rank'] > 0)]

choices = ['PLP', 'PDP', 'Other']

df['Brand Page Type'] = np.select(conditions, choices, None)

Above, we’re using str.contains to evaluate whether or not a URL in the top 10 matches our brand’s pattern, then using the “Brand Rank” column to exclude any competitors.

In this example, the tilde sign (~) indicates a negative match. In other words, we’re saying we want every brand URL that doesn’t match the pattern for a “PDP” or “PLP” to match the criteria for ‘Other.’

Lastly, None is included because we want non-brand results to return a null value.

np select PythonScreenshot from VS Code, November 2022

VLOOKUP

VLOOKUP is an essential tool for joining together two distinct datasets on a common column.

In this case, adding the URLs within column N to the keyword, position, and search volume data in columns A-C, using the shared “Keyword” column:

=VLOOKUP(A2,M:N,2,FALSE)
vlookup ExcelScreenshot from Microsoft Excel, November 2022

To do something similar with Pandas, we can use merge.

Replicating the functionality of an SQL join, merge is an incredibly powerful function that supports a variety of different join types.

For our purposes, we want to use a left join, which will maintain our first dataframe and only merge in matching values from our second dataframe:

mergeddf = df.merge(df2, how='left', on='Keyword')

One added advantage of performing a merge over a VLOOKUP, is that you don’t have to have the shared data in the first column of the second dataset, as with the newer XLOOKUP.

It will also pull in multiple rows of data rather than the first match in finds.

One common issue when using the function is for unwanted columns to be duplicated. This occurs when multiple shared columns exist, but you attempt to match using one.

To prevent this – and improve the accuracy of your matches – you can specify a list of columns:

mergeddf = df.merge(df2, how='left', on=['Keyword', 'Search Volume'])

In certain scenarios, you may actively want these columns to be included. For instance, when attempting to merge multiple monthly ranking reports:

mergeddf = df.merge(df2, on='Keyword', how='left', suffixes=('', '_october'))
    .merge(df3, on='Keyword', how='left', suffixes=('', '_september'))

The above code snippet executes two merges to join together three dataframes with the same columns – which are our rankings for November, October, and September.

By labeling the months within the suffix parameters, we end up with a much cleaner dataframe that clearly displays the month, as opposed to the defaults of _x and _y seen in the earlier example.

multi merge PythonScreenshot from VS Code, November 2022

COUNTIF/SUMIF/AVERAGEIF

In Excel, if you want to perform a statistical function based on a condition, you’re likely to use either COUNTIF, SUMIF, or AVERAGEIF.

Commonly, COUNTIF is used to determine how many times a specific string appears within a dataset, such as a URL.

We can accomplish this by declaring the ‘URL’ column as our range, then the URL within an individual cell as our criteria:

=COUNTIF(D:D,D2)
Excel countifScreenshot from Microsoft Excel, November 2022

In Pandas, we can achieve the same outcome by using the groupby function:

df.groupby('URL')['URL'].count()
Python groupbyScreenshot from VS Code, November 2022

Here, the column declared within the round brackets indicates the individual groups, and the column listed in the square brackets is where the aggregation (i.e., the count) is performed.

The output we’re receiving isn’t perfect for this use case, though, because it’s consolidated the data.

Typically, when using Excel, we’d have the URL count inline within our dataset. Then we can use it to filter to the most frequently listed URLs.

To do this, use transform and store the output in a column:

df['URL Count'] = df.groupby('URL')['URL'].transform('count')
Python groupby transformScreenshot from VS Code, November 2022

You can also apply custom functions to groups of data by using a lambda (anonymous) function:

df['Google Count'] = df.groupby(['URL'])['URL'].transform(lambda x: x[x.str.contains('google')].count())

In our examples so far, we’ve been using the same column for our grouping and aggregations, but we don’t have to. Similarly to COUNTIFS/SUMIFS/AVERAGEIFS in Excel, it’s possible to group using one column, then apply our statistical function to another.

Going back to the earlier search engine results page (SERP) example, we may want to count all ranking PDPs on a per-keyword basis and return this number alongside our existing data:

df['PDP Count'] = df.groupby(['Keyword'])['URL'].transform(lambda x: x[x.str.contains('/product/|/prd/|/pd/')].count())
Python groupby countifsScreenshot from VS Code, November 2022

Which in Excel parlance, would look something like this:

=SUM(COUNTIFS(A:A,[@Keyword],D:D,{"*/product/*","*/prd/*","*/pd/*"}))

Pivot Tables

Last, but by no means least, it’s time to talk pivot tables.

In Excel, a pivot table is likely to be our first port of call if we want to summarise a large dataset.

For instance, when working with ranking data, we may want to identify which URLs appear most frequently, and their average ranking position.

pivot table ExcelScreenshot from Microsoft Excel, November 2022

Again, Pandas has its own pivot tables equivalent – but if all you want is a count of unique values within a column, this can be accomplished using the value_counts function:

count = df['URL'].value_counts()

Using groupby is also an option.

Earlier in the article, performing a groupby that aggregated our data wasn’t what we wanted – but it’s precisely what’s required here:

grouped = df.groupby('URL').agg(
     url_frequency=('Keyword', 'count'),
     avg_position=('Position', 'mean'),
     )

grouped.reset_index(inplace=True)
groupby-pivot PythonScreenshot from VS Code, November 2022

Two aggregate functions have been applied in the example above, but this could easily be expanded upon, and 13 different types are available.

There are, of course, times when we do want to use pivot_table, such as when performing multi-dimensional operations.

To illustrate what this means, let’s reuse the ranking groupings we made using conditional statements and attempt to display the number of times a URL ranks within each group.

ranking_groupings = df.groupby(['URL', 'Grouping']).agg(
     url_frequency=('Keyword', 'count'),
     )
python groupby groupingScreenshot from VS Code, November 2022

This isn’t the best format to use, as multiple rows have been created for each URL.

Instead, we can use pivot_table, which will display the data in different columns:

pivot = pd.pivot_table(df,
index=['URL'],
columns=['Grouping'],
aggfunc='size',
fill_value=0,
)
pivot table PythonScreenshot from VS Code, November 2022

Final Thoughts

Whether you’re looking for inspiration to start learning Python, or are already leveraging it in your SEO workflows, I hope that the above examples help you along on your journey.

As promised, you can find a Google Colab notebook with all of the code snippets here.

In truth, we’ve barely scratched the surface of what’s possible, but understanding the basics of Python data analysis will give you a solid base upon which to build.

More resources:


Featured Image: mapo_japan/Shutterstock

What 2022 SEO Shifts Could Mean For 2023 & Beyond [Webinar] via @sejournal, @lorenbaker

Have you ever felt overwhelmed by Google’s seemingly constant algorithm updates? If so, you’re certainly not alone.

Many SEO professionals are reeling from Google’s whirlwind of a year, with eight confirmed and several unconfirmed updates that have dropped in 2022.

And with so much volatility in search this past year, it can often feel like you’re scrambling to keep up.

But what does the chaos of 2022 mean for 2023? Can we expect more updates? Will we see more testing?

How can you get on the front end of Google’s new rollouts and make sure you’re prepared for the changes to come?

How can you adapt your SEO strategy to keep it fresh and relevant?

For SEO pros looking to get ahead of the curve, our next webinar focuses on how to handle frequent algorithm changes and market shifts.

Join Pat Reinhart, VP of Customer Success at Conductor, for an in-depth recap of this year’s biggest SEO insights, as well as expert predictions for what 2023 may hold.

Key Takeaways From This Upcoming Google Algorithm Webinar

  • What a crazy 2022 for Google means for 2023.
  • How the growth of social media search will impact strategy next year.
  • What the popularity of visual search will mean going forward.

Trends To Watch For In 2023

As technology continues to evolve and new digital trends emerge, the SEO community must quickly adapt.

With image search becoming more prominent and Google starting to prioritize short-form videos on mobile SERPs, visual content is predicted to make a major impact on search rankings, going forward.

Between the rise of social media and the explosion of short-form video content, there are several factors expected to have a major impact on SEO in 2023.

Not only are people sharing more on social platforms now, but an increasing amount of people are relying on social media search to find what they’re looking for online.

This trend, plus the growing popularity of visual search, should be key considerations in your SEO strategy for next year.

Discover more insights on how these trends could affect SEO next year by signing up for this webinar.

Optimize Your SEO Strategy

If you struggled to keep up with this year’s frequent search engine updates, the SEO predictions you’ll discover in this webinar could be a game-changer for your business.

If you want to stay competitive in 2023, it’s time to take action and start optimizing your SEO strategy.

Register for this webinar and learn more about how Google’s recent past will inform the future.

Elon Musk has created a toxic mess for the LGBTQ+ community. I would know.

A mere day after Elon Musk reactivated Rep. Marjorie Taylor Greene’s Twitter account, she tweeted that I’m a “communist groomer,” presumably because I’m a gay Jewish Democratic elected official from San Francisco. 

Greene’s tweet also promoted her proposed federal law to ban gender-affirming care for transgender youth and to make it effectively impossible for adult transgender people to receive that care. In the past when Greene has gone after me with homophobic or transphobic tropes, I’ve received increased abuse on social media, but this was an escalation beyond what I’m used to. And that escalation, which was especially pronounced after the Club Q massacre, was due less to Greene than to Twitter’s new owner, Elon Musk.

Since finalizing his purchase of Twitter, Musk has brought some of the platform’s most notorious banned users back to the flock. Shortly before he restored Greene’s account, he reactivated the accounts of Donald Trump and Kanye West (of “death con 3 on Jewish people” fame). He’s also reinstated the accounts of Project Veritas, which had engaged in severe doxxing; James Lindsay, who popularized the “OK groomer” hashtag, opined that Joe McCarthy hadn’t gone far enough, and referred to a Jewish person as “Dr. Lampshades” (a Holocaust myth that holds that Jewish skin was used to make lampshades); and Andrew Tate, who said that rape victims bear responsibility for getting raped.

Musk is now promising—based on a Twitter “poll” that was reportedly mobbed with extremist 4chan users—to reactivate any suspended account that didn’t violate the law or generate egregious spam. That could be quite the motley crew: for example, Nick Fuentes, a white supremacist who said “the Jews had better start being nice to people like us, because what comes out of this is going to be a lot uglier and a lot worse for them”; Milo Yiannopoulos, who worked closely with Nazi and white supremacist leaders, was Sieg Heil saluted by Nazis, used antisemitic words as passwords, and recently posted about the “Jewish powers that be who hate Jesus Christ, hate our country, and see us all as disposable cattle according to their ‘holy’ book” (Yiannopoulos interns for Greene); and an endless cast of lesser-known insurrectionists, bigots, and online harassers. And given that Trump absolutely broke the law by inciting people to violent insurrection, Musk’s “violate the law” exclusion appears to be quite limited.

While Twitter is a small platform compared with other major social media, this shift matters tremendously. Twitter punches way above its weight class. It is an incredibly important platform for our democracy—a place where ideas and information germinate, spread, and break out of Twitter itself into broader media and public perception. Whether for politics, media, science, medicine, history, or pretty much any other subject area, Twitter has become an epicenter of public discourse in American life. 

Make no mistake: the reinstatement of these accounts will make Twitter far more toxic than it was before. The people previously banned from Twitter are not just benign trolls. Many have engaged in aggressive antisemitic, homophobic, transphobic, or racist harassment campaigns; are doxxers; are egregious purveyors of misinformation that risks violence or promotes vaccine lies; or have incited or continue to incite insurrection. Bringing them back not only forgives their past behavior, it validates and enshrines their rhetoric as pillars of Twitter’s platform going forward.

Musk’s reinstatement effort appears to stem from his assertion that he is a “free speech absolutist.” Putting aside that he’s banned multiple progressive accounts that parodied him—parody being one of the most powerful and essential forms of free speech—his free speech absolutism is actually about free hate speech, free harassment speech, and free incitement speech. Combined with his decimation of Twitter’s content moderation staff, Twitter will quickly become the free-for-all hellscape Musk insists he wants to avoid.

If Twitter becomes a right-wing cesspool—even if it’s just a more benign version of 4chan—its role as a democratizing host to global conversations will quickly collapse, as people who don’t think Fuentes or other white supremacists and Nazis are awesome flee the platform.

More tangibly for Twitter users—and for those who are not on Twitter but are nevertheless targeted on the increasingly unmoderated platform—an antisemitic, racist, homophobic, transphobic, xenophobic, threatening Twitter cesspool puts a lot of people in actual physical danger. I say this based on personal experience, as that gay Jewish Democrat from San Francisco.

Over the past several years, I’ve received thousands of death threats, overwhelmingly on or stemming from social media, largely in response to my work advancing LGBTQ+ civil rights, with a secondary source being my work to expand vaccine access.

The threats and harassment started when I wrote a law to repeal several felonies that singled out people living with HIV for harsh criminal treatment (felonies that didn’t apply to people with any other serious infectious diseases). The social media threats and harassment then exploded when I authored a law—supported by law enforcement, civil rights organizations, and victim advocacy groups—to end discrimination against LGBTQ+ young people when determining who should be included on California’s sex offender registry. That bill started the QAnon slander campaign tidal wave against me, describing me as a “pedophile” and “groomer.” The threats and harassment flared up again when I drafted legislation to allow transgender kids and their families to seek refuge in California if they are being criminalized in states that seek to ban gender-affirming care for trans youth, like Texas and Alabama, and when I pursued legislation to allow teenagers to get vaccinated without parental consent and protect their own health.

The threats and harassment directed at me on social media have been breathtaking. I’ve been doxxed. I’ve been repeatedly threatened with decapitation and rape. I’ve been told that the sender would come find me with a gun. I received a bomb threat that led to the police sweeping my home with a bomb-sniffing dog. Several threats, either from or almost certainly inspired by social media, resulted in criminal prosecutions and convictions for those who issued them. For the first time in my life, I had to testify before a jury—against a man who was threatening my very existence.

As I received these waves of death threats, I learned a lot about the various social media platforms and how they handle the problem. YouTube was the slowest to address the threats and harassment. Meta (mostly Instagram but also Facebook) was initially quite slow to take action but got better over time. Twitter was the most responsive and proactive, but I’m confident that, going forward, it won’t be any better than the other platforms. It’ll likely be much worse for people like me.

Yet as bad as it’s been for me, I’m one of the lucky ones. I’m privileged because I have resources. I have a platform and a role where I can highlight this issue, as I’m doing in this piece. 

The same can’t be said about the vast majority of people who are threatened, stalked, harassed, or doxxed on Twitter and other platforms and whose lives will get worse as Musk empties out the Twitter equivalent of the Phantom Zone, allowing vicious, bigoted, and even violent harassers, Nazis, and white supremacists to return. 

School board members, teachers, and librarians are being targeted by extremists claiming these educators are “grooming” their kids to be transgender or teaching them critical race theory. Progressive activists’ home addresses are being posted online, as are pictures of community leaders’ families. Public health leaders are viciously harassed and threatened by anti-vaxxers, and physicians are harassed and threatened by elements of the anti-choice movement.

Suffice it to say that for every prominent public figure like me who’s getting harassed and threatened, thousands of people are suffering in silence.

Elon Musk owns Twitter, and he has the power to shape and change it. Yet Twitter is so much more than a private asset. It matters to our democracy and public discourse. And it matters in terms of whether people are safe. Musk lives in a rarefied world. He is, in fact, the richest man in the world. He has access to every conceivable resource—security, investigators, or whatever else he needs.

Most of us don’t have those resources. As Musk plays his chaotic Twitter game, we’re the ones left suffering the consequences.

Scott Wiener is a California state senator who represents San Francisco and northern San Mateo County.

What Shanghai protesters want and fear

China Report is MIT Technology Review’s newsletter about technology developments in China. Sign up to receive it in your inbox every Tuesday.

The past week has meant many sleepless nights for people in China, and for people like me who are intently watching from afar. 

You may have seen that nearly three years after the pandemic started, protests have erupted across the country. In Beijing, Shanghai, Urumqi, Guangzhou, Wuhan, Chengdu, and more cities and towns, hundreds of people have taken to the streets to mourn the lives lost in an apartment fire in Urumqi and to demand that the government roll back its strict pandemic policies, which many blame for trapping those who died. 

It’s remarkable. It’s likely the largest grassroots protest in China in decades, and it’s happening at a time when the Chinese government is better than ever at monitoring and suppressing dissent.

Videos of these protests have been shared in real time on social media—on both Chinese and American platforms, even though the latter are technically blocked in the country—and they have quickly become international front-page news. However, discussions among foreigners have too often reduced the protests to the most sensational clips, particularly ones in which protesters directly criticize President Xi Jinping or the ruling party.

The reality is more complicated. As in any spontaneous protest, different people want different things. Some only want to abolish the zero-covid policies, while others have made direct calls for freedom of speech or a change of leadership. 

I talked to two Shanghai residents who attended the protests to understand what they experienced firsthand, why they went, and what’s making them anxious about the thought of going again. Both have requested we use only their surnames, to avoid political retribution.

Zhang, who went to the first protest in Shanghai after midnight on Saturday, told me he was motivated by a desire to let people know his discontent. “Not everyone can silently suffer from your actions,” he told me, referring to government officials. “No. People’s lives have been really rough, and you should reflect on yourself.”

In the hour that he was there, Zhang said, protesters were mostly chanting slogans that stayed close to opposing zero-covid policies—like the now-famous line “Say no to covid tests, yes to food. No to lockdowns, yes to freedom,” which came from a protest by one Chinese citizen, Peng Lifa, right before China’s heavily guarded party congress meeting last month. 

While Peng hasn’t been seen in public since, his slogans have been heard and seen everywhere in China over the past week. Relaxing China’s strict pandemic control measures, which often don’t reflect a scientific understanding of the virus, is the most essential—and most agreed-upon—demand. 

One picture that’s been circulating widely on Chinese social media since Monday is a good example of these more pragmatic calls. Among six demands listed, it asks the government to apologize for unreasonable covid policies, to stop exaggerating the risks of contracting covid, to abandon QR-code-based pandemic surveillance measures, and to resume allowing everyday activities like dining in restaurants and going to movie theaters.   

It was really only later that night (or, more accurately, early the next morning, around 3 a.m.), that the chants got more radical and more political, when some people directly called for the Chinese Communist Party and Xi to step down. Zhang had already left by then, but from home he saw videos on social media. 

Chen, another Shanghai resident, went to the second protest on Sunday afternoon in the same location and heard much of the same as Zhang. She said that while everyone echoed the demands for relaxing the testing system and increasing freedom, there were some chants explicitly mentioning Xi or the Communist Party. These, she said, were noticeably less loud. 

Chen agreed that people have the right to say whatever they want, but she worried that it may divert the public’s attention from what she sees as the core message: “It’s unnecessary to shout out too radical political slogans from the beginning. It’s too radical.” 

The people protesting are clearly not a monolith. And, to be fair, it is the first time many of them are participating in a protest in real life; they are just learning how it works. They came out of their homes because they have been genuinely disturbed by the increased covid control measures. Even after the Chinese government announced a policy to loosen restrictions in early November, the reality on the ground hasn’t really changed. In some cities, local government officials have doubled down on controls. When people hit the streets, they might be thinking of the things that are closest to their lives and not what that means on a higher political level. 

It’s understandable that the rare direct criticism of China’s top leadership has raised more eyebrows overseas and made it into newspaper headlines. But it has also stirred worries that this organic, homegrown movement will be painted as foreign interference. In fact, that’s already happening. Some Chinese pro-government influencers have highlighted the anti-Xi slogans to claim that foreign actors are pushing a “color revolution.” 

(Other protesters argue that the legitimacy of the protests would be doubted regardless of whether the slogans were radical or not. Smearing protesters as foreign actors is an old rule in the Chinese information-control playbook.) 

So what’s going to happen next? We don’t know how long the protests are going to continue, but they have become much harder to organize and attend since the Chinese police gradually reacted to the events and increased their enforcement activities. 

While Zhang has friends who worry that protesters are being pushed to become more radical as the demonstrations continue, that in particular does not trouble him. He told me he thinks it’s perfectly fine for people to have a range of thoughts and feelings. “[If you don’t agree], you can just choose not to say it,” Zhang said. “In protests, there are always going to be slogans that are too radical. You can either choose peaceful demonstrations and not say anything; or if you are speaking out, then don’t be afraid.”

What does worry him is how China’s well-oiled state surveillance system can be easily deployed against these protesters—an important part of the risk calculation for anyone who has participated and who still wants to go. Zhang read on social media that protesters in Beijing suspect their health code data has been used against them to determine who showed up. There are also reports of police checking people’s phones in Shanghai, which deeply concerned Chen and made her take a different route to work on Monday to avoid the police presence.

Chen said she worries about going to a protest again and ending up alone and falling victim to the police. But she would go if enough people showed up; she wants to, because the experience of the past days has taught her that protests really matter. 

Back in October, when Peng Lifa staged that single-person protest, Chen thought it would go unnoticed. But seeing so many people in different cities chanting the same words that Peng wrote has convinced her that protests, no matter how small, can get the message across in today’s China. “These fights have meaningful results,” she said. “The [results] may not show up the next day, but they will.”

What else do you want to know about the protests? Write me at zeyi@technologyreview.com

Catch up with China

1. What else you need to know about the protests in China:

  • A Uyghur living in exile confirmed that five of his relatives died in the Urumqi fire, which inspired the nationwide protests. (AP)
  • Twitter, with its massively reduced anti-propaganda team, is struggling with the rise of porn spam that has obscured search results on what’s happening in Chinese cities. (Washington Post $)
  • Blank sheets of white paper have become the new protest symbol. (Wall Street Journal $)
  • Last week, in a separate but related protest, workers in a Foxconn factory in China clashed, sometimes violently, with security forces over salary changes and covid-infection concerns. (CNN)

2. China plans to revise its antitrust law, adding many new rules targeting tech platforms. (South China Morning Post $)

3. Four Chinese immigrants working on a marijuana farm in Oklahoma were recently killed. (NBC News)

  • While it’s too early to know if it was the case in this incident, during the pandemic thousands of Chinese immigrants living on the West Coast were lured and trafficked to cannabis farms in New Mexico, Oklahoma, and the Navajo Nation. (Searchlight New Mexico)

4. The Vatican was taken by surprise by the installation of a bishop in China in a diocese that the church does not recognize. (Vatican News)

5. Serbian police bought and used Huawei-made surveillance equipment to identify fugitives and record videos of protesters. (Radio Free Europe)

6. Chinese company Sino Biopharm announced it has successfully developed three mRNA vaccines to prevent monkeypox. (News Medical)

7. Popular video games like World of Warcraft and Overwatch will no longer be playable in China after a deal between Activision Blizzard and the Chinese company NetEase fell through. (BBC)

8. China may be the biggest climate polluter today, but data shows the US is responsible for the most emissions throughout history. (MIT Technology Review

Lost in translation

When three Chinese artists found themselves in a centralized quarantine facility in Sichuan, they decided to turn eight days in solitary into an art experiment. 

A collage of the art pieces by three artists.

As Chinese publication Bingdian Weekly reported, Meng Lichao, Chen Yu, and Yang Yang were supposed to attend an art festival in early November, but a last-minute covid case in the hotel where they were staying meant all three artists had to be transferred to a quarantine facility. Since they were missing the festival, they decided to put up art exhibitions in their individual rooms instead. Meng drew doodles over every inch of the walls and made an audio installation mixing EDM music and audio samples that say “You are being monitored.” Chen printed out surveillance camera footage of fellow residents opening their doors without management’s approval that had been shared in an attempt to publicly shame them. Yang made a collage on the wall with medical waste trash bags, cotton swabs, and food packaging from his quarantine meals.

In the end, since it’s a quarantine facility, no one could come in to see the art in their rooms except for the next batch of residents, who arrived just hours after they left.

One more thing

Who says you can’t find peace and serenity in your phone? Young Chinese people are using apps that simulate “wooden fish”—a special woodblock that Buddhist monks knock rhythmically in ceremonies—to purify themselves of sins and acquire “merit scores.” Well, most of the time it’s more of a tongue-in-cheek joke for these people than a serious religious practice. But app developers have since come up with different variations of digital wooden fish, sometimes gamifying the practice and allowing users to compete with friends for the highest merit score.

Screenshot of a video where someone knocks on the wooden fish on an iPad screen.