Controversial CRISPR scientist promises “no more gene-edited babies” until society comes around

He Jiankui, the Chinese biophysicist whose controversial 2018 experiment led to the birth of three gene-edited children, says he’s returned to work on the concept of altering the DNA of people at conception, but with a difference. 

This time around, he says, he will restrict his research to animals and nonviable human embryos. He will not try to create a pregnancy, at least until society comes to accept his vision for “genetic vaccines” against common diseases.

“There will be no more gene-edited babies. There will be no more pregnancies,” he said during an online roundtable discussion hosted by MIT Technology Review, during which He answered questions from biomedicine editor Antonio Regalado, editor in chief Mat Honan, and our subscribers.

During the interview, He defended his past research and said the “only regret” he had was the difficulties he had caused to his wife and two daughters. He spent three years in prison after a court found him guilty of breaking regulations, but since his release in 2022 he has sought to stage a scientific comeback.

He says he currently has a private lab in the city of Sanya, in Hainan province, where he works on gene therapy for rare disease as well as laboratory tests to determine how, one day, babies could be born resistant to ever developing Alzheimer’s disease.

The Chinese scientist said he’s receiving financial support from individuals in the US and China, and from Chinese companies, and has received an offer to form a research company in Silicon Valley. He declined to name his investors.

Read the full transcript of the event below.

Mat Honan: Hello, everybody. Thanks for joining us today. My name is Mat Honan. I’m the editor in chief here at MIT Technology Review. I’m really thrilled to host what’s going to be, I think, a great discussion today. I’m joined by Antonio Regalado, our senior editor for biomedicine, and He Jiankui, who goes by the name JK. 

JK is a biophysicist, He’s based in China, and JK used CRISPR to edit the genes of human embryos, which ultimately resulted in the first children born whose DNA had been tailored using gene editing. Welcome to you both.

To our audience tuning in today, I wanted to let you know if you’ve got questions for us, please do ask them in the chat window. We’ve got a packed discussion planned, but we will get to as many of those as we can throughout. Antonio, I think I’m going to start with you, if we can. You’re the one who broke this story six years ago. Why don’t you set the stage for what we’re going to be talking about here today, and why it’s important.

Antonio Regalado: Mat, thank you.

The subject is genome editing. Of course, it’s a technology for changing the DNA inside of individual cells, including embryos. It’s hard to overstate its importance. I put it up there with the invention of the transistor and artificial intelligence.

And why do I think so? Well, genome editing gives humans control, or at least the ability to try and direct the very processes that brought us about as a species. So it’s that profound.

Getting to JK’s story. In 2018 we had a scoop—he might call it a leak—in which we described his experiment, which, as Mat said, was to edit human embryos to delete a particular gene called CCR5 with the goal of rendering the children, of which there were three, immune to HIV, which their fathers had and which is a source of stigma in China. So that was the project.

Of course our story set off, you know, immediate chaos. Voices were raised all over the world—many critical, a few in support. But one of the consequences was that JK and his team, the parents and the doctors, did not have the ability to tell their own story—in JK’s case because he was, in fact, detained and has completed a term in prison. So we’re happy to have him here to answer my questions and those of our subscribers. JK, thank you for being here. 

Several people, including Professor Michael Waitzkin of Duke University, would like to know what the situation is with the three children. What do you know about their health, and where is this information coming from?

He Jiankui: Lulu, Nana, and the third gene-edited baby—they were healthy and are living a normal, peaceful, undisturbed life. They are as happy as any other people, any other children in kindergarten. I have maintained a constant connection with their parents.

Antonio Regalado: I see. JK, on X, you recently made a comment about one of the parents—now a single mother—who you said you were supporting financially. What can you tell us about that situation? What kind of obligations do you have to these children, and are you able to meet those obligations?

He Jiankui: So the third genetic baby—the parents divorced, so the girl is with her mother. You know, a single mother, a single-parent family—life is not easy. So in the last two years, I’m providing some financial support, but I’m not sure it’s the right thing to do or whether it’s ethical, because I’m a scientist or a doctor, and she is a volunteer or patient. For scientists or doctors to provide financial support to the volunteer or patient—it correct? Is it the right thing to do, and is it ethical? That’s something I’m not sure of. So I have this question, actually.

Antonio Regalado: Interesting. Well, there’s a lot of ethical dilemmas here, and one of them is about your publications, the scientific publications which you prepared and which describe the experiment. So a two-part question for you. 

First of all, setting the ethics aside, some people who criticized your experiment still want to know the result. They would like to know if it worked. Are the children resistant to HIV or not? So part one of the question is: Are you able to make a measurement on their blood, or is anybody able to make a measurement that would show if the experiment worked? And second part of the question: Do you intend to publish your paper, including as a preprint or as a white paper?

He Jiankui: So I always believe that scientific research must be open and transparent, so I am willing to publish my papers, which I wrote six years ago.

It was rejected by Nature, for some reason. But even today, I would say that I’m willing to publish these two papers in a peer-reviewed journal. It has to be peer-reviewed; that is the standard way to publish in a paper.

The other thing is whether the baby is resistant to HIV. Actually, several years ago, when we designed the experiment, we already collected the [umbilical] cord blood when they were born. We collected cord blood from the babies, and our original experiment design was to challenge the cord blood with the HIV virus to see whether they are actually resistant to HIV. But this experiment never happened, because when the news broke out, there has been no way to do any experiment since then. 

I would say I am happy to share my results to the whole world.

Mat Honan: Thanks, Antonio. Let me start with a question from a reader, Karen Jones. She asks, with so much controversy around breaking the law in China, she wanted to know about your credibility. And it reminds me of something that I’m curious about myself. What are the professional consequences of your work? Are you still able to work in China? Are you still able to do experiments with CRISPR?

He Jiankui: Yes, I continue my research in the lab. I have a lab in Sanya [Hainan province], and also previously a lab in Wuhan.

My current work is on gene editing to cure genetic disease such as Duchenne muscular dystrophy and several other genetic diseases. And all this is done by somatic gene therapy, which means this is not working on human embryos.

Mat Honan: I think that leads [to] a question that we have from another reader, Sophie, who wanted to know if you plan to do more gene editing in humans.

He Jiankui: So I have proposed a research project using human embryo gene editing to prevent Alzheimer’s disease. I posted this proposal last year on Twitter. So my goal is we’re going to test the embryo gene editing in mice and monkeys, and in human nonviable embryos. Again, it’s nonviable embryos. There will be no more gene-edited babies. There will be no more pregnancies. We’re going to stop at human nonviable embryos. So our goal is to see if we could prevent Alzheimer’s for offspring or the next generation, because Alzheimer’s has no cure currently.

Mat Honan: I see. And then my last question before I move it back to Antonio. I’m curious if you plan to continue working in China, or if you think that you will ultimately relocate somewhere else. Do you plan to do this work elsewhere? 

He Jiankui: Some investors from Silicon Valley proposed to invest in me to start a company in the United States, with research done both in the United States and in China. This is a very interesting proposal, and I am considering it. I would be happy to work in the United States if there’s good opportunity.

Mat Honan: Let me just remind our readers—if you do have questions, you could put them in the chat and we will try to get to them. But in the meantime, Antonio, back over to you, please.

Antonio Regalado: Definitely, I’m curious about what your plans are. Yesterday Stat News reported some of the answers to today’s questions. They said that you have established yourself in the province of Hainan in China. So what kind of facility do you have there? Do you have a lab, or are you doing research? And where is the financial support coming from?

He Jiankui: So here I have an independent private research lab with a few people. We get funding from both the United States and also from China to support me to carry on the research on the gene therapy for Duchenne muscular dystrophy, for high cholesterol, and some other genetic diseases. 

Antonio Regalado: Could you be more specific about where the funding is coming from? I mean, who is funding you, or what types of people are funding this research? 

He Jiankui:  There are people in the United States who made a donation to me. I’m not going to disclose the name and amount. Also the Chinese people, including some companies, are providing funding to me.

Antonio Regalado: I wonder if you could sketch out for us—I know people are interested—where you think all this [is] going to lead. With a long enough time frame—10 years, 20 years, 30 years—do you think the technology will be in use to change embryos, and how will it be used? What is the larger plan that you see?

He Jiankui: I would say in 50 years, like in 2074, embryo gene editing will be as common as IVF babies to prevent all the genetic disease we know today. So the babies born at that time will be free of genetic disease.

Antonio Regalado: You’re working on Alzheimer’s. This is a gene variant that was described in 2012 by deCode Genetics. This is one of these variants that is protective—it would protect against Alzheimer’s. Strictly speaking, it’s not a genetic disease. So what about the role of protective variants, or what could be called improvements to health?

He Jiankui: Well, I decided to do Alzheimer’s disease because my mother has Alzheimer’s. So I’m going to have Alzheimer’s too, and maybe my daughter and my granddaughter. So I want to do something to change it. 

There’s no cure for Alzheimer’s today. I don’t know for how many years that will be true. But what we can do is: Since some people in Europe are at a very low risk [for] Alzheimer’s, why don’t we just make some modifications so our next generation also have this protective allele, so they have a low risk of Alzheimer’s or maybe are free of Alzheimer’s. That’s my goal.

Antonio Regalado: Well, a couple of questions. Will any country permit this? I mean, genome editing, producing genome-edited children, was made formally illegal in China, I think in 2021. And it’s prohibited in the United States in another way. So where can you go, or where will you go to further this technology?

He Jiankui:  I believe society will eventually accept that embryo gene editing is a good thing because it improves human health. So I’m waiting for society to accept that. My current research is not doing any gene-edited baby or any pregnancy. What I do is a basic research in mice, monkeys, or human nonviable embryos. We only do basic research, but I’m certain that one day society will accept embryo gene editing.

Mat Honan: That raises a question for me. We’re talking about HIV or Alzheimer’s, but there are other aspects of this as well. You could be doing something where you’re optimizing for intelligence or optimizing for physical performance. And I’m curious where you think this leads, and if you think that there is a moral issue around, say, parents who are allowed to effectively design their children by editing their genes.

He Jiankui: Well, I advise you to read the paper I published in 2018 in the CRISPR Journal. It’s my personal thinking of the ethical guidelines for embryo gene editing. It was retracted by the CRISPR Journal. But I proposed that the embryo gene editing should only be used for disease. It should never be used for a nontherapeutic purpose, like making people smarter, stronger, or beautiful.

Mat Honan:  Do you not think that becomes inevitable, though, if gene-editing embryos becomes common?

He Jiankui: Society will decide that. 

Mat Honan: Moving on: You said that you were only working with animals or with nonviable embryos. Are there other people who you think are working with human embryos, with viable human embryos, or that you know of, or have heard about, continuing with that kind of work?

He Jiankui: Well, I don’t know yet. Actually, many scientists are keeping their distance from me. But there are people from somewhere, an island in Honduras or maybe some small East European country, inviting me to do that. And I refused. I refused. I will only do research in the United States and China or other major countries.

Mat Honan: So the short answer is, that sounded almost like a yes to me? You think that it is happening? Is that correct?

He Jiankui: I’m not answering that. 

Mat Honan: Okay, fair enough. I’m going to move on to some reader questions here while we have the time. You mentioned basically having society come around to seeing that this is necessary work. Ravi asks: What type of regulatory framework do you believe is necessary to ensure responsible development and applications of this technology? You had mentioned limiting to therapeutic purposes. Are there other frameworks you think should be in place?

He Jiankui: I’m not answering this question.

Mat Honan: What you think should be in place in terms of regulation?

He Jiankui: Well, there are a lot of regulations. I personally comply with all the laws, regulations, and international ethics for my work. 

Mat Honan: I see. Go ahead, Antonio. 

Antonio Regalado: Let me just jump in with a related question. You talked about offers of funding from the United States, from Silicon Valley—offers of funding to support you. Is that to create a company, and how would accepting investment from entrepreneurs to start a company change public perception about the technology?

He Jiankui: Well, it was designed as a company registered in the United States and headquartered in the United States.

Antonio Regalado: But do you think that starting a company will make people more enthusiastic or interested in this technology?

He Jiankui: Well, for me, I would certainly be more happy to get an offer from the United States [if it came] from a university or research institution. I would be happy for that, but it’s not happening. But, well, a company started doing some basic research, and that’s also a good contribution.

Antonio Regalado: Getting back to the initial experiment—obviously, it’s been criticized a great deal. And I am just wondering, looking back, which of those criticisms do you accept? Which do you disagree with? Do you have regrets about the experiment?

He Jiankui: The only regret I have is to my family, my wife and my two daughters. In the last few years, they are living in a very difficult situation. I won’t let that happen again.

Antonio Regalado: The technology is viewed as controversial. I’m talking about embryo editing. So it’s a little bit surprising to me that you would return to it. Surprising and interesting. So why is it that you have decided to pursue this vision, this project, despite the problems? I mean, you’re still working on it. What is your motivation?

He Jiankui: Our stance is always for us to do something to benefit mankind.

Antonio Regalado: Speaking of mankind, or humankind, I did have a question about evolution. The gene edits that you made to CCR5 and now are working on to another gene in Alzheimer’s—these are natural mutations that occur in some populations, you mentioned in Europe. They’ve been discovered through population genetics. Studies of a large number of people can find these genetic variations that are protective, or believed to be protective, against disease. In the natural course of evolution, those might spread, right? But it would take hundreds of thousands of years. So with gene editing, you can introduce such a change into an embryo, I guess, in a matter of minutes.

So the question I have is: Is this an evolutionary project? Is it human technology being used to take over from evolution?

He Jiankui: I’m not interested in evolution. Evolution takes thousands of years. I only care about the people surrounding me—my family, and also the patients who would come to find me. What I want to do is help those people, help people in this living world. I’m not interested in evolution.

Antonio Regalado: Mat, any other question from the audience you’d like to throw in?

Mat Honan: Yeah, let me get to one from Rez, who’s asking: What do you see as the major hurdles in advancing CRISPR to more general health-care use cases? What do you see as the big barriers there?

He Jiankui:  If you’re talking about somatic gene therapy, the bottleneck, of course, is delivery. Without breakthroughs in delivery technology, somatic gene therapy is heading toward a dead end. For the embryo gene editing, the bottleneck, of course, is: How long will it take people to accept new technology? Because as humans, we are always conservative. We are always worried about the new things, and it takes time for people to accept new technology. 

Mat Honan: I wanted to get a question from Robert that goes back to our earlier discussion here, which is: What was your initial motivation to take this step with the three children?

He Jiankui: So several years ago, I went to a village in the center of China where more than 30% of people are infected with HIV. Back to the 1990s, many years ago, people sold blood, and it did something [spread HIV]. When I was there, I saw that there’s a very small kindergarten, only designed for the children of HIV patients. Why did that happen? Other public schools won’t take them. I felt that there’s a kind of discrimination to these children. And what I want to do is to do something to change it. If the HIV patient—if their children are not just free from but actually immune to HIV, then it will help them to go back to the society. For me, it’s just like a vaccine. It’s one vaccine to protect them for a lifetime. 

Mat Honan: I see we’re running short on time here, and I do want to try to get to some more of our reader questions. I know Antonio has a last one as well. If you do have questions, please put them in the chat. And from Joseph, he wants to know: You say that you think that the society will come around. What do you think will be the first types of embryo DNA edits that would be acceptable to the medical community or to society at large?

He Jiankui: Very recently, a patient flew here to visit me in my office. They are a couple, they are over 40 years old. They want to have a baby and already did IVF. They have embryos, but the embryos have a problem with a chromosome. So this embryo is not good. So one thing, apparently, we could do to help them is to correct the chromosome problem so they can have a healthy embryo, so they can have children. We’re not creating any immunity to anything—it’s just to restore the health of the embryo. And I believe that would be a good start.

Mat Honan: Thank you, JK. Antonio, back over to you. 

Antonio Regalado:  JK, I’m curious about your relationship to the government in China, the central government. You were punished, but on the other hand, you’re free to continue to talk about science and do research. Does the government support you and your ideas? Are you a member of the political party? Have you been offered membership? What is your relationship to the government?

He Jiankui: Next question.

Antonio Regalado: Next question? Okay. Interesting. We’ll have to postpone that one for another day.

Mat, anything else? I think we’re coming up against time, and I’m wondering if we have reader questions. I have one here that I could ask, which is about the new technologies in CRISPR. People want to know where this technology is going, in terms of the methods. You used CRISPR to delete a gene. But CRISPR itself is constantly being improved. There are new tools. So in your lab, in your experiments, what gene-editing technology are you employing?

He Jiankui:  So six years ago, we were using the original CRISPR-Cas9 invented by Jennifer Doudna. But today, we are moving on to base editing, invented by David Liu. The base editing, it’s safe in embryos. It won’t cut the DNA or break it—just small changes. So we no longer use CRISPR-Cas9. We’re using base editing.

Antonio Regalado: And can you tell me the nature of the genetic change that you’re experimenting with or would like to make in these cells to make them resistant to Alzheimer’s? How big a change are you making with this base editor, or trying to make with it?

He Jiankui: So to make people protected against Alzheimer’s, we just need a single base change in the whole human 3 billion letters of DNA. We just change one letter of it to protect people from Alzheimer’s.

Antonio Regalado: And how soon do you think that this could be in use? I mean, it sounds interesting. If I had a child, I might want them to be immune to Alzheimer’s. So this is quite an interesting proposal. What is the time frame in years—if it works in the lab—before it could be implemented in IVF clinics?

He Jiankui: I would say there’s the basic research that could be finished in two years. I won’t move on to the human trial. That’s not my role. It’s determined by society whether to accept it or not. And that’s the ethical side. 

Antonio Regalado: A last question on this from a reader. The question is: How do you prove the benefits? Of course, you can make a genetic change. You can even create a person with a genetic change. But if it’s for Alzheimer’s, it’s going to take 70 years before you know and can prove the results. So how can you prove its medical benefit? Or how can you predict the medical benefit?

He Jiankui: So one thing is that we can observe it in the natural world. There are already thousands of people with this mutation. It helps them against Alzheimer’s. It naturally exists in the population, in humans, so that’s a natural human experiment. And also we could do it in mice. We could use Alzheimer’s model mice and then to modulate DNA to see the results.

You might argue that it takes many years to develop Alzheimer’s, but in society, we’ve done a lot with the HPV vaccine against certain women’s cancers. Cancer takes many years to happen, but they take the HPV vaccine at age eight or seven.

Mat Honan: Thank you so much. JK and Antonio, we are slightly past time here, and I’m going to go ahead and wrap it up. Thank you very much for joining us today, to both of you. And I also want to thank all of our subscribers who tuned in today. I do hope that we see you again next month at our Roundtable in August. It’s our subscriber-only series. And I hope you enjoyed today. Thanks, everybody. 

Antonio Regalado: Thank you, JK.

He Jiankui: Thank you. 

Mystic Gum Sees Early DTC Success

Braxton Manley first appeared on the podcast in 2021. As a college student, he had launched Braxley Bands, a maker of Apple Watch bands. Last year he returned with an update on that business after operational and sales challenges.

He’s back, having launched his latest company, Mystic, a direct-to-consumer maker of health-focused chewing gum. In our recent conversation, we discuss the origins of Mystic, marketing plans, early successes, and more.

The entire audio is embedded below. The transcript is edited for length and clarity.

Eric Bandholz: How’s business?

Braxton Manley: Braxley Bands, our Apple Watch band company, is surviving in a challenging climate. We’re operating from a profit-first mentality. We grow as much as possible and, based on the prior month’s profit-and-loss statement, scale back if needed. It’s multiple scale-ups, then pull-backs. My brother Zach and I run the business, working remotely. We haven’t taken a salary in a while and are focused on the business’s long-term stability.

I’m involved with three direct-to-consumer ecommerce businesses now. My fiance, Maddie, started Peace Love Hormones about three years ago. It’s a direct-to-consumer supplement brand for women’s hormone health. I have an executive role there, functioning as CEO so that Maddie can pursue her doctorate in herbal medicine and focus on the product. I focus on the marketing and operations.

Our third business, Mystic, just launched. It’s chewing gum for women made with sap from a mastic tree, which grows on a Greek island and has a ton of health benefits.

We’re trying to build a family holding company to operate multiple DTC businesses. At this point, they’re all relatively humble — six and seven figures in annual revenue.

Bandholz: Tell me about Mystic.

Manley: It’s square chunks of organic gum. It costs $38 for a can. It’s a beauty product for women and is categorized that way on TikTok. It’s different from regular gum. It’s not sweet at all. It’s palate-cleansing. It relieves indigestion and promotes oral health. You can develop an appreciation for the flavor.

The business is six months old. We’ve been fulfilling orders for just a week. The beginning stage was figuring out what the logo would look like. We did a beta test last year. We invested about $3,000 and ended up selling $20,000 worth. We realized we had a viable product.

We then raised $90,000 from friends and family. We developed custom packaging and produced 5,000 gum units — enough to make our first $200,000 in revenue.

Bandholz: How are you marketing the product?

Manley: Well, we’re a week into fulfilling orders. So it is fresh. We’ve spent much time on a TikTok Shop. We believe TikTok is a good product fit.

Affiliates are important to us too. Maddie, my finance, is an Instagram creator in the health and wellness space. She has an incredible community, which produced our first Mystic orders — about $5,000 in revenue. By Q4, we’ll be doing six figures monthly. This can scale quickly.

We sell recurring orders, but we’re not using the terms “subscribers” or “subscriptions.” Instead, we sell memberships to a gum-chewing club. We have cool hats, a club logo, and patches. The idea is to build a culture. We will charge more for our first subscription and less for renewals. It’s $38 for a one-time order or $30 to join the club for recurring shipments.

Bandholz: Where can people buy the gum and follow you?

Manley: Go to MysticGum.com. You can follow me on X, @Braxtonmanley, or LinkedIn.

Google Cautions On Blocking GoogleOther Bot via @sejournal, @martinibuster

Google’s Gary Illyes answered a question about the non-search features that the GoogleOther crawler supports, then added a caution about the consequences of blocking GoogleOther.

What Is GoogleOther?

GoogleOther is a generic crawler created by Google for the various purposes that fall outside of those of bots that specialize for Search, Ads, Video, Images, News, Desktop and Mobile. It can be used by internal teams at Google for research and development in relation to various products.

The official description of GoogleOther is:

“GoogleOther is the generic crawler that may be used by various product teams for fetching publicly accessible content from sites. For example, it may be used for one-off crawls for internal research and development.”

Something that may be surprising is that there are actually three kinds of GoogleOther crawlers.

Three Kinds Of GoogleOther Crawlers

  1. GoogleOther
    Generic crawler for public URLs
  2. GoogleOther-Image
    Optimized to crawl public image URLs
  3. GoogleOther-Video
    Optimized to crawl public video URLs

All three GoogleOther crawlers can be used for research and development purposes. That’s just one purpose that Google publicly acknowledges that all three versions of GoogleOther could be used for.

What Non-Search Features Does GoogleOther Support?

Google doesn’t say what specific non-search features GoogleOther supports, probably because it doesn’t really “support” a specific feature. It exists for research and development crawling which could be in support of a new product or an improvement in a current product, it’s a highly open and generic purpose.

This is the question asked that Gary narrated:

“What non-search features does GoogleOther crawling support?”

Gary Illyes answered:

“This is a very topical question, and I think it is a very good question. Besides what’s in the public I don’t have more to share.

GoogleOther is the generic crawler that may be used by various product teams for fetching publicly accessible content from sites. For example, it may be used for one-off crawls for internal research and development.

Historically Googlebot was used for this, but that kind of makes things murky and less transparent, so we launched GoogleOther so you have better controls over what your site is crawled for.

That said GoogleOther is not tied to a single product, so opting out of GoogleOther crawling might affect a wide range of things across the Google universe; alas, not Search, search is only Googlebot.”

It Might Affect A Wide Range Of Things

Gary is clear that blocking GoogleOther wouldn’t have an affect on Google Search because Googlebot is the crawler used for indexing content. So if blocking any of the three versions of GoogleOther is something a site owner wants to do, then it should be okay to do that without a negative effect on search rankings.

But Gary also cautioned about the outcome that blocking GoogleOther, saying that it would have an effect on other products and services across Google. He didn’t state which other products it could affect nor did he elaborate on the pros or cons of blocking GoogleOther.

Pros And Cons Of Blocking GoogleOther

Whether or not to block GoogleOther doesn’t necessarily have a straightforward answer. There are several considerations to whether doing that makes sense.

Pros

Inclusion in research for a future Google product that’s related to search (maps, shopping, images, a new feature in search) could be useful. It might be helpful to have a site included in that kind of research because it might be used for testing something good for a site and be one of the few sites chosen to test a feature that could increase earnings for a site.

Another consideration is that blocking GoogleOther to save on server resources is not necessarily a valid reason because GoogleOther doesn’t seem to crawl so often that it makes a noticeable impact.

If blocking Google from using site content for AI is a concern then blocking GoogleOther will have no impact on that at all. GoogleOther has nothing to do with crawling for Google Gemini apps or Vertex AI, including any future products that will be used for training associated language models. The bot for that specific use case is Google-Extended.

Cons

On the other hand it might not be helpful to allow GoogleOther if it’s being used to test something related to fighting spam and there’s something the site has to hide.

It’s possible that a site owner might not want to participate if GoogleOther comes crawling for market research or for training machine learning models (for internal purposes) that are unrelated to public-facing products like Gemini and Vertex.

Allowing GoogleOther to crawl a site for unknown purposes is like giving Google a blank check to use your site data in any way they see fit outside of training public-facing LLMs or purposes related to named bots like GoogleBot.

Takeaway

Should you block GoogleOther? It’s a coin toss. There are possible potential benefits but in general there isn’t enough information to make an informed decision.

Listen to the Google SEO Office Hours podcast at the 1:30 minute mark:

Featured Image by Shutterstock/Cast Of Thousands

Reddit Limits Search Engine Access, Google Remains Exception via @sejournal, @MattGSouthern

Reddit has recently tightened its grip on who can access its content, blocking major search engines from indexing recent posts and comments.

This move has sparked discussions in the SEO and digital marketing communities about the future of content accessibility and AI training data.

What’s Happening?

First reported by 404 Media, Reddit updated its robots.txt file, preventing most web crawlers from accessing its latest content.

Google, however, remains an exception, likely due to a $60 million deal that allows the search giant to use Reddit’s content for AI training.

Brent Csutoras, founder of Search Engine Journal, offers some context:

“Since taking on new investors and starting their pathway to IPO, Reddit has moved away from being open-source and allowing anyone to scrape their content and use their APIs without paying.”

The Google Exception

Currently, Google is the only major search engine able to display recent Reddit results when users search with “site:reddit.com.”

This exclusive access sets Google apart from competitors like Bing and DuckDuckGo.

Why This Matters

For users who rely on appending “Reddit” to their searches to find human-generated answers, this change means they’ll be limited to using Google or search engines that pull from Google’s index.

It presents new challenges for SEO professionals and marketers in monitoring and analyzing discussions on one of the internet’s largest platforms.

The Bigger Picture

Reddit’s move aligns with a broader trend of content creators and platforms seeking compensation for using their data in AI training.

As Csutoras points out:

“Publications, artists, and entertainers have been suing OpenAI and other AI companies, blocking AI companies, and fighting to avoid using public content for AI training.”

What’s Next?

While this development may seem surprising, Csutoras suggests it’s a logical step for Reddit.

He notes:

“It seems smart on Reddit’s part, especially since similar moves in the past have allowed them to IPO and see strong growth for their valuation over the last two years.”


FAQ

What is the recent change Reddit has made regarding content accessibility?

Reddit has updated its robots.txt file to block major search engines from indexing its latest posts and comments. This change exempts Google due to a $60 million deal, allowing Google to use Reddit’s content for AI training purposes.

Why does Google have exclusive access to Reddit’s latest content?

Google has exclusive access to Reddit’s latest content because of a $60 million deal that allows Google to use Reddit’s content for AI training. This agreement sets Google apart from other search engines like Bing and DuckDuckGo, which are unable to index new Reddit posts and comments.

What broader trend does Reddit’s recent move reflect?

Reddit’s decision to limit search engine access aligns with a larger trend where content creators and platforms seek compensation for the use of their data in AI training. Many publications, artists, and entertainers are taking similar actions to either block or demand compensation from AI companies using their content.


Featured Image: Mamun sheikh K/Shutterstock

What Can AI Do For Healthcare Marketing In 2024? via @sejournal, @CallRail

This post was sponsored by CallRail. The opinions expressed in this article are the sponsor’s own.

Artificial intelligence (AI) has huge potential for healthcare practices. It can assist with diagnosis and treatment, as well as administrative and marketing tasks. Yet, many practices are still wary of using AI, especially regarding marketing.

The reality is that AI is here to stay, and many healthcare practices are beginning to use the technology. According to one recent study, 89% of healthcare professionals surveyed said that they were at least evaluating AI products, experimenting with them, or had implemented AI.

To help you determine whether using AI is right for your healthcare practice, let’s take a look at some of the pros and cons of using AI while marketing.

The Pros And Cons Of AI For Healthcare Practices

Healthcare practices that choose to implement AI in safe and appropriate ways to help them with their marketing and patient experience efforts can reap many benefits, including more leads, conversions, and satisfied patients. In fact, 41% of healthcare organizations say their marketing team already uses AI.

Patients also expect healthcare practices to begin to implement AI in a number of ways. In one dentistry study, patients overall showed a positive attitude toward using AI. So, what’s holding your practice back from adding new tools and finding new use cases for AI? Let’s take a look at common concerns.

Con #1: Data Security And Privacy Concerns

Let’s get one of the biggest concerns with AI and healthcare out of the way first. Healthcare practices must follow all privacy and security regulations related to patients’ protected health information (PHI) to maintain HIPAA compliance.

So, concerns over whether AI can be used in a way that doesn’t interfere with HIPAA compliance are valid. In addition, there are also concerns about the open-source nature of popular GenAI models, which means sensitive practice data might be exposed to competitors or even hackers.

Pro #1: AI Can Help You Get More Value From Your Data Securely

While there are valid concerns about how AI algorithms make decisions and data privacy concerns, AI can also be used to enrich data to help you achieve your marketing goals while still keeping it protected.

With appropriate guardrails and omission procedures in place, you can apply AI to gain insights from data that matters to you without putting sensitive data at risk.

For example, our CallRail Labs team is helping marketers remove their blind spots by using AI to analyze and detect critical context clues that help you qualify which calls are your best leads so you can follow up promptly.

At the same time, we know how important it is for healthcare companies to keep PHI secure, which is why we integrate with healthcare privacy platforms like Freshpaint. It can help you bridge the gap between patient privacy and digital marketing.

In addition, our AI-powered Healthcare Plan automatically redacts sensitive patient-protected health information from call transcripts, enforces obligatory log-outs to prevent PHI from becoming public, provides full audit trail logging, and even features unique logins and credentials for every user, which helps eliminate the potential for PHI to be accidentally exposed to employees who don’t need access to that information.

Con #2: AI Is Impersonal

Having a good patient experience is important to almost all patients, and according to one survey, 52% of patients said a key part of a good patient experience is being treated with respect. Almost as many (46%) said they want to be addressed as a person. Given these concerns, handing over content creation or customer interactions to AI can feel daunting. While an AI-powered chatbot might be more efficient than a human in a call center, you also don’t want patients to feel like you’ve delegated customer service to a robot. Trust is the key to building patient relationships.

Pro #2: AI Can Improve The Patient Experience

Worries over AI making patient interactions feel impersonal are reasonable, but just like any other type of tool, it’s how you use AI that matters. There are ways to deploy AI that can actually enhance the patient experience and, by doing so, give your healthcare practice an advantage over your competitors.

The answer isn’t in offloading customer interaction to chatbots. But AI can help you analyze customer interactions to make customer service more efficient and helpful.

With CallRail’s AI-powered Premium Conversation Intelligence™, which transcribes, summarizes, and analyzes each call, you can quickly assess your patients’ needs and concerns and respond appropriately with a human touch. For instance, Premium Conversation Intelligence can identify and extract common keywords and topics from call transcripts. This data reveals recurring themes, such as frequently asked questions, common complaints, and popular services. A healthcare practice could then use these insights to tailor their marketing campaigns to address the most pressing patient concerns.

Con #3: AI Seems Too Complicated To Use

Let’s face it: new technology is risky, and for healthcare practices especially, risk is scary. With AI, some of the risk comes from its perceived complexity. Identifying the right use cases for your practice, selecting the right tools, training your staff, and changing workflows can all feel quite daunting. Figuring this out takes time and money. And, if there aren’t clear use cases and ROI attached, the long-term benefits may not be worth the short-term impact on business.

Pro #3: AI Can Save Time And Money

Using a computer or a spreadsheet for the first time probably also felt complicated – and on the front end, took some time to learn. However, you know that using these tools, compared to pen, paper, and calculators, has saved an enormous amount of time, making the upfront investment clearly worth it. Compared to many technologies, AI tools are often intuitive and only require you to learn a few simple things like writing prompts, refining prompts, reviewing reports, etc. Even if it takes some time to learn new AI tools, the time savings will be worth it once you do.

To get the greatest return on investment, focus on AI solutions that take care of time-intensive tasks to free up time for innovation. With the right use cases and tools, AI can help solve complexity without adding complexity. For example, with Premium Conversation Intelligence, our customers spend 60% less time analyzing calls each week, and they’re using that time to train staff better, increase their productivity, and improve the patient experience.

Con #4: AI Marketing Can Hurt Your Brand

Many healthcare practices are excited to use GenAI tools to accelerate creative marketing efforts, like social media image creation and article writing. But consumers are less excited. In fact, consumers are more likely to say that the use of AI makes them distrusting (40%), rather than trusting (19%), of a brand. In a market where trust is the most important factor for patients when choosing healthcare providers, there is caution and hesitancy around using GenAI for marketing.

Pro #4: AI Helps Make Your Marketing Better

While off-brand AI images shared on social media can be bad brand marketing, there are many ways AI can elevate your marketing efforts without impacting the brand perception. From uncovering insights to improving your marketing campaigns and maximizing the value of each marketing dollar spent to increasing lead conversion rates and decreasing patient churn, AI can help you tackle these problems faster and better than ever.

At CallRail, we’re using AI to tackle complex challenges like multi-conversation insights. CallRail can give marketers instant access to a 3-6 sentence summary for each call, average call sentiment, notable trends behind positive and negative interactions, and a summary of commonly asked questions. Such analysis would take hours and hours for your marketing team to do manually, but with AI, you have call insights at your fingertips to help drive messaging and keyword decisions that can improve your marketing attribution and the patient experience.

Con #5: Adapting AI Tools Might Cause Disruption

As a modern healthcare practice, your tech stack is the engine that runs your business. When onboarding any new technology, there are always concerns about how well it will integrate with existing technology and tools you use and whether it supports HIPAA compliance. There may also be concern about how AI tools can fit into your existing workflows without causing disruption.

Pro #5: AI Helps People Do Their Jobs Better

Pairing the right AI tool for roles with repetitive tasks can be a win for your staff and your practice. For example, keeping up with healthcare trends is important for marketers to improve messaging and campaigns.

An AI-powered tool that analyzes conversations and provides call highlights can help healthcare marketers identify keyword and Google Ad opportunities so they can focus on implementing the most successful marketing strategy rather than listening to hours of call recordings. In addition, CallRail’s new AI-powered Convert Assist helps healthcare marketers provide a better patient experience. With AI-generated call coaching, marketers can identify what went well and what to improve after every conversation.

What’s more, with a solution like CallRail, which offers a Healthcare Plan and will sign a business associate agreement (BAA), you are assured that we will comply with HIPAA controls within our service offerings to ensure that your call tracking doesn’t expose you to potential fines or litigation. Moreover, we also integrate with other marketing tools, like Google Ads, GA4, and more, making it easy to integrate our solution into your existing technologies and workflows.

Let CallRail Show You The Pros Of AI

If you’re still worried about using AI in your healthcare practice, start with a trusted solution like CallRail that has proven ROI for AI-powered tools and a commitment to responsible AI development. You can talk to CallRail’s experts or test the product out for yourself with a 14-day free trial.


Image Credits

Featured Image: Image by CallRail. Used with permission.

Find Keyword Cannibalization Using OpenAI’s Text Embeddings With Examples via @sejournal, @vahandev

This new series of articles focuses on working with LLMs to scale your SEO tasks. We hope to help you integrate AI into SEO so you can level up your skills.

We hope you enjoyed the previous article and understand what vectors, vector distance, and text embeddings are.

Following this, it’s time to flex your “AI knowledge muscles” by learning how to use text embeddings to find keyword cannibalization.

We will start with OpenAI’s text embeddings and compare them.

Model Dimensionality Pricing Notes
text-embedding-ada-002 1536 $0.10 per 1M tokens Great for most use cases.
text-embedding-3-small 1536 $0.002 per 1M tokens Faster and cheaper but less accurate
text-embedding-3-large 3072 $0.13 per 1M tokens More accurate for complex long text-related tasks, slower

(*tokens can be considered as words words.)

But before we start, you need to install Python and Jupyter on your computer.

Jupyter is a web-based tool for professionals and researchers. It allows you to perform complex data analysis and machine learning model development using any programming language.

Don’t worry – it’s really easy and takes little time to finish the installations. And remember, ChatGPT is your friend when it comes to programming.

In a nutshell:

  • Download and install Python.
  • Open your Windows command line or terminal on Mac.
  • Type this commands pip install jupyterlab and pip install notebook
  • Run Jupiter by this command: jupyter lab

We will use Jupyter to experiment with text embeddings; you’ll see how fun it is to work with!

But before we start, you must sign up for OpenAI’s API and set up billing by filling your balance.

Open AI Api Billing settingsOpen AI Api Billing settings

Once you’ve done that, set up email notifications to inform you when your spending exceeds a certain amount under Usage limits.

Then, obtain API keys under Dashboard > API keys, which you should keep private and never share publicly.

OpenAI API keysOpenAI API keys

Now, you have all the necessary tools to start playing with embeddings.

  • Open your computer command terminal and type jupyter lab.
  • You should see something like the below image pop up in your browser.
  • Click on Python 3 under Notebook.
jupyter labjupyter lab

In the opened window, you will write your code.

As a small task, let’s group similar URLs from a CSV. The sample CSV has two columns: URL and Title. Our script’s task will be to group URLs with similar semantic meanings based on the title so we can consolidate those pages into one and fix keyword cannibalization issues.

Here are the steps you need to do:

Install required Python libraries with the following commands in your PC’s terminal (or in Jupyter notebook)

pip install pandas openai scikit-learn numpy unidecode

The ‘openai’ library is required to interact with the OpenAI API to get embeddings, and ‘pandas’ is used for data manipulation and handling CSV file operations.

The ‘scikit-learn’ library is necessary for calculating cosine similarity, and ‘numpy’ is essential for numerical operations and handling arrays. Lastly, unidecode is used to clean text.

Then, download the sample sheet as a CSV, rename the file to pages.csv, and upload it to your Jupyter folder where your script is located.

Set your OpenAI API key to the key you obtained in the step above, and copy-paste the code below into the notebook.

Run the code by clicking the play triangle icon at the top of the notebook.


import pandas as pd
import openai
from sklearn.metrics.pairwise import cosine_similarity
import numpy as np
import csv
from unidecode import unidecode

# Function to clean text
def clean_text(text: str) -> str:
    # First, replace known problematic characters with their correct equivalents
    replacements = {
        '–': '–',   # en dash
        '’': '’',   # right single quotation mark
        '“': '“',   # left double quotation mark
        '”': '”',   # right double quotation mark
        '‘': '‘',   # left single quotation mark
        'â€': '—'     # em dash
    }
    for old, new in replacements.items():
        text = text.replace(old, new)
    # Then, use unidecode to transliterate any remaining problematic Unicode characters
    text = unidecode(text)
    return text

# Load the CSV file with UTF-8 encoding from root folder of Jupiter project folder
df = pd.read_csv('pages.csv', encoding='utf-8')

# Clean the 'Title' column to remove unwanted symbols
df['Title'] = df['Title'].apply(clean_text)

# Set your OpenAI API key
openai.api_key = 'your-api-key-goes-here'

# Function to get embeddings
def get_embedding(text):
    response = openai.Embedding.create(input=[text], engine="text-embedding-ada-002")
    return response['data'][0]['embedding']

# Generate embeddings for all titles
df['embedding'] = df['Title'].apply(get_embedding)

# Create a matrix of embeddings
embedding_matrix = np.vstack(df['embedding'].values)

# Compute cosine similarity matrix
similarity_matrix = cosine_similarity(embedding_matrix)

# Define similarity threshold
similarity_threshold = 0.9  # since threshold is 0.1 for dissimilarity

# Create a list to store groups
groups = []

# Keep track of visited indices
visited = set()

# Group similar titles based on the similarity matrix
for i in range(len(similarity_matrix)):
    if i not in visited:
        # Find all similar titles
        similar_indices = np.where(similarity_matrix[i] >= similarity_threshold)[0]
        
        # Log comparisons
        print(f"nChecking similarity for '{df.iloc[i]['Title']}' (Index {i}):")
        print("-" * 50)
        for j in range(len(similarity_matrix)):
            if i != j:  # Ensure that a title is not compared with itself
                similarity_value = similarity_matrix[i, j]
                comparison_result = 'greater' if similarity_value >= similarity_threshold else 'less'
                print(f"Compared with '{df.iloc[j]['Title']}' (Index {j}): similarity = {similarity_value:.4f} ({comparison_result} than threshold)")

        # Add these indices to visited
        visited.update(similar_indices)
        # Add the group to the list
        group = df.iloc[similar_indices][['URL', 'Title']].to_dict('records')
        groups.append(group)
        print(f"nFormed Group {len(groups)}:")
        for item in group:
            print(f"  - URL: {item['URL']}, Title: {item['Title']}")

# Check if groups were created
if not groups:
    print("No groups were created.")

# Define the output CSV file
output_file = 'grouped_pages.csv'

# Write the results to the CSV file with UTF-8 encoding
with open(output_file, 'w', newline='', encoding='utf-8') as csvfile:
    fieldnames = ['Group', 'URL', 'Title']
    writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
    
    writer.writeheader()
    for group_index, group in enumerate(groups, start=1):
        for page in group:
            cleaned_title = clean_text(page['Title'])  # Ensure no unwanted symbols in the output
            writer.writerow({'Group': group_index, 'URL': page['URL'], 'Title': cleaned_title})
            print(f"Writing Group {group_index}, URL: {page['URL']}, Title: {cleaned_title}")

print(f"Output written to {output_file}")

This code reads a CSV file, ‘pages.csv,’ containing titles and URLs, which you can easily export from your CMS or get by crawling a client website using Screaming Frog.

Then, it cleans the titles from non-UTF characters, generates embedding vectors for each title using OpenAI’s API, calculates the similarity between the titles, groups similar titles together, and writes the grouped results to a new CSV file, ‘grouped_pages.csv.’

In the keyword cannibalization task, we use a similarity threshold of 0.9, which means if cosine similarity is less than 0.9, we will consider articles as different. To visualize this in a simplified two-dimensional space, it will appear as two vectors with an angle of approximately 25 degrees between them.

<span class=

In your case, you may want to use a different threshold, like 0.85 (approximately 31 degrees between them), and run it on a sample of your data to evaluate the results and the overall quality of matches. If it is unsatisfactory, you can increase the threshold to make it more strict for better precision.

You can install ‘matplotlib’ via terminal.

pip install matplotlib

And use the Python code below in a separate Jupyter notebook to visualize cosine similarities in two-dimensional space on your own. Try it; it’s fun!


import matplotlib.pyplot as plt
import numpy as np

# Define the angle for cosine similarity of 0.9. Change here to your desired value. 
theta = np.arccos(0.9)

# Define the vectors
u = np.array([1, 0])
v = np.array([np.cos(theta), np.sin(theta)])

# Define the 45 degree rotation matrix
rotation_matrix = np.array([
    [np.cos(np.pi/4), -np.sin(np.pi/4)],
    [np.sin(np.pi/4), np.cos(np.pi/4)]
])

# Apply the rotation to both vectors
u_rotated = np.dot(rotation_matrix, u)
v_rotated = np.dot(rotation_matrix, v)

# Plotting the vectors
plt.figure()
plt.quiver(0, 0, u_rotated[0], u_rotated[1], angles='xy', scale_units='xy', scale=1, color='r')
plt.quiver(0, 0, v_rotated[0], v_rotated[1], angles='xy', scale_units='xy', scale=1, color='b')

# Setting the plot limits to only positive ranges
plt.xlim(0, 1.5)
plt.ylim(0, 1.5)

# Adding labels and grid
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.grid(True)
plt.title('Visualization of Vectors with Cosine Similarity of 0.9')

# Show the plot
plt.show()

I usually use 0.9 and higher for identifying keyword cannibalization issues, but you may need to set it to 0.5 when dealing with old article redirects, as old articles may not have nearly identical articles that are fresher but partially close.

It may also be better to have the meta description concatenated with the title in case of redirects, in addition to the title.

So, it depends on the task you are performing. We will review how to implement redirects in a separate article later in this series.

Now, let’s review the results with the three models mentioned above and see how they were able to identify close articles from our data sample from Search Engine Journal’s articles.

Data SampleData Sample

From the list, we already see that the 2nd and 4th articles cover the same topic on ‘meta tags.’ The articles in the 5th and 7th rows are pretty much the same – discussing the importance of H1 tags in SEO – and can be merged.

The article in the 3rd row doesn’t have any similarities with any of the articles in the list but has common words like “Tag” or “SEO.”

The article in the 6th row is again about H1, but not exactly the same as H1’s importance to SEO. Instead, it represents Google’s opinion on whether they should match.

Articles on the 8th and 9th rows are quite close but still different; they can be combined.

text-embedding-ada-002

By using ‘text-embedding-ada-002,’ we precisely found the 2nd and 4th articles with a cosine similarity of 0.92 and the 5th and 7th articles with a similarity of 0.91.

Screenshot from Jupyter log showing cosine similaritiesScreenshot from Jupyter log showing cosine similarities

And it generated output with grouped URLs by using the same group number for similar articles. (colors are applied manually for visualization purposes).

Output sheet with grouped URLsOutput sheet with grouped URLs

For the 2nd and 3rd articles, which have common words “Tag” and “SEO” but are unrelated, the cosine similarity was 0.86. This shows why a high similarity threshold of 0.9 or greater is necessary. If we set it to 0.85, it would be full of false positives and could suggest merging unrelated articles.

text-embedding-3-small

By using ‘text-embedding-3-small,’ quite surprisingly, it didn’t find any matches per our similarity threshold of 0.9 or higher.

For the 2nd and 4th articles, cosine similarity was 0.76, and for the 5th and 7th articles, with similarity 0.77.

To better understand this model through experimentation, I’ve added a slightly modified version of the 1st row with ’15’ vs. ’14’ to the sample.

  1. “14 Most Important Meta And HTML Tags You Need To Know For SEO”
  2. “15 Most Important Meta And HTML Tags You Need To Know For SEO”
Example which shows text-embedding-3-small resultsAn example which shows text-embedding-3-small results

On the contrary, ‘text-embedding-ada-002’ gave 0.98 cosine similarity between those versions.

Title 1 Title 2 Cosine Similarity
14 Most Important Meta And HTML Tags You Need To Know For SEO 15 Most Important Meta And HTML Tags You Need To Know For SEO 0.92
14 Most Important Meta And HTML Tags You Need To Know For SEO Meta Tags: What You Need To Know For SEO 0.76

Here, we see that this model is not quite a good fit for comparing titles.

text-embedding-3-large

This model’s dimensionality is 3072, which is 2 times higher than that of ‘text-embedding-3-small’ and ‘text-embedding-ada-002′, with 1536 dimensionality.

As it has more dimensions than the other models, we could expect it to capture semantic meaning with higher precision.

However, it gave the 2nd and 4th articles cosine similarity of 0.70 and the 5th and 7th articles similarity of 0.75.

I’ve tested it again with slightly modified versions of the first article with ’15’ vs. ’14’ and without ‘Most Important’ in the title.

  1. “14 Most Important Meta And HTML Tags You Need To Know For SEO”
  2. “15 Most Important Meta And HTML Tags You Need To Know For SEO”
  3. “14 Meta And HTML Tags You Need To Know For SEO”
Title 1 Title 2 Cosine Similarity
14 Most Important Meta And HTML Tags You Need To Know For SEO 15 Most Important Meta And HTML Tags You Need To Know For SEO 0.95
14 Most Important Meta And HTML Tags You Need To Know For SEO 14 Most Important Meta And HTML Tags You Need To Know For SEO 0.93
14 Most Important Meta And HTML Tags You Need To Know For SEO Meta Tags: What You Need To Know For SEO 0.70
15 Most Important Meta And HTML Tags You Need To Know For SEO 14 Most Important  Meta And HTML Tags You Need To Know For SEO 0.86

So we can see that ‘text-embedding-3-large’ is underperforming compared to ‘text-embedding-ada-002’ when we calculate cosine similarities between titles.

I want to note that the accuracy of ‘text-embedding-3-large’ increases with the length of the text, but ‘text-embedding-ada-002’ still performs better overall.

Another approach could be to strip away stop words from the text. Removing these can sometimes help focus the embeddings on more meaningful words, potentially improving the accuracy of tasks like similarity calculations.

The best way to determine whether removing stop words improves accuracy for your specific task and dataset is to empirically test both approaches and compare the results.

Conclusion

With these examples, you have learned how to work with OpenAI’s embedding models and can already perform a wide range of tasks.

For similarity thresholds, you need to experiment with your own datasets and see which thresholds make sense for your specific task by running it on smaller samples of data and performing a human review of the output.

Please note that the code we have in this article is not optimal for large datasets since you need to create text embeddings of articles every time there is a change in your dataset to evaluate against other rows.

To make it efficient, we must use vector databases and store embedding information there once generated. We will cover how to use vector databases very soon and change the code sample here to use a vector database.

More resources: 


Featured Image: BestForBest/Shutterstock

How To Draft A Social Media Policy via @sejournal, @lorenbaker

Today’s social media is different from a decade ago.

Higher usage, increased prominence, and other signs of a platform’s success also mean brands take on higher risks when utilizing these channels; a social media policy is no longer an “extra.”

A well-crafted social media policy uses clearly outlined rules and best practices to guide employees and those accessing the brand’s profiles in using social media platforms effectively and appropriately.

Everyone involved in the brand’s public image should clearly understand what is expected of them.

A policy helps them understand how to conduct themselves in a way that aligns with a brand’s values, missions, and goals, propelling the company forward.

And you’ll be better able to avoid security breaches, legal issues, reputational damage, and PR crises.

Getting Started With A Social Media Policy

Before creating a comprehensive social media policy, you must understand that it is impossible to prepare for every possibility.

However, talking to others in the organization can help you consider needs and issues you may not otherwise consider.

For example, your customer service team will understand how your audience sees your brand, uses social media, and what they need from your brand’s channels.

Your IT team will know what is available to manage channel security and how to manage security issues.

So, you’ll need to put together a social media policy team. Not everyone should have a say over every element of the policy, but representatives should be able to provide input and ideas.

Your social media policy team should include representatives from:

  • HR department.
  • Leadership.
  • Customer service.
  • Social media team.
  • Employees from other teams.
  • Design team.
  • IT or website management team.
  • Brand advocates or spokespersons.
  • Marketing team.
  • Loyal customers.

Once you have a team in place, you can start crafting your brand’s social media policy.

7 Steps For Creating An Effective Social Media Policy

The actual writing can be completed in seven steps, followed by four steps for implementation. But be warned: The last four steps are as important as the first seven.

Leave one out – fail to implement, update, or enforce your policy, for example – and your social media policy may be unable to guide your team and protect your brand.

To help craft your policy, we’ve included an explanation.

You’ll also find a downloadable pdf here with a list of questions to ask during each step to get you started.

1. Scope And Purpose

The first step to crafting an effective social media policy is to understand why you’re creating the document in the first place.

By clarifying the document’s purpose with everyone, you’ll increase the policy’s use and make it easy for team members to understand who should use it and when.

While a social media policy generally covers any social interaction or platform, including company blog comments and social platforms, listing these locations specifically will reduce confusion and act as a documented list of which platforms are approved and owned by your brand – as well as which platforms your employees or team members may use.

Decide what situations your policy should cover and who should be guided by it. Make it clear what circumstances fall under the policy and which do not.

2. Identify Risks

Social media use is full of risks, but many specific risks (and the ones that often catch a brand unaware) are unique to you.

For example, if you’re in the finance industry, you may have FCC and other rules and regulations to follow. Those in healthcare will have HIPAA and other laws and guidelines.

Aside from that, you will also face the standard risks of PR crises, security risks, intellectual property violations, and others.

List as many general risks as possible to help you determine what your social media policy should include.

3. Cover The Basics

With the list of risks in mind, it’s time to start outlining the various processes and guidelines team members will follow.

Outline the content that can and can’t be shared on the company’s accounts. Decide who will access these accounts and what security features must be in place.

Decide if you’ll allow comments on all your updates, what you will and won’t allow in those comments, and how you’ll handle any comments or posts you remove.

Develop a process for granting and revoking access to your accounts.

And decide the rules and guidelines that employees and others will need to follow when sharing brand-related content (or identifying themselves as connected with the company).

4. Define Who Is Responsible

Many times, errors are made, or issues are avoided not because employees don’t know how to handle them but because they are unsure of who is responsible and the process the company wants to follow.

So, for example, decide who is responsible for monitoring, listening, responding, and managing your social media profiles, promotions, and paid ads.

Decide and outline approval processes, reporting mechanisms or systems, posting limits, and other details.

And don’t forget to consider processes outside of the usual social media processes, such as what will happen when someone takes time off and who will be responsible for social media training.

5. Address Legal Considerations And Regulations

While you may or may not be regulated by industry regulators such as the FCC, you will undoubtedly need to follow data privacy laws, rules surrounding intellectual property, and advertising rules.

In your social media policy, you need to outline the general idea of these rules and what those utilizing social media need to know.

Note: While some of these rules may seem obvious to you, they won’t be obvious to everyone.

Don’t leave important ones out. Make more comprehensive documents (that are in plain language and easy to use in a hurry) available.

6. Voice And Style

Brands are delicate. To keep updates and content consistent, detail and explain the voice and style the company’s channels should have.

Provide users with a wealth of examples of updates that are and are not acceptable. You may also want to include links to official style guides.

Lastly, make sure the goal of your social media channels is clear. Will your brand respond to audience inquiries or offer customer service via social media?

7. Crisis Response

No matter how careful or prepared you are, the worst-case scenario is inevitable. Eventually, a crisis will arise, so you need to be ready.

What should happen if an advertising or intellectual property rule has been violated?

If a PR disaster occurs or runs afoul of some other rule, regulation, law, or guideline?

You should also have a clear process to follow if an account or user has been compromised. Include links and email addresses to support each network so they can be contacted immediately.

You also need to consider PR issues outside of social media.

If a tragedy occurs, for example, how will you communicate with vendors, customers, and the general public? Who is responsible for crafting that message, and who needs to approve it?

Putting Your Social Media Policy Into Action

This process doesn’t end when you have a final social media policy draft.

Even the best social media policy is useless if it isn’t implemented, used, maintained, and enforced.

8. Social Media Policy Implementation

If you want employees and team members to follow the social media policy, it must be easily available and distributed to everyone.

Email it and announce it through internal channels. Walk through the document in a video. Make sure everyone is aware that it has been completed and made available.

Store it somewhere that’s easy for others to access. Still, you also need to ensure that it is added to onboarding packages and provided to anyone who may communicate on behalf of or promote the brand.

(You may wish to craft an external version for customers, your target audience, and other external parties.)

9. Resources And Assets

One of the easiest and most efficient ways to encourage employees to share company news and information while avoiding issues is to make approved assets available.

Provide everyone with logos, approved images, discount codes, and other resources in a location that’s easily accessed.

To make it even easier for employees to share updates, consider having an internal communication channel that notifies everyone of news, newly published information, and fresh assets.

10. Maintaining Your Social Media Policy

Social media and your needs change quickly. And while your social media policy won’t need daily or weekly updates, it will still require regular updating.

Let it get outdated, and it could cause more harm than good.

Imagine, for example, that you have an outdated security protocol in place when one of your accounts is compromised.

Schedule the social media policy update to ensure it gets done. It’s also the perfect opportunity to remind everyone of the document and help refresh their awareness of its processes and guidelines.

11. Utilizing Your Social Media Policy

The successful utilization of a social media policy begins with proper training.

While not all employees will need to understand all the processes, everyone should have a basic understanding of the guidelines within the policy and how it applies to them.

Lastly, ensure that the policy you’ve invested time and effort into creating is enforced.

Schedule regular searches and audits to ensure compliance and be sure to deal with anything that fails to meet the guidelines appropriately.

Conclusion

While a social media policy does require an upfront investment and your time, they are vital in today’s world.

This simple policy document will help you avoid and prepare for a crisis while arming your brand with the resources and knowledge it needs to deal with issues as they arise.

Do this well, and you’ll find that understanding what is and isn’t allowed will help encourage employees to promote your brand on social media.

More resources: 


Featured Image: Rawpixel.com/Shutterstock

Top 7 Most Emotionally Engaging Olympics Ads (P&G Campaigns Are Winning) via @sejournal, @gregjarboe

With the 2024 Olympic Games in Paris officially opening today, DAIVID used its advanced content testing platform to see which ads from the global sporting event have elicited the most intense positive emotions of all time.

Procter & Gamble (P&G) dominates DAIVID’s chart, with five of the top seven ads – including the top three positions.

So, the rest of the search and marketing community will want to figure out what the American multinational consumer goods corporation headquartered in Cincinnati, Ohio has understood for more than a dozen years.

1. P&G Thank You, Mom – Sochi 2014 Olympic Winter Games

A P&G 2014 Winter Olympics campaign honoring the crucial support mothers provide to athletes is the most emotionally engaging Olympic ad ever.

This accolade comes from DAIVID, a creative effectiveness platform, which found that the “Pick Them Back Up” campaign evoked the strongest positive emotions among viewers.

“P&G Thank You, Mom | Pick Them Back Up | Sochi 2014 Olympic Winter Games” led the chart with 59.6% of viewers responding with intense positive emotions. As the video’s description says, “For teaching us that falling only makes us stronger. For giving us the encouragement to try again. Thank you, Mom.”

2. P&G – Thank You, Mom – The Winter Olympics (2018)

Following the (emotional) success from 2014, Thank You, Mom – The Winter Olympics (2018) was close behind in second place with a score of 59.5%.

This video guides the viewer through moms supporting their kids with their dreams and through their circumstances – whether it be bias over color, religion, disability, or sexual orientation.

3. P&G ‘Thank You Mom’ Commercial: “Best Job” (London 2012 Olympics)

P&G’s ad from the London 2012 Olympics took third place, with 58.4% of viewers responding with intense positive emotions.

In this 2012 edition of Procter & Gamble’s ad campaign, supportive mothers take their children to practices and help the kids deal with setbacks on their way to becoming successful Olympic athletes.

4. National Lottery Funded Athletes – TV Extended Version

The UK’s National Lottery ad, ” National Lottery funded athletes – TV advert Extended Version,” took fourth place with a score of 56.9%.

It was inspired by the story of 800-meter runner Jenny Meadows’ mother and showcased how National Lottery funding supports British athletes in achieving their dreams.

5. P&G ‘Thank You, Mom’ Campaign Ad: Strong (Rio 2016 Olympics)

Another from P&G’s Thank You, Mom series for the Rio 2016 Olympics was placed fifth, with 55.9% of viewers responding with intense positive emotions.

In this two-minute commercial, P&G features supportive mothers helping their children persevere through difficult circumstances on their way to becoming Olympic champions.

The brand positions itself as the “Proud sponsor of Moms” and uses the tagline: “It takes someone strong to make someone strong. Thank you, Mom.”

6. We’re The Superhumans – Rio Paralympics 2016

Channel 4, a British free-to-air public broadcast television channel, took sixth place with its “Superhumans” trailer for the Rio Paralympics 2016. The 3-minute video ad got a score of 55.7%.

7. Procter & Gamble – Your Goodness Is Your Greatness

Your Goodness is Your Greatness from P&G took seventh place, with 55.5% of viewers responding with intense positive emotions.

Now, P&G was founded in 1837 by William Procter and James Gamble. Do you think this gave them a head start on the rest of the field?

DAIVID CEO’s Insights

In a press release, Ian Forrester, CEO and founder of DAIVID, said:

“When it comes to emotional Olympic campaigns, no brand has ever gone faster, higher or stronger than P&G.

The company’s incredible tributes to the role mums play in helping to put future Olympic champions on the path to Games glory really tug at the emotional heartstrings and are capable of turning even the most cynical viewers into emotional wrecks.

‘Pick Them Back Up’ is a worthy gold winner, generating some of the most intense feelings of positivity we’ve ever seen for an ad.”

He added, “It’s also great to see Channel 4’s sensational campaign, ‘We’re The Superhumans’ in the top 6. Generating incredibly intense feelings of inspiration, the ad has played a crucial role in putting the Paralympics firmly in the hearts and minds of viewers all around the world.”

What can I add?

I’ve known Forrester since September 2012, when he joined the Unruly Group as global insight lead. And I talked with him several times over the next six years about Unruly’s Viral Spiral charts, which showed which video ads were among the most shared.

So, I’ve learned that Forrester has the kind of Experience, Expertise, Authoritativeness, and Trustworthiness (E-E-A-T) that not only Google talks about, but skeptical journalists and bloggers value, too.

That’s why I’ve quoted him – along with other video gurus – in articles like “What’s The Alternative To Spending $7 Million On A Super Bowl Ad?” as well as “How To Make A Video Go Viral.”

And that’s why I’ve cited DAIVID’s critical data and strategic insights in articles like “The Best 5 Super Bowl Ads in 2024 (Brands That Got It Right)” as well as “39 Emotions Digital Marketers Can Use In Advertising.”

But if you want to figure out what P&G already understands, then it’s worth spending a few moments learning more about DAIVID’s methodology.

Check Out DAIVID’s Methodology

Based in London, DAIVID leverages technologies like facial coding, eye tracking, and computer vision to help advertisers enhance the emotional and business impact of their campaigns.

Their platform allows marketers to assess and improve ad effectiveness on a large scale using advanced data analysis methods.

DAIVID’s study of the most emotionally engaging Olympics ads utilized its Self-Serve solution, trained on millions of consumer data points, to predict the emotional reactions and attention levels ads would generate, along with their potential brand and business impacts.

The analysis involved 56 Olympic ads, excluding those from the current Paris Olympics.

Watch For Yourself To See Why These Videos Trigger Emotion

So, watch the seven ads above and see for yourself what kind of video content triggers intense positive emotions in viewers. You may see something that I might have missed.

But the next time you want to know if your ad creative is working, test it. I know, talking about testing social videos the way that Madison Avenue once tested TV commercials seems like pie in the sky.

But with AI as your co-pilot, making creative testing affordable, you can fix problems and identify solutions faster and easier than it could back in the old days.

Okay, this may not bring tears to your eyes – like “Pick Them Back Up” probably will – but it can help you catch up with P&G, which already has a 12-year head start.

More resources: 


Featured Image: Gorodenkoff/Shutterstock

Beer, hydrogen, and heat: Why the US is still trying to make mirror-magnified solar energy work

The US is continuing its decades-long effort to commercialize a technology that converts sunlight into heat, funding a series of new projects using that energy to brew beer, produce low-carbon fuels, or keep grids running.

On July 25, the Department of Energy will announce it is putting $33 million into nine pilot or demonstration projects based on concentrating solar thermal power, MIT Technology Review can report exclusively. The technology uses large arrays of mirrors to concentrate sunlight onto a receiver, where it’s used to heat up molten salt, ceramic particles, or other materials that can store that energy for extended periods. 

“Under the Biden-Harris administration, DOE continues to invest in the next-generation solar technologies we need to tackle the climate crisis and ensure American scientific innovation remains the envy of the world,” Energy Secretary Jennifer Granholm said in a statement.

The DOE has been funding efforts to get concentrated solar energy off the ground since at least the 1970s. The idea was initially driven in part by the quest to develop more renewable, domestic sources of energy during the oil crisis of that era. 

But early commercial efforts to produce clean electricity based on this technology have been bedeviled by high costs, low output, and other challenges. 

Researchers continued to try to drive the field forward, in part by moving to higher-temperature systems that are more efficient and switching to new types of materials that can withstand them. The focus of the concentrating solar field has also shifted away from using the technology to produce electricity—a job that its solar photovoltaic cousin now does incredibly effectively, cheaply, and on a massive scale—and toward using it to provide the heat needed for various industrial processes or as a form of very long-duration energy storage for grids. 

Indeed, a core promise of the technology is that heat can be stored more efficiently than electricity, potentially offering an alternative to very expensive large-scale battery plants. This could be especially useful for dealing with prolonged dips in renewable generation as solar, wind, and other fluctuating sources come to produce a larger and larger share of electricity.

Among the awardees:

  • More than $7 million of the DOE funds will support a project at Firestone Walker Brewery in Paso Robles, California, which will tap into solar thermal energy to produce the steam needed for its lineup of IPAs and other beers.
  • Another $6 million will go to Premier Resource Management’s planned concentrating solar power plant in Bakersfield, California, which would store thermal energy in retired fracking sites.
  • Researchers at West Virginia University, who are working with NASA, secured $5 million to explore the use of solar thermal to produce a clean form of hydrogen, a fuel as well as a feedstock in the production of fertilizer, steel, and other industrial goods.

The DOE funds pilot and demonstration projects in the hopes of kick-starting commercialization of emerging energy technologies, helping research groups or companies to refine them, scale them up, and drive down costs.

In the case of concentrating solar thermal, costs still need to fall by about half  to “really unlock broader applications,” says Becca Jones-Albertus, director of DOE’s Solar Energy Technologies Office.

But she says the department continues to invest in the development of the technology because it remains one of the most promising ways to address three big areas where the world still needs better solutions to cut climate-warming emissions: long-duration grid storage, industrial heat, and steady forms of carbon-free electricity.

Google DeepMind’s AI systems can now solve complex math problems

AI models can easily generate essays and other types of text. However, they’re nowhere near as good at solving math problems, which tend to involve logical reasoning—something that’s beyond the capabilities of most current AI systems.

But that may finally be changing. Google DeepMind says it has trained two specialized AI systems to solve complex math problems involving advanced reasoning. The systems—called AlphaProof and AlphaGeometry 2—worked together to successfully solve four out of six problems from this year’s International Mathematical Olympiad (IMO), a prestigious competition for high school students. They won the equivalent of a silver medal at the event.

It’s the first time any AI system has ever achieved such a high success rate on these kinds of problems. “This is great progress in the field of machine learning and AI,” says Pushmeet Kohli, vice president of research at Google DeepMind, who worked on the project. “No such system has been developed until now which could solve problems at this success rate with this level of generality.” 

There are a few reasons math problems that involve advanced reasoning are difficult for AI systems to solve. These types of problems often require forming and drawing on abstractions. They also involve complex hierarchical planning, as well as setting subgoals, backtracking, and trying new paths. All these are challenging for AI. 

“It is often easier to train a model for mathematics if you have a way to check its answers (e.g., in a formal language), but there is comparatively less formal mathematics data online compared to free-form natural language (informal language),” says Katie Collins, an researcher at the University of Cambridge who specializes in math and AI but was not involved in the project. 

Bridging this gap was Google DeepMind’s goal in creating AlphaProof, a reinforcement-learning-based system that trains itself to prove mathematical statements in the formal programming language Lean. The key is a version of DeepMind’s Gemini AI that’s fine-tuned to automatically translate math problems phrased in natural, informal language into formal statements, which are easier for the AI to process. This created a large library of formal math problems with varying degrees of difficulty.

Automating the process of translating data into formal language is a big step forward for the math community, says Wenda Li, a lecturer in hybrid AI at the University of Edinburgh, who peer-reviewed the research but was not involved in the project. 

“We can have much greater confidence in the correctness of published results if they are able to formulate this proving system, and it can also become more collaborative,” he adds.

The Gemini model works alongside AlphaZero—the reinforcement-learning model that Google DeepMind trained to master games such as Go and chess—to prove or disprove millions of mathematical problems. The more problems it has successfully solved, the better AlphaProof has become at tackling problems of increasing complexity.

Although AlphaProof was trained to tackle problems across a wide range of mathematical topics, AlphaGeometry 2—an improved version of a system that Google DeepMind announced in January—was optimized to tackle problems relating to movements of objects and equations involving angles, ratios, and distances. Because it was trained on significantly more synthetic data than its predecessor, it was able to take on much more challenging geometry questions.

To test the systems’ capabilities, Google DeepMind researchers tasked them with solving the six problems given to humans competing in this year’s IMO and proving that the answers were correct. AlphaProof solved two algebra problems and one number theory problem, one of which was the competition’s hardest. AlphaGeometry 2 successfully solved a geometry question, but two questions on combinatorics (an area of math focused on counting and arranging objects) were left unsolved.   

“Generally, AlphaProof performs much better on algebra and number theory than combinatorics,” says Alex Davies, a research engineer on the AlphaProof team. “We are still working to understand why this is, which will hopefully lead us to improve the system.”

Two renowned mathematicians, Tim Gowers and Joseph Myers, checked the systems’ submissions. They awarded each of their four correct answers full marks (seven out of seven), giving the systems a total of 28 points out of a maximum of 42. A human participant earning this score would be awarded a silver medal and just miss out on gold, the threshold for which starts at 29 points. 

This is the first time any AI system has been able to achieve a medal-level performance on IMO questions. “As a mathematician, I find it very impressive, and a significant jump from what was previously possible,” Gowers said during a press conference. 

Myers agreed that the systems’ math answers represent a substantial advance over what AI could previously achieve. “It will be interesting to see how things scale and whether they can be made faster, and whether it can extend to other sorts of mathematics,” he said.

Creating AI systems that can solve more challenging mathematics problems could pave the way for exciting human-AI collaborations, helping mathematicians to both solve and invent new kinds of problems, says Collins. This in turn could help us learn more about how we humans tackle math.

“There is still much we don’t know about how humans solve complex mathematics problems,” she says.