The race to clean up heavy-duty trucks

This article is from The Spark, MIT Technology Review’s weekly climate newsletter. To receive it in your inbox every Wednesday, sign up here.

Truckers have to transport massive loads long distances, every single day, under intense time pressure—and they rely on the semi-trucks they drive to get the job done. Their diesel engines spew not only greenhouse gas emissions that cause climate change, but also nitrogen oxide, which can be extremely harmful for human health.

Cleaning up trucking, especially the biggest trucks, presents a massive challenge. That’s why some companies are trying to ease the industry into change. For my most recent story, I took a look at Range Energy, a startup that’s adding batteries to the trailers of semi-trucks. If the electrified trailers are attached to diesel trucks, they can improve the fuel economy. If they’re added to zero-emissions vehicles powered by batteries or hydrogen, they could boost range and efficiency. 

During my reporting, I learned more about what’s holding back progress in trucking and how experts are thinking about a few different technologies that could help.

The entire transportation sector is slowly shifting toward electrification: EVs are hitting the road in increasing numbers, making up 18% of sales of new passenger vehicles in 2023

Trucks may very well follow suit—nearly 350 models of zero-emissions medium- and heavy-duty trucks are already available worldwide, according to data from CALSTART. “I do see a lot of strength and demand in the battery electric space in particular,” says Stephanie Ly, senior manager for e-mobility strategy and manufacturing engagement at the World Resources Institute.

But battery-powered trucks will pose a few major challenges as they take to the roads. First, and perhaps most crucially, is their cost. Battery-powered trucks, especially big models like semi-trucks, will be significantly more expensive than diesel versions today.

There may be good news on this front: When you consider the cost of refueling and maintenance, it’s looking like electric trucks could soon compete with diesel. By 2030, the total cost of ownership of a battery electric long-haul truck will likely be lower than that of a diesel one in the US, according to a 2023 report from the International Council on Clean Transportation. The report looked at a number of states including California, Georgia, and New York, and found that the relatively high upfront cost for electric trucks are balanced out by lower operating expenses. 

Another significant challenge for battery-powered trucking is weight: The larger the vehicle, the bigger the battery. That could be a problem given current regulations, which typically limit the weight of a rig both for safety reasons and to prevent wear and tear on roads (in the US, it’s 80,000 pounds). Operators tend to want to maximize the amount of goods they can carry in each load, so the added weight of a battery might not be welcome.

Finally, there’s the question of how far trucks can go, and how often they’ll need to stop. Time is money for truck drivers and fleet operators. Batteries will need to pack more energy into a smaller space so that trucks can have a long enough range to run their routes. Charging is another huge piece here—if drivers do need to stop to charge their trucks, they’ll need much more powerful chargers to enable them to top off quickly. That could present challenges for the grid, and operators might need to upgrade infrastructure in certain places to allow the huge amounts of power that would be needed for fast charging of massive batteries. 

All these challenges for battery electric trucks add up. “What companies are really looking for is something they can swap out,” says Thomas Walker, transportation technology manager at the Clean Air Task Force. And right now, he says, we’re just not quite in a spot where batteries are a clean and obvious switch.

That’s why some experts say we should keep our options open when it comes to technologies for future heavy-duty trucks, and that includes hydrogen. 

Batteries are currently beating out hydrogen in the race to clean up transportation, as I covered in a story earlier this year. For most vehicles and most people, batteries simply make more sense than hydrogen, for reasons that include everything from available infrastructure to fueling cost. 

But heavy-duty trucks are a different beast: Heavier vehicles, bigger batteries, higher power charging, and longer distances might tip the balance in favor of hydrogen. (There are some big “ifs” here, including whether hydrogen prices will get low enough to make hydrogen-powered vehicles economical.) 

For a sector as tough to decarbonize as heavy-duty trucking, we need all the help we can get. As Walker puts it, “It’s key that you start off with a lot of options and then narrow it down, rather than trying to pick which one’s going to win, because we really don’t know.”


Now read the rest of The Spark

Related reading

To learn more about Range Energy and how its electrified trailers could help transform trucking in the near future, check out my latest story here

Hydrogen is losing the race to power cleaner cars, but heavy-duty trucks might represent a glimmer of hope for the technology. Dig into why in my story from earlier this year

Getting the grid ready for fleets of electric trucks is going to be a big challenge. But for some short-distance vehicles in certain areas, we may actually be good to go already, as I reported in 2021

Urban Sky Microballoon pictured shortly after deployment near Breckenridge, Colorado.
COURTESY URBAN SKY

Two more things

Spotting wildfires early and keeping track of them can be tough. Now one company wants to monitor blazes using high-altitude balloons. Next month in Colorado, Urban Sky is deploying balloons that are about as big as vans, and they’ll be keeping watch using much finer resolution than what’s possible with satellites without a human pilot. Read more about fire-tracking balloons in this story from Sarah Scoles

A new forecasting model attempts to marry conventional techniques with AI to better predict the weather. The model from Google uses physics to work out larger atmospheric forces, then tags in AI for the smaller stuff. Check out the details in the latest from my colleague James O’Donnell

Keeping up with climate  

Small rocky nodules in the deep sea might be a previously undiscovered source of oxygen. They contain metals such as lithium and are a potential target for deep-sea mining efforts. (Nature)

→ Polymetallic nodules are roughly the size and shape of potatoes, and they may be the future of mining for renewable energy. (MIT Technology Review)

A 350-foot-long blade from a wind turbine off the coast of Massachusetts broke off last week, and hunks of fiberglass have been washing up on local beaches. The incident is a setback for a struggling offshore wind industry, and we’re still not entirely sure what happened. (Heatmap News)

A new report shows that low-emissions steel- and iron-making processes are on the rise. But coal-powered operations are still growing too, threatening progress in the industry. (Canary Media)

Sunday, July 21, was likely the world’s hottest day in recorded history (so far). It edged out a record set just last year. (The Guardian)

Plastic forks, cups, and single-use packages are sometimes stamped with nice-sounding labels like “compostable,” “biodegradable,” or just “Earth-friendly.” But that doesn’t mean you can stick the items in your backyard compost pile—these marketing terms are basically the Wild West. (Washington Post)

While EVs are indisputably better than gas-powered cars in terms of climate emissions, they are heavier, meaning they wear through tires faster. The resulting particulate pollution presents a new challenge, one a startup company is trying to address with new tires designed for electric vehicles. (Canary Media)

Public fast chargers are popping up nearly everywhere in the US—at this pace, they’ll outnumber gas stations by 2030. And deployment is only expected to speed up. (Bloomberg)

PsiQuantum plans to build the biggest quantum computing facility in the US

The quantum computing firm PsiQuantum is partnering with the state of Illinois to build the largest US-based quantum computing facility, the company announced today. 

The firm, which has headquarters in California, says it aims to house a quantum computer containing up to 1 million quantum bits, or qubits, within the next 10 years. At the moment, the largest quantum computers have around 1,000 qubits. 

Quantum computers promise to do a wide range of tasks, from drug discovery to cryptography, at record-breaking speeds. Companies are using different approaches to build the systems and working hard to scale them up. Both Google and IBM, for example, make the qubits out of superconducting material. IonQ makes qubits by trapping ions using electromagnetic fields. PsiQuantum is building qubits from photons.  

A major benefit of photonic quantum computing is the ability to operate at higher temperatures than superconducting systems. “Photons don’t feel heat and they don’t feel electromagnetic interference,” says Pete Shadbolt, PsiQuantum’s cofounder and chief scientific officer. This imperturbability makes the technology easier and cheaper to test in the lab, Shadbolt says. 

It also reduces the cooling requirements, which should make the technology more energy efficient and easier to scale up. PsiQuantum’s computer can’t be operated at room temperature, because it needs superconducting detectors to locate photons and perform error correction. But those sensors only need to be cooled to a few degrees Kelvin, or a little under -450 °F. While that’s an icy temperature, it is still easier to achieve than what’s required for superconducting systems, which demand cryogenic cooling. 

The company has opted not to build small-scale quantum computers (such as IBM’s Condor, which uses a little over 1,100 qubits). Instead it is aiming to manufacture and test what it calls “intermediate systems.” These include chips, cabinets, and superconducting photon detectors. PsiQuantum says it is targeting these larger-scale systems in part because smaller devices are unable to adequately correct errors and operate at a realistic price point.  

Getting smaller-scale systems to do useful work has been an area of active research. But “just in the last few years, we’ve seen people waking up to the fact that small systems are not going to be useful,” says Shadbolt. In order to adequately correct the inevitable errors, he says, “you have to build a big system with about a million qubits.” The approach conserves resources, he says, because the company doesn’t spend time piecing together smaller systems. But skipping over them makes PsiQuantum’s technology difficult to compare to what’s already on the market. 

The company won’t share details about the exact timeline of the Illinois project, which will include a collaboration with the University of Chicago, and several other Illinois universities. It does say it is hoping to break ground on a similar facility in Brisbane, Australia, next year and hopes that facility, which will house its own large-scale quantum computer, will be fully operational by 2027. “We expect Chicago to follow thereafter in terms of the site being operational,” the company said in a statement. 

“It’s all or nothing [with PsiQuantum], which doesn’t mean it’s invalid,” says Christopher Monroe, a computer scientist at Duke University and ex-IonQ employee. “It’s just hard to measure progress along the way, so it’s a very risky kind of investment.”

Significant hurdles lie ahead. Building the infrastructure for this facility, particularly for the cooling system, will be the slowest and most expensive aspect of the construction. And when the facility is finally constructed, there will need to be improvements in the quantum algorithms run on the computers. Shadbolt says the current algorithms are far too expensive and resource intensive. 

The sheer complexity of the construction project might seem daunting. “This could be the most complex quantum optical electronic system humans have ever built, and that’s hard,” says Shadbolt. “We take comfort in the fact that it resembles a supercomputer or a data center, and we’re building it using the same fabs, the same contract manufacturers, and the same engineers.”

Correction: we have updated the story to reflect that the partnership is only with the state of Illinois and its universities, and not a national lab

Update: we added comments from Christopher Monroe

How our genome is like a generative AI model

This article first appeared in The Checkup, MIT Technology Review’s weekly biotech newsletter. To receive it in your inbox every Thursday, and read articles like this first, sign up here.

What does the genome do? You might have heard that it is a blueprint for an organism. Or that it’s a bit like a recipe. But building an organism is much more complex than constructing a house or baking a cake.

This week I came across an idea for a new way to think about the genome—one that borrows from the field of artificial intelligence. Two researchers are arguing that we should think about it as being more like a generative model, a form of AI that can generate new things.

You might be familiar with such AI tools—they’re the ones that can create text, images, or even films from various prompts. Do our genomes really work in the same way? It’s a fascinating idea. Let’s explore.

When I was at school, I was taught that the genome is essentially a code for an organism. It contains the instructions needed to make the various proteins we need to build our cells and tissues and keep them working. It made sense to me to think of the human genome as being something like a program for a human being.

But this metaphor falls apart once you start to poke at it, says Kevin Mitchell, a neurogeneticist at Trinity College in Dublin, Ireland, who has spent a lot of time thinking about how the genome works.

A computer program is essentially a sequence of steps, each controlling a specific part of development. In human terms, this would be like having a set of instructions to start by building a brain, then a head, and then a neck, and so on. That’s just not how things work.

Another popular metaphor likens the genome to a blueprint for the body. But a blueprint is essentially a plan for what a structure should look like when it is fully built, with each part of the diagram representing a bit of the final product. Our genomes don’t work this way either.

It’s not as if you’ve got a gene for an elbow and a gene for an eyebrow. Multiple genes are involved in the development of multiple body parts. The functions of genes can overlap, and the same genes can work differently depending on when and where they are active. It’s far more complicated than a blueprint.

Then there’s the recipe metaphor. In some ways, this is more accurate than the analogy of a blueprint or program. It might be helpful to think about our genes as a set of ingredients and instructions, and to bear in mind that the final product is also at the mercy of variations in the temperature of the oven or the type of baking dish used, for example. Identical twins are born with the same DNA, after all, but they are often quite different by the time they’re adults.

But the recipe metaphor is too vague, says Mitchell. Instead, he and his colleague Nick Cheney at the University of Vermont are borrowing concepts from AI to capture what the genome does. Mitchell points to generative AI models like Midjourney and DALL-E, both of which can generate images from text prompts. These models work by capturing elements of existing images to create new ones.

Say you write a prompt for an image of a horse. The models have been trained on a huge number of images of horses, and these images are essentially compressed to allow the models to capture certain elements of what you might call “horsiness.” The AI can then construct a new image that contains these elements.

We can think about genetic data in a similar way. According to this model, we might consider evolution to be the training data. The genome is the compressed data—the set of information that can be used to create the new organism. It contains the elements we need, but there’s plenty of scope for variation. (There are lots more details about the various aspects of the model in the paper, which has not yet been peer-reviewed.)

Mitchell thinks it’s important to get our metaphors in order when we think about the genome. New technologies are allowing scientists to probe ever deeper into our genes and the roles they play. They can now study how all the genes are expressed in a single cell, for example, and how this varies across every cell in an embryo.

“We need to have a conceptual framework that will allow us to make sense of that,” says Mitchell. He hopes that the concept will aid the development of mathematical models that might help us better understand the intricate relationships between genes and the organisms they end up being part of—in other words, exactly how components of our genome contribute to our development.


Now read the rest of The Checkup

Read more from MIT Technology Review’s archive:

Last year, researchers built a new human genome reference designed to capture the diversity among us. They called it the “pangenome,” as Antonio Regalado reported.

Generative AI has taken the world by storm. Will Douglas Heaven explored six big questions that will determine the future of the technology.

A Disney director tried to use AI to generate a soundtrack in the style of Hans Zimmer. It wasn’t as good as the real thing, as Melissa Heikkilä found.

Melissa has also reported on how much energy it takes to create an image using generative AI. Turns out it’s about the same as charging your phone. 

What is AI? No one can agree, as Will found in his recent deep dive on the topic.

From around the web

Evidence from more than 1,400 rape cases in Maryland, some from as far back as 1977, are set to be processed by the end of the year, thanks to a new law. The state still has more than 6,000 untested rape kits. (ProPublica)

How well is your brain aging? A new tool has been designed to capture a person’s brain age based on an MRI scan, and which accounts for the possible effects of traumatic brain injuries. (NeuroImage)

Iran has reported the country’s first locally acquired cases of dengue, a viral infection spread by mosquitoes. There are concerns it could spread. (WHO)

IVF is expensive, and add-ons like endometrial scratching (which literally involves scratching the lining of the uterus) are not supported by strong evidence. Is the fertility industry profiting from vulnerability? (The Lancet)

Up to 2 million Americans are getting their supply of weight loss drugs like Wegovy or Zepbound from compounding pharmacies. They’re a fraction of the price of brand-name Big Pharma drugs, but there are some safety concerns. (KFF Health News)

The Download: AI’s math solutions, and brewing beer with sunlight

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology.

Google DeepMind’s new AI systems can now solve complex math problems

AI models can easily generate essays and other types of text. However, they’re nowhere near as good at solving math problems, which tend to involve logical reasoning—something that’s beyond the capabilities of most current AI systems.

But that may finally be changing. Google DeepMind says it has trained two specialized AI systems to solve complex math problems involving advanced reasoning. The systems worked together to successfully solve four out of six problems from this year’s International Mathematical Olympiad, a  prestigious competition for high school students.

They won the equivalent of a silver medal, marking the first time any AI system has ever achieved such a high success rate on these kinds of problems. Read the full story.

—Rhiannon Williams

Why the US is still trying to make mirror-magnified solar energy work

The US is continuing its decades-long effort to commercialize a technology that converts sunlight into heat, funding a series of new projects using that energy to brew beer, produce low-carbon fuels, or keep grids running.

The Department of Energy has announced it is putting $33 million into nine pilot projects based on concentrating solar thermal power, MIT Technology Review can report exclusively. The technology uses large arrays of mirrors to concentrate sunlight onto a receiver, where it’s used to heat up molten salt, ceramic particles, or other materials that can store that energy for extended periods. 

But early commercial efforts to produce clean electricity based on this technology have been bedeviled by high costs, low output, and other challenges. Read the full story.

—James Temple

“Copyright traps” could tell writers if an AI has scraped their work

Since the beginning of the generative AI boom, content creators have argued that their work has been scraped into AI models without their consent. But until now, it has been difficult to know whether specific text has actually been used in a training data set. 

Now they have a new way to prove it: “copyright traps” developed by a team at Imperial College London, pieces of hidden text that allow writers and publishers to subtly mark their work in order to later detect whether it has been used in AI models or not. Read the full story.

—Melissa Heikkilä

How our genome is like a generative AI model

What does the genome do? You might have heard that it is a blueprint for an organism. Or that it’s a bit like a recipe. But building an organism is much more complex than constructing a house or baking a cake.

This week I came across an idea for a new way to think about the genome—one that borrows from the field of artificial intelligence. Two researchers are arguing that we should think about it as being more like a generative model, a form of AI that can generate new things.

You might be familiar with such AI tools—they’re the ones that can create text, images, or even films from various prompts. But do our genomes really work in the same way? Read the full story.

—Jessica Hamzelou

This story is from The Checkup, our weekly health and biotech newsletter. Sign up to receive it in your inbox every Thursday.

The must-reads

I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology.

1 OpenAI’s search engine is here
And it’s already getting stuff wrong. (The Atlantic $)
+ SearchGPT will eventually be folded into ChatGPT. (WP $)
+ Its launch is a clear threat to Google’s long-held search engine dominance. (Wired $)
+ Why you shouldn’t trust AI search engines. (MIT Technology Review)

2 The chip industry’s workers are demanding better treatment
As the sector’s profits soar, its employees aren’t seeing the benefits. (WSJ $)

3 What studying the human brain can teach us about AI
Trying to understand why AI does the things it does is key to controlling it. (Vox)
+ What is AI? (MIT Technology Review)

4 Russia is throttling access to YouTube
It’s looking as though a total ban is imminent. (Bloomberg $)

5 Robots are finally becoming more useful
And it’s all thanks to AI. (FT $)
+ Is robotics about to have its own ChatGPT moment? (MIT Technology Review)

6 Voice actors are striking against video game companies
They claim the firms have learnt nothing from the prior strikes against film and TV. (NYT $)
+ They want studios to seek actors’ consent for using their voices with AI. (Bloomberg $)

7 Identifying all of Mexico’s dead bodies is a forensic crisis
Scientists are doing their best to harness tech to their cause. (New Yorker $)
+ The mothers of Mexico’s missing are using social media to search for mass graves. (MIT Technology Review)

8 New Jersey is angling to become a major AI hub
Bruce Springsteen’s hometown wants a slice of those hefty new tax credits. (Wired $)
+ The $100 billion bet that a postindustrial US city can reinvent itself as a high-tech hub. (MIT Technology Review)

9 Mexico’s delivery workers are sick of food orders
It’s less waiting around, and fewer irate customers. (Rest of World)

10 How to find serenity in a plant-identifying app
Take a minute to step outside and smell the roses. (The Guardian)

Quote of the day

“Just hug your IT folks.”

—Jerry Leever, an IT director at accounting, tax and advisory firm GHJ, explains to the Washington Post what it was like attempting to handle last week’s CrowdStrike meltdown. 

The big story

Bright LEDs could spell the end of dark skies

August 2022

Scientists have known for years that light pollution is growing and can harm both humans and wildlife. In people, increased exposure to light at night disrupts sleep cycles and has been linked to cancer and cardiovascular disease, while wildlife suffers from interruption to their reproductive patterns, and increased danger.

Astronomers, policymakers, and lighting professionals are all working to find ways to reduce light pollution. Many of them advocate installing light-emitting diodes, or LEDs, in outdoor fixtures such as city streetlights, mainly for their ability to direct light to a targeted area.

But the high initial investment and durability of modern LEDs mean cities need to get the transition right the first time or potentially face decades of consequences. Read the full story.

—Shel Evergreen

We can still have nice things

A place for comfort, fun and distraction to brighten up your day. (Got any ideas? Drop me a line or tweet ’em at me.)

+ Lady Gaga! Celine Dion! Snoop Dogg! It’s safe to say tonight’s Paris Olympics opening ceremony is going to be suitably bonkers.
+ Although nothing is ever going to top London 2012’s opening.
+ Candace Bushnell, you will never not be fabulous.
+ Who doesn’t love a good Kubrick stare?

Controversial CRISPR scientist promises “no more gene-edited babies” until society comes around

He Jiankui, the Chinese biophysicist whose controversial 2018 experiment led to the birth of three gene-edited children, says he’s returned to work on the concept of altering the DNA of people at conception, but with a difference. 

This time around, he says, he will restrict his research to animals and nonviable human embryos. He will not try to create a pregnancy, at least until society comes to accept his vision for “genetic vaccines” against common diseases.

“There will be no more gene-edited babies. There will be no more pregnancies,” he said during an online roundtable discussion hosted by MIT Technology Review, during which He answered questions from biomedicine editor Antonio Regalado, editor in chief Mat Honan, and our subscribers.

During the interview, He defended his past research and said the “only regret” he had was the difficulties he had caused to his wife and two daughters. He spent three years in prison after a court found him guilty of breaking regulations, but since his release in 2022 he has sought to stage a scientific comeback.

He says he currently has a private lab in the city of Sanya, in Hainan province, where he works on gene therapy for rare disease as well as laboratory tests to determine how, one day, babies could be born resistant to ever developing Alzheimer’s disease.

The Chinese scientist said he’s receiving financial support from individuals in the US and China, and from Chinese companies, and has received an offer to form a research company in Silicon Valley. He declined to name his investors.

Read the full transcript of the event below.

Mat Honan: Hello, everybody. Thanks for joining us today. My name is Mat Honan. I’m the editor in chief here at MIT Technology Review. I’m really thrilled to host what’s going to be, I think, a great discussion today. I’m joined by Antonio Regalado, our senior editor for biomedicine, and He Jiankui, who goes by the name JK. 

JK is a biophysicist, He’s based in China, and JK used CRISPR to edit the genes of human embryos, which ultimately resulted in the first children born whose DNA had been tailored using gene editing. Welcome to you both.

To our audience tuning in today, I wanted to let you know if you’ve got questions for us, please do ask them in the chat window. We’ve got a packed discussion planned, but we will get to as many of those as we can throughout. Antonio, I think I’m going to start with you, if we can. You’re the one who broke this story six years ago. Why don’t you set the stage for what we’re going to be talking about here today, and why it’s important.

Antonio Regalado: Mat, thank you.

The subject is genome editing. Of course, it’s a technology for changing the DNA inside of individual cells, including embryos. It’s hard to overstate its importance. I put it up there with the invention of the transistor and artificial intelligence.

And why do I think so? Well, genome editing gives humans control, or at least the ability to try and direct the very processes that brought us about as a species. So it’s that profound.

Getting to JK’s story. In 2018 we had a scoop—he might call it a leak—in which we described his experiment, which, as Mat said, was to edit human embryos to delete a particular gene called CCR5 with the goal of rendering the children, of which there were three, immune to HIV, which their fathers had and which is a source of stigma in China. So that was the project.

Of course our story set off, you know, immediate chaos. Voices were raised all over the world—many critical, a few in support. But one of the consequences was that JK and his team, the parents and the doctors, did not have the ability to tell their own story—in JK’s case because he was, in fact, detained and has completed a term in prison. So we’re happy to have him here to answer my questions and those of our subscribers. JK, thank you for being here. 

Several people, including Professor Michael Waitzkin of Duke University, would like to know what the situation is with the three children. What do you know about their health, and where is this information coming from?

He Jiankui: Lulu, Nana, and the third gene-edited baby—they were healthy and are living a normal, peaceful, undisturbed life. They are as happy as any other people, any other children in kindergarten. I have maintained a constant connection with their parents.

Antonio Regalado: I see. JK, on X, you recently made a comment about one of the parents—now a single mother—who you said you were supporting financially. What can you tell us about that situation? What kind of obligations do you have to these children, and are you able to meet those obligations?

He Jiankui: So the third genetic baby—the parents divorced, so the girl is with her mother. You know, a single mother, a single-parent family—life is not easy. So in the last two years, I’m providing some financial support, but I’m not sure it’s the right thing to do or whether it’s ethical, because I’m a scientist or a doctor, and she is a volunteer or patient. For scientists or doctors to provide financial support to the volunteer or patient—it correct? Is it the right thing to do, and is it ethical? That’s something I’m not sure of. So I have this question, actually.

Antonio Regalado: Interesting. Well, there’s a lot of ethical dilemmas here, and one of them is about your publications, the scientific publications which you prepared and which describe the experiment. So a two-part question for you. 

First of all, setting the ethics aside, some people who criticized your experiment still want to know the result. They would like to know if it worked. Are the children resistant to HIV or not? So part one of the question is: Are you able to make a measurement on their blood, or is anybody able to make a measurement that would show if the experiment worked? And second part of the question: Do you intend to publish your paper, including as a preprint or as a white paper?

He Jiankui: So I always believe that scientific research must be open and transparent, so I am willing to publish my papers, which I wrote six years ago.

It was rejected by Nature, for some reason. But even today, I would say that I’m willing to publish these two papers in a peer-reviewed journal. It has to be peer-reviewed; that is the standard way to publish in a paper.

The other thing is whether the baby is resistant to HIV. Actually, several years ago, when we designed the experiment, we already collected the [umbilical] cord blood when they were born. We collected cord blood from the babies, and our original experiment design was to challenge the cord blood with the HIV virus to see whether they are actually resistant to HIV. But this experiment never happened, because when the news broke out, there has been no way to do any experiment since then. 

I would say I am happy to share my results to the whole world.

Mat Honan: Thanks, Antonio. Let me start with a question from a reader, Karen Jones. She asks, with so much controversy around breaking the law in China, she wanted to know about your credibility. And it reminds me of something that I’m curious about myself. What are the professional consequences of your work? Are you still able to work in China? Are you still able to do experiments with CRISPR?

He Jiankui: Yes, I continue my research in the lab. I have a lab in Sanya [Hainan province], and also previously a lab in Wuhan.

My current work is on gene editing to cure genetic disease such as Duchenne muscular dystrophy and several other genetic diseases. And all this is done by somatic gene therapy, which means this is not working on human embryos.

Mat Honan: I think that leads [to] a question that we have from another reader, Sophie, who wanted to know if you plan to do more gene editing in humans.

He Jiankui: So I have proposed a research project using human embryo gene editing to prevent Alzheimer’s disease. I posted this proposal last year on Twitter. So my goal is we’re going to test the embryo gene editing in mice and monkeys, and in human nonviable embryos. Again, it’s nonviable embryos. There will be no more gene-edited babies. There will be no more pregnancies. We’re going to stop at human nonviable embryos. So our goal is to see if we could prevent Alzheimer’s for offspring or the next generation, because Alzheimer’s has no cure currently.

Mat Honan: I see. And then my last question before I move it back to Antonio. I’m curious if you plan to continue working in China, or if you think that you will ultimately relocate somewhere else. Do you plan to do this work elsewhere? 

He Jiankui: Some investors from Silicon Valley proposed to invest in me to start a company in the United States, with research done both in the United States and in China. This is a very interesting proposal, and I am considering it. I would be happy to work in the United States if there’s good opportunity.

Mat Honan: Let me just remind our readers—if you do have questions, you could put them in the chat and we will try to get to them. But in the meantime, Antonio, back over to you, please.

Antonio Regalado: Definitely, I’m curious about what your plans are. Yesterday Stat News reported some of the answers to today’s questions. They said that you have established yourself in the province of Hainan in China. So what kind of facility do you have there? Do you have a lab, or are you doing research? And where is the financial support coming from?

He Jiankui: So here I have an independent private research lab with a few people. We get funding from both the United States and also from China to support me to carry on the research on the gene therapy for Duchenne muscular dystrophy, for high cholesterol, and some other genetic diseases. 

Antonio Regalado: Could you be more specific about where the funding is coming from? I mean, who is funding you, or what types of people are funding this research? 

He Jiankui:  There are people in the United States who made a donation to me. I’m not going to disclose the name and amount. Also the Chinese people, including some companies, are providing funding to me.

Antonio Regalado: I wonder if you could sketch out for us—I know people are interested—where you think all this [is] going to lead. With a long enough time frame—10 years, 20 years, 30 years—do you think the technology will be in use to change embryos, and how will it be used? What is the larger plan that you see?

He Jiankui: I would say in 50 years, like in 2074, embryo gene editing will be as common as IVF babies to prevent all the genetic disease we know today. So the babies born at that time will be free of genetic disease.

Antonio Regalado: You’re working on Alzheimer’s. This is a gene variant that was described in 2012 by deCode Genetics. This is one of these variants that is protective—it would protect against Alzheimer’s. Strictly speaking, it’s not a genetic disease. So what about the role of protective variants, or what could be called improvements to health?

He Jiankui: Well, I decided to do Alzheimer’s disease because my mother has Alzheimer’s. So I’m going to have Alzheimer’s too, and maybe my daughter and my granddaughter. So I want to do something to change it. 

There’s no cure for Alzheimer’s today. I don’t know for how many years that will be true. But what we can do is: Since some people in Europe are at a very low risk [for] Alzheimer’s, why don’t we just make some modifications so our next generation also have this protective allele, so they have a low risk of Alzheimer’s or maybe are free of Alzheimer’s. That’s my goal.

Antonio Regalado: Well, a couple of questions. Will any country permit this? I mean, genome editing, producing genome-edited children, was made formally illegal in China, I think in 2021. And it’s prohibited in the United States in another way. So where can you go, or where will you go to further this technology?

He Jiankui:  I believe society will eventually accept that embryo gene editing is a good thing because it improves human health. So I’m waiting for society to accept that. My current research is not doing any gene-edited baby or any pregnancy. What I do is a basic research in mice, monkeys, or human nonviable embryos. We only do basic research, but I’m certain that one day society will accept embryo gene editing.

Mat Honan: That raises a question for me. We’re talking about HIV or Alzheimer’s, but there are other aspects of this as well. You could be doing something where you’re optimizing for intelligence or optimizing for physical performance. And I’m curious where you think this leads, and if you think that there is a moral issue around, say, parents who are allowed to effectively design their children by editing their genes.

He Jiankui: Well, I advise you to read the paper I published in 2018 in the CRISPR Journal. It’s my personal thinking of the ethical guidelines for embryo gene editing. It was retracted by the CRISPR Journal. But I proposed that the embryo gene editing should only be used for disease. It should never be used for a nontherapeutic purpose, like making people smarter, stronger, or beautiful.

Mat Honan:  Do you not think that becomes inevitable, though, if gene-editing embryos becomes common?

He Jiankui: Society will decide that. 

Mat Honan: Moving on: You said that you were only working with animals or with nonviable embryos. Are there other people who you think are working with human embryos, with viable human embryos, or that you know of, or have heard about, continuing with that kind of work?

He Jiankui: Well, I don’t know yet. Actually, many scientists are keeping their distance from me. But there are people from somewhere, an island in Honduras or maybe some small East European country, inviting me to do that. And I refused. I refused. I will only do research in the United States and China or other major countries.

Mat Honan: So the short answer is, that sounded almost like a yes to me? You think that it is happening? Is that correct?

He Jiankui: I’m not answering that. 

Mat Honan: Okay, fair enough. I’m going to move on to some reader questions here while we have the time. You mentioned basically having society come around to seeing that this is necessary work. Ravi asks: What type of regulatory framework do you believe is necessary to ensure responsible development and applications of this technology? You had mentioned limiting to therapeutic purposes. Are there other frameworks you think should be in place?

He Jiankui: I’m not answering this question.

Mat Honan: What you think should be in place in terms of regulation?

He Jiankui: Well, there are a lot of regulations. I personally comply with all the laws, regulations, and international ethics for my work. 

Mat Honan: I see. Go ahead, Antonio. 

Antonio Regalado: Let me just jump in with a related question. You talked about offers of funding from the United States, from Silicon Valley—offers of funding to support you. Is that to create a company, and how would accepting investment from entrepreneurs to start a company change public perception about the technology?

He Jiankui: Well, it was designed as a company registered in the United States and headquartered in the United States.

Antonio Regalado: But do you think that starting a company will make people more enthusiastic or interested in this technology?

He Jiankui: Well, for me, I would certainly be more happy to get an offer from the United States [if it came] from a university or research institution. I would be happy for that, but it’s not happening. But, well, a company started doing some basic research, and that’s also a good contribution.

Antonio Regalado: Getting back to the initial experiment—obviously, it’s been criticized a great deal. And I am just wondering, looking back, which of those criticisms do you accept? Which do you disagree with? Do you have regrets about the experiment?

He Jiankui: The only regret I have is to my family, my wife and my two daughters. In the last few years, they are living in a very difficult situation. I won’t let that happen again.

Antonio Regalado: The technology is viewed as controversial. I’m talking about embryo editing. So it’s a little bit surprising to me that you would return to it. Surprising and interesting. So why is it that you have decided to pursue this vision, this project, despite the problems? I mean, you’re still working on it. What is your motivation?

He Jiankui: Our stance is always for us to do something to benefit mankind.

Antonio Regalado: Speaking of mankind, or humankind, I did have a question about evolution. The gene edits that you made to CCR5 and now are working on to another gene in Alzheimer’s—these are natural mutations that occur in some populations, you mentioned in Europe. They’ve been discovered through population genetics. Studies of a large number of people can find these genetic variations that are protective, or believed to be protective, against disease. In the natural course of evolution, those might spread, right? But it would take hundreds of thousands of years. So with gene editing, you can introduce such a change into an embryo, I guess, in a matter of minutes.

So the question I have is: Is this an evolutionary project? Is it human technology being used to take over from evolution?

He Jiankui: I’m not interested in evolution. Evolution takes thousands of years. I only care about the people surrounding me—my family, and also the patients who would come to find me. What I want to do is help those people, help people in this living world. I’m not interested in evolution.

Antonio Regalado: Mat, any other question from the audience you’d like to throw in?

Mat Honan: Yeah, let me get to one from Rez, who’s asking: What do you see as the major hurdles in advancing CRISPR to more general health-care use cases? What do you see as the big barriers there?

He Jiankui:  If you’re talking about somatic gene therapy, the bottleneck, of course, is delivery. Without breakthroughs in delivery technology, somatic gene therapy is heading toward a dead end. For the embryo gene editing, the bottleneck, of course, is: How long will it take people to accept new technology? Because as humans, we are always conservative. We are always worried about the new things, and it takes time for people to accept new technology. 

Mat Honan: I wanted to get a question from Robert that goes back to our earlier discussion here, which is: What was your initial motivation to take this step with the three children?

He Jiankui: So several years ago, I went to a village in the center of China where more than 30% of people are infected with HIV. Back to the 1990s, many years ago, people sold blood, and it did something [spread HIV]. When I was there, I saw that there’s a very small kindergarten, only designed for the children of HIV patients. Why did that happen? Other public schools won’t take them. I felt that there’s a kind of discrimination to these children. And what I want to do is to do something to change it. If the HIV patient—if their children are not just free from but actually immune to HIV, then it will help them to go back to the society. For me, it’s just like a vaccine. It’s one vaccine to protect them for a lifetime. 

Mat Honan: I see we’re running short on time here, and I do want to try to get to some more of our reader questions. I know Antonio has a last one as well. If you do have questions, please put them in the chat. And from Joseph, he wants to know: You say that you think that the society will come around. What do you think will be the first types of embryo DNA edits that would be acceptable to the medical community or to society at large?

He Jiankui: Very recently, a patient flew here to visit me in my office. They are a couple, they are over 40 years old. They want to have a baby and already did IVF. They have embryos, but the embryos have a problem with a chromosome. So this embryo is not good. So one thing, apparently, we could do to help them is to correct the chromosome problem so they can have a healthy embryo, so they can have children. We’re not creating any immunity to anything—it’s just to restore the health of the embryo. And I believe that would be a good start.

Mat Honan: Thank you, JK. Antonio, back over to you. 

Antonio Regalado:  JK, I’m curious about your relationship to the government in China, the central government. You were punished, but on the other hand, you’re free to continue to talk about science and do research. Does the government support you and your ideas? Are you a member of the political party? Have you been offered membership? What is your relationship to the government?

He Jiankui: Next question.

Antonio Regalado: Next question? Okay. Interesting. We’ll have to postpone that one for another day.

Mat, anything else? I think we’re coming up against time, and I’m wondering if we have reader questions. I have one here that I could ask, which is about the new technologies in CRISPR. People want to know where this technology is going, in terms of the methods. You used CRISPR to delete a gene. But CRISPR itself is constantly being improved. There are new tools. So in your lab, in your experiments, what gene-editing technology are you employing?

He Jiankui:  So six years ago, we were using the original CRISPR-Cas9 invented by Jennifer Doudna. But today, we are moving on to base editing, invented by David Liu. The base editing, it’s safe in embryos. It won’t cut the DNA or break it—just small changes. So we no longer use CRISPR-Cas9. We’re using base editing.

Antonio Regalado: And can you tell me the nature of the genetic change that you’re experimenting with or would like to make in these cells to make them resistant to Alzheimer’s? How big a change are you making with this base editor, or trying to make with it?

He Jiankui: So to make people protected against Alzheimer’s, we just need a single base change in the whole human 3 billion letters of DNA. We just change one letter of it to protect people from Alzheimer’s.

Antonio Regalado: And how soon do you think that this could be in use? I mean, it sounds interesting. If I had a child, I might want them to be immune to Alzheimer’s. So this is quite an interesting proposal. What is the time frame in years—if it works in the lab—before it could be implemented in IVF clinics?

He Jiankui: I would say there’s the basic research that could be finished in two years. I won’t move on to the human trial. That’s not my role. It’s determined by society whether to accept it or not. And that’s the ethical side. 

Antonio Regalado: A last question on this from a reader. The question is: How do you prove the benefits? Of course, you can make a genetic change. You can even create a person with a genetic change. But if it’s for Alzheimer’s, it’s going to take 70 years before you know and can prove the results. So how can you prove its medical benefit? Or how can you predict the medical benefit?

He Jiankui: So one thing is that we can observe it in the natural world. There are already thousands of people with this mutation. It helps them against Alzheimer’s. It naturally exists in the population, in humans, so that’s a natural human experiment. And also we could do it in mice. We could use Alzheimer’s model mice and then to modulate DNA to see the results.

You might argue that it takes many years to develop Alzheimer’s, but in society, we’ve done a lot with the HPV vaccine against certain women’s cancers. Cancer takes many years to happen, but they take the HPV vaccine at age eight or seven.

Mat Honan: Thank you so much. JK and Antonio, we are slightly past time here, and I’m going to go ahead and wrap it up. Thank you very much for joining us today, to both of you. And I also want to thank all of our subscribers who tuned in today. I do hope that we see you again next month at our Roundtable in August. It’s our subscriber-only series. And I hope you enjoyed today. Thanks, everybody. 

Antonio Regalado: Thank you, JK.

He Jiankui: Thank you. 

Mystic Gum Sees Early DTC Success

Braxton Manley first appeared on the podcast in 2021. As a college student, he had launched Braxley Bands, a maker of Apple Watch bands. Last year he returned with an update on that business after operational and sales challenges.

He’s back, having launched his latest company, Mystic, a direct-to-consumer maker of health-focused chewing gum. In our recent conversation, we discuss the origins of Mystic, marketing plans, early successes, and more.

The entire audio is embedded below. The transcript is edited for length and clarity.

Eric Bandholz: How’s business?

Braxton Manley: Braxley Bands, our Apple Watch band company, is surviving in a challenging climate. We’re operating from a profit-first mentality. We grow as much as possible and, based on the prior month’s profit-and-loss statement, scale back if needed. It’s multiple scale-ups, then pull-backs. My brother Zach and I run the business, working remotely. We haven’t taken a salary in a while and are focused on the business’s long-term stability.

I’m involved with three direct-to-consumer ecommerce businesses now. My fiance, Maddie, started Peace Love Hormones about three years ago. It’s a direct-to-consumer supplement brand for women’s hormone health. I have an executive role there, functioning as CEO so that Maddie can pursue her doctorate in herbal medicine and focus on the product. I focus on the marketing and operations.

Our third business, Mystic, just launched. It’s chewing gum for women made with sap from a mastic tree, which grows on a Greek island and has a ton of health benefits.

We’re trying to build a family holding company to operate multiple DTC businesses. At this point, they’re all relatively humble — six and seven figures in annual revenue.

Bandholz: Tell me about Mystic.

Manley: It’s square chunks of organic gum. It costs $38 for a can. It’s a beauty product for women and is categorized that way on TikTok. It’s different from regular gum. It’s not sweet at all. It’s palate-cleansing. It relieves indigestion and promotes oral health. You can develop an appreciation for the flavor.

The business is six months old. We’ve been fulfilling orders for just a week. The beginning stage was figuring out what the logo would look like. We did a beta test last year. We invested about $3,000 and ended up selling $20,000 worth. We realized we had a viable product.

We then raised $90,000 from friends and family. We developed custom packaging and produced 5,000 gum units — enough to make our first $200,000 in revenue.

Bandholz: How are you marketing the product?

Manley: Well, we’re a week into fulfilling orders. So it is fresh. We’ve spent much time on a TikTok Shop. We believe TikTok is a good product fit.

Affiliates are important to us too. Maddie, my finance, is an Instagram creator in the health and wellness space. She has an incredible community, which produced our first Mystic orders — about $5,000 in revenue. By Q4, we’ll be doing six figures monthly. This can scale quickly.

We sell recurring orders, but we’re not using the terms “subscribers” or “subscriptions.” Instead, we sell memberships to a gum-chewing club. We have cool hats, a club logo, and patches. The idea is to build a culture. We will charge more for our first subscription and less for renewals. It’s $38 for a one-time order or $30 to join the club for recurring shipments.

Bandholz: Where can people buy the gum and follow you?

Manley: Go to MysticGum.com. You can follow me on X, @Braxtonmanley, or LinkedIn.

Google Cautions On Blocking GoogleOther Bot via @sejournal, @martinibuster

Google’s Gary Illyes answered a question about the non-search features that the GoogleOther crawler supports, then added a caution about the consequences of blocking GoogleOther.

What Is GoogleOther?

GoogleOther is a generic crawler created by Google for the various purposes that fall outside of those of bots that specialize for Search, Ads, Video, Images, News, Desktop and Mobile. It can be used by internal teams at Google for research and development in relation to various products.

The official description of GoogleOther is:

“GoogleOther is the generic crawler that may be used by various product teams for fetching publicly accessible content from sites. For example, it may be used for one-off crawls for internal research and development.”

Something that may be surprising is that there are actually three kinds of GoogleOther crawlers.

Three Kinds Of GoogleOther Crawlers

  1. GoogleOther
    Generic crawler for public URLs
  2. GoogleOther-Image
    Optimized to crawl public image URLs
  3. GoogleOther-Video
    Optimized to crawl public video URLs

All three GoogleOther crawlers can be used for research and development purposes. That’s just one purpose that Google publicly acknowledges that all three versions of GoogleOther could be used for.

What Non-Search Features Does GoogleOther Support?

Google doesn’t say what specific non-search features GoogleOther supports, probably because it doesn’t really “support” a specific feature. It exists for research and development crawling which could be in support of a new product or an improvement in a current product, it’s a highly open and generic purpose.

This is the question asked that Gary narrated:

“What non-search features does GoogleOther crawling support?”

Gary Illyes answered:

“This is a very topical question, and I think it is a very good question. Besides what’s in the public I don’t have more to share.

GoogleOther is the generic crawler that may be used by various product teams for fetching publicly accessible content from sites. For example, it may be used for one-off crawls for internal research and development.

Historically Googlebot was used for this, but that kind of makes things murky and less transparent, so we launched GoogleOther so you have better controls over what your site is crawled for.

That said GoogleOther is not tied to a single product, so opting out of GoogleOther crawling might affect a wide range of things across the Google universe; alas, not Search, search is only Googlebot.”

It Might Affect A Wide Range Of Things

Gary is clear that blocking GoogleOther wouldn’t have an affect on Google Search because Googlebot is the crawler used for indexing content. So if blocking any of the three versions of GoogleOther is something a site owner wants to do, then it should be okay to do that without a negative effect on search rankings.

But Gary also cautioned about the outcome that blocking GoogleOther, saying that it would have an effect on other products and services across Google. He didn’t state which other products it could affect nor did he elaborate on the pros or cons of blocking GoogleOther.

Pros And Cons Of Blocking GoogleOther

Whether or not to block GoogleOther doesn’t necessarily have a straightforward answer. There are several considerations to whether doing that makes sense.

Pros

Inclusion in research for a future Google product that’s related to search (maps, shopping, images, a new feature in search) could be useful. It might be helpful to have a site included in that kind of research because it might be used for testing something good for a site and be one of the few sites chosen to test a feature that could increase earnings for a site.

Another consideration is that blocking GoogleOther to save on server resources is not necessarily a valid reason because GoogleOther doesn’t seem to crawl so often that it makes a noticeable impact.

If blocking Google from using site content for AI is a concern then blocking GoogleOther will have no impact on that at all. GoogleOther has nothing to do with crawling for Google Gemini apps or Vertex AI, including any future products that will be used for training associated language models. The bot for that specific use case is Google-Extended.

Cons

On the other hand it might not be helpful to allow GoogleOther if it’s being used to test something related to fighting spam and there’s something the site has to hide.

It’s possible that a site owner might not want to participate if GoogleOther comes crawling for market research or for training machine learning models (for internal purposes) that are unrelated to public-facing products like Gemini and Vertex.

Allowing GoogleOther to crawl a site for unknown purposes is like giving Google a blank check to use your site data in any way they see fit outside of training public-facing LLMs or purposes related to named bots like GoogleBot.

Takeaway

Should you block GoogleOther? It’s a coin toss. There are possible potential benefits but in general there isn’t enough information to make an informed decision.

Listen to the Google SEO Office Hours podcast at the 1:30 minute mark:

Featured Image by Shutterstock/Cast Of Thousands

Reddit Limits Search Engine Access, Google Remains Exception via @sejournal, @MattGSouthern

Reddit has recently tightened its grip on who can access its content, blocking major search engines from indexing recent posts and comments.

This move has sparked discussions in the SEO and digital marketing communities about the future of content accessibility and AI training data.

What’s Happening?

First reported by 404 Media, Reddit updated its robots.txt file, preventing most web crawlers from accessing its latest content.

Google, however, remains an exception, likely due to a $60 million deal that allows the search giant to use Reddit’s content for AI training.

Brent Csutoras, founder of Search Engine Journal, offers some context:

“Since taking on new investors and starting their pathway to IPO, Reddit has moved away from being open-source and allowing anyone to scrape their content and use their APIs without paying.”

The Google Exception

Currently, Google is the only major search engine able to display recent Reddit results when users search with “site:reddit.com.”

This exclusive access sets Google apart from competitors like Bing and DuckDuckGo.

Why This Matters

For users who rely on appending “Reddit” to their searches to find human-generated answers, this change means they’ll be limited to using Google or search engines that pull from Google’s index.

It presents new challenges for SEO professionals and marketers in monitoring and analyzing discussions on one of the internet’s largest platforms.

The Bigger Picture

Reddit’s move aligns with a broader trend of content creators and platforms seeking compensation for using their data in AI training.

As Csutoras points out:

“Publications, artists, and entertainers have been suing OpenAI and other AI companies, blocking AI companies, and fighting to avoid using public content for AI training.”

What’s Next?

While this development may seem surprising, Csutoras suggests it’s a logical step for Reddit.

He notes:

“It seems smart on Reddit’s part, especially since similar moves in the past have allowed them to IPO and see strong growth for their valuation over the last two years.”


FAQ

What is the recent change Reddit has made regarding content accessibility?

Reddit has updated its robots.txt file to block major search engines from indexing its latest posts and comments. This change exempts Google due to a $60 million deal, allowing Google to use Reddit’s content for AI training purposes.

Why does Google have exclusive access to Reddit’s latest content?

Google has exclusive access to Reddit’s latest content because of a $60 million deal that allows Google to use Reddit’s content for AI training. This agreement sets Google apart from other search engines like Bing and DuckDuckGo, which are unable to index new Reddit posts and comments.

What broader trend does Reddit’s recent move reflect?

Reddit’s decision to limit search engine access aligns with a larger trend where content creators and platforms seek compensation for the use of their data in AI training. Many publications, artists, and entertainers are taking similar actions to either block or demand compensation from AI companies using their content.


Featured Image: Mamun sheikh K/Shutterstock

What Can AI Do For Healthcare Marketing In 2024? via @sejournal, @CallRail

This post was sponsored by CallRail. The opinions expressed in this article are the sponsor’s own.

Artificial intelligence (AI) has huge potential for healthcare practices. It can assist with diagnosis and treatment, as well as administrative and marketing tasks. Yet, many practices are still wary of using AI, especially regarding marketing.

The reality is that AI is here to stay, and many healthcare practices are beginning to use the technology. According to one recent study, 89% of healthcare professionals surveyed said that they were at least evaluating AI products, experimenting with them, or had implemented AI.

To help you determine whether using AI is right for your healthcare practice, let’s take a look at some of the pros and cons of using AI while marketing.

The Pros And Cons Of AI For Healthcare Practices

Healthcare practices that choose to implement AI in safe and appropriate ways to help them with their marketing and patient experience efforts can reap many benefits, including more leads, conversions, and satisfied patients. In fact, 41% of healthcare organizations say their marketing team already uses AI.

Patients also expect healthcare practices to begin to implement AI in a number of ways. In one dentistry study, patients overall showed a positive attitude toward using AI. So, what’s holding your practice back from adding new tools and finding new use cases for AI? Let’s take a look at common concerns.

Con #1: Data Security And Privacy Concerns

Let’s get one of the biggest concerns with AI and healthcare out of the way first. Healthcare practices must follow all privacy and security regulations related to patients’ protected health information (PHI) to maintain HIPAA compliance.

So, concerns over whether AI can be used in a way that doesn’t interfere with HIPAA compliance are valid. In addition, there are also concerns about the open-source nature of popular GenAI models, which means sensitive practice data might be exposed to competitors or even hackers.

Pro #1: AI Can Help You Get More Value From Your Data Securely

While there are valid concerns about how AI algorithms make decisions and data privacy concerns, AI can also be used to enrich data to help you achieve your marketing goals while still keeping it protected.

With appropriate guardrails and omission procedures in place, you can apply AI to gain insights from data that matters to you without putting sensitive data at risk.

For example, our CallRail Labs team is helping marketers remove their blind spots by using AI to analyze and detect critical context clues that help you qualify which calls are your best leads so you can follow up promptly.

At the same time, we know how important it is for healthcare companies to keep PHI secure, which is why we integrate with healthcare privacy platforms like Freshpaint. It can help you bridge the gap between patient privacy and digital marketing.

In addition, our AI-powered Healthcare Plan automatically redacts sensitive patient-protected health information from call transcripts, enforces obligatory log-outs to prevent PHI from becoming public, provides full audit trail logging, and even features unique logins and credentials for every user, which helps eliminate the potential for PHI to be accidentally exposed to employees who don’t need access to that information.

Con #2: AI Is Impersonal

Having a good patient experience is important to almost all patients, and according to one survey, 52% of patients said a key part of a good patient experience is being treated with respect. Almost as many (46%) said they want to be addressed as a person. Given these concerns, handing over content creation or customer interactions to AI can feel daunting. While an AI-powered chatbot might be more efficient than a human in a call center, you also don’t want patients to feel like you’ve delegated customer service to a robot. Trust is the key to building patient relationships.

Pro #2: AI Can Improve The Patient Experience

Worries over AI making patient interactions feel impersonal are reasonable, but just like any other type of tool, it’s how you use AI that matters. There are ways to deploy AI that can actually enhance the patient experience and, by doing so, give your healthcare practice an advantage over your competitors.

The answer isn’t in offloading customer interaction to chatbots. But AI can help you analyze customer interactions to make customer service more efficient and helpful.

With CallRail’s AI-powered Premium Conversation Intelligence™, which transcribes, summarizes, and analyzes each call, you can quickly assess your patients’ needs and concerns and respond appropriately with a human touch. For instance, Premium Conversation Intelligence can identify and extract common keywords and topics from call transcripts. This data reveals recurring themes, such as frequently asked questions, common complaints, and popular services. A healthcare practice could then use these insights to tailor their marketing campaigns to address the most pressing patient concerns.

Con #3: AI Seems Too Complicated To Use

Let’s face it: new technology is risky, and for healthcare practices especially, risk is scary. With AI, some of the risk comes from its perceived complexity. Identifying the right use cases for your practice, selecting the right tools, training your staff, and changing workflows can all feel quite daunting. Figuring this out takes time and money. And, if there aren’t clear use cases and ROI attached, the long-term benefits may not be worth the short-term impact on business.

Pro #3: AI Can Save Time And Money

Using a computer or a spreadsheet for the first time probably also felt complicated – and on the front end, took some time to learn. However, you know that using these tools, compared to pen, paper, and calculators, has saved an enormous amount of time, making the upfront investment clearly worth it. Compared to many technologies, AI tools are often intuitive and only require you to learn a few simple things like writing prompts, refining prompts, reviewing reports, etc. Even if it takes some time to learn new AI tools, the time savings will be worth it once you do.

To get the greatest return on investment, focus on AI solutions that take care of time-intensive tasks to free up time for innovation. With the right use cases and tools, AI can help solve complexity without adding complexity. For example, with Premium Conversation Intelligence, our customers spend 60% less time analyzing calls each week, and they’re using that time to train staff better, increase their productivity, and improve the patient experience.

Con #4: AI Marketing Can Hurt Your Brand

Many healthcare practices are excited to use GenAI tools to accelerate creative marketing efforts, like social media image creation and article writing. But consumers are less excited. In fact, consumers are more likely to say that the use of AI makes them distrusting (40%), rather than trusting (19%), of a brand. In a market where trust is the most important factor for patients when choosing healthcare providers, there is caution and hesitancy around using GenAI for marketing.

Pro #4: AI Helps Make Your Marketing Better

While off-brand AI images shared on social media can be bad brand marketing, there are many ways AI can elevate your marketing efforts without impacting the brand perception. From uncovering insights to improving your marketing campaigns and maximizing the value of each marketing dollar spent to increasing lead conversion rates and decreasing patient churn, AI can help you tackle these problems faster and better than ever.

At CallRail, we’re using AI to tackle complex challenges like multi-conversation insights. CallRail can give marketers instant access to a 3-6 sentence summary for each call, average call sentiment, notable trends behind positive and negative interactions, and a summary of commonly asked questions. Such analysis would take hours and hours for your marketing team to do manually, but with AI, you have call insights at your fingertips to help drive messaging and keyword decisions that can improve your marketing attribution and the patient experience.

Con #5: Adapting AI Tools Might Cause Disruption

As a modern healthcare practice, your tech stack is the engine that runs your business. When onboarding any new technology, there are always concerns about how well it will integrate with existing technology and tools you use and whether it supports HIPAA compliance. There may also be concern about how AI tools can fit into your existing workflows without causing disruption.

Pro #5: AI Helps People Do Their Jobs Better

Pairing the right AI tool for roles with repetitive tasks can be a win for your staff and your practice. For example, keeping up with healthcare trends is important for marketers to improve messaging and campaigns.

An AI-powered tool that analyzes conversations and provides call highlights can help healthcare marketers identify keyword and Google Ad opportunities so they can focus on implementing the most successful marketing strategy rather than listening to hours of call recordings. In addition, CallRail’s new AI-powered Convert Assist helps healthcare marketers provide a better patient experience. With AI-generated call coaching, marketers can identify what went well and what to improve after every conversation.

What’s more, with a solution like CallRail, which offers a Healthcare Plan and will sign a business associate agreement (BAA), you are assured that we will comply with HIPAA controls within our service offerings to ensure that your call tracking doesn’t expose you to potential fines or litigation. Moreover, we also integrate with other marketing tools, like Google Ads, GA4, and more, making it easy to integrate our solution into your existing technologies and workflows.

Let CallRail Show You The Pros Of AI

If you’re still worried about using AI in your healthcare practice, start with a trusted solution like CallRail that has proven ROI for AI-powered tools and a commitment to responsible AI development. You can talk to CallRail’s experts or test the product out for yourself with a 14-day free trial.


Image Credits

Featured Image: Image by CallRail. Used with permission.

Find Keyword Cannibalization Using OpenAI’s Text Embeddings With Examples via @sejournal, @vahandev

This new series of articles focuses on working with LLMs to scale your SEO tasks. We hope to help you integrate AI into SEO so you can level up your skills.

We hope you enjoyed the previous article and understand what vectors, vector distance, and text embeddings are.

Following this, it’s time to flex your “AI knowledge muscles” by learning how to use text embeddings to find keyword cannibalization.

We will start with OpenAI’s text embeddings and compare them.

Model Dimensionality Pricing Notes
text-embedding-ada-002 1536 $0.10 per 1M tokens Great for most use cases.
text-embedding-3-small 1536 $0.002 per 1M tokens Faster and cheaper but less accurate
text-embedding-3-large 3072 $0.13 per 1M tokens More accurate for complex long text-related tasks, slower

(*tokens can be considered as words words.)

But before we start, you need to install Python and Jupyter on your computer.

Jupyter is a web-based tool for professionals and researchers. It allows you to perform complex data analysis and machine learning model development using any programming language.

Don’t worry – it’s really easy and takes little time to finish the installations. And remember, ChatGPT is your friend when it comes to programming.

In a nutshell:

  • Download and install Python.
  • Open your Windows command line or terminal on Mac.
  • Type this commands pip install jupyterlab and pip install notebook
  • Run Jupiter by this command: jupyter lab

We will use Jupyter to experiment with text embeddings; you’ll see how fun it is to work with!

But before we start, you must sign up for OpenAI’s API and set up billing by filling your balance.

Open AI Api Billing settingsOpen AI Api Billing settings

Once you’ve done that, set up email notifications to inform you when your spending exceeds a certain amount under Usage limits.

Then, obtain API keys under Dashboard > API keys, which you should keep private and never share publicly.

OpenAI API keysOpenAI API keys

Now, you have all the necessary tools to start playing with embeddings.

  • Open your computer command terminal and type jupyter lab.
  • You should see something like the below image pop up in your browser.
  • Click on Python 3 under Notebook.
jupyter labjupyter lab

In the opened window, you will write your code.

As a small task, let’s group similar URLs from a CSV. The sample CSV has two columns: URL and Title. Our script’s task will be to group URLs with similar semantic meanings based on the title so we can consolidate those pages into one and fix keyword cannibalization issues.

Here are the steps you need to do:

Install required Python libraries with the following commands in your PC’s terminal (or in Jupyter notebook)

pip install pandas openai scikit-learn numpy unidecode

The ‘openai’ library is required to interact with the OpenAI API to get embeddings, and ‘pandas’ is used for data manipulation and handling CSV file operations.

The ‘scikit-learn’ library is necessary for calculating cosine similarity, and ‘numpy’ is essential for numerical operations and handling arrays. Lastly, unidecode is used to clean text.

Then, download the sample sheet as a CSV, rename the file to pages.csv, and upload it to your Jupyter folder where your script is located.

Set your OpenAI API key to the key you obtained in the step above, and copy-paste the code below into the notebook.

Run the code by clicking the play triangle icon at the top of the notebook.


import pandas as pd
import openai
from sklearn.metrics.pairwise import cosine_similarity
import numpy as np
import csv
from unidecode import unidecode

# Function to clean text
def clean_text(text: str) -> str:
    # First, replace known problematic characters with their correct equivalents
    replacements = {
        '–': '–',   # en dash
        '’': '’',   # right single quotation mark
        '“': '“',   # left double quotation mark
        '”': '”',   # right double quotation mark
        '‘': '‘',   # left single quotation mark
        'â€': '—'     # em dash
    }
    for old, new in replacements.items():
        text = text.replace(old, new)
    # Then, use unidecode to transliterate any remaining problematic Unicode characters
    text = unidecode(text)
    return text

# Load the CSV file with UTF-8 encoding from root folder of Jupiter project folder
df = pd.read_csv('pages.csv', encoding='utf-8')

# Clean the 'Title' column to remove unwanted symbols
df['Title'] = df['Title'].apply(clean_text)

# Set your OpenAI API key
openai.api_key = 'your-api-key-goes-here'

# Function to get embeddings
def get_embedding(text):
    response = openai.Embedding.create(input=[text], engine="text-embedding-ada-002")
    return response['data'][0]['embedding']

# Generate embeddings for all titles
df['embedding'] = df['Title'].apply(get_embedding)

# Create a matrix of embeddings
embedding_matrix = np.vstack(df['embedding'].values)

# Compute cosine similarity matrix
similarity_matrix = cosine_similarity(embedding_matrix)

# Define similarity threshold
similarity_threshold = 0.9  # since threshold is 0.1 for dissimilarity

# Create a list to store groups
groups = []

# Keep track of visited indices
visited = set()

# Group similar titles based on the similarity matrix
for i in range(len(similarity_matrix)):
    if i not in visited:
        # Find all similar titles
        similar_indices = np.where(similarity_matrix[i] >= similarity_threshold)[0]
        
        # Log comparisons
        print(f"nChecking similarity for '{df.iloc[i]['Title']}' (Index {i}):")
        print("-" * 50)
        for j in range(len(similarity_matrix)):
            if i != j:  # Ensure that a title is not compared with itself
                similarity_value = similarity_matrix[i, j]
                comparison_result = 'greater' if similarity_value >= similarity_threshold else 'less'
                print(f"Compared with '{df.iloc[j]['Title']}' (Index {j}): similarity = {similarity_value:.4f} ({comparison_result} than threshold)")

        # Add these indices to visited
        visited.update(similar_indices)
        # Add the group to the list
        group = df.iloc[similar_indices][['URL', 'Title']].to_dict('records')
        groups.append(group)
        print(f"nFormed Group {len(groups)}:")
        for item in group:
            print(f"  - URL: {item['URL']}, Title: {item['Title']}")

# Check if groups were created
if not groups:
    print("No groups were created.")

# Define the output CSV file
output_file = 'grouped_pages.csv'

# Write the results to the CSV file with UTF-8 encoding
with open(output_file, 'w', newline='', encoding='utf-8') as csvfile:
    fieldnames = ['Group', 'URL', 'Title']
    writer = csv.DictWriter(csvfile, fieldnames=fieldnames)
    
    writer.writeheader()
    for group_index, group in enumerate(groups, start=1):
        for page in group:
            cleaned_title = clean_text(page['Title'])  # Ensure no unwanted symbols in the output
            writer.writerow({'Group': group_index, 'URL': page['URL'], 'Title': cleaned_title})
            print(f"Writing Group {group_index}, URL: {page['URL']}, Title: {cleaned_title}")

print(f"Output written to {output_file}")

This code reads a CSV file, ‘pages.csv,’ containing titles and URLs, which you can easily export from your CMS or get by crawling a client website using Screaming Frog.

Then, it cleans the titles from non-UTF characters, generates embedding vectors for each title using OpenAI’s API, calculates the similarity between the titles, groups similar titles together, and writes the grouped results to a new CSV file, ‘grouped_pages.csv.’

In the keyword cannibalization task, we use a similarity threshold of 0.9, which means if cosine similarity is less than 0.9, we will consider articles as different. To visualize this in a simplified two-dimensional space, it will appear as two vectors with an angle of approximately 25 degrees between them.

<span class=

In your case, you may want to use a different threshold, like 0.85 (approximately 31 degrees between them), and run it on a sample of your data to evaluate the results and the overall quality of matches. If it is unsatisfactory, you can increase the threshold to make it more strict for better precision.

You can install ‘matplotlib’ via terminal.

pip install matplotlib

And use the Python code below in a separate Jupyter notebook to visualize cosine similarities in two-dimensional space on your own. Try it; it’s fun!


import matplotlib.pyplot as plt
import numpy as np

# Define the angle for cosine similarity of 0.9. Change here to your desired value. 
theta = np.arccos(0.9)

# Define the vectors
u = np.array([1, 0])
v = np.array([np.cos(theta), np.sin(theta)])

# Define the 45 degree rotation matrix
rotation_matrix = np.array([
    [np.cos(np.pi/4), -np.sin(np.pi/4)],
    [np.sin(np.pi/4), np.cos(np.pi/4)]
])

# Apply the rotation to both vectors
u_rotated = np.dot(rotation_matrix, u)
v_rotated = np.dot(rotation_matrix, v)

# Plotting the vectors
plt.figure()
plt.quiver(0, 0, u_rotated[0], u_rotated[1], angles='xy', scale_units='xy', scale=1, color='r')
plt.quiver(0, 0, v_rotated[0], v_rotated[1], angles='xy', scale_units='xy', scale=1, color='b')

# Setting the plot limits to only positive ranges
plt.xlim(0, 1.5)
plt.ylim(0, 1.5)

# Adding labels and grid
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
plt.grid(True)
plt.title('Visualization of Vectors with Cosine Similarity of 0.9')

# Show the plot
plt.show()

I usually use 0.9 and higher for identifying keyword cannibalization issues, but you may need to set it to 0.5 when dealing with old article redirects, as old articles may not have nearly identical articles that are fresher but partially close.

It may also be better to have the meta description concatenated with the title in case of redirects, in addition to the title.

So, it depends on the task you are performing. We will review how to implement redirects in a separate article later in this series.

Now, let’s review the results with the three models mentioned above and see how they were able to identify close articles from our data sample from Search Engine Journal’s articles.

Data SampleData Sample

From the list, we already see that the 2nd and 4th articles cover the same topic on ‘meta tags.’ The articles in the 5th and 7th rows are pretty much the same – discussing the importance of H1 tags in SEO – and can be merged.

The article in the 3rd row doesn’t have any similarities with any of the articles in the list but has common words like “Tag” or “SEO.”

The article in the 6th row is again about H1, but not exactly the same as H1’s importance to SEO. Instead, it represents Google’s opinion on whether they should match.

Articles on the 8th and 9th rows are quite close but still different; they can be combined.

text-embedding-ada-002

By using ‘text-embedding-ada-002,’ we precisely found the 2nd and 4th articles with a cosine similarity of 0.92 and the 5th and 7th articles with a similarity of 0.91.

Screenshot from Jupyter log showing cosine similaritiesScreenshot from Jupyter log showing cosine similarities

And it generated output with grouped URLs by using the same group number for similar articles. (colors are applied manually for visualization purposes).

Output sheet with grouped URLsOutput sheet with grouped URLs

For the 2nd and 3rd articles, which have common words “Tag” and “SEO” but are unrelated, the cosine similarity was 0.86. This shows why a high similarity threshold of 0.9 or greater is necessary. If we set it to 0.85, it would be full of false positives and could suggest merging unrelated articles.

text-embedding-3-small

By using ‘text-embedding-3-small,’ quite surprisingly, it didn’t find any matches per our similarity threshold of 0.9 or higher.

For the 2nd and 4th articles, cosine similarity was 0.76, and for the 5th and 7th articles, with similarity 0.77.

To better understand this model through experimentation, I’ve added a slightly modified version of the 1st row with ’15’ vs. ’14’ to the sample.

  1. “14 Most Important Meta And HTML Tags You Need To Know For SEO”
  2. “15 Most Important Meta And HTML Tags You Need To Know For SEO”
Example which shows text-embedding-3-small resultsAn example which shows text-embedding-3-small results

On the contrary, ‘text-embedding-ada-002’ gave 0.98 cosine similarity between those versions.

Title 1 Title 2 Cosine Similarity
14 Most Important Meta And HTML Tags You Need To Know For SEO 15 Most Important Meta And HTML Tags You Need To Know For SEO 0.92
14 Most Important Meta And HTML Tags You Need To Know For SEO Meta Tags: What You Need To Know For SEO 0.76

Here, we see that this model is not quite a good fit for comparing titles.

text-embedding-3-large

This model’s dimensionality is 3072, which is 2 times higher than that of ‘text-embedding-3-small’ and ‘text-embedding-ada-002′, with 1536 dimensionality.

As it has more dimensions than the other models, we could expect it to capture semantic meaning with higher precision.

However, it gave the 2nd and 4th articles cosine similarity of 0.70 and the 5th and 7th articles similarity of 0.75.

I’ve tested it again with slightly modified versions of the first article with ’15’ vs. ’14’ and without ‘Most Important’ in the title.

  1. “14 Most Important Meta And HTML Tags You Need To Know For SEO”
  2. “15 Most Important Meta And HTML Tags You Need To Know For SEO”
  3. “14 Meta And HTML Tags You Need To Know For SEO”
Title 1 Title 2 Cosine Similarity
14 Most Important Meta And HTML Tags You Need To Know For SEO 15 Most Important Meta And HTML Tags You Need To Know For SEO 0.95
14 Most Important Meta And HTML Tags You Need To Know For SEO 14 Most Important Meta And HTML Tags You Need To Know For SEO 0.93
14 Most Important Meta And HTML Tags You Need To Know For SEO Meta Tags: What You Need To Know For SEO 0.70
15 Most Important Meta And HTML Tags You Need To Know For SEO 14 Most Important  Meta And HTML Tags You Need To Know For SEO 0.86

So we can see that ‘text-embedding-3-large’ is underperforming compared to ‘text-embedding-ada-002’ when we calculate cosine similarities between titles.

I want to note that the accuracy of ‘text-embedding-3-large’ increases with the length of the text, but ‘text-embedding-ada-002’ still performs better overall.

Another approach could be to strip away stop words from the text. Removing these can sometimes help focus the embeddings on more meaningful words, potentially improving the accuracy of tasks like similarity calculations.

The best way to determine whether removing stop words improves accuracy for your specific task and dataset is to empirically test both approaches and compare the results.

Conclusion

With these examples, you have learned how to work with OpenAI’s embedding models and can already perform a wide range of tasks.

For similarity thresholds, you need to experiment with your own datasets and see which thresholds make sense for your specific task by running it on smaller samples of data and performing a human review of the output.

Please note that the code we have in this article is not optimal for large datasets since you need to create text embeddings of articles every time there is a change in your dataset to evaluate against other rows.

To make it efficient, we must use vector databases and store embedding information there once generated. We will cover how to use vector databases very soon and change the code sample here to use a vector database.

More resources: 


Featured Image: BestForBest/Shutterstock