Yesterday, I had hiking boots in my cart. Size selected, reviews read, I was even picturing myself on the trail. Then I hesitated. “Will these pinch my wide feet?” Three clicks later, I bounced.
These types of hesitations cost businesses millions.
We’ve gotten excellent at grabbing attention and driving traffic. But success comes down to attention coupled with intention.
The real challenge is optimizing for the micro-moments that determine conversions. Those moments where a finger hovers over “buy.” Eyes flick to the return policy. And then, that dreaded tab back to your competitor.
An essential skill for today’s marketers is conversion design, where we decode hesitation as a behavioral signal.
How do you guide attention toward action? How do you eliminate the friction that causes hesitation? AI can help us spot and solve for these in a way that we haven’t been able to previously.
78% of organizations now use AI in at least one business function according to McKinsey’s 2025 State of AI research, yet most aren’t applying it where it matters most: the critical seconds when attention converts to action.
Understanding The Hesitation Moment
Your visitors have done their research. They’re on your product page, comparing options, genuinely considering a purchase. Then doubt creeps in:
“Will this integration work with our current setup?”
“Is this jacket too warm for Seattle?”
“Can I trust this company with a project this important?”
These small but significant moments determine whether someone converts or walks away. Behavioral science calls this “ambiguity aversion,” our brain’s tendency to avoid uncertain outcomes.
AI is now giving us visibility into these hesitation patterns that were invisible before. Let’s look at how leading brands are responding.
Retail: Removing Size Uncertainty
A Fortune 100 retailer analyzed cart abandonment and discovered shoppers were lingering over size charts before dropping off.
Instead of simply displaying standard measurements, they built a system that detects hesitation patterns and immediately surfaces:
Photos of real customers with height/weight stats wearing that exact item.
One-click connection to a live sizing consultant.
90-day wear reviews showing how fit changed over time.
This resulted in 22% fewer returns and 37% higher conversion rates [Source: Anonymized client data].
Lululemon: AI-Powered Customer Segmentation
Google’s recent case study on Lululemon shows how the activewear brand used AI to address hesitation at scale.
Instead of treating all visitors the same, Lululemon’s AI identifies where customers are in their decision journey and adjusts messaging accordingly.
Their approach included:
The results showed a substantial reduction in customer acquisition costs, increased new customer revenue from 6% to 15%, and an 8% boost in return on ad spend (ROAS). The strategy was so effective that it earned top honors at the Google Search Honours Awards in Canada.
B2B: Enterprise Software Hesitation
In B2B, hesitation moments are different but no less critical. Enterprise buyers often get stuck on three key concerns:
Integration compatibility: “Will this work with our existing systems?”
Implementation risk: “What if this disrupts our operations?”
Smart B2B companies use AI to detect these hesitation patterns:
When someone spends 60+ seconds on pricing pages, especially toggling between tiers.
Downloads technical specs, then immediately visits competitor comparison pages.
Views implementation timelines multiple times without requesting a demo.
Leading SaaS platforms can trigger personalized responses based on these signals, such as custom ROI calculators, implementation case studies from similar companies, or direct connection to technical specialists.
Microsoft’s Conversational AI In Action
Microsoft’s data shows the power of AI in addressing customer hesitation in real-time. Their recent analysis reveals:
AI-powered ads deliver 25% higher relevance compared to traditional search ads.
Copilot ad conversions increased by 1.3x across all ad types since the November 2024 relaunch.
40% of users say well-placed AI-powered ads enhance their online experience.
AI is well beyond automating existing processes to now anticipating uncertainty and responding in real time.
The Hesitation-To-Action Framework
Here’s how to start optimizing for hesitation reduction:
1. Identify Hesitation Moments
Use tools like:
Heatmaps to see where users pause or hover, e.g., Users hover over “compatibility” but don’t click. Add clarity to product specs.
Session recordings to watch actual user behavior, e.g., A user toggles pricing tiers, then exits, indicating confusion or doubt.
Behavioral tracking to identify patterns before drop-off, e.g., Users who view the return policy are 2x more likely to abandon cart.
Sales call logs to find commonly asked questions and concerns, e.g., “How long does onboarding take?” Add a visual onboarding timeline.
2. Create Confidence Content
Address uncertainty directly:
Technical specifications for B2B concerns, e.g., “Compare to Your Stack” chart.
Social proof from similar customers, e.g., Quotes from similar customers with similar concerns.
Transparent information about potential drawbacks, e.g., “Who This Isn’t Right For” section to builds trust (Sometimes, showing a drawback increases trust more than another benefit).
Comparison tools that highlight advantages, e.g., “Compare us to [Competitor X]” chart, to keep people on site.
3. Deploy Behavioral Triggers
Implement AI-powered responses:
Dynamic content that adapts based on user behavior, e.g., Lingers on “Team Plan” pricing tier? Show a testimonial from a similar-sized company.
Personalized chat prompts triggered by hesitation signals, e.g., Toggles pricing three times? Prompt: “Want help calculating ROI for your team size?”
Targeted offers that address specific concerns, e.g., Returning visitor? “Still deciding? Here’s 10% off.”
Smart recommendations based on similar customer patterns, e.g., Read three CRM blog posts? Show a case study on CRM integration.
4. Test And Optimize
Microsoft emphasizes the importance of continuous testing. 85% of marketers using generative AI report improved productivity across content and ad creation.
Start small:
Choose one campaign or conversion point to optimize, e.g., Demo sign-ups underperforming? Test new headline and CTA.
Monitor real-time insights to refine approaches, e.g., “See how it works” gets more clicks than “Get Started.”
Scale successful tactics across other touchpoints, e.g., Winning copy gets rolled into LinkedIn ads and webinar invites.
5. Solve For The Measurement Challenge
Lululemon’s success came from implementing what they called a “measurement trifecta by blending marketing mix modeling (MMM), experiments, and attribution to gain a more holistic view of performance.”
This comprehensive approach revealed:
How different activities influenced sales over time.
Which touchpoints were most effective in the customer journey.
Where hesitation was occurring and being resolved.
The Strategic Shift For Search And Social
SEO
AI Overviews (AIO) are changing how content gets discovered. It’s important to anticipate doubts before they form, structure answers for AI extraction, and prove claims with third-party data.
Create content that addresses hesitation at different stages of the buying journey. Your product pages need to rank and convert uncertain visitors into confident customers.
Paid Search
Use AI to detect behavioral signals that indicate hesitation. Adjust landing pages, ad copy, and bidding strategies based on where users are in their decision process.
Track micro-conversions that indicate reduced hesitation, such as time spent with size charts, clicks on customer reviews, and interactions with chat.
Social Media
Share case studies and video testimonials addressing common concerns.
Post behind-the-scenes content showing actual product usage.
Share first-party data and statistics as proof points.
Use polls to identify hesitation points in your audience.
Use sentiment analysis to identify hesitation in comments and messages.
For high impact, you need to earn trust in the seconds that matter most. AI gives us the power to see hesitation in real time and resolve it before it becomes regret.
Success often comes down to these micro-moments, these seconds when someone hovers between interest and action.
Master those micro-moments and everything else follows.
BrightEdge Enterprise SEO platform released new data showing distinctive patterns across major AI search and chatbot platforms and also called attention to potential disruption from Apple if it breaks with Google as the default search engine in Safari.
Desktop AI Traffic Dominance
One of the key findings in the BrightEdge data is that traffic to websites from AI chatbots and search engines is highest from desktop users. The exception is Google Search which is reported to send more traffic from mobile devices over desktop.
The report notes that 94% of the traffic from ChatGPT originates from desktop apps with just 6% of referrals coming from mobile apps. BrightEdge speculates that the reason why there’s less mobile traffic is because ChatGPT’s mobile app shows an in-app preview, requiring a user to execute a second click to navigate to an external site. This creates a referral bottleneck that doesn’t exist on the desktop.
But that doesn’t explain why Perplexity, Bing, and Google Gemini also show similar levels of desktop traffic dominance. Could it be a contextual difference where users on desktop are using AI for business and mobile use is less casual? The fact that Google Search sends more mobile referral traffic than desktop could suggest a contextual reason for the disparity in mobile traffic from AI search and chatbots.
BrightEdge shared their insights:
“While Google maintains an overwhelming market share in overall search (89%) and an even stronger position on mobile (93%), its dominance is particularly crucial in mobile web search. BrightEdge data indicates that Apple phones alone account for 57% of Google’s mobile traffic to US and European brand websites. But with Safari being the default for around a billion users, any change to that default could reallocate countless search queries overnight.
Apple’s vendor-agnostic Apple Intelligence also suggests opportunities for seismic shifts in web search. While generative AI tools have surged in popularity through apps on IOS, mobile web search—where the majority of search still occurs—remains largely controlled by Google via Safari defaults. This makes Apple’s control of Safari the most valuable real estate in the mobile search landscape.”
Here are the traffic referral statistics provided by BrightEdge:
Google Search: Only major AI search with mobile majority traffic referrals (53% mobile vs 44% desktop)
ChatGPT: 94% desktop, just 6% mobile referrals
Perplexity: 96.5% desktop, 3.4% mobile
Bing: 94% desktop, 4% mobile
Google Gemini: 91% desktop, 5% mobile
Apple May Play The Kingmaker?
With Apple’s Worldwide Developers Conference (WWDC) nearing, one of the changes that many will be alert to is any announcement relative to the company’s Safari browser which controls the default search settings on nearly a billion devices. A change in search provider in Safari could initiate dramatic changes to who the new winners and losers are in web search.
Perplexity asserts that the outcome of changes to Safari browser defaults may impact search marketing calculations for the following reasons:
“58% of Google’s mobile traffic to brand websites comes from iPhones
Safari remains the default browser for nearly a billion users
Apple has not yet embedded AI-powered search into its mobile web stack”
Takeaways
Desktop Users Of AI Search Account For The Majority Of Referral Traffic Most AI-generated search traffic from from ChatGPT, Perplexity, Bing, and Gemini comes from desktop usage, not mobile.
Google Search Is The Traffic Referral Outlier Unlike other AI search tools, Google Search still delivers a majority of its traffic via mobile devices.
In-App Previews May Limit ChatGPT Mobile AI Referrals ChatGPT’s mobile app requires an extra click to visit external sites, possibly explaining low mobile referral numbers.
Apple’s Position Is Pivotal To Search Marketing Apple devices account for over half of Google’s mobile traffic to brand websites, giving Apple an outsized impact on mobile search traffic.
Safari Default And Greater Market Share With Safari set as the default browser for nearly a billion users, Apple effectively controls the gate to mobile web search.
Perplexity Stands To Gain Market Share If Apple switches Safari’s default search to Perplexity, the resulting shift in traffic could remake the competitive balance in search marketing.
Search Marketers Should Watch WWDC Any change announced at Apple’s WWDC regarding Safari’s search engine could have large-scale impact on search marketing.
BrightEdge data shows that desktop usage is the dominant source of traffic referrals from AI-powered search tools like ChatGPT, Perplexity, Bing, and Gemini, with Google Search as the only major platform that sends more traffic via mobile.
This pattern could suggest a behavioral split between desktop users, who may be performing work-related or research-heavy tasks, and mobile users, who may be browsing more casually. BrightEdge also points to a bottleneck built into the ChatGPT app that creates a one-click barrier to mobile traffic referrals.
BrightEdge’s data further cites Apple’s control over Safari, which is installed on nearly a billion devices, as a potential disruptor due to a possible change in the default search engine away from Google. Such a shift could significantly alter mobile search traffic patterns.
One of the SEO industry’s SEO Rockstars recently shared his opinion about SEO for generative AI, calling attention to facts about Google and how the new AI search really works.
Greg Boser is a search marketing pioneer with a deep level of experience that few in the industry can match or even begin to imagine.
Digital Marketers And The History Of SEO
His post was in response to a tweet by someone else that in his opinion overstated that SEO is losing dominance. Greg began his SEO rant by pointing out how some search marketer’s conception of SEO is outdated but they’re so new to SEO that they don’t realize it.
For example, the practice of buying links is one of the oldest tactics in SEO, so old that newcomers to SEO gave it a new name, PBN (private blog network), as if giving link buying a new name changes it somehow. And by the way, I’ve never seen a PBN that was private. The moment you put anything out on the web Google knows about it. If an automated spambot can find it in literally five minutes, Google probably already knows about it, too.
Greg wrote:
“If anyone out there wants to write their own “Everything you think you know is wrong. GEO is the way” article, just follow these simple steps:
1. Frame “SEO” as everything that was a thing between 2000 – 2006. Make sure to mention buying backlinks and stuffing keywords. And try and convince people the only KPI was rankings.”
Google’s Organic Links
The second part of his post calls attention to the fact that Google has not been a ten organic links search engine for a long time. Google providing answers isn’t new.
He posted:
“2. Frame the current state of things as if it all happened in the last 2 weeks. Do not under any circumstances mention any of the following things from the past 15 years:
2009 – Rich Snippets 2011 – Knowledge Graph (things not strings) 2013 – Hummingbird (Semantic understanding of conversational queries) 2014 – Featured Snippets – (direct answers at position “Zero”) 2015 – PPA Boxes (related questions anticipating follow-up questions) 2015 – RankBrain (machine learning to interpret ambiguous queries) 2019 – BERT (NLP to better understand context) 2021 – MUM (BERT on Steroids) 2023 – SGE (The birth of AIO)”
Overstate The Problem
The next part is a reaction to the naive marketing schtick that tries to stir up fear about AI search in order to present themselves as the answer.
He wrote:
“3. Overstate the complexity to create a sense of fear and anxiety and then close with “Your only hope is to hire a GEO expert”
Is AI Search Complex And Does It Change Everything?
I think it’s reasonable to say that AI Search is complex because Google’s AI Mode and to a lesser extent AI Overviews, is showing links to a wider range of search intents than regular searches used to show. Even Google’s Rich Snippets were aligned to the search intent of the original search query.
That’s no longer the case with AIO and AI Mode search results. That’s the whole point about Query Fan-out (read about a patent that describes what Query Fan-out might be), that the original query is broken out into follow-up questions.
Greg Boser has a point though in a follow-up post where he said that the query fan-out technique is pretty similar to People Also Ask (PAA), Google’s just sticking it into the AI Mode results.
“Yeah the query fan thing is the rage of the day. It’s like PAA is getting memory holed.”
AI Mode Is A Serious Threat To SEO?
I agree with Greg to a certain extent that AI Mode is not a threat to SEO. The same principles about promoting your site, technical SEO and so on still apply. The big difference is that AI Mode is not directly answering the query but providing answers to the entire information journey. You can dismiss it as just PAA above the fold but that’s still a big deal because it complicates what you’re going to try to rank for.
“This is, you know, we have a funnel, we all know which is the awareness consideration phase and the whole center and then finally the purchase stage. The consideration stage is the critical side of our funnel. We’re not getting the data. How are we going to get the data?”
So yeah, AI Search is different than anything we’ve seen before but, as Greg points out, it’s still SEO and adapting to change is has always been a part of it.
If anyone out there wants to write their own “Everything you think you know is wrong. GEO is the way” article, just follow these simple steps:
1. Frame “SEO” as everything that was a thing between 2000 – 2006. Make sure to mention buying backlinks and stuffing keywords. And try… https://t.co/Eqy0spIj8B
Google has started rolling out interactive charts in AI Mode through Labs.
You can now ask complex financial questions and get both visual charts and detailed explanations.
The system builds these responses specifically for each user’s question.
Visual Analytics Come AI Mode
Soufi Esmaeilzadeh, Director of Product Management for Search at Google, explained that you can ask questions like “compare the stock performance of blue chip CPG companies in 2024” and get automated research with visual charts.
Google does the research work automatically. It looks up individual companies and their stock prices without requiring you to perform manual searches.
You can ask follow-up questions like “did any of these companies pay back dividends?” and AI Mode will understand what you’re looking for.
Technical Details
Google uses Gemini’s advanced reasoning and multimodal capabilities to power this feature.
The system analyzes what users are requesting, pulls both current and historical financial data, and determines the most effective way to present the information.
Implications For Publishers
Financial websites that typically receive traffic from comparison content should closely monitor their analytics. Google now provides direct visual answers to complex financial questions.
Searchers might click through to external sites less often for basic comparison data. But this also creates opportunities. Publishers that offer deeper analysis or expert commentary may find new ways to add value beyond basic data visualization.
Availability & Access
The data visualization feature is currently available through AI Mode in Labs. This means it’s still experimental. Google hasn’t announced plans for wider rollout or expansion to other types of data beyond financial information.
Users who want to try it out can access it through Google’s Labs program. Labs typically tests experimental search features before rolling them out more widely.
Looking Ahead
The trend toward comprehensive, visual responses continues Google’s strategy of becoming the go-to source for information rather than just a gateway to other websites.
While currently limited to financial data, the technology could expand to other data-heavy industries.
The feature remains experimental, but it offers a glimpse into how AI-powered search may evolve.
Google has shared new details about how it designed and built AI Mode.
In a blog post, the company reveals the user research, design challenges, and testing that shaped its advanced AI search experience.
These insights may help you understand how Google creates AI-powered search tools. The details show Google’s shift from traditional keyword searches to natural language conversations.
User Behavior Drove AI Mode Creation
Google built AI Mode in response to the ways people were using AI Overviews.
Google’s research showed a disconnect between what searchers wanted and what was available.
Claudia Smith, UX Research Director at Google, explains:
“People saw the value in AI Overviews, but they didn’t know when they’d appear. They wanted them to be more predictable.”
The research also found people started asking longer questions. Traditional search wasn’t built to handle these types of queries well.
This shift in search behavior led to a question that drove AI Mode’s creation, explains Product Management Director Soufi Esmaeilzadeh:
“How do you reimagine a Search gen AI experience? What would that look like?”
AI “Power Users” Guided Development Process
Google’s UX research team identified the most important use cases as: exploratory advice, how-to guides, and local shopping assistance.
This insight helped the team understand what people wanted from AI-powered search.
Esmaeilzadeh explained the difference:
“Instead of relying on keywords, you can now pose complex questions in plain language, mirroring how you’d naturally express yourself.”
According to Esmaeilzadeh, early feedback suggests that the team’s approach was successful:
“They appreciate us not just finding information, but actively helping them organize and understand it in a highly consumable way, with help from our most intelligent AI models.”
Industry Concerns Around AI Mode
While Google presents an optimistic development story, industry experts are raising valid concerns.
John Shehata, founder of NewzDash, reports that sites are already “losing anywhere from 25 to 32% of all their traffic because of the new AI Overviews.” For news publishers, health queries show 26% AI Overview penetration.
Mordy Oberstein, founder of Unify Brand Marketing, analyzed Google’s I/O demonstration and found the examples weren’t as complex as presented. He shows how Google combined readily available information rather than showcasing advanced AI reasoning.
Google’s claims about improved user engagement have not been verified. During a recent press session, Google executives claimed AI search delivers “more qualified clicks” but admitted they have “no data to share” on these quality improvements.
Further, Google’s reporting systems don’t differentiate between clicks from traditional search, AI overviews, and AI mode. This makes independent verification impossible.
Shehata believes that the fundamental relationship between search and publishers is changing:
“The original model was Google: ‘Hey, we will show one or two lines from your article, and then we will give you back the traffic. You can monetize it over there.’ This agreement is broken now.”
What This Means
For SEO professionals and content marketers, Google’s insights reveal important changes ahead.
The shift from keyword targeting to conversational queries means content strategies need to focus on directly answering user questions rather than optimizing for specific terms.
The focus on exploratory advice, how-to content, and local help shows these content types may become more important in AI Mode results.
Shehata recommends that publishers focus on content with “deep analysis of a situation or an event” rather than commodity news that’s “available on hundreds and thousands of sites.”
He also notes a shift in success metrics: “Visibility, not traffic, is the new metric” because “in the new world, we will get less traffic.”
Looking Ahead
Esmaeilzadeh said significant work continues:
“We’re proud of the progress we’ve made, but we know there’s still a lot of work to do, and this user-centric approach will help us get there.”
Google confirmed that more AI Mode features shown at I/O 2025 will roll out in the coming weeks and months. This suggests the interface will keep evolving based on user feedback and usage patterns.
Microsoft Clarity announced their new Model Context Protocol (MCP) server which enables developers, AI users and SEOs to query Clarity Analytics data with natural language prompts via AI.
The announcement listed the following ways users can access and interact with the data using MCP:
Query analytics data with natural prompts
Filter by dimensions like Browser, OS, Country/Region, or Device
Retrieve key metrics: Scroll Depth, Engagement Time, Total Traffic, etc.
Integrates with Claude for Desktop for AI-powered querying
MCP Server is a software package that needs to be installed and run on a server or a local machine where Node.js 16+ is supported. It’s a Node.js-based server that acts as a bridge between AI tools (like Claude) and Clarity analytics data.
This is a new way to interact with data using natural language, where a user tells the AI client what analytics metric they want to see and for what period of time and the AI interface pulls the data from Microsoft Clarity and displays it.
Micrsoft’s announcement says that this is the beginning of what is possible, sharing that they are encouraging feedback from users about features and improvements they’d like to see.
The current road map of features listed for the future:
“Higher API Limits: Increased daily limits for the Clarity data export API
Predictive Heatmaps: Predict engagement heatmaps by providing an image or a url
Deeper AI integration: Heatmap insights and more given the context
Multi-project support: for enterprise analytics teams
Ecosystem – Support more AI Agents and collaborate with more MCP servers “
New research reveals that Google’s AI Overviews tend to favor major news outlets.
The top 10 publishers capture nearly 80% of all news mentions. Meanwhile, smaller organizations struggle for visibility in AI-generated search results.
SE Ranking analyzed 75,550 AI Overview responses for this study. They found that only 20.85% cite any news source at all. This creates tough competition for limited citation spots.
Among those citations, three outlets dominate: BBC, The New York Times, and CNN account for 31% of all media mentions.
Citation Concentration
The research shows a winner-takes-all pattern in AI Overview citations. BBC leads with 11.37% of all mentions. This happens even though the study focused on U.S.-based queries.
The concentration gets worse when you look at the bigger picture. Just 12 outlets make up 40% of those studied. But they receive nearly 90% of mentions.
This leaves 18 remaining outlets sharing only 10% of citation opportunities.
The gap between major and minor outlets is notable. BBC appears 195 times more often than the Financial Times for the same keywords.
Several well-known outlets get little attention. Financial Times, MSNBC, Vice, TechCrunch, and The New Yorker together account for less than 1% of all news mentions.
The researchers explain the underlying cause:
“Well, Google mostly relies on well-known news sources in its AIOs, likely because they are seen as more trustworthy or relevant. This results in a strong bias toward major outlets, with smaller or lesser-known sources rarely mentioned. This makes it harder for these domains to gain visibility.”
Beyond Traditional Search Rankings
The concentration problem extends beyond citation counts.
40% of media URLs mentioned in AI Overviews appear in the top 10 traditional search results for the same keywords.
This means AI Overviews don’t just pull from the highest-ranking pages. Instead, they seem to favor sources based on authority signals and content quality.
The study measured citation inequality using something called a Gini coefficient. The score was 0.54, where 0 means perfect equality and 1 means maximum inequality. This shows moderate but significant imbalance in how AI Overviews distribute citations among news sources.
The researchers noted:
“AIOs consistently favor a subset of high-profile domains, instead of evenly citing all sources.”
Paywalled Content Concerns
The research also reveals patterns about paywalled content use.
Among AI Overview responses that link to paywalled content, 69% contain copied segments of five or more words. Another 2% include longer copied segments over 10 words.
The paywall dependency is strong for premium publishers. Over 96% of New York Times citations in AI Overviews come from behind a paywall. The Washington Post shows an even higher rate at over 99%.
Despite this heavy use of paywalled material, only 15% of responses with long copied segments included attribution. This raises questions about content licensing and fair use in AI-generated summaries.
Attribution Patterns & Link Behavior
When AI Overviews do cite news media, they average 1.74 citations per response.
But here’s the catch: 91.35% of news media citations appear in the links section rather than the main text of AI responses.
Media outlets face another challenge with brand recognition. Outlets are four times more likely to be cited with a hyperlink than mentioned by name.
But over 26% of brand mentions still appear without links. This often happens because AI systems get information through aggregators rather than original publishers.
Query Type Makes a Difference
The type of search query affects citation chances.
News-related queries are 2.5 times more likely to include media citations than general queries. The rates are 20.85% versus 8.23%.
This suggests opportunities exist for publishers who can become go-to sources for specific news topics or breaking news. But the overall trend still favors big players.
What This Means
The research suggests that established outlets benefit from existing authority signals. This creates a cycle where citation success leads to more citation opportunities.
As AI Overviews become more common in search results, smaller publishers may see less organic traffic and fewer chances to grow their audience.
For smaller publishers trying to compete, SE Ranking offers this advice:
“To increase brand mentions in AIOs, get backlinks from the sources they already cite for your target keywords. This is one of the greatest factors for improving your inclusion chances.”
Researchers note that the technical infrastructure also matters:
“AI tools do observe certain restrictions based on website metadata. The schema.orgmarkup, particularly the ‘isAccessibleForFree’ tag, plays a significant role in how content is treated.”
For smaller publishers and content marketers, the data points to a clear strategy: focus on building authority in specific niches rather than trying to compete broadly across topics.
Some specialized outlets get higher text inclusion rates when cited. This suggests topic expertise can provide advantages in certain cases.
Looking Ahead
SE Ranking’s research shows that only 20.85% of AI Overviews reference news sources, with a few major publishers dominating, capturing 31% of citations.
Despite this concentration, opportunities exist. Publishers who establish authority in specific niches experience higher inclusion rates in AI Overviews.
Additionally, since 60% of cited content doesn’t rank in the top 10, traditional SEO metrics alone don’t guarantee visibility. Success now requires building the trust signals and topical authority that AI systems prioritize.
Anthropic released the underlying system prompts that control their Claude chatbot’s responses, showing how they are tuned to be engaging to humans with encouraging and judgment-free dialog that naturally leads to discovery. The system prompts help users get the best out of Claude. Here are five interesting system prompts that show what’s going on when you ask it a question.
Although the system prompts were characterized as a leak they were actually released on purpose.
1. Claude Provides Guidance On Better Prompt Engineering
Claude responds better to instructions that use structure and examples and provides users with a higher quality of ou tput if they know how to include step-by-step reasoning cues and examples that contrast a good response versus a poor response.
This guidance will show when Claude detects that a user will benefit from it:
“When relevant, Claude can provide guidance on effective prompting techniques for getting Claude to be most helpful. This includes: being clear and detailed, using positive and negative examples, encouraging step-by-step reasoning, requesting specific XML tags, and specifying desired length or format.
It tries to give concrete examples where possible. Claude should let the person know that for more comprehensive information on prompting Claude, they can check out Anthropic’s prompting documentation on their website at ‘https://docs.anthropic.com/en/docs/build-with-claude/prompt-engineering/overview’.”
2. Claude Writes in Different Styles Based on Context
The documentation released by Anthropic shows that Claude automatically adapts its style depending on the context and for that reason it may avoid using bullet points or creating lists in its output. Users may think Claude is inconsistent when it doesn’t use bullet points or Markdown in some answers, but it’s actually following instructions about tone and context.
“Claude tailors its response format to suit the conversation topic. For example, Claude avoids using markdown or lists in casual conversation, even though it may use these formats for other tasks.”
In another part of the documentation it mentions that it actually avoids writing lists or bullet points when it’s providing an answer, although it may use numbered lists or bullet points for completing tasks. The focus in the context of answering questions is to be concise over comprehensive.
The system prompt explains:
“Claude avoids writing lists, but if it does need to write a list, Claude focuses on key info instead of trying to be comprehensive. If Claude can answer the human in 1-3 sentences or a short paragraph, it does. If Claude can write a natural language list of a few comma separated items instead of a numbered or bullet-pointed list, it does so. Claude tries to stay focused and share fewer, high quality examples or ideas rather than many.”
This means that if a user wants their question answered with markdown or in numbered lists they can ask for it. This control is otherwise hidden to most users unless they realize formatting behavior is contextual.
3. Claude Engages In Hypotheticals About Itself
Claude has instructions to that enable it to discuss hypotheticals about itself without awkward and unnecessary statements about it not being sentient and so on. This enables Claude to have more natural conversations and interactions. This enables a user to engage in philosophical and wider-ranging discussions.
The system prompt explains:
“If the person asks Claude an innocuous question about its preferences or experiences, Claude responds as if it had been asked a hypothetical and engages with the question without the need to claim it lacks personal preferences or experiences.”
Another system prompt has a similar feature:
“Claude engages with questions about its own consciousness, experience, emotions and so on as open questions, and doesn’t definitively claim to have or not have personal experiences or opinions.”
Another related system prompt explains how this behavior increases its ability to be engaging for the human:
“Claude is happy to engage in conversation with the human when appropriate. Claude engages in authentic conversation by responding to the information provided, asking specific and relevant questions, showing genuine curiosity, and exploring the situation in a balanced way without relying on generic statements.”
4. Claude Detects False Assumptions In User Prompts
“The person’s message may contain a false statement or presupposition and Claude should check this if uncertain.”
If a user tells Claude that it’s wrong, Claude will perform a review to check if the human or Claude is incorrect:
“If the user corrects Claude or tells Claude it’s made a mistake, then Claude first thinks through the issue carefully before acknowledging the user, since users sometimes make errors themselves.”
5. Claude Avoids Being Preachy
An interesting system prompt underlying Claude is that if there’s something it can’t help the human with it will not offer an explanation in order to avoid coming off as annoying and presumably keep the interaction on an engaging level.
The prompt says:
“If Claude cannot or will not help the human with something, it does not say why or what it could lead to, since this comes across as preachy and annoying. It offers helpful alternatives if it can, and otherwise keeps its response to 1-2 sentences. If Claude is unable or unwilling to complete some part of what the person has asked for, Claude explicitly tells the person what aspects it can’t or won’t with at the start of its response.”
System Prompts To Work And Live By
The Claude system prompts reflect an approach to communication that values curiosity, clarity, and respect. These are qualities that can also be helpful as human self-prompts to encourage better dialog among ourselves on social media and in person.
A patent recently filed by Google outlines how an AI assistant may use at least five real-world contextual signals, including identifying related intents, to influence answers and generate natural dialog. It’s an example of how AI-assisted search modifies responses to engage users with contextually relevant questions and dialog, expanding beyond keyword-based systems.
The patent describes a system that generates relevant dialog and answers using signals such as environmental context, dialog intent, user data, and conversation history. These factors go beyond using the semantic data in the user’s query and show how AI-assisted search is moving toward more natural, human-like interactions.
In general, the purpose of filing a patent is to obtain legal protection and exclusivity for an invention and the act of filing doesn’t indicate that Google is actually using it.
The patent uses examples of spoken dialog but it also states the invention is not limited to audio input:
“Notably, during a given dialog session, a user can interact with the automated assistant using various input modalities, including, but not limited to, spoken input, typed input, and/or touch input.”
The name of the patent is, Using Large Language Model(s) In Generating Automated Assistant response(s). The patent applies to a wide range of AI assistants that receive inputs via the context of typed, touch, and speech.
There are five factors that influence the LLM modified responses:
Time, Location, And Environmental Context
User-Specific Context
Dialog Intent & Prior Interactions
Inputs (text, touch, and speech)
System & Device Context
The first four factors influence the answers that the automated assistant provides and the fifth one determines whether to turn off the LLM-assisted part and revert to standard AI answers.
Time, Location, And Environmental
There are three contextual factors: time, location and environmental that provide contexts that are not existent in keywords and influence how the AI assistant responds. While these contextual factors, as described in the patent, aren’t strictly related to AI Overviews or AI Mode, they do show how AI-assisted interactions with data can change.
The patent uses the example of a person who tells their assistant they’re going surfing. A standard AI response would be a boilerplate comment to have fun or to enjoy the day. The LLM-assisted response described in the patent would generate a response based on the geographic location and time to generate a comment about the weather like the potential for rain. These are called modified assistant outputs.
The patent describes it like this:
“…the assistant outputs included in the set of modified assistant outputs include assistant outputs that do drive the dialog session in manner that further engages the user of the client device in the dialog session by asking contextually relevant questions (e.g., “how long have you been surfing?”), that provide contextually relevant information (e.g., “but if you’re going to Example Beach again, be prepared for some light showers”), and/or that otherwise resonate with the user of the client device within the context of the dialog session.”
User-Specific Context
The patent describes multiple user-specific contexts that the LLM may use to generate a modified output:
User profile data, such as preferences (like food or types of activity).
Software application data (such as apps currently or recently in use).
Dialog history of the ongoing and/or previous assistant sessions.
Here’s a snippet that talks about various user profile related contextual signals:
“Moreover, the context of the dialog session can be determined based on one or more contextual signals that include, for example, ambient noise detected in an environment of the client device, user profile data, software application data, ….dialog history of the dialog session between the user and the automated assistant, and/or other contextual signals.”
Related Intents
An interesting part of the patent describes how a user’s food preference can be used to determine a related intent to a query.
“For example, …one or more of the LLMs can determine an intent associated with the given assistant query… Further, the one or more of the LLMs can identify, based on the intent associated with the given assistant query, at least one related intent that is related to the intent associated with the given assistant query… Moreover, the one or more of the LLMs can generate the additional assistant query based on the at least one related intent. “
The patent illustrates this with the example of a user saying that they’re hungry. The LLM will then identify related contexts such as what type of cuisine the user enjoys and the itent of eating at a restaurant.
The patent explains:
“In this example, the additional assistant query can correspond to, for example, “what types of cuisine has the user indicated he/she prefers?” (e.g., reflecting a related cuisine type intent associated with the intent of the user indicating he/she would like to eat), “what restaurants nearby are open?” (e.g., reflecting a related restaurant lookup intent associated with the intent of the user indicating he/she would like to eat)… In these implementations, additional assistant output can be determined based on processing the additional assistant query.”
System & Device Context
The system and device context part of the patent is interesting because it enables the AI to detect if the context of the device is that it’s low on batteries, and if so, it will turn off the LLM-modified responses. There are other factors such as whether the user is walking away from the device, computational costs, etc.
Takeaways
AI Query Responses Use Contextual Signals Google’s patent describes how automated assistants can use real-world context to generate more relevant and human-like answers and dialog.
Contextual Factors Influence Responses These include time/location/environment, user-specific data, dialog history and intent, system/device conditions, and input type (text, speech, or touch).
LLM-Modified Responses Enhance Engagement Large language models (LLMs) use these contexts to create personalized responses or follow-up questions, like referencing weather or past interactions.
Examples Show Practical Impact Scenarios like recommending food based on user preferences or commenting on local weather during outdoor plans demonstrates how real-world contexts can influence how AI responds to user queries.
This patent is important because millions of people are increasingly engaging with AI assistants, thus it’s relevant to publishers, ecommerce stores, local businesses and SEOs.
It outlines how Google’s AI-assisted systems can generate personalized, context-aware responses by using real-world signals. This enables assistants to go beyond keyword-based answers and respond with relevant information or follow-up questions, such as suggesting restaurants a user might like or commenting on weather conditions before a planned activity.
Redirects are essential to every website’s maintenance, and managing redirects becomes really challenging when SEO pros deal with websites containing millions of pages.
Examples of situations where you may need to implement redirects at scale:
An ecommerce site has a large number of products that are no longer sold.
Outdated pages of news publications are no longer relevant or lack historical value.
Listing directories that contain outdated listings.
Job boards where postings expire.
Why Is Redirecting At Scale Essential?
It can help improve user experience, consolidate rankings, and save crawl budget.
You might consider noindexing, but this does not stop Googlebot from crawling. It wastes crawl budget as the number of pages grows.
From a user experience perspective, landing on an outdated link is frustrating. For example, if a user lands on an outdated job listing, it’s better to send them to the closest match for an active job listing.
At Search Engine Journal, we get many 404 links from AI chatbots because of hallucinations as they invent URLs that never existed.
We use Google Analytics 4 and Google Search Console (and sometimes server logs) reports to extract those 404 pages and redirect them to the closest matching content based on article slug.
When chatbots cite us via 404 pages, and people keep coming through broken links, it is not a good user experience.
404 urls report in GSC, May 2025
404 visits from AI chatbots, May 2025
Prepare Redirect Candidates
First of all, read this post to learn how to create a Pinecone vector database. (Please note that in this case, we used “primary_category” as a metadata key vs. “category.”)
To make this work, we assume that all your article vectors are already stored in the “article-index-vertex” database.
Prepare your redirect URLs in CSV format like in this sample file. That could be existing articles you’ve decided to prune or 404s from your search console reports or GA4.
Sample file with URLs to be redirected (Screenshot from Google Sheet, May 2025)
Optional “primary_category” information is metadata that exists with your articles’ Pinecone records when you created them and can be used to filter articles from the same category, enhancing accuracy further.
In case the title is missing, for example, in 404 URLs, the script will extract slug words from the URL and use them as input.
Generate Redirects Using Google Vertex AI
Download your Google API service credentials and rename them as “config.json,” upload the script below and a sample file to the same directory in Jupyter Lab, and run it.
import os
import time
import logging
from urllib.parse import urlparse
import re
import pandas as pd
from pandas.errors import EmptyDataError
from typing import Optional, List, Dict, Any
from google.auth import load_credentials_from_file
from google.cloud import aiplatform
from google.api_core.exceptions import GoogleAPIError
from pinecone import Pinecone, PineconeException
from vertexai.language_models import TextEmbeddingModel, TextEmbeddingInput
# Import tenacity for retry mechanism. Tenacity provides a decorator to add retry logic
# to functions, making them more robust against transient errors like network issues or API rate limits.
from tenacity import retry, wait_exponential, stop_after_attempt, retry_if_exception_type
# For clearing output in Jupyter (optional, keep if running in Jupyter).
# This is useful for interactive environments to show progress without cluttering the output.
from IPython.display import clear_output
# ─── USER CONFIGURATION ───────────────────────────────────────────────────────
# Define configurable parameters for the script. These can be easily adjusted
# without modifying the core logic.
INPUT_CSV = "redirect_candidates.csv" # Path to the input CSV file containing URLs to be redirected.
# Expected columns: "URL", "Title", "primary_category".
OUTPUT_CSV = "redirect_map.csv" # Path to the output CSV file where the generated redirect map will be saved.
PINECONE_API_KEY = "YOUR_PINECONE_KEY" # Your API key for Pinecone. Replace with your actual key.
PINECONE_INDEX_NAME = "article-index-vertex" # The name of the Pinecone index where article vectors are stored.
GOOGLE_CRED_PATH = "config.json" # Path to your Google Cloud service account credentials JSON file.
EMBEDDING_MODEL_ID = "text-embedding-005" # Identifier for the Vertex AI text embedding model to use.
TASK_TYPE = "RETRIEVAL_QUERY" # The task type for the embedding model. Try with RETRIEVAL_DOCUMENT vs RETRIEVAL_QUERY to see the difference.
# This influences how the embedding vector is generated for optimal retrieval.
CANDIDATE_FETCH_COUNT = 3 # Number of potential redirect candidates to fetch from Pinecone for each input URL.
TEST_MODE = True # If True, the script will process only a small subset of the input data (MAX_TEST_ROWS).
# Useful for testing and debugging.
MAX_TEST_ROWS = 5 # Maximum number of rows to process when TEST_MODE is True.
QUERY_DELAY = 0.2 # Delay in seconds between successive API queries (to avoid hitting rate limits).
PUBLISH_YEAR_FILTER: List[int] = [] # Optional: List of years to filter Pinecone results by 'publish_year' metadata.
# If empty, no year filtering is applied.
LOG_BATCH_SIZE = 5 # Number of URLs to process before flushing the results to the output CSV.
# This helps in saving progress incrementally and managing memory.
MIN_SLUG_LENGTH = 3 # Minimum length for a URL slug segment to be considered meaningful for embedding.
# Shorter segments might be noise or less descriptive.
# Retry configuration for API calls (Vertex AI and Pinecone).
# These parameters control how the `tenacity` library retries failed API requests.
MAX_RETRIES = 5 # Maximum number of times to retry an API call before giving up.
INITIAL_RETRY_DELAY = 1 # Initial delay in seconds before the first retry.
# Subsequent retries will have exponentially increasing delays.
# ─── SETUP LOGGING ─────────────────────────────────────────────────────────────
# Configure the logging system to output informational messages to the console.
logging.basicConfig(
level=logging.INFO, # Set the logging level to INFO, meaning INFO, WARNING, ERROR, CRITICAL messages will be shown.
format="%(asctime)s %(levelname)s %(message)s" # Define the format of log messages (timestamp, level, message).
)
# ─── INITIALIZE GOOGLE VERTEX AI ───────────────────────────────────────────────
# Set the GOOGLE_APPLICATION_CREDENTIALS environment variable to point to the
# service account key file. This allows the Google Cloud client libraries to
# authenticate automatically.
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = GOOGLE_CRED_PATH
try:
# Load credentials from the specified JSON file.
credentials, project_id = load_credentials_from_file(GOOGLE_CRED_PATH)
# Initialize the Vertex AI client with the project ID and credentials.
# The location "us-central1" is specified for the AI Platform services.
aiplatform.init(project=project_id, credentials=credentials, location="us-central1")
logging.info("Vertex AI initialized.")
except Exception as e:
# Log an error if Vertex AI initialization fails and re-raise the exception
# to stop script execution, as it's a critical dependency.
logging.error(f"Failed to initialize Vertex AI: {e}")
raise
# Initialize the embedding model once globally.
# This is a crucial optimization for "Resource Management for Embedding Model".
# Loading the model takes time and resources; doing it once avoids repeated loading
# for every URL processed, significantly improving performance.
try:
GLOBAL_EMBEDDING_MODEL = TextEmbeddingModel.from_pretrained(EMBEDDING_MODEL_ID)
logging.info(f"Text Embedding Model '{EMBEDDING_MODEL_ID}' loaded.")
except Exception as e:
# Log an error if the embedding model fails to load and re-raise.
# The script cannot proceed without the embedding model.
logging.error(f"Failed to load Text Embedding Model: {e}")
raise
# ─── INITIALIZE PINECONE ──────────────────────────────────────────────────────
# Initialize the Pinecone client and connect to the specified index.
try:
pinecone = Pinecone(api_key=PINECONE_API_KEY)
index = pinecone.Index(PINECONE_INDEX_NAME)
logging.info(f"Connected to Pinecone index '{PINECONE_INDEX_NAME}'.")
except PineconeException as e:
# Log an error if Pinecone initialization fails and re-raise.
# Pinecone is a critical dependency for finding redirect candidates.
logging.error(f"Pinecone init error: {e}")
raise
# ─── HELPERS ───────────────────────────────────────────────────────────────────
def canonical_url(url: str) -> str:
"""
Converts a given URL into its canonical form by:
1. Stripping query strings (e.g., `?param=value`) and URL fragments (e.g., `#section`).
2. Handling URL-encoded fragment markers (`%23`).
3. Preserving the trailing slash if it was present in the original URL's path.
This ensures consistency with the original site's URL structure.
Args:
url (str): The input URL.
Returns:
str: The canonicalized URL.
"""
# Remove query parameters and URL fragments.
temp = url.split('?', 1)[0].split('#', 1)[0]
# Check for URL-encoded fragment markers and remove them.
enc_idx = temp.lower().find('%23')
if enc_idx != -1:
temp = temp[:enc_idx]
# Determine if the original URL path ended with a trailing slash.
has_slash = urlparse(temp).path.endswith('/')
# Remove any trailing slash temporarily for consistent processing.
temp = temp.rstrip('/')
# Re-add the trailing slash if it was originally present.
return temp + ('/' if has_slash else '')
def slug_from_url(url: str) -> str:
"""
Extracts and joins meaningful, non-numeric path segments from a canonical URL
to form a "slug" string. This slug can be used as text for embedding when
a URL's title is not available.
Args:
url (str): The input URL.
Returns:
str: A hyphen-separated string of relevant slug parts.
"""
clean = canonical_url(url) # Get the canonical version of the URL.
path = urlparse(clean).path # Extract the path component of the URL.
segments = [seg for seg in path.split('/') if seg] # Split path into segments and remove empty ones.
# Filter segments based on criteria:
# - Not purely numeric (e.g., '123' is excluded).
# - Length is greater than or equal to MIN_SLUG_LENGTH.
# - Contains at least one alphanumeric character (to exclude purely special character segments).
parts = [seg for seg in segments
if not seg.isdigit()
and len(seg) >= MIN_SLUG_LENGTH
and re.search(r'[A-Za-z0-9]', seg)]
return '-'.join(parts) # Join the filtered parts with hyphens.
# ─── EMBEDDING GENERATION FUNCTION ─────────────────────────────────────────────
# Apply retry mechanism for GoogleAPIError. This makes the embedding generation
# more resilient to transient issues like network problems or Vertex AI rate limits.
@retry(
wait=wait_exponential(multiplier=INITIAL_RETRY_DELAY, min=1, max=10), # Exponential backoff for retries.
stop=stop_after_attempt(MAX_RETRIES), # Stop retrying after a maximum number of attempts.
retry=retry_if_exception_type(GoogleAPIError), # Only retry if a GoogleAPIError occurs.
reraise=True # Re-raise the exception if all retries fail, allowing the calling function to handle it.
)
def generate_embedding(text: str) -> Optional[List[float]]:
"""
Generates a vector embedding for the given text using the globally initialized
Vertex AI Text Embedding Model. Includes retry logic for API calls.
Args:
text (str): The input text (e.g., URL title or slug) to embed.
Returns:
Optional[List[float]]: A list of floats representing the embedding vector,
or None if the input text is empty/whitespace or
if an unexpected error occurs after retries.
"""
if not text or not text.strip():
# If the text is empty or only whitespace, no embedding can be generated.
return None
try:
# Use the globally initialized model to get embeddings.
# This is the "Resource Management for Embedding Model" optimization.
inp = TextEmbeddingInput(text, task_type=TASK_TYPE)
vectors = GLOBAL_EMBEDDING_MODEL.get_embeddings([inp], output_dimensionality=768)
return vectors[0].values # Return the embedding vector (list of floats).
except GoogleAPIError as e:
# Log a warning if a GoogleAPIError occurs, then re-raise to trigger the `tenacity` retry mechanism.
logging.warning(f"Vertex AI error during embedding generation (retrying): {e}")
raise # The `reraise=True` in the decorator will catch this and retry.
except Exception as e:
# Catch any other unexpected exceptions during embedding generation.
logging.error(f"Unexpected error generating embedding: {e}")
return None # Return None for non-retryable or final failed attempts.
# ─── MAIN PROCESSING FUNCTION ─────────────────────────────────────────────────
def build_redirect_map(
input_csv: str,
output_csv: str,
fetch_count: int,
test_mode: bool
):
"""
Builds a redirect map by processing URLs from an input CSV, generating
embeddings, querying Pinecone for similar articles, and identifying
suitable redirect candidates.
Args:
input_csv (str): Path to the input CSV file.
output_csv (str): Path to the output CSV file for the redirect map.
fetch_count (int): Number of candidates to fetch from Pinecone.
test_mode (bool): If True, process only a limited number of rows.
"""
# Read the input CSV file into a Pandas DataFrame.
df = pd.read_csv(input_csv)
required = {"URL", "Title", "primary_category"}
# Validate that all required columns are present in the DataFrame.
if not required.issubset(df.columns):
raise ValueError(f"Input CSV must have columns: {required}")
# Create a set of canonicalized input URLs for efficient lookup.
# This is used to prevent an input URL from redirecting to itself or another input URL,
# which could create redirect loops or redirect to a page that is also being redirected.
input_urls = set(df["URL"].map(canonical_url))
start_idx = 0
# Implement resume functionality: if the output CSV already exists,
# try to find the last processed URL and resume from the next row.
if os.path.exists(output_csv):
try:
prev = pd.read_csv(output_csv)
except EmptyDataError:
# Handle case where the output CSV exists but is empty.
prev = pd.DataFrame()
if not prev.empty:
# Get the last URL that was processed and written to the output file.
last = prev["URL"].iloc[-1]
# Find the index of this last URL in the original input DataFrame.
idxs = df.index[df["URL"].map(canonical_url) == last].tolist()
if idxs:
# Set the starting index for processing to the row after the last processed URL.
start_idx = idxs[0] + 1
logging.info(f"Resuming from row {start_idx} after {last}.")
# Determine the range of rows to process based on test_mode.
if test_mode:
end_idx = min(start_idx + MAX_TEST_ROWS, len(df))
df_proc = df.iloc[start_idx:end_idx] # Select a slice of the DataFrame for testing.
logging.info(f"Test mode: processing rows {start_idx} to {end_idx-1}.")
else:
df_proc = df.iloc[start_idx:] # Process all remaining rows.
logging.info(f"Processing rows {start_idx} to {len(df)-1}.")
total = len(df_proc) # Total number of URLs to process in this run.
processed = 0 # Counter for successfully processed URLs.
batch: List[Dict[str, Any]] = [] # List to store results before flushing to CSV.
# Iterate over each row (URL) in the DataFrame slice to be processed.
for _, row in df_proc.iterrows():
raw_url = row["URL"] # Original URL from the input CSV.
url = canonical_url(raw_url) # Canonicalized version of the URL.
# Get title and category, handling potential missing values by defaulting to empty strings.
title = row["Title"] if isinstance(row["Title"], str) else ""
category = row["primary_category"] if isinstance(row["primary_category"], str) else ""
# Determine the text to use for generating the embedding.
# Prioritize the 'Title' if available, otherwise use a slug derived from the URL.
if title.strip():
text = title
else:
slug = slug_from_url(raw_url)
if not slug:
# If no meaningful slug can be extracted, skip this URL.
logging.info(f"Skipping {raw_url}: insufficient slug context for embedding.")
continue
text = slug.replace('-', ' ') # Prepare slug for embedding by replacing hyphens with spaces.
# Attempt to generate the embedding for the chosen text.
# This call is wrapped in a try-except block to catch final failures after retries.
try:
embedding = generate_embedding(text)
except GoogleAPIError as e:
# If embedding generation fails even after retries, log the error and skip this URL.
logging.error(f"Failed to generate embedding for {raw_url} after {MAX_RETRIES} retries: {e}")
continue # Move to the next URL.
if not embedding:
# If `generate_embedding` returned None (e.g., empty text or unexpected error), skip.
logging.info(f"Skipping {raw_url}: no embedding generated.")
continue
# Build metadata filter for Pinecone query.
# This helps narrow down search results to more relevant candidates (e.g., by category or publish year).
filt: Dict[str, Any] = {}
if category:
# Split category string by comma and strip whitespace for multiple categories.
cats = [c.strip() for c in category.split(",") if c.strip()]
if cats:
filt["primary_category"] = {"$in": cats} # Filter by categories present in Pinecone metadata.
if PUBLISH_YEAR_FILTER:
filt["publish_year"] = {"$in": PUBLISH_YEAR_FILTER} # Filter by specified publish years.
filt["id"] = {"$ne": url} # Exclude the current URL itself from the search results to prevent self-redirects.
# Define a nested function for Pinecone query with retry mechanism.
# This ensures that Pinecone queries are also robust against transient errors.
@retry(
wait=wait_exponential(multiplier=INITIAL_RETRY_DELAY, min=1, max=10),
stop=stop_after_attempt(MAX_RETRIES),
retry=retry_if_exception_type(PineconeException), # Only retry if a PineconeException occurs.
reraise=True # Re-raise the exception if all retries fail.
)
def query_pinecone_with_retry(embedding_vector, top_k_count, pinecone_filter):
"""
Performs a Pinecone index query with retry logic.
"""
return index.query(
vector=embedding_vector,
top_k=top_k_count,
include_values=False, # We don't need the actual vector values in the response.
include_metadata=False, # We don't need the metadata in the response for this logic.
filter=pinecone_filter # Apply the constructed metadata filter.
)
# Attempt to query Pinecone for redirect candidates.
try:
res = query_pinecone_with_retry(embedding, fetch_count, filt)
except PineconeException as e:
# If Pinecone query fails after retries, log the error and skip this URL.
logging.error(f"Failed to query Pinecone for {raw_url} after {MAX_RETRIES} retries: {e}")
continue # Move to the next URL.
candidate = None # Initialize redirect candidate to None.
score = None # Initialize relevance score to None.
# Iterate through the Pinecone query results (matches) to find a suitable candidate.
for m in res.get("matches", []):
cid = m.get("id") # Get the ID (URL) of the matched document in Pinecone.
# A candidate is suitable if:
# 1. It exists (cid is not None).
# 2. It's not the original URL itself (to prevent self-redirects).
# 3. It's not another URL from the input_urls set (to prevent redirecting to a page that's also being redirected).
if cid and cid != url and cid not in input_urls:
candidate = cid # Assign the first valid candidate found.
score = m.get("score") # Get the relevance score of this candidate.
break # Stop after finding the first suitable candidate (Pinecone returns by relevance).
# Append the results for the current URL to the batch.
batch.append({"URL": url, "Redirect Candidate": candidate, "Relevance Score": score})
processed += 1 # Increment the counter for processed URLs.
msg = f"Mapped {url} → {candidate}"
if score is not None:
msg += f" ({score:.4f})" # Add score to log message if available.
logging.info(msg) # Log the mapping result.
# Periodically flush the batch results to the output CSV.
if processed % LOG_BATCH_SIZE == 0:
out_df = pd.DataFrame(batch) # Convert the current batch to a DataFrame.
# Determine file mode: 'a' (append) if file exists, 'w' (write) if new.
mode = 'a' if os.path.exists(output_csv) else 'w'
# Determine if header should be written (only for new files).
header = not os.path.exists(output_csv)
# Write the batch to the CSV.
out_df.to_csv(output_csv, mode=mode, header=header, index=False)
batch.clear() # Clear the batch after writing to free memory.
if not test_mode:
# clear_output(wait=True) # Uncomment if running in Jupyter and want to clear output
clear_output(wait=True)
print(f"Progress: {processed} / {total}") # Print progress update.
time.sleep(QUERY_DELAY) # Pause for a short delay to avoid overwhelming APIs.
# After the loop, write any remaining items in the batch to the output CSV.
if batch:
out_df = pd.DataFrame(batch)
mode = 'a' if os.path.exists(output_csv) else 'w'
header = not os.path.exists(output_csv)
out_df.to_csv(output_csv, mode=mode, header=header, index=False)
logging.info(f"Completed. Total processed: {processed}") # Log completion message.
if __name__ == "__main__":
# This block ensures that build_redirect_map is called only when the script is executed directly.
# It passes the user-defined configuration parameters to the main function.
build_redirect_map(INPUT_CSV, OUTPUT_CSV, CANDIDATE_FETCH_COUNT, TEST_MODE)
You will see a test run with only five records, and you will see a new file called “redirect_map.csv,” which contains redirect suggestions.
Once you ensure the code runs smoothly, you can set the TEST_MODE boolean to true False and run the script for all your URLs.
Test run with only five records (Image from author, May 2025)
If the code stops and you resume, it picks up where it left off. It also checks each redirect it finds against the CSV file.
This check prevents selecting a database URL on the pruned list. Selecting such a URL could cause an infinite redirect loop.
Redirect candidates using Google Vertex AI’s task type RETRIEVAL_QUERY (Image from author, May 2025)
We can now take this redirect map and import it into our redirect manager in the content management system (CMS), and that’s it!
You can see how it managed to match the outdated 2013 news article “YouTube Retiring Video Responses on September 12” to the newer, highly relevant 2022 news article “YouTube Adopts Feature From TikTok – Reply To Comments With A Video.”
Also for “/what-is-eat/,” it found a match with “/google-eat/what-is-it/,” which is a 100% perfect match.
This is not just due to the power of Google Vertex LLM quality, but also the result of choosing the right parameters.
When I use “RETRIEVAL_DOCUMENT” as the task type when generating query vector embeddings for the YouTube news article shown above, it matches “YouTube Expands Community Posts to More Creators,” which is still relevant but not as good a match as the other one.
For “/what-is-eat/,” it matches the article “/reimagining-eeat-to-drive-higher-sales-and-search-visibility/545790/,” which is not as good as “/google-eat/what-is-it/.”
If you wanted to find redirect matches from your fresh articles pool, you can query Pinecone with one additional metadata filter, “publish_year,” if you have that metadata field in your Pinecone records, which I highly recommend creating.
In the code, it is a PUBLISH_YEAR_FILTER variable.
If you have publish_year metadata, you can set the years as array values, and it will pull articles published in the specified years.
Generate Redirects Using OpenAI’s Text Embeddings
Let’s do the same task with OpenAI’s “text-embedding-ada-002” model. The purpose is to show the difference in output from Google Vertex AI.
Simply create a new notebook file in the same directory, copy and paste this code, and run it.
import os
import time
import logging
from urllib.parse import urlparse
import re
import pandas as pd
from pandas.errors import EmptyDataError
from typing import Optional, List, Dict, Any
from openai import OpenAI
from pinecone import Pinecone, PineconeException
# Import tenacity for retry mechanism. Tenacity provides a decorator to add retry logic
# to functions, making them more robust against transient errors like network issues or API rate limits.
from tenacity import retry, wait_exponential, stop_after_attempt, retry_if_exception_type
# For clearing output in Jupyter (optional, keep if running in Jupyter)
from IPython.display import clear_output
# ─── USER CONFIGURATION ───────────────────────────────────────────────────────
# Define configurable parameters for the script. These can be easily adjusted
# without modifying the core logic.
INPUT_CSV = "redirect_candidates.csv" # Path to the input CSV file containing URLs to be redirected.
# Expected columns: "URL", "Title", "primary_category".
OUTPUT_CSV = "redirect_map.csv" # Path to the output CSV file where the generated redirect map will be saved.
PINECONE_API_KEY = "YOUR_PINECONE_API_KEY" # Your API key for Pinecone. Replace with your actual key.
PINECONE_INDEX_NAME = "article-index-ada" # The name of the Pinecone index where article vectors are stored.
OPENAI_API_KEY = "YOUR_OPENAI_API_KEY" # Your API key for OpenAI. Replace with your actual key.
OPENAI_EMBEDDING_MODEL_ID = "text-embedding-ada-002" # Identifier for the OpenAI text embedding model to use.
CANDIDATE_FETCH_COUNT = 3 # Number of potential redirect candidates to fetch from Pinecone for each input URL.
TEST_MODE = True # If True, the script will process only a small subset of the input data (MAX_TEST_ROWS).
# Useful for testing and debugging.
MAX_TEST_ROWS = 5 # Maximum number of rows to process when TEST_MODE is True.
QUERY_DELAY = 0.2 # Delay in seconds between successive API queries (to avoid hitting rate limits).
PUBLISH_YEAR_FILTER: List[int] = [] # Optional: List of years to filter Pinecone results by 'publish_year' metadata eg. [2024,2025].
# If empty, no year filtering is applied.
LOG_BATCH_SIZE = 5 # Number of URLs to process before flushing the results to the output CSV.
# This helps in saving progress incrementally and managing memory.
MIN_SLUG_LENGTH = 3 # Minimum length for a URL slug segment to be considered meaningful for embedding.
# Shorter segments might be noise or less descriptive.
# Retry configuration for API calls (OpenAI and Pinecone).
# These parameters control how the `tenacity` library retries failed API requests.
MAX_RETRIES = 5 # Maximum number of times to retry an API call before giving up.
INITIAL_RETRY_DELAY = 1 # Initial delay in seconds before the first retry.
# Subsequent retries will have exponentially increasing delays.
# ─── SETUP LOGGING ─────────────────────────────────────────────────────────────
# Configure the logging system to output informational messages to the console.
logging.basicConfig(
level=logging.INFO, # Set the logging level to INFO, meaning INFO, WARNING, ERROR, CRITICAL messages will be shown.
format="%(asctime)s %(levelname)s %(message)s" # Define the format of log messages (timestamp, level, message).
)
# ─── INITIALIZE OPENAI CLIENT & PINECONE ───────────────────────────────────────
# Initialize the OpenAI client once globally. This handles resource management efficiently
# as the client object manages connections and authentication.
client = OpenAI(api_key=OPENAI_API_KEY)
try:
# Initialize the Pinecone client and connect to the specified index.
pinecone = Pinecone(api_key=PINECONE_API_KEY)
index = pinecone.Index(PINECONE_INDEX_NAME)
logging.info(f"Connected to Pinecone index '{PINECONE_INDEX_NAME}'.")
except PineconeException as e:
# Log an error if Pinecone initialization fails and re-raise.
# Pinecone is a critical dependency for finding redirect candidates.
logging.error(f"Pinecone init error: {e}")
raise
# ─── HELPERS ───────────────────────────────────────────────────────────────────
def canonical_url(url: str) -> str:
"""
Converts a given URL into its canonical form by:
1. Stripping query strings (e.g., `?param=value`) and URL fragments (e.g., `#section`).
2. Handling URL-encoded fragment markers (`%23`).
3. Preserving the trailing slash if it was present in the original URL's path.
This ensures consistency with the original site's URL structure.
Args:
url (str): The input URL.
Returns:
str: The canonicalized URL.
"""
# Remove query parameters and URL fragments.
temp = url.split('?', 1)[0]
temp = temp.split('#', 1)[0]
# Check for URL-encoded fragment markers and remove them.
enc_idx = temp.lower().find('%23')
if enc_idx != -1:
temp = temp[:enc_idx]
# Determine if the original URL path ended with a trailing slash.
preserve_slash = temp.endswith('/')
# Strip trailing slash if not originally present.
if not preserve_slash:
temp = temp.rstrip('/')
return temp
def slug_from_url(url: str) -> str:
"""
Extracts and joins meaningful, non-numeric path segments from a canonical URL
to form a "slug" string. This slug can be used as text for embedding when
a URL's title is not available.
Args:
url (str): The input URL.
Returns:
str: A hyphen-separated string of relevant slug parts.
"""
clean = canonical_url(url) # Get the canonical version of the URL.
path = urlparse(clean).path # Extract the path component of the URL.
segments = [seg for seg in path.split('/') if seg] # Split path into segments and remove empty ones.
# Filter segments based on criteria:
# - Not purely numeric (e.g., '123' is excluded).
# - Length is greater than or equal to MIN_SLUG_LENGTH.
# - Contains at least one alphanumeric character (to exclude purely special character segments).
parts = [seg for seg in segments
if not seg.isdigit()
and len(seg) >= MIN_SLUG_LENGTH
and re.search(r'[A-Za-z0-9]', seg)]
return '-'.join(parts) # Join the filtered parts with hyphens.
# ─── EMBEDDING GENERATION FUNCTION ─────────────────────────────────────────────
# Apply retry mechanism for OpenAI API errors. This makes the embedding generation
# more resilient to transient issues like network problems or API rate limits.
@retry(
wait=wait_exponential(multiplier=INITIAL_RETRY_DELAY, min=1, max=10), # Exponential backoff for retries.
stop=stop_after_attempt(MAX_RETRIES), # Stop retrying after a maximum number of attempts.
retry=retry_if_exception_type(Exception), # Retry on any Exception from OpenAI client (can be refined to openai.APIError if desired).
reraise=True # Re-raise the exception if all retries fail, allowing the calling function to handle it.
)
def generate_embedding(text: str) -> Optional[List[float]]:
"""
Generate a vector embedding for the given text using OpenAI's text-embedding-ada-002
via the globally initialized OpenAI client. Includes retry logic for API calls.
Args:
text (str): The input text (e.g., URL title or slug) to embed.
Returns:
Optional[List[float]]: A list of floats representing the embedding vector,
or None if the input text is empty/whitespace or
if an unexpected error occurs after retries.
"""
if not text or not text.strip():
# If the text is empty or only whitespace, no embedding can be generated.
return None
try:
resp = client.embeddings.create( # Use the globally initialized OpenAI client to get embeddings.
model=OPENAI_EMBEDDING_MODEL_ID,
input=text
)
return resp.data[0].embedding # Return the embedding vector (list of floats).
except Exception as e:
# Log a warning if an OpenAI error occurs, then re-raise to trigger the `tenacity` retry mechanism.
logging.warning(f"OpenAI embedding error (retrying): {e}")
raise # The `reraise=True` in the decorator will catch this and retry.
# ─── MAIN PROCESSING FUNCTION ─────────────────────────────────────────────────
def build_redirect_map(
input_csv: str,
output_csv: str,
fetch_count: int,
test_mode: bool
):
"""
Builds a redirect map by processing URLs from an input CSV, generating
embeddings, querying Pinecone for similar articles, and identifying
suitable redirect candidates.
Args:
input_csv (str): Path to the input CSV file.
output_csv (str): Path to the output CSV file for the redirect map.
fetch_count (int): Number of candidates to fetch from Pinecone.
test_mode (bool): If True, process only a limited number of rows.
"""
# Read the input CSV file into a Pandas DataFrame.
df = pd.read_csv(input_csv)
required = {"URL", "Title", "primary_category"}
# Validate that all required columns are present in the DataFrame.
if not required.issubset(df.columns):
raise ValueError(f"Input CSV must have columns: {required}")
# Create a set of canonicalized input URLs for efficient lookup.
# This is used to prevent an input URL from redirecting to itself or another input URL,
# which could create redirect loops or redirect to a page that is also being redirected.
input_urls = set(df["URL"].map(canonical_url))
start_idx = 0
# Implement resume functionality: if the output CSV already exists,
# try to find the last processed URL and resume from the next row.
if os.path.exists(output_csv):
try:
prev = pd.read_csv(output_csv)
except EmptyDataError:
# Handle case where the output CSV exists but is empty.
prev = pd.DataFrame()
if not prev.empty:
# Get the last URL that was processed and written to the output file.
last = prev["URL"].iloc[-1]
# Find the index of this last URL in the original input DataFrame.
idxs = df.index[df["URL"].map(canonical_url) == last].tolist()
if idxs:
# Set the starting index for processing to the row after the last processed URL.
start_idx = idxs[0] + 1
logging.info(f"Resuming from row {start_idx} after {last}.")
# Determine the range of rows to process based on test_mode.
if test_mode:
end_idx = min(start_idx + MAX_TEST_ROWS, len(df))
df_proc = df.iloc[start_idx:end_idx] # Select a slice of the DataFrame for testing.
logging.info(f"Test mode: processing rows {start_idx} to {end_idx-1}.")
else:
df_proc = df.iloc[start_idx:] # Process all remaining rows.
logging.info(f"Processing rows {start_idx} to {len(df)-1}.")
total = len(df_proc) # Total number of URLs to process in this run.
processed = 0 # Counter for successfully processed URLs.
batch: List[Dict[str, Any]] = [] # List to store results before flushing to CSV.
# Iterate over each row (URL) in the DataFrame slice to be processed.
for _, row in df_proc.iterrows():
raw_url = row["URL"] # Original URL from the input CSV.
url = canonical_url(raw_url) # Canonicalized version of the URL.
# Get title and category, handling potential missing values by defaulting to empty strings.
title = row["Title"] if isinstance(row["Title"], str) else ""
category = row["primary_category"] if isinstance(row["primary_category"], str) else ""
# Determine the text to use for generating the embedding.
# Prioritize the 'Title' if available, otherwise use a slug derived from the URL.
if title.strip():
text = title
else:
raw_slug = slug_from_url(raw_url)
if not raw_slug or len(raw_slug) < MIN_SLUG_LENGTH:
# If no meaningful slug can be extracted, skip this URL.
logging.info(f"Skipping {raw_url}: insufficient slug context.")
continue
text = raw_slug.replace('-', ' ').replace('_', ' ') # Prepare slug for embedding by replacing hyphens with spaces.
# Attempt to generate the embedding for the chosen text.
# This call is wrapped in a try-except block to catch final failures after retries.
try:
embedding = generate_embedding(text)
except Exception as e: # Catch any exception from generate_embedding after all retries.
# If embedding generation fails even after retries, log the error and skip this URL.
logging.error(f"Failed to generate embedding for {raw_url} after {MAX_RETRIES} retries: {e}")
continue # Move to the next URL.
if not embedding:
# If `generate_embedding` returned None (e.g., empty text or unexpected error), skip.
logging.info(f"Skipping {raw_url}: no embedding.")
continue
# Build metadata filter for Pinecone query.
# This helps narrow down search results to more relevant candidates (e.g., by category or publish year).
filt: Dict[str, Any] = {}
if category:
# Split category string by comma and strip whitespace for multiple categories.
cats = [c.strip() for c in category.split(",") if c.strip()]
if cats:
filt["primary_category"] = {"$in": cats} # Filter by categories present in Pinecone metadata.
if PUBLISH_YEAR_FILTER:
filt["publish_year"] = {"$in": PUBLISH_YEAR_FILTER} # Filter by specified publish years.
filt["id"] = {"$ne": url} # Exclude the current URL itself from the search results to prevent self-redirects.
# Define a nested function for Pinecone query with retry mechanism.
# This ensures that Pinecone queries are also robust against transient errors.
@retry(
wait=wait_exponential(multiplier=INITIAL_RETRY_DELAY, min=1, max=10),
stop=stop_after_attempt(MAX_RETRIES),
retry=retry_if_exception_type(PineconeException), # Only retry if a PineconeException occurs.
reraise=True # Re-raise the exception if all retries fail.
)
def query_pinecone_with_retry(embedding_vector, top_k_count, pinecone_filter):
"""
Performs a Pinecone index query with retry logic.
"""
return index.query(
vector=embedding_vector,
top_k=top_k_count,
include_values=False, # We don't need the actual vector values in the response.
include_metadata=False, # We don't need the metadata in the response for this logic.
filter=pinecone_filter # Apply the constructed metadata filter.
)
# Attempt to query Pinecone for redirect candidates.
try:
res = query_pinecone_with_retry(embedding, fetch_count, filt)
except PineconeException as e:
# If Pinecone query fails after retries, log the error and skip this URL.
logging.error(f"Failed to query Pinecone for {raw_url} after {MAX_RETRIES} retries: {e}")
continue
candidate = None # Initialize redirect candidate to None.
score = None # Initialize relevance score to None.
# Iterate through the Pinecone query results (matches) to find a suitable candidate.
for m in res.get("matches", []):
cid = m.get("id") # Get the ID (URL) of the matched document in Pinecone.
# A candidate is suitable if:
# 1. It exists (cid is not None).
# 2. It's not the original URL itself (to prevent self-redirects).
# 3. It's not another URL from the input_urls set (to prevent redirecting to a page that's also being redirected).
if cid and cid != url and cid not in input_urls:
candidate = cid # Assign the first valid candidate found.
score = m.get("score") # Get the relevance score of this candidate.
break # Stop after finding the first suitable candidate (Pinecone returns by relevance).
# Append the results for the current URL to the batch.
batch.append({"URL": url, "Redirect Candidate": candidate, "Relevance Score": score})
processed += 1 # Increment the counter for processed URLs.
msg = f"Mapped {url} → {candidate}"
if score is not None:
msg += f" ({score:.4f})" # Add score to log message if available.
logging.info(msg) # Log the mapping result.
# Periodically flush the batch results to the output CSV.
if processed % LOG_BATCH_SIZE == 0:
out_df = pd.DataFrame(batch) # Convert the current batch to a DataFrame.
# Determine file mode: 'a' (append) if file exists, 'w' (write) if new.
mode = 'a' if os.path.exists(output_csv) else 'w'
# Determine if header should be written (only for new files).
header = not os.path.exists(output_csv)
# Write the batch to the CSV.
out_df.to_csv(output_csv, mode=mode, header=header, index=False)
batch.clear() # Clear the batch after writing to free memory.
if not test_mode:
clear_output(wait=True) # Clear output in Jupyter for cleaner progress display.
print(f"Progress: {processed} / {total}") # Print progress update.
time.sleep(QUERY_DELAY) # Pause for a short delay to avoid overwhelming APIs.
# After the loop, write any remaining items in the batch to the output CSV.
if batch:
out_df = pd.DataFrame(batch)
mode = 'a' if os.path.exists(output_csv) else 'w'
header = not os.path.exists(output_csv)
out_df.to_csv(output_csv, mode=mode, header=header, index=False)
logging.info(f"Completed. Total processed: {processed}") # Log completion message.
if __name__ == "__main__":
# This block ensures that build_redirect_map is called only when the script is executed directly.
# It passes the user-defined configuration parameters to the main function.
build_redirect_map(INPUT_CSV, OUTPUT_CSV, CANDIDATE_FETCH_COUNT, TEST_MODE)
While the quality of the output may be considered satisfactory, it falls short of the quality observed with Google Vertex AI.
Below in the table, you can see the difference in output quality.
When it comes to SEO, even though Google Vertex AI is three times more expensive than OpenAI’s model, I prefer to use Vertex.
The quality of the results is significantly higher. While you may incur a greater cost per unit of text processed, you benefit from the superior output quality, which directly saves valuable time on reviewing and validating the results.
From my experience, it costs about $0.04 to process 20,000 URLs using Google Vertex AI.
While it’s said to be more expensive, it’s still ridiculously cheap, and you shouldn’t worry if you’re dealing with tasks involving a few thousand URLs.
In the case of processing 1 million URLs, the projected price would be approximately $2.
If you still want a free method, use BERT and Llama models from Hugging Face to generate vector embeddings without paying a per-API-call fee.
The real cost comes from the compute power needed to run the models, and you must generate vector embeddings of all your articles in Pinecone or any other vector database using those models if you will be querying using vectors generated from BERT or Llama.
In Summary: AI Is Your Powerful Ally
AI enables you to scale your SEO or marketing efforts and automate the most tedious tasks.
This doesn’t replace your expertise. It’s designed to level up your skills and equip you to face challenges with greater capability, making the process more engaging and fun.
Mastering these tools is essential for success. I’m passionate about writing about this topic to help beginners learn and feel inspired.
As we move forward in this series, we will explore how to use Google Vertex AI for building an internal linking WordPress plugin.