Google’s Search Advocate, John Mueller, shared insights on diagnosing widespread crawling issues.
This guidance was shared in response to a disruption reported by Adrian Schmidt on LinkedIn. Google’s crawler stopped accessing several of his domains at the same time.
Despite the interruption, Schmidt noted that live tests via Search Console continued to function without error messages.
Investigations indicated no increase in 5xx errors or issues with robots.txt requests.
What could the problem be?
Mueller’s Response
Addressing the situation, Mueller pointed to shared infrastructure as the likely cause:
“If it shared across a bunch of domains and focuses on something like crawling, it’s probably an issue with a shared piece of infrastructure. If it’s already recovering, at least it’s not urgent anymore and you have a bit of time to poke at recent changes / infrastructure logs.”
Infrastructure Investigation
All affected sites used Cloudflare as their CDN, which raised some eyebrows.
When asked about debugging, Mueller recommended checking Search Console data to determine whether DNS or failed requests were causing the problem.
Mueller stated:
“The crawl stats in Search Console will also show a bit more, perhaps help decide between say DNS vs requests failing.”
He also pointed out that the timing was a key clue:
“If it’s all at exactly the same time, it wouldn’t be robots.txt, and probably not DNS.”
Impact on Search Results
Regarding search visibility concerns, Mueller reassured this type of disruption wouldn’t cause any problems:
“If this is from today, and it just lasted a few hours, I wouldn’t expect any visible issues in search.”
Why This Matters
When Googlebot suddenly stops crawling across numerous sites simultaneously, it can be challenging to identify the root cause.
While temporary crawling pauses might not immediately impact search rankings, they can disrupt Google’s ability to discover and index new content.
The incident highlights a vulnerability organizations might face without realizing it, especially those relying on shared infrastructure.
How This Can Help You
If time Googlebot stops crawling your sites:
Check if the problem hits multiple sites at once
Look at your shared infrastructure first
Use Search Console data to narrow down the cause
Don’t rule out DNS just because regular traffic looks fine
Keep an eye on your logs
For anyone running multiple sites behind a CDN, make sure you:
Have good logging set up
Watch your crawl rates
Know who to call when things go sideways
Keep tabs on your infrastructure provider
Featured Image: PeopleImages.com – Yuri A/Shutterstock
This post was sponsored by DebugBear. The opinions expressed in this article are the sponsor’s own.
From a user’s perspective, a slow website can be incredibly frustrating, creating a poor experience. But the impact of sluggish load times goes deeper than just user frustration.
Poor page performance affects search rankings, overall site engagement, E-E-A-T, and conversion rates that results in abandoned sessions, lost sales, and damaged trust.
Sure, Chrome UX (CrUX) and Google’s CWV reports can indicate there’s an issue, but that’s it. They don’t provide you with enough details to identify, troubleshoot, and fix the issue.
And fixing these issues are vital to your digital success.
Image from DebugBear, October 2024
This article explores why slow websites are bad for user experience (UX), the challenges that cause them, and how advanced page performance tools can help fix these issues in ways that basic tools can’t.
UX, Brand Perception & Beyond
While often at the bottom of a technical SEO checklist, site speed is critical for UX. Sites that load in once second convert 2.5 to 3 times more than sites that require five seconds to load.
And yet, today, an estimated 14% of B2C ecommerce websites require five seconds or more to load.
These numbers become even more pronounced for mobile users, for whom pages load 70.9% slower. Mobile users have 31% fewer pageviews and an average of 4.8% higher bounce rate per session.
According to a recent Google study, 53% of mobile users will abandon a page if it takes more than three seconds to load.
Poor page experience can negatively other aspects of your site, too:
Search Rankings – Google includes page experience, of which CWV and page performance is a factor, when ranking web pages.
User Trust – Poor performing pages fail to meet a potential customer’s expectations. They are often perceived by users as the brand inconveniencing them, introducing stress, negative emotions, and a loss of a sense of control to the buying process. Slower pages can also cause users to forget information gained from previous pages, reducing the effectiveness of advertising, copy, and branding campaigns between clicks.
User Retention – Site visitors who experience slow load times may never return, reducing retention rates and customer loyalty.
Why Basic Page Performance Tools Don’t Fully Solve The Problem
Tools like Google PageSpeed Insights or Lighthouse give valuable insights into how your website performs, but they can often be limited. They tell you that there’s an issue but often fall short of explaining what caused it or how to fix it.
Google’s Chrome User Experience Report (CrUX) and Core Web Vitals have become essential in tracking website performance and user experience.
These metrics—Largest Contentful Paint (LCP), First Input Delay (FID), and Cumulative Layout Shift (CLS)—offer valuable insights into how users perceive a website’s speed and stability.
However, CrUX and Core Web Vitals only tell part of the story. They indicate that a problem exists but don’t show the root cause or offer an immediate path for improvement.
For instance, your LCP might be poor, but without deeper page speed analysis, you wouldn’t know whether it’s due to an unoptimized image, a slow server response, or third-party scripts.
Image from DebugBear, October 2024
Here’s where DebugBear stands out. DebugBear digs deeper, offering more granular data and unique features that basic tools don’t provide.
Continuous Monitoring and Historical Data – Many speed testing tools only offer snapshots of performance data. DebugBear, on the other hand, allows for continuous monitoring over time, providing an ongoing view of your site’s performance. This is crucial for detecting issues that crop up unexpectedly or tracking the effectiveness of your optimizations.
Granular Breakdown by Device, Location, and Browser – Basic tools often provide aggregated data, which hides the differences between user experiences across various devices, countries, and network conditions. DebugBear lets you drill down to see how performance varies, allowing you to optimize for specific user segments.
Pinpointing Content Elements Causing Delays – One of DebugBear’s standout features is its ability to show exactly which content elements—images, scripts, or third-party code—are slowing down your website. Rather than wasting hours digging through code and experimenting with trial and error, DebugBear highlights the specific elements causing delays, allowing for targeted, efficient fixes.
Why You Need Continuous Page Speed Testing
One of the biggest pitfalls in web performance optimization is relying on single-point speed tests.
Image from DebugBear, October 2024
Running a one-time test may give you a snapshot of performance at that moment, but it doesn’t account for fluctuations caused by different factors, such as traffic spikes, varying user devices, or changes to site content.
Without continuous testing, you risk spending hours (or even days) trying to identify the root cause of performance issues.
DebugBear solves this problem by continuously tracking page speed across different devices and geographies, offering detailed reports that can be easily shared with team members or stakeholders.
If a performance dip occurs, DebugBear provides the data necessary to quickly identify and rectify the issue, saving you from the endless trial-and-error process of manual debugging.
Without tools like DebugBear, you’re left with only a high-level view of your website’s performance.
This means hours of trying to guess the underlying issues based on broad metrics, with no real insight into what’s dragging a site down.
Different Users Experience Performance Differently
Not all users experience your website’s performance in the same way.
Device type, geographic location, and network speed can significantly affect load times and interaction delays.
For example, a user on a fast fiberoptic connection in the U.S. may have a completely different experience than someone on a slower mobile network in India.
This variance in user experience can be hidden in aggregate data, leading you to believe your site is performing well when a significant portion of your audience is actually struggling with slow speeds.
Here’s why breaking down performance data by device, country, and browser matters:
Device-Specific Optimizations – Some elements, like large images or animations, may perform well on desktop but drag down speeds on mobile.
Geographic Performance Variations – International users may experience slower speeds due to server location or network conditions. DebugBear can highlight these differences and help you optimize your content delivery network (CDN) strategy.
Browser Differences – Different browsers may handle elements like JavaScript and CSS in different ways, impacting performance. DebugBear’s breakdown by browser ensures you’re not overlooking these subtleties.
Without this granular insight, you risk alienating segments of your audience and overlooking key areas for optimization.
And troubleshooting these issues becomes and expensive nightmare.
Just ask SiteCare.
WordPress web development and optimization service provider SiteCare uses DebugBear to quickly troubleshoot a full range of WordPress sites, solve performance issues faster, and monitor them for changes, providing high quality service to its clients, saving thousands of hours and dollars every year.
DebugBear offers these breakdowns, providing a clear view of how your website performs for all users, not just a select few.
Real User Monitoring: The Key To Accurate Performance Insights
In addition to synthetic testing (which mimics user interactions), real user monitoring (RUM) is another powerful feature technical SEOs and marketing teams will find valuable.
While synthetic tests offer valuable controlled insights, they don’t always reflect the real-world experiences of your users.
RUM captures data from actual users as they interact with your site, providing real-time, accurate insights into what’s working and what isn’t.
For instance, real user monitoring can help you:
Identify performance issues unique to specific user segments.
Detect trends that may not be visible in synthetic tests, such as network issues or slow third-party scripts.
Measure the actual experience users are having on your website, not just the theoretical one.
Without real user monitoring, you might miss critical issues that only surface under specific conditions, like a heavy user load or slow mobile networks.
If you’re not using continuous page speed testing and in-depth reports, you’re flying blind.
You may see an overall decline in performance without understanding why, or you could miss opportunities for optimization that only reveal themselves under specific conditions.
The result?
Wasted time, frustrated users, lost conversions, and a website that doesn’t perform up to its potential.
DebugBear solves this by offering both continuous monitoring and granular breakdowns, making it easier to troubleshoot issues quickly and accurately.
With detailed reports, you’ll know exactly what to fix and where to focus your optimization efforts, significantly cutting down on the time spent searching for problems.
Headless CMSs are becoming more popular as companies want more flexible content management tools.
These systems separate the content creation from how it’s displayed, giving more options for publishing content across different platforms.
In this article, we’re reviewing ten top headless CMS options. We’re examining them based on the following criteria:
How well they work with other tools.
How easy they are for developers to use.
How flexible they are for organizing content.
We’ve included enterprise-level systems and free, open-source ones to cover different needs and technical skill levels. This comparison should help teams understand their options and figure out which platforms might work best for their specific projects.
Here are the best headless CMS platforms to consider.
Why Should You Use A Headless CMS Platform?
Headless CMS platforms have become increasingly popular for good reasons.
They offer several advantages over traditional content management systems, including:
Flexibility: Publish once, display anywhere – from websites to apps to smart devices.
Developer-Friendly: Devs can use their favorite tools, speeding up project timelines.
Performance: API-driven content delivery often means quicker load times.
Future-Proof: Easily adapt to new tech without overhauling your whole system.
Security: Backend separation reduces exposure to potential threats.
Scaling: Grow your content or expand to new markets without hitting technical roadblocks.
Teamwork: Features like real-time editing streamline the content creation process.
While not a one-size-fits-all solution, a headless CMS is worth considering if you need a flexible, scalable content system that can keep up with changing tech.
10 Best Headless CMS Platforms
1. Sanity
Screenshot from: sanity.io, July 2024.
Sanity is a good fit for small to medium-sized teams prioritizing real-time collaboration and a customizable content studio.
In practice, Sanity works well for structuring and managing content, such as product information, for e-commerce sites. Its GROQ query language allows for flexible content retrieval and filtering.
Why It’s Good:
Real-time collaboration with live multi-user editing.
Flexible, schema-driven content modeling for structured content.
Customizable Content Studio for intuitive editorial experience.
Pay-as-you-go pricing is suitable for scaling projects.
Potential Drawbacks:
Steeper learning curve for non-technical users compared to traditional CMSs.
Requires development resources for initial setup and ongoing maintenance.
Tips and Tricks:
Use GROQ queries for flexible content retrieval and filtering.
Leverage Portable Text for rich text content with embedded objects.
Customize input components in Content Studio for optimal editorial UX.
Best Hack:
Set up a real-time preview using Next.js and Sanity’s Listener. Changes in the Content Studio will instantly update the front end, providing a true WYSIWYG editing experience for content teams.
2. Storyblok
Screenshot from: storyblok.com, July 2024.
Storyblok’s key selling point is its visual editor, which empowers marketers to create and manage content independently.
It’s been successfully used for multilingual corporate sites. The visual editor simplifies content updates, while features like field-level translations streamline localization.
Why It’s Good:
Visual editor for drag-and-drop page building.
Reusable block system for modular content creation.
Field-level translations for multi-language content.
Roles and permissions for content governance.
Potential Drawbacks:
Some technical setup is required to fully leverage the visual editor.
Pricing can add up for larger-scale projects with higher content needs.
Tips and Tricks:
Utilize nested components for flexible, reusable page sections.
Set up Language Fallbacks to gracefully handle missing translations.
Integrate with Sketch/Figma for a design-to-CMS workflow.
Best Hack:
With Storyblok’s Visual Composer within Content Stages, you can build and preview pages in a draft stage before pushing them live.
3. Hygraph
Screenshot from: hygraph.com, July 2024.
Formerly GraphCMS, Hygraph stands out for its GraphQL-native approach and Content Federation capabilities for unifying data from disparate sources.
Hygraph has been used to power blogs with user-generated content. Authors can submit posts via forms, which are managed in Hygraph. Content Federation allows merging this with e-commerce data to create shoppable posts.
Why It’s Good:
Intuitive GraphQL API for efficient content querying.
Visual schema builder for modeling content without code.
Content Federation is used to unify data from multiple sources.
Webhooks and API extensions for custom integrations.
Potential Drawbacks:
Content preview capabilities are limited compared to some other headless CMSs.
GraphQL learning curve for teams new to this query language.
Tips and Tricks:
Leverage remote fields to enrich content with external API data.
Use Scheduled Publishing to automate content releases.
Set up Webhooks to trigger external actions based on content changes.
Best Hack:
Combine Hygraph’s Content Stages with Vercel’s Preview Deployments. Configure a webhook to trigger a new preview deployment whenever content is updated in a draft stage. This allows stakeholders to review changes in a live environment before publishing.
4. Contentful
Screenshot from: contentful.com, July 2024.
Contentful is an API-first headless CMS geared towards large organizations with complex content requirements.
It’s well-suited for global, multi-site content platforms. Content types and relationships allow for well-structured, multi-locale content. The app framework enables customizing the editorial experience.
Why It’s Good:
Flexible content modeling with validations and relationships.
App framework for extending functionality.
Localization features for multi-language content.
Multiple API options (REST, GraphQL, Content Delivery API).
Potential Drawbacks:
Higher learning curve due to extensive features and concepts.
Pricing can be prohibitive for smaller projects.
Tips and Tricks:
Use Link Fields to model content relationships and references.
Leverage Environment Aliases to streamline content promotion across spaces.
Extend editorial capabilities with app framework and UI extensions.
Best Hack:
Integrate Contentful with Gatsby Cloud for a powerful JAMstack setup. Contentful’s webhook triggers a new build in Gatsby whenever content changes, ensuring the static front end displays the latest published content.
5. Contentstack
Screenshot from: contentstack.com, July 2024.
Contentstack is another enterprise-focused headless CMS, with strengths in content localization and publishing workflows.
It’s been successfully used to build global product catalogs. Content types mirror hierarchical product data, varying by locale. Workflows enable synchronized multi-region releases, with granular roles and permissions ensuring proper governance.
Why It’s Good:
Roles & permissions for fine-grained access control.
Localization features for managing global, multi-language content.
Content scheduling and versioning capabilities.
Stackable per-locale publishing for phasing releases across regions.
Potential Drawbacks:
UI can feel complex for content teams used to simpler traditional CMSs.
There are some limitations around content relationships compared to other enterprise CMSs.
Tips and Tricks:
Utilize the pre-built Field Editor extensions to customize the editorial interface.
Leverage webhooks to trigger external processes or rebuild content changes.
Use the CDN-powered Content Delivery APIs for optimal performance.
Best Hack:
Combine Contentstack’s Release Workflows with Netlify’s Deploy Previews. Create a workflow automatically generating a deploy preview whenever content is moved to a new workflow stage.
6. Strapi
Screenshot from: strapi.co, July 2024.
Strapi is a popular open-source headless CMS that prioritizes developer experience and customization.
It’s been used as a backend for developer-focused SaaS applications. Custom content types mirror application data structures, while the plugin system allows application-specific features to be added to the admin panel.
Why It’s Good:
Fully customizable API with the ability to define endpoints and logic.
Extensive plugin library for extending functionality.
Open-source, self-hosted option for maximum control.
GraphQL support is available in addition to REST and JSON:API.
Potential Drawbacks:
It requires more technical setup and maintenance than hosted solutions.
Less built-in functionality out-of-the-box compared to paid options.
Tips and Tricks:
Utilize Community and Custom Plugins to add project-specific features.
Leverage Strapi’s JWT authentication to secure API access.
Use Strapi’s Draft System for content staging and approval workflows.
Best Hack:
Deploy Strapi on Heroku with PostgreSQL database. Set up GitHub integration in Heroku to automatically deploy from a repository branch. Any custom plugin code changes pushed to GitHub will trigger a rebuild of the Strapi backend on Heroku.
7. Directus
Screenshot from: directus.io, July 2024.
Directus is a unique headless CMS that sits on top of new or existing SQL databases, instantly providing RESTful API and GraphQL endpoints.
It’s proven helpful for projects with existing SQL databases. Installing Directus on top provides an instant content management layer and API without data migration. The platform allows for tailoring the admin panel to project needs.
Why It’s Good:
Instantly generates APIs from new and existing SQL databases.
Supports various SQL databases (MySQL, PostgreSQL, MS-SQL, Oracle).
Extensible platform with custom endpoints, webhooks, and event hooks.
Open Data Platform for powering data-driven projects beyond typical CMS use cases.
Potential Drawbacks:
Requires SQL database skills for initial setup and data modeling.
Less content-focused out-of-the-box compared to other headless options.
Tips and Tricks:
Use Directus Fields to shape how your database data is presented in the admin.
Leverage Flows for automating data processing tasks on backend events.
Enable Data Accountability for granular activity tracking and versioning.
Best Hack:
Connect Directus to Metabase for powerful data visualization. By leveraging Directus’ SQL mirroring, Metabase can query and chart your content data directly, enabling advanced content analytics and reporting dashboards.
8. Umbraco Heartcore
Screenshot from: umbraco.com/products/umbraco-heartcore/, July 2024.
Umbraco Heartcore is the headless offering from the team behind the popular open-source Umbraco CMS.
It’s a natural fit for existing Umbraco users looking to adopt a headless architecture.
Document Types and Templates from the standard Umbraco CMS can be reused in Heartcore, easing the transition for content editors and .NET developers.
Why It’s Good:
Smooth transition path for existing Umbraco users.
.NET-based for integrating with Microsoft stack.
Umbraco Cloud for managed hosting and deployment.
Familiar, feature-rich editor experience.
Potential Drawbacks:
Less adoption compared to some other headless options.
Requires .NET development skills for implementation.
Tips and Tricks:
Use Umbraco’s Compositions for reusable content structures.
Leverage Umbraco Deploy for moving content and configuration between environments.
Explore the Umbraco Marketplace for editor and developer tools.
Best Hack:
Integrate Umbraco Heartcore with Azure Cognitive Services for AI-powered content features. By calling Azure’s APIs from an Umbraco Controller, you can add capabilities like automated tagging, sentiment analysis, and image recognition to your editorial workflow.
9. Kontent.ai
Screenshot from: kontent.ai, July 2024.
Kontent.ai differentiates itself with features geared towards content operations and planning. It’s been successfully used for large-scale content marketing initiatives.
The content calendar provides a high-level overview of content across channels while ‘Assignments’ and ‘Tasks’ keep teams in sync.
Why It’s Good:
Structured, reusable content models and components.
Collaboration features like suggestions and comments.
AI-powered content insights for optimizing content performance.
Potential Drawbacks:
Relatively higher pricing compared to some other options.
Less developer-focused compared to highly extensible options like Strapi.
Tips and Tricks:
Use Linked Items to connect related content across your project.
Set up Webhooks to trigger builds or notifications on content changes.
Customize the Web Spotlight interface for a tailored, inline preview experience.
Best Hack:
Sync Kontent with Airtable for a versatile content planning setup. Use the Kontent webhooks to send content data to Airtable. Then, leverage Airtable’s views, filters, and fields to plan and track your content production pipeline.
10. Prismic
Screenshot from: prismic.io, July 2024.
Prismic is a developer-friendly headless CMS focusing on sliced-based content modeling and multi-language support.
It’s been used for multilingual documentation sites with modern frameworks like Nuxt.
Prismic’s slices allow for componentizing common elements like code snippets and callouts, while the localization features enable translation management.
Why It’s Good:
Content Slices for modular, reusable page components.
Strong multi-language features with built-in translation management.
API browsers and development kits for popular frameworks.
Simple, clean UI for content editors.
Potential Drawbacks:
Fewer collaboration and workflow features compared to some enterprise options.
Slice-based model can take some getting used to for traditional CMS users.
Tips and Tricks:
Utilize Slice Machine for local Slice development and testing.
Set up preview URLs to preview on-page content before publishing.
Use Prismic’s Integration Fields with tools like Shopify and Magento for e-commerce projects.
Best Hack:
Integrate Prismic with Cloudinary for advanced media management. Using a webhook to send Prismic image data to Cloudinary, you can leverage Cloudinary’s transformation and optimization capabilities while managing images within your CMS.
Choosing The Right Headless CMS For Your Needs
To select the ideal headless CMS, consider these factors:
Content Complexity: Match your content structure needs with platform capabilities (e.g., Contentful for complex hierarchies, Strapi for customizable architecture).
Team Skills: Align with your team’s tech expertise (e.g., Hygraph for GraphQL users, Directus for SQL pros).
Localization: For multilingual content, prioritize platforms with strong localization features, such as Contentstack or Contentful.
Integration: Ensure compatibility with your existing tech stack through APIs, SDKs, or plugins.
Collaboration: For large teams, consider platforms with advanced workflow management (e.g., Kontent.ai) or real-time collaboration (e.g., Sanity).
Pricing and Scalability: Balance cost with features and growth potential. Open-source options like Strapi offer savings, while enterprise solutions like Contentful provide high-end features.
Explore options, run proof of concepts (POCs), and gather team feedback before deciding.
A well-chosen headless CMS and ongoing optimization will equip you to handle current and future content challenges.
Google Chrome has officially ended support for the First Input Delay (FID) metric, marking a transition to prioritizing Interaction to Next Paint (INP).
The announcement by Rick Viscomi, who oversees web performance developer relations for the Chrome team, confirms INP as the core metric for evaluating interaction responsiveness.
Today’s the day: Chrome ends support for FID
If you’re still relying on it in Chrome tools, your workflows WILL BREAK
The following tools will stop reporting FID data over the next few days:
PageSpeed Insights
Chrome User Experience Report (CrUX)
web-vitals.js
Web Vitals extension
Background
The move to replace FID with INP stems from limitations in capturing the full scope of interaction responsiveness on the web.
FID only measured the delay between a user’s input and the browser’s response, overlooking other critical phases.
INP takes a more holistic approach by measuring the entire process, from user input to visual updates on the screen.
Transition Period
While the web-vitals.js library will receive a version bump (5.0) to accommodate the change, most other tools will stop reporting FID data without a version update.
The CrUX BigQuery project will remove FID-related fields from its schema starting with the 202409 dataset, scheduled for release in October.
To aid developers in the transition, the Chrome team is also retiring the “Optimize FID” documentation, redirecting users to the updated “Optimize INP” guidance.
We’re also shutting down the old Optimize FID article
Now with better APIs and metrics, there’s no reason to optimize ONLY the input delay phase of an interaction. Instead, focus on the entire UX from input to painthttps://t.co/DMzeFUelfm
Here are some steps to take to in light of the transition from FID to INP:
Familiarize yourself with the INP metric by reviewing the official documentation on web.dev. Understand how INP measures the full lifecycle of an interaction from input to visual update.
Audit your site’s current INP performance using tools like PageSpeed Insights or real-user monitoring services that support INP. Identify areas where interaction responsiveness needs improvement.
Consult the “Optimize INP” guidance on web.dev for best practices on reducing input delay, optimizing event handling, minimizing layout thrashing, and other techniques to enhance INP.
Update any performance monitoring tools or custom scripts that currently rely on the deprecated FID metric to use INP instead. For web-vitals.js users, be prepared for the breaking change in version 5.0.
If leveraging the CrUX BigQuery dataset, plan to update data pipelines to handle the schema changes, removing FID fields after the 202409 release in October.
By taking these steps, you can ensure a smooth migration to INP.
Breadcrumbs are a navigational feature for your website, and they can greatly impact SEO and user experience.
Many websites still don’t implement breadcrumbs, which is a huge mistake. Not only do breadcrumbs impact SEO, but they are also pretty easy to implement.
Here’s what you need to know about breadcrumbs, how they impact SEO, and common mistakes to avoid.
What Are Breadcrumbs In SEO?
Breadcrumbs are automated internal links that allow users to track their location on a website and their distance from the homepage.
You’ll usually find them at the top of a website or just under the navigation bar.
Just like internal links, they help keep users on a website and help them find the information they are looking for. If they feel disoriented, they can use breadcrumbs links to go one level up and continue their journey on the website rather than clicking a browser’s back button.
Here’s an example of breadcrumbs from eBay’s website:
Screenshot from eBay, June 2024
It shows exactly what categories I clicked on to land on the page I am viewing.
The breadcrumbs make it easy to backtrack to a previous page if I need to.
4 Common Types Of Breadcrumbs
Not all breadcrumbs are created equal!
There are four main types of breadcrumbs, each with their own purpose.
Before adding breadcrumbs to your site, determine which type will be the best fit for user experience.
The most common type of breadcrumbs that tell users where they are in the site structure and how to get back to the homepage.
For example: Home > California > San Francisco
Screenshot from cars.com, June 2024
2. Attribute-Based Breadcrumbs
These breadcrumbs are commonly used on ecommerce sites to show what attributes the user has clicked.
For example: Home > Shoes > Hiking > Womens
Screenshot from eBay, June 2024
Please note how smartly eBay handles breadcrumbs for attributes when the trail is too long.
It shows the last three items following the home page and truncates previous ones under a three-dot menu; you can see all previous items in the breadcrumbs upon clicking.
3. Forward Or Look-Ahead Breadcrumbs
This type of breadcrumb not only shows the user’s current path within a website’s hierarchy but also provides a preview of the next steps they can take.
Here is an example from the Statista website, which illustrates how useful it can be by giving users a preview of other sections of the subsection.
Screenshot from Statista, June 2024
4. History-Based Breadcrumbs
This type of breadcrumb is rarely used and shows users what other pages on the site they have visited, similar to a browser history.
For example, if you were searching for SEO news and read three different articles, the breadcrumbs might look like this: Home > SEO article 1 > SEO article 2 > Current page.
But I recommend avoiding this because it may confuse users. Users may navigate to the same destination through different journeys, which means you will show a different breadcrumb structure each time, confusing users.
If you’re unsure breadcrumbs are worth the hassle (spoiler, they totally are!), then you’ll want to read the section below.
1. Breadcrumbs Improve UX
Breadcrumbs make it easier for users to navigate a website and encourage them to browse other sections.
For example, if you want to learn more about Nestle, you head to its site and end up on the Nestle company history page.
Screenshot from Nestle, June 2024
Using its breadcrumbs, you can easily navigate back to About Us, History, or even its homepage.
It’s a handy way to help users easily find what they are looking for – and hopefully draw them deeper into your website.
2. Keep People Onsite Longer
Bounce rate is not a ranking factor. But keeping users from bouncing can still help SEO as it helps users click and navigate through the website, an engagement signal that Google uses for ranking purposes.
Say, you are looking for a new pair of sneakers on Adidas’s website.
Screenshot from Adidas, June 2024
Using Adidas’s breadcrumbs, you can easily navigate back to the boots category and look for a different pair.
This is great for Adidas because it will likely keep you from returning to Google and landing on another shoe website.
That’s the power of the humble breadcrumb!
A case study on Moz shows what happened when it added breadcrumbs to a site and made several other changes.
Sessions drastically increased in just a few months.
Screenshot from Moz, June 2024
Granted, they also added meta descriptions and eliminated a few other UX issues, but breadcrumbs also played a part.
3. Breadcrumbs Improve Internal Linking
Breadcrumbs are not just a navigational utility; they play a crucial role in enhancing a website’s internal linking structure. Google uses breadcrumbs to determine the relationship between different pages which are deeper in the site structure.
As discussed, breadcrumbs make site navigation easier, but they do a lot more so as Google displays rich snippets in the search results.
Screenshot from Google.com
But this doesn’t happen until you markup your breadcrumbs with structured data so Google can pick it up and surface it in search engine results pages (SERP).
Here is a JSON-LD structured data code example for a breadcrumb that matched the rich snippet from the screenshot:
Here is a breakdown of each attribute in the breadcrumb JSON-LD schema.
Attribute
Description
@context
This tells search engines where to find the definitions of the structured data
@type
Defines the type of schema used, in this case, “BreadcrumbList”
itemListElement
An array of list items representing a breadcrumb.
itemListElement[position]
Indicates the position of the breadcrumb in the list, starting from 1.
itemListElement[item]
The URL of the breadcrumb’s target page
itemListElement[name]
The visible name of the breadcrumb as it appears to users.
Please note that you can’t game Google by having structured data on the website without having an actual breadcrumb visible to users.
If Google detects such manipulations, violating Google’s guidelines, you may get a manual penalty. However, that doesn’t cause a drop in rankings, but your website will not be eligible for any kind of rich snippets in search results.
So, the golden rule is that every schema markup you have on the website has to exist on the page and be visible to users.
4 Common Mistakes When Using Breadcrumbs For SEO
Implementing breadcrumbs is a straightforward way to improve a site’s SEO and provide better UX.
However, sometimes, implementing breadcrumbs could cause more harm than good.
Here are a few breadcrumb mistakes you’ll want to avoid.
1. Don’t Go Too Big or Too Small – Aim For Just Right
Breadcrumbs should be easy to see but unobtrusive.
A slightly smaller font is fine, but too small text will be hard to see and hard to click on mobile devices.
Position them at the top of the page, beneath the hero image, or just above the H1 title so they are easy to find.
2. Don’t Just Repeat Your Navigation Bar
If the breadcrumbs just duplicate what is already in your navbar, they might not serve any additional purpose.
There’s no need to add more coding (and take up room!) if it doesn’t help.
3. Don’t Ditch Your Navigation Bar In Favor Of Breadcrumbs
While you don’t want to repeat navigation, you also don’t want to rely entirely on breadcrumbs.
They serve as a supplement, not a replacement for other navigational features.
4. Use The Right Type Of Breadcrumbs
Location breadcrumbs are the most common type, but they might not be the best choice for your site.
Don’t use location breadcrumbs if your site doesn’t use a nested structure where most pages fit under a few categories.
In that case, history-based breadcrumbs might be more beneficial.
How To Implement Breadcrumbs In WordPress
Breadcrumbs are an incredibly useful navigation element for both users and search engines — and they are easy to add to your site.
Here are a few ways to add these useful features to your site.
Screenshot from Yoast SEO, June 2024
Use Yoast SEO: If you already use Yoast, adding breadcrumbs will only take a few steps. Simply log in and follow these steps to implement breadcrumbs.
WordPress Plugins: If you use WordPress, there are several plugins that can add breadcrumbs in a few steps. I like Breadcrumb NavXT because it is easy to implement and generates locational breadcrumbs that can be customized as needed.
WooCommerce Breadcrumb Plugin: If you have an ecommerce site that uses Woocommerce, consider using their breadcrumb plugin, which allows you to restyle the built-in WooCommerce breadcrumbs.
Finally, your site builder or WordPress theme might have a built-in breadcrumb feature.
Shopify, Wix, or Squarespace sites have built-in features you can enable on their settings page.
Breadcrumbs Are An Easy-to-Grasp Way To Navigate Your Website
Think of breadcrumbs as the butter to your bread. The Kermit to your Miss Piggy. The animal sauce to your In N’ Out burger.
You get the point.
Breadcrumbs are a simple change that can help your site stand out on the search results page.
Though they won’t guarantee a significant boost to SERPs, they are helpful to users and search engines alike.
As an added bonus, breadcrumbs are easy to implement using a plugin like Yoast.
In just a few clicks, you could make your site easier to navigate and maybe rank higher in SERPs.
Google has released an update to its Publisher Tag Ads Library, introducing a new feature to improve Interaction to Next Paint (INP) scores.
The update focuses on yielding during out-of-viewport ad slot insertions when using Single Request Architecture (SRA).
INP Improvement: Focus On Ad Loading Efficiency
The new feature allows for more strategic ad loading, particularly for ad slots not immediately visible to users.
The ad library prioritizes more immediate content and interactions by yielding to these out-of-viewport insertions, potentially improving INP scores.
Gilberto Cocchi was first to notice this update:
Google Publisher Tag Ads Library just released an INP specific improvement Yielding on out of viewport ad slots insertions via SRA. Publishers can also decide to yield on every slot including the in-viewport ones by using the adYield Config option. pic.twitter.com/ODpe3tkC0D
Google has also introduced an adYield Config option, giving publishers additional control over ad loading behavior.
This setting allows publishers to extend yielding to all ad slots, including those within the viewport, offering more flexibility in managing site performance.
Potential Impact On INP Scores
The update may affect INP scores, a Core Web Vital metric that measures page responsiveness to user interactions.
Lower INP scores generally indicate better performance, which can influence search engine rankings and user experience.
Upcoming August CrUX Report
The full impact of this update will become more apparent with the release of the next Chrome User Experience (CrUX) report, expected on September 10th.
This report will provide data on INP measurements across websites using the updated Google Publisher Tag Ads Library.
It should provide concrete data on how this update affects real-world INP scores.
INP’s Relevance For Publishers
Since its introduction as a Core Web Vital, INP has become an important metric.
It reflects a site’s responsiveness to user actions and can influence user engagement.
As Google continues emphasizing page experience in ranking systems, INP improvements could affect search visibility.
Implementing The New Feature
Publishers can access this new functionality by updating their Google Publisher Tag implementation.
The adYield Config options are detailed in the library’s documentation. Google advises testing various configurations to determine the best setup for individual site needs.
This update to the Google Publisher Tag Ads Library represents efforts needed to address the balance between ad delivery, site performance, and user experience in digital publishing.
FAQ
How does the new Google Publisher Tag Ads Library update improve Interaction to Next Paint (INP) scores?
This update improves smart ad loading, especially for ads off-screen. It prioritizes visible content and user interactions to boost INP scores, potentially helping SEO.
The new adYield Config lets publishers extend ad-yielding to all ad slots, including visible ones, for better performance control.
What is the adYield Config option, and how does it benefit publishers?
Google’s new adYield Config setting gives publishers better control over ad loading. It extends yield to all ad slots, even those immediately visible.
Key benefits:
More ad loading control
Flexible performance management
Potential UX and page responsiveness boost
This could indirectly improve INP scores and search visibility.
What is the potential impact of the Google Publisher Tag Ads Library update on INP scores?
This update aims to boost INP scores by delaying ad insertions outside the visible screen area. Better INP scores mean more responsive pages, which can impact search rankings and user experience. Publishers who use this update might see better search visibility.
The full impact will be shown in the next CrUX report, due September 10th.
Choosing the right website builder may depend on your goals. They have a variety of features, and some platforms excel in areas that others don’t.
Not all builders will fit if you need advanced SEO or ecommerce capabilities.
We compared 10 website builders based on price, data limits, core use cases, and whether they provide domains.
The 10 Best Website Builders Compared
Website Builder
Starting Price
Free Option
Premium Content Gates
Limits
Free Domain
Great For
Extras We Like
WordPress.com
$9/month
Yes
Yes
1-50 GB
Yes (annual plans only)
Blogging and text-based sites
Easily work between the .com and self-hosted sites.
Customizability.
Wix
$17/month
Yes
Yes
2 GB-Unlimited
Yes
Small businesses & entrepreneurs
Educational programs and support.
Scheduling.
Ad management.
Email campaigns.
Duda
$25/month
14 days
Yes
1-4 sites
No
Getting started
Excellent help and support.
Zapier integration.
Multiple language sites.
Content library and free assets.
HubSpot
$15/month
Yes
Yes
Up to 30 pages on the free plan
No
Scaling
Conversational bots.
Wide range of free tools for sales, marketing, and services.
Extensive site and business owner education.
Mobile app.
Squarespace
$25/month
14 days
Yes
Unlimited bandwidth, 30 minutes of video storage
Yes (annual plans only)
Quick, no-fuss sites
Custom product creation without worrying about fulfillment and shipping.
Integrated ecommerce on larger plans.
Webflow
$18/month
Yes
Yes
Starts with 1 GB bandwidth and 50 CMS items
Yes
Designers & Agencies
Schema markup and structured search support.
Pre-built interactions.
IONOS
$6/month
No
No
50-75 GB
Yes
Small businesses on a budget
Affordable.
Competitor tracking.
Online booking included.
Built-in privacy and SSL.
Shopify
$5/month
3 days
No
Unlimited products, bandwidth, and online storage
No
Ecommerce
Wide range of ecommerce features.
Large app store for extensions.
Weebly
$12/month
Yes
No
Unlimited storage
Yes
Beginners
Ease of use.
Built-in SEO tools.
Hostinger
$2.99/month
No
No
25,000 visits, 100 GB SSD storage, 400,000 files
Yes
Budget sites
Very affordable plans.
24/7 customer support.
10 Best Website Builders For 2024
1. WordPress.com
Screenshot from WordPress.com, June 2024
With 62.7% of the market share held between WordPress.com and .org, WordPress is the largest and most prominent website builder.
Key Features
Over 50,000 plugins and 8,000 themes for customization.
Ability to transition between hosted and self-hosted options.
With paid plans, custom domains, site security, and advanced features are available.
Benefits & SEO Highlights
User-friendly interface suitable for beginners.
Flexibility to create various types of websites.
Built-in SEO tools and options to optimize your site for search engines.
Cost
$0-$70/month ($0-$45/month, billed annually), plus custom options.
2. Wix
Screenshot from Wix.com, June 2024
Wix controls only 4% of the CMS market, but that small number translates into hundreds of millions of users and makes it one of the most popular website builders.
It offers ease of use and flexibility, making it suitable for creating professional websites with expanded functionality.
Key Features
Customizable templates with drag-and-drop editing.
Wide range of elements and third-party apps for added functionality.
Comprehensive business solutions, including ecommerce and marketing tools.
Benefits & SEO Highlights
Suitable for beginners and those needing advanced features.
Schema.org is a collection of vocabulary (or schemas) used to apply structured data markup to web pages and content. Correctly applying schema can improve SEO outcomes through rich snippets.
Structured data markup is translated by platforms such as Google and Microsoft to provide enhanced rich results (or rich snippets) in search engine results pages or emails. For example, you can markup your ecommerce product pages with variants schema to help Google understand product variations.
Schema.org is an independent project that has helped establish structured data consistency across the internet. It began collaborating with search engines such as Google, Yahoo, Bing, and Yandex back in 2011.
The Schema vocabulary can be applied to pages through encodings such as RDFa, Microdata, and JSON-LD. JSON-LD schema is preferred by Google as it is the easiest to apply and maintain.
Does Schema Markup Improve Your Search Rankings?
Schema is not a ranking factor.
However, your webpage becomes eligible for rich snippets in SERPs only when you use schema markup. This can enhance your search visibility and increase CTR on your webpage from search results.
Schema can also be used to build a knowledge graph of entities and topics. Using semantic markup in this way aligns your website with how AI algorithms categorize entities, assisting search engines in understanding your website and content.
“Most webmasters are familiar with HTML tags on their pages. Usually, HTML tags tell the browser how to display the information included in the tag. For example,
Avatar
tells the browser to display the text string “Avatar” in a heading 1 format.
However, the HTML tag doesn’t give any information about what that text string means—“Avatar” could refer to the hugely successful 3D movie, or it could refer to a type of profile picture—and this can make it more difficult for search engines to intelligently display relevant content to a user.”
This means that search engines should have additional information to help them figure out what the webpage is about.
You can even link your entities directly to sites like Wikipedia or Google’s knowledge graph to build explicit connections. Using Schema this way can have positive SEO results, according to Martha van Berkel, CEO of Schema App:
“At Schema App, we’ve tested how entity linking can impact SEO. We found that disambiguating entities like places resulted in pages performing better on [near me] and other location-based search queries.
Our experiments also showed that entity linking can help pages show up for more relevant non-branded search queries, increasing click-through rates to the pages.
Here’s an example of entity linking. If your page talks about “Paris”, it can be confusing to search engines because there are several cities in the world named Paris.
If you are talking about the city of Paris in Ontario, Canada, you can use the sameAs property to link the Paris entity on your site to the known Paris, Ontario entity on Wikipedia, Wikidata, and Google’s Knowledge Graph.”
By helping search engines understand content, you are assisting them in saving resources (especially important when you have a large website with millions of pages) and increasing the chances for your content to be interpreted properly and ranked well. While this may not be a ranking factor directly, Schema helps your SEO efforts by giving search engines the best chance of interpreting your content correctly, giving users the best chance of discovering it.
What Is Schema Markup Used For?
Listed above are some of the most popular uses of schema, which are supported by Google and other search engines.
You may have an object type that has a schema.org definition but is not supported by search engines.
In such cases, it is advised to implement them, as search engines may start supporting them in the future, and you may benefit from them as you already have that implementation.
Types Of Schema Encoding: JSON-LD, Microdata, & RDFa
There are three primary formats for encoding schema markup:
JSON-LD.
Microdata.
RDFa.
Google recommends JSON-LD as the preferred format for structured data. Microdata is still supported, but JSON-LD schema is recommended.
In certain circumstances, it isn’t possible to implement JSON-LD schema due to website technical infrastructure limitations such as old content management systems). In these cases, the only option is to markup HTML via Microdata or RDFa.
You can now mix JSON-LD and Microdata formats by matching the @idattribute of JSON-LD schema with theitemidattribute of Microdata schema. This approach helps reduce the HTML size of your pages.
For example, in a FAQ section with extensive text, you can use Microdata for the content and JSON-LD for the structured data without duplicating the text, thus avoiding an increase in page size. We will dive deeper into this below in the article when discussing each type in detail.
1. JSON-LD Schema Format
JSON-LD encodes data using JSON, making it easy to integrate structured data into web pages. JSON-LD allows connecting different schema types using a graph with @ids, improving data integration and reducing redundancy.
Let’s look at an example. Let’s say that you own a store that sells high-quality routers. If you were to look at the source code of your homepage, you would likely see something like this:
Once you dive into the code, you’ll want to find the portion of your webpage that discusses what your business offers. In this example, that data can be found between the two
tags.
The following JSON-LD formatted text will markup the information within that HTML fragment on your webpage, which you may want to include in your webpage’s
section.
This snippet of code defines your business as a store via the attribute"@type": "Store".
Then, it details its location, contact information, hours of operation from Monday to Saturday, and different operational hours for Sunday.
By structuring your webpage data this way, you provide critical information directly to search engines, which can improve how they index and display your site in search results. Just like adding tags in the initial HTML, inserting this JSON-LD script tells search engines specific aspects of your business.
Let’s review another example of WebPage schema connected with Organization and Author schemas via @id. JSON-LD is the format Google recommends andother search engines because it’s extremely flexible, and this is a great example.
In the example:
Website links to the organization as the publisher with @id.
The organization is described with detailed properties.
WebPage links to the WebSite with isPartOf.
NewsArticle links to the WebPage with isPartOf, and back to the WebPage with mainEntityOfPage, and includes the author property via @id.
You can see how graph nodes are linked to each other using the"@id"attribute. This way, we inform Google that it is a webpage published by the publisher described in the schema.
The use of hashes (#) for IDs is optional. You should only ensure that different schema types don’t have the same ID by accident. Adding custom hashes (#) can be helpful, as it provides an extra layer of insurance that they will not be repeated.
You may wonder why we use"@id"to connect graph nodes. Can’t we just drop organization, author, and webpage schemas separately on the same page, and it is intuitive that those are connected?
The issue is that Google and other search engines cannot reliably interpret these connections unless explicitly linked using@id.
Adding to the graph additional schema types is as easy as constructing Lego bricks. Say we want to add an image to the schema:
As you already know from the NewsArticle schema, you need to add it to the above schema graph as a parent node and link via @id.
As you do that, it will have this structure:
Quite easy, isn’t it? Now that you understand the main principle, you can build your own schema based on the content you have on your website.
And since we live in the age of AI, you may also want to use ChatGPT or other chatbots to help you build any schema you want.
2. Microdata Schema Format
Microdata is a set of tags that aims to make annotating HTML elements with machine-readable tags much easier.
However, the one downside to using Microdata is that you have to mark every individual item within the body of your webpage. As you can imagine, this can quickly get messy.
Take a look at this sample HTML code, which corresponds to the above JSON schema with NewsArticle:
Our Company
Example Company, also known as Example Co., is a leading innovator in the tech industry.
Founded in 2000, we have grown to a team of 200 dedicated employees.
Our slogan is: "Innovation at its best".
Contact us at +1-800-555-1212 for customer service.
Our Founder
Our founder, Jane Smith, is a pioneer in the tech industry.
This example shows how complicated it becomes compared to JSON-LD since the markup is spread over HTML. Let’s understand what is in the markup.
You can see
tags like:
By adding this tag, we’re stating that the HTML code contained between the
blocks identifies a specific item.
Next, we have to identify what that item is by using the ‘itemtype’ attribute to identify the type of item (Person).
An item type comes in the form of a URL (such as https://schema.org/Person). Let’s say, for example, you have a product you may use http://schema.org/Product.
To make things easier, you can browse a list of item types here and view extensions to identify the specific entity you’re looking for. Keep in mind that this list is not all-encompassing but only includes ones that are supported by Google, so there is a possibility that you won’t find the item type for your specific niche.
It may look complicated, but Schema.org provides examples of how to use the different item types so you can see what the code is supposed to do.
Don’t worry; you won’t be left out in the cold trying to figure this out on your own!
To use this amazing tool, just select your item type, paste in the URL of the target page or the content you want to target, and then highlight the different elements so that you can tag them.
3. RDFa Schema Format
RDFa is an acronym for Resource Description Framework in Attributes. Essentially, RDFa is an extension to HTML5 designed to aid users in marking up structured data.
RDFa isn’t much different from Microdata. RDFa tags incorporate the preexisting HTML code in the body of your webpage. For familiarity, we’ll look at the same code above.
The HTML for the same JSON-LD news article will look like:
Unlike Microdata, which uses a URL to identify types, RDFa uses one or more words to classify types.
vocab=”http://schema.org/” typeof=”WebPage”>
If you wish to identify a property further, use the ‘typeof’ attribute.
Let’s compare JSON-LD, Microdata, and RDFa side by side. The @type attribute of JSON-LD is equivalent to the itemtype attribute of Microdata format and the typeof attribute in RDFa. Furthermore, the propertyName of JSON-LD attribute would be the equivalent of the itemprop and property attributes.
Attribute Name
JSON-LD
Microdata
RDFa
Type
@type
itemtype
typeof
ID
@id
itemid
resource
Property
propertyName
itemprop
property
Name
name
itemprop=”name”
property=”name”
Description
description
itemprop=”description”
property=”description”
For further explanation, you can visit Schema.org to check lists and view examples. You can find which kinds of elements are defined as properties and which are defined as types.
To help, every page on Schema.org provides examples of how to apply tags properly. Of course, you can also fall back on Google’s Structured Data Testing Tool.
4. Mixing Different Formats Of Structured Data With JSON-LD
If you use JSON-LD schema but certain parts of pages aren’t compatible with it, you can mix schema formats by linking them via @id.
For example, if you have live blogging on the website and a JSON-LD schema, including all live blogging items in the JSON schema would mean having the same content twice on the page, which may increase HTML size and affect First Contentful Paint and Largest Contentful Paint page speed metrics.
You can solve this either by generating JSON-LD dynamically with JavaScript when the page loads or by marking up HTML tags of live blogging via the Microdata format, then linking to your JSON-LD schema in the head section via “@id“.
Here is an example of how to do it.
Say we have this HTML with Microdata markup with itemid="https://www.example.com/live-blog-page/#live-blog"
Live Blog Headline
Explore the biggest announcements from DevDay
1:45 PM ETNov 6, 2023
OpenAI is taking the first step in gradual deployment of GPTs – tailored ChatGPT for a specific purpose – for safety purposes.
1:44 PM ETNov 6, 2023
ChatGPT now uses GPT-4 turbo with current knowledge.
It also knows which tool to choose for a task with GPT-4 All Tools.
1:43 PM ETNov 6, 2023
Microsoft CEO Satya Nadella joined Altman to announce deeper partnership with OpenAI to help developers bring more AI advancements.
We can link to it from the sample JSON-LD example we had like this:
If you copy and paste HTML and JSON examples underneath in the schema validator tool, you will see that they are validating properly.
The schema validator does validate the above example.
The SEO Impact Of Structured Data
This article explored the different schema encoding types and all the nuances regarding structured data implementation.
Schema is much easier to apply than it seems, and it’s a best practice you must incorporate into your webpages. While you won’t receive a direct boost in your SEO rankings for implementing Schema, it can:
Make your pages eligible to appear in rich results.
Ensure your pages get seen by the right users more often.
Avoid confusion and ambiguity.
The work may seem tedious. However, given time and effort, properly implementing Schema markup is good for your website and can lead to better user journeys through the accuracy of information you’re supplying to search engines.
Image Credits
Featured Image: Paulo Bobita Screenshot taken by author
Whether you are an SEO pro, marketer, or web developer, you might often need to change your browser’s user-agent to test different things.
For example, imagine you’re running a MAC-OS-specific campaign. To find out if your campaign is running properly and not targeting Linux users, changing the user-agent of your browser can help you test.
Changing user-agents is almost a daily task for web developers, as they need to test how websites behave in different browsers and devices.
What Is A User-Agent?
A user-agent is an HTTP request header string identifying browsers, applications, or operating systems that connect to the server.
Browsers have user-agents, and so do bots and crawlers such as search engines Googlebot, Google AdSense, etc.
Screenshot by author, May 2024
Here, we will learn how to change your browser’s user-agent.
The process is called user-agent spoofing.
Spoofing occurs when a browser or client sends a different user-agent HTTP header from what it is and fakes it.
While the term may be alarming, this is not a dangerous activity and will not cause you any problems. (Feel free to spoof your user-agent as much as you want.)
How To Change Your User-Agent On Chrome & Edge
Since Microsoft Edge is now using Chromium, the settings for both Chrome and Edge are the same.
1. Right Click Anywhere On Webpage > Inspect
Alternatively, you can use CTR+Shift+I on Windows and Cmd + Opt +J on Mac.
Screenshot by author, May 2024
2. Choose More Tools > Network Conditions
Screen new.
Click on the three vertical dots in the upper right corner.
Screenshot by author, May 2024
3. Uncheck Select Automatically Checkbox
Screenshot by author, May 2024
4. Choose One Among The Built-In User-Agents List
Screenshot by author, May 2024
If the user-agent you want doesn’t exist, you can enter any string you want on the field below the list.
For example, you can enter the following (Googlebot’s user-agent) into the custom field:
This may be useful for SEO professionals to identify if there is a cloaking on the website where the webpage shows specific content to Googlebot and different content to website visitors.
The user-agents are easy to spoof, and anyone can use these easy tricks to alter them.
This feature is useful for testing web apps against various devices, especially when the HTML is different for mobile or tablet devices.
It is a cost-efficient way to test websites as one doesn’t need to have many physical devices to be able to test.
However, certain issues may appear on the real device but not when testing by changing the user agent and using a browser emulator.
In that case, if you want to test on multiple real devices, I suggest using Browserstack, which offers testing opportunities on almost all devices.
FAQ
What is a user agent?
User agent is a HTTP request header string identifying browser, application, operating system which connects to the server. Not only browsers have user agent but also bots, search engines crawlers such as Googlebot, Google Adsense etc. which are not browsers.
What is user-agent spoofing?
When browser or any client sends different user-agent HTTP header from what they are and fakes it that is called spoofing.
How does changing the user-agent help SEO professionals?
SEO professionals may find changing the user-agent to be a critical part of their audit process. It is beneficial for several reasons:
Identifying cloaking issues: By mimicking different user-agents, such as Googlebot, SEO experts can uncover whether a website presents different content to search engines than users, which violates search engine guidelines.
Compatibility: It ensures web applications are compatible across various browsers and devices.
User Experience: Developers can optimize the user experience by understanding how content is rendered on different systems.
Debugging: Changing the user-agent can help pinpoint browser-specific issues.
Quality Assurance: It’s an essential step in quality assurance and helps maintain the integrity and performance of a website.
Can changing your browser’s user-agent pose a security risk?
No, changing your browser’s user-agent, commonly called user-agent spoofing, does not inherently pose a security risk. While the term “spoofing” might suggest malicious intent, this practice in the context of user-agents is harmless. It is a tool for developers and marketers to test how websites and applications interact with various devices and browsers.
This post was sponsored by JetOctopus. The opinions expressed in this article are the sponsor’s own.
If you manage a large website with over 10,000 pages, you can likely appreciate the unique SEO challenges that come with such scale.
Sure, the traditional tools and tactics — keyword optimization, link building, etc. — are important to establish a strong foundation and maintain basic SEO hygiene.
However, they may not fully address the technical complexities of Site Visibility for Searchbots and the dynamic needs of a large enterprise website.
This is where log analyzers become crucial. An SEO log analyzer monitors and analyzes server access logs to give you real insights into how search engines interact with your website. It allows you to take strategic action that satisfies both search crawlers and users, leading to stronger returns on your efforts.
In this post, you’ll learn what a log analyzer is and how it can enable your enterprise SEO strategy to achieve sustained success. But first, let’s take a quick look at what makes SEO tricky for big websites with thousands of pages.
The Unique SEO Challenges For Large Websites
Managing SEO for a website with over 10,000 pages isn’t just a step up in scale; it’s a whole different ball game.
Relying on traditional SEO tactics limits your site’s potential for organic growth. You can have the best titles and content on your pages, but if Googlebot can’t crawl them effectively, those pages will be ignored and may not get ranked ever.
Image created by JetOctopus, May 2024
For big websites, the sheer volume of content and pages makes it difficult to ensure every (important) page is optimized for visibility to Googlebot. Then, the added complexity of an elaborate site architecture often leads to significant crawl budget issues. This means Googlebot is missing crucial pages during its crawls.
Image created by JetOctopus, May 2024
Furthermore, big websites are more vulnerable to technical glitches — such as unexpected tweaks in the code from the dev team — that can impact SEO. This often exacerbates other issues like slow page speeds due to heavy content, broken links in bulk, or redundant pages that compete for the same keywords (keyword cannibalization).
All in all, these issues that come with size necessitate a more robust approach to SEO. One that can adapt to the dynamic nature of big websites and ensure that every optimization effort is more meaningful toward the ultimate goal of improving visibility and driving traffic.
This strategic shift is where the power of an SEO log analyzer becomes evident, providing granular insights that help prioritize high-impact actions. The primary action being to better understand Googlebot like it’s your website’s main user — until your important pages are accessed by Googlebot, they won’t rank and drive traffic.
What Is An SEO Log Analyzer?
An SEO log analyzer is essentially a tool that processes and analyzes the data generated by web servers every time a page is requested. It tracks how search engine crawlers interact with a website, providing crucial insights into what happens behind the scenes. A log analyzer can identify which pages are crawled, how often, and whether any crawl issues occur, such as Googlebot being unable to access important pages.
By analyzing these server logs, log analyzers help SEO teams understand how a website is actually seen by search engines. This enables them to make precise adjustments to enhance site performance, boost crawl efficiency, and ultimately improve SERP visibility.
Put simply, a deep dive into the logs data helps discover opportunities and pinpoint issues that might otherwise go unnoticed in large websites.
But why exactly should you focus your efforts on treating Googlebot as your most important visitor?
Why is crawl budget a big deal?
Let’s look into this.
Optimizing Crawl Budget For Maximum SEO Impact
Crawl budget refers to the number of pages a search engine bot — like Googlebot — will crawl on your site within a given timeframe. Once a site’s budget is used up, the bot will stop crawling and move on to other websites.
Crawl budgets vary for every website and your site’s budget is determined by Google itself, by considering a range of factors such as the site’s size, performance, frequency of updates, and links. When you focus on optimizing these factors strategically, you can increase your crawl budget and speed up ranking for new website pages and content.
As you’d expect, making the most of this budget ensures that your most important pages are frequently visited and indexed by Googlebot. This typically translates into better rankings (provided your content and user experience are solid).
And here’s where a log analyzer tool makes itself particularly useful by providing detailed insights into how crawlers interact with your site. As mentioned earlier, it allows you to see which pages are being crawled and how often, helping identify and resolve inefficiencies such as low-value or irrelevant pages that are wasting valuable crawl resources.
An advanced log analyzer like JetOctopus offers a complete view of all the stages from crawling and indexation to getting organic clicks. Its SEO Funnel covers all the main stages, from your website being visited by Googlebot to being ranked in the top 10 and bringing in organic traffic.
Image created by JetOctopus, May 2024
As you can see above, the tabular view shows how many pages are open to indexation versus those closed from indexation. Understanding this ratio is crucial because if commercially important pages are closed from indexation, they will not appear in subsequent funnel stages.
The next stage examines the number of pages crawled by Googlebot, with “green pages” representing those crawled and within the structure, and “gray pages” indicating potential crawl budget waste because they are visited by Googlebot but not within the structure, possibly orphan pages or accidentally excluded from the structure. Hence, it’s vital to analyze this part of your crawl budget for optimization.
The later stages include analyzing what percentage of pages are ranked in Google SERPs, how many of these rankings are in the top 10 or top three, and, finally, the number of pages receiving organic clicks.
Overall, the SEO funnel gives you concrete numbers, with links to lists of URLs for further analysis, such as indexable vs. non-indexable pages and how crawl budget waste is occurring. It is an excellent starting point for crawl budget analysis, allowing a way to visualize the big picture and get insights for an impactful optimization plan that drives tangible SEO growth.
Put simply, by prioritizing high-value pages — ensuring they are free from errors and easily accessible to search bots — you can greatly improve your site’s visibility and ranking.
Using an SEO log analyzer, you can understand exactly what should be optimized on pages that are being ignored by crawlers, work on them, and thus attract Googlebot visits. A log analyzer benefits in optimizing other crucial aspects of your website:
Image created by JetOctopus, May 2024
Detailed Analysis of Bot Behavior: Log analyzers allow you to dissect how search bots interact with your site by examining factors like the depth of their crawl, the number of internal links on a page, and the word count per page. This detailed analysis provides you with the exact to-do items for optimizing your site’s SEO performance.
Improves Internal Linking and Technical Performance: Log analyzers provide detailed insights into the structure and health of your site. They help identify underperforming pages and optimize the internal links placement, ensuring a smoother user and crawler navigation. They also facilitate the fine-tuning of content to better meet SEO standards, while highlighting technical issues that may affect site speed and accessibility.
Aids in Troubleshooting JavaScript and Indexation Challenges: Big websites, especially eCommerce, often rely heavily on JavaScript for dynamic content. In the case of JS websites, the crawling process is lengthy. A log analyzer can track how well search engine bots are able to render and index JavaScript-dependent content, underlining potential pitfalls in real-time. It also identifies pages that are not being indexed as intended, allowing for timely corrections to ensure all relevant content can rank.
Helps Optimize Distance from Index (DFI): The concept of Distance from Index (DFI) refers to the number of clicks required to reach any given page from the home page. A lower DFI is generally better for SEO as it means important content is easier to find, both by users and search engine crawlers. Log analyzers help map out the navigational structure of your site, suggesting changes that can reduce DFI and improve the overall accessibility of key content and product pages.
Besides, historical log data offered by a log analyzer can be invaluable. It helps make your SEO performance not only understandable but also predictable. Analyzing past interactions allows you to spot trends, anticipate future hiccups, and plan more effective SEO strategies.
With JetOctopus, you benefit from no volume limits on logs, enabling comprehensive analysis without the fear of missing out on crucial data. This approach is fundamental in continually refining your strategy and securing your site’s top spot in the fast-evolving landscape of search.
Real-World Wins Using Log Analyzer
Big websites in various industries have leveraged log analyzers to attain and maintain top spots on Google for profitable keywords, which has significantly contributed to their business growth.
For example, Skroutz, Greece’s biggest marketplace website with over 1 million sessions daily, set up a real-time crawl and log analyzer tool that helped them know things like:
Does Googlebot crawl pages that have more than two filters activated?
How extensively does Googlebot crawl a particularly popular category?
What are the main URL parameters that Googlebot crawls?
Does Googlebot visit pages with filters like “Size,” which are typically marked as nofollow?
This ability to see real-time visualization tables and historical log data spanning over ten months for monitoring Googlebot crawls effectively enabled Skroutz to find crawling loopholes and decrease index size, thus optimizing its crawl budget.
Eventually, they also saw a reduced time for new URLs to be indexed and ranked — instead of taking 2-3 months to index and rank new URLs, the indexing and ranking phase took only a few days.
This strategic approach to technical SEO using log files has helped Skroutz cement its position as one of the top 1000 websites globally according to SimilarWeb, and the fourth most visited website in Greece (after Google, Facebook, and Youtube) with over 70% share of its traffic from organic search.
Image created by JetOctopus, May 2024
Another case in point is DOM.RIA, Ukraine’s popular real estate and rental listing website, which doubled the Googlebot visits by optimizing their website’s crawl efficiency. As their site structure is huge and elaborate, they needed to optimize the crawl efficiency for Googlebot to ensure the freshness and relevance of content appearing in Google.
Initially, they implemented a new sitemap to improve the indexing of deeper directories. Despite these efforts, Googlebot visits remained low.
By using the JetOctopus to analyze their log files, DOM.RIA identified and addressed issues with their internal linking and DFI. They then created mini-sitemaps for poorly scanned directories (such as for the city, including URLs for streets, districts, metro, etc.) while assigning meta tags with links to pages that Googlebot often visits. This strategic change resulted in a more than twofold increase in Googlebot activity on these crucial pages within two weeks.
Image created by JetOctopus, May 2024
Getting Started With An SEO Log Analyzer
Now that you know what a log analyzer is and what it can do for big websites, let’s take a quick look at the steps involved in logs analysis.
Here is an overview of using an SEO log analyzer like JetOctopus for your website:
Integrate Your Logs: Begin by integrating your server logs with a log analysis tool. This step is crucial for capturing all data related to site visits, which includes every request made to the server.
Identify Key Issues: Use the log analyzer to uncover significant issues such as server errors (5xx), slow load times, and other anomalies that could be affecting user experience and site performance. This step involves filtering and sorting through large volumes of data to focus on high-impact problems.
Fix the Issues: Once problems are identified, prioritize and address these issues to improve site reliability and performance. This might involve fixing broken links, optimizing slow-loading pages, and correcting server errors.
Combine with Crawl Analysis: Merge log analysis data with crawl data. This integration allows for a deeper dive into crawl budget analysis and optimization. Analyze how search engines crawl your site and adjust your SEO strategy to ensure that your most valuable pages receive adequate attention from search bots.
And that’s how you can ensure that search engines are efficiently indexing your most important content.
Conclusion
As you can see, the strategic use of log analyzers is more than just a technical necessity for large-scale websites. Optimizing your site’s crawl efficiency with a log analyzer can immensely impact your SERP visibility.
For CMOs managing large-scale websites, embracing a log analyzer and crawler toolkit like JetOctopus is like getting an extra tech SEO analyst that bridges the gap between SEO data integration and organic traffic growth.