Google: Host Resources On Different Hostname To Save Crawl Budget via @sejournal, @MattGSouthern

Google Search Central has launched a new series called “Crawling December” to provide insights into how Googlebot crawls and indexes webpages.

Google will publish a new article each week this month exploring various aspects of the crawling process that are not often discussed but can significantly impact website crawling.

The first post in the series covers the basics of crawling and sheds light on essential yet lesser-known details about how Googlebot handles page resources and manages crawl budgets.

Crawling Basics

Today’s websites are complex due to advanced JavaScript and CSS, making them harder to crawl than old HTML-only pages. Googlebot works like a web browser but on a different schedule.

When Googlebot visits a webpage, it first downloads the HTML from the main URL, which may link to JavaScript, CSS, images, and videos. Then, Google’s Web Rendering Service (WRS) uses Googlebot to download these resources to create the final page view.

Here are the steps in order:

  1. Initial HTML download
  2. Processing by the Web Rendering Service
  3. Resource fetching
  4. Final page construction

Crawl Budget Management

Crawling extra resources can reduce the main website’s crawl budget. To help with this, Google says that “WRS tries to cache every resource (JavaScript and CSS) used in the pages it renders.”

It’s important to note that the WRS cache lasts up to 30 days and is not influenced by the HTTP caching rules set by developers.

This caching strategy helps to save a site’s crawl budget.

Recommendations

This post gives site owners tips on how to optimize their crawl budget:

  1. Reduce Resource Use: Use fewer resources to create a good user experience. This helps save crawl budget when rendering a page.
  2. Host Resources Separately: Place resources on a different hostname, like a CDN or subdomain. This can help shift the crawl budget burden away from your main site.
  3. Use Cache-Busting Parameters Wisely: Be careful with cache-busting parameters. Changing resource URLs can make Google recheck them, even if the content is the same. This can waste your crawl budget.

Also, Google warns that blocking resource crawling with robots.txt can be risky.

If Google can’t access a necessary resource for rendering, it may have trouble getting the page content and ranking it properly.

Monitoring Tools

The Search Central team says the best way to see what resources Googlebot is crawling is by checking a site’s raw access logs.

You can identify Googlebot by its IP address using the ranges published in Google’s developer documentation.

Why This Matters

This post clarifies three key points that impact how Google finds and processes your site’s content:

  • Resource management directly affects your crawl budget, so hosting scripts and styles on CDNs can help preserve it.
  • Google caches resources for 30 days regardless of your HTTP cache settings, which helps conserve your crawl budget.
  • Blocking critical resources in robots.txt can backfire by preventing Google from properly rendering your pages.

Understanding these mechanics helps SEOs and developers make better decisions about resource hosting and accessibility – choices that directly impact how well Google can crawl and index their sites.


Featured Image: ArtemisDiana/Shutterstock

Why Is Google Losing Market Share In The EU? via @sejournal, @Kevin_Indig

Most marketers don’t realize that Google has been losing search market share in EU countries.

Image Credit: Kevin Indig

The drop in market share comes at a time when Google’s business is under siege:

  • The DoJ recommended separating Google from Chrome and Android amid a lawsuit against Alphabet. (I summarized the lawsuit and potential outcomes in Monopoly.)
  • The Justice Department runs a separate lawsuit against Google’s advertising business.
  • Canada just sued Google over anti-competitive practices in online ads.
  • ChatGPT, Perplexity & Co are growing mind and market share. (I covered the meteoric rise of ChatGPT in ChatGPT Search.)
  • Google faces heavy regulation in the EU from the DMA (Digital Marketing Act), which I wrote about in 2 Internets.

So, the question is two-fold: How much does the drop in market share matter, and what is the driver?

The short answer is that the drop matters more than Alphabet might like to admit.

It gives oxygen to competitors and weakens the body in the fight against external agents. Google’s revenue is still strong, but advertising market share is declining.

A mix of regulation, competitors, and negative sentiment toward Google seem responsible for the drop.

The implication is that marketers increasingly need to track and optimize for more search engines, but a more fragmented playing field could also be an opportunity for more referral traffic from search engines to websites.

What Is Going On With Google?

Image Credit: Kevin Indig

Google’s market share over the last 10 years dipped by -5.6 pp (percentage points) in France and -3.3 pp in Germany.

StatCounter has never recorded such a low share since measuring data in January 2009.

France and Germany are not the only ones. Most EU countries saw Google’s market share drip in the last five years (mobile):

  • Austria: -4.1 pp.
  • Poland: -3.1 pp.
  • Switzerland: -2.3 pp.
  • Netherlands: -2.1 pp.
  • Denmark: -1.5 pp.

Zooming further in also doesn’t change things. Google market share over the last 12 months (mobile):

  • France: -4.6 pp.
  • Austria: -3.2 pp.
  • Poland: -2.4 pp.
  • Germany: -2.1 pp.
  • Switzerland: -1.3 pp.
  • Netherlands: -1.0 pp.
  • Denmark: -1.0 pp.

What’s going on? The picture becomes clearer when we look at when the trend changes. There are two inflection points: November 2018 and April 2024.

Image Credit: Kevin Indig

The data shows a shift away from Google starting around April, a month after Android and Apple introduced choice screens for browsers and search engines.

In other words, Google can no longer be the default search engine on mobile and desktop devices. We’re starting to see the results.

However, not all countries see a dip. Why?

Why Are Some Countries Flat?

Google’s market share isn’t down in every EU country, e.g.:

  • Portugal.
  • Spain.
  • Italy.
  • Ireland.

How come? These countries are part of Europe, and users see a choice screen.

The answer is devices. The countries listed above lost market share on desktop but not mobile.

Image Credit: Kevin Indig

This happens everywhere in the EU. Over the last five years, Google lost -2.1% market share on mobile compared to -10% on desktop in the EU.

Why?

A big part of the reason is the exclusive distribution agreement with Apple.

Windows is the dominant desktop operating system, with +75% in the EU, largely because of its domination in corporate computing. MacOS has only 15.1%.

While Android (Google’s operating system) also has the majority market share on mobile with 66.5%, Apple’s iOS has 33%.

And since Google is the default search engine on Apple devices by paying a $20 billion fee, its position is more solid in the EU on mobile – until the DMA forced choice screens in March.

Image Credit: Kevin Indig

But what about countries that show a decline in Google’s market share before March? Way before!

Why Does The Dip Start Earlier In Some Countries?

Image Credit: Kevin Indig

Google lost market share in countries like Germany and Portugal as early as November 2018. So, there must be something else going on besides choice screens and device-specific dynamics.

Two things happened in 2018: First, GDPR, the European data protection law, came into effect in May 2018. Second, the EU fined Alphabet €4.34 billion for antitrust violations related to Android’s market dominance.

Both events didn’t directly decrease Google’s market share but set off a period of Google mistrust that gave space to smaller competitors like DuckDuckGo and Bing.

Europeans are much more privacy-sensitive, which means regulatory fines and privacy laws influence consumer behavior much more than the U.S.

For example, the European privacy search engine StartPage gets 56% of searches from the EU and 21% from the U.S.

Users visit Google less because of privacy concerns. France declared not to use Google as a default search engine for some ministries in November 2018.

Choice screens and public perception are the biggest drivers behind Google’s decline. Google sends less referral traffic to websites. So, what is the effect?

Who Wins What Google Loses?

Image Credit: Kevin Indig

The biggest winner of Google’s decline is Bing. The ever-second search engine is the biggest beneficiary of Google’s decline.

It’s very possible that ChatGPT and its close affiliation with Microsoft gave its search engine a bigger boost in Europe than originally assumed, but Bing is also the second choice in cons’ minds.

Now, these numbers are still peanuts, and search engines like DuckDuckGo, Ecosia, and QWANT license search results from Bing and Google. So, you could say that Google and Bing win, after all.

However, Ecosia and QWANT are working on a joint web index to become independent from other search engines.

How much longer until DuckDuckGo and others announce their own index as well? When the alpha gets weaker, the smaller animals smell the opportunity.

Despite the decline in market share, Google’s search revenue is still growing impressively fast at its scale. Why?

  1. Market share doesn’t have to correlate with search volume or monetizable queries.
  2. There are more mobile than desktop searches, and mobile searches drop to a smaller degree.
  3. Google still dominates in other markets – the EU might not be enough to put a dent into Google’s revenue that the company couldn’t compensate.
  4. Google has been more aggressive in search monetization than the drop in market share.

Relative ad revenue growth, which is predicted to fall below 50% next year, could be a better indicator than absolute growth.

I also want to point out a caveat in the data: StatCounter gathers data by measuring referral traffic on 1.5 million sites. There is a chance that Google sending out less traffic to websites and keeping it to themselves affects the numbers.

What Are The Implications?

Google’s dropping market share in the EU, combined with potential antitrust remedies (like a forced end to the distribution agreement with Apple) and more competition, will likely fragment Search further.

In other words, we might optimize for more search engines (again). Most of them might function similarly in ranking but might need site owners to take dedicated indexing actions, such as integrating with Bing’s IndexNOW.

We’ve already dusted off our Bing Webmaster Tools when it turned out ChatGPT is using Bing results for its search feature. What’s next? Perplexity webmaster tools? Boosted by growing market share, SEO professionals should pay more attention to Bing.

Other search engines don’t have webmaster tools yet – to my surprise. What better way to foster a relationship with site owners than a portal? But with increasingly independent indices, that could become a reality soon.

Ironically, the monopoly lawsuit against Google comes just as the company gets more competition. A 1% market share of a giant like Alphabet can create a unicorn with $1.75 billion in ARR.

Browsers play a critical role in the search engine wars. The DoJ is pushing for Chrome to divest from Google, and OpenAI is working on its own browser.

In my opinion, OpenAI should buy Arc. Either way, browsers are the ultimate internet user interface and offer more user information than search engines can chew.

I want to be clear that I don’t think Google is doomed to fail. Google has all the ingredients to come out on top in the “new AI world.” The only reason it will fail is by standing in its own way.


France is ditching Google to reclaim its online independence

OpenAI Considers Taking on Google With Browser


Featured Image: Paulo Bobita/Search Engine Journal

seo enhancements
Enhance your SEO skills through experimentation

The world of SEO keeps evolving and changing, which is why it’s important to keep developing your own skills. An excellent way to do this is via hands-on experimentation. In this post, I’ll share three valuable lessons I’ve learned from my previous ventures.

Where it all started

A bit of background information: I started experimenting with SEO in 1999 without realizing it, when I created a South Park fan website. This was done via my early foray into the fundamentals of HTML and having fun with the site through different experiments. I discovered that by manipulating meta keywords, I could influence search rankings. Nowadays, that tactic wouldn’t fly, but it’s still incredible that I learned about SEO this way rather than the more predictable entry through my first professional jobs! 

It didn’t stop there, though. I kept learning by starting my own businesses and creating my own websites and plugins, which gave me invaluable insight into customer behavior, product development, and marketing. Plus, I gained a deeper understanding of website structures and functionalities, which we all know is invaluable for technical SEO.

Tip 1: Embrace experimentation

It’s unsurprising, then, that my first piece of advice is: embrace experimentation. That’s how I learned most of what I know. Simply start by experimenting on your own personal website or create a new site to work with. If you use tools like LocalWP, you can freely experiment without impacting live websites. 

And don’t shy away from getting your hands dirty with code! Writing code might seem daunting at first, but I promise you it pays off. I taught myself coding in PHP around 2002 and figured it out quite quickly, approaching code like a puzzle I needed to solve. If I could figure it out on my own during my teenage years (when the technology was in a much earlier stage), then you can too. 

Explore new technologies and platforms

We all know WordPress is great. I think so too. It’s a truly unique and amazing platform to get started with, because it allows you to extend and experiment with plugins, as well as being able to create custom websites to your heart’s desire. 

In recent years, more CMSs (content management systems) have launched as well as really upping their game to the wider market. Whilst a lot can be good for simpler needs, my preference always naturally returns to WordPress as my experiments and scaling attempts will always eventually hit a wall with other CMSs out there.

Create that website for someone else

After you’ve experimented and gained an understanding of websites and SEO, people you know may start to ask you to build one for them, or help out with one they have already. Whilst this may sometimes seem annoying at the time, it’s a great opportunity to experiment with someone live on the web so you can create a use case for your work.

Working with different people and businesses will make sure you encounter different challenges and opportunities to develop new skills. This will ultimately enhance your SEO capabilities.

Tip 2: The importance of a customer-centric mindset

One venture I learned many lessons from is from when I owned a bar with my wife. Whilst this was far from SEO, it taught me many lessons, some of which I apply in my job today.

It’s the same with any business, online or physical. If you understand who your customer is, you can create content and products that resonate with them. This will make them much more likely to become your customers. With a physical business, it’s easier to engage directly with the customer, but in the digital world this can be more challenging. You can learn a lot by engaging with individual customers or end-users directly through a video call or meeting them in real life—try to do this for your clients or the company you work for.

An interesting story of brand loyalty: one day the bar received a one-star review on TripAdvisor. The reviewer said they were happy with their visit in general – with great service and wine – but there was a dog in the bar, which seemed unfair considering that the dog was 3 tables away from the customer and that it’s a dog-friendly bar (as most are in the suburbs). However, this does happen to businesses from time to time and we replied to the review. Back at the bar, some regular customers noticed the review and decided to add their own—all 5 stars. Three days later, the review was removed. This brought our average rating up as a result, which also improved our ranking within TripAdvisor.

This really brought home that not only can a disproportionately negative review have real consequences for a business and its owners, but also showed how brand loyalty counts for so much.

By nurturing and maintaining a relationship with your audience, people will talk about you online and offline. 

Remember NFTs? Non-Fungible Tokens are a form of digital asset all powered by the blockchain and were extremely popular during 2020-2022. You may have seen a couple of them, including Bored Ape Yacht Club—a generative NFT collection—or a single NFT by Beeple sold for $69.3m.

During its increased popularity, I co-founded an NFT marketing agency. One SEO tactic I used was to utilize my existing agency and create a landing page there to sell the service, using the site’s existing relevance and authority. As a result we began ranking quicker than any other agency was attempting to, whilst also using our newly built site to do the same. Building something from the ground up is a long process but is still worth it, as even the new agency’s site ranked independently and earned its own authority.

Avoid putting all your eggs in one trendy basket

Whilst the NFT marketing agency gave me a lot of invaluable experience and garnered new connections, the trend—and therefore the business—didn’t last.

This experience highlighted the limitations of niche trends for me. It was a great learning experience, but it taught me that trends are usually not a solid foundation for any long-term goals you might have. Whilst it’s great to go “all in” on a new venture, ensure that your current one is supported enough or balance both until one gets to a position you make yourself redundant in the other.

Get experimenting!

I hope this post helps nudge you to explore beyond business as usual. After all, the best way to enhance your SEO and other professional skills is by experimenting!

Coming up next!

Structured Data In 2024: Key Patterns Reveal The Future Of AI Discovery [Data Study] via @sejournal, @cyberandy

The structured data landscape has undergone significant transformation in 2024, driven by the rise of AI-powered search, the growing importance of machine-readable content, and the need to ground large language models in factual data.

According to the latest HTTP Archive’s Web Almanac, analyzing structured data across 16.9 million websites reveals a clear shift from traditional SEO implementation to more sophisticated knowledge graph development that powers AI discovery systems.

While Google deprecated certain rich results like FAQs and HowTos in 2023, it simultaneously introduced an unprecedented number of new structured data types, including vehicle listings, course info, vacation rentals, profile pages, and 3D product models.

In February 2024, it expanded support for product variants and GS1 Digital Link, followed by the beta launch of structured data carousels in March.

This rapid evolution signals a maturing ecosystem where structured data serves not just search visibility but also forms the foundation for factual AI responses, training language models, and enhanced digital product experiences.

Analysis and Methodology

The insights presented in this article are based on the 2024 edition of the Structured Data chapter of the HTTP Archive’s Web Almanac. The annual report analyzes the state of the web by evaluating structured data implementation across 16.9 million websites. These datasets are publicly queryable on BigQuery in tables in the `httparchive.all.*` tables for the date date = '2024-06-01' and relies on tools like WebPageTest, Lighthouse, and Wappalyzer to capture metrics on structured data formats, adoption trends, and performance.

Structured Data Adoption Trends

The analysis reveals compelling growth across major structured data formats:

  • JSON-LD reaches 41% adoption (+7% YoY).
  • RDFa maintains leadership with 66% presence (+3% YoY).
  • Open Graph implementation grows to 64% (+5% YoY).
  • X (Twitter) meta tag usage increases to 45% (+8% YoY).

This widespread adoption indicates that organizations are investing in structured data not just for search visibility, but also to enable AI and crawlers to understand and enhance their digital experiences.

AI Discovery And Knowledge Graphs

The relationship between structured data and AI systems is evolving in complex ways.

While many generative AI search engines are still developing their approach to leveraging structured data, established platforms like Bing Copilot, Google Gemini, and specialized tools like SearchGPT already seem to demonstrate the value of entity-based understanding, particularly for local queries and factual validation.

Training And Entity Understanding

Generative AI search engines are trained on vast datasets that include structured data markup, influencing how they:

  • Recognize and categorize entities (products, locations, organizations).
  • Ground responses. We see this in systems like DataGemma that use structured data to ground responses in verifiable facts.
  • Understand relationships between different data points. This is particularly evident when schema.org is used for aggregating datasets from authoritative sources worldwide.
  • Process-specific query types like local business and product searches.

This training shapes how AI systems interpret and respond to queries, particularly visible in:

  • Local business queries where entity attributes match structured data patterns.
  • Product queries that reflect merchant-provided structured data.
  • Knowledge panel information that aligns with entity definitions.

Search Engine Integration

Different platforms demonstrate structured data influence through:

  • Traditional Search: Rich results and knowledge panels directly powered by structured data.
  • AI Search Integration:
    • Bing Copilot showing enhanced results for structured entities.
    • Google Gemini reflecting knowledge graph information.
    • Specialized engines like Perplexity.ai demonstrating entity understanding in location queries.
    • Latest Google’s experiment of an AI Sales Assistant integrated into the SERP for shopping queries (This is huge! Here is on X, spotted by SERP Alert).
WordLift's Entity Knowledge Graph Panel on Google SearchWordLift’s Entity Knowledge Graph Panel on Google Search – Foundation Year.
Asking Asking “When was WordLift founded?” to Google Gemini.

Here is an example of Gemini and Google Search sharing the same factoid.

AI Sales Assistant through a 'Shop' CTA on branded sitelinksAI Sales Assistant through a ‘Shop’ CTA on branded sitelinks.

Data Validation And Verification

Structured data provides verification mechanisms through:

  • Knowledge Graphs: Systems like Google’s Data Commons use structured data for fact verification.
  • Training Sets: Schema.org markup creates reliable training examples for entity recognition.
  • Validation Pipelines: Content generation tools, like WordLift, use structured data to verify AI outputs.

The key distinction is that structured data doesn’t directly influence LLM responses, but rather shapes AI search engines through:

  1. Training data that includes structured markup.
  2. Entity class definitions that guide understanding.
  3. Integration with traditional search rich results.

This makes structured data implementation increasingly important for visibility across both traditional and AI-powered search platforms.

As we enter this new era of AI Discovery, investing in structured data isn’t just about SEO anymore – it’s about building the semantic layer that enables machines to truly understand and accurately represent who you are.

Semantic SEO Evolution: From Structured Data To Semantic Data

The practice of SEO has evolved into Semantic SEO, going beyond traditional keyword optimization to embrace semantic understanding:

Entity-Based Optimization

  • Focus on clear entity definitions and relationships.
  • Implementation of comprehensive entity attributes.
  • Strategic use of sameAs properties for entity disambiguation.

Content Networks

  • Development of interconnected content clusters.
  • Clear attribution and authorship markup.
  • Rich media relationship definitions.

Key Implementation Patterns In JSON-LD

Content Publishing

Analysis of structured data patterns across millions of websites reveals three dominant implementation trends for content publishers.

JSON-LD patterns for content publishersJSON-LD patterns for content publishers. (Image from author, November 2024)

Website Structure & Navigation (+6 Million Implementations)

The dominance of WebPage → isPartOf → WebSite (5.8 million) and WebPage → breadcrumb → BreadcrumbList (4.8 million) relationships demonstrates that major websites prioritize clear site architecture and navigation paths.

Site structure remains the foundation of structured data implementation, suggesting that search engines heavily rely on these signals for understanding content hierarchy.

Content Attribution & Authority

Strong patterns emerge around content attribution:

  • Article → author → Person (925,000).
  • Article → publisher → Organization (597,000).
  • BlogPosting → author → Person (217,000).

This focus on authorship and organizational attribution reflects the increasing importance of E-E-A-T signals and content authority in search algorithms.

Rich Media Integration

Consistent implementation of image markup across content types:

  • WebPage → primaryImageOfPage → ImageObject (3 million)
  • Article → image → ImageObject (806,000)

The high frequency of media relationships indicates that publishers recognize the value of structured visual content for both search visibility and user experience.

The data suggests publishers are moving beyond basic SEO markup to create comprehensive machine-readable content graphs that support both traditional search and emerging AI discovery systems.

Local Business & Retail

Analysis of local business structured data implementation reveals three critical pattern groups that dominate location-based markup.

JSON-LD patterns for local business and retailJSON-LD patterns for local business and retail. (Image from author, November 2024)

Location & Accessibility (+1.4 Million Implementations)

High adoption of physical location markup demonstrates its fundamental importance:

  • LocalBusiness → address → PostalAddress (745,000).
  • Place → address → PostalAddress (658,000).
  • Organization → contactPoint → ContactPoint (334,000).
  • LocalBusiness → openingHoursSpecification (519,000).

The strong presence of these basic operational details suggests they are core ranking factors for local search visibility.

Geographic Precision

Significant implementation of geo-coordinates shows focus on precise location:

  • Place → geo → GeoCoordinates (231,000).
  • LocalBusiness → geo → GeoCoordinates (205,000).

This dual approach to location (address + coordinates) indicates search engines value precise geographic positioning for local search accuracy.

Trust Signals

A smaller but notable pattern group focuses on reputation:

  • LocalBusiness → review → Review (94,000)
  • LocalBusiness → aggregateRating → AggregateRating (70,000)
  • LocalBusiness → photos → ImageObject (42,000)
  • LocalBusiness → makesOffer → Offer (56,000)

While less frequently implemented, these trust-building elements create richer local business entities that support both search visibility and user decision-making.

Ecommerce (Expanded List)

Analysis of ecommerce structured data reveals sophisticated implementation patterns that focus on product discovery and conversion optimization.

JSON-LD patterns for eCommerce websitesJSON-LD patterns for ecommerce websites. (Image from author, November 2024)

Core Product Information (+4.7 Million Implementations)

The dominance of basic product markup shows its fundamental importance:

  • Product → offers → Offer (3.1 million).
  • Offer → seller → Organization (2.2 million).
  • Product → mainEntityOfPage → WebPage (1.5 million).

This high adoption rate of core product relationships indicates their critical role in product discovery and merchant visibility.

Trust & Social Proof

Significant implementation of review-related markup:

  • Product → review → Review (490,000).
  • Product → aggregateRating → AggregateRating (201,000).
  • Review → reviewRating → Rating (110,000).

The substantial presence of review markup suggests social proof remains crucial for ecommerce conversion.

Enhanced Product Context

Rich product attribute implementation shows a focus on detailed product information:

  • Product → brand → Brand (315,000).
  • Product → additionalProperty → PropertyValue (253,000).
  • Product → image → ImageObject (182,000).
  • Offer → shippingDetails → OfferShippingDetails (151,000).
  • Offer → priceSpecification → PriceSpecification (42,000).
  • AggregateOffer → offers → Offer (69,000).

This layered approach to product attributes creates comprehensive product entities that support both search visibility and user decision-making.

Future Outlook

The role of structured data is expanding beyond its traditional function as an SEO tool for powering rich snippets and specific search features. In the age of AI discovery, structured data is becoming a critical enabler for machine understanding, transforming how content is interpreted and connected across the web. This shift is driving the industry to think beyond Google-centric optimization, embracing structured data as a core component of a semantic and AI-integrated web.

Structured data provides the scaffolding for creating interconnected, machine-readable frameworks, which are vital for emerging AI applications such as conversational search, knowledge graphs, and (Graph) retrieval-augmented generation (GraphRAG or RAG) systems. This evolution calls for a dual approach: leveraging actionable schema types for immediate SEO benefits (rich results) while investing in comprehensive, descriptive schemas that build a broader data ecosystem.

The future lies in the intersection of structured data, semantic modeling, and AI-driven content discovery systems. By adopting a more holistic view, organizations can move from using structured data as a tactical SEO addition to positioning it as a strategic layer for powering AI interactions and ensuring findability across diverse platforms.

Credits And Acknowledgements

This analysis wouldn’t be possible without the dedicated work of the HTTP Archive team and Web Almanac contributors. Special thanks to:

The complete Web Almanac Structured Data chapter offers even deeper insights into the evolving landscape of structured data implementation.

As we move toward an AI-powered future, the strategic importance of structured data will continue to grow.

More resources:


Featured Image: Koto Amatsukami/Shutterstock

Scan of 140K Sites Reveals Best WordPress Plugins via @sejournal, @martinibuster

An analysis of 140,000 sites hosted on managed WordPress host Kinsta revealed the WordPress plugins that users judge to be the best. These findings highlight how publishers prioritize performance, SEO, and user experience.

10. Schema.org Structured Data – Schema Pro – 1.75%

Adding structured data is critical for SEO and in general for making it clear for search engines and AI what the content is about. Only 1.75% of the 140,000 sites scanned by Kinsta use a standalone Schema plugin. The reason may be that users are satisfied with the structured data functionalities offered by SEO Plugins.

The Schema Pro WordPress plugin offers a wider selection of structured data types than most SEO plugins and it also offers the capability to add custom structured data automatically across the entire site targeted to specific kinds of posts or at the individual page level.

9. XML Sitemap Generator for Google Plugin – 2.17%

Sitemaps are helpful for encouraging search engines to crawl web pages efficiently in a timely manner. But only 2% of sites use it, likely because a basic version of this functionality is native to the WordPress core and it’s provided by all WordPress SEO Plugins.

Like the dedicated Schema Pro Structured Data plugin, the XML Sitemap Generator for Google Plugin offers greater flexibility than built-in XML site generators found in most SEO plugins but with only 2.17% use it’s clear that SEO plugins are a perfect fit for most WordPress users. The advantage of using Schema Pro Structured Data plugin is that it offers greater flexibility but that might be just for edge cases.

8. Broken Link Checker – 3.27%

The Broken Link Checker is a plugin that checks for broken links but is not commonly used in this sample of sites. Google Search Console offers a report of 404 errors discovered by Googlebot which indicates broken internal and external links. The broken link check can also be accomplished with a software app like Screaming Frog.

The Broken Link Checker plugin offers a cloud-based scanner and a local checker that uses website server resources to monitor the entire website for broken links.

7. SEOPress – 4.81%

SEOPress is the seventh most popular plugin in the sample of 140,000 sites that are hosted on Kinsta. It’s a fairly popular plugin with 300,000+ installations. all-in-one SEO plugin that facilitates content optimization, schema implementation, and redirection management.

6. All in One SEO – 5.11%

The sixth most popular plugin is highly popular online, with 3+ million installations. On Kinsta it’s installed on 5.11% of sites in this sample.

5. Imagify – 11.62%

Imagify is am image optimizer that reduces image file sizes to improve website loading time. The popularity of these kinds of plugins may reflect the lack of image optimizing skills of the average WordPress user as it’s an easy thing to optimize an image before uploading it. Less than 12% of sites on Kinsta have installed it.

4. Rank Math – 18.32%

Rank Math is a highly popular SEO plugin with over 3 million installations worldwide. So it’s not surprising to see that almost 20% of sites hosted on Kinsta use it.

3. WP Rocket – 19.10%

The #3 most popular plugin is for performance optimization, demonstrating how important website performance is for publishers. WP Rocket performs file minification (makes code smaller but removing blank spaces), lazy loading and database optimization. WP Rocket made their plugin compatible with Kinsta so that caching is handled at the server level by Kinsta instead of at the PHP level by the plugin. Handling caching at the server level is faster and uses less server resources. Caching at the server level is one of the benefits of a managed WordPress server.

2. Redirection – 26.85%

The Redirection plugin is used by almost 27% of users, which is curious because redirection is something that can be handled in the built-in redirection manager tool in Kinsta’s dashboard. I use the Redirection plugin on some of my sites and it does a lot more than redirects. The plugin features 404 error reporting which alerts users to a problem like a typo in the URL of an external or internal link, which can be fixed by redirecting the typo to the correct URL. The plugin can also set security headers, which is useful for strengthening site security.

1. Yoast – 57.95%

Yoast is the most popular plugin installed on sites according to the scan Kinsta performed on 140,000 sites. In a way it’s not surprising because Yoast is installed on over 10+ million websites and is a trusted brand.

Takeaways:

The choice of plugins suggest what concerns WordPress users the most,  SEO, website performance, and proper site functioning.

Search Optimization

SEO is a strong concern to WordPress users, with a combined total of 86.19% of users employing an SEO plugin. The small percentage of users that install a dedicated structured data plugin (1.75%) or XML sitemap generator (2.17%) indicates that most users are satisfied with the built-in features of their SEO plugins.

Website Performance

Over 30% of managed WordPress host users in the surveyed sample of 140,000 sites are concerned enough about performance optimization to install plugins (WP Rocket 19.10% and Imagify 11.62%).

Site Health Maintenance

Over 30% of sites are concerned that their sites are working properly, as evidenced by the amount of users that install the Redirection and Broken Link Checker plugins.

Brand Trustworthiness

Over 95% of WordPress users turn to a name brand plugin:

  • Yoast 57.95%
  • WP Rocket 19.10%
  • Rank Math 18.32%

These findings suggest that trust and reliability, comprehensive functionality, and ease of use are important factors guiding the choice of WordPress plugins. It’s possible that trustworthy word of mouth recommendations and brand awareness also play a role in plugin choices.

Notable in this survey of plugins that users consider the best is that security plugins did not make the list. This is likely because Kinsta provides built-in WordPress security, including two firewalls and enterprise-level DDoS protection.

Featured Image by Shutterstock/Asier Romero

Maximize SEO Efforts: How To Fix Website Issues That Drain Time, Money & Performance

This post was sponsored by Bluehost. The opinions expressed in this article are the sponsor’s own.

Your website’s hosting is more than a technical decision.

It’s a cornerstone of your business’s online success that impacts everything from site speed and uptime to customer trust and overall branding.

Yet, many businesses stick with subpar hosting providers, often unaware of how much it’s costing them in time, money, and lost opportunities.

The reality is that bad hosting doesn’t just frustrate you. It frustrates your customers, hurts conversions, and can even damage your brand reputation.

The good news?

Choosing the right host can turn hosting into an investment that works for you, not against you.

Let’s explore how hosting affects your bottom line, identify common problems, and discuss what features you should look for to maximize your return on investment.

1. Start By Auditing Your Website’s Hosting Provider

The wrong hosting provider can quickly eat away at your time & efficiency.

In fact, time is the biggest cost of an insufficient hosting provider.

To start out, ask yourself:

  • Is Your Bounce Rate High?
  • Are Customers Not Converting?
  • Is Revenue Down?

If you answered yes to any of those questions, and no amount of on-page optimization seems to make a difference, it may be time to audit your website host.

Why Audit Your Web Host?

Frequent downtime, poor support, and slow server response times can disrupt workflows and create frustration for both your team and your visitors.

From an SEO & marketing perspective, a sluggish website often leads to:

  • Increased bounce rates.
  • Missed customer opportunities.
  • Wasted time troubleshooting technical issues.

Could you find workarounds for some of these problems? Sure. But they take time and money, too.

The more dashboards and tools you use, the more time you spend managing it all, and the more opportunities you’ll miss out on.

For example, hosts offering integrated domain and hosting management make overseeing your website easier and reduce administrative hassles.

Bluehost’s integrated domain services simplify website management by bringing all your hosting and domain tools into one intuitive platform.

2. Check If Your Hosting Provider Is Causing Slow Site Load Speeds

Your website is often the first interaction a customer has with your brand.

A fast, reliable website reflects professionalism and trustworthiness.

Customers associate smooth experiences with strong brands, while frequent glitches or outages send a message that you’re not dependable.

Your hosting provider should enhance your brand’s reputation, not detract from it.

How To Identify & Measure Slow Page Load Speeds

Identifying and measuring slow site and page loading speeds starts with using tools designed to analyze performance, such as Google PageSpeed Insights, GTmetrix, or Lighthouse.

These tools provide metrics like First Contentful Paint (FCP) and Largest Contentful Paint (LCP), which help you see how quickly key elements of your page load.

Pay attention to your site’s Time to First Byte (TTFB), a critical indicator of how fast your server responds to requests.

Regularly test your site’s performance across different devices, browsers, and internet connections to identify bottlenecks. High bounce rates or short average session durations in analytics reports can also hint at speed issues.

Bandwidth limitations can create bottlenecks for growing websites, especially during traffic spikes.

How To Find A Fast Hosting Provider

Opt for hosting providers that offer unmetered or scalable bandwidth to ensure seamless performance even during periods of high demand.

Cloud hosting is designed to deliver exceptional site and page load speeds, ensuring a seamless experience for your visitors and boosting your site’s SEO.

With advanced caching technology and optimized server configurations, Bluehost Cloud accelerates content delivery to provide fast, reliable performance even during high-traffic periods.

Its scalable infrastructure ensures your website maintains consistent speeds as your business grows, while a global Content Delivery Network (CDN) helps reduce latency for users around the world.

With Bluehost Cloud, you can trust that your site will load quickly and keep your audience engaged.

3. Check If Your Site Has Frequent Or Prolonged Downtime

Measuring and identifying downtime starts with having the right tools and a clear understanding of your site’s performance.

Tools like uptime monitoring services can track when your site is accessible and alert you to outages in real time.

You should also look at patterns.

Frequent interruptions or prolonged periods of unavailability are red flags. Check your server logs for error codes and timestamps that indicate when the site was down.

Tracking how quickly your hosting provider responds and resolves issues is also helpful, as slow resolutions can compound the problem.

Remember, even a few minutes of downtime during peak traffic hours can lead to lost revenue and customer trust, so understanding and monitoring downtime is critical for keeping your site reliable.

No matter how feature-packed your hosting provider is, unreliable uptime or poor support can undermine its value. These two factors are critical for ensuring a high-performing, efficient website.

What Your Hosting Server Should Have For Guaranteed Uptime

A Service Level Agreement (SLA) guarantees uptime, response time, and resolution time, ensuring that your site remains online and functional. Look for hosting providers that back their promises with a 100% uptime SLA.

Bluehost Cloud offers a 100% uptime SLA and 24/7 priority support, giving you peace of mind that your website will remain operational and any issues will be addressed promptly.

Our team of WordPress experts ensures quick resolutions to technical challenges, reducing downtime and optimizing your hosting ROI.

4. Check Your Host For Security Efficacy

Strong security measures protect your customers and show them you value their privacy and trust.

A single security breach can ruin your brand’s image, especially if customer data is compromised.

Hosts that lack built-in security features like SSL certificates, malware scanning, and regular backups leave your site vulnerable.

How Hosting Impacts Security

Security breaches don’t just affect your website. They affect your customers.

Whether it’s stolen data, phishing attacks, or malware, these breaches can erode trust and cause long-term damage to your business.

Recovering from a security breach is expensive and time-consuming. It often involves hiring specialists, paying fines, and repairing the damage to your reputation.

Is Your Hosting Provider Lacking Proactive Security Measures?

Assessing and measuring security vulnerabilities or a lack of proactive protection measures begins with a thorough evaluation of your hosting provider’s features and practices.

  1. Review Included Security Tools

Start by reviewing whether your provider includes essential security tools such as SSL certificates, malware scanning, firewalls, and automated backups in their standard offerings.

If these are missing or come as costly add-ons, your site may already be at risk.

  1. Leverage Brute Force Tools To Check For Vulnerabilities

Next, use website vulnerability scanning tools like Sucuri, Qualys SSL Labs, or SiteLock to identify potential weaknesses, such as outdated software, unpatched plugins, or misconfigured settings.

These tools can flag issues like weak encryption, exposed directories, or malware infections.

Monitor your site for unusual activity, such as unexpected traffic spikes or changes to critical files, which could signal a breach.

  1. Make Sure The Host Also Routinely Scans For & Eliminates Threats

It’s also crucial to evaluate how your hosting provider handles updates and threat prevention.

  • Do they offer automatic updates to patch vulnerabilities?
  • Do they monitor for emerging threats and take steps to block them proactively?

A good hosting provider takes a proactive approach to security, offering built-in protections that reduce your risks.

Look for hosting providers that include automatic SSL encryption, regular malware scans, and daily backups. These features not only protect your site but also give you peace of mind.

Bluehost offers robust security tools as part of its standard WordPress hosting package, ensuring your site stays protected without extra costs. With built-in SSL certificates and daily backups, Bluehost Cloud keeps your site secure and your customers’ trust intact.

5. Audit Your WordPress Hosting Provider’s Customer Support

Is your host delivering limited or inconsistent customer support?

Limited or inconsistent customer support can turn minor issues into major roadblocks. When hosting providers fail to offer timely, knowledgeable assistance, you’re left scrambling to resolve problems that could have been easily fixed.

Delayed responses or unhelpful support can lead to prolonged downtime, slower page speeds, and unresolved security concerns, all of which impact your business and reputation.

Reliable hosting providers should offer 24/7 priority support through multiple channels, such as chat and phone, so you can get expert help whenever you need it.

Consistent, high-quality support is essential for keeping your website running smoothly and minimizing disruptions.

Bluehost takes customer service to the next level with 24/7 priority support available via phone, chat, and email. Our team of knowledgeable experts specializes in WordPress, providing quick and effective solutions to keep your site running smoothly.

Whether you’re troubleshooting an issue, setting up your site, or optimizing performance, Bluehost’s dedicated support ensures you’re never left navigating challenges alone.

Bonus: Check Your Host For Hidden Costs For Essential Hosting Features

Hidden costs for essential hosting features like:

  • Backups.
  • SSL certificates.
  • Additional bandwidth can quickly erode the value of a seemingly affordable hosting plan.

What Does This Look Like?

For example, daily backups, which are vital for recovery after data loss or cyberattacks, may come with an unexpected monthly fee.

Similarly, SSL certificates, which are essential for encrypting data and maintaining trust with visitors, are often sold as expensive add-ons.

If your site experiences traffic spikes, additional bandwidth charges can catch you off guard, adding to your monthly costs.

Many providers, as you likely have seen, lure customers in with low entry prices, only to charge extra for services that are critical to your website’s functionality and security.

These hidden expenses not only strain your budget but also create unnecessary complexity in managing your site.

A reliable hosting provider includes these features as part of their standard offering, ensuring you have the tools you need without the surprise bills.

Which Hosting Provider Does Not Charge For Essential Features?

Bluehost is a great option, as their pricing is upfront.

Bluehost includes crucial tools like daily automated backups, SSL certificates, and unmetered bandwidth in their standard plans.

This means you won’t face surprise fees for the basic functionalities your website needs to operate securely and effectively.

Whether you’re safeguarding your site from potential data loss or ensuring encrypted, trustworthy connections for your visitors, or need unmetered bandwidth to ensure your site can handle traffic surges without penalty, you’ll gain the flexibility to scale without worrying about extra charges.

We even give WordPress users the option to bundle premium plugins together to help you save even more.

By including these features upfront, Bluehost simplifies your WordPress hosting experience and helps you maintain a predictable budget, freeing you to focus on growing your business instead of worrying about unexpected hosting costs.

Transitioning To A Better Hosting Solution: What To Consider

Switching hosting providers might seem daunting, but the right provider can make the process simple and cost-effective. Here are key considerations for transitioning to a better hosting solution:

Migration Challenges

Migrating your site to a new host can involve technical hurdles, including transferring content, preserving configurations, and minimizing downtime. A hosting provider with dedicated migration support can make this process seamless.

Cost of Switching Providers

Many businesses hesitate to switch hosts due to the cost of ending a contract early. To offset these expenses, search for hosting providers that offer migration incentives, such as contract buyouts or credit for remaining fees.

Why Bluehost Cloud Stands Out

Bluehost Cloud provides comprehensive migration support, handling every detail of the transfer to ensure a smooth transition.

Plus, our migration promotion includes $0 switching costs and credit for remaining contracts, making the move to Bluehost not only hassle-free but also financially advantageous.

Your hosting provider plays a pivotal role in the success of your WordPress site. By addressing performance issues, integrating essential features, and offering reliable support, you can maximize your hosting ROI and create a foundation for long-term success.

If your current hosting provider is falling short, it’s time to evaluate your options. Bluehost Cloud delivers performance-focused features, 100% uptime, premium support, and cost-effective migration services, ensuring your WordPress site runs smoothly and efficiently.

In addition, Bluehost has been a trusted partner of WordPress since 2005, working closely to create a hosting platform tailored to the unique needs of WordPress websites.

Beyond hosting, Bluehost empowers users through education, offering webinars, masterclasses, and resources like the WordPress Academy to help you maximize your WordPress experience and build successful websites.

Take control of your website’s performance and ROI. Visit the Bluehost Migration Page to learn how Bluehost Cloud can elevate your hosting experience.

This article has been sponsored by Bluehost, and the views presented herein represent the sponsor’s perspective.


Image Credits

Featured Image: Image by Bluehost. Used with permission.

What the departing White House chief tech advisor has to say on AI

President Biden’s administration will end within two months, and likely to depart with him is Arati Prabhakar, the top mind for science and technology in his cabinet. She has served as Director of the White House Office of Science and Technology Policy since 2022 and was the first to demonstrate ChatGPT to the president in the Oval Office. Prabhakar was instrumental in passing the president’s executive order on AI in 2023, which sets guidelines for tech companies to make AI safer and more transparent (though it relies on voluntary participation). 

The incoming Trump administration has not presented a clear thesis of how it will handle AI, but plenty of people in it will want to see that executive order nullified. Trump said as much in July, endorsing the 2024 Republican Party Platform that says the executive order “hinders AI innovation and imposes Radical Leftwing ideas on the development of this technology.” Venture capitalist Marc Andreessen has said he would support such a move. 

However, complicating that narrative will be Elon Musk, who for years has expressed fears about doomsday AI scenarios, and has been supportive of some regulations aiming to promote AI safety. 

As she prepares for the end of the administration, I sat down with Prabhakar and asked her to reflect on President Biden’s AI accomplishments, and how AI risks, immigration policies, the CHIPS Act and more could change under Trump.  

This conversation has been edited for length and clarity.

Every time a new AI model comes out, there are concerns about how it could be misused. As you think back to what were hypothetical safety concerns just two years ago, which ones have come true?

We identified a whole host of risks when large language models burst on the scene, and the one that has fully manifested in horrific ways is deepfakes and image-based sexual abuse. We’ve worked with our colleagues at the Gender Policy Council to urge industry to step up and take some immediate actions, which some of them are doing. There are a whole host of things that can be done—payment processors could actually make sure people are adhering to their Terms of Use. They don’t want to be supporting [image-based sexual abuse] and they can actually take more steps to make sure that they’re not. There’s legislation pending, but that’s still going to take some time.

Have there been risks that didn’t pan out to be as concerning as you predicted?

At first there was a lot of concern expressed by the AI developers about biological weapons. When people did the serious benchmarking about how much riskier that was compared with someone just doing Google searches, it turns out, there’s a marginally worse risk, but it is marginal. If you haven’t been thinking about how bad actors can do bad things, then the chatbots look incredibly alarming. But you really have to say, compared to what?

For many people, there’s a knee-jerk skepticism about the Department of Defense or police agencies going all in on AI. I’m curious what steps you think those agencies need to take to build trust.

If consumers don’t have confidence that the AI tools they’re interacting with are respecting their privacy, are not embedding bias and discrimination, that they’re not causing safety problems, then all the marvelous possibilities really aren’t going to materialize. Nowhere is that more true than national security and law enforcement. 

I’ll give you a great example. Facial recognition technology is an area where there have been horrific, inappropriate uses: take a grainy video from a convenience store and identify a black man who has never even been in that state, who’s then arrested for a crime he didn’t commit. (Editor’s note: Prabhakar is referring to this story). Wrongful arrests based on a really poor use of facial recognition technology, that has got to stop. 

In stark contrast to that, when I go through security at the airport now, it takes your picture and compares it to your ID to make sure that you are the person you say you are. That’s a very narrow, specific application that’s matching my image to my ID, and the sign tells me—and I know from our DHS colleagues that this is really the case—that they’re going to delete the image. That’s an efficient, responsible use of that kind of automated technology. Appropriate, respectful, responsible—that’s where we’ve got to go.

Were you surprised at the AI safety bill getting vetoed in California?

I wasn’t. I followed the debate, and I knew that there were strong views on both sides. I think what was expressed, that I think was accurate, by the opponents of that bill, is that it was simply impractical, because it was an expression of desire about how to assess safety, but we actually just don’t know how to do those things. No one knows. It’s not a secret, it’s a mystery. 

To me, it really reminds us that while all we want is to know how safe, effective and trustworthy a model is, we actually have very limited capacity to answer those questions. Those are actually very deep research questions, and a great example of the kind of public R&D that now needs to be done at a much deeper level.

Let’s talk about talent. Much of the recent National Security Memorandum on AI was about how to help the right talent come from abroad to the US to work on AI. Do you think we’re handling that in the right way?

It’s a hugely important issue. This is the ultimate American story, that people have come here throughout the centuries to build this country, and it’s as true now in science and technology fields as it’s ever been. We’re living in a different world. I came here as a small child because my parents came here in the early 1960s from India, and in that period, there were very limited opportunities [to emigrate to] many other parts of the world. 

One of the good pieces of news is that there is much more opportunity now. The other piece of news is that we do have a very critical strategic competition with the People’s Republic of China, and that makes it more complicated to figure out how to continue to have an open door for people who come seeking America’s advantages, while making sure that we continue to protect critical assets like our intellectual property. 

Do you think the divisive debates around immigration, especially around the time of the election, may hurt the US ability to bring the right talent into the country?

Because we’ve been stalled as a country on immigration for so long, what is caught up in that is our ability to deal with immigration for the STEM fields. It’s collateral damage.

Has the CHIPS Act been successful?

I’m a semiconductor person starting back with my graduate work. I was astonished and delighted when, after four decades, we actually decided to do something about the fact that semiconductor manufacturing capability got very dangerously concentrated in just one part of the world [Taiwan]. So it was critically important that, with the President’s leadership, we finally took action. And the work that the Commerce Department has done to get those manufacturing incentives out, I think they’ve done a terrific job.

One of the main beneficiaries so far of the CHIPS Act has been Intel. There’s varying degrees of confidence in whether it is going to deliver on building a domestic chip supply chain in the way that the CHIPS Act intended. Is it risky to put a lot of eggs in one basket for one chip maker?

I think the most important thing I see in terms of the industry with the CHIPS Act is that today we’ve got not just Intel, but TSMC, Samsung, SK Hynix and Micron. These are the five companies whose products and processes are at the most advanced nodes in semiconductor technology. They are all now building in the US. There’s no other part of the world that’s going to have all five of those. An industry is bigger than a company. I think when you look at the aggregate, that’s a signal to me that we’re on a very different track.

You are the President’s chief advisor for science and technology. I want to ask about the cultural authority that science has, or doesn’t have, today. RFK Jr. is the pick for health secretary, and in some ways, he captures a lot of frustration that Americans have about our healthcare system. In other ways, he has many views that can only be described as anti-science. How do you reflect on the authority that science has now?

I think it’s important to recognize that we live in a time when trust in institutions has declined across the board, though trust in science remains relatively high compared with what’s happened in other areas. But it’s very much part of this broader phenomenon, and I think that the scientific community has some roles [to play] here. The fact of the matter is that despite America having the best biomedical research that the world has ever seen, we don’t have robust health outcomes. Three dozen countries have longer life expectancies than America. That’s not okay, and that disconnect between advancing science and changing people’s lives is just not sustainable. The pact that science and technology and R&D makes with the American people is that if we make these public investments, it’s going to improve people’s lives and when that’s not happening, it does erode trust. 

Is it fair to say that that gap—between the expertise we have in the US and our poor health outcomes—explains some of the rise in conspiratorial thinking, in the disbelief of science?

It leaves room for that. Then there’s a quite problematic rejection of facts. It’s troubling if you’re a researcher, because you just know that what’s being said is not true. The thing that really bothers me is [that the rejection of facts] changes people’s lives, and it’s extremely dangerous and harmful. Think about if we lost herd immunity for some of the diseases for which we right now have fairly high levels of vaccination. It was an ugly world before we tamed infectious disease with the vaccines that we have. 

This manga publisher is using Anthropic’s AI to translate Japanese comics into English

A Japanese publishing startup is using Anthropic’s flagship large language model Claude to help translate manga into English, allowing the company to churn out a new title for a Western audience in just a few days rather than the two to three months it would take a team of humans.

Orange was founded by Shoko Ugaki, a manga superfan who (according to VP of product Rei Kuroda) has some 10,000 titles in his house. The company now wants more people outside Japan to have access to them. “I hope we can do a great job for our readers,” says Kuroda.

A page from a Manga comic in both Japanese and translated English.
Orange’s Japanese-to-English translation of Neko Oji: Salaryman reincarnated as a kitten!
IMAGES COURTESY ORANGE / YAJIMA

But not everyone is happy. The firm has angered a number of manga fans who see the use of AI to translate a celebrated and traditional art form as one more front in the ongoing battle between tech companies and artists. “However well-intentioned this company might be, I find the idea of using AI to translate manga distasteful and insulting,” says Casey Brienza, a sociologist and author of the book Manga in America: Transnational Book Publishing and the Domestication of Japanese Comics.

Manga is a form of Japanese comic that has been around for more than a century. Hit titles are often translated into other languages and find a large global readership, especially in the US. Some, like Battle Angel Alita or One Piece, are turned into anime (animated versions of the comics) or live-action shows and become blockbuster movies and top Netflix picks. The US manga market was worth around $880 million in 2023 but is expected to reach $3.71 billion by 2030, according to some estimates. “It’s a huge growth market right now,” says Kuroda.

Orange wants a part of that international market. Only around 2% of titles published in Japan make it to the US, says Kuroda. As Orange sees it, the problem is that manga takes human translators too long to translate. By building AI tools to automate most of the tasks involved in translation—including extracting Japanese text from a comic’s panels, translating it into English, generating a new font, pasting the English back into the comic, and checking for mistranslations and typos—it can publish a translated mange title in around one-tenth the time it takes human translators and illustrators working by hand, the company says.

Humans still keep a close eye on the process, says Kuroda: “Honestly, AI makes mistakes. It sometimes misunderstands Japanese. It makes mistakes with artwork. We think humans plus AI is what’s important.”

Superheroes, aliens, cats

Manga is a complex art form. Stories are told via a mix of pictures and words, which can be descriptions or characters’ voices or sound effects, sometimes in speech bubbles and sometimes scrawled across the page. Single sentences can be split across multiple panels.

There are also diverse themes and narratives, says Kuroda: “There’s the student romance, mangas about gangs and murders, superheroes, aliens, cats.” Translations must capture the cultural nuance in each story. “This complexity makes localization work highly challenging,” he says.

Orange often starts with nothing more than the scanned image of a page. Its system first identifies which parts of the page show Japanese text, copies it, and erases the text from each panel. These snippets of text are then combined into whole sentences and passed to the translation module, which not only translates the text into English but keeps track of where on the page each individual snippet comes from. Because Japanese and English have a very different word order, the snippets need to be reordered, and the new English text must be placed on the page in different places from where the Japanese equivalent had come from—all without messing up the sequence of images.

“Generally, the images are the most important part of the story,” says Frederik Schodt, an award-winning manga translator who published his first translation in 1977. “Any language cannot contradict the images, so you can’t take many of the liberties that you might in translating a novel. You can’t rearrange paragraphs or change things around much.”

A page from a Manga comic in both Japanese and translated English.
Orange’s Japanese-to-English translation of Neko Oji: Salaryman reincarnated as a kitten!
IMAGES COURTESY ORANGE / YAJIMA

Orange tried several large language models, including its own, developed in house, before picking Claude 3.5. “We’re always evaluating new models,” says Kuroda. “Right now Claude gives us the most natural tone.”

Claude also has an agent framework that lets several sub-models work together on an overall task. Orange uses this framework to juggle the multiple steps in the translation process.

Orange distributes its translations via an app called Emaqi (a pun on “emaki,” the ancient Japanese illustrated scrolls that are considered a precursor to manga). It also wants to be a translator-for-hire for US publishers.

But Orange has not been welcomed by all US fans. When it showed up at Anime NYC, a US anime convention, this summer, the Japanese-to-English translator Jan Mitsuko Cash tweeted: “A company like Orange has no place at the convention hosting the Manga Awards, which celebrates manga and manga professionals in the industry. If you agree, please encourage @animenyc to ban AI companies from exhibiting or hosting panels.”  

Brienza takes the same view. “Work in the culture industries, including translation, which ultimately is about translating human intention, not mere words on a page, can be poorly paid and precarious,” she says. “If this is the way the wind is blowing, I can only grieve for those who will go from making little money to none.”

Some have also called Orange out for cutting corners. “The manga uses stylized text to represent the inner thoughts that the [protagonist] can’t quite voice,” another fan tweeted. “But Orange didn’t pay a redrawer or letterer to replicate it properly. They also just skip over some text entirely.”

App that offers distribution service that will provide translated manga
Orange distributes its translations via an app called Emaqi (available only in the US and Canada for now)
EMAQI

Everyone at Orange understands that manga translation is a sensitive issue, says Kuroda: “We believe that human creativity is absolutely irreplaceable, which is why all AI-assisted work is rigorously reviewed, refined, and finalized by a team of people.”  

Orange also claims that the authors it has translated are on board with its approach. “I’m genuinely happy with how the English version turned out,” says Kenji Yajima, one of the authors Orange has worked with, referring to the company’s translation of his title Neko Oji: Salaryman reincarnated as a kitten! (see images). “As a manga artist, seeing my work shared in other languages is always exciting. It’s a chance to connect with readers I never imagined reaching before.”

Schodt sees the upside too. He notes that the US is flooded with poor-quality, unofficial fan-made translations. “The number of pirated translations is huge,” he says. “It’s like a parallel universe.”

He thinks using AI to streamline translation is inevitable. “It’s the dream of many companies right now,” he says. “But it will take a huge investment.” He believes that really good translation will require large language models trained specifically on manga: “It’s not something that one small company is going to be able to pull off.”

“Whether this will prove economically feasible right now is anyone’s guess,” says Schodt. “There is a lot of advertising hype going on, but the readers will have the final judgment.”

The Download: words of wisdom from the departing White House tech advisor, and controversial AI manga translation

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology.

What the departing White House chief tech advisor has to say on AI

President Biden’s administration will end within two months, and likely to depart with him is Arati Prabhakar, the top mind for science and technology in his cabinet. She has served as Director of the White House Office of Science and Technology Policy since 2022 and was the first to demonstrate ChatGPT to the president in the Oval Office. 

Prabhakar was instrumental in passing the president’s executive order on AI in 2023, which sets guidelines for tech companies to make AI safer and more transparent (though it relies on voluntary participation).

As she prepares for the end of the administration, MIT Technology Review sat down with Prabhakar and asked her to reflect on President Biden’s AI accomplishments, and how the approach to AI risks, immigration policies, the CHIPS Act and more could change under Trump. Read the full story.

—James O’Donnell

This manga publisher is using Anthropic’s AI to translate Japanese comics into English

A Japanese publishing startup is using Anthropic’s flagship large language model Claude to help translate manga into English, allowing the company to churn out a new title for a Western audience in just a few days rather than the 2-3 months it would take a team of humans.

But not everyone is happy about it. The firm has angered a number of manga fans who see the use of AI to translate a celebrated and traditional art-form as one more front in the ongoing battle between tech companies and artists. Read the full story.

—Will Douglas Heaven

The must-reads

I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology.

1 The US has announced more restrictions on chip exports to China
It’s the third round of crackdowns on the industry in as many years. (Reuters)
+ It’s not just China-based companies that could suffer, either. (WP $)
+ The delayed announcement gave China the chance to stockpile affected chips. (WSJ $)
+ Meanwhile, computer scientists in the West are trying to make peace. (Economist $)
+ What’s next in chips. (MIT Technology Review)

2 Donald Trump’s administration is full of pseudo-influencers
They’re capitalizing on their fame to make big bucks ahead of the inauguration. (WP $)
+ A lot of his cabinet also happen to be billionaires. (NY Mag $)

 3 We’re not prepared for a clean energy future
It seems energy authorities keep underestimating how much clean power the world really wants. (Vox)
+ Why artificial intelligence and clean energy need each other. (MIT Technology Review)

4 Ads could start cropping up in ChatGPT
OpenAI is on a revenue drive, and advertising is an obvious cash source. (FT $)
+ Elon Musk is doing all he can to prevent it becoming a for-profit business. (Bloomberg $)

5 Chemistry students in Mexico are being lured into making fentanyl
Cartels are offering young chemists large sums to make the drug even more potent. (NYT $)
+ Deaths from fentanyl are falling—and it looks it’s because of supply changes.(FT $)
+ Anti-opioid groups are cautiously optimistic about Trump’s new tariffs. (The Guardian)

6 BYD isn’t just a EV company these days
It’s carved out an unlikely side gig assembling Apple’s iPads. (WSJ $)
+ BYD has also experimented with shipping for its colossal car consignments. (MIT Technology Review)

7 Our organs age at different rates
And AI is giving us a window into understanding why. (New Scientist $)
+ Aging hits us in our 40s and 60s. But well-being doesn’t have to fall off a cliff. (MIT Technology Review)

8 The unbearable mundanity of home DNA tests
The likelihood of them revealing anything interesting is actually pretty low. (The Guardian)
+ How to… delete your 23andMe data. (MIT Technology Review)

9 This website is full of random, barely-watched home videos
Which one you’ll be served is anyone’s guess. (WP $)

10 Brain rot is the Oxford University dictionary word of the year
Specifically in the context of spending too long looking at nonsense online. (BBC)

Quote of the day

“It’s like trying to prevent a fisherman from catching bigger fish simply by denying him bigger fishing poles. He’ll get there in the end.”

—Meghan Harris, an export control expert at consultancy Beacon Global Strategies, explains the limits of the US government’s plans to curb China’s chipmaking to the Financial Times.

The big story

The quest to build wildfire-resistant homes

April 2023

With each devastating wildfire in the US West, officials consider new methods or regulations that might save homes or lives the next time.

In the parts of California where the hillsides meet human development, and where the state has suffered recurring seasonal fire tragedies, that search for new means of survival has especially high stakes.

Many of these methods are low cost and low tech, but no less truly innovative. In fact, the hardest part to tackle may not be materials engineering, but social change. Read the full story.

—Susie Cagle

We can still have nice things

A place for comfort, fun and distraction to brighten up your day. (Got any ideas? Drop me a line or tweet ’em at me.)

+ This Instagram account is a treasure trove of bygone mobile phones.
+ The newly renovated Notre Dame cathedral is really quite something.
+ Bad news: we’re probably not going to find alien life any time soon 😭
+ Think you know grilled cheese? This recipe might make you question everything you know and hold dear.

Moving generative AI into production

Generative AI has taken off. Since the introduction of ChatGPT in November 2022, businesses have flocked to large language models (LLMs) and generative AI models looking for solutions to their most complex and labor-intensive problems. The promise that customer service could be turned over to highly trained chat platforms that could recognize a customer’s problem and present user-friendly technical feedback, for example, or that companies could break down and analyze their troves of unstructured data, from videos to PDFs, has fueled massive enterprise interest in the technology. 

This hype is moving into production. The share of businesses that use generative AI in at least one business function nearly doubled this year to 65%, according to McKinsey. The vast majority of organizations (91%) expect generative AI applications to increase their productivity, with IT, cybersecurity, marketing, customer service, and product development among the most impacted areas, according to Deloitte. 

Yet, difficulty successfully deploying generative AI continues to hamper progress. Companies know that generative AI could transform their businesses—and that failing to adopt will leave them behind—but they are faced with hurdles during implementation. This leaves two-thirds of business leaders dissatisfied with progress on their AI deployments. And while, in Q3 2023, 79% of companies said they planned to deploy generative AI projects in the next year, only 5% reported having use cases in production in May 2024. 

“We’re just at the beginning of figuring out how to productize AI deployment and make it cost effective,” says Rowan Trollope, CEO of Redis, a maker of real-time data platforms and AI accelerators. “The cost and complexity of implementing these systems is not straightforward.”

Estimates of the eventual GDP impact of generative AI range from just under $1 trillion to a staggering $4.4 trillion annually, with projected productivity impacts comparable to those of the Internet, robotic automation, and the steam engine. Yet, while the promise of accelerated revenue growth and cost reductions remains, the path to get to these goals is complex and often costly. Companies need to find ways to efficiently build and deploy AI projects with well-understood components at scale, says Trollope.

Download the full report.

This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by MIT Technology Review’s editorial staff.