5 Things To Consider Before A Site Migration via @sejournal, @martinibuster

One of the scariest SEO tasks is a site migration because the stakes are so high and the pitfalls at every step . Here are five tips that will help keep a site migration on track to a successful outcome.

Site Migrations Are Not One Thing

Site Migrations are not one thing, they are actually different scenarios and the only thing they have in common is that there is always something that can go wrong.

Here are examples of some of the different kinds of site migrations:

  • Migration to a new template
  • Migrating to a new web host
  • Merging two different websites
  • Migrating to a new domain name
  • Migrating to a new site architecture
  • Migrating to a new content management system (CMS)
  • Migrating to a new WordPress site builder

There are many ways a site can change and more ways for those changes to result in a negative outcome.

The following is not a site migration checklist. It’s five suggestions for things to consider.

1. Prepare For Migration: Download Everything

Rule number one is to prepare for the site migration. One of my big concerns is that the old version of the website is properly documented.

These are some of the ways to document a website:

  • Download the database and save it in at least two places. I like to have a backup of the backup stored on a second device.
  • Download all the website files. Again, I prefer to save a backup of the backup stored on a second device.
  • Crawl the site, save the crawl and export it as a CSV or an XML site map. I prefer to have redundant backups just in case something goes wrong.

An important thing to remember about downloading files by FTP is that there are two formats for downloading files: ASCII and Binary.

  1. Use ASCII for downloading files that contain code, like CSS, JS, PHP and HTML.
  2. Use Binary for media like images, videos and zip files.

Fortunately, most modern FTP software have an automatic setting that should be able to distinguish between the two kinds of files. A sad thing that can happen is to download image files using the ASCII format which results in corrupted images.

So always check that your files are all properly downloaded and not in a corrupted state. Always consider downloading a copy for yourself if you have hired a third party to handle the migration or a client is doing it and they’re downloading files. That way if they fail with their download you’ll have an uncorrupted copy backed up.

The most important rule about backups: You can never have too many backups!

2. Crawl The Website

Do a complete crawl of the website. Create a backup of the crawl. Then create a backup of the backup and store it on a separate hard drive.

After the site migration, this crawl data can be used to generate a new list for crawling the old URLs to identify any URLs that are missing (404), are failing to redirect, or are redirecting to the wrong webpage. Screaming Frog also has a list mode that can crawl a list of URLs saved in different formats, including as an XML sitemap, and directly input into a text field.  This is a way to crawl a specific batch of URLs as opposed to crawling a site from link to to link.

3. Tips For Migrating To A New Template

Website redesigns can be can be a major source of anguish when they go wrong. On paper, migrating a site to a new template should be a one-to-one change with minimal issues. In practice that’s not always the case.  For one, no template can be used off the shelf, it has to be modified to conform to what’s needed, which can mean removing and/or altering the code.

Search marketing expert Nigel Mordaunt (LinkedIn), who recently sold his search marketing agency, has experience migrating over a hundred sites and has important considerations for migrating to a new WordPress template.

This is Nigel’s advice:

“Check that all images have the same URL, alt text and image titles, especially if you’re using new images.

Templates sometimes have hard-coded heading elements, especially in the footer and sidebars. Those should be styled with CSS, not with H tags. I had this problem with a template once where the ranks had moved unexpectedly, then found that the Contact Us and other navigation links were all marked up to H2. I think that was more of a problem a few years ago. But still, some themes have H tags hard coded in places that aren’t ideal.

Make sure that all URLs are the exact same, a common mistake. Also, if planning to change content then check that the staging environment has been noindexed then after the site goes live make sure that the newly uploaded live site no longer contains the noindex robots meta tag.

If changing content then be prepared the site to perhaps be re-evaluated by Google. Depending on the size of the site, even if the changes are positive it may take several weeks to be rewarded, and in some cases several months. The client needs to be informed of this before the migration.

Also, check that analytics and tracking codes have been inserted into the new site, review all image sizes to make sure there are no new images that are huge and haven’t been scaled down. You can easily check the image sizes and heading tags with a post-migration Screaming Frog crawl. I can’t imagine doing any kind of site migration without Screaming Frog.”

4. Advice For Migrating To A New Web Host

Mark Barrera (LinkedIn), VP SEO, Newfold Digital (parent company of Bluehost), had this to say about crawling before a site migration in preparation for a migration to a new web host:

“Thoroughly crawl your existing site to identify any indexing or technical SEO issues prior to the move.

Maintain URL Structure (If Possible): Changing URL structures can confuse search engines and damage your link equity. If possible, keep your URLs the same.

301 Redirects: 301 Redirects are your friend. Search engines need to be informed that your old content now lives at a new address. Implementing 301 redirects from any old URLs to their new counterparts preserves link equity and avoids 404 errors for both users and search engine crawlers.

Performance Optimization: Ensure your new host provides a fast and reliable experience. Site speed is important for user experience.

Be sure to do a final walkthrough of your new site before doing your actual cutover. Visually double-check your homepage, any landing pages, and your most popular search hits. Review any checkout/cart flows, comment/review chains, images, and any outbound links to your other sites or your partners.

SSL Certificate: A critical but sometimes neglected aspect of hosting migrations is the SSL certificate setup. Ensuring that your new host supports and correctly implements your existing SSL certificate—or provides a new one without causing errors is vital. SSL/TLS not only secures your site but also impacts SEO. Any misconfiguration during migration can lead to warnings in browsers, which deter visitors and can temporarily impact rankings.

Post migration, it’s crucial to benchmark server response times not just from one location, but regionally or globally, especially if your audience is international. Sometimes, a new hosting platform might show great performance in one area but lag in other parts of the world. Such discrepancies can affect page load times, influencing bounce rates and search rankings. “

5. Accept Limitations

Ethan Lazuk, SEO Strategist & Consultant, Ethan Lazuk Consulting, LLC, (LinkedIn, Twitter) offers an interesting perspective on site migrations on the point about anticipating client limitations imposed upon what you are able to do. It can be frustrating when a client pushes back on advice and it’s important to listen to their reasons for doing it.

I have consulted over Zoom with companies whose SEO departments had concerns about what an external SEO wanted to do. Seeking a third party confirmation about a site migration plan is a reasonable thing to do. So if the internal SEO department has concerns about the plan, it’s not a bad idea to have a trustworthy third party take a look at it.

Ethan shared his experience:

“The most memorable and challenging site migrations I’ve been a part of involved business decisions that I had no control over.

As SEOs, we can create a smart migration plan. We can follow pre- and post-launch checklists, but sometimes, there are legal restrictions or other business realities behind the scenes that we have to work around.

Not having access to a DNS, being restricted from using a brand’s name or certain content, having to use an intermediate domain, and having to work days, weeks, or months afterward to resolve any issues once the internal business situations have changed are just a few of the tricky migration issues I’ve encountered.

The best way to handle these situations require working around client restrictions is to button up the SEO tasks you can control, set honest expectations for how the business issues could impact performance after the migration, and stay vigilant with monitoring post-launch data and using it to advocate for resources you need to finish the job.”

Different Ways To Migrate A Website

Site migrations are a pain and should be approached with caution. I’ve done many different kinds of migrations for myself and have assisted them with clients. I’m currently moving thousands of webpages from a folder to the root and it’s complicated by multiple redirects that have to be reconfigured, not looking forward to it. But migrations are sometimes unavoidable so it’s best to step up to it after careful consideration.

Featured Image by Shutterstock/Krakenimages.com

WordPress on Your Desktop: Studio By WordPress & Other Free Tools via @sejournal, @martinibuster

WordPress announced the rollout of Studio by WordPress, a new local development tool that makes it easy for publishers to not just develop and update websites locally on their desktop or laptop but is also useful for learning how to use WordPress. Learn about Studio and other platforms that are make it easy to develop websites with WordPress right on your desktop.

Local Development Environments

Local Environments are like web hosting spaces on the desktop that can be used to set up a WordPress site. They’re a fantastic way to try out new WordPress themes and plugins to learn how they work without messing up a live website or publishing something to the web that might get accidentally indexed by Google. They are also useful for testing if an updated plugin causes a conflict with other plugins on a website, which is useful for testing updated plugins offline before committing to updating the plugins on a live website.

Studio joins a list of popular local development environments that are specific for WordPress and more advanced platforms that are that can be used for WordPress on the desktop but have greater flexibility and options but may be harder to use for non-developers.

Desktop WordPress Development Environments

There are currently a few local environments that are specific to WordPress. The advantages of using a dedicated WordPress environment is that they make it easy to start creating  with WordPress for those who only need to work with WordPress sites and nothing more complicated than that.

Studio By WordPress.com

Studio is an open source project that allows developers and publishers to set up a WordPress site on their desktop in order to design, test or learn how to use WordPress.

According to the WordPress announcement:

“Say goodbye to manual tool configuration, slow site setup, and clunky local development workflows, and say hello to Studio by WordPress.com, our new, free, open source local WordPress development environment.

Once you have a local site running, you can access WP Admin, the Site Editor, global styles, and patterns, all with just one click—and without needing to remember and enter a username or password.”

The goal of Studio is to be a simple and fast way to create WordPress sites on the desktop. It’s currently available for use on a Mac and a Windows version is coming soon.

Download the Mac version here.

Other Popular WordPress Local Development Environments

DevKinsta

DevKinsta, developed by Kinsta managed web host, is another development environment that’s specifically dedicated for quickly designing and testing WordPress sites on the desktop. It’s a popular choice that many developers endorse.

That makes it a great tool for publishers, SEOs and developers who just want a tool to do one thing, create WordPress sites. This makes DevKinsta a solid consideration for anyone who is serious about developing WordPress sites or just wants to learn how to use WordPress, especially the latest Gutenberg Blocks environment.

Download  DevKinsta for free here.

Local WP

Local WP is a popular desktop development environment specifically made for WordPress users by WP Engine, a managed WordPress hosting provider.

Useful Features of Local WP

Local WP has multiple features that make it useful beyond simply developing and testing WordPress websites.

  • Image Optimizer
    It features a free image optimizer add-on that optimizes images on your desktop which should be popular for those who are unable to optimize images on their own.
  • Upload Backups
    Another handy feature is the ability to upload backups to Dropbox and Google Drive.
  • Link Checker
    The tool has a built-in link checker that scans your local version of the website to identify broken links. This is a great way to check a site offline without using server resources and potentially slowing down your live site.
  • Import & Export Sites
    This has the super-handy ability to import WordPress website files and export them so that you can work on your current WordPress site on your desktop, test out new plugins or themes and if you’re ready you can upload the files to your website.

Advanced Local Development Environments

There are other local development environments that are not specific for WordPress but are nonetheless useful for designing and testing WordPress sites on the desktop. These tools are more advanced and are popular with developers who appreciate the freedom and options available in these platforms.

DDEV with Docker

An open source app that makes it easy to use the Docker software containerization to quickly install a content management system and start working, without having to deal with the Docker learning curve.

Download DDEV With Docker here.

Laragon

Laragon is a free local development environment that was recommended to me by someone who is an advanced coder because they said that it’s easy to use and fairly intuitive. They were right. I’ve used it and have had good experiences with it. It’s not a WordPress-specific tool so that must be kept in mind.

Laragon describes itself as an easy to use alternative to XXAMPP and WAMP.

Download DDEV here.

Mamp

Mamp is a local development platform that’s popular with advanced coders and is available for Mac and Windows.

David McCan (Facebook profile), a WordPress trainer who writes about advanced WordPress topics on WebTNG shared his experience with MAMP.

“MAMP is pretty easy to setup and it provides a full range of features. I currently have 51 local sites which are development versions of my production sites, that I use for testing plugins, and periodically use for new beta versions of WordPress core. It is easy to clone sites also. I haven’t noticed any system slowdown or lag.”

WAMP And XAMPP

WAMP is a Windows only development environment that’s popular with developers and WordPress theme and plugin publishers.

XAMPP is a PHP development platform that can be used on Linux, Mac, and Windows desktops.

Download Wamp here.

Download XAMPP here.

So Many Local Development Platforms

Studio by WordPress.com is an exciting new local development platform and I’m looking forward to trying it out. But it’s not the only one so it may be useful to try out different solutions to see which one works best for you.

Read more about Studio by WordPress:

Meet Studio by WordPress.com—a fast, free way to develop locally with WordPress

Featured Image by Shutterstock/Wpadington

Big Update To Google’s Ranking Drop Documentation via @sejournal, @martinibuster

Google updated their guidance with five changes on how to debug ranking drops. The new version contains over 400 more words that address small and large ranking drops. There’s room to quibble about some of the changes but overall the revised version is a step up from what it replaced.

Change# 1: Downplays Fixing Traffic Drops

The opening sentence was changed so that it offers less hope for bouncing back from an algorithmic traffic drop. Google also joined two sentences into one sentence in the revised version of the documentation.

The documentation previously said that most traffic drops can be reversed and that identifying the reasons for a drop aren’t straightforward. The part about most of them can be reversed was completely removed.

Here is the original two sentences:

“A drop in organic Search traffic can happen for several reasons, and most of them can be reversed. It may not be straightforward to understand what exactly happened to your site”

Now there’s no hope offered for “most of them can be reversed” and more emphasis on understanding what happened is not straightforward.

This is the new guidance

“A drop in organic Search traffic can happen for several reasons, and it may not be straightforward to understand what exactly happened to your site.”

Change #2 Security Or Spam Issues

Google updated the traffic graph illustrations so that they precisely align with the causes for each kind of traffic decline.

The previous version of the graph was labeled:

“Site-level technical issue (Manual Action, strong algorithmic changes)”

The problem with the previous label is that manual actions and strong algorithmic changes are not technical issues and the new version fixes that issue.

The updated version now reads:

“Large drop from an algorithmic update, site-wide security or spam issue”

A line graph labeled

Change #3 Technical Issues

There’s one more change to a graph label, also to make it more accurate.

This is how the previous graph was labeled:

“Page-level technical issue (algorithmic changes, market disruption)”

The updated graph is now labeled:

“Technical issue across your site, changing interests”

Now the graph and label are more specific as a sitewide change and “changing interests” is more general and covers a wider range of changes than market disruption. Changing interests includes market disruption (where a new product makes a previous one obsolete or less desirable) but it also includes products that go out of style or loses their trendiness.

Graph titled

Change #4 Google Adds New Guidance For Algorithmic Changes

The biggest change by far is their brand new section for algorithmic changes which replaces two smaller sections, one about policy violations and manual actions and a second one about algorithm changes.

The old version of this one section had 108 words. The updated version contains 443 words.

A section that’s particularly helpful is where the guidance splits algorithmic update damage into two categories.

Two New Categories:

  • Small drop in position? For example, dropping from position 2 to 4.
  • Large drop in position? For example, dropping from position 4 to 29.

The two new categories are perfect and align with what I’ve seen in the search results for sites that have lost rankings. The reasons for dropping up and down within the top ten are different from the reasons why a site drops completely out of the top ten.

I don’t agree with the guidance for large drops. They recommend reviewing your site for large drops, which is good advice for some sites that have lost rankings. But in other cases there’s nothing wrong with the site and this is where less experienced SEOs tend to be unable to fix the problems because there’s nothing wrong with the site. Recommendations for improving EEAT, adding author bios or filing link disavows do not solve what’s going on because there’s nothing wrong with the site. The problem is something else in some of the cases.

Here is the new guidance for debugging search position drops:

Algorithmic update
Google is always improving how it assesses content and updating its search ranking and serving algorithms accordingly; core updates and other smaller updates may change how some pages perform in Google Search results. We post about notable improvements to our systems on our list of ranking updates page; check it to see if there’s anything that’s applicable to your site.

If you suspect a drop in traffic is due to an algorithmic update, it’s important to understand that there might not be anything fundamentally wrong with your content. To determine whether you need to make a change, review your top pages in Search Console and assess how they were ranking:

Small drop in position? For example, dropping from position 2 to 4.
Large drop in position? For example, dropping from position 4 to 29.

Keep in mind that positions aren’t static or fixed in place. Google’s search results are dynamic in nature because the open web itself is constantly changing with new and updated content. This constant change can cause both gains and drops in organic Search traffic.

Small drop in position
A small drop in position is when there’s a small shift in position in the top results (for example, dropping from position 2 to 4 for a search query). In Search Console, you might see a noticeable drop in traffic without a big change in impressions.

Small fluctuations in position can happen at any time (including moving back up in position, without you needing to do anything). In fact, we recommend avoiding making radical changes if your page is already performing well.

Large drop in position
A large drop in position is when you see a notable drop out of the top results for a wide range of terms (for example, dropping from the top 10 results to position 29).

In cases like this, self-assess your whole website overall (not just individual pages) to make sure it’s helpful, reliable and people-first. If you’ve made changes to your site, it may take time to see an effect: some changes can take effect in a few days, while others could take several months. For example, it may take months before our systems determine that a site is now producing helpful content in the long term. In general, you’ll likely want to wait a few weeks to analyze your site in Search Console again to see if your efforts had a beneficial effect on ranking position.

Keep in mind that there’s no guarantee that changes you make to your website will result in noticeable impact in search results. If there’s more deserving content, it will continue to rank well with our systems.”

Change #5 Trivial Changes

The rest of the changes are relatively trivial but nonetheless makes the documentation more precise.

For example, one of the headings was changed from this:

You recently moved your site

To this new heading:

Site moves and migrations

Google’s Updated Ranking Drops Documentation

Google’s updated documentation is a well thought out but I think that the recommendations for large algorithmic drops are helpful for some cases and not helpful for other cases. I have 25 years of SEO experience and have experienced every single Google algorithm update. There are certain updates where the problem is not solved by trying to fix things and Google’s guidance used to be that sometimes there’s nothing to fix. The documentation is better but in my opinion it can be improved even further.

Read the new documentation here:

Debugging drops in Google Search traffic

Review the previous documentation:

Internet Archive Wayback Machine: Debugging drops in Google Search traffic

Featured Image by Shutterstock/Tomacco

Google March 2024 Core Update Officially Completed A Week Ago via @sejournal, @MattGSouthern

Google has officially completed its March 2024 Core Update, ending over a month of ranking volatility across the web.

However, Google didn’t confirm the rollout’s conclusion on its data anomaly page until April 26—a whole week after the update was completed on April 19.

Many in the SEO community had been speculating for days about whether the turbulent update had wrapped up.

The delayed transparency exemplifies Google’s communication issues with publishers and the need for clarity during core updates

Google March 2024 Core Update Timeline & Status

First announced on March 5, the core algorithm update is complete as of April 19. It took 45 days to complete.

Unlike more routine core refreshes, Google warned this one was more complex.

Google’s documentation reads:

“As this is a complex update, the rollout may take up to a month. It’s likely there will be more fluctuations in rankings than with a regular core update, as different systems get fully updated and reinforce each other.”

The aftershocks were tangible, with some websites reporting losses of over 60% of their organic search traffic, according to data from industry observers.

The ripple effects also led to the deindexing of hundreds of sites that were allegedly violating Google’s guidelines.

Addressing Manipulation Attempts

In its official guidance, Google highlighted the criteria it looks for when targeting link spam and manipulation attempts:

  • Creating “low-value content” purely to garner manipulative links and inflate rankings.
  • Links intended to boost sites’ rankings artificially, including manipulative outgoing links.
  • The “repurposing” of expired domains with radically different content to game search visibility.

The updated guidelines warn:

“Any links that are intended to manipulate rankings in Google Search results may be considered link spam. This includes any behavior that manipulates links to your site or outgoing links from your site.”

John Mueller, a Search Advocate at Google, responded to the turbulence by advising publishers not to make rash changes while the core update was ongoing.

However, he suggested sites could proactively fix issues like unnatural paid links.

Mueller stated on Reddit:

“If you have noticed things that are worth improving on your site, I’d go ahead and get things done. The idea is not to make changes just for search engines, right? Your users will be happy if you can make things better even if search engines haven’t updated their view of your site yet.”

Emphasizing Quality Over Links

The core update made notable changes to how Google ranks websites.

Most significantly, Google reduced the importance of links in determining a website’s ranking.

In contrast to the description of links as “an important factor in determining relevancy,” Google’s updated spam policies stripped away the “important” designation, simply calling links “a factor.”

This change aligns with Google’s Gary Illyes’ statements that links aren’t among the top three most influential ranking signals.

Instead, Google is giving more weight to quality, credibility, and substantive content.

Consequently, long-running campaigns favoring low-quality link acquisition and keyword optimizations have been demoted.

With the update complete, SEOs and publishers are left to audit their strategies and websites to ensure alignment with Google’s new perspective on ranking.


Featured Image: Rohit-Tripathi/Shutterstock

FAQ

After the update, what steps should websites take to align with Google’s new ranking criteria?

After Google’s March 2024 Core Update, websites should:

  • Improve the quality, trustworthiness, and depth of their website content.
  • Stop heavily focusing on getting as many links as possible and prioritize relevant, high-quality links instead.
  • Fix any shady or spam-like SEO tactics on their sites.
  • Carefully review their SEO strategies to ensure they follow Google’s new guidelines.

Google Declares It The “Gemini Era” As Revenue Grows 15% via @sejournal, @MattGSouthern

Alphabet Inc., Google’s parent company, announced its first quarter 2024 financial results today.

While Google reported double-digit growth in key revenue areas, the focus was on its AI developments, dubbed the “Gemini era” by CEO Sundar Pichai.

The Numbers: 15% Revenue Growth, Operating Margins Expand

Alphabet reported Q1 revenues of $80.5 billion, a 15% increase year-over-year, exceeding Wall Street’s projections.

Net income was $23.7 billion, with diluted earnings per share of $1.89. Operating margins expanded to 32%, up from 25% in the prior year.

Ruth Porat, Alphabet’s President and CFO, stated:

“Our strong financial results reflect revenue strength across the company and ongoing efforts to durably reengineer our cost base.”

Google’s core advertising units, such as Search and YouTube, drove growth. Google advertising revenues hit $61.7 billion for the quarter.

The Cloud division also maintained momentum, with revenues of $9.6 billion, up 28% year-over-year.

Pichai highlighted that YouTube and Cloud are expected to exit 2024 at a combined $100 billion annual revenue run rate.

Generative AI Integration in Search

Google experimented with AI-powered features in Search Labs before recently introducing AI overviews into the main search results page.

Regarding the gradual rollout, Pichai states:

“We are being measured in how we do this, focusing on areas where gen AI can improve the Search experience, while also prioritizing traffic to websites and merchants.”

Pichai reports that Google’s generative AI features have answered over a billion queries already:

“We’ve already served billions of queries with our generative AI features. It’s enabling people to access new information, to ask questions in new ways, and to ask more complex questions.”

Google reports increased Search usage and user satisfaction among those interacting with the new AI overview results.

The company also highlighted its “Circle to Search” feature on Android, which allows users to circle objects on their screen or in videos to get instant AI-powered answers via Google Lens.

Reorganizing For The “Gemini Era”

As part of the AI roadmap, Alphabet is consolidating all teams building AI models under the Google DeepMind umbrella.

Pichai revealed that, through hardware and software improvements, the company has reduced machine costs associated with its generative AI search results by 80% over the past year.

He states:

“Our data centers are some of the most high-performing, secure, reliable and efficient in the world. We’ve developed new AI models and algorithms that are more than one hundred times more efficient than they were 18 months ago.

How Will Google Make Money With AI?

Alphabet sees opportunities to monetize AI through its advertising products, Cloud offerings, and subscription services.

Google is integrating Gemini into ad products like Performance Max. The company’s Cloud division is bringing “the best of Google AI” to enterprise customers worldwide.

Google One, the company’s subscription service, surpassed 100 million paid subscribers in Q1 and introduced a new premium plan featuring advanced generative AI capabilities powered by Gemini models.

Future Outlook

Pichai outlined six key advantages positioning Alphabet to lead the “next wave of AI innovation”:

  1. Research leadership in AI breakthroughs like the multimodal Gemini model
  2. Robust AI infrastructure and custom TPU chips
  3. Integrating generative AI into Search to enhance the user experience
  4. A global product footprint reaching billions
  5. Streamlined teams and improved execution velocity
  6. Multiple revenue streams to monetize AI through advertising and cloud

With upcoming events like Google I/O and Google Marketing Live, the company is expected to share further updates on its AI initiatives and product roadmap.


Featured Image: Sergei Elagin/Shutterstock

Google Stresses The Need To Fact Check AI-Generated Content via @sejournal, @MattGSouthern

In a recent episode of Google’s Search Off The Record podcast, team members got hands-on with Gemini to explore creating SEO-related content.

However, their experiment raised concerns over factual inaccuracies when relying on AI tools without proper vetting.

The discussion involved Lizzi Harvey, Gary Illyes, and John Mueller taking turns utilizing Gemini to write sample social media posts on technical SEO concepts.

As they analyzed Gemini’s output, Illyes highlighted a limitation shared by all AI tools:

“My bigger problem with pretty much all generative AI is the factuality – you always have to fact check whatever they are spitting out. That kind of scares me that now we are just going to read it live, and maybe we are going to say stuff that is not even true.”

Outdated SEO Advice Exposed

The concerns stemmed from an AI-generated tweet suggesting using rel=”prev/next” for pagination – a technique that Google has deprecated.

Gemini suggested publishing the following tweet:

“Pagination causing duplicate content headaches? Use rel=prev, rel=next to guide Google through your content sequences. #technicalSEO, #GoogleSearch.”

Harvey immediately identified the advice as outdated. Mueller confirms rel=prev and rel=next is still unsupported:

“It’s gone. It’s gone. Well, I mean, you can still use it. You don’t have to make it gone. It’s just ignored.”

Earlier in the podcast, Harvey warned inaccuracies could result from outdated training data information.

Harvey stated:

“If there’s enough myth circulating or a certain thought about something or even outdated information
that has been blogged about a lot, it might come up in our exercise today, potentially.”

Sure enough, it took only a short time for outdated information to come up.

Human Oversight Still Critical

While the Google Search Relations team saw the potential for AI-generated content, their discussion stressed the need for human fact-checking.

Illyes’ concerns reflect the broader discourse around responsible AI adoption. Human oversight is necessary to prevent the spread of misinformation.

As generative AI use increases, remember that its output can’t be blindly trusted without verification from subject matter experts.

Why SEJ Cares

While AI-powered tools can potentially aid in content creation and analysis, as Google’s own team illustrated, a healthy degree of skepticism is warranted.

Blindly deploying generative AI to create content can result in publishing outdated or harmful information that could negatively impact your SEO and reputation.

Hear the full podcast episode below:


FAQ

How can inaccurate AI-generated content affect my SEO efforts?

Using AI-generated content for your website can be risky for SEO because the AI might include outdated or incorrect information.

Search engines like Google favor high-quality, accurate content, so publishing unverified AI-produced material can hurt your website’s search rankings. For example, if the AI promotes outdated practices like using the rel=”prev/next” tag for pagination, it can mislead your audience and search engines, damaging your site’s credibility and authority.

It’s essential to carefully fact-check and validate AI-generated content with experts to ensure it follows current best practices.

How can SEO and content marketers ensure the accuracy of AI-generated output?

To ensure the accuracy of AI-generated content, companies should:

  • Have a thorough review process involving subject matter experts
  • Have specialists check that the content follows current guidelines and industry best practices
  • Fact-check any data or recommendations from the AI against reliable sources
  • Stay updated on the latest developments to identify outdated information produced by AI


Featured Image: Screenshot from YouTube.com/GoogleSearchCentral, April 2024. 

Google’s New Infini-Attention And SEO via @sejournal, @martinibuster

Google has published a research paper on a new technology called Infini-attention that allows it to process massively large amounts of data with “infinitely long contexts” while also being capable of being easily inserted into other models to vastly improve their capabilities

That last part should be of interest to those who are interested in Google’s algorithm. Infini-Attention is plug-and-play, which means it’s relatively easy to insert into other models, including those in use b Google’s core algorithm. The part about “infinitely long contexts” may have implications for how some of Google’s search systems may work.

The name of the research paper is: Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention

Memory Is Computationally Expensive For LLMs

Large Language Models (LLM) have limitations on how much data they can process at one time because the computational complexity and memory usage can spiral upward significantly. Infini-Attention gives the LLM the ability to handle longer contexts while keeping the down memory and processing power needed.

The research paper explains:

“Memory serves as a cornerstone of intelligence, as it enables efficient computations tailored to specific contexts. However, Transformers …and Transformer-based LLMs …have a constrained context-dependent memory, due to the nature of the attention mechanism.

Indeed, scaling LLMs to longer sequences (i.e. 1M tokens) is challenging with the standard Transformer architectures and serving longer and longer context models becomes costly financially.”

And elsewhere the research paper explains:

“Current transformer models are limited in their ability to process long sequences due to quadratic increases in computational and memory costs. Infini-attention aims to address this scalability issue.”

The researchers hypothesized that Infini-attention can scale to handle extremely long sequences with Transformers without the usual increases in computational and memory resources.

Three Important Features

Google’s Infini-Attention solves the shortcomings of transformer models by incorporating three features that enable transformer-based LLMs to handle longer sequences without memory issues and use context from earlier data in the sequence, not just data near the current point being processed.

The features of Infini-Attention

  • Compressive Memory System
  • Long-term Linear Attention
  • Local Masked Attention

Compressive Memory System

Infini-Attention uses what’s called a compressive memory system. As more data is input (as part of a long sequence of data), the compressive memory system compresses some of the older information in order to reduce the amount of space needed to store the data.

Long-term Linear Attention

Infini-attention also uses what’s called, “long-term linear attention mechanisms” which enable the LLM to process data that exists earlier in the sequence of data that’s being processed which enables to retain the context. That’s a departure from standard transformer-based LLMs.

This is important for tasks where the context exists on a larger plane of data. It’s like being able to discuss and entire book and all of the chapters and explain how the first chapter relates to another chapter closer to the end of the book.

Local Masked Attention

In addition to the long-term attention, Infini-attention also uses what’s called local masked attention. This kind of attention processes nearby (localized) parts of the input data, which is useful for responses that depend on the closer parts of the data.

Combining the long-term and local attention together helps solve the problem of transformers being limited to how much input data it can remember and use for context.

The researchers explain:

“The Infini-attention incorporates a compressive memory into the vanilla attention mechanism and builds in both masked local attention and long-term linear attention mechanisms in a single Transformer block.”

Results Of Experiments And Testing

Infini-attention was tested with other models for comparison across multiple benchmarks involving long input sequences, such as long-context language modeling, passkey retrieval, and book summarization tasks. Passkey retrieval is a test where the language model has to retrieve specific data from within a extremely long text sequence.

List of the three tests:

  1. Long-context Language Modeling
  2. Passkey Test
  3. Book Summary

Long-Context Language Modeling And The Perplexity Score

The researchers write that the Infini-attention outperformed the baseline models and that increasing the training sequence length brought even further improvements in the Perplexity score. The Perplexity score is a metric that measures language model performance with lower scores indicating better performance.

The researchers shared their findings:

“Infini-Transformer outperforms both Transformer-XL …and Memorizing Transformers baselines while maintaining 114x less memory parameters than the Memorizing Transformer model with a vector retrieval-based KV memory with length of 65K at its 9th layer. Infini-Transformer outperforms memorizing transformers with memory length of 65K and achieves 114x compression ratio.

We further increased the training sequence length to 100K from 32K and trained the models on Arxiv-math dataset. 100K training further decreased the perplexity score to 2.21 and 2.20 for Linear and Linear + Delta models.”

Passkey Test

The passkey test is wherea random number is hidden within a long text sequence with the task being that the model must fetch the hidden text. The passkey is hidden either near the beginning, middle or the end of the long text. The model was able to solve the passkey test up to a length of 1 million.

“A 1B LLM naturally scales to 1M sequence length and solves the passkey retrieval task when injected with Infini-attention. Infini-Transformers solved the passkey task with up to 1M context length when fine-tuned on 5K length inputs. We report token-level retrieval accuracy for passkeys hidden in a different part (start/middle/end) of long inputs with lengths 32K to 1M.”

Book Summary Test

Infini-attention also excelled at the book summary test by outperforming top benchmarks achieving new state of the art (SOTA) performance levels.

The results are described:

“Finally, we show that a 8B model with Infini-attention reaches a new SOTA result on a 500K length book summarization task after continual pre-training and task fine-tuning.

…We further scaled our approach by continuously pre-training a 8B LLM model with 8K input length for 30K steps. We then fine-tuned on a book summarization task, BookSum (Kry´sci´nski et al., 2021) where the goal is to generate a summary of an entire book text.

Our model outperforms the previous best results and achieves a new SOTA on BookSum by processing the entire text from book. …There is a clear trend showing that with more text provided as input from books, our Infini-Transformers improves its summarization performance metric.”

Implications Of Infini-Attention For SEO

Infini-attention is a breakthrough in modeling long and short range attention with greater efficiency than previous models without Infini-attention. It also supports “plug-and-play continual pre-training and long-context adaptation
by design” which means that it can easily be integrated into existing models.

Lastly, the “continual pre-training and long-context adaptation” makes it exceptionally useful for scenarios where it’s necessary to constantly train the model on new data. This last part is super interesting because it may make it useful for applications on the back end of Google’s search systems, particularly where it is necessary to be able to analyze long sequences of information and understand the relevance from one part near the beginning of the sequence and another part that’s closer to the end.

Other articles focused on the “infinitely long inputs” that this model is capable of but where it’s relevant to SEO is how that ability to handle huge input and “Leave No Context Behind” is what’s relevant to search marketing and how some of Google’s systems might work if Google adapted Infini-attention to their core algorithm.

Read the research paper:

Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention

Featured Image by Shutterstock/JHVEPhoto

Google Crawler Documentation Has A New IP List via @sejournal, @martinibuster

Google updated their Googlebot and crawler documentation to add a range of IPs for bots triggered by users of Google products. The names of the feeds switched which is important for publishers who are whitelisting Google controlled IP addresses. The change will be useful for publishers who want to block scrapers who are using Google’s cloud and other crawlers not directly associated with Google itself.

New List Of IP Addresses

Google says that the list contains IP ranges that have long been in use, so they’re not new IP address ranges.

There are two kinds of IP address ranges:

  1. IP ranges that are initiated by users but controlled by Google and resolve to a Google.com hostname.
    These are tools like Google Site Verifier and presumably the Rich Results Tester Tool.
  2. IP ranges that are initiated by users but not controlled by Google and resolve to a gae.googleusercontent.com hostname.
    These are apps that are on Google cloud or apps scripts that are called from Gooogle Sheets.

The lists that correspond to each category are different now.

Previously the list that corresponded to Google IP addresses was this one: special-crawlers.json (resolving to gae.googleusercontent.com)

Now the “special crawlers” list corresponds to crawlers that are not controlled by Google.

“IPs in the user-triggered-fetchers.json object resolve to gae.googleusercontent.com hostnames. These IPs are used, for example, if a site running on Google Cloud (GCP) has a feature that requires fetching external RSS feeds on the request of the user of that site.”

The new list that corresponds to Google controlled crawlers is: 

user-triggered-fetchers-google.json

“Tools and product functions where the end user triggers a fetch. For example, Google Site Verifier acts on the request of a user. Because the fetch was requested by a user, these fetchers ignore robots.txt rules.

Fetchers controlled by Google originate from IPs in the user-triggered-fetchers-google.json object and resolve to a google.com hostname.”

The list of IPs from Google Cloud and App crawlers that Google doesn’t control can be found here:

https://developers.google.com/static/search/apis/ipranges/user-triggered-fetchers.json

The list of IP from Google that are triggered by users and controlled by Google is here:

https://developers.google.com/static/search/apis/ipranges/user-triggered-fetchers-google.json

New Section Of Content

There is a new section of content that explains what the new list is about.

“Fetchers controlled by Google originate from IPs in the user-triggered-fetchers-google.json object and resolve to a google.com hostname. IPs in the user-triggered-fetchers.json object resolve to gae.googleusercontent.com hostnames. These IPs are used, for example, if a site running on Google Cloud (GCP) has a feature that requires fetching external RSS feeds on the request of the user of that site. ***-***-***-***.gae.googleusercontent.com or google-proxy-***-***-***-***.google.com user-triggered-fetchers.json and user-triggered-fetchers-google.json”

Google Changelog

Google’s changelog explained the changes like this:

“Exporting an additional range of Google fetcher IP addresses
What: Added an additional list of IP addresses for fetchers that are controlled by Google products, as opposed to, for example, a user controlled Apps Script. The new list, user-triggered-fetchers-google.json, contains IP ranges that have been in use for a long time.

Why: It became technically possible to export the ranges.”

Read the updated documentation:
Verifying Googlebot and other Google crawlers

Read the old documentation:
Archive.org – Verifying Googlebot and other Google crawlers

Featured Image by Shutterstock/JHVEPhoto

DeepL Write: New AI Editor Improves Content Quality via @sejournal, @martinibuster

DeepL, the makers of the DeepL Translator, announced a new product called DeepL Write, an AI real-time editor that is powered by their own Large Language Model (LLM) that improves content at the draft stage, preserving the tone and voice of the writer.

Unlike many other AI writing tools, DeepL Write is not a content generator, it’s an editor that offers suggestions for what words to choose, how best to phrase ideas, proofreading your documents so that they sound professional and in the right tone and voice. Plus the usual spelling, grammar, and punctuation improvements.

According to DeepL:

“Unlike common generative AI tools that auto-populate text, or rules-based grammar correction tools, DeepL Write Pro acts as a creative assistant to writers in the drafting process, elevating their text with real-time, AI-powered suggestions on word choice, phrasing, style, and tone.

This unique approach sparks a creative synergy between the user and the AI that transforms text while preserving the writer’s authentic voice. DeepL Write Pro’s strength lies in its ability to give writers a sophisticated boost in their communication, regardless of language proficiency—empowering them to find the perfect words for any situation or audience.”

Enterprise Grade Security

DeepL write also comes with TLS (Transport Layer Security) encryption. TLS is a is a protocol that’s used to encrypt data sent between an app and a server, commonly used for email and instant messaging and it’s also the technology that is behind HTTPS which keeps websites secure.

In addition to keeping the documents secure DeepL write also comes with a text deletion feature to ensure that all documents are secure and nothing is stored online.

Standalone and With DeepL Translator Integration

DeepL Write is available as a standalone app and as part of a suite together with DeepL Translator. The integration with DeepL Translator makes it an advanced tool for creating documentation that can be rewritten into another language in the right tone and style.

At this time DeepL Write Pro is available in English and German, with more languages becoming available soon.

The standalone product is available in a free version with limited text improvements and a Pro version that costs $10.99 per month.

DeepL Write Pro comes with the following features:

  • Maximum data security
  • Unlimited text improvements
  • Unlimited use of alternatives
  • Unlimited use of writing styles
  • Team administration

There is also an Enterprise level named DeepL Write for Business which is for organizations that need it for 50 or more users.

DeepL Write Pro

Many publishers and search marketers who depended on AI for generating content have reported having lost rankings during the last Google Core Algorithm update in March. Naturally many publishes are hesitant to give AI a try for generating content.

DeepL Write Pro offers an alternative use of AI for content in the form of a virtual editor that can help to polish up a human’s writing and help make it more concise, professional and in the correct style and tone.

One of the things that stands between a passable document and a great document is good writing, something that an editor is useful for elevating content to a higher quality. Given the modest price and the value that a good editor provides, the timing for this kind of product couldn’t be better.

Read more at DeepL Write Pro

Featured Image by Shutterstock/one photo

Google Further Postpones Third-Party Cookie Deprecation In Chrome via @sejournal, @MattGSouthern

Google has again delayed its plan to phase out third-party cookies in the Chrome web browser. The latest postponement comes after ongoing challenges in reconciling feedback from industry stakeholders and regulators.

The announcement was made in Google and the UK’s Competition and Markets Authority (CMA) joint quarterly report on the Privacy Sandbox initiative, scheduled for release on April 26.

Chrome’s Third-Party Cookie Phaseout Pushed To 2025

Google states it “will not complete third-party cookie deprecation during the second half of Q4” this year as planned.

Instead, the tech giant aims to begin deprecating third-party cookies in Chrome “starting early next year,” assuming an agreement can be reached with the CMA and the UK’s Information Commissioner’s Office (ICO).

The statement reads:

“We recognize that there are ongoing challenges related to reconciling divergent feedback from the industry, regulators and developers, and will continue to engage closely with the entire ecosystem. It’s also critical that the CMA has sufficient time to review all evidence, including results from industry tests, which the CMA has asked market participants to provide by the end of June.”

Continued Engagement With Regulators

Google reiterated its commitment to “engaging closely with the CMA and ICO” throughout the process and hopes to conclude discussions this year.

This marks the third delay to Google’s plan to deprecate third-party cookies, initially aiming for a Q3 2023 phaseout before pushing it back to late 2024.

The postponements reflect the challenges in transitioning away from cross-site user tracking while balancing privacy and advertiser interests.

Transition Period & Impact

In January, Chrome began restricting third-party cookie access for 1% of users globally. This percentage was expected to gradually increase until 100% of users were covered by Q3 2024.

However, the latest delay gives websites and services more time to migrate away from third-party cookie dependencies through Google’s limited “deprecation trials” program.

The trials offer temporary cookie access extensions until December 27, 2024, for non-advertising use cases that can demonstrate direct user impact and functional breakage.

While easing the transition, the trials have strict eligibility rules. Advertising-related services are ineligible, and origins matching known ad-related domains are rejected.

Google states the program aims to address functional issues rather than relieve general data collection inconveniences.

Publisher & Advertiser Implications

The repeated delays highlight the potential disruption for digital publishers and advertisers relying on third-party cookie tracking.

Industry groups have raised concerns that restricting cross-site tracking could push websites toward more opaque privacy-invasive practices.

However, privacy advocates view the phaseout as crucial in preventing covert user profiling across the web.

With the latest postponement, all parties have more time to prepare for the eventual loss of third-party cookies and adopt Google’s proposed Privacy Sandbox APIs as replacements.


Featured Image: Novikov Aleksey/Shutterstock