Getty Images Updated Generative AI Pushes Boundaries Of What’s Possible via @sejournal, @martinibuster

Getty Images announced an updated AI model for their image generator that generates images faster and with a higher quality. The changes benefit users of Generative AI by Getty Images and also Generative AI by iStock.

Fully Licensed High Quality Images

The Getty AI generated images are trained on their own content which means that all generated images can be fully licensed and commercial use is indemnified which means that users can license the images without ethical worries about how the AI models were trained.

High Quality Image Generation And Modification

A benefit of the updated Generative AI By Getty is that both generated images and existing stock images can be edited and modified by the AI. An image can easily be extended horizontally or vertically, individual elements can be added or removed, including the entire background of the image.

This solves so many problems for publishers who are looking for images with specific qualities in them because now they can more easily edit images to make them fit their exact needs – without having to use an expensive image editing software or SaaS.

These are some of the features users can take advantage of:

  • Industry-leading generation speed: Image generation speeds set to reach around 6 seconds, doubling the performance of the previous model, putting it at the forefront of the industry.
  • Advanced 4K generation detail: Enhanced detail and fidelity in generated images, with advanced upscaling and increased 4K generation detail.
  • Expanded support and adherence for more detailed prompts: Higher level of detail prompts results in images that more closely match the descriptions provided in the text prompt.
  • Longer prompts: Supports more complex and longer prompts, up to 250 words.
  • Advanced camera controls: Greater control over output using shot type and depth of field.”

Create Your Own AI Model

Enterprise level customers have the ability to fine-tune their own AI image generator models by training with their own images. This means that customers can create AI generated images based on their products, models and other image assets that are exclusive and proprietary to the users.

Getty Images Democratizes High Quality Images

Getty’s announcement represents a milestone in the business of stock images, enabling both pro and enthusiast level users to create and modify images at a level that was unthinkable only a few years ago.

Read more at:

Generate AI images and modify iStock imagery with ease

Featured Image by Shutterstock/rafapress

Why WordPress 6.6.1 Was Flagged For Trojan Malware via @sejournal, @martinibuster

Multiple user reports have surfaced warning that the latest version of WordPress is triggering trojan alerts and at least one person reported that a web host locked down a website because of the file. What really happened turned into a learning experience.

Antivirus Flags Trojan In Official WordPress 6.6.1 Download

The first report was filed in the official WordPress.org help forums where a user reported that the native antivirus in Windows 11 (Windows Defender) flagged the WordPress zip file they had downloaded from WordPress contained a trojan.

This is the text of the original post:

“Windows Defender shows that the latest wordpress-6.6.1zip has Trojan:Win32/Phish!MSR virus when i try downloading from the official wp site

it shows the same virus notification when updating from within the WordPress dashboard of my site

Is this a false positive?”

They also posted screenshots of the trojan warning that listed the status as “Quarantine failed” and that WordPress zip file of version 6.6.1 “is dangerous and executes commands from an attacker.”

Screenshot Of Windows Defender Warning

Screenshot of alert to a Trojan virus file in WordPress 6.6.1

Someone else affirmed that they were also having the same issue, noting that a string of code within one of the CSS files (style code that governs the look of a website, including colors) was the culprit that was triggering the warning.

They posted:

“I am experiencing the same issue. It seems to occur with the file wp-includescssdistblock-librarystyle.min.css. It appears that a specific string in the CSS file is being detected as a Trojan virus. I would like to allow it, but I think I should wait for an official response before doing so. Is there anyone who can provide an official answer?”

Unexpected “Solution”

A false positive is generally a result that tests as positive when it’s not actually a positive for whatever is being tested for. WordPress users soon began to suspect that the Windows Defender trojan virus alert was a false positive.

An official WordPress GitHub ticket was filed where the cause was identified as an insecure URL (http versus https) that’s referenced from within the CSS style sheet. A URL is not commonly considered a part of a CSS file so that may be why Windows Defender flagged this specific CSS file as containing a trojan.

Here’s the part where things went off in an unexpected direction. Someone opened another WordPress GitHub ticket to document a proposed fix for the insecure URL, which should have been the end of the story but it ended up leading to a discovery about what was really going on.

The insecure URL that needed fixing was this one:

http://www.w3.org/2000/svg

So the person who opened the ticket updated the file with a version that contained a link to the HTTPS version which should have been the end of the story but for a nuance that was overlooked.

The (‘insecure’) URL is not a link to a source of files (and therefore not insecure) but rather an identifier that defines the scope of the Scalable Vector Graphics (SVG) language within XML.

So the problem ultimately ended up not being about something wrong with the code in WordPress 6.6.1 but rather an issue with Windows Defender that failed to properly identify an “XML namespace” instead of mistakenly flagging it as a URL linking to downloadable files.

Takeaway

The false positive trojan file alert by Windows Defender and subsequent discussion was a learning moment for many people (including myself!) about a relatively arcane bit of coding knowledge regarding the XML namespace for SVG files.

Read the original report:

Virus Issue :wordpress-6.6.1.zip shows a virus from windows defender

Google May Rely Less On Hreflang, Shift To Auto Language Detection via @sejournal, @MattGSouthern

In the latest episode of Google’s “Search Off The Record” podcast, a member of the Search Relations team suggested that Google may be moving towards automatically detecting language versions of web pages, potentially reducing the need for manual hreflang annotations.

Google’s Stance On Automatic Language Detection

Gary Illyes, a Google analyst, believes that search engines should rely less on annotations like hreflang and more on automatically learned signals.

Illyes stated during the podcast:

“Ultimately, I would want less and less annotations, site annotations, and more automatically learned things.”

He argued that this approach is more reliable than the current system of manual annotations.

Illyes elaborated on the existing capabilities of Google’s systems:

“Almost ten years ago, we could already do that, and this was what, almost ten years ago.”

Illyes emphasized the potential for improvement in this area:

“If, almost ten years ago, we could already do that quite reliably, then why would we not be able to do it now.”

The Current State Of Hreflang Implementation

The discussion also touched on the current state of hreflang implementation.

According to data cited in the podcast, only about 9% of websites currently use hreflang annotations on their home pages.

This relatively low adoption rate might be a factor in Google’s consideration of alternative methods for detecting language and regional targeting.

Potential Challenges & Overrides

While advocating for automatic detection, Illyes acknowledged that website owners should be able to override automatic detections if necessary.

He conceded, “I think we should have overrides,” recognizing the need for manual control in some situations.

The Future Of Multilingual SEO

While no official changes have been announced, this discussion provides insight into the potential future direction of Google’s approach to multilingual and multi-regional websites.

Stay tuned for any official updates from Google on this topic.

What This Means For You

This potential shift in Google’s language detection and targeting approach could have significant implications for website owners and SEO professionals.

It could reduce the technical burden of implementing hreflang annotations, particularly for large websites with multiple language versions.

The top takeaways from this discussion include the following:

  1. It’s advisable to continue following Google’s current guidelines on implementing hreflang annotations.
  2. Ensure that your multilingual content is high-quality and accurately translated. This will likely remain crucial regardless of how Google detects language versions.
  3. While no immediate changes are planned, be ready to adapt your SEO strategy if Google moves towards more automatic language detection.
  4. If you’re planning a new multilingual site or restructuring an existing one, consider a clear and logical structure that makes language versions obvious, as this may help with automatic detection.

Remember, while automation may increase, having a solid understanding of international SEO principles will remain valuable for optimizing your global web presence.

Listen to the full podcast episode below:

Google Insights: Can Incorrect Hreflang Tags Hurt SEO? via @sejournal, @MattGSouthern

In a recent episode of Google’s Search Off The Record podcast, Gary Illyes, a Google’s Search Relations team member, addressed concerns about incorrect hreflang implementation and its potential impact on SEO.

Hreflang Errors: Less Problematic Than Expected?

During the discussion, Illyes was asked about the consequences of mismatched hreflang annotations and actual page content.

Specifically, he addressed scenarios where a page might be incorrectly labeled as one language while containing content in another.

Illyes stated:

“As far as I remember, I worked on the parsing implementation plus the promotion implementation of hreflang, and back then, it didn’t cause problems.”

However, he also noted that his direct experience with this was from around 2016, adding the following:

“That’s a few years back… since then, we changed so many things that I would have to check whether it causes problems.”

Language Demotion & Country Promotion

Providing further context, Illyes explained Google’s approach to language and country relevance:

“When I spelled out LDCP, I said the language demotion country promotion. So, for example, if someone is searching in German and your page is in English, then you would get a negative demotion in the search results.”

This suggests that while incorrect hreflang implementation might not directly cause problems, the actual language of the content still plays a vital role in search relevance.

Exceptions To Language Matching

Interestingly, Illyes pointed out that there are exceptions to strict language matching:

“It’s less relevant to the query to the person unless you are searching for something like ‘how do you spell banana’… Because then it doesn’t really matter… well no it does… it still matters but… because you’re searching for something in English, so we would think okay you want some page that explains how to spell banana in English, not German.”

What This Means For You

Understanding how Google handles hreflang and language mismatches can help inform international SEO strategies.

While Google’s systems appear to be somewhat forgiving of hreflang errors, the actual language of the content remains a key factor in search relevance.

Here are the top takeaways:

  1. While incorrect hreflang implementation may not directly penalize your site, it’s still best practice to ensure your annotations accurately reflect your content.
  2. The actual language of your content appears to be more important than hreflang annotations for search relevance.
  3. For specific queries, like spelling or language-learning topics, Google may be more flexible in presenting content in various languages.

As Illyes noted, Google’s systems have changed over time. Continue to monitor official Google documentation and announcements for the most up-to-date best practices in international SEO.

Listen to the full podcast episode below:


Featured Image: Longfin Media/Shutterstock

Google Hints Lowering SEO Value Of Country Code Top-Level Domains via @sejournal, @MattGSouthern

In a recent episode of Google’s Search Off The Record podcast, the company’s Search Relations team hinted at potential changes in how country-code top-level domains (ccTLDs) are valued for SEO.

This revelation came during a discussion on internationalization and hreflang implementation.

The Fading Importance Of ccTLDs

Gary Illyes, a senior member of Google’s Search Relations team, suggested that the localization boost traditionally associated with ccTLDs may soon be over.

Illyes stated:

“I think eventually, like in years’ time, that [ccTLD benefit] will also fade away.”

He explained that ccTLDs are becoming less reliable indicators of a website’s geographic target audience.

Creative Use Of ccTLDs For Branding

According to Illyes, the primary reason for this shift is the creative use of ccTLDs for branding purposes rather than geographic targeting.

He elaborated:

“Think about the all the funny domain names that you can buy nowadays like the .ai. I think that’s Antigua or something… It doesn’t say anything anymore about the country… it doesn’t mean that the content is for the country.”

Illyes further explained the historical context and why this change is occurring:

“One of the main algorithms that do the whole localization thing… is called something like LDCP – language demotion country promotion. So basically if you have like a .de, then for users in Germany you would get like a slight boost with your .de domain name. But nowadays, with .co or whatever .de, which doesn’t relate to Germany anymore, it doesn’t really make sense for us to like automatically apply that little boost because it’s ambiguous what the target is.”

The Impact On SEO Strategies

This change in perspective could have implications for international SEO strategies.

Traditionally, many businesses have invested in ccTLDs to gain a perceived advantage in local search results.

If Google stops using ccTLDs as a strong signal for geographic relevance, this could alter how companies approach their domain strategy for different markets.

Marketing Value Of ccTLDs

However, Illyes also noted that from a marketing perspective, there might still be some value in purchasing ccTLDs:

“I think from a marketing perspective there’s still some value in buying the ccTLDs and if I… if I were to run some… like a new business, then I would try to buy the country TLDs when I can, when like it’s monetarily feasible, but I would not worry too much about it.”

What This Means For You

As search engines become more capable of understanding content and context, traditional signals like ccTLDs may carry less weight.

This could lead to a more level playing field for websites, regardless of their domain extension.

Here are some top takeaways:

  1. If you’ve invested heavily in country-specific domains for SEO purposes, it may be time to reassess this strategy.
  2. Should the importance of ccTLDs decrease, proper implementation of hreflang tags becomes crucial for indicating language and regional targeting.
  3. While the SEO benefits may diminish, ccTLDs can still have branding and marketing value.
  4. Watch for official announcements or changes in Google’s documentation regarding using ccTLDs and international SEO best practices.

While no immediate changes were announced, this discussion provides valuable insight into the potential future direction of international SEO.

Listen to the full podcast episode below:

Google Advises Caution With AI Generated Answers via @sejournal, @martinibuster

Google’s Gary Illyes cautioned about the use of Large Language Models (LLMs), affirming the importance of checking authoritative sources before accepting any answers from an LLM. His answer was given in the context of a question, but curiously, he didn’t publish what that question was.

LLM Answer Engines

Based on what Gary Illyes said, it’s clear that the context of his recommendation is the use of AI for answering queries. The statement comes in the wake of OpenAI’s announcement of SearchGPT that they are testing an AI Search Engine prototype. It may be that his statement is not related to that announcement and is just a coincidence.

Gary first explained how LLMs craft answers to questions and mentions how a technique called “grounding” can improve the accuracy of the AI generated answers but that it’s not 100% perfect, that mistakes still slip through. Grounding is a way to connect a database of facts, knowledge, and web pages to an LLM. The goal is to ground the AI generated answers to authoritative facts.

This is what Gary posted:

“Based on their training data LLMs find the most suitable words, phrases, and sentences that align with a prompt’s context and meaning.

This allows them to generate relevant and coherent responses. But not necessarily factually correct ones. YOU, the user of these LLMs, still need to validate the answers based on what you know about the topic you asked the LLM about or based on additional reading on resources that are authoritative for your query.

Grounding can help create more factually correct responses, sure, but it’s not perfect; it doesn’t replace your brain. The internet is full of intended and unintended misinformation, and you wouldn’t believe everything you read online, so why would you LLM responses?

Alas. This post is also online and I might be an LLM. Eh, you do you.”

AI Generated Content And Answers

Gary’s LinkedIn post is a reminder that LLMs generate answers that are contextually relevant to the questions that are asked but that contextual relevance isn’t necessarily factually accurate.

Authoritativeness and trustworthiness is an important quality of the kind of content Google tries to rank. Therefore it is in publishers best interest to consistently fact check content, especially AI generated content, in order to avoid inadvertently becoming less authoritative. The need to verify facts also holds true for those who use generative AI for answers.

Read Gary’s LinkedIn Post:

Answering something from my inbox here

Featured Image by Shutterstock/Roman Samborskyi

Google Cautions On Blocking GoogleOther Bot via @sejournal, @martinibuster

Google’s Gary Illyes answered a question about the non-search features that the GoogleOther crawler supports, then added a caution about the consequences of blocking GoogleOther.

What Is GoogleOther?

GoogleOther is a generic crawler created by Google for the various purposes that fall outside of those of bots that specialize for Search, Ads, Video, Images, News, Desktop and Mobile. It can be used by internal teams at Google for research and development in relation to various products.

The official description of GoogleOther is:

“GoogleOther is the generic crawler that may be used by various product teams for fetching publicly accessible content from sites. For example, it may be used for one-off crawls for internal research and development.”

Something that may be surprising is that there are actually three kinds of GoogleOther crawlers.

Three Kinds Of GoogleOther Crawlers

  1. GoogleOther
    Generic crawler for public URLs
  2. GoogleOther-Image
    Optimized to crawl public image URLs
  3. GoogleOther-Video
    Optimized to crawl public video URLs

All three GoogleOther crawlers can be used for research and development purposes. That’s just one purpose that Google publicly acknowledges that all three versions of GoogleOther could be used for.

What Non-Search Features Does GoogleOther Support?

Google doesn’t say what specific non-search features GoogleOther supports, probably because it doesn’t really “support” a specific feature. It exists for research and development crawling which could be in support of a new product or an improvement in a current product, it’s a highly open and generic purpose.

This is the question asked that Gary narrated:

“What non-search features does GoogleOther crawling support?”

Gary Illyes answered:

“This is a very topical question, and I think it is a very good question. Besides what’s in the public I don’t have more to share.

GoogleOther is the generic crawler that may be used by various product teams for fetching publicly accessible content from sites. For example, it may be used for one-off crawls for internal research and development.

Historically Googlebot was used for this, but that kind of makes things murky and less transparent, so we launched GoogleOther so you have better controls over what your site is crawled for.

That said GoogleOther is not tied to a single product, so opting out of GoogleOther crawling might affect a wide range of things across the Google universe; alas, not Search, search is only Googlebot.”

It Might Affect A Wide Range Of Things

Gary is clear that blocking GoogleOther wouldn’t have an affect on Google Search because Googlebot is the crawler used for indexing content. So if blocking any of the three versions of GoogleOther is something a site owner wants to do, then it should be okay to do that without a negative effect on search rankings.

But Gary also cautioned about the outcome that blocking GoogleOther, saying that it would have an effect on other products and services across Google. He didn’t state which other products it could affect nor did he elaborate on the pros or cons of blocking GoogleOther.

Pros And Cons Of Blocking GoogleOther

Whether or not to block GoogleOther doesn’t necessarily have a straightforward answer. There are several considerations to whether doing that makes sense.

Pros

Inclusion in research for a future Google product that’s related to search (maps, shopping, images, a new feature in search) could be useful. It might be helpful to have a site included in that kind of research because it might be used for testing something good for a site and be one of the few sites chosen to test a feature that could increase earnings for a site.

Another consideration is that blocking GoogleOther to save on server resources is not necessarily a valid reason because GoogleOther doesn’t seem to crawl so often that it makes a noticeable impact.

If blocking Google from using site content for AI is a concern then blocking GoogleOther will have no impact on that at all. GoogleOther has nothing to do with crawling for Google Gemini apps or Vertex AI, including any future products that will be used for training associated language models. The bot for that specific use case is Google-Extended.

Cons

On the other hand it might not be helpful to allow GoogleOther if it’s being used to test something related to fighting spam and there’s something the site has to hide.

It’s possible that a site owner might not want to participate if GoogleOther comes crawling for market research or for training machine learning models (for internal purposes) that are unrelated to public-facing products like Gemini and Vertex.

Allowing GoogleOther to crawl a site for unknown purposes is like giving Google a blank check to use your site data in any way they see fit outside of training public-facing LLMs or purposes related to named bots like GoogleBot.

Takeaway

Should you block GoogleOther? It’s a coin toss. There are possible potential benefits but in general there isn’t enough information to make an informed decision.

Listen to the Google SEO Office Hours podcast at the 1:30 minute mark:

Featured Image by Shutterstock/Cast Of Thousands

Reddit Limits Search Engine Access, Google Remains Exception via @sejournal, @MattGSouthern

Reddit has recently tightened its grip on who can access its content, blocking major search engines from indexing recent posts and comments.

This move has sparked discussions in the SEO and digital marketing communities about the future of content accessibility and AI training data.

What’s Happening?

First reported by 404 Media, Reddit updated its robots.txt file, preventing most web crawlers from accessing its latest content.

Google, however, remains an exception, likely due to a $60 million deal that allows the search giant to use Reddit’s content for AI training.

Brent Csutoras, founder of Search Engine Journal, offers some context:

“Since taking on new investors and starting their pathway to IPO, Reddit has moved away from being open-source and allowing anyone to scrape their content and use their APIs without paying.”

The Google Exception

Currently, Google is the only major search engine able to display recent Reddit results when users search with “site:reddit.com.”

This exclusive access sets Google apart from competitors like Bing and DuckDuckGo.

Why This Matters

For users who rely on appending “Reddit” to their searches to find human-generated answers, this change means they’ll be limited to using Google or search engines that pull from Google’s index.

It presents new challenges for SEO professionals and marketers in monitoring and analyzing discussions on one of the internet’s largest platforms.

The Bigger Picture

Reddit’s move aligns with a broader trend of content creators and platforms seeking compensation for using their data in AI training.

As Csutoras points out:

“Publications, artists, and entertainers have been suing OpenAI and other AI companies, blocking AI companies, and fighting to avoid using public content for AI training.”

What’s Next?

While this development may seem surprising, Csutoras suggests it’s a logical step for Reddit.

He notes:

“It seems smart on Reddit’s part, especially since similar moves in the past have allowed them to IPO and see strong growth for their valuation over the last two years.”


FAQ

What is the recent change Reddit has made regarding content accessibility?

Reddit has updated its robots.txt file to block major search engines from indexing its latest posts and comments. This change exempts Google due to a $60 million deal, allowing Google to use Reddit’s content for AI training purposes.

Why does Google have exclusive access to Reddit’s latest content?

Google has exclusive access to Reddit’s latest content because of a $60 million deal that allows Google to use Reddit’s content for AI training. This agreement sets Google apart from other search engines like Bing and DuckDuckGo, which are unable to index new Reddit posts and comments.

What broader trend does Reddit’s recent move reflect?

Reddit’s decision to limit search engine access aligns with a larger trend where content creators and platforms seek compensation for the use of their data in AI training. Many publications, artists, and entertainers are taking similar actions to either block or demand compensation from AI companies using their content.


Featured Image: Mamun sheikh K/Shutterstock