In a post on the r/TechSEO subreddit, Google’s Search Advocate, John Mueller, responded to a Reddit user asking how to increase localized traffic to a European Union (EU) domain hosted in the United States (US).
The user’s client, who owns a .com and a .eu subdomain, hopes to increase targeted traffic to the latter. However, the user is concerned that the site’s server location could reduce the domain’s visibility in international search results.
Here are the five things Mueller suggested the user should focus on – or safely ignore – to increase localized traffic from the EU.
1. Utilize Hreflang Tags
Mueller’s first recommendation is the use of hreflang tags. These tags are instrumental in directing users from various European countries to the EU subdomain, making the site more accessible and relevant to the European audience.
This approach is crucial for a site targeting multiple regions with potentially overlapping or similar content. He emphasizes that hreflang should connect major European countries to the EU domain, and the rest of the visitors would default to the .com.
2. Server Location Isn’t A Factor
Secondly, Mueller downplays the importance of server location.
Contrary to the common belief that server proximity to the target audience enhances performance, he suggests that server location is less pivotal, thus offering more flexibility in server hosting decisions.
3. Canonical Tags Can Prevent Content Duplication
The third point addresses the issue of content duplication, particularly when the same language is used across multiple domains.
Mueller advises using canonical tags carefully in such scenarios to avoid Google interpreting the content as duplicate. Alternatively, slight variations in content across these domains can help distinguish them.
4. Support Local Currency With Google Shopping Feeds
This approach involves optimizing product listings for visibility in Google’s Shopping search results, an effective tool for reaching a broader European audience and enhancing e-commerce performance.
5. Focus On The Homepage And High-Level Pages
Lastly, Mueller suggests the best results can be achieved by focusing on the homepage and other high-level pages.
This strategy implies that a comprehensive site-wide overhaul may not be necessary; instead, prioritizing key pages can lead to substantial improvements in traffic with efficient resource allocation.
Conclusion
This insight is crucial for marketing and SEO professionals aiming to expand their reach within the EU market.
By implementing Mueller’s strategies, businesses can enhance their website’s visibility and relevance in European search results, thereby driving targeted traffic and potentially boosting conversions.
These tactics align with the latest SEO best practices and offer practical solutions for multinational digital marketing.
Google’s Gary Illyes answered a question about site structure, explaining why a hierarchical site structure is good for SEO and when it’s okay to use a flat site structure.
Gary offers a good reason why a hierarchical site structure is a good choice and why the flat site structure is okay for simple websites.
Flat Site Structure
A flat site structure is when every page has a link from the home page, thus every page is one click away from the home page.
It’s called a flat structure because if the linking structure were visualized it would be flat, with everything all linked together on one level beneath the home page.
The flat site structure came about during a time when people used to obtain links from web directories and from reciprocal linking (where two sites agree to link to each other).
That kind of link building created massive amounts of links to the home page but not so much to the inner pages. What SEOs to maximize how much PageRank was distributed across the website was to create the flat structure which distributed PageRank down to every page, giving every page the maximum ability to rank better during a time where high PageRank pages tended to rank better.
Not long after that strategy was invented Google dampened the influence of PageRank as a ranking factor so that sites with lower PageRank scores that were more relevant had a chance to rank. That pretty much removed any reason for using a flat file structure.
Hierarchical Site Structure
A hierarchy is a reference to way of organizing something by order of importance. In the case of a site structure the organization is with the most general level of the site topic at the very top with the webpages becoming more specific (granular) the lower down the site structure that you go.
A hierarchical site structure allows you to create categories that fit into a topic theme.
The home page can be a general topic that represents the entire site, like Science.
The next level down are thematic categories like Astronomy, Botany, Geology, Meteorology, and Psychology.
The next level down from Astronomy can be Astrophysics, Cosmology, Observational Astronomy, Planetary Science, Stellar Astronomy. Each level contains articles that are share the theme of each of the categories.
Also Known As The Taxonomic Site Structure
The hierarchical site structure was also known as a taxonomic structure in which things are categorized in hierarchical groups. In the early 2000s, web directories were said to be organized in a taxonomical site structure, with the home page representing the main topic (Web Directory) and from there linking down to the next level of categories and successively granular (specific) categories and web pages progressively deeper into the taxonomical site structure.
Also Known As The Pyramid Site Structure
Hierarchical and Taxonomic site structures were also known as the Pyramid site structure. The top of the pyramid represents the home page which generally ranks for the topic of the entire website and beneath it there are successively granular (specific) categories and web pages the lower down the site structure a person clicks.
Silo Site Structure
There was a fourth visualization of site architecture that was contemporaneous with the previous three. Like the previous three site structures, it had a general topic at the top and successively granular (specific) categories and web pages the lower down the site structure.
The visualization is different for each form of site structure but they all describe the exact same hierarchical manner of organizing a website by categories and topics (also known as themes, the same thing as topics), the general topic at the top becoming progressively specific the lower down one goes.
The person asking the question asked:
“Which category structure: hierarchical or flat structure for my website?”
Gary Illyes answered:
“I think this largely depends on the site’s size.
For a large site it’s likely better to have a hierarchical structure; that will allow you to do funky stuff on just one section, and will also allow search engines to potentially treat different sections differently, especially when it comes to crawling.
For example, having a /news/ section for newsy content and /archives/ for old content would allow search engines to crawl /news/ faster than the other directory. If you put everything in one directory, that’s hardly possible.”
Hierarchical Site Structure
Gary offers a great reason to use a hierarchical site structure in that it gives Google the opportunity to treat different sections differently, including regarding one section of a website as dealing with the topic of news. Every category of a webpage can be a different topic which helps Google separate the site into sections and know what each one is about.
The hierarchical site structure, so good that SEOs had to give it four names.
Listen to the office hours hangout at the 1:35 second mark.
Google’s search results have been hit by a spam attack for the past few days in what can only be described as completely out of control. Many domains are ranking for hundreds of thousands of keywords each, an indication that the scale of this attack could easily reach into the millions of keyword phrases.
Surprisingly, many of the domains have only been registered within the past 24-48 hours.
This recently came to my attention from a series of posts by Bill Hartzer (LinkedIn profile) where he published link graph generated by the Majestic backlinks tool that exposed the link networks of several of the spam sites.
The link graph that he posted showed scores of websites tightly interlinking with each other, which is fairly typical pattern for spammy link networks.
Screenshot Of Tightly Interlinked Network
Image by Bill Hartzer via Majestic
Bill and I talked about the spam sites over Facebook messenger and we both agreed that although the spammers put a lot of work into creating a backlink network, the links weren’t actually responsible for the high rankings.
Bill said:
“This, in my opinion, is partly the fault of Google, who appears to be putting more emphasis on content rather than links.”
I agree 100% that Google is putting more emphasis on content than links. But my thoughts are that the spam links are there so that Googlebot can discover the spam pages and index them, even if just for one or two days.
Once indexed the spam pages are likely exploiting what I consider two loopholes in Google’s algorithms, which I talk about next.
Out of Control Spam in Google SERPs
Multiple sites are ranking for longtail phrases that are somewhat easy to rank, as well as phrases with a local search component, which are also easy to rank.
Longtail phrases are keyword phrases that are used by people but exceedingly rarely. Longtail is a concept that’s been around for almost twenty years and subsequently popularized by a 2006 book called The Long Tail: Why the Future of Business is Selling Less of More.
Spammers are able to rank for these rarely searched phrases because there is little competition for those phrases, which makes it easy to rank.
So if a spammer creates millions of pages of longtail phrases those pages can then rank for hundreds of thousands of keywords every day in a short period of time.
Companies like Amazon use the principle of the longtail to sell hundreds of thousands of individual products a day which is different than selling one product hundred thousands of times per day.
That’s what the spammers are exploiting, the ease of ranking for longtail phrases.
The second thing that the spammers are exploiting is the loophole that’s inherent in Local Search.
The local search algorithm is not the same as the algorithm for ranking non-local keywords.
The examples that have come to light are variations of Craigslist and related keywords.
Examples are phrases like Craigslist auto parts, Craigslist rooms to rent, Craigslist for sale by owner and thousands of other keywords, most of which don’t use the word Craigslist.
The scale of the spam is huge and it goes far beyond than keywords with the word “Craigslist” in it.
What The Spam Page Looks Like
Taking a look at what the spam page looks like is impossible by visiting the pages with a browser.
I tried to see the source code of the sites that rank in Google but all of the spam sites automatically redirect to another domain.
I next entered the spam URL into the W3C link checker to visit the website but the W3C bot couldn’t see the site either.
So I changed my browser user agent to identify itself as Googlebot but the spam site still redirected me.
That indicated that the site was not checking if the user agent was Googlebot.
The spam site was checking for Googlebot IP addresses. If the visitor’s IP address matched as belonging to Google then the spam page displayed content to Googlebot.
All other visitors got a redirect to other domains that displayed sketchy content.
In order to see the HTML of the website I had to visit with a Google IP address. So I used Google’s Rich Results tester to visit the spam site and record the HTML of the page.
I showed Bill Hartzer how to extract the HTML by using the Rich Results tester and he immediately went off to tweet about it, lol. Dang!
The Rich Results Tester has an option to show the HTML of a webpage. So copied the HTML, pasted it into a text file then saved it it as an HTML file.
Screenshot Of HTML Provided By Rich Results Tool
I next edited the HTML file to remove any JavaScript then saved the file again.
I was now able to see what the webpage looks like to Google:
Screenshot Of Spam Webpage
One Domain Ranks For 300,000+ Keywords
Bill sent me a spreadsheet containing a list of keyword phrases that just one of the spam sites ranked for. One spam site, just one of them, ranked for over 300,000 keyword phrases.
Screenshot Showing Keywords For One Domain
There were a lot of Craigslist keyword phrases but there were also other longtail phrases, many of which contained a local search element. As I mentioned, it’s easy to rank for longtail phrases, easy to rank for local search phrases and combine the two kinds of phrases and it’s really easy to rank for these keyword phrases.
Why Does This Spam Technique Work?
Local search uses a different algorithm than the non-local algorithm. For example, a local site, in general, doesn’t need a lot of links to rank for a query. The pages just need the right kinds of keywords to trigger a local search algorithm and rank it for a geographic area.
So if you search for “Craigslist auto parts” that’s going to trigger the local search algorithm and because it’s longtail it’s not going to take too much to rank it.
This is an ongoing problem for many years. Several years ago a website was able to rank for “Rhinoplasty Plano, Texas” with a site that contained old Roman Latin content and headings in English. Rhinoplasty is a longtail local search and Plano, Texas is a relatively small town. Ranking for that Rhinoplasty keyword phrase was so easy that the latin language website was able to easily rank for it.
Google has known about this spam problem since at least December 19th, as acknowledged in a tweet by Danny Sullivan.
Yes, I already passed that one on to the search team. Here’s a peek. And it’s being looked at. pic.twitter.com/vJH3EisnXD
In 2023, AI became a part of the user interfaces of top search engines, social media networks, advertising platforms, productivity software, and SEO tools.
After a year of non-stop breaking news about generative AI research, large language models (LLMs), stealth AI startups, GPT wrappers, and AI integrations, it may be hard to determine which solution is best suited for your objectives.
But if you make a list of challenges you face throughout your workday, you’ll find there is – or soon will be – an AI that makes handling those challenges simpler.
This quick guide offers over 100 of the best AI chatbots, tools, solutions, and training for everyone in SEO, from individual consultants to enterprise marketing teams.
Why AI Matters For SEO
Simply put, AI is or will be in just about everything you and your clients use.
Search engines like Google have been using AI for decades in ranking algorithms.
Bing offers AI-powered chat, summaries, and image creation capabilities.
Google is experimenting with similar features in the Search Generative Experience (SGE) and Notes.
The top advertising, business, marketing, productivity, and SEO platforms are integrating generative AI chat and tools.
Custom AI tools can be promoted as free or essential resources to build links, increase brand visibility, and generate leads/sales.
Observing how AI chatbots conduct web searches and learning how each search system works can provide insight into how to optimize content for both search engines and AI.
Screenshot from Perplexity, December 2023
Especially since Bing AI is changing how search works, Perplexity strives to modernize PageRank for better answers, and Cohere aims to improve search results with rerank technology.
Screenshot from Cohere, December 2023
Important Disclaimers About Using AI
With any AI tool or online service, it’s important to remember not to share confidential or sensitive information.
Humans may review the information you share for quality assurance purposes.
Information you share may be included in training data for future models. Some AI platforms allow you to opt out of training data in your user profiles, account settings, developer profiles, AI project settings, chat settings, etc.
Because AI tools use information scraped from the internet as part of the training data, AI-generated responses may not be accurate or unique.
Always fact-check generative AI content.
Never trust generative AI content for medical, legal, or similar advice.
Know what protections (if any) are available from your AI tools for users, subscribers, businesses, enterprises, and developers accused of copyright infringement.
Some AI tools are available for research or preview only. The content generated by them is not intended for publishing, resale, or commercial use.
While never enjoyable, it’s important to review the terms of service, content use policies, and other usage policies for tools you use, especially when you rely on them for personal or professional tasks.
The Top AI Chatbots
AI chatbots have come a long way since OpenAI released the ChatGPT “research preview” in November 2022.
Many can help you with analysis, content creation, coding, documents, images, research, summarization, and other complex tasks.
Best of all, a few – Bard, Copilot, and Pi – are free.
ChatGPT, Claude, Perplexity, and Poe offer free plans and premium subscriptions starting at $20 per month.
ChatGPT gives subscribers access to GPT-4, DALL·E, file uploads, beta features, web browsing with Bing, over 1,000 third-party plugins, the ability to create custom chatbots (GPTs), and soon, a store full of GPTs.
Claude by Anthropic offers a larger context window than most, making it better for tasks like summarizing large chunks of text and analyzing uploaded documents. Subscribers get access to the latest model and beta features.
Google Bard (Gemini Pro) connects to Google Docs, Google Drive, Gmail, YouTube, Flights, Hotels, and Maps.
Microsoft Copilot (GPT-4) uses the web and plugins from Instacart, Kayak, Klarna, and Shop.
Perplexity provides the best answers to questions with deep search and sources. Open-source models from Meta and Mistral AI are available in Labs. Subscribers can switch between GPT-4, Claude, Gemini Pro, and online LLMs.
Pi (Inflection-1) distinguishes itself as a more emotionally intelligent and personalized chat.
Poe hosts 27 official chatbots powered by the latest models from Anthropic, Google, Meta, Mistral AI, OpenAI, and Stability AI. It also allows you to create custom chatbots on many of those platforms. Subscribers get more access to the latest models.
Screenshot from Poe, December 2023
Learn Prompt Engineering
Learn how to get the most out of AI chatbots with the following free courses and guides:
Six Prompt Engineering Strategies For Better Results (OpenAI)
Best Prompting Tips For Text & Image Generation Bots (Poe)
AI Chatbots On Social Media
While Bard may have access to YouTube content, other social networks built AI chatbots into the platforms.
Meta launched a Meta AI chatbot that can generate images and search the web with Bing on Messenger, Instagram, and WhatsApp.
Quora integrates the ChatGPT bot from its AI platform, Poe, into some answers.
Snapchat’s My AI was one of the first AI chatbots that users could direct messages and integrate into group chats.
X, formerly known as Twitter, launched Grok for Premium+ subscribers, which has access to X posts/tweets as “sources” for responses.
Bytedance was working on an AI model, but the parent company of TikTok allegedly did it in a way that violated OpenAI and Microsoft TOS. This may delay the platform’s AI plans and affect the future availability of the CapCut plugin for ChatGPT.
While not a social media platform, you can also chat with Pi by Inflection on Messenger, Instagram, and WhatsApp.
AI Chatbots With Context
As you start to use AI chatbots more frequently, you may find yourself repeating certain details to get better responses.
Some platforms give you the option to add basic details about yourself, your work, and what you are trying to accomplish so that AI provides the best response for your needs.
ChatGPT lets users give context and specific directions with Custom Instructions.
Perplexity lets users offer context, specific directions, location, and language preferences in the AI Profile.
Custom Chatbots With Minimal Coding
When you need an AI chatbot that is more tailored to your needs, you can create custom chatbots in under ten minutes.
ChatGPT allows subscribers to create GPTs – chatbots that use GPT-4 with web browsing, DALL·E, and code analysis with custom instructions, knowledge files, and actions to complete specific tasks.
Microsoft Copilot Studio allows 365 users to create custom Copilot experiences.
Poe users create custom chatbots using eight of the most popular AI models. You can give each chatbot specific instructions to follow for each conversation and knowledge files to reference.
SEO Benefits Of Creating Custom AI Chatbots And GPTs
Even if you don’t need a custom AI chatbot for yourself, think about it from an SEO perspective.
Making a free tool or inexpensive solution for you or your clients that can be promoted on a webpage = links, brand visibility, and a new tool for lead generation and sales.
You can also create a new product/income stream for your clients.
Learn How To Build Custom AI Chatbots
Learn how to build custom chatbots and GPTs with these free courses and guides:
Quickstart Guide to Building Generative AI Copilots (Microsoft)
Built-In AI Chatbots And Features
You don’t always have to find a new tool to get AI assistance. Many popular business tools, software, and online services have built-in AI chatbots and features.
In addition to having AI with direct access to your company’s data, you have the confidence that comes with a platform you already trust for reliability and security.
The following are examples of platforms with scalable AI features for advertisers, marketers, agencies, teams, and enterprises.
With Zapier (and plans starting at $20 a month), you can connect your ChatGPT account, Claude (via API), or Cohere (via API) to thousands of popular business applications to create automated workflows.
For example, I can create a workflow that connects Gmail, ChatGPT, and Slack and works when a new email matching a search is received.
Screenshot from Zapier, December 2023
You can create advanced automated workflows that use your ChatGPT account or your OpenAI account with the Assistants API.
Actions For GPTs
Remember the custom AI chatbots mentioned earlier that ChatGPT subscribers can build with minimal coding?
If you’re willing to do a small amount of coding, Zapier also offers actions that connect GPTs to Zapier’s platform of app integrations.
Train AI Models And Build Generative AI Applications
If none of the AI chatbots and tools mentioned thus far provide the customized level of solution you need, then you may want to work with the following for enterprise AI solutions.
A common misconception is that AI is going to replace people at their jobs.
The reality is that AI could do most of the work that companies have outsourced for decades.
It’s the people who are willing to use AI to optimize workflows and increase productivity who will replace those who can’t or won’t adapt to the future of work with AI.
A publisher took to Twitter to share their reaction to what they felt was essentially a theft of their content for the benefit Google with what they felt was little to no benefit to the publisher.
Google’s response was surprising and probably not what publishers and SEOs expected.
The publisher showed a screenshot of a branded site:search for things to do in Denver with content directly from their site.
“They are doing this across all travel searches – unbranded and branded alike.
Example: “Mexico Travel Tips” – they have an AI answer & also a rich result that basically just re-creates an entire blog post, including our stolen photos.
Again, I am IN that Mexico packing photo!”
Here’s the tweet:
They are doing this across all travel searches – unbranded and branded alike.
Example: “Mexico Travel Tips” – they have an AI answer & also a rich result that basically just re-creates an entire blog post, including our stolen photos.
“Like how is it legal for Google to just essentially create entire blog posts from creators’ content and images?
I literally have a law degree from the top law school in the world, and even I can’t figure it out!
Fair use does NOT apply if you’re using the content to compete directly against the creator, which they clearly are. I can’t sit outside a movie theatre, project the movie on a wall, earn money from it, and claim fair use.
I spent SO much time taking those photos in Denver.
It was 10+ full days worth of work for me and partner Clara, going around the city to photograph everything. $100s of money spent in attraction admission fees, gas, parking.
Now Google just gets to extract all that value?
How much does Google get to take before creators say “enough is enough”?
How hard does the water have to boil before the frog jumps?
The comments show it is a prisoner’s dilemma as long as Google has a monopoly on search …”
Google Responds
Google’s SearchLiaison (aka Danny Sullivan) responded with an explanation of what’s going on. They explained how the rich result that uses the entirety of the publisher’s content also features a link back to the publisher’s webpage.
Wisely, SearchLiaison didn’t insist that Google was in the right. Instead, their response was sympathetic to the plight of the publisher.
SearchLiaison likely understood how the publisher felt because, unlike many Googlers, Danny Sullivan used to be a publisher for many decades. He, probably more than any other Googler, knows what it’s like to be on the other side of Google’s fence.
“Hey Nate, this got flagged to my attention. I’ll pass along the feedback to the team. Pretty sure this isn’t a new feature. Elsewhere in the thread, you talk about it being an AI answer, and I’m pretty sure that’s not the case, either. It’s a way to refine an initial query and browse into more results.
With the example you point out, when you expand the listing, your image is there with a credit. If you click, a preview with a larger view comes up, and that lets people visit the site. Personally, I’m not a fan of the preview-to-click.
I think it should click directly to the site (feedback I’ve shared internally before, and I’ll do this again). But it’s making use of how Google Images operates, where there’s a larger preview that helps people decide if an image is relevant to their search query. Your site is also listed there, too. Click on that, people get to your site.”
If you don’t want your images to appear in Google Search, this explains how to block them: https://developers.google.com/search/docs/crawling-indexing/prevent-images-on-your-page
I suspect you’d prefer an option to not have them appear as thumbnails in particular features. We don’t have that type of granular control, but I’ll also pass the feedback on.”
“I appreciate your thoughts and concerns. I do. The intention overall is to make search better, which includes ensuring people do indeed continue to the open web — because we know for us to thrive, the open web needs to thrive.
But I can also appreciate that this might not seem obvious from how some of the features display.
I’m going to be sharing these concerns with the search team, because they’re important.
You and other creators that are producing good content (and when you’re ranking in the top results, that’s us saying it’s good content) should feel we are supporting you.
We need to look at how what we say and how our features operate ensure you feel that way.
I’ll be including your response as part of this.”
Are Google’s Rich Results Unfair?
There’s a legal definition of what’s fair and it may be that Google has a legal right to use website content in a manner that has the impression that Google is “stealing” the content from a publisher to outrank that publisher with their own content.
But there’s also a subjective common sense definition of fair play that you feel in your heart. Maybe it’s that notion of fairness that many publishers feel when Google appears to use their content in a way that seems to benefit Google more than it does the publisher.
Is this one of those situations that fits into the paradigm of just because you can doesn’t mean that you should?
Researchers uncover data leaks in Google Tag Manager (GTM) as well as security vulnerabilities, arbitrary script injections and instances of consent for data collection enabled by default. A legal analysis identifies potential violations of EU data protection law.
There are many troubling revelations including that server-side GTM “obstructs compliance auditing endeavors from regulators, data protection officers, and researchers…”
GTM, developed by Google in 2012 to assist publishers in implementing third-party JavaScript scripts, is currently used on as many as 28 million websites. The research study evaluates both versions of GTM, the Client-side and the newer Server-side GTM that was introduced in 2020.
The analysis, undertaken by researchers and legal experts, revealed a number of issues inherent to the GTM architecture.
An examination of 78 Client-side Tags, 8 Server-side Tags, and two Consent Management Platforms (CMPs), revealed hidden data leaks, instances of Tags bypassing GTM permission systems in order to inject scripts, and consent set to enabled by default without any user interaction.
A significant finding pertains to the Server-side GTM. Server-side GTM works by loading and executing tags on a remote server, which creates the perception of the absence of third parties on the website. However, the study showed that this architecture allows tags running on the server to clandestinely share users’ data with third parties, circumventing browser restrictions and security measures like like the Content-Security-Policy (CSP).
Methodology Used In Research On GTM Data Leaks
The researchers are from Centre Inria de l’Université, Centre Inria d’Université Côte d’Azur, Centre Inria de l’Université, and Utrecht University.
The methodology used by the researchers was to buy a domain and install GTM on a live website.
The research paper explains in detail:
“To conduct experiments and set up the GTM infrastructure, we bought a domain – we call it example.com here – and created a public website containing one basic webpage with a paragraph of text and an HTML login form. We have included a login form since Senol et al. …have recently found that user input is often leaked from the forms, so we decided to test whether Tags may be responsible for such leakage.
The website and the Server-side GTM infrastructure were hosted on a virtual machine we rented on the Microsoft Azure cloud computing platform located in a data center in the EU.
…We used the ‘profiles’ functionality of the browser to start every experiment in a fresh environment, devoid from cookies, local storage and other technologies than maintain a state.
The browser, visiting the website, was run on a computer connected to the Internet through an institutional network in the EU.
To create Client- and Server-side GTM installations, we created a new Google account, logged into it and followed the suggested steps in the official GTM documentation.”
The results of the analysis contain multiple critical findings, including that the “Google Tag” facilitates collecting multiple types of users’ data without consent and at the time of analysis it presented a security vulnerability.
Data Collection Is Hidden From Publishers
Another discovery was the extent of data collection by the “Pinterest Tag,” which garnered a significant amount of user data without disclosing it to the Publisher.
What some may find disturbing is that publishers who deploy these tags may not only be unaware of the data leaks but that the tools they rely on to help them monitor data collection don’t notify them of these issues.
The researchers documented their findings:
“We observe that the data sent by the Pinterest Tag is not visible to the Publisher on the Pinterest website, where we logged in to observe Pinterest’s disclosure about collected data.
Moreover, we find that the data collected by the Google Tag about form interaction is not shown in the Google Analytics dashboard.
This finding demonstrates that for such Tags, Publishers are not aware of the data collected by the Tags that they select.”
Injections of Third Party Scripts
Google Tag Managers has a feature for controlling tags, including third party tags, called Web Containers. The tags can run inside a sandbox that limits their functionalities. The sandbox also uses a permission system with one permission called inject_script that allows a script to download and run any (arbitrary) script outside of the Web Container.
The inject_script permission allows the tag to bypass the GTM permission system to gain access to all browser APIs and DOM.
Screenshot Illustrating Script Injection
The researchers analyzed 78 officially supported Client-side tags and discovered 11 tags that don’t have the inject_script permission but can inject arbitrary scripts. Seven of those eleven tags were provided by Google.
They write:
“11 out of 78 official Client-side tags inject a third-party script into the DOM bypassing the GTM permission system; and GTM “Consent Mode” enables some of the consent purposes by default, even before the user has interacted with the consent banner.”
The situation is even worse because it’s not just a privacy vulnerability, it’s also a security vulnerability.
The research paper explains the meaning of what they uncovered:
“This finding shows that the GTM permission system implemented in the Web Container sandbox allows Tags to insert arbitrary, uncontrolled scripts, thus opening potential security and privacy vulnerabilities to the website. We have disclosed this finding to Google via their Bug Bounty online system.”
Consent Management Platforms (CMP)
Consent Management Platforms (CMP) are a technology for managing what consent users have granted in terms of their privacy. This is a way to manage ad personalization, user data storage, analytics data storage and so on.
Google’s documentation for CMP usage states that setting the consent mode defaults is the responsibility of the marketers and publishers who use the GTM.
The defaults can be set to deny ad personalizaton by default, for example.
The documentation states:
“Set consent defaults We recommend setting a default value for each consent type you are using.
The consent state values in this article are only examples. You are responsible for making sure that default consent mode is set for each of your measurement products to match your organization’s policy.”
What the researchers discovered is that CMPs for Client-side GTMs are loaded in an undefined state on the webpage and that becomes problematic when a CMP does not load default variables (referred to as undefined variables).
The problem is that GTM considers undefined variables to mean that users have given their consent to all of the undefined variables, even though the user has not consented in any way.
The researchers explained what’s happening:
“Surprisingly, in this case, GTM considers all such undefined variables to be accepted by the end user, even though the end user has not interacted with the consent banner of the CMP yet.
Among two CMPs tested (see §3.1.1), we detected this behavior for the Consentmanager CMP.
This CMP sets a default value to only two consent variables – analytics_storage and ad_storage – leaving three GTM consent variables – security_-storage , personalization_storage functionality_storage – and consent variables specific to this CMP – e.g., cmp_purpose_c56 which corresponds to the “Social Media” purpose – in undefined state.
These extra variables are hence considered granted by GTM. As a result, all the Tags that depend on these four consent variables get executed even without user consent.”
Legal Implications
The research paper notes that United States privacy laws like the European Union General Data Protection Regulation (GDPR) and the ePrivacy Directive (ePD) regulate the processing of user data and the use of tracking technologies and impose significant fines for violations of those laws, such as requiring consent for the storage of cookies and other tracking technologies.
A legal analysis of the Client-Side GTM flagged a total of seven potential violations.
Seven Potential Violations Of Data Protection Laws
Potential violation 1. CMP scanners often miss purposes
Potential violation 2. Mapping CMP purposes to GTM consent variables is not compliant.
Potential violation 3. GTM purposes are limited to clientside storage.
Potential violation 4. GTM purposes are not specific nor explicit.
Potential violation 5. Defaulting consent variables to “accepted” means that Tags run without consent.
Potential violation 6. Google Tag sends data independently of user’s consent decisions.
Potential violation 7. GTM allows Tag Providers to inject scripts exposing end users to security risks.
Legal analysis of Server-Side GTM
The researchers write that the findings raise legal concerns about GTM in its current state. They assert that the system introduces more legal challenges than resolutions, complicating compliance efforts and posing a challenge for regulators to monitor effectively.
These are some of the factors that caused concern about the ability to comply with regulations:
Complying with data subject rights is hard for the Publisher For both Client- and Server-Side GTM there is no easy way for a publisher to comply with a request for access to collected data as required by Article 15 of the GDPR. The publisher would have to manually track down every Data Collector to comply with that legal request.
Built-in consent raises trust issues When using tags with built-in consent, publishers are forced to trust that Tag Providers actually implement the built-in consent within the code. There’s no easy way for a publisher to review the code to verify that the Tag Provider is actually ignoring the consent and collecting user information. Reviewing the code is impossible for official tags that are sandboxed within the gtm.js script. The researchers state that reviewing the code for compliance “requires heavy reverse engineering.”
Server-side GTM is invisible for regulatory monitoring and auditing The researchers write that Server-side GTM blocks obstructs compliance auditing because the data collection occurs remotely on a server.
Consent is hard to configure on GTM Server Containers Consent management tools are missing in GTM Server Containers, which prevents CMPs from displaying the purposes and the Data Collectors as required by regulations.
Auditing is described as highly difficult:
“Moreover, auditing and monitoring is exclusively attainable by only contacting the Publisher to grant access to the configuration of the GTM Server Container.
Furthermore, the Publisher is able to change the configuration of the GTM Server Container at any point in time (e.g., before any regulatory investigation), masking any compliance check.”
Conclusion: GTM Has Pitfalls And Flaws
The researchers were gave GTM poor marks for security and the non-compliant defaults, stating that it introduces more legal issues than solutions while complicating the compliance with regulations and making it hard for regulators to monitor for compliance.
Google recently launched a new design for Labs with 12 artificial intelligence (AI) experiments, tools, and projects, showcasing a commitment to advancing AI technology.
Screenshot from Labs.Google, December 2023
This redesign highlights a dozen AI products that could transform how AI is in most workflows and organizations.
From search to productivity, here are the AI experiments you can try or sign up for on the newly designed Google Labs website.
12 Google AI Experiments
Here are the 12 AI experiments from Google that have been appearing in popular products, including Search, Workspace, and YouTube.
1. Google Search Generative Experience (SGE)
This feature – now notably missing the ending date of December 2023 – quickly summarizes topics, generates fresh ideas, and simplifies follow-up research. It represents a significant advancement in how users can interact with and leverage AI for efficient information retrieval.
Screenshot from Labs.Google, December 2023
2. TextFX For AI Writing Assistance
TextFX is an AI assistant for the creative writing process. While offering vast possibilities, users should note the need for accuracy checks.
Available from Google’s AI Test Kitchen, TextFX utilizes Google’s PaLM 2 and incorporates artistic insights to enhance artistic expression.
Screenshot from Labs.Google, December 2023
3. Google Bard With Extensions
Google Bard, now powered by Gemini Pro, seamlessly integrates with Google services like YouTube, Gmail, and Maps through extensions available in English, Japanese, and Korean.
Screenshot from Google Bard, December 2023
4. NotebookLM For Research
NotebookLM, powered by Google AI, changes how you read, take notes, ask questions, and organize ideas, offering assistance – and sources.
I have saved a lot of PDFs, Google Docs, and ebooks that I want to read but never got a chance to do so. When NotebookLM came out, I created folders of business, philosophy, and lifestyle.
For marketers, this could be customized in the future for meeting transcripts, book quotes, novel chapters, or corporate documents, providing a rich resource for research and idea generation.
Screenshot from Labs.Google, December 2023
In a Lab Session, video journalist Cleo Abram demonstrates how NotebookLM can assist with research by aiding complex tasks such as memory retrieval and fact-checking.
5. Duet AI For Google Workspace
Duet AI for Google Workspace enables collaborative AI-assisted writing, visualization, and organization, enhancing creative workflows. It works to simplify common tasks in Gmail, Docs, Sheets, Slides, and Meet.
To use Duet AI, users need an eligible Google Workspace plan. Those not currently using Workspace can begin by signing up for it.
Screenshot from Labs.Google, December 2023
6. A Game To Improve Image Prompting Skills
Say What You See is an AI experiment that teaches the art of image prompting.
The Google Arts & Culture Lab team, with the aid of Google AI, has created a series of images, and your task is to describe what you observe.
Your descriptions will be used to generate a new image that is inspired by the one you’re looking at. You’ll get three chances for each image to reach a certain level of visual similarity.
Keep an eye out for helpful tips that will guide you in refining your prompts.
Screenshot from Labs.Google, December 2023
7. AI Experiments On YouTube
Exclusive to YouTube Premium members, this feature provides early access to experimental AI functionalities.
This option for Premium members gives marketers and advertisers a way to prepare video content to adapt to new YouTube features, such as AI-generated conversation topics and summaries.
8. Project IDX
Project IDX could become an invaluable tool for developers, offering a quick and seamless transition into development workflows within Google Cloud.
Additionally, it enhances app optimization for multiple platforms, offering previews and simulators for different environments.
A standout feature is its generative AI assistance from Google’s Codey, which aids in code generation, completion, and translation, streamlining the development process significantly for SEO professionals.
9. Instrument Playground with MusicFX
This AI experiment allows users to generate, play, and compose music using AI.
It features over 100 instruments from around the globe, allowing for the generation of a 20-second sound clip tailored to specific moods or themes.
It also comes with a variety of modes, including Ambient, Beat, or Pitch, and an Advanced mode with a Sequencer for intricate compositions.
Screenshot from Labs.Google, December 2023
Developed from Google AI’s Music LM research, this innovative platform offers a novel way for marketers to create distinctive, AI-generated soundtracks that resonate with their brand and audience.
The following shows how Google Bard and generative AI music tools enhance creativity.
10. Magic Compose For Text Replies On Android
For those uncertain about text replies, Magic Compose suggests responses with the appropriate tone and context.
11. AI Script Editor For Google Home
For those who can’t work remotely, the Script Editor’s experimental features allow users to create advanced home automation using AI, removing the need for coding expertise.
Screenshot from Labs.Google, December 2023
12. Magic Editor For Google Photos
Google Photos’ Magic Editor offers AI photo editing, which should lead to better tools for advertisers to use to edit ad campaign media faster.
Conclusion
In addition to the new design, Google shared a link to its Discord for AI Test Kitchen experiments and called for innovative AI experiment submissions, seeking projects that push the boundaries of what code can do and inspire other coders.
The bright new layout for Labs.Google and the range of AI experiments highlight Google’s relentless pursuit of innovation and its commitment to shaping a future where AI plays a pivotal role in everyday life.
It should also give marketers an idea of what new AI features will appear in search and what tools Google could have to offer for business, creativity, and productivity soon.
In 2023, the WordPress community witnessed a significant milestone in website performance, with Core Web Vitals (CWV) showing significant improvements for both mobile and desktop users.
This article delves into the specifics of these improvements, exploring their implications and the evolving landscape of web performance within the WordPress ecosystem.
What Are Core Web Vitals?
Core Web Vitals are a set of specific metrics designed to measure the quality of user experience on web pages. This set of metrics is also a confirmed ranking factor for Google Search.
As part of Google’s broader Web Vitals initiative, the metrics focus on loading performance, interactivity, and visual stability. They apply to all web pages and are important for site owners to measure and optimize.
There are three key metrics within CWV:
Largest Contentful Paint (LCP) evaluates loading performance. A good user experience is indicated when the LCP occurs within 2.5 seconds of when the page starts loading.
First Input Delay (FID) measures the interactivity of a page. For a good user experience, the FID should be 100 milliseconds or less.
Cumulative Layout Shift (CLS) assesses the visual stability of a page. A good user experience is maintained if the page has a CLS of 0.1 or less.
These metrics are designed to be measurable in real-world scenarios, reflecting critical aspects of user experience.
In addition to these, there are other vital metrics:
Time to First Byte (TTFB) and First Contentful Paint (FCP) are key aspects of the loading experience and help diagnose issues with LCP.
Total Blocking Time (TBT) is important for diagnosing potential interactivity issues impacting FID.
While important, they are not part of the Core Web Vitals set because they are either not field-measurable or do not directly reflect a user-centric outcome.
WordPress Core Web Vitals Improve In 2023
WordPress CWV improved substantially in 2023.
Screenshot from WordPress, December 2023
The mobile CWV passing rate has increased by 8.13%, rising from 28.31% to 36.44%.
Similarly, the desktop CWV passing rate improved by 8.25%, moving from 32.55% to 40.80%.
This improvement is significant, considering the base values from which these percentages increased.
In relative terms, the new passing rates are approximately 29% higher than the previous ones on mobile and 25% higher on desktop.
This progress outstrips the improvements made in the previous year, where mobile CWV improved by 6.99% and desktop by 6.25%.
A line chart illustrates the gradual improvement of WordPress’s mobile CWV passing rate over the year, with a slight dip between March and April 2023 due to a change in the Largest Contentful Paint (LCP) algorithm calculation.
CWV Metrics For Mobile
The improvement in individual CWV metrics on mobile platforms is noteworthy.
Screenshot from WordPress, December 2023
The mobile LCP passing rate rose by 8.89%, the CLS passing rate by 4.22%, and the FID passing rate by 0.87%.
LCP experienced the largest increase, aligning with the WordPress performance team’s focus on this metric, considering it had the lowest base passing rate.
Despite a modest increase in FID, its already high passing rate makes this less concerning.
The TTFB rate, while not a Core Web Vital metric, is integral to LCP and received attention in 2023.
Screenshot from WordPress, December 2023
The mobile TTFB passing rate improved by 3.10%, and the desktop rate by 3.53%.
Impact Of WordPress 2023 Releases
The release of WordPress versions 6.2, 6.3, and 6.4 focused on improvements in load time performance, particularly impacting LCP and TTFB metrics.
For each version, data was compiled comparing sites before and after updating to the new version.
This approach, though not a strict A/B comparison, helped reduce noise and provide clearer insights.
For instance, the release of WordPress 6.2 showed a 0.01% improvement in mobile LCP and 0.65% in mobile TTFB.
Screenshot from WordPress, December 2023
Version 6.3 brought more significant improvements, with a 4.72% increase in mobile LCP.
Screenshot from WordPress, December 2023
The release of WordPress 6.4 also contributed to the improvements, albeit more modestly.
Screenshot from WordPress, December 2023
How WordPress Core Web Vitals Impact The Web
WordPress’s high usage rate means its performance has a substantial effect on the overall web.
Screenshot from WordPress, December 2023
In 2023, WordPress’s improvement in CWV passing rates exceeded those of non-WordPress sites.
For example, the mobile CWV passing rate for non-WordPress sites improved by 3.68%, compared to WordPress’s 8.13%. This demonstrates WordPress’s significant role in enhancing web performance.
Interaction To Next Paint Arrives In March 2024
Looking forward to 2024, WordPress faces new challenges and opportunities.
INP is a more comprehensive measure of interactivity, and its introduction is expected to lower overall CWV passing rates.
Screenshot from WordPress, December 2023
The WordPress performance team is considering this in their planning for 2024, inviting community contributions to their roadmap.
Next Steps
As a marketing professional, it is essential to stay current with the latest developments in Core Web Vitals, considering the implications for website performance and SEO.
With the upcoming shift to INP in 2024, it’s vital to prepare for these changes and consider how they might affect your website’s performance metrics.
Given this change, WordPress developers and site owners should start focusing on optimizing for INP. Prioritizing INP means optimizing your site to ensure that it responds quickly and smoothly to user interactions.
Another suggestion was to explore more ways to improve TTFB.
This may include optimizing hosting environments, using caching strategies, or adjusting content delivery networks, rather than just focusing on the server response time within the WordPress core.
Google updated their spam policies for web search and the guide to ranking systems to clarify how Google handles sites with a high number of non-consensual explicit imagery and requests for their removals.
The changes are to the policy that specifically mentions sites that charge for removal of negative information but the guidance also states that if they will also demote content on other sites that practice the same pattern of behavior.
Thus, a report about one site can trigger demotions of other sites that have similar kinds of exploitative removal practices.
Background Of Google’s Policies On Non-Consensual Explicit Imagery
The kind of imagery Google is demoting sites for is the sharing of intimate images without the consent of the party whose picture is being publicly shared on a website.
These are the changes to Google’s spam and demotion guidance:
Original wording:
“If we process a high volume of personal information removals involving a site with exploitative removal practices, we demote other content from the site in our results.
We also look to see if the same pattern of behavior is happening with other sites and, if so, apply demotions to content on those sites.
We may apply similar demotion practices for sites that receive a high volume of doxxing content removals.
Furthermore, we have automatic protections designed to prevent non-consensual explicit personal images from ranking highly in response to queries involving names.”
This was added:
“removals or non-consensual explicit imagery removals.”
The new version now reads (emphasis on additional words ):
“If we process a high volume of personal information removals involving a site with exploitative removal practices, we demote other content from the site in our results.
We also look to see if the same pattern of behavior is happening with other sites and, if so, apply demotions to content on those sites.
We may apply similar demotion practices for sites that receive a high volume of doxxing content removals or non-consensual explicit imagery removals.”
Perhaps of interest to some is the removal of a reference to automatic systems for removing this kind of content.
This is what was removed:
“Furthermore, we have automatic protections designed to prevent non-consensual explicit personal images from ranking highly in response to queries involving names.”
Why did Google remove that passage?
Is it because it said too much or because the system no longer exists? Or was it removed because it was redundant with the part that already mentions demotions?
I think it’s the latter, that it was removed because it was redundant.
Search Ranking Systems Guidance Updated
A similar edit was made to Google’s Guide to Search Ranking Systems where the same sentence about the “automatic protections” was entirely removed, possibly because it was redundant.
But new wording was added to the last sentence that detailed what would trigger a removal demotions from Google’s search results.
The additional reason for demotion is sites that experience a high level of “non-consensual explicit imagery removals” requests.
The updated passage, with the section about “automatic systems” removed, now reads like this:
“Personal information removals: If we process a high volume of personal information removals involving a site with exploitative removal practices, we demote other content from the site in our results. We also look to see if the same pattern of behavior is happening with other sites and, if so, apply demotions to content on those sites. We may apply similar demotion practices for sites that receive a high volume of doxxing content removals or non-consensual explicit imagery removals.”
Google’s John Mueller answered the question of whether company blogs are eligible to show up in Google News.
Google News
Google News can be a tremendous source of readers who are interested in reading the published news and that can serve as the foundation for a steady stream of advertising income.
And while “news” is generally found on news websites, is it possible for a company blog to show up in Google News?
That’s the question someone asked in the recent Google Office Hours for December 2023.
The person asking the question asked:
“Are company-owned blogs eligible to be included in the Google News feed?”
Google’s John Mueller answered:
“I can’t speak directly for Google News, since I work on Search, which is somewhat separate, but looking at their content policies, I don’t see anything specific to company blogs.
Mueller is right that there isn’t anything specific to company blogs in the policies. However, those who are interested in getting into Google News, including a company blog, should take a peek at the content policies because there are recommendations that are useful to know.
Aside from policies about not misleading readers and so on there are other requirements to be aware of regarding the content policies for news websites:
“Clear dates and bylines
Information about the authors, publication, and publisher
Information about the company or network behind the content
Contact information”
But is having that enough for a company blog to get into Google News?
Getting Into Google News
It may be possible to get into Google News by passively publishing news content in a company blog. Google’s publisher center help page says that Google can automatically discover news content.
However, it may be useful to be more proactive by visiting Google’s publisher center where there’s a way to submit a URL for consideration for inclusion into Google News.
Does Google Publish Company Blogs In News?
Google does indeed publish news content from company blogs.
For example, here’s a screenshot of a security company whose content is in Google News.
Screenshot Of Google News
And here’s a screenshot of a company webpage from Adobe that is also showing up in Google News:
Clearly, dedicated news sites dominate Google News. But company websites that also publish news are also featured in Google News, so it is possible.
Watch the Google Office Hours question and answer at the 15:37 minute mark: