Tired Of SEO Spam, Software Engineer Creates A New Search Engine via @sejournal, @martinibuster

A software engineer from New York got so fed up with the irrelevant results and SEO spam in search engines that he decided to create a better one. Two months later, he has a demo search engine up and running. Here is how he did it, and four important insights about what he feels are the hurdles to creating a high-quality search engine.

One of the motives for creating a new search engine was the perception that mainstream search engines contained increasing amount of SEO spam. After two months the software engineer wrote about their creation:

“What’s great is the comparable lack of SEO spam.”

Neural Embeddings

The software engineer, Wilson Lin, decided that the best approach would be neural embeddings. He created a small-scale test to validate the approach and noted that the embeddings approach was successful.

Chunking Content

The next phase was how to process the data, like should it be divided into blocks of paragraphs or sentences? He decided that the sentence level was the most granular level that made sense because it enabled identifying the most relevant answer within a sentence while also enabling the creation of larger paragraph-level embedding units for context and semantic coherence.

But he still had problems with identifying context with indirect references that used words like “it” or “the” so he took an additional step in order to be able to better understand context:

“I trained a DistilBERT classifier model that would take a sentence and the preceding sentences, and label which one (if any) it depends upon in order to retain meaning. Therefore, when embedding a statement, I would follow the “chain” backwards to ensure all dependents were also provided in context.

This also had the benefit of labelling sentences that should never be matched, because they were not “leaf” sentences by themselves.”

Identifying The Main Content

A challenge for crawling was developing a way to ignore the non-content parts of a web page in order to index what Google calls the Main Content (MC). What made this challenging was the fact that all websites use different markup to signal the parts of a web page, and although he didn’t mention it, not all websites use semantic HTML, which would make it vastly easier for crawlers to identify where the main content is.

So he basically relied on HTML tags like the paragraph tag

to identify which parts of a web page contained the content and which parts did not.

This is the list of HTML tags he relied on to identify the main content:

  • blockquote – A quotation
  • dl – A description list (a list of descriptions or definitions)
  • ol – An ordered list (like a numbered list)
  • p – Paragraph element
  • pre – Preformatted text
  • table – The element for tabular data
  • ul – An unordered list (like bullet points)

Issues With Crawling

Crawling was another part that came with a multitude of problems to solve. For example, he discovered, to his surprise, that DNS resolution was a fairly frequent point of failure. The type of URL was another issue, where he had to block any URL from crawling that was not using the HTTPS protocol.

These were some of the challenges:

“They must have https: protocol, not ftp:, data:, javascript:, etc.

They must have a valid eTLD and hostname, and can’t have ports, usernames, or passwords.

Canonicalization is done to deduplicate. All components are percent-decoded then re-encoded with a minimal consistent charset. Query parameters are dropped or sorted. Origins are lowercased.

Some URLs are extremely long, and you can run into rare limits like HTTP headers and database index page sizes.

Some URLs also have strange characters that you wouldn’t think would be in a URL, but will get rejected downstream by systems like PostgreSQL and SQS.”

Storage

At first, Wilson chose Oracle Cloud because of the low cost of transferring data out (egress costs).

He explained:

“I initially chose Oracle Cloud for infra needs due to their very low egress costs with 10 TB free per month. As I’d store terabytes of data, this was a good reassurance that if I ever needed to move or export data (e.g. processing, backups), I wouldn’t have a hole in my wallet. Their compute was also far cheaper than other clouds, while still being a reliable major provider.”

But the Oracle Cloud solution ran into scaling issues. So he moved the project over to PostgreSQL, experienced a different set of technical issues, and eventually landed on RocksDB, which worked well.

He explained:

“I opted for a fixed set of 64 RocksDB shards, which simplified operations and client routing, while providing enough distribution capacity for the foreseeable future.

…At its peak, this system could ingest 200K writes per second across thousands of clients (crawlers, parsers, vectorizers). Each web page not only consisted of raw source HTML, but also normalized data, contextualized chunks, hundreds of high dimensional embeddings, and lots of metadata.”

GPU

Wilson used GPU-powered inference to generate semantic vector embeddings from crawled web content using transformer models. He initially used OpenAI embeddings via API, but that became expensive as the project scaled. He then switched to a self-hosted inference solution using GPUs from a company called Runpod.

He explained:

“In search of the most cost effective scalable solution, I discovered Runpod, who offer high performance-per-dollar GPUs like the RTX 4090 at far cheaper per-hour rates than AWS and Lambda. These were operated from tier 3 DCs with stable fast networking and lots of reliable compute capacity.”

Lack Of SEO Spam

The software engineer claimed that his search engine had less search spam and used the example of the query “best programming blogs” to illustrate his point. He also pointed out that his search engine could understand complex queries and gave the example of inputting an entire paragraph of content and discovering interesting articles about the topics in the paragraph.

Four Takeaways

Wilson listed many discoveries, but here are four that may be of interest to digital marketers and publishers interested in this journey of creating a search engine:

1. The Size Of The Index Is Important

One of the most important takeaways Wilson learned from two months of building a search engine is that the size of the search index is important because in his words, “coverage defines quality.” This is

2. Crawling And Filtering Are Hardest Problems

Although crawling as much content as possible is important for surfacing useful content, Wilson also learned that filtering low quality content was difficult because it required balancing the need for quantity against the pointlessness of crawling a seemingly endless web of useless or junk content. He discovered that a way of filtering out the useless content was necessary.

This is actually the problem that Sergey Brin and Larry Page solved with Page Rank. Page Rank modeled user behavior, the choice and votes of humans who validate web pages with links. Although Page Rank is nearly 30 years old, the underlying intuition remains so relevant today that the AI search engine Perplexity uses a modified version of it for its own search engine.

3. Limitations Of Small-Scale Search Engines

Another takeaway he discovered is that there are limits to how successful a small independent search engine can be. Wilson cited the inability to crawl the entire web as a constraint which creates coverage gaps.

4. Judging trust and authenticity at scale is complex

Automatically determining originality, accuracy, and quality across unstructured data is non-trivial

Wilson writes:

“Determining authenticity, trust, originality, accuracy, and quality automatically is not trivial. …if I started over I would put more emphasis on researching and developing this aspect first.

Infamously, search engines use thousands of signals on ranking and filtering pages, but I believe newer transformer-based approaches towards content evaluation and link analysis should be simpler, cost effective, and more accurate.”

Interested in trying the search engine? You can find it here and  you can read how the full technical details of how he did it here.

Featured Image by Shutterstock/Red Vector

OpenAI Updates GPT-5 To Make It Warmer And Friendlier via @sejournal, @martinibuster

OpenAI updated GPT-5 to make it warmer and more familiar (in the sense of being friendlier) while taking care that the model didn’t become sycophantic, a problem discovered with GPT-4o.

A Warm and Friendly Update to GPT-5

GPT-5 was apparently perceived as too formal, distant, and detached. This update addresses that issue so that interactions are more pleasant and so that ChatGPT is perceived as more likable, as opposed to formal and distant.

Something that OpenAI is working toward is making ChatGPT’s personality user-configurable so that it’s style can be a closer match to user’s preferences.

OpenAI’s CEO Sam Altman tweeted:

“Most users should like GPT-5 better soon; the change is rolling out over the next day.

The real solution here remains letting users customize ChatGPT’s style much more. We are working that!”

One of the responses to Altman’s post was a criticism of GPT-5, asserting that 4o was more sensitive.

They tweeted:

“What GPT-4o had — its depth, emotional resonance, and ability to read the room — is fundamentally different from the surface-level “kindness” GPT-5 is now aiming for.

GPT-4o:
•The feeling of someone silently staying beside you
•Space to hold emotions that can’t be fully expressed
•Sensitivity that lets kindness come through the air, not just words.”

The Line Between Warmth And Sycophancy

The previous version of ChatGPT was widely understood as being overly flattering to the point of validating and encouraging virtually every idea. There was a discussion on Hacker News a few weeks ago about this topic of sycophantic AI and how ChatGPT could lead users into thinking every idea was a breakthrough.

One commenter wrote:

“…About 5/6 months ago, right when ChatGPT was in it’s insane sycophancy mode I guess, I ended up locked in for a weekend with it…in…what was in retrospect, a kinda crazy place.

I went into physics and the universe with it and got to the end thinking…”damn, did I invent some physics???” Every instinct as a person who understands how LLMs work was telling me this is crazy LLMbabble, but another part of me, sometimes even louder, was like “this is genuinely interesting stuff!” – and the LLM kept telling me it was genuinely interesting stuff and I should continue – I even emailed a friend a “wow look at this” email (he was like, dude, no…) I talked to my wife about it right after and she basically had me log off and go for a walk.”

Should ChatGPT feel like a sensitive friend, or should it be a tool that is friendly or pleasant to use?

Read ChatGPT release notes here:

GPT-5 Updates

Featured Image by Shutterstock/cosmoman

Google Expands iOS App Marketing Capabilities via @sejournal, @brookeosmundson

Running iOS app campaigns in Google has never been straightforward. Between Apple’s privacy changes and evolving user behavior, marketers have often felt like they were working with one hand tied behind their backs.

Measurement was limited, signals were weaker, and getting campaigns to scale often required more guesswork than strategy.

Google Ads Liaison, Ginny Marvin, took to LinkedIn to announce the numerous updates to iOS App Install campaigns/

Google is now making changes to help advertisers navigate this space more confidently. Their latest updates to iOS App Install campaigns are designed to give marketers a stronger mix of creative options, smarter bidding tools, and privacy-respecting measurement features.

While these changes won’t solve every iOS challenge overnight, they do mark a meaningful shift in how advertisers can approach growth on one of the world’s largest mobile ecosystems.

New Ad Formats Bring More Creative Opportunities

One of the biggest updates is the addition of new creative formats designed to improve engagement and give users a clearer picture of an app before they download.

Google is expanding support for co-branded YouTube ads, which integrate creator-driven content directly into placements like YouTube Shorts and in-feed ads.

For advertisers, it’s an opportunity to lean into the authenticity of creator-style ads, which often resonate more strongly than traditional branded spots.

Playable end cards are also being introduced across select AdMob inventory. After watching an ad, users can now interact with a lightweight, playable demo of the app.

Think of it as a “try before you buy” moment: users get a quick preview of the experience, which can lead to higher-quality installs.

For app marketers, this shift matters because it aligns user expectations with actual in-app experiences. The closer someone feels to your product before downloading, the less risk you face with churn or low-value installs.

Both of these creative updates point to a broader trend: ads are becoming less static and more interactive. That’s particularly important on iOS, where advertisers need every edge they can get to capture attention in environments where tracking is constrained.

Target ROAS Bidding Now Available for iOS

Another cornerstone of this announcement is Google’s expansion of value-based bidding on iOS.

Target ROAS (tROAS), a bidding strategy that optimizes for return on ad spend rather than raw install volume, is now fully supported.

This is especially valuable for apps with monetization models that vary widely across users, such as subscription services or in-app purchase businesses. Instead of paying equally for every install, advertisers can now direct spend toward users more likely to generate meaningful revenue.

Beyond tROAS, Google is also expanding the “Maximize Conversions” strategy for iOS. This allows campaigns to optimize not just for installs, but for deeper in-app actions.

By leaning into Google’s AI-driven modeling, advertisers can let the system identify where budget should be allocated to maximize results within daily spend limits.

The takeaway here is simple: volume still matters, but value matters more. With these updates, Google is nudging app marketers away from chasing installs at any cost and toward optimizing for users who truly drive long-term impact.

Measurement That Balances Privacy and Clarity

Perhaps the most challenging part of iOS advertising has been measurement.

Apple’s App Tracking Transparency framework made it harder to follow users across devices, limiting the signals available for campaign optimization. Google’s new measurement updates are designed to give advertisers more clarity without crossing privacy lines.

On-device conversion measurement is one of the most notable additions. Rather than sending user-level data back to servers, performance signals are processed directly on the device.

This means advertisers can still see which campaigns are working, but without compromising privacy. Importantly, it also reduces latency in reporting, helping marketers make faster decisions.

Integrated conversion measurement (ICM) is another feature being pushed forward. This approach works through app attribution partners (AAPs), giving advertisers cleaner, more near real-time data about installs and post-install actions.

Taken together, these tools signal a future where privacy and measurement don’t have to be opposing forces. Instead, advertisers can get the insights they need while users retain more control over their data.

How App Marketers Can Take Advantage

These updates aren’t the kind that require testing and adaptation.

For most advertisers, the best starting point is experimenting with the new ad formats. Running a co-branded YouTube ad or a playable end card alongside your existing creative can help you see whether engagement and conversion quality improve.

These tests don’t need to be massive, but they should be deliberate enough to give you actionable learnings.

For bidding, marketers should look closely at whether tROAS makes sense for their business model.

If your app has a clear monetization strategy and meaningful differences in user value, tROAS could be a game-changer. Start conservatively with your targets, give the algorithm time to learn, and refine based on observed performance.

On the measurement side, now is the time to talk to your developers and attribution partners about what it would take to implement on-device conversion tracking or ICM. These solutions may involve technical lift, but the payoff is improved data quality in an environment where every signal counts.

It’s also worth noting that these changes won’t transform campaigns overnight. Smart bidding models and new measurement frameworks take time to stabilize, and the impact of new formats might not show up in the first week of a test.

Patience, consistency, and a focus on week-over-week trends are key.

Looking Ahead

Google’s latest iOS updates don’t eliminate the complexities of app marketing, but they do give advertisers sharper tools to work with. From more engaging ad formats to value-based bidding and privacy-first measurement, the changes represent progress in a space that’s been difficult to navigate.

The message for marketers is clear: start testing, invest in measurement infrastructure, and don’t let short-term results cloud the bigger picture.

With the right approach, these updates can help shift iOS campaigns from a defensive play into an opportunity for real growth.

Google Answers Question About Core Web Vitals “Poisoning” via @sejournal, @martinibuster

Someone posted details of a novel negative SEO attack that they said appeared to be a Core Web Vitals performance poisoning attack. Google’s John Mueller and Chrome’s Barry Pollard assisted in figuring out what was going on.

The person posted on Bluesky, tagging Google’s John Mueller and Rick Viscomi, the latter a DevRel Engineer at Google.

They posted:

“Hey we’re seeing a weird type of negative SEO attack that looks like core web vitals performance poisoning, seeing it on multiple sites where it seems like an intentional render delay is being injected, see attached screenshot.Seeing across multiple sites & source countries

..this data is pulled by webvitals-js. At first I thought dodgy AI crawler but the traffic pattern is from multiple countries hitting the same set of pages and forging the referrer in many cases”

The significance of the reference to “webvitals-js” is that the degraded Core Web Vitals data is from what’s hitting the server, actual performances scores recorded on the website itself, not the CrUX data, which we’ll discuss next.

Could This Affect Rankings?

The person making the post did not say if the “attack” had impacted search rankings, although that is unlikely, given that website performance is a weak ranking factor and less important than things like content relevance to user queries.

Google’s John Mueller responded, sharing his opinion that it’s unlikely to cause an issue, and tagging Chrome Web Performance Developer Advocate Barry Pollard (@tunetheweb) in his response.

Mueller said:

“I can’t imagine that this would cause issues, but maybe @tunetheweb.com has seen things like this or would be keen on taking a look.”

Barry Pollard wondered if it’s a bug in the web-vitals library and asked the original poster if it’s reflected in the CrUX data (Chrome User Experience Report), which is a record of actual user visits to websites.

The person who posted about the issue responded to Pollard’s question by answering that the CrUX report does not reflect the page speed issues.

They also stated that the website in question is experiencing a cache-bypass DoS (denial-of-service) attack, which is when an attacker sends a massive number of web page requests that bypass a CDN or a local cache, causing stress to server resources.

The method employed by a cache-bypass DoS attack is to bypass the cache (whether that’s a CDN or a local cache) in order to get the server to serve a web page (instead of a copy of it from the cache or CDN), thus slowing down the server.

The local web-vitals script is recording the performance degradation of those visits, but it is likely not registering with the CrUX data because that comes from actual Chrome browser users who have opted in to sharing their web performance data.

So What’s Going On?

Judging by the limited information in the discussion, it appears that a DoS attack is slowing down server response times, which in turn is affecting page speed metrics on the server. The Chrome User Experience Report (CrUX) data is not reflecting the degraded response times, which could be because the CDN is handling the page requests for the users recorded in CrUX. There’s a remote chance that the CrUX data isn’t fresh enough to reflect recent events but it seems logical that users are getting cached versions of the web page and thus not experiencing degraded performance.

I think the bottom line is that CWV scores themselves will not have an effect on rankings. Given that actual users themselves will hit the cache layer if there’s a CDN, the DoS attack probably won’t have an effect on rankings in an indirect way either.

Local SEO Best Practices Aren’t Universal: Yext Study via @sejournal, @MattGSouthern

A new Yext analysis of 8.7 million Google search results suggests many common local SEO tactics don’t perform the same across industries and regions.

The dataset, drawn from the company’s Scout Index, focuses on what correlates with visibility in Google’s Local Pack, not just overall map presence.

What Yext Found

Review Management Emerges As The Strongest Signal

The clearest pattern is around reviews. Yext states “Review engagement dominates,” calling it “the most consistent driver of Local Pack visibility across all industries and regions.”

Within the study’s feature rankings, review signals top the list, including review count, new reviews per month, and owner responses.

Businesses with many positive reviews and prompt owner responses tend to outperform competitors.

Industry Differences Vs. One-Size-Fits-All Playbooks

While profile completeness and timely replies generally help, their impact varies by vertical.

  • Food & Dining: Recent, highly rated reviews correlate more with visibility than total volume or profile completeness. A steady flow of new, high-quality reviews appears more influential than maximizing every profile field.
  • Hospitality: Photo quantity shows a weaker or even negative correlation with higher rankings. Yext notes that “a smaller set of curated, high-quality photos has more impact than a large, unfocused collection” for hotels and similar businesses.
    • At the same time, hospitality still benefits from strong ratings, clear descriptions, and curated visuals. Quality and focus matter more than volume.
  • Other sectors: The report highlights universal positives such as profile completeness, but stops short of advising identical tactics everywhere.

Regional Patterns

Geography also changes the picture. The Northeast appears less sensitive to many traditional SEO factors, while the South and West are more affected by slow review responses.

Yext calls out weekend response gaps: waiting until Monday can cost visibility, especially in the Midwest.

The practical takeaway is to maintain timely review engagement every day, not just during weekday office hours.

Methodology

Yext’s Scout Index compiles more than 200 structured data points per business, including review patterns, hours, contact details, media assets, social activity, and Google Business Profile completeness.

The analysis covers six industries across 2,500 populous ZIP codes and compares Local Pack placements against baseline Google Maps results.

Study caveats: This research involves vendor analysis using a proprietary dataset. It reports correlations rather than causal effects. Please consider these findings as directional and validate them in your own markets.

Looking Ahead

Yext’s conclusion is: “The one-size-fits-all approach seems to be a relic of the past.”

For marketers, this means testing industry-specific and region-specific strategies. Local search performance appears to reflect differences in both what people search and where they search.

Review management is the baseline to get right. Prioritize the cadence and quality of reviews, and respond quickly. Consider ways to cover weekends where delays correlate with lost visibility.


Featured Image: Roman Samborskyi/Shutterstock

ChatGPT-5 Now Connects To Gmail, Calendar, And Contacts via @sejournal, @martinibuster

OpenAI announced that it has added connectors to Gmail, Google Calendar, and Google Contacts for ChatGPT Plus users, enabling ChatGPT to use data from those apps within ChatGPT chats.

ChatGPT Connectors

A connector is a bridge between ChatGPT and an external app like Canva, Dropbox, and Gmail, enabling users to connect those apps to ChatGPT in order to work with them within the ChatGPT interface. Access to the Google apps isn’t automatic; it has to be manually enabled by users.

This access was first made available to Pro users, and now it has been rolled out to Plus subscribers.

How To Enable Google App Connectors

Step 1: Click the + button then “Connected apps” link

Click The Next “Connected Apps” Link

Choose The Gmail App To Connect

How Connectors Work With ChatGPT-5

According to OpenAI’s announcement:

“Once you enable them, ChatGPT will automatically reference them when relevant, making it faster and easier to bring information from these tools into your conversations without having to manually select them each time.

This capability is part of GPT-5 and will begin rolling out to Pro users globally this week, followed by Plus, Team, Enterprise, and Edu plans in the coming weeks. To enable, visit Settings → Connectors→ Connect on the application.”

Read OpenAI’s announcement:

Gmail, Google Calendar, and Google Contacts Connectors in ChatGPT (Plus)

Featured Image by Shutterstock/Visuals6x

Google Explains Why They Need To Control Their Ranking Signals via @sejournal, @martinibuster

Google’s Gary Illyes answered a question about why Google doesn’t use social sharing as a ranking factor, explaining that it’s about the inability to control certain kinds of external signals.

Kenichi Suzuki Interview With Gary Illyes

Kenichi Suzuki (LinkedIn profile), of Faber Company (LinkedIn profile), is a respected Japanese search marketing expert who has at least 25 years of experience in digital marketing. I last saw him speak at a Pubcon session a few years back, where he shared his findings on qualities inherent to sites that Google Discover tended to show.

Suzuki published an interview with Gary Illyes, where he asked a number of questions about SEO, including this one about SEO, social media, and Google ranking factors.

Gary Illyes is an Analyst at Google (LinkedIn profile) who has a history of giving straightforward answers that dispel SEO myths and sometimes startle, like the time recently when he said that links play less of a role in ranking than most SEOs tend to believe. Gary used to be a part of the web publishing community before working at Google, and he was even a member of the WebmasterWorld forums under the nickname Methode. So I think Gary knows what it’s like to be a part of the SEO community and how important good information is, and that’s reflected in the quality of answers he provides.

Are Social Media Shares Or Views Google Ranking Factors?

The question about social media and ranking factors was asked by Rio Ichikawa (LinkedIn profile), also of Faber Company. She asked Gary whether social media views and shares were ranking signals.

Gary’s answer was straightforward and with zero ambiguity. He said no. The interesting part of his answer was the explanation of why Google doesn’t use them and will never use them as a ranking factor.

Ichikawa asked the following question:

“All right then. The next question. So this is about the SEO and social media. Is the number of the views and shares on social media …used as one of the ranking signals for SEO or in general?”

Gary answered:

“For this we have basically a very old, very canned response and something that we learned or it’s based on something that we learned over the years, or particularly one incident around 2014.

The answer is no. And for the future is also likely no.

And that’s because we need to be able to control our own signals. And if we are looking at external signals, so for example, a social network’s signals, that’s not in our control.

So basically if someone on that social network decides to inflate the number, we don’t know if that inflation was legit or not, and we have no way knowing that.”

Easily Gamed Signals Are Unreliable For SEO

External signals that Google can’t control but can be influenced by an SEO are untrustworthy. Googlers have expressed similar opinions about other things that are easily manipulated and therefore unreliable as ranking signals.

Some SEOs might say, “If that’s true, then what about structured data? Those are under the control of SEOs, but Google uses them.”

Yes, Google uses structured data, but not as a ranking factor; they just make websites eligible for rich results. Additionally, stuffing structured data with content that’s not visible on the web page is a violation of Google’s guidelines and can lead to a manual action.

A recent example is the LLMs.txt protocol proposal, which is essentially dead in the water precisely because it is unreliable, in addition to being superfluous. Google’s John Mueller has said that the LLMs.txt protocol is unreliable because it could easily be misused to show highly optimized content for ranking purposes, and that it is analogous to the keywords meta tag, which was used by SEOs for every keyword they wanted their web pages to rank for.

Mueller said:

“To me, it’s comparable to the keywords meta tag – this is what a site-owner claims their site is about … (Is the site really like that? well, you can check it. At that point, why not just check the site directly?)”

The content within an LLMs.txt and associated files are completely in control of SEOs and web publishers, which makes them unreliable.

Another example is the author byline. Many SEOs promoted author bylines as a way to show “authority” and influence Google’s understanding of Expertise, Experience, Authoritativeness, and Trustworthiness. Some SEOs, predictably, invented fake LinkedIn profiles to link from their fake author bios in the belief that author bylines were a ranking signal. The irony is that the ease of abusing author bylines should have been reason enough for the average SEO to dismiss them as a ranking-related signal.

In my opinion, the key statement in Gary’s answer is this:

“…we need to be able to control our own signals.”

I think that the SEO community, moving forward, really needs to rethink some of the unconfirmed “ranking signals” they believe in, like brand mentions, and just move on to doing things that actually make a difference, like promoting websites and creating experiences that users love.

Watch the question and answer at about the ten minute mark:

Featured Image by Shutterstock/pathdoc

Google Gemini Adds Personalization From Past Chats via @sejournal, @MattGSouthern

Google is rolling out updates to the Gemini app that personalize responses using past conversations and add new privacy controls, including a Temporary Chat mode.

The changes start today and will expand over the coming weeks.

What’s New

Personalization From Past Chats

Gemini now references earlier chats to recall details and preferences, making responses feel like collaborating with a partner who’s already familiar with the context.

The update aligns with Google’s I/O vision for an assistant that learns and understand the user.

Screenshot from: blog.google/products/gemini/temporary-chats-privacy-controls/, August 2025.

The setting is on by default and can be turned off in SettingsPersonal contextYour past chats with Gemini.

Temporary Chats

For conversations that shouldn’t influence future responses, Google is adding Temporary Chat.

As Google describes it:

“There may be times when you want to have a quick conversation with the Gemini app without it influencing future chats.”

Temporary chats don’t appear in recent chats, aren’t used to personalize or train models, and are kept for up to 72 hours.

Screenshot from: blog.google/products/gemini/temporary-chats-privacy-controls/, August 2025.

Rollout starts today and will reach all users over the coming weeks.

Updated Privacy Controls

Google will rename the “Gemini Apps Activity” setting to “Keep Activity” in the coming weeks.

When this setting is on, a sample of future uploads, such as files and photos, may be used to help improve Google services.

If your Gemini Apps Activity setting is currently off, Keep Activity will remain off. You can also turn the setting off at any time or use Temporary Chats.

Why This Matters

Personalized responses can reduce repetitive context-setting once Gemini understands your typical topics and goals.

For teams working across clients and categories, Temporary Chats help keep sensitive brainstorming separate from your main context, avoiding cross-pollination of preferences.

Both features include controls that meet privacy requirements for client-sensitive workflows.

Availability

The personalization setting begins rolling out today on Gemini 2.5 Pro in select countries, with expansion to 2.5 Flash and more regions in the coming weeks.


Featured Image: radithyaraf/Shutterstock