Google Confirms It Uses Something Similar To MUVERA via @sejournal, @martinibuster

Google’s Gary Illyes answered questions during the recent Search Central Live Deep Dive in Asia about whether or not they use the new Multi‑Vector Retrieval via Fixed‑Dimensional Encodings (MUVERA) retrieval method and also if they’re using Graph Foundation Models.

MUVERA

Google recently announced MUVERA in a blog post and a research paper: a method that improves retrieval by turning complex multi-vector search into fast single-vector search. It compresses sets of token embeddings into fixed-dimensional vectors that closely approximate their original similarity. This lets it use optimized single-vector search methods to quickly find good candidates, then re-rank them using exact multi-vector similarity. Compared to older systems like PLAID, MUVERA is faster, retrieves fewer candidates, and still improves recall, making it a practical solution for large-scale retrieval.

The key points about MUVERA are:

  • MUVERA converts multi-vector sets into fixed vectors using Fixed Dimensional Encodings (FDEs), which are single-vector representations of multi-vector sets.
  • These FDEs (Fixed Dimensional Encodings) match the original multi-vector comparisons closely enough to support accurate retrieval.
  • MUVERA retrieval uses MIPS (Maximum Inner Product Search), an established search technique used in retrieval, making it easier to deploy at scale.
  • Reranking: After using fast single-vector search (MIPS) to quickly narrow down the most likely matches, MUVERA re-ranks them using Chamfer similarity, a more detailed multi-vector comparison method. This final step restores the full accuracy of multi-vector retrieval, so you get both speed and precision.
  • MUVERA is able to find more of the precisely relevant documents with a lower processing time than the state-of-the-art retrieval baseline (PLAID) it was compared to.

Google Confirms That They Use MUVERA

José Manuel Morgal (LinkedIn profile) related his question to Google’s Gary Illyes and his response was to jokingly ask what MUVERA was and then he confirmed that they use a version of it:

This is how the question and answer was described by José:

“An article has been published in Google Research about MUVERA and there is an associated paper. Is it currently in production in Search?

His response was to ask me what MUVERA was haha and then he commented that they use something similar to MUVERA but they don’t name it like that.”

Does Google Use Graph Foundation Models (GFMs)?

Google recently published a blog announcement about an AI breakthrough called a Graph Foundation Model.

Google’s Graph Foundation Model (GFM) is a type of AI that learns from relational databases by turning them into graphs, where rows become nodes and the connections between tables become edges.

Unlike older models (machine learning models and graph neural networks (GNNs)) that only work on one dataset, GFMs can handle new databases with different structures and features without retraining on the new data. GFMs use a large AI model to learn how data points relate across tables. This lets GFMs find patterns that regular models miss, and they perform much better in tasks like detecting spam in Google’s scaled systems. GFMs are a big step forward because they bring foundation-model flexibility to complex structured data.

Graph Foundation Models represent a notable achievement because their improvements are not incremental. They are an order-of-magnitude improvement, with performance gains of 3x to 40x in average precision.

José next asked Illyes if Google uses Graph Foundation Models and Gary again jokingly feigned not knowing what José was talking about.

He related the question and answer:

“An article has been published in Google Research about Graph Foundation Models for data, this time there are not paper associated with it. Is it currently in production in Search?

His answer was the same as before, asking me what Graph Foundation Models for data was, and he thought it was not in production. He did not know because there are not associated paper and on the other hand, he commented me that he did not control what is published in Google Research blog.”

Gary expressed his opinion that Graph Foundation Model was not currently used in Search. At this point, that’s the best information we have.

Is GFM Ready For Scaled Deployment?

The official Graph Foundation Model announcement says it was tested in an internal task, spam detection in ads, which strongly suggests that real internal systems and data were used, not just academic benchmarks or simulations.

Here is what Google’s announcement relates:

“Operating at Google scale means processing graphs of billions of nodes and edges where our JAX environment and scalable TPU infrastructure particularly shines. Such data volumes are amenable for training generalist models, so we probed our GFM on several internal classification tasks like spam detection in ads, which involves dozens of large and connected relational tables. Typical tabular baselines, albeit scalable, do not consider connections between rows of different tables, and therefore miss context that might be useful for accurate predictions. Our experiments vividly demonstrate that gap.”

Takeaways

Google’s Gary Illyes confirmed that a form of MUVERA is in use at Google. His answer about GFM seemed to be expressed as an opinion, so it’s somewhat less clear, as it’s related as Gary saying that he thinks it’s not in production.

Featured Image by Shutterstock/Krakenimages.com

Chrome Trial Aims To Fix Core Web Vitals For JavaScript-Heavy Sites via @sejournal, @MattGSouthern

Google Chrome is testing a new way to measure Core Web Vitals in Single Page Applications (SPAs), which is a long-standing blind spot in performance tracking that affects SEO audits and ranking signals.

Starting with Chrome 139, developers can opt into an origin trial for the Soft Navigations API. This enables measurement of metrics like LCP, CLS, and INP even when a page updates content without a full reload.

Why This Matters For SEO

SPAs are popular for speed and interactivity, but they’ve been notoriously difficult to monitor using tools like Lighthouse, field data in CrUX, or real user monitoring scripts.

That’s because SPAs often update the page using JavaScript without triggering a traditional navigation. As a result, Google’s measurement systems and most performance tools miss those updates when calculating Core Web Vitals.

This new API aims to close that gap, giving you a clearer picture of how your site performs in the real world, especially after a user clicks or navigates within an app-like interface.

What The New API Does

Chrome’s Soft Navigations API uses built-in heuristics to detect when a soft navigation happens. For example:

  • A user clicks a link
  • The page URL updates
  • The DOM visibly changes and triggers a paint

When these conditions are met, Chrome now treats it as a navigation event for performance measurement, even though no full page load occurred.

The API introduces new metrics and enhancements, including:

  • interaction-contentful-paint – lets you measure Largest Contentful Paint after a soft navigation
  • navigationId – added to performance entries so metrics can be tied to specific navigations (crucial when URLs change mid-interaction)
  • Extensions to layout shift, event timing, and INP to work across soft navigations

How To Try It

You can test this feature today in Chrome 139 using either:

  • Local testing: Enable chrome://flags/#soft-navigation-heuristics
  • Origin trial: Add a token to your site via meta tag or HTTP header to collect real user data

Chrome recommends enabling the “Advanced Paint Attribution” flag for the most complete data.

Things To Keep In Mind

Chrome’s Barry Pollard, who leads this initiative, emphasizes the API is still experimental:

“Wanna measure Core Web Vitals for for SPAs?

Well we’ve been working on the Soft Navigations API for that and we’re launching a new origin trial from Chrome 139.

Take it for a run on your app, and see if it correctly detects soft navigations on your application and let us know if it doesn’t!”

Here’s what else you should know:

  • Metrics may not be supported in older Chrome versions or other browsers
  • Your RUM provider may need to support navigationId and interaction-contentful-paintfor tracking
  • Some edge cases, like automatic redirects or replaceState() usage, may not register as navigations

Looking Ahead

This trial is a step toward making Core Web Vitals more accurate for modern JavaScript-heavy websites.

While the API isn’t yet integrated into Chrome’s public performance reports like CrUX, that could change if the trial proves successful.

If your site relies on React, Vue, Angular, or other SPA frameworks, now’s your chance to test how well Chrome’s new approach captures user experience.


Featured Image: Roman Samborskyi/Shutterstock

Merging SEO And Content Using Your Knowledge Graph to AI-Proof Content via @sejournal, @marthavanberkel

New AI platforms, powered by generative technologies like Google’s Gemini, Microsoft’s Copilot, Grok, and countless specialized chatbots, are rapidly becoming the front door for digital discovery.

We’ve entered an era of machine-led discovery, where AI systems aggregate, summarize, and contextualize content across multiple platforms.

Users today no longer follow a linear journey from keyword to website. Instead, they engage in conversations and move fluidly between channels and experiences.

These shifts are being driven by new types of digital engagement, including:

  • AI-generated overviews, such as AI Overviews in Google, that pull data from many sources.
  • Conversational search, such as ChatGPT and Gemini, where follow-up questions replace traditional browsing.
  • Social engagement, with platforms like TikTok equipped with their own generative search features, engaging entire generations in interactive journeys of discovery.

The result is a new definition of discoverability and a need to rethink how you manage your brand across these experiences.

It’s not enough to optimize your brand’s website for search engines. You must ensure your website content is machine-consumable and semantically connected to appear in AI-generated results.

This is why forward-thinking organizations are turning to schema markup (structured data) and building content knowledge graphs to manage the data layer that powers both traditional search and emerging AI platforms.

Semantic structured data transforms your content into a machine-readable network of information, enabling your brand to be recognized, connected, and potentially included in AI-driven experiences across channels.

In this article, we’ll explore how SEO and content teams can partner to build a content knowledge graph that fuels discoverability in the age of AI, and why this approach is critical for enterprise brands aiming to future-proof their digital presence.

Why Schema Markup Is Your Strategic Data Layer

You may be asking, “Schema markup – is that not just for rich results (visual changes in SERP)?”

Schema markup is no longer just a technical SEO tactic for achieving rich results; it can also be used to define the content on your website and its relationship to other entities within your brand.

When you apply markup in a connected way, AI and search can do more accurate inferencing, resulting in more accurate targeting to user queries or prompts.

In May 2025, Google and Microsoft both reiterated that the use of structured data does make your content “machine-readable” that makes you eligible for certain features. [Editor’s note: Although, Gary Illyes recently said to avoid excessive use and that Schema is not a ranking factor.]

Schema markup can be a strategic foundation for creating a data layer that feeds AI systems. While schema markup is a technical SEO approach, it all starts with content.

When You Implement Schema Markup, You’re:

Defining Entities

Schema markup clarifies the “things” your content is about, such as products, services, people, locations, and more.

It provides precise tags that help machines recognize and categorize your content accurately.

Establishing Relationships

Beyond defining individual entities (a.k.a. topics), schema markup describes how those entities connect to each other and to broader topics across the web.

This creates a web of meaning that mirrors how humans understand context and relationships.

Providing Machine-Readable Context

Schema markup assists your content to be machine-readable.

It enables search engines and AI tools to confidently identify, interpret, and surface your content in relevant contexts, which can help your brand appear where it is most relevant.

Enterprise SEO and content teams can work together to implement schema markup to create a content knowledge graph, a structured representation of your brand’s expertise, offerings, and topic authority.

When you do this, the data you put into search and AI platforms is ready for large language models (LLMs) to make accurate inferences, which can help with consumer visibility.

What Is A Content Knowledge Graph?

A content knowledge graph organizes your website’s data into a network of interconnected entities and topics, all defined by implementing schema markup based on the Schema.org vocabulary. This graph serves as a digital map of your brand’s expertise and topical authority.

Imagine your website as a library. Without a knowledge graph, AI systems trying to read your site have to sift through thousands of pages, hoping to piece together meaning from scattered words and phrases.

With a content knowledge graph:

  • Entities are defined. Machines can informed precisely who, what, and where you’re talking about.
  • Topics are connected. Machines can better understand and infer how subjects relate. For example, machines can infer that “cardiology” encompasses entities like heart disease, cholesterol, or specific medical procedures.
  • Content becomes query-ready. your content is assisted to become structured data that AI can reference, cite, and include in responses.

When your content is organized into a knowledge graph, you’re effectively supplying AI platforms with information about your products, services, and expertise.

This becomes a powerful control point for how your brand is represented in AI search experiences.

Rather than leaving it to chance how AI systems interpret your web content, you can help to proactively shape the narrative and ensure machines have the right signals to potentially include your brand in conversations, summaries, and recommendations.

Your organization’s leaders should be aware this is now a strategic issue, not just a technical one.

A content knowledge graph gives you some influence over how your organization’s expertise and authority are recognized and distributed by AI systems, which can impact discoverability, reputation, and competitive advantage in a rapidly evolving digital landscape.

This structure can improve your chances of appearing in AI-generated answers and equips your content and SEO teams with data-driven insights to guide your content strategy and optimization efforts.

How Enterprise SEO And Content Teams Can Build A Content Knowledge Graph

Here’s how enterprise teams can operationalize a content knowledge graph to future-proof discoverability and unify SEO and content strategies:

1. Define What You Want To Be Known For

Enterprise brands should start by identifying their core topical authority areas. Ask:

  • Which topics matter most to our audience and brand?
  • Where do we want to be the recognized authority?
  • What new topics are emerging in our industry that we should own?

These strategic priorities shape the pillars of your content knowledge graph.

2. Use Schema Markup To Define Key Entities

Next, use schema markup to:

  • Identify key entities tied to your priority topics, such as products, services, people, places, or concepts.
  • Connect those entities to each other through Schema.org properties, such as “about,” “mentions,” or “sameAs.”
  • Ensure consistent entity definitions across your entire site so that AI systems can reliably identify and understand entities and their relationships.

This is how your content becomes machine-readable and more likely to be accurately included in AI-driven results and recommendations.

3. Audit Your Existing Content Against Your Content Knowledge Graph

Instead of just tracking keywords, enterprises should audit their content based on entity coverage:

  • Are all priority entities represented on your site?
  • Do you have “entity homes” (pillar pages) that serve as authoritative hubs for those priority entities?
  • Where are there gaps in entity coverage that could limit your presence in search and AI responses?
  • What content opportunities exist to improve coverage of priority entities where these gaps have been identified?

A thorough audit provides a clear roadmap for aligning your content strategy with how machines interpret and surface information, ensuring your brand has the potential to be discoverable in evolving AI-driven search experiences.

4. Create Pillar Pages And Fill Content Gaps

Based on your findings from Step 3, create dedicated pillar pages for high-priority entities where needed. These become the authoritative source that:

  • Defines the entity.
  • Links to supporting content, including case studies, blog posts, or service pages.
  • Signals to search engines and AI systems on where to find reliable information about that entity.

Supporting content can then be created to expand on subtopics and related entities that link back to these pillar pages, ensuring comprehensive coverage of topics.

5. Measure Performance By Entity And Topic

Finally, enterprises should track how well their content performs at the entity and topic levels:

  • Which entities drive impressions and clicks in AI-powered search results?
  • Are there emerging entities gaining traction in your industry that you should cover?
  • How does your topical authority compare to competitors?

This data-driven approach enables continuous optimization, helping you to stay visible as AI search evolves.

Why SEO And Content Teams Are The Heroes Of The AI Search Evolution

In this new landscape, where AI generates answers before users ever reach your website, schema markup and content knowledge graphs provide a critical control point.

They enable your brand to signal its authority to machines, support the possibility of accurate inclusion in AI results and overviews, and inform SEO and content investment based on data, not guesswork.

For enterprise organizations, this isn’t just an SEO tactic; it’s a strategic imperative that could protect visibility and brand presence in the new digital ecosystem.

So, the question remains: What does your brand want to be known for?

Your content knowledge graph is the infrastructure that ensures AI systems, and by extension, your future customers, know the answer.

More Resources:


Featured Image: Urbanscape/Shutterstock

2025 Core Web Vitals Challenge: WordPress Versus Everyone via @sejournal, @martinibuster

The Core Web Vitals Technology Report shows the top-ranked content management systems by Core Web Vitals (CWV) for the month of June (July’s statistics aren’t out yet). The breakout star this year is an e-commerce platform, which is notable because shopping sites generally have poor performance due to the heavy JavaScript and image loads necessary to provide shopping features.

This comparison also looks at the Interaction to Next Paint (INP) scores because they don’t mirror the CWV scores. INP measures how quickly a website responds visually after a user interacts with it. The phrase “next paint” refers to the moment the browser visually updates the page in response to a user’s interaction.

A poor INP score can mean that users will be frustrated with the site because it’s perceived as unresponsive. A good INP score correlates with a better user experience because of how quickly the website performs.

Core Web Vitals Technology Report

The HTTP Archive Technology Report combines two public datasets:

  1. Chrome UX Report (CrUX)
  2. HTTP Archive

1. Chrome UX Report (CrUX)
CrUX obtains its data from Chrome users who opt into providing usage statistics reporting as they browse over 8 million websites. This data includes performance on Core Web Vitals metrics and is aggregated into monthly datasets.

2. HTTP Archive
HTTP Archive obtains its data from lab tests by tools like WebPageTest and Lighthouse that analyze how pages are built and whether they follow performance best practices. Together, these datasets show how websites perform and what technologies they use.

The CWV Technology Report combines data from HTTP Archive (which tracks websites through lab-based crawling and testing) and CrUX (which collects real-user performance data from Chrome users), and that’s where the Core Web Vitals performance data of content management systems comes from.

#1 Ranked Core Web Vitals (CWV) Performer

The top-performing content management system is Duda. A remarkable 83.63% of websites on the Duda platform received a good CWV score. Duda has consistently ranked #1, and this month continues that trend.

For Interaction to Next Paint scores, Duda ranks in the second position.

#2 Ranked CWV CMS: Shopify

The next position is occupied by Shopify. 75.22% of Shopify websites received a good CWV score.

This is extraordinary because shopping sites are typically burdened with excessive JavaScript to power features like product filters, sliders, image effects, and other tools that shoppers rely on to make their choices. Shopify, however, appears to have largely solved those issues and is outperforming other platforms, like Wix and WordPress.

In terms of INP, Shopify is ranked #3, at the upper end of the rankings.

#3 Ranked CMS For CWV: Wix

Wix comes in third place, just behind Shopify. 70.76% of Wix websites received a good CWV score. In terms of INP scores, 86.82% of Wix sites received a good INP score. That puts them in fourth place for INP.

#4 Ranked CMS: Squarespace

67.66% of Squarespace sites had a good CWV score, putting them in fourth place for CWV, just a few percentage points behind the No. 3 ranked Wix.

That said, Squarespace ranks No. 1 for INP, with a total of 95.85% of Squarespace sites achieving a good INP score. That’s a big deal because INP is a strong indicator of a good user experience.

#5 Ranked CMS: Drupal

59.07% of sites on the Drupal platform had a good CWV score. That’s more than half of sites, considerably lower than Duda’s 83.63% score but higher than WordPress’s score.

But when it comes to the INP score, Drupal ranks last, with only 85.5% of sites scoring a good INP score.

#6 Ranked CMS: WordPress

Only 43.44% of WordPress sites had a good CWV score. That’s over fifteen percentage points lower than fifth-ranked Drupal. So WordPress isn’t just last in terms of CWV performance; it’s last by a wide margin.

WordPress performance hasn’t been getting better this year either. It started 2025 at 42.58%, then went up a few points in April to 44.93%, then fell back to 43.44%, finishing June at less than one percentage point higher than where it started the year.

WordPress is in fifth place for INP scores, with 85.89% of WordPress sites achieving a good INP score, just 0.39 points above Drupal, which is in last place.

But that’s not the whole story about the WordPress INP scores. WordPress started the year with a score of 86.05% and ended June with a slightly lower score.

INP Rankings By CMS

Here are the rankings for INP, with the percentage of sites exhibiting a good INP score next to the CMS name:

  1. Squarespace 95.85%
  2. Duda 93.35%
  3. Shopify 89.07%
  4. Wix 86.82%
  5. WordPress 85.89%
  6. Drupal 85.5%

As you can see, positions 3–6 are all bunched together in the eighty percent range, with only a 3.57 percentage point difference between the last-placed Drupal and the third-ranked Shopify. So, clearly, all the content management systems deserve a trophy for INP scores. Those are decent scores, especially for Shopify, which earned a second-place ranking for CWV and third place for INP.

Takeaways

  • Duda Is #1
    Duda leads in Core Web Vitals (CWV) performance, with 83.63% of sites scoring well, maintaining its top position.
  • Shopify Is A Strong Performer
    Shopify ranks #2 for CWV, a surprising performance given the complexity of e-commerce platforms, and scores well for INP.
  • Squarespace #1 For User Experience
    Squarespace ranks #1 for INP, with 95.85% of its sites showing good responsiveness, indicating an excellent user experience.
  • WordPress Performance Scores Are Stagnant
    WordPress lags far behind, with only 43.44% of sites passing CWV and no signs of positive momentum.
  • Drupal Also Lags
    Drupal ranks last in INP and fifth in CWV, with over half its sites passing but still underperforming against most competitors.
  • INP Scores Are Generally High Across All CMSs
    Overall INP scores are close among the bottom four platforms, suggesting that INP scores are relatively high across all content management systems.

Find the Looker Studio rankings for here (must be logged into a Google account to view).

Featured Image by Shutterstock/Krakenimages.com

OpenAI Is Pulling Shared ChatGPT Chats From Google Search via @sejournal, @MattGSouthern

OpenAI has rolled back a feature that allowed ChatGPT conversations shared via link to appear in Google Search results.

The company confirms it has disabled the toggle that enabled shared chats to be “discoverable” by search engines and is working to remove existing indexed links.

Shared Chats Were “Short-Lived Experiment”

When users shared a ChatGPT conversation using the platform’s built-in “Share” button, they were given the option to make the chat visible in search engines.

That feature, introduced quietly earlier this year, caused concern after thousands of personal chats started showing up in search results.

Fast Company first reported the issue, finding over 4,500 shared ChatGPT links indexed by Google, some containing personally identifiable information such as names, resumes, emotional reflections, and confidential work content.

In a statement, OpenAI confirms:

“We just removed a feature from [ChatGPT] that allowed users to make their conversations discoverable by search engines, such as Google. This was a short-lived experiment to help people discover useful conversations. This feature required users to opt-in, first by picking a chat to share, then by clicking a checkbox for it to be shared with search engines (see below).

Ultimately we think this feature introduced too many opportunities for folks to accidentally share things they didn’t intend to, so we’re removing the option. We’re also working to remove indexed content from the relevant search engines. This change is rolling out to all users through tomorrow morning.

Security and privacy are paramount for us, and we’ll keep working to maximally reflect that in our products and features.”

How the Feature Worked

By default, shared ChatGPT links were accessible only to people with the URL. But users could choose to toggle on discoverability, allowing search engines like Google to index the conversation.

That setting has now been removed, and previously shared chats will no longer be indexed going forward. However, OpenAI cautions that already-indexed content may still appear in search results temporarily due to caching.

Importantly, deleting a conversation from your ChatGPT history does not delete the public share link or remove it from search engines.

Why It Matters

The discoverability toggle was intended to encourage people to reuse outputs generated in ChatGPT, but the company acknowledges it came with unintended privacy tradeoffs.

Even though OpenAI offered explicit controls over visibility, many people may not have understood the implications of enabling search indexing.

This is a reminder to be cautious about what kinds of information you enter into AI chatbots. Although a chat starts out private, features like sharing, logging, or model training can create paths for that content to be exposed publicly.

Looking Ahead

OpenAI says it’s working with Google and other search engines to remove indexed shared links and is reassessing how public sharing features are handled in ChatGPT.

If you’ve shared a ChatGPT conversation in the past, you can check your visibility settings and delete shared links through the ChatGPT Shared Links dashboard.

Featured Image: Mehaniq/Shutterstock

The two people shaping the future of OpenAI’s research

For the past couple of years, OpenAI has felt like a one-man brand. With his showbiz style and fundraising glitz, CEO Sam Altman overshadows all other big names on the firm’s roster. Even his bungled ouster ended with him back on top—and more famous than ever. But look past the charismatic frontman and you get a clearer sense of where this company is going. After all, Altman is not the one building the technology on which its reputation rests. 

That responsibility falls to OpenAI’s twin heads of research—chief research officer Mark Chen and chief scientist Jakub Pachocki. Between them, they share the role of making sure OpenAI stays one step ahead of powerhouse rivals like Google.

I sat down with Chen and Pachocki for an exclusive conversation during a recent trip the pair made to London, where OpenAI set up its first international office in 2023. We talked about how they manage the inherent tension between research and product. We also talked about why they think coding and math are the keys to more capable all-purpose models; what they really mean when they talk about AGI; and what happened to OpenAI’s superalignment team, set up by the firm’s cofounder and former chief scientist Ilya Sutskever to prevent a hypothetical superintelligence from going rogue, which disbanded soon after he quit. 

In particular, I wanted to get a sense of where their heads are at in the run-up to OpenAI’s biggest product release in months: GPT-5.

Reports are out that the firm’s next-generation model will be launched in August. OpenAI’s official line—well, Altman’s—is that it will release GPT-5 “soon.” Anticipation is high. The leaps OpenAI made with GPT-3 and then GPT-4 raised the bar of what was thought possible with this technology. And yet delays to the launch of GPT-5 have fueled rumors that OpenAI has struggled to build a model that meets its own—not to mention everyone else’s—expectations.

But expectation management is part of the job for a company that for the last several years has set the agenda for the industry. And Chen and Pachocki set the agenda inside OpenAI.

Twin peaks 

The firm’s main London office is in St James’s Park, a few hundred meters east of Buckingham Palace. But I met Chen and Pachocki in a conference room in a coworking space near King’s Cross, which OpenAI keeps as a kind of pied-à-terre in the heart of London’s tech neighborhood (Google DeepMind and Meta are just around the corner). OpenAI’s head of research communications, Laurance Fauconnet, sat with an open laptop at the end of the table. 

Chen, who was wearing a maroon polo shirt, is clean-cut, almost preppy. He’s media trained and comfortable talking to a reporter. (That’s him flirting with a chatbot in the “Introducing GPT-4o” video.) Pachocki, in a black elephant-logo tee, has more of a TV-movie hacker look. He stares at his hands a lot when he speaks.

But the pair are a tighter double act than they first appear. Pachocki summed up their roles. Chen shapes and manages the research teams, he said. “I am responsible for setting the research roadmap and establishing our long-term technical vision.”

“But there’s fluidity in the roles,” Chen said. “We’re both researchers, we pull on technical threads. Whatever we see that we can pull on and fix, that’s what we do.”

Chen joined the company in 2018 after working as a quantitative trader at the Wall Street firm Jane Street Capital, where he developed machine-learning models for futures trading. At OpenAI he spearheaded the creation of DALL-E, the firm’s breakthrough generative image model. He then worked on adding image recognition to GPT‑4 and led the development of Codex, the generative coding model that powers GitHub Copilot.

Pachocki left an academic career in theoretical computer science to join OpenAI in 2017 and replaced Sutskever as chief scientist in 2024. He is the key architect of OpenAI’s so-called reasoning models—especially o1 and o3—which are designed to tackle complex tasks in science, math, and coding. 

When we met they were buzzing, fresh off the high of two new back-to-back wins for their company’s technology.

On July 16, one of OpenAI’s large language models came in second in the AtCoder World Tour Finals, one of the world’s most hardcore programming competitions. On July 19, OpenAI announced that one of its models had achieved gold-medal-level results on the 2025 International Math Olympiad, one of the world’s most prestigious math contests.

The math result made headlines, not only because of OpenAI’s remarkable achievement, but because rival Google DeepMind revealed two days later that one of its models had achieved the same score in the same competition. Google DeepMind had played by the competition’s rules and waited for its results to be checked by the organizers before making an announcement; OpenAI had in effect marked its own answers.

For Chen and Pachocki, the result speaks for itself. Anyway, it’s the programming win they’re most excited about. “I think that’s quite underrated,” Chen told me. A gold medal result in the International Math Olympiad puts you somewhere in the top 20 to 50 competitors, he said. But in the AtCoder contest OpenAI’s model placed in the top two: “To break into a really different tier of human performance—that’s unprecedented.”

Ship, ship, ship!

People at OpenAI still like to say they work at a research lab. But the company is very different from the one it was before the release of ChatGPT three years ago. The firm is now in a race with the biggest and richest technology companies in the world and valued at $300 billion. Envelope-pushing research and eye-catching demos no longer cut it. It needs to ship products and get them into people’s hands—and boy, it does. 

OpenAI has kept up a run of new releases—putting out major updates to its GPT-4 series, launching a string of generative image and video models, and introducing the ability to talk to ChatGPT with your voice. Six months ago it kicked off a new wave of so-called reasoning models with its o1 release, soon followed by o3. And last week it released its browser-using agent Operator to the public. It now claims that more than 400 million people use its products every week and submit 2.5 billion prompts a day. 

OpenAI’s incoming CEO of applications, Fidji Simo, plans to keep up the momentum. In a memo to the company, she told employees she is looking forward to “helping get OpenAI’s technologies into the hands of more people around the world,” where they will “unlock more opportunities for more people than any other technology in history.” Expect the products to keep coming.

I asked how OpenAI juggles open-ended research and product development. “This is something we have been thinking about for a very long time, long before ChatGPT,” Pachocki said. “If we are actually serious about trying to build artificial general intelligence, clearly there will be so much that you can do with this technology along the way, so many tangents you can go down that will be big products.” In other words, keep shaking the tree and harvest what you can.

A talking point that comes up with OpenAI folks is that putting experimental models out into the world was a necessary part of research. The goal was to make people aware of how good this technology had become. “We want to educate people about what’s coming so that we can participate in what will be a very hard societal conversation,” Altman told me back in 2022. The makers of this strange new technology were also curious what it might be for: OpenAI was keen to get it into people’s hands to see what they would do with it.

Is that still the case? They answered at the same time. “Yeah!” Chen said. “To some extent,” Pachocki said. Chen laughed: “No, go ahead.” 

“I wouldn’t say research iterates on product,” said Pachocki. “But now that models are at the edge of the capabilities that can be measured by classical benchmarks and a lot of the long-standing challenges that we’ve been thinking about are starting to fall, we’re at the point where it really is about what the models can do in the real world.”

Like taking on humans in coding competitions. The person who beat OpenAI’s model at this year’s AtCoder contest, held in Japan, was a programmer named Przemysław Dębiak, also known as Psyho. The contest was a puzzle-solving marathon in which competitors had 10 hours to find the most efficient way to solve a complex coding problem. After his win, Psyho posted on X: “I’m completely exhausted … I’m barely alive.”  

Chen and Pachocki have strong ties to the world of competitive coding. Both have competed in international coding contests in the past and Chen coaches the USA Computing Olympiad team. I asked whether that personal enthusiasm for competitive coding colors their sense of how big a deal it is for a model to perform well at such a challenge.

They both laughed. “Definitely,” said Pachocki. “So: Psyho is kind of a legend. He’s been the number one competitor for many years. He’s also actually a friend of mine—we used to compete together in these contests.” Dębiak also used to work with Pachocki at OpenAI.

When Pachocki competed in coding contests he favored those that focused on shorter problems with concrete solutions. But Dębiak liked longer, open-ended problems without an obvious correct answer.

“He used to poke fun at me, saying that the kind of contest I was into will be automated long before the ones he liked,” Pachocki recalled. “So I was seriously invested in the performance of this model in this latest competition.”

Pachocki told me he was glued to the late-night livestream from Tokyo, watching his model come in second: “Psyho resists for now.” 

“We’ve tracked the performance of LLMs on coding contests for a while,” said Chen. “We’ve watched them become better than me, better than Jakub. It feels something like Lee Sedol playing Go.”

Lee is the master Go player who lost a series of matches to DeepMind’s game-playing model AlphaGo in 2016. The results stunned the international Go community and led Lee to give up professional play. Last year he told the New York Times: “Losing to AI, in a sense, meant my entire world was collapsing … I could no longer enjoy the game.” And yet, unlike Lee, Chen and Pachocki are thrilled to be surpassed.   

But why should the rest of us care about these niche wins? It’s clear that this technology—designed to mimic and, ultimately, stand in for human intelligence—is being built by people whose idea of peak intelligence is acing a math contest or holding your own against a legendary coder. Is it a problem that this view of intelligence is skewed toward the mathematical, analytical end of the scale?

“I mean, I think you are right that—you know, selfishly, we do want to create models which accelerate ourselves,” Chen told me. “We see that as a very fast factor to progress.”  

The argument researchers like Chen and Pachocki make is that math and coding are the bedrock for a far more general form of intelligence, one that can solve a wide range of problems in ways we might not have thought of ourselves. “We’re talking about programming and math here,” said Pachocki. “But it’s really about creativity, coming up with novel ideas, connecting ideas from different places.”

Look at the two recent competitions: “In both cases, there were problems which required very hard, out-of-the-box thinking. Psyho spent half the programming competition thinking and then came up with a solution that was really novel and quite different from anything that our model looked at.”

“This is really what we’re after,” Pachocki continued. “How do we get models to discover this sort of novel insight? To actually advance our knowledge? I think they are already capable of that in some limited ways. But I think this technology has the potential to really accelerate scientific progress.” 

I returned to the question about whether the focus on math and programming was a problem, conceding that maybe it’s fine if what we’re building are tools to help us do science. We don’t necessarily want large language models to replace politicians and have people skills, I suggested.

Chen pulled a face and looked up at the ceiling: “Why not?”

What’s missing

OpenAI was founded with a level of hubris that stood out even by Silicon Valley standards, boasting about its goal of building AGI back when talk of AGI still sounded kooky. OpenAI remains as gung-ho about AGI as ever, and it has done more than most to make AGI a mainstream multibillion-dollar concern. It’s not there yet, though. I asked Chen and Pachocki what they think is missing.

“I think the way to envision the future is to really, deeply study the technology that we see today,” Pachocki said. “From the beginning, OpenAI has looked at deep learning as this very mysterious and clearly very powerful technology with a lot of potential. We’ve been trying to understand its bottlenecks. What can it do? What can it not do?”  

At the current cutting edge, Chen said, are reasoning models, which break down problems into smaller, more manageable steps, but even they have limits: “You know, you have these models which know a lot of things but can’t chain that knowledge together. Why is that? Why can’t it do that in a way that humans can?”

OpenAI is throwing everything at answering that question.

“We are probably still, like, at the very beginning of this reasoning paradigm,” Pachocki told me. “Really, we are thinking about how to get these models to learn and explore over the long term and actually deliver very new ideas.”

Chen pushed the point home: “I really don’t consider reasoning done. We’ve definitely not solved it. You have to read so much text to get a kind of approximation of what humans know.”

OpenAI won’t say what data it uses to train its models or give details about their size and shape—only that it is working hard to make all stages of the development process more efficient.

Those efforts make them confident that so-called scaling laws—which suggest that models will continue to get better the more compute you throw at them—show no sign of breaking down.

“I don’t think there’s evidence that scaling laws are dead in any sense,” Chen insisted. “There have always been bottlenecks, right? Sometimes they’re to do with the way models are built. Sometimes they’re to do with data. But fundamentally it’s just about finding the research that breaks you through the current bottleneck.” 

The faith in progress is unshakeable. I brought up something Pachocki had said about AGI in an interview with Nature in May: “When I joined OpenAI in 2017, I was still among the biggest skeptics at the company.” He looked doubtful. 

“I’m not sure I was skeptical about the concept,” he said. “But I think I was—” He paused, looking at his hands on the table in front of him. “When I joined OpenAI, I expected the timelines to be longer to get to the point that we are now.”

“There’s a lot of consequences of AI,” he said. “But the one I think the most about is automated research. When we look at human history, a lot of it is about technological progress, about humans building new technologies. The point when computers can develop new technologies themselves seems like a very important, um, inflection point.

“We already see these models assist scientists. But when they are able to work on longer horizons—when they’re able to establish research programs for themselves—the world will feel meaningfully different.”

For Chen, that ability for models to work by themselves for longer is key. “I mean, I do think everyone has their own definitions of AGI,” he said. “But this concept of autonomous time—just the amount of time that the model can spend making productive progress on a difficult problem without hitting a dead end—that’s one of the big things that we’re after.”

It’s a bold vision—and far beyond the capabilities of today’s models. But I was nevertheless struck by how Chen and Pachocki made AGI sound almost mundane. Compare this with how Sutskever responded when I spoke to him 18 months ago. “It’s going to be monumental, earth-shattering,” he told me. “There will be a before and an after.” Faced with the immensity of what he was building, Sutskever switched the focus of his career from designing better and better models to figuring out how to control a technology that he believed would soon be smarter than himself.

Two years ago Sutskever set up what he called a superalignment team that he would co-lead with another OpenAI safety researcher, Jan Leike. The claim was that this team would funnel a full fifth of OpenAI’s resources into figuring out how to control a hypothetical superintelligence. Today, most of the people on the superalignment team, including Sutskever and Leike, have left the company and the team no longer exists.   

When Leike quit, he said it was because the team had not been given the support he felt it deserved. He posted this on X: “Building smarter-than-human machines is an inherently dangerous endeavor. OpenAI is shouldering an enormous responsibility on behalf of all of humanity. But over the past years, safety culture and processes have taken a backseat to shiny products.” Other departing researchers shared similar statements.

I asked Chen and Pachocki what they make of such concerns. “A lot of these things are highly personal decisions,” Chen said. “You know, a researcher can kind of, you know—”

He started again. “They might have a belief that the field is going to evolve in a certain way and that their research is going to pan out and is going to bear fruit. And, you know, maybe the company doesn’t reshape in the way that you want it to. It’s a very dynamic field.”

“A lot of these things are personal decisions,” he repeated. “Sometimes the field is just evolving in a way that is less consistent with the way that you’re doing research.”

But alignment, both of them insist, is now part of the core business rather than the concern of one specific team. According to Pachocki, these models don’t work at all unless they work as you expect them to. There’s also little desire to focus on aligning a hypothetical superintelligence with your objectives when doing so with existing models is already enough of a challenge.

“Two years ago the risks that we were imagining were mostly theoretical risks,” Pachocki said. “The world today looks very different, and I think a lot of alignment problems are now very practically motivated.”

Still, experimental technology is being spun into mass-market products faster than ever before. Does that really never lead to disagreements between the two of them?

I am often afforded the luxury of really kind of thinking about the long term, where the technology is headed,” Pachocki said. “Contending with the reality of the process—both in terms of people and also, like, the broader company needs—falls on Mark. It’s not really a disagreement, but there is a natural tension between these different objectives and the different challenges that the company is facing that materializes between us.”

Chen jumped in: “I think it’s just a very delicate balance.”  

Correction: we have removed a line referring to an Altman message on X about GPT-5.

The Download: OpenAI’s future research, and US climate regulation is under threat

This is today’s edition of The Download, our weekday newsletter that provides a daily dose of what’s going on in the world of technology.

The two people shaping the future of OpenAI’s research

—Will Douglas Heaven

For the past couple of years, OpenAI has felt like a one-man brand. With his showbiz style and fundraising glitz, CEO Sam Altman overshadows all other big names on the firm’s roster.

But Altman is not the one building the technology on which its reputation rests. That responsibility falls to OpenAI’s twin heads of research—chief research officer Mark Chen and chief scientist Jakub Pachocki. Between them, they share the role of making sure OpenAI stays one step ahead of powerhouse rivals like Google.

I recently sat down with Chen and Pachocki for an exclusive conversation which covered everything from how they manage the inherent tension between research and product, to what they really mean when they talk about AGI, to what happened to OpenAI’s superalignment team. 

I also wanted to get a sense of where their heads are at in the run-up to OpenAI’s biggest product release in months: GPT-5. Read the full story.

An EPA rule change threatens to gut US climate regulations

The mechanism that allows the US federal government to regulate climate change is on the chopping block.

On Tuesday, US Environmental Protection Agency administrator Lee Zeldin announced that the agency is taking aim at the endangerment finding, a 2009 rule that’s essentially the tentpole supporting federal greenhouse-gas regulations.

This might sound like an obscure legal situation, but it’s a really big deal for climate policy in the US. So let’s look at what this rule says now, what the proposed change looks like, and what it all means. Read the full story.

—Casey Crownhart

This story is part of MIT Technology Review’s “America Undone” series, examining how the foundations of US success in science and innovation are currently under threat. You can read the rest here.

It appeared first in The Spark, MIT Technology Review’s weekly climate newsletter. To receive it in your inbox every Wednesday, sign up here.

The AI Hype Index: The White House’s war on “woke AI”

Separating AI reality from hyped-up fiction isn’t always easy. That’s why we’ve created the AI Hype Index—a simple, at-a-glance summary of everything you need to know about the state of the industry. Take a look at this month’s edition of the index here.

The must-reads

I’ve combed the internet to find you today’s most fun/important/scary/fascinating stories about technology.

1 Trump has announced a new US health care records system 
Experts warn the initiative could leave patients’ medical records open to abuse. (NYT $)
+ Big Tech has pledged to work with providers and health systems. (The Hill)

2 China says it’s worried Nvidia’s chips have serious security issues
Just as the company sought to resume sales in the country. (Reuters)
+Experts reportedly found the chips featured location tracking tech. (FT $)

3 Mark Zuckerberg believes superintelligence “is now in sight”
Although he didn’t illuminate what it even means. (The Guardian)
+ Zuckerberg has taken a leaf out of the Altman playbook. (NY Mag $)
+ Don’t expect Meta to open source any of those superintelligent models. (TechCrunch)
+ Tech billionaires are making a risky bet with humanity’s future. (MIT Technology Review)

4 NASA is in turmoil
Without a permanent leader, workers are leaving in their thousands. (WP $)

5 Google removed negative articles about a tech CEO from search results
After someone made fraudulent requests using its Refresh Outdated Content Tool. (404 Media)
+ They exploited a bug in the tool to get pages removed. (Ars Technica)

6 How AI has transformed data center design
They need to accommodate a lot more heat and power than they used to. (FT $)
+ A proposed Wyoming data center would use more electricity than its homes. (Ars Technica)
+ Apple manufacturer Foxconn wants to get involved in building data centers. (CNBC)
+ Should we be moving data centers to space? (MIT Technology Review)

7 AI agents can probe websites for security weaknesses
Especially shoddily-constructed vibe-coded ones. (Wired $)
+ Cyberattacks by AI agents are coming. (MIT Technology Review)

8 New forms of life have been filmed at the ocean’s deepest points
The abundance of life was amazing, the Chinese-led research team says. (BBC)
+ Meet the divers trying to figure out how deep humans can go. (MIT Technology Review)

9 TikTok is adding Footnotes to its clips
As AI-generated videos become even harder to spot. (The Verge)
+ This fake viral clip of rabbits on a trampoline is a great example. (404 Media)

10 What it’s like to attend an Elon Musk fan fest
X Takeover promised to unite Tesla and SpaceX-heads alike. (Insider $)
+ Some people who definitely aren’t fans: neighbors of Tesla’s diner. (404 Media)

Quote of the day

“Patients across America should be very worried that their medical records are going to be used in ways that harm them and their families.”

—Lawrence Gostin, a Georgetown University law professor specializing in public health, warns of the potential repercussions of the Trump administration’s new health data tracking system, the Associated Press reports.

One more thing

The cost of building the perfect wave

For nearly as long as surfing has existed, surfers have been obsessed with the search for the perfect wave.

While this hunt has taken surfers from tropical coastlines to icebergs, these days that search may take place closer to home. That is, at least, the vision presented by developers and boosters in the growing industry of surf pools, spurred by advances in wave-­generating technology that have finally created artificial waves surfers actually want to ride.

But there’s a problem: some of these pools are in drought-ridden areas, and face fierce local opposition. At the core of these fights is a question that’s also at the heart of the sport: What is the cost of finding, or now creating, the perfect wave—and who will have to bear it? Read the full story.

—Eileen Guo

We can still have nice things

A place for comfort, fun and distraction to brighten up your day. (Got any ideas? Drop me a line or skeet ’em at me.)

+ Maybe airplane food isn’t so bad after all.
+ An unwitting metal detectorist uncovered some ancient armor in the Czech Republic that may have been worn during the Trojan war.
+ Talking of the siege of Troy, tickets for Christopher Nolan’s retelling of The Odyssey are already selling out a year before it’s released.
+ This fun website refreshes every few seconds with a new picture of someone pointing at your mouse pointer.

Charts: U.S. Manufacturing Trends Q3 2025

U.S. manufacturing activity showed modest improvement in June 2025, according to data released by the Institute for Supply Management (PDF).

The Institute’s Manufacturing Purchasing Managers Index (PMI) derives from monthly survey responses collected from purchasing and supply executives at more than 400 industrial firms. The overall PMI is a weighted composite of five seasonally adjusted indicators: new orders (30%), production (25%), employment (20%), supplier deliveries (15%), and inventories (10%).

The June PMI rose to 49.0, up from 48.5 in May, surpassing forecasts.

Since 1968 the U.S. Federal Reserve Bank in Philadelphia has conducted a monthly survey targeting approximately 250 manufacturers in its district of Delaware, southern New Jersey, and eastern and central Pennsylvania. Respondents report on the business conditions and various aspects of activity at their facilities, such as employment, work hours, new and backlogged orders, shipments, inventory levels, and delivery times.

The July 2025 survey, conducted from the 7th to the 14th, solicited executives’ expectations for changes in operational and labor costs for the current year. Most anticipate rising costs in all expense categories throughout 2025.

The National Association of Manufacturers (NAM) is the largest such organization in the United States, representing small and large manufacturers in every sector and in all 50 states.

Trade uncertainties and rising costs are the leading concerns for manufacturers, according to NAM’s Q1 2025 “Manufacturers’ Outlook Survey” (PDF) of approximately 250 firms, released in March.

Moreover, manufacturers are increasingly focusing on digital transformation. NAM introduced a question in March to measure the importance manufacturers are giving to digitally transforming their operations. Over one-third of respondents (36.8%) plan to moderately prioritize digital transformation in the coming year.

5 Ecommerce Platforms for Startups 2025

Ecommerce entrepreneurs getting started in 2025 should look for easy-to-use platforms that won’t bust the budget.

The ecommerce industry is maturing. The first online shop was likely the Boston Computer Exchange, launched in 1982. Amazon and eBay turn 30 this year.

For the most part, the technology underpinning online selling, such as displaying products or accepting payments, is standard and reliable. The difference in ecommerce platforms is not core functionality.

AI image of a male working on a laptop in a living room setting

Bootstrapped and focused: launching an ecommerce business from the living room.

Key Elements

Entrepreneurs with limited capital and no technical help require ecommerce tools that work out of the box, grow with the business, and stay out of the way.

Thus new, bootstrapped ecommerce merchants should seek six platform characteristics, as follows:

  • Easy to set up. It should be possible to launch in hours without hiring a designer or developer.
  • Easy to maintain. Changing prices, updating products, and adding pages should be small tasks.
  • Inexpensive or free. New entrepreneurs should invest in inventory and advertising. An ecommerce platform should be a marginal cost at worst.
  • Comprehensive. The platform should perform all ecommerce functions with the stability described above.
  • Able to scale. As it grows, a shop will need more features and throughput; the platform should be viable through a few million in annual sales.
  • Easy to market. New stores depend on advertising. This means having good landing pages, email capture forms, and discount features such as coupon codes. Organic traffic from search engines and generative AI platforms is beyond reach, so startups require features that turn paid clicks into customers.

5 Platforms

No list of ecommerce platforms makes everyone happy. It’s especially true for the makers of the many capable software solutions not mentioned below. But here are my five recommendations for new ecommerce stores.

Shopify

Shopify, the leading industry platform, is built for growth. The Basic plan is $39.99 per month or $359 annually, plus transaction fees. That investment delivers a fully fledged online store with unlimited products, loads of payment integrations, and access to the platform’s massive app ecosystem.

It is reasonable to have a Shopify store up and running in less than an hour that can easily scale to the enterprise level.

Some parts of the platform may be confusing for a novice, but Shopify works for most ecommerce businesses.

Square Online

Ecommerce businesses sometimes originate from in-person experiences with point-of-sale software. This might be a business that sells handmade jewelry at local art shows, has an Etsy shop, and wants its own ecommerce site. Why not pick a platform from a supplier you already work with?

The Square Online platform is low-cost to start (just transaction fees) and integrates directly with Square’s POS system. The service is built on the Weebly platform, which Square acquired in 2018, and is simple to use and understand.

Ecwid by Lightspeed

The term “ecommerce platform” evolved with the software it described. Years ago, the industry called these tools “shopping carts,” and many could bolt on to just about any website someone had built.

Ecwid is both a freestanding platform and a bolt-on to an existing site, shopping cart style. So, the gardening blog turned organic seed seller doesn’t require a new website; it needs only to add Ecwid. The same is true for the social influencer selling directly from a profile page.

The Ecwid add-on is free for five products and remains reasonably priced as a store scales.

Wix and Squarespace

Drag-and-drop editors make it easy for some of the least technical ecommerce operators to produce functional and attractive online stores.

My final two recommended ecommerce platforms — Wix and Squarespace — are head-to-head competitors known for remarkable ease of use and clean website designs.

These platforms appeal to startup founders who want to prioritize branding, design, and speed to market without hiring developers. Ecommerce functionality is built in, and templates come optimized for mobile and desktop, although neither platform is ideal for scaling.

Both cost less than $30 per month to start.

Success

Every new, bootstrapped ecommerce entrepreneur would love to win the Google lottery and have hundreds of eager shoppers flood in, but organic search is not a viable way to drive site traffic to a new store.

A new shop, with relatively thin content, cannot compete for transactional intent keyword phrases against long-established ecommerce sellers.

Regardless of the ecommerce platform, success will come from paid or at least active customer acquisition.