The Shared Mechanism

Why "Optimizing for LLMs" *probably* also improves your SEO

The todo's aren't new, but for many, the insights are. That's allright. But let's look into what actually the relationship really is between 'optimizing for LLMs' and 'optimizing for search'.

Because there's a claim floating around the GEO space that goes something like this: "Optimizing your content for LLMs also helps your traditional SEO." It sounds convenient. It would be cool if true. And I believe it's mostly right, but for reasons that are more nuanced and less proven than the claim states.

This article breaks down what we actually know, what we can reasonably infer, and where the evidence runs out. And why I don't really care about that.

The claim #

When you make content more extractable for LLMs (clearer passages, resolved references, explicit entity relationships, self-contained statements) you also make it more discoverable and rankable in traditional Google Search.

I've been developing an LLM Utility Analysis framework over the past year, working with enterprise clients in the Netherlands and Belgium. The framework evaluates content across five dimensions: structural fitness, selection criteria, extractability, entity and propositional completeness, and natural language quality. Every recommendation I make through this framework improves the same qualities that Google has been rewarding since at least 2020.

But does that correlation mean causation?

What we actually know #

Better said: what I actually know. I haven't read everything, and might (probably ...) misunderstand things, sure. So here's the caveat: it's what I have seen and have deduced. Please tell me if I'm being stupid, I'm just here to learn.

Google evaluates passages independently (since 2021) #

Google's Passage Ranking update, launched in February 2021, was the first major signal that page-level evaluation was giving way to passage-level evaluation. Google's SearchLiaison confirmed the scope: the change affects 7% of search queries worldwide.

Martin Splitt, Developer Advocate at Google, explained it clearly in a discussion with SEO professionals: "We try to help those who are not necessarily familiar with SEO or how to structure their content. Lots of people end up creating these long-winded pages that have a hard time ranking for anything because everything is so diluted."

The mechanism is straightforward. Google's algorithm can now score individual passages within a page and rank them independently for specific queries. A well-structured passage buried in a longer article can surface for a niche query, even if the page as a whole targets a broader topic.

This is exactly what LLM optimization demands. When I evaluate content through my extractability lens, I'm asking: "Does this paragraph function as a self-contained unit? Would it make sense if extracted without surrounding context?" Google has been asking the same question for five years.

Google uses vector embeddings at the passage level #

Mike King, founder of iPullRank and one of the more technically rigorous voices in SEO, has documented how Google's infrastructure now creates vector representations of queries, pages, passages, authors, entities, websites, and users. These embeddings underpin both traditional ranking and AI Overviews.

King's own empirical work demonstrates this concretely. In his analysis of chunking and passage structure, he split passages, re-measured vector similarity, and found improvements across all distance measures. Adding a heading to a paragraph improved cosine similarity by 17.54%. The implication: better passage structure doesn't just help humans parse content; it changes how the content embeds mathematically, which directly affects retrieval.

His broader point is worth quoting: "The main attribute that yields better content performance is better structure irrespective of why you do it." This applies to traditional ranking, AI Overviews, and third-party LLM citation simultaneously, because they share retrieval infrastructure.

The grounding budget is fixed and rank-dependent #

Dan Petrovic at DEJAN AI has produced some of the most concrete data on how Google selects content for its AI systems. His research on grounding chunks, based on analysis of 7,060 queries and 2,275 tokenized pages, reveals a fixed budget: approximately 2,000 words total per query, distributed across sources by relevance rank.

The rank-based allocation is steep. The #1 source gets a median of 531 words (28% of the budget). The #5 source gets 266 words (13%). And coverage drops as page size increases: pages under 5,000 characters get 66% coverage, while pages over 20,000 characters get only 12%.

Petrovic's conclusion ("density beats length") aligns perfectly with what LLM optimization requires. Pages under 1,000 words get 61% coverage (most of the page gets selected), while pages over 3,000 words get only 13% coverage. A tight, well-structured page with high propositional density outperforms a sprawling page that covers more ground but dilutes its extractable core.

AI Overviews mostly cite top-ranking pages, but not exclusively #

Ahrefs' study of 1.9 million citations from 1 million AI Overviews (July 2025) found that 76% of cited URLs also rank in the top 10 of traditional Google results. The primary cited source typically holds a median ranking of position 2. Approximately 10% of AI Overview citations come from pages ranking outside page one entirely.

This is a more nuanced picture than "AI ignores rankings", but the 10% matters. It suggests that passage-level quality operates as a partially independent selection criterion. A page ranking in position 15 can still be cited if its passages are more extractable and relevant for a specific sub-query. Google's "fan-out queries" mechanism (where the AI generates more specific sub-queries to find answers) explains how lower-ranking pages get selected for niche aspects of a broader topic.

Meanwhile, for third-party AI assistants, the overlap is dramatically lower. Ahrefs found that only 12% of URLs cited by ChatGPT, Gemini, and Copilot appear in Google's top 10 for the same prompt. These systems clearly operate on different selection logic, making passage-level quality even more important for cross-platform AI visibility.

Olaf Kopp's LLM Readability framework #

Olaf Kopp developed a framework around two concepts: LLM Readability and Chunk Relevance. His RAG process model breaks down how AI systems select content: Information Retrieval → Source Qualification (E-E-A-T filters) → Chunk Extraction → Context Provision → Generation.

The critical insight from his work is that these steps happen sequentially. Your page first needs to pass source qualification (traditional authority signals). Then your individual chunks compete on relevance and readability. Kopp's claim: even if your document isn't the most relevant in the source set overall, your chunks can still win citation if they're more relevant or better structured than competitors' chunks.

This isn't just about LLMs. The same chunk-level competition plays out in Google's featured snippets, passage ranking, and AI Overviews. The underlying evaluation mechanism is shared.

Putting it all together: the mechanistic argument #

The argument for "LLM optimization helps SEO" isn't based on a single study proving the causal link. It's based on a shared mechanism.

Google's ranking and Google's AI systems use the same embedding infrastructure. Dense retrieval (which converts both queries and passages into vector representations and measures their similarity) underlies passage ranking, AI Overviews, featured snippets, and the Knowledge Graph. When you improve a passage's clarity, entity coverage, and self-containment, you change how it embeds. That changed embedding affects retrieval across all of Google's systems simultaneously.

The academic research supports this convergence. Dense X Retrieval (Chen et al., EMNLP 2024) demonstrates that propositions (atomic, self-contained statements averaging about 11 words) outperform full passages as retrieval units. Content that decomposes cleanly into propositions is more retrievable in any dense retrieval system, whether that system powers a traditional search ranking or an LLM citation.

HippoRAG (NeurIPS 2024) shows that content with explicit entity relationships enables graph-based retrieval, finding answers that span multiple documents through entity connections rather than surface-level text similarity. This matters for Google's Knowledge Graph as much as it matters for third-party AI systems.

What I cannot prove #

Here's where I need to be honest. I cannot point to a study that says: "We made content more LLM-friendly and measured a statistically significant improvement in Google rankings."

That experiment, to my knowledge, has not been conducted. So what I have deduced, is:

A shared mechanism: the same embedding infrastructure serves both traditional and AI-powered search. Improvements at the passage level should logically improve retrieval in both contexts.

Practitioner evidence: Mike King reports that passage-level optimization produces measurable improvements in cosine similarity scores, which correlate with better performance in AI Overviews. But correlation across AI systems is not the same as proven improvement in traditional rankings.

Convergent best practices: every recommendation that comes out of LLM optimization (clear passages, resolved references, explicit entity relationships, self-contained statements, high propositional density) is independently good SEO advice. But "independently good advice" is a weaker claim than "LLM optimization causes better rankings."

The near-duplicate inference: when I advise clients to differentiate template pages with unique, entity-rich content per page variant, I'm making a recommendation that serves both LLM extractability and Google's de-duplication systems. But the specific claim that "near-duplicate listing pages are penalized through passage-level evaluation" is my inference, not a documented finding.

The practical reality #

Despite the honest limits above, the practical case is strong. The recommendations that emerge from LLM optimization are never in conflict with SEO best practices. There is no trade-off. You never find yourself thinking: "This change would help LLMs but hurt my rankings." The changes are directionally identical:

  • -

    Self-contained passages help passage ranking, featured snippets, AI Overviews, and third-party LLM citation simultaneously.

  • -

    Explicit entity relationships improve Knowledge Graph alignment, semantic search accuracy, and graph-based RAG retrieval.

  • -

    High propositional density makes content more retrievable in dense retrieval systems regardless of whether those systems power traditional or AI-powered search.

  • -

    Resolved references (replacing "it," "this," "the above" with named entities) improve comprehension for every system that processes your content at the passage level.

The risk profile is asymmetric. The downside of optimizing for LLMs is approximately zero (because the changes are independently good). The upside is visibility across an expanding set of AI-powered discovery surfaces. Which is exactly why I'm using it!

Google's August 2025 spam update reinforced this convergence from the penalty side. The update specifically targeted template-based content at scale: location pages, category pages, and other programmatic content that relied on identical text across variants. The recommendation to make each page variant genuinely unique aligns with both spam avoidance and LLM extractability.

So my point is ... #

"LLM optimization helps SEO" is a reasonable inference from shared mechanisms, not a proven causal relationship. The underlying retrieval infrastructure is the same. The passage-level evaluation criteria overlap almost completely. The practical recommendations are identical.

I'd frame it this way: you're not optimizing for two different systems. You're optimizing for one evolving system that evaluates content at the passage level, measures semantic relevance through embeddings, and rewards density, clarity, and completeness. Whether that system renders a blue link, a featured snippet, an AI Overview, or a ChatGPT citation is just a surface difference. The content qualities that earn selection are the same.

The honest caveat: I can't prove the causal chain from LLM-optimized content to improved traditional rankings with the evidence I have come across. What I can say is that every LLM optimization I recommend is independently justified by traditional SEO research, and the shared mechanism makes the combined benefit highly plausible.

In my book, that's enough to act on.

Sources #

Besides my own intuition, I also looked at these sources. You know, to verify if i'm stupid or not:

  • -

    Petrovic, D. (2025). "How big are Google's grounding chunks?" DEJAN AI. Analysis of 7,060 queries showing ~2,000-word grounding budget per query.

  • -

    Kopp, O. (2026). "Ultimate guide for LLM readability optimization and better chunk relevance." Framework for LLM Readability and Chunk Relevance in RAG processes.

  • -

    King, M. (2026). "Moving from a Google-shaped Web to an Agent-shaped Web: A Refutation of Misinformation about Chunking." iPullRank, January 15, 2026. Empirical passage-level similarity analysis.

  • -

    King, M. (2025). "How AI Mode Works and How SEO Can Prepare for the Future of Search." iPullRank, August 2025. Vector embedding infrastructure across Google systems.

  • -

    Chen, S., et al. (2024). "Dense X Retrieval: What Retrieval Granularity Should We Use?" EMNLP 2024. Propositions as retrieval units outperform passages.

  • -

    Gutiérrez, B., et al. (2024). "HippoRAG: Neurobiologically Inspired Long-Term Memory for LLMs." NeurIPS 2024. Graph-based retrieval through entity relationships.

  • -

    Aggarwal, P., et al. (2024). "GEO: Generative Engine Optimization." KDD 2024, Barcelona. IIT Delhi / Princeton. Statistics Addition and Quotation Addition improving visibility by up to 41% on Position-Adjusted Word Count.

  • -

    Google SearchLiaison (2021). Passage Ranking launch announcement. February 11, 2021.

  • -

    Splitt, M. (2020). Discussion on Passage Ranking with Cindy Krum, Bartosz Goralewicz, and Tomek Rudzki. Published via Search Engine Journal, November 19, 2020.

  • -

    Google (2025). August 2025 Spam Update targeting scaled content abuse and template-based location pages.

  • -

    Ahrefs (2025). "76% of AI Overview Citations Pull From Top 10 Pages." Study of 1.9M citations from 1M AI Overviews, July 2025. Median #1 citation ranks position 2; ~10% of citations from outside page one.

  • -

    Ahrefs (2025). "Only 12% of AI Cited URLs Rank in Google's Top 10 for the Original Prompt." 15,000 prompts across ChatGPT, Gemini, Copilot, and Perplexity. September 2025.

Menu
Published: February 24, 2026 ~ 9 min.
Home  »  Blog  »  The Shared Mechanism

Eikhart - Mad Scientist