The AI Version of Your Brand Was Built Without You.

What your shadow reputation in AI search is, what to do about it, and why this turns the SEO-game upside down.

Let's start with a very recent case for a known company in the SEO biz.

The case of seer Interactive #

Seer Interactive is a Philadelphia-based digital agency founded in 2002. Their team retention rate is 79.2%. Their client retention rate is 92%. Their longest-standing client has been with them for 14 years. Eight former employees have later hired Seer from their new organizations, which is a quality signal in itself. The agency ranks #16 on Newsweek's 2025 Global Most Loved Workplaces list and #23 on Ad Age's Best Places to Work 2026.

However ... in early 2026, they discovered that one in three branded AI queries surfaced a negative characterization of the company, one that traced back to a single complaint, posted in 2018, replicated across five review directories by the same person. The model didn't recognize the duplication. It read five sources making the same claim as five independent corroborations.

A single isolated complaint from eight years ago, copied across low-authority directories, was outweighing 24 years of verifiable client relationships in the model's representation of the brand.

Wil Reynolds, Seer's founder, described the mechanism on the seer website: "When an LLM sees the same claim on multiple sites, it doesn't flag it as potentially duplicated. It reads it as corroboration."

This is the shadow reputation in action:

Shadow reputation is what language models believe about your brand, assembled from the full open web before you ever had the chance to intervene. It is not the brand you built. It is the brand others wrote, and the model learned.

~

Ramon Eijkemans

Most - current - AI search optimization advice treats brand visibility as a content problem. Publish more. Structure it better. Use the right headings. That advice is not wrong, but it is half the story, or maybe less.

Your shadow reputation doesn't live in the content you publish. It lives in what others have written about you. Your own assets cannot change it alone, not because they are worthless, but because the mechanism that builds parametric memory requires external, independent sources. How you write your own content determines how well it performs when retrieved. Whether it shifts the underlying representation depends entirely on whether others pick it up.

This article explains how the shadow reputation forms, why it resists correction, how to measure it, and what to do about it.

Why your own content isn't enough #

Public relations and SEO share a foundational assumption: produce the right signals, and they will be received as intended. A press hit today influences a journalist's perception today. A new page can affect your rankings within weeks. The signal you produce and the signal that is received are the same signal.

Language models break this symmetry. Their foundational knowledge (what researchers call parametric memory) is encoded at training time from a corpus that is overwhelmingly external. Not your site. The open web: everything others have ever written about you, your category, your competitors.

Your own assets enter the picture not as primary sources but as a verification layer. When a model encounters your content during real-time retrieval, it checks it against what it already believes from external sources. If your content is consistent with the external record, it reinforces. If it conflicts, the model either retains the external version or drops the claim.

There is a partial exception worth noting: well-structured, directly relevant own content can sometimes influence what the model retrieves and surfaces in a specific response, even when it conflicts with the parametric baseline. This is precisely why utility writing matters: it raises the chance that your content clears the quality threshold at which retrieval systems begin to prefer it1. But this effect is query-specific and temporary. It does not revise the underlying parametric layer.

The foundational asymmetry remains: in AI search, the open web outranks you on your own brand.

Shadow reputation turns SEO upside down #

SEO usually starts and ends on your own website. You optimise your pages, earn links that point back to your domain, build authority that flows toward your own content. Even link building - the most external activity in traditional SEO - is ultimately in service of your own site. The destination is always yours.

Shadow reputation runs in the opposite direction. It starts externally, in everything others have written about you across the open web, and ends inside a language model's parametric memory. Your own site is not the destination. It is not even the primary input. The external corpus builds the representation; your own content, at best, influences how that representation is retrieved and surfaced in a specific response.

This reversal has a practical implication that most SEO advice misses: optimising your own content is necessary but not sufficient. The primary lever is not what you publish. It is what others publish about you.

How the shadow reputation forms #

The primary authors of your shadow reputation are everyone else who has ever written about you. A reviewer on G2. A journalist who covered your funding round three years ago. A Reddit thread where someone complained about your support response time. A competitor's comparison page that positions you as the budget option. An analyst report that categorized your product incorrectly.

All of this is training data, weighted by source authority and mention frequency, assembled into a parametric representation before you begin any optimization effort. As Andrea Volpini of WordLift put it in conversation with Jason Barnard2: "In 2017, schema markup might get you a rich snippet. Nice to have. In 2026, your entity representation determines whether AI systems recommend you at all."

Three things follow from this:

  1. 1.

    Volume without coherence doesn't help. A brand mentioned 100,000 times across inconsistent, low-quality sources can end up with a worse parametric representation than a brand mentioned 10,000 times across coherent, high-authority ones. The model encodes not just the mentions but the semantic context around them. Contradictory positioning across sources produces an incoherent internal representation.

  2. 2.

    Sparse external data leads to interpolation, not silence. When the external corpus about a brand is thin or contradictory, the model doesn't admit ignorance. It fills the gap by drawing on patterns from similar brands, inferring attributes and positioning from what it knows about comparable entities. Myriam Jessier has documented this as brand drift: when an entity lacks sufficient structured representation, the model fills in the gaps probabilistically, assembling a plausible but potentially inaccurate picture from related patterns in the training data. The result is a confident description of your brand that the model didn't find. It inferred.

  3. 3.

    The shadow reputation isn't static. It shifts as the external corpus shifts, slowly, through model retraining cycles, not through your next LinkedIn post. This makes it a long-cycle problem requiring a long-cycle strategy.

Why correction is harder than it looks #

Even when a model uses real-time retrieval (RAG) to supplement its parametric knowledge, shadow reputations resist correction.

When retrieved content contradicts what the model already believes from training, the model faces a conflict between two knowledge sources.

Research on this conflict (Xie, 2024) shows a nuanced picture: when external evidence is the only evidence, models are often highly receptive to it, even when it conflicts with their parametric baseline. But most brands are not in that situation. The model already has a prior, assembled from thousands of external sources, and when that prior is present alongside your own content, the dynamic shifts. Xie (2024) documents a strong confirmation bias: when a model encounters both supporting and conflicting evidence simultaneously, it tends to cling to the version that is consistent with what it already believes.

This is the real reason correction is hard. Not that your own content has no weight, but that the model treats mixed evidence asymmetrically. A coherent, well-structured correction that arrives with no competing signal can work. The same correction arriving into a context where the model already has a different prior faces the confirmation bias problem. Your own website is a single source. The external corpus that built the prior is thousands of independent sources. Under mixed evidence conditions, the model resolves toward the prior.

The consequence is direct: you cannot correct your shadow reputation primarily from the inside. The parametric layer is built externally and requires external correction. Well-structured own content can win RAG interactions decisively, as the Seer case shows. But winning a RAG interaction is not the same as shifting the parametric layer. For that, the content needs to be picked up by external sources. Publication is the starting point, not the intervention itself.

Jason Barnard describes the intervention logic: the model needs a "Digital Brand Echo": a cumulative presence across external sources that consistently associates your brand with the right attributes. Without that echo, the model either interpolates incorrectly or represents you weakly.

Measuring your shadow reputation: the canary query framework #

Because the shadow reputation lives in parametric memory, it can be probed directly, without waiting for citation data or traffic signals. The approach draws on a well-established research finding: language models contain relational knowledge that is directly recoverable via natural language prompts3.

The canary query framework applies this to brand entities specifically. It is not a one-time audit. It is a repeatable diagnostic that establishes a baseline and tracks shifts over time.

Why am I talking about "canaries" here? #

The name comes from the canary in the coal mine: a small bird that miners carried underground as an early warning system for toxic gases. If the canary stopped singing, miners knew danger was present before they could detect it themselves, before they could smell it, feel it, or measure it any other way.

The metaphor fits for a specific reason. Your shadow reputation is already forming, already being read by buyers, already shaping decisions, before you have any conventional signal that something is wrong. No ranking drop. No traffic loss. No customer complaint. The model has simply been describing you incorrectly to people who never told you they asked.

A canary query is a carefully chosen prompt designed to reveal what a language model actually believes about your brand, before that belief surfaces in a high-stakes response to a real buyer. It does not produce a useful answer for the person asking. It produces a reading of the model's internal state. The canary does not fix the air. It tells you whether the air is safe.

Map the purchase context #

Identify the queries a buyer asks during active evaluation. Not awareness queries ("what is [category]") but decision queries:

  • -

    "Best [category] for [use case]"

  • -

    "Compare [your brand] vs [competitor]"

  • -

    "Is [your brand] good for [specific context]"

  • -

    "[Your brand] reviews"

  • -

    "Alternatives to [your brand]"

These are the queries where shadow reputation directly determines commercial outcomes.

Run canary queries across multiple models #

Test the same prompts in ChatGPT, Claude, Perplexity, and Gemini. Do not test in a single model. Different models encode brand representations differently based on their training corpora, and significant divergence between models is itself a diagnostic signal (weak or incoherent parametric representation tends to produce inconsistent results across models).

Practical note on reliability: Because prompt formulation affects what models surface, test multiple phrasings of the same question and look for patterns across runs rather than conclusions from a single output4.

Extract entity associations #

For each response, capture not just whether you appear, but:

What to capture Why it matters
What category are you placed in? Category placement determines which comparison sets you appear in
What attributes are used to describe you? This is your effective positioning in the model
Which competitors appear alongside you? Reveals the competitive frame the model has internalized
What objections or limitations are mentioned? The model's "concerns" about your brand
What buyer type or use case are you associated with? May differ significantly from your intended ICP

This is the difference between brand-level visibility (do you appear?) and entity-level representation (what do you mean?).

Run bidirectional entity probes #

This step uses a diagnostic method developed by Dan Petrovic of DEJAN: the brand-to-entity (B→E) and entity-to-brand (E→B) probe structure5.

B→E prompts: what the model retrieves when given your name:

List ten things you associate with [brand name].

Describe [brand name] to someone who has never heard of it.

What is [brand name] known for?

E→B prompts: whether your brand surfaces when the relevant concept is queried:

List ten brands you associate with [your key attribute or use case].

Which companies are the best choice for [problem you solve]?

Name the leading providers of [your category] for [your target segment].

A brand with a strong shadow reputation performs well in both directions. Most brands are strong in one and absent in the other. The more commercially significant failure is E→B: the model "knows" your brand exists but does not retrieve it when buyers are actively looking for what you offer.

Scoring your shadow reputation #

After running canary queries, score what you find across four dimensions. This is not a precise metric but a structured way to identify where to focus.

Dimension Strong Weak
Category placement Consistently placed in the correct category Placed in wrong category, or inconsistently across models
Attribute accuracy Attributes match your actual positioning Attributes are wrong, outdated, or from a competitor's frame
E→B retrieval Brand surfaces for category queries without prompting Brand absent from category queries unless named directly
Cross-model consistency Similar representation across ChatGPT, Claude, Perplexity, Gemini Significantly different or contradictory across models

A brand scoring strong on all four has a healthy shadow reputation. A brand scoring weak on E→B retrieval and cross-model consistency is a candidate for active intervention. A brand scoring weak on attribute accuracy may have a specific external narrative to correct.

What to do about it: intervention priorities #

The shadow reputation is a long-cycle problem. The interventions that matter work through training data, not through your next content calendar. Here is how to prioritize.

Now: Establish the baseline #

Run the canary query framework before doing anything else. You cannot measure improvement without a baseline, and you cannot prioritize interventions without knowing what is actually wrong. A brand that is misplaced in its category needs a different response than a brand that is correctly categorized but absent from E→B queries.

Ongoing: Build external association density #

The parametric representation is built from the external corpus. Correcting it requires changing the external corpus. That means earned media, third-party coverage, analyst mentions, community references: content where others describe your brand in the right categories and with the right attributes, across sources the model treats as independent.

The Seer case is instructive. Publishing their 79.2% retention rate on their own site worked fast: Perplexity cited it the same day, and the "high turnover" claim dropped from AI responses. But it didn't hold. When Perplexity changed how it sourced content, the citations dried up and the misconception returned. Seer's own read is that a single blog post is whack-a-mole. So they did the right thing: built a permanent, updatable page as a single source of truth, and got recent reviews onto the external sites (Clutch, AgencySpotter) that the model was already treating as independent sources. That second step especially is the correct one.

The logic behind that second step is worth spelling out. The model surfaced the negative review precisely because it was on Clutch and AgencySpotter, the same sites it already treated as authoritative sources for agency evaluation. Those sites are therefore not the problem. They are the correction mechanism. New reviews on Clutch don't just add positive data points; they update the model's primary source for this brand in this category. The negative claim won not because it was true, but because it was on the right platforms. Seeding the correction onto those same platforms is the right move, and it is what Seer is now doing.

The broader principle for any brand: if the positive facts only exist on your own domain, they are a single-source claim. The same facts on a review platform, in a trade article, or in an award citation become corroboration. Utility writing applies here too: structured as extractable facts, not buried in prose.

Olaf Kopp's Brand Context Optimization framework6 describes the broader principle as co-occurrence optimization: your brand name needs to consistently appear alongside the right attributes across sources the model treats as independent. This is not PR as traffic driver. It is PR as training data investment.

Prioritize by gap: if the category placement is wrong, focus on earned media and third-party content that correctly categorizes you. If the E→B retrieval is weak, focus on content that connects your brand to the specific use cases and problems you want to own.

Also ongoing: Make your own content retrieval-ready #

While own content cannot change the parametric layer, it can influence specific RAG interactions, particularly when it is well-structured and directly addresses decision queries. The utility-writing framework describes the specific techniques: self-contained statements, explicit entity relationships, content that survives extraction from context. Just don't forget that humans read your content too ;)

The connection to shadow reputation: well-structured own content raises the chance that your version of your brand's story gets retrieved and surfaced in specific queries, even when the parametric layer is not yet aligned. It does not fix the underlying problem, but it reduces the gap between the shadow reputation and what the model surfaces in real-time responses.

Long-cycle: Track and iterate #

Rerun the canary query framework regularly, for example monthly. The parametric layer updates through model retraining cycles, not continuously. Changes to the external corpus take months to influence parametric representations. Monthly or quarterly tracking should give you enough resolution to detect meaningful movement without the noise of weekly variation.

Track three things: category placement consistency, E→B retrieval rate for your target queries, and cross-model consistency. These are the leading indicators that the external corpus is shifting in the right direction.

The measurement trap to avoid #

Most brands that have begun tracking AI visibility are measuring citation frequency: how often their name appears in AI-generated responses.

This is the wrong metric.

Citation frequency tells you whether you are present (kinda). It does not tell you whether your presence is accurate, whether you are placed in the right category, whether you surface for the queries that matter commercially, or whether what the model says about you would make a buyer more or less likely to engage.

The right question is not "how often are we cited?" but "when buyers ask the questions they ask during active evaluation, does the model describe us accurately, position us favorably, and place us in the right competitive context?"

That question requires qualitative analysis of model outputs. It requires the canary query framework. And it requires distinguishing between the B→E direction (does the model know what we do?) and the E→B direction (does the model retrieve us when our strengths are the query?).

A summary #

For your convenience

  • -

    Every brand has a shadow reputation inside language models: an internal representation assembled primarily from the external open web, from sources you do not control.

  • -

    Shadow reputation turns SEO upside down: traditional SEO ends on your own site. Shadow reputation starts externally, in the open web, and ends inside a model's parametric memory. Your own site is not the destination. It is not even the primary input.

  • -

    Your own content is not powerless. Well-structured, utility-written content can win RAG interactions decisively. But winning a RAG interaction and shifting the parametric layer are different things. The parametric layer changes when external sources pick up your content and corroborate it independently.

  • -

    When external data is sparse or incoherent, models interpolate from similar brands, producing confident but inaccurate descriptions (brand drift).

  • -

    The canary query framework is the minimum viable diagnostic. Like the canary in the coal mine, it detects a problem before you have any conventional signal — no ranking drop, no traffic loss, no customer complaint. Establish a baseline before any other intervention.

  • -

    Interventions should be prioritized by gap: wrong category placement, weak E→B retrieval, and cross-model inconsistency each point to different interventions.

  • -

    The primary lever is the external corpus: earned media and third-party content that consistently associates your brand with the right categories and attributes. This is a training data investment, not a campaign.

References #

  • 1 Research on knowledge conflicts in LLMs (Xu et al., 2024; Xie et al., 2024) shows that when retrieved content is coherent, direct, and highly relevant, models can and do follow it over parametric priors. The condition is quality. Utility-writing techniques (Eikhart, 2026) are designed to produce content that meets this standard. The effect is query-specific and does not persist into parametric memory. back
  • 2 Volpini, A. (2026). Enterprise Schema Architecture. WordLift / Kalicube. Barnard, J. (2026). The Three Graphs Framework. Kalicube. kalicube.com. back
  • 3 Petroni et al. (2019) established that pre-trained language models contain relational knowledge recoverable via natural language prompts without fine-tuning. Jiang et al. (2020) showed that prompt formulation significantly affects recall rates, which is why the canary query framework recommends testing multiple phrasings. The frequency-representation relationship documented by Kandpal et al. (2023) and Mallen et al. (2023) is based on factual QA tasks; its direct applicability to brand association representation is an extrapolation, not an established finding. back
  • 4 Xu et al. (2024) document "intra-memory conflict": the same entity can be represented inconsistently across differently phrased queries. Significant divergence between models and across prompt formulations is therefore itself a diagnostic signal: it indicates a weaker or more ambiguous parametric representation. back
  • 5 Petrovic, D. (2025). Beyond Rank Tracking. SEO Week 2025 / DEJAN. back
  • 6 Kopp, O. (2026). Guide to Brand Context Optimization for GEO. kopp-online-marketing.com. back
Menu
Published: March 25, 2026 ~ 15 min.
Home  »  Blog  »  Shadow Reputation

Eikhart - Mad Scientist