Generative AI is reshaping how information is delivered, and with it comes a need to understand the data-driven side of Answer Engine Optimization. What experiments are researchers and marketers conducting to decode LLM behavior? What do the numbers say about SEO factors influencing AI results? And how do biases in AI models affect whether your brand gets a shout-out? In this article, we delve into emerging research and experiments (2024–2025) that shed light on these questions.
Do Traditional SEO Factors Influence AI Mentions?
One of the burning questions early on was: If I rank #1 on Google for my keyword, does that mean an LLM will mention me when asked about that topic? Several experiments have tackled this:
Seer Interactive’s Large-Scale Study (2024): Seer ran an experiment with 10,000 real user questions in the finance and SaaS sector. They used GPT-4 to get answers for each question, extracted which brands were mentioned, and then looked at those brands’ SEO metric. Key findings included:
- Google Rankings Matter: There was a strong positive correlation (~0.65) between a brand ranking on page 1 of Google and that brand being mentioned by the LLM (src). In other words, if you’re a top organic result for a query, the AI is much more likely to include you in its answer. This makes sense – GPT-4 with browsing or Bing Chat tend to see the same top results a human searcher would, and thus use that info.
- Bing Rankings Also Matter: A correlation (~0.5–0.6) was found with Bing ranking (src). So optimizing for Bing is indeed relevant to GEO, though Google’s overlap was slightly higher. Considering ChatGPT’s live search uses Bing and Bing Chat itself obviously uses Bing, this correlation is notable.
- Backlinks/Domain Authority Not a Direct Factor: They expected high Domain Authority sites to be mentioned more, but the impact of backlinks was weak or neutral (src). This suggests the AI isn’t directly considering a site’s backlink profile; it’s more about content relevance and presence. (However, backlinks indirectly help you rank, which as above, helps you get mentioned.)
- Content Type Variety: Surprisingly, having diverse content (images, videos, etc.) on your page didn’t significantly increase mentions (src). They thought maybe an AI would prefer content-rich pages, but it appears the text content quality and relevance is what matters. That said, as AIs become more multimodal, this could change – but for now, don’t rely on an infographic alone to get you cited; ensure the text around it is solid.
They also found that when they filtered out “noise” sites (like forums or thin content sites) from their analysis, the correlations of rankings to mentions became even stronger. This reinforces a basic point: High-quality, solution-oriented sites get more love from LLMs than noisy UGC sites. For example, an AI would rather mention a SaaS company’s blog or a review site than a random forum thread, even if that thread ranks on Google.
SEO.com’s Findings on Overlap: Another data point comes from SEO.com, where they noted:
- When using ChatGPT’s Search mode, results were *73% similar to Bing’s results.
- Google’s AI Overviews had about *61% overlap with the top Google organic results. (per a SERanking study they cited).
This quantifies the intuition that AIs are largely drawing from the same pool of info as search engines. The 27–39% difference might be where an AI adds in other context from its training data or picks something slightly lower-ranked that complements the answer.
For a marketer, these findings mean:
- If you’re already doing well in SEO, you’re ahead of the game for GEO – your focus should be on maintaining that and ensuring the AI can easily digest your content (as discussed in earlier articles).
- If you’re not on page 1 for a key topic, the uphill battle to get mentioned by an AI is even steeper. You might try alternative GEO tactics (like being present on other sites that are on page 1, as we saw in case studies).
Experiments on Prompt Phrasing and AI Biases
Understanding biases in AI answers is crucial – biases can be towards popular brands, or due to the model’s training data skew, etc. Researchers and practitioners are running experiments to uncover these biases:
- Brand Preference Bias: Some informal tests by marketers involve asking an AI open-ended product questions versus more guided ones. For example, asking “What’s the best soda?” might get an answer listing Coca-Cola and Pepsi (major brands bias), whereas asking “List some craft soda brands” yields different names. This suggests that broad queries bias toward the most famous entities (likely due to training data frequency). Recognizing this, a strategy emerges: if you’re a smaller brand, you want to encourage users (or content) to frame queries in ways that include your unique attributes. In content, that means seeding those attributes: “X is a new soda brand popular in Texas” – so that someone asking “best soda in Texas” triggers X.
- Prompt Wording Experiments: SEO folks have tried adding phrases like “including lesser-known options” to a prompt. For instance, “What are the best project management tools, including lesser-known options?” Sometimes, this forces the AI to mention more than just Asana/Trello/Jira. The takeaway is that AI outputs can be manipulated by prompt phrasing, which is out of our control when a user is querying – but it tells us that the AI can include smaller brands if prompted. So the gap is not that the AI doesn’t know you, it might be simply defaulting to the most common answer unless asked otherwise. Over time, if users frequently ask for alternatives, the AI’s default answers might broaden (especially if feedback is given like “you always mention the same few, what about X?”).
- Political/Ideological Bias Studies: While not brand-focused, these are insightful. For example, a study in 2023 found ChatGPT had a measurable left-leaning bias in political questions. Another found it gave lower ratings to resumes that included disability-related info (suggesting bias in how it judged “professionalism”. They then tried mitigating instructions (basically telling the AI not to be biased), and it improved the outcomes somewhat. These highlight that AI can have hidden biases from training data. Translate that to brands: an AI might have a bias towards “established” brands over new ones, or English-language sources over non-English, simply because of training data composition. This is analogous to telling the AI “consider newer companies” in a recommendation scenario. Users generally won’t do that, but AI companies might internally implement such “debiasing” instructions in future.
- Hallucination and Misinformation Checks: Experiments also examine when AIs make up info about brands. For example, asking “What are some disadvantages of [Brand]?” might lead the AI to fabricate something if it doesn’t have real info. Some companies tested this by asking about their own brand when they knew no negative news existed – yet the AI sometimes invented a “con” just to give a balanced answer. This reveals the risk of hallucination, which is an AI’s tendency to generate plausible-sounding but incorrect information. OpenAI and others are working on reducing this, but from a brand standpoint, it means you should monitor what AIs say about you (as covered in Tools and earlier tips). If false info comes up, you might need to create content to correct it (because the AI got that idea from somewhere, or from a vacuum it decided to fill).
The Bias of Omission and How to Address It
“Bias” isn’t only about political or social issues – in GEO, a big bias to watch for is bias of omission. An AI might consistently omit a certain class of sources. For example:
- It might over-rely on US-based sources and omit international ones (language and regional bias).
- It might favor Wikipedia/Big sites for facts and omit company websites (source-type bias).
- It might list the same big brands in recommendations because those appear in training data most often (popularity bias).
From experiments and data:
- The Seer study’s next phases are looking into factors like *PR and partnerships – essentially, does having media buzz or official tie-ins with AI companies affect mentions? They suspect it might; for instance, if OpenAI forms a partnership with, say, a financial data provider, ChatGPT might start citing info from that provider more.
- There’s anecdotal evidence that citation policies affect inclusion. Google’s SGE, for example, has been observed to avoid citing pure spam sites or thin affiliates. So if your site is borderline in quality, you may be algorithmically omitted for being low trust. On the flip side, Google’s AI sometimes omits citing the actual brand’s site even if it’s the subject (it might cite a third-party describing the brand). This could be to avoid self-promotional content.
Addressing bias of omission:
- Diverse Content Presence: Ensure your brand/content is present in multiple forms (text, structured data, etc.) and multiple places. If the AI doesn’t “see” you in one channel, it might in another.
- Feedback to AI Providers: Some brands have started reaching out to AI companies when a glaring omission or error occurs (especially if it’s harmful). OpenAI has a feedback system and contact for factual errors. It might seem far-fetched, but if an AI is saying something incorrect or consistently ignoring a clearly relevant answer (like your brand when you’re actually a market leader), flagging that to the AI company can sometimes lead to adjustments.
- User Education and Prompting: While you can’t change all user behavior, you might subtly encourage in your content for users to ask more specific questions. For example, if you often get missed in “best X” queries, perhaps publish content like “X for niche Y” and users who care will start asking in that way. It’s indirect, but if it catches on, the AI answers will adapt.
Experiments in Improving AI Inclusion
On the flip side of biases, people are experimenting with ways to explicitly improve an AI’s representation of certain info:
- llms.txt Experiments: We mentioned earlier the experiment where providing an LLM with a detailed llms.txt file improved answer quality for that specific content. While that experiment focused on correctness, it also implies if you give the AI a focused repository of your brand info, it’s more likely to include it (it can’t include what it doesn’t know). It’s as if you could partially fine-tune ChatGPT on your content without actually fine-tuning – just by letting it crawl a curated set. Several SEO professionals are trying this out on their sites to see if it yields better AI mentions over time.
- Content Freshness Tests: Some have tested how quickly new content gets picked up by AI. For example, after Google I/O 2024, people published articles to see if Google’s SGE would reference them. In general, fresh content does get picked up by connected AI (like Bing/Google) fairly quickly, often within days if it ranks or is relevant. But it won’t influence models without live search until they’re retrained. Experiments on ChatGPT (pre-browsing) show it still giving 2021-era answers for new developments. The practical insight: for time-sensitive topics, focus on the platforms that do use live data (Bing, Google’s AI, etc.).
- Comparative Outcome Testing: Another experiment a few SEOs did was to compare AI answers before and after a content or SEO change. For instance, they noted that their site wasn’t being cited, then they added an FAQ or improved the title to match the question, and then checked a few weeks later. In some cases, they saw their site then appear as a citation in SGE. While hard to fully control, it indicates that standard SEO optimizations (like matching content to query intent better) do reflect in AI answers. The correlation is still there: if you move from ranking #5 to #1, your likelihood of mention goes up.
Summarizing What the Data Tells Us
SEO fundamentals are validated: High search rankings and authoritative content significantly increase AI visibility. So, SEO isn’t obsolete – it’s prerequisite.
But SEO isn’t the whole story: We see instances where something other than the #1 result is chosen by the AI because it adds value (maybe #3 had a better summarized list). So focusing on content quality and how well it directly answers the question can sometimes beat pure ranking.
Bias is real: AI can omit or favor certain info due to data biases. Brands need to be aware of how this might impact them (are you in a less-covered category? a smaller region? new on the scene?). Proactively counter biases by feeding the ecosystem with information about you.
Experimentation helps: The field is new, and even large-scale studies are just starting. There’s an opportunity for every brand team to do a bit of their own R&D – try things, measure, share or at least learn from it.
Adaptability: The algorithms behind AI answers are likely to evolve. For instance, if too many noticed bias X, the next model update might adjust for it. The data we gather is a snapshot of a moving target. That means continuous monitoring (as discussed in Tools) is key. What’s true today (say, multi-modal content not affecting mentions much) could change if, say, Google starts prioritizing pages with embedded videos in its AI answers.
In the next part of our series, we will step back and look at the bigger picture: frameworks and philosophies for approaching GEO. We’ll synthesize these findings and others into a strategic way of thinking about optimization in the AI era, similar to how the SEO community developed frameworks over years.
For now, the data and experiments so far should give you confidence in one thing: GEO isn’t magic, it’s measurable. We can analyze it, understand factors, and adapt – just like we do with traditional SEO. As the saying goes in analytics, “What gets measured gets managed.” We’re finally starting to measure GEO, which means we can manage it.