How AI decides what sources to use for its answers?
Oct 3, 2025
Generative AI is quickly becoming the new search engine. Customers no longer need to click through multiple websites to find what they want - they can simply ask ChatGPT, Perplexity, or Gemini and get an instant answer. For marketers, this shift raises a critical question: if AI is now deciding which brands to surface, how does it choose the sources behind its answers?
Understanding this is the foundation of Generative Engine Optimisation (GEO) - the emerging practice of making sure your brand shows up when people ask AIs about your category.
General vs. Topic-Specific Sources
When we talk about sources, it helps to separate them into two categories:
General domains: sites that cover a wide range of topics, such as Wikipedia, Reddit, or LinkedIn.
Topic-specific domains: niche websites that focus on one subject area - industry publications, expert blogs, review sites, or association pages.
Both types matter, but they play very different roles in shaping what AI says.
How AI models select sources?
AI-generated answers are shaped by two layers:
Training data – the massive corpus of information a model learns from before it’s released. This includes both general and niche content.
Retrieval or browsing – most models (ChatGPT, Gemini, Perplexity, Claude) can pull live web data. But browsing isn’t always switched on, and even when it is, the model may decide not to use it unless the query demands fresh or specific information.
What matters is how models combine these layers. Sometimes they answer entirely from training data; other times they fetch live sources. In both cases, they apply filters: giving more weight to domains considered trustworthy, authoritative, and relevant.
Different models strike different balances. Some lean on general domains for context and common knowledge. Others, like Gemini, are much stricter and overwhelmingly favor topic-specific sites. Unlike SEO, this isn’t about backlinks or keyword tricks - it’s about being present in the datasets and domains AIs actually consider credible when they generate or ground answers.
The data: General vs. Topic-Specific sources
An analysis conducted with GetMentioned, based on nearly 1,000,000 prompts across all major models, shows how the three leading models compare:
ChatGPT: ~8% general vs. ~92% topic-specific
Perplexity: ~8% general vs. ~92% topic-specific
Gemini: ~1% general vs. ~99% topic-specific

The takeaway is clear: all three models lean strongly toward niche, authoritative sources. Gemini is the strictest, pulling almost entirely from topic-specific sites, while ChatGPT and Perplexity allow slightly more space for general platforms.
Why this matters for brands?
If your brand is only visible on general domains - say through a Wikipedia mention, LinkedIn content, or Reddit discussions - your chances of appearing in AI answers are limited. These platforms help provide context and credibility, but they represent only a small share of what the models use.
Where AI really “looks” is in topic-specific sources: industry media, specialist blogs, product review sites, and association websites. This is where credibility is built, and it’s what most influences whether your brand shows up in a response.
What marketers should do?
Audit your presence: Check whether your brand is being mentioned in both general and niche sources.
Prioritize niche visibility: Secure mentions in trusted industry publications, expert blogs, and category-specific review sites.
Leverage general sources strategically: Don’t ignore platforms like Wikipedia or LinkedIn - they still provide context, validation, and exposure that AI occasionally taps into.
Track model differences: Since ChatGPT, Perplexity, and Gemini weigh sources differently, monitor how each represents your brand and adapt your strategy accordingly.
Conclusion
AI clearly prefers topic-specific sources - they are the cornerstone of GEO and the key to being consistently included in AI answers. But general domains shouldn’t be written off. They add important credibility, provide a layer of context, and can still influence how a model frames your brand.
The winning strategy is balance: dominate in niche, authoritative sites while maintaining a presence in general domains. That way, whether a customer asks ChatGPT, Perplexity, or Gemini about your category, your brand stands the best chance of being mentioned.