Just this week we’ve seen Google’s new AI mode rolled out to some users in the EU, offering a direct LLM response to queries in place of a conventional search engine ranking page (SERP). It’s yet another step towards an AI-first internet in which businesses must focus on ranking within LLMs alongside older search engines.
The issue (for now) is that it is hard to measure exactly how effectively content performs across these platforms, especially as the internal workings of most LLMs remain tightly guarded secrets. To help you squeeze more insights from less data, I’ve pulled together some of Kooba’s best advice on LLM measurement below:
The challenges of AI measurement
Invisible brand interactions
To begin with, we should understand that we are working with incomplete sets of data. In a previous blog on this topic we noted that “absence of evidence is not evidence of absence”. In other words, users will interact with your brand on AI platforms without you knowing, and you’ll have to infer the nature of these interactions from other sources. Because we lack native AI analytics tools, we have to wait for AI traffic to arrive on our website, and then look at the source of this traffic.
This means that a successful AI content strategy may not show immediate results, especially if it improves impressions without increasing the volume of clicks. After all, many users will find useful and valuable content from your brand without clicking onto your site. As such, we can only measure the information available to us, even if we know it is incomplete.
The unknown quality of query responses
Likewise, we still have no gauge on the quality of our LLM mentions. Consumers often use these tools to compare different vendors, so ranking highly within ChatGPT could be caused by negative comparisons (which are probably worse than no mentions at all). Whilst some AI “tone measurement” services exist, they seem to rely on asking LLMs to disclose their own ranking, which cannot consistently provide accurate answers.
The only way to actually gain an understanding of how your brand is discussed within LLMs is to run anonymised queries related to those of your target audience. This needs to be conducted at scale to reduce the statistical noise caused by model temperature (AKA the randomised variations provided in answers).
The diversity of LLM analytics
Unlike conventional SEO, which is made easier by the dominance of one search engine (Google), AI optimisation requires looking at your brand’s performance within many different competing models. Some of these are challenging to measure, whereas others are relatively generous in the information they provide us.
Using ChatGPT as a proxy
Traffic from ChatGPT, the most popular LLM on the market, can be measured fairly accurately on Google Analytics. As mentioned above, this excludes users who saw your brand on ChatGPT and never clicked to your site, but it still remains a useful proxy for understanding the wider nature of your inbound LLM traffic.
In our own generative engine optimisation (GEO) work at Kooba, we’ve found that ChatGPT traffic tends to arrive on high-intent pages (like our Services page or our Case Studies), spend a long time on the site, and convert at higher than average rates. In other words, ChatGPT users are extremely valuable from a commercial perspective.
We can also use ChatGPT traffic as a measuring stick for our broader AI content strategy. We can expect that an increase in ChatGPT arrivals correlates with increased traffic from Gemini and other tools, and platforms like “ChatGPT-vs-Google” provide us with benchmarks to track our progress against.
Cracking the Google code
Google, on the other hand, is far more secretive about traffic from Gemini, AI overviews, or their newly-launched “AI Mode”. As these tools keep users within Google and away from publisher websites, Google has been accused of breaching the internet’s social contract (they claim that an increase in traffic quality will offset decreases in quantity). The only way to guess rankings within Google’s AI tools is through impressions on Google Search Console. Unfortunately, Google doesn’t allow us to disentangle impressions from conventional searches (which are not that important) from impressions in AI overviews or AI Mode (which are potentially very valuable). We would hope that this will change soon, especially if Google wants to keep content publishers on its platform.
Conclusion: trusting quality
It’s easy to get worked up about the changing nature of consumer behaviour on the internet, but most of these issues are still theoretical for now. In September 2025, less than 1% of inbound traffic came via AI tools (compared to 41% from Google). It might make sense to worry about the growth of these channels, but there is certainly no reason to panic yet.
The best content strategy advice is quite boring. Produce informative and useful content that readers want to engage in, and publish it in an accessible and well-labelled manner. Every search platform has an incentive to connect users with valuable information and content, so in the long run high-quality work will prevail, even if it fails to match every short-term tactical shift.
If you’d like to discuss the ways you can (and can’t) measure your AI visibility, just get in touch with our team today, we’d love to talk!







