This article was sponsored by Alli AI. The opinions expressed in this article are those of the sponsor.
Everyone assumes that Googlebot is the dominant crawler on their website. This assumption is now false.
We analyzed 24,411,048 proxy requests on over 78,000 pages across 69 customer websites on Alli AI’s bot enablement platform over a 55-day period (January to March 2026). OpenAI’s ChatGPT-User crawler performed 3.6 times more queries than Googlebot across our entire data sample. And that’s not even counting GPTBot, OpenAI’s separate training crawler.
A note on methodology: Crawler identification used a user agent string match, verified against published IP address ranges. Query metrics are measured at the proxy/CDN layer. The dataset covers 69 websites across various industries and sizes, primarily based on WordPress. The full methodology is detailed at the end.
Finding 1: AI crawlers now exceed Google 3.6x and ChatGPT is leading the pack

When we ranked each identified bot by request volume, the results were unambiguous:
| Rank | Caterpillar | Requests | Category |
| 1 | ChatGPT-User (OpenAI) | 133,361 | AI research |
| 2 | Googlebot | 37,426 | Traditional search |
| 3 | Amazonbot | 35,728 | AI / E-commerce |
| 4 | Bingbot | 18,280 | Traditional search |
| 5 | ClaudeBot (Anthropic) | 13,918 | AI research |
| 6 | MetaBot | 10,756 | Social |
| 7 | GPTBot (OpenAI) | 8,864 | AI training |
| 8 | Applebot | 6,794 | AI research |
| 9 | Bytespider (ByteDance) | 6,644 | AI training |
| 10 | PerplexityBot | 5,731 | AI research |
ChatGPT-User made more requests than Googlebot, Amazonbot and Bingbot combined.

Grouped by purpose, AI-related crawlers (ChatGPT-User, GPTBot, ClaudeBot, Amazonbot, Applebot, Bytespider, PerplexityBot, CCBot) created 213,477 requests against 59,353 for traditional search bots (Googlebot, Bingbot, YandexBot). AI crawlers now perform 3.6 times more queries than traditional search bots on our network.
Finding 2: OpenAI uses 2 crawlers (and most sites don’t know the difference)

OpenAI operates two separate crawlers with very different goals.
ChatGPT-User is the recovery robot. It fetches pages in real time when users ask ChatGPT questions that require up-to-date web information. This determines whether your content appears in ChatGPT responses.
GPTBot is the training robot. It collects data to improve OpenAI’s models. Many sites block GPTBot via robots.txt but not ChatGPT-User, or vice versa, without understanding the distinct consequences of each.
In total, OpenAI’s crawlers made 142,225 requests: 3.8x the volume of Googlebot.
The robots.txt directives are distinct:
User-agent: GPTBot # Training crawler — feeds OpenAI's models
User-agent: ChatGPT-User # Retrieval crawler — fetches pages for ChatGPT answers
Finding 3: AI crawlers are faster and more reliable, but their volume adds up

AI crawlers are significantly more efficient per query:
| Caterpillar | Average response time | 200 success rate |
| PerplexityBot | 8ms | 100% |
| ChatGPT-User | 11ms | 99.99% |
| GPTBot | 12ms | 99.9% |
| ClaudeBot | 21ms | 99.9% |
| Bingbot | 42ms | 98.4% |
| Googlebot | 84ms | 96.3% |
Two probable reasons. First, AI crawlers retrieve specific pages in response to user queries, without comprehensively discovering the site architecture. They know what they want, they take it and go. Second, while all crawlers in our infrastructure receive pre-rendered responses, Googlebot’s broader crawl model means it requests a wider range of URLs, including stale paths from sitemaps and its own legacy index, which adds redirect chain latency and error handling that fetch bots completely avoid.
But there’s a catch: even though each individual request is light, its volume means that the overall server load is substantial. ChatGPT-User at 11ms × 133,361 requests is still a real infrastructure cost, just distributed differently from Googlebot’s fewer, heavier requests.
Finding 4: Googlebot sees a different (worse) version of your site

Googlebot’s 96.3% success rate compared to the near-perfect rates of AI crawlers reveals an important structural difference.
Googlebot received 624 blocked responses (403) and 480 not found errors (404), or 3% of its queries. Meanwhile, ChatGPT-User achieved 99.99% success. PerplexityBot achieved a perfect 100%.

Why this gap? The most likely explanation is index age and crawling behavior, not poor site configuration.
Googlebot maintains a massive index, built over years of continuous crawling. It regularly re-requests URLs it already knows, including pages that have since been deleted (404) or restructured (403). This is normal behavior for a search engine running an index of this scale, but it means that a significant percentage of Googlebot’s queries are directed to URLs that no longer exist.
AI crawlers do not carry this baggage. ChatGPT-User retrieves specific pages in response to user queries in real-time, targeting currently relevant and related content. This is a structural advantage that produces near-perfect success rates.
Industry Reports Confirm AI Exploration Increased 15x in 2025
These results align with broader industry trends. Cloudflare Analysis 2025 Report ChatGPT User Queries Increase 2,825% YoYwith AI exploration of “user action” growing more than 15x over the course of 2025. Akamai has identified OpenAI as largest AI robot operatoraccounting for 42.4% of all AI robot requests. Vercel’s analysis on nextjs.org confirmed that none of the major AI crawlers currently render JavaScript.
Our data shows that this crossover can already happen at the site level for properties that actively allow AI crawler access.
Your New SEO Strategy: How to Audit, Clean, and Optimize for AI Crawlers
1. Audit your robots.txt for AI crawlers today
Most robots.txt files were written for a Googlebot-centric world. At a minimum, have explicit guidelines for ChatGPT-User, GPTBot, ClaudeBot, Amazonbot, PerplexityBot, Applebot, Bytespider, CCBot, and Google-Extended.
Our recommendation: Most businesses benefit from allowing both crawlers (ChatGPT-User, PerplexityBot, ClaudeBot) And training crawlers (GPTBot, CCBot, Bytespider), training data is what teaches these models about your brand, products, and expertise. Blocking training bots today means AI models will learn less about you tomorrow, lowering your chances of being cited in AI-generated answers in the long run.
The exception: If you have content that you specifically need to protect against patterning (proprietary search, gated content), use the granular method. Refuse rules for these paths rather than general blocks.
2. Clean Out Obsolete URLs in Google Search Console
Our data shows that Googlebot achieves a 3% error rate, mostly 403s and 404s, while AI crawlers achieve near-perfect success rates. This discrepancy likely reflects Googlebot’s recrawl of legacy URLs that no longer exist. But these failed queries still consume the crawl budget.
Check your GSC crawl stats for recurring 404s and 403s. Configure appropriate redirects for restructured URLs and submit updated sitemaps.
3. Treat AI crawler accessibility as a separate SEO channel
Ranking in ChatGPT responses, Perplexity results, and Claude responses appears as a separate visibility channel. If your content is not accessible to these crawlers, especially if you use JavaScript-heavy frameworks, you are invisible in AI search.
We published a live dashboard showing how AI crawler traffic is distributed on a real site: which platforms visit, how often, and their share of total traffic; if you want to see what this looks like in practice.
4. Plan for volume, not just the weight of individual demand
AI crawlers send light and fast queries, but they send a lot two. ChatGPT-User alone accounted for more than 133,000 requests in 55 days. The overall server load from AI crawlers now likely exceeds the load on your Googlebot. Make sure your hosting and CDN can handle it, the low response times per request in our data reflect the fact that Alli AI serves pre-rendered static HTML from the edge of the CDN, which is exactly the type of architecture that absorbs this volume without taxing your origin server.
Methodology
This analysis is based on 24,411,048 HTTP proxy requests processed through Alli AI’s bot enablement platform between January 14 and March 9, 2026, covering 69 customer websites.
Crawler identification used a user agent string match, verified against published IP address ranges. For OpenAI crawlers specifically, each query was compared to CIDR ranges published by OpenAI. This confirmed that 100% of GPTBot requests and 99.76% of ChatGPT-User requests came from OpenAI infrastructure. The remaining 0.24% (requests from spoofed user agents) were excluded.
Boundaries: The dataset is aimed at Alli AI customers who have opted in to enable the crawler. Crawlers that do not identify themselves through the user agent are not captured. Response time measurements are made at the proxy layer, not at the origin server.
About Alli AI
Alli AI provides server-side rendering infrastructure for AI bots and search engines. This analysis was conducted using data from our proxy infrastructure to help the SEO community better understand the evolving crawler landscape.
Want to see this data in action? See the breakdown by visiting our AI Visibility Dashboard.
Image credits
Featured image: Image from Alli AI. Used with permission.
In-Post Images: Images from Alli AI. Used with permission.





