ChatGPT now crawls 3.6x more than Googlebot


This article was sponsored by Alli AI. The opinions expressed in this article are those of the sponsor.

Everyone assumes that Googlebot is the dominant crawler on their website. This assumption is now false.

We analyzed 24,411,048 proxy requests on over 78,000 pages across 69 customer websites on Alli AI’s bot enablement platform over a 55-day period (January to March 2026). OpenAI’s ChatGPT-User crawler performed 3.6 times more queries than Googlebot across our entire data sample. And that’s not even counting GPTBot, OpenAI’s separate training crawler.

A note on methodology: Crawler identification used a user agent string match, verified against published IP address ranges. Query metrics are measured at the proxy/CDN layer. The dataset covers 69 websites across various industries and sizes, primarily based on WordPress. The full methodology is detailed at the end.

Finding 1: AI crawlers now exceed Google 3.6x and ChatGPT is leading the pack

Image created by Alli AI, April 2026.

When we ranked each identified bot by request volume, the results were unambiguous:

Rank Caterpillar Requests Category
1 ChatGPT-User (OpenAI) 133,361 AI research
2 Googlebot 37,426 Traditional search
3 Amazonbot 35,728 AI / E-commerce
4 Bingbot 18,280 Traditional search
5 ClaudeBot (Anthropic) 13,918 AI research
6 MetaBot 10,756 Social
7 GPTBot (OpenAI) 8,864 AI training
8 Applebot 6,794 AI research
9 Bytespider (ByteDance) 6,644 AI training
10 PerplexityBot 5,731 AI research

ChatGPT-User made more requests than Googlebot, Amazonbot and Bingbot combined.

Image created by Alli AI, April 2026.

Grouped by purpose, AI-related crawlers (ChatGPT-User, GPTBot, ClaudeBot, Amazonbot, Applebot, Bytespider, PerplexityBot, CCBot) created 213,477 requests against 59,353 for traditional search bots (Googlebot, Bingbot, YandexBot). AI crawlers now perform 3.6 times more queries than traditional search bots on our network.

Finding 2: OpenAI uses 2 crawlers (and most sites don’t know the difference)

Image created by Alli AI, April 2026.

OpenAI operates two separate crawlers with very different goals.

ChatGPT-User is the recovery robot. It fetches pages in real time when users ask ChatGPT questions that require up-to-date web information. This determines whether your content appears in ChatGPT responses.

GPTBot is the training robot. It collects data to improve OpenAI’s models. Many sites block GPTBot via robots.txt but not ChatGPT-User, or vice versa, without understanding the distinct consequences of each.

In total, OpenAI’s crawlers made 142,225 requests: 3.8x the volume of Googlebot.

The robots.txt directives are distinct:

User-agent: GPTBot      # Training crawler — feeds OpenAI's models
User-agent: ChatGPT-User # Retrieval crawler — fetches pages for ChatGPT answers

Finding 3: AI crawlers are faster and more reliable, but their volume adds up

Image created by Alli AI, April 2026.

AI crawlers are significantly more efficient per query:

Caterpillar Average response time 200 success rate
PerplexityBot 8ms 100%
ChatGPT-User 11ms 99.99%
GPTBot 12ms 99.9%
ClaudeBot 21ms 99.9%
Bingbot 42ms 98.4%
Googlebot 84ms 96.3%

Two probable reasons. First, AI crawlers retrieve specific pages in response to user queries, without comprehensively discovering the site architecture. They know what they want, they take it and go. Second, while all crawlers in our infrastructure receive pre-rendered responses, Googlebot’s broader crawl model means it requests a wider range of URLs, including stale paths from sitemaps and its own legacy index, which adds redirect chain latency and error handling that fetch bots completely avoid.

But there’s a catch: even though each individual request is light, its volume means that the overall server load is substantial. ChatGPT-User at 11ms × 133,361 requests is still a real infrastructure cost, just distributed differently from Googlebot’s fewer, heavier requests.

Finding 4: Googlebot sees a different (worse) version of your site

Image created by Alli AI, April 2026.

Googlebot’s 96.3% success rate compared to the near-perfect rates of AI crawlers reveals an important structural difference.

Googlebot received 624 blocked responses (403) and 480 not found errors (404), or 3% of its queries. Meanwhile, ChatGPT-User achieved 99.99% success. PerplexityBot achieved a perfect 100%.

Image created by Alli AI, April 2026.

Why this gap? The most likely explanation is index age and crawling behavior, not poor site configuration.

Googlebot maintains a massive index, built over years of continuous crawling. It regularly re-requests URLs it already knows, including pages that have since been deleted (404) or restructured (403). This is normal behavior for a search engine running an index of this scale, but it means that a significant percentage of Googlebot’s queries are directed to URLs that no longer exist.

AI crawlers do not carry this baggage. ChatGPT-User retrieves specific pages in response to user queries in real-time, targeting currently relevant and related content. This is a structural advantage that produces near-perfect success rates.

Industry Reports Confirm AI Exploration Increased 15x in 2025

These results align with broader industry trends. Cloudflare Analysis 2025 Report ChatGPT User Queries Increase 2,825% YoYwith AI exploration of “user action” growing more than 15x over the course of 2025. Akamai has identified OpenAI as largest AI robot operatoraccounting for 42.4% of all AI robot requests. Vercel’s analysis on nextjs.org confirmed that none of the major AI crawlers currently render JavaScript.

Our data shows that this crossover can already happen at the site level for properties that actively allow AI crawler access.

Your New SEO Strategy: How to Audit, Clean, and Optimize for AI Crawlers

1. Audit your robots.txt for AI crawlers today

Most robots.txt files were written for a Googlebot-centric world. At a minimum, have explicit guidelines for ChatGPT-User, GPTBot, ClaudeBot, Amazonbot, PerplexityBot, Applebot, Bytespider, CCBot, and Google-Extended.

Our recommendation: Most businesses benefit from allowing both crawlers (ChatGPT-User, PerplexityBot, ClaudeBot) And training crawlers (GPTBot, CCBot, Bytespider), training data is what teaches these models about your brand, products, and expertise. Blocking training bots today means AI models will learn less about you tomorrow, lowering your chances of being cited in AI-generated answers in the long run.

The exception: If you have content that you specifically need to protect against patterning (proprietary search, gated content), use the granular method. Refuse rules for these paths rather than general blocks.

2. Clean Out Obsolete URLs in Google Search Console

Our data shows that Googlebot achieves a 3% error rate, mostly 403s and 404s, while AI crawlers achieve near-perfect success rates. This discrepancy likely reflects Googlebot’s recrawl of legacy URLs that no longer exist. But these failed queries still consume the crawl budget.

Check your GSC crawl stats for recurring 404s and 403s. Configure appropriate redirects for restructured URLs and submit updated sitemaps.

3. Treat AI crawler accessibility as a separate SEO channel

Ranking in ChatGPT responses, Perplexity results, and Claude responses appears as a separate visibility channel. If your content is not accessible to these crawlers, especially if you use JavaScript-heavy frameworks, you are invisible in AI search.

We published a live dashboard showing how AI crawler traffic is distributed on a real site: which platforms visit, how often, and their share of total traffic; if you want to see what this looks like in practice.

4. Plan for volume, not just the weight of individual demand

AI crawlers send light and fast queries, but they send a lot two. ChatGPT-User alone accounted for more than 133,000 requests in 55 days. The overall server load from AI crawlers now likely exceeds the load on your Googlebot. Make sure your hosting and CDN can handle it, the low response times per request in our data reflect the fact that Alli AI serves pre-rendered static HTML from the edge of the CDN, which is exactly the type of architecture that absorbs this volume without taxing your origin server.

Methodology

This analysis is based on 24,411,048 HTTP proxy requests processed through Alli AI’s bot enablement platform between January 14 and March 9, 2026, covering 69 customer websites.

Crawler identification used a user agent string match, verified against published IP address ranges. For OpenAI crawlers specifically, each query was compared to CIDR ranges published by OpenAI. This confirmed that 100% of GPTBot requests and 99.76% of ChatGPT-User requests came from OpenAI infrastructure. The remaining 0.24% (requests from spoofed user agents) were excluded.

Boundaries: The dataset is aimed at Alli AI customers who have opted in to enable the crawler. Crawlers that do not identify themselves through the user agent are not captured. Response time measurements are made at the proxy layer, not at the origin server.

About Alli AI

Alli AI provides server-side rendering infrastructure for AI bots and search engines. This analysis was conducted using data from our proxy infrastructure to help the SEO community better understand the evolving crawler landscape.

Want to see this data in action? See the breakdown by visiting our AI Visibility Dashboard.


Image credits

Featured image: Image from Alli AI. Used with permission.

In-Post Images: Images from Alli AI. Used with permission.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *