AI Bot Rules: explicit AI-crawler sections in robots.txt
Why add separate sections for GPTBot/ClaudeBot/PerplexityBot beyond the wildcard rule, an example, common mistakes, and how we score it.
Updated:
What it is
AI Bot Rules are separate sections in robots.txt for specific AI crawlers, on
top of the wildcard User-agent: *. Each AI platform crawls with its own
User-Agent, and an explicit section pins down how you treat it. It’s a refinement of
the general robots.txt guide — here the focus is the
AI bots themselves.
Why it matters for AI agents
By default AI bots follow the * rule. Explicit sections give three things:
- Guaranteed access — works even if
*is restricted. - Targeted rules —
/blog/open,/api/private/closed for a specific bot. - A statement of intent — explicit trust to the platforms, a ticket into GEO/AEO results.
Current AI User-Agents:
| User-Agent | Platform |
|---|---|
GPTBot, ChatGPT-User, OAI-SearchBot | OpenAI |
ClaudeBot, anthropic-ai | Anthropic |
PerplexityBot, Perplexity-User | Perplexity |
Google-Extended | Google AI / Gemini |
Applebot-Extended | Apple |
YandexAdditional | Yandex |
Minimal working example
User-agent: *
Allow: /
User-agent: GPTBot
Allow: /
User-agent: ClaudeBot
Allow: /
User-agent: PerplexityBot
Allow: /
User-agent: Google-Extended
Allow: /
Right vs wrong
| Right | Wrong |
|---|---|
| 3+ explicit AI sections | Only User-agent: * |
Exact bot names (GPTBot) | Typos (GptBot) — the section won’t match |
Deliberate Allow/Disallow | A careless Disallow: / in an AI section |
Common mistakes
- Typos in names — the bot doesn’t match the section;
*applies. - Only
*— no explicit trust signal (and a lower score, see below). - Conflicts between the wildcard and AI blocks.
- Confusing it with Content Signals — that’s about usage, this is about access.
How to verify
This check depends on robots.txt and is scored on a gradient:
- pass — 3+ AI-specific sections found;
- warning — 1–2 sections;
- fail — only
*or nothing.
curl -s https://example.com/robots.txt | grep -iE 'gptbot|claudebot|perplexitybot|google-extended'