Search yes, training no

How to Allow ChatGPT Search but Block AI Training

Robots.txt guidance for allowing OAI-SearchBot while disallowing GPTBot, with caveats about ChatGPT-User, WAF rules and private content.

When to use this

Use this when you want discovery in ChatGPT Search but do not want to allow broad training-oriented crawling.

Use separate OpenAI user agents

OpenAI documents OAI-SearchBot for search and GPTBot for training. That means you can allow the search crawler while disallowing the training crawler in robots.txt.

User-agent: OAI-SearchBot
Allow: /

User-agent: GPTBot
Disallow: /

User-agent: ChatGPT-User
Allow: /

Sitemap: https://example.com/sitemap.xml

Important caveats

robots.txt is voluntary, user-triggered agents may behave differently from automatic crawlers, and private content must be protected with authentication rather than crawler rules.

Checklist

Allow OAI-SearchBot on public pages you want eligible for ChatGPT Search.

Disallow GPTBot if your policy is to opt out of training-oriented crawling.

Keep ChatGPT-User policy separate from automatic crawling policy.

Make sure WAF/CDN settings do not block the bots you intentionally allow.

Request recrawl after publishing robots.txt and sitemap changes.

FAQ

What does How to Allow ChatGPT Search but Block AI Training check first?

Allow OAI-SearchBot on public pages you want eligible for ChatGPT Search.

Does this guarantee ranking or inclusion in AI answers?

No. It checks public technical signals that can make a page easier to crawl, parse and cite, but no tool can guarantee ranking, indexing or citation in ChatGPT, Claude, Perplexity or Google.

Should I fix robots.txt, llms.txt or page rendering first?

Fix public reachability, indexability and readable initial HTML first. robots.txt should express crawler policy, and llms.txt is optional supporting documentation rather than a replacement for normal search fundamentals.

Primary references

Next step

Use the free scan to confirm whether your current rules match this policy.

Scan your site