Free tool · No signup · Bot-by-bot robots.txt check

AI Crawler Checker

Test exactly which AI crawlers can reach your public page. WebsiteReady parses robots.txt and page-level indexing directives, then shows a bot-by-bot access matrix for search discovery, training, user-triggered fetchers, and model-usage controls.

Start with the facts. Paste a URL to see which named AI bots are allowed, blocked, or unknown, then use the default Maximize AI discovery scoring as a starting policy view.

Default policy: Maximize AI discovery. Only submit public URLs you own, operate, or are authorized to test.

What this checks

Bot-by-bot AI crawler access with robots.txt evidence.

Named AI bot access

Evaluate key official OpenAI, Perplexity, Anthropic, Google, and Apple AI bot tokens one by one, then show the real bot count in the generated report.

robots.txt rule evidence

Show the matched user-agent group, longest Allow/Disallow rule, wildcard fallback, and unknown states instead of a vague AI-readiness score.

Page-level indexing blockers

Flag meta robots and X-Robots-Tag blocks that make crawler-specific allow rules irrelevant.

Default discovery policy

Score the result against Maximize AI search discovery, the best default for public indie tools and SEO-focused product sites.

Search crawlers vs training crawlers

AI crawlers are not all the same. Search discovery bots such as OAI-SearchBot, PerplexityBot, and Claude-SearchBot are different from training-related bots like GPTBot or ClaudeBot, user-triggered fetchers, and model-usage controls like Google-Extended.

How to fix robots.txt issues

The report keeps the first step simple: show which bots are allowed, blocked, or unknown, then recommend the Maximize AI search discovery robots.txt snippet when important search crawlers are blocked. It still explains training/model-usage controls as context, but the public tool does not ask visitors to choose a crawler strategy first.

Supporting GEO context

llms.txt and schema explain the page after crawler access is clear.

WebsiteReady checks whether /llms.txt exists, whether it is empty or placeholder copy, whether it has a Markdown H1 and sections, and whether the page has JSON-LD. Concise visible-copy signals are shown as advisory context, not hard failures, because semantic heuristics can be wrong. Missing llms.txt is treated as an improvement, not a guaranteed launch blocker.

FAQ

What is an AI crawler checker?

It checks whether AI-related crawlers, fetchers, and model-usage controls can access your public pages based on robots.txt, meta robots, X-Robots-Tag, llms.txt, structured data, and visible summary copy.

Does this guarantee AI search visibility?

No. WebsiteReady checks technical readiness signals and obvious crawler-policy conflicts. AI search visibility, citations, and recommendations depend on external systems and cannot be guaranteed.

Should I block AI crawlers?

For public personal tools and SEO-focused product sites, the best default is usually to maximize AI search discovery. Blocking AI crawlers is a separate policy choice and can reduce discoverability in AI answer/search experiences.

Is llms.txt required?

No. llms.txt is a proposed standard and should be treated as an optional AI-readiness signal, not a required ranking factor or launch blocker by itself.

Can I allow ChatGPT search but block model training?

Yes. Some platforms provide separate user agents or policy tokens for search, training, model usage, and user-triggered fetches. WebsiteReady surfaces that distinction so your robots.txt policy is intentional.