robots-txt-generator
A robots.txt file at your site root tells crawlers which paths they may or may not access β Google, Bing, but also AI training and AI search bots (GPTBot, ClaudeBot, PerplexityBot, Google-Extended). In 2026, robots.txt is critical for AI search visibility: blocking AI bots means you can't be cited by ChatGPT/Claude/Perplexity. The ZTools Robots.txt Generator builds a valid file with per-bot allow/deny rules, sitemap directive, AI-bot allowlist, and bad-bot denylist (AhrefsBot, SemrushBot, scrapers without value).
Use casesβ
New site launchβ
Day-one foundation. Without robots.txt, crawlers default to "crawl everything" β wastes crawl budget and may index admin/test pages.
AI search optimizationβ
Explicitly allow GPTBot, ClaudeBot, PerplexityBot, Google-Extended for AI citation eligibility. The default-allow status is unreliable across bots; explicit allowlist removes ambiguity.
Block bad crawlersβ
AhrefsBot, SemrushBot, MJ12bot β these scrape your content and feed competitor SEO tools without bringing traffic. Block them to save bandwidth.
Staging / private pagesβ
Disallow /admin, /search, /preview to keep them out of search indexes. Note: robots.txt is a polite request β security-sensitive paths should be auth-gated, not just disallowed.
How it worksβ
- Pick base mode β Allow-all (default) or deny-all (private staging). Allow-all is the right starting point for public sites.
- AI-bot policy β Allow each AI bot explicitly: GPTBot, ChatGPT-User, ClaudeBot, anthropic-ai, PerplexityBot, Google-Extended, CCBot, Applebot.
- Add disallow paths β /admin, /search, /api, /preview β paths you don't want indexed.
- Block bad bots β AhrefsBot, SemrushBot, MJ12bot, DotBot, SeekportBot, Bytespider β scrapers without value to your traffic.
- Add sitemap directive β
Sitemap: https://example.com/sitemap.xmlβ discoverable without GSC submission.
Examplesβ
Input: Public marketing site, AI-friendly
Output: Allow-all + explicit AI-bot allowlist + Sitemap directive + bad-bot denylist.
Input: Staging / preview environment
Output: User-agent: * \\n Disallow: / β keep out of all indexes.
Input: E-commerce, deny faceted-filter URLs
Output: Allow-all + Disallow: /*?filter= + Sitemap.
Frequently asked questionsβ
Should I block AI bots?
Decision tree: (1) If your business depends on AI search visibility (citations in ChatGPT, Perplexity, Claude), allow them. (2) If you actively don't want your content used for AI training (e.g. publisher with paid content), block GPTBot and CCBot but allow PerplexityBot/Google-Extended which are search-only.
GPTBot vs ChatGPT-User vs OAI-SearchBot?
GPTBot crawls for training data. ChatGPT-User fetches a URL when a ChatGPT user requests it. OAI-SearchBot is the new dedicated search crawler. Allow all three for full ChatGPT integration.
Is robots.txt a security mechanism?
No β it's a polite request, not enforcement. Bad actors ignore it. Use auth, IP allowlists, or 401/403 responses for security-sensitive paths.
How big can robots.txt be?
Google parses up to 500KB. Beyond that, rules are ignored. Keep it concise.
Can I use wildcards?
Yes β Google/Bing support * (any characters) and $ (end of URL). Example: Disallow: /*.pdf$ blocks all PDFs.
Where does robots.txt live?
Always at the site root: https://example.com/robots.txt. Subdomain robots.txt files cover only that subdomain. Path-based robots.txt at any other location is ignored.
Tipsβ
- Always include
Sitemap: https://example.com/sitemap.xmlβ discoverable without manual GSC submission. - Allow AI bots explicitly to remove ambiguity β default-allow is unreliable across crawlers.
- Block AhrefsBot/SemrushBot/MJ12bot/DotBot β scrapers without traffic value.
- Test changes with Google's robots.txt Tester (legacy GSC) or live URL inspection before deploying.
- Don't use robots.txt to hide secrets β disallowed URLs are still publicly listed in your robots.txt itself.
Try it nowβ
The full robots-txt-generator runs in your browser at https://ztools.zaions.com/robots-txt-generator β no signup, no upload, no data leaves your device.
Last updated: 2026-05-06 Β· Author: Ahsan Mahmood Β· Edit this page on GitHub