Build a valid robots.txt file with user agents, disallow/allow rules, sitemap URL, and crawl delay settings.
robots.txt is a plain text file placed at the root of your website (e.g., example.com/robots.txt) that tells web crawlers which pages or sections they're allowed or not allowed to access. It follows the Robots Exclusion Protocol, a standard supported by all major search engines.
While robots.txt is not a security mechanism (it's a directive, not an enforcement), it's essential for managing crawl budget, preventing indexing of duplicate or low-value content, protecting staging environments, and blocking AI training crawlers.
A misconfigured robots.txt can either block search engines from your most important pages or invite crawlers into areas you'd rather keep private. This generator walks you through the decisions — which user agents, which paths to allow or disallow, sitemap location — and produces a properly formatted file. It's easier than editing the file by hand and less error-prone.
See your robots.txt file update in real-time as you configure rules. No need to click generate — the output reflects your settings instantly.
Create separate rule groups for different crawlers. Set specific rules for Googlebot, Bingbot, GPTBot, or any custom user agent.
Fine-grained control over which paths each crawler can access. Combine Disallow and Allow rules to create precise crawl policies.
Add your sitemap URL directly in the robots.txt file. This helps search engines discover your sitemap without relying on Search Console submission alone.
Set crawl delay values to throttle aggressive crawlers. Useful for servers with limited resources or during high-traffic periods.
Download the generated file as robots.txt ready to upload to your server root, or copy the content to your clipboard.
Control how search engines crawl your site. Block admin pages, staging areas, and low-value content while ensuring important pages are accessible.
Quickly generate robots.txt for new projects. Configure crawl rules during development and update them before launch.
Manage crawler access to protect server resources. Block aggressive bots, restrict access to sensitive directories, and declare sitemaps.
Block AI training crawlers like GPTBot, CCBot, and Google-Extended from scraping your content for model training purposes.
| Directive | Purpose | Example |
|---|---|---|
| User-agent | Specifies which crawler the rules apply to | User-agent: Googlebot |
| Disallow | Blocks access to a path | Disallow: /admin/ |
| Allow | Overrides Disallow for specific paths | Allow: /admin/public/ |
| Sitemap | Declares your XML sitemap location | Sitemap: https://example.com/sitemap.xml |
| Crawl-delay | Seconds between requests (Bing, Yandex) | Crawl-delay: 10 |
| * (wildcard) | Matches any user agent | User-agent: * |
Rated by real users — your feedback helps us improve