Robots.txt Generator

Generate a robots.txt file to control how search engines and web crawlers access your website.

Configure Rules

Sitemap URL (Optional) URL to your XML sitemap

Crawl Delay (Optional) Seconds between requests (not supported by all crawlers)

User-Agent Rules

Common Presets

Generated robots.txt

What is robots.txt?

The robots.txt file is a text file placed in the root directory of a website that tells web crawlers (like search engine bots) which pages they can and cannot access. It's part of the Robots Exclusion Protocol (REP).

Basic Syntax

User-agent: Specifies which crawler the rule applies to (use * for all crawlers)
Disallow: Specifies paths that should not be crawled
Allow: Specifies paths that can be crawled (useful to override Disallow rules)
Sitemap: Points to the location of your XML sitemap
Crawl-delay: Specifies the delay between requests (in seconds)

Common User-Agents

Googlebot - Google's web crawler
Bingbot - Bing's web crawler
Slurp - Yahoo's web crawler
DuckDuckBot - DuckDuckGo's web crawler
Baiduspider - Baidu's web crawler

Yandex - Yandex's web crawler
facebookexternalhit - Facebook's crawler
Twitterbot - Twitter's crawler
* - All crawlers (wildcard)

Examples

Allow Everything

Let all crawlers access everything:

User-agent: *
Disallow:

Block Everything

Prevent all crawlers from accessing any page:

User-agent: *
Disallow: /

Block Specific Directories

Allow crawlers but block admin and private areas:

User-agent: *
Disallow: /admin/
Disallow: /private/
Disallow: /temp/

Different Rules for Different Bots

Allow Google but block others:

User-agent: Googlebot
Allow: /

User-agent: *
Disallow: /

With Sitemap

Include sitemap location:

User-agent: *
Disallow: /admin/

Sitemap: https://example.com/sitemap.xml

Important Notes

The robots.txt file must be placed in the root directory of your website (e.g., https://example.com/robots.txt)
Robots.txt is a recommendation, not a security measure - malicious bots may ignore it
Use robots.txt to prevent crawlers from indexing pages, not to hide sensitive information
Changes to robots.txt may take time to be recognized by search engines
Blocking pages in robots.txt prevents crawling but doesn't guarantee removal from search results
Use noindex meta tags or X-Robots-Tag HTTP headers for more control over indexing

Testing Your Robots.txt

Google provides a robots.txt Tester tool in Google Search Console to validate and test your robots.txt file.

Related Web Utilities

HTTP Headers HTTP Status Codes Htpasswd Generator

Feedback

Help us improve this page by providing feedback:

Share with