Robots.txt Educator & Generator

📚 Learn About Robots.txt

What is robots.txt?

A robots.txt file tells web crawlers which pages or files they can or can't request from your site. It's placed at the root of your website (e.g., yoursite.com/robots.txt).

Basic Structure

# Allow all bots to crawl everything User-agent: * Allow: / # Block specific directories Disallow: /admin/ Disallow: /private/ # Sitemap location Sitemap: https://yoursite.com/sitemap.xml

Remember: robots.txt is publicly accessible and not a security measure! Use proper authentication for sensitive content.

Prevent Duplicate Content

# Block parameter-based duplicates Disallow: /*?sort= Disallow: /*?filter= Disallow: /*?page= Disallow: /*?utm_ Disallow: /*?ref=

Conserve Crawl Budget

# Block low-value pages Disallow: /search/ Disallow: /cart/ Disallow: /checkout/ Disallow: /account/ Disallow: /print/

Multiple Sitemaps

Sitemap: https://yoursite.com/sitemap-posts.xml Sitemap: https://yoursite.com/sitemap-pages.xml Sitemap: https://yoursite.com/sitemap-images.xml

Bot-Specific Rules

# Allow major search engines User-agent: Googlebot Allow: / User-agent: Bingbot Crawl-delay: 2 Allow: / # Block resource-heavy crawlers User-agent: AhrefsBot Disallow: / User-agent: MJ12bot Disallow: /

Smart API Protection

# Allow only specific endpoints for rich snippets Allow: /api/schema/ Disallow: /api/

Performance Optimization

# Fast crawling for premium search engines User-agent: Googlebot Crawl-delay: 1 # Slower for less important bots User-agent: * Crawl-delay: 30

🔧 Generate Your Robots.txt

Website Domain:

Search Engine Access:

Google

Bing

Yahoo

DuckDuckGo

Block Common Problematic Areas:

Admin areas

API endpoints

Search pages

Cart/Checkout

URL parameters

Custom Disallow Paths (one per line):

Crawl Delay (seconds) for unknown bots:

💡 Testing Your Robots.txt

Use Google Search Console's robots.txt tester to validate your file before deploying. Check for syntax errors and test specific URLs.

📊 Monitor Performance

Track crawl rate changes and blocked URL requests in your server logs. Adjust your robots.txt based on actual bot behavior.

🔄 Regular Updates

Review and update your robots.txt file when you add new site sections or change your site structure.

⚡ Performance Impact

A well-optimized robots.txt can significantly improve your site's crawl efficiency and reduce server load.

🤖 Robots.txt Educator & Generator