AIMentions.today

Free Robots.txt Tester

Check if your robots.txt file exists, is properly configured, and doesn't accidentally block important pages from search engines.

Enter the full URL including https:// to analyze the page

What is Robots.txt?

Robots.txt is a text file placed in your website's root directory that tells search engine crawlers which pages or sections of your site they can and cannot access. It's the first file crawlers check when visiting your site.

Why Robots.txt Matters

  • Crawl Budget: Prevent crawlers from wasting time on unimportant pages
  • Privacy: Block search engines from indexing admin pages or sensitive areas
  • Duplicate Content: Prevent indexing of duplicate or low-value pages
  • Sitemap Declaration: Tell crawlers where to find your XML sitemap
  • Crawl Rate: Control how fast search engines crawl your site

Common Robots.txt Directives

User-agent: Specifies which crawler the rules apply to

Disallow: Paths that crawlers should not access

Allow: Paths that crawlers can access (overrides Disallow)

Sitemap: Location of your XML sitemap

Crawl-delay: Seconds to wait between requests

Example Robots.txt

User-agent: *
Disallow: /admin/
Disallow: /private/
Allow: /public/

User-agent: Googlebot
Crawl-delay: 10

Sitemap: https://example.com/sitemap.xml

Common Mistakes to Avoid

  • Accidentally blocking your entire site with "Disallow: /"
  • Blocking important pages like homepage or product pages
  • Forgetting to include sitemap URL
  • Using robots.txt to hide sensitive data (not secure - use proper authentication)
  • Not testing changes before deploying