CLAUDEBOT
LOW RISK🔍 SEARCH & AI CRAWLERAnthropic's web crawler for training Claude AI models
📡 CLAUDEBOT USER-AGENT STRING
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +https://claudebot.ai)
This is the User-Agent header sent by ClaudeBot in HTTP requests. Use this to identify ClaudeBot in your server access logs.
📋 ABOUT CLAUDEBOT
ClaudeBot is Anthropic's official web crawler, used to gather publicly available content from the internet for training Claude, Anthropic's family of AI assistants. First observed in late 2023, ClaudeBot systematically indexes web pages to build the training datasets that power Claude's conversational and analytical abilities.
ClaudeBot operates in compliance with robots.txt directives and Anthropic provides clear opt-out mechanisms for website owners who prefer their content not be used for AI training. The crawler identifies itself transparently in the User-Agent string and crawls at moderate rates to avoid impacting site performance. It does not execute JavaScript, focusing purely on HTML content extraction.
NORAD.io monitors ClaudeBot activity globally, tracking crawl volumes, geographic patterns, and compliance with site policies. Through the NORAD radar network, site operators gain real-time insight into when and how frequently ClaudeBot accesses their content, enabling informed decisions about AI training data access controls.
🎯 HOW TO DETECT CLAUDEBOT
- ▸Look for 'ClaudeBot' in the User-Agent header
- ▸Verify the referrer URL contains claudebot.ai
- ▸Does not execute JavaScript or load external resources
- ▸Respects Crawl-delay directives in robots.txt
- ▸Typically crawls from AWS IP ranges
🌐 CLAUDEBOT KNOWN IP RANGES
160.79.104.0/2313.56.0.0/16Use these CIDR ranges to verify ClaudeBot identity at the network level. Always combine with User-Agent verification for accurate detection.
🔄 CRAWL BEHAVIOR
Moderate crawl rate with polite intervals between requests. Respects robots.txt and crawl-delay directives. Primarily fetches HTML content without JavaScript execution.
Collects publicly available web content to train Anthropic's Claude family of AI models. Used for pre-training data collection to improve Claude's knowledge and capabilities.
🤖 ROBOTS.TXT CONFIGURATION
User-agent: ClaudeBot Disallow: /private/ Disallow: /api/ # To block completely: # User-agent: ClaudeBot # Disallow: /
ClaudeBot respects robots.txt directives. Add this to your robots.txt file at the root of your domain.
🗺️ WHERE IS CLAUDEBOT ACTIVE?
⚠️ RELATED THREATS
Attempts to override bot instructions via malicious content embedded in web pages
Data ExfiltrationBots attempting to extract sensitive data from websites including PII and credentials
Credential StuffingAutomated login attempts using leaked credentials from data breaches
Aggressive Content ScrapingBots aggressively scraping content beyond robots.txt limits and ToS
🔗 RELATED BOTS
📂 MORE 🔍 SEARCH & AI CRAWLERS
📚 RELATED GUIDES
PROTECT YOUR WEBSITE
Deploy SiteTrust to monitor and control AI bot access to your site with the Agent Passport Standard.
INSTALL SITETRUST →