AI BOT & CRAWLER DATABASE

40 bot types tracked across the NORAD.io global radar network. Complete reference with User-Agent strings, IP ranges, detection guides, and robots.txt configurations.

40
BOTS TRACKED
22
CRAWLERS
5
HIGH RISK
27
ORGANIZATIONS

WHAT ARE AI BOTS?

AI bots are automated programs that access websites to collect data for artificial intelligence systems. They include search engine crawlers like Googlebot and Bingbot that index content for search results, AI training crawlers like GPTBot and ClaudeBot that collect data for training large language models (LLMs), and AI assistant browsers like ChatGPT-User and Perplexity-User that fetch pages in real-time during AI conversations.

The rise of generative AI has dramatically increased bot traffic across the web. In 2025-2026, AI-related crawlers account for a growing share of website traffic, often exceeding human visitors on content-heavy sites. Understanding which AI bots access your site — and controlling that access through robots.txt, User-Agent detection, and IP-level policies — is essential for modern web operations.

NORAD.io monitors all major AI bots globally, providing real-time visibility into crawl activity, behavioral patterns, and compliance with site access policies. Each bot profile below includes the complete User-Agent string, known IP ranges, detection tips, and robots.txt configuration examples.

📋 ALL TRACKED BOTS

BOTRISK
GPTBot
OpenAI
LOW
ClaudeBot
Anthropic
LOW
ChatGPT-User
OpenAI
LOW
Googlebot
Google
LOW
Google-Extended
Google
LOW
Bingbot
Microsoft
LOW
PerplexityBot
Perplexity AI
LOW
Bytespider
ByteDance
MEDIUM
CCBot
Common Crawl
LOW
Amazonbot
Amazon
LOW
FacebookBot
Meta
LOW
AhrefsBot
Ahrefs
LOW
SemrushBot
Semrush
LOW
Applebot
Apple
LOW
YandexBot
Yandex
LOW
Headless Chrome
Unknown
HIGH
Playwright
Microsoft
HIGH
Scrapy
Open Source
MEDIUM
Python Requests
Unknown
MEDIUM
DuckDuckBot
DuckDuckGo
LOW
Puppeteer
Google
HIGH
cURL
Open Source
MEDIUM
Selenium
Open Source
HIGH
Twitterbot
X (Twitter)
LOW
LinkedInBot
LinkedIn
LOW
Baiduspider
Baidu
LOW
Sogou Spider
Sogou
LOW
MJ12bot
Majestic
LOW
DotBot
Moz
LOW
Anthropic-AI
Anthropic
LOW
OAI-SearchBot
OpenAI
LOW
Perplexity-User
Perplexity AI
LOW
Claude-Web
Anthropic
LOW
Meta-ExternalAgent
Meta
MEDIUM
Cohere-AI
Cohere
LOW
AI2Bot
Allen Institute for AI
LOW
YouBot
You.com
LOW
PetalBot
Huawei
LOW
DataForSeoBot
DataForSEO
LOW
PhantomJS
Open Source
HIGH

🔍 SEARCH & AI CRAWLERS

Bots from search engines and AI companies that systematically crawl web content for indexing and AI model training.

GPTBotLOW
OpenAI

OpenAI's web crawler used for training GPT models and improving AI capabilities

Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatibl
ClaudeBotLOW
Anthropic

Anthropic's web crawler for training Claude AI models

Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatibl
GooglebotLOW
Google

Google's primary web crawler for search indexing — the most important bot for SEO

Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.c
Google-ExtendedLOW
Google

Google's AI training crawler for Gemini — separate from Googlebot search indexing

Mozilla/5.0 (compatible; Google-Extended; +https://developer
BingbotLOW
Microsoft

Microsoft Bing's web crawler for search indexing and AI features

Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/b
BytespiderMEDIUM
ByteDance

ByteDance's aggressive web crawler — one of the most active bots on the internet

Mozilla/5.0 (Linux; Android 5.0) AppleWebKit/537.36 (KHTML,
CCBotLOW
Common Crawl

Common Crawl's open web archival bot — the largest open dataset of web content

CCBot/2.0 (https://commoncrawl.org/faq/)
AmazonbotLOW
Amazon

Amazon's web crawler for Alexa answers and Amazon search features

Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/
FacebookBotLOW
Meta

Meta's crawler that fetches pages for link previews on Facebook and Instagram

facebookexternalhit/1.1 (+http://www.facebook.com/externalhi
ApplebotLOW
Apple

Apple's web crawler for Siri, Spotlight search, and Apple Intelligence features

Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/
YandexBotLOW
Yandex

Yandex search engine crawler — Russia's largest search engine

Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/b
DuckDuckBotLOW
DuckDuckGo

DuckDuckGo's web crawler for its privacy-focused search engine

DuckDuckBot/1.1; (+http://duckduckgo.com/duckduckbot.html)
TwitterbotLOW
X (Twitter)

Twitter/X's crawler for generating link preview cards in tweets

Twitterbot/1.0
LinkedInBotLOW
LinkedIn

LinkedIn's crawler for generating link previews in posts and messages

LinkedInBot/1.0 (compatible; Mozilla/5.0; Apache-HttpClient
BaiduspiderLOW
Baidu

Baidu's web crawler — China's largest search engine

Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.
Sogou SpiderLOW
Sogou

Sogou search engine crawler — China's third-largest search engine (Tencent-owned)

Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmast
OAI-SearchBotLOW
OpenAI

OpenAI's search grounding crawler — fetches pages for ChatGPT Search and SearchGPT results

Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatibl
Meta-ExternalAgentMEDIUM
Meta

Meta's AI training crawler — collects web data for Meta's LLaMA and AI models

Mozilla/5.0 (compatible; Meta-ExternalAgent/1.0; +https://de
Cohere-AILOW
Cohere

Cohere's web crawler for training enterprise AI and retrieval-augmented generation models

Mozilla/5.0 (compatible; cohere-ai; +https://cohere.com/craw
AI2BotLOW
Allen Institute for AI

Allen Institute for AI's crawler for academic AI research and open models

Mozilla/5.0 (compatible; AI2Bot/1.0; +https://allenai.org/cr
YouBotLOW
You.com

You.com's web crawler for its AI-powered search engine and assistant

Mozilla/5.0 (compatible; YouBot/1.0; +https://about.you.com/
PetalBotLOW
Huawei

Huawei's web crawler for Petal Search — Huawei's search engine for HarmonyOS devices

Mozilla/5.0 (compatible; PetalBot;+https://webmaster.petalse

🤖 AI ASSISTANTS

Bots that fetch pages in real-time during AI assistant conversations (ChatGPT browsing, Perplexity search, Claude web access).

📊 SEO & DATA SCRAPERS

Crawlers from SEO tools and data companies that analyze site structure, backlinks, and content for marketing intelligence.

AUTOMATION AGENTS

Browser automation tools and headless browsers used for testing, scraping, and automated web interactions.

📡 ALL AI BOT USER-AGENT STRINGS

Quick reference list of all tracked User-Agent strings. Use these to identify bots in your server access logs and configure detection rules.

GPTBotMozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GPTBot/1.2; +https://openai.com/gptbot)
ClaudeBotMozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ClaudeBot/1.0; +https://claudebot.ai)
ChatGPT-UserMozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; ChatGPT-User/1.0; +https://openai.com/bot)
GooglebotMozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
Google-ExtendedMozilla/5.0 (compatible; Google-Extended; +https://developers.google.com/search/docs/crawling-indexing/google-common-crawlers)
BingbotMozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)
PerplexityBotMozilla/5.0 (compatible; PerplexityBot/1.0; +https://perplexity.ai/perplexitybot)
BytespiderMozilla/5.0 (Linux; Android 5.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.66 Mobile Safari/537.36 (compatible; Bytespider; spider-feedback@bytedance.com)
CCBotCCBot/2.0 (https://commoncrawl.org/faq/)
AmazonbotMozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.157 Safari/537.36 (compatible; Amazonbot/0.1; +https://developer.amazon.com/support/amazonbot)
FacebookBotfacebookexternalhit/1.1 (+http://www.facebook.com/externalhit_uatext.php)
AhrefsBotMozilla/5.0 (compatible; AhrefsBot/7.0; +http://ahrefs.com/robot/)
SemrushBotMozilla/5.0 (compatible; SemrushBot/7~bl; +http://www.semrush.com/bot.html)
ApplebotMozilla/5.0 (Macintosh; Intel Mac OS X 10_15_5) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/13.1.1 Safari/605.1.15 (Applebot/0.1; +http://www.apple.com/go/applebot)
YandexBotMozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)
Headless ChromeMozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/120.0.0.0 Safari/537.36
PlaywrightMozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36
ScrapyScrapy/2.11 (+https://scrapy.org)
Python Requestspython-requests/2.31.0
DuckDuckBotDuckDuckBot/1.1; (+http://duckduckgo.com/duckduckbot.html)
PuppeteerMozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/120.0.0.0 Safari/537.36
cURLcurl/8.5.0
SeleniumMozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36
TwitterbotTwitterbot/1.0
LinkedInBotLinkedInBot/1.0 (compatible; Mozilla/5.0; Apache-HttpClient +http://www.linkedin.com)
BaiduspiderMozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)
Sogou SpiderSogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)
MJ12botMozilla/5.0 (compatible; MJ12bot/v1.4.8; http://mj12bot.com/)
DotBotMozilla/5.0 (compatible; DotBot/1.2; +https://opensiteexplorer.org/dotbot; help@moz.com)
Anthropic-AIMozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Anthropic-AI; +https://anthropic.com)
OAI-SearchBotMozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; OAI-SearchBot/1.0; +https://openai.com/searchbot)
Perplexity-UserMozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Perplexity-User/1.0; +https://perplexity.ai)
Claude-WebMozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Claude-Web/1.0; +https://anthropic.com/claude-web)
Meta-ExternalAgentMozilla/5.0 (compatible; Meta-ExternalAgent/1.0; +https://developers.facebook.com/docs/sharing/webmasters/crawler)
Cohere-AIMozilla/5.0 (compatible; cohere-ai; +https://cohere.com/crawler)
AI2BotMozilla/5.0 (compatible; AI2Bot/1.0; +https://allenai.org/crawler)
YouBotMozilla/5.0 (compatible; YouBot/1.0; +https://about.you.com/youbot/)
PetalBotMozilla/5.0 (compatible; PetalBot;+https://webmaster.petalsearch.com/site/petalbot)
DataForSeoBotMozilla/5.0 (compatible; DataForSeoBot/1.0; +https://dataforseo.com/dataforseo-bot)
PhantomJSMozilla/5.0 (Unknown; Linux x86_64) AppleWebKit/538.1 (KHTML, like Gecko) PhantomJS/2.1.1 Safari/538.1

PROTECT YOUR WEBSITE

Deploy SiteTrust to monitor and control AI bot access to your site with the Agent Passport Standard.

INSTALL SITETRUST →