PERPLEXITYBOT
LOW RISK🤖 AI ASSISTANT BROWSERPerplexity AI's web indexing crawler for its AI-powered search engine
📡 PERPLEXITYBOT USER-AGENT STRING
Mozilla/5.0 (compatible; PerplexityBot/1.0; +https://perplexity.ai/perplexitybot)
This is the User-Agent header sent by PerplexityBot in HTTP requests. Use this to identify PerplexityBot in your server access logs.
📋 ABOUT PERPLEXITYBOT
PerplexityBot is the web crawler operated by Perplexity AI, one of the fastest-growing AI-powered search engines. Unlike traditional search engines that return a list of links, Perplexity provides direct, cited answers by synthesizing information from multiple web sources. PerplexityBot indexes content to build the retrieval corpus that powers these AI-generated answers.
PerplexityBot respects robots.txt directives and identifies itself clearly in the User-Agent string. The crawler focuses on content-rich pages that are likely to contain useful information for answering user queries. It does not render JavaScript and focuses on extracting text content from HTML. Perplexity has faced some controversy around its crawling practices, making monitoring especially important for site operators.
NORAD.io tracks PerplexityBot activity to provide transparency into how this rapidly growing AI search engine accesses web content. With Perplexity's growing market share in AI-powered search, understanding and controlling PerplexityBot access is increasingly important for content publishers.
🎯 HOW TO DETECT PERPLEXITYBOT
- ▸User-Agent contains 'PerplexityBot'
- ▸Crawl patterns focus on content-rich pages rather than navigation pages
- ▸Does not execute JavaScript or render pages
- ▸Follows sitemaps when available
- ▸Request frequency is moderate — less aggressive than major search engine crawlers
🔄 CRAWL BEHAVIOR
Crawls web pages for indexing in Perplexity's AI search engine. Moderate frequency with polite crawling behavior. Does not execute JavaScript. Follows sitemaps and link structures.
Indexes web content for Perplexity's AI-powered search engine, which provides cited, conversational answers to user queries. Content is indexed for retrieval, not used for model training.
🤖 ROBOTS.TXT CONFIGURATION
User-agent: PerplexityBot Allow: / # To block: # User-agent: PerplexityBot # Disallow: /
PerplexityBot respects robots.txt directives. Add this to your robots.txt file at the root of your domain.
🗺️ WHERE IS PERPLEXITYBOT ACTIVE?
⚠️ RELATED THREATS
Attempts to override bot instructions via malicious content embedded in web pages
Data ExfiltrationBots attempting to extract sensitive data from websites including PII and credentials
Credential StuffingAutomated login attempts using leaked credentials from data breaches
Aggressive Content ScrapingBots aggressively scraping content beyond robots.txt limits and ToS
🔗 RELATED BOTS
📂 MORE 🤖 AI ASSISTANT BROWSERS
📚 RELATED GUIDES
PROTECT YOUR WEBSITE
Deploy SiteTrust to monitor and control AI bot access to your site with the Agent Passport Standard.
INSTALL SITETRUST →