BAIDUSPIDER

LOW RISK🔍 SEARCH & AI CRAWLER

Baidu's web crawler — China's largest search engine

ORGANIZATION
Baidu
FIRST SEEN
2004-01
RESPECTS ROBOTS.TXT
✓ YES
DOCUMENTATION
www.baidu.com
DAILY VISITS
COUNTRIES ACTIVE
TRACKING
STATUS
LAST SEEN

📡 BAIDUSPIDER USER-AGENT STRING

Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)

This is the User-Agent header sent by Baiduspider in HTTP requests. Use this to identify Baiduspider in your server access logs.

📋 ABOUT BAIDUSPIDER

Baiduspider is the web crawler for Baidu, China's largest search engine with approximately 60% domestic market share. Baiduspider indexes web content for Baidu Search, Baidu News, Baidu Image, and increasingly for Baidu's AI products including Ernie Bot, Baidu's large language model.

Baiduspider crawls from Chinese IP ranges and primarily targets Chinese-language content, though it does crawl international websites. For websites targeting the Chinese market, Baiduspider indexing is essential for visibility. The crawler respects robots.txt directives and can be verified through reverse DNS resolution to baidu.com or baidu.jp domains.

NORAD.io monitors Baiduspider activity globally, tracking crawl volumes and geographic patterns. For sites not targeting Chinese audiences, unexpected Baiduspider activity may be noteworthy. NORAD helps site operators make informed decisions about allowing or restricting Baiduspider access based on their audience and content strategy.

🎯 HOW TO DETECT BAIDUSPIDER

  • User-Agent contains 'Baiduspider/2.0'
  • Crawls from Chinese IP ranges (180.76.x.x, 119.63.x.x, 106.12.x.x)
  • Verify via reverse DNS — should resolve to *.baidu.com or *.baidu.jp
  • Multiple variants: Baiduspider-image, Baiduspider-video, Baiduspider-news
  • May crawl non-Chinese sites less frequently

🌐 BAIDUSPIDER KNOWN IP RANGES

180.76.0.0/16119.63.192.0/21106.12.0.0/15182.61.0.0/16

Use these CIDR ranges to verify Baiduspider identity at the network level. Always combine with User-Agent verification for accurate detection.

🔄 CRAWL BEHAVIOR

Systematic crawling with moderate to high request rates. Respects robots.txt. Can be aggressive on large sites. Primarily targets Chinese-language content but crawls globally. Does not render JavaScript.

PURPOSE

Indexes web content for Baidu Search, the dominant search engine in China with approximately 60% market share. Also powers Baidu's AI assistant Ernie Bot.

🤖 ROBOTS.TXT CONFIGURATION

User-agent: Baiduspider
Allow: /

# To block:
# User-agent: Baiduspider
# Disallow: /

Baiduspider respects robots.txt directives. Add this to your robots.txt file at the root of your domain.

🗺️ WHERE IS BAIDUSPIDER ACTIVE?

⚠️ RELATED THREATS

📂 MORE 🔍 SEARCH & AI CRAWLERS

📚 RELATED GUIDES

PROTECT YOUR WEBSITE

Deploy SiteTrust to monitor and control AI bot access to your site with the Agent Passport Standard.

INSTALL SITETRUST →