What are Meta-ExternalAgent's IP ranges?

Known Meta-ExternalAgent IP ranges include: 69.63.176.0/20, 66.220.144.0/20

META-EXTERNALAGENT

Q: What is the Meta-ExternalAgent User-Agent string?

The Meta-ExternalAgent User-Agent string is: Mozilla/5.0 (compatible; Meta-ExternalAgent/1.0; +https://developers.facebook.com/docs/sharing/webmasters/crawler)

Q: How do I block Meta-ExternalAgent in robots.txt?

To control Meta-ExternalAgent access, add the following to your robots.txt file: # Block AI training but keep Facebook link previews: User-agent: Meta-ExternalAgent Disallow: / User-agent: facebookexternalhit Allow: /

Q: Does Meta-ExternalAgent respect robots.txt?

Yes, Meta-ExternalAgent respects robots.txt directives.

MEDIUM RISK🔍 SEARCH & AI CRAWLER

Meta's AI training crawler — collects web data for Meta's LLaMA and AI models

ORGANIZATION

📡 META-EXTERNALAGENT USER-AGENT STRING

Mozilla/5.0 (compatible; Meta-ExternalAgent/1.0; +https://developers.facebook.com/docs/sharing/webmasters/crawler)

This is the User-Agent header sent by Meta-ExternalAgent in HTTP requests. Use this to identify Meta-ExternalAgent in your server access logs.

📋 ABOUT META-EXTERNALAGENT

Meta-ExternalAgent is Meta's dedicated AI training crawler, used to collect web content for training the LLaMA family of large language models and other Meta AI products. This crawler is distinct from FacebookBot (facebookexternalhit), which fetches pages for link previews on Facebook and Instagram.

Meta introduced Meta-ExternalAgent as a separate User-Agent token to give website operators independent control over AI training data collection versus social media link previews. This mirrors the approach taken by Google (Google-Extended vs Googlebot) and OpenAI (GPTBot vs ChatGPT-User). Blocking Meta-ExternalAgent prevents your content from being used in LLaMA training without affecting Facebook or Instagram link preview functionality.

NORAD.io classifies Meta-ExternalAgent as medium risk due to the scale of Meta's AI training operations and the volume of data collected. NORAD tracks Meta-ExternalAgent separately from FacebookBot to help site operators implement precise access policies — allowing social media functionality while controlling AI training data access.

🎯 HOW TO DETECT META-EXTERNALAGENT

▸User-Agent contains 'Meta-ExternalAgent'
▸Distinct from 'facebookexternalhit' (link preview bot)
▸Crawls systematically rather than on-demand
▸Shares Meta's IP infrastructure with FacebookBot
▸Blocking Meta-ExternalAgent does not affect Facebook/Instagram link previews

🌐 META-EXTERNALAGENT KNOWN IP RANGES

69.63.176.0/2066.220.144.0/20

Use these CIDR ranges to verify Meta-ExternalAgent identity at the network level. Always combine with User-Agent verification for accurate detection.

🔄 CRAWL BEHAVIOR

Systematic crawling for AI training data. Moderate to high request rates. Separate from FacebookBot (link previews). Respects robots.txt with its own User-Agent token. Does not execute JavaScript.

PURPOSE

Collects web content for training Meta's AI models including LLaMA, Llama 2, Llama 3, and Meta AI assistant products. This is Meta's AI training crawler, distinct from FacebookBot which generates link previews.

🤖 ROBOTS.TXT CONFIGURATION

# Block AI training but keep Facebook link previews:
User-agent: Meta-ExternalAgent
Disallow: /

User-agent: facebookexternalhit
Allow: /

Meta-ExternalAgent respects robots.txt directives. Add this to your robots.txt file at the root of your domain.

→ Complete Guide: robots.txt for AI Bots

🗺️ WHERE IS META-EXTERNALAGENT ACTIVE?

⚠️ RELATED THREATS

Prompt Injection

Attempts to override bot instructions via malicious content embedded in web pages

Data Exfiltration

Bots attempting to extract sensitive data from websites including PII and credentials

Credential Stuffing

Automated login attempts using leaked credentials from data breaches

Aggressive Content Scraping

Bots aggressively scraping content beyond robots.txt limits and ToS

🔗 RELATED BOTS

FacebookBotLOW

Meta · Meta's crawler that fetches pages for link previews on Facebook and Instagram

📂 MORE 🔍 SEARCH & AI CRAWLERS

GPTBotOpenAI ClaudeBotAnthropic GooglebotGoogle Google-ExtendedGoogle BingbotMicrosoft BytespiderByteDance

📚 RELATED GUIDES

How to Detect AI Bots →robots.txt for AI Bots →NORAD API Docs →

PROTECT YOUR WEBSITE

Deploy SiteTrust to monitor and control AI bot access to your site with the Agent Passport Standard.

INSTALL SITETRUST →