PUPPETEER

HIGH RISK⚡ AUTOMATION AGENT

Google's Node.js browser automation library — widely used for scraping and testing

ORGANIZATION
Google
FIRST SEEN
2017-08
RESPECTS ROBOTS.TXT
✗ NO
DOCUMENTATION
pptr.dev
DAILY VISITS
COUNTRIES ACTIVE
TRACKING
STATUS
LAST SEEN

📡 PUPPETEER USER-AGENT STRING

Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) HeadlessChrome/120.0.0.0 Safari/537.36

This is the User-Agent header sent by Puppeteer in HTTP requests. Use this to identify Puppeteer in your server access logs.

📋 ABOUT PUPPETEER

Puppeteer is Google's official Node.js library for controlling Chrome/Chromium browsers via the Chrome DevTools Protocol. Released in 2017, Puppeteer quickly became the standard tool for browser automation, offering a high-level API for navigation, form interaction, screenshot capture, and PDF generation.

In the bot detection ecosystem, Puppeteer occupies a unique position. It's the foundation for many legitimate testing and monitoring tools, but it's also the most popular framework for building sophisticated web scrapers that can handle JavaScript-rendered content. The puppeteer-extra ecosystem, particularly puppeteer-extra-plugin-stealth, provides tools to evade common bot detection mechanisms.

NORAD.io classifies Puppeteer-based automation as high risk due to its prevalence in scraping operations and the sophistication of available evasion tools. NORAD's detection approach goes beyond simple fingerprinting to analyze behavioral patterns, network characteristics, and environmental signals that stealth plugins cannot fully mask.

🎯 HOW TO DETECT PUPPETEER

  • Default User-Agent contains 'HeadlessChrome' — easily spoofed
  • navigator.webdriver is true by default (patchable with stealth plugins)
  • puppeteer-extra-plugin-stealth is widely used to evade detection
  • Chrome DevTools Protocol connections from non-standard ports
  • Missing browser features: WebGL renderer strings, plugin lists
  • Consistent default viewport size (800x600 in older versions)

🔄 CRAWL BEHAVIOR

Full browser automation with JavaScript rendering. Uses Chrome DevTools Protocol. Default headless mode with HeadlessChrome UA string, but can run headed. Behavior entirely determined by script logic.

PURPOSE

Browser automation for testing, scraping, screenshot generation, PDF creation, and automated web interactions. Used extensively in both legitimate testing pipelines and scraping operations.

🤖 ROBOTS.TXT CONFIGURATION

# Puppeteer does not check robots.txt.
# Default UA includes 'HeadlessChrome' but this is easily changed.
# Use browser fingerprinting for detection.

⚠ Puppeteer may not fully respect robots.txt. Consider supplementing with IP-level blocking or bot detection middleware.

🗺️ WHERE IS PUPPETEER ACTIVE?

⚠️ RELATED THREATS

🔗 RELATED BOTS

📂 MORE ⚡ AUTOMATION AGENTS

📚 RELATED GUIDES

PROTECT YOUR WEBSITE

Deploy SiteTrust to monitor and control AI bot access to your site with the Agent Passport Standard.

INSTALL SITETRUST →