Undetected Browser Mode

Overview

Crawl4AI offers two powerful anti-bot features to help you access websites with bot detection:

  1. Stealth Mode - Uses playwright-stealth to modify browser fingerprints and behaviors
  2. Undetected Browser Mode - Advanced browser adapter with deep-level patches for sophisticated bot detection

This guide covers both features and helps you choose the right approach for your needs.

Anti-Bot Features Comparison

Feature Regular Browser Stealth Mode Undetected Browser
WebDriver Detection ❌ βœ… βœ…
Navigator Properties ❌ βœ… βœ…
Plugin Emulation ❌ βœ… βœ…
CDP Detection ❌ Partial βœ…
Deep Browser Patches ❌ ❌ βœ…
Performance Impact None Minimal Moderate
Setup Complexity None None Minimal

When to Use Each Approach

Use Regular Browser + Stealth Mode When:

  • Sites have basic bot detection (checking navigator.webdriver, plugins, etc.)
  • You need good performance with basic protection
  • Sites check for common automation indicators

Use Undetected Browser When:

  • Sites employ sophisticated bot detection services (Cloudflare, DataDome, etc.)
  • Stealth mode alone isn't sufficient
  • You're willing to trade some performance for better evasion

Best Practice: Progressive Enhancement

  1. Start with: Regular browser + Stealth mode
  2. If blocked: Switch to Undetected browser
  3. If still blocked: Combine Undetected browser + Stealth mode

Stealth Mode

Stealth mode is the simpler anti-bot solution that works with both regular and undetected browsers:

from crawl4ai import AsyncWebCrawler, BrowserConfig

# Enable stealth mode with regular browser
browser_config = BrowserConfig(
    enable_stealth=True,  # Simple flag to enable
    headless=False       # Better for avoiding detection
)

async with AsyncWebCrawler(config=browser_config) as crawler:
    result = await crawler.arun("https://example.com")

What Stealth Mode Does:

  • Removes navigator.webdriver flag
  • Modifies browser fingerprints
  • Emulates realistic plugin behavior
  • Adjusts navigator properties
  • Fixes common automation leaks

Undetected Browser Mode

For sites with sophisticated bot detection that stealth mode can't bypass, use the undetected browser adapter:

Key Features

  • Drop-in Replacement: Uses the same API as regular browser mode
  • Enhanced Stealth: Built-in patches to evade common detection methods
  • Browser Adapter Pattern: Seamlessly switch between regular and undetected modes
  • Automatic Installation: crawl4ai-setup installs all necessary browser dependencies

Quick Start

import asyncio
from crawl4ai import (
    AsyncWebCrawler, 
    BrowserConfig, 
    CrawlerRunConfig,
    UndetectedAdapter
)
from crawl4ai.async_crawler_strategy import AsyncPlaywrightCrawlerStrategy

async def main():
    # Create the undetected adapter
    undetected_adapter = UndetectedAdapter()

    # Create browser config
    browser_config = BrowserConfig(
        headless=False,  # Headless mode can be detected easier
        verbose=True,
    )

    # Create the crawler strategy with undetected adapter
    crawler_strategy = AsyncPlaywrightCrawlerStrategy(
        browser_config=browser_config,
        browser_adapter=undetected_adapter
    )

    # Create the crawler with our custom strategy
    async with AsyncWebCrawler(
        crawler_strategy=crawler_strategy,
        config=browser_config
    ) as crawler:
        # Your crawling code here
        result = await crawler.arun(
            url="https://example.com",
            config=CrawlerRunConfig()
        )
        print(result.markdown[:500])

asyncio.run(main())

Combining Both Features

For maximum evasion, combine stealth mode with undetected browser:

from crawl4ai import AsyncWebCrawler, BrowserConfig, UndetectedAdapter
from crawl4ai.async_crawler_strategy import AsyncPlaywrightCrawlerStrategy

# Create browser config with stealth enabled
browser_config = BrowserConfig(
    enable_stealth=True,  # Enable stealth mode
    headless=False
)

# Create undetected adapter
adapter = UndetectedAdapter()

# Create strategy with both features
strategy = AsyncPlaywrightCrawlerStrategy(
    browser_config=browser_config,
    browser_adapter=adapter
)

async with AsyncWebCrawler(
    crawler_strategy=strategy,
    config=browser_config
) as crawler:
    result = await crawler.arun("https://protected-site.com")

Examples

Example 1: Basic Stealth Mode

import asyncio
from crawl4ai import AsyncWebCrawler, BrowserConfig, CrawlerRunConfig

async def test_stealth_mode():
    # Simple stealth mode configuration
    browser_config = BrowserConfig(
        enable_stealth=True,
        headless=False
    )

    async with AsyncWebCrawler(config=browser_config) as crawler:
        result = await crawler.arun(
            url="https://bot.sannysoft.com",
            config=CrawlerRunConfig(screenshot=True)
        )

        if result.success:
            print("βœ“ Successfully accessed bot detection test site")
            # Save screenshot to verify detection results
            if result.screenshot:
                import base64
                with open("stealth_test.png", "wb") as f:
                    f.write(base64.b64decode(result.screenshot))
                print("βœ“ Screenshot saved - check for green (passed) tests")

asyncio.run(test_stealth_mode())

Example 2: Undetected Browser Mode

import asyncio
from crawl4ai import (
    AsyncWebCrawler,
    BrowserConfig,
    CrawlerRunConfig,
    UndetectedAdapter
)
from crawl4ai.async_crawler_strategy import AsyncPlaywrightCrawlerStrategy


async def main():
    # Create browser config
    browser_config = BrowserConfig(
        headless=False,
        verbose=True,
    )

    # Create the undetected adapter
    undetected_adapter = UndetectedAdapter()

    # Create the crawler strategy with the undetected adapter
    crawler_strategy = AsyncPlaywrightCrawlerStrategy(
        browser_config=browser_config,
        browser_adapter=undetected_adapter
    )

    # Create the crawler with our custom strategy
    async with AsyncWebCrawler(
        crawler_strategy=crawler_strategy,
        config=browser_config
    ) as crawler:
        # Configure the crawl
        crawler_config = CrawlerRunConfig(
            markdown_generator=DefaultMarkdownGenerator(
                content_filter=PruningContentFilter()
            ),
            capture_console_messages=True,  # Test adapter console capture
        )

        # Test on a site that typically detects bots
        print("Testing undetected adapter...")
        result: CrawlResult = await crawler.arun(
            url="https://www.helloworld.org", 
            config=crawler_config
        )

        print(f"Status: {result.status_code}")
        print(f"Success: {result.success}")
        print(f"Console messages captured: {len(result.console_messages or [])}")
        print(f"Markdown content (first 500 chars):\n{result.markdown.raw_markdown[:500]}")


if __name__ == "__main__":
    asyncio.run(main())

Browser Adapter Pattern

The undetected browser support is implemented using an adapter pattern, allowing seamless switching between different browser implementations:

# Regular browser adapter (default)
from crawl4ai import PlaywrightAdapter
regular_adapter = PlaywrightAdapter()

# Undetected browser adapter
from crawl4ai import UndetectedAdapter
undetected_adapter = UndetectedAdapter()

The adapter handles: - JavaScript execution - Console message capture - Error handling - Browser-specific optimizations

Best Practices

  1. Avoid Headless Mode: Detection is easier in headless mode

    browser_config = BrowserConfig(headless=False)
    

  2. Use Reasonable Delays: Don't rush through pages

    crawler_config = CrawlerRunConfig(
        wait_time=3.0,  # Wait 3 seconds after page load
        delay_before_return_html=2.0  # Additional delay
    )
    

  3. Rotate User Agents: You can customize user agents

    browser_config = BrowserConfig(
        headers={"User-Agent": "your-user-agent"}
    )
    

  4. Handle Failures Gracefully: Some sites may still detect and block

    if not result.success:
        print(f"Crawl failed: {result.error_message}")
    

Advanced Usage Tips

Progressive Detection Handling

async def crawl_with_progressive_evasion(url):
    # Step 1: Try regular browser with stealth
    browser_config = BrowserConfig(
        enable_stealth=True,
        headless=False
    )

    async with AsyncWebCrawler(config=browser_config) as crawler:
        result = await crawler.arun(url)
        if result.success and "Access Denied" not in result.html:
            return result

    # Step 2: If blocked, try undetected browser
    print("Regular + stealth blocked, trying undetected browser...")

    adapter = UndetectedAdapter()
    strategy = AsyncPlaywrightCrawlerStrategy(
        browser_config=browser_config,
        browser_adapter=adapter
    )

    async with AsyncWebCrawler(
        crawler_strategy=strategy,
        config=browser_config
    ) as crawler:
        result = await crawler.arun(url)
        return result

Installation

The undetected browser dependencies are automatically installed when you run:

crawl4ai-setup

This command installs all necessary browser dependencies for both regular and undetected modes.

Limitations

  • Performance: Slightly slower than regular mode due to additional patches
  • Headless Detection: Some sites can still detect headless mode
  • Resource Usage: May use more resources than regular mode
  • Not 100% Guaranteed: Advanced anti-bot services are constantly evolving

Troubleshooting

Browser Not Found

Run the setup command:

crawl4ai-setup

Detection Still Occurring

Try combining with other features:

crawler_config = CrawlerRunConfig(
    simulate_user=True,  # Add user simulation
    magic=True,  # Enable magic mode
    wait_time=5.0,  # Longer waits
)

Performance Issues

If experiencing slow performance:

# Use selective undetected mode only for protected sites
if is_protected_site(url):
    adapter = UndetectedAdapter()
else:
    adapter = PlaywrightAdapter()  # Default adapter

Future Plans

Note: In future versions of Crawl4AI, we may enable stealth mode and undetected browser by default to provide better out-of-the-box success rates. For now, users should explicitly enable these features when needed.

Conclusion

Crawl4AI provides flexible anti-bot solutions:

  1. Start Simple: Use regular browser + stealth mode for most sites
  2. Escalate if Needed: Switch to undetected browser for sophisticated protection
  3. Combine for Maximum Effect: Use both features together when facing the toughest challenges

Remember: - Always respect robots.txt and website terms of service - Use appropriate delays to avoid overwhelming servers - Consider the performance trade-offs of each approach - Test progressively to find the minimum necessary evasion level

See Also


> Feedback