Transform any website into structured data with just a few clicks! The Crawl4AI Assistant Chrome Extension provides three powerful tools for web scraping and data extraction.
Click2Crawl
Visual data extraction - click elements to build schemas instantly!
Script Builder (Alpha)
Record browser actions to create automation scripts
Markdown Extraction (New!)
Convert any webpage content to clean markdown with Visual Text Mode
Quick Start
Explore Our Tools
Click2Crawl
Visual data extraction
Script Builder
Browser automation
Markdown Extraction
Content to markdown
🎯 Click2Crawl
Click elements to build extraction schemas - No LLM needed!Select Container
Click on any repeating element like product cards or articles. Use up/down navigation to fine-tune selection!
Click Fields to Extract
Click on data fields inside the container - choose text, links, images, or attributes
Test & Extract Data Instantly!
🎉 Click "Test Schema" to see extracted JSON immediately - no LLM or coding required!
🔴 Script Builder
Record actions, generate automationHit Record
Start capturing your browser interactions
Interact Naturally
Click, type, scroll - everything is captured
Export Script
Get JavaScript for Crawl4AI's js_code parameter
📝 Markdown Extraction
Convert webpage content to clean markdown "as you see"Ctrl/Cmd + Click
Hold Ctrl/Cmd and click multiple elements you want to extract
Enable Visual Text Mode
Extract content "as you see" - clean text without complex HTML structures
Export Clean Markdown
Get beautifully formatted markdown ready for documentation or LLMs
See the Generated Code & Extracted Data
#!/usr/bin/env python3
"""
🎉 NO LLM NEEDED! Direct extraction with CSS selectors
Generated by Crawl4AI Chrome Extension - Click2Crawl
"""
import asyncio
import json
from crawl4ai import AsyncWebCrawler, BrowserConfig, CrawlerRunConfig
from crawl4ai.extraction_strategy import JsonCssExtractionStrategy
# The EXACT schema from Click2Crawl - no guessing!
EXTRACTION_SCHEMA = {
"name": "Product Catalog",
"baseSelector": "div.product-card", # The container you selected
"fields": [
{
"name": "title",
"selector": "h3.product-title",
"type": "text"
},
{
"name": "price",
"selector": "span.price",
"type": "text"
},
{
"name": "image",
"selector": "img.product-img",
"type": "attribute",
"attribute": "src"
},
{
"name": "link",
"selector": "a.product-link",
"type": "attribute",
"attribute": "href"
}
]
}
async def extract_data(url: str):
# Direct extraction - no LLM API calls!
extraction_strategy = JsonCssExtractionStrategy(schema=EXTRACTION_SCHEMA)
async with AsyncWebCrawler() as crawler:
result = await crawler.arun(
url=url,
config=CrawlerRunConfig(extraction_strategy=extraction_strategy)
)
if result.success:
data = json.loads(result.extracted_content)
print(f"✅ Extracted {len(data)} items instantly!")
# Save to file
with open('products.json', 'w') as f:
json.dump(data, f, indent=2)
return data
# Run extraction on any similar page!
data = asyncio.run(extract_data("https://example.com/products"))
# 🎯 Result: Clean JSON data, no LLM costs, instant results!
// 🎉 Instantly extracted from the page - no coding required!
[
{
"title": "Wireless Bluetooth Headphones",
"price": "$79.99",
"image": "https://example.com/images/headphones-bt-01.jpg",
"link": "/products/wireless-bluetooth-headphones"
},
{
"title": "Smart Watch Pro 2024",
"price": "$299.00",
"image": "https://example.com/images/smartwatch-pro.jpg",
"link": "/products/smart-watch-pro-2024"
},
{
"title": "4K Webcam for Streaming",
"price": "$149.99",
"image": "https://example.com/images/webcam-4k.jpg",
"link": "/products/4k-webcam-streaming"
},
{
"title": "Mechanical Gaming Keyboard RGB",
"price": "$129.99",
"image": "https://example.com/images/keyboard-gaming.jpg",
"link": "/products/mechanical-gaming-keyboard"
},
{
"title": "USB-C Hub 7-in-1",
"price": "$45.99",
"image": "https://example.com/images/usbc-hub.jpg",
"link": "/products/usb-c-hub-7in1"
}
]
import asyncio
from crawl4ai import AsyncWebCrawler, CrawlerRunConfig
# JavaScript generated from your recorded actions
js_script = """
// Search for products
document.querySelector('button.search-toggle').click();
await new Promise(r => setTimeout(r, 500));
// Type search query
const searchInput = document.querySelector('input#search');
searchInput.value = 'wireless headphones';
searchInput.dispatchEvent(new Event('input', {bubbles: true}));
// Submit search
searchInput.dispatchEvent(new KeyboardEvent('keydown', {
key: 'Enter', keyCode: 13, bubbles: true
}));
// Wait for results
await new Promise(r => setTimeout(r, 2000));
// Click first product
document.querySelector('.product-item:first-child').click();
// Wait for product page
await new Promise(r => setTimeout(r, 1000));
// Add to cart
document.querySelector('button.add-to-cart').click();
"""
async def automate_shopping():
config = CrawlerRunConfig(
js_code=js_script,
wait_for="css:.cart-confirmation",
screenshot=True
)
async with AsyncWebCrawler() as crawler:
result = await crawler.arun(
url="https://shop.example.com",
config=config
)
print(f"✓ Automation complete: {result.url}")
return result
asyncio.run(automate_shopping())
# Extracted from Hacker News with Visual Text Mode 👁️
1. **Show HN: I built a tool to find and reach out to YouTubers** (hellosimply.io)
84 points by erickim 2 hours ago | hide | 31 comments
2. **The 24 Hour Restaurant** (logicmag.io)
124 points by helsinkiandrew 5 hours ago | hide | 52 comments
3. **Building a Better Bloom Filter in Rust** (carlmastrangelo.com)
89 points by carlmastrangelo 3 hours ago | hide | 27 comments
---
### Article: The 24 Hour Restaurant
In New York City, the 24-hour restaurant is becoming extinct. What we lose when we can no longer eat whenever we want.
When I first moved to New York, I loved that I could get a full meal at 3 AM. Not just pizza or fast food, but a proper sit-down dinner with table service and a menu that ran for pages. The city that never sleeps had restaurants that matched its rhythm.
Today, finding a 24-hour restaurant in Manhattan requires genuine effort. The pandemic accelerated a decline that was already underway, but the roots go deeper: rising rents, changing labor laws, and shifting cultural patterns have all contributed to the death of round-the-clock dining.
---
### Product Review: Framework Laptop 16
**Specifications:**
- Display: 16" 2560×1600 165Hz
- Processor: AMD Ryzen 7 7840HS
- Memory: 32GB DDR5-5600
- Storage: 2TB NVMe Gen4
- Price: Starting at $1,399
**Pros:**
- Fully modular and repairable
- Excellent Linux support
- Great keyboard and trackpad
- Expansion card system
**Cons:**
- Battery life could be better
- Slightly heavier than competitors
- Fan noise under load
Crawl4AI Cloud
Your browser cluster without the cluster.
See it extract your own data. Right now.
More Features Coming Soon
We're continuously expanding C4AI Assistant with powerful new features:
Direct Data Download
Skip the code generation entirely! Download extracted data directly from Click2Crawl as JSON or CSV files.
📊 One-click download • No Python needed • Multiple export formats
Smart Field Detection
AI-powered field detection for Click2Crawl that automatically suggests the most likely data fields on any page.
🤖 Auto-detect fields • Smart naming • Pattern recognition
🚀 Stay tuned for updates! Follow our GitHub for the latest releases.