You can now crawl an entire website with a single API call using Browser Rendering's new /crawl endpoint, available in open beta. Submit a starting URL, and pages are automatically discovered, rendered in a headless browser, and returned in multiple formats, including HTML, Markdown, and structured JSON. This is great for training models, building RAG pipelines, and researching or monitoring content across a site.
Crawl jobs run asynchronously. You submit a URL, receive a job ID, and check back for results as pages are processed.
Terminal window # Initiate a crawl curl -X POST 'https://api.cloudflare.com/client/v4/accounts/{account_id}/browser-rendering/crawl' \ -H 'Authorization: Bearer
Key features:
Multiple output formats - Return crawled content as HTML, Markdown, and structured JSON (powered by Workers AI)
- Return crawled content as HTML, Markdown, and structured JSON (powered by Workers AI) Crawl scope controls - Configure crawl depth, page limits, and wildcard patterns to include or exclude specific URL paths
- Configure crawl depth, page limits, and wildcard patterns to include or exclude specific URL paths Automatic page discovery - Discovers URLs from sitemaps, page links, or both
- Discovers URLs from sitemaps, page links, or both Incremental crawling - Use modifiedSince and maxAge to skip pages that haven't changed or were recently fetched, saving time and cost on repeated crawls
- Use and to skip pages that haven't changed or were recently fetched, saving time and cost on repeated crawls Static mode - Set render: false to fetch static HTML without spinning up a browser, for faster crawling of static sites
- Set to fetch static HTML without spinning up a browser, for faster crawling of static sites Well-behaved bot - Honors robots.txt directives, including crawl-delay
... continue reading