Elyse Betters Picaro / ZDNET
ZDNET's key takeaways
Cloudflare claims Perplexity ignores websites' wishes in its content hunt.
Other AI companies, such as OpenAI, don't wipe content, Cloudflare says
Cloudflare now offers services to block aggressive AI crawlers.
Perplexity is denying Cloudflare's claims.
Cloudflare, a leading content delivery network (CDN) company, has accused the AI startup Perplexity of evading websites' "no crawl" directives by stealthily deploying web crawlers to scrape content from sites that have explicitly blocked its official bots.
If that sounds familiar, you've heard these accusations before. Last year, WIRED and Forbes both accused Perplexity of doing the same thing to their sites.
How Perplexity allegedly bypasses 'no crawl' directives
According to Cloudflare, when Perplexity's web crawler encounters a robots.txt file, which sites use to block their content from being crawled, Perplexity pretends to be an ordinary Chrome web browser on a Mac. This enables it to bypass the bot barriers.
Also: Perplexity's Comet AI browser is hurtling toward Chrome - how to try it
Cloudflare started investigating when it received complaints from customers who had "both disallowed Perplexity crawling activity in their robots.txt files and also created WAF [Web Application Firewall] rules to specifically block both of Perplexity's declared crawlers: PerplexityBot and Perplexity-User." The customers said their content still ended up in Perplexity, even after they had blocked it.
The CDN then set up new test domains, explicitly prohibiting all automated access in its robots.txt files and through specific WAF rules that blocked crawling from Perplexity's acknowledged crawlers. Cloudflare found that Perplexity would use multiple IP addresses not listed in its official IP range and rotate through these IPs to sneak into the sites' content and records.
"In addition to rotating IPs, we observed requests coming from different Autonomous System Numbers (ASNs) to evade website blocks," Cloudflare said. "This activity was observed across tens of thousands of domains and millions of requests per day."
Also: Samsung users can get Perplexity Pro AI free for an entire year - that's $240 off
The result? Cloudflare said it observed "Perplexity not only accessed such content but was able to provide detailed answers about it when queried by users."
Cloudflare's plan to stop Perplexity
Moving forward, Cloudflare has claimed its bot management system can spot and block Perplexity's hidden User Agent. Any bot management customer who has an existing block rule in place is already protected.
If you don't want to block such traffic on the grounds that it might be from real users, you can set up rules to challenge requests. This allows real humans to proceed. Customers with existing challenge rules are already protected.
Also: I tested ChatGPT's Deep Research against Gemini, Perplexity, and Grok AI to see which is best
Finally, Cloudflare has added signature matches for the stealth crawler to its managed rule, which blocks AI crawling activity. This rule is available to all Cloudflare customers, including free users.
Cloudflare noted that OpenAI does obey the robots.txt restrictions and doesn't try to break into websites. That said, Ziff Davis, ZDNET's parent company, filed an April 2025 lawsuit against OpenAI, alleging it infringed copyrights in training and operating its AI systems.
Cloudflare has recently started offering its customers the option to automatically block all AI crawlers. To complement the move to block AI crawlers, Cloudflare has also launched its Pay Per Crawl program, enabling publishers to set rates for AI companies that want to scrape their content.
Also: 5 reasons why I still prefer Perplexity over every other AI chatbot
This follows numerous deals in which media businesses are permitting AI companies to legally use their content to train their large language models (LLMs). Examples include The New York Times with Amazon, The Washington Post with OpenAI, and Perplexity with Gannett Publishing.
In the meantime, Perplexity appears to continue to break the rules in its hunt for content. ZDNET has asked Perplexity about Cloudflare's claims, but the company has not responded.
Perplexity strikes back
Since then, Perplexity has publicly and loudly announced that Cloudflare has it all wrong. In a blog post, Perplexity claims:
This controversy reveals that Cloudflare's systems are fundamentally inadequate for distinguishing between legitimate AI assistants and actual threats. If you can't tell a helpful digital assistant from a malicious scraper, then you probably shouldn't be making decisions about what constitutes legitimate web traffic.
Those are fighting words! Further, Perplexity states, "Technical errors in Cloudflare's analysis aren't just embarrassing -- they're disqualifying. When you misattribute millions of requests, publish completely inaccurate technical diagrams, and demonstrate a fundamental misunderstanding of how modern AI assistants work, you've forfeited any claim to expertise in this space."
This fight is on. Stay tuned for what's next in this battle between an internet giant and an AI powerhouse.
Want more stories about AI? Check out AI Leaderboard, our weekly newsletter.