Some people are defending Perplexity after Cloudflare ‘named and shamed’ it

When Cloudflare accused AI search engine Perplexity of stealthily scraping websites on Monday, while ignoring a site’s specific methods to block it, this wasn’t a clear-cut case of an AI web crawler gone wild.

Many people came to Perplexity’s defense. They argued that Perplexity accessing sites in defiance of the website owner’s wishes, while controversial, is acceptable. And this is a controversy that will certainly grow as AI agents flood the internet: Should an agent accessing a website on behalf of its user be treated like a bot? Or like a human making the same request?

Cloudflare is known for providing anti-bot crawling and other web security services to millions of websites. Essentially, Cloudflare’s test case involved setting up a new website with a new domain that had never been crawled by any bot, setting up a robots.txt file that specifically blocked Perplexity’s known AI crawling bots, and then asking Perplexity about the website’s content. And Perplexity answered the question.

Cloudflare researchers found the AI search engine used “a generic browser intended to impersonate Google Chrome on macOS” when its web crawler itself was blocked. Cloudflare CEO Matthew Prince posted the research on X, writing, “Some supposedly ‘reputable’ AI companies act more like North Korean hackers. Time to name, shame, and hard block them.”

But many people disagreed with Prince’s assessment that this was actual bad behavior. Those defending Perplexity on sites like X and Hacker News pointed out that what Cloudflare seemed to document was the AI accessing a specific public website when its user asked about that specific website.

“If I as a human request a website, then I should be shown the content,” one person on Hacker News wrote, adding, “why would the LLM accessing the website on my behalf be in a different legal category as my Firefox web browser?”

A Perplexity spokesperson previously denied to TechCrunch that the bots were the company’s and called Cloudflare’s blog post a sales pitch for Cloudflare. Then on Tuesday, Perplexity published a blog in its defense (and generally attacking Cloudflare), claiming the behavior was from a third-party service it uses occasionally.

Techcrunch event Tech and VC heavyweights join the Disrupt 2025 agenda Netflix, ElevenLabs, Wayve, Sequoia Capital — just a few of the heavy hitters joining the Disrupt 2025 agenda. They’re here to deliver the insights that fuel startup growth and sharpen your edge. Don’t miss the 20th anniversary of TechCrunch Disrupt, and a chance to learn from the top voices in tech — grab your ticket now and save up to $675 before prices rise on August 7. Tech and VC heavyweights join the Disrupt 2025 agenda Netflix, ElevenLabs, Wayve, Sequoia Capital — just a few of the heavy hitters joining the Disrupt 2025 agenda. They’re here to deliver the insights that fuel startup growth and sharpen your edge. Don’t miss the 20th anniversary of TechCrunch Disrupt, and a chance to learn from the top voices in tech — grab your ticket now and save up to $675 before prices rise. San Francisco | REGISTER NOW

But the crux of Perplexity’s post made a similar appeal as its online defenders did.

“The difference between automated crawling and user-driven fetching isn’t just technical — it’s about who gets to access information on the open web,” the post said. “This controversy reveals that Cloudflare’s systems are fundamentally inadequate for distinguishing between legitimate AI assistants and actual threats.”

... continue reading