Cloud services giant Fastly has released a report claiming AI crawlers are putting a heavy load on the open web, slurping up sites at a rate that accounts for 80 percent of all AI bot traffic, with the remaining 20 percent used by AI fetchers. Bots and fetchers can hit websites hard, demanding data from a single site in thousands of requests per minute.
I can only see one thing causing this to stop: the AI bubble popping
According to the report [PDF], Facebook owner Meta's AI division accounts for more than half of those crawlers, while OpenAI accounts for the overwhelming majority of on-demand fetch requests.
Cloudflare creates AI crawler tollbooth to pay publishers READ MORE
"AI bots are reshaping how the internet is accessed and experienced, introducing new complexities for digital platforms," Fastly senior security researcher Arun Kumar opined in a statement on the report's release. "Whether scraping for training data or delivering real-time responses, these bots create new challenges for visibility, control, and cost. You can't secure what you can't see, and without clear verification standards, AI-driven automation risks are becoming a blind spot for digital teams."
The company's report is based on analysis of Fastly's Next-Gen Web Application Firewall (NGWAF) and Bot Management services, which the company says "protect over 130,000 applications and APIs and inspect more than 6.5 trillion requests per month" – giving it plenty of data to play with. The data reveals a growing problem: an increasing website load comes not from human visitors, but from automated crawlers and fetchers working on behalf of chatbot firms.
The report warned, "Some AI bots, if not carefully engineered, can inadvertently impose an unsustainable load on webservers," Fastly's report warned, "leading to performance degradation, service disruption, and increased operational costs." Kumar separately noted to The Register, "Clearly this growth isn't sustainable, creating operational challenges while also undermining the business model of content creators. We as an industry need to do more to establish responsible norms and standards for crawling that allows AI companies to get the data they need while respecting websites content guidelines."
That growing traffic comes from just a select few companies. Meta accounted for more than half of all AI crawler traffic on its own, at 52 percent, followed by Google and OpenAI at 23 percent and 20 percent respectively. This trio then has its hands on a combined 95 percent of all AI crawler traffic. Anthropic, by contrast, accounted for just 3.76 percent of crawler traffic. The Common Crawl Project, which slurps websites to include in a free public dataset designed to prevent duplication of effort and traffic multiplication at the heart of the crawler problem, was a surprisingly-low 0.21 percent.
The story flips when it comes to AI fetchers, which unlike crawlers are fired off on-demand when a user requests that a model incorporates information newer than its training cut-off date. Here, OpenAI was by far the dominant traffic source, Fastly found, accounting for almost 98 percent of all requests. That's an indication, perhaps, of just how much of a lead OpenAI's early entry into the consumer-facing AI chatbot market with ChatGPT gave the company, or possibly just a sign that the company's bot infrastructure may be in need of optimization.
While AI fetchers make up a minority of Ai bot requests – only about 20%, says Kumar – they can be responsible for huge bursts of traffic, with one fetcher generating over 39,000 requests per minute during the testing period. "We expect fetcher traffic to grow as AI tools become more widely adopted and as more agentic tools come into use that mediate the experience between people and websites," Kumar told The Register.
... continue reading