Tech News
← Back to articles

AI web crawlers are destroying websites in their never-ending content hunger

read original related products more articles

Opinion With AI's rise, AI web crawlers are strip-mining the web in their perpetual hunt for ever more content to feed into their Large Language Model (LLM) mills. How much traffic do they account for? According to Cloudflare, a major content delivery network (CDN) force, 30% of global web traffic now comes from bots. Leading the way and growing fast? AI bots.

Cloud services company Fastly agrees. It reports that 80% of all AI bot traffic comes from AI data fetcher bots. So, you ask, "What's the problem? Haven't web crawlers been around since 1993 with the arrival of the World Wide Web Wanderer in 1993?" Well, yes, they have. Anyone who runs a website, though, knows there's a huge, honking difference between the old-style crawlers and today's AI crawlers. The new ones are site killers.

Fastly warns that they're causing "performance degradation, service disruption, and increased operational costs." Why? Because they're hammering websites with traffic spikes that can reach up to ten or even twenty times normal levels within minutes.

Moreover, AI crawlers are much more aggressive than standard crawlers. As the InMotionhosting web hosting company notes, they also tend to disregard crawl delays or bandwidth-saving guidelines and extract full page text, and sometimes attempt to follow dynamic links or scripts.

The result? If you're using a shared server for your website, as many small businesses do, even if your site isn't being shaken down for content, other sites on the same hardware with the same Internet pipe may be getting hit. This means your site's performance drops through the floor even if an AI crawler isn't raiding your website.

Smaller sites, like my own Practical Tech, get slammed to the point where they're simply knocked out of service. Thanks to Cloudflare Distributed Denial of Service (DDoS) protection, my microsite can shrug off DDoS attacks. AI bot attacks – and let's face it, they are attacks – not so much.

Even large websites are feeling the crush. To handle the load, they must increase their processor, memory, and network resources. If they don't? Well, according to most web hosting companies, if a website takes longer than three seconds to load, more than half of visitors will abandon the site. Bounce rates jump up for every second beyond that threshold.

So when AI searchbots, with Meta (52% of AI searchbot traffic), Google (23%), and OpenAI (20%) leading the way, clobber websites with as much as 30 Terabits in a single surge, they're damaging even the largest companies' site performance.

Now, if that were traffic that I could monetize, it would be one thing. It's not. It used to be when search indexing crawler, Googlebot, came calling, I could always hope that some story on my site would land on the magical first page of someone's search results so they'd visit me, they'd read the story, and two or three times out of a hundred visits, they'd click on an ad, and I'd get a few pennies of income. Or, if I had a business site, I might sell a widget or get someone to do business with me.

AI searchbots? Not so much. AI crawlers don't direct users back to the original sources. They kick our sites around, return nothing, and we're left trying to decide how we're to make a living in the AI-driven web world.

... continue reading