AI bots strain Wikimedia as bandwidth surges 50%
Published on: 2025-05-17 22:06:06
On Tuesday, the Wikimedia Foundation announced that relentless AI scraping is putting strain on Wikipedia's servers. Automated bots seeking AI model training data for LLMs have been vacuuming up terabytes of data, growing the foundation's bandwidth used for downloading multimedia content by 50 percent since January 2024. It’s a scenario familiar across the free and open source software (FOSS) community, as we've previously detailed.
The Foundation hosts not only Wikipedia but also platforms like Wikimedia Commons, which offers 144 million media files under open licenses. For decades, this content has powered everything from search results to school projects. But since early 2024, AI companies have dramatically increased automated scraping through direct crawling, APIs, and bulk downloads to feed their hungry AI models. This exponential growth in non-human traffic has imposed steep technical and financial costs—often without the attribution that helps sustain Wikimedia’s volunteer ecos
... Read full article.