Wikimedia Foundation bemoans AI bot bandwidth burden
Briefly

Wikimedia Foundation representatives report significant increases in traffic due to web-scraping bots. Since January 2024, bandwidth usage for multimedia files has surged by 50 percent, and this rise is primarily linked to automated programs accessing Wikimedia Commons to gather images for AI purposes. Although bots make up only 35 percent of page views, they account for over 65 percent of traffic to expensive data center content, exacerbating resource burdens. While human traffic can often be managed, the unpredictable nature of bots is leading to rising operational risks and costs for Wikimedia communities.
Our infrastructure is built to sustain sudden traffic spikes from humans during high-interest events, but the amount of traffic generated by scraper bots is unprecedented and presents growing risks and costs.
This increase is not coming from human readers, but largely from automated programs that scrape the Wikimedia Commons image catalog of openly licensed images to feed images to AI models.
At least 65 percent of the traffic for the most expensive content served by Wikimedia Foundation datacenters is generated by bots, even though these software agents represent only about 35 percent of page views.
The heedlessness of ill-behaved bots has been a common complaint over the past year or so among those operating computing infrastructure for open source projects.
Read at Theregister
[
|
]