AI Scrapers Strain Wikipedia: 50% Surge in Multimedia Bandwidth Usage

Wed 2nd Apr, 2025

Wikipedia, one of the most visited websites globally, is experiencing significant strain on its infrastructure due to the increasing use of AI scrapers for training purposes. This surge in traffic has led to a 50% increase in bandwidth consumption for multimedia downloads, as reported by the Wikimedia Foundation.

Despite being prepared for spikes in access due to notable events, such as the recent passing of former U.S. President Jimmy Carter, the organization has found itself facing unique challenges. While it typically anticipates heightened interest in specific content, the continuous access patterns of AI scrapers have created additional burdens on its resources. The Foundation has highlighted the urgent need to prioritize human traffic over automated requests to maintain optimal performance.

Data from the Wikimedia Foundation reveals a marked increase in multimedia bandwidth usage starting from Spring 2024, with several peaks throughout the year. The most significant spike occurred following the announcement of Jimmy Carter's death, leading to a high volume of users accessing a lengthy video debate between Carter and Ronald Reagan. Although Wikipedia is typically equipped to handle such influxes, the demand caused by AI scraping resulted in noticeable delays in loading times for users.

The challenges posed by AI scrapers stem from their operational design, which differs from typical human browsing behavior. While human users often return to familiar pages, AI scrapers continuously request a vast array of content, placing an increased load on the central data centers. This shift in traffic dynamics has led to a depletion of reserves meant to handle sudden spikes in human interest.

According to the Foundation, approximately two-thirds of the traffic requiring substantial resources comes from these automated requests rather than human interactions. This ongoing influx of traffic from AI scrapers has introduced frequent disruptions, necessitating the blocking of many automated requests to ensure that genuine users can access Wikipedia and its related resources without hindrance. The Foundation has characterized the volume of traffic created by AI scrapers as unprecedented, warning that it poses escalating risks and costs without delivering any tangible benefits, such as increased visibility or user engagement.

The issue of resource allocation due to AI scraper traffic is not new. Earlier this year, Linux Weekly News publicly reported that automated access was creating a denial-of-service effect, slowing down their website for all users. While the Wikimedia Foundation has not disclosed specific details about the entities behind the scrapers impacting its systems, it is evident that the challenge arises from various AI companies utilizing publicly available data from the internet, with Wikipedia being a prime source for training their models.


More Quick Read Articles »