Wikimedia has seen a 50 percent increase in bandwidth used for downloading multimedia content since January 2024 due to AI crawlers taking its content to train generative AI models. It has to find a way to address the problem, because it could slow down actual readers' access to its pages and assets.
That what these AI crawler builders should be doing. They should be downloading the wikipedia backup or whatever it is and running their own wikipedia locally. They can download an update once a day or however often the backup is updated. Wouldn’t surprise me if it’s some poor intern had to implement a bot, was fired or moved on, and it’s just running with nobody maintaining it. All the while the C-suits are shovelling money into their pockets.
Why would an AI company hire someone when they can just tell an ai prompt to write a script to download wikipedia, and run it without even checking?
A man of our times!