AI crawlers cause Wikimedia(The umbrella organization of Wikipedia and a dozen or so other crowdsourced knowledge projects) Commons bandwidth demands to surge 50%.

Tea@programming.dev · edit-2 2 days ago

AI crawlers cause Wikimedia(The umbrella organization of Wikipedia and a dozen or so other crowdsourced knowledge projects) Commons bandwidth demands to surge 50%.

chrash0@lemmy.world · 2 days ago

i doubt the recent uptick in traffic is from “stealing data” for training but rather from agents scraping them for context, eg Edge Copilot, Google’s AI search, SearchGPT, etc.

poisoning the data will likely not help in this situation since there’s a human on the other side that will just do the same search again given unsatisfactory results. like how retries and timeouts can cause huge outages for web scale companies, poisoning search results will likely cause this type of traffic to increase and further increase the chances of DoS and higher bandwidth usage.

TheBlackLounge@lemm.ee · 2 days ago

So? Break context scrapers till they give up, on your site or completely.

chrash0@lemmy.world · 2 days ago

easily said

AI crawlers cause Wikimedia(The umbrella organization of Wikipedia and a dozen or so other crowdsourced knowledge projects) Commons bandwidth demands to surge 50%.

AI crawlers cause Wikimedia(The umbrella organization of Wikipedia and a dozen or so other crowdsourced knowledge projects) Commons bandwidth demands to surge 50%.

How crawlers impact the operations of the Wikimedia projects