Reddit sues Anthropic for scraping its users' content without consent
Briefly

Reddit has filed a lawsuit against Anthropic, alleging violations of the Robots Exclusion Protocol (REP) by unauthorized scraping of its content for AI training. The complaint asserts that Anthropic trained its models on Reddit user data without consent, infringing on privacy agreements. This case adds to the rising trend of lawsuits from content creators against AI firms for scraping practices. Notably, Reddit, as a tech platform itself, presents a unique perspective in this legal battleground, unlike many traditional publishers that have previously filed similar suits against AI companies for copyright breaches.
Research indicates that other AI companies are also engaging in this practice: In March, Columbia's Tow Center found that multiple chatbots, including Perplexity, could still retrieve articles from publishers that had blocked their crawlers using REP. They retrieved content despite explicit blocks, raising serious concerns about AI companies' adherence to web standards.
The complaint states that 'Anthropic is in fact intentionally trained on the personal data of Reddit users without ever requesting their consent,' which is a violation of Reddit's user privacy agreement.
In July 2024, when Reddit publicly criticized Anthropic for misusing its content, 'Anthropic's bots continued to hit Reddit's servers over 100,000 times' despite insisting that it had stopped its bots from crawling the site.
This lawsuit is the latest in the ongoing clash between sites that create and host content and the AI companies that scrape that content to use as training data.
Read at ZDNET
[
|
]