The Internet's Most Powerful Archiving Tool Is in Peril

US media organizations, including USA Today and The New York Times, are blocking the Wayback Machine from archiving their content. This restriction contrasts with their reliance on the Wayback Machine for research. The Internet Archive's Wayback Machine preserves web pages for public access, but many major news sites are limiting its functionality. USA Today Co. claims this is part of a broader effort against scraping bots, while other outlets like The Guardian have implemented selective restrictions on their content.

""They're able to pull together their story research because the Wayback Machine exists. At the same time, they're blocking access," Graham says."

"According to analysis by the artificial-intelligence-detection startup Originality AI, 23 major news sites are currently blocking ia_archiverbot, the web crawler commonly used by the Internet Archive for the Wayback project."

"USA Today Co. spokesperson Lark-Marie Anton emphasized that 'this effort is not about specifically blocking the Internet Archive' but instead part of the company's broader efforts to block all scraping bots."

"The Guardian does not block the crawler, but it excludes its content from the Internet Archive API and filters out articles from the Wayback Machine interface, which makes it harder for regular people to access archived versions of its articles."

#wayback-machine #us-media #archiving #information-access #scraping-bots

Read at WIRED

Unable to calculate read time

Collection

[

...

]

The Internet's Most Powerful Archiving Tool Is in PerilThe Internet's Most Powerful Archiving Tool Is in Peril Briefly

The Internet's Most Powerful Archiving Tool Is in Peril
The Internet's Most Powerful Archiving Tool Is in Peril
Briefly