Reddit blocks non-profit Wayback Machine from archiving the site

The Internet Archive’s Wayback Machine is one of the most valuable free services available on the web, ensuring that important sources of information are protected from the vicissitudes of fate and tech companies.

Until recently, the archive was able to capture the entirety of Reddit, but that is no longer the case following new restrictions implemented by the for-profit community discussion platform …

The Internet Archive

The archive has been in operation since 1996.

We began in 1996 by archiving the Internet itself, a medium that was just beginning to grow in use. Like newspapers, the content published on the web was ephemeral – but unlike newspapers, no one was saving it. Today we have 28+ years of web history accessible through the Wayback Machine and we work with 1,200+ library and other partners through our Archive-It program to identify important web pages.

To date, it has archived 835 billion web pages, alongside books, audio recordings, photos, videos, photos, and apps. It is used by millions of people a day, from researchers and historians to the general public.

Reddit blocks Wayback Machine

Engadget reports that Reddit is almost completely blocking the Wayback Machine from crawling content on the platform.

The company has begun to place new restrictions on what the archive site will be able to access in a move that will significantly limit the Wayback Machine’s ability to preserve information from Reddit. With the change, the Wayback Machine, a project run by the nonprofit Internet Archive, will only be able to crawl Reddit’s homepage. It will no longer be able to access comments, subreddit pages, post details, profiles and other data.

This is despite the fact that Reddit said last year that it would not block good faith actors, specifically including the Internet Archive within this.

... continue reading