When Online Content Disappears
www.pewresearch.org
external-link
A quarter of all webpages that existed at one point between 2013 and 2023 are no longer accessible.

cross-posted from: https://lemy.lol/post/25166889

TehPers
link
fedilink
English
54M

On the flip side, nobody can be expected to keep their website up for 4000 years. Hosting costs money and time, and at some point, the thing you’re hosting will fall out of relevance enough to no longer be worth the cost.

This is why archiving is important. Hopefully most of the content that was lost was archived at some point. Getting a good chunk of that content onto long term storage would do future generations a favor (even if it’s just a bunch of tape storage locked away in a warehouse or something).

This is true. Right now the OG internet is sort of kept alive by oral history, but we have the technology to save these websites in perpetuity as historical artifacts. That might be a good coding project - a robust archiving system that lets you point a URL at a webpage and scrape everything under its domain and keep a static collection of its contents. The issue, though, is that this doesn’t actually truly “capture” many web pages. A lot of the backend data that might have been served dynamically from a database isn’t retrievable, so the experience of using the page itself is potentially non-archivable.

Create a post

A nice place to discuss rumors, happenings, innovations, and challenges in the technology sphere. We also welcome discussions on the intersections of technology and society. If it’s technological news or discussion of technology, it probably belongs here.

Remember the overriding ethos on Beehaw: Be(e) Nice. Each user you encounter here is a person, and should be treated with kindness (even if they’re wrong, or use a Linux distro you don’t like). Personal attacks will not be tolerated.

Subcommunities on Beehaw:


This community’s icon was made by Aaron Schneider, under the CC-BY-NC-SA 4.0 license.

  • 1 user online
  • 144 users / day
  • 275 users / week
  • 709 users / month
  • 2.87K users / 6 months
  • 1 subscriber
  • 3.09K Posts
  • 64.9K Comments
  • Modlog