It would be helpful if there were an instance that migrated all of this to Lemmy so that we could access it from any other instance, instead of having to download it for local browsing.

This was data from pushshift before Reddit nuked it in March. You can find this torrent (called “Reddit comments/submissions 2005-06 to 2022-12”) and others, including 2023-01 and 2023-02, on https://academictorrents.com by user Watchful1.

JSON compressed with zstd. You can also grab individual subreddits at https://the-eye.eu/redarcs/

What’s the context and background here? It would be nice to know what’s in some of these 4GB compressed files before downloading them.

@jacaw@lemmy.ml
link
fedilink
English
11Y

Agreed, what’s in these? Raw text? Image metadata?

@theUnlikely@sopuli.xyz
link
fedilink
English
21Y

Nothin’ but JSON compressed with zstd. You can also grab individual subreddits at https://the-eye.eu/redarcs/

@InternetPirate@lemmy.fmhy.ml
creator
link
fedilink
English
61Y

It has json files with every written post on Reddit.

Piracy: ꜱᴀɪʟ ᴛʜᴇ ʜɪɢʜ ꜱᴇᴀꜱ
!piracy@lemmy.dbzer0.com
Create a post
⚓ Dedicated to the discussion of digital piracy, including ethical problems and legal advancements.

Rules • Full Version

1. Posts must be related to the discussion of digital piracy

2. Don’t request invites, trade, sell, or self-promote

3. Don’t request or link to specific pirated titles, including DMs

4. Don’t submit low-quality posts, be entitled, or harass others



Loot, Pillage, & Plunder

📜 c/Piracy Wiki (Community Edition):


💰 Please help cover server costs.

Ko-Fi Liberapay
Ko-fi Liberapay

  • 1 user online
  • 116 users / day
  • 280 users / week
  • 1.01K users / month
  • 3.51K users / 6 months
  • 1 subscriber
  • 3.39K Posts
  • 82.1K Comments
  • Modlog