Hi, I’m looking for some recommendations, mostly looking for pointers of where to go and look at/research stuff as I have no idea what is good and what is just well advertised.
Intro: I have finally entered the world of (almost) Gigabit internet, which is opening up options with what I can host.
I currently have:
I will probably also be upgrading my gaming PC in the next few months, so my current rig will probably be put behind the TV to use as a server and for couch gaming.
Info/recommendations I would like:
A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don’t control.
Rules:
Be civil: we’re here to support and learn from one another. Insults won’t be tolerated. Flame wars are frowned upon.
No spam posting.
Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it’s not obvious why your post topic revolves around selfhosting, please include details to make it clear.
Don’t duplicate the full text of your blog or github here. Just post the link for folks to click.
Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).
No trolling.
Resources:
Any issues on the community? Report it using the report flag.
Questions? DM the mods!
Interesting approach with the ASN — haven’t started using that feature yet. If I understand correctly, you add a QR ASN to each document you need to keep a physical copy of? And that sticker also has the ASN in human readable form? So you would then add many documents at once to the feeder, and Paperless will read the QR and also split documents whenever a new code appears?
What about documents you don’t want to keep physically? Is there a way to get Paperless to split them automatically as well if you add many to the feeder?
Yes! They look like this:
Paperless supports two different splitting methods:
so all you need to do is have a “Patch T” page between each document and it’ll split them automatically.
Docs: https://docs.paperless-ngx.com/advanced_usage/#document-splitting
I’m also using
paperless-ai
to automatically tag and set a title for scanned documents. Very useful. I’d love to run my own AI locally using ollama, but I don’t have good enough hardware so for now I’m using Google’s Gemini 2.0 Flash. I trust Google’s privacy policy far more than OpenAI’s, Google Gemini is very cheap, and if you use the paid version they don’t retain any of your data nor use it for training.Thanks, this sounds really useful. Patch T sounds like some manual sorting work, but I guess with the option to reuse those separator pages it is still better than manual splitting or - worse - single scanning.
I haven’t looked into paperless-ai yet, but I hope my machine would be beefy enough for this task — worst case I guess it might take a little longer to process all docs.
Now I only still need to decide on a good archiving method. I read some article a long time ago about the pros and cons of different document archiving methods used by professional archivers. Some prefer horizontal stacking in boxes, while others prefer vertical stacks in vertical boxes. Pretty interesting nerdy topic 😀
You need a GPU with a decent amount of VRAM to get LLMs working well locally. I don’t have a new enough GPU to be useful - my server just has the Intel iGPU, and my desktop PC only has a GTX1080, which is from before Nvidia added Tensor cores for AI.
Thanks, I’ll look into it. For completionists: This is the article about how to properly archive paper: https://peelarchivesblog.com/2024/09/10/how-do-archivists-package-things-the-battle-of-the-boxes/