Always enjoyed scrolling though these posts, figured I’d give it a go here:
What are your must-have selfhosted services?
Some of mine:
A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don’t control.
Rules:
Be civil: we’re here to support and learn from one another. Insults won’t be tolerated. Flame wars are frowned upon.
No spam posting.
Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it’s not obvious why your post topic revolves around selfhosting, please include details to make it clear.
Don’t duplicate the full text of your blog or github here. Just post the link for folks to click.
Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).
No trolling.
Resources:
Any issues on the community? Report it using the report flag.
Questions? DM the mods!
One of my favorites is Whoogle, a simple Google search proxy. It accepts search requests and forwards them to Google anonymously, then strips out the AMP links and tracking. There’s even an option for it to use Tor so your IP address changes frequently.
Whoogle vs SearXng in your experience?
I used both, I ended up settling on searxng because Whoogle seemed to be unable to retain my settings. Might be something with my cookie configuration, but searxng has no problem remembering my preferences. If that is not a problem for you then they are comparable; Whoogle is pretty simple to get going and works well, searxng is slightly more complicated to set up (but not that much with docker) but has a ton more features.
Currently running
My personal setup:
I have been playing with some other tools, but these are the most important for me.
So the scanner saves the file in SMB-share(s), then Paperless(-xng) will automatically process it?
Maybe Paperless, with an LLM API integration to chat with the documents, using the power of referring to and verifying against Paperless’ concrete results, would be somehow useful.
Edit: Oh, this is already being discussed on their GitHub. Of course it is!
You are right with the first part. It only takes three clicks to scan a doc and have it available.
As for me, I’m not interest in sending my documents to open AI. But it would definitely offer some nice functions.
You wouldn’t have to. There are plenty of well-performing open-source models that work with an API similar to the Open AI standard, with which you can simply substitute OpenAI models by using a different URL and API-key.
You can run these models in the cloud, either selfhosted or “as a service”.
Or you can run them locally on high-end consumer-grade hardware, some even on smartphones, and the models are only getting smaller and more performant with very frequent advancements regarding training, tuning and prompting. Some of these open-source models are already claiming to be outperforming GPT-4 in some regards, so this solution seems viable too.
Hell, you can even build and automate your own specialized agents in collaborating “crews” using frameworks, and so much more…
Though, I’m unsure if the LLM functionality should be integrated into Paperless, or rather implemented by calling the Paperless API from the LLM agent. I see how both ways could fit some specific uses.
Some features like a “tl,dr” bot would probably not even need high end hardware, because it does not matter if it takes ten minutes for a summary.
Features like a chat bot do not belong into paperless IMO.
True, that’s a good take. Tl;dr for the masses! Do you think an internal or external tl;dr bot would be embraced by the Paperless community?
It could either process the (entire or selected) collection, adding the new tl;dr entries to the files “behind the scenes”, just based on some general settings/prompt to optimize for the desired output – or it could do the work on-demand on a per-document basis, either based on the general settings or custom settings, though this could be a flow-breaking bottleneck in situations where the hardware isn’t powerful enough to keep up with you. However, that only seems like a temporary problem to me, since hardware, LLMs etc. will keep advancing and getting more powerful/efficient/cheap/noice.
Right – but, opposingly to that, Paperless definitely do belong into some chatbots!
I think more “intelligence” in parsing the documents would be well-received. Just as OCR is fundamental to paperless, AI features could be the next step forward. Automatically extract the relevant positions of e.g. a bill, understand the document (and select the correct date, not my birthday) and apply correct tags to new documents.
Definitely!
Yes, I think that’s the way to go. If the paperless-ngx team doesn’t believe in following that path, someone else will probably fork the project and do it, or build something with similar capabilities “from scratch”. Then, it’ll be interesting to see what’s coming forth of open-source models with capabilites similar to GPT-4Vision… . . . . 🤯
Things I rely on are Nextcloud, Jellyfin, Wireguard, and Matrix-Synapse.
Syncthing - No introduction needed. Couldn’t live without it.
Healthchecks.io (you can self host this) - Dead man’s switch monitoring for all my automation. Most of my automated scripts hit up a Healthchecks endpoint when they run, and if they fail to hit the endpoint on a regular schedule I get notified. Mandatory for my anxiety.
I have a network drive that I put all my documents on. Would using syncthing have a better workflow than that?
It depends on what your workflow/usecase for putting documents on the drive currently is. Syncthing is usually intended to be put on two separate devices, and then a folder on each device gets synchronized - meaning you have a folder of your documents on each device. Is there any reason not to just mount the network drive’s folder and drag the documents in that way?
Yeah, that’s how I do it now. I just mount the network drive on each PC and they can all access the same files. I’m just wondering if there’s a usecase that syncthing has that my workflow doesn’t that I just can’t think of because I haven’t used it.
Yeah I wouldn’t bother. It intends for you to have a duplicate copy on every device, which is probably not what you want. Syncthing is really good for things like synchronizing notes, calendars, password databases, music, etc to your devices. Things that you want to access in both places, but that are usually disconnected from each other from time to time.
XMPP server and a basic WebDAV server.
My own Forgejo is nice to have.
According to my continued survival on the planet, none.
Not all of us are so lucky. I was hospitalized for 3 weeks until I was able to get my PiHole back up. I was nearly a goner.
Pfsense, Bitwarden, NAS running Debian, Kubernetes cluster. I have plans to expand And add more services when I get some of my newer hardware online.
PiHole
Sandstorm
I hardly ever see people talking about Pocketbase in threads like these, but as a dev I love it
What do you use it for and why do you like it over other databases
deleted by creator
You don’t hear a lot of talk because it’s SQLite with a thin layer added on top (an SDK and some Oauth modules). You can achieve the same in 5 minutes with SQLite and a few NPM modules.
Pocketbase is amazing.
For everyone else:
Home assistant is high on my todo list right after i set up my new proxmox host
Vaultwarden AdGuardHome + Sync Jellyfin + FinAmp + Supersonic Linkding + Linkding Injector LLDAP Calibre-web + Kobo
piped and libreddit, also i’d like to host my own simplex server
libreddit is dead though…
works still fine, especially self-hosted
Oh cool. I thought the API changes broke it
My whole infrastructure is designed so that my homeserver is expendable.
Therefore my most important tool is Syncthing. It is decentral, which is awesome for uptime and reducing dependance on a single point of failure. My server is configured as the “introducer” node for convenience.
I try to find file-based applications, such as KeePassXC or Obsidian, whenever I can so that I can sync as much as possible with Syncthing.
Therefore there is (luckily) not much left to host and all of it is less critical:
So the worst thing that can happen when my server fails is: I need to import my OPML to a cloud provider and I loose syncing for some less important stuff and my homepage is not accessible.
Since I just rebuilt my server, I can confirm that I managed a whole week without it just fine. Thank you very much, Syncthing!