Just an explorer in the threadiverse.

  • 2 Posts
  • 63 Comments
Joined 1Y ago
cake
Cake day: Jun 04, 2023

help-circle
rss

Like helping to find a bug, discussing about how to setup an application for a certain use case or anything like that? Answering questions on Stack overflow is an example but is that the best way?

Generally the best way to help out is to do a thing that’s needed and that you can figure out how to do. Your list includes a bunch of good options, and I’ve been thanked for doing all those things at one point or another. Some common growth paths include:

  1. Using the software
  2. Encountering bugs, problems, or small opportunities for improvement.
  3. Discussing those informally in forums and helping people find workarounds.
  4. Identifying some of those issues as common things other things experience as well, so filing bugs for them with clear explanations and links to related forum discussions.
  5. Reading source code to better understand bugs.
  6. Discussing potential fixes in developer bug threads (or in GitHub or whatever).
  7. Submitting small fixes for simple bugs as pull requests.

Another path might be:

  1. Using the software and reading forums/docs for help.
  2. Answering basic questions on forums, looking to old threads and relevant docs.
  3. Learning about common questions.
  4. Writing blogs or forum posts about common questions.
  5. Submitting improvements to official docs to clarify common areas of confusion.

There are other paths as well, the main thing is to use a thing so you learn about it and then use that knowledge to make it a little easier for the next person. Good luck!


I had a look through the comments on this HN thread the other day and came away more intrigued by https://github.com/openobserve/openobserve than hyperdx. Hyperdx is built on top of clickhouse whereas open observe has it’s own storage engines based on parquet files that can be accessed from local disk, S3, or a few other protocols.

I haven’t tried either option yet… I’m, currently using netdata for metrics and don’t do anything special for logs or tracing, but at tiny self-hosting scale I often find software with it’s own storage engines (often sqlite) to be extra hassle-free. I’m curious to kick the tires on openobserve for that reason.


This is a very strong explanation of what’s going on. And as a follow-up, I believe that ZeroTier present a single Ethernet broadcast domain, and so WoL tricks are more likely to work naturally there than with Wireguard. I haven’t used ZeroTier, and I do use Wireguard via Tailscale/Headscale. I’ve never missed the Ethernet features of ZeroTier and they CAN result in a very chatty wan if you’re not careful. But I think ZT would make this straightforward.

Though as other people note… the simplest/least-disruptive change is probably to expose some scripty thing on the rpi that can be triggered via be triggered over a routed protocol and then have the rpi emit the Ethernet broadcast packets from the physical network.


I don’t think titles directly transfer between companies, and yet the industry allows it. It’s a very useful tool for advancement.

This may be true on some corners of the industry, but at the more competitive end (both in terms of competitive pay, and a competitive pool of candidates)… I believe it’s common to relevel on hire. I’ve seen folks go from director to senior and from senior to junior at my org. The candidates being offered those seemingly big “demotions” often seem to be somewhere between unphased and enthusiastic about the change, presumably because the compensation package we offer at the lower level beats what they were getting with an inflated title and because they know their inflated title is nonsense and they’re frustrated with the other aspects of organizational dysfunction that accompany title inflation at their current company.

What you say is real, and sometimes a promotion in one org can help bridge you into an org that would have been hard to get hired into as a junior, or harder to get promoted in. It’s not without risk though. All things being equal, I’d much rather spend my time working on a strong team and learning a lot and being challenged than to be in a weaker org that’s handing out inflated titles. Getting gud isn’t a guarantee of advancement, but it’s at least as reliable over the long haul as title inflation.


To a first approximation, Tailscale/Headscale don’t route and traffic.

Ah, well damn. Is there a way to achieve this while using Tailscale as well, or is that even recommended?

Is there a way to achieve what? Force tailscale to route all traffic through the DERP servers? I don’t know, and I don’t know why you’d want to. When my laptop is at home on the same network as my file-server, I certainly don’t want tailscale sending filserver traffic out to my Headscale server on the Internet just to download it back to my laptop on the same network it came from. I want NAT traversal to allow my laptop and file-server to negotiate the most efficient network path that works for them… whether that’s within my home lab when I’m there, across the internet when I’m traveling, or routing through the DERP server when no other option works.

OpenVPN or vanilla Wireguard are commonly setup with simple hub-and-spoke routing topologies that send all VPN traffic through “the VPN server”, but this is generally slower path than a direct connection. It might be imperceptibly slower over the Internet, but it will be MUCH slower than the local network unless you do some split-dns shenanigans to special-case the local-network scenario. With Tailscale, it all more or less works the same wherever you are which is a big benefit. Of course excepting if you have a true multigigabit network at home and the encryption overhead slows you down… Wireguard is pretty fast though and not a problematic throughout limiter for the vast majority of cases.


Have a read through https://tailscale.com/blog/how-nat-traversal-works/

You, and many commenters are pretty confused about out tailscale/Headscale work.

  1. To a first approximation, Tailscale/Headscale don’t route and traffic. They perform NAT traversal and data flows directly between nodes on the tailnet, without traversing Headscale/Tailscale directly.
  2. If NAT traversal fails badly enough, it’s POSSIBLE that bulk traffic can flow through the headscale/tailscale DERP nodes… but that’s an unusual scenario.
  3. You probably can’t run Headscale from your home network and have it perform the NAT traversal functions correctly. Of course, I can’t know that for sure because I don’t know anything about your ISP… but home ISPs preventing Headscale from doing it’s NAT traversal job are the norm… one would be pleasantly surprised to find that a home network can do that properly.
  4. Are younreally expecting 10gb/s speeds over your encrypted links? I don’t want to say it’s impossible, people do it… but you’d generally only expect to see this on fairly burly servers that are properly configured. Tailscale just in April bragged about hitting 10gb speeds with recent optimizations: https://tailscale.com/blog/more-throughput/ and on home hardware with novice configd I’d generally expect to see roughly more like single gigabit.

I don’t know what’s up on your case, but I would not jump to the conclusion that it’s impossible to use tailscale with any other VPN in any circumstance.

Rather, tailscale and Mullvad will now work easily and out of the box. For other VPNs, you may need to do understand the topology and routing of virtual devices and have the technical ability and system permissions to make deep networking changes.

So I’d expect one can probably find a way for most things to coexist on a Linux server. On a non-rootrr android phone? I’m less confident.


So I have a question, what can I do to prevent that from happening? Apart from hosting everything on my own hardware of course, for now I prefer to use VPS for different reasons.

Others have mentioned that client-caching can act as a read-only stopgap while you restore Vaultwarden.

But otherwise the solution is backup/restore. If you run Vaultwarden in docker or podman container using volumes to hold state… then you know that as long as you can restart Vaultwarden without losing data that you also know exactly what data needs to be backed up and what needs to be done to restore it. Set up a nightly cron job somewhere (your laptop is fine enough if you don’t have somewhere better) to shut down Vaultwarden, rsync it’s volume dirs, and start it up again. If you VPS explodes, copy these directories to a new VPS at the same DNS name and restart Vaultwarden using the same podman or docker-compose setup.

All that said, keeypass+filesync is a great solution as well. The reason I moved to Vaultwarden was so I could share passwords with others in a controlled way. For single-user, I prefer how keypass folders work and keepass generally has better organization features… I’d still be using it for only myself.


You connect to Headscale using the tailscale clients, and configuration is exactly the same irrespective of which control server you use… with the exception of having to configure the custom server url with Headscale (which requires navigating some hoops and poor docs for mobile/windows clients).

But to my knowledge there are no client-side configs related to NAT traversal (which is kind of the goal… to work seamlessly everywhere). The configs themselves on the headscale server aren’t so bad either, but the networking concepts involved are extremely advanced, so debugging if anything goes sideways or validating that your server-side NAT traversal setup is working as expected can be a deep dive. With Tailscale, you know any problems are client-side and can focus your attention accordingly… which simplifies initial debugging quite a lot.


… only if you are in the US and get an API key from NCMEC. They are very protective of who gets the keys and require a zoom call as well.

Do you have a source for these statements, because they directly contradict the Cloudflare product announcement at https://blog.cloudflare.com/the-csam-scanning-tool/ which states:

Beginning today, every Cloudflare customer can login to their dashboard and enable access to the CSAM Scanning Tool.

… and shows a screenshot of a config screen with no field for an API key. Some CSAM scanners do have fairly limited access, but Cloudflare’s appears to be broadly available.


Yeah, misread the pricing page. Fixed the post, thanks for the correction.


I use Headscale, but Tailscale is a great service and what I generally recommend to strangers who want to approximate my setup. The tradeoffs are pretty straightforward:

  • Tailscale is going to have better uptime than any single-machine Headscale setup, though not better uptime than the single-machine services I use it to access… so not a big deal to me either way.
  • Tailscale doesn’t require you to wrestle with certs or the networking setup required to do NAT traversal. And they do it well, you don’t have to wonder whether you’ve screwed something up that’s degrading NAT traversal only in certain conditions. It just works. That said, I’ve been through the wringer already on these topics so Headscale is not painful for me.
  • Headscale is self-hosted, for better and worse.
  • In the default config (and in any reasonable user-friendly, non professional config), Tailscale can inject a node into your network. They don’t and won’t. They can’t sniff your traffic without adding a node to your tailnet. But they do have the technical capability to join a node to your tailnet without your consent… their policy to not do that protects you… but their technology doesn’t. This isn’t some surveillance power grab though, it’s a risk that’s essential to the service they provide… which is determining what nodes can join your tailnet. IMO, the tailscale security architecture is strong. I’d have no qualms about trusting them with my network.
  • Beyond 3 devices users, Tailscale costs money… about $6 US in that geography. It’s a pretty reasonable cost for the service, and proportional in the grand scheme of what most self-hosters spend on their setups annually. IMO, it’s good value and I wouldn’t feel bad paying it.

Tailscale is great, and there’s no compelling reason that should prevent most self-hosters that want it from using it. I use Headscale because I can and I’m comfortable doing so… But they’re both awesome options.


My money is also on IO. Outside of CPU and RAM, it’s the most likely resource to get saturated (especially if using rotational magnetic disks rather than an SSD, magnetic disks are going to be the performance limiter by a lot for many workloads), and also the one that OP said nothing about, suggesting it’s a blind spot for them.

In addition to the excellent command-line approaches suggested above, I recommend installing netdata on the box as it will show you a very comprehensive set of performance metrics without having to learn to collect each one on the CLI. A downside is that it will use RAM proportional to the data retention period, which if you’re swapping hard will be an issue. But even a few hours of data can be very useful and with 16gb of ram I feel like any swapping is likely to be a gross misconfiguration rather than true memory demand… and once that’s sorted dedicating a gig or two to observability will be a good investment.


Tailscale is out, unfortunately. Because the server also runs Plex and I need to use it with Chromecast on remote access…

I rather suspect you already understand this, but for anyone following along… Tailscale can be combined with other networking techniques as well. So one could:

  • Access Plex from a Chromecast on your home network using your physical IP, and on your tailnet using the overlay IP.
  • Or one could have some services exposed publicly and others exposed on the tailnet. So Immich could be on the tailnet while Plex is exposed differently.

It’s not an all or nothing proposition, but of course the more networking components you have the more complicated everything gets. If one can simplify, it’s often well worth doing so.

Good luck, however you approach it.


So for something like Jellyfin that you are sharing to multiple people you would suggest a VPS running a reverse proxy instead of using DDNS and port forwarding to expose your home IP?

I run my Jellyfin on Tailscale and don’t expose it directly to the internet. This limits remote access to my own devices, or the devices of those I’m willing to help install and configure tailscale on. I don’t really trust Jellyfin on the public internet though. It’s both a bit buggy, which doesn’t bode well for security posture… and also a misconfiguration that exposes your content could generate a lot of copyright liability even if it’s all legitimately licensed since you’re not allowed to redistribute it.

But if you do want it publicly accessible there isn’t a hoge difference between a VPS proxying and a dynamic DNS setup. I have a VPS and like it, but there’s nothing I do with it that couldn’t be done with Cloudflare tunnel or dyndns.

What VPS would you recommend? I would prefer to self host, but if that is too large of a security concern I think there is a real argument for a VPS.

I use linode, or what used to be linode before it was acquired by Akamai. Vultr and Digitalocean are probably what I’d look to if I got dissatisfied. There’s a lot of good options available. I don’t see a VPS proxy as a security improvement over Cloudflare tunnel or dyndns though. Tailscale is the security improvement that matters to me, by removing public internet access to a service entirely, while lettinge continue to use it from my devices.


Do I need to set up NGINX on a VPS (or similar cloud based server) to send the queries to my home box?

A proxy on a VPS is one way to do this, but not the only way and not necessarily the best one… depending on your goals.

  • You can also use port-forwarding and dyndns to just expose the port off your home-ip. If your ISP is sucky, this may not work though.
  • You can also use Cloudflare’s free tunneling product, which is basically a hosted proxy that acts like a super port-forward that bypasses sucky ISP restrictions.
  • If you want to access Immich yourself from your own devices but don’t need to make it available to (many) others on devices you don’t control, I like and use tailscale the best. The advantage of tailscale is that Immich remains on a private network, not directly scannable from the internet. If there’s a preauth exploit published and you don’t pay attention to update promptly, scanners WILL exploit your Immich instance with internet-exposed techniques… whereas tailscale allows you to access services that internet scanners cannot connect to, which is a nice safety net.

Do I need to purchase a domain (randomblahblah.xyz) to use as the main access route from outside my house?

Not for tailscale, and I don’t think for Cloudflare tunnel. Yes for a VPS proxy.

I’ve run a VPS for a long while and use multiple techniques for different services.

  • Some services I run directly on the VPS because it’s simple and I want them to be truly publicly accessible.
  • Other services I run on a bigger server at home and proxy through the VPS because although I want them to be publicly accessible, they require more resources than my VPS has available. When I get around to installing Immich, there’s a decent chance it will go into this category.
  • Still other services, I run wherever and attach them to my tailnet. These I access myself on my own devices (or maybe invite a handful of trusted people into my tailnet), but aren’t visible to the public internet. If I decide not to use immich’s shared gallery features (and so don’t need it publicly accessible) or decide I don’t trust it security-wise… it will go here instead of the proxy-by-vps category.

I use k8s at work and have built a k8s cluster in my homelab… but I did not like it. I tore it down, and currently using podman, and don’t think I would go back to k8s (though I would definitely use docker as an alternative to podman and would probably even recommend it over podman for beginners even though I’ve settled on podman for myself).

  1. K8s itself is quite resource-consuming, especially on ram. My homelab is built on old/junk hardware from retired workstations. I don’t want the kubelet itself sucking up half my ram. Things like k3s help with this considerably, but that’s not quite precisely k8s either. If I’m going to start trimming off the parts of k8s I don’t need, I end up going all the way to single-node podman/docker… not the halfway point that is k3s.
  2. If you don’t use hostNetworking, the k8s model of traffic routes only with the cluster except for egress is all pure overhead. It’s totally necessary with you have a thousand engineers slinging services around your cluster, but there’s no benefit to this level fo rigor in service management in a homelab. Here again, the networking in podman/docker is more straightforward and maps better to the stuff I want to do in my homelab.
  3. Podman accepts a subset of k8s resource-yaml as a docker-compose-like config interface. This lets me use my familiarity with k8s configs iny podman setup.

Overall, the simplicity and lightweight resource consumption of podman/docker are are what I value at home. The extra layers of abstraction and constraints k8s employs are valuable at work, where we have a lot of machines and alot of people that must coordinate effectively… but I don’t have those problems at home and the overhead (compute overhead, conceptual overhead, and config-overhesd) of k8s’ solutions to them is annoying there.


Nutbutter sort of covered it.

  • Tailscale creates a virtual network.
  • That network can be (and is by default) private in that no one can join that you don’t allow, and in that respect it’s similar to your home network. You can join your laptop, desktop, and phone to your tailnet… but probably you cannot join your Chromecast or smart-television (they don’t publish tsilscale clients for these devices).
  • If you configure Jellyfin to listen on your tailnet and not on the Internet… then you can access Jellyfin from anywhere using a device that is connected to your tailnet, but attackers on the Internet cannot access Jellyfin without first accessing your tailnet, which is hard to do.

The security/convenience tradeoff of tailscale is pretty good if you want to access a service from anywhere, but only from your own devices and only from supported operating systems (Linux, windows, OSX, android… not sure about iOS). It is another networking layer, which can be mind-bending… but as much as such a layer can be easy to use… tailscale is as easy as any of them.

However, Tailscale’s backend is not open-source. They may not log all the data passed through, but they certainly can look at it.

This see sentence is nonsense though.

  • Tailscale is end to end encrypted, tailscale cannot quietly see your traffic.
  • Tailscale COULD, by default, surreptitiously join a node to your tailnet. If you’re super paranoid, they provide a way to disable this but it makes tailscale much less convenient to use: https://tailscale.com/kb/1226/tailnet-lock/
  • Tailscale is phenomenally transparent about security and has WAY higher standards than self-hosters: https://tailscale.com/security/.
  • Tailscale clients are open source, and they employ the author of Headscale an open source implementation of the Tailscale control protocols.

There is very little to fear from Tailscale as a provider, and they support the headscale project if you want to go that route (which I do… but not because I am concerned about Tailscale’s integrity or security posture).


This is a great approach, but I find myself not trusting Jellyfin’s preauth security posture. I’m just too concerned about a remote unauthenticated exploit that 2fa does nothing to prevent.

As a result, I’m much happier having Jellyfin access gated behind tailscale or something similar, at which point brute force attacks against Jellyfin directly become impossible in normal operation and I don’t sweat 2fa much anymore. This is also 100% client compatible as tailscale is transparent to the client, and also protects against brute force vs Jellyfin as direct network communication with Jellyfin isn’t possible. And of course, Tailscale has a very tightly controlled preauth attack surface… essentially none of you use the free/commercial tailscale and even self-hosting headscale I’m much more inclined to trust their code as being security-concscious than Jellyfin’s.


Fair enough, sound like you have a well considered use case for Kuma specifically. Good luck, I don’t have much to offer on your OP question.


I’m mostly in the pro-written word camp myself, but I have sought out video tutorials in cases where written docs seem to assume something I don’t know. When I’m learning something new, a written doc might have a 3-word throwaway clause like “… add a user and then…”. But I’ve never added a user and don’t know how. If it’s niche open-source software with a small dev team, this may not be covered in the docs either. I’ll go fishing for videos and just seeing that they go to a web-ui or config-file or whatever sets me on the path to figure out the rest myself.

That is to say, video content that shows someone doing a thing successfully often includes unspoken visual information that the author doesn’t necessarily value or even realize is being communicated. But the need to do the thing successfully on-screen involves documenting many small/easy factoids that can easily trip someone inexperienced up for hours.

I’m as annoyed as anyone when I want reference material and find only videos, and I generally prefer written tutorials as well. But sometimes a video tutorial is the thing that gets me oriented enough to understand the written worthy I wasn’t ready to process previously.

Edit: The ubiquity of video material probably has little to do with it’s usefulness though, and everything to do with how easy it is to monetize on YouTube.


This isn’t exactly an answer to your question, but an alternative monitoring architecture that elides this problem entirely is to run netdata on each server you run.

  • It appears to collect WAY more useful data than uptime Kuma, and requires basically no config. It also collects data on docker containers running on the server so you automatically get per-service metrics as well.
  • Health probes for several protocols including ping and http can be custom-defined in config-files if you want that.
  • There’s no cross server config or discovery required, it just collects data from the system it’s running on (though health probes can hit remote systems if you wish).
  • If any individual or collection of services is down, I see it immediately in their metrics.
  • If the server itself is down, it’s obvious and I don’t need a monitoring system to show a red streak for me to know. I’ve never wasted more than minute differentiating between a broken service and a broken server.

This approach needs no external monitoring hosts. It’s not as elegant as a remote monitoring host that shows everything from a third-party perspective, but that also has the benefit of not false-positiving because the monitoring host went down or lost its network path to the monitored host… Netdata can always see what’s happening because it’s right there when it happens.


I use postgres for my install and had a similar thing happen to me. I tried moving an org credential to a folder, which moved the folder to the org, and kicked all other credentials to “no folder”.

Thanks for confirming with your DB. That saves me sweating whether I should rebuild on PG at least, and also makes me feel better that it’s a folder bug and not generalized database corruption.

Having finished the heavy organizing, my rate of big org transfers has slowed and I haven’t reproduced again yet. Hopefully this will be uncommon enough to be a non-issue. Thanks again for the info.


Thanks for the suggestion, but sync seems to be working ok… at least on the read side. I was able to verify the pre-existing good state and the bad state afterward from multiple clients. If sync played into it, it must have been on a write somehow.


Vaultwarden Users: Folder Unassignment Bug?
Hey Vaultwarden users... I was turned on to Vaultwarden by this community and have a new installation up and running. I've recently imported a pretty substantial keeypass DB and have been manually validating the import and tidying up my folder organization as I go, including selectively moving some credentials to an organization with the future intention of adding family members to that org to access shared accounts. By and large it's all going swimmingly with one concerning exception. Every now and again, a bunch of credentials forget their folder and get moved into "no folder". - I don't have a reliable reproduction yet, but it seems vaguely correlated with bulk moves. In the web-ui, I'll check a bunch of entries to move from my vault to the org, and OTHER entries I didn't touch get moved to "no folder" in my vault as a side-effect. - Once I had a folder disappear like this as well - I think I understand the basics around how collections, folders, and nesting of those containers work. I'm fairly confident that I'm not getting tripped up by just failing to understand the implications of the operation I'm doing. - I'm using sqlite for my db backend. I'm perfectly comfortable running a Postgres instance, I just thought the no-maintenance and no-dependencies approach of sqlite felt like a good match for this tiny but critical dataset. Could it be that the sqlite backend is under baked and I"m hitting some persistence bug? - Fwiw I've also seen issues where I get an encryption key error saving an entry or I see tons of missing entries.In each case logging out and logging in works around the issue. I had assumed this was browser/web buglets, but now I wonder if it's more signs of storage layer problems. Have others seen similar issues? What db backend are you using?
fedilink

A very common DDoS attack uses UDP services to amplify your request to a bigger response, but then spoof your src ip to the target.

Having followed many reports of denial of service activity of Lemmy, I don’t think this is the common mode. Attacks I’d heard of involve:

  • Using regular lemmy APIs backed by heavy database queries. I haven’t heard discussion of query rates, but Lemmy instances are typically single-machine deployments on modest 4-core to 32-core hardware. Dozens to thousands of queries per second to the heaviest API endpoints are sufficient to saturate them. There’s no need for distributed attack networks to be involved.
  • Uploading garbage images to fill storage.

Essentially the low-hanging fruit is low enough that distributed attacks, amplification, and attacks on bandwidth or the networking stack itself are just unnecessary. A WAF is still a good if indeed OPs instance is getting attacked, but I’d be surprised if wafs has built-in rules for lemmy yet. I somewhat suspect one would have to do the DB query analysis to identify slow queries and then write custom waf rules to rate limit the corresponding API calls. But it’s worth noting that OP has provided no evidence of an attack. It’s at least equally likely that they dos’ed themselves by running too many services on a crappy VPS and running out of ram. The place to start is probably basic capacity analysis.

Some recent sources:


Docker is a powerful tool to increase confidence in your backups.

  • In a VM, the way you figure out which files to backup is to read the docs. If they’re wrong or you misread them, the only way you’ll find out is by doing a full restore test… which is often painful and complex in home setups.
  • In docker, the filesystem outside volumes is destroyed between every container restart. If your volume setup is insufficient, you’ll repeatedly lose state during your initial installation process between container restarts. You’ll continually test your state management throughout the lifetime of the service during restarts. This leaves a much smaller window for backup mistakes.

The tradeoff with docker is that the networking is complex (well, everything is complex… but the networking is where it often hurts). But if you’re able to deal with that one-time pain, it’s superior almost all the time for home setups. I think the only things I run outside docker are ssh and netdata. SSH because it’s stateless and works perfectly out of the box, and netdata because it wants permissions to everything… and is functionally stateless for me because I don’t care if I drop my observability data.


The Foundry VTT community frequently uses video conferencing for tabletop roleplaying games and initially Jitsi was the recommended self-hosted video option, but the community has since moved on and now recommends https://livekit.io/. I didn’t set up either and don’t have deep insights into what drove the shift, but it’s an interesting data point around a community that tried both shifting focus away from Jitsi.


I can’t remember if it’s enabled by default or not, but it’s easy enough to enable pprof and get a helpful performance profile from /debug/pprof. See https://caddy.community/t/hangs-on-reload/12010/18 for an example.

I’ve found that even being unfamiliar with the codebase, it’s often pretty easy to identify what part of the call stack is being slow and file a very useful performance but report in GitHub. Check out the profile and see if it leads to any obvious conclusions about why domains are so much slower. There may be some function that’s trivial to cache the results of that brings things back to the expected performance.


In the vast majority of cases, one can support variation in admin preferences by exposing a configuration parameter. Your downvote example is perfect because Beehaw doesn’t run a customized lemmy codebase. There is a checkbox exposed to lemmy admins that enables/disabled downvotes.

Running a custom-codebase is generally the highest-hassle method of achieving some custom-config goal. The absence of communities around this approach isn’t an accident, the people who develop customizations generally try to work with the upstream unless the devs give them good reason not to.


Are there instances that run modified versions of the base Lemmy software? For example, that use their own sorting algorithms, or provide users ways to block instances or specific users, etc?

If one had developed code to do these things, why would one not upstream it so it’s released in core lemmy and all instances can benefit from that capability?


ZFS zRAID is pretty good for this I think. You hook up the drives from one “pool” to a new machine, and ZFS can detect them and see that they constitute a pool and import them.

I second this approach, but if one isn’t down with ZFS, LVM can bodge a raid onto any filesystem at the block layer. I don’t remember when I got over hardware raid envy and decided that I preferred software raid for my home lab, but it was a long while ago and I’ve never regretted it. Being able to plug some drives into any old USB, sata, or whatever port on any Linux box is super valuable when things start going sideways and you don’t have budget for spare hardware or rapid-response support contracts.


I enabled it and out of the box none of my containers could resolve DNS, even though aardvark was running.

I experienced this on Ubuntu as well, and addressed it by opening up a firewall rule on the network interface for my podman network allowing the ip-range of the podman network to issue DNS requests to the gateway-ip (which is where aardvark-dns sets up shop).

Also had to add a firewall rule to open whatever ports I exposed from all src-ips to the podman network range before exposing hostPorts would work.

Again, not critiquing the very capable macvlan setup, just sharing tips I’ve picked up on making netavark work.


This is a pretty awesome how-to. I knew nothing about containerizing GPU workloads before this, and it seems quite a lot less scary/involved than I feared.

FWIW, I think some of your DNS and general networking woes may be due to the macvlan setup rather than using netavark. Netavark seems like the golden path going forward for a batteries-included experience. Not that I have anything against macvlan, in many ways macvlan feels simplest and nicest for homelab setups and I’ve used it with LXC and other container runtimes in the past. But for the most docker-like “it just works” experience, I feel like netavark is getting the upstream love.


Is it necessary to add an instance to the allowed list? If federation is enabled, isn’t an instance ’allowed’ by default?

Your understanding is correct, and making your “allowed” list non-empty is a big deal because it implicitly defederates you with every instance that isn’t in that list. I rather suspect that OP doesn’t understand what the allowed list does and is trying to find ways to promote small Instances while lacking knowledge about how things work.

Defederating from all major instances would almost certainly relegate small instances doing so to irrelevance.


Fwiw, federation is known to be relatively unreliable at the moment. https://github.com/LemmyNet/lemmy/issues/3101 is marked as closed, but it seems pretty clear to me that many of these behaviors persist in 0.18.3. There may well have been improvements, but we haven’t yet achieved full resolution of these things.



is it worth starting out with podman or is this just some job requirement and docker is perfectly fine for us hobbyists

I’m doing this in my homelab, but I am a pro and so time spent learning arcane details of container ecosystems is not precisely wasted time for me. But I’m not doing it directly for some particular professional requirement, it’s more curiosity.

Based on my experience, I don’t think I could honestly recommend podman right now for a beginner. The people that tend to be most interested in podman tend to think:

  • The best days of docker are behind it. The company hasn’t achieved financial success and are going to make it worse over time to pressure companies into paying them. We’ve seen the start of this with docker-desktop but I’m predicting it will continue and escalate.
  • Docker was the first really successful container system and is very monolithic and full of questionable technical decisions. Improving it will be hard because of its success, and also because its monolithic nature means that many changes will bottleneck at docker the company, who as noted is not incentivized to make its open source stuff “too good” such that companies use it without paying.

Podman is more modular, is supported by more successful and stable companies can have revenue strategies that don’t require them to monetize podman specifically to death, and the individual pieces are small enough to be built and supported by individuals and non-commercial teams if necessary. So I’m sort of betting that over time podman will gain more traction and am willing to invest in learning my way through some bumps in the road as that happens. For beginners, I think you’ll know it’s time to consider a switch when projects start to ship podman configs instead of docker-compose configs. Then you’ll know that those devs think that supporting podman deployments will give them less headaches than supporting docker deployments and we’re reaching the inflection point where podman is starting to “win” and legit be easier/better. Right now I’m pretty clearly swimming upstream and I’m ok with that.

But relating back to OP’s question, although my usage of podman is a bit bleeding edge… it still illustrates the kind of problems every self-hoster hits and how it’s necessary to break those problems down into smaller parts to solve them yourself. It’s just not realistic to expect every self-hosting scenario to be fully tutorialized. Tutorials help us understand how the pieces fit together, but when things go wrong we have to understand the pieces and troubleshoot them directly rather than expect the tutorial to dive into fractally complex subject in easy/brief overviews but simultaneously dive into infinitely many edge-cases in depth.


I acknowledge it’s all surely basic but I’m not sure where to find a comprehensive source of learning instead of googling bits and pieces.

I think a challenge you are likely to run into is that self-hosting many services really ISN’T basic, and there simply aren’t comprehensive sources… and really can’t be. It’s too varied and complex. Every network environment is different, and every network environment is so complex that it takes a networking expert to understand. No tutorial can cover all the possibilities, or even help you figure out what scenario you’re in.

As an example, I’m currently migrating from docker-compose to podman-kube-play for my container management. I’m a a professional engineer who works with containers every day, and I’ve spent the better part of a week trying to get my first non-trivial container to run.

  • I’ve had to read tutorials to see how to get started.
  • I’ve had to read podman docs to see what k8s config options are supported.
  • I’ve had to read bug reports and examples from people using podman to see how specific features get strung together for complex use-cases like mine.
  • Even after getting many things right, DNS resolution didn’t work in my container. I spent many hours researching and found nothing. I finally had to start installing debugging tools like dig and nmap in my container to find that I couldn’t speak to the DNS server at all. I eventually found firewall logs showing that UFW was blocking the traffic from the container to the DNS server. UFW has nothing to do with Podman. Arch and Fedora users would not have been affected by this issue. Ubuntu users like me still wouldn’t have been affected if they were using host-networking or rootless podman. My specific environment and use-case was affected.

There is simply no single resource on the internet addressing my personal scenario. To get to the bottom of it, I had to know enough about podman, k8s, DNS, networking, firewalls, UFW specifically, where interesting data on my system tends to get logged, and enough about “normal” logs to sift through the garbage and find the logs that lead me to a solution.

So I recommend switching your perspective. Stop looking for a one-stop-shop that doesn’t exist. Instead, try to learn when the thing you’re trying to do is really 5 different things lashed together with duct tape. Then start deep-diving on each piece until you know enough about that thing to relate it to your specific environment and move on to the next thing. This is time consuming, especially as you’re getting started… but it’s fractally deep and remains time consuming forever as you continue to learn new things and aspire to do more complicated stuff. This breaking down of complex topics into a series of simpler (but not necessarily simple) topics is the hallmark of every successful engineer I’ve ever met.


I read the design notes and found them pretty interesting even though I haven’t really been pining for a faster linker: https://github.com/rui314/mold/blob/main/docs/design.md


I went through that phase too, but people haven’t widely adopted the idioms around immutable infrastructure for no reason. My LXC setup was more work to maintain and left me with much MUCH more upgrade uncertainty than my idiomatic/immutable container setup does. I have a deep understanding of both systems and both approaches and I would never go back to using LXC like VMs.


This post overviews several self-hostable management systems that enable one to configure multiple clients and tunnels via wireguard. It gives a nice comparison between them, I learned a bit about how they compare and overlap.
fedilink