(Justin)

Tech nerd from Sweden

  • 1 Post
  • 106 Comments
Joined 1Y ago
cake
Cake day: Jun 10, 2023

help-circle
rss

Should work well for that!

If you use cloudflare for dns only and turn cloudflare proxying off, none of your data or traffic goes to cloudflare’s servers. They just act as your dns server, telling your devices what IP to go to.


it might be better to skip the cloud server and use cloudflare for dynamic dns. The standardized way to restrict access to websites is with client certificates or a basic authentication (user/pass) proxy. That would help avoid issues with internet traffic passing through the VPN accidentally.


Nixos’ weakness is definitely it’s documentation. There’s often configuration snippets you can copy and paste, though. If you go with NixOS, make sure to come back with questions, the community is very helpful.


Unraid is bad at NAS and bad at docker. Go with a separate Nas and application server.


Is there a way for me to be “notified” if shell access of any form is gained by someone?

Falco is a very powerful tool for this.


If you’re not using something like synology, it isn’t really an issue to run applications and nas on the same machine. I would generally recommend separating them so you have more options in the future if you want to run muliple servers for HA or expansion, but it should be fine either way. It is worth noting that quad core N100 computers are like $150 on aliexpress if you want a cheap application server(s).


Generally it’s simpler if you have your NAS separate from your application server. Synology runs NAS really well, but a separate application server for docker/etc is a lot easier to use and easier to upgrade than running on Synology. Your application server can even have a GPU for media transcoding or AI processing. Trying to do everything on one box makes things more complicated and fragile.

I would recommend something like Debian or NixOS for the application server, and you should be able to manage it over SSH. You can then mount your NAS as an NFS share, and then run all your applications in Docker or NixOS, using the NAS to store all your state.


That’s fair. I’m just thinking I could never use something like this because I would be invading the privacy of others using my Jellyfin. I would live to see an anonymous view counter on every movie though tbh.


Seems pretty creepy to be collecting logs about what people watch. Why do people use this?


You need IP cameras and then you need a NVR server for recording, detection, and display. There are some good open source NVR programs out there with docker support. I’ve been wanting to try Viseron. There’s also ZoneMinder and Shinobi that seem to be good.

Unfortunately most consumer cameras are cloud only. This seems to be a list of cameras you can look into: https://wiki.zoneminder.com/Hardware_Compatibility_List

Your best bet is probably a chinese brand for cameras. Dahlua seems popular. There are also a bunch of PoE cameras on Aliexpress for $15-25, but I can’t attest to if they’re any good. Hikvision cameras seem to have been popular too, but they have been recently sanctioned by EU/US for human rights violations.


STH measured 23w on theirs, but it can vary based on which one you buy. Tons of compute power with those 4 E cores.

https://www.servethehome.com/fanless-intel-n100-firewall-and-virtualization-appliance-review/4/


$150 fanless N100 pc with 4x2.5gbps from aliexpress and install OPNsense on it.


I guess that makes sense, but I wonder if it would be hard to get clean data out of the per-token confidence values. The LLM could be hallucinating, or it could just be generating bad grammar. It seems like it’s hard enough already to get LLMs to distinguish between “killing processes” and murder, but maybe there could be some novel training and inference techniques that come up.


I thought confidence levels were for image recognition? How do confidence levels work for transformer LLMs?


Being able to find and read software documentation and knowing how to use the tools that automate software deployment are why SRE/devops/cloud guys get paid the big bucks.

I definitely recommend synapse over dendrite or conduit btw. dendrite and conduit have a bunch of missing features, and my first attempt at dendrite server shat the bed with its NATS store and died. I definitely recommend Synapse for all matrix servers going forward.

The .well-known entries I found were the hardest to test, since synapse doesn’t provide a web server for them, and Element throws a fit if you don’t have CORS set up exactly in the way it wants you to.

I mostly have my matrix server working now, with bridges even. However, Element randomly logs itself out on a daily basis which is really frustrating :/



Apparently there’s something called fcast, but I’ve never tried it.

https://fcast.org/


they have CEC adapters and remotes for PCs that you can use.


Yeah I’ve been wanting to start using it. I have a colleague who uses it for platform engineering and it’s supposed to be amazing. I was going to use it for creating offsite backup buckets on OVH but I ended up setting up a Hetzner storage box manually instead because that was cheaper. Since everything I have is self hosted, really the only external infrastructure I have is Cloudflare, but all the records there are handled by external-dns, so I haven’t really seen a need to GitOps it.

One thing I do want to look at was the custom CRD feature they were talking about at Kubecon this year, it sounded like they might have finally fixed the platform engineering abstraction problem that people have been trying to use helm (badly) for. Many companies have actually been resorting to operators for this problem, which is super overkill. I did try to use cdk8s for abstraction last year, and I was even planning to create and support a production-ready Lemmy deployment option using cdk8s, but cdk8s honestly was quite clunky on the developer side and committed the sin of reimplementing an API without even properly documenting the new API.

I’m probably just going to create a Lemmy Helm chart at some point using Cloud-Native Postgres operator and Gateway API when I have time. But Helm has glaring issues, both as a developer and as a user.


Kubernetes does it a lot better. No more messing with caddy config files, or docker sockets, you get the real deal, production stuff.

Containers automatically take themselves off the built-in loadbalancer and/or restart when they fail a health check.

A new high-availability postgres cluster with automatic backups is just a Cluster, a firewall rule is just a NetworkPolicy, a new subdomain is just an HTTPRoute, a new proxy container is just a Gateway, a new auto-renewed Let’s Encrypt certificate is just a Certificate, and DNS is set up automatically with the domain name from the HTTPRoute without me touching anything. Everything is high-availability and self-healing, I’ve never had anything go down or crash.

The other thing is ArgoCD, which automatically syncs your cluster with git. If I edit any of my config files in git, it is instantly updated on the cluster itself.

Here is my configuration for my 200+ containers, even my Lemmy instance is running here: https://codeberg.org/jlh/h5b/src/branch/main/argo/custom_applications

Docker and the Docker ecosystem copies a lot of features from Kubernetes, because they’re essentially the same thing, but Kubernetes does it in a production-ready, maintainable way. Kubernetes is an automation tool that lets 1 engineer do the work of 10.


I mean that with k3s you can get a kubernetes cluster running with 0 effort on a single machine. It is easier to maintain, because it handles restarting containers, updating containers, managing ports, provisioning storage, creating databases, etc for you. I’ve found the logs and events system to be super useful for troubleshooting compared to Dockerd, but maybe it can be tricky if it does something you don’t expect it to.

Obviously you need to learn how to use that automation to take advantage of it, and stuff like networking and persistent volumes can be confusing if you don’t have a good guide on it. The fact that there are different drivers for networking, storage, database management, etc can also take a bit of time. That said, networking and storage can be confusing on Docker too if you don’t have a good guide, and Docker-compose also has a learning curve, so I honestly don’t think Kubernetes is that much more effort. The main thing is that most guides are written for Docker, but the Kubernetes documentation is really good too.

If you just want to just run containers for jellyfin and home-assistant, Docker compose will be good enough. But if you want databases, reverse proxy, certificates, dns, self-healing, etc, for running bigger stuff like nextcloud and lemmy, then I would spend the extra 50% effort and do it on Kubernetes, it’ll save you time and headaches in the long run.

Asking an LLM like Lllama or ChatGPT might be a good way to learn the basics with Kubernetes, but things move fast once you start getting into the newest operators like CNPG and Gateway API.


Fair enough. I think it’s bad to invent new words for “stopped container”, though. And there should be a way to re-start them.

Yeah, the container creation GUI is a mess. The $3/month thing is a new thing they started for new customers this year. https://unraid.net/pricing

Not a big deal for grandfathered users, but I think its important to consider as a new customer, as you won’t even get security updates without paying the subscription fee. Even for vulnerabilities like the CVE-2024-21626 Leaky Vessels vulnerability.

The raid is nice, but it can be kinda clunky adding/removing drives sometimes and I’ve managed to accidentally destroy an array when I was playing with it. I think you can get identical features using LVM, but obviously it’s nice how Unraid does it all for you in a GUI.


Yeah, definitely go with a single machine for containers if you haven’t seen a need for disaggregation. Even a cheap Aliexpress N100 box is super capable.

Regarding the jump to Kubernetes, I will point out that Kubernetes is a tool for container orchestration and automation, not necessarily a container cluster. I have found many benefits from using Kubernetes on a single node, so I wouldn’t consider container clustering a prerequisite for Kubernetes.


Makes sense. I would probably recommend more infrastructure-as-code workflows over snapshots, like ArgoCD or docker-compose, as git commits are simpler than VM snapshots. But both ways work.


I’m well aware.

I was #8 on this list: https://web.archive.org/web/20240221094039/unraid.net/about

The way that Unraid manages Docker containers is really dumb, and it gets in your way SO MUCH. Orphans are not a normal Docker idea, it is something invented by Unraid. It actively makes managing containers harder, as there is no documented way to restore orphans if I recall correctly. Creating new containers is confusing and uses non-standard terminology, when docker-compose files have been standard for half a decade now. Unraid is a really bad container orchestrator with bad abstractions and no ability to do Infrastructure as Code. The only good thing is the GUI for monitoring containers.

The monitoring GUI is nice, and I guess if you’re doing everything with the CLI and just using the GUI for monitoring it makes sense. But CLI is not a supported workflow with Unraid, and what are you paying $3/month for if you’re just going to use the CLI? I personally wouldn’t recommend the overhead, setup, and upgrade headaches over just doing the CLI with Debian. There are just as nice free dashboards available for Kubernetes.

For what it’s worth, this is my homelab: https://codeberg.org/jlh/h5b

I run nearly 300 containers in a 4 node cluster, with a separate router and iot server. Every single piece is implemented in code, because that’s easier to maintain and document. I used Proxmox for VMs/LXC for a while, and I used FreeNAS for ZFS+NFS for a while, but now I use purely NixOS and Kubernetes. I have never seen Unraid as a valuable thing that I would like to add to my homelab in the past 8 years.


Pretty much the tradeoff that you said. Harder to maintain an all in one box since things conflict with each other. That said, it’s also harder to maintain 10 devices instead of 2. Usually, you want to segregate your services based on maintenance schedule. Something that you reboot once a year like your router probably shouldn’t be on the same device as something that you might reboot every day, like home assistant, if you value your sanity.

Also, virtualization is pretty much dead-end now and will just make your life harder.

In terms of the easiest software available for self hosting, I would use a dedicated router and a dedicated nas, as those are fairly standalone and can be purchased as appliances. Then I would use a single machine with Debian or NixOS, and use it as a Kubernetes or Docker host. (Kubernetes is super easy with k3s and easier to maintain than Docker, but there’s a higher barrier to entry as you’d have to write your services with Pod files instead of docker-compose files)

I wouldn’t recommend something that tries to do everything, like Unraid, TrueNAS, or Proxmox, as they honestly obfuscate things and make things harder to maintain. Though they can be nice for DIY NASes.

If you’re interested in high availability and clustering for a DIY NAS, you could even look into ceph/rook, which is what I’m using for my NAS, but it’s like 20x the effort of just having a standard NFS appliance.


Coops are still about the money. They’re about saving money by sharing resources with fellow workers/consumers, and maintaining democratic control over the company. You’re not going to get rich from a coop (without embezzlement), but you and your coowners will be cutting out the middle man. Obviously, it only makes sense for industries that you’re heavily invested in.


Self hosting can save a lot of money compared to Google or aws. Also, self hosting doesn’t make you vulnerable to DDOS, you can be DDOSed even without a home server.

You don’t need VLANs to keep your network secure, but you should make sure than any self hosted service isn’t unnecessarily opens up tot he internet, and make sure that all your services are up to date.

What services are you planning to run? I could help suggest a threat model and security policy.


No shit. These machines are as advanced as a nuclear power plant, they’re gonna have a bit more proprietary software and security protocols than you’d think. Not as simple as just pressing “start” with these machines.

I wonder if asianometry will do a video about the software on ASML machines sometime.


Definitely! If your VPN keeps logs, is in a surveillance-friendly jurisdiction, etc, then details of your internet traffic can be revealed by your VPN. I recommend Mullvad, paid with cash, for the most security. It can also help to pick VPN servers outside of the most egregious jurisdictions, like picking EU servers over US or HK servers.


DoH is meant to hide your internet activity from your ISP/cell-provider since DNS is otherwise unencrypted. If you trust your VPN, then you can trust unencrypted DNS.


The first step in security is to answer who you’re defending against. Someone stealing your phone? A cop with a STINGRAY device? All the security decisions you make are based on your initial threat model.

Generally, home internet, wifi, and cellular data are considered safe against passers-by (assuming your wifi password is strong). However, they are also assumed to be eavesdropped on by your ISP and government. Details of your internet traffic can then also be revealed by your ISP to other people during legal action, such as if you’re being investigated for piracy.

There are ways to further protect your internet traffic from being snooped on, even from your ISP and government, by using things like HTTPS, DNS over HTTPS, and of course, VPNs.


It would accelerate the ongoing brain drain in Hong Kong at least, and encourage the stragglers to finally leave for more democratic countries. Banning Google in Hong Kong would be a shitshow for the CCP, but Google doesn’t have any sort of spine or ethics.


The people? Democracy really isn’t that hard.


It’s an algorithm for determining how fast to upload packets. This article just talks about how to enable it.

Here’s the Wikipedia section about it: https://en.wikipedia.org/wiki/TCP_congestion_control#TCP_BBR

The gist is that instead of only throttling upload rate based on packet loss, BBR constantly measures roundtrip delay (ping) to determine how much bandwidth is available.


Stop basing your organization on Discord and hosting all your development work there. Don’t subject yourself to the whims of venture capital and enshittification.

Diversify your online presence, and find a local company that will host a matrix server for you.


Yeah, there’s definitely a learning curve since it’s so different from docker, but there are some good tutorials and everything just makes sense. All the error messages are googlable and everything fits together so well.


Kubernetes has user accounts that you could use to restart containers in an unprivileged way. Create a role and role binding that gives the “delete Pod” permission to a service account. Kind makes it very easy to run Kubernetes without any setup. You’d just need to convert your docker compose files to Deployments, Services, and PersistentVolumes.

If converting to a kubernetes setup is too big of a leap, you could maybe try to write a C program that uses setuid to gain docker privileges in a restricted way.

Probably easiest to just have a cronjob that restarts the container regularly, though.


Your internet/wifi seems really overloaded, average ping rtt should be under 100ms, not 712ms. Your wifi signal might be bad, a computer may be downloading/uploading a lot of data, or there is an issue with your internet line.

Double check your wifi signal and computer traffic, maybe try using a direct wired ethernet connection and disconnecting all other computers. Otherwise, contact your ISP with these ping results and speed results from speedtest.net.


Check for PSI stalling in htop (add PSI meters for cpu, ram, and io in the config menu), to rule out your system being overloaded. Check internet connectivity with ping 1.1.1.1, and see why registry is timing out with curl -v https://registry-1.docker.io/v2/

You can also test your dns servers if you think that they are an issue with

dig registry-1.docker.io @1.1.1.1
dig registry-1.docker.io @194.168.4.100

If the dig command outputs differ from each other, then it is likely that your ISP’s DNS servers are faulty and you should switch nameservers to 1.1.1.1 and 1.0.0.1 like the other commenter said.


Bit of a weird observation: “Seeing a new computing paradigm coming out of Data Science / Observability”
I wanted to share an observation I've seen on the way the latest computer systems work. I swear this isn't an AI hype train post 😅 I'm seeing more and more computer systems these days use usage data or internal metrics to be able to automatically adapt how they run, and I get the feeling that this is a sort of new computing paradigm that has been enabled by the increased modularity of modern computer systems. First off, I would classify us being in a sort of "second-generation" of computing. The first computers in the 80s and 90s were fairly basic, user programs were often written in C/Assembly, and often ran directly in ring 0 of CPUs. Leading up to the year 2000, there were a lot of advancements and technology adoption in creating more modular computers. Stuff like microkernels, MMUs, higher-level languages with memory management runtimes, and the rise of modular programming in languages like Java and Python. This allowed computer systems to become much more advanced, as the new abstractions available allowed computer programs to reuse code and be a lot more ambitious. We are well into this era now, with VMs and Docker containers taking over computer infrastructure, and modern programming depending on software packages, like you see with NPM and Cargo. So we're still in this "modularity" era of computing, where you can reuse code and even have microservices sharing data with each other, but often the amount of data individual computer systems have access to is relatively limited. More recently, I think we're seeing the beginning of "data-driven" computing, which uses observability and control loops to run better and self-manage. I see a lot of recent examples of this: - Service orchestrators like Linux-systemd and Kubernetes that monitor the status and performance of services they own, and use that data for self-healing and to optimize how and where those services run. - Centralized data collection systems for microservices, which often include automated alerts and control loops. You see a lot of new systems like this, including Splunk, OpenTelemetry, and Pyroscope, as well as internal data collection systems in all of the big cloud vendors. These systems are all trying to centralize as much data as possible about how services run, not just including logs and metrics, but also more low-level data like execution-traces and CPU/RAM profiling data. - Hardware metrics in a lot of modern hardware. Before 2010, you were lucky if your hardware reported clock speeds and temperature for hardware components. Nowadays, it seems like hardware components are overflowing with data. Every CPU core now not only reports temperature, but also power usage. You see similar things on GPUs too, and tools like nvitop are critical for modern GPGPU operations. Nowadays, even individual RAM DIMMs report temperature data. The most impressive thing is that now CPUs even use their own internal metrics, like temperature, silicon quality, and power usage, in order to run more efficiently, like you see with AMD's CPPC system. - Of source, I said this wasn't an AI hype post, but I think the use of neural networks to enhance user interfaces is definitely a part of this. The way that social media uses neural networks to change what is shown to the user, the upcoming "AI search" in Windows, and the way that all this usage data is fed back into neural networks makes me think that even user-facing computer systems will start to adapt to changing conditions using data science. I have been kind of thinking about this "trend" for a while, but [this announcement that ACPI is now adding hardware health telemetry](https://www.phoronix.com/news/AMD-New-SoCs-With-ACPI-PHAT) inspired me to finally write up a bit of a description of this idea. What do people think? Have other people seen the trend for self-adapting systems like this? Is this an oversimplification on computer engineering?
fedilink