Primary account is now @Dark_Arc@social.packetloss.gg.
Not sure what you’re using to generate that list/formatting is a bit difficult.
I don’t have a cluster since it’s effectively single user + @Auto_Post_Bot@social.packetloss.gg (in theory a few other people have access, but they’re not active), single machine, it’s just more or less the out of the box docker stuff on a bare metal machine in my basement + a digital ocean droplet.
The droplet is what I’m using to have a static IP to prevent dynamic DNS nonsense + it provides some level of protection against a naive DDoS attack on random fediverse servers (since I can in the worst case, get on my phone and severe the ZeroTier connection that’s using to connect the droplet to my basement server).
I’m pretty confident whatever is going on is payload related at this point.
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
50622 70 20 0 330264 240200 201512 S 0.0 0.7 0:25.21 postgres
50636 70 20 0 327804 239520 201296 S 0.0 0.7 0:26.55 postgres
50627 70 20 0 327204 239152 201592 S 0.0 0.7 0:24.75 postgres
50454 70 20 0 328932 238720 200872 S 0.0 0.7 0:26.61 postgres
50639 70 20 0 313528 217800 193792 S 0.0 0.7 0:03.13 postgres
50641 70 20 0 313284 217336 194204 S 0.0 0.7 0:03.15 postgres
50626 70 20 0 313592 216604 193636 S 0.0 0.7 0:05.07 postgres
50632 70 20 0 313236 216460 193968 S 0.0 0.7 0:04.52 postgres
50638 70 20 0 310368 216084 193856 S 0.0 0.7 0:04.20 postgres
50614 70 20 0 310520 216072 193840 S 0.0 0.7 0:02.88 postgres
50642 70 20 0 312200 215920 194068 S 0.0 0.7 0:04.46 postgres
50640 70 20 0 312584 215724 193676 S 0.0 0.7 0:03.32 postgres
50635 70 20 0 309744 215404 193764 S 0.0 0.7 0:02.72 postgres
50630 70 20 0 312168 215224 193488 S 0.0 0.7 0:02.67 postgres
50621 70 20 0 309560 215096 193772 S 0.0 0.7 0:02.97 postgres
50646 70 20 0 309492 215008 193560 S 0.0 0.7 0:04.66 postgres
50625 70 20 0 309760 215004 193368 S 0.0 0.7 0:03.08 postgres
50637 70 20 0 309296 214992 193848 S 0.0 0.7 0:02.87 postgres
50616 70 20 0 310596 214984 192700 S 0.0 0.7 0:04.17 postgres
50643 70 20 0 310392 214940 194008 S 0.0 0.7 0:04.14 postgres
50624 70 20 0 310128 214880 192928 S 0.0 0.7 0:04.15 postgres
50631 70 20 0 310220 214596 192576 S 0.0 0.7 0:02.71 postgres
50613 70 20 0 309364 213880 192520 S 0.0 0.7 0:04.06 postgres
50628 70 20 0 309852 213236 191504 S 0.0 0.7 0:03.04 postgres
50634 70 20 0 187772 163388 149428 S 0.0 0.5 0:02.87 postgres
50644 70 20 0 189684 162892 148508 S 0.0 0.5 0:04.11 postgres
50633 70 20 0 186096 162544 149324 S 0.0 0.5 0:03.20 postgres
50629 70 20 0 185644 162112 149296 S 0.0 0.5 0:04.62 postgres
50618 70 20 0 186264 160576 147928 S 0.0 0.5 0:04.10 postgres
50582 70 20 0 185708 160236 147592 S 0.0 0.5 0:04.10 postgres
3108 70 20 0 172072 144092 142256 S 0.0 0.4 0:04.46 postgres
3109 70 20 0 172024 142404 140632 S 0.0 0.4 0:02.24 postgres
2408 70 20 0 171856 23660 22020 S 0.0 0.1 0:00.76 postgres
3113 70 20 0 173536 9472 7436 S 0.0 0.0 0:00.15 postgres
3112 70 20 0 171936 8732 7020 S 0.0 0.0 0:01.54 postgres
3114 70 20 0 173472 5624 3684 S 0.0 0.0 0:00.00 postgres
I’ve got quite a bit of experience with postgres; I don’t see any indication it’s the problem.
So, I think this is a (helpful) general comment but wrong in this/my specific case.
The server is so small it’s not really going to register on a 10-minute frequency for outgoing content – I’m not that much of a lemmy addict! haha.
You can see in a comment here my most recent comment to lemmy.world did sync: https://lemmy.world/comment/8728858
I’m not having any issues with outgoing content, beehaw, the KDE instance, and several others. It’s just lemmy.world that’s acting up (which is unfortunately because it’s my favorite – I mod/run several communities and donate to here/them – haha).
Yeah, I mean things should be fine in general; like I said this has been working for quite a long time now without issue.
The machine that’s actually doing the work here is quite powerful and is used to run several game servers in addition to Lemmy … Lemmy really isn’t much more than footnote in resource usage:
CPU:
Info: 8-core model: Intel Core i7-10700 bits: 64 type: MT MCP cache: L2: 2 MiB
Speed (MHz): avg: 4653 min/max: 800/4800 cores: 1: 4698 2: 4685 3: 4786 4: 4704 5: 4694
6: 4700 7: 4800 8: 4801 9: 4802 10: 3408 11: 4756 12: 4713 13: 4706 14: 4707 15: 4798 16: 4703
Drives:
Local Storage: total: 931.51 GiB used: 380.39 GiB (40.8%)
ID-1: /dev/nvme0n1 vendor: Western Digital model: WDS100T2B0C-00PXH0 size: 931.51 GiB
Partition:
ID-1: / size: 914.18 GiB used: 380.02 GiB (41.6%) fs: xfs dev: /dev/dm-0
ID-2: /boot size: 1014 MiB used: 370 MiB (36.5%) fs: xfs dev: /dev/nvme0n1p2
ID-3: /boot/efi size: 598.8 MiB used: 5.8 MiB (1.0%) fs: vfat dev: /dev/nvme0n1p1
Swap:
ID-1: swap-1 type: partition size: 15.71 GiB used: 1.2 MiB (0.0%) dev: /dev/dm-1
Sensors:
System Temperatures: cpu: 28.0 C pch: 26.0 C mobo: N/A
Fan Speeds (rpm): N/A
Info:
Processes: 358 Uptime: 16h 39m Memory: total: 32 GiB note: est. available: 30.77 GiB
used: 8.54 GiB (27.8%) Init: systemd target: multi-user (3) Shell: fish inxi: 3.3.30
So, I think you’re most on the right track of the responses…
It seems to just be exclusively incoming from lemmy.world. If you look here, my most recent comment is on lemmy.world:
https://social.packetloss.gg/comment/1415801 https://lemmy.world/comment/8710941
The instance just isn’t getting any new posts, comments, or votes back from lemmy.world.
Everytime I shut down the lemmy server I see this:
2024-03-23T17:34:33.774333Z WARN lemmy_server: Received ctrl-c, shutting down gracefully...
2024-03-23T17:34:33.774912Z WARN lemmy_federate: Waiting for 1618 workers (30.00s max)
That number never seems to move, there are always 1618 works. I’m not sure if that means anything or not regarding pending processing or what have you.
I am seeing in my publicly facing nginx logs:
135.181.143.221 - - [22/Mar/2024:12:23:28 +0000] "POST /inbox HTTP/1.1" 499 0 "-" "Lemmy/0.19.3; +https://lemmy.world"
135.181.143.221 - - [22/Mar/2024:12:23:40 +0000] "POST /inbox HTTP/1.1" 499 0 "-" "Lemmy/0.19.3; +https://lemmy.world"
135.181.143.221 - - [22/Mar/2024:12:23:54 +0000] "POST /inbox HTTP/1.1" 499 0 "-" "Lemmy/0.19.3; +https://lemmy.world"
135.181.143.221 - - [22/Mar/2024:12:24:12 +0000] "POST /inbox HTTP/1.1" 499 0 "-" "Lemmy/0.19.3; +https://lemmy.world"
135.181.143.221 - - [22/Mar/2024:12:24:38 +0000] "POST /inbox HTTP/1.1" 499 0 "-" "Lemmy/0.19.3; +https://lemmy.world"
135.181.143.221 - - [22/Mar/2024:12:25:21 +0000] "POST /inbox HTTP/1.1" 499 0 "-" "Lemmy/0.19.3; +https://lemmy.world"
135.181.143.221 - - [22/Mar/2024:12:26:35 +0000] "POST /inbox HTTP/1.1" 499 0 "-" "Lemmy/0.19.3; +https://lemmy.world"
135.181.143.221 - - [22/Mar/2024:12:28:53 +0000] "POST /inbox HTTP/1.1" 499 0 "-" "Lemmy/0.19.3; +https://lemmy.world"
135.181.143.221 - - [22/Mar/2024:12:33:19 +0000] "POST /inbox HTTP/1.1" 499 0 "-" "Lemmy/0.19.3; +https://lemmy.world"
135.181.143.221 - - [22/Mar/2024:12:42:01 +0000] "POST /inbox HTTP/1.1" 499 0 "-" "Lemmy/0.19.3; +https://lemmy.world"
135.181.143.221 - - [22/Mar/2024:12:59:15 +0000] "POST /inbox HTTP/1.1" 499 0 "-" "Lemmy/0.19.3; +https://lemmy.world"
135.181.143.221 - - [22/Mar/2024:13:33:33 +0000] "POST /inbox HTTP/1.1" 499 0 "-" "Lemmy/0.19.3; +https://lemmy.world"
135.181.143.221 - - [22/Mar/2024:21:31:55 +0000] "POST /inbox HTTP/1.1" 499 0 "-" "Lemmy/0.19.3; +https://lemmy.world"
There’s then an internal nginx server that sees:
10.241.127.2 - - [22/Mar/2024:12:23:18 +0000] "POST /inbox HTTP/1.1" 200 0 "-" "Lemmy/0.19.3; +https://lemmy.world"
10.241.127.2 - - [22/Mar/2024:12:24:19 +0000] "POST /inbox HTTP/1.1" 408 0 "-" "Lemmy/0.19.3; +https://lemmy.world"
10.241.127.2 - - [22/Mar/2024:12:24:31 +0000] "POST /inbox HTTP/1.1" 408 0 "-" "Lemmy/0.19.3; +https://lemmy.world"
10.241.127.2 - - [22/Mar/2024:12:24:45 +0000] "POST /inbox HTTP/1.1" 408 0 "-" "Lemmy/0.19.3; +https://lemmy.world"
10.241.127.2 - - [22/Mar/2024:12:25:03 +0000] "POST /inbox HTTP/1.1" 408 0 "-" "Lemmy/0.19.3; +https://lemmy.world"
10.241.127.2 - - [22/Mar/2024:12:25:29 +0000] "POST /inbox HTTP/1.1" 408 0 "-" "Lemmy/0.19.3; +https://lemmy.world"
10.241.127.2 - - [22/Mar/2024:12:26:11 +0000] "POST /inbox HTTP/1.1" 408 0 "-" "Lemmy/0.19.3; +https://lemmy.world"
10.241.127.2 - - [22/Mar/2024:12:27:25 +0000] "POST /inbox HTTP/1.1" 408 0 "-" "Lemmy/0.19.3; +https://lemmy.world"
10.241.127.2 - - [22/Mar/2024:12:29:43 +0000] "POST /inbox HTTP/1.1" 408 0 "-" "Lemmy/0.19.3; +https://lemmy.world"
10.241.127.2 - - [22/Mar/2024:12:34:09 +0000] "POST /inbox HTTP/1.1" 408 0 "-" "Lemmy/0.19.3; +https://lemmy.world"
10.241.127.2 - - [22/Mar/2024:12:42:51 +0000] "POST /inbox HTTP/1.1" 408 0 "-" "Lemmy/0.19.3; +https://lemmy.world"
10.241.127.2 - - [22/Mar/2024:13:00:06 +0000] "POST /inbox HTTP/1.1" 408 0 "-" "Lemmy/0.19.3; +https://lemmy.world"
10.241.127.2 - - [22/Mar/2024:13:34:24 +0000] "POST /inbox HTTP/1.1" 408 0 "-" "Lemmy/0.19.3; +https://lemmy.world"
10.241.127.2 - - [22/Mar/2024:14:42:49 +0000] "POST /inbox HTTP/1.1" 408 0 "-" "Lemmy/0.19.3; +https://lemmy.world"
10.241.127.2 - - [22/Mar/2024:16:59:32 +0000] "POST /inbox HTTP/1.1" 408 0 "-" "Lemmy/0.19.3; +https://lemmy.world"
10.241.127.2 - - [22/Mar/2024:21:32:45 +0000] "POST /inbox HTTP/1.1" 408 0 "-" "Lemmy/0.19.3; +https://lemmy.world"
10.241.127.2 - - [23/Mar/2024:06:39:03 +0000] "POST /inbox HTTP/1.1" 408 0 "-" "Lemmy/0.19.3; +https://lemmy.world"
So, things did start timing out. I’m not sure what to do about that though.
This server is not resource starved:
load average: 0.04, 0.09, 0.10
total used free shared buff/cache available
Mem: 31507 7651 1092 164 22764 23239
Swap: 16087 1 16086
It’s just this lemmy.world data that’s suddenly out of wack after months of normal operation (both on lemmy 18 and 19).
It feels like a bad payload that the server just can’t move past for some reason and lemmy.world keeps sending.
I had logging on the lemmy container itself piped to /dev/null because it’s just such a noisy log. I turned it back on… I’ll see if I can find more information next time lemmy.world posts.
It’s been up on 19.x for a few months now. It’s also a full on bare metal server with a ton of resources, it’s not at all starved.
It’s almost like someone posted something to somewhere that “jammed” Lemmy and it just won’t get past it but I’m not sure how to figure out what that would be or how to unjam things.
I don’t know anything about Netbird, but I’ll link you to my ZeroTier pitch the last time I noticed someone talking about Tailscale: https://lemmy.world/comment/1058287.
You could checkout a very similar product, ZeroTier (Open Source Community Edition) assuming your use case is non-commercial.
… if you’re willing to use an older release, you could potentially do whatever you want as the software uses a BSL license with a change date fallback license of Apache 2.0.
I’m going to plus one what others are saying here. As someone who wears dev and sysadmin hats, Postgres is just a way better database. For new projects, Postgres should be the go to unless you’ve got a niche use case.
Even with Diesel (or some other query builder) in use, it’s not uncommon for apps to use special features specific to a database. It can also be a major pain just to test that your app actually works on these other databases, and then support weird slow downs, migrations, etc
I’ll give the example that MySQL and MariaDB don’t support transactions for SQL DDL. Basically what that means if that if you want to write a database migration to update the layout of one of your tables, there’s no way in MySQL to have the database automatically revert your changes if something fails during the database changes… You’re just screwed unless you have a backup. Meanwhile with Postgres, you just say rollback in your transaction, and things are back to normal.
There are also other less serious quirks like if you insert a NULL into and Integer field that’s specifically marked not null, Postgres will error, MySQL/MariaDB will just silently convert that to a 0.
Like in a way, MySQL is the “JavaScript” of databases, and that’s not what you want at all.
I’ve tried to switch in the past, but tripped over the differences in Podman vs Docker networking. IIRC Docker is better for creating an isolated network.
I have noticed that Docker doesn’t do the best job at graceful shutdowns (say for automatic installation of updates). I suspect Podman with systemd integration could do much much beter.
Consider TrueNAS Scale with mirrored drive pairs DIY.