Edit: Results tabulated, thanks for all y’alls input!

Results fitting within the listed categories

Just do it live

  • Backup while it is expected to be idle @MangoPenguin@lemmy.blahaj.zone @khorak@lemmy.dbzer0.com @dandroid@sh.itjust.works

  • @Darkassassin07@lemmy.ca suggested adding a real long-ass-backup-script to run monthly to limit overall downtime

Shut down all database containers

  • Shutdown all containers -> backup @PotatoPotato@lemmy.world

  • Leveraging NixOS impermanence, reboot once a day and backup @thejevans@lemmy.ml

Long-ass backup script

  • Long-ass backup script leveraging a backup method in series @STROHminator@lemmy.world @lemmyvore@feddit.nl

Mythical database live snapshot command

(it seems pg_dumpall for Postgres and mysqldump for mysql (though some images with mysql don’t have that command for meeeeee))

  • Dump Postgres via pg_dumpall on a schedule, backup normally on another schedule @RegalPotoo@lemmy.world

  • Dump mysql via mysqldump and pipe to restic directly @youRFate@feddit.de

  • Dump Postgres via pg_dumpall -> backup -> delete dump @2xsaiko@discuss.tchncs.de @SteveDinn@lemmy.ca

Docker image that includes Mythical database live snapshot command (Postgres only)

  • Make your own docker image (https://gitlab.com/trubeck/postgres-backup) and set to run on a schedule, includes restic so it backs itself up @Undaunted@discuss.tchncs.de (thanks for uploading your scripts!!)

  • Add docker image prodrigestivill/postgres-backup-local and set to run on a schedule, backup those dumps on another schedule @brewery@lemmy.world @Lem453@lemmy.ca (also recommended additionally backing up the running database and trying that first during a restore)

New catagories

Snapshot it, seems to act like a power outage to the database

  • LVM snapshot -> backup that @butitsnotme@lemmy.world

  • ZFS snapshot -> backup that @ikidd@lemmy.world (real world recovery experience shows that databases act like they’re recovering from a power outage and it works)

  • (I assume btrfs snapshot will also work)

One liner self-contained command for crontab

  • One-liner crontab that prunes to maintain 7 backups, dump Postgres via pg_dumpall, zips, then rclone them @DeltaTangoLima@reddrefuge.com

Turns out Borgmatic has database hooks

  • Borgmatic with its explicit support for databases via hooks (autorestic has hooks but it looks like you have to make database controls yourself) @PastelKeystone@lemmy.world

I’ve searched this long and hard and I haven’t really seen a good consensus that made sense. The SEO is really slowing me on this one, stuff like “restic backup database” gets me garbage.

I’ve got databases in docker containers in LXC containers, but that shouldn’t matter (I think).

me-me about containers in containers

a me-me using the mental gymnastics me-me template; the template is split into two sections with the upper being a simple 3-step gymnastic routine while the bottom has the one being mocked flipping on gymnastic bars, using gymnastic rings, a balance beam, before finally jetpacking over a burning car. The top says "docker compose up -d" in line with the 3 simple steps of the routine, while the bottom, while becoming increasingly more cluttered, says "pass uid/gid to LXC", "add storage devices to LXC", "proxy network", "install docker on every container", and finally "docker compose up -d".


I’ve seen:

  • Just backup the databases like everything else, they’re “transactional” so it’s cool
  • Some extra docker image to load in with everything else that shuts down the databases in docker so they can be backed up
  • Shut down all database containers while the backup happens
  • A long ass backup script that shuts down containers, backs them up, and then moves to the next in the script
  • Some mythical mentions of “database should have a command to do a live snapshot, git gud”

None seem turnkey except for the first, but since so many other options exist I have a feeling the first option isn’t something you can rest easy with.

I’d like to minimize backup down times obviously, like what if the backup for whatever reason takes a long time? I’d denial of service myself trying to backup my service.

I’d also like to avoid a “long ass backup script” cause autorestic/borgmatic seem so nice to use. I could, but I’d be sad.

So, what do y’all do to backup docker databases with backup programs like Borg/Restic?

@Decronym@lemmy.decronym.xyz
bot account
link
fedilink
English
1
edit-2
5M

Acronyms, initialisms, abbreviations, contractions, and other phrases which expand to something larger, that I’ve seen in this thread:

Fewer Letters More Letters
LVM (Linux) Logical Volume Manager for filesystem mapping
LXC Linux Containers
NAS Network-Attached Storage
NFS Network File System, a Unix-based file-sharing protocol known for performance and efficiency
ZFS Solaris/Linux filesystem focusing on data integrity

5 acronyms in this thread; the most compressed thread commented on today has 6 acronyms.

[Thread #784 for this sub, first seen 5th Jun 2024, 09:15] [FAQ] [Full list] [Contact] [Source code]

@DeltaTangoLima@reddrefuge.com
link
fedilink
English
3
edit-2
5M

I just have a one-liner in crontab that keeps the last 7 nightly database dumps. That destination location is on one my my NASes, which rclones everything to my secondary NAS and an S3 bucket.

ls -tp /storage/proxmox-data/paperless/backups/*.sql.gz | grep -v '/$' | tail -n +7 | xargs -I {} rm -- {}; docker exec -t paperless-db-1 pg_dumpall -c -U paperless | gzip > /storage/proxmox-data/paperless/backups/paperless_$( date +\%Y\%m\%d )T$( date +\%H\%M\%S ).sql.gz

@glizzyguzzler@lemmy.blahaj.zone
creator
link
fedilink
English
25M

Holy shot thanks for droppin this spell, that’s awesome

I mostly use postgres so I created myself a small docker image, which has the postgres client, restic and cron. It also gets a small bash script which executes pg_dump and then restic to backup the dump. pg_dump can be used while the database is used so no issues there. Restic stores the backup in a volume which points to an NFS share on my NAS. This script is called periodically by cron.

I use this image to start a backup-service alongside every database. So it’s part of the docker-compose.yml

@glizzyguzzler@lemmy.blahaj.zone
creator
link
fedilink
English
25M

Would you mind pastebin-ing your docker image creator file? I have no experience cooking up my own docker image.

Sure! I’ll try to do it today but I can’t promise to get to it

I quickly threw together a repository. But please keep in mind that I made some changes to it, to be able to publish it, and it is a combination of 3 different custom solutions that I made for myself. I have not tested it, so use at your own risk :D But if something is broken, just tell me and I try to fix it.

@glizzyguzzler@lemmy.blahaj.zone
creator
link
fedilink
English
15M

Thanks for taking the time to upload the whole thing!! This is pretty cool because it moves the backup work straight into the container with the db

adr1an
link
fedilink
English
35M

I use rsnapshot docker image from Linuxserver. The tool uses rsync incrementally and does rotation/ prunning for you (e.g. keep 10 days, 5 weeks, 8 months, 100 years). I just pointed it to the PostgreSQL data volume. This runs without interruption of service. To restore, I need to convert from WAL files into a dump… So, load an empty PostgreSQL container on any snapshot and run the dump command.

thejevans
link
fedilink
English
35M

My plan to handle this is to switch my VMs to NixOS, set up NixOS with impermanence using a btrfs or zfs volume that gets backed up and wiped at every startup with another that holds persistent data that also gets backed up, and just reboot once per day.

I’m currently learning how to do impermanence in all the different ways, so this is a long goal, but Nix config + backups should handle everything.

@glizzyguzzler@lemmy.blahaj.zone
creator
link
fedilink
English
25M

That’s wild and cool - don’t have that architecture now but… next time

STROHminator
link
fedilink
English
45M

I have also been wanting to try borg for at least offsite backups. Currently been using a “long ass backup script” with how little time I currently have.

lemmyvore
link
fedilink
English
35M

I’ve replaced my “long ass script” I was using for rsync with a much shorter one that uses borg. 10/10 would recommend.

Not sure how much time it will save because in both cases the stuff that took the most time was figuring out each tool’s voodoo for including/excluding directories from backup.

@glizzyguzzler@lemmy.blahaj.zone
creator
link
fedilink
English
25M

I’m coming from rsync too, hoping for the same good stuff

@SteveDinn@lemmy.ca
link
fedilink
English
135M

+1 for long-ass backup script. First dump the databases with the appropriate command. Currently, I have only MariaDB and Postgres instances. Then, I use Borg to backup the database dumps and the docker volumes.

Database SQL dumps compress very well. I haven’t had any problems yet

@glizzyguzzler@lemmy.blahaj.zone
creator
link
fedilink
English
25M

It’s gon b long ass backup script I think!

@ikidd@lemmy.world
link
fedilink
English
45M

Snapshot with zfs, backup snapshot.

@glizzyguzzler@lemmy.blahaj.zone
creator
link
fedilink
English
35M

That’s ok for a database that’s running?

Do you use a ZFS backup manager?

@ikidd@lemmy.world
link
fedilink
English
25M

While there’s probably a better way of doing it via the docker zfs driver, I just make a datastore per stack under the hypervisor, mount the datastore into the docker LXC, and make everything bind mount within that mountpoint, then snapshot and backup via Sanoid to a couple of remote ZFS pools, one local and one on zfs.rent.

I’ve had to restore our mailserver (mysql) and nextcloud (postgres) and they both act as if the power went out, recovering via their own journaling systems. I’ve not found any inconsistencies on recovery, even when I’ve done a test restore on a snapshot that’s been backed up during known hard activitiy. I trust both databases for their recovery methods, others maybe not so much. But test that for yourself.

@glizzyguzzler@lemmy.blahaj.zone
creator
link
fedilink
English
15M

That is straightforward, and if you recovered nextcloud like that it does say something about the robustness!

Better question is: why are you running static storage servers in Docker?

@glizzyguzzler@lemmy.blahaj.zone
creator
link
fedilink
English
35M

;.; I don’t know what this means

Don’t run storage services in Docker. It’s stupid and unnecessary. Just run it on the host.

@glizzyguzzler@lemmy.blahaj.zone
creator
link
fedilink
English
45M

Ah gotchya, well docker compose plus the image is pretty necessary for me to easily manage big ass/complicated database-based storage services like paperless or Immich - so I’m locked in!

And I’d still have to specially handle the database for backup even if it wasn’t in a container…

Why, exactly?

@peregus@lemmy.world
link
fedilink
English
45M

Because you can’t just copy the files of a running DB (if I got what you mean).

Don’t worry, it’s fine, there’s nothing inherently wrong with running stateful workload in a container.

You should really back that up with arguments as I don’t think a lot of people would agree with you.

lemmyvore
link
fedilink
English
25M

You’d have to run several versions of several db engines side by side, which is not even doable easily in most distros. Not to mention some apps need special niche versions, Immich needs a version of Postgres with pg-vectors installed. Also they don’t tell you how they provision them — and usually I don’t care because that’s the whole point of using a docker stack so I don’t have to.

Last but not least there’s no reason to not run databases in docker.

I just shut down the containers before backing up and it has worked totally fine

I guess the trouble is that you don’t want to read the volumes where the db files are because they’re not guaranteed to be consistent at a given point in time right?

Does the given engine support a backup method/utility that can be used to copy files to some volume on a set schedule?

@glizzyguzzler@lemmy.blahaj.zone
creator
link
fedilink
English
45M

As far as I know (unless smarter people know), you need a “long ass backup script” to make your own fun on a set schedule. Autorestic and borgmatic are smooth but don’t seem to have the granularity to deal with it. (Unless smarter people know how to make them do, which I may be fishing for lol)

@Darkassassin07@lemmy.ca
link
fedilink
English
7
edit-2
5M

I setup borg around 4 months ago using option 1. I’ve messed around with it a bit, restoring a few backups, and haven’t run into any issues with corrupt/broken databases.

I just used the example script provided by borg, but modified it to include my docker data, and write info to a log file instead of the console.

Daily at midnight, a new backup of around 427gb of data is taken. At the moment that takes 2-15min to complete, depending on how much data has changed since yesterday; though the initial backup was closer to 45min. Then old backups are trimmed; Backups <24hr old are kept, along with 7 dailys, 3 weeklys, and 6 monthlys. Anything outside that scope gets deleted.

With the compression and de-duplication process borg does; the 15 backups I have so far (5.75tb of data) currently take up 255.74gb of space. 10/10 would recommend on that aspect alone.

/edit, one note: I’m not backing up Docker volumes directly, though you could just fine. Anything I want backed up lives in a regular folder that’s then bind mounted to a docker container. (including things like paperless-ngxs databases)

@glizzyguzzler@lemmy.blahaj.zone
creator
link
fedilink
English
35M

Love the detail, thanks!!

I have one more thought for you:

If downtime is your concern, you could always use a mixed approach. Run a daily backup system like I described, somewhat haphazard with everything still running. Then once a month at 4am or whatever, perform a more comprehensive backup, looping through each docker project and shutting them down before running the backup and bringing it all online again.

@glizzyguzzler@lemmy.blahaj.zone
creator
link
fedilink
English
15M

Not a bad idea for a hybrid thing, especially people seem to say that a running database backup at least some of the time most of the time with no special shutdown/export effort is readable. And the dedupe stats are really impressive

pg_dumpall on a schedule, then restic to backup the dumps. I’m running Zalando Postgres in kubernetes so scheduled tasks and intercontainer networking is a bit simpler, but should be able to run a sidecar container in your compose file

@glizzyguzzler@lemmy.blahaj.zone
creator
link
fedilink
English
25M

So you’re saying you dump on a sched to a <place> and then just let your restic backup pick it up asynchronously?

2xsaiko
link
fedilink
English
45M

My backup service runs pg_dumpall, then borg create, then deletes the dump.

@RegalPotoo@lemmy.world
link
fedilink
English
25M

Pretty much - I try and time it so the dumps happen ~an hour before restic runs, but it’s not super critical

Dandroid
link
fedilink
English
45M

I guess I’m a dummy, because I never even thought about this. Maybe I got lucky, but when I did restore from a backup, I didn’t have any issues. My containerized services came right back up like nothing was wrong. Though that may have been right before I successfully hosted my own (now defunct) Lemmy instance. I can’t remember, but I think I only had sqlite databases in my services at the time.

@glizzyguzzler@lemmy.blahaj.zone
creator
link
fedilink
English
25M

Good to know if I need to just throw the running database into borg/restic there’s a chance it’ll come out ok! Def not a dummy, I only found out databases may not like being backed up while running through someone mentioning it offhandedly

@brewery@lemmy.world
link
fedilink
English
45M

I just started using some docker containers I found on Docker Hub designed for DB backups (e.g. prodrigestivill/postgres-backup-local) to automatically dump from the databases into a set folder, which is included in the restic backup. I know you could come up with scripts but this way, I could easily copy the compose code to other containers with different databases (and different passwords etc).

@glizzyguzzler@lemmy.blahaj.zone
creator
link
fedilink
English
25M

That is nicely expandable with my docker_compose files, thanks for the find!

Create a post

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don’t control.

Rules:

  1. Be civil: we’re here to support and learn from one another. Insults won’t be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it’s not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don’t duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

  • 1 user online
  • 126 users / day
  • 421 users / week
  • 1.16K users / month
  • 3.85K users / 6 months
  • 1 subscriber
  • 3.68K Posts
  • 74.2K Comments
  • Modlog