Basically title. Is it common to use some kind of RAID for backing up other RAIDs or do people just go with single drives?
A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don’t control.
Rules:
Be civil: we’re here to support and learn from one another. Insults won’t be tolerated. Flame wars are frowned upon.
No spam posting.
Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it’s not obvious why your post topic revolves around selfhosting, please include details to make it clear.
Don’t duplicate the full text of your blog or github here. Just post the link for folks to click.
Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).
No trolling.
Resources:
Any issues on the community? Report it using the report flag.
Questions? DM the mods!
So many people didn’t read the post and going off how raid isn’t backup.
There are a few things to consider. How much data is it? How is it connected? How reliable do you want it to be? Where is it going to be? How are you backing it up? How will you monitor the disk(s) and backup process for failures?
Is it at some place that will be a pain to deal with if a hard drive dies, like a friend’s house or something. I’d deal with raid so it wouldn’t be an immediate reason to go fix it or go without backups.
Is it small enough amounts of data that you could have a complete third copy if you didn’t put the disks in raid? Then I’d probably make multiple copies and not use raid.
Are you dealing with something like veeam doing backup chains? Having an initial copy and then incremental with changes where you can go back to different days? Go with raid because having to reconfigure can be a hassle or having a full and incremental across jbods could cost you all the backups if the disk with the full backup is lost.
Either or is a valid choice and depends on your particular needs.
As others said, depends on your use case. There are lots of good discussions here about mirroring vs single disks, different vendors, etc. Some backup systems may want you to have a large filesystem available that would not be otherwise attainable without a RAID 5/6.
Enterprise backups tend to fall along the recommendation called 3-2-1:
On my home system, I have 3-2-0 for most data and 4-3-0 for my most important virtual machines. My home system doesn’t have an off-site, but I do have two external hard drives connected to my NAS.
Story time
I had one of my two backup drives fail a few months ago. Literally actually nothing of value was lost, just went down to the electronics shop and bought a bigger drive from the same vendor (preserving the one on each vendor approach). Reformatted the disk, recreated the backup job, then ran the first transfer. Pretty much not a big deal, all the data was still in 2 other places - the source itself, and the NAS primary array.
The most important thing to determine about a backup when you plan one - think about how much the data is valuable to you. That’s how much you might be willing to spend on keeping that data safe.
Any storage shut be raid or a form their of in a ideal world. The storage where backups are stored a defiantly yes raid shut be a very high priority.
I haven’t needed RAID for years, because my storage needs were small enough to fit on currently available drives.
Which is why my file server has a single 4TB data drive, with an external attached for mirroring on a schedule, plus a NAS also mirrored on a schedule, and Crashplan.
The NAS was recently added, and it’s RAID 5, only because it was free and I had the drives sitting around collecting dust. Hopefully I can switch it to RAID 6 once deduplication is finished.
Technically only Crashplan is a real backup in my setup. The rest is just local redundancy.
I’d prefer to not use RAID if I can avoid it.
Raid is not only for if a drive fails. But can also be used against slow corruption of files. If you love your data use raid.
That is just a specific type of drive failure and only certain software RAID solutions are able to even detect corruption through the use of checksums. Typical “dumb” RAID will happily pass on corrupted data returned by the drives.
RAID only serves to prevent downtime due to drive failure. If your system has very high uptime requirements and a drive just dropping out must not affect the availability of your system, that’s where you use RAID.
If you want to preserve data however, there are much greater hazards than drive failure: Ransomware, user error, machine failure (PSU blows up), facility failure (basement flooded) are all similarly likely. RAID protects against exactly none of those.
Proper backups do provide at least decent mitigation against most of these hazards in addition to failure of any one drive.
If love your data, you make backups of it.
With a handful of modern drives (<~10) and a restore time of 1 week, you can expect storage uptime of >99.68%. If you don’t need more than that, you don’t need RAID. I’d also argue that if you do indeed need more than that, you probably also need higher uptime in other components than the drives through redundant computers at which point the benefit of RAID in any one of those redundant computers diminishes.
Could always use UNRAID for the backup if you’re trying to be storage efficient, but it’s really no better than RAID5
Yep, all RAID has the same kinds of issues - largely sensitivity to X number of drive failures. Which is part of why we see RAID 6 (double parity), Mirroring, RAID 1-0, etc, all as mechanisms to provide compensation for disk failure within the RAID.
In the SMB, RAID 10 seems to be the favorite approach today for NAS/Virtualization hosts (ESX, etc), with backup going to a cloud provider such as iland or barracuda.
Obligatory "TrueNAS is free " comment
Unraid’s “killer feature” is the ability to mix and match disparate drive sizes and only requiring the parity drive to be at least as large as your largest data disk, a la MergeFS/Snapraid. Also ZFS chugging RAM like there’s no tomorrow so not really an option for underpowered devices like some NASes. But yeah, TrueNAS is nice.
Thats is a very budget-friendly choice for UnRAID to accept varying drive sizes. As a backup destination, especially a cold backup, the RAM requirements of ZFS should be less impactful. I had lots of use from my TrueNAS box with 16GB, and my dedicated cold backup build is just 8GB on 5x1TB WD Blue (gasp!) HDDs. I always wanted to try other NAS platforms, but I’m away from all my tech for a few years.
Lol.
If you have a spare box doing little, and a bunch of drives, it (or unRAID) are reasonable solutions. Proxmox can also build RAID with random drive sizes - I’m running one with 3 drives, using ZFS RAID 0, it has a terabyte of storage.
Yep, it’s gonna suck when one of those drives fail.
Well as long as you’re aware of the risk and prepared for it, its not so bad to run in a volatile way like that. I ran my TN box for almost a decade on the same USB boot before I finally caved and picked up three Intel enterprise SSD for the job, with one as a cold spare. Nothing in the vox was critical or would be missed for more than a few beers of crying.
I would recommend avoiding RAID for backups. It’s preferable to have two separate backup disks in two distinct systems rather than relying on mirrored backup disks. If there’s a human error on the backup machine, you risk losing both backups simultaneously. Additionally, unforeseen events like system failure due to a lightning strike could compromise your data. Ideally, you should have two backups stored in two different location.
I use a RAID for the data but the backups go to simple single disks. My reasoning is, I already have a RAID and redundancy. And I don’t have an unlimited budged. It’d already need 2 disks to fail to wreck the RAID and then also the backup has to fail with that solution. That’s probably a fire or ransomware or a deliberate effort. Adding one more disk of redundancy would probably not change much. But It’d cost and add complexity.
Also this way I don’t need to care about buying disks of a certain size and go through painful migration processes more than necessary. I can re-use the drives with mismatched sizes and swap them in to the backup pool.
RAID. No question. Or two individual drives each alternating full backups, which is what I do.
I just plugged in a new drive to replace an SSD that locked and wouldn’t write new backups. It failed a format attempt. I immediately ordered a replacement. Remember the rule: one is none.
And for fucksake, have an offsite backup.
deleted by creator
2 Single drives means 2 full copies, one you can keep at a friends place. 2 mirrored drives means if you accidentally overwrite a backup, you have lost both drives to the error, unless you have snapshotting or imcremental backups.
Lots of good backup advice on this podcast https://2.5admins.com/
Snapraid to a single drive works well if you are fine with daily snapshots of up to 6 drives.
I would go with raid on the backup system too. you don’t want all your backups disappearing because one drive fails.
Depends on them not choosing wrong raid type :)
That is why I say “RAID0 is not RAID.”
Where? not in what i replied to
I have 1 off site and two 10tb external drives that are duplicate backups.
Generally speaking, fault protection schemes need only account for one fault at a time, unless you’re a really large business, or some other entity with extra-stringent data protection requirements.
RAID protects against drive failure faults. Backups protect against drive failure faults as well, but also things like accidental deletions or overwrites of data.
In order for RAID on backups to make sense, when you already have RAID on your main storage, you’d have to consider drive failures and other data loss to be likely to occur simultaneously. I.E. RAID on your backups only protects you from drive failure occurring WHILE you’re trying to restore a backup. Or maybe more generally, WHILE that backup is in use, say, if you have a legal requirement that you must keep a history of all your data for X years or something (I would argue data like this shouldn’t be classified as backups, though).
RAID is a choice if you’re (generally) trying to maximize storage capacity against cost of drive capacity. It was born out of a lack of drives of sufficient capacity.
Mirroring is useful for protection against hardware failures - it’s not a backup.
Follow the 3:3 rule: 3 backups, in 3 different “locations”. Locations in quotes because 2 different cloud storage providers count as 2 different locations.
Whether your “local” backup (in your location, at a friend’s house, etc) uses RAID depends on your requirements, cost sensitivity, etc.
I have a couple RAID setups only because I always have spare drives around, and it’s relatively cheap to build a box to run something like UnRAID or TrueNAS which can take advantage of mixed drive sizes.
My current setup is an old file server with a large drive that is currently replicating to an external drive, a small NAS, and Crashplan.
Not an ideal setup since 2 backups are local (though my NAS is easy to grab and run with, weighs about 10lbs).
Next phase is to move to Storj.io and switch to a proper backup tool like Borg.
I have a tiny archive of my own consisting of one 1 TB and one 2 TB USB HDDs by different vendors. Whenever I want to save something, I put it on both. Btrfs snapshots make that really easy.
Acronyms, initialisms, abbreviations, contractions, and other phrases which expand to something larger, that I’ve seen in this thread:
5 acronyms in this thread; the most compressed thread commented on today has 8 acronyms.
[Thread #622 for this sub, first seen 22nd Mar 2024, 23:15] [FAQ] [Full list] [Contact] [Source code]