About a year ago I switched to ZFS for Proxmox so that I wouldn’t be running technology preview.

Btrfs gave me no issues for years and I even replaced a dying disk with no issues. I use raid 1 for my Proxmox machines. Anyway I moved to ZFS and it has been a less that ideal experience. The separate kernel modules mean that I can’t downgrade the kernel plus the performance on my hardware is abysmal. I get only like 50-100mb/s vs the several hundred I would get with btrfs.

Any reason I shouldn’t go back to btrfs? There seems to be a community fear of btrfs eating data or having unexplainable errors. That is sad to hear as btrfs has had lots of time to mature in the last 8 years. I would never have considered it 5-6 years ago but now it seems like a solid choice.

Anyone else pondering or using btrfs? It seems like a solid choice.

@SRo@lemmy.dbzer0.com
link
fedilink
English
81M

One time I had a power outage and one of the btrfs hds (not in a raid) couldn’t be read anymore after reboot. Even with help from the (official) btrfs mailinglist It was impossible to repair the file system. After a lot of low level tinkering I was able to retrieve the files, but the file system itself was absolutely broken, no repair process was possible. I since switched to zfs, the emergency options are much more capable.

Possibly linux
creator
link
fedilink
English
41M

Was that less than 2 years ago? Were you using kernel 5.15 or newer?

@SRo@lemmy.dbzer0.com
link
fedilink
English
61M

Yes that was may/june 23 and I was on a 6.x kernel

@zarenki@lemmy.ml
link
fedilink
English
101M

I’ve been using single-disk btrfs for my rootfs on every system for almost a decade. Great for snapshots while still being an in-tree driver. I also like being able to use subvolumes to treat / and /home (maybe others) similar to separate filesystems without actually being different partitions.

I had used it for my NAS array too, with btrfs raid1 (on top of luks), but migrated that over to ZFS a couple years ago because I wanted to get more usable storage space for the same money. btrfs raid5 is widely reported to be flawed and seemed to be in purgatory of never being fixed, so I moved to raidz1 instead.

One thing I miss is heterogenous arrays: with btrfs I can gradually upgrade my storage one disk at a time (without rewriting the filesystem) and it uses all of my space. For example, two 12TB drives, two 8TB drives, and one 4TB drive adds up to 44TB and raid1 cuts that in half to 22TB effective space. ZFS doesn’t do that. Before I could migrate to ZFS I had to commit to buying a bunch of new drives (5x12TB not counting the backup array) so that every drive is the same size and I felt confident it would be enough space to last me a long time since growing it after the fact is a burden.

Possibly linux
creator
link
fedilink
English
11M

Btrfs Raid 10 reportedly is stable

@stuner@lemmy.world
link
fedilink
English
21M

With version 2.3 (currently in RC), ZFS will at least support RAIDZ expansion. That should already help a lot for a NAS usecase.

exu
link
fedilink
English
161M

Did you set the correct block size for your disk? Especially modern SSDs like to pretend they have 512B sectors for some compatibility reason, while the hardware can only do 4k sectors. Make sure to set ashift=12.

Proxmox also uses a very small volblocksize by default. This mostly applies to RAIDz, but try using a higher value like 64k. (Default on Proxmox is 8k or 16k on newer versions)

https://discourse.practicalzfs.com/t/psa-raidz2-proxmox-efficiency-performance/1694

I’m thinking of bumping mine up to 128k since I do mostly photography and videography, but I’ve heard that 1M can increase write speeds but decrease read speeds?

I’ll have a RAIDZ1 and a RAIDZ2 pool for hot storage and warm storage.

Nicht BurningTurtle
link
fedilink
English
81M

Didn’t have any btrfs problems yet, infact cow saved me a few times on my desktop.

@Heavybell@lemmy.world
link
fedilink
English
21M

Can you elaborate for the curious among us?

Nicht BurningTurtle
link
fedilink
English
41M

btrfs + timeshift saved me multiple times, when updates broke random stuff.

@Heavybell@lemmy.world
link
fedilink
English
11M

I have research to do, I see.

Avid Amoeba
link
fedilink
English
261M

You shouldn’t have abysmal performance with ZFS. Something must be up.

Possibly linux
creator
link
fedilink
English
-8
edit-2
1M

What’s up is ZFS. It is solid but the architecture is very dated at this point.

There are about a hundred different settings I could try to change but at some point it is easier to go btrfs where it works out of the box.

Since most people with decently simple setups don’t have the described problem likely somethings up with your setup.

Yes ifta old and yes it’s complicated but it doesn’t have to be to get a decent performance.

Avid Amoeba
link
fedilink
English
11M

I used to run a mirror for a while with WD USB disks. Didn’t notice any performance problems. Used Ubuntu LTS which has a built-in ZFS module, not DKMS, although I doubt there’s performance problems stemming from DKMS.

Possibly linux
creator
link
fedilink
English
11M

I have been trying to get ZFS working well for months. Also I am not the only one having issues as I have seen lots of other posts about similar problems.

I don’t doubt that you have problems with your setup. Given the large number of (simple) zfs setups that are working flawlessly there are a bound to be a large number of issues to be found on the Internet. People that are discontent voice their opinion more often and loudly compared to the people that are satisfied.

@jj4211@lemmy.world
link
fedilink
English
21M

You’ve been downvoted, but I’ve seen a fair share of ZFS implementations confirm your assessment.

E.g. “Don’t use ZFS if you care about performance, especially on SSD” is a fairly common refrain in response to anyone asking about how to get the best performance out of their solution.

You have angered the zfs gods!

Possibly linux
creator
link
fedilink
English
6
edit-2
1M

I have gotten a ton of people to help me. Sometimes it is easier to piss people off to gather info and usage tips.

Avid Amoeba
link
fedilink
English
61M

What seems dated in its architecture? Last time I looked at it, it struck me as pretty modern compared to what’s in use today.

Possibly linux
creator
link
fedilink
English
0
edit-2
1M

It doesn’t share well. Anytime anything IO heavy happens the system completely locks up.

That doesn’t happen on other systems

Avid Amoeba
link
fedilink
English
2
edit-2
1M

That doesn’t speak much of the architecture. Also it’s really odd. Not denying what you’re seeing is happening, just that it seems odd based on the setups I run with ZFS. My main server is in fact a shared machine that I use as a workstation and games along as a server. All works in parallel. I used to have a mirror, then a 4-disk RAIDz and now an 8-disk RAIDz2. I have multiple applications constantly using the pool. I don’t notice any performance slowdowns on the desktop, or in-game when IO goes high. The only time I notice anything is when something like multiple Plex transcoders hit the CPU hard. Sequential performance is around 1.3GB/s which is limited by the data bus speeds (USB DAS boxes). Random performance is very good although I don’t have any numbers out of my head. I’m using mostly WD Elements shucked disks and a couple of IronWolfs. No enterprise grade disks on this system.

I’m also not saying that you have to keep fucking around with it instead of going Btrfs. Simply adding another anecdote to the picture. If I had a serious problem like that and couldn’t figure it out I’d be on LVMRAID+Ext4 which is what used prior to ZFS.

Possibly linux
creator
link
fedilink
English
01M

Yeah maybe my machines are cursed

Avid Amoeba
link
fedilink
English
31M

That is totally possible. I spent a month changing boards and CPUs to fix a curse on my main, unrelated to storage. In case you’re curious.

I doubt that. Some options:

  • bad memory
  • failing drives
  • silent CPU faults
  • poor power delivery

The list is endless. Maybe BTRFS is more tolerant of the problems you’re facing, but that doesn’t mean the problems are specific to ZFS. I recommend doing a bit of testing to see if everything looks fine on the HW side of things (memtest, smart tests, etc).

Possibly linux
creator
link
fedilink
English
41M

I set the Arc cache to 4GB and it is working better now

@vividspecter@lemm.ee
link
fedilink
English
371M

No reason not to. Old reputations die hard, but it’s been many many years since I’ve had an issue.

I like also that btrfs is a lot more flexible than ZFS which is pretty strict about the size and number of disks, whereas you can upgrade a btrfs array ad hoc.

I’ll add to avoid RAID5/6 as that is still not considered safe, but you mentioned RAID1 which has no issues.

I’ve been vaguely planning on using btrfs in raid5 for my next storage upgrade. Is it really so bad?

@vividspecter@lemm.ee
link
fedilink
English
91M

Check status here. It looks like it may be a little better than the past, but I’m not sure I’d trust it.

An alternative approach I use is mergerfs + snapraid + snapraid-btrfs. This isn’t the best idea for a system drive, but if it’s something like a NAS it works well and snapraid-btrfs doesn’t have the write hole issues that normal snapraid does since it operates on r/o snapshots instead of raw data.

@sntx@lemm.ee
link
fedilink
English
21M

It’s affected by the write-hole phenomenon. In BTRFS case that can mean that perfectly good old data might corrupt without any notice.

Don’t use btrfs if you need RAID 5 or 6.

The RAID56 feature provides striping and parity over several devices, same as the traditional RAID5/6. There are some implementation and design deficiencies that make it unreliable for some corner cases and the feature should not be used in production, only for evaluation or testing. The power failure safety for metadata with RAID56 is not 100%.

https://btrfs.readthedocs.io/en/latest/btrfs-man5.html#raid56-status-and-recommended-practices

@lurklurk@lemmy.world
link
fedilink
English
111M

Or run the raid 5 or 6 separately, with hardware raid or mdadm

Even for simple mirroring there’s an argument to be made for running it separately from btrfs using mdadm. You do lose the benefit of btrfs being able to automatically pick the valid copy on localised corruption, but the admin tools are easier to use and more proven in a case of full disk failure, and if you run an encrypted block device you need to encrypt half as much stuff.

@Eideen@lemmy.world
link
fedilink
English
41M

I have no problem running it with raid 5/6. The important thing is to have a UPS.

@dogma11@lemmy.world
link
fedilink
English
11M

I’ve been running a btrfs storage array with data on raid5 and metadata I believe raid1 for the last 5 or so years and have yet to have a problem because of it. I did unfortunately learn not to fully trust the windows btrfs driver but was fortunately able to restore from backups and redownloading.

I wouldn’t hesitate to set it up again for myself or anybody else, and adding a UPS would be icing on the cake. (I added UPS to my setup this last summer)

@Anonymouse@lemmy.world
link
fedilink
English
21M

I’ve got raid 6 at the base level and LVM for partitioning and ext4 filesystem for a k8s setup. Based on this, btrfs doesn’t provide me with any advantages that I don’t already have at a lower level.

Additionaly, for my system, btrfs uses more bits per file or something such that I was running out of disk space vs ext4. Yeah, I can go buy more disks, but I like to think that I’m running at peak efficiency, using all the bits, with no waste.

btrfs doesn’t provide me with any advantages that I don’t already have at a lower level.

Well yeah, because it’s supposed to replace those lower levels.

Also, BTRFS does provide advantages over ext4, such as snapshots, which I think are fantastic since I can recover if things go sideways. I don’t know what your use-case is, so I don’t know if the features BTRFS provides would be valuable to you.

@Anonymouse@lemmy.world
link
fedilink
English
11M

Generally, if a lower level can do a thing, I prefer to have the lower level do it. It’s not really a reason, just a rule of thumb. I like to think that the lower level is more efficient to do the thing.

I use LVM snapshots to do my backups. I don’t have any other reason for it.

That all being said, I’m using btrfs on one system and if I really like it, I may migrate to it. It does seem a whole lot simpler to have one thing to learn than all the layers.

@jj4211@lemmy.world
link
fedilink
English
11M

Actually, the lower level may likely be less efficient, due to being oblivious about the nature of the data.

For example, a traditional RAID1 mirror on creation immediately starts a rebuild across all the potential data capacity of the storage, without a single byte of actual data written. So you spend an entire drive wipe making “don’t care” bytes redundant.

Similarly, for snapshotting, it can only track dirty blocks. So you replace uninitialized data that means nothing with actual data, the snapshot layer is compelled to back up that unitiialized data, because it has no idea whether the blocks replaced were uninialized junk or real stuff.

There’s some mechanisms in theory and in practice to convey a bit of context to the block layer, but broadly speaking by virtue of being a mostly oblivious block level, you have to resort to the most naive and often inefficient approaches.

That said, block capacity is cheap, and doing things at the block level can be done in a ‘dumb’ way, which may be easier for an implementation to get right, versus a more clever approach with a bigger surface for mistakes.

@Anonymouse@lemmy.world
link
fedilink
English
11M

Those are some good points. I guess I was thinking about the hardware. At least where I do RAID, it’s on the controller, so that offloads much of the parity checking and such to the controller and not the CPU. It’s all probably negligible for the apps that I run, but my hardware is quite old, so maybe trying to squeeze all the performance I can is a worthwhile activity.

Yup, I used to use LVM, but the two big NAS filesystems have a ton of nice features and they expect to control the disk management. I looked into BTRFS and ZFS, and since BTRFS is native to Linux (some of my SW doesn’t support BSD) and I don’t need anything other than RAID mirror, that’s what I picked.

I used LVM at work for simple RAID 0 systems where long term uptime was crucial and hardware swaps wouldn’t likely happen (these were treated like IOT devices), and snapshots weren’t important. It works well. But if you want extra features (file-level snapshots, compression, volume quotas, etc), BTRFS and ZFS make that way easier.

@Anonymouse@lemmy.world
link
fedilink
English
21M

I am interested in compression. I may give it a try when I swap out my desktop system. I did try btrfs in it’s early, post alpha stage, but found that the support was not ready yet. I think I had a VM system that complained. It is older now and more mature and maybe it’s worth another look.

Domi
link
fedilink
English
161M

btrfs has been the default file system for Fedora Workstation since Fedora 33 so not much reason to not use it.

Suzune
link
fedilink
English
81M

The question is how do you get a bad performance with ZFS?

I just tried to read a large file and it gave me uncached 280 MB/s from two mirrored HDDs.

The fourth run (obviously cached) gave me over 3.8 GB/s.

Possibly linux
creator
link
fedilink
English
-2
edit-2
1M

I have never heard of anyone getting those speeds without dedicated high end hardware

Also the write will always be your bottleneck.

@Moonrise2473@feddit.it
link
fedilink
English
51M

I have similar speeds on a truenas that I installed on a simple i3 8100

Possibly linux
creator
link
fedilink
English
11M

How much ram and what is the drive size?

I suspect this also could be an issue with SSDs. I have seen a lot a posts around describing similar performance on SSDs.

@Moonrise2473@feddit.it
link
fedilink
English
11M

64 gb of ecc ram (48gb cache used by zfs) with 2tb drives (3 of them)

Possibly linux
creator
link
fedilink
English
01M

Yeah it sounds like I don’t have enough ram.

ZFS really likes RAM, so if you’re running anything less than 16GB, that could be your issue.

Possibly linux
creator
link
fedilink
English
2
edit-2
1M

From the Proxmox documentation:

As a general rule of thumb, allocate at least 2 GiB Base + 1 GiB/TiB-Storage. For example, if you have a pool with 8 TiB of available storage space then you should use 10 GiB of memory for the ARC.

I changed the arc size on all my machines to 4GB and it runs a bit better. I am getting much better performance. I though I had changed it but I didn’t regenerate initramfs so it didn’t apply. I am still having issues with VM transfers locking up the cluster but that might be fixable by tweaking some settings.

16GB might be overkill or underkill depending on what you are doing.

@stuner@lemmy.world
link
fedilink
English
2
edit-2
1M

I’m seeing very similar speeds on my two-HDD RAID1. The computer has an AMD 8500G CPU but the load from ZFS is minimal. Reading / writing a 50GB /dev/urandom file (larger than the cache) gives me:

  • 169 MB/s write
  • 254 MB/s read

What’s your setup?

Possibly linux
creator
link
fedilink
English
11M

Maybe I am CPU bottlenecked. I have a mix of i5-8500 and i7-6700k

The drives are a mix but I get almost the same performance across machines

@stuner@lemmy.world
link
fedilink
English
21M

It’s possible, but you should be able to see it quite easily. In my case, the CPU utilization was very low, so the same test should also not be CPU-bottlenecked on your system.

Possibly linux
creator
link
fedilink
English
01M

Is your machine part of a cluster by chance? Of so, when you do a VM transfer what performance do you see?

@stuner@lemmy.world
link
fedilink
English
11M

Unfotunately, I can help you with that. The machine is not running any VMs.

Suzune
link
fedilink
English
4
edit-2
1M

This is an old PC (Intel i7 3770K) with 2 HDDs (16 TB) attached to onboard SATA3 controller, 16 GB RAM and 1 SSD (120 GB). Nothing special. And it’s quite busy because it’s my home server with a VM and containers.

@Bookmeat@lemmy.world
link
fedilink
English
561M

A bit of topic; am I the only one that pronounces it “butterface”?

Not anymore.

You son of a bitch, I’m in.

@uhmbah@lemmy.ca
link
fedilink
English
181M

Ah feck. Not any more.

@combatfrog@sopuli.xyz
link
fedilink
English
11M

Similarly, I read bcachefs as BCA Chefs 😅

downhomechunk
link
fedilink
English
21M

I call it butter fuss. Yours is better.

@adept@programming.dev
link
fedilink
English
21M

Related, and I cannot help but read “bcachefs” as “bitch café”

Isn’t it meant to be like “better FS”? So you’re not too far off.

i call it “butter FS”

I was meant to be Better FS, but it corrupted it to btrfs without noticing.

@sem@lemmy.blahaj.zone
link
fedilink
English
161M

Btrfs came default with my new Synology, where I have it in Synology’s raid config (similar to raid 1 I think) and I haven’t had any problems.

I don’t recommend the btrfs drivers for windows 10. I had a drive using this and it would often become unreachable under load, but this is more a Windows problem than a problem with btrfs

poVoq
link
fedilink
English
11M

I am using btrfs on raid1 for a few years now and no major issue.

It’s a bit annoying that a system with a degraded raid doesn’t boot up without manual intervention though.

Also, not sure why but I recently broke a system installation on btrfs by taking out the drive and accessing it (and writing to it) from another PC via an USB adapter. But I guess that is not a common scenario.

The whole point of RAID redundancy is uptime. The fact that btrfs doesn’t boot with a degraded disk is utterly ridiculous and speaks volumes of the developers.

@tychosmoose@lemm.ee
link
fedilink
English
21M

Using it here. Love the flexibility and features.

@Lem453@lemmy.ca
link
fedilink
English
11M

Btrfs only has issues with raid 5. Works well for raid 1 and 0. No reason to change if it works for you

Possibly linux
creator
link
fedilink
English
11M

It is stable with raid 0,1 and 10.

Raid 5 and 6 are dangerous

I think it has more issues than just with raid 5 &6!

TFO Winder
link
fedilink
English
11M

Used it in development environment, well I didn’t need the snapshot feature and it didn’t have a straightforward swap setup, it lead to performance issues because of frequent writes to swap.

Not a big issue but annoyed me a bit.

Create a post

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don’t control.

Rules:

  1. Be civil: we’re here to support and learn from one another. Insults won’t be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it’s not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don’t duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

  • 1 user online
  • 61 users / day
  • 296 users / week
  • 975 users / month
  • 3.73K users / 6 months
  • 1 subscriber
  • 3.91K Posts
  • 79.3K Comments
  • Modlog