Hi y’all,
I am exploring TrueNAS and configuring some ZFS datasets. As ZFS provides with some parameters to fine-tune its setup to the type of data, I was thinking it would be good to take advantage of it. So I’m here with the simple task of choosing the appropriate “record size”.
Initially I thought, well this is simple, the dataset is meant to store videos, movies, tv shows for a jellyfin docker container, so in general large files and a record size of 1M sounds like a good idea (as suggested in Jim Salter’s cheatsheet).
Out of curiosity, I ran Wendell’s magic command from level1 tech to get a sense for the file size distribution:
find . -type f -print0 | xargs -0 ls -l | awk '{ n=int(log($5)/log(2)); if (n<10) { n=10; } size[n]++ } END { for (i in size) printf("%d %d\n", 2^i, size[i]) }' | sort -n | awk 'function human(x) { x[1]/=1024; if (x[1]>=1024) { x[2]++; human(x) } } { a[1]=$1; a[2]=0; human(a); printf("%3d%s: %6d\n", a[1],substr("kMGTEPYZ",a[2]+1,1),$2) }'
Turns out, that’s when I discovered it was not as simple. The directory is obviously filled with videos, but also tiny small files, for subtitiles, NFOs, and small illustration images, valuable for Jellyfin’s media organization.
That’s where I’m at. The way I see it, there are several options:
So what do you think? And also, how have your personally set it up? Would love to get some feedback, especially if you are also using ZFS and have a videos library with a dedicated dataset. Thanks!
Edit: Alright, so I found the following post by Jim Salter which goes through more detail regarding record size. It clarifies my misconception about recordsize not being the same as the block size, but also it can easily be changed at any time. It’s just the size of the chunks of data to be read. So I’ll be sticking to 1M recordsize and leave it at that despite having multiple smaller files, because the important will be to effectively stream the larger files. Thank you all!
A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don’t control.
Rules:
Be civil: we’re here to support and learn from one another. Insults won’t be tolerated. Flame wars are frowned upon.
No spam posting.
Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it’s not obvious why your post topic revolves around selfhosting, please include details to make it clear.
Don’t duplicate the full text of your blog or github here. Just post the link for folks to click.
Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).
No trolling.
Resources:
Any issues on the community? Report it using the report flag.
Questions? DM the mods!
Let me clarify:
Recordsize is basically hash block size. If you want to change things you will always write in blocks up to the recordsize, smaller if the file is smaller, then calculate the hash based on that.
Smaller only helps for randomish accesses inside a file.