Score:4

Why does ZFS RAIDZ2 only use 2GB of data when I create a 1GB File

fr flag

I have created a ZFS RAIDZ2 / Raid 6 file system, which from what I believe will store parity on 2 disks.

root@zfs-demo:/data# zpool status
  pool: data
 state: ONLINE
config:

        NAME        STATE     READ WRITE CKSUM
        data        ONLINE       0     0     0
          raidz2-0  ONLINE       0     0     0
            sdb     ONLINE       0     0     0
            sdc     ONLINE       0     0     0
            sdd     ONLINE       0     0     0
            sde     ONLINE       0     0     0

errors: No known data errors

I have a 1GB file

root@zfs-demo:/data# ls -alh
total 1023M
drwxr-xr-x  2 root root    3 Dec 17 18:22 .
drwxr-xr-x 19 root root 4.0K Dec 17 18:10 ..
-rw-r--r--  1 root root 1.0G Dec 17 18:22 1GB.bin

I thought the two disks of parity would mean I was storing the file itself + two lots parity = 3 GB of storage in total for a 1 GB File, but only 2GB is allocated.

root@zfs-demo:/data# zpool list
NAME   SIZE  ALLOC   FREE  CKPOINT  EXPANDSZ   FRAG    CAP  DEDUP    HEALTH  ALTROOT
data  39.5G  2.01G  37.5G        -         -     0%     5%  1.00x    ONLINE  -
ewwhite avatar
ng flag
For four disks, you should probably be using RAIDZ1 or ZFS mirrors. RAIDZ2 doesn't offer much benefit for that small number of disks.
fr flag
@ewwhite Thank you, this is just a lab environment for me to learn more about ZFS. I will be blowing it all away once I have answered all my questions one of which is this question.
Sunzi avatar
cn flag
Simple reasoning without the need for any technical knowledge: You have 4 discs a 10 TB, and 20 TB usable and 20 TB lost for parity in RAID-Z2. So you need a 1 to 1 proportion of data to parity or you could not get the disc full. If your proposed example of 1 GB should use 2 GB of parity space, the parity space would be full after 10 TB written, but you have 20 TB usable.
Score:8
ru flag

With two out of four disks for redundancy, you can simply double the user data: two disks store the original data and the same space is used for redundancy data on the two other disks. Parity is actually distributed across all disks using striping, but that doesn't change the space that's taken up.

With the amount of disks you could use RAID 1/mirroring with the same space efficiency but better throughput efficiency (and less resilience, as Romeo Ninov has commented). RAID-Z2 or RAID 6 become more efficient with more disks: with a total of ten disks, eight can effectively be used for data and still only two are for redundancy.

Romeo Ninov avatar
in flag
If by mirror you mean RAID10 (we talk about 4 disks) you will be in trouble if two disks fail and they are part of one submirror. But RAID6 will survive 2 missing disks w/o any problem.
Zac67 avatar
ru flag
@RomeoNinov Yes, Z2/RAID6 are more resilient than RAID1/10 but the latter are almost always faster, also with rebuilding.
mx flag
While efficiency for parity RAID improves with more disks, effective resilience is inversely proportionate to the code rate (IOW, as the ratio of data disks to total disks approaches unity, the probability of losing enough disks at once to lose the whole array also approaches unity). This is part of why RAID-Z3 exists, and also why the norm for very large ZFS pools is to use multiple smaller arrays instead of one very big array. Most sensible admins would not consider using ten disks for a single RAID6 array instead of just doing two five-disk RAID5 arrays (or even two five-disk RAID6 arrays).
Romeo Ninov avatar
in flag
@AustinHemmelgarn, getting in consideration MTBF and size of contemporary disks time to recovery of array (based on single disk fail) start to become longer than MTBF (very simplified) so RAID5 become highly unrecommendable . And you are right, this is the reason of existence of RAIDZ3
Zac67 avatar
ru flag
@AustinHemmelgarn Similar to RAID10 vs RAID6 from Romeo's comment, RAID50 is not as resilient as RAID6 with the same amount of disks.
Romeo Ninov avatar
in flag
And to adding to Zac67 comment: most of the time RAIDZ2+spare help increase the resilience :)
Score:4
in flag

The situation is (explained to get the idea, very simplified) this:

Let suppose ZFS use 512MB blocks. So you store on disk 1 512MB (part one of file), on disk 2 you store next 512MB, on parity 1 you store next block of 512MB (so you can restore the file only with disk1 and parity 1 for example), on parity 2 you store another 512MB so you can restore the file with disk 1 and parity2.

Here is what you need to be up and running to get entire file:

  • d1+d2
  • d1+p1
  • d1+p2
  • d2+p1
  • d2+p2
  • p1+p2

If you have for example you have 5 disks (RAIDZ2) and have block 333MB you will have such blocks on disk 1, 2, 3, parity 1 and 2. In sum 1666MB

fr flag
Thank you, but I'm not sure I understand your answer in relation to my question. Why a 1GB file is using only 2GB of storage space. If disk 1 stores the 1GB file and Disk 2 stores 1GB parity and disk 3 stores another 1GB Parity, should I not see total disk usage of 3GB?
Romeo Ninov avatar
in flag
@PrestonDocks, please read my answer. D1 store half of the file, D2 store another half. P1 store parity with size of half file, same for p2.
Romeo Ninov avatar
in flag
@PrestonDocks, yes, this is how it is stored. And no, parity is calculated so array can restore the file based on original information and parities on p1 and p2. See in my answer the pairs, sufficient to reconstruct the file.
I sit in a Tesla and translated this thread with Ai:

mangohost

Post an answer

Most people don’t grasp that asking a lot of questions unlocks learning and improves interpersonal bonding. In Alison’s studies, for example, though people could accurately recall how many questions had been asked in their conversations, they didn’t intuit the link between questions and liking. Across four studies, in which participants were engaged in conversations themselves or read transcripts of others’ conversations, people tended not to realize that question asking would influence—or had influenced—the level of amity between the conversationalists.