Page 1 of 1

RAIDZ2 on 5 disks?

Posted: 24 Mar 2017 00:03
by geos
I have six 3TB disks and I would like to make RAIDZ2 ZFS array. from many guides on the internet I can read that minimum recommended number of disks for RAIDZ2 is 6. ok, but... will there be anything seriously impacted in RAIDZ2 array operations if it is created from 5 disks instead of 6? I would like to keep one disk as a spare in case of issue/need for replacement. the summary capacity of 3x3TB is OK for me, 2 disks go for parity making them 5 total and one is kept as a spare. but if there is something really wrong with this setup I would consider using all 6 drives for the array. from your experience, what downsides are for RAIDZ2 on 5 disks instead of 6 (except sum capacity of course)?

thank you,
geos

Re: RAIDZ2 on 5 disks?

Posted: 24 Mar 2017 08:53
by noclaf
It's generally recommended to have #of disks = x^2 + #of parity disks. E.g. for Z2 it's 4 (thus it's silly having Z2 for 4 disks :) ),6, 10 etc.
Different # of total disk will work but the performance might be suboptimal. I assume this is related to more overhead in parity calculation.
----
EDIT :
https://forums.anandtech.com/threads/zf ... t-35760300
This has to do with the recordsize of 128KiB that gets divided over the number of disks. Example for a 3-disk RAID-Z writing 128KiB to the pool:
disk1: 64KiB data (part1)
disk2: 64KiB data (part2)
disk3: 64KiB parity

Each disk now gets 64KiB which is an exact multiple of 4KiB. This means it is efficient and fast. Now compare this with a non-optimal configuration of 4 disks in RAID-Z:
disk1: 42,66KiB data (part1)
disk2: 42,66KiB data (part2)
disk3: 42,66KiB data (part3)
disk4: 42,66KiB parity

Now this is ugly! It will either be downpadded to 42.5KiB or padded toward 43.00KiB, which can vary per disk. Both of these are non optimal for 4KiB sector harddrives. This is because both 42.5K and 43K are not whole multiples of 4K. It needs to be a multiple of 4K to be optimal.

Re: RAIDZ2 on 5 disks?

Posted: 24 Mar 2017 09:56
by ms49434
Unfortunately many information on the internet regarding ZFS are not 100% accurate.

Have a read at the following article and decide yourself: https://www.delphix.com/blog/delphix-en ... love-raidz
Matthew Ahrens is one of the co-founders of the ZFS project back in 2001: http://www.open-zfs.org/wiki/User:Mahrens

Re: RAIDZ2 on 5 disks?

Posted: 24 Mar 2017 16:30
by geos
noclaf wrote:
24 Mar 2017 08:53
It's generally recommended to have #of disks = x^2 + #of parity disks. E.g. for Z2 it's 4 (thus it's silly having Z2 for 4 disks :) ),6, 10 etc.
Different # of total disk will work but the performance might be suboptimal. I assume this is related to more overhead in parity calculation.
----
EDIT :
https://forums.anandtech.com/threads/zf ... t-35760300
This has to do with the recordsize of 128KiB that gets divided over the number of disks. Example for a 3-disk RAID-Z writing 128KiB to the pool:
disk1: 64KiB data (part1)
disk2: 64KiB data (part2)
disk3: 64KiB parity

Each disk now gets 64KiB which is an exact multiple of 4KiB. This means it is efficient and fast. Now compare this with a non-optimal configuration of 4 disks in RAID-Z:
disk1: 42,66KiB data (part1)
disk2: 42,66KiB data (part2)
disk3: 42,66KiB data (part3)
disk4: 42,66KiB parity

Now this is ugly! It will either be downpadded to 42.5KiB or padded toward 43.00KiB, which can vary per disk. Both of these are non optimal for 4KiB sector harddrives. This is because both 42.5K and 43K are not whole multiples of 4K. It needs to be a multiple of 4K to be optimal.
is my understanding correct that it would be OK to set 96KiB record size for RAIDZ2 on 5 disks (32KiB chunks would go to "data" and "parity" disks) to avoing "ugly chunks" being written here and there? my disks are 3TB and physical block of 512 bytes (not 512e).

Re: RAIDZ2 on 5 disks?

Posted: 25 Mar 2017 11:10
by noclaf
As per the article posted by ms49434 (read it, it's interesting) it shouldn't be issue even having these "ugly" chunks.
I personally cannot say you yes/no, because I run R5 on HW RAID card, not RaidZ. And I was not able to find real-life tests of throughput based on # of disks and parity. Sorry. :)