This is the old XigmaNAS forum in read only mode,
it will taken offline by the end of march 2021!
I like to aks Users and Admins to rewrite/take over important post from here into the new fresh main forum!
Its not possible for us to export from here and import it to the main forum!
it will taken offline by the end of march 2021!
I like to aks Users and Admins to rewrite/take over important post from here into the new fresh main forum!
Its not possible for us to export from here and import it to the main forum!
ZFS Degraded after replacing Faulty Disk
-
renjithc
- Starter

- Posts: 33
- Joined: 30 Jun 2012 07:35
- Status: Offline
ZFS Degraded after replacing Faulty Disk
I'm relatively basic user of NAS4Free (9.0.0.1 - Sandstorm revision 148) with ZFS. But I'm a bit concerned abt the issue I'm facing here -
1) I have a 3x3TB RAIDZ ZFS
2) One of the drives went faulty
3) I did the shutdown, put in a new drive on same port as I've done before & did the replace command
4) Started Resilvering as always
5) Found out after 28hrs - that some 12GB of data was corrupt and had permanent errors (some 8-10files)
6) Resilvering was completed and I deleted the so called fault files. But the disk still shows as replacing.
7) Now its still in degraded state & I just started a scrub
At this stage I'm having serious concerns - I dont have a full backup anywhere and the data on it is valuable. Pls. advice what needs to be done before I do anything stupid & start panicking.
Attached Screenshot of the ZFS Info Screen
1) I have a 3x3TB RAIDZ ZFS
2) One of the drives went faulty
3) I did the shutdown, put in a new drive on same port as I've done before & did the replace command
4) Started Resilvering as always
5) Found out after 28hrs - that some 12GB of data was corrupt and had permanent errors (some 8-10files)
6) Resilvering was completed and I deleted the so called fault files. But the disk still shows as replacing.
7) Now its still in degraded state & I just started a scrub
At this stage I'm having serious concerns - I dont have a full backup anywhere and the data on it is valuable. Pls. advice what needs to be done before I do anything stupid & start panicking.
Attached Screenshot of the ZFS Info Screen
You do not have the required permissions to view the files attached to this post.
ProdnNAS: Norco 4224 Chasis, Supermicro X8SiL-F Mobo, Xeon X3430, 2xM1015, 1xSil3112, 32GB (4x8GB) ECC RDIMM, 8x3TB WD Red/Purple Mix RAIDZ2 (16TB Usable), (5x3TB+3x4TB) WD Purple RAIDZ2(16TB Usable), 3x3TB WD Green RAIDZ, 700W Antec PSU, 6xV80E Nidec Screamers@6000rpm, NAS4Free 9.2.0.1 - Shigawire (r972)
-
renjithc
- Starter

- Posts: 33
- Joined: 30 Jun 2012 07:35
- Status: Offline
Re: ZFS Degraded after replacing Faulty Disk
arrghh .. i should have touched it .. i just tried the clear command. read in the wiki - it just clears the errors and tries to make the disks online - didnt realize it will start resilvering again to fix the issues!!
Hope its the right direction ...
Hope its the right direction ...
ProdnNAS: Norco 4224 Chasis, Supermicro X8SiL-F Mobo, Xeon X3430, 2xM1015, 1xSil3112, 32GB (4x8GB) ECC RDIMM, 8x3TB WD Red/Purple Mix RAIDZ2 (16TB Usable), (5x3TB+3x4TB) WD Purple RAIDZ2(16TB Usable), 3x3TB WD Green RAIDZ, 700W Antec PSU, 6xV80E Nidec Screamers@6000rpm, NAS4Free 9.2.0.1 - Shigawire (r972)
-
substr
- experienced User

- Posts: 113
- Joined: 04 Aug 2013 20:21
- Status: Offline
Re: ZFS Degraded after replacing Faulty Disk
You appear to have double drive failure (one drive being replaced, and then read failures on ada3). If you are able, immediately backup the valuable data before issuing any more ZFS commands.
If you are lucky, even though some data is lost, metadata is usually stored twice, so there is good chance of you getting most critical data off before something Really Bad happens.
Had you ever run any scrubs on the pool?
If you are lucky, even though some data is lost, metadata is usually stored twice, so there is good chance of you getting most critical data off before something Really Bad happens.
Had you ever run any scrubs on the pool?
-
substr
- experienced User

- Posts: 113
- Joined: 04 Aug 2013 20:21
- Status: Offline
Re: ZFS Degraded after replacing Faulty Disk
If you do run into any problems with any particular directory, or file, give up on it and move on to others. Do not keep trying to save files that are not working until you have gotten all the rest.
-
substr
- experienced User

- Posts: 113
- Joined: 04 Aug 2013 20:21
- Status: Offline
Re: ZFS Degraded after replacing Faulty Disk
Correction: You have TRIPLE drive failure. (ada5 also shows read errors).
Are your drives overheating? Is your power supply failing? A bad cable shared by all the drives?
Get your most important data off immediately.
Are your drives overheating? Is your power supply failing? A bad cable shared by all the drives?
Get your most important data off immediately.
-
renjithc
- Starter

- Posts: 33
- Joined: 30 Jun 2012 07:35
- Status: Offline
Re: ZFS Degraded after replacing Faulty Disk
thnx substr .. i too noticed errors on other disks it once you mentioned it.
But it has now completed resilvering successfully without any errors (once the corrupted files were deleted). I was assuming things are going ok and then all of a sudden I see logs filling up with below:
smartd[5880]: Device: /dev/ada3, 66 Currently unreadable (pending) sectors
smartd[5880]: Device: /dev/ada5, 5 Currently unreadable (pending) sectors
The funny thing - I did have temperature issues - but never with these disk's (ada3/ada5) - they were adequately cooled and never hit above 33C. (My external chasis with ada0-2 hit a temp of 40C when fans failed)
I'm running a scrub now - should i assume the disks are faulty - even if the scrub returns successfull and repaired?
I just spent out $500 on 4TB HDD bunch recently and was getting a norco.
Current Snap of ZFS pool attached
All this craziness started after i began using RSYNC.
But it has now completed resilvering successfully without any errors (once the corrupted files were deleted). I was assuming things are going ok and then all of a sudden I see logs filling up with below:
smartd[5880]: Device: /dev/ada3, 66 Currently unreadable (pending) sectors
smartd[5880]: Device: /dev/ada5, 5 Currently unreadable (pending) sectors
The funny thing - I did have temperature issues - but never with these disk's (ada3/ada5) - they were adequately cooled and never hit above 33C. (My external chasis with ada0-2 hit a temp of 40C when fans failed)
I'm running a scrub now - should i assume the disks are faulty - even if the scrub returns successfull and repaired?
I just spent out $500 on 4TB HDD bunch recently and was getting a norco.
Current Snap of ZFS pool attached
All this craziness started after i began using RSYNC.
You do not have the required permissions to view the files attached to this post.
ProdnNAS: Norco 4224 Chasis, Supermicro X8SiL-F Mobo, Xeon X3430, 2xM1015, 1xSil3112, 32GB (4x8GB) ECC RDIMM, 8x3TB WD Red/Purple Mix RAIDZ2 (16TB Usable), (5x3TB+3x4TB) WD Purple RAIDZ2(16TB Usable), 3x3TB WD Green RAIDZ, 700W Antec PSU, 6xV80E Nidec Screamers@6000rpm, NAS4Free 9.2.0.1 - Shigawire (r972)
-
substr
- experienced User

- Posts: 113
- Joined: 04 Aug 2013 20:21
- Status: Offline
Re: ZFS Degraded after replacing Faulty Disk
I would stop the scrub until after ada3 is replaced and you have a backup. You can run one after that.
You might have a system problem that is causing the drives to act up. However, unless you can identify what that is, you must assume that ada3 is also failing and needs to be replaced immediately. It would be best if you can do this by connecting a new drive (perhaps ada6? or ada0,1, or 2?) and leaving ada2 connected until the replacement is complete. If you can't, then do it the normal way.
But make that backup immediately!
Once backup is made and ada3 is replaced, then you need to consider whether ada5 also needs replacement. And if the new ada4 or new ada3 start showing errors, your hardware is junk and you need to stop using it immediately.
You might have a system problem that is causing the drives to act up. However, unless you can identify what that is, you must assume that ada3 is also failing and needs to be replaced immediately. It would be best if you can do this by connecting a new drive (perhaps ada6? or ada0,1, or 2?) and leaving ada2 connected until the replacement is complete. If you can't, then do it the normal way.
But make that backup immediately!
Once backup is made and ada3 is replaced, then you need to consider whether ada5 also needs replacement. And if the new ada4 or new ada3 start showing errors, your hardware is junk and you need to stop using it immediately.
-
substr
- experienced User

- Posts: 113
- Joined: 04 Aug 2013 20:21
- Status: Offline
Re: ZFS Degraded after replacing Faulty Disk
Yes, even if a scrub shows clean, I would still consider ada3 to be failed. It gave you 70 read errors, and shows 66 pending sectors. That is too risky for raidz1. A clean scrub just means you no longer have any data stored on the bad sectors(because you deleted it), but that could change quickly, or more sectors could be close to failing.
-
renjithc
- Starter

- Posts: 33
- Joined: 30 Jun 2012 07:35
- Status: Offline
Re: ZFS Degraded after replacing Faulty Disk
Stopped my scrub. I dont see any errors anywhere or the pool going offline, But definetly getting these now
smartd[28040]: Device: /dev/ada5, 5 Offline uncorrectable sectors
smartd[28040]: Device: /dev/ada5, 5 Currently unreadable (pending) sectors
and yet dont see any errors on the pool read\write or performance issues.
My hardware is (atleast so far was) relatively good - its an ML110 G5 with Xeon CPU and cooled adequately with pcie 4port sata cards extra. ada3\ada5 on mobo sata ports and ada4 on sil pcie sata board.
I'm not sure whats going on anymore. Dont get it .. these two drives were always cooled properly. cant fail together.
smartd[28040]: Device: /dev/ada5, 5 Offline uncorrectable sectors
smartd[28040]: Device: /dev/ada5, 5 Currently unreadable (pending) sectors
and yet dont see any errors on the pool read\write or performance issues.
My hardware is (atleast so far was) relatively good - its an ML110 G5 with Xeon CPU and cooled adequately with pcie 4port sata cards extra. ada3\ada5 on mobo sata ports and ada4 on sil pcie sata board.
I'm not sure whats going on anymore. Dont get it .. these two drives were always cooled properly. cant fail together.
ProdnNAS: Norco 4224 Chasis, Supermicro X8SiL-F Mobo, Xeon X3430, 2xM1015, 1xSil3112, 32GB (4x8GB) ECC RDIMM, 8x3TB WD Red/Purple Mix RAIDZ2 (16TB Usable), (5x3TB+3x4TB) WD Purple RAIDZ2(16TB Usable), 3x3TB WD Green RAIDZ, 700W Antec PSU, 6xV80E Nidec Screamers@6000rpm, NAS4Free 9.2.0.1 - Shigawire (r972)
- b0ssman
- Forum Moderator

- Posts: 2438
- Joined: 14 Feb 2013 08:34
- Location: Munich, Germany
- Status: Offline
Re: ZFS Degraded after replacing Faulty Disk
hard drives from the same badge have been known to fail at the same time.
Nas4Free 11.1.0.4.4517. Supermicro X10SLL-F, 16gb ECC, i3 4130, IBM M1015 with IT firmware. 4x 3tb WD Red, 4x 2TB Samsung F4, both GEOM AES 256 encrypted.
-
substr
- experienced User

- Posts: 113
- Joined: 04 Aug 2013 20:21
- Status: Offline
Re: ZFS Degraded after replacing Faulty Disk
They can fail together, but sometimes there is a reason. Did the corrupted files happen to be written recently, or have they been there a very long time? Were you running scrubs regularly? I don't think the temperature you mentioned was high enough to cause this.
Is ada3 replaced, or did it stop showing errors?
Is ada3 replaced, or did it stop showing errors?