ZFS Array Drive Melted
Posted: 10 Aug 2015 21:48
In looking over my Nas4Free system this morning, I noticed that one of the disks in my 6TB ZFS array was marked as "removed" even though it was plugged in and appeared to be functioning fine. I've had some power glitches around here in the past few weeks and I thought maybe that had thrown my drive into an error state so I figured I'd see if it came back on a reboot.
When I rebooted my root my root filesystem (Nas4Free is running from a USB key) would not mount...
I was pretty confused so the first thing I did was unplug all drives except for the USB key and try again. It still failed and at that point I figured the install on the USB key had somehow become corrupted (which proved true, after copying the img file back to the key it boots fine)
However, when I re-plugged the drives in and booted up the computer, one of the SATA power cables immediately started spewing smoke and melted itself... and the connector on the hard drive, before I was able to pull the plug.
No clue what happened there, and although I suspect the disk itself is fine, the connector on the drive is completely ruined.
In any case, I pulled that drive, re-imaged the USB key, booted up the system and at least my initial assumption was correct, the "missing" disk was found at startup so the only disk absent from the system was the one that almost caught fire. So... I took a look at the ZFS pool.... which looks like this:
zfsdata FAULTED corrupted data
raidz1-0 FAULTED corrupted data
12637692214261834096 FAULTED corrupted data
11306278499670812609 FAULTED corrupted data
9359699380247702151 FAULTED corrupted data
4520295435616108019 FAULTED corrupted data
3374229570764583106 FAULTED corrupted data
I suspect that because the array already thought it was missing a disk, that losing the 2nd disk to the faulty power connection constitutes a 2nd drive failure and I'm completely out of luck?
I do have a pretty recent backup but obviously I'd like to salvage the array if I can.
1) there's a bit of data on this array since the last backup that I'd be sad to lose.
2) I have over 4TB of data on this thing and just copying that much data from a backup to a new array takes a ton of time
Is there any chance?
When I rebooted my root my root filesystem (Nas4Free is running from a USB key) would not mount...
I was pretty confused so the first thing I did was unplug all drives except for the USB key and try again. It still failed and at that point I figured the install on the USB key had somehow become corrupted (which proved true, after copying the img file back to the key it boots fine)
However, when I re-plugged the drives in and booted up the computer, one of the SATA power cables immediately started spewing smoke and melted itself... and the connector on the hard drive, before I was able to pull the plug.
No clue what happened there, and although I suspect the disk itself is fine, the connector on the drive is completely ruined.
In any case, I pulled that drive, re-imaged the USB key, booted up the system and at least my initial assumption was correct, the "missing" disk was found at startup so the only disk absent from the system was the one that almost caught fire. So... I took a look at the ZFS pool.... which looks like this:
zfsdata FAULTED corrupted data
raidz1-0 FAULTED corrupted data
12637692214261834096 FAULTED corrupted data
11306278499670812609 FAULTED corrupted data
9359699380247702151 FAULTED corrupted data
4520295435616108019 FAULTED corrupted data
3374229570764583106 FAULTED corrupted data
I suspect that because the array already thought it was missing a disk, that losing the 2nd disk to the faulty power connection constitutes a 2nd drive failure and I'm completely out of luck?
I do have a pretty recent backup but obviously I'd like to salvage the array if I can.
1) there's a bit of data on this array since the last backup that I'd be sad to lose.
2) I have over 4TB of data on this thing and just copying that much data from a backup to a new array takes a ton of time
Is there any chance?