Log time user, first time poster
I woke up this morning to some dreaded "clicking" from a hard drive in my homebrew NAS. Now this machine is kept away from me and is not actually used a whole lot so this could have been going on for some time.
The setup is:
nas4free 9.2.0.1 - Shigawire (revision 972) - embedded (on USB).
I'm in a RaidZ1 (if thats what you call it nowadays?) with four disks. I was hoping to move the data off at some point and go to a clean RaidZ2 with 5 (or maybe 7 disks) later for more redundancy.
I'll be brief I ran some S.M.A.R.T checks on the disks using the smartctl command to perform an extended offline test, now i'm not an expert at using this by any means.
This was the status of the smartctl -l selftest for each drive.
Code: Select all
$ smartctl -l selftest /dev/ada0
smartctl 6.2 2013-07-26 r3841 [FreeBSD 9.2-RELEASE-p4 amd64] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 4836 -
Code: Select all
$ smartctl -l selftest /dev/ada1
smartctl 6.2 2013-07-26 r3841 [FreeBSD 9.2-RELEASE-p4 amd64] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed: read failure 90% 5046 1142887794
Code: Select all
$ smartctl -l selftest /dev/ada2
smartctl 6.2 2013-07-26 r3841 [FreeBSD 9.2-RELEASE-p4 amd64] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed: read failure 70% 5043 727949262
Code: Select all
$ smartctl -l selftest /dev/ada3
smartctl 6.2 2013-07-26 r3841 [FreeBSD 9.2-RELEASE-p4 amd64] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF READ SMART DATA SECTION ===
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 4832 -
I can still get data off my ZFS pool, but sometimes it is painful to access (typical of when a drive is failing and it can't be spun up properly) the problem is i'm not sure which disk is causing the issue.
As of right now i just want to protect the data if possible, so if there are some commands that will help be better diagnose this please let me know.
I suspect /dev/ada1 is the culprit from an earlier aborted test, but cannot confirm.
The zpool also reports itself as healthy/online.
I need to know of any commands that may help me determine which drive and protect the integrity of the data in the zpool.
If there are any specific zfs commands for this? I read an article on repairing bad blocks using smarttools, but i was very wary of doing what they suggested if zfs isn't aware of it.
Anyone has any ideas please let me know, thanks.


