This is the old XigmaNAS forum in read only mode,
it will taken offline by the end of march 2021!



I like to aks Users and Admins to rewrite/take over important post from here into the new fresh main forum!
Its not possible for us to export from here and import it to the main forum!

Still Degraded after replacing drive

Forum rules
Set-Up GuideFAQsForum Rules
Post Reply
JoyMonkey
NewUser
NewUser
Posts: 12
Joined: 05 Feb 2013 13:17
Status: Offline

Still Degraded after replacing drive

Post by JoyMonkey »

My Nas4Free box has been running 9.1.0.1 Sandstorm happily for some time now with 5 2TB drives in a raidz1 array. But a few days ago I had a drive die and a few of my other drives don't look healthy so I decided to start replacing them 1 by 1. It's been a while since I dealt with the box so I did a lot of Googling, reading through the wiki and these forums, then I took these steps...

I shutdown and physically removed the 'dead' drive from my case, replacing it with a similar drive. The replacement drive was previously used in a Freenas box, but I deleted all partitions using GParted.

In the Disks/Management tab, I clicked 'Import Disks' to get the new drive to show up, then I deleted the old drive from the listing.

I logged in via SSH and issued the command

Code: Select all

zpool replace terraid 3305624593698899328 /dev/ada1
This began resilvering, but once the resilvering completed my ZFS pool still shows as degraded...

Code: Select all

  pool: terraid
 state: DEGRADED
status: One or more devices has experienced an error resulting in data
	corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
	entire pool from backup.
   see: http://illumos.org/msg/ZFS-8000-8A
  scan: resilvered 1.64T in 14h2m with 4 errors on Thu Feb 27 02:02:55 2014
config:

	NAME                                            STATE     READ WRITE CKSUM
	terraid                                         DEGRADED     0     0     4
	  raidz1-0                                      DEGRADED     0     0     8
	    ada0                                        ONLINE       0     0     0
	    replacing-1                                 DEGRADED     0     0     0
	      3305624593698899328                       UNAVAIL      0     0     0  was /dev/gptid/26971116-5d0b-11e2-ac5a-b8975a2730dc
	      ada1                                      ONLINE       0     0     0
	    gptid/26fa2600-5d0b-11e2-ac5a-b8975a2730dc  ONLINE       0     0     0
	    gptid/27b1c2b5-5d0b-11e2-ac5a-b8975a2730dc  ONLINE       0     0     0
	    gptid/282b425b-5d0b-11e2-ac5a-b8975a2730dc  ONLINE       0     0     0

errors: Permanent errors have been detected in the following files:

        /mnt/terraid/1gbfile1.dat
        /mnt/terraid/1gbfile2.dat
        /mnt/terraid/12gbfile.dat
        /mnt/terraid/8gbfile.dat
I thought a scrub might help, but it still showed as Degraded after scrubbing. I restarted the box and when Nas4Free came up again, it was resilvering all over again, resulting in the same state as above.

Any ideas where I went wrong and how to remedy? Thanks!

JoyMonkey
NewUser
NewUser
Posts: 12
Joined: 05 Feb 2013 13:17
Status: Offline

Re: Still Degraded after replacing drive

Post by JoyMonkey »

Well don't I feel dumb? :?

I solved this by detaching the old unavailable drive. After issuing this command...

Code: Select all

zpool detach terraid 3305624593698899328
The pool now shows as ONLINE instead of DEGRADED and the new drive has once again begun resilvering...

Code: Select all

pool: terraid
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
	continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Thu Feb 27 08:05:38 2014
        25.9G scanned out of 8.22T at 209M/s, 11h24m to go
        5.18G resilvered, 0.31% done
config:

	NAME                                            STATE     READ WRITE CKSUM
	terraid                                         ONLINE       0     0     4
	  raidz1-0                                      ONLINE       0     0     8
	    ada0                                        ONLINE       0     0     0
	    ada1                                        ONLINE       0     0     0  (resilvering)
	    gptid/26fa2600-5d0b-11e2-ac5a-b8975a2730dc  ONLINE       0     0     0
	    gptid/27b1c2b5-5d0b-11e2-ac5a-b8975a2730dc  ONLINE       0     0     0
	    gptid/282b425b-5d0b-11e2-ac5a-b8975a2730dc  ONLINE       0     0     0
So should I have issued the detach command BEFORE issuing the replace command? Man, I'm an idiot. :oops:

substr
experienced User
experienced User
Posts: 113
Joined: 04 Aug 2013 20:21
Status: Offline

Re: Still Degraded after replacing drive

Post by substr »

Had you ever run a scrub on the pool before the failure?

It appears you have some block errors that are uncorrectable. The files it listed are affected, so somewhere in those files will be a read error. The files might be salvageable depending on whether they can handle a gap. Otherwise, "restore from backup."

I don't think you can detach before replace with a raidz. It is best not to detach after the replace until the resilver is complete, unless you have a situation that seems to require it(dead drive killing performance, etc).

JoyMonkey
NewUser
NewUser
Posts: 12
Joined: 05 Feb 2013 13:17
Status: Offline

Re: Still Degraded after replacing drive

Post by JoyMonkey »

This just gets weirder.
It finished resilvering the replacement drive (for a third time). And I thought I'd reboot. When it comes back up it starts another resilver on the new drive all over again.

Code: Select all

  pool: terraid
 state: ONLINE
status: One or more devices is currently being resilvered.  The pool will
	continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Fri Feb 28 05:43:31 2014
        1.13T scanned out of 8.23T at 202M/s, 10h13m to go
        231G resilvered, 13.70% done
config:

	NAME                                            STATE     READ WRITE CKSUM
	terraid                                         ONLINE       0     0     0
	  raidz1-0                                      ONLINE       0     0     0
	    ada0                                        ONLINE       0     0     0
	    ada1                                        ONLINE       0     0     0  (resilvering)
	    gptid/26fa2600-5d0b-11e2-ac5a-b8975a2730dc  ONLINE       0     0     0
	    gptid/27b1c2b5-5d0b-11e2-ac5a-b8975a2730dc  ONLINE       0     0     0
	    gptid/282b425b-5d0b-11e2-ac5a-b8975a2730dc  ONLINE       0     0     0
Any ideas why it keeps doing this?

substr
experienced User
experienced User
Posts: 113
Joined: 04 Aug 2013 20:21
Status: Offline

Re: Still Degraded after replacing drive

Post by substr »

Possibly export the pool after it finishes the resilver. That might make sure the new drive is considered fully a part of the pool.

pakpenyo
NewUser
NewUser
Posts: 6
Joined: 01 Nov 2013 12:00
Status: Offline

Re: Still Degraded after replacing drive

Post by pakpenyo »

Yes, this happen with me.

ZFS Detected

Code: Select all

Name               Type        Pool    Devices
vol1_raidz1_0   raidz1      vol1     /dev/ada0, /dev/ada1, /dev/ada2
vol1_raidz1_1   raidz1      vol1     /dev/replacing-0, /dev/ada4, /dev/ada5
ZFS Current

Code: Select all

Name               Type       Pool    Devices
vol1_raidz1_0   raidz1     vol1     /dev/ada0, /dev/ada1, /dev/ada2
vol2_raidz1_0   raidz1     vol1     /dev/ada3, /dev/ada4, /dev/ada5
After reboot, still resilvering, and /dev/ada3/old still exist.

Code: Select all

pool: vol1
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
	continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
  scan: resilver in progress since Wed Apr 16 18:05:07 2014
        7.56T scanned out of 9.80T at 241M/s, 2h42m to go
        1.26T resilvered, 77.18% done
config:

	NAME                        STATE     READ WRITE CKSUM
	vol1                        DEGRADED     0     0     0
	  raidz1-0                  ONLINE       0     0     0
	    ada0.nop                ONLINE       0     0     2  (resilvering)
	    ada1.nop                ONLINE       0     0     0
	    ada2.nop                ONLINE       0     0     0
	  raidz1-1                  DEGRADED     0     0     0
	    replacing-0             DEGRADED     0     0     0
	      12609113799974789003  UNAVAIL      0     0     0  was /dev/ada3/old
	      ada3                  ONLINE       0     0     0  (resilvering)
	    ada4                    ONLINE       0     0    24  (resilvering)
	    ada5                    ONLINE       0     0    30  (resilvering)

errors: Permanent errors have been detected in the following files:

        vol1/progress@auto-20131213-140000:/VIDEO/Local/Blnd/01 DEN HAAG - nicely.avi
And there are snapshot permanent error, but the .avi video is fine.
I don't know how to fix it. Maybe with detach command, but i'm not sure.

Olvikolvi
NewUser
NewUser
Posts: 1
Joined: 17 Jan 2015 19:35
Status: Offline

Re: Still Degraded after replacing drive

Post by Olvikolvi »

Same problem here. Afrer replace and first resilvering old disk is UNAVAIL and new disk is ONLINE, but it still says replacing-1. And it started resilvering again.. Resilve, scrup or clear does not help.. Another disk is giving read errors and need to be replaced too, but dont know if allready changed disk is part of raidz or not.

Code: Select all

	NAME                                              STATE     READ WRITE CKSUM
	myfiles                                         DEGRADED     0     0   141
	  raidz2-0                                        DEGRADED     0     0   282
	    gptid/disk-xxxxxxxxa    ONLINE      16     0     0  (resilvering)
	    replacing-1                                   DEGRADED     0     0     0
	      12345678901234456                        UNAVAIL      0     0     0  was /dev/gptid/disk-xxxxxxxxb
	      gptid/disk-xxxxxxxxc  ONLINE       0     0     0  (resilvering)
	    gptid/disk-xxxxxxxxd    ONLINE       0     0     0  (resilvering)
....

Post Reply

Return to “ZFS (only!)”