This is the old XigmaNAS forum in read only mode,
it will taken offline by the end of march 2021!



I like to aks Users and Admins to rewrite/take over important post from here into the new fresh main forum!
Its not possible for us to export from here and import it to the main forum!

[SOLVED] ZFS resilvering problem

Forum rules
Set-Up GuideFAQsForum Rules
Post Reply
User avatar
prosiakus
Starter
Starter
Posts: 29
Joined: 25 Aug 2017 18:01
Status: Offline

[SOLVED] ZFS resilvering problem

Post by prosiakus »

I have problem with one drive, after replace it was frezzing NAS. Pool is is 8x3TB HDD.
So I have changed to other drive without waiting to resilver ealier drive that was frezzing NAS.
Than I have made zpool replace p1 ada6 and it start resilvering and end after 30GB resilver in 20 min..
I cant do enything naw. I tryed replace again, "zpool detach p1 18446744071900754353" (was unavailable but did offline and it works)
What should I do to replace this ada6 to make online state? I did all command from WEBGUI. 10.3.0.3
AND this drive was used in other zfs pool.
pool.png

Code: Select all

                               capacity     operations    bandwidth
pool                        alloc   free   read  write   read  write
--------------------------  -----  -----  -----  -----  -----  -----
P1                          17.1T  4.66T     60      8  6.54M  57.0K
  raidz2                    17.1T  4.66T     60      8  6.54M  57.0K
    ada0                        -      -     16      1  1.14M  14.7K
    ada1                        -      -     16      1  1.14M  14.7K
    ada7                        -      -     47      1  1.12M  14.7K
    ada4                        -      -     50      1  1.12M  14.7K
    ada5                        -      -     47      1  1.12M  14.8K
    ada2                        -      -     50      1  1.12M  14.8K
    replacing                   -      -      2     54  32.5K  1.11M
      ada6                      -      -      2     41  33.0K  1.10M
      18446744071900754353      -      -      0      0      0      0
    ada3                        -      -     50      1  1.12M  14.7K
--------------------------  -----  -----  -----  -----  -----  -----
You do not have the required permissions to view the files attached to this post.
Last edited by prosiakus on 30 Aug 2017 09:04, edited 1 time in total.
Core(TM) i7-4790 CPU @ 3.60GHz / Gigabyte Z97P-D3 4 x 4GB / Pool 6x4TB raidz2-0

Core(TM) i7-3770 CPU @ 3.40GHz / ASUSTeK COMPUTER INC. P8Z77-V LX 4 x 8GB / Pool 8 x 3TB raidz2-0

Core(TM)2 Duo CPU E8400 @ 3.00GHz / Intel Corporation DQ35MP 4 x 2GB/ Pool 4 x 8TB raidz1-0

sleid
PowerUser
PowerUser
Posts: 774
Joined: 23 Jun 2012 07:36
Location: FRANCE LIMOUSIN CORREZE
Status: Offline

Re: ZFS resilvering problem

Post by sleid »

The initial order should have been

zpool replace P1 18446744071900754353 /dev/ada6

Can not perform this operation from the replacement menu.
12.1.0.4 - Ingva (revision 7852)
FreeBSD 12.1-RELEASE-p12 #0 r368465M: Tue Dec 8 23:25:11 CET 2020
X64-embedded sur Intel(R) Atom(TM) CPU C2750 @ 2.40GHz Boot UEFI
ASRock C2750D4I 2 X 8GB DDR3 ECC
Pool of 2 vdev Raidz1: 3 WDC WD40EFRX + 3 WDC WD40EFRX

User avatar
prosiakus
Starter
Starter
Posts: 29
Joined: 25 Aug 2017 18:01
Status: Offline

Re: ZFS resilvering problem

Post by prosiakus »

I have tried like this too :/ Naw I have done "scrub a pool"
scrub.png
You do not have the required permissions to view the files attached to this post.
Core(TM) i7-4790 CPU @ 3.60GHz / Gigabyte Z97P-D3 4 x 4GB / Pool 6x4TB raidz2-0

Core(TM) i7-3770 CPU @ 3.40GHz / ASUSTeK COMPUTER INC. P8Z77-V LX 4 x 8GB / Pool 8 x 3TB raidz2-0

Core(TM)2 Duo CPU E8400 @ 3.00GHz / Intel Corporation DQ35MP 4 x 2GB/ Pool 4 x 8TB raidz1-0

sleid
PowerUser
PowerUser
Posts: 774
Joined: 23 Jun 2012 07:36
Location: FRANCE LIMOUSIN CORREZE
Status: Offline

Re: ZFS resilvering problem

Post by sleid »

Replace rebuilds the pool automatically there is no need to do a scrub.
It was necessary to wait until the end of the resilvering before launching the scrub which repairs ada3, currently your raidZ2 has no more redundancy if another disk dies.
Do not touch anything and cross your fingers.
12.1.0.4 - Ingva (revision 7852)
FreeBSD 12.1-RELEASE-p12 #0 r368465M: Tue Dec 8 23:25:11 CET 2020
X64-embedded sur Intel(R) Atom(TM) CPU C2750 @ 2.40GHz Boot UEFI
ASRock C2750D4I 2 X 8GB DDR3 ECC
Pool of 2 vdev Raidz1: 3 WDC WD40EFRX + 3 WDC WD40EFRX

User avatar
prosiakus
Starter
Starter
Posts: 29
Joined: 25 Aug 2017 18:01
Status: Offline

Re: ZFS resilvering problem

Post by prosiakus »

Problem was that resilvering stuck after 30gb and do nothing. As u can see on first attached screen there is status that resilvered 31 GB. Problem is that I have to change drive again after first change when first resilvering didn't finish job. Nas was freezing. So I changed to another new drive. And after that resilvering stuck after 31gb. So I start scrub.
Core(TM) i7-4790 CPU @ 3.60GHz / Gigabyte Z97P-D3 4 x 4GB / Pool 6x4TB raidz2-0

Core(TM) i7-3770 CPU @ 3.40GHz / ASUSTeK COMPUTER INC. P8Z77-V LX 4 x 8GB / Pool 8 x 3TB raidz2-0

Core(TM)2 Duo CPU E8400 @ 3.00GHz / Intel Corporation DQ35MP 4 x 2GB/ Pool 4 x 8TB raidz1-0

User avatar
b0ssman
Forum Moderator
Forum Moderator
Posts: 2438
Joined: 14 Feb 2013 08:34
Location: Munich, Germany
Status: Offline

Re: ZFS resilvering problem

Post by b0ssman »

ist could be that more drives are failing. please post the smart values of all drives.
Nas4Free 11.1.0.4.4517. Supermicro X10SLL-F, 16gb ECC, i3 4130, IBM M1015 with IT firmware. 4x 3tb WD Red, 4x 2TB Samsung F4, both GEOM AES 256 encrypted.

User avatar
prosiakus
Starter
Starter
Posts: 29
Joined: 25 Aug 2017 18:01
Status: Offline

Re: ZFS resilvering problem

Post by prosiakus »

SMART is ok but nobody run scrub at this machine. Sometimes in log there was information about bad sectors. At this moment nas is freezing sometimes (sometimes 8h sometimes 15 min after restart). I dont have to much experience with nas. Is there a chance that when repairing will finish NAS will run normaly?
Core(TM) i7-4790 CPU @ 3.60GHz / Gigabyte Z97P-D3 4 x 4GB / Pool 6x4TB raidz2-0

Core(TM) i7-3770 CPU @ 3.40GHz / ASUSTeK COMPUTER INC. P8Z77-V LX 4 x 8GB / Pool 8 x 3TB raidz2-0

Core(TM)2 Duo CPU E8400 @ 3.00GHz / Intel Corporation DQ35MP 4 x 2GB/ Pool 4 x 8TB raidz1-0

User avatar
b0ssman
Forum Moderator
Forum Moderator
Posts: 2438
Joined: 14 Feb 2013 08:34
Location: Munich, Germany
Status: Offline

Re: ZFS resilvering problem

Post by b0ssman »

please post the smart values anyway.
Nas4Free 11.1.0.4.4517. Supermicro X10SLL-F, 16gb ECC, i3 4130, IBM M1015 with IT firmware. 4x 3tb WD Red, 4x 2TB Samsung F4, both GEOM AES 256 encrypted.

User avatar
prosiakus
Starter
Starter
Posts: 29
Joined: 25 Aug 2017 18:01
Status: Offline

Re: ZFS resilvering problem

Post by prosiakus »

There are some errors becase of that an old motherboard had some problem with SATA controler. After hardware replace smart error doesnt incrise.
For exaple ada4 is a new drive that was connected to old MB and couter rise to Command_Timeout 19324 but naw its not a problem. On all new hardware its not changing. All error couter stoped after new mother board.
smart.php
- saved as php because cant attach txt file and i think its not good to write all8 hdd smart as post.
You do not have the required permissions to view the files attached to this post.
Core(TM) i7-4790 CPU @ 3.60GHz / Gigabyte Z97P-D3 4 x 4GB / Pool 6x4TB raidz2-0

Core(TM) i7-3770 CPU @ 3.40GHz / ASUSTeK COMPUTER INC. P8Z77-V LX 4 x 8GB / Pool 8 x 3TB raidz2-0

Core(TM)2 Duo CPU E8400 @ 3.00GHz / Intel Corporation DQ35MP 4 x 2GB/ Pool 4 x 8TB raidz1-0

User avatar
b0ssman
Forum Moderator
Forum Moderator
Posts: 2438
Joined: 14 Feb 2013 08:34
Location: Munich, Germany
Status: Offline

Re: ZFS resilvering problem

Post by b0ssman »

Device /dev/ada3
187 Reported_Uncorrect 0x0032 095 095 000 Old_age Always - 5
any value about 0 means the drive will fail soon see:
https://www.backblaze.com/blog/hard-drive-smart-stats/
Also the drive internally experiences errors mutlible times.
Error 5 occurred at disk power-on lifetime: 52441 hours (2185 days + 1 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 e0 ff ff ff 4f 00 00:02:38.704 READ FPDMA QUEUED
2f 00 01 10 00 00 00 00 00:02:38.609 READ LOG EXT
60 00 e0 ff ff ff 4f 00 00:02:35.964 READ FPDMA QUEUED
2f 00 01 10 00 00 00 00 00:02:35.861 READ LOG EXT
60 00 e0 ff ff ff 4f 00 00:02:33.216 READ FPDMA QUEUED

Error 4 occurred at disk power-on lifetime: 52441 hours (2185 days + 1 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 e0 ff ff ff 4f 00 00:02:35.964 READ FPDMA QUEUED
2f 00 01 10 00 00 00 00 00:02:35.861 READ LOG EXT
60 00 e0 ff ff ff 4f 00 00:02:33.216 READ FPDMA QUEUED
2f 00 01 10 00 00 00 00 00:02:33.080 READ LOG EXT
60 00 e0 ff ff ff 4f 00 00:02:30.435 READ FPDMA QUEUED

Error 3 occurred at disk power-on lifetime: 52441 hours (2185 days + 1 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 e0 ff ff ff 4f 00 00:02:33.216 READ FPDMA QUEUED
2f 00 01 10 00 00 00 00 00:02:33.080 READ LOG EXT
60 00 e0 ff ff ff 4f 00 00:02:30.435 READ FPDMA QUEUED
2f 00 01 10 00 00 00 00 00:02:30.318 READ LOG EXT
60 00 e0 ff ff ff 4f 00 00:02:27.620 READ FPDMA QUEUED

Error 2 occurred at disk power-on lifetime: 52441 hours (2185 days + 1 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 e0 ff ff ff 4f 00 00:02:30.435 READ FPDMA QUEUED
2f 00 01 10 00 00 00 00 00:02:30.318 READ LOG EXT
60 00 e0 ff ff ff 4f 00 00:02:27.620 READ FPDMA QUEUED
60 00 e0 20 02 00 40 00 00:02:27.618 READ FPDMA QUEUED
60 00 e0 20 00 00 40 00 00:02:27.588 READ FPDMA QUEUED

Error 1 occurred at disk power-on lifetime: 52441 hours (2185 days + 1 hours)
When the command that caused the error occurred, the device was active or idle.

After command completion occurred, registers were:
ER ST SC SN CL CH DH
-- -- -- -- -- -- --
40 51 00 ff ff ff 0f Error: UNC at LBA = 0x0fffffff = 268435455

Commands leading to the command that caused the error were:
CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
-- -- -- -- -- -- -- -- ---------------- --------------------
60 00 e0 ff ff ff 4f 00 00:02:27.620 READ FPDMA QUEUED
60 00 e0 20 02 00 40 00 00:02:27.618 READ FPDMA QUEUED
60 00 e0 20 00 00 40 00 00:02:27.588 READ FPDMA QUEUED
61 00 10 ff ff ff 4f 00 00:02:27.094 WRITE FPDMA QUEUED
61 00 10 ff ff ff 4f 00 00:02:26.873 WRITE FPDMA QUEUED



Device /dev/ada4
188 Command_Timeout 0x0032 100 001 000 Old_age Always - 19324

if any of those command timeouts occur during resilvering that could cause the problems you are seeing.
Nas4Free 11.1.0.4.4517. Supermicro X10SLL-F, 16gb ECC, i3 4130, IBM M1015 with IT firmware. 4x 3tb WD Red, 4x 2TB Samsung F4, both GEOM AES 256 encrypted.

sleid
PowerUser
PowerUser
Posts: 774
Joined: 23 Jun 2012 07:36
Location: FRANCE LIMOUSIN CORREZE
Status: Offline

Re: ZFS resilvering problem

Post by sleid »

ada3,ada4,ada5 : 199 UDMA_CRC_Error_Count

Multiple sata connection errors from either the cables, connectors, or controller (s) may cause timeout and other errors.
So this is a priority issue.
12.1.0.4 - Ingva (revision 7852)
FreeBSD 12.1-RELEASE-p12 #0 r368465M: Tue Dec 8 23:25:11 CET 2020
X64-embedded sur Intel(R) Atom(TM) CPU C2750 @ 2.40GHz Boot UEFI
ASRock C2750D4I 2 X 8GB DDR3 ECC
Pool of 2 vdev Raidz1: 3 WDC WD40EFRX + 3 WDC WD40EFRX

User avatar
b0ssman
Forum Moderator
Forum Moderator
Posts: 2438
Joined: 14 Feb 2013 08:34
Location: Munich, Germany
Status: Offline

Re: ZFS resilvering problem

Post by b0ssman »

he already said that he crc errors were old.
Nas4Free 11.1.0.4.4517. Supermicro X10SLL-F, 16gb ECC, i3 4130, IBM M1015 with IT firmware. 4x 3tb WD Red, 4x 2TB Samsung F4, both GEOM AES 256 encrypted.

sleid
PowerUser
PowerUser
Posts: 774
Joined: 23 Jun 2012 07:36
Location: FRANCE LIMOUSIN CORREZE
Status: Offline

Re: ZFS resilvering problem

Post by sleid »

Yes but it is always interesting to follow the evolution of this counter.
12.1.0.4 - Ingva (revision 7852)
FreeBSD 12.1-RELEASE-p12 #0 r368465M: Tue Dec 8 23:25:11 CET 2020
X64-embedded sur Intel(R) Atom(TM) CPU C2750 @ 2.40GHz Boot UEFI
ASRock C2750D4I 2 X 8GB DDR3 ECC
Pool of 2 vdev Raidz1: 3 WDC WD40EFRX + 3 WDC WD40EFRX

User avatar
prosiakus
Starter
Starter
Posts: 29
Joined: 25 Aug 2017 18:01
Status: Offline

Re: ZFS resilvering problem

Post by prosiakus »

OK all is fine scrub 100% and no freeze since yesterday (1 Day 5 Hours 48 Minutes 7 Seconds) that is mega record since last 3 weeks!!! :) yupi!!!

But there is a problem with an old ada 6 in pool information:
afterscrub.jpg

Code: Select all

                               capacity     operations    bandwidth
pool                        alloc   free   read  write   read  write
--------------------------  -----  -----  -----  -----  -----  -----
P1                          17.2T  4.50T    892     28   101M   961K
  raidz2                    17.2T  4.50T    892     28   101M   961K
    ada0                        -      -    232      6  17.8M   189K
    ada1                        -      -    235      6  17.8M   189K
    ada7                        -      -    491     10  17.7M   187K
    ada4                        -      -    542     10  17.6M   187K
    ada5                        -      -    509     10  17.6M   187K
    ada2                        -      -    545     10  17.6M   187K
    replacing                   -      -    835     23  17.4M   322K
      ada6                      -      -    507     12  17.6M   320K
      18446744071900754353      -      -      0      0      0      0
    ada3                        -      -    531     10  17.6M   187K
--------------------------  -----  -----  -----  -----  -----  -----
Haw may i bring state: online?

This is replace window:
replace.jpg
And this is remove:
remove.jpg
I did "zpool remove p1 18446744071900754353" in webgui>tools>command and nothing had happend :(
I would like to set this pool to normal "online" :)

May I use in this situation "Overwrite already configured disks (only affects filesystem value)" in Disks > ZFS > Configuration > Synchronize? (i dont realy know what it will do)
I have tried replace and remove and nothing work. Ghost disk :) Its there but cant remove or replace it.
Good for me that after scrub there is no freez enymore. Just need to know how to remove this crap "18446744071900754353"
mayby remove ada6 from manager, than "gpart destroy -F ada6" and add it in manager and than resilver will run?

Or other idea:
Mayby export P1. Remove zfs label from ada6, import P1 and replace ada6 will work after that, and old ada6 will dissapear? :)
You do not have the required permissions to view the files attached to this post.
Core(TM) i7-4790 CPU @ 3.60GHz / Gigabyte Z97P-D3 4 x 4GB / Pool 6x4TB raidz2-0

Core(TM) i7-3770 CPU @ 3.40GHz / ASUSTeK COMPUTER INC. P8Z77-V LX 4 x 8GB / Pool 8 x 3TB raidz2-0

Core(TM)2 Duo CPU E8400 @ 3.00GHz / Intel Corporation DQ35MP 4 x 2GB/ Pool 4 x 8TB raidz1-0

sleid
PowerUser
PowerUser
Posts: 774
Joined: 23 Jun 2012 07:36
Location: FRANCE LIMOUSIN CORREZE
Status: Offline

Re: ZFS resilvering problem

Post by sleid »

zpool clear P1

zpool clear P1 dev/ada6

zpool clear P1 18446744071900754353

If no success, the cleanest solution is the replace that has certainly done poorly previously.


zpool replace P1 18446744071900754353 /dev/ada6
12.1.0.4 - Ingva (revision 7852)
FreeBSD 12.1-RELEASE-p12 #0 r368465M: Tue Dec 8 23:25:11 CET 2020
X64-embedded sur Intel(R) Atom(TM) CPU C2750 @ 2.40GHz Boot UEFI
ASRock C2750D4I 2 X 8GB DDR3 ECC
Pool of 2 vdev Raidz1: 3 WDC WD40EFRX + 3 WDC WD40EFRX

User avatar
prosiakus
Starter
Starter
Posts: 29
Joined: 25 Aug 2017 18:01
Status: Offline

Re: ZFS resilvering problem

Post by prosiakus »

I have just done what u write. In console and by webgui and nothing happened ;(
Core(TM) i7-4790 CPU @ 3.60GHz / Gigabyte Z97P-D3 4 x 4GB / Pool 6x4TB raidz2-0

Core(TM) i7-3770 CPU @ 3.40GHz / ASUSTeK COMPUTER INC. P8Z77-V LX 4 x 8GB / Pool 8 x 3TB raidz2-0

Core(TM)2 Duo CPU E8400 @ 3.00GHz / Intel Corporation DQ35MP 4 x 2GB/ Pool 4 x 8TB raidz1-0

sleid
PowerUser
PowerUser
Posts: 774
Joined: 23 Jun 2012 07:36
Location: FRANCE LIMOUSIN CORREZE
Status: Offline

Re: ZFS resilvering problem

Post by sleid »

zpool detach P1 18446744071900754353
12.1.0.4 - Ingva (revision 7852)
FreeBSD 12.1-RELEASE-p12 #0 r368465M: Tue Dec 8 23:25:11 CET 2020
X64-embedded sur Intel(R) Atom(TM) CPU C2750 @ 2.40GHz Boot UEFI
ASRock C2750D4I 2 X 8GB DDR3 ECC
Pool of 2 vdev Raidz1: 3 WDC WD40EFRX + 3 WDC WD40EFRX

User avatar
prosiakus
Starter
Starter
Posts: 29
Joined: 25 Aug 2017 18:01
Status: Offline

Re: ZFS resilvering problem

Post by prosiakus »

Gr8. its DONE. THX all.
I have done it yesterday but your erlier post with this logic:

zpool clear P1

zpool clear P1 dev/ada6

zpool clear P1 18446744071900754353

zpool replace P1 18446744071900754353 /dev/ada6

give way to finish this:
zpool detach P1 18446744071900754353
with succes!!!

I realy done it yesterday but not in this way :)
You do not have the required permissions to view the files attached to this post.
Last edited by prosiakus on 30 Aug 2017 09:03, edited 1 time in total.
Core(TM) i7-4790 CPU @ 3.60GHz / Gigabyte Z97P-D3 4 x 4GB / Pool 6x4TB raidz2-0

Core(TM) i7-3770 CPU @ 3.40GHz / ASUSTeK COMPUTER INC. P8Z77-V LX 4 x 8GB / Pool 8 x 3TB raidz2-0

Core(TM)2 Duo CPU E8400 @ 3.00GHz / Intel Corporation DQ35MP 4 x 2GB/ Pool 4 x 8TB raidz1-0

User avatar
raulfg3
Site Admin
Site Admin
Posts: 4865
Joined: 22 Jun 2012 22:13
Location: Madrid (ESPAÑA)
Contact:
Status: Offline

Re: ZFS resilvering problem

Post by raulfg3 »

if problem is solved, please edit & mark first post as [SOLVED]

Thanks.
12.1.0.4 - Ingva (revision 7743) on SUPERMICRO X8SIL-F 8GB of ECC RAM, 11x3TB disk in 1 vdev = Vpool = 32TB Raw size , so 29TB usable size (I Have other NAS as Backup)

Wiki
Last changes

HP T510

Post Reply

Return to “ZFS (only!)”