Page 1 of 1

How to repair primary GPT table in reused ZFS disks

Posted: 30 Oct 2015 17:17
by slaycock
Hi All

I've noticed that after reusing a set of discs from a previous incarnation of NAS4Free with a 10 disk ZFS 2 array I am getting error messages regarding the primary GPT table.

after doing some googling I understand that I should try gpart recover <device>.

However whenever I try this the devices that are part of the ZFS2 array are not found

gpart recover da0
gpart recover \dev\da0
gpart recover da0.nop
gpart recover \dev\da0.nop

all return the same error message. 'Invalid argument'

Suggestions as to how to proceed, including restarting and wiping disks (its a backup of backup array) most welcome

Code: Select all

ZFS pool list:
--------------
NAME          SIZE  ALLOC   FREE  EXPANDSZ   FRAG    CAP  DEDUP  HEALTH  ALTROOT
LazarusPool  9.06T  2.28T  6.79T         -     6%    25%  1.00x  ONLINE  -

ZFS pool status:
----------------
  pool: LazarusPool
 state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
	still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
	the pool may no longer be accessible by software that does not support
	the features. See zpool-features(7) for details.
  scan: scrub repaired 0 in 4h1m with 0 errors on Sat Oct 17 08:54:55 2015
config:

	NAME          STATE     READ WRITE CKSUM
	LazarusPool   ONLINE       0     0     0
	  raidz2-0    ONLINE       0     0     0
	    ada1.nop  ONLINE       0     0     0
	    ada2.nop  ONLINE       0     0     0
	    da0.nop   ONLINE       0     0     0
	    da1.nop   ONLINE       0     0     0
	    da2.nop   ONLINE       0     0     0
	    da3.nop   ONLINE       0     0     0
	    da4.nop   ONLINE       0     0     0
	    da5.nop   ONLINE       0     0     0
	    da6.nop   ONLINE       0     0     0
	    da7.nop   ONLINE       0     0     0

errors: No known data errors

SCSI disk:
----------
<ATA Hitachi HDS72101 A3MA>        at scbus0 target 0 lun 0 (pass0,da0)
<ATA SAMSUNG HD103SJ 0001>         at scbus0 target 1 lun 0 (pass1,da1)
<ATA SAMSUNG HD103SJ 0001>         at scbus0 target 2 lun 0 (pass2,da2)
<ATA SAMSUNG HD103UJ 1113>         at scbus0 target 3 lun 0 (pass3,da3)
<ATA SAMSUNG HD103UJ 1118>         at scbus0 target 4 lun 0 (pass4,da4)
<ATA SAMSUNG HD103UJ 1113>         at scbus0 target 5 lun 0 (pass5,da5)
<ATA Hitachi HDS72101 A3MA>        at scbus0 target 6 lun 0 (pass6,da6)
<ATA Hitachi HDS72101 A3MA>        at scbus0 target 8 lun 0 (pass7,da7)
<CF Card Ver2.35>                  at scbus1 target 0 lun 0 (ada0,pass8)
<Hitachi HDS721010CLA332 JP4OA3MA>  at scbus5 target 0 lun 0 (ada1,pass9)
<TSSTcorp DVD-ROM TS-H353B LE10>   at scbus6 target 0 lun 0 (pass10,cd0)
<SAMSUNG HD103SJ 1AJ10001>         at scbus7 target 0 lun 0 (ada2,pass11)
<AHCI SGPIO Enclosure 1.00 0001>   at scbus8 target 0 lun 0 (pass12,ses0)


GEOM: da1: the primary GPT table is corrupt or invalid.
GEOM: da1: using the secondary instead -- recovery strongly advised.
GEOM: da2: the primary GPT table is corrupt or invalid.
GEOM: da2: using the secondary instead -- recovery strongly advised.
GEOM: da3: the primary GPT table is corrupt or invalid.
GEOM: da3: using the secondary instead -- recovery strongly advised.
GEOM: da4: the primary GPT table is corrupt or invalid.
GEOM: da4: using the secondary instead -- recovery strongly advised.
GEOM: da5: the primary GPT table is corrupt or invalid.
GEOM: da5: using the secondary instead -- recovery strongly advised.
GEOM: da6: the primary GPT table is corrupt or invalid.
GEOM: da6: using the secondary instead -- recovery strongly advised.
GEOM: da7: the primary GPT table is corrupt or invalid.
GEOM: da7: using the secondary instead -- recovery strongly advised.
GEOM: ada1: the primary GPT table is corrupt or invalid.
GEOM: ada1: using the secondary instead -- recovery strongly advised.
GEOM: ada2: the primary GPT table is corrupt or invalid.
GEOM: ada2: using the secondary instead -- recovery strongly advised.
 
GEOM_NOP: Device ada1.nop created.
GEOM_NOP: Device ada2.nop created.
GEOM_NOP: Device da0.nop created.
GEOM_NOP: Device da1.nop created.
GEOM_NOP: Device da2.nop created.
GEOM_NOP: Device da3.nop created.
GEOM_NOP: Device da4.nop created.
GEOM_NOP: Device da5.nop created.
GEOM_NOP: Device da6.nop created.
GEOM_NOP: Device da7.nop created.


Re: How to repair primary GPT table in reused ZFS disks

Posted: 31 Oct 2015 00:46
by Parkcomm
I just rand

Code: Select all

gpart show
and my devices did not show up. They are whole of disk zfs drives, so i have not added any partitions

When I run

Code: Select all

gpart recover /dev/ada0
i get the same error as you

I fixed this exact error once before - two things to note:

- you have a backup of the table on redundant devices, which are (hopefully) backed up to another location. You could just ignore the error. (I did for well over a year)

- You can use

Code: Select all

zpool labelclear ada0
You have to offline the device, run the command, online resilver, repeat

Re: How to repair primary GPT table in reused ZFS disks

Posted: 02 Nov 2015 10:53
by b0ssman
since you are using the entire drives the messages concerning the partition table can be ignored. you are not using partitions.

Re: How to repair primary GPT table in reused ZFS disks

Posted: 22 Nov 2015 20:16
by slaycock
Zpool labelclear ada0 didn't work whilst the rest of the zpool was present.

Offlining a disk allows access to the disk. but gpart recover did not repair the broken primary gpt.

In the end I offlined each disk in turn, dd'd the first and last GB to zero, then did a zpool replace to initiate resilvering of the disk. A bit slow as there were 10 1TB disks but the problem is now fixed. None of the disks are reporting any problems with the primary gpt.