Page 1 of 1

N4F issue after power outage

Posted: 07 Sep 2015 06:35
by newmember
So after the power outage the N4F did not come back on.
I reinstalled N4F to the USB.
I then tried a zfs import.
Nothing to import

I went to the command prompt to find the following.
What are the numbers that are replacing the disk dev name?
Any suggestions?


nas4free: ~# zpool import
pool: tank1-pool
id: 11839228348510070876
state: UNAVAIL
status: One or more devices are missing from the system.
action: The pool cannot be imported. Attach the missing
devices and try again.
see: http://illumos.org/msg/ZFS-8000-3C
config:

tank1-pool UNAVAIL insufficient replicas
raidz2-0 UNAVAIL insufficient replicas
ada0 ONLINE
987516735268493811 UNAVAIL cannot open
7492536897629704888 UNAVAIL cannot open
ada2 ONLINE
ada3 ONLINE
16469022700334976080 OFFLINE

Re: N4F issue after power outage

Posted: 07 Sep 2015 06:40
by b0ssman
3 out of your 6 drives are not available.

since you have a raidz2 2 devices could fail and it would still work.

Re: N4F issue after power outage

Posted: 07 Sep 2015 09:28
by crowi
either your drives or your controller have died, can you post the dmesg output?
You should use a UPS on your server to prevent such things.

Re: N4F issue after power outage

Posted: 07 Sep 2015 21:52
by newmember
dmeg


ada0 at ahcich2 bus 0 scbus2 target 0 lun 0
ada0: <Hitachi HDS5C3030ALA630 MEAOA580> ATA8-ACS SATA 3.x device
ada0: Serial Number MJ1321YNG1AAGA
ada0: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)
ada0: Command Queueing enabled
ada0: 2861588MB (5860533168 512 byte sectors: 16H 63S/T 16383C)
ada0: Previously was known as ad8
ada1 at ahcich3 bus 0 scbus3 target 0 lun 0
ada1: <ST3000DM001-1CH166 CC24> ATA8-ACS SATA 3.x device
ada1: Serial Number Z1F281L5
ada1: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada1: Command Queueing enabled
ada1: 2861588MB (5860533168 512 byte sectors: 16H 63S/T 16383C)
ada1: quirks=0x1<4K>
ada1: Previously was known as ad10
ada2 at ahcich5 bus 0 scbus5 target 0 lun 0
ada2: <Hitachi HDS5C3030ALA630 MEAOA580> ATA8-ACS SATA 3.x device
ada2: Serial Number MJ1321YNG12PSA
ada2: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)
ada2: Command Queueing enabled
ada2: 2861588MB (5860533168 512 byte sectors: 16H 63S/T 16383C)
ada2: Previously was known as ad14
ada3 at ahcich6 bus 0 scbus6 target 0 lun 0
ada3: <Hitachi HDS5C3030ALA630 MEAOA580> ATA8-ACS SATA 3.x device
ada3: Serial Number MJ1321YNG156LA
ada3: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)
ada3: Command Queueing enabled
ada3: 2861588MB (5860533168 512 byte sectors: 16H 63S/T 16383C)
ada3: Previously was known as ad16
ada4 at ahcich7 bus 0 scbus7 target 0 lun 0
ada4: <Hitachi HDS5C3030ALA630 MEAOA580> ATA8-ACS SATA 3.x device
ada4: Serial Number MJ1321YNG13XYA
ada4: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)
ada4: Command Queueing enabled
ada4: 2861588MB (5860533168 512 byte sectors: 16H 63S/T 16383C)
ada4: Previously was known as ad18
ugen0.5: <vendor 0x13fe> at usbus0
umass0: <vendor 0x13fe Patriot Memory, class 0/0, rev 3.00/1.00, addr 4> on usbus0
umass0: SCSI over Bulk-Only; quirks = 0xc100
umass0:9:0:-1: Attached to scbus9
da0 at umass-sim0 bus 0 scbus9 target 0 lun 0
da0: < Patriot Memory PMAP> Removable Direct Access SPC-2 SCSI device
da0: Serial Number 070131B6758D4160
da0: 400.000MB/s transfers
da0: 7385MB (15124992 512 byte sectors: 255H 63S/T 941C)
da0: quirks=0x2<NO_6_BYTE>

Re: N4F issue after power outage

Posted: 08 Sep 2015 05:13
by Parkcomm
The numbers that replace the device name are ZFS guids. Just run

Code: Select all

zdb -C -e tank1-pool 
If that doesn't work, try it without the -e options. You can also use "zdb -l /dev/ada0" to queries the device itself, as each device caries a copy of the config and has the guid for all disks.

(Obviously they represent /dev/ada1 and /dev/ada4)

it will display the config including the guid to device mapping. Here is a segment from mine:

Code: Select all

children[2]:
       children[1]:
                    type: 'disk'
                    id: 1
                    guid: 16283620402190596653
                    path: '/dev/ada2'
                    phys_path: '/dev/ada2'
                    whole_disk: 1
                    DTL: 124
                    create_txg: 4
Am I correct in thinking you have a five disk zraid2 pool - yet you have three working disks and three failed!

I would try the following in this order. Run zpool status after each step to see if it worked.

1/ Reboot (probably won't help)
2/ zoool export tank1-pool
3/ zpool import tank1-pool
4/ zpool -f import tank1-pool
5/ zpool -fF import tank1-pool (if this works you lose the most recent transactions)

If this process does not work - and assuming you have a five vdev raidz2 with two faulty disks - attach one of the UNAVAIL devices. The pool will resilver, then attach the other. (the reason I'm suggesting doing this last is if you get a disk failure during the rebuild the data is toast)

If at this point the pool hasn't come back, I'd probably think about starting from scratch.

Re: N4F issue after power outage

Posted: 08 Sep 2015 17:45
by newmember
Update:
I looked in the BIOS and saw that one disk was missing.
I found the one disk was not seated correctly even though the light was on. gees.

I imported the disks
I synchronized the zfs.
Now I can see this ( see below)
I will work to move the data off

Is there something I can check with the disks with the errors?




pool: tank1-pool
state: DEGRADED
status: One or more devices could not be opened. Sufficient replicas exist for
the pool to continue functioning in a degraded state.
action: Attach the missing device and online it using 'zpool online'.
see: http://illumos.org/msg/ZFS-8000-2Q
scan: resilvered 15.5K in 0h0m with 0 errors on Thu Aug 22 21:27:59 2013
config:

NAME STATE READ WRITE CKSUM
tank1-pool DEGRADED 0 0 0
raidz2-0 DEGRADED 0 0 0
ada0 ONLINE 0 0 0
987516735268493811 UNAVAIL 0 0 0 was /dev/ada1
ada2 ONLINE 0 0 0
ada3 ONLINE 0 0 0
ada4 ONLINE 0 0 0
16469022700334976080 OFFLINE 0 0 0 was /dev/ada5

Re: N4F issue after power outage

Posted: 08 Sep 2015 17:50
by newmember
FYI:

Not sure I can run this command while the pool is running?

nas4free: ~# zdb -C -e tank1-pool
zdb: can't open 'tank1-pool': File exists

Re: N4F issue after power outage

Posted: 08 Sep 2015 17:53
by b0ssman
post all the smart values from Diagnostics|Information

Re: N4F issue after power outage

Posted: 08 Sep 2015 22:50
by Parkcomm
newmember wrote: nas4free: ~# zdb -C -e tank1-pool
zdb: can't open 'tank1-pool': File exists
if the pool is not exported

zdb -C tank1-pool

however you don't need it now - zpool status is showing the mapping

Re: N4F issue after power outage

Posted: 08 Sep 2015 22:59
by Parkcomm
newmember wrote: status: One or more devices could not be opened. Sufficient replicas exist for
the pool to continue functioning in a degraded state.
Well this is good news - your pool is functioning, so now you only have to sort out your disk problems rather than rebuild the pool.

What happens when you run

Code: Select all

zpool clear  tank1-pool
and

Code: Select all

zpool online tank1-pool /dev/ada5 

Re: N4F issue after power outage

Posted: 09 Sep 2015 05:29
by newmember
yikes I better get this data off right away.
I see it resilvering but I also see CRC error running on the monitor.
I guess it started after adding ada5



nas4free: ~# zpool clear tank1-pool
nas4free: ~#
nas4free: ~#
nas4free: ~#
nas4free: ~# zpool online tank1-pool /dev/ada5
nas4free: ~#






pool: tank1-pool
state: DEGRADED
status: One or more devices is currently being resilvered. The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
scan: resilver in progress since Wed Sep 9 03:17:54 2015
4.85G scanned out of 4.68T at 84.2M/s, 16h10m to go
12.2M resilvered, 0.10% done
config:

NAME STATE READ WRITE CKSUM
tank1-pool DEGRADED 0 0 0
raidz2-0 DEGRADED 0 0 0
ada0 ONLINE 0 0 0
987516735268493811 UNAVAIL 0 0 0 was /dev/ada1
ada2 ONLINE 0 0 0
ada3 ONLINE 0 0 0
ada4 ONLINE 0 0 15
ada5 ONLINE 70 3.17M 208 (resilvering)

Re: N4F issue after power outage

Posted: 09 Sep 2015 06:14
by Parkcomm
Maybe take ada5 back offline - halt the resilvering (zfs scrub -s <pool>)

Copy the data off - then put the disk back online. I don't like the look of the CRC errors on ada4 and 3M write errors on ada5 that disk isn't doing you any favours.

You should also see if you can borrow a controller!

Re: N4F issue after power outage

Posted: 14 Sep 2015 05:11
by newmember
Just an update.
Did lot so reading and suspected maybe the sata cable.
I removed the 'Hot swap ' drive slots and connected the drives directly to the motherboard.

My CRC errors seem to have gone away. dmesg below
I then ran the CLI to add back disk ada5 . see blow

In the end I will assume it was those not so good hot swap trays.






$ dmesg | grep ada
ada0 at ahcich2 bus 0 scbus2 target 0 lun 0
ada0: <Hitachi HDS5C3030ALA630 MEAOA580> ATA8-ACS SATA 3.x device
ada0: Serial Number MJ1321YNG1AAGA
ada0: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)
ada0: Command Queueing enabled
ada0: 2861588MB (5860533168 512 byte sectors: 16H 63S/T 16383C)
ada0: Previously was known as ad8
ada1 at ahcich3 bus 0 scbus3 target 0 lun 0
ada1: <ST3000DM001-1CH166 CC24> ATA8-ACS SATA 3.x device
ada1: Serial Number Z1F281L5
ada1: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
ada1: Command Queueing enabled
ada1: 2861588MB (5860533168 512 byte sectors: 16H 63S/T 16383C)
ada1: quirks=0x1<4K>
ada1: Previously was known as ad10
ada2 at ahcich4 bus 0 scbus4 target 0 lun 0
ada2: <Hitachi HDS5C3030ALA630 MEAOA580> ATA8-ACS SATA 3.x device
ada2: Serial Number MJ1321YNG13YKA
ada2: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)
ada2: Command Queueing enabled
ada2: 2861588MB (5860533168 512 byte sectors: 16H 63S/T 16383C)
ada2: Previously was known as ad12
ada3 at ahcich5 bus 0 scbus5 target 0 lun 0
ada3: <Hitachi HDS5C3030ALA630 MEAOA580> ATA8-ACS SATA 3.x device
ada3: Serial Number MJ1321YNG12PSA
ada3: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)
ada3: Command Queueing enabled
ada3: 2861588MB (5860533168 512 byte sectors: 16H 63S/T 16383C)
ada3: Previously was known as ad14
ada4 at ahcich6 bus 0 scbus6 target 0 lun 0
ada4: <Hitachi HDS5C3030ALA630 MEAOA580> ATA8-ACS SATA 3.x device
ada4: Serial Number MJ1321YNG156LA
ada4: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)
ada4: Command Queueing enabled
ada4: 2861588MB (5860533168 512 byte sectors: 16H 63S/T 16383C)
ada4: Previously was known as ad16
ada5 at ahcich7 bus 0 scbus7 target 0 lun 0
ada5: <Hitachi HDS5C3030ALA630 MEAOA580> ATA8-ACS SATA 3.x device
ada5: Serial Number MJ1321YNG13XYA
ada5: 600.000MB/s transfers (SATA 3.x, UDMA6, PIO 8192bytes)
ada5: Command Queueing enabled
ada5: 2861588MB (5860533168 512 byte sectors: 16H 63S/T 16383C)
ada5: Previously was known as ad18





nas4free: ~# zpool status
pool: tank1-pool
state: DEGRADED
status: One or more devices could not be used because the label is missing or
invalid. Sufficient replicas exist for the pool to continue
functioning in a degraded state.
action: Replace the device using 'zpool replace'.
see: http://illumos.org/msg/ZFS-8000-4J
scan: resilvered 866M in 0h0m with 0 errors on Mon Sep 14 03:00:39 2015
config:

NAME STATE READ WRITE CKSUM
tank1-pool DEGRADED 0 0 0
raidz2-0 DEGRADED 0 0 0
ada0 ONLINE 0 0 0
987516735268493811 UNAVAIL 0 0 0 was /dev/ada1
ada2 ONLINE 0 0 0
ada3 ONLINE 0 0 0
ada4 ONLINE 0 0 0
ada5 ONLINE 0 0 0

errors: No known data errors
nas4free: ~#

Re: N4F issue after power outage

Posted: 14 Sep 2015 05:38
by Parkcomm
Thats good news.

Hotswap trays - could be, dodgy cable to the trays, seating of the cable connectors, seating of the disk in the tray. All worth a look and easily fixed.

Now that its a bit more stable, try zpool export / zpool import and see if that unavailable disk comes online (the benefit is that no resilver is required)

Re: N4F issue after power outage

Posted: 14 Sep 2015 08:38
by b0ssman
You still haven't posted the smart values.

Gesendet von meinem D5803 mit Tapatalk