This is the old XigmaNAS forum in read only mode,
it will taken offline by the end of march 2021!



I like to aks Users and Admins to rewrite/take over important post from here into the new fresh main forum!
Its not possible for us to export from here and import it to the main forum!

System fails when disk is lost

Forum rules
Set-Up GuideFAQsForum Rules
Post Reply
User avatar
tuaris
experienced User
experienced User
Posts: 85
Joined: 19 Jul 2012 21:31
Contact:
Status: Offline

System fails when disk is lost

Post by tuaris »

Why does the entire system fail when I loose one disk in the ZFS RAID?

Code: Select all

Oct 27 01:16:41 <user.crit> storage kernel: arcsas: Completion Q Entry=0x300c0, Slot No.=0xc0, Status_Buff.Err_Info=0x00000000,01000000, INT status=0x1
Oct 27 01:16:41 <user.crit> storage kernel: Device 0x5 Task file error, Status Reg=0x51, Error Reg=0x40.
Oct 27 01:16:41 <user.crit> storage kernel: AbortReq reset command 0xffffff8141eae9c0: Reset pPort(0x1) pCCB->EntryIndex(0x5) Slot(0xc8)
Oct 27 01:16:41 <user.crit> storage kernel: arcsas_cmd_done: target=0x5, lun=0x0, SCSI Command=0x28,0x0,0x6a,0xc5,0x70,0x9a,0x0,0x0,0x7,0x0,cmd_status=0x208, scsi_status=0x0, ccb_status=0x6
Oct 27 01:16:41 <user.crit> storage kernel: AbortReq reset command 0xffffff8141e781e0: Reset pPort(0x1) pCCB->EntryIndex(0x5) Slot(0xca)
Oct 27 01:16:41 <user.crit> storage kernel: arcsas_cmd_done: target=0x5, lun=0x0, SCSI Command=0x2a,0x0,0x32,0x0,0xd9,0xc1,0x0,0x0,0x3,0x0,cmd_status=0x208, scsi_status=0x0, ccb_status=0x6
Oct 27 01:16:41 <user.crit> storage kernel: arcsas: Target=0x 5, lun=0, GONE!!!
Oct 27 01:16:41 <daemon.info> storage istgt[2177]: ABORT_TASK
Oct 27 01:16:42 <user.crit> storage kernel: da5 at arcsas0 bus 0 scbus0 target 5 lun 0
Oct 27 01:16:42 <user.crit> storage kernel: da5: <WDC WD1003FBYX-01Y7B 01.0> s/n WD-WCAW30740700 detached
Oct 27 01:16:42 <user.crit> storage kernel: (da5:arcsas0:0:5:0): READ(10). CDB: 28 00 53 df ff d1 00 00 10 00 
Oct 27 01:16:42 <user.crit> storage kernel: (da5:arcsas0:0:5:0): CAM status: SCSI Status Error
Oct 27 01:16:42 <user.crit> storage kernel: (da5:arcsas0:0:5:0): SCSI status: Check Condition
Oct 27 01:16:42 <user.crit> storage kernel: (da5:arcsas0:0:5:0): SCSI sense: RECOVERED ERROR asc:0,0 (No additional sense information)
Oct 27 01:16:42 <user.crit> storage kernel: (da5:arcsas0:0:5:0): Info: 0x53dfffdd
Oct 27 01:17:01 <user.debug> storage kernel: sonewconn: pcb 0xfffffe015fa447a8: Listen queue overflow: 2 already in queue awaiting acceptance (1 occurrences)
Oct 27 01:16:43 <daemon.info> storage last message repeated 3 times

User avatar
Parkcomm
Advanced User
Advanced User
Posts: 384
Joined: 21 Sep 2012 12:58
Location: Australia
Status: Offline

Re: System fails when disk is lost

Post by Parkcomm »

Not enough info to tell:

Is the nas4free OS hosted on the pool with the failing disk?

Could mean that one disk fails -> /dev/ numbers change -> ZFS is seeing the the wrong vdevs -> pool down -> OS down?

If so boot from a Nas4Free USB, report zpool status, zpool list etc.

Is this the same problem you had before?
NAS4Free Embedded 10.2.0.2 - Prester (revision 2003), HP N40L Microserver (AMD Turion) with modified BIOS, ZFS Mirror 4 x WD Red + L2ARC 128M Apple SSD, 10G ECC Ram, Intel 1G CT NIC + inbuilt broadcom

User avatar
tuaris
experienced User
experienced User
Posts: 85
Joined: 19 Jul 2012 21:31
Contact:
Status: Offline

Re: System fails when disk is lost

Post by tuaris »

The NAS4Free system is hosted as en embedded image on a bootable flash drive.
There are 6 drives attached to a HBA, 4 external (da0-da3), 2 internal (da4 and da5).

There are two ZFS pools

Code: Select all

  pool: external1
 state: ONLINE
status: The pool is formatted using a legacy on-disk format.  The pool can
	still be used, but some features are unavailable.
action: Upgrade the pool using 'zpool upgrade'.  Once this is done, the
	pool will no longer be accessible on software that does not support feature
	flags.
  scan: resilvered 879G in 4h7m with 0 errors on Tue Aug 19 18:04:40 2014
config:

	NAME        STATE     READ WRITE CKSUM
	external1   ONLINE       0     0     0
	  raidz2-0  ONLINE       0     0     0
	    da2     ONLINE       0     0     0
	    da0     ONLINE       0     0     0
	    da1     ONLINE       0     0     0
	    da3     ONLINE       0     0     0

errors: No known data errors

  pool: internal
 state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
	still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
	the pool may no longer be accessible by software that does not support
	the features. See zpool-features(7) for details.
  scan: resilvered 5.98M in 0h0m with 0 errors on Tue Oct 27 01:34:50 2015
config:

	NAME        STATE     READ WRITE CKSUM
	internal    ONLINE       0     0     0
	  mirror-0  ONLINE       0     0     0
	    da4     ONLINE       0     0     0
	    da5     ONLINE       0     0     0

errors: No known data errors
It appears that da5 has failed. Shouldn't the system continue to function?

User avatar
b0ssman
Forum Moderator
Forum Moderator
Posts: 2438
Joined: 14 Feb 2013 08:34
Location: Munich, Germany
Status: Offline

Re: System fails when disk is lost

Post by b0ssman »

what controller is that drive on?
Nas4Free 11.1.0.4.4517. Supermicro X10SLL-F, 16gb ECC, i3 4130, IBM M1015 with IT firmware. 4x 3tb WD Red, 4x 2TB Samsung F4, both GEOM AES 256 encrypted.

User avatar
tuaris
experienced User
experienced User
Posts: 85
Joined: 19 Jul 2012 21:31
Contact:
Status: Offline

Re: System fails when disk is lost

Post by tuaris »

ARECA ARC-1320-4i4X

User avatar
b0ssman
Forum Moderator
Forum Moderator
Posts: 2438
Joined: 14 Feb 2013 08:34
Location: Munich, Germany
Status: Offline

Re: System fails when disk is lost

Post by b0ssman »

a lot depends on how the card and the driver handles failed drives.

if for example the drive just dies and the controler/driver does not correctly support hotplugging then the system can crash.
Nas4Free 11.1.0.4.4517. Supermicro X10SLL-F, 16gb ECC, i3 4130, IBM M1015 with IT firmware. 4x 3tb WD Red, 4x 2TB Samsung F4, both GEOM AES 256 encrypted.

Post Reply

Return to “ZFS (only!)”