*New 12.1 series Release:
2019-11-08: XigmaNAS 12.1.0.4.7091 - released!

*New 11.3 series Release:
2019-10-19: XigmaNAS 11.3.0.4.7014 - released


We really need "Your" help on XigmaNAS https://translations.launchpad.net/xigmanas translations. Please help today!

Producing and hosting XigmaNAS costs money. Please consider donating for our project so that we can continue to offer you the best.
We need your support! eg: PAYPAL

ZFS, drive disconnects/re-attach while live

Forum rules
Set-Up GuideFAQsForum Rules
Post Reply
BloodyIron
NewUser
NewUser
Posts: 3
Joined: 04 Feb 2013 06:15
Status: Offline

ZFS, drive disconnects/re-attach while live

#1

Post by BloodyIron » 04 Feb 2013 06:21

I'm trying to figure out how to recover from disaster scenarios while the system stays live, as in without rebooting. Rebooting seems to resolve all my issues, but I want to run VMs on my storage array so rebooting really isn't acceptable.

One of my scenarios is when a drive just loses connection, say something's wrong with the port, or it gets accidentally pulled out. I've tried several ways to re-add a drive to a Z1 pool while live, and so far no luck. I am not sure what the proper procedure is when it's a drive that isn't new, it was one that was part of the array. I keep finding instructions for replacing a _bad_ drive with a new one, not a disconnected one.

So, what is the proper method I should be following?

Also, same thing with a Z1+hot swap.

rostreich
Status: Offline

Re: ZFS, drive disconnects/re-attach while live

#2

Post by rostreich » 04 Feb 2013 09:46

I am not sure what the proper procedure is when it's a drive that isn't new, it was one that was part of the array.
ZFS writes metadata on every disk, so it knows that it was part of the array. if you have real hotswap with your hardware, you could pull a disk while running and kill the disk with dban or something like that, so the metadata gets destroyed and ZFS is tricked to believe you changed the 'faulty' disk to a new one. put the disk back in, take the device online, scrub and resilver. done. :)

same for hot spare disk!

create your arrays fresh, try it out with some testdata for the first time and you are prepared for the real crash scenario. ;)

fsbruva
Advanced User
Advanced User
Posts: 383
Joined: 21 Sep 2012 14:50
Status: Offline

Re: ZFS, drive disconnects/re-attach while live

#3

Post by fsbruva » 04 Feb 2013 13:10

You've tried a zpool replace operation? To do this, you would first find out the GUID of the disk you're removing (zdb @ command prompt), then you would use that as the disk identifier (rather than /dev/adaX) of the OLD disk, and then wherever the new disk as the new replacement.

BloodyIron
NewUser
NewUser
Posts: 3
Joined: 04 Feb 2013 06:15
Status: Offline

Re: ZFS, drive disconnects/re-attach while live

#4

Post by BloodyIron » 18 Feb 2013 03:54

So I have my 4 drives like this.

3 in ZFS Z1
1 as hot spare

When I remove one of the drives from the Z1 pool the spare doesn't automatically start reslivering. I read somewhere that it should automatically just replace the removed drive. Did I miss a specific setting to have this happen? The device showed as a spare for the zpool so I thought that should be sufficient. Also, how can I deal with this situation 100% through the GUI? Sure, I can work through the CLI by reading manuals, but that kind of defeats the purpose of the GUI.

randomparity
NewUser
NewUser
Posts: 4
Joined: 07 Feb 2013 18:55
Status: Offline

Re: ZFS, drive disconnects/re-attach while live

#5

Post by randomparity » 18 Feb 2013 06:52

In my admittedly limited experience I've found that managing ZFS through the GUI is problematic, especially when adding/removing disks. Examples:

1) Tried to remove multiple datasets (clicking 'X' in Disks|ZFS|Datasets|Dataset) all at once before clicking "Apply". The actual ZFS operation failed since some of the datasets were mounted on other datasets in the same operation. The GUI didn't report any errors and the data sets were not displayed, but they were still present on the system. I needed to reboot to clear the discrepancy.
2) Similar problems happen when creating a dataset, the GUI thinks it worked and allows me to create CIFS file shares which don't actually work since the "zfs create ..." operation failed.
3) Adding new disks causes confusion in the GUI. I created a pool with disks /dev/da1 to /dev/da8. (The zpool history command shows: zpool create -m /mnt/zp0 zp0 raidz1 /dev/da1.nop /dev/da2.nop /dev/da3.nop /dev/da4.nop raidz1 /dev/da5.nop /dev/da6.nop /dev/da7.nop /dev/da8.nop). After adding 4 new drives the "zpool status" command shows /dev/da0 to /dev/da7 in the pool. That's fine, the pool still works, but the Disks|Format GUI page shows da0, da9, da10, da11 when I want to format new drives for ZFS. As a result, I can't create a new virtual device and add it to my pool through the GUI.

Bottom line, if th GUI works then great, but make sure you know how to do things at the command line when things don't work right.

Dave

BloodyIron
NewUser
NewUser
Posts: 3
Joined: 04 Feb 2013 06:15
Status: Offline

Re: ZFS, drive disconnects/re-attach while live

#6

Post by BloodyIron » 18 Feb 2013 07:45

Which version have you been working with?

Thanks for the insight :D

Have you tried freenas at all btw? I switched to nas4free so I could use a higher ZFS version, but that doesn't seem relevant any more.
randomparity wrote:In my admittedly limited experience I've found that managing ZFS through the GUI is problematic, especially when adding/removing disks. Examples:

1) Tried to remove multiple datasets (clicking 'X' in Disks|ZFS|Datasets|Dataset) all at once before clicking "Apply". The actual ZFS operation failed since some of the datasets were mounted on other datasets in the same operation. The GUI didn't report any errors and the data sets were not displayed, but they were still present on the system. I needed to reboot to clear the discrepancy.
2) Similar problems happen when creating a dataset, the GUI thinks it worked and allows me to create CIFS file shares which don't actually work since the "zfs create ..." operation failed.
3) Adding new disks causes confusion in the GUI. I created a pool with disks /dev/da1 to /dev/da8. (The zpool history command shows: zpool create -m /mnt/zp0 zp0 raidz1 /dev/da1.nop /dev/da2.nop /dev/da3.nop /dev/da4.nop raidz1 /dev/da5.nop /dev/da6.nop /dev/da7.nop /dev/da8.nop). After adding 4 new drives the "zpool status" command shows /dev/da0 to /dev/da7 in the pool. That's fine, the pool still works, but the Disks|Format GUI page shows da0, da9, da10, da11 when I want to format new drives for ZFS. As a result, I can't create a new virtual device and add it to my pool through the GUI.

Bottom line, if th GUI works then great, but make sure you know how to do things at the command line when things don't work right.

Dave

User avatar
ChriZathens
Forum Moderator
Forum Moderator
Posts: 833
Joined: 23 Jun 2012 09:14
Location: Athens, Greece
Contact:
Status: Offline

Re: ZFS, drive disconnects/re-attach while live

#7

Post by ChriZathens » 18 Feb 2013 09:35

BloodyIron wrote:So I have my 4 drives like this.

3 in ZFS Z1
1 as hot spare

When I remove one of the drives from the Z1 pool the spare doesn't automatically start reslivering. I read somewhere that it should automatically just replace the removed drive. Did I miss a specific setting to have this happen? The device showed as a spare for the zpool so I thought that should be sufficient. Also, how can I deal with this situation 100% through the GUI? Sure, I can work through the CLI by reading manuals, but that kind of defeats the purpose of the GUI.
AFAIK in order for the disk to start immediately resilvering, the autoreplace attribute of your pool must be set to on.
Go in Advanced|Execute command and enter the following command: zpool get all <poolname>, where poolname is the name of your pool. This is an example output of my pool named Media:

Code: Select all

$ zpool get all Media
NAME   PROPERTY       VALUE       SOURCE
Media  size           5.44T       -
Media  capacity       86%         -
Media  altroot        -           default
Media  health         ONLINE      -
Media  guid           1776827176248041000  default
Media  version        28          default
Media  bootfs         -           default
Media  delegation     on          default
Media  autoreplace    off         default
Media  cachefile      -           default
Media  failmode       wait        default
Media  listsnapshots  off         default
Media  autoexpand     off         default
Media  dedupditto     0           default
Media  dedupratio     1.00x       -
Media  free           753G        -
Media  allocated      4.70T       -
Media  readonly       off         -
Media  comment        -           default
Media  expandsize     0           -


As you can see, the autoreplace value is set to off.
To turn it on, again in Advanced|Execute command write zpool set autoreplace=on <poolname>
In my example:

Code: Select all

$ zpool set autoreplace=on Media
And then enter again the get all command to see that it has now changed:

Code: Select all

$ zpool get all Media
NAME   PROPERTY       VALUE       SOURCE
Media  size           5.44T       -
Media  capacity       86%         -
Media  altroot        -           default
Media  health         ONLINE      -
Media  guid           1776827176248041000  default
Media  version        28          default
Media  bootfs         -           default
Media  delegation     on          default
Media  autoreplace    on          local
Media  cachefile      -           default
Media  failmode       wait        default
Media  listsnapshots  off         default
Media  autoexpand     off         default
Media  dedupditto     0           default
Media  dedupratio     1.00x       -
Media  free           753G        -
Media  allocated      4.70T       -
Media  readonly       off         -
Media  comment        -           default
Media  expandsize     0           -
Now the operation should be automatic... Give it a try in your machine to check the behavior. But you have to make sure that the replacement disk has no metadata in it, otherwise the replace will fail.
My Nas
  1. Case: Fractal Design Define R2
  2. M/B: Supermicro x9scl-f
  3. CPU: Intel Celeron G1620
  4. RAM: 16GB DDR3 ECC (2 x Kingston KVR1333D3E9S/8G)
  5. PSU: Chieftec 850w 80+ modular
  6. Storage: 8x2TB HDDs in a RaidZ2 array ~ 10.1 TB usable disk space
  7. O/S: XigmaNAS 11.2.0.4.6625 -amd64 embedded
  8. Extra H/W: Dell Perc H310 SAS controller, crosflashed to LSI 9211-8i IT mode, 8GB Innodisk D150SV SATADOM for O/S

Backup Nas: HP N40L (4x1TB HP branded Seagate disks in RaidZ configuration - 8GB ECC RAM)

Post Reply

Return to “ZFS (only!)”