ZFS L2ARC bug?
Posted: 28 Jul 2016 14:48
Sorry, this is a long one. First let me get this out of the way: my machine specs.
Version 10.3.0.3 - Pilingitam (revision 2870)
Compiled Wed Jul 13 20:48:29 EDT 2016
Platform OS FreeBSD 10.3-RELEASE-p5 #0 r302530M: Sun Jul 10 22:07:33 CEST 2016
Platform x64-embedded on Intel(R) Xeon(R) CPU E3-1246 v3 @ 3.50GHz
System Supermicro X10SAT
System bios American Megatrends Inc. version: 3.0 05/26/2015
System RAM 32 GB Unbuffered ECC DDR3 (fully tested with Memtest86, no errors)
Pool VDEVs Six "WD30EFRX" WD Red 3TB hard drives in RAIDZ2
SLOG VDEV* Two PLEXTOR PX-AG128M6e SSDs (partition 1 of each SSD, roughly 2.5 GB each, mirrored)
L2ARC VDEV* Same two SSDs as above (partition 2 of each SSD, roughly 30 GB each, striped)
Power backup APC SMT1500 Smart UPS (provides roughly 75 minutes runtime on an idle system and 25 under load)
Swap device NONE (this was most likely my mistake)
* SLOG and L2ARC were added upon upgrade to 10.2.x last year, as the PLEXTOR SSDs were not recognized in 9.x.
---
This system had been set up on 9.2.x approximately two years ago without any major issues and upgraded last year when 10.2.x was released.
I lost my pool last week when the system crashed (out of memory killed almost every process on the box including sshd and getty so I could neither log in remotely nor via console. FreeBSD has no "magic SysRq" like Linux, so I wasn't able to force a sync of the disks prior to a hard reset. The OOM condition was most likely my fault as (a) I was running without swap believing 32 GB was enough memory and wouldn't require it, and (b) when the system ran out of memory, I was running a "diff" between two very large files of about 16 GB each.
Once the system was restarted, it got stuck in a reboot loop. I edited the config.xml file to remove references to the pool and the system booted properly. Any attempt to import the pool other than with "-o readonly=on" would cause a crash and reboot. (thanks to "hdantman" for this post providing the read-only option, it saved me a lot of time fiddling with other stuff): viewtopic.php?t=8173
Kernel panic image: I googled around for "Solaris panic ZFS freeing free segment" and found mention of a Solaris bug #7191375 in a blog - comments indicate the bug is present in FreeBSD 9.2 as well. This sounded exactly like what I was encountering:
http://www.asksendai.com/zfs-with-l2arc-bug-panic-loop/
Oracle briefly mentions this bug, on what I can access publicly:
http://docs.oracle.com/cd/E26502_01/htm ... html#gmkgz
Here's a more in-depth discussion. Cindy Swearingen appears to be (or was at the time) an Oracle engineer.
http://zfs-discuss.opensolaris.narkive. ... ic-7191375
This is scary enough to make me never want to think of using L2ARC again:
https://java.net/projects/solaris-zfs/l ... message/19
I was able to salvage most of the data. I imported the pool read-only and used rsync to copy the "current" data to another machine (EXT4 rather than ZFS) but past snapshots would have taken up too much room on a non-snapshotting file system, so those were lost.
I have since rebuilt the pool and am in the process of rsyncing the data back to the new pool. The pool uses the same storage RAIDZ2. I've given the system more SLOG space (about 16 GB mirrored, which I know is overkill by at least a factor of 4). Because the bug affects pools with an L2ARC, I didn't use an L2ARC this time, I have the second partitions of the SSDs now being used for swap (24 GB on each SSD, the system has yet to touch the swap space).
So, if this bug still exists in FreeBSD 10.3, it exists in NAS4Free. Even if it's fixed in FreeBSD 10.3, any pool created with an L2ARC on a version of FreeBSD/NAS4Free where the bug hadn't yet been fixed *MIGHT* still have the potential for corruption.
Edit: added screenshot of boot loop/crash
Version 10.3.0.3 - Pilingitam (revision 2870)
Compiled Wed Jul 13 20:48:29 EDT 2016
Platform OS FreeBSD 10.3-RELEASE-p5 #0 r302530M: Sun Jul 10 22:07:33 CEST 2016
Platform x64-embedded on Intel(R) Xeon(R) CPU E3-1246 v3 @ 3.50GHz
System Supermicro X10SAT
System bios American Megatrends Inc. version: 3.0 05/26/2015
System RAM 32 GB Unbuffered ECC DDR3 (fully tested with Memtest86, no errors)
Pool VDEVs Six "WD30EFRX" WD Red 3TB hard drives in RAIDZ2
SLOG VDEV* Two PLEXTOR PX-AG128M6e SSDs (partition 1 of each SSD, roughly 2.5 GB each, mirrored)
L2ARC VDEV* Same two SSDs as above (partition 2 of each SSD, roughly 30 GB each, striped)
Power backup APC SMT1500 Smart UPS (provides roughly 75 minutes runtime on an idle system and 25 under load)
Swap device NONE (this was most likely my mistake)
* SLOG and L2ARC were added upon upgrade to 10.2.x last year, as the PLEXTOR SSDs were not recognized in 9.x.
---
This system had been set up on 9.2.x approximately two years ago without any major issues and upgraded last year when 10.2.x was released.
I lost my pool last week when the system crashed (out of memory killed almost every process on the box including sshd and getty so I could neither log in remotely nor via console. FreeBSD has no "magic SysRq" like Linux, so I wasn't able to force a sync of the disks prior to a hard reset. The OOM condition was most likely my fault as (a) I was running without swap believing 32 GB was enough memory and wouldn't require it, and (b) when the system ran out of memory, I was running a "diff" between two very large files of about 16 GB each.
Once the system was restarted, it got stuck in a reboot loop. I edited the config.xml file to remove references to the pool and the system booted properly. Any attempt to import the pool other than with "-o readonly=on" would cause a crash and reboot. (thanks to "hdantman" for this post providing the read-only option, it saved me a lot of time fiddling with other stuff): viewtopic.php?t=8173
Kernel panic image: I googled around for "Solaris panic ZFS freeing free segment" and found mention of a Solaris bug #7191375 in a blog - comments indicate the bug is present in FreeBSD 9.2 as well. This sounded exactly like what I was encountering:
http://www.asksendai.com/zfs-with-l2arc-bug-panic-loop/
Oracle briefly mentions this bug, on what I can access publicly:
http://docs.oracle.com/cd/E26502_01/htm ... html#gmkgz
Here's a more in-depth discussion. Cindy Swearingen appears to be (or was at the time) an Oracle engineer.
http://zfs-discuss.opensolaris.narkive. ... ic-7191375
This is scary enough to make me never want to think of using L2ARC again:
https://java.net/projects/solaris-zfs/l ... message/19
I was able to salvage most of the data. I imported the pool read-only and used rsync to copy the "current" data to another machine (EXT4 rather than ZFS) but past snapshots would have taken up too much room on a non-snapshotting file system, so those were lost.
I have since rebuilt the pool and am in the process of rsyncing the data back to the new pool. The pool uses the same storage RAIDZ2. I've given the system more SLOG space (about 16 GB mirrored, which I know is overkill by at least a factor of 4). Because the bug affects pools with an L2ARC, I didn't use an L2ARC this time, I have the second partitions of the SSDs now being used for swap (24 GB on each SSD, the system has yet to touch the swap space).
So, if this bug still exists in FreeBSD 10.3, it exists in NAS4Free. Even if it's fixed in FreeBSD 10.3, any pool created with an L2ARC on a version of FreeBSD/NAS4Free where the bug hadn't yet been fixed *MIGHT* still have the potential for corruption.
Edit: added screenshot of boot loop/crash