Missing ZFS acl, causing panic
Posted: 21 May 2015 10:37
I imported an old pool into a new nas4free system. Everything worked fine, but it kept bugging me to upgrade the zpool.
After doing so, the box started crashing seemingly at random.
Investingating the logs showed lots of: empty ZFS ACL.
Logs:
kernel: acl_from_aces: empty ZFS ACL; returning EINVAL.
kernel: panic: vm_fault: fault on nofault entry, addr: ffffff81a769c000
kernel: cpuid = 6
kernel: KDB: stack backtrace:
kernel: #0 0xffffffff80a32e76 at kdb_backtrace+0x66
kernel: #1 0xffffffff809f85ae at panic+0x1ce
kernel: #2 0xffffffff80ca1820 at vm_fault_hold+0x24b0
kernel: #3 0xffffffff80ca1b63 at vm_fault+0x73
kernel: #4 0xffffffff80e1ae6f at trap_pfault+0x41f
kernel: #5 0xffffffff80e1b263 at trap+0x363
kernel: #6 0xffffffff80e04453 at calltrap+0x8
kernel: #7 0xffffffff81da9b8d at zfs_zaccess_aces_check+0x6d
kernel: #8 0xffffffff81daa126 at zfs_zaccess+0xc6
kernel: #9 0xffffffff81dcaa8b at zfs_freebsd_setattr+0xfbb
kernel: #10 0xffffffff80f5ac42 at VOP_SETATTR_APV+0x72
kernel: #11 0xffffffff80aa1ff1 at setfmode+0x101
kernel: #12 0xffffffff80aa2115 at kern_fchmodat+0x115
kernel: #13 0xffffffff80e1a0aa at amd64_syscall+0x5ea
kernel: #14 0xffffffff80e04737 at Xfast_syscall+0xf7
I can pretty much trigger this by finding a file/directory and trying to access it, move or delete and the panic occurs. Doing chmod also triggers it, so I'm kinda in a catch 22. Unable to create/modify the ACL and unable to delete the offending files/directory. Mounting the pool in the old system doesn't work because of the version issue.
Scrubbing works without problems though.
Any ideas on how to fix/workaround this issue?
After doing so, the box started crashing seemingly at random.
Investingating the logs showed lots of: empty ZFS ACL.
Logs:
kernel: acl_from_aces: empty ZFS ACL; returning EINVAL.
kernel: panic: vm_fault: fault on nofault entry, addr: ffffff81a769c000
kernel: cpuid = 6
kernel: KDB: stack backtrace:
kernel: #0 0xffffffff80a32e76 at kdb_backtrace+0x66
kernel: #1 0xffffffff809f85ae at panic+0x1ce
kernel: #2 0xffffffff80ca1820 at vm_fault_hold+0x24b0
kernel: #3 0xffffffff80ca1b63 at vm_fault+0x73
kernel: #4 0xffffffff80e1ae6f at trap_pfault+0x41f
kernel: #5 0xffffffff80e1b263 at trap+0x363
kernel: #6 0xffffffff80e04453 at calltrap+0x8
kernel: #7 0xffffffff81da9b8d at zfs_zaccess_aces_check+0x6d
kernel: #8 0xffffffff81daa126 at zfs_zaccess+0xc6
kernel: #9 0xffffffff81dcaa8b at zfs_freebsd_setattr+0xfbb
kernel: #10 0xffffffff80f5ac42 at VOP_SETATTR_APV+0x72
kernel: #11 0xffffffff80aa1ff1 at setfmode+0x101
kernel: #12 0xffffffff80aa2115 at kern_fchmodat+0x115
kernel: #13 0xffffffff80e1a0aa at amd64_syscall+0x5ea
kernel: #14 0xffffffff80e04737 at Xfast_syscall+0xf7
I can pretty much trigger this by finding a file/directory and trying to access it, move or delete and the panic occurs. Doing chmod also triggers it, so I'm kinda in a catch 22. Unable to create/modify the ACL and unable to delete the offending files/directory. Mounting the pool in the old system doesn't work because of the version issue.
Scrubbing works without problems though.
Any ideas on how to fix/workaround this issue?