This is the old XigmaNAS forum in read only mode,
it will taken offline by the end of march 2021!



I like to aks Users and Admins to rewrite/take over important post from here into the new fresh main forum!
Its not possible for us to export from here and import it to the main forum!

[ IMPORTANT ] Last Friday morning, ZFS freeze

Forum rules
Set-Up GuideFAQsForum Rules
Post Reply
justin
Starter
Starter
Posts: 19
Joined: 22 Jul 2013 15:10
Status: Offline

[ IMPORTANT ] Last Friday morning, ZFS freeze

Post by justin »

Last Friday our zfs volumes freeze.
This happend on a SAN/NAS for backup by NFS.
Thrusday 10:00pm, it works yet (zfs snapshot copy of another SAN)
Friday, at 1:00am, a backup tried to start on NFS and it never started.
Friday, at 4:00am, system report mail never send. there was smartctl freeze also.

I see this à 11:00am. Try to access data on pool but it was impossible. System worked yet.
ssh access ok.
zfs list ok
zpool status -v ok

NO error reported in log
IPMI console does'nt show error.
No see kernel panic.

WHAT'S HAPPEND ??? misterious!!

Afraid it arrive again on backup SAN/NAS. MORE on iSCSI SAN !! :(

Is that NFS that Crash ?
Is there a way to Incrase LOGS ?

after some tries, web interface stop to answer.

try to reboot cmd... freezed too. waiting 5 mins... nothing. no error message nowhere.

I had to hard reset... (Yeh!!! so happy).

After that, all was good. I've same see backup that was pending starting and completed.

1,5 month that the SAN/NAS is in production. I have to understand what happend.

Thanks you a lot,

Best regards,
Justin
- NAS4Free 9.1.0.1 x64-full 804 | x64-full on Intel(R) Xeon(R) CPU E5620 @ 2.40GHz | 98271MiB RAM | X x YTB WD ZFS mirror stripping compressed, Z x YTB WD ZFS zraid2 | 2 SSD ZIL, 1 SSD LOG
- NAS4Free 9.1.0.1 x64-full 804 | x64-full on Intel(R) Xeon(R) CPU E5620 @ 2.40GHz | 98271MiB RAM | Z x YTB WD ZFS zraid2

User avatar
Lee Sharp
Advanced User
Advanced User
Posts: 251
Joined: 13 May 2013 21:12
Contact:
Status: Offline

Re: [ IMPORTANT ] Last Friday morning, ZFS freeze

Post by Lee Sharp »

Use dmesg and see if you see disconnect and timeout errors on a SATA card. I had a card slowly going out, and after a while the zpool would lock up. Replaced the card, and all was good again.

btechnet
NewUser
NewUser
Posts: 3
Joined: 17 Aug 2013 06:05
Status: Offline

Re: [ IMPORTANT ] Last Friday morning, ZFS freeze

Post by btechnet »

You may need to plug a monitor into the server and view the console output when this happens. If there are hard drive timeout errors then CAM will provide a status change and spit out information about what changed. example:

NOP. ACB: 00 00 00 00 00 00 00 00 00 00 00 00
CAM status: Command timeout
Error 5, Retries exhausted
NOP. ACB: 00 00 00 00 00 00 00 00 00 00 00 00
CAM status: Command timeout
Error 5, Retries exhausted

This will cause ZFS to stop but it will most likely show that the pool is degraded.

On the other hand, if your ram is going bad, that can cause a kernel to deadlock. But usually you get a timeout trap when that happens.
Use memtest to check for bad ram.

If not, then it may be your boot drive. (if it is not embedded)

User avatar
Lee Sharp
Advanced User
Advanced User
Posts: 251
Joined: 13 May 2013 21:12
Contact:
Status: Offline

Re: [ IMPORTANT ] Last Friday morning, ZFS freeze

Post by Lee Sharp »

btechnet wrote: NOP. ACB: 00 00 00 00 00 00 00 00 00 00 00 00
CAM status: Command timeout
Error 5, Retries exhausted
NOP. ACB: 00 00 00 00 00 00 00 00 00 00 00 00
CAM status: Command timeout
Error 5, Retries exhausted
Thanks for that. I saved my dmesg somewhere, but could not find it at the time. :)

And those timeouts start as degraded zpools and corrupt files, but eventually they can hang the entire pool and more.

justin
Starter
Starter
Posts: 19
Joined: 22 Jul 2013 15:10
Status: Offline

Re: [ IMPORTANT ] Last Friday morning, ZFS freeze

Post by justin »

Thanks for answers btechnet and lee sharp.

Dmesg : no error
Console : i logged in IPMI, and try to see some error. It there a key command to see more console message ? alt+F4->F12 does nothing

i don't see errors now.
But dmesg and console ALT+F1, friday does'nt show nothing.

If it happen again, i'll do a memtest.

If i redirect syslog to another server, Console messages will be redirect also? a way for?

Best regards,
Justin
- NAS4Free 9.1.0.1 x64-full 804 | x64-full on Intel(R) Xeon(R) CPU E5620 @ 2.40GHz | 98271MiB RAM | X x YTB WD ZFS mirror stripping compressed, Z x YTB WD ZFS zraid2 | 2 SSD ZIL, 1 SSD LOG
- NAS4Free 9.1.0.1 x64-full 804 | x64-full on Intel(R) Xeon(R) CPU E5620 @ 2.40GHz | 98271MiB RAM | Z x YTB WD ZFS zraid2

User avatar
Lee Sharp
Advanced User
Advanced User
Posts: 251
Joined: 13 May 2013 21:12
Contact:
Status: Offline

Re: [ IMPORTANT ] Last Friday morning, ZFS freeze

Post by Lee Sharp »

When it happens, run dmesg from the console. Compare it to a dmesg run after boot. If the broken dmesg is longer, those later lines are the key.

Post Reply

Return to “ZFS (only!)”