I have an existing ZFS array, RAIDZ1, about 80% full.
Gigabyte GA-D510UD board
4GB RAM
Drives:
Hitachi HDS721010CLA332 1TB (s/n ends JSPH)
Hitachi HDS721010CLA332 1TB (s/n ends MJSC)
Samsung HD103UJ 1TB
Seagate ST2000DM001 2TB (1TB unused)
Under NAS4Free, when reading files or doing a scrub, the following inevitably happens:
- drives become unresponsive (e.g. "ls" of a directory on the pool blocks)
- drive activity stops completely
- the Hitachi drive with s/n ending MJSC has its activity light frozen on, and iostat -x 5 says it has a queue length of > 0 (all other drives are zero)
- I get the following logged:
Code: Select all
kernel: (ada2:ata3:0:0:0): WRITE_DMA. ACB: ca 00 7a 02 40 40 00 00 00 00 02 00
kernel: (ada2:ata3:0:0:0): CAM status: Command timeout
kernel: (ada2:ata3:0:0:0): Retrying command
- about 2 minutes later everything springs back to life - pending commands complete
Invariably the same thing happens again every few minutes during drive activity.
I got the same under FreeNAS 8, which is one of the main drivers to me coming back to NAS4Free.
Under Ubuntu with the zfsonlinux bits installed, it works perfectly. I.e, for the exact same usage pattern, under Ubuntu I CANNOT provoke these symptoms but under NAS4Free and FreeNAS 8 I cannot escape these symptoms!
Here's what's weird.
I have swapped out the SATA controller, same problem.
Also swapped out the SATA cables, same problem.
Also swapped out the power supply, same problem.
Even weirder, I have dd'd the freezing disk to a brand new identical drive, same thing.
Even more weird, if I move the drive to a different SATA port, the problem FOLLOWS THE DRIVE.
Even though it's a different brand new drive.
Completely perplexed.
Any suggestions massively appreciated.

