This is the old XigmaNAS forum in read only mode,
it will taken offline by the end of march 2021!



I like to aks Users and Admins to rewrite/take over important post from here into the new fresh main forum!
Its not possible for us to export from here and import it to the main forum!

zfs amd64 4gb raidz1 50% slower than bare drive, then locks.

Forum rules
Set-Up GuideFAQsForum Rules
Post Reply
harryc
Starter
Starter
Posts: 25
Joined: 08 Nov 2012 22:12
Status: Offline

zfs amd64 4gb raidz1 50% slower than bare drive, then locks.

Post by harryc »

A fresh install onto a usb memory stick, then a file move just crashes zfs on this system-- even though scrubs (though veeerrrry slooow ~ 12-18Mb/s) report no errors, smartd reports no errors. Exporting the pool and using 'disk info' on the zfs drives reports individual disk speeds in the expected 68Mbytes/s to 147Mb/s range for the spindles and 175Mb/s for the ssd.

I'm doing a move of some video files from one zfs dataset to another on the same raidz1 zpool. Zero other activity on the system. The files vary in size but generally 10G each. The commands are all given locally on the console, no networking. Simple unix 'mv'. The target dataset has no dedup, no compression. The source pool has checksums and compression. Basically, the zfs parts system just stops within a couple of minutes. Zero write traffic, all commands referencing zfs simply hang. 'ls -l <dataset>' never returns. Processors 99.7% idle. Swap space 0% used. Mem: 17M Active, 22M Inact, 1881M Wired, 12M Buf, 1531M Free

The original pool had 4 Hitachi 7200rpm 'deskstar' 1.5 TB drives in a raidz1, and 2 SSD's offering a bit of cache and zil. They were made on a Freebsd install then imported into Nas4Free. Same version of ZFS on all pools. I eliminated the cache and the zil from the pool: same issue. Tried the evil tuning guide, the zfs kernel tune extension, no meaninful change.

Even just doing reads with dd from a video file to /dev/null, although they complete, the performance is 80% of than the worst case speed of a single spindle drive, and less than half of a single drive's best case speed.

plenty of details at http://www.quietfountain.com/fs1pool1.txt .

The only hunch I have is that the fact writes are being completed somehow never make it back to ZFS?
The system is a Dell DImension 9150 (corrected error --not 6150). All the dmesg, statuses, and whatnot I could think of are in the link above.

Recap: I really expected this to be easy, I used a completely fresh install with zero changes of any sort. zpool import, zfs mount -a, mv <src 10gb file dataset1 > <dest dataset2> and then zfs hangs after a bit. Any further request to zfs hangs (eg ls -l <dataset2>). After a while, the only thing that works is switching among virtual ttys, while typing a character even at the login prompt does not echo. And, there it will sit until physically rebooted.

Help please!
Last edited by harryc on 24 Jan 2013 00:48, edited 1 time in total.

harryc
Starter
Starter
Posts: 25
Joined: 08 Nov 2012 22:12
Status: Offline

Re: zfs amd64 4gb raidz1 50% slower than bare drive?

Post by harryc »

Further detail:

nas4free:~# zpool get all pool1
NAME PROPERTY VALUE SOURCE
pool1 size 5.44T -
pool1 capacity 55% -
pool1 altroot - default
pool1 health ONLINE -
pool1 guid 1701438519865110975 default
pool1 version 28 default
pool1 bootfs - default
pool1 delegation on default
pool1 autoreplace off default
pool1 cachefile - default
pool1 failmode wait default
pool1 listsnapshots off default
pool1 autoexpand off default
pool1 dedupditto 0 default
pool1 dedupratio 1.46x -
pool1 free 2.44T -
pool1 allocated 3.00T -
pool1 readonly off -
pool1 comment - default
pool1 expandsize 0 -

Destination dataset (location 10G video files are copied 'to')

nas4free:~# zfs get all pool1/videos
NAME PROPERTY VALUE SOURCE
pool1/videos type filesystem -
pool1/videos creation Sat Jan 19 5:12 2013 -
pool1/videos used 516G -
pool1/videos available 1.75T -
pool1/videos referenced 516G -
pool1/videos compressratio 1.00x -
pool1/videos mounted no -
pool1/videos quota none local
pool1/videos reservation none local
pool1/videos recordsize 128K default
pool1/videos mountpoint /mnt/pool1/videos inherited from pool1
pool1/videos sharenfs off default
pool1/videos checksum on default
pool1/videos compression off local
pool1/videos atime off local
pool1/videos devices on default
pool1/videos exec on default
pool1/videos setuid on default
pool1/videos readonly off local
pool1/videos jailed off default
pool1/videos snapdir hidden local
pool1/videos aclmode discard default
pool1/videos aclinherit restricted default
pool1/videos canmount on local
pool1/videos xattr on default
pool1/videos copies 1 default
pool1/videos version 5 -
pool1/videos utf8only off -
pool1/videos normalization none -
pool1/videos casesensitivity sensitive -
pool1/videos vscan off default
pool1/videos nbmand off default
pool1/videos sharesmb off default
pool1/videos refquota none default
pool1/videos refreservation none default
pool1/videos primarycache all default
pool1/videos secondarycache all default
pool1/videos usedbysnapshots 0 -
pool1/videos usedbydataset 516G -
pool1/videos usedbychildren 0 -
pool1/videos usedbyrefreservation 0 -
pool1/videos logbias latency default
pool1/videos dedup off local
pool1/videos mlslabel -
pool1/videos sync standard local
pool1/videos refcompressratio 1.00x -
pool1/videos written 516G -
nas4free:~#

Source dataset:

nas4free:~# zfs get all pool1/wip_archive
NAME PROPERTY VALUE SOURCE
pool1/wip_archive type filesystem -
pool1/wip_archive creation Thu Oct 27 15:48 2011 -
pool1/wip_archive used 2.53T -
pool1/wip_archive available 1.75T -
pool1/wip_archive referenced 2.53T -
pool1/wip_archive compressratio 1.02x -
pool1/wip_archive mounted no -
pool1/wip_archive quota none local
pool1/wip_archive reservation none local
pool1/wip_archive recordsize 128K default
pool1/wip_archive mountpoint /mnt/pool1/wip_archive inherited from pool1
pool1/wip_archive sharenfs off default
pool1/wip_archive checksum on default
pool1/wip_archive compression on local
pool1/wip_archive atime on local
pool1/wip_archive devices on default
pool1/wip_archive exec on default
pool1/wip_archive setuid on default
pool1/wip_archive readonly off local
pool1/wip_archive jailed off default
pool1/wip_archive snapdir hidden local
pool1/wip_archive aclmode discard default
pool1/wip_archive aclinherit restricted default
pool1/wip_archive canmount on local
pool1/wip_archive xattr on default
pool1/wip_archive copies 1 default
pool1/wip_archive version 5 -
pool1/wip_archive utf8only off -
pool1/wip_archive normalization none -
pool1/wip_archive casesensitivity sensitive -
pool1/wip_archive vscan off default
pool1/wip_archive nbmand off default
pool1/wip_archive sharesmb off default
pool1/wip_archive refquota none default
pool1/wip_archive refreservation none default
pool1/wip_archive primarycache all default
pool1/wip_archive secondarycache all default
pool1/wip_archive usedbysnapshots 0 -
pool1/wip_archive usedbydataset 2.53T -
pool1/wip_archive usedbychildren 0 -
pool1/wip_archive usedbyrefreservation 0 -
pool1/wip_archive logbias latency default
pool1/wip_archive dedup verify local
pool1/wip_archive mlslabel -
pool1/wip_archive sync standard local
pool1/wip_archive refcompressratio 1.02x -
pool1/wip_archive written 2.53T -

bkk_mike
NewUser
NewUser
Posts: 1
Joined: 22 Jun 2013 22:20
Status: Offline

Re: zfs amd64 4gb raidz1 50% slower than bare drive, then lo

Post by bkk_mike »

I figure you've probably got an answer elsewhere by now, since you raised this in January, but looking at your setup, it looks simply like you're memory-starved.

First - something weird for a start is that your available memory is a lot lower than it should be. (Made me double-check your log that you were running the 64-bit version).

real memory = 4294967296 (4096 MB)
avail memory = 3591544832 (3425 MB) - this looks like 32-bit Windows - losing 670MB like that...

Then, your kmem_size seems low even for the 3.34GB available memory.

vm.kmem_size: 1809956864 1.68G

And the kmem size has a knock-on effect on the size of your ARC.


The next issue, I think, is that your ZIL is smaller than the file you're trying to write. I actually think you'd be better off removing the log altogether if you're going to have a ZIL that's smaller than the files you're moving around.

Because of the amount of memory, it's defaulted to have prefetch disabled (i.e. it doesn't cache sequential reads - which would include things like large 10GB files), so you're making matters worse by having the L2ARC (which swallows some of your ARC space to administer it), but you're not using it.
This you could turn on, which, as your L2ARC is larger than the file you're transferring, possibly useful for the next reason.

You're reading and writing to the same physical disk, so it's having to read enough to fill your ARC (which is not a lot of reads), then write it back on the same disk (moving the heads), writing it to the ZIL at the same time, and then move the heads again to read the next bit.


You could try turning off the prefetch disable flag, so that, in theory at least, you copy the whole file to the SSD cache, so you're not having to shuffle between reading and writing on the same hard disks, but even then, I think you might have to play around with the kmem_size and the arc_min and arc_max to get values that are stable.

The better solution is probably to add some memory.

Post Reply

Return to “ZFS (only!)”