Page 1 of 1

ZFS max performance, PURE SSD?

Posted: 24 Dec 2014 02:22
by maverickhunterx
Hello, I currently have a 24 bay super micro chassis with 48GB of RAM.

My current set up is as follows

zpool1-
5 7200RPM SAS drives in RAIDZ1 format
1 - 128GB L2ARC
1 - 128GB ZIL mirrored

zpool1-
5 7200RPM SAS drives in RAIDZ1 format
1 - 128GB L2ARC
1 - 128GB ZIL mirrored

both pools are shared via iSCSI and connected via 4, 10Gb etherchannel connections.

I get that I probably just should have mirrored the pool entirely.

Here is what I want to accomplish,

I want to host about 60VM's with various work loads as best I can on this chassis. I have a budget to populate it entirely with 512GB SSD's, but I'm wondering if this is the best way to go to maximize storage capacity / performance. Right now in my current set up it's not exactly fast. When I shift some VM's to internal storage on the hosts, they perform much better. Network throughput seems relatively low to the system most of the time because this is TRULY not in production yet... but it will be in a week or so. I'd like help on determining the best set up if I were to go PURE SSD, for example, would I need a ZIL and L2ARC with all SSD disks? Could I get away with using buying 18 x 3TB 15k SAS disks and front ending them with a mirrored ZIL for each pool and an L2ARC for each? or should I mirror the vdevs and form one single pool front ended by 4 ZILs and a mirrored L2ARC? these combinations are killing me. I wish I had the budget for something like a compellent or IBM storewize V7000 and call it a day, but right now that is not yet an option. All that being said, I am currently using iSCSI with VMware and etherchannel and throughput does not seem to be an issue as I have done iperf and pushed 10Gbps no problem. I think my bottleneck is "spindles" Is there anything I should tweak inside of nas4free with pure SSD's? I thought about using NFS over iSCSI too if I switch.

Re: ZFS max performance, PURE SSD?

Posted: 25 Dec 2014 00:05
by Onichan
Well if the whole pool was SSD's then in theory there wouldn't be a point in a SSD ZIL or L2ARC. Not sure if that has been tested though.

First off let me say I am not an expert so a second opinion would be good.

Anyways if I understand it you have two pools, each pool has 5 HDD's in RAIDZ1 and each pool has a SSD for L2ARC and ZIL. The first potential problem might be your IO. ZFS only gets the IO of a single disk per vdev so each of your pools have a max theoretical IO of 120 (7200RPM/60=120), which is crap, especially when you are running VM's on it which is going to be IO hungry. Normally you would want many mirrored vdevs in a single pool if you need a lot of IO.

Now yes that ZIL should be providing a bunch of IO write buffer (also technically async writes should buffer in RAM to a certain extent, but VM's shouldn't be async so that wouldn't matter), but there is such a massive range of SSD's that I have no idea how good that SSD really is. So I assume the SSD is at least working halfway decent and providing you with decent write IO, but that could be a bottleneck.

Next is the L2ARC is providing you some faster read cache, but if the VM's need to read anything not in cache it has to go to disk so that could be a problem.

So I'm not really sure where your bottleneck is, you have a couple things that could be a problem. I would first look up the specs of those SSD's to see what kind of IO they have. Also how well do they perform under heavy usage as many consumer ones don't handle heavy usage well. Then you should check the cache hit statistics. If you are going to disk frequently then that is defiantly a problem because your pools are slow.


There isn't anything wrong with having slow disks with fast cache in front. In-fact that's the big thing nowadays as flash is expensive and you only need a small amount of flash to get the majority of hits on cache. The problem is finding the right balance.

Re: ZFS max performance, PURE SSD?

Posted: 25 Dec 2014 02:19
by Lee Sharp
Welcome to the rabbit hole! I spent about 80 hours here last year. Thank God it was billable!

First, you have a lot of stuff working against you, chiefly the way VMware (assuming VMware as Xen is not as bad with this) handles writes. It does not release a write until it is written to permanent media. (not ram cache) Yes, you can cheat, but the official way to cut some of the wait without eventually breaking ZFS is setting the cache flush disable flag. Background links;

http://christopher-technicalmusings.blo ... h-zil.html

http://forums.freebsd.org/showthread.php?t=30856

Next, you do not have your drives optimized for IOPS. Raidz1 is solid, but slow. The fastest way to arrainge your disks is to stripe a bunch of mirror vdevs. This means build a bunch of 2 drive vdevs mirrored, and then stripe them all in a single vpool. Yes, if all flash this will be FASTER than a flash ZIL device.

But, this still may not be what you need. VMware (especially Horizon View) can be a pig on storage. So Infinio makes a verry nice ram cache VM, and Pernix Data makes on with SSD local storage or RAM for a bit more money. Even with the cost of Infinio, your are still cheaper than a Tintri box! :)