Page 1 of 1

Low throughput via my HBA

Posted: 12 Dec 2014 17:54
by sunshine
I've got a performance problem and I'm looking for ideas.

System:
- Dell PE 2950 III, 32gb ECC, PCIe v.1.x
- Dell MD1000 disk expansion (15 disk SAS/SATA expander), with two controller modules, running in split mode (one controller takes 7 disks, the other takes 8, both controllers directly connected to my HBA).
- 15 x 1TB Hitachi enterprise SATA disk 7.2k rpm in the MD1000
- Dell SAS 6gbps HBA (same as LSI 9200-8e SAS2008), installed in a PCIe 1.x X8 slot, with two connections to the MD1000
- Dell Perc6/i RAID controller, with two Intel 320 series SSD, used for LOG and CACHE.

The problem:
Low throughput from the MD1000 to the system. Not just via ZFS - this issue is an IO issue that affects ZFS performance, not a ZFS performance issue.

I've done a series of tests from the NAS4Free console, and long story short, I cannot seem to get more than about 465MB/sec from the array during sequential reads, regardless of how many disks are being read.

My test approach is as follows:
Read the devices directly using dd, dumping output to /dev/null. No ZFS involvement, this is raw data/raw throughput.

As root:

Code: Select all

iostat da0 da1 da2 da3 da4 da5 da6 120 &
sleep 1;
iostat da7 da8 da9 da10 da11 da12 da13 da14 120 &
for X in 0 7 1 8 2 9 3 10 4 11 5 12 6 13 14; do dd if=/dev/da$X of=/dev/null bs=1M & done
To gradually reduce the load, I simply fg and kill off two dd's while observing the results.

Results:
With the test running I see individual device IO updated every two minutes. When reading all 15 HDDs, I see an average of 31 MB/sec per device. Since the MD1000 is split into 7 and 8 drives, one side had slightly higher throughput per drive (the half with 7 drives had higher per-drive read performance than the half with 8 drives). Then, I killed off two dd's at a time, resulting in one less drive taxing each half of the MD1000 and HBA interconnects. As I killed off dd's, I saw the per-drive read rate increase. In order to see the maximum sequential read rate of 80MB/sec per drive, I had to reduce the dd's to just three drives per side of the MD1000 (ie 3 drives per SAS connection to the HBA), which resulted in about 73MB/sec per drive. I CAN get it to 80+MB/sec /drive, but I'd have to drop to just two drives per MD1000 connection.

Regardless of the number of drives, the combined throughput remains about the same.

What next?! Where's the bottleneck?
I expected performance to be better, and to be able to max out each HDDs read/write capability. Each disk is capable of 80MB/sec sequential read individually, the interconnect should easily have headroom for this, it's got two SAS 3gbps X 4 connections, one to 7 disks one to 8 disks.

Because the max throughput remains the same regardless of the number of drives being read, I believe the issue is due to one of the following:
- the MD1000 is incapable of sustaining higher read throughput per controller
- the SAS 6gbps HBA (LSI 9200-8e SAS2008) is incapable of sustaining a higher transfer rate per connection, possibly due to PCIe 1.x, or maybe it's just not a good HBA

Does anyone have any suggestions for how I might go about troubleshooting this, or tuning things?

Re: Low throughput via my HBA

Posted: 14 Dec 2014 01:24
by substr
What happens if you try a smaller or larger block size? (512k or 2M?)

Re: Low throughput via my HBA

Posted: 14 Dec 2014 01:51
by sunshine
substr wrote:What happens if you try a smaller or larger block size? (512k or 2M?)
Marginal difference in performance. I found 1M to be the sweet spot.

Also - I tested the same configuration, but swapping the older Dell PE 2950 III for a newer desktop-class system with a PCIe v3 bus. I installed the HBA into a x8 slot. Performance increased to an average of 45Mbyte/sec per disk with all read simultaneously. Better, but still a far cry from 80Mbyte/sec each.

Re: Low throughput via my HBA

Posted: 23 Dec 2014 09:14
by 00Roush
My initial thought is HBA is not working proper or just not capable of higher speeds. Normally what I would try is do a similar test but with windows OS if possible. This allows me to see what best case might be if everything is for sure working proper. Then compare that to FreeBSD speeds.

I recall seeing folks mention flashing raid/hba cards based on the LSI 9200 to a IT mode but I am not familiar.

00Roush