i'm experiencing a very high CPU usage on my new box that's essentially killing all performance.
Box is a Intel G31PR(latest bios) motherboard with a P4 651 (3.4Ghz HT, cedar mill) and 2GB of RAM.
The controller is an ICH7 plain, with no AHCI so NF is limiting me to 150MBps transfers.
NIC is RT8111 gigabit onboard, gigabit network
NF is 9.2.0.1 - Shigawire (revision 972) x64, kernel tune is on, ZFS tuning has been set for 2GB of RAM, HPET as system timer
there's a single ZFS pool set up as RAIDZ1 with 3x1.5TB HDD (ST31500341AS, 7200rpm, 512BPS, SATAII, each drive is capable of more than 100MB/s sustained speed. all 3 drives have the same FW version).
Samba is setup as SMB2/local, AIO, large transfers, SO_RCVBUF and SO_SNDBUF at 524288
The problem is that when i start copying data to the NF the performance is very poor and erratic, with SMB it shows choppiness and speed around 50~60MB/s, FTP fares somewhat better but then drops.
CPU Usage is at 95+%(of both "HT" cores), with ZFS compression it is even more erratic, cpu load doesn't changes.
I then did some tests with DD and file copies:
can't use compression ON with DD as freebsd dev/random performance is horrible (~50MB/s in my machine with 50% cpu load), so take it with a grain of salt
Compression OFF, using ZERO:
Code: Select all
nas4free: /mnt # dd if=/dev/zero of=/mnt/tank/Anime/file1 bs=8m count=500
500+0 records in
500+0 records out
4194304000 bytes transferred in 34.874673 secs ([b]120267909 [/b]bytes/sec)
zpool stats concur
Code: Select all
capacity operations bandwidth
pool alloc free read write read write
---------- ----- ----- ----- ----- ----- -----
tank 1.87G 4.06T 1 1.20K 613 125M
raidz1 1.87G 4.06T 1 1.20K 613 125M
ada1 - - 0 575 306 62.2M
ada2 - - 0 576 306 62.3M
ada3 - - 0 585 0 63.5M
---------- ----- ----- ----- ----- ----- -----
capacity operations bandwidth
pool alloc free read write read write
---------- ----- ----- ----- ----- ----- -----
tank 2.83G 4.06T 1 1.07K 715 107M
raidz1 2.83G 4.06T 1 1.07K 715 107M
ada1 - - 0 509 204 53.4M
ada2 - - 0 508 102 53.4M
ada3 - - 0 488 408 51.2M
---------- ----- ----- ----- ----- ----- -----
Code: Select all
last pid: 23660; load averages: 2.97, 1.71, 0.94 up 0+02:04:09 01:52:06
39 processes: 2 running, 37 sleeping
CPU: 1.6% user, 0.0% nice, 35.9% system, 3.9% interrupt, 58.6% idle
Mem: 43M Active, 43M Inact, 676M Wired, 22M Buf, 1184M Free
ARC: 511M Total, 29M MFU, 273M MRU, 207M Anon, 2079K Header, 947K Other
Swap: 2047M Total, 2047M Free
0 and 1 remain between 85 and 97%~, in red all the time, but ada3 between 67 to 87!, mostly in 67 and all the time in purple, check out the difference in ms/w!, this drive is newer than the others(a refurbish of a failed one from that batch)
Code: Select all
dT: 1.010s w: 1.000s
L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name
0 2 0 0 0.0 2 8 3.9 0.4| ada0
9 603 1 0 0.1 600 66862 12.2 97.8| ada1
0 2 0 0 0.0 2 8 5.2 0.5| ada0s1
0 0 0 0 0.0 0 0 0.0 0.0| ada0s2
0 0 0 0 0.0 0 0 0.0 0.0| ada0s3
10 601 1 0 0.2 598 66609 12.3 97.8| ada2
10 599 0 0 0.0 597 67993 8.7 76.7| ada3
FTP transfer of a single large file to the NAS
Erratic speed, cpu usage though the roof
Code: Select all
last pid: 29435; load averages: 2.96, 2.03, 1.42 up 0+02:29:26 02:17:23
40 processes: 2 running, 37 sleeping, 1 zombie
CPU: 2.4% user, 0.0% nice, 50.2% system, 40.6% interrupt, 6.9% idle
Mem: 39M Active, 44M Inact, 702M Wired, 22M Buf, 1162M Free
ARC: 512M Total, 23M MFU, 375M MRU, 111M Anon, 2257K Header, 1157K Other
Swap: 2047M Total, 2047M Free
PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
29249 root 1 89 0 62696K 15308K RUN 0 0:21 58.98% proftpd
Code: Select all
input (re0) output
packets errs idrops bytes packets errs bytes colls
2 0 0 132 2 0 396 0
3844 0 0 5733736 3837 0 254784 0
64352 0 0 95976326 64388 0 4249290 0
64717 0 0 96484954 64721 0 4275466 0
68855 0 0 102643134 68871 0 4546431 0
66512 0 0 99156296 66522 0 4392896 0
55658 0 0 82970540 55660 0 3673920 0
3 0 0 198 3 0 946 0
10518 0 0 15678388 10517 0 695130 0
68130 0 0 101573496 68138 0 4499076 0
65915 0 0 98263544 65916 0 4352149 0
72256 0 0 107711472 72268 0 4770752 0
65132 0 0 97106376 65138 0 4299666 0
63118 0 0 94092764 63126 0 4167494 0
73729 0 0 109907290 73734 0 4866820 0
59704 0 0 88999366 59710 0 3941528 0
62970 0 0 93880596 62967 0 4156842 0
71995 0 0 107322750 72007 0 4754291 0
64430 0 0 96046220 64437 0 4253876 0
71760 0 0 106971944 71769 0 4736992 0
62288 0 0 92855816 62294 0 4111796 0
Again top showing the cpu getting killed:
Code: Select all
last pid: 30673; load averages: 2.30, 1.71, 1.42 up 0+02:33:48 02:21:45
38 processes: 3 running, 35 sleeping
CPU: 8.6% user, 0.0% nice, 52.4% system, 32.9% interrupt, 6.1% idle
Mem: 29M Active, 43M Inact, 701M Wired, 22M Buf, 1172M Free
ARC: 512M Total, 1K MFU, 356M MRU, 152M Anon, 2678K Header, 1164K Other
Swap: 2047M Total, 2047M Free
PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
8049 root 1 86 0 82788K 19356K RUN 1 4:48 50.00% smbd
Code: Select all
input (re0) output
packets errs idrops bytes packets errs bytes colls
42042 0 0 63439194 42942 0 2912424 0
75146 0 0 113546380 76677 0 5200314 0
37334 0 0 56247780 38100 0 2586637 0
78589 0 0 118657410 80151 0 5435886 0
40110 0 0 60447380 40938 0 2777780 0
56646 0 0 85489892 57753 0 3920154 0
33252 0 0 50276861 33968 0 2305088 0
44161 0 0 66607224 45101 0 3059984 0
58872 0 0 88896837 60069 0 4074412 0
25376 0 0 38302440 25882 0 1756371 0
79420 0 0 119958272 81018 0 5496478 0
76142 0 0 115039612 77675 0 5267694 0
38285 0 0 57741066 39072 0 2649512 0
61560 0 0 92889456 62788 0 4257752 0
67621 0 0 102146176 69031 0 4683060 0
44574 0 0 67325644 45525 0 3087026 0
46240 0 0 69675560 47163 0 3200163 0
78792 0 0 119026976 80380 0 5450686 0
50105 0 0 75644986 51128 0 3467700 0
44378 0 0 66887580 45299 0 3072494 0
79251 0 0 119646654 80803 0 5480774 0
Read tests:
reading a file is not showing the same erratic behaviour, it starts at ~80% network then after a little more than half the file is transfered it drops to around 57% and stays there, but the transfer graph is flatter.
CPU usage is still through the roof:
Code: Select all
CPU: 11.2% user, 0.0% nice, 37.6% system, 35.3% interrupt, 15.9% idle
Mem: 34M Active, 47M Inact, 678M Wired, 22M Buf, 1187M Free
ARC: 512M Total, 12M MFU, 497M MRU, 272K Anon, 1756K Header, 1047K Other
Swap: 2047M Total, 2047M Free
PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND
8049 root 1 90 0 86824K 23368K RUN 0 3:08 57.96% smbd
Code: Select all
input (re0) output
packets errs idrops bytes packets errs bytes colls
27966 0 0 2010492 64831 0 96488580 0
29482 0 0 2121780 69468 0 103262075 0
29263 0 0 2103582 67870 0 100970058 0
27382 0 0 1970076 64515 0 96077110 0
30463 0 0 2188398 70379 0 104727138 0
30152 0 0 2166000 69315 0 103037552 0
29556 0 0 2123342 67530 0 100248002 0
30601 0 0 2195770 70095 0 104448506 0
21676 0 0 1552296 48629 0 72038937 0
13060 0 0 937080 28714 0 42614220 0
27136 0 0 1950096 62721 0 93410386 0
18945 0 0 1358946 43831 0 64964822 0
19921 0 0 1427106 43875 0 64969882 0
19808 0 0 1420062 44484 0 66147154 0
18414 0 0 1322028 41731 0 62220042 0
20527 0 0 1459614 46978 0 69578295 0
19590 0 0 1418500 44373 0 65774374 0
Code: Select all
packets errs idrops bytes packets errs bytes colls
33109 0 0 2185194 28422 0 114730426 0
32606 0 0 2151996 27276 0 113629798 0
33594 0 0 2217618 29961 0 115448840 0
29690 0 0 1959540 20988 0 103360830 0
29492 0 0 1946472 26328 0 104963565 0
32327 0 0 2133582 29589 0 112333250 0
33516 0 0 2212056 30994 0 114397604 0
30729 0 0 2028114 27870 0 112994556 0
34125 0 0 2252250 33388 0 118170854 0
31999 0 0 2112348 23621 0 111804194 0
30866 0 0 2037156 26120 0 108785584 0
29870 0 0 1971420 26584 0 108504330 0
22052 0 0 1455894 19753 0 77006738 0
17987 0 0 1187142 13505 0 61662670 0
16747 0 0 1105302 12847 0 61329272 0
18718 0 0 1235802 13626 0 62231770 0
18385 0 0 1213410 14831 0 63294738 0
16677 0 0 1100682 13451 0 58835133 0
18783 0 0 1239678 14697 0 63030224 0
18361 0 0 1211962 14614 0 61782360 0
18665 0 0 1231890 14774 0 61792728 0
now even if this is an "old" P4, it's still one of the latest models out(in fact it's the latest P4 core), and they had excellent integer performance, it shouldn't be doing this!.
¿why is ftp pumping a solid 10MB/s more than SMB?
¿why are read speeds tanking after some time?(the ARC cache is at 512MB and the tanking is waaay past that)
why are interrupts eating so much CPU?
any ideas?

