This is the old XigmaNAS forum in read only mode,
it will taken offline by the end of march 2021!
I like to aks Users and Admins to rewrite/take over important post from here into the new fresh main forum!
Its not possible for us to export from here and import it to the main forum!
it will taken offline by the end of march 2021!
I like to aks Users and Admins to rewrite/take over important post from here into the new fresh main forum!
Its not possible for us to export from here and import it to the main forum!
RTL8169 NIC Drops Connection Under Load
-
bjones371
- NewUser

- Posts: 9
- Joined: 17 Nov 2014 11:49
- Status: Offline
RTL8169 NIC Drops Connection Under Load
Hi,
I've set up N4F using SMB to share a ZFS Dataset with compression enabled to a Windows Server so I can take backups. After some tweaking with ZFSKernTune and a few other recommended best practices I've managed to get the throughput quite high, however the network card dies after a period of time under load, and requires a reboot of N4F to come back.
I have a 10/100 onboard card which is connected and used purely for management and is on my normal LAN, and a 10/100/1000 RTL8169 PCI card which is the card bound to the CIFS service and the one being used for the data transfer. The PCI card is connected to a separate switch off the main LAN, as is another PCI card in the Windows Server, so the data transfer is going across it's own switch with no other data. When the problem occurs, the PCI card in the N4F box stops responding to ping and I can no longer connect to the CIFS share. If I connect to the WebUI on the LAN card then I can reboot N4F and the share becomes accessible again, but again drops the connection under load.
Are there any log files I can obtain that will help diagnose this issue? Hardware is a Core 2 Quad Q6600 and 4GB (2 x 2GB) RAM. The RAM usage does not go above 60% during file transfers according to Status > System. I'm running the embedded version of N4F.
Thanks,
B
I've set up N4F using SMB to share a ZFS Dataset with compression enabled to a Windows Server so I can take backups. After some tweaking with ZFSKernTune and a few other recommended best practices I've managed to get the throughput quite high, however the network card dies after a period of time under load, and requires a reboot of N4F to come back.
I have a 10/100 onboard card which is connected and used purely for management and is on my normal LAN, and a 10/100/1000 RTL8169 PCI card which is the card bound to the CIFS service and the one being used for the data transfer. The PCI card is connected to a separate switch off the main LAN, as is another PCI card in the Windows Server, so the data transfer is going across it's own switch with no other data. When the problem occurs, the PCI card in the N4F box stops responding to ping and I can no longer connect to the CIFS share. If I connect to the WebUI on the LAN card then I can reboot N4F and the share becomes accessible again, but again drops the connection under load.
Are there any log files I can obtain that will help diagnose this issue? Hardware is a Core 2 Quad Q6600 and 4GB (2 x 2GB) RAM. The RAM usage does not go above 60% during file transfers according to Status > System. I'm running the embedded version of N4F.
Thanks,
B
- b0ssman
- Forum Moderator

- Posts: 2438
- Joined: 14 Feb 2013 08:34
- Location: Munich, Germany
- Status: Offline
Re: RTL8169 NIC Drops Connection Under Load
get an intel card. dont waste your time with realtek crap.
Nas4Free 11.1.0.4.4517. Supermicro X10SLL-F, 16gb ECC, i3 4130, IBM M1015 with IT firmware. 4x 3tb WD Red, 4x 2TB Samsung F4, both GEOM AES 256 encrypted.
-
bjones371
- NewUser

- Posts: 9
- Joined: 17 Nov 2014 11:49
- Status: Offline
Re: RTL8169 NIC Drops Connection Under Load
That was my first thought and I'm hunting around to see if I've got one I can try instead, but all I've found so far is another Realtek based oneb0ssman wrote:get an intel card. dont waste your time with realtek crap.
Just thought I'd post here to see if there was anything else I could look in to in the meantime.
-
bjones371
- NewUser

- Posts: 9
- Joined: 17 Nov 2014 11:49
- Status: Offline
Re: RTL8169 NIC Drops Connection Under Load
Definitely looking like a NIC problem, tried running a backup through the 10/100 onboard NIC instead and reached 30GB without issue, normally it'd bomb out within the first 5GB. Will try the other RTL card I have (8168-based rather than 8169, though I don't hold much hope since it's the re(4) driver for both of those cards unless I try compiling a new one) out of curiosity, and look to move to an Intel-based card in the future.
- ChriZathens
- Forum Moderator

- Posts: 758
- Joined: 23 Jun 2012 09:14
- Location: Athens, Greece
- Contact:
- Status: Offline
Re: RTL8169 NIC Drops Connection Under Load
Yeap, Intel cards work much better in *nix systems....
Having said that, my NAS uses a Realtek card and I have transferred as much as 3TB of data at once without issues.
In fact I have an Intel PCI-E card lying at my desk, but since I have no issues with the Realtek card, I am too lazy to plug the Intel..
But perhaps I am just one of a few lucky ones..
Having said that, my NAS uses a Realtek card and I have transferred as much as 3TB of data at once without issues.
In fact I have an Intel PCI-E card lying at my desk, but since I have no issues with the Realtek card, I am too lazy to plug the Intel..
But perhaps I am just one of a few lucky ones..
My Nas
Backup Nas: U-NAS NSC-400, Gigabyte MB10-DS4 (4x4TB Seagate Exos disks in RaidZ configuration - 32GB RAM)
- Case: Fractal Design Define R2
- M/B: Supermicro x9scl-f
- CPU: Intel Celeron G1620
- RAM: 16GB DDR3 ECC (2 x Kingston KVR1333D3E9S/8G)
- PSU: Chieftec 850w 80+ modular
- Storage: 8x2TB HDDs in a RaidZ2 array ~ 10.1 TB usable disk space
- O/S: XigmaNAS 11.2.0.4.6625 -amd64 embedded
- Extra H/W: Dell Perc H310 SAS controller, crosflashed to LSI 9211-8i IT mode, 8GB Innodisk D150SV SATADOM for O/S
Backup Nas: U-NAS NSC-400, Gigabyte MB10-DS4 (4x4TB Seagate Exos disks in RaidZ configuration - 32GB RAM)
-
bjones371
- NewUser

- Posts: 9
- Joined: 17 Nov 2014 11:49
- Status: Offline
Re: RTL8169 NIC Drops Connection Under Load
Found a 1Gbps Intel NIC and the whole thing is so much more responsive, even things like browsing to the share via SMB are instant now rather than suffering from a few seconds worth of delays... Didn't think it would have made that much of a difference!
Running a backup to it now, hopefully will get some sustained high throughput, it's started at around 50-70MB/s, with Task Manager averaging at around 350Mb/s. Strange how speeds were fine in iPerf but no good when trying to use Samba!
Anyway, all sorted - thanks folks.
EDIT: Spoke too soon! The throughput was much higher while it worked, but it's now bombed again with exactly the same symptoms...
Running a backup to it now, hopefully will get some sustained high throughput, it's started at around 50-70MB/s, with Task Manager averaging at around 350Mb/s. Strange how speeds were fine in iPerf but no good when trying to use Samba!
Anyway, all sorted - thanks folks.
EDIT: Spoke too soon! The throughput was much higher while it worked, but it's now bombed again with exactly the same symptoms...
-
bjones371
- NewUser

- Posts: 9
- Joined: 17 Nov 2014 11:49
- Status: Offline
Re: RTL8169 NIC Drops Connection Under Load
Getting a lot of em0: watchdog timeout -- resetting errors logged when the issue occurs if that helps.
- ChriZathens
- Forum Moderator

- Posts: 758
- Joined: 23 Jun 2012 09:14
- Location: Athens, Greece
- Contact:
- Status: Offline
Re: RTL8169 NIC Drops Connection Under Load
My Nas
Backup Nas: U-NAS NSC-400, Gigabyte MB10-DS4 (4x4TB Seagate Exos disks in RaidZ configuration - 32GB RAM)
- Case: Fractal Design Define R2
- M/B: Supermicro x9scl-f
- CPU: Intel Celeron G1620
- RAM: 16GB DDR3 ECC (2 x Kingston KVR1333D3E9S/8G)
- PSU: Chieftec 850w 80+ modular
- Storage: 8x2TB HDDs in a RaidZ2 array ~ 10.1 TB usable disk space
- O/S: XigmaNAS 11.2.0.4.6625 -amd64 embedded
- Extra H/W: Dell Perc H310 SAS controller, crosflashed to LSI 9211-8i IT mode, 8GB Innodisk D150SV SATADOM for O/S
Backup Nas: U-NAS NSC-400, Gigabyte MB10-DS4 (4x4TB Seagate Exos disks in RaidZ configuration - 32GB RAM)
-
bjones371
- NewUser

- Posts: 9
- Joined: 17 Nov 2014 11:49
- Status: Offline
Re: RTL8169 NIC Drops Connection Under Load
Thanks, I'd spotted that earlier but couldn't get the settings to "stick" after a reboot, but realised it's because I'm running embedded so needed to remount /cf in rw to edit the /cf/boot/loader.conf. Unfortunately it seems to have made the problem worse if anything, can only manage 500MB or so now before the NIC shuts off. There's a few more errors output in dmesg following that change too:
Code: Select all
em0: link state changed to UP
em0: Watchdog timeout -- resetting
em0: link state changed to DOWN
ahcich3: Timeout on slot 21 port 0
ahcich3: is 00000001 cs 00000000 ss 00000000 rs 00300000 tfd 50 serr 00000000 cmd 00049517
(ada0:ahcich3:0:0:0): WRITE_DMA. ACB: ca 00 20 16 02 46 00 00 00 00 00 00
(ada0:ahcich3:0:0:0): CAM status: Command timeout
(ada0:ahcich3:0:0:0): Retrying command
- b0ssman
- Forum Moderator

- Posts: 2438
- Joined: 14 Feb 2013 08:34
- Location: Munich, Germany
- Status: Offline
Re: RTL8169 NIC Drops Connection Under Load
btw the mainboard should have a pcie slot that you could us for a pcie network card.
Nas4Free 11.1.0.4.4517. Supermicro X10SLL-F, 16gb ECC, i3 4130, IBM M1015 with IT firmware. 4x 3tb WD Red, 4x 2TB Samsung F4, both GEOM AES 256 encrypted.
- ChriZathens
- Forum Moderator

- Posts: 758
- Joined: 23 Jun 2012 09:14
- Location: Athens, Greece
- Contact:
- Status: Offline
Re: RTL8169 NIC Drops Connection Under Load
Just for the record, you can add the settings you like in System|Advanced|loader.conf - no need to mount cf and suchbjones371 wrote:Thanks, I'd spotted that earlier but couldn't get the settings to "stick" after a reboot, but realised it's because I'm running embedded so needed to remount /cf in rw to edit the /cf/boot/loader.conf. Unfortunately it seems to have made the problem worse if anything, can only manage 500MB or so now before the NIC shuts off. There's a few more errors output in dmesg following that change too:
ada0 being the disk drive suggests there's either a failing disk (SMART checks out OK), or the SATA interface can't keep up with the incoming data on the NIC maybe? Starting to wonder if it's just that the hardware can't cope with ZFSCode: Select all
em0: link state changed to UP em0: Watchdog timeout -- resetting em0: link state changed to DOWN ahcich3: Timeout on slot 21 port 0 ahcich3: is 00000001 cs 00000000 ss 00000000 rs 00300000 tfd 50 serr 00000000 cmd 00049517 (ada0:ahcich3:0:0:0): WRITE_DMA. ACB: ca 00 20 16 02 46 00 00 00 00 00 00 (ada0:ahcich3:0:0:0): CAM status: Command timeout (ada0:ahcich3:0:0:0): Retrying command
As for your ahci error, check the smart status of the specific disk and see if UDMA_CRC_Error_Count has anything but 0 - this means you have a bad cable
My Nas
Backup Nas: U-NAS NSC-400, Gigabyte MB10-DS4 (4x4TB Seagate Exos disks in RaidZ configuration - 32GB RAM)
- Case: Fractal Design Define R2
- M/B: Supermicro x9scl-f
- CPU: Intel Celeron G1620
- RAM: 16GB DDR3 ECC (2 x Kingston KVR1333D3E9S/8G)
- PSU: Chieftec 850w 80+ modular
- Storage: 8x2TB HDDs in a RaidZ2 array ~ 10.1 TB usable disk space
- O/S: XigmaNAS 11.2.0.4.6625 -amd64 embedded
- Extra H/W: Dell Perc H310 SAS controller, crosflashed to LSI 9211-8i IT mode, 8GB Innodisk D150SV SATADOM for O/S
Backup Nas: U-NAS NSC-400, Gigabyte MB10-DS4 (4x4TB Seagate Exos disks in RaidZ configuration - 32GB RAM)
-
bjones371
- NewUser

- Posts: 9
- Joined: 17 Nov 2014 11:49
- Status: Offline
Re: RTL8169 NIC Drops Connection Under Load
Thanks for the pointer on the loader.conf - I'd used the Advanced | Sysctl.conf already so not sure how I missed loader! I rebuilt the pen drive from fresh yesterday and recreated the ZFS Pool and Dataset from scratch as I'd done a lot of tinkering with various things in trying to get the Realtek card working. The throughput on my next attempt was considerably higher than it had been, but it still bombed after around 30GB worth of data. I've got a different module for the Intel NIC (v7.4.2) that I compiled in FreeBSD 9.2-RELEASE and tried out yesterday before I rebuilt it and that didn't help either, not tried it since rebuilding though. Disabling MSI in the loader.conf still has the effect of making the network connection fail more quickly though, so at least it's consistent.
The motherboard does have a PCIe slot, but the only PCIe NIC I have is another Realtek one, so I've not bothered entertaining the idea of putting it in
UDMA CRC Errors come back clean according to smart, here's the info it's spitting out:
Thanks for all the help so far by the way!
The motherboard does have a PCIe slot, but the only PCIe NIC I have is another Realtek one, so I've not bothered entertaining the idea of putting it in
UDMA CRC Errors come back clean according to smart, here's the info it's spitting out:
Code: Select all
=== START OF INFORMATION SECTION ===
Model Family: Seagate Barracuda 7200.14 (AF)
Device Model: ST2000DM001-1ER164
Serial Number: Z4Z0T6SM
LU WWN Device Id: 5 000c50 079579072
Firmware Version: CC25
User Capacity: 2,000,398,934,016 bytes [2.00 TB]
Sector Sizes: 512 bytes logical, 4096 bytes physical
Rotation Rate: 7200 rpm
Device is: In smartctl database [for details use: -P show]
ATA Version is: ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is: SATA 3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is: Tue Nov 18 09:20:09 2014 UTC
==> WARNING: A firmware update for this drive may be available,
see the following Seagate web pages:
http://knowledge.seagate.com/articles/en_US/FAQ/207931en
http://knowledge.seagate.com/articles/en_US/FAQ/223651en
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x82) Offline data collection activity
was completed without error.
Auto Offline Data Collection: Enabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 80) seconds.
Offline data collection
capabilities: (0x7b) SMART execute Offline immediate.
Auto Offline data collection on/off support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 1) minutes.
Extended self-test routine
recommended polling time: ( 213) minutes.
Conveyance self-test routine
recommended polling time: ( 2) minutes.
SCT capabilities: (0x1085) SCT Status supported.
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 114 100 006 Pre-fail Always - 61222872
3 Spin_Up_Time 0x0003 097 097 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 24
5 Reallocated_Sector_Ct 0x0033 100 100 010 Pre-fail Always - 0
7 Seek_Error_Rate 0x000f 100 253 030 Pre-fail Always - 199662
9 Power_On_Hours 0x0032 100 100 000 Old_age Always - 43
10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 24
183 Runtime_Bad_Block 0x0032 100 100 000 Old_age Always - 0
184 End-to-End_Error 0x0032 100 100 099 Old_age Always - 0
187 Reported_Uncorrect 0x0032 100 100 000 Old_age Always - 0
188 Command_Timeout 0x0032 100 100 000 Old_age Always - 0 0 0
189 High_Fly_Writes 0x003a 100 100 000 Old_age Always - 0
190 Airflow_Temperature_Cel 0x0022 069 067 045 Old_age Always - 31 (Min/Max 31/32)
191 G-Sense_Error_Rate 0x0032 100 100 000 Old_age Always - 0
192 Power-Off_Retract_Count 0x0032 100 100 000 Old_age Always - 16
193 Load_Cycle_Count 0x0032 100 100 000 Old_age Always - 122
194 Temperature_Celsius 0x0022 031 040 000 Old_age Always - 31 (0 18 0 0 0)
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0
240 Head_Flying_Hours 0x0000 100 253 000 Old_age Offline - 40h+08m+43.886s
241 Total_LBAs_Written 0x0000 100 253 000 Old_age Offline - 360122796
242 Total_LBAs_Read 0x0000 100 253 000 Old_age Offline - 8614696021
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Short offline Completed without error 00% 4 -
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
- ChriZathens
- Forum Moderator

- Posts: 758
- Joined: 23 Jun 2012 09:14
- Location: Athens, Greece
- Contact:
- Status: Offline
Re: RTL8169 NIC Drops Connection Under Load
OK, so cables seem OK....
Please tell me something else... do you have any scripts that check the status of your hdds frequently?
Please tell me something else... do you have any scripts that check the status of your hdds frequently?
My Nas
Backup Nas: U-NAS NSC-400, Gigabyte MB10-DS4 (4x4TB Seagate Exos disks in RaidZ configuration - 32GB RAM)
- Case: Fractal Design Define R2
- M/B: Supermicro x9scl-f
- CPU: Intel Celeron G1620
- RAM: 16GB DDR3 ECC (2 x Kingston KVR1333D3E9S/8G)
- PSU: Chieftec 850w 80+ modular
- Storage: 8x2TB HDDs in a RaidZ2 array ~ 10.1 TB usable disk space
- O/S: XigmaNAS 11.2.0.4.6625 -amd64 embedded
- Extra H/W: Dell Perc H310 SAS controller, crosflashed to LSI 9211-8i IT mode, 8GB Innodisk D150SV SATADOM for O/S
Backup Nas: U-NAS NSC-400, Gigabyte MB10-DS4 (4x4TB Seagate Exos disks in RaidZ configuration - 32GB RAM)
- b0ssman
- Forum Moderator

- Posts: 2438
- Joined: 14 Feb 2013 08:34
- Location: Munich, Germany
- Status: Offline
Re: RTL8169 NIC Drops Connection Under Load
What Chipset does the motherboard have?
Sent from my iPhone using Tapatalk
Sent from my iPhone using Tapatalk
Nas4Free 11.1.0.4.4517. Supermicro X10SLL-F, 16gb ECC, i3 4130, IBM M1015 with IT firmware. 4x 3tb WD Red, 4x 2TB Samsung F4, both GEOM AES 256 encrypted.
-
bjones371
- NewUser

- Posts: 9
- Joined: 17 Nov 2014 11:49
- Status: Offline
Re: RTL8169 NIC Drops Connection Under Load
No scripts or anything like that running, it's a clean NAS4Free Embedded build I have running now - all I've done is set up a basic ZFS structure and shared it using Samba, beyond that it's pretty much as you'd find it installed fresh off the disk.
Motherboard is using an nVidia Geforce 7050 / 610i chipset, so not the most modern of hardware I admit. This website has a good breakdown of the motherboard specs. http://www.ascendtech.us/ecs-mcp73vt-pm ... 3vtpm.aspx
Motherboard is using an nVidia Geforce 7050 / 610i chipset, so not the most modern of hardware I admit. This website has a good breakdown of the motherboard specs. http://www.ascendtech.us/ecs-mcp73vt-pm ... 3vtpm.aspx
- b0ssman
- Forum Moderator

- Posts: 2438
- Joined: 14 Feb 2013 08:34
- Location: Munich, Germany
- Status: Offline
Re: RTL8169 NIC Drops Connection Under Load
the nivida chipset will be your problem.
it is not well supported under freebsd and causes problems with the sata controller and other stuff.
it is not well supported under freebsd and causes problems with the sata controller and other stuff.
Nas4Free 11.1.0.4.4517. Supermicro X10SLL-F, 16gb ECC, i3 4130, IBM M1015 with IT firmware. 4x 3tb WD Red, 4x 2TB Samsung F4, both GEOM AES 256 encrypted.
-
bjones371
- NewUser

- Posts: 9
- Joined: 17 Nov 2014 11:49
- Status: Offline
Re: RTL8169 NIC Drops Connection Under Load
Cool - I'll start looking for an alternative that's not BSD based 
-
sarwanov
- NewUser

- Posts: 5
- Joined: 29 Jan 2015 16:08
- Status: Offline
Re: RTL8169 NIC Drops Connection Under Load
There is no need to waste your time anymore you just need to get an intel card and that' all.
Graduated from Soran University with First Class Degree with Honours in Computer Science.