Bug 860035
Summary: | Reduced bandwidth gigabit NIC, rx packet loss, buffer overruns, terrible iperf and ethtool stats | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 5 | Reporter: | that_IT_guy <mjklein> |
Component: | kernel | Assignee: | Nikolay Aleksandrov <naleksan> |
Status: | CLOSED NOTABUG | QA Contact: | Red Hat Kernel QE team <kernel-qe> |
Severity: | medium | Docs Contact: | |
Priority: | unspecified | ||
Version: | 5.8 | CC: | agospoda, nhorman, peterm |
Target Milestone: | rc | ||
Target Release: | --- | ||
Hardware: | i686 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2012-09-28 18:42:02 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
that_IT_guy
2012-09-24 17:00:41 UTC
Why do you think that this is a problem with ethtool, and not the kernel? Do you only see these problems with RHEL-5, or do they occur with RHEL-6 / Fedora as well? Honestly, I wasn't sure which component to categorize it under (first bug report, so forgive my noobishness). I chose the ethtool category simply based on that it is network related. I haven't had the chance to test it on Fedora fedora is not allowed on my corporate network, and RHEL6 has not been put into production in my environment as of yet. The one RHEL6 system that I do have is on a different hardware platform (T3400, BCM5754) and does not seem to exhibit any of these issues as far as bandwidth performance, packet loss, or buffer overruns. Stumbled upon something interesting today on one of the two problem boxes.. Had my rx ring size scaled up to 500 for a day just to check results.. and had framing errors out the wazoo.. Reduced my rx ring back to 200 and no more framing errors. yay.. What I found most interesting is that at this stage this system (tg3 bcm5761 64bit RHEL5.3) is not exhibiting any packet loss, framing errors, or buffer overruns on the receiving end and even though I am no longer seeing any of the above the NIC is doing something strange. If I bring the interface down and back up either using /etc/init.d/network restart or ifconfig ifup/down eth0 and then test with iperf I hit an average of around 830Mbit/sec up to about a 10GB transfer overall. At the point that I hit the 10GB mark my performance halves. My Ring Test Ring parameters for eth0: Pre-set maximums: RX: 511 RX Mini: 0 RX Jumbo: 0 TX: 511 Current hardware settings: RX: 500 RX Mini: 0 RX Jumbo: 0 TX: 511 Back to 200 Ring parameters for eth0: Pre-set maximums: RX: 511 RX Mini: 0 RX Jumbo: 0 TX: 511 Current hardware settings: RX: 200 RX Mini: 0 RX Jumbo: 0 TX: 511 iperf results immediately after eth0 down/up (this was run 5 times) Client connecting to server, TCP port 5001 TCP window size: 256 KByte (WARNING: requested 2.00 MByte) ------------------------------------------------------------ [ 3] local 0.0.0.0 port 34168 connected with 0.0.0.0 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0- 2.0 sec 202 MBytes 847 Mbits/sec [ ID] Interval Transfer Bandwidth [ 3] 2.0- 4.0 sec 202 MBytes 846 Mbits/sec [ ID] Interval Transfer Bandwidth [ 3] 4.0- 6.0 sec 202 MBytes 848 Mbits/sec [ ID] Interval Transfer Bandwidth [ 3] 6.0- 8.0 sec 202 MBytes 849 Mbits/sec [ ID] Interval Transfer Bandwidth [ 3] 8.0-10.0 sec 202 MBytes 848 Mbits/sec [ ID] Interval Transfer Bandwidth [ 3] 10.0-12.0 sec 203 MBytes 851 Mbits/sec [ ID] Interval Transfer Bandwidth [ 3] 12.0-14.0 sec 201 MBytes 843 Mbits/sec [ ID] Interval Transfer Bandwidth [ 3] 14.0-16.0 sec 197 MBytes 827 Mbits/sec [ ID] Interval Transfer Bandwidth [ 3] 16.0-18.0 sec 200 MBytes 837 Mbits/sec [ ID] Interval Transfer Bandwidth [ 3] 18.0-20.0 sec 199 MBytes 835 Mbits/sec [ ID] Interval Transfer Bandwidth [ 3] 0.0-20.0 sec 1.96 GBytes 843 Mbits/sec ifconfig stats of eth0 minus identifying info UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:3955787 errors:0 dropped:0 overruns:0 frame:0 TX packets:7666484 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:281073766 (268.0 MiB) TX bytes:11631517756 (10.8 GiB) Interrupt:177 iperf results after client has transferred 10GB from previous iperf tests Client connecting to server, TCP port 5001 TCP window size: 256 KByte (WARNING: requested 2.00 MByte) ------------------------------------------------------------ [ 3] local 0.0.0.0 port 37186 connected with 0.0.0.0 port 5001 [ ID] Interval Transfer Bandwidth [ 3] 0.0- 2.0 sec 87.8 MBytes 368 Mbits/sec [ ID] Interval Transfer Bandwidth [ 3] 2.0- 4.0 sec 94.0 MBytes 394 Mbits/sec [ ID] Interval Transfer Bandwidth [ 3] 4.0- 6.0 sec 99.9 MBytes 419 Mbits/sec [ ID] Interval Transfer Bandwidth [ 3] 6.0- 8.0 sec 101 MBytes 425 Mbits/sec [ ID] Interval Transfer Bandwidth [ 3] 8.0-10.0 sec 101 MBytes 422 Mbits/sec [ ID] Interval Transfer Bandwidth [ 3] 10.0-12.0 sec 101 MBytes 423 Mbits/sec [ ID] Interval Transfer Bandwidth [ 3] 12.0-14.0 sec 93.5 MBytes 392 Mbits/sec [ ID] Interval Transfer Bandwidth [ 3] 14.0-16.0 sec 99.2 MBytes 416 Mbits/sec [ ID] Interval Transfer Bandwidth [ 3] 16.0-18.0 sec 102 MBytes 428 Mbits/sec [ ID] Interval Transfer Bandwidth [ 3] 18.0-20.0 sec 101 MBytes 425 Mbits/sec [ ID] Interval Transfer Bandwidth [ 3] 0.0-20.0 sec 981 MBytes 411 Mbits/sec The above output is something I was seeing as an initial symptom on both of these systems and apparently slipped my mind. Until today I hadn't realized that the bandwidth was halving after the client received a particular amount of data. NOT a RHEL BUG. Found to be faulty PCI BUS. Report can be closed. |