Bug 620973

Summary: TCP performance abysmal on RHEL6 with myri10ge
Product: Red Hat Enterprise Linux 6 Reporter: Rich Ercolani <rercola>
Component: kernelAssignee: Stanislaw Gruszka <sgruszka>
Status: CLOSED INSUFFICIENT_DATA QA Contact: Red Hat Kernel QE team <kernel-qe>
Severity: medium Docs Contact:
Priority: low    
Version: 6.1CC: arozansk
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-05-24 14:59:18 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Rich Ercolani 2010-08-03 21:44:15 UTC
Description of problem:
On RHEL5, I was testing the performance of the Myricom 10GbE NICs using netperf, with TCP_SENDFILE and an MTU of 9000, and got around 9400 Mbit/s +/- 100 mbit.

I went to test on the RHEL 6 beta, as it will be a number of months before the cards in question are deployed, and RHEL 6 will likely be what we actually run these on, only to find that, with identical settings, I received 1.77 Mbit/s.

Version-Release number of selected component (if applicable):
2.6.32-44.2.el6.x86_64
I also tried a stock 2.6.35 kernel, built with make rpm, which achieved similar results.

How reproducible:
100%

Steps to Reproduce:
1. Use myri10ge card on a RHEL 6 machine
2. Run netperf with any TCP-based test
  
Actual results:
Performance on the order of 1.77 Mbit/s (occasionally, I will get a result around 300 Mbit/s back).

Expected results:
Performance on the order of 9400 Mbit/s.

Additional info:
UDP_STREAM tests on RHEL 54 using MTU 9000 yielded 8700 Mbit/s +/- 100 Mbit. Tests on RHEL 6 yield 7800-9300 Mbit/s.

Using rsync, TCP-based NFS mounts, or any other TCP-based things over the 10GbE link results in connection hangs (though almost never explicitly a drop). This is not true either of using TCP or UDP on the 1Gbit interfaces on the machines or using UDP-based protocols over the 10GbE links.

Nothing else is being done on the switch at the time these tests are running. The configuration has not changed from when the RHEL 54 tests were run.

Comment 2 RHEL Program Management 2010-08-03 22:07:38 UTC
This issue has been proposed when we are only considering blocker
issues in the current Red Hat Enterprise Linux release.

** If you would still like this issue considered for the current
release, ask your support representative to file as a blocker on
your behalf. Otherwise ask that it be considered for the next
Red Hat Enterprise Linux release. **

Comment 3 Stanislaw Gruszka 2010-08-06 13:43:38 UTC
Does "ethtool -K ethX lro off" make something better ?

Comment 4 RHEL Program Management 2010-08-18 21:19:36 UTC
Thank you for your bug report. This issue was evaluated for inclusion
in the current release of Red Hat Enterprise Linux. Unfortunately, we
are unable to address this request in the current release. Because we
are in the final stage of Red Hat Enterprise Linux 6 development, only
significant, release-blocking issues involving serious regressions and
data corruption can be considered.

If you believe this issue meets the release blocking criteria as
defined and communicated to you by your Red Hat Support representative,
please ask your representative to file this issue as a blocker for the
current release. Otherwise, ask that it be evaluated for inclusion in
the next minor release of Red Hat Enterprise Linux.

Comment 5 Stanislaw Gruszka 2011-04-29 11:17:49 UTC
I have these netperf results when using myri10ge on RHEL6.

[root@dhcp-31-100 ~]# netperf -H 192.168.1.2
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to 192.168.1.2 (192.168.1.2) port 0 AF_INET
Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    10^6bits/sec  

 87380  65536  65536    10.01    9874.51  

Could you elaborate more about the problem (i.e. provide your configuration options)?

Comment 7 RHEL Program Management 2011-04-30 06:01:06 UTC
Since RHEL 6.1 External Beta has begun, and this bug remains
unresolved, it has been rejected as it is not proposed as
exception or blocker.

Red Hat invites you to ask your support representative to
propose this request, if appropriate and relevant, in the
next release of Red Hat Enterprise Linux.