Bug 582364 - [RHEL5.5] 82599 Performance regression
Summary: [RHEL5.5] 82599 Performance regression
Keywords:
Status: CLOSED WORKSFORME
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: 4Suite
Version: 5.5
Hardware: All
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: Andy Gospodarek
QA Contact: Red Hat Kernel QE team
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-04-14 18:23 UTC by Mark Wagner
Modified: 2014-06-29 23:02 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-06-15 18:14:24 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
some data (8.57 KB, text/plain)
2010-04-14 18:23 UTC, Mark Wagner
no flags Details
ixgbe-disable-multiple-tx-queues.patch (1.98 KB, patch)
2010-04-14 19:38 UTC, Andy Gospodarek
no flags Details | Diff

Description Mark Wagner 2010-04-14 18:23:03 UTC
Created attachment 406590 [details]
some data

Description of problem:

The Niantic card has a performance regression when going from RHEL5.4 to RHEL5.5

On RHEL5.4 I see a throughput of app 9.3 Gbit/sec using a single stream netperf TCP_STREAM test with 16K message sizes. On RHEL5.5 on the same hardware, I see highly variable linespeeds. Also I'm observing that in RHEL5.5 the Niantic is running a lot slower on receive, (transmit is fine). The main symptom is a very "jerky" flow when running a single stream TCP_STREAM netperf test to the niantic. 

The thinking is that soemthing is filling up the queues and causing this behavior. If I use the modprobe option to create multiple vf, the issue goes away. It may be that this is related the RSC and flow director being bypassed in this mode. 

Also, reducing the coalesce time will improve the flow an throughput, although not entirely eliminate the issue. 



Version-Release number of selected component (if applicable):

RHEL5.5 + ixgbe driver

How reproducible:
Always

Steps to Reproduce:
1. Run netperf with large message sizes (./netperf -l 30 -H 172.17.20.22 -D )
2. 
3.
  
Actual results:

Very "jerky" flow and a low aggregate throughput. 

Expected results:
Linespeed (9.4 Gbits/sec

Additional info:

Comment 1 Andy Gospodarek 2010-04-14 19:38:26 UTC
Created attachment 406612 [details]
ixgbe-disable-multiple-tx-queues.patch


Mark, one thing I've considered is that my backport may enable multiple TX queues in the driver on RHEL5.5 and if that is causing issues.  Can you try this patch and let me know how it looks?

Comment 2 Andy Gospodarek 2010-05-04 14:25:07 UTC
Tested that patch on Mark's system and found no benefit.

Comment 3 Emil Tantilov 2010-05-19 19:32:23 UTC
I have not been able to reproduce this issue. This is a test run from RHEL5.5 ia32e with the included ixgbe 2.0.44-k2 (2.6.18-194 kernel)

#netperf -H u1505 -D
TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to u1505 (190.0.15.5) port 0 AF_INET : demo
Interim result: 9201.60 10^6bits/s over 1.00 seconds
Interim result: 9251.12 10^6bits/s over 1.00 seconds
Interim result: 9287.27 10^6bits/s over 1.00 seconds
Interim result: 9270.80 10^6bits/s over 1.00 seconds
Interim result: 9255.75 10^6bits/s over 1.00 seconds
Interim result: 9281.18 10^6bits/s over 1.00 seconds
Interim result: 9280.45 10^6bits/s over 1.00 seconds
Interim result: 9284.71 10^6bits/s over 1.00 seconds
Interim result: 9311.09 10^6bits/s over 1.00 seconds
Recv   Send    Send                          
Socket Socket  Message  Elapsed              
Size   Size    Size     Time     Throughput  
bytes  bytes   bytes    secs.    10^6bits/sec  

Could you post the actual numbers you are seeing from the netperf run? 

Also the following information may be useful:
- dmesg after loading the driver
- cat /proc/interruts
- lspci -vvv

Comment 4 Mark Wagner 2010-05-19 20:32:16 UTC
data is included already included in an attachment to this (see near top of page) 

The other data you requested will take a while as the systems are in the midst of some RHEL6 work.

Comment 5 Andy Gospodarek 2011-06-15 18:14:24 UTC
Mark, does it even make sense to keep this open at this point?  It is on 5.5 and we are getting ready to ship 5.7.

I'm going to close this as WORKSFORME, but we can reopen if needed.


Note You need to log in before you can comment on or make changes to this bug.