Bug 105718

Summary:

(NET E1000) driver gives poor performance with jumbo frames

Product:

Red Hat Enterprise Linux 2.1

Reporter:

Brian Feeny <signal>

Component:

kernel

Assignee:

John W. Linville <linville>

Status:

CLOSED WORKSFORME

QA Contact:

Brian Brock <bbrock>

Severity:

medium

Docs Contact:

Priority:

medium

Version:

2.1

CC:

davem, jbaron, k.georgiou, riel, scott.feldman

Target Milestone:

---

Target Release:

---

Hardware:

i686

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2005-03-15 14:52:43 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
test results from NFS benchmark	none

Description Brian Feeny 2003-09-26 20:10:26 UTC

From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 5.22; Mac_PowerPC)

Description of problem:
I opened a ticket on this today via the Redhat Enterprise Entitlement WebHelp 
support area.  Not sure if its a good or bad thing to put it here as well.

We have a set of RH8.0 machines, which will be running mail server software.  
These machines are front ended by a load balancing la$
share a common file store, which is a RH ES 2.1 server, via NFS.  Optimum NFS 
performance is important to us in this deployment.

Here is the hardware being used:

NFS Client machine
  Intel SE7501BR2 motherboard (hyperthreading enabled)
  Dual Xeon 2.6Ghz
  2GB memory
  Redhat 8.0 with all updates
  Red Hat Linux release 8.0 (Psyche)
  Linux mx-admin.shreve.net 2.4.18-24.8.0smp #1 SMP Fri Jan 31 06:03:47 EST 
2003 i686 i686    i386 GNU/Linux


NFS Server machines
  Intel SE7501BR2 motherboard (hyperthreading enabled)
  Dual Xeon 2.6Ghz
  2GB Memory
  2 3Ware 7500 IDE RAID Controllers
  Redhat ES 2.1
  Red Hat Enterprise Linux ES release 2.1 (Panama)
  Linux mx-nfs.shreve.net 2.4.9-e.24smp #1 SMP Tue May 27 16:07:39 EDT 2003 
i686 unknown


These machines are only running the software needed to do the NFS benchmark.  
These are in a test enviroment and not in production.

There are a number of performance tuning that we are looking at doing on the 
client and server side.  Once we started seeing proble$
decided to go with just basic settings that reveal a problem.

The tests:

Benchmarks were performed using iozone (www.iozone.org).  The tests are 
attached in a file attachment to this ticket called data.ta$

The original test was done with a MTU of 1500 and a blocksize of 8192.  The 
results were not that impressive.  Because of the fragm$
reassembly that must take place for a 8k block size on a pipe with a MTU of 
1500, the move was made to use jumbo frames.

A second test was done, with MTU set to 9000 and blocksize of 8192.  This test 
showed worse performance than the original test.  De$
there should be little to no fragmentation and reassembly, and the IP header 
processing should be significantly cut down, along wit$

Both tests were done with the e1000 driver for RH8.0 (latest kernel), and RH ES 
2.1 (latest kernel).
No tuning was done to the module for these tests.

The above tests were then repeated, using the e1000_4412k1 driver included with 
RH ES 2.1.  The tests were faster than the tests wi$
(e1000 driver being a 5.x version of the driver and the 4412k1 is a 4.x version 
of the driver).  But the 1500 MTU still came out on$
were using blocksize of 9000.

The Intel SE7501BR2's onboard e1000 NIC is Jumbo Frame clean.  Large file 
copies with sumations were done to confirm a clean data p$

I have no explaination as to why I would see worse performance with MTU 9000 
(jumbo frames), when ideally that is where i need to b$
project, to reduce cpu overhead and packet fragmentation and reassembly.

In alot of reading I have done, it has led be to consider the driver parameters 
suspect.  As I said I left them at there default va$
I am especially concerned with RxIntDelay.  Using a larger packet size, such as 
9000 in this case, should probably be running with $
delay (please correct me if I am wrong).  It looks like in the e1000 driver, 
this is set to some sort of dynamic learning mode base$

InterruptThrottleRate int array (min = 1, max = 32), description "Interrupt 
Throttling Rate"

In the 4412_k1 version of this driver, it looks like it just relies on 
RxIntDelay.  I am wondering if the rapid interrupts are caus$
(9000 MTU) to be interrupted and causing resends.  Another possibility is that 
its interrupting too much and causing too much cpu o$

I don't know if interrupts are at all to blame.  I only offer it up as a 
suggestion.

Version-Release number of selected component (if applicable):
kernel-smp-2.4.9-e.24

How reproducible:
Always

Steps to Reproduce:
1. Configure a partition on a RH ES NFS server
2. Configure a client to mount it on RH 8.0, 8192 block size
3. Use MTU 9000, intel e1000 nic's, striaght thru
4. Run the iozone benchmark suite, results will be BETTER using MTU 1500
    

Actual Results:  Better performance when using MTU 1500 vs. 9000

Expected Results:  MTU 9000 should definitly be better results when using 8192 
block size on a pipe that is Jumbo Frame "clean"

Additional info:

Comment 1 Brian Feeny 2003-09-26 20:12:03 UTC

Here is some info about how the machines are configured, and some notes about 
the tests:
The NFS Clients are mounting the filesystem as follows:

/etc/fstab:
mx-nfs.mx:/home/cust                    /home/cust              nfs     
hard,intr,rsize=8192,wsize=8192 0 0


The NFS Server has the filesystem setup as follows:

/etc/fstab:
/dev/sdb1               /home/cust              ext3    defaults,data=
journal,noatime        1 14

/etc/exports:
/home/cust                      192.168.10.0/255.255.255.0(rw,no_root_squash)


The server shows the following for nfsstat after many tests:
root@mx-nfs network-scripts]# /usr/sbin/nfsstat
Warning: /proc/net/rpc/nfs: No such file or directory
Server rpc stats:
calls      badcalls   badauth    badclnt    xdrcall
87948134   12         12         0          0
  
Server nfs v3:
null       getattr    setattr    lookup     access     readlink
0       0% 13796   0% 630     0% 10077   0% 33061   0% 0       0%
read       write      create     mkdir      symlink    mknod
45134282  2% 35832835 40% 1260    0% 0       0% 0       0% 0       0%
remove     rmdir      rename     link       readdir    readdirplus
1260    0% 0       0% 0       0% 0       0% 9       0% 0       0%
fsstat     fsinfo     pathconf   commit
8869    0% 8869    0% 0       0% 6903186  7%
  
The client shows the following for nfsstat after many tests:
Client rpc stats:
calls      retrans    authrefrsh
166801290   1981       0

Client nfs v3:
null       getattr    setattr    lookup     access     readlink
0       0% 13823   0% 630     0% 10085   0% 33100   0% 0       0%
read       write      create     mkdir      symlink    mknod
45134281  2% 35831152 40% 1260    0% 0       0% 0       0% 0       0%
remove     rmdir      rename     link       readdir    readdirplus
1260    0% 0       0% 0       0% 0       0% 11      0% 0       0%
fsstat     fsinfo     pathconf   commit
8873    0% 8873    0% 0       0% 6903186  7%

The server network adapter is as follows after many tests:
[root@mx-nfs network-scripts]# netstat -i
Kernel Interface table
Iface   MTU Met   RX-OK RX-ERR RX-DRP RX-OVR   TX-OK TX-ERR TX-DRP TX-OVR Flg
eth1   1500   070949182      0      0      080169808      0      0      0 BMRU

The client network adapter is as follows after many tests:
[root@mx-admin NEW]# netstat -i

Kernel Interface table
Iface     MTU Met   RX-OK RX-ERR RX-DRP RX-OVR   TX-OK TX-ERR TX-DRP TX-OVR Flg
eth1       1500   0177924705     12      0      0159498604      0      0      0 
BMRU


Notes:

1. The NFS Server is set to run with RPCNFSDCOUNT=15 in /etc/rc.d/init.d/nfs
2. The client and server are connected via back to back Cat5e patch cable 
during the tests
3. The client and server are both set to the same MTU during the respective 
tests.  This is verified by ifconfig -a and a successfull "/usr/sbin/tracepath 
server/2049" from the client.

Comment 2 Brian Feeny 2003-09-26 20:12:32 UTC

here is an update:

I ran the above test with the e1000 driver, MTU 9000 and 8192 blocksize, but 
this time I changed some driver parameters.
This is how I set the parameters:

lient:
options e1000 Speed=1000 Duplex=2 RxDescriptors=256 TxAbsIntDelay=0

server:
options e1000 Speed=1000 Duplex=2 RxDescriptors=256 TxAbsIntDelay=0 
InterruptThrottleRate=0

My reasoning was that I wanted to make them match on both sides.  They use 
different RxDescriptor and TxAbsIntDelay on the
4.x and 5.x drivers, this way they are the same.  I also disabled the Dynamic 
Interrupt Throttling of the 5.x driver on the
server side.

The results are a substantial improvement.  I think this is a step in the right 
direction but I am ignorant as to exactly how
to proceed.

If I can attach files to this bug I will attach a tar showing all the test 
results including this last one which I call
"custom1", because i used custom /etc/modules.conf parameters.

My next tests will disable hyperthreading and smp all together too see if it 
has something to do with the smp_affinity.

Brian

Comment 3 Brian Feeny 2003-09-26 20:19:25 UTC

Created attachment 94768 [details]
test results from NFS benchmark

These are the test results of the iozone nfs tests.  Included is a readme file
that serves as an index.

Comment 4 John W. Linville 2005-02-21 15:52:13 UTC

Brian,

This one just found its way to me...  Are you still using 2.1?  Have
you picked-up the latest updates?  (The e1000 driver has received a
number of updates since this was reported...)

Would you mind re-running your tests with the latest 2.1 update (U7)
and reporting the results?

Comment 5 John W. Linville 2005-03-09 13:33:31 UTC

Any response to the questions from comment 4?

Comment 6 John W. Linville 2005-03-15 14:52:43 UTC

Closed due to lack of respons.  Please attempt to recreate with the
latest available AS2.1 update and reopen if the problem persists.