Bug 55178 - e1000 driver included in distribution performs poorly
e1000 driver included in distribution performs poorly
Status: CLOSED RAWHIDE
Product: Red Hat Linux
Classification: Retired
Component: kernel (Show other bugs)
7.3
ia64 Linux
high Severity high
: ---
: ---
Assigned To: Arjan van de Ven
Brock Organ
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2001-10-26 13:33 EDT by Matthew Tolentino
Modified: 2005-10-31 17:00 EST (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2001-12-06 12:15:23 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Matthew Tolentino 2001-10-26 13:33:47 EDT
Description of Problem:

The e1000 driver (version 3.1.23) for the Intel Pro/1000 Gigabit Ethernet 
Adapter exhibits extremely low performance.  Pings to other systems often 
take as long as 1 second.  Ftp and telnet sessions to other systems across 
the adapter are extremely slow.  

Version-Release number of selected component (if applicable):
driver version 3.1.23

How Reproducible:

Always

Steps to Reproduce:
1. load e1000 driver fro Pro/1000 Gb NIC
2. ping <other_system_ip> OR
3. telnet/ftp to <other_system)ip>

Actual Results:
System appears extremely sluggish.  Almost unusable.

Expected Results:
pings should succeed.  Also, telnet/ftp sessions should be reasonably 
fast - at least as fast as across a 10/100 NIC.

Additional Information:
	
I downloaded the same version driver (v3.1.23) from the intel.com/support 
website, built it standalone, and found it to function as expected.  This 
data points to a problem only with the one included in the distribution RH 
tree.
Comment 1 Arjan van de Ven 2001-10-26 13:37:44 EDT
This is interesting as we ship the same driver except for the Intel Proprietary
and Non-Free parts. Which kernel is this exactly ?
Comment 2 Matthew Tolentino 2001-10-26 13:41:02 EDT
This is the default 2.4.9-0.12smp kernel.
Comment 3 Scott Feldman 2001-10-26 22:09:28 EDT
The RH merge of the Intel Proprietary driver (version 3.1.23) into the kernel 
breaks the driver on IA-64.  Several critical hardware structures have been 
redefined by RH's merge script as 64-bit values rather than the intended 32-
bit values.  unsigned long's on IA-64 are 64-bits long!!!  (Note that the 
kernel driver is not broken for IA-32).

The patch below corrects the driver on IA-64.  This patch should be 
incorporated into the final build.

--- drivers/addon/e1000/e1000_fxhw.h	Tue Sep 25 09:19:30 2001
+++ drivers/addon/e1000.fixed/e1000_fxhw.h	Fri Oct 26 18:14:28 2001
@@ -195,20 +195,20 @@ typedef struct _E1000_TRANSMIT_DESCRIPTO
 	E1000_64_BIT_PHYSICAL_ADDRESS BufferAddress;
 
 	union {
-		unsigned long DwordData;
+		u32 DwordData;
 		struct _TXD_FLAGS {
-			unsigned short Length;
-			unsigned char Cso;
-			unsigned char Cmd;
+			u16 Length;
+			u8 Cso;
+			u8 Cmd;
 		} Flags;
 	} Lower;
 
 	union {
-		unsigned long DwordData;
+		u32 DwordData;
 		struct _TXD_FIELDS {
-			unsigned char TransmitStatus;
-			unsigned char Css;
-			unsigned short Special;
+			u8 TransmitStatus;
+			u8 Css;
+			u16 Special;
 		} Fields;
 	} Upper;
 
@@ -216,31 +216,31 @@ typedef struct _E1000_TRANSMIT_DESCRIPTO
 
 typedef struct _E1000_TCPIP_CONTEXT_TRANSMIT_DESCRIPTOR {
 	union {
-		unsigned long IpXsumConfig;
+		u32 IpXsumConfig;
 		struct _IP_XSUM_FIELDS {
-			unsigned char Ipcss;
-			unsigned char Ipcso;
-			unsigned short Ipcse;
+			u8 Ipcss;
+			u8 Ipcso;
+			u16 Ipcse;
 		} IpFields;
 	} LowerXsumSetup;
 
 	union {
-		unsigned long TcpXsumConfig;
+		u32 TcpXsumConfig;
 		struct _TCP_XSUM_FIELDS {
-			unsigned char Tucss;
-			unsigned char Tucso;
-			unsigned short Tucse;
+			u8 Tucss;
+			u8 Tucso;
+			u16 Tucse;
 		} TcpFields;
 	} UpperXsumSetup;
 
-	unsigned long CmdAndLength;
+	u32 CmdAndLength;
 
 	union {
-		unsigned long DwordData;
+		u32 DwordData;
 		struct _TCP_SEG_FIELDS {
-			unsigned char Status;
-			unsigned char HdrLen;
-			unsigned short Mss;
+			u8 Status;
+			u8 HdrLen;
+			u16 Mss;
 		} Fields;
 	} TcpSegSetup;
 
@@ -251,20 +251,20 @@ typedef struct _E1000_TCPIP_DATA_TRANSMI
 	E1000_64_BIT_PHYSICAL_ADDRESS BufferAddress;
 
 	union {
-		unsigned long DwordData;
+		u32 DwordData;
 		struct _TXD_OD_FLAGS {
-			unsigned short Length;
-			unsigned char TypLenExt;
-			unsigned char Cmd;
+			u16 Length;
+			u8 TypLenExt;
+			u8 Cmd;
 		} Flags;
 	} Lower;
 
 	union {
-		unsigned long DwordData;
+		u32 DwordData;
 		struct _TXD_OD_FIELDS {
-			unsigned char TransmitStatus;
-			unsigned char Popts;
-			unsigned short Special;
+			u8 TransmitStatus;
+			u8 Popts;
+			u16 Special;
 		} Fields;
 	} Upper;
 
Comment 4 Arjan van de Ven 2001-11-02 16:59:06 EST
Fixed in 2.4.9-12 and later, released as erratum for 7.1
Comment 5 Matt Domsch 2001-11-05 16:39:40 EST
In 2.4.9-13, this part of the patch didn't make it in:
-	unsigned long CmdAndLength;
+	u32 CmdAndLength;

It's part of a hardware descriptor, and only a u32 is written into it.  For 
correctness on IA-64, this needs to be included.
Comment 6 Scott Feldman 2001-11-07 15:57:55 EST
CmdAndLength is used for Tx Checksum offloading and consequently ZEROCOPY.  
This needs to be typed u32, per the patch given earlier.
Comment 7 Arjan van de Ven 2001-11-07 16:47:20 EST
Fixed in my current tree; e1000 is still a tad slow, see the nttcp results below
(on a 32 GB RAM Dell IA64 way machine with 4 cpus)
(5th column is mbit tcp payload)

l 83886080    0.76    0.72    884.8381    929.8965   20480  27003.12   28378.2
1 83886080    0.76    0.75    883.9884    894.7849   21147  27855.79   28196.0
l 83886080    0.76    0.73    883.7637    924.8907   20480  26970.33   28225.4
1 83886080    0.76    0.73    883.0183    919.2995   21211  27909.43   29056.2

eepro100 is now also fixed in my tree now for this machine:

l 83886080    7.14    0.33     94.0244   2033.1214   20480   2869.40   62045.9
1 83886080    7.15    1.64     93.8813    409.2004   57921   8102.80   35317.7
l 83886080    7.14    0.32     94.0540   2095.1027   20480   2870.30   63937.5
1 83886080    7.15    1.50     93.8907    447.3924   57911   8102.22   38607.3
Comment 8 Arjan van de Ven 2001-11-07 17:34:00 EST
update: the poor e1000 performance seems to be caused by the other side of the
test (a bcm5700) that doesn't go fast enough to max out the e1000
Comment 9 Clay Cooper 2001-11-13 09:35:57 EST
w/ qa1108 (2.4.9-13.3smp), running nttcp between two e1000 nics in two
bordeauxs, I'm seeing 400-500MB/s sustained, with infrequent peaks of 800MB/s.

Each bordeaux has 4cpus.  Client machine has 64GB ram, and server has 32GB ram. 

Cpu utilization on both machines is 75% idle.
Comment 10 Arjan van de Ven 2001-11-13 09:37:43 EST
This is bad performance. Is there a switch involved in the setup ?
Comment 11 Clay Cooper 2001-11-14 10:47:36 EST
Initially yes.  I just tried with nothing between but a crossover cable and got
the same speed as going through a gigabit switch.
Comment 12 Scott Feldman 2001-11-19 13:49:23 EST
Run more nttcp sessions (~10) to get an aggregate performance.  A single nttcp 
session over gigabit is like trying to flood a culvert with a garden hose.
Comment 13 Clay Cooper 2001-11-19 14:58:54 EST
Ok, aggregate performance with 2 e1000's and multiple nttcp sessions is roughly
1000-1200Mb/s.  It seems to exceed the gigabit bandwidth, but the total is
fairly consistent as sessions are added:

1 session  -- 500Mb/s
2 sessions -- 500Mb/s each
3 sessions -- 400Mb/s each
4 sessions -- 300Mb/s each
5 sessions -- 225Mb/s each
Comment 14 Michael K. Johnson 2001-12-06 12:06:39 EST
Sounds fixed to me.

Anyone disagree?
Comment 15 Clay Cooper 2001-12-06 12:15:18 EST
I agree
Comment 16 John A. Hull 2001-12-06 12:17:19 EST
Closing.

Note You need to log in before you can comment on or make changes to this bug.