Bug 115877

Summary: e1000 keeps locking up
Product: [Fedora] Fedora Reporter: Thomas J. Baker <tjb>
Component: kernelAssignee: Arjan van de Ven <arjanv>
Status: CLOSED RAWHIDE QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: rawhideCC: gczarcinski, paul.0000.black, sait.a.umar, scott.feldman, tao
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-02-25 14:50:15 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
tar ball of six patches to apply in order against 2.6.3 none

Description Thomas J. Baker 2004-02-16 20:05:24 UTC
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4.1)
Gecko/20031114 Galeon/1.3.11a

Description of problem:
The e1000 driver appears to lock up several times a day for me on a
Dell Precision 650N with the Intel e1000 gigabit ethernet. The dmesg
output looks like this:

NETDEV WATCHDOG: eth0: transmit timed out
e1000: eth0 NIC Link is Up 100 Mbps Full Duplex

All the FC2 2.6 kernels have exhibited this problem. Some seem worse
than others but I don't know what precisely is triggering the problem
(network load, cpu load, etc). This latest 2.6.2-81 kernel seems a
little better than the precious 2.6.1-65 but it still has the problem.

Version-Release number of selected component (if applicable):
kernel-smp-2.6.2-1.81

How reproducible:
Didn't try


Additional info:

Comment 1 Paul Black 2004-02-20 11:16:46 UTC
I'm seeing the same. Seems to be worst when copying lots of data
to/from  an NFS server.

The other thing that I notice is that initialising eth0 takes a long
time compared with machines that have different cards in.

Machine is a Dell Optiplex SX270.

Ethernet (builtin) from lspci:
0000:01:0c.0 Ethernet controller: Intel Corp. 82540EM Gigabit Ethernet
Controller (rev 02)

Kernel is 2.6.3-1.91smp.


Comment 2 Sammy 2004-02-21 15:47:39 UTC
I am just experiencing this with the latest 2.6.3 kernel. I have not experienced 
this earlier. Essentially the network is useless for transferring large files. I  am 
getting hundreds of 
 
Feb 21 08:28:18 compsci kernel: e1000: eth0 NIC Link is Up 100 Mbps Full Duplex 
Feb 21 08:28:33 compsci kernel: NETDEV WATCHDOG: eth0: transmit timed out 
 
in /var/log/messages. 

Comment 3 Sammy 2004-02-21 15:49:48 UTC
Another note: this seems to happen when transmitting out of the machine only. 

Comment 4 Sammy 2004-02-22 16:19:41 UTC
I took the driver back to 27-ko version from 2.6.2-rc1 and having still the 
same problem. I am sure I did not have this problem at some point with 
2.6.2 so I am beginning to think that the problem is also associated with 
another part of kernel that was upgraded. 

Comment 5 Sammy 2004-02-22 16:24:01 UTC
Well, there is another bug 115566, that is reporting the same problem. 
Judging from that and the comments here the problem seems to be 
happening with SMP kernels. I am on a DELL Precision 350n, single 
CPU but hyperthreading turned on to use an smp kernel. 

Comment 6 Gene Czarcinski 2004-02-22 20:16:34 UTC
The drivers and what is hapopening are somewhat different with 115566.

Comment 7 Scott Feldman 2004-02-24 03:00:29 UTC
There are some patches against 5.2.30.1-k1 that have been posted to 
netdev to fix some issues with e1000.  I'll attach the patches for 
y'all to try.  These patches would go into 2.6.4.

Comment 9 Sammy 2004-02-24 15:37:31 UTC
Could you please attache them as a tar file or text. Some 
characters are lost when saved from html. 
Thanks 

Comment 10 Scott Feldman 2004-02-24 15:46:48 UTC
Created attachment 97995 [details]
tar ball of six patches to apply in order against 2.6.3

Comment 11 Sammy 2004-02-24 16:38:35 UTC
Great that FIXED it. By the way, whatever the problem was, it was 
only happening with the SMP version of the kernel. Now the SMP 
works fine. 
 
I am using the latest arjanv kernel upgraded to bk6 and very few 
bk-patches from the -mm3. 

Comment 12 Scott Feldman 2004-02-24 16:55:41 UTC
Ok, good.

So I don't know what the exit criteria is to close these Bugzillas, 
but I do know the attached patches have been submitted to the 
upstream netdev-2.6 BK tree.  If accepted into netdev-2.6, these 
patches will propagate to the downstream kernels, and ultimately to 
the RH kernel.

Comment 13 Sammy 2004-02-25 14:31:18 UTC
I see that the patches made it into 2.6.3-bk7. 

Comment 14 Dave Jones 2004-02-25 14:50:15 UTC
fixed in rawhide