Bug 63306

Summary: (NET EEPRO100) eepro100 module wait_for_cmd_done timeout! on Toshiba laptop
Product: [Retired] Red Hat Raw Hide Reporter: Peter Bieringer <pb>
Component: kernelAssignee: Jeff Garzik <jgarzik>
Status: CLOSED WONTFIX QA Contact: Brian Brock <bbrock>
Severity: medium Docs Contact:
Priority: medium    
Version: 1.0CC: davej, giulioo, jsullivan, krutaw, peterm
Target Milestone: ---   
Target Release: ---   
Hardware: i686   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2004-10-30 04:07:38 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Peter Bieringer 2002-04-12 08:12:22 UTC
From Bugzilla Helper:
User-Agent: Mozilla/4.78 [en] (Windows NT 5.0; U)

Description of problem:
After short network usage, following occurs:

Apr 11 10:23:06 t1mobil kernel: eepro100.c: $Revision: 1.36 $ 2000/11/17 Modified by Andrey V. Savochkin <saw.com.sg> and others
Apr 11 13:48:36 t1mobil kernel: eepro100: wait_for_cmd_done timeout!
Apr 11 13:54:02 t1mobil kernel: eepro100: wait_for_cmd_done timeout!
Apr 11 14:00:09 t1mobil kernel: eepro100: wait_for_cmd_done timeout!

Solve: ifdown eth0; ifup eth0

Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
Install kernel 2.4.18-0.16 on Toshiba Satellite Pro 4600
Use internal NIC


Actual Results:  Network hangs after short time



Expected Results:  Rawhide 2.4.17-0.18 has not such problems

Additional info:

lcpci -vv:

02:08.0 Ethernet controller: Intel Corp. 82820 (ICH2) Chipset Ethernet Controller (rev 03)
        Subsystem: Intel Corp.: Unknown device 3013
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B-
        Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR-
        Latency: 64 (2000ns min, 14000ns max), cache line size 08
        Interrupt: pin A routed to IRQ 11
        Region 0: Memory at f7dff000 (32-bit, non-prefetchable) [size=4K]
        Region 1: I/O ports at df40 [size=64]
        Capabilities: [dc] Power Management version 2
                Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
                Status: D0 PME-Enable- DSel=0 DScale=2 PME-

Comment 1 Alistair Riddoch 2002-05-03 13:27:30 UTC
I have observed the same problems with the following hardware:

Supermicro dual Pentium III server motherboard with dual onboard
NIC.

/sbin/lspci -n

00:00.0 Class 0600: 1166:0009 (rev 05)
00:00.1 Class 0600: 1166:0009 (rev 05)
00:03.0 Class 0300: 1002:4752 (rev 27)
00:04.0 Class 0200: 8086:1229 (rev 08)
00:06.0 Class 0200: 8086:1229 (rev 08)
00:0f.0 Class 0601: 1166:0200 (rev 4f)
00:0f.1 Class 0101: 1166:0211
00:0f.2 Class 0c03: 1166:0220 (rev 04)
01:01.0 Class 0100: 1119:01d6
01:03.0 Class 0100: 9005:008f (rev 02)


/sbin/lspci -v 

...

00:04.0 Ethernet controller: Intel Corporation 82557 [Ethernet Pro 100] (rev 08)
        Subsystem: Intel Corporation EtherExpress PRO/100+ Server Adapter
(PILA8470B)
        Flags: bus master, medium devsel, latency 64, IRQ 22
        Memory at feafc000 (32-bit, non-prefetchable) [size=4K]
        I/O ports at d000 [size=64]
        Memory at fe800000 (32-bit, non-prefetchable) [size=1M]
        Capabilities: <available only to root>

00:06.0 Ethernet controller: Intel Corporation 82557 [Ethernet Pro 100] (rev 08)
        Subsystem: Intel Corporation EtherExpress PRO/100+ Server Adapter
(PILA8470B)
        Flags: bus master, medium devsel, latency 64, IRQ 31
        Memory at feafd000 (32-bit, non-prefetchable) [size=4K]
        I/O ports at d400 [size=64]
        Memory at fe900000 (32-bit, non-prefetchable) [size=1M]
        Capabilities: <available only to root>

...

The symptoms are extremely bad performance, or complete loss of connectivity
accompanied by "eepro100: wait_for_cmd_done timeout!" errors in the logs.
This error has been reproduced using both the eepro100 and e100 drivers,
in kernels 2.4.9-31enterprise (7.2 updates), and 2.4.18-0.13smp (skipjack beta
2). I have also tested with a version of the driver published by Donald Backer,
with his revision number 1.20, but this drivers causes a kernel panic under load.

The system in question is not yet in production use, so I am willing to test
possible fixes.

Comment 2 Alistair Riddoch 2002-05-03 14:26:25 UTC
I can confirm that the bug also exists in kernel 2.4.7-10 enterprise.

Comment 3 Need Real Name 2002-07-03 14:47:06 UTC
I have also experienced this issue.  I am using a Gateway E-1400 with an Intel 
i82557 chipset.  I am booting on a 2.4.7-10BOOT kernel, and am booting via 
PXE.  The problem is intermittent, but when it does occur, the machine has no 
network connectivity.

Comment 4 Jason M. Sullivan 2002-08-09 01:40:49 UTC
I'm seeing this too, on an IBM NetVista with the intel 810e chipset.  It's
preventing me from netloading the system and forcing me to burn a CD or shuffle
drives around (both major pains).

Comment 5 Jason M. Sullivan 2002-08-09 22:59:44 UTC
I managed to load the system from some burned CD's, and I'm still getting the
error.  Since I can now get a debug kernel on the machine (eventually), what
should I try?  I'd love to know if this is a hardware error or not.


Comment 6 giulioo 2002-10-25 16:38:50 UTC
Have you disabled sleep mode on the card firmware?
ie with ftp://ftp.scyld.com/pub/diag/eepro100-diag.c

I solved this problem on another kernel disabling sleep mode. This is a common
issue.

Comment 7 Jeff Garzik 2002-10-25 16:44:40 UTC
Note that Red Hat ships eepro100-diag in kernel-utils RPM also.


Comment 8 Jeff Garzik 2002-10-25 17:02:59 UTC
*** Bug 62799 has been marked as a duplicate of this bug. ***

Comment 9 Jason M. Sullivan 2002-10-25 18:13:55 UTC
So the quesiton is, shouldn't network install kernels have this disabled by default?

Comment 10 Jeff Garzik 2002-10-25 18:20:52 UTC
It is not obvious from the comments that this is definitely the problem.

Does disabling sleep mode via eepro100-diag fix the problem, for all bug reporters?


Comment 11 Jeff Garzik 2003-01-19 00:00:43 UTC
No response after multiple months.  I believe this problem to be fixed in the
current Red Hat kernel.  Please try with 8.1 beta or the latest errata kernel
for 7.1/7.2/8.0 and file a new bug if the behavior persists.


Comment 12 Peter Bieringer 2003-01-22 12:51:32 UTC
Sorry, but playing around here with latest RHL kernel 2.4.18-18.7.x still show this problem after some minutes (let say around 10). NIC is connected to a 10 MBit single speed hub, mii-tool proper tell "no autonegotianion, 10baseT-HD, link ok".

Comment 13 Jeff Garzik 2004-03-03 05:57:16 UTC
Does this problem occur with e100 driver?

We are deprecating eepro100 in favor of the latest e100.