Bug 56058 - Intel Ethernet Express/100 Hardware is b0rked for 10 Mbit
Intel Ethernet Express/100 Hardware is b0rked for 10 Mbit
Status: CLOSED CURRENTRELEASE
Product: Red Hat Linux
Classification: Retired
Component: kernel (Show other bugs)
7.2
i386 Linux
medium Severity medium
: ---
: ---
Assigned To: Jeff Garzik
Brock Organ
:
: 57219 (view as bug list)
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2001-11-12 02:15 EST by bpk
Modified: 2013-07-02 22:05 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2003-06-07 16:55:47 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description bpk 2001-11-12 02:15:22 EST
From Bugzilla Helper:
User-Agent: Mozilla/4.0 (compatible; MSIE 5.0; Mac_PowerPC)

Description of problem:


Version-Release number of selected component (if applicable):


How reproducible:
Always

Steps to Reproduce:
1. boot up
2. begin to copy a large file over the network using any protocol
3. network connection is lost; /var/log/messages shows "eepro100: 
wait_for_cmd_done timeout"
	

Additional info:
Comment 1 Arjan van de Ven 2001-11-12 04:37:30 EST
Is this with the 2.4.9-13 kernel ?
Comment 2 bpk 2001-11-12 04:56:32 EST
I experience this problem with the kernel from the 
enigma-i386-disc[12].iso CD and with 2.4.9-13 from the update rpm.
Comment 3 Arjan van de Ven 2001-11-12 04:58:55 EST
Could you please attach "lspci", "lspci -n" output and give a brief description
of the network (eg 10 or 100 mbit, half or full duplex, hub or switch) ?

Also it might be worth trying the e100 driver; it seems to work in some cases
where eepro100 doesn't (but doesn't work for other cases where eepro100 works)
Comment 4 bpk 2001-11-12 05:31:31 EST
The e100 driver was not available; there was no e100 module in 
/lib/modules/2.4.?/kernel/drivers/net/ -- this was a fresh install of 7.2 on a 
new hard drive. Experienced same problem after upgrading to updated 
kernel rpm (and still no e100 driver in 
/lib/modules/2.4.9-13/kernel/drivers/net/). I did try just "eepro" in 
/etc/modules.conf, which didn't work at all.

Same computer/NIC had no problems with 7.1 and e100. In fact, I've 
reverted to a fresh install of 7.1 in order to get some work done.

lspci says:

00:00.0 Host bridge: Intel Corporation 82815 815 Chipset Host Bridge 
and Memory Controller Hub (rev 02)
00:02.0 VGA compatible controller: Intel Corporation 82815 CGC [Chipset 
Graphics Controller]  (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82820 820 (Camino 2) Chipset PCI 
(rev 02)
00:1f.0 ISA bridge: Intel Corporation 82820 820 (Camino 2) Chipset ISA 
Bridge (ICH2) (rev 02)
00:1f.1 IDE interface: Intel Corporation 82820 820 (Camino 2) Chipset IDE 
U100 (rev 02)
00:1f.2 USB Controller: Intel Corporation 82820 820 (Camino 2) Chipset 
USB (Hub A) (rev 02)
00:1f.4 USB Controller: Intel Corporation 82820 820 (Camino 2) Chipset 
USB (Hub B) (rev 02)
00:1f.5 Multimedia audio controller: Intel Corporation: Unknown device 
2445 (rev 02)
02:08.0 Ethernet controller: Intel Corporation 82820 820 (Camino 2) 
Chipset Ethernet (rev 01)

lspci -n says:

00:00.0 Class 0600: 8086:1130 (rev 02)
00:02.0 Class 0300: 8086:1132 (rev 02)
00:1e.0 Class 0604: 8086:244e (rev 02)
00:1f.0 Class 0601: 8086:2440 (rev 02)
00:1f.1 Class 0101: 8086:244b (rev 02)
00:1f.2 Class 0c03: 8086:2442 (rev 02)
00:1f.4 Class 0c03: 8086:2444 (rev 02)
00:1f.5 Class 0401: 8086:2445 (rev 02)
02:08.0 Class 0200: 8086:2449 (rev 01)

The network is 10Mbps, not sure about duplex (how to test?), plugged into 
a router with integrated hub (Netopia 7200). Network connection was lost 
during xfer of 1.45Gb file from local network (scp or ftp) and during xfer of 
kernel update rpm from updates.redhat.com (had to xfer update to another 
computer and burn to CD to xfer to 7.2 machine).
Comment 5 Arjan van de Ven 2001-11-12 05:41:09 EST
e100 is in drivers/addon/e100 not in drivers/net
(it's not GPL compatible and also I'm moving all non-standard drivers to
drivers/addon)
Comment 6 Ed Marshall 2001-12-11 12:44:03 EST
I was seeing the same problems with 2.4.7-10smp on an HP NetServer LPr using 
the eepro100 module; basically, under anything over 1Mbps of network load, the 
driver would start blasting "eepro100: wait_for_cmd_done timeout" to console at 
a rate of about 1 per second. This would halt any network activity 
until "ifdown eth0;ifup eth0" was performed (or until connections timed 
out/reset on their own). Occasionally, the system will completely freeze up.

dmesg reports:

    Uhhuh. NMI received. Dazed and confused, but trying to continue
    You probably have a hardware problem with your RAM chips.

I've run memtest86 just to be sure, and all appears to be good with the memory 
in this machine.

Switching to e100 stopped the wait_for_cmd_done messages, but under moderate 
network load the NMI message still appears and the same problems occur 
(actually, the frequency of lockups seems to increase).

Once I've built a megaraid module that works (see bugzilla #55448 and #47221), 
I'll give the 2.4.9-13smp kernel a spin.

lspci output:

00:00.0 Host bridge: Intel Corporation 440BX/ZX - 82443BX/ZX Host bridge (AGP 
disabled) (rev 03)
00:04.0 ISA bridge: Intel Corporation 82371AB PIIX4 ISA (rev 02)
00:04.1 IDE interface: Intel Corporation 82371AB PIIX4 IDE (rev 01)
00:04.2 USB Controller: Intel Corporation 82371AB PIIX4 USB (rev 01)
00:04.3 Bridge: Intel Corporation 82371AB PIIX4 ACPI (rev 02)
00:07.0 PCI bridge: Digital Equipment Corporation DECchip 21152 (rev 03)
00:0d.0 VGA compatible controller: Cirrus Logic GD 5446 (rev 45)
01:02.0 PCI bridge: Intel Corporation 80960RP [i960 RP Microprocessor/Bridge] 
(rev 03)
01:02.1 I2O: Intel Corporation 80960RP [i960RP Microprocessor] (rev 03)
01:03.0 Ethernet controller: Intel Corporation 82557 [Ethernet Pro 100] (rev 05)
01:04.0 SCSI storage controller: Symbios Logic Inc. (formerly NCR) 53c895 (rev 
01)

lspci -n:

00:00.0 Class 0600: 8086:7192 (rev 03)
00:04.0 Class 0601: 8086:7110 (rev 02)
00:04.1 Class 0101: 8086:7111 (rev 01)
00:04.2 Class 0c03: 8086:7112 (rev 01)
00:04.3 Class 0680: 8086:7113 (rev 02)
00:07.0 Class 0604: 1011:0024 (rev 03)
00:0d.0 Class 0300: 1013:00b8 (rev 45)
01:02.0 Class 0604: 8086:0960 (rev 03)
01:02.1 Class 0e00: 8086:1960 (rev 03)
01:03.0 Class 0200: 8086:1229 (rev 05)
01:04.0 Class 0100: 1000:000c (rev 01)

I'd be glad to provide any additional information I can on this.
Comment 7 Need Real Name 2001-12-11 17:19:22 EST
I've had the same issue with kernel 2.4.4 & eepro100. I've built and installed 
the latest e100 module from Intel and have not seen the problem since.
Comment 8 Michael Young 2001-12-12 07:05:44 EST
I had this problem when I was on a 10 Mb half duplex hub for a week. The network
hung under high load, and sometimes reset after 5-10 minutes, though it didn't
seem to recover when X was running. It hasn't been a problem before or after on
my usual 100 Mb full duplex connection (probably to a switch). This was with
2.4.9-13 using the eepro100 driver, lspci is
00:00.0 Host bridge: Intel Corporation 82815 815 Chipset Host Bridge and Memory
Controller Hub (rev 02)
00:02.0 VGA compatible controller: Intel Corporation 82815 CGC [Chipset Graphics
Controller]  (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801BAM PCI (rev 02)
00:1f.0 ISA bridge: Intel Corporation 82801BA ISA Bridge (ICH2) (rev 02)
00:1f.1 IDE interface: Intel Corporation 82801BA IDE U100 (rev 02)
00:1f.2 USB Controller: Intel Corporation 82801BA(M) USB (Hub A) (rev 02)
00:1f.3 SMBus: Intel Corporation 82801BA(M) SMBus (rev 02)
00:1f.4 USB Controller: Intel Corporation 82801BA(M) USB (Hub B) (rev 02)
00:1f.5 Multimedia audio controller: Intel Corporation 82801BA(M) AC'97 Audio
(rev 02)
01:08.0 Ethernet controller: Intel Corporation 82801BA(M) Ethernet (rev 01)
lspci -n is
00:00.0 Class 0600: 8086:1130 (rev 02)
00:02.0 Class 0300: 8086:1132 (rev 02)
00:1e.0 Class 0604: 8086:244e (rev 02)
00:1f.0 Class 0601: 8086:2440 (rev 02)
00:1f.1 Class 0101: 8086:244b (rev 02)
00:1f.2 Class 0c03: 8086:2442 (rev 02)
00:1f.3 Class 0c05: 8086:2443 (rev 02)
00:1f.4 Class 0c03: 8086:2444 (rev 02)
00:1f.5 Class 0401: 8086:2445 (rev 02)
01:08.0 Class 0200: 8086:2449 (rev 01)
Comment 9 Arjan van de Ven 2001-12-12 07:25:42 EST
Someone from SUN Microsystems provided a fix for the most common problems; a
kernel with this fix can be grabbed from:
http://people.redhat.com/arjanv/testkernels/

The 10mbit/HD problem is a hardware bug in some revisions of the eepro100 card
and there's also a attempt to a workaround in this kernel; since I don't have
one of the broken revisions I cannot test it...
Comment 10 Michael Young 2001-12-14 05:35:41 EST
I have tried your test kernel (2.4.9-17.6.i686), and it looks like there are
still problems. The system has frozen twice under high loads on a 10 Mb hub,
which seems to require a power cycle to clear, though possibly this occurs less
often than the original bug.
Comment 11 Michael Young 2002-01-30 10:06:32 EST
2.4.9-21 behaves in the same way as 2.4.9-13, ie. fine at 100Mbit,
wait_for_cmd_done timeout errors and long pauses on 10Mbit.
Comment 12 Arjan van de Ven 2002-02-11 12:14:08 EST
*** Bug 57219 has been marked as a duplicate of this bug. ***
Comment 13 Alan Cox 2003-06-07 16:55:47 EDT
2.4.18 and later errata should have resolved this. If not please re-open

Thanks

Note You need to log in before you can comment on or make changes to this bug.