Bug 72472
Summary: | kernel tries to load 8139cp instead of 8139too | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 2.1 | Reporter: | Jos van den Oever <oever> |
Component: | kernel | Assignee: | Jeff Garzik <jgarzik> |
Status: | CLOSED WONTFIX | QA Contact: | Brian Brock <bbrock> |
Severity: | medium | Docs Contact: | |
Priority: | medium | ||
Version: | 2.1 | CC: | maspotts, peterm, trevor |
Target Milestone: | --- | ||
Target Release: | --- | ||
Hardware: | i686 | ||
OS: | Linux | ||
Whiteboard: | |||
Fixed In Version: | Doc Type: | Bug Fix | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2006-10-01 05:17:03 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Jos van den Oever
2002-08-24 09:32:04 UTC
This is because they're both listed in the hotplug map... This has just started happening to me after installing the new errata kernel 2.4.18-17-7.x on my Sony VAIO PCG-XG19. A work-around is to run 'modprobe 8139too' after booting. But for some reason if I add 'alias eth0 8139too' to /etc/modules.conf it doesn't work. Is there a correct way to specify a different driver in modules.conf? Jeff... your backyard ;) I need to extend the kernel's PCI map to include PCI revision id, and patch modutils to support it. Any idea when that modification will make it into a new errata kernel for RedHat 7.1? I have a similar problem, I have RH7.2 and a "Surecom EP-428X" Cardbus card. There is a very simple workaround, edit /lib/modules/2.4.18-18.7.x/modules.pcimap and delete the line that begins with "8139cp". BTW it worked without problems with older kernels (e.g. 2.4.9-31). This is an anoyance. I recommend to keep the line deleted until a normal solution can be found. MfG shurdeek This bug exists (reintroduced?) in FC5 2.6.17-1.2187_FC5smp! I just upgraded a box from FC3 to FC4 to FC5 and hit this exact bug. Took me forever to figure out as the symptoms were very strange. This bug did NOT exist in FC3, at least on this system, which was running FC3 for over 1.5 years. It may have existed in FC4 for the brief time (hours) I was in transition through FC4, but I can't be certain. The solution of removing the 8139cp line in pcimap works perfectly. I tested with over a dozen boots with and without the line and can confirm that the entire bug hinges on the presence of that line. Note, that if the line is NOT deleted, sometimes the bug does NOT show up. I'd say 1 out of 5 boots will work even with the line intact. Take out the line, and 100% of boots will work. I do believe there's some race condition in the order of module loading. My NIC config: ne2k-pci as eth0; 8139too as eth1. Here's a dmesg snippet from a "good" boot (pcimap line deleted): ne2k-pci.c:v1.03 9/22/2003 D. Becker/P. Gortmaker http://www.scyld.com/network/ne2k-pci.html eth0: RealTek RTL-8029 found at 0xa800, IRQ 137, 52:54:05:F6:1B:D0. gameport: EMU10K1 is pci0000:00:09.1/gameport0, io 0xb400, speed 1169kHz piix4_smbus 0000:00:04.3: Found 0000:00:04.3 device 8139too Fast Ethernet driver 0.9.27 eth1: RealTek RTL8139 at 0xd08fc000, 00:40:f4:15:2d:1c, IRQ 145 eth1: Identified 8139 chip type 'RTL-8139C' 8139cp: 10/100 PCI Ethernet driver v1.2 (Mar 22, 2004) Note, ne2k loads first and grabs eth0, as it should. Snippet from a buggy boot (pcimap line intact): 8139too Fast Ethernet driver 0.9.27 eth0: RealTek RTL8139 at 0xd0836000, 00:40:f4:15:2d:1c, IRQ 145 eth0: Identified 8139 chip type 'RTL-8139C' gameport: EMU10K1 is pci0000:00:09.1/gameport0, io 0xb400, speed 1193kHz 8139cp: 10/100 PCI Ethernet driver v1.2 (Mar 22, 2004) Linux video capture interface: v1.00 piix4_smbus 0000:00:04.3: Found 0000:00:04.3 device SCSI subsystem initialized ne2k-pci.c:v1.03 9/22/2003 D. Becker/P. Gortmaker http://www.scyld.com/network/ne2k-pci.html eth1: RealTek RTL-8029 found at 0xa800, IRQ 137, 52:54:05:F6:1B:D0. Note, 8139 captures eth0 in defiance of alias eth0 ne2k-pci in modprobe.conf. The symptom I see on the buggy boots is that neither network interface will function. The eth0 won't get it's WAN DHCP. The LAN eth1 will show its static IP but cannot send or receive any traffic except for rx broadcasts. ifdown'ing and ifup'ing till the cows come how has no effect. I can fix the problem on a buggy boot by ifdown'ing and rmmod'ing both NIC's and ifup'ing the ne2k eth0 first. Should this bug be switched to FC5 to indicate it's not a stale RHEL2 bug? In case it wasn't clear, unlike the original poster, my system is a standard desktop with PCI cards, not a laptop with PCMCIA cards. This is not a kernel problem. The kernel is not the entity that chooses to load one driver or another. Huh? Instead of closing the bug, why not then reassign it to the correct component? If not the kernel, then what? This is a serious bug that causes a machine to become non-remotely-accessible and total network failure. It must be a "bug" because on unchanged/untouched hardware that ran perfectly in FC1 and FC3, all of a sudden it fails in FC5. Deleting a line in modules.pcimap every time I yum update the kernel hardly seems ideal. Perhaps my bug is actually a new bug that just happens to share the solution to the original bug? I looked in rc.sysinit to see how FC5 loads these modules vs FC3. It appears to be quite different and hints that the problem could be in udev (/sbin/start_udev). Any hints on how to proceed are appreciated. Perhaps related (but not identical to) bug 178165. Perhaps related: I just checked the problem system vs another FC5 system and was surpised to notice that the modules.pcimap has the 8139xx entries in different order. And both these systems are running the exact same kernel! I suppose that pcimap file is generated by something else. I'm wondering if the order of 8139too vs 8139cp may impact whether this bug gets hit or not. As per bug 178165, adding HWADDR fields to the ifcfg-ethX files does indeed appear to solve the problem. I just did so and tested it and the modules are loaded in the correct order (ne2k then 8139too) and eth0/eth1 are assigned correctly. I think the "bug" here is the change in human interface from using modprobe.conf to specify ethX aliases to using HWADDR. modprobe.conf not only is greatly ignored (by udev) but it still seems to be used by ifup/down/etc and thus really messes things up by creating conflicts. So what *does* control module loading order? In FC5, it appears to be udev (/sbin/start_udev, to be precise). Obviously there are two "correct" solutions: 1) fix pcimap code to [optionally] account for the Revision field, as suggested by Jeff three years ago but which I can't find any indication of this having happened yet, and 2) merge 8139too and 8139cp modules, as there logically should NOT be two different but NOT equal drivers for the exact same [logically speaking] piece of hardware. I believe situations such as aic_7xxx are different because the old and new drivers provided essentially the same functionality for the same hardware - the new driver added PCI IDs and dropped other PCI IDs, but for the ones that were dropped the pcimap did not indicate the new driver to be authoritative. How do you copy an entire bug in Bugzilla, so this can be reopened under FC5, component UDEV ? |