Red Hat Bugzilla – Bug 190571
REGRESSION: e100 hang on HP Integrity
Last modified: 2014-06-18 04:29:03 EDT
Description of problem:
The latest kernels have an issue where the e100 causes a hang on bootup. I am
not clear exactly when this issue was introduced since we have been seeing
serial console hangs over the past few weeks.
I see the hang in 2 different ways:
If the system is up and running an older good kernel and I do a warm reboot it
hangs as it brings up the network device.
If I cold boot the system (power on or virtual reset button via the MP) it hangs
at the udev step in the boot:
It does not look like this was caused by the latest e100 driver update that went
in to 2.6.9-34.15 since I did have kernels working after that. I am in the
process of building from source so I can pull out specific patches to determine
Version-Release number of selected component (if applicable):
kernel-2.6.9-35 and possibly earlier.
100% on the rx2600 system
Steps to Reproduce:
1. install update kernel
2. reset system via the MP
3. system hangs at the udev step in the bootup
I did a sysrq-c to get a stack trace. This is when the system hung after a
reset in the udev step. I don't see anything e100 specific here but I have
verified that if I remove the e100.ko module the system boots cleanly.
Adding e100 maintainers to CC:...
John, can you speculate on why e100 might have problems w/ ia64?
Can the driver be loaded after the system is booted up without the driver being
loaded at boot? This might show an error during the laod of the driver which
might point ot something.
Without seeing an error from loading it's hard to even guess at this. Does this
happen with a kernel.org kernel? I know that might be hard for you to test since
the RH kernel is so different.
What PRO/100 NIC is it, lspci -vv? Don't think i twould matter but you never
know. Did older kernels work fine on this exact system?
answers to your questions above...
When I try to load the driver after the system is up it just locks up the
system. No errors, just hung.
I have not yet tried a kernel.org kernel.
It has been working until todays kernel build. Nothing specific to the e100
driver has changed so it must be a side effect of another change.
Here is the lspci -vv info for the card:
00:03.0 Ethernet controller: Intel Corporation 82557/8/9 [Ethernet Pro 100] (rev 0d)
Subsystem: Hewlett-Packard Company: Unknown device 1274
Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+
Stepping- SERR+ FastB2B-
Status: Cap+ 66Mhz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort-
<TAbort- <MAbort- >SERR- <PERR-
Latency: 128 (2000ns min, 14000ns max)
Interrupt: pin A routed to IRQ 53
Region 0: Memory at 0000000080020000 (32-bit, non-prefetchable) [size=4K]
Region 1: I/O ports at 0d00 [size=64]
Region 2: Memory at 0000000080000000 (32-bit, non-prefetchable) [size=128K]
Capabilities: [dc] Power Management version 2
Flags: PMEClk- DSI+ D1+ D2+ AuxCurrent=0mA
Status: D0 PME-Enable- DSel=0 DScale=2 PME-
We have found the cause of this. It appears to be the netpoll-bonding patch.
We don't understand why this hangs here (and not on other configurations) but
backing it out does fix the problem.