Bug 241338 - ide0=noprobe kills the kernel
Summary: ide0=noprobe kills the kernel
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: kernel
Version: 5.0
Hardware: All
OS: Linux
Target Milestone: ---
: ---
Assignee: Michal Schmidt
QA Contact: Martin Jenner
Depends On:
Blocks: 251292 RHEL5u2_relnotes 425461
TreeView+ depends on / blocked
Reported: 2007-05-25 09:31 UTC by Gerd Hoffmann
Modified: 2008-05-21 14:43 UTC (History)
4 users (show)

Fixed In Version: RHBA-2008-0314
Doc Type: Bug Fix
Doc Text:
Clone Of:
Last Closed: 2008-05-21 14:43:19 UTC

Attachments (Terms of Use)
boot log (3.49 KB, text/plain)
2007-05-25 09:31 UTC, Gerd Hoffmann
no flags Details
band aid fix (594 bytes, patch)
2007-07-13 15:03 UTC, Gerd Hoffmann
no flags Details | Diff
different approach to fix it ... (617 bytes, patch)
2007-08-22 14:30 UTC, Gerd Hoffmann
no flags Details | Diff
another fix, more like upstream (2.17 KB, patch)
2007-08-28 10:27 UTC, Michal Schmidt
no flags Details | Diff

System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2008:0314 normal SHIPPED_LIVE Updated kernel packages for Red Hat Enterprise Linux 5.2 2008-05-20 18:43:34 UTC

Description Gerd Hoffmann 2007-05-25 09:31:54 UTC
Description of problem:
ide0=noprobe kills the kernel

Version-Release number of selected component (if applicable):

How reproducible:
boot the kernel with ide0=noprobe on the command line and see
it crash quite early.  Needs earlyprintk to actually see something ;)

Actual results:
kernel gets a GPF and panics.

Expected results:
kernel boots up with ide0 disabled.

Additional info:
This happened within a Xen HVM machine, while trying to get the
ide driver out of the way, so we can use paravirtual drivers
after booting.

Comment 1 Gerd Hoffmann 2007-05-25 09:31:54 UTC
Created attachment 155438 [details]
boot log

Comment 2 Gerd Hoffmann 2007-07-03 14:16:56 UTC
other machine, different versions (some upcoming rhel-5.1 xen bits).

Guest boots up fine, up to the point where it would mount the root filesystem. 
Close to the place where the other kernel crashed it complains about interrupts
being enabled though:

[ ... ]
SMP: Allowing 1 CPUs, 0 hotplug CPUs
Built 1 zonelists.  Total pages: 127971
Kernel command line: ro root=/dev/VolGroup00/LogVol00 console=tty1
console=ttyS0,115200 ide0=noprobe ide1=noprobe
ide_setup: ide0=noprobe
ide_setup: ide1=noprobe
Initializing CPU#0
PID hash table entries: 2048 (order: 11, 16384 bytes)
start_kernel(): bug: interrupts were enabled early
Console: colour VGA+ 80x25
[ ... ]

So this looks like a lucky game to me.  When an interrupt comes in early it
takes down the machine, if not it boots up fine ...

Comment 4 Gerd Hoffmann 2007-07-13 15:03:12 UTC
Created attachment 159196 [details]
band aid fix

pci_find_device() enables irqs as side effect, probably due to device list
locking using a rwsem.	So avoid calling it.  Patch cripples pre-PCI ide

call chain:


So any ide=foo on the kernel command line triggers this.

Comment 5 Alan Cox 2007-07-23 17:19:31 UTC

This is a revert for older systems.

Fix pci_find_device not to enable IRQs by mistake

Comment 6 RHEL Product and Program Management 2007-07-31 13:45:47 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update

Comment 7 Gerd Hoffmann 2007-08-22 14:30:07 UTC
Created attachment 162064 [details]
different approach to fix it ...

Don't bother taking the rwsem (and enable irqs as side effect) to walk the list
if the list is empty anyway in pci_find_device().

Comment 8 Alan Cox 2007-08-22 15:12:28 UTC
Bit of a hack but solves the problem and with minimal risk - ok by me

Comment 9 Gerd Hoffmann 2007-08-23 07:52:26 UTC
dduile asked for details why irqs get enabled for the patch comment.

I think it is in lib/rwsem-spinlock.c, function __down_read(), spin_unlock_irq()
call.  Probably happens on x86_64 only because i386 doesn't use the generic,
spinlock-based rw semaphores.  Which likely also is the reason it went unnoticed
so far because on modern, 64bit capable hardware you'll rarely have a need to
specify ide=something on the kernel command line ...

Comment 10 Michal Schmidt 2007-08-28 10:27:02 UTC
Created attachment 175821 [details]
another fix, more like upstream

Backport of the upstream fix, introduces no_pci_devices() function.

Upstream fixed in in git commit ed4aaadb1a7913f509f05d3e67840541a180713f ('fix
jvc cdrom drive lockup'). It introduced a new exported function

I made a scratch build with this patch included:

Gerd, would you test if it fixes your problem?

Comment 11 Gerd Hoffmann 2007-08-28 11:03:31 UTC
Works fine for me.

Comment 13 Don Zickus 2007-12-14 18:41:44 UTC
in 2.6.18-60.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Comment 15 Don Domingo 2008-02-07 05:13:16 UTC
added to RHEL5.2 release notes under "Kernel-Related Updates":

 The kernel parameter ide0=noprobe no longer causes a kernel panic. This was
fixed through the introduction of a new function, no_pci_devices().


please advise if any further revisions are required. thanks!

Comment 16 Michal Schmidt 2008-02-07 10:26:45 UTC
no_pci_devices() is an implementation detail. This should be enough:

The kernel parameter ide0=noprobe no longer causes a kernel panic.

Comment 17 Don Domingo 2008-02-07 23:02:59 UTC
thanks Michal, revising as requested. 

Comment 18 Mike Gahagan 2008-03-18 20:58:55 UTC
Confirmed the bugfix is in the -85.el5 kernel. I wasn't able to reproduce the
problem with the -53 kernel on any of the xen guests I tried.

Comment 19 Don Domingo 2008-04-02 02:10:03 UTC
the RHEL5.2 release notes will be dropped to translation on April 15, 2008, at
which point no further additions or revisions will be entertained.

a mockup of the RHEL5.2 release notes can be viewed at the following link:

please use the aforementioned link to verify if your bugzilla is already in the
release notes (if it needs to be). each item in the release notes contains a
link to its original bug; as such, you can search through the release notes by
bug number.


Comment 21 errata-xmlrpc 2008-05-21 14:43:19 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.


Note You need to log in before you can comment on or make changes to this bug.