Bug 138892 - smp kernel crashes on Dell Prec 650n w/ USB enabled in bios
smp kernel crashes on Dell Prec 650n w/ USB enabled in bios
Product: Red Hat Enterprise Linux 3
Classification: Red Hat
Component: kernel (Show other bugs)
i686 Linux
medium Severity high
: ---
: ---
Assigned To: Pete Zaitcev
Depends On:
  Show dependency treegraph
Reported: 2004-11-11 15:37 EST by Peter Ruprecht
Modified: 2007-11-30 17:07 EST (History)
2 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2004-11-12 18:36:32 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)
dmidecode for Dell Precision 650n BIOS A04 (17.06 KB, text/plain)
2004-11-12 19:09 EST, Peter Ruprecht
no flags Details

  None (edit)
Description Peter Ruprecht 2004-11-11 15:37:06 EST
From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4.3)

Description of problem:
Dual Xeon Dell Precision 650n (3.06 GHz; CPU family 15) freezes or
restarts during kernel startup when booting the SMP kernel if "USB
Emulation" and "USB Controller" are enabled in the BIOS.  The same
behavior exists in kernels back at least to
kernel-smp-2.4.21-15.0.2.EL.  The system boots normally with
single-processor RHEL kernels.  When USB is disabled in the system
BIOS, both single- and multi-processor kernels boot normally ... but
in this case we can't use any USB devices which is not really
satisfactory.  Enabling or disabling hyperthreading in the BIOS
doesn't seem to make any difference.

The same behavior exists on the same machine with a stock Fedora Core
2 kernel (not sure which version exactly; a colleague tested this)
which leads me to believe that this may be a RHEL analogue of bugid

Have tried Dell BIOS version A03 and A04.  Dell tech support is not
interested in working on this problem.  I get the impression they
think it's a kernel problem, so that's why I'm submitting this bug
report.  (Dell has replaced the motherboard, both processors, the scsi
cable, and the voltage regulator, so I don't think it's a HW problem.)

The very odd part is that the system *will* boot successfully with SMP
and USB enabled some small fraction of the time, perhaps once in ten.
 I have tried all kinds of tests regarding warm vs cold starts,
whether I enter the BIOS before starting Linux, etc., but can't find
any consistent way to force a successful or failed boot.  Probably
there is some magic combination but I couldn't find it.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1. enable USB in system BIOS
2. boot SMP kernel

Actual Results:  System freezes after "booting processor 1/6 eip
2000", or resets following "mptbase: Initiating ioc0 bringup".

Expected Results:  System boots up all the way.

Additional info:

I am not sure what log info might be useful; if there's anything that
the kernel gurus would like to look at, please let me know.
Comment 1 Ernie Petrides 2004-11-11 20:19:35 EST
PeterR, could you please determine if the U4 beta kernel resolves
this problem?  The kernel version currently in the RHN beta channel
is 2.4.21-23.EL, but the -24.EL kernel should appear there within a
week.  PeteZ implemented USB BIOS-to-kernel hand-off in U4, although
there was a recent fix in an error path in the latest U4 respin.

PeteZ, would you expect this to be fixed by 0653.zaitcev.early-usb-handoff.patch?
Comment 2 Pete Zaitcev 2004-11-11 21:02:12 EST
This is likely to be a bug in BIOS emulation of PS/2 ports.
If so, proper handoff may help. It is shipping in 2.4.21-20, I'm
pretty sure, but it has not be enabled with "usb-handoff" kernel
parameter in grub.conf.

Failing that, BIOS tweaks are the only answer.
Peter wrote that "Emulation" is enabled separately of "Controller",
in which case the workaround to try is to disable emulation while
leaving the controller enabled. Unfortunately, this is likely to
leave GRUB without input; so a UP kernel would be needed to be kept
around at all time to fix system if anything goes wrong.

Let's see if usb-handoff helps.
Comment 3 Peter Ruprecht 2004-11-12 15:45:39 EST
I won't be able to get access to this system for a few days, probably,
but will give your suggestions a try then.  Thanks!!!!
 -Peter Ruprecht
Comment 4 Peter Ruprecht 2004-11-12 17:18:09 EST
I have added usb-handoff as a kernel option in grub.conf for
2.4.21-20.EL (smp) and re-enabled USB Emulation and Controller in the
BIOS, and the machine now seems to boot normally.  I only had time to
try booting it twice, but previously it would fail almost every time,
so two successive successful boots seems very good.  Thanks for your
quick responses and working solution!
Comment 5 Ernie Petrides 2004-11-12 18:36:32 EST
Closing as NOTABUG according to my understanding that this
is actually a BIOS bug (and has now been worked around).
Comment 6 Pete Zaitcev 2004-11-12 18:52:23 EST
Peter, if you can, please attach the output of "dmidecode"
(it's a part of kernel-utils). This should allow us to enable
this automatically in the future, without extra options.
Do not drop it into comments box - it's going to be rather long.
Comment 7 Peter Ruprecht 2004-11-12 19:09:42 EST
Created attachment 106625 [details]
dmidecode for Dell Precision 650n BIOS A04

Note You need to log in before you can comment on or make changes to this bug.