Description of problem:
If I add nmi_watchdog=1 to the kernel command line when booting, it hangs
ACPI: Found ECDT
This is *almost* 100% guaranteed. There seems to be a rain dance you can do to
get it to boot. It goes something like:
* Remove the power cord
* Boot the laptop
* Wait for grub
* Re-insert power cord while grub is still running
However once the laptop is up, if you remove the power cord, reinserting it will
cause the laptop to hang immediately and automatically reboot after a few
seconds. Removing nmi_watchdog=1 causes the problem to go away. The xen kernel
boots fine with or without nmi_watchdog=1.
Version-Release number of selected component (if applicable):
same hang on X60 with Core2Duo (type 1706-GMG)
did not bother trying the rain dance
Forgot to add in Comment #1:
This is with the x86_64 version of RHEL5
Does this still occur with the latest RHEL5.1 beta kernels? If so, can the
problem be reproduced with a recent Fedora kernel? (i.e., is there a fix
upstream we need to hunt down?)
On 1706-GMG with Fedora rawhide x86_64 and kernel 2.6.23-0.164.rc5.fc8 I still
cannot boot if I use nmi_watchdog=1. Machine boots fine without this.
I have seen the same problem on my laptop T43 + i386 + RHEL5-Client, but I have
seen it when bootting kernel 2.6.18-8.1.8.el5PAE with parameter including both
"nmi_watchdog=1" and "crashkernel=128M@16M". If substituting with
"nmi_watchdog=2" or without "crashkernel" para, there is no problem as well. I
can confirm that there is no such problem when running kernel 2.6.18-8.el5.
I have tried on the latest released 5.0.z kernel, 2.6.18-8.1.14.el5, and the
problem is still there.
I have observed the same hang even in 2.6.18-8.el5, but only when attached a USB
disk and had "nmi_watchdog=1" before booting.
Does this problem occur on the X60 if "nmi_watchdog=2" is used instead?
FWIW: X60 with Core2Duo (type 1706-GMG) and Fedora 8, kernel x86_64 126.96.36.199-49.fc8
nmi_watchdog=1 still does not boot
nmi_watchdog=2 does boot (although
/usr/share/doc/kernel-doc-2.6.23/Documentation/nmi_watchdog.txt tells me this
should not work)
Using nmi_watchdog=2 is fine as long as NMI in /proc/interrupts increments
perodically. The docs on NMI are very vague. nmi_watchdog is very hardware
specific. Some hardware only works with nmi_watchdog=1 and other hardware only
works with nmi_watchdog=2. Predicting which to use on which hardware really is
more of a guessing game.
Can you see if NMI increments in /proc/interrupts if nmi_watchdog=2 is used?
That being said, the hanging at "ACPI: Found ECDT" may be a seperate issue. Some
of these thinkpads had ACPI problems relating to ACPI battery state object
breakage. I see that in Comment #1 that removing the power cord sometimes makes
a difference. It seemes related so I figured I would mention it.
Upstream is trying to deprecate nmi_watchdog=1, as the preferred method is to
use the local apic (nmi_watchdog=2) as opposed to the ioapic (nmi_watchdog=1).
It's no surprise nmi_watchdog=1 doesn't work upstream on a Core2Duo.
Also, RHEL-5.0 the nmi won't work on Core2Duo, you will need RHEL-5.1 due to bz
221671. But I haven't heard of any reports of ACPI issues when using different
do you still have RHEL 5(.1) on the affected box and can you test? Mine is F8
x86_64, so my testing is only of limited value
Note that there's nothing stopping you from installing and booting a RHEL5 kernel on an F8 system for
the purposes of this test...
I'm currently running kernel 2.6.18-53.1.4.el5 (5.1), i386, on the laptop I
originally reported this on. Following previous discussion, I have booted with
the following kernel command line:
ro root=/dev/vg_local/root crashkernel=64M@16M audit=1 nmi_watchdog=2
This worked fine for me the 1 time I've tried it.