Description of problem: If I add nmi_watchdog=1 to the kernel command line when booting, it hangs forever at: ACPI: Found ECDT This is *almost* 100% guaranteed. There seems to be a rain dance you can do to get it to boot. It goes something like: * Remove the power cord * Boot the laptop * Wait for grub * Re-insert power cord while grub is still running However once the laptop is up, if you remove the power cord, reinserting it will cause the laptop to hang immediately and automatically reboot after a few seconds. Removing nmi_watchdog=1 causes the problem to go away. The xen kernel boots fine with or without nmi_watchdog=1. Version-Release number of selected component (if applicable): kernel-2.6.18-8.1.1.el5
same hang on X60 with Core2Duo (type 1706-GMG) did not bother trying the rain dance
Forgot to add in Comment #1: This is with the x86_64 version of RHEL5
Does this still occur with the latest RHEL5.1 beta kernels? If so, can the problem be reproduced with a recent Fedora kernel? (i.e., is there a fix upstream we need to hunt down?)
On 1706-GMG with Fedora rawhide x86_64 and kernel 2.6.23-0.164.rc5.fc8 I still cannot boot if I use nmi_watchdog=1. Machine boots fine without this.
I have seen the same problem on my laptop T43 + i386 + RHEL5-Client, but I have seen it when bootting kernel 2.6.18-8.1.8.el5PAE with parameter including both "nmi_watchdog=1" and "crashkernel=128M@16M". If substituting with "nmi_watchdog=2" or without "crashkernel" para, there is no problem as well. I can confirm that there is no such problem when running kernel 2.6.18-8.el5.
I have tried on the latest released 5.0.z kernel, 2.6.18-8.1.14.el5, and the problem is still there.
I have observed the same hang even in 2.6.18-8.el5, but only when attached a USB disk and had "nmi_watchdog=1" before booting.
Does this problem occur on the X60 if "nmi_watchdog=2" is used instead?
FWIW: X60 with Core2Duo (type 1706-GMG) and Fedora 8, kernel x86_64 2.6.23.1-49.fc8 nmi_watchdog=1 still does not boot nmi_watchdog=2 does boot (although /usr/share/doc/kernel-doc-2.6.23/Documentation/nmi_watchdog.txt tells me this should not work)
Using nmi_watchdog=2 is fine as long as NMI in /proc/interrupts increments perodically. The docs on NMI are very vague. nmi_watchdog is very hardware specific. Some hardware only works with nmi_watchdog=1 and other hardware only works with nmi_watchdog=2. Predicting which to use on which hardware really is more of a guessing game. Can you see if NMI increments in /proc/interrupts if nmi_watchdog=2 is used? That being said, the hanging at "ACPI: Found ECDT" may be a seperate issue. Some of these thinkpads had ACPI problems relating to ACPI battery state object breakage. I see that in Comment #1 that removing the power cord sometimes makes a difference. It seemes related so I figured I would mention it.
Upstream is trying to deprecate nmi_watchdog=1, as the preferred method is to use the local apic (nmi_watchdog=2) as opposed to the ioapic (nmi_watchdog=1). It's no surprise nmi_watchdog=1 doesn't work upstream on a Core2Duo. Also, RHEL-5.0 the nmi won't work on Core2Duo, you will need RHEL-5.1 due to bz 221671. But I haven't heard of any reports of ACPI issues when using different nmi settings.
Matthew, do you still have RHEL 5(.1) on the affected box and can you test? Mine is F8 x86_64, so my testing is only of limited value
Note that there's nothing stopping you from installing and booting a RHEL5 kernel on an F8 system for the purposes of this test...
I'm currently running kernel 2.6.18-53.1.4.el5 (5.1), i386, on the laptop I originally reported this on. Following previous discussion, I have booted with the following kernel command line: ro root=/dev/vg_local/root crashkernel=64M@16M audit=1 nmi_watchdog=2 This worked fine for me the 1 time I've tried it.