Description of problem: For A 4 CPU system, let's say CPUs 1-3 spin on the endflag variable (that is located on the CPU0 stack!) waiting until CPU0 sets it to 1. When CPU0 decides that the NMI got stuck, it sets endflag to 1, logs the event, and returns. What if CPUs1-3 didn't get a chance to run and the next function called on CPU0 zeroes out the stack location that used to correspond to endflag? This might result in a hang. This race is more likely to be exposed in a virtualized environment although it can happen on physical setup too. Excerpt from linux-2.6.17 (arch/x86_64/kernel/nmi.c): static __init void nmi_cpu_busy(void *data) { volatile int *endflag = data; local_irq_enable(); /* Intentionally don't use cpu_relax here. This is to make sure that the performance counter really ticks, even if there is a simulator or similar that catches the pause instruction. On a real HT machine this is fine because all other CPUs are busy with "useless" delay loops and don't care if they get somewhat less cycles. */ while (*endflag == 0) barrier(); } #endif int __init check_nmi_watchdog (void) { volatile int endflag = 0; <------------------ int *counts; int cpu; counts = kmalloc(NR_CPUS * sizeof(int), GFP_KERNEL); if (!counts) return -1; printk(KERN_INFO "testing NMI watchdog ... "); #ifdef CONFIG_SMP if (nmi_watchdog == NMI_LOCAL_APIC) smp_call_function(nmi_cpu_busy, (void *)&endflag, 0, 0); #endif for (cpu = 0; cpu < NR_CPUS; cpu++) counts[cpu] = cpu_pda(cpu)->__nmi_count; local_irq_enable(); mdelay((10*1000)/nmi_hz); // wait 10 ticks for_each_online_cpu(cpu) { if (cpu_pda(cpu)->__nmi_count - counts[cpu] <= 5) { endflag = 1; <--------------------------- printk("CPU#%d: NMI appears to be stuck (%d->%d)!\n", cpu, counts[cpu], cpu_pda(cpu)->__nmi_count); nmi_active = 0; lapic_nmi_owner &= ~LAPIC_NMI_WATCHDOG; nmi_perfctr_msr = 0; kfree(counts); return -1; <----------------------- } } This race is verified present in rhel5's 2.6.18-8.el5, kernel-2.6.18-8.1.15.el5.src.rpm, and kernel-2.6.18-51.el5.jwltest.43 which reflects ToT for RHEL5 kernels. I am including my observations in a virtualized environment. How reproducible/Steps to Reproduce: It takes about 15 minutes while running repeated boot-halts on 5 copies of RHEL5.0 VM Actual results: OS hangs. Expected results: No hang. Additional info: Solutions: Following one-liner change has been verified to fix the problem. int _init check_nmi_watchdog(void) { - volatile int endflag = 0; + static volatile int endflag = 0; int *counts; Also, the race has been fixed in 2.6.20 kernel. It would be a good idea to apply the same patch.
The analysis is correct. I suggest using upstream commit 92715e282be7c7488f892703c8d39b08976a833b instead. P.
Created attachment 273831 [details] RHEL5 fix for this issue Initial backport.
This request was evaluated by Red Hat Product Management for inclusion in a Red Hat Enterprise Linux maintenance release. Product Management has requested further review of this request by Red Hat Engineering, for potential inclusion in a Red Hat Enterprise Linux Update release for currently deployed products. This request is not yet committed for inclusion in an Update release.
in 2.6.18-62.el5 You can download this test kernel from http://people.redhat.com/dzickus/el5
(In reply to comment #5) > This request was evaluated by Red Hat Product Management for inclusion in a Red > Hat Enterprise Linux maintenance release. Product Management has requested > further review of this request by Red Hat Engineering, for potential > inclusion in a Red Hat Enterprise Linux Update release for currently deployed > products. This request is not yet committed for inclusion in an Update > release. Has the fix been included in an update release?
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2008-0314.html
Hi I am using SUSE 10.2. I am having some issues with one of my device drivers in SUSE. when my system boots up everything seems to be fine. But when I do dmesg some where in the log it shows me the same message NMI seems to be stuck. Nothing is wromg till now but as soon as i plug in my device driver and start reading stuff from that kernel gets hang and the error comes is Unable to handle kernel NULL pointer dereference at 0000000000000000 RIP: <ffffffff802ea868>{_spin_lock_irqsave+3} I changed the file nmi.c which you mentioned but the message NMI seems to be stuck doesnt go away. I checked my kernel version and it 2.6.16. so i am not sure whether this is kernel issue or driver issue. I will really appreciate if you give me some guidelines what to do. Thanks Arun Mittal