357391 – RHEL5: Race in smp-linux kernel (nmi.c)

Bug 357391 - RHEL5: Race in smp-linux kernel (nmi.c)

Summary: RHEL5: Race in smp-linux kernel (nmi.c)

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	kernel
Sub Component:
Version:	5.0
Hardware:	x86_64
OS:	Linux
Priority:	low
Severity:	low
Target Milestone:	---
Target Release:	---
Assignee:	Prarit Bhargava
QA Contact:	Martin Jenner
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2007-10-29 22:08 UTC by Bhavesh Mehta
Modified:	2009-09-23 15:43 UTC (History)
CC List:	6 users (show)
Fixed In Version:	RHBA-2008-0314
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2008-05-21 14:59:33 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
RHEL5 fix for this issue (2.97 KB, patch) 2007-11-30 14:40 UTC, Prarit Bhargava	no flags	Details \| Diff
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2008:0314	0	normal	SHIPPED_LIVE	Updated kernel packages for Red Hat Enterprise Linux 5.2	2008-05-20 18:43:34 UTC

Description Bhavesh Mehta 2007-10-29 22:08:09 UTC

Description of problem:
For A 4 CPU system, let's say CPUs 1-3 spin on the endflag variable (that is
located on the CPU0 stack!) waiting until CPU0 sets it to 1. When CPU0 decides
that the NMI got stuck, it sets endflag to 1, logs the event, and returns. What
if CPUs1-3 didn't get a chance to run and the next function called on CPU0
zeroes out the stack location that used to correspond to endflag? This might
result in a hang. This race is more likely to be exposed in a virtualized
environment although it can happen on physical setup too.

Excerpt from linux-2.6.17 (arch/x86_64/kernel/nmi.c):

static __init void nmi_cpu_busy(void *data)
{
volatile int *endflag = data;
local_irq_enable();
/* Intentionally don't use cpu_relax here. This is
   to make sure that the performance counter really ticks,
   even if there is a simulator or similar that catches the
   pause instruction. On a real HT machine this is fine because
   all other CPUs are busy with "useless" delay loops and don't
   care if they get somewhat less cycles. */
while (*endflag == 0)
 barrier();
}
#endif

int __init check_nmi_watchdog (void)
{
volatile int endflag = 0; <------------------
int *counts;
int cpu;

counts = kmalloc(NR_CPUS * sizeof(int), GFP_KERNEL);
if (!counts)
 return -1;

printk(KERN_INFO "testing NMI watchdog ... ");

#ifdef CONFIG_SMP
if (nmi_watchdog == NMI_LOCAL_APIC)
 smp_call_function(nmi_cpu_busy, (void *)&endflag, 0, 0);
#endif

for (cpu = 0; cpu < NR_CPUS; cpu++)
 counts[cpu] = cpu_pda(cpu)->__nmi_count;
local_irq_enable();
mdelay((10*1000)/nmi_hz); // wait 10 ticks

for_each_online_cpu(cpu) {
 if (cpu_pda(cpu)->__nmi_count - counts[cpu] <= 5) {
  endflag = 1; <---------------------------
  printk("CPU#%d: NMI appears to be stuck (%d->%d)!\n",
         cpu,
         counts[cpu],
         cpu_pda(cpu)->__nmi_count);
  nmi_active = 0;
  lapic_nmi_owner &= ~LAPIC_NMI_WATCHDOG;
  nmi_perfctr_msr = 0;
  kfree(counts);
  return -1; <-----------------------
 }
}

This race is verified present in rhel5's 2.6.18-8.el5,
kernel-2.6.18-8.1.15.el5.src.rpm, and kernel-2.6.18-51.el5.jwltest.43 which
reflects ToT for RHEL5 kernels.

I am including my observations in a virtualized environment.

How reproducible/Steps to Reproduce:
It takes about 15 minutes while running repeated boot-halts on 5 copies of
RHEL5.0 VM
  
Actual results: OS hangs.

Expected results: No hang.

Additional info:


Solutions:

Following one-liner change has been verified to fix the problem.

int _init check_nmi_watchdog(void)
{
- volatile int endflag = 0;
+ static volatile int endflag = 0;
int *counts;

Also, the race has been fixed in 2.6.20 kernel. It would be a good idea to apply
the same patch.

Comment 1 Prarit Bhargava 2007-11-30 13:25:42 UTC

The analysis is correct.  I suggest using upstream commit
92715e282be7c7488f892703c8d39b08976a833b instead.

P.

Comment 2 Prarit Bhargava 2007-11-30 14:40:07 UTC

Created attachment 273831 [details]
RHEL5 fix for this issue

Initial backport.

Comment 5 RHEL Program Management 2007-12-07 21:45:29 UTC

This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux maintenance release.  Product Management has requested
further review of this request by Red Hat Engineering, for potential
inclusion in a Red Hat Enterprise Linux Update release for currently deployed
products.  This request is not yet committed for inclusion in an Update
release.

Comment 7 Don Zickus 2007-12-21 20:18:16 UTC

in 2.6.18-62.el5
You can download this test kernel from http://people.redhat.com/dzickus/el5

Comment 9 Bhavesh Mehta 2008-02-07 22:26:46 UTC

(In reply to comment #5)
> This request was evaluated by Red Hat Product Management for inclusion in a Red
> Hat Enterprise Linux maintenance release.  Product Management has requested
> further review of this request by Red Hat Engineering, for potential
> inclusion in a Red Hat Enterprise Linux Update release for currently deployed
> products.  This request is not yet committed for inclusion in an Update
> release.

Has the fix been included in an update release?

Comment 11 errata-xmlrpc 2008-05-21 14:59:33 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on the solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2008-0314.html

Comment 12 arun mittal 2009-09-23 15:43:42 UTC

Hi

I am using SUSE 10.2. I am having some issues with one of my device drivers in SUSE. when my system boots up everything seems to be fine. But when I do dmesg  some where in the log it shows me the same message NMI seems to be stuck. Nothing is wromg till now but as soon as i plug in my device driver and start reading stuff from that kernel gets hang and the error comes is

Unable to handle kernel NULL pointer dereference at 0000000000000000 RIP:
<ffffffff802ea868>{_spin_lock_irqsave+3}

I changed the file nmi.c which you mentioned but the message NMI seems to be stuck doesnt go away. I checked my kernel version and it 2.6.16. so i am not sure whether this is kernel issue or driver issue.

I will really appreciate if you give me some guidelines what to do.

Thanks
Arun Mittal

Note You need to log in before you can comment on or make changes to this bug.