Bug 489521 - Disable all cpus' watchdog on error in check_nmi_watchdog()
Disable all cpus' watchdog on error in check_nmi_watchdog()
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: kernel (Show other bugs)
x86_64 Linux
low Severity medium
: rc
: ---
Assigned To: Aristeu Rozanski
Red Hat Kernel QE team
Depends On:
  Show dependency treegraph
Reported: 2009-03-10 11:27 EDT by Prarit Bhargava
Modified: 2010-06-24 17:37 EDT (History)
3 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2010-06-24 17:37:02 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Prarit Bhargava 2009-03-10 11:27:25 EDT
Description of problem:

During code inspection (dzickus & myself) it was noticed that the 4.8 kernel does not disable all cpus' watchdog when nmi_watchdog == NMI_LOCAL_APIC:

int __init check_nmi_watchdog (void)
        int counts[NR_CPUS];
        int cpu;

        if (!atomic_read(&nmi_watchdog_active))
                return 0;

        printk(KERN_INFO "testing NMI watchdog ... ");

        for (cpu = 0; cpu < NR_CPUS; cpu++)
                counts[cpu] = cpu_pda[cpu].__nmi_count; 
        mdelay((10*1000)/nmi_hz); // wait 10 ticks

        for (cpu = 0; cpu < NR_CPUS; cpu++) {
                if (!cpu_online(cpu))
                if (!per_cpu(wd_enabled, cpu))

                if (cpu_pda[cpu].__nmi_count - counts[cpu] <= 5) {
                        printk("CPU#%d: NMI appears to be stuck (%d)!\n", 
                        if (atomic_dec_and_test(&nmi_watchdog_active))
                                nmi_active = 0;
                        per_cpu(wd_enabled, cpu) = 0; <<< only disables _this_ cpu's watchdog, not all of them.
                        goto error;
        if (!atomic_read(&nmi_watchdog_active)) {
                atomic_set(&nmi_watchdog_active, -1);
                nmi_active = -1;
                goto error;
Comment 1 Don Zickus 2010-06-24 17:37:02 EDT
This is only seen in the error path and with RHEL-4 seen the end of its life soon, I don't think it is worth fixing.

Note You need to log in before you can comment on or make changes to this bug.