Bug 1671126 - NMI watchdog ineffective due to mismerge
Summary: NMI watchdog ineffective due to mismerge
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 7
Classification: Red Hat
Component: kernel-rt
Version: 7.7
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: rc
: ---
Assignee: Scott Wood
QA Contact: Tiefu
URL:
Whiteboard:
Depends On:
Blocks: 1655694
TreeView+ depends on / blocked
 
Reported: 2019-01-30 19:51 UTC by Scott Wood
Modified: 2019-08-06 12:36 UTC (History)
5 users (show)

Fixed In Version: 08.rt56.9kernel-rt-3.10.0-1066.el7
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-08-06 12:36:27 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2019:2043 0 None None None 2019-08-06 12:36:58 UTC

Description Scott Wood 2019-01-30 19:51:13 UTC
There is an extra "return" added by merge commit 7be8efa0a0bacd464 in watchdog_overflow_callback() that prevents an NMI watchdog-detected lockup from ever being reported.

Comment 9 Tiefu 2019-05-14 10:56:18 UTC
[Tiefu Li 14 May 2019]
Following the principal "does anything break" testing, I can't see any different behaviour between the problematic version and the fixed version.
Please refer the below steps:
1. Ensure that testing environment contains the bug:
[root@hp-dl380eg8-01 ~]# uname -r
3.10.0-976.rt56.930.el7.x86_64
2. Ensure that the feature is on:
[root@hp-dl380eg8-01 ~]# cat /proc/sys/kernel/nmi_watchdog
1-> Means this is on.
3. Observe the interrupts:
[root@hp-dl380eg8-01 ~]# grep NMI /proc/interrupts
 NMI:          9          6          6          6          6          7          7          7          9          7          7          7        174          6          6          6          6          6          6          7          8          6          6          6   Non-maskable interrupts

Secondly,I installed the fix on my server again then test again
Below are my step by step procedure:
1. Installation:
[root@hp-dl380eg8-01 ~]# wget http://download.eng.pek2.redhat.com/brewroot/packages/kernel-rt/3.10.0/1010.rt56.968.el7/x86_64/kernel-rt-3.10.0-1010.rt56.968.el7.x86_64.rpm
[root@hp-dl380eg8-01 ~]# rpm -ihv kernel-rt-3.10.0-1010.rt56.968.el7.x86_64.rpm
[root@hp-dl380eg8-01 ~]#  grubby --default-kernel
/boot/vmlinuz-3.10.0-1010.rt56.968.el7.x86_64
[root@hp-dl380eg8-01 ~]# rhts-reboot
Connection to hp-dl380eg8-01.rhts.eng.pek2.redhat.com closed by remote host.

2. After reboot the server:
[root@hp-dl380eg8-01 ~]# uname -r
3.10.0-1010.rt56.968.el7.x86_64
[root@hp-dl380eg8-01 ~]# cat /proc/sys/kernel/nmi_watchdog
1
[root@hp-dl380eg8-01 ~]# grep NMI /proc/interrupts
 NMI:         10          7          7          7          7          8          8          8         10          8          8          8        209          7          7          7          7          7          7          8          9          7          7          7   Non-maskable interrupts

Comment 11 errata-xmlrpc 2019-08-06 12:36:27 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2019:2043


Note You need to log in before you can comment on or make changes to this bug.