1598309 – Dazed and confused: NMI received for unknown reason (21|31) on CPU x

Bug 1598309 - Dazed and confused: NMI received for unknown reason (21|31) on CPU x

Summary: Dazed and confused: NMI received for unknown reason (21|31) on CPU x

Keywords:
Status:	CLOSED EOL
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	kernel
Sub Component:
Version:	34
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Assignee:	Kernel Maintainer List
QA Contact:	Fedora Extras Quality Assurance
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-07-05 05:34 UTC by Raman Gupta
Modified:	2022-12-25 05:00 UTC (History)
CC List:	34 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2022-06-08 06:24:26 UTC
Type:	Bug
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
Output of `journalctl --dmesg` (7.98 MB, text/x-vhdl) 2018-07-05 05:34 UTC, Raman Gupta	no flags	Details
Output of `lspci` (7.16 KB, text/plain) 2018-07-05 05:36 UTC, Raman Gupta	no flags	Details
Output of `lspci -t` (1.56 KB, text/plain) 2018-07-05 05:36 UTC, Raman Gupta	no flags	Details
Output of `cat /proc/cpuinfo` (43.86 KB, text/plain) 2018-07-05 05:38 UTC, Raman Gupta	no flags	Details
View All

Description Raman Gupta 2018-07-05 05:34:12 UTC

Created attachment 1456659 [details]
Output of `journalctl --dmesg`

Description of problem:

Occasionally (once every 1-4 weeks or so) receive a message like this from the kernel:

Jul 04 01:50:33 edison kernel: Uhhuh. NMI received for unknown reason 21 on CPU 23.
Jul 04 01:50:33 edison kernel: Do you have a strange power saving mode enabled?
Jul 04 01:50:33 edison kernel: Dazed and confused, but trying to continue

Here are the historical entries from journalctl that are still available -- these logs span multiple kernel versions:

Mar 14 00:25:44 edison kernel: Uhhuh. NMI received for unknown reason 21 on CPU 5.
Mar 14 00:25:47 edison kernel: Uhhuh. NMI received for unknown reason 31 on CPU 25.
Mar 22 21:39:04 edison kernel: Uhhuh. NMI received for unknown reason 31 on CPU 25.
Mar 22 22:19:46 edison kernel: Uhhuh. NMI received for unknown reason 31 on CPU 17.
Apr 06 05:20:06 edison kernel: Uhhuh. NMI received for unknown reason 31 on CPU 23.
May 08 19:52:24 edison kernel: Uhhuh. NMI received for unknown reason 21 on CPU 20.
Jun 14 06:16:50 edison kernel: Uhhuh. NMI received for unknown reason 31 on CPU 23.
Jul 04 01:50:33 edison kernel: Uhhuh. NMI received for unknown reason 21 on CPU 23


Version-Release number of selected component (if applicable):

Currently 4.17.3-100.fc27.x86_64 but have seen this error over several kernel updates (ever since I built this AMD Threadripper-based hardware).

# uname -a
Linux edison 4.17.3-100.fc27.x86_64 #1 SMP Tue Jun 26 14:19:03 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux


How reproducible:

Happens 1-4 weeks, with no noticeable trigger.


Steps to Reproduce:
1. Wait


Additional info:

Comment 1 Raman Gupta 2018-07-05 05:36:24 UTC

Created attachment 1456661 [details]
Output of `lspci`

Comment 2 Raman Gupta 2018-07-05 05:36:51 UTC

Created attachment 1456662 [details]
Output of `lspci -t`

Comment 3 Raman Gupta 2018-07-05 05:38:06 UTC

Created attachment 1456663 [details]
Output of `cat /proc/cpuinfo`

Comment 4 Justin M. Forbes 2018-07-23 15:33:22 UTC

*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There are a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 27 kernel bugs.

Fedora 27 has now been rebased to 4.17.7-100.fc27.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 28, and are still experiencing this issue, please change the version to Fedora 28.

If you experience different issues, please open a new bug report for those.

Comment 5 aaronsloman 2018-08-02 20:12:13 UTC

I still have this bug in
4.17.9-100.fc27.x86_64 #1 SMP Mon Jul 23 22:35:38 UTC 2018

I've had it intermittently for several months.
The annoying thing is that this is broadcast on all open xterm windows. Can that be suppressed?

This seems to happen at random times, but most recently after coming out of hibernate. Here's the dmesg output after resume from hibernate a few minutes ago:

[16590.754113] acpi LNXPOWER:00: Turning OFF
[16590.754389] PM: Basic memory bitmaps freed
[16590.754394] OOM killer enabled.
[16590.754398] Restarting tasks ... done.
[16590.764983] thermal thermal_zone5: failed to read out thermal zone (-61)
[16590.765973] PM: hibernation exit
[16591.083373] Bluetooth: hci0: Bootloader revision 0.0 build 2 week 52 2014
[16591.090389] Bluetooth: hci0: Device revision is 5
[16591.090396] Bluetooth: hci0: Secure boot is enabled
[16591.090398] Bluetooth: hci0: OTP lock is enabled
[16591.090401] Bluetooth: hci0: API lock is enabled
[16591.090403] Bluetooth: hci0: Debug lock is disabled
[16591.090405] Bluetooth: hci0: Minimum firmware build 1 week 10 2014
[16591.092900] Bluetooth: hci0: Found device firmware: intel/ibt-11-5.sfi
[16591.168975] IPv6: ADDRCONF(NETDEV_UP): enp1s0f2: link is not ready
[16591.180576] IPv6: ADDRCONF(NETDEV_UP): wlp2s0: link is not ready
[16591.423111] IPv6: ADDRCONF(NETDEV_UP): wlp2s0: link is not ready
[16591.678061] IPv6: ADDRCONF(NETDEV_UP): wlp2s0: link is not ready
[16591.740744] IPv6: ADDRCONF(NETDEV_UP): wlp2s0: link is not ready
[16593.212386] Bluetooth: hci0: Waiting for firmware download to complete
[16593.212405] Bluetooth: hci0: Firmware loaded in 2080246 usecs
[16593.212630] Bluetooth: hci0: Waiting for device to boot
[16593.224743] Bluetooth: hci0: Device booted in 11894 usecs
[16593.225445] Bluetooth: hci0: Found Intel DDC parameters: intel/ibt-11-5.ddc
[16593.229723] Bluetooth: hci0: Applying Intel DDC parameters completed
[16595.307159] IPv6: ADDRCONF(NETDEV_UP): wlp2s0: link is not ready
[16596.339347] wlp2s0: authenticate with c4:e9:84:b3:39:e9
[16596.348671] wlp2s0: send auth to c4:e9:84:b3:39:e9 (try 1/3)
[16596.353982] wlp2s0: authenticated
[16596.354778] wlp2s0: associate with c4:e9:84:b3:39:e9 (try 1/3)
[16596.364980] wlp2s0: RX AssocResp from c4:e9:84:b3:39:e9 (capab=0x411 status=0 aid=1)
[16596.368347] wlp2s0: associated
[16596.382397] IPv6: ADDRCONF(NETDEV_CHANGE): wlp2s0: link becomes ready
[16596.404457] wlp2s0: Limiting TX power to 17 (20 - 3) dBm as advertised by c4:e9:84:b3:39:e9
[16879.050072] Uhhuh. NMI received for unknown reason 3c on CPU 0.
[16879.050077] Do you have a strange power saving mode enabled?
[16879.050079] Dazed and confused, but trying to continue

Comment 6 Raman Gupta 2018-08-08 05:16:18 UTC

I can confirm this issue is still occurring on 4.17.9-100.fc27.x86_64.

Comment 7 aaronsloman 2018-08-08 19:26:33 UTC

Also 4.17.11-100.fc27.x86_64 #1 SMP Mon Jul 30 15:22:33 UTC 2018
on a Clevo W515LU mini notebook and on a Viglen desktop PC

dmesg on Clevo after hibernate:

 [22443.646347] wlp2s0: associated
 [22443.659516] IPv6: ADDRCONF(NETDEV_CHANGE): wlp2s0: link becomes ready
 [22443.728125] wlp2s0: Limiting TX power to 17 (20 - 3) dBm as advertised by c4:e9:84:b3:39:e9
 [22754.408184] Uhhuh. NMI received for unknown reason 3c on CPU 0.
 [22754.408191] Do you have a strange power saving mode enabled?
 [22754.408193] Dazed and confused, but trying to continue

dmesg on PC after hibernate:

 [32343.502024] Restarting tasks ... done.
 [32343.504460] PM: hibernation exit
 [32344.480374] e1000e: eno1 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx
 [33002.841393] Uhhuh. NMI received for unknown reason 21 on CPU 0.
 [33002.841394] Do you have a strange power saving mode enabled?
 [33002.841394] Dazed and confused, but trying to continue

Comment 8 Raman Gupta 2018-08-31 19:28:30 UTC

Still occurring on 4.17.12-100.fc27.x86_64.

Comment 9 Raman Gupta 2018-08-31 19:29:31 UTC

Even though this happens rarely, it is extremely annoying because the message gets sent to every open terminal window.

Comment 10 aaronsloman 2018-09-01 00:15:26 UTC

I have the same problem, running fedora 27 on desktop PC and (rebadged) Clevo laptop, although the frequency has gone down.

Current kernel

On PC:     4.17.14-102.fc27.x86_64 #1 SMP Wed Aug 15 12:26:40 UTC 2018

On laptop: 4.17.17-100.fc27.x86_64 #1 SMP Mon Aug 20 15:53:11 UTC 2018

Frequency of messages is higher on laptop. E.g output of grep

On PC:
grep Dazed messages*
 messages-20180812:Aug  7 09:55:05 vig kernel: Dazed and confused, but trying to continue
 messages-20180826:Aug 20 08:51:10 vig kernel: Dazed and confused, but trying to continue

On laptop:
grep Dazed messages*
 messages-20180812:Aug  7 14:53:03 stone kernel: Dazed and confused, but trying to continue
 messages-20180812:Aug  9 09:48:29 stone kernel: Dazed and confused, but trying to continue
 messages-20180819:Aug 15 11:50:08 stone kernel: Dazed and confused, but trying to continue
 messages-20180826:Aug 20 18:47:28 stone kernel: Dazed and confused, but trying to continue
 messages-20180826:Aug 25 18:55:25 stone kernel: Dazed and confused, but trying to continue
 messages-20180826:Aug 26 00:00:52 stone kernel: Dazed and confused, but trying to continue
 messages-20180826:Aug 26 01:34:55 stone kernel: Dazed and confused, but trying to continue

The greater frequency of messages on the laptop may be due to more frequent use of suspend and hibernate.

Comment 11 aaronsloman 2018-09-01 00:36:06 UTC

It looks as if the "Dazed and confused" messages are part of some debugging code.
It should be possible for users to turn them off.

In my examples, in previous message, "stone" is the name of the laptop and "vig" the PC.

Comment 12 Łukasz Faber 2018-09-01 13:31:40 UTC

I have similar messages on Ryzen 1700X CPU:

[24858.583553] Uhhuh. NMI received for unknown reason 0c on CPU 4.
[24858.583553] Do you have a strange power saving mode enabled?
[24858.583554] Dazed and confused, but trying to continue

Comment 13 aaronsloman 2018-09-01 14:58:54 UTC

It looks as if these reports come from very old code (unless something has been copied into a new context).

The oldest reports I've been able to find (thanks google!) come from Redhat 6.2 1972 (reported on 11:18:2002): running
  REDHAT LINUX ADVANCED SERVER RELEASE 2.1AS/i686 (PENSACOLA):
Reported here:
 https://www.linuxquestions.org/questions/linux-hardware-18/memory-dazed-and-confused-35801/

Since then (17 years) there seem to have been very many fault reports about "Dazed and confused" messages. Is this a record?

It's possible that what looks like one three line error message is generated by different pieces of code during exit after the initial error. To help with debugging, the messages probably need more information attached. But if the error does not interfere with continued use of the computer the messages should simply be recorded internally, not broadcast on all terminals.
???

Comment 14 Laura Abbott 2018-10-01 21:26:08 UTC

We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 27 kernel bugs.
 
Fedora 27 has now been rebased to 4.18.10-100.fc27.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.
 
If you have moved on to Fedora 28 or Fedora 29, and are still experiencing this issue, please change the version to Fedora 28 or 29.
 
If you experience different issues, please open a new bug report for those.

Comment 15 Łukasz Faber 2018-10-13 15:43:25 UTC

I have encountered it again on 4.18.12-200.fc28.x86_64

[113732.728058] Uhhuh. NMI received for unknown reason 0c on CPU 15.
[113732.728059] Do you have a strange power saving mode enabled?
[113732.728060] Dazed and confused, but trying to continue

Comment 16 Ben Cotton 2018-11-27 14:01:58 UTC

This message is a reminder that Fedora 27 is nearing its end of life.
On 2018-Nov-30  Fedora will stop maintaining and issuing updates for
Fedora 27. It is Fedora's policy to close all bug reports from releases
that are no longer maintained. At that time this bug will be closed as
EOL if it remains open with a Fedora  'version' of '27'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 27 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 17 Ben Cotton 2018-11-30 23:14:55 UTC

Fedora 27 changed to end-of-life (EOL) status on 2018-11-30. Fedora 27 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 18 Raman Gupta 2020-02-25 06:15:15 UTC

Still an issue on Fedora 31, kernel 5.4.19-200.fc31.x86_64.

Comment 19 Zachary Smith 2020-04-04 22:19:38 UTC

Also still an issue on 5.5.10-200.fc31.x86_64

Comment 20 Kim Bisgaard 2020-04-25 08:53:07 UTC

Just saw this on f32 kernel 5.6.5-300.fc32.x86_64

tail /proc/cpuinfo:
processor       : 7
vendor_id       : AuthenticAMD
cpu family      : 23
model           : 17
model name      : AMD Ryzen 5 2400G with Radeon Vega Graphics
stepping        : 0
microcode       : 0x8101016
cpu MHz         : 1423.938
cache size      : 512 KB
physical id     : 0
siblings        : 8
core id         : 3
cpu cores       : 4
apicid          : 7
initial apicid  : 7
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb hw_pstate sme ssbd sev ibpb vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 xsaves clzero irperf xsaveerptr arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif overflow_recov succor smca
bugs            : sysret_ss_attrs null_seg spectre_v1 spectre_v2 spec_store_bypass
bogomips        : 7186.35
TLB size        : 2560 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 43 bits physical, 48 bits virtual
power management: ts ttp tm hwpstate eff_freq_ro [13] [14]

Comment 21 Andrej 2020-05-12 18:28:11 UTC

The same on freshly installed F32 with 5.6.10-300.fc32.x86_64. I was compiling some stuff on all cores when this happened (make -j).

May 12 20:09:31 ryzen kernel: Dazed and confused, but trying to continue
May 12 20:09:31 ryzen kernel: Do you have a strange power saving mode enabled?
May 12 20:09:31 ryzen kernel: Uhhuh. NMI received for unknown reason 3c on CPU 12.

processor       : 0                                                                                                                                                                                                 
vendor_id       : AuthenticAMD                                                                                                                                                                                      
cpu family      : 23                                                                                                                                                                                                
model           : 8                                                                                                                                                                                                 
model name      : AMD Ryzen 7 2700X Eight-Core Processor                                                                                                                                                            
stepping        : 2                                                                                                                                                                                                 
microcode       : 0x800820d                                                                                                                                                                                         
cpu MHz         : 1884.122                                                                                                                                                                                          
cache size      : 512 KB                                                                                                                                                                                            
physical id     : 0                                                                                                                                                                                                 
siblings        : 16                                                                                                                                                                                                
core id         : 0                                                                                                                                                                                                 
cpu cores       : 8                                                                                                                                                                                                 
apicid          : 0                                                                                                                                                                                                 
initial apicid  : 0                                                                                                                                                                                                 
fpu             : yes                                                                                                                                                                                               
fpu_exception   : yes                                                                                                                                                                                               
cpuid level     : 13                                                                                                                                                                                                
wp              : yes                                                                                                                                                                                               
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb hw_pstate sme ssbd sev ibpb vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 xsaves clzero irperf xsaveerptr arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif overflow_recov succor smca                                                        
bugs            : sysret_ss_attrs null_seg spectre_v1 spectre_v2 spec_store_bypass                                                                                                                                  
bogomips        : 7386.55                                                                                                                                                                                           
TLB size        : 2560 4K pages                                                                                                                                                                                     
clflush size    : 64                                                                                                                                                                                                
cache_alignment : 64                                                                                                                                                                                                
address sizes   : 43 bits physical, 48 bits virtual                                                                                                                                                                 
power management: ts ttp tm hwpstate cpb eff_freq_ro [13] [14]

Comment 22 Henrique Martins 2020-06-24 20:36:36 UTC

Uhhuh. NMI received for unknown reason 3c on CPU 4

processor       : 7
vendor_id       : AuthenticAMD
cpu family      : 23
model           : 24
model name      : AMD Ryzen 5 3400G with Radeon Vega Graphics
stepping        : 1
microcode       : 0x8108102
cpu MHz         : 1414.466
cache size      : 512 KB
physical id     : 0
siblings        : 8
core id         : 3
cpu cores       : 4
apicid          : 7
initial apicid  : 7
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb hw_pstate sme ssbd sev ibpb vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 xsaves clzero irperf xsaveerptr arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif overflow_recov succor smca
bugs            : sysret_ss_attrs null_seg spectre_v1 spectre_v2 spec_store_bypass
bogomips        : 7399.92
TLB size        : 2560 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 43 bits physical, 48 bits virtual
power management: ts ttp tm hwpstate eff_freq_ro [13] [14]

Comment 23 Raman Gupta 2020-06-25 04:11:37 UTC

Still happening on 5.6.15-200.fc31.x86_64. This is so annoying. I'm fine with just suppressing this message so it doesn't blast out to every terminal. Anyone know how?

Comment 24 Raman Gupta 2020-06-25 04:12:02 UTC

May 31 17:54:21 edison kernel: Uhhuh. NMI received for unknown reason 20 on CPU 27.
Jun 07 14:17:40 edison kernel: Uhhuh. NMI received for unknown reason 21 on CPU 19.
Jun 09 04:34:21 edison kernel: Uhhuh. NMI received for unknown reason 20 on CPU 24.
Jun 16 02:23:40 edison kernel: Uhhuh. NMI received for unknown reason 30 on CPU 18.
Jun 23 04:37:19 edison kernel: Uhhuh. NMI received for unknown reason 30 on CPU 22.

Comment 25 Barrydocks 2020-07-21 08:54:43 UTC

Hi there,

I have just had this bug on my ubuntu (18.04) system installed on a brand new box
Uhhuh. NMI received for unknown reason 31 on CPU 2.
Do you have a strange power saving mode enabled?
Dazed and confused, but trying to continue

Comment 26 Raman Gupta 2020-07-21 13:41:11 UTC

Yes, it is still happening on 5.7.8-100.fc31.x86_64, and still cannot be suppressed from spamming every console as far as I can tell. These kernel messages are configured as "emergency" (level 0) and `/proc/sys/kernel/printk` cannot be used to disable them.

Comment 27 James Hilliard 2020-09-08 18:09:04 UTC

Seeing this on ubuntu 20.04.1(5.4.0-42-generic #46-Ubuntu SMP Fri Jul 10 00:24:02 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux) as well.

[1565723.677106] Uhhuh. NMI received for unknown reason 31 on CPU 25.
[1565723.677107] Do you have a strange power saving mode enabled?
[1565723.677107] Dazed and confused, but trying to continue

processor	: 25
vendor_id	: AuthenticAMD
cpu family	: 23
model		: 1
model name	: AMD Ryzen Threadripper 1950X 16-Core Processor
stepping	: 1
microcode	: 0x8001137
cpu MHz		: 1989.409
cache size	: 512 KB
physical id	: 0
siblings	: 32
core id		: 9
cpu cores	: 16
apicid		: 19
initial apicid	: 19
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid amd_dcm aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb hw_pstate sme ssbd sev ibpb vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 xsaves clzero irperf xsaveerptr arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif overflow_recov succor smca
bugs		: sysret_ss_attrs null_seg spectre_v1 spectre_v2 spec_store_bypass
bogomips	: 6786.33
TLB size	: 2560 4K pages
clflush size	: 64
cache_alignment	: 64
address sizes	: 43 bits physical, 48 bits virtual
power management: ts ttp tm hwpstate eff_freq_ro [13] [14]

Comment 28 Nigel Reed 2020-09-21 15:06:07 UTC

I'm going to add to this, even though I'm using Ubuntu like the above user, it seems this is only affecting AMD Ryzen processors. 

processor       : 15
vendor_id       : AuthenticAMD
cpu family      : 23
model           : 1
model name      : AMD Ryzen 7 1800X Eight-Core Processor
stepping        : 1
microcode       : 0x8001138
cpu MHz         : 2337.092
cache size      : 512 KB
physical id     : 0
siblings        : 16
core id         : 7
cpu cores       : 8
apicid          : 15
initial apicid  : 15
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb hw_pstate sme ssbd sev ibpb vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 xsaves clzero irperf xsaveerptr arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif overflow_recov succor smca
bugs            : sysret_ss_attrs null_seg spectre_v1 spectre_v2 spec_store_bypass
bogomips        : 7185.27
TLB size        : 2560 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 43 bits physical, 48 bits virtual
power management: ts ttp tm hwpstate eff_freq_ro [13] [14]

Comment 29 Emmett Culley 2020-10-04 15:30:18 UTC








Message from syslogd@ws1 at Oct  3 20:37:53 ...
 kernel:Uhhuh. NMI received for unknown reason 20 on CPU 5.

processor       : 5
vendor_id       : AuthenticAMD
cpu family      : 23
model           : 8
model name      : AMD Ryzen 5 2600 Six-Core Processor
stepping        : 2
microcode       : 0x800820d
cpu MHz         : 3423.378
cache size      : 512 KB
physical id     : 0
siblings        : 12
core id         : 2
cpu cores       : 6
apicid          : 5
initial apicid  : 5
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl nonstop_tsc cpuid extd_apicid aperfmperf pni pclmulqdq monitor ssse3 fma cx16 sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb bpext perfctr_llc mwaitx cpb hw_pstate sme ssbd sev ibpb vmmcall fsgsbase bmi1 avx2 smep bmi2 rdseed adx smap clflushopt sha_ni xsaveopt xsavec xgetbv1 xsaves clzero irperf xsaveerptr arat npt lbrv svm_lock nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold avic v_vmsave_vmload vgif overflow_recov succor smca
bugs            : sysret_ss_attrs null_seg spectre_v1 spectre_v2 spec_store_bypass
bogomips        : 6799.03
TLB size        : 2560 4K pages
clflush size    : 64
cache_alignment : 64
address sizes   : 43 bits physical, 48 bits virtual
power management: ts ttp tm hwpstate cpb eff_freq_ro [13] [14]


I just started seeing this after that last system update:

# uname -a
Linux ws1.webengineer.com 5.8.10-200.fc32.x86_64 #1 SMP Thu Sep 17 16:48:25 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

Comment 30 Ben Cotton 2020-11-03 15:01:34 UTC

This message is a reminder that Fedora 31 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 31 on 2020-11-24.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '31'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 31 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 31 Raman Gupta 2020-11-03 15:13:01 UTC

FYI adding `nmi_watchdog=0` to my GRUB_CMDLINE_LINUX in /etc/sysconfig/grub seems to have worked around the issue for me. Therefore I can't confirm whether this is still an issue on Fedora 32+.

Comment 32 Henrique Martins 2020-11-03 16:00:45 UTC

Still happened on my F32 system.

I've updated to F33 last week, haven't seen it since but this is not a high frequency problem.

Comment 33 Henrique Martins 2020-11-05 17:03:33 UTC

And just happened on my f33 system, so probably bump the version to f33

Comment 34 Habig, Alec 2021-01-02 15:10:36 UTC

Just happened now when I had some time to go bug hunting (happened in the past on the same ~month or two random time mentioned earlier).  It hints at power modes, but I've done nothing to tweak whatever the F33 install set up.

Message from syslogd@enthalpy at Jan  2 08:55:59 ...
 kernel:Uhhuh. NMI received for unknown reason 2c on CPU 3.

Message from syslogd@enthalpy at Jan  2 08:55:59 ...
 kernel:Do you have a strange power saving mode enabled?

Message from syslogd@enthalpy at Jan  2 08:55:59 ...
 kernel:Dazed and confused, but trying to continue

F33, CPU is:
model name      : AMD Ryzen 7 PRO 3700U w/ Radeon Vega Mobile Gfx

Comment 35 Fedora Program Management 2021-04-29 15:54:25 UTC

This message is a reminder that Fedora 32 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 32 on 2021-05-25.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '32'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 32 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 36 Ben Cotton 2021-05-25 14:58:38 UTC

Fedora 32 changed to end-of-life (EOL) status on 2021-05-25. Fedora 32 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 37 Henrique Martins 2021-05-29 16:36:16 UTC

Can this please be re-opened and bumped to F34?
Just happened on my F34 system today.

Comment 38 Adam Pribyl 2021-05-29 17:53:50 UTC

You should be able to do that by editing the bug and chainging the status.

Comment 39 Henrique Martins 2021-05-29 18:13:32 UTC

(In reply to Adam Pribyl from comment #38)
> You should be able to do that by editing the bug and chainging the status.

I tried, but it seems I can only add comments, and the version field is not editable.

Comment 41 Ben Cotton 2022-05-12 16:52:41 UTC

This message is a reminder that Fedora Linux 34 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora Linux 34 on 2022-06-07.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
'version' of '34'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, change the 'version' 
to a later Fedora Linux version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora Linux 34 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora Linux, you are encouraged to change the 'version' to a later version
prior to this bug being closed.

Comment 42 Ben Cotton 2022-06-08 06:24:26 UTC

Fedora Linux 34 entered end-of-life (EOL) status on 2022-06-07.

Fedora Linux 34 is no longer maintained, which means that it
will not receive any further security or bug fix updates. As a result we
are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 43 Henrique Martins 2022-12-24 15:06:00 UTC

Has this been reopened on/obsoleted by some other bugzilla ticket? 

Still happening on my Ryzen based system with kernel-6.0.12-300.fc37.x86_64.

Should I just add nmi_watchdog=0 to the kernel and bury my head in the sand?

Comment 44 Raman Gupta 2022-12-25 05:00:44 UTC

(In reply to Henrique Martins from comment #43)
> Has this been reopened on/obsoleted by some other bugzilla ticket? 
> 
> Still happening on my Ryzen based system with kernel-6.0.12-300.fc37.x86_64.
> 
> Should I just add nmi_watchdog=0 to the kernel and bury my head in the sand?

Yes, disable it and be happy. Or go back to an Intel system as I have and re-enable it.

Note You need to log in before you can comment on or make changes to this bug.

adundovi
ahabig
airlied
a.sloman
barrydocks99
bskeggs
covex
david
elbin.p
ewk
faber
fedora
hdegoede
ichavero
itamar
james.hilliard1
jarodwilson
jglisse
john.j5live
jonathan
josef
jscheibe
juantxorena
kernel-maint
kim-rh
linville
lst_manage
mblaha
mchehab
mjg59
nigel
rocketraman
steved
z