Bug 1373881 - Overheating CPU generates Hardware Error messages
Summary: Overheating CPU generates Hardware Error messages
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 27
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-09-07 11:14 UTC by Stephen Finucane
Modified: 2021-09-09 11:55 UTC (History)
44 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-02-21 01:05:01 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Stephen Finucane 2016-09-07 11:14:02 UTC
Description of problem:

Running an intensive process that loads the CPU generates warning messages on my company-issued Lenovo T460s (specs below). I think this is due to an unhandled 'mce' event, per output of 'dmesg':

    $ dmesg | tail -30
    ...
    [ 9550.913754] CPU0: Core temperature above threshold, cpu clock throttled (total events = 1)
    [ 9550.913755] CPU2: Core temperature above threshold, cpu clock throttled (total events = 1)
    [ 9550.913775] CPU2: Package temperature above threshold, cpu clock throttled (total events = 1)
    [ 9550.913777] CPU0: Package temperature above threshold, cpu clock throttled (total events = 1)
    [ 9550.913779] mce: [Hardware Error]: Machine check events logged
    [ 9550.913780] mce: [Hardware Error]: Machine check events logged
    [ 9550.913781] CPU1: Package temperature above threshold, cpu clock throttled (total events = 1)
    [ 9550.913782] CPU3: Package temperature above threshold, cpu clock throttled (total events = 1)
    [ 9550.914747] CPU2: Core temperature/speed normal
    [ 9550.914750] CPU2: Package temperature/speed normal
    [ 9550.914752] CPU0: Core temperature/speed normal
    [ 9550.914753] CPU3: Package temperature/speed normal
    [ 9550.914753] CPU1: Package temperature/speed normal
    [ 9550.914756] CPU0: Package temperature/speed normal
    ...

Once the processor has cooled down I stop seeing the errors.

Version-Release number of selected component (if applicable):

    Linux redbox 4.6.6-300.fc24.x86_64 #1 SMP Wed Aug 10 21:07:35 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

How reproducible:

Always

Steps to Reproduce:
1. Run an intensive process, like 'tox' on the openstack nova project

    $ tox -e py27

2. Wait for error messages to appear

Actual results:

Error messages appear in the gnome notification area. These are not reportable as they are system errors.

Expected results:

N/A

Additional info:

CPU: Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz
Mem: 19G (19965188 kB)

Comment 1 Stephen Finucane 2016-09-07 11:15:41 UTC
I should add that the warning messages are through the Gnome notification area: I discovered the dmesg output by trying to figure out what the issue is. 'mcelog' is no help:

    $ sudo mcelog --client
    $ sudo mcelog 
    mcelog: Family 6 Model 4e CPU: only decoding architectural errors

Comment 2 Jeremy Eder 2016-09-08 15:08:04 UTC
happens all the time on t450 as well. any kernel version doesn't matter. latest bios/fw.  makes battery life around 60-90 minutes.

Comment 3 dpw818 2016-09-17 17:56:17 UTC
Same issue with Lenovo X1 Carbon with below specs:

Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                4
On-line CPU(s) list:   0-3
Thread(s) per core:    2
Core(s) per socket:    2
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 78
Model name:            Intel(R) Core(TM) i7-6600U CPU @ 2.60GHz
Stepping:              3
CPU MHz:               2100.000
CPU max MHz:           3400.0000
CPU min MHz:           400.0000
BogoMIPS:              5615.89
Virtualization:        VT-x
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              4096K
NUMA node0 CPU(s):     0-3
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb intel_pt tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx rdseed adx smap clflushopt xsaveopt xsavec xgetbv1 dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp


Probably more info than needed, but:

[41849.676814] thinkpad_acpi: EC reports that Thermal Table has changed
[43683.991324] thinkpad_acpi: EC reports that Thermal Table has changed
[62613.701635] perf: interrupt took too long (4086 > 4073), lowering kernel.perf_event_max_sample_rate to 48000
[78848.151700] thinkpad_acpi: EC reports that Thermal Table has changed
[80415.114750] thinkpad_acpi: EC reports that Thermal Table has changed
[95938.642762] ------------[ cut here ]------------
[95938.642802] WARNING: CPU: 1 PID: 723 at drivers/net/wireless/intel/iwlwifi/mvm/tx.c:1377 iwl_mvm_rx_tx_cmd+0x7fd/0xa10 [iwlmvm]
[95938.642807] Modules linked in: rfcomm ccm fuse ip6t_REJECT nf_reject_ipv6 xt_conntrack ip6t_rpfilter ip_set nfnetlink ebtable_nat ebtable_broute bridge ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_raw ip6table_mangle ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_raw iptable_mangle iptable_security ebtable_filter ebtables ip6table_filter ip6_tables cmac bnep vfat fat snd_soc_skl snd_soc_skl_ipc snd_soc_sst_ipc snd_soc_sst_dsp snd_hda_ext_core snd_soc_sst_match intel_rapl arc4 snd_soc_core x86_pkg_temp_thermal intel_powerclamp coretemp snd_hda_codec_hdmi snd_hda_codec_conexant snd_hda_codec_generic snd_compress kvm iTCO_wdt iTCO_vendor_support snd_pcm_dmaengine ac97_bus mei_wdt snd_hda_intel acer_wmi snd_hda_codec btusb
[95938.642897]  sparse_keymap btrtl btbcm iwlmvm btintel bluetooth mac80211 irqbypass crct10dif_pclmul uvcvideo crc32_pclmul snd_hda_core ghash_clmulni_intel snd_hwdep intel_cstate intel_rapl_perf snd_seq videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_core iwlwifi snd_seq_device videodev snd_pcm media cfg80211 joydev rtsx_pci_ms memstick i2c_i801 thinkpad_acpi snd_timer mei_me snd mei shpchp soundcore rfkill wmi tpm_crb intel_pch_thermal tpm_tis tpm nfsd auth_rpcgss nfs_acl lockd grace vboxpci(OE) vboxnetadp(OE) sunrpc vboxnetflt(OE) vboxdrv(OE) 8021q garp stp llc mrp i915 rtsx_pci_sdmmc mmc_core i2c_algo_bit drm_kms_helper e1000e crc32c_intel drm ptp serio_raw nvme pps_core nvme_core rtsx_pci video fjes
[95938.643011] CPU: 1 PID: 723 Comm: irq/129-iwlwifi Tainted: G        W  OE   4.7.3-200.fc24.x86_64 #1
[95938.643016] Hardware name: LENOVO 20FBCTO1WW/20FBCTO1WW, BIOS N1FET41W (1.15 ) 06/23/2016
[95938.643021]  0000000000000286 00000000a857e9b6 ffff88040d79bbb8 ffffffffb63d961f
[95938.643030]  0000000000000000 0000000000000000 ffff88040d79bbf8 ffffffffb609faab
[95938.643038]  000005610d79bc08 0000000000000000 0000000000000000 0000000000000600
[95938.643046] Call Trace:
[95938.643062]  [<ffffffffb63d961f>] dump_stack+0x63/0x84
[95938.643072]  [<ffffffffb609faab>] __warn+0xcb/0xf0
[95938.643081]  [<ffffffffb609fbdd>] warn_slowpath_null+0x1d/0x20
[95938.643108]  [<ffffffffc0a4543d>] iwl_mvm_rx_tx_cmd+0x7fd/0xa10 [iwlmvm]
[95938.643131]  [<ffffffffc0a3b4bc>] iwl_mvm_rx_common+0x18c/0x2a0 [iwlmvm]
[95938.643150]  [<ffffffffc0a3b62b>] iwl_mvm_rx+0x5b/0x70 [iwlmvm]
[95938.643167]  [<ffffffffc088d6eb>] iwl_pcie_rx_handle+0x30b/0x860 [iwlwifi]
[95938.643185]  [<ffffffffc088f22d>] iwl_pcie_irq_handler+0x6ad/0xae0 [iwlwifi]
[95938.643193]  [<ffffffffb67e7102>] ? __schedule+0x2f2/0x780
[95938.643202]  [<ffffffffb60fc7c0>] ? irq_forced_thread_fn+0x70/0x70
[95938.643210]  [<ffffffffb60fc7e0>] irq_thread_fn+0x20/0x50
[95938.643218]  [<ffffffffb60fca1d>] irq_thread+0x12d/0x1b0
[95938.643226]  [<ffffffffb60fc840>] ? wake_threads_waitq+0x30/0x30
[95938.643234]  [<ffffffffb60fc8f0>] ? irq_thread_dtor+0xb0/0xb0
[95938.643242]  [<ffffffffb60bf4d8>] kthread+0xd8/0xf0
[95938.643251]  [<ffffffffb67eba7f>] ret_from_fork+0x1f/0x40
[95938.643259]  [<ffffffffb60bf400>] ? kthread_worker_fn+0x180/0x180
[95938.643265] ---[ end trace 9d3276affb4f208e ]---
[143127.395226] CPU2: Core temperature above threshold, cpu clock throttled (total events = 1)
[143127.395227] CPU0: Core temperature above threshold, cpu clock throttled (total events = 1)
[143127.395228] CPU3: Package temperature above threshold, cpu clock throttled (total events = 1)
[143127.395228] CPU1: Package temperature above threshold, cpu clock throttled (total events = 1)
[143127.395231] CPU0: Package temperature above threshold, cpu clock throttled (total events = 1)
[143127.395234] CPU2: Package temperature above threshold, cpu clock throttled (total events = 1)
[143127.395237] mce: [Hardware Error]: Machine check events logged
[143127.395237] mce: [Hardware Error]: Machine check events logged

Comment 4 Yogendra Jog 2016-09-27 07:35:03 UTC
I have been getting these errors for a long time, even while running simple bluejeans application or google docs. 

Initially I felt it was cooling issue.  Lenovo changed the heat sink, but issue continues only the occurrences have reduced.  

----------------------------------------------------------------------------------------------------

Sep 25 11:35:09 mcelog: CPUID Vendor Intel Family 6 Model 78
Sep 25 11:35:09 mcelog: mcelog: Family 6 Model 4e CPU: only decoding architectural errors
Sep 25 11:35:09 mcelog: Hardware event. This is not a software error.
Sep 25 11:35:09 mcelog: MCE 1
Sep 25 11:35:09 mcelog: CPU 0 THERMAL EVENT TSC 13859093a8
Sep 25 11:35:09 mcelog: TIME 1474783509 Sun Sep 25 11:35:09 2016
Sep 25 11:35:09 mcelog: Processor 0 below trip temperature. Throttling disabled
Sep 25 11:35:09 mcelog: STATUS 881a2802 MCGSTATUS 0
Sep 25 11:35:09 mcelog: MCGCAP c08 APICID 0 SOCKETID 0
Sep 25 11:35:09 mcelog: CPUID Vendor Intel Family 6 Model 78
Sep 25 11:35:09 mcelog: mcelog: Family 6 Model 4e CPU: only decoding architectural errors
Sep 25 11:35:09 mcelog: Hardware event. This is not a software error.
Sep 25 11:35:09 mcelog: MCE 2
Sep 25 11:35:09 mcelog: CPU 2 THERMAL EVENT TSC 1387fe7336
Sep 25 11:35:09 mcelog: TIME 1474783509 Sun Sep 25 11:35:09 2016
Sep 25 11:35:09 mcelog: Processor 2 heated above trip temperature. Throttling enabled.
Sep 25 11:35:09 mcelog: Please check your system cooling. Performance will be impacted
Sep 25 11:35:09 mcelog: STATUS 88192803 MCGSTATUS 0
Sep 25 11:35:09 mcelog: MCGCAP c08 APICID 1 SOCKETID 0
Sep 25 11:35:09 mcelog: CPUID Vendor Intel Family 6 Model 78

----------------------------------------------------------------------------------------------------

[   14.538652] CPU2: Core temperature above threshold, cpu clock throttled (total events = 1)
[   14.538653] CPU0: Core temperature above threshold, cpu clock throttled (total events = 1)
[   14.538656] CPU0: Package temperature above threshold, cpu clock throttled (total events = 1)
[   14.538674] mce: [Hardware Error]: Machine check events logged
[   14.538680] CPU3: Package temperature above threshold, cpu clock throttled (total events = 1)
[   14.538680] CPU1: Package temperature above threshold, cpu clock throttled (total events = 1)
[   14.539633] CPU0: Core temperature/speed normal
[   14.539634] CPU0: Package temperature/speed normal
[   14.539635] mce: [Hardware Error]: Machine check events logged
[   14.539661] CPU1: Package temperature/speed normal
[   14.539661] CPU3: Package temperature/speed normal
[   14.554147] CPU2: Package temperature above threshold, cpu clock throttled (total events = 1)
[   14.831765] EXT4-fs (dm-6): mounted filesystem with ordered data mode. Opts: (null)
[   14.842704] EXT4-fs (dm-7): mounted filesystem with ordered data mode. Opts: (null)

----------------------------------------------------------------------------------------------------

Comment 5 OE1FEU 2016-11-04 22:52:34 UTC
cat /etc/profile.d/thermal-mce.sh

shows:

#!/bin/bash
echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo

This is on Fedora 23 and I have no idea when this was introduced as a "fix" for this bug. Fact is that despite this "fix" the error is still existent, albeit less frequently and only under higher load. 

But a fix it is not, quite the opposite. I'd really like a solution that just doesn't simply disable a performance enhancing hardware capability. So far, this problem has never turned up on LKML (please correct me, should I be wrong here), so I'd really like to ask the fine people of Fedora to initiate a process that actually finds a solution, not a workaround.

Comment 6 Nadav Goldin 2016-12-14 10:20:33 UTC
Same here - fc24 with Lenovo t460s.

Comment 7 Dan Yasny 2016-12-21 01:47:25 UTC
Same here on two types on X1 carbon, Fedora 22-23-24 and today 25. Reinstalled one of the X1's with Centos 7.3 and the error also exists there. In one of the older BZs for the same issue it has been observed that the error started showing up after Fedora moved away from kernel 3.9

Comment 8 Jeremy Harris 2017-01-06 13:11:18 UTC
Also on a t460p with f24

Comment 9 Jeremy Eder 2017-01-06 13:14:31 UTC
Do you all see the horrendous battery life as well?  For me my CPUs are at full speed all the time.  I can manually slow them down and battery life is as expected, (I actually wrote a little script to do this for when I go on battery...).

Comment 10 Jeremy Harris 2017-01-06 13:17:23 UTC
No, that aspect is OK.  Overheat only with a cpu-intensive job really there.

Comment 11 Sean 2017-02-05 15:49:13 UTC
I have the same problem on an x1c4 with both fedora 24 and 25. 
Mcelog gives me

mcelog: Family 6 Model 4e CPU: only decoding architectural errors
mcelog: warning: 16 bytes ignored in each record
mcelog: consider an update

Comment 12 Otto J. Makela 2017-02-28 13:29:56 UTC
Interestingly, I am also seeing something very similar on a HP EliteBook 840 running Red Hat Enterprise Linux Workstation release 7.3 (Maipo). The fact that the CPUs ostensibly overheat and then again cool down within the same second doesn't really sound super-plausible to me, but what do I know...

/var/log/messages:

Feb 28 13:50:07 avosetti kernel: CPU3: Core temperature above threshold, cpu clock throttled (total events = 1087)
Feb 28 13:50:07 avosetti kernel: CPU2: Core temperature above threshold, cpu clock throttled (total events = 1087)
Feb 28 13:50:07 avosetti kernel: CPU1: Package temperature above threshold, cpu clock throttled (total events = 1872)
Feb 28 13:50:07 avosetti kernel: CPU0: Package temperature above threshold, cpu clock throttled (total events = 1872)
Feb 28 13:50:07 avosetti kernel: CPU2: Package temperature above threshold, cpu clock throttled (total events = 1872)
Feb 28 13:50:07 avosetti kernel: mce: [Hardware Error]: Machine check events logged
Feb 28 13:50:07 avosetti kernel: CPU3: Package temperature above threshold, cpu clock throttled (total events = 1872)
Feb 28 13:50:07 avosetti kernel: mce: [Hardware Error]: Machine check events logged
Feb 28 13:50:07 avosetti kernel: CPU3: Core temperature/speed normal
Feb 28 13:50:07 avosetti kernel: CPU2: Core temperature/speed normal
Feb 28 13:50:07 avosetti kernel: CPU1: Package temperature/speed normal
Feb 28 13:50:07 avosetti kernel: CPU0: Package temperature/speed normal
Feb 28 13:50:07 avosetti kernel: CPU2: Package temperature/speed normal
Feb 28 13:50:07 avosetti kernel: CPU3: Package temperature/speed normal
Feb 28 13:50:08 avosetti sh: abrt-dump-oops: Found oopses: 1
Feb 28 13:50:08 avosetti sh: abrt-dump-oops: Creating problem directories
Feb 28 13:50:08 avosetti sh: abrt-dump-oops: Not going to make dump directories world readable because PrivateReports is on
Feb 28 13:50:09 avosetti abrt-dump-oops: Reported 1 kernel oopses to Abrt

For slightly better time stamps:

% dmesg -e
[Feb28 13:50] CPU3: Core temperature above threshold, cpu clock throttled (total events = 1087)
[  +0,000002] CPU2: Core temperature above threshold, cpu clock throttled (total events = 1087)
[  +0,000001] CPU1: Package temperature above threshold, cpu clock throttled (total events = 1872)
[  +0,000001] CPU0: Package temperature above threshold, cpu clock throttled (total events = 1872)
[  +0,000001] CPU2: Package temperature above threshold, cpu clock throttled (total events = 1872)
[  +0,000002] mce: [Hardware Error]: Machine check events logged
[  +0,000005] CPU3: Package temperature above threshold, cpu clock throttled (total events = 1872)
[  +0,000002] mce: [Hardware Error]: Machine check events logged
[  +0,000978] CPU3: Core temperature/speed normal
[  +0,000001] CPU2: Core temperature/speed normal
[  +0,000000] CPU1: Package temperature/speed normal
[  +0,000001] CPU0: Package temperature/speed normal
[  +0,000001] CPU2: Package temperature/speed normal
[  +0,000004] CPU3: Package temperature/speed normal

% mcelog
Hardware event. This is not a software error.
MCE 0
CPU 3 THERMAL EVENT TSC 25c769db5d390 
TIME 1487593124 Mon Feb 20 14:18:44 2017
Processor 3 heated above trip temperature. Throttling enabled.
Please check your system cooling. Performance will be impacted
STATUS 88010803 MCGSTATUS 0
MCGCAP 1000c07 APICID 3 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 61
Hardware event. This is not a software error.
MCE 1
CPU 2 THERMAL EVENT TSC 25c769db6234d 
TIME 1487593124 Mon Feb 20 14:18:44 2017
Processor 2 heated above trip temperature. Throttling enabled.
Please check your system cooling. Performance will be impacted
STATUS 88010803 MCGSTATUS 0
MCGCAP 1000c07 APICID 2 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 61
Hardware event. This is not a software error.
MCE 2
CPU 3 THERMAL EVENT TSC 25c769ddd59d2 
TIME 1487593124 Mon Feb 20 14:18:44 2017
Processor 3 below trip temperature. Throttling disabled
STATUS 88020802 MCGSTATUS 0
MCGCAP 1000c07 APICID 3 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 61
Hardware event. This is not a software error.
MCE 3
CPU 2 THERMAL EVENT TSC 25c769ddd90dd 
TIME 1487593124 Mon Feb 20 14:18:44 2017
Processor 2 below trip temperature. Throttling disabled
STATUS 88020802 MCGSTATUS 0
MCGCAP 1000c07 APICID 2 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 61
Hardware event. This is not a software error.
MCE 4
CPU 1 THERMAL EVENT TSC 4b9047804e441 
TIME 1487849371 Thu Feb 23 13:29:31 2017
Processor 1 heated above trip temperature. Throttling enabled.
Please check your system cooling. Performance will be impacted
STATUS 88010803 MCGSTATUS 0
MCGCAP 1000c07 APICID 1 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 61
Hardware event. This is not a software error.
MCE 5
CPU 0 THERMAL EVENT TSC 4b9047805565a 
TIME 1487849371 Thu Feb 23 13:29:31 2017
Processor 0 heated above trip temperature. Throttling enabled.
Please check your system cooling. Performance will be impacted
STATUS 88010803 MCGSTATUS 0
MCGCAP 1000c07 APICID 0 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 61
Hardware event. This is not a software error.
MCE 6
CPU 1 THERMAL EVENT TSC 4b904782d73cb 
TIME 1487849371 Thu Feb 23 13:29:31 2017
Processor 1 below trip temperature. Throttling disabled
STATUS 88020802 MCGSTATUS 0
MCGCAP 1000c07 APICID 1 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 61
Hardware event. This is not a software error.
MCE 7
CPU 0 THERMAL EVENT TSC 4b904782da5d2 
TIME 1487849371 Thu Feb 23 13:29:31 2017
Processor 0 below trip temperature. Throttling disabled
STATUS 88020802 MCGSTATUS 0
MCGCAP 1000c07 APICID 0 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 61
Hardware event. This is not a software error.
MCE 8
CPU 0 THERMAL EVENT TSC 4b9bec23240d1 
TIME 1487849680 Thu Feb 23 13:34:40 2017
Processor 0 heated above trip temperature. Throttling enabled.
Please check your system cooling. Performance will be impacted
STATUS 88010a83 MCGSTATUS 0
MCGCAP 1000c07 APICID 0 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 61
Hardware event. This is not a software error.
MCE 9
CPU 1 THERMAL EVENT TSC 4b9bec23318c5 
TIME 1487849680 Thu Feb 23 13:34:40 2017
Processor 1 heated above trip temperature. Throttling enabled.
Please check your system cooling. Performance will be impacted
STATUS 88010a83 MCGSTATUS 0
MCGCAP 1000c07 APICID 1 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 61
Hardware event. This is not a software error.
MCE 10
CPU 0 THERMAL EVENT TSC 4b9bec281fc8c 
TIME 1487849680 Thu Feb 23 13:34:40 2017
Processor 0 below trip temperature. Throttling disabled
STATUS 88020a82 MCGSTATUS 0
MCGCAP 1000c07 APICID 0 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 61
Hardware event. This is not a software error.
MCE 11
CPU 1 THERMAL EVENT TSC 4b9bec282340e 
TIME 1487849680 Thu Feb 23 13:34:40 2017
Processor 1 below trip temperature. Throttling disabled
STATUS 88020a82 MCGSTATUS 0
MCGCAP 1000c07 APICID 1 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 61
Hardware event. This is not a software error.
MCE 12
CPU 2 THERMAL EVENT TSC 4ba76050d785f 
TIME 1487849983 Thu Feb 23 13:39:43 2017
Processor 2 heated above trip temperature. Throttling enabled.
Please check your system cooling. Performance will be impacted
STATUS 88010a83 MCGSTATUS 0
MCGCAP 1000c07 APICID 2 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 61
Hardware event. This is not a software error.
MCE 13
CPU 3 THERMAL EVENT TSC 4ba76050d9420 
TIME 1487849983 Thu Feb 23 13:39:43 2017
Processor 3 heated above trip temperature. Throttling enabled.
Please check your system cooling. Performance will be impacted
STATUS 88010a83 MCGSTATUS 0
MCGCAP 1000c07 APICID 3 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 61
Hardware event. This is not a software error.
MCE 14
CPU 3 THERMAL EVENT TSC 4ba7605ab417c 
TIME 1487849983 Thu Feb 23 13:39:43 2017
Processor 3 below trip temperature. Throttling disabled
STATUS 88020a82 MCGSTATUS 0
MCGCAP 1000c07 APICID 3 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 61
Hardware event. This is not a software error.
MCE 15
CPU 2 THERMAL EVENT TSC 4ba7605ab9fe3 
TIME 1487849983 Thu Feb 23 13:39:43 2017
Processor 2 below trip temperature. Throttling disabled
STATUS 88020a82 MCGSTATUS 0
MCGCAP 1000c07 APICID 2 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 61
Hardware event. This is not a software error.
MCE 16
CPU 2 THERMAL EVENT TSC 8b7225a721587 
TIME 1488282607 Tue Feb 28 13:50:07 2017
Processor 2 heated above trip temperature. Throttling enabled.
Please check your system cooling. Performance will be impacted
STATUS 88010a83 MCGSTATUS 0
MCGCAP 1000c07 APICID 2 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 61
Hardware event. This is not a software error.
MCE 17
CPU 3 THERMAL EVENT TSC 8b7225a7269e9 
TIME 1488282607 Tue Feb 28 13:50:07 2017
Processor 3 heated above trip temperature. Throttling enabled.
Please check your system cooling. Performance will be impacted
STATUS 88010a83 MCGSTATUS 0
MCGCAP 1000c07 APICID 3 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 61
Hardware event. This is not a software error.
MCE 18
CPU 2 THERMAL EVENT TSC 8b7225a994d17 
TIME 1488282607 Tue Feb 28 13:50:07 2017
Processor 2 below trip temperature. Throttling disabled
STATUS 88020a82 MCGSTATUS 0
MCGCAP 1000c07 APICID 2 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 61
Hardware event. This is not a software error.
MCE 19
CPU 3 THERMAL EVENT TSC 8b7225a99845d 
TIME 1488282607 Tue Feb 28 13:50:07 2017
Processor 3 below trip temperature. Throttling disabled
STATUS 88020a82 MCGSTATUS 0
MCGCAP 1000c07 APICID 3 SOCKETID 0 
CPUID Vendor Intel Family 6 Model 61

% uname -a
Linux avosetti.x.csc.fi 3.10.0-514.6.1.el7.x86_64 #1 SMP Sat Dec 10 11:15:38 EST 2016 x86_64 x86_64 x86_64 GNU/Linux

% cat /etc/redhat-release 
Red Hat Enterprise Linux Workstation release 7.3 (Maipo)

Comment 13 OE1FEU 2017-03-29 06:52:13 UTC
Happy 4th birthday, dear MCE-bug!

Introduced in 2013, this bug is still there on Fedora 25. Today received a kernel upgrade to 4.10.5-200.fc25.x86_64, but the bug is still there. 

There seems to have been a slight change in the systemd-journald package, which now spams ALL consoles, no matter which user, no matter whether you log in on a framebuffer console or Konsole as X11 application.

I changed that behaviour by doing:

edit /etc/systemd/journald.conf
ForwardToWall=no
ForwardToConsole=no

systemctl restart systemd-journald

Still isn't there a way to make this bug go away? Fedora has has tried to mitigate the situation by disabling the Intel Turbo feature, yet, the MCE notification is still there:

In

/etc/profile.d/thermal-mce.sh

echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo

Is there anyway we can make someone at Red Hat and/or Intel aware of this problem and find a solution? AFAICS this specific bug has not yet found its way to LKML. I'd rather not post there, so maybe a more experienced person with some good standing on LKML can address this problem.

I'd really like to use all capabilities of my X1 3rd gen to their fullest extent, including the Intel Turbo.

Comment 14 OE1FEU 2017-03-29 09:10:44 UTC
Addendum to previous posting:

Despite the adaptation in the syslogd, the system still logs all MCE-notifications to Konsole.

By now the whole problem has developed into a nightmare.

Comment 15 Pierguido Lambri 2017-04-05 10:49:16 UTC
Got the same on a shiny new T460s.
I get also some other memory errors, not sure if these are related or not.

[158748.687661] CPU2: Core temperature above threshold, cpu clock throttled (total events = 124)
[158748.687662] CPU0: Core temperature above threshold, cpu clock throttled (total events = 124)
[158748.687665] CPU0: Package temperature above threshold, cpu clock throttled (total events = 124)
[158748.687684] CPU2: Package temperature above threshold, cpu clock throttled (total events = 124)
[158748.687687] CPU3: Package temperature above threshold, cpu clock throttled (total events = 124)
[158748.687688] CPU1: Package temperature above threshold, cpu clock throttled (total events = 124)
[158748.687689] mce_notify_irq: 1 callbacks suppressed
[158748.687690] mce: [Hardware Error]: Machine check events logged
[158748.687708] mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 128: 000000008819280b
[158748.687710] mce: [Hardware Error]: TSC 195756741aa83 
[158748.687713] mce: [Hardware Error]: PROCESSOR 0:406e3 TIME 1491387946 SOCKET 0 APIC 0 microcode 9e
[158748.687715] mce: [Hardware Error]: CPU 2: Machine Check: 0 Bank 128: 000000008819280b
[158748.687716] mce: [Hardware Error]: TSC 1957567428bb8 
[158748.687719] mce: [Hardware Error]: PROCESSOR 0:406e3 TIME 1491387946 SOCKET 0 APIC 1 microcode 9e

Comment 16 John 2017-04-05 12:58:02 UTC
I'm seeing the same issue on a Dell Inspirion 7378 with an i7-7500u. I re-pasted the heatsink thinking that might help, but it didn't.

This issue does need to go to LKML.

Comment 17 Angelo Lisco 2017-04-10 16:59:23 UTC
Same problem here :(

System Information
Manufacturer: LENOVO
Product Name: 20BWS3D500
Version: ThinkPad T450s

Any news about a specific thread on LKML?

Comment 18 Justin M. Forbes 2017-04-11 14:47:43 UTC
*********** MASS BUG UPDATE **************

We apologize for the inconvenience.  There are a large number of bugs to go through and several of them have gone stale.  Due to this, we are doing a mass bug update across all of the Fedora 24 kernel bugs.

Fedora 25 has now been rebased to 4.10.9-100.fc24.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.

If you have moved on to Fedora 26, and are still experiencing this issue, please change the version to Fedora 26.

If you experience different issues, please open a new bug report for those.

Comment 19 Pierguido Lambri 2017-04-13 11:13:39 UTC
Just testing the latest kernel (4.10.9-200.fc25.x86_64 I'm using testing repos) and still see the messages:

[ 2664.911050] CPU0: Core temperature above threshold, cpu clock throttled (total events = 1)
[ 2664.911051] CPU2: Core temperature above threshold, cpu clock throttled (total events = 1)
[ 2664.911054] CPU2: Package temperature above threshold, cpu clock throttled (total events = 1)
[ 2664.911055] CPU0: Package temperature above threshold, cpu clock throttled (total events = 1)
[ 2664.911061] mce: [Hardware Error]: Machine check events logged
[ 2664.911079] CPU1: Package temperature above threshold, cpu clock throttled (total events = 1)
[ 2664.911079] CPU3: Package temperature above threshold, cpu clock throttled (total events = 1)
[ 2664.912116] CPU0: Core temperature/speed normal
[ 2664.912117] CPU2: Core temperature/speed normal
[ 2664.912118] CPU1: Package temperature/speed normal
[ 2664.912118] CPU3: Package temperature/speed normal
[ 2664.912119] CPU0: Package temperature/speed normal
[ 2664.912120] mce: [Hardware Error]: Machine check events logged

Comment 20 Pierguido Lambri 2017-04-13 11:15:18 UTC
PS: I'm changing version (F25) and bumping up the prio/sev

Comment 21 Jeremy Harris 2017-04-23 11:33:30 UTC
With kernel-4.10.8-200.fc25.x86_64 I'm no longer seeing the problem.

Comment 22 Jeremy Harris 2017-05-02 23:25:50 UTC
4.10.12-200.fc25.x86_64 and it's back.

Comment 23 Lance Bragstad 2017-05-06 15:04:19 UTC
I was able to recreate by running a very similar workflow to Stephen's within a container. I'm running Fedora 25 on a 5th gen X1 Carbon with kernel version 4.10.13-200.fc25.x86_64.

Comment 24 Marek Salwerowicz 2017-05-17 08:48:17 UTC
I confirm the bug on Thinkpad T470, running Fedora 25:

Linux marek-t470 4.10.15-200.fc25.x86_64 #1 SMP Mon May 8 18:46:06 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Comment 25 Pierguido Lambri 2017-05-18 12:09:11 UTC
Switched to F26 and never had this problem any more.

Comment 26 Lewis Eason 2017-05-24 10:29:51 UTC
I'm also seeing this on a Thinkpad T470, running Fedora 25:

Linux insh 4.10.15-200.fc25.x86_64 #1 SMP Mon May 8 18:46:06 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

About to upgrade to 4.10.16, but don't imagine it'll help based on the shortlog for this kernel.

Comment 27 William Fleming 2017-05-25 10:16:49 UTC
Also seeing on a Thinkpad T470 running Fedora 25:

4.10.16-200.fc25.x86_64 #1 SMP Mon May 15 15:19:52 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

Comment 28 William Fleming 2017-06-05 08:50:00 UTC
Issue still exists on latest kernel for f25 4.11.3-200.fc25.x86_64

Comment 29 John 2017-06-06 15:16:31 UTC
The machine check errors have disappeared for me. I'm on 4.11.3-200.fc25.x86_64.

I'm still seeing thermal events, but that's to be expected.

John

Comment 30 Jeremy Harris 2017-06-30 15:40:17 UTC
Gone away for me, with 4.11.6-201.fc25.x86_64

Comment 31 Fedora End Of Life 2017-11-16 18:49:06 UTC
This message is a reminder that Fedora 25 is nearing its end of life.
Approximately 4 (four) weeks from now Fedora will stop maintaining
and issuing updates for Fedora 25. It is Fedora's policy to close all
bug reports from releases that are no longer maintained. At that time
this bug will be closed as EOL if it remains open with a Fedora  'version'
of '25'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version'
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not
able to fix it before Fedora 25 is end of life. If you would still like
to see this bug fixed and are able to reproduce it against a later version
of Fedora, you are encouraged  change the 'version' to a later Fedora
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's
lifetime, sometimes those efforts are overtaken by events. Often a
more recent Fedora release includes newer upstream software that fixes
bugs or makes them obsolete.

Comment 32 George Sapkin 2017-11-19 16:24:16 UTC
Still an issue with Fedora 26 and 27 and latest kernels.

Comment 33 George Sapkin 2017-11-22 09:09:55 UTC
Same issue with kernel 4.14.0-1 from Fedora 28.

Comment 34 Nuno Passos 2017-11-23 15:34:07 UTC
It also happens in the
Hewlett-Packard HP EliteBook 8470p / 179B, BIOS 68ICF Ver. F.46 01/17/2014

 5567.811064] mce: [Hardware Error]: CPU 3: Machine Check: 0 Bank 128: 0000000088020002
[5567.811066] mce: [Hardware Error]: TSC f3374ab78e5
[5567.811068] mce: [Hardware Error]: PROCESSOR 0: 306a9 TIME 1511441212 SOCKET 0 APIC 3 microcode 1c
[5567.811070] mce: [Hardware Error]: CPU 2: Machine Check: 0 Bank 128: 0000000088020002
[5567.811071] mce: [Hardware Error]: TSC f3374ab834d
[5567.811073] mce: [Hardware Error]: PROCESSOR 0: 306a9 TIME 1511441212 SOCKET 0 APIC 2 microcode 1c
[6129.118759] CPU2: Core temperature above threshold, cpu clock throttled (total events = 77)
[6129.118761] CPU0: Package temperature above threshold, cpu clock throttled (total events = 77)
[6129.118762] CPU1: Package temperature above threshold, cpu clock throttled (total events = 77)
[6129.118763] CPU3: Core temperature above threshold, cpu clock throttled (total events = 77)
[6129.118767] CPU3: Package temperature above threshold, cpu clock throttled (total events = 77)
[6129.118768] mce_notify_irq: 1 suppressed callbacks
[6129.118769] mce: [Hardware Error]: Machine check events logged
[6129.118770] CPU2: Package temperature above threshold, cpu clock throttled (total events = 77)
[6129.118771] mce: [Hardware Error]: Machine check events logged
[6129.118782] mce: [Hardware Error]: CPU 3: Machine Check: 0 Bank 128: 0000000088010283
[6129.118784] mce: [Hardware Error]: TSC 10baa371071f
[6129.118788] mce: [Hardware Error]: PROCESSOR 0: 306a9 TIME 1511441774 SOCKET 0 APIC 3 microcode 1c
[6129.118791] mce: [Hardware Error]: CPU 2: Machine Check: 0 Bank 128: 0000000088010283
[6129.118793] mce: [Hardware Error]: TSC 10baa3713edb
[6129.118797] mce: [Hardware Error]: PROCESSOR 0: 306a9 TIME 1511441774 SOCKET 0 APIC 2 microcode 1c
It



4.10.0-40-generic

Distributor ID:	Ubuntu
Description:	Ubuntu 16.04.3 LTS
Release:	16.04
Codename:	xenial

Comment 35 Jeremy Harris 2018-01-08 12:23:45 UTC
After a bit of a hiatus in these, I've had a couple on 4.14.11-300.fc27.x86_64

Comment 36 Laura Abbott 2018-02-20 20:02:38 UTC
We apologize for the inconvenience.  There is a large number of bugs to go through and several of them have gone stale.  As kernel maintainers, we try to keep up with bugzilla but due the rate at which the upstream kernel project moves, bugs may be fixed without any indication to us. Due to this, we are doing a mass bug update across all of the Fedora 27 kernel bugs.
 
Fedora 27 has now been rebased to 4.15.3-300.f27.  Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel.
 
If you experience different issues, please open a new bug report for those.

Comment 37 Christian Kujau 2018-02-20 23:46:53 UTC
I'm getting only the temperature warnings, a lot of them! - but no hardware events are logged (according to mcelog) on my ThinkPad T470 (Intel i7-7600U). Today's kern.log has this many messages on a current Fedora 27 system:

$ dmesg -t | grep temperature | cut -d\  -f2-8 | sort | uniq -c | sort -n
    120 Core temperature above threshold, cpu clock throttled
    132 Core temperature/speed normal
    260 Package temperature above threshold, cpu clock throttled
    287 Package temperature/speed normal

But, as there are no hardwre events logged on this particular machine, I'm not sure if this is even the same bug.

Comment 38 Laura Abbott 2018-02-21 01:05:01 UTC
There really isn't anything to be done, this is working as expected. When the CPU temperature gets too hot, the correct behavior is to throttle the clock. It's annoying this gets logged but it's no longer generating an MCE log. I'm just going to close this bug.

Comment 39 Otto J. Makela 2018-02-21 07:40:57 UTC
As I earlier said, the fact that the CPUs ostensibly overheat and then again cool down within the same second doesn't really sound super-plausible to me, but what do I know...

Comment 40 Christian Kujau 2018-03-06 19:50:42 UTC
Can the loglevel of these messages be adjusted though? I don't understand why these messages are logged with a priority of criticial, when (if I parse Laura's reply correctly) it should be "debug" at most:


arch/x86/kernel/cpu/mcheck/therm_throt.c:187: pr_crit("CPU%d: %s temperature above threshold, cpu clock throttled (total events = %lu)\n",
arch/x86/kernel/cpu/mcheck/therm_throt.c:195: pr_info("CPU%d: %s temperature/speed normal\n", this_cpu,



Some stats over 90 minutes of usage (edited for readability)
===================================================
# atop -r -P CPU -b 10:12 -e 11:30 | grep -v ^SEP
TIME     TPS # SYS   USER  NICE  IDLE  WT  IRQ SIRQ S G  FRQ FPCT 
10:55:36 100 4 9939  21586  22 204930 139 2136  820 0 0 2551   65
11:05:36 100 4 11653 30937   0 193369 140 2304 1212 0 0 8969  229
11:15:36 100 4 10987 36769   0 187918 468 2349 1086 0 0 2823   72
11:25:36 100 4 8158  19757   1 209369  83 1803  685 0 0 2750   70

# journalctl -l -t kernel | egrep -c "$(date +%b\ %d)" 
363

# journalctl -l -t kernel | egrep -c "$(date +%b\ %d).*CPU" 
86

# journalctl -l -p crit -t kernel | egrep -c "$(date +%b\ %d).*CPU[0-9]:" 
42

Comment 41 OE1FEU 2018-06-17 14:25:48 UTC
https://github.com/erpalma/lenovo-throttling-fix

goes into more detail and certainly shows that indeed we have a bug and that CPU temperature wildly going up and down is not what's actually going on.

I, too, would like to make use of the Intel turbo feature until the CPU actually reaches 100°C.

Please re-open this as a bug.

Comment 42 Peter Hostačný 2018-08-01 13:36:54 UTC
I have the same problem on T480 - the temperature messages are the first ones when booting Fedora 28.

Please reopen.

Comment 43 Dan.Kolbas 2018-08-06 16:44:13 UTC
Same issue.  I just ran some updates on a restart and now my computer is bricked.  

Fedora 28 on T480

I get temperature warnings as well.

Comment 44 Pavel Druyan 2018-08-15 11:59:40 UTC
(In reply to Dan.Kolbas from comment #43)
> Same issue.  I just ran some updates on a restart and now my computer is
> bricked.  
> 
> Fedora 28 on T480
> 
> I get temperature warnings as well.

Same here!!! F28 on T480 (Lenovo ThinkPad T480 (i7-8550U, MX150, FHD))

Comment 45 flira 2018-09-03 06:07:35 UTC
Same here! Dell Inspiron 15 7560

Comment 46 bitchecker 2018-11-20 22:01:15 UTC
ho HP Pavilion 5335KV running Fedora 29, same problem reported on dmesg:

```
[15106.139924] CPU0: Core temperature above threshold, cpu clock throttled (total events = 1)
[15106.139945] CPU4: Core temperature above threshold, cpu clock throttled (total events = 1)
[15106.139947] CPU5: Package temperature above threshold, cpu clock throttled (total events = 11)
[15106.139948] CPU2: Package temperature above threshold, cpu clock throttled (total events = 11)
[15106.139949] CPU6: Package temperature above threshold, cpu clock throttled (total events = 11)
[15106.139950] CPU1: Package temperature above threshold, cpu clock throttled (total events = 11)
[15106.139952] CPU4: Package temperature above threshold, cpu clock throttled (total events = 11)
[15106.139953] CPU7: Package temperature above threshold, cpu clock throttled (total events = 11)
[15106.139954] CPU3: Package temperature above threshold, cpu clock throttled (total events = 11)
[15106.139961] CPU0: Package temperature above threshold, cpu clock throttled (total events = 11)
[15106.144987] CPU0: Core temperature/speed normal
[15106.144988] CPU4: Core temperature/speed normal
[15106.144989] CPU6: Package temperature/speed normal
[15106.144990] CPU2: Package temperature/speed normal
[15106.144991] CPU5: Package temperature/speed normal
[15106.144991] CPU1: Package temperature/speed normal
[15106.144992] CPU4: Package temperature/speed normal
[15106.144993] CPU7: Package temperature/speed normal
[15106.144994] CPU3: Package temperature/speed normal
[15106.144995] CPU0: Package temperature/speed normal
```

Comment 47 Alessandro Silva 2019-02-28 12:55:20 UTC
I'm facing the same issue on a Thinkpad P50 and running RHEL7

[211952.288488] CPU7: Package temperature/speed normal
[211952.288488] CPU6: Core temperature/speed normal
[211952.288489] CPU1: Package temperature/speed normal
[211952.288490] CPU2: Core temperature/speed normal
[211952.288491] CPU5: Package temperature/speed normal
[211952.288491] CPU3: Package temperature/speed normal
[211952.288492] CPU6: Package temperature/speed normal
[211952.288494] CPU2: Package temperature/speed normal
[211952.288522] CPU0: Package temperature/speed normal
[211952.288523] CPU4: Package temperature/speed normal
[212365.270480] CPU4: Core temperature above threshold, cpu clock throttled (total events = 1860)
[212365.270481] CPU0: Core temperature above threshold, cpu clock throttled (total events = 1860)
[212365.270483] CPU0: Package temperature above threshold, cpu clock throttled (total events = 2704)
[212365.270486] CPU4: Package temperature above threshold, cpu clock throttled (total events = 2704)
[212365.270521] CPU1: Package temperature above threshold, cpu clock throttled (total events = 2704)
[212365.270522] CPU6: Package temperature above threshold, cpu clock throttled (total events = 2704)
[212365.270523] CPU2: Package temperature above threshold, cpu clock throttled (total events = 2704)
[212365.270524] CPU7: Package temperature above threshold, cpu clock throttled (total events = 2704)
[212365.270525] CPU5: Package temperature above threshold, cpu clock throttled (total events = 2704)
[212365.270526] CPU3: Package temperature above threshold, cpu clock throttled (total events = 2704)
[212365.271474] CPU4: Core temperature/speed normal
[212365.271475] CPU1: Package temperature/speed normal
[212365.271475] CPU0: Core temperature/speed normal
[212365.271476] CPU5: Package temperature/speed normal
[212365.271477] CPU0: Package temperature/speed normal
[212365.271480] CPU3: Package temperature/speed normal
[212365.271480] CPU7: Package temperature/speed normal
[212365.271484] CPU4: Package temperature/speed normal
[212365.271505] CPU6: Package temperature/speed normal
[212365.271506] CPU2: Package temperature/speed normal
[212775.336445] CPU0: Core temperature above threshold, cpu clock throttled (total events = 1909)
[212775.336446] CPU4: Core temperature above threshold, cpu clock throttled (total events = 1909)
[212775.336448] CPU4: Package temperature above threshold, cpu clock throttled (total events = 2770)
[212775.336450] CPU0: Package temperature above threshold, cpu clock throttled (total events = 2770)
[212775.336485] CPU1: Package temperature above threshold, cpu clock throttled (total events = 2770)
[212775.336486] CPU5: Package temperature above threshold, cpu clock throttled (total events = 2770)
[212775.336487] CPU7: Package temperature above threshold, cpu clock throttled (total events = 2770)
[212775.336488] CPU6: Package temperature above threshold, cpu clock throttled (total events = 2770)
[212775.336489] CPU2: Package temperature above threshold, cpu clock throttled (total events = 2770)
[212775.336490] CPU3: Package temperature above threshold, cpu clock throttled (total events = 2770)
[212775.337437] CPU2: Package temperature/speed normal

Comment 48 mgrobelkiewicz+redhat 2021-04-18 14:17:43 UTC
redhat 8.3 
computer acer aspire v3 771g 


  +0.015407] virbr0: port 1(virbr0-nic) entered blocking state
[  +0.000005] virbr0: port 1(virbr0-nic) entered disabled state
[  +0.000214] device virbr0-nic entered promiscuous mode
[  +4.619444] virbr0: port 1(virbr0-nic) entered blocking state
[  +0.000006] virbr0: port 1(virbr0-nic) entered listening state
[  +0.487898] virbr0: port 1(virbr0-nic) entered disabled state
[Apr18 14:30] nouveau 0000:01:00.0: therm: temperature (90 C) hit the 'fanboost' threshold
[ +46.394386] Bluetooth: RFCOMM TTY layer initialized
[  +0.000039] Bluetooth: RFCOMM socket layer initialized
[  +0.000153] Bluetooth: RFCOMM ver 1.11
[  +4.347952] rfkill: input handler disabled
[Apr18 14:32] nouveau 0000:01:00.0: therm: temperature (87 C) went below the 'fanboost' threshold
[Apr18 14:33] CPU1: Core temperature above threshold, cpu clock throttled (total events = 1)
[  +0.000001] CPU0: Core temperature above threshold, cpu clock throttled (total events = 1)
[  +0.000003] CPU3: Package temperature above threshold, cpu clock throttled (total events = 1)
[  +0.000001] CPU2: Package temperature above threshold, cpu clock throttled (total events = 1)
[  +0.000005] CPU0: Package temperature above threshold, cpu clock throttled (total events = 1)
[  +0.000003] CPU1: Package temperature above threshold, cpu clock throttled (total events = 1)
[  +0.002007] CPU1: Core temperature/speed normal
[  +0.000001] CPU0: Core temperature/speed normal
[  +0.000002] CPU2: Package temperature/speed normal
[  +0.000001] CPU3: Package temperature/speed normal
[  +0.000001] CPU0: Package temperature/speed normal
[  +0.000001] CPU1: Package temperature/speed normal


Note You need to log in before you can comment on or make changes to this bug.