Description of problem: I've just got my brand new Thinkpad T470s with a i7-7600U. The kernel is reporting my CPU was to hot and it's being throttled, even though I didn't do anything CPU hungry. When booting Fedora I'm seeing the following warnings: > CPU2: Core temperature above threshold, cpu clock throttled (total events = 2080) > CPU0: Core temperature above threshold, cpu clock throttled (total events = 2080) > CPU0: Package temperature above threshold, cpu clock throttled (total events = 2206) > CPU2: Package temperature above threshold, cpu clock throttled (total events = 2206) > CPU1: Package temperature above threshold, cpu clock throttled (total events = 2206) > CPU3: Package temperature above threshold, cpu clock throttled (total events = 2206) `sensors` shows: > coretemp-isa-0000 > Adapter: ISA adapter > Package id 0: +48.0°C (high = +100.0°C, crit = +100.0°C) > Core 0: +46.0°C (high = +100.0°C, crit = +100.0°C) > Core 1: +48.0°C (high = +100.0°C, crit = +100.0°C) Version-Release number of selected component (if applicable): Kernel 4.17.3-200.fc28.x86_64, How reproducible: No idea! For me that's the case after every restart. Actual results: CPU throttled. Expected results: No throttling under 100°C Additional info: I'm noticing a sluggish behaviour of Gnome, resizing windows leaves black boxes on my external screen, even with a Intel HD 620.
Just checked lscpu, and mine is throttled to 900 MHz.
Is it that the CPU temperature can change so quickly that I cannot observe the critical temperature even a minute later? I guess I have another issue then that that CPU actually constantly get's too hot.
I have been having this problem as well. This is happening constantly even without any processes other then the basic display manager. At first I noted that my scaling governor was set to powersave which caused the throttling to be severe and was really disruptive to my daily schedule and I have since set the governor to 'performance' which as caused the scaling to be much more manageable (at least I can actually use my laptop now), but hasn't stopped the issue. Hardware: Lenovo ThinkPad T480 Intel Core i7-8650U CPU @ 1.90GHz 32GB RAM VGA Adaptor: Intel Corporation UHD Graphics 620 3D controller: NVIDIA Corporation GP108M [GeForce MX150] Kernel: 4.17.3-200.fc28.x86_64 Fedora 28 I am unfortunately running with the nouveau driver which is really troublesome and might be a huge contributor to this problem (and possibly the OP) and has caused quite a few crashes that have either been insignificant (I didn't even notice the driver crashed) to completely locking up my workstation. This has even just happened while typing this comment: [15625.782074] thinkpad_acpi: EC reports that Thermal Table has changed [15627.920642] ------------[ cut here ]------------ [15627.920643] nouveau 0000:01:00.0: timeout [15627.920697] WARNING: CPU: 3 PID: 10772 at drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c:86 nvkm_pmu_reset+0x14c/0x160 [nouveau] [15627.920698] Modules linked in: fuse ccm xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun devlink nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack libcrc32c iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables bnep vfat fat arc4 intel_rapl iTCO_wdt iTCO_vendor_support x86_pkg_temp_thermal intel_powerclamp iwlmvm coretemp mei_wdt kvm_intel mac80211 snd_hda_codec_hdmi kvm snd_soc_skl snd_soc_skl_ipc snd_hda_ext_core snd_soc_sst_dsp snd_soc_sst_ipc snd_hda_codec_realtek [15627.920717] snd_soc_acpi irqbypass snd_soc_core snd_hda_codec_generic intel_cstate iwlwifi snd_compress intel_uncore snd_pcm_dmaengine intel_rapl_perf ac97_bus snd_hda_intel uvcvideo snd_hda_codec btusb cfg80211 btrtl videobuf2_vmalloc btbcm videobuf2_memops btintel videobuf2_v4l2 snd_hda_core thunderbolt videobuf2_common bluetooth snd_hwdep joydev snd_seq wmi_bmof intel_wmi_thunderbolt snd_seq_device videodev snd_pcm idma64 thinkpad_acpi ucsi_acpi snd_timer intel_lpss_pci i2c_i801 media mei_me typec_ucsi mei ecdh_generic intel_lpss intel_pch_thermal snd shpchp typec soundcore int3403_thermal processor_thermal_device int3400_thermal int340x_thermal_zone rfkill acpi_thermal_rel acpi_pad intel_soc_dts_iosf auth_rpcgss sunrpc dm_crypt i915 nouveau mxm_wmi ttm i2c_algo_bit drm_kms_helper crct10dif_pclmul [15627.920761] crc32_pclmul crc32c_intel nvme uas e1000e drm usb_storage nvme_core ghash_clmulni_intel serio_raw wmi video [15627.920782] CPU: 3 PID: 10772 Comm: lspci Tainted: G W 4.17.3-200.fc28.x86_64 #1 [15627.920783] Hardware name: LENOVO 20L5CTO1WW/20L5CTO1WW, BIOS N24ET37W (1.12 ) 03/14/2018 [15627.920800] RIP: 0010:nvkm_pmu_reset+0x14c/0x160 [nouveau] [15627.920801] RSP: 0018:ffffa03f4dae7af8 EFLAGS: 00010282 [15627.920802] RAX: 0000000000000000 RBX: ffff9066ece5edc0 RCX: 0000000000000006 [15627.920802] RDX: 0000000000000007 RSI: 0000000000000092 RDI: ffff9067114d6930 [15627.920803] RBP: ffff9066e6e5b400 R08: 0000000000000030 R09: 0000000000000677 [15627.920803] R10: 0000000000000000 R11: 0000000000000001 R12: ffff9066e7e88c00 [15627.920804] R13: ffff9066e80a1000 R14: 00000e3628c3dbc0 R15: ffff9066eceaa0a0 [15627.920805] FS: 00007f2409e57740(0000) GS:ffff9067114c0000(0000) knlGS:0000000000000000 [15627.920805] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [15627.920806] CR2: 000006abbc76a000 CR3: 00000007bb882001 CR4: 00000000003606e0 [15627.920806] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [15627.920807] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [15627.920807] Call Trace: [15627.920825] nvkm_pmu_init+0x16/0x40 [nouveau] [15627.920834] nvkm_subdev_init+0xb2/0x200 [nouveau] [15627.920851] nvkm_device_init+0x123/0x280 [nouveau] [15627.920867] nvkm_udevice_init+0x41/0x60 [nouveau] [15627.920877] nvkm_object_init+0x3e/0x100 [nouveau] [15627.920886] nvkm_object_init+0x71/0x100 [nouveau] [15627.920896] nvkm_object_init+0x71/0x100 [nouveau] [15627.920899] ? pci_restore_standard_config+0x40/0x40 [15627.920914] nouveau_do_resume+0x28/0x150 [nouveau] [15627.920930] nouveau_pmops_runtime_resume+0x88/0x150 [nouveau] [15627.920932] pci_pm_runtime_resume+0x78/0xb0 [15627.920934] __rpm_callback+0xca/0x210 [15627.920935] ? pci_restore_standard_config+0x40/0x40 [15627.920936] rpm_callback+0x1f/0x70 [15627.920937] ? pci_restore_standard_config+0x40/0x40 [15627.920938] rpm_resume+0x560/0x780 [15627.920939] pm_runtime_barrier+0x96/0xa0 [15627.920940] pci_config_pm_runtime_get+0x36/0x50 [15627.920942] pci_read_config+0x95/0x290 [15627.920944] ? _cond_resched+0x15/0x30 [15627.920946] ? __kmalloc+0x19a/0x230 [15627.920948] kernfs_fop_read+0xac/0x180 [15627.920950] __vfs_read+0x36/0x170 [15627.920951] vfs_read+0x8a/0x140 [15627.920952] ksys_pread64+0x61/0xa0 [15627.920954] do_syscall_64+0x5b/0x160 [15627.920956] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [15627.920958] RIP: 0033:0x7f2409557577 [15627.920958] RSP: 002b:00007ffe92c3b0d8 EFLAGS: 00000246 ORIG_RAX: 0000000000000011 [15627.920959] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f2409557577 [15627.920959] RDX: 0000000000000040 RSI: 000055dd3bce42c0 RDI: 0000000000000003 [15627.920960] RBP: 0000000000000040 R08: 00007f2409a4d6d2 R09: 00007ffe92c3a590 [15627.920960] R10: 0000000000000000 R11: 0000000000000246 R12: 000055dd3bcebfe0 [15627.920961] R13: 000055dd3bce42c0 R14: 0000000000000000 R15: 0000000000000000 [15627.920961] Code: 41 5c 41 5d 41 5e c3 48 8b 7d 10 48 8b 5f 50 48 85 db 74 1e e8 76 03 17 ee 48 89 da 48 c7 c7 1f f5 58 c0 48 89 c6 e8 8e 3d c5 ed <0f> 0b e9 42 ff ff ff 48 8b 5f 10 eb dc 48 8b 5f 10 eb a5 90 0f [15627.920978] ---[ end trace ccc3e7c7618ee3b1 ]--- [15633.393937] thinkpad_acpi: EC reports that Thermal Table has changed [15780.648342] CPU4: Package temperature above threshold, cpu clock throttled (total events = 2123) [15780.648342] CPU0: Package temperature above threshold, cpu clock throttled (total events = 2123) [15780.648381] CPU6: Package temperature above threshold, cpu clock throttled (total events = 2123) [15780.648382] CPU5: Package temperature above threshold, cpu clock throttled (total events = 2123) [15780.648383] CPU1: Package temperature above threshold, cpu clock throttled (total events = 2123) [15780.648383] CPU2: Package temperature above threshold, cpu clock throttled (total events = 2123) [15780.648384] CPU3: Package temperature above threshold, cpu clock throttled (total events = 2123) [15780.648385] CPU7: Package temperature above threshold, cpu clock throttled (total events = 2123) [15780.671327] CPU7: Package temperature/speed normal [15780.671328] CPU3: Package temperature/speed normal [15780.671391] CPU0: Package temperature/speed normal [15780.671392] CPU5: Package temperature/speed normal [15780.671393] CPU4: Package temperature/speed normal [15780.671393] CPU1: Package temperature/speed normal [15780.671394] CPU6: Package temperature/speed normal [15780.671395] CPU2: Package temperature/speed normal I unfortunately don't have a lot of time to spend on researching this problem but if I can provide any new information that might help solve this bug, PLEASE let me know. Another VERY weird symptom is the output of 'sensors' ... it shows the nvidia card at 511 degrees celcius?!! [root@PIQLT501 bin] # sensors coretemp-isa-0000 Adapter: ISA adapter Package id 0: +51.0°C (high = +100.0°C, crit = +100.0°C) Core 0: +49.0°C (high = +100.0°C, crit = +100.0°C) Core 1: +50.0°C (high = +100.0°C, crit = +100.0°C) Core 2: +51.0°C (high = +100.0°C, crit = +100.0°C) Core 3: +49.0°C (high = +100.0°C, crit = +100.0°C) pch_skylake-virtual-0 Adapter: Virtual device temp1: +46.5°C acpitz-virtual-0 Adapter: Virtual device temp1: +50.0°C (crit = +128.0°C) iwlwifi-virtual-0 Adapter: Virtual device temp1: +32.0°C thinkpad-isa-0000 Adapter: ISA adapter fan1: 0 RPM nouveau-pci-0100 Adapter: PCI adapter temp1: +511.0°C (high = +95.0°C, hyst = +3.0°C) (crit = +105.0°C, hyst = +5.0°C) (emerg = +135.0°C, hyst = +5.0°C) ... While typing this I've just noticed that 4.17.4-200.fc28 has been released ... going to install it and report back if anything changes ... Regards, Brian Mendenhall
We apologize for the inconvenience. There is a large number of bugs to go through and several of them have gone stale. Due to this, we are doing a mass bug update across all of the Fedora 28 kernel bugs. Fedora 28 has now been rebased to 4.18.10-300.fc28. Please test this kernel update (or newer) and let us know if you issue has been resolved or if it is still present with the newer kernel. If you have moved on to Fedora 29, and are still experiencing this issue, please change the version to Fedora 29. If you experience different issues, please open a new bug report for those.
Currently I'm on kernel 4.18.12-200.fc28, and my journal from 3 days ago contains this: Okt 12 13:39:16 schoko kernel: CPU1: Package temperature above threshold, cpu clock throttled (total events = 70862) … I just restarted and the message showed again on boot.
This has been always an issue for 1.5 years I have t470s. The throttling begins immediately during boot, right after the laptop is switched on. It seems there are wrong temperatures specified somewhere in the kernel comparing to Windows, as suggested here [1] and on various other places. There even exists a tool which workarounds this [2], but ... @Hans, would you be so kind and could you please look into this? Because this completely destroys your flicker-free boot effort. [1] https://forums.lenovo.com/t5/Linux-Discussion/X1C6-T480s-low-cTDP-and-trip-temperature-in-Linux/td-p/4028489 [2] https://github.com/erpalma/lenovo-throttling-fix
My t460s has had this since I have it, basically for 2,5 years now. I don't expect it to get any better anytime soon. I just powered the machine up; the disk is encrypted and I wasn't directly typing. I see this on my screen: [ 1.094212] CPU1: Core temperature above threshold, cpu clock throttled (tota l events = 1) [ 1.094212] CPU3: Core temperature above threshold, cpu clock throttled (tota l events = 1) [ 1.094214] CPU3: Package temperature above threshold, cpu clock throttled (t otal events = 1) [ 1.094256] CPU0: Package temperature above threshold, cpu clock throttled (t otal events = 1) [ 1.094257] CPU2: Package temperature above threshold, cpu clock throttled (t otal events = 1) [ 1.094316] CPU1: Package temperature above threshold, cpu clock throttled (t otal events = 1) ONE second after powering up? Current kernel is 4.19.15-300.fc29.x86_64, running fed29 (but had this issue since.. 24/25?).
(In reply to Martijn ten Heuvel from comment #7) > My t460s has had this since I have it, basically for 2,5 years now. I don't > expect it to get any better anytime soon. > > > I just powered the machine up; the disk is encrypted and I wasn't directly > typing. I see this on my screen: > [ 1.094212] CPU1: Core temperature above threshold, cpu clock throttled > (tota > l events = 1) > [ 1.094212] CPU3: Core temperature above threshold, cpu clock throttled > (tota > l events = 1) > [ 1.094214] CPU3: Package temperature above threshold, cpu clock > throttled (t > otal events = 1) > [ 1.094256] CPU0: Package temperature above threshold, cpu clock > throttled (t > otal events = 1) > [ 1.094257] CPU2: Package temperature above threshold, cpu clock > throttled (t > otal events = 1) > [ 1.094316] CPU1: Package temperature above threshold, cpu clock > throttled (t > otal events = 1) > > > ONE second after powering up? > Current kernel is 4.19.15-300.fc29.x86_64, running fed29 (but had this issue > since.. 24/25?). I am having similar behaviour with my T460s but immediately I have a message to say the opposite.. check logs below Jan 23 11:40:23 fireball.redhat.local kernel: CPU0: Core temperature above threshold, cpu clock throttled (total events = 93) Jan 23 11:40:23 fireball.redhat.local kernel: CPU2: Core temperature above threshold, cpu clock throttled (total events = 93) Jan 23 11:40:23 fireball.redhat.local kernel: CPU1: Package temperature above threshold, cpu clock throttled (total events = 93) Jan 23 11:40:23 fireball.redhat.local kernel: CPU3: Package temperature above threshold, cpu clock throttled (total events = 93) Jan 23 11:40:23 fireball.redhat.local kernel: CPU2: Package temperature above threshold, cpu clock throttled (total events = 93) Jan 23 11:40:23 fireball.redhat.local kernel: CPU0: Package temperature above threshold, cpu clock throttled (total events = 93) <==== !! Jan 23 11:40:23 fireball.redhat.local kernel: CPU0: Core temperature/speed normal <==== !! Jan 23 11:40:23 fireball.redhat.local kernel: CPU2: Core temperature/speed normal Jan 23 11:40:23 fireball.redhat.local kernel: CPU1: Package temperature/speed normal Jan 23 11:40:23 fireball.redhat.local kernel: CPU3: Package temperature/speed normal Jan 23 11:40:23 fireball.redhat.local kernel: CPU2: Package temperature/speed normal Jan 23 11:40:23 fireball.redhat.local kernel: CPU0: Package temperature/speed normal Kernel is also latest : 4.19.14-300.fc29.x86_64 Regards,
I'll get back to the question I posed before: > Is it that the CPU temperature can change so quickly that I cannot observe the critical temperature even a minute later? > I guess I have another issue then that that CPU actually constantly gets too hot. How quickly can the temperature rise in a CPU? I have no idea. As Martijn statet > ONE second after powering up? I guess everybody with the issue has an i7? Before buying the T470s I read in tests that it's design is flawed and not able to cool the i7 efficiently. So apparently even on windows it's running constantly throttled. I tried to get one with i5 instead, but ended up with the i7. So is it plausible that the CPU *actually* gets too hot?
(In reply to Andy from comment #9) > I'll get back to the question I posed before: > > > Is it that the CPU temperature can change so quickly that I cannot observe the critical temperature even a minute later? > > I guess I have another issue then that that CPU actually constantly gets too hot. > > How quickly can the temperature rise in a CPU? I have no idea. As Martijn > statet > > ONE second after powering up? > > I guess everybody with the issue has an i7? > > Before buying the T470s I read in tests that it's design is flawed and not > able to cool the i7 efficiently. So apparently even on windows it's running > constantly throttled. I tried to get one with i5 instead, but ended up with > the i7. > > So is it plausible that the CPU *actually* gets too hot? The machine (t460s) I have is company provided and has an i7-6600U. It ran fine all day, and just did the following: 2019-01-23T15:45:49,105186+01:00 CPU3: Core temperature above threshold, cpu clock throttled (total events = 1) 2019-01-23T15:45:49,105187+01:00 CPU1: Core temperature above threshold, cpu clock throttled (total events = 1) 2019-01-23T15:45:49,105189+01:00 CPU1: Package temperature above threshold, cpu clock throttled (total events = 1) 2019-01-23T15:45:49,105192+01:00 CPU2: Package temperature above threshold, cpu clock throttled (total events = 1) 2019-01-23T15:45:49,105193+01:00 CPU0: Package temperature above threshold, cpu clock throttled (total events = 1) 2019-01-23T15:45:49,105195+01:00 CPU3: Package temperature above threshold, cpu clock throttled (total events = 1) << 2019-01-23T15:45:49,106179+01:00 CPU3: Core temperature/speed normal << 2019-01-23T15:45:49,106179+01:00 CPU0: Package temperature/speed normal 2019-01-23T15:45:49,106180+01:00 CPU1: Core temperature/speed normal 2019-01-23T15:45:49,106181+01:00 CPU2: Package temperature/speed normal 2019-01-23T15:45:49,106181+01:00 CPU1: Package temperature/speed normal 2019-01-23T15:45:49,106182+01:00 CPU3: Package temperature/speed normal << Now, the change is basically throttle it, and then it's instantly normal again. That is weird, right?
Hi, I noticed the same problem on my Lenovo P51 running Fedora 29 and kernel 4.20.5-200.fc29.x86_64. # lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 8 On-line CPU(s) list: 0-7 Thread(s) per core: 2 Core(s) per socket: 4 Socket(s): 1 NUMA node(s): 1 Vendor ID: GenuineIntel CPU family: 6 Model: 158 Model name: Intel(R) Core(TM) i7-7700HQ CPU @ 2.80GHz Stepping: 9 CPU MHz: 800.056 CPU max MHz: 3800.0000 CPU min MHz: 800.0000 BogoMIPS: 5616.00 Virtualization: VT-x L1d cache: 32K L1i cache: 32K L2 cache: 256K L3 cache: 6144K NUMA node0 CPU(s): 0-7 Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb invpcid_single pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid ept_ad fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid mpx rdseed adx smap clflushopt intel_pt xsaveopt xsavec xgetbv1 xsaves dtherm ida arat pln pts hwp hwp_notify hwp_act_window hwp_epp flush_l1d This is the output extracted from dmesg, and as stated above the throttling messages are immediately followed by a temperature/speed normal message. [42581.573364] CPU0: Core temperature above threshold, cpu clock throttled (total events = 427) [42581.573365] CPU4: Core temperature above threshold, cpu clock throttled (total events = 427) [42581.573367] CPU3: Package temperature above threshold, cpu clock throttled (total events = 427) [42581.573367] CPU7: Package temperature above threshold, cpu clock throttled (total events = 427) [42581.573369] CPU1: Package temperature above threshold, cpu clock throttled (total events = 427) [42581.573370] CPU5: Package temperature above threshold, cpu clock throttled (total events = 427) [42581.573371] CPU2: Package temperature above threshold, cpu clock throttled (total events = 427) [42581.573371] CPU6: Package temperature above threshold, cpu clock throttled (total events = 427) [42581.573374] CPU4: Package temperature above threshold, cpu clock throttled (total events = 427) [42581.573380] CPU0: Package temperature above threshold, cpu clock throttled (total events = 427) [42581.581361] CPU0: Core temperature/speed normal [42581.581362] CPU4: Core temperature/speed normal [42581.581363] CPU5: Package temperature/speed normal [42581.581364] CPU1: Package temperature/speed normal [42581.581364] CPU7: Package temperature/speed normal [42581.581365] CPU6: Package temperature/speed normal [42581.581366] CPU2: Package temperature/speed normal [42581.581367] CPU3: Package temperature/speed normal [42581.581367] CPU4: Package temperature/speed normal [42581.581368] CPU0: Package temperature/speed normal
(In reply to Vít Ondruch from comment #6) > @Hans, would you be so kind and could you please look into this? Because > this completely destroys your flicker-free boot effort. I'm sorry there is nothing I / Fedora can do here. This is a BIOS bug and the only advice I can give you is to complain to Lenovo, hopefully if enough people complain they will do something about this.
I just wonder how it happens that Windows seems to work without throttling while Linux has some issues. Does Windows use some workaround similar to [2] reference above?
(In reply to Vít Ondruch from comment #13) > I just wonder how it happens that Windows seems to work without throttling > while Linux has some issues. Does Windows use some workaround similar to [2] > reference above? I wish we knew how this does work under Windows. If we knew we might be able to come up with a fix on the Linux side instead of waiting for a firmware fix, but that too requires cooperation from Lenovo.
Intel are the people who need to provide additional details here, this is due to the DPTF thermal framework that many ultrabook-style machines now use.
(In reply to Matthew Garrett from comment #15) > Intel are the people who need to provide additional details here, this is > due to the DPTF thermal framework that many ultrabook-style machines now use. Interesting. Should I read it as that something like this could help? https://github.com/intel/dptf But since it is a daemon, not sure if it can run soon enough, because throttling starts right after systemd is executed, but probably sooner then any unit is loaded ...
This message is a reminder that Fedora 28 is nearing its end of life. On 2019-May-28 Fedora will stop maintaining and issuing updates for Fedora 28. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as EOL if it remains open with a Fedora 'version' of '28'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version. Thank you for reporting this issue and we are sorry that we were not able to fix it before Fedora 28 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior this bug is closed as described in the policy above. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
(In reply to Matthew Garrett from comment #15) > Intel are the people who need to provide additional details here, this is > due to the DPTF thermal framework that many ultrabook-style machines now use. Upgraded to f30, issue still the same. [me@t460s ~]$ cat /etc/fedora-release Fedora release 30 (Thirty) [mtenheuv@t460s ~]$ dmesg | grep -i cpu | grep thro [ 1.578399] mce: CPU1: Core temperature above threshold, cpu clock throttled (total events = 1) [ 1.578400] mce: CPU3: Core temperature above threshold, cpu clock throttled (total events = 1) [ 1.578402] mce: CPU3: Package temperature above threshold, cpu clock throttled (total events = 1) [ 1.578434] mce: CPU0: Package temperature above threshold, cpu clock throttled (total events = 1) [ 1.578434] mce: CPU2: Package temperature above threshold, cpu clock throttled (total events = 1) [ 1.578518] mce: CPU1: Package temperature above threshold, cpu clock throttled (total events = 1) [me@t460s ~]$ Weird this is unresolvable.
@Peter you were involved in bug 1480844 previously, do you think you could help direct this bug towards the right people at Lenovo?
There are a few pieces that are relevant here and we are working on making them available in Fedora. More specifically: * thermald (available in Fedora), https://github.com/intel/thermal_daemon * dtpfxtract (will be available in rpmfusion soon), https://github.com/intel/dptfxtract Unfortunately, I don't think that dptfxtract works well enough on the T470s. Also, even with a proper configuration I have seen the CPU being throttled consistently to a power usage of 15W (after an initial peak) which is the reported TDP. Windows on the same machine was able to draw a lot more power. AFAICT the above threshold warnings are expected. The CPU package will draw a lot of power for a short period of time and the thermal capacity of the heatsink is small in modern laptops. So it will heat up really quickly and the CPU then takes protective actions by throttling. By the way, there is "throttled" which is a hack adjusting certain CPU registers (https://github.com/erpalma/throttled). But I cannot say whether this is entirely safe to do so.
@Christian Do you have by a chance some contacts in Lenovo? I think fixing issues like this would improve the user experience.
Lately there has been some movement on side of Lenovo! According to the forum thread, they have been working on some updates, and started rolling out fixes the for 7th gen X1 Carbon. Other laptops are supposed to follow soon: * T480/T480s * X1 Carbon 6th Gen * X1 Yoga 3rd Gen * P52/P52s/P53 * T470, T490, L380 * X1 extreme * Thinkpad 25 Anniversary edition [X1C6/T480s] low cTDP and trip temperature in Linux https://forums.lenovo.com/t5/Other-Linux-Discussions/X1C6-T480s-low-cTDP-and-trip-temperature-in-Linux/td-p/4028489/page/11 They published a PDF with explanations of the fixes, which I'll attach here as well. Thanks everybody!
Created attachment 1619042 [details] Linux Thermal Throttling explained by Lenovo In the forum thread on the throttling issue Lenovo posted this document.
I am curious about this as it appears it may also affect the T490s. It looks like firmware has been released for various devices, however, here's what I see: [thoraxe:~/Downloads] master* 1 ± sudo fwupdmgr get-approved-firmware [sudo] password for thoraxe: There is no approved firmware. [thoraxe:~/Downloads] master* ± sudo fwupdmgr get-updates No updatable devices [thoraxe:~/Downloads] master* 2 ± sudo fwupdmgr get-devices 20NYS7K90F │ ├─Thunderbolt Controller: │ Device ID: 85cd0f2da1ec523f67160e52b2d10ab70b83161a │ Summary: Unmatched performance for high-speed I/O │ Current version: 20.00 │ Vendor: Lenovo (TBT:0x0109) │ GUIDs: e56d9729-1948-50ec-9a51-bd7448f55816 ← TBT-01091806 │ 6d1b64e3-ebb3-5481-9d55-f334e31f3332 ← TBT-01091806-0000:04:00.0 │ Device Flags: • Internal device │ • Updatable │ • Requires AC power │ • Device stages updates │ ├─System Firmware: │ Device ID: 123fd4143619569d8ddb6ea47d1d3911eb5ef07a │ Current version: N2JET81W (1.59 ) │ Vendor: LENOVO │ Update Error: Firmware can not be updated in legacy mode, switch to UEFI mode │ GUID: 230c8b18-8d9b-53ec-838b-6cfc0383493a ← main-system-firmware │ Device Flags: • Internal device │ • Requires AC power │ • Needs a reboot after installation │ └─WDC PC SN730 SDBQNTY-256G-1001: Device ID: f2759da7fe8e0388c5f3601cb072f837b1070b03 Summary: NVM Express Solid State Drive Current version: 11110101 Vendor: Sandisk Corp (NVME:0x15B7) Serial Number: 19385E801939 GUIDs: a39943dd-3afb-54f8-b110-c5a21f071200 ← NVME\VEN_15B7&DEV_5006&REV_00 fccbb6ea-e20e-58ad-bf8a-7fb7d43ff4c2 ← NVME\VEN_15B7&DEV_5006 1836f81c-3a3b-52b2-bb89-e5dc480ca9ec ← WDC PC SN730 SDBQNTY-256G-1001 Device Flags: • Internal device • Updatable • Requires AC power • Needs a reboot after installation • Device is usable for the duration of the update It seems that there are some BIOS updates available for this device, but I'm not sure whether or not I should also see a *firmware* update via fwupdmgr ?
(In reply to Erik M Jacobs from comment #26) > [thoraxe:~/Downloads] master* 2 ± sudo fwupdmgr get-devices > [...] > ├─System Firmware: > │ Device ID: 123fd4143619569d8ddb6ea47d1d3911eb5ef07a > │ Current version: N2JET81W (1.59 ) > │ Vendor: LENOVO > │ Update Error: Firmware can not be updated in legacy mode, > switch to UEFI mode > │ GUID: 230c8b18-8d9b-53ec-838b-6cfc0383493a ← > main-system-firmware > │ Device Flags: • Internal device > │ • Requires AC power > │ • Needs a reboot after installation > It seems that there are some BIOS updates available for this device, but I'm > not sure whether or not I should also see a *firmware* update via fwupdmgr ? There is a firmware update, however it can be installed only if the system is installed in UEFI mode. Since your system is installed in legacy BIOS mode, it cannot be applied -- fwupd says that, see the above snippet of its output.
Yeah I realized that shortly after I posted. Unfortunately "converting" to UEFI mode is a little more than I want to tackle due to the partitioning scheme I've got right now. So it'll wait for F32 to come out...
I'm afraid that the only system that received a fix is the X1 Carbon Gen 7, and the release notes for the T490 BIOS updates don't mention any throttling fixes either. So even if you update your BIOS to UEFI you probably won't have a fix for the throttling issue. Redhat still certified the T470s and the T490s for RHEL 7, despite the throttling issue and the massive lack of performance that comes with it. - https://access.redhat.com/ecosystem/hardware/2951231 - https://access.redhat.com/ecosystem/hardware/4000481
(In reply to Andy from comment #24) > [X1C6/T480s] low cTDP and trip temperature in Linux > https://forums.lenovo.com/t5/Other-Linux-Discussions/X1C6-T480s-low-cTDP-and- > trip-temperature-in-Linux/td-p/4028489/page/11 It seems Lenovo restructured their web site. This should be the updated link to the same thread: https://forums.lenovo.com/topic/view/27/4028489?page=6
And this seems to be document listing affected and fixed laptops: https://docs.google.com/document/d/1MsqSYt0f_vU72pGGTWueeEgTBzd-HIz4ODHLmNOv16o/edit
Temporary workaround: #!/usr/bin/sh set -e echo 63BE270F-1C11-48FD-A6F7-3AF253FF3E2D > /sys/devices/platform/INT3400:00/uuids/current_uuid echo enabled > /sys/class/thermal/thermal_zone1/mode
Vitalie where did you get the 63B.... string from? Does this work on all systems?
[root@t490s-festive-local ~]# cat /sys/devices/platform/INT3400\:00/uuids/current_uuid INVALID [root@t490s-festive-local ~]# cat /sys/devices/platform/INT3400\:00/uuids/available_uuids 63BE270F-1C11-48FD-A6F7-3AF253FF3E2D 9E04115A-AE87-4D1C-9500-0F3E340BFE75
> Vitalie where did you get the 63B.... string from? From Linux kernel sources: https://github.com/torvalds/linux/blob/master/drivers/thermal/intel/int340x_thermal/int3400_thermal.c#L35 63BE270F-1C11-48FD-A6F7-3AF253FF3E2D is a GUID for THERMAL_ADAPTIVE_PERFORMANCE scheme. Lenovo Intelligent Thermal Service on Windows set this GUID on boot. On modern Linux kernels it works fine too. No more throttling for me. > Does this work on all systems? Tested on T480, T580. I created a small systemd-unit for myself: https://github.com/xvitaly/throttling-fix
I've enabled your systemd unit. I have a T490. How would I validate that it's "fixed"?
> I've enabled your systemd unit. I have a T490. How would I validate that it's "fixed"? cat /sys/devices/platform/INT3400:00/uuids/available_uuids It should return the correct GUID instead of INVALID value.
Oops, the correct command is: cat /sys/devices/platform/INT3400:00/uuids/current_uuid
Matthew has been working on reverse engineering this, he posted a blog post here https://mjg59.dreamwidth.org/54923.html
I realised that it has been a while that I saw the error "package temperature above threshold" on my T470s. So I queried my journal, and the last occurrence seemed to be February 22 for me. I've been gaming since quite a while, and even with my Windows 10 on Boxes I already saw my CPU up at 3600, unthrottled. I did not apply any workarounds like thermald, and I have been running Fedora 31 with TLP, now Fedora 32. My BIOS version is 1.35 from August 2019. Is it possible that some kernel update fixed the issue for my platform?
(In reply to Andy from comment #40) This is interesting discovery. I have checked my journal and the last 'cpu clock throttled` message is from March 23rd, when I upgraded to kernel-5.5.10-200.fc31.x86_64. Not sure if the issue was fixed or just the message disabled ;) But it seems that my CPU can reach the CPU max. (In reply to Peter Robinson from comment #39) > Matthew has been working on reverse engineering this, he posted a blog post > here https://mjg59.dreamwidth.org/54923.html This was originally reported against T470s and I for one don't have the INT3400 available: ~~~ $ sudo ls /sys/devices/platform/ | grep INT3400 ~~~
(In reply to Vít Ondruch from comment #41) > (In reply to Andy from comment #40) > This is interesting discovery. I have checked my journal and the last 'cpu > clock throttled` message is from March 23rd, when I upgraded to > kernel-5.5.10-200.fc31.x86_64. Not sure if the issue was fixed or just the > message disabled ;) But it seems that my CPU can reach the CPU max. This is mostly due to our (Benjamin) and Intels work to not have critical messages for expected thermal events. See kernel commits 9c3bafaa1fd88e4dd2dba3735a1f1abb0f2c7bb7[1] and f6656208f04e5b3804054008eba4bf7170f4c841[2] [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/x86/kernel/cpu/mce/therm_throt.c?id=9c3bafaa1fd88e4dd2dba3735a1f1abb0f2c7bb7 [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/arch/x86/kernel/cpu/mce/therm_throt.c?id=f6656208f04e5b3804054008eba4bf7170f4c841
See https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org/message/VHQCIGJBKRDLTRRDKJBKCLE7BATFCBME/ for a good explanation of some of the issues involved. Thermald is unlikely to be of high relevance on the Lenovo laptops in the future. The reverse engineering work might help on certain models.