Created attachment 2019129 [details] kernel log after resume with stuck low frequencies 1. Please describe the problem: My Thinkpad P16v gen1 laptop sometimes resumes with CPU frequency locked to 400 Mhz - 544 MHz range. Even when I put load on CPU, the frequency doesn't go above 544 MHz (while in normal case, I can easily see 4,5 GHz), and stay at 400 MHz during idle. The system is then slow and unresponsive, it's easy to spot. I have to reboot to "fix" the issue. This bug has already been seen by two different people on two different laptops (both Thinkpad P16v), so it's not some hardware issue with my exact device. It's probably common to (at least) all Thinkpad P16v laptops. When the system is stuck to low frequencies, I see this output from cpupower: $ sudo cpupower frequency-info analyzing CPU 14: driver: amd-pstate-epp CPUs which run at the same hardware frequency: 14 CPUs which need to have their frequency coordinated by software: 14 maximum transition latency: Cannot determine or is not supported. hardware limits: 400 MHz - 5.76 GHz available cpufreq governors: performance powersave current policy: frequency should be within 400 MHz and 5.76 GHz. The governor "powersave" may decide which speed to use within this range. current CPU frequency: Unable to call hardware current CPU frequency: 544 MHz (asserted by call to kernel) boost state support: Supported: yes Active: yes AMD PSTATE Highest Performance: 220. Maximum Frequency: 5.76 GHz. AMD PSTATE Nominal Performance: 145. Nominal Frequency: 3.80 GHz. AMD PSTATE Lowest Non-linear Performance: 42. Lowest Non-linear Frequency: 1.10 GHz. AMD PSTATE Lowest Performance: 16. Lowest Frequency: 400 MHz. 2. What is the Version-Release number of the kernel: kernel-6.7.6-200.fc39.x86_64 3. Did it work previously in Fedora? If so, what kernel version did the issue *first* appear? Old kernels are available for download at https://koji.fedoraproject.org/koji/packageinfo?packageID=8 : This is a new laptop, I haven't used older kernels with it. 4. Can you reproduce this issue? If so, please provide the steps to reproduce the issue below: Unfortunately it's random. Suspend the laptop and resume it. In most cases, it works as expected, but sometimes, this bug occurs and only low CPU frequencies are available. 5. Does this problem occur with the latest Rawhide kernel? To install the Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by ``sudo dnf update --enablerepo=rawhide kernel``: I can test if needed, but it might take a long time before I'm able to say whether it's affected or not. 6. Are you running any modules that not shipped with directly Fedora's kernel?: No. 7. Please attach the kernel logs. You can get the complete kernel log for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the issue occurred on a previous boot, use the journalctl ``-b`` flag. Kernel log is attached. The resume happened at 07:06:31 and then another one at 10:27:51. In both cases, the CPU freq was locked to 400-500MHz.
Created attachment 2019130 [details] lscpu.txt
Created attachment 2019131 [details] lspci.txt
@mpearson Hey Mark, this is something that you might be interested at, perhaps. Thanks.
Ack - we're already looking at this one. Internal ticket is LO-2468 FW team are having trouble reproducing it with the images we certified the platform with so are pointing at the kernel - which doesn't yet make sense to me. Normally these issues are FW related. We're doing ongoing debug to try and narrow down the issue. Thanks for the report and details - it's useful to have some other logs to review Mark
Thanks, Mark. If you want me to provide any further debugging logs, just tell me how. It just happened to me again this morning (that's a second time in 6 days of usage of this laptop), so it seems there's a decent chance to trigger it every few days.
As a note, I reproduced this on my system. Easy repro is to suspend, unplug from power, and resume. Using this I narrowed it down to breaking between 6.4 and 6.5-rc1. Did a bisect and it looks like this commit is causing the issue: https://github.com/torvalds/linux/commit/b5539eb5ee70257520e40bb636a295217c329a50 I'm working with AMD on determining best next steps - but for now this is looking like a kernel regression issue. Mark
I've fully updated my ThinkPad P16v [1] and used kernel-6.10.1-200.fc40, and the issue is now different. I can no longer use "suspend, unplug from power, and resume" reproducer, because any time I connect or disconnect AC power during suspend, the laptop immediately resumes. Which is quite annoying (breaks the "close laptop, unplug, put it into your bag" workflow), but also precludes testing any fix for this. The issue might or might not still be there, but with the current behavior, I can't tell. [1] System Firmware 0.1.52
(In reply to Kamil Páral from comment #7) > I can no longer use "suspend, unplug from power, > and resume" reproducer, because any time I connect or disconnect AC power > during suspend, the laptop immediately resumes. I've reported this problem separately as bug 2301921.
Now that bug 2301921 was resolved, I was able to re-test this. Unfortunately this is still an issue, exactly the same symptoms, exactly the same reproducer (see comment 6). Tested on Fedora 41 with kernel-6.11.0-63.fc41.