Bug 2177962

Summary: Power-profiles bug still (somewhat) present in kernel 6.2.9
Product: [Fedora] Fedora Reporter: Jonathan Heitz <jheitz223>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED EOL QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 39CC: acaringi, adscvr, airlied, alciregi, bskeggs, capybara_overdose, davemillsap, dzyndzla, hdegoede, hpa, jarodwilson, jglisse, josef, kernel-maint, lgoncalv, linville, masami256, mchehab, mpearson, paulds, ptalbert, steved
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2024-11-27 21:08:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jonathan Heitz 2023-03-14 04:13:26 UTC
This started with an issue that was present in the 6.1.7 kernel in which the power profile would constantly switch to "Power Saver" whenever it was set to anything else. That issue had been "fixed" in the 6.1.9 kernel, but I found after that, I could not set the profile to "Performance" or "Power Saver", and it would switch to "Balanced" (sometimes instantly, sometimes after a few seconds, sometimes after a minute or so) whenever I tried to set it to "Performance" or "Power Saver". Ever since 6.1.9, whenever a new kernel releases, I try it, and when it ultimately still does this, I immediately downgrade back to 6.1.6, before there was any issue to begin with. 

The latest Fedora kernel at the time of writing this is 6.2.9, which I have indeed tried. The issue is still present. I can confirm that this does still occur on a fresh installation of Fedora with the latest kernel, so it is nothing to do with my installation.

As for the hardware, this computer is a ThinkPad T490 (model 20N20031US) with an i7-8565U. This computer has the latest firmware updates.

Comment 1 Mark Pearson 2023-04-17 15:26:19 UTC
Seems platform profiles aren't loading with the latest kernel. I'll need to dig into why but I don't think they are supported on this platform
I checked with an older kernel (6.0.11-300) and the platform profiles aren't there either. 

Are you able to confirm on your working kernel if you see /sys/firmware/acpi/platform_profile please?

Also it might be interesting to do:
grep . /sys/devices/virtual/powercap/intel-rapl-mmio/intel-rapl-mmio:0/*_uw

and then do FN+H; FN+M; FN+L which are the hotkeys to switch between power profiles (bypass the OS entirely) and see if the values change. On my system they didn't (but it seems I'm also running a trial BIOS that I need to update from...so be great to get your results)

Please also confirm which BIOS and EC versions you have

Thanks
Mark

Comment 2 Jonathan Heitz 2023-04-17 17:35:39 UTC
(In reply to Mark Pearson from comment #1)
> Seems platform profiles aren't loading with the latest kernel. I'll need to
> dig into why but I don't think they are supported on this platform
> I checked with an older kernel (6.0.11-300) and the platform profiles aren't
> there either. 
> 
> Are you able to confirm on your working kernel if you see
> /sys/firmware/acpi/platform_profile please?
> 
> Also it might be interesting to do:
> grep . /sys/devices/virtual/powercap/intel-rapl-mmio/intel-rapl-mmio:0/*_uw
> 
> and then do FN+H; FN+M; FN+L which are the hotkeys to switch between power
> profiles (bypass the OS entirely) and see if the values change. On my system
> they didn't (but it seems I'm also running a trial BIOS that I need to
> update from...so be great to get your results)
> 
> Please also confirm which BIOS and EC versions you have
> 
> Thanks
> Mark

Hello Mark,

Thanks for your reply. On kernel 6.1.6 (the working one) I do have /sys/firmware/acpi/platform_profile, and the contents accurately represent the current power profile. Grepping the files you mentioned returns the following:

/sys/devices/virtual/powercap/intel-rapl-mmio/intel-rapl-mmio:0/constraint_0_max_power_uw:15000000
/sys/devices/virtual/powercap/intel-rapl-mmio/intel-rapl-mmio:0/constraint_0_power_limit_uw:10000000
/sys/devices/virtual/powercap/intel-rapl-mmio/intel-rapl-mmio:0/constraint_1_max_power_uw:0
/sys/devices/virtual/powercap/intel-rapl-mmio/intel-rapl-mmio:0/constraint_1_power_limit_uw:29000000

The hotkeys did not do anything for me either. I am on BIOS 1.79 (N2IETA1P) and EC 1.26 (N2IHT42W). 

Please let me know if you need any other information.

-Jon

Comment 3 Jonathan Heitz 2023-04-17 17:36:23 UTC
(In reply to Mark Pearson from comment #1)
> Seems platform profiles aren't loading with the latest kernel. I'll need to
> dig into why but I don't think they are supported on this platform
> I checked with an older kernel (6.0.11-300) and the platform profiles aren't
> there either. 
> 
> Are you able to confirm on your working kernel if you see
> /sys/firmware/acpi/platform_profile please?
> 
> Also it might be interesting to do:
> grep . /sys/devices/virtual/powercap/intel-rapl-mmio/intel-rapl-mmio:0/*_uw
> 
> and then do FN+H; FN+M; FN+L which are the hotkeys to switch between power
> profiles (bypass the OS entirely) and see if the values change. On my system
> they didn't (but it seems I'm also running a trial BIOS that I need to
> update from...so be great to get your results)
> 
> Please also confirm which BIOS and EC versions you have
> 
> Thanks
> Mark

Hello Mark,

Thanks for your reply. On kernel 6.1.6 (the working one) I do have /sys/firmware/acpi/platform_profile, and the contents accurately represent the current power profile. Grepping the files you mentioned returns the following:

/sys/devices/virtual/powercap/intel-rapl-mmio/intel-rapl-mmio:0/constraint_0_max_power_uw:15000000
/sys/devices/virtual/powercap/intel-rapl-mmio/intel-rapl-mmio:0/constraint_0_power_limit_uw:10000000
/sys/devices/virtual/powercap/intel-rapl-mmio/intel-rapl-mmio:0/constraint_1_max_power_uw:0
/sys/devices/virtual/powercap/intel-rapl-mmio/intel-rapl-mmio:0/constraint_1_power_limit_uw:29000000

The hotkeys did not do anything for me either. I am on BIOS 1.79 (N2IETA1P) and EC 1.26 (N2IHT42W). 

Please let me know if you need any other information.

-Jon

Comment 4 Mark Pearson 2023-04-17 18:06:26 UTC
Thanks Jon - I'll do some more checking on mine and figure out what changed.

Comment 5 Mark Pearson 2023-04-18 17:58:24 UTC
This is confusing :(

The culprit commit is https://github.com/torvalds/linux/commit/bce6243f767f7da88aa4674d5d678f9f156eaba9

But as noted in there:
> PSC platform profile mode is only supported on Linux for AMD platforms.
> Some older Intel platforms (e.g T490) are advertising it's capability
> as Windows uses it - but on Linux we should only be using MMC profile
> for Intel systems.

I'm testing on my T490 and it is not advertising MMC mode - it is *only* advertising PSC; and I'm assuming that is what you are seeing too.

However - I originally made that change because of this:
https://gitlab.freedesktop.org/hadess/power-profiles-daemon/-/issues/86

My internal ticket on that issue notes that the BIOS is advertising MMC and PSC modes. It definitely isn't now.

I need to talk to the FW team and understand what is going on I'm afraid. Right now I'm confused. It seems they've disabled MMC but I need to check why....

Mark

Comment 6 Jonathan Heitz 2023-04-18 20:20:55 UTC
Does that mean, even though my T490 says it is using a given profile on the "working" kernel, it actually isn't?

Also, I am unfortunately not familiar with PSC or MMC. Are they just different ways of setting processor states?

Jon

Comment 7 Mark Pearson 2023-04-19 15:50:04 UTC
Hi Jon,

PSC and MMC are different thermal control modes internal to our firmware. My previous understanding was that MMC was used on Intel and PSC was used on AMD.

However, after some discussion, it looks like my previous understanding on this was wrong :( 
It was a little bit a case of 'lost in translation' when discussing with the FW team in China but ultimately my fault - I should have double checked my understanding of what was being stated.

The fix I committed above needs to be reverted. The FW team have (since I proposed that patch) updated the BIOS to remove advertising MMC, so the kernel should allow PSC on Intel platforms.

I'll get a patch submitted upstream ASAP and get it fixed.

Mark

Comment 8 Jonathan Heitz 2023-04-19 16:11:37 UTC
Great, thank you!

Jon

Comment 9 Hans de Goede 2023-04-19 16:33:44 UTC
(In reply to Mark Pearson from comment #7)
> The fix I committed above needs to be reverted. The FW team have (since I
> proposed that patch) updated the BIOS to remove advertising MMC, so the
> kernel should allow PSC on Intel platforms.

Maybe not revert entirely, but allow both preferring one or the other depending on if it is an Intel vs AMD based laptop ?

To be clear what I'm worried about here is machines with an older BIOS which does still advertise both regressing ... ?

Comment 10 Mark Pearson 2023-04-19 18:00:39 UTC
Yeah - I was thinking about that
 - On the L13 G2 it advertises PSC mode but it won't work under Linux (Intel has done some special Windows drivers to handle it). Right now I believe there are no profile controls available that we can use.
 - T490 advertises PSC mode and it works (all handled by the FW). Right now it isn't being made available but should be

I don't believe I've seen other cases reported.

My suggestion is:
 - FW should advertise which mode it supports correctly. T490 is now doing this (but earlier FW versions weren't). If FW is wrong it should be fixed (if possible)
 - L13 G2 doesn't support platform profiles under Linux. I'm thinking a quirk for the L13 G2 to avoid using profiles is needed. I'm not going to be able to get them to fix the firmware. We should have caught this during enablement, but that's a separate discussion with our QA team that I need to have.

The downside is I don't have an L13 G2...so have to rely on a colleague for testing so it's going to take me a little while to get this ready

Let me know any thoughts or concerns
Mark

Comment 11 Hans de Goede 2023-04-19 18:50:59 UTC
(In reply to Mark Pearson from comment #10)
> Yeah - I was thinking about that
>  - On the L13 G2 it advertises PSC mode but it won't work under Linux (Intel
> has done some special Windows drivers to handle it). Right now I believe
> there are no profile controls available that we can use.

Ok, so we need to DMI quirk this and not offer any platform_profile support on this model.

>  - T490 advertises PSC mode and it works (all handled by the FW). Right now
> it isn't being made available but should be

But originally (with older BIOS-es) it did not work and MMC mode had to be used, right ?  At least that is what:
https://github.com/torvalds/linux/commit/bce6243f767f7da88aa4674d5d678f9f156eaba9

Claims. So maybe add a "prefer MMC" DMI quirk mechanism and set that for the 490 intel models and then if both are advertised use MMC to keep things working with the old BIOS for which bce6243f767f7 was written ?

At least if this is not too much effort / not too messy code wise. We do have fwupdate support for these models, but still many users just never update their BIOS so it would be good to keep old versions working if we can.

Comment 12 Mark Pearson 2023-04-20 01:46:25 UTC
> But originally (with older BIOS-es) it did not work and MMC mode had to be used, right ?  At least that is what:
> https://github.com/torvalds/linux/commit/bce6243f767f7da88aa4674d5d678f9f156eaba9

I suspect I'm going to have to downgrade FW to check what is/was going on - it's not making a lot of sense right now to me either. I'm stumped as to how it used to not work - it should have done. The only conclusion I have so far is my commit was garbage - but I know I tested it so I'm really hoping that's not the case. I'll do some more digging and see what I can find out as I'm missing something in my understanding.

As MMC is offered first in the code it will work as 'MMC preferred' by default (if both are advertised) I think without needing a quirk - but I have this horrible sinking feeling the issue is tied up somewhere in there.

The conversation with the FW team has been a little confusing at times (largely language based); but they have confirmed clearly the T490 is using PSC (and in my testing yesterday PSC definitely works).

Mark

Comment 13 Mark Pearson 2023-05-02 18:28:20 UTC
Some updates on this issue. I was travelling last week so haven't made much direct progress - but I was in Japan and got to sit down with the FW team which was quite helpful and I have a bit more insight.

They confirmed that PSC mode is used on Intel platforms. In fact Windows is using that for their profile slider control - so my code in the driver saying not to run it on Intel platforms is definitely wrong and I need to correct it.
MMC mode was intended for Linux platforms and maps on top of the FN+H/M/L key presses. It wasn't really made clear why we have two modes and I do have some concerns that it may be related to how the thermal tables are used in Windows and not Linux so I'm treading cautiously for now. I will be doing some more investigation here but it's going to take me some time and testing. PSC mode should possibly take preference over MMC mode but I'm somewhat hesitant to make that change without clarity and results

In the short term I'm working on a couple of patches:
 1) Revert the change that blocks PSC mode on the Intel platforms. MMC will remain as preferred for now.
 2) A module parameter that can be used to force MMC or PSC mode regardless of what is advertised. That way if we hit some other particular case we have a workaround whilst I figure it out with the FW team.

Once I've had a chance to discuss some options more with Intel, AMD and the thermal team I'm hoping I can get something a bit better - but it's definitely going to take me longer to get those pieces all tied together.

Sorry about all the confusion - I wish I could state confidently what the final answer is but there's a lot of history and churn in this area (and variation between platforms).

I've re-opened internal ticket LO-1710 on this issue

Mark

Comment 14 Jonathan Heitz 2023-07-15 16:08:51 UTC
Any more updates on this?

Jon

Comment 15 Mark Pearson 2023-07-17 13:52:12 UTC
The patch for fixing the original regression was accepted upstream - it fixes things for the T490
https://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86.git/commit/drivers/platform/x86/thinkpad_acpi.c?h=for-next&id=0c0cd3e25a5b64b541dd83ba6e032475a9d77432

I checked and the fix is in the fedora-6.3 kernel.

Note, this does raise a problem on the L13 G2 Intel, but this is a FW problem that gets exposed by the patch and needs fixing in FW. I have a ticket open with the FW team (LO-2513) and nagged them last week for an update because we don't have the fix yet.
If you have another platform impacted please let me know

I think we can close this (from the T490 point of view) - let me know if any objections.

Mark

Comment 16 Jonathan Heitz 2023-07-17 21:39:00 UTC
Unfortunately, I am still experiencing the issue on kernel-6.3.12-100.fc37. Is there anything I should check or any information I could provide?

Jon

Comment 17 Mark Pearson 2023-07-18 15:28:08 UTC
I updated my T490 (with Fedora38 rather than 37; but don't think that should matter).

I'm seeing the same as you - it's not working for performance mode; only for low-power and balanced.
What is weird is I still have the kernel and module from when I fixed this originally and I double checked it is all working there correctly(including checking the power limits).

Right now I have no idea what is going on - those patches fixed it back on the 6.3.0-rc1 kernel but don't seem to be working now they've been integrated. I'll have to debug and find out what is going on - something stupid must have happened. Sorry :(

One small note - I'm on PTO from Friday for two weeks (and travelling so can't take a stack of laptops with me). I'll do my best to look at this before then but this week is a bit crazy. I've tagged myself as needinfo so I'll get the bug nags for it as a reminder.

Mark

Comment 18 Mark Pearson 2023-07-18 21:08:31 UTC
I'm confused...

I cloned the kernel ark repository (https://gitlab.com/cki-project/kernel-ark) and did a clean build using the Fedora config from the fedora-6.3 branch. This gives a 6.3.13 kernel - and it works correctly.

I've checked the source code for 6.3.12 and the patches are there. I've kicked off a build of that exact tree to see if I can reproduce the issue from that....it should work.

Comment 19 Mark Pearson 2023-07-19 01:31:28 UTC
This got even weirder...

I built the 6.3.12 kernel from the same kernel source as the 6.3.12-200-fc38 image. Platform profiles were working.
Pretty confused at this point I went back to my 6.3.12-200-fc38 install; and platform profiles are now working correctly there too. I can't reproduce the problem anymore.
I've tried dropping back to an old kernel and then going back up - everything keeps working.

Not sure what to do at this point. Can you try power cycling (full power off, not just reboot) to see if that makes a difference? (grasping at straws)
If it still doesn't work are you happy to try a kernel rpm that I've built?

Mark

Comment 20 Mark Pearson 2023-07-19 01:45:20 UTC
Think I've figured out what is going on - this is the lapmode sensor triggering.

You can't have performance mode enabled in lapmode, though for some reason it is dropping to low-power if it's in performance mode (balance mode stays at balance).

You can check the status of the lapmode with /sys/devices/platform/thinkpad_acpi/dytc_lapmode

Can you confirm if that matches with your observations? If lapmode isn't triggered then the profiles are working correctly.

I think there is a FW issue that it goes to low-power mode instead of balanced (getting that fixed will be interesting). I'll follow up with the FW team on that once you confirm my observations are correct.

Mark

Comment 21 Jonathan Heitz 2023-07-19 12:56:37 UTC
I am not at the machine right now, but I think it is worth noting that when I tested it with the new kernel, the laptop stood completely still the whole time, from before I even turned it on. Unless lap mode is being triggered when it is not supposed to, I do not believe it was in lap mode. I will confirm when I get home from work today.

Jon

Comment 22 Jonathan Heitz 2023-07-25 01:36:29 UTC
Ok, I have just checked and confirmed that the initial problem still occurs, and that the laptop is not entering lap mode when it is not supposed to. In other words, Performance and Power Saver both switch to balanced, and the aforementioned file continues to display 0, indicating no lap mode. This is on both 6.4.4 and 6.3.12. Let me know if there is any other information I could provide.

Comment 23 capybara_overdose 2023-08-17 13:18:23 UTC
This is still VERY much present on the L13  Gen2 and has been for like *checks notes* THREE MONTHS. No 'lap mode' whatever here either, just constant throttling

Comment 24 Jonathan Heitz 2023-08-17 13:46:45 UTC
Well, I will say it is somewhat comforting to know now that I am not the only one experiencing this. Welcome to the club.

Comment 25 capybara_overdose 2023-09-12 15:56:23 UTC
(In reply to Jonathan Heitz from comment #24)
> Well, I will say it is somewhat comforting to know now that I am not the
> only one experiencing this. Welcome to the club.

Well, I can't offer much more comfort.

I've been bumping this issue on the Fedora forums for MONTHS, and they've opted to victim-blame and ban me rather than concede this is a failure in their product (I've tested it in Ubuntu, it's not present there).

The Lenovo side is no better. Months and months of excuses and now theyre pulling the same "Its your fault for finding the problem!" line, thrown a tantrum and now refuse to even reply. 

Pretty pathetic indictment of both organisational cultures and I'd say this will be the end of Lenovo machines both in my business space and personal/family use, after a solid decade of using them basically exclusively. They clearly just have no intent to even try and fix this, despite the severity.

Comment 26 Jonathan Heitz 2023-09-12 16:23:21 UTC
I completely understand, however I think if we are respectful and patient, a resolution will be more likely than it would otherwise. Despite the minimal progress in the last half a year, I am just trying to have faith in Mark and the others at Lenovo. It's all I can do besides just hound them, which I will not do since at the end of the day they are, in some way shape or form, trying to help me and I am appreciative of that. As someone who works in support I know how it feels when people overlook that essential fact, that you are there to help them, and instead make you out to be the enemy. Linux is a free, open source product and so nobody is "entitled" to support. Again, I understand where you are coming from and I am frustrated as well that little progress has been made, but I am trying my best to remain patient and faithful. 

However, I would kindly appreciate an update from Mark or others on the status of this issue. Thank you!

Comment 27 Mark Pearson 2023-09-12 19:42:13 UTC
Hi Jonathan,

Thanks for the nudge - this did fall between the cracks over the summer. I've pinged the FW for an update (for my reference - internal ticket is LO-1710).

Mark

Comment 28 capybara_overdose 2023-09-13 01:46:08 UTC
(In reply to Jonathan Heitz from comment #26)
> I completely understand, however I think if we are respectful and patient, a
> resolution will be more likely than it would otherwise. Despite the minimal
> progress in the last half a year, I am just trying to have faith in Mark and
> the others at Lenovo. It's all I can do besides just hound them, which I
> will not do since at the end of the day they are, in some way shape or form,
> trying to help me and I am appreciative of that. As someone who works in
> support I know how it feels when people overlook that essential fact, that
> you are there to help them, and instead make you out to be the enemy. Linux
> is a free, open source product and so nobody is "entitled" to support.
> Again, I understand where you are coming from and I am frustrated as well
> that little progress has been made, but I am trying my best to remain
> patient and faithful. 
> 
> However, I would kindly appreciate an update from Mark or others on the
> status of this issue. Thank you!

I won't derail this too much, but I absolutely couldn't disagree more. 

As someone who also runs a business (albeit small) with a product support side, if I had staff behave this way, and just...abandon a support case after a customer complaint, for a known problem causing  causing major performance issues, for MONTHS - they would be sacked on the spot. 

If it IS a firmware issue, then the usual "it gets a pass because lInuX iS FrEe' " excuse is irrelevant. The firmware is 100% Lenovos responsibility. If it's NOT a firmware but a Fedora issue (which I suspect it is) they why are Lenovo marketing their machines as 'Linux Compatible' only to bail on the support when it turns out they're not? Why put a 'FrEe' product on them at all, if you know it's not going to work properly?

These organisations reap in millions (or billions...) of $ a year. There is no reason this should not allocated resource to be rectified, without the user being expected to simp for some megacorporation.

Comment 29 capybara_overdose 2023-10-08 04:53:55 UTC
This is STILL. NOT. FIXED.

Comment 30 Mark Pearson 2023-10-11 15:25:45 UTC
Hi Jonathan,

China are back from Autumn festival holidays and I think they've reproduced the issue with lapmode not triggering; but not the profile issues. It's strange as I confirmed again that I'm not seeing that problem on my system.

I did also update to BIOS 1.80 and EC 1.26 and tested there. These fix the problem with the profile dropping to low-power when lapmode is triggered. It would be worth checking on your system if that makes any difference.

Mark

Comment 31 Jonathan Heitz 2023-10-12 00:59:36 UTC
(In reply to Mark Pearson from comment #30)
> Hi Jonathan,
> 
> China are back from Autumn festival holidays and I think they've reproduced
> the issue with lapmode not triggering; but not the profile issues. It's
> strange as I confirmed again that I'm not seeing that problem on my system.
> 
> I did also update to BIOS 1.80 and EC 1.26 and tested there. These fix the
> problem with the profile dropping to low-power when lapmode is triggered. It
> would be worth checking on your system if that makes any difference.
> 
> Mark

Hello Mark,

Thanks for the update. I have applied the new firmware, and unfortunately the profiles still switch back to 'balanced' when they are not 'balanced', regardless of whether or not the computer is in lapmode. In other words, I cannot use the 'power saver' mode or the 'performance' mode under any circumstances. I will point out that in lapmode, the change is more sudden. That is, it switches back to 'balanced' instantly, whereas when the computer is not in lapmode, it switches after a (varying) brief period. 

I hope this helps. It's strange though that nobody is able to reproduce this issue... that certainly makes things more difficult. If any of you at Lenovo would like me to test out future patches to the firmware or kernel, I would happily do so. I would honestly even consider mailing out my device for testing purposes, as I don't use this machine much these days. 

Let me know if there is anything else I can do, or if I can provide any other info.

Thanks,

Jon

Comment 32 Hans de Goede 2023-10-12 07:42:26 UTC
Jonathan, I assume this is already the case, but just in case it is not: I assume you are running the latest version of power-profiles-daemon:

[hans@shalem ~]$ rpm -q power-profiles-daemon
power-profiles-daemon-0.13-1.fc38.x86_64

?

Comment 33 Mark Pearson 2023-10-12 19:31:27 UTC
Good point. 
Jonathan, if I build a Fedora kernel with some extra debug in it would you be OK to run that? I'd just add some prints so we can figure out if user space is configuring things, or if it's the FW changing the setting.

Mark

Comment 34 Jonathan Heitz 2023-10-12 21:57:15 UTC
(In reply to Hans de Goede from comment #32)
> Jonathan, I assume this is already the case, but just in case it is not: I
> assume you are running the latest version of power-profiles-daemon:
> 
> [hans@shalem ~]$ rpm -q power-profiles-daemon
> power-profiles-daemon-0.13-1.fc38.x86_64
> 
> ?

Hans,

I am actually not running the latest version of power-profiles-daemon as I am still on Fedora 37 which has power-profiles-daemon-0.12-2.fc37.x86_64 at the moment. Having no intention to upgrade at this time, I tried a live USB of Fedora 38 which comes with power-profiles-daemon-0.13-1.fc38.x86_64
but the issue persists. 

Thanks,

Jon

Comment 35 Jonathan Heitz 2023-10-12 21:59:09 UTC
(In reply to Mark Pearson from comment #33)
> Good point. 
> Jonathan, if I build a Fedora kernel with some extra debug in it would you
> be OK to run that? I'd just add some prints so we can figure out if user
> space is configuring things, or if it's the FW changing the setting.
> 
> Mark

Yes, I would be happy to try that kernel.

Comment 36 Mark Pearson 2023-10-16 15:49:24 UTC
Hopefully this works:
kernel rpm: https://drive.google.com/file/d/12hQEHysEpAUZEE_L6zazAhGuk3HylAe7/view?usp=drive_link
kernel-headers rpm: https://drive.google.com/file/d/1-SB8cm3cIRWE-dOGN4TDncN-FRIqRwUU/view?usp=drive_link

It should add extra debug in your kernel log (look for anything with "---" in front of it). I was testing on a P1G6, just because I happened to be building a kernel on there for something else...so let me know if you have any issues on your system. Build is from the Fedora kernel-ark tree.

Mark

Comment 37 Jonathan Heitz 2023-10-16 23:34:59 UTC
I just sent requests for access to both files. Also, should I be on Fedora 38 before installing this kernel? I'm running 37 right now as I prefer to stay one version behind for stability. (I don't mind upgrading to 38 though as 39 comes out tomorrow so 37 probably has a month of support left)

Thanks,

Jon

Comment 38 Mark Pearson 2023-10-17 07:20:08 UTC
Oops sorry - I didn't mean to make those limited access files. Should be able to download now.

These should work on F37. I was testing on F38 myself, but it shouldn't be required.
Mark

Comment 39 Jonathan Heitz 2023-10-18 00:49:36 UTC
Ok, running the debug kernel it looks like I get the below messages in my log when the profile switches to 'balanced'

[  422.118034] ----THM_CSM event
[  422.119224] ---dytc_profile_refresh 0x1f001

Comment 40 Mark Pearson 2023-10-19 12:32:01 UTC
Thanks - looks like an event from the FW (I'm assuming there wasn't anything else happening at the time that was generated - system being moved etc)

Would you mind running one more debug image - I'm wondering if you're getting bogus lapmode sensor changes. I added some more instrumentation to display the lapmode sensor state on these events:
https://drive.google.com/file/d/1JUt0pciiNi2GtfQPMBAMz4XmH7HdRyf_/view?usp=sharing
https://drive.google.com/file/d/1ZSQoOK5TXI6Iglt1aeF8rQtT_2t2-DOG/view?usp=sharing

Mark

Comment 41 Jonathan Heitz 2023-10-20 00:38:34 UTC
I believe you may be right about there being bogus lapmode sensor changes. On the second debug kernel I get the following:

[ 1232.214454] lapsensor 1 0
[ 1232.214459] ----THM_CSM event
[ 1232.214845] ---dytc_profile_refresh 0x1f001

I can assure you that the laptop was completely still when this happened (I didn't even touch it, nor did I touch the surface it was on)
I know you said this isn't happening on other T490 models, do you think maybe my lapmode sensor is just broken, physically? 

The only thing that confuses me is that when I check /sys/devices/platform/thinkpad_acpi/dytc_lapmode, it returns 0. Therefore it's not entering lapmode per se, it is simply just changing the power profile. However, when I do actually move the laptop, it does enter lapmode. So I'm thinking maybe there is no issue with the lapmode sensor. Maybe I just don't get what the above message in the kernel log means. Let me know your thoughts.

Thanks,

Jon

Comment 42 capybara_overdose 2023-10-20 06:25:53 UTC
(In reply to Mark Pearson from comment #40)
> Thanks - looks like an event from the FW (I'm assuming there wasn't anything
> else happening at the time that was generated - system being moved etc)
> 
> Would you mind running one more debug image - I'm wondering if you're
> getting bogus lapmode sensor changes. I added some more instrumentation to
> display the lapmode sensor state on these events:
> https://drive.google.com/file/d/1JUt0pciiNi2GtfQPMBAMz4XmH7HdRyf_/
> view?usp=sharing
> https://drive.google.com/file/d/1ZSQoOK5TXI6Iglt1aeF8rQtT_2t2-DOG/
> view?usp=sharing
> 
> Mark

I can't get ANY of these files to even install. It just says "loading app details" and never does anything more.

Where are these messages about lap sensors coming from, what even is going on here?

Comment 43 Jonathan Heitz 2023-10-20 13:12:29 UTC
(In reply to capybara_overdose from comment #42)
> (In reply to Mark Pearson from comment #40)
> > Thanks - looks like an event from the FW (I'm assuming there wasn't anything
> > else happening at the time that was generated - system being moved etc)
> > 
> > Would you mind running one more debug image - I'm wondering if you're
> > getting bogus lapmode sensor changes. I added some more instrumentation to
> > display the lapmode sensor state on these events:
> > https://drive.google.com/file/d/1JUt0pciiNi2GtfQPMBAMz4XmH7HdRyf_/
> > view?usp=sharing
> > https://drive.google.com/file/d/1ZSQoOK5TXI6Iglt1aeF8rQtT_2t2-DOG/
> > view?usp=sharing
> > 
> > Mark
> 
> I can't get ANY of these files to even install. It just says "loading app
> details" and never does anything more.
> 
> Where are these messages about lap sensors coming from, what even is going
> on here?

...Are you sure you are installing them correctly? They are just kernel RPMs, you install them through DNF. The messages I am referring to are from my kernel log, which I view through dmesg.

Comment 44 capybara_overdose 2023-10-20 13:28:01 UTC
(In reply to Jonathan Heitz from comment #43)
> (In reply to capybara_overdose from comment #42)
> > (In reply to Mark Pearson from comment #40)
> > > Thanks - looks like an event from the FW (I'm assuming there wasn't anything
> > > else happening at the time that was generated - system being moved etc)
> > > 
> > > Would you mind running one more debug image - I'm wondering if you're
> > > getting bogus lapmode sensor changes. I added some more instrumentation to
> > > display the lapmode sensor state on these events:
> > > https://drive.google.com/file/d/1JUt0pciiNi2GtfQPMBAMz4XmH7HdRyf_/
> > > view?usp=sharing
> > > https://drive.google.com/file/d/1ZSQoOK5TXI6Iglt1aeF8rQtT_2t2-DOG/
> > > view?usp=sharing
> > > 
> > > Mark
> > 
> > I can't get ANY of these files to even install. It just says "loading app
> > details" and never does anything more.
> > 
> > Where are these messages about lap sensors coming from, what even is going
> > on here?
> 
> ...Are you sure you are installing them correctly? They are just kernel
> RPMs, you install them through DNF. The messages I am referring to are from
> my kernel log, which I view through dmesg.

No idea at all if I'm installing them correctly - after all, literally no instructions were given on how to do so. I just did what I normally do and double-click the downloaded file. 

dnf is a terminal thing yes? If so, how exactly people are expected just magically know what words to enter is beyond me, or even where to find this out. Just tired googling "install kernel with DNF" and it's a whole galaxy of incomprehensibility.

Comment 45 Jonathan Heitz 2023-10-20 13:42:53 UTC
With all due respect, if you are writing any of the things that you just wrote, you probably should not be installing a debug kernel.

Comment 46 Mark Pearson 2023-10-20 13:56:16 UTC
Thanks Jon,

Actually those logs suggest it's not the lap sensor. I should have made the debug message clearer but the '1' is confirming the system has a lapsensor, and the '0' is the state - so it's confirming the lapsensor is not triggering which means it's something else.
Afraid I need to follow on with the FW team now to try and understand what is generating the event. Thanks for collecting the logs and narrowing down the issue.

Mark

Comment 47 capybara_overdose 2023-10-20 14:44:28 UTC
(In reply to Jonathan Heitz from comment #45)
> With all due respect, if you are writing any of the things that you just
> wrote, you probably should not be installing a debug kernel.

With all "due respect", how does that sort of judegmental gate-keeping help solve the issue, or help anything?  You clearly know what's supposed to be done, and you're instead choosing to deliberately withhold that information, and instead look down your nose at someone rather than share it.

Comment 48 capybara_overdose 2023-10-20 14:45:55 UTC
(In reply to Mark Pearson from comment #46)
> Thanks Jon,
> 
> Actually those logs suggest it's not the lap sensor. I should have made the
> debug message clearer but the '1' is confirming the system has a lapsensor,
> and the '0' is the state - so it's confirming the lapsensor is not
> triggering which means it's something else.
> Afraid I need to follow on with the FW team now to try and understand what
> is generating the event. Thanks for collecting the logs and narrowing down
> the issue.
> 
> Mark

So can it be assumed it's the same issue affecting the L13's as well? Because firmware was being blamed there and yet profiles stay set where they should be just fine on Ubuntu.

Comment 49 Jonathan Heitz 2023-10-20 15:23:58 UTC
(In reply to capybara_overdose from comment #47)
> (In reply to Jonathan Heitz from comment #45)
> > With all due respect, if you are writing any of the things that you just
> > wrote, you probably should not be installing a debug kernel.
> 
> With all "due respect", how does that sort of judegmental gate-keeping help
> solve the issue, or help anything?  You clearly know what's supposed to be
> done, and you're instead choosing to deliberately withhold that information,
> and instead look down your nose at someone rather than share it.

I am afraid you misunderstood my comment. I was not gatekeeping, I was giving genuine advice. Playing with kernels is not something that should be done if you do not know what you are doing. You could break your system.

Comment 50 capybara_overdose 2023-10-20 15:46:37 UTC
(In reply to Jonathan Heitz from comment #49)
> (In reply to capybara_overdose from comment #47)
> > (In reply to Jonathan Heitz from comment #45)
> > > With all due respect, if you are writing any of the things that you just
> > > wrote, you probably should not be installing a debug kernel.
> > 
> > With all "due respect", how does that sort of judegmental gate-keeping help
> > solve the issue, or help anything?  You clearly know what's supposed to be
> > done, and you're instead choosing to deliberately withhold that information,
> > and instead look down your nose at someone rather than share it.
> 
> I am afraid you misunderstood my comment. I was not gatekeeping, I was
> giving genuine advice. Playing with kernels is not something that should be
> done if you do not know what you are doing. You could break your system.

so we're back at the original question:

If this needs to be done correctly, but nobody's willing to give any instruction on what the correct steps are, then how can anyone be expected to know what to do and contribute to a solution?

Unwitting or not - that's gatekeeping.

Comment 51 Jonathan Heitz 2023-10-20 15:49:34 UTC
There is currently no solution. That is why we are here, it is a bug and it is being fixed by the developers.

Comment 52 Aoife Moloney 2023-11-23 01:26:53 UTC
This message is a reminder that Fedora Linux 37 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora Linux 37 on 2023-12-05.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
'version' of '37'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, change the 'version' 
to a later Fedora Linux version. Note that the version field may be hidden.
Click the "Show advanced fields" button if you do not see it.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora Linux 37 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora Linux, you are encouraged to change the 'version' to a later version
prior to this bug being closed.

Comment 53 Mark Pearson 2023-12-20 14:26:45 UTC
Finally some updates, on both platforms.

T490 - FW team are releasing an update to advertise the mode correctly. They've tested and state it fixes the issue (running on a Fedora 6.6.6 kernel). I've asked for the trial BIOS myself to double confirm, and asked for confirmation on when the fix will be in a formal release BIOS.

L13 G2 - very messy, but after a few failed attempts the platform profiles are now working correctly. My team tested the trial BIOS last night and confirmed it is working. I've asked for confirmation on when the fix will be in a formal release BIOS.

As a heads up, formal release usually takes a few weeks to go thru the test and approval process. I don't think it will be impacted by the Christmas/New year holidays as much of this is being done in China - but some review does come from Japan so may add a bit of delay

I will try and remember to update once I have the details - but if not keep an eye out for these showing up on LVFS.

Mark

Comment 54 capybara_overdose 2023-12-27 00:57:47 UTC
Too little, way too late.

I'm not seeing any updates available on my machine - it still just wants to "Update" to an older firmware version, which then fails on installation anyway.

I'm not putting any more effort into it. It's been replaced with a Dell and set aside for when consumer affairs investigates (along with the trackpad and battery issues). 

I'm still staggered by the sloppy execution in this and frankly not convinced it's just firmware (why did it work fine in Ubuntu?). But I'm done wasting any more time on it - Lenovo and Fedora and blacklisted now for me.

Comment 55 dzyndzla 2024-02-03 13:15:14 UTC
(In reply to Mark Pearson from comment #53)
> Finally some updates, on both platforms.
> 
> T490 - FW team are releasing an update to advertise the mode correctly.
> They've tested and state it fixes the issue (running on a Fedora 6.6.6
> kernel). I've asked for the trial BIOS myself to double confirm, and asked
> for confirmation on when the fix will be in a formal release BIOS.
> 
> L13 G2 - very messy, but after a few failed attempts the platform profiles
> are now working correctly. My team tested the trial BIOS last night and
> confirmed it is working. I've asked for confirmation on when the fix will be
> in a formal release BIOS.
> 
> As a heads up, formal release usually takes a few weeks to go thru the test
> and approval process. I don't think it will be impacted by the Christmas/New
> year holidays as much of this is being done in China - but some review does
> come from Japan so may add a bit of delay
> 
> I will try and remember to update once I have the details - but if not keep
> an eye out for these showing up on LVFS.
> 
> Mark

Hi Mark; could you please confirm whether latest BIOS update for T490 -> 20N2XXXXX UEFI: LENOVO v: N2IETA3W (1.81 ) was about to include the fix you mentioned above? If yes, then with the default kernel settings powerprofiles were still getting back to balanced. However, after passing 'thinkpad_acpi.profile_force=2' (PSC) as a kernel parameter, seems to keep performance power profile and performance platform_profile. Is this correct setup and intended behavior?

Comment 56 Mark Pearson 2024-02-05 14:24:36 UTC
It's not I'm afraid - should be in the next one. I've asked for an ETA
Mark

Comment 57 dzyndzla 2024-03-13 19:06:37 UTC
(In reply to Mark Pearson from comment #56)
> It's not I'm afraid - should be in the next one. I've asked for an ETA
> Mark

Hi there, Mark is there any update on that?

Comment 58 Mark Pearson 2024-03-20 17:08:28 UTC
Yes! FW 1.82 with fix has been released to LVFS.
I gave it a quick sniff test on my system and it looks good. Let me know if you still see issues.
Mark

Comment 59 dmillsap 2024-03-20 22:59:09 UTC
(In reply to Mark Pearson from comment #58)
> Yes! FW 1.82 with fix has been released to LVFS.
> I gave it a quick sniff test on my system and it looks good. Let me know if
> you still see issues.
> Mark

Hi Mark.  Appreciate all you're doing on this. I know this is a Redhat forum, but I am running kernel 5.15.0-101 on Linux Mint 21.3. I just pushed the 1.82 firmware to my T490 20N3 today and I still have the issue where it reverts back to balanced on the power platform profile.  Would you be able to recommend a kernel I should update to for this fix to be enabled? I have options for kernels 6.2 and 6.5.  Really appreciate it!

Dave

Comment 60 Mark Pearson 2024-03-21 18:40:22 UTC
Hmm - I don't think this should matter, though there were a few kernel fixes.

Could you just confirm how you see the problem - I did some testing on mine, leaving it in performance mode and confirming it stayed there.
And, just for clarification - it will drop (should be to balanced mode - but see note below) if the laptop thinks it's in lap mode (cat /sys/devices/platform/thinkpad_acpi/dytc_lapmode)

I was on Fedora 39 with a 6.5.6 kernel. Maybe to rule out the kernel issue you can boot from a live USB issue?

Update: I was just giving it a more thorough testing, before posting, and am finding that if I'm in performance mode and trigger the lap sensor then it drops to low-power instead of balanced. I'm flagging this to the FW team. It wasn't switching when just idle.

Mark

Comment 61 dzyndzla 2024-03-29 09:01:06 UTC
Hi there,
finally! I can confirm it works! T490 20N2, EndeavourOS @ kernel 6.8.2.

Setting power-profiles-deamon to performance keeps performance and even lap mode activation doesn't affect much on CPU wattage. Only sometimes happens that lap mode turns on while lap lays steadily on a desk but I noticed it twice maybe. I have a scaling governor set to powersave by default but if I switch to governor performance and activate lap mode (by shaking laptop) governor sticks with performance, platform_profile sticks with performance and power-profiles keeps performance (but degraded in - which is intentional I guess). Proof: https://i.imgur.com/vjwtrjp.png

Comment 62 Jonathan Heitz 2024-03-29 12:38:03 UTC
Hello all,

I can confirm as well, during my testing* I did not experience any issues. Thank you Mark and all who have contributed to fixing :)

Should I close this bug?

-Jon

* (Do note that my testing was very brief as I rarely use this machine anymore.)

Comment 63 Aoife Moloney 2024-11-08 10:49:34 UTC
This message is a reminder that Fedora Linux 39 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora Linux 39 on 2024-11-26.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
'version' of '39'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, change the 'version' 
to a later Fedora Linux version. Note that the version field may be hidden.
Click the "Show advanced fields" button if you do not see it.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora Linux 39 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora Linux, you are encouraged to change the 'version' to a later version
prior to this bug being closed.

Comment 64 Aoife Moloney 2024-11-27 21:08:50 UTC
Fedora Linux 39 entered end-of-life (EOL) status on 2024-11-26.

Fedora Linux 39 is no longer maintained, which means that it
will not receive any further security or bug fix updates. As a result we
are closing this bug.

If you can reproduce this bug against a currently maintained version of Fedora Linux
please feel free to reopen this bug against that version. Note that the version
field may be hidden. Click the "Show advanced fields" button if you do not see
the version field.

If you are unable to reopen this bug, please file a new report against an
active release.

Thank you for reporting this bug and we are sorry it could not be fixed.