Bug 1920960 - resume from suspend to RAM does not work on Thinkpad X1 Nano
Summary: resume from suspend to RAM does not work on Thinkpad X1 Nano
Keywords:
Status: CLOSED EOL
Alias: None
Product: Fedora
Classification: Fedora
Component: kernel
Version: 33
Hardware: x86_64
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
Assignee: Kernel Maintainer List
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-01-27 09:55 UTC by Milos Jakubicek
Modified: 2022-11-14 15:03 UTC (History)
23 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-11-30 18:50:52 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)
dmesg output from boot after suspend with pm_trace and secure boot enabled (158.76 KB, text/plain)
2021-01-27 09:55 UTC, Milos Jakubicek
no flags Details
dmesg output from boot after suspend with pm_trace and secure boot disabled (148.94 KB, text/plain)
2021-01-27 09:56 UTC, Milos Jakubicek
no flags Details
dmesg output from boot after suspend with pm_trace and secure boot disabled and microcode_ctl downgraded (90.17 KB, text/plain)
2021-01-27 09:57 UTC, Milos Jakubicek
no flags Details

Description Milos Jakubicek 2021-01-27 09:55:24 UTC
Created attachment 1751204 [details]
dmesg output from boot after suspend with pm_trace and secure boot enabled

1. Please describe the problem:

Resume from STR doesn't work on a brand new Thinkpad X1 Nano

2. What is the Version-Release number of the kernel:

kernel-5.10.9-201.fc33.x86_64

3. Did it work previously in Fedora? If so, what kernel version did the issue
   *first* appear?  Old kernels are available for download at
   https://koji.fedoraproject.org/koji/packageinfo?packageID=8 :

N/A, tried fc33 builds of kernel 3.6 (did not even boot), 3.7 (hangs shortly after boot), and none of 3.8, 3.9 and 3.10 make STR working.

4. Can you reproduce this issue? If so, please provide the steps to reproduce
   the issue below:

Yes

I've tried to debug following https://www.kernel.org/doc/html/latest/power/basic-pm-debugging.html

- "freezer" stage to /sys/power/pm_test works, but "platform" not anymore.
- after echo 1 > /sys/power/pm_trace, suspend, reboot, grepping dmesg for "hash matches" yields:

"memory memory48: hash matches"

There is also a couple of firmware errors before, but that seems to be an independent issue (downgrading microcode_ctl to 2.1-40 makes them go away but STR still does not work).


5. Does this problem occur with the latest Rawhide kernel? To install the
   Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by
   ``sudo dnf update --enablerepo=rawhide kernel``:


It does, tested 5.11.0-0.rc4.129.fc34

6. Are you running any modules that not shipped with directly Fedora's kernel?:

No

7. Please attach the kernel logs. You can get the complete kernel log
   for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the
   issue occurred on a previous boot, use the journalctl ``-b`` flag.

Attached files are dmesg outputs from boots after suspend (with 1>pm_trace)

dmesg - with secure boot enabled
dmesg2 - with secure boot disabled
dmesg3 - with secure boot disabled and microcode_ctl downgraded to 2.1-40

Comment 1 Milos Jakubicek 2021-01-27 09:56:21 UTC
Created attachment 1751206 [details]
dmesg output from boot after suspend with pm_trace and secure boot disabled

Comment 2 Milos Jakubicek 2021-01-27 09:57:07 UTC
Created attachment 1751207 [details]
dmesg output from boot after suspend with pm_trace and secure boot disabled and microcode_ctl downgraded

Comment 3 Hans de Goede 2021-01-27 20:10:09 UTC
I wonder what suspend mode is being used. Modern systems support 2 modes s3 suspend (aka deep mode) and s2idle. Windows always uses s2idle now a days. So typically s2idle works better as that is what vendors actually test. But for some reason some Linux users still try to switch to S3 mode because older (much older) Linux versions did not suspend s2idle.

You can see which mode you are using by doing:

cat /sys/power/mem_sleep

On my system this outputs the following:

[hans@x1 ~]$ cat /sys/power/mem_sleep
[s2idle] deep

Meaning that both modes are supported and s2idle is being used, if yours
is set to "deep" (aka s3) mode, and s2idle mode is listed too, you can
change this by doing:

sudo sh -c 'echo s2idle > /sys/power/mem_sleep'

And then do: "cat /sys/power/mem_sleep" again to check the setting changed.

(if you are using s2idle you could try switching to deep mode too)


You may also have a BIOS setting for this. Enter your BIOS settings and in the "Config" pane there might very well be a "Sleep State" setting or some such. This can be set to either "Windows" (s2idle) or "Linux" (s3) the Linux setting really is only necessary when running older kernels. If you have such an option it might be worthwhile to try both options, this might change some things under the hood which are not accessible through /sys/power/mem_sleep .

Comment 4 Milos Jakubicek 2021-01-28 00:06:38 UTC
A few more observations:

- it is actually STI (suspend to idle) not STR that does not work
- STR is not even announced by the device:

>sudo cat /sys/power/mem_sleep 
[s2idle]
>sudo dmesg | grep ACPI | grep supports
[    0.704898] ACPI: (supports S0 S4 S5)

- both "freezer" and "devices" to /sys/power/pm_test work (device suspends and resumes after 5 seconds)
- but echo freeze > /sys/power/state (with no/none pm_test) fails
- leaving the device untouched for a couple of minutes after attempting to resume makes it actually reboot, after the reboot FnLock, F1 and F4 leds flash wildly for a couple of seconds and then the device boots.

I can see why STR/S3 does not work (apparently this is still an issue for Thinkpad X1 family), but it's weird that STI/S0 does not work either.

Comment 5 Milos Jakubicek 2021-01-28 00:15:40 UTC
Thanks Hans, I've only noticed your comment after writing mine, reading some docs put me under the impression that s2idle is just kind of a fallback when deep is not available, good to know it's quite not like that.

I searched the BIOS but haven't found anything at least remotely related to suspend.

Comment 6 Milos Jakubicek 2021-01-31 00:01:15 UTC
Okay, so I've managed to track down the cause: it's the LTE modem.

I have basically followed this guide: https://01.org/blogs/rzhang/2015/best-practice-debug-linux-suspend/hibernate-issues
and when trying to tweak /proc/acpi/wakeup, I have disabled all what's been enabled and suddenly the suspend started working like a charm.
Going one by one in the "enabled" list of /proc/acpi/wakeup, I've found out that disabling wakeup for a PCIe root port RP01 is what makes the suspend working, so that then I had this:

RP01      S4    *disabled  pci:0000:00:1c.0

Looking into the output of lspci -PP -nn -k for 00:1c.0, I got:

00:1c.0 PCI bridge [0604]: Intel Corporation Device [8086:a0b8] (rev 20)
        Kernel driver in use: pcieport
00:1c.0/08:00.0 Wireless controller [0d40]: Intel Corporation XMM7360 LTE Advanced Modem [8086:7360] (rev 01)
        Subsystem: Device [1cf8:8521]


The XMM7360 is actually a Fibocom L850-GL for which there is currently no PCIe driver in the upstream kernel.
I found this driver https://github.com/xmm7360/xmm7360-pci.git and got it working in the sense that I now have this:

00:1c.0/08:00.0 Wireless controller [0d40]: Intel Corporation XMM7360 LTE Advanced Modem [8086:7360] (rev 01)
        Subsystem: Device [1cf8:8521]
        Kernel driver in use: xmm7360

I haven't actually tested using the modem yet, but regardless of the driver module being loaded or not, the suspend
only works if the acpi wakeup for RP01 is disabled.

I'm not sure whether this is actually a kernel bug but it seems to be so: all device-suspend issues should generally be solvable by rmmoding the module, but
here that's not enough. Or is it perhaps an issue with the pcieport driver used for RP01?

Side note for anybody facing similar issues: I feared I've managed to brick the device because when playing with the ACPI wakeups, the laptop got suddenly dead after a few hours of running.
Fortunately it was just enough to push to motherboard emergency reset button (there is even a hole for it in the chassis).

Comment 7 Ant 2021-04-26 15:45:27 UTC
Hello.

Milos Jakubicek, i have the same issue with my X1 Nano.
If i doesn`t activating my LTE - everything works fine.
If i running this script - https://github.com/xmm7360/xmm7360-pci/blob/master/scripts/lte.sh - Can`t woke up from a sleep.

What we gonna do with this? Any thoughts?

Comment 8 Milos Jakubicek 2021-04-27 07:09:40 UTC
I haven't investigated this any further yet after I managed to get the suspend working when disabling the ACPI wakeup for the modem.
I will try to get a USB serial console working to see what's exactly going on, if that will be possible, but not anytime soon.

Comment 9 Ben Cotton 2021-11-04 13:57:46 UTC
This message is a reminder that Fedora 33 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 33 on 2021-11-30.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '33'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 33 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 10 Ben Cotton 2021-11-04 14:27:09 UTC
This message is a reminder that Fedora 33 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 33 on 2021-11-30.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '33'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 33 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 11 Ben Cotton 2021-11-04 15:24:47 UTC
This message is a reminder that Fedora 33 is nearing its end of life.
Fedora will stop maintaining and issuing updates for Fedora 33 on 2021-11-30.
It is Fedora's policy to close all bug reports from releases that are no longer
maintained. At that time this bug will be closed as EOL if it remains open with a
Fedora 'version' of '33'.

Package Maintainer: If you wish for this bug to remain open because you
plan to fix it in a currently maintained version, simply change the 'version' 
to a later Fedora version.

Thank you for reporting this issue and we are sorry that we were not 
able to fix it before Fedora 33 is end of life. If you would still like 
to see this bug fixed and are able to reproduce it against a later version 
of Fedora, you are encouraged  change the 'version' to a later Fedora 
version prior this bug is closed as described in the policy above.

Although we aim to fix as many bugs as possible during every release's 
lifetime, sometimes those efforts are overtaken by events. Often a 
more recent Fedora release includes newer upstream software that fixes 
bugs or makes them obsolete.

Comment 12 Ben Cotton 2021-11-30 18:50:52 UTC
Fedora 33 changed to end-of-life (EOL) status on 2021-11-30. Fedora 33 is
no longer maintained, which means that it will not receive any further
security or bug fix updates. As a result we are closing this bug.

If you can reproduce this bug against a currently maintained version of
Fedora please feel free to reopen this bug against that version. If you
are unable to reopen this bug, please file a new report against the
current release. If you experience problems, please add a comment to this
bug.

Thank you for reporting this bug and we are sorry it could not be fixed.

Comment 13 Ian Newton 2022-11-14 15:03:59 UTC
Though this is a bit late to the party, the above analysis by Milos helped me to fix my suspend/resume on a Lenovo Yoga 920 with updated wireless card (intel AX210). Wakeup would often result in blinking CapsLk followed by reboot. So the following based on the Mylos guide may help others who have similar hardware. I'm mostly using systemctl hybrid-sleep and systemctl hibernate for suspend/resume so am able to use: 
GNU nano 6.4                     /usr/lib/systemd/system-sleep/suspend.sh                                
```
#!/bin/sh
#
# This script should prevent suspend errors
# Put it in /usr/lib/systemd/system-sleep/xhci.sh
# The PCI 00:14.0 device is the usb xhci controller.
#

if [ "${1}" == "pre" ]; then
#  Do the thing you want before suspend here, e.g.:
   grep XHC.*enable /proc/acpi/wakeup && echo XHC > /proc/acpi/wakeup
   grep RP05.*enable /proc/acpi/wakeup && echo RP05 > /proc/acpi/wakeup
elif [ "${1}" == "post" ]; then
#  Do the thing you want after resume here, e.g.:
   grep XHC.*disable /proc/acpi/wakeup && echo XHC > /proc/acpi/wakeup
   grep RP05.*disable /proc/acpi/wakeup && echo RP05 > /proc/acpi/wakeup
fi
```

Using cat /proc/acpi/wakeup and lspci -PP -nn -k to find in my case XHC which is the USB3 subsystem and RP05 which is the intel wireless card.


Note You need to log in before you can comment on or make changes to this bug.