Bug 2056131

Summary: Blank Display After Second Suspend/Resume Cycle with docked Lenovo P1 Gen 3: nouveau 0000:01:00.0: PM: failed to resume async: error -110
Product: [Fedora] Fedora Reporter: Devan Goodwin <dgoodwin>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 35CC: acaringi, adscvr, airlied, alciregi, aravindh, bskeggs, hdegoede, jarodwilson, jcall, jeremy, jglisse, jonathan, josef, kernel-maint, lgoncalv, linville, masami256, mchehab, ptalbert, steved, zulinx86
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-03-09 12:08:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
journalctl over 20 minutes testing with three fails resumes none

Description Devan Goodwin 2022-02-18 22:48:07 UTC
Created attachment 1861981 [details]
journalctl over 20 minutes testing with three fails resumes

1. Please describe the problem:

With a docked Lenovo P1 Gen 3: After a fresh boot I can suspend AND resume successfully once, the next attempt to resume will leave me with all enabled displays blank. 

This line in journalctl (attached) seems to correspond with each failed resume:

Feb 18 14:51:40 berlinetta kernel: nouveau 0000:01:00.0: PM: failed to resume async: error -110

It does not appear to matter if I have a single display (external monitor) or joined displays (laptop + external monitor), the problem occurs either way.

Lenovo P1 Gen 3 with both intel and nvidia graphics, I am just using nouveau per default.

❯ lspci | grep VGA
00:02.0 VGA compatible controller: Intel Corporation CometLake-H GT2 [UHD Graphics] (rev 05)
01:00.0 VGA compatible controller: NVIDIA Corporation TU117GLM [Quadro T2000 Mobile / Max-Q] (rev a1

Docking station is a Lenovo Thunderbolt 3 Dock model no DBB9003L1, it a little older than the laptop and if it's relevant, does not provide enough power to it, I have to plug in it's power cord as well otherwise I get periodic dock disconnects. (probably unrelated)


2. What is the Version-Release number of the kernel:

5.16.8


3. Did it work previously in Fedora? If so, what kernel version did the issue
   *first* appear?  Old kernels are available for download at
   https://koji.fedoraproject.org/koji/packageinfo?packageID=8 :

I believe the problem was introduced with 5.15.17: https://koji.fedoraproject.org/koji/buildinfo?buildID=1909364

Tested good: 
5.15.11
5.15.15
5.15.16

Tested bad: 
5.15.17
5.16.7
5.16.8


4. Can you reproduce this issue? If so, please provide the steps to reproduce
   the issue below:

- Fresh boot the system while docked. (system remains docked throughout)
- Login to Gnome, suspend the laptop from Gnome menus.
- Wait for suspend to complete, press power button on laptop to resume. This works fine.
- Give a few seconds, suspend again from Gnome menus.
- Wait for suspend to complete, press power button on laptop again.

This time no display will come on, and the line will show along with other details you can see in the attached journalctl.

Feb 18 14:51:40 berlinetta kernel: nouveau 0000:01:00.0: PM: failed to resume async: error -110


5. Does this problem occur with the latest Rawhide kernel? To install the
   Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by
   ``sudo dnf update --enablerepo=rawhide kernel``:

This was far too broken to tell, on boot my keyboard / bluetooth mouse were not working, it suspended and then would not wake up at all.


6. Are you running any modules that not shipped with directly Fedora's kernel?:

I don't think so but attaching an lsmod output file.


7. Please attach the kernel logs. You can get the complete kernel log
   for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the
   issue occurred on a previous boot, use the journalctl ``-b`` flag.

I reproduced this three times over 20 minutes trying to isolate the steps, all on kernel 5.16.8. These three occurrences can be seen in the attached log if you grep for the "failed to resume async" line. You will also see the successful first resumes in the logs after each reboot.

Comment 2 Devan Goodwin 2022-02-18 22:56:28 UTC
I should note that undocked, I can suspend resume the laptop many times without issue even on kernel 5.16.8, my display always comes on. Whatever this is seems related to the dock.

Comment 3 Aravindh Puthiyaparambil 2022-02-21 20:18:43 UTC
I have observed this issue without a dock and when my Lenovo P1 laptop is connected to an external display using USB-C/thunderbolt.

Comment 4 Devan Goodwin 2022-03-09 12:08:31 UTC
I think the problem may have been resolved, I've been sticking with 5.15.16 for weeks, but on update today I tried again and I cannot reproduce after booting 5.16.12-200.fc35.x86_64

Going to close for now.