Bug 1735786

Summary: Latest kernels have regressions (Suspend & Reboot)
Product: [Fedora] Fedora Reporter: naaa <bareye4583>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: CLOSED CURRENTRELEASE QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 30CC: airlied, bskeggs, hdegoede, ichavero, itamar, jarodwilson, jeremy, jforbes, jglisse, john.j5live, jonathan, josef, kernel-maint, linville, mchehab, mjg59, steved
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-08-07 18:12:23 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On: 1737046    
Bug Blocks:    
Description Flags
dmesg from previous boot (-b) none

Description naaa 2019-08-01 12:34:58 UTC
1. Please describe the problem:

Has problems suspending and rebooting

2. What is the Version-Release number of the kernel:

5.1.20-300.fc30.x86_64 and newer

3. Did it work previously in Fedora? If so, what kernel version did the issue
   *first* appear?

5.1.20-300.fc30.x86_64 (5.1.19 is good. Regressions started with 5.1.20 and go into 5.2.5 at the very least).

4. Can you reproduce this issue? If so, please provide the steps to reproduce
   the issue below:

Suspend: Suspend once, resume, start 3D game and play for a few minutes, exit 3D game, open page in web browser, suspend - fails to suspend every time. (There may be other ways but this is 100% reproducible every time.)

Reboot: Try rebooting after suspend fails. Stays on rebooting text and doesn't reboot. (There may be other ways but this is 100% reproducible every time.)

5. Does this problem occur with the latest Rawhide kernel? To install the
   Rawhide kernel, run ``sudo dnf install fedora-repos-rawhide`` followed by
   ``sudo dnf update --enablerepo=rawhide kernel``:


6. Are you running any modules that not shipped with directly Fedora's kernel?: 

N/A This Fedora has few customization's - using Negativo17's Nvidia driver.

7. Please attach the kernel logs. You can get the complete kernel log
   for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If the
   issue occurred on a previous boot, use the journalctl ``-b`` flag.

N/A not using kernel at this time. Can update later if needed.

Comment 1 naaa 2019-08-01 13:16:47 UTC
Created attachment 1597260 [details]
dmesg from previous boot (-b)

Kernel 5.2.5 dmesg after trying to suspend (failed to suspend and went to black screen; othertimes it will re-show gui).

Comment 2 naaa 2019-08-01 13:18:00 UTC
Simplify repro for suspend: (just enter in-game then exit, no need to play for few minutes)

Comment 3 Justin M. Forbes 2019-08-01 16:54:19 UTC
Can you reproduce this without the nvidia driver installed?  Looks like the nvidia driver updated around the same time. We cannot support the nvidia binary driver as it is closed source.  If you cannot reproduce without the nvidia driver running, please close this bug and contact nvidia.

Comment 4 naaa 2019-08-01 19:33:36 UTC
...and its going to be more complicated than that... Have to find a 3D game that I have that will work with Nouveau and produces the problem. Looks like so far it has to be a dx11 game running through Steam Proton, so in theory, Steam Proton could be the issue and causing something to happen where the OS refuses to suspend.

I will leave this open for now. If someone decides to, someone else is taking responsibility elsewhere, gets fixed, or something similar to that, then it will be closed.

Thanks for the reply and it might help narrow it down more. Will spend more time testing.

Comment 5 naaa 2019-08-02 15:33:04 UTC
3D game not needed (repro steps reduction)... also https://bugzilla.redhat.com/show_bug.cgi?id=1737046 happens after 2nd resume when Nouveau is in use.

$ dnf remove *nvidia*
$ lspci -nnk | grep -iA2 vga
Kernel driver in use: nouveau

So it is apparent that the closed source Nvidia driver is not at fault.

Comment 6 naaa 2019-08-06 20:30:22 UTC
Looks to be solved with kernel 5.2.6-200. Will close in a day or two should the problem remain gone which I fully expect that it will be good.

Comment 7 naaa 2019-08-07 18:11:34 UTC
Fixed with 5.2.6-200.

Thank you to all who got this thing fixed :-)