Bug 2189510

Summary: Resume after suspend fails on kernel >= 6.2
Product: [Fedora] Fedora Reporter: Mark Knoop <mark>
Component: kernelAssignee: Kernel Maintainer List <kernel-maint>
Status: NEW --- QA Contact: Fedora Extras Quality Assurance <extras-qa>
Severity: high Docs Contact:
Priority: unspecified    
Version: 38CC: acaringi, adscvr, airlied, alciregi, bskeggs, hdegoede, hpa, jarodwilson, jglisse, josef, kernel-maint, lgoncalv, linville, mark, masami256, mchehab, ptalbert, publicperson, steved
Target Milestone: ---Keywords: Regression
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: ---
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Call trace after successful resume none

Description Mark Knoop 2023-04-25 13:29:23 UTC
Dell XPS 13 (9350) laptop has been running Fedora since 2015 and never had any issues with suspend/resume. 

Beginning with kernel 6.2.8-200 on Fedora 37 there were frequent failures to resume after suspend. The first suspend-resume cycle after a reboot would often work, but subsequent attempts would fail. I tried to debug but without success and then did a clean install of Fedora 38.

The same problem occurred with kernels 6.2.9-300 and 6.2.11-300. One successful resume with this kernel generated a call trace which is attached. However this trace does not always appear.

I have installed older kernels from koji, the problem starts to appear with the first 6.2 kernel.

- 6.0.0-0.rc2.20220824gitc40e8341e3b3.23 working suspend-resume
- 6.1.0-0.rc8.20221209git0d1409e4ff08.62 working suspend-resume
- 6.2.0-0.rc2.18 resume fails after suspend


Reproducible: Sometimes

Steps to Reproduce:
1. Install a kernel version >= 6.2
2. Reboot, then suspend
3. Resume is successful ~50% of the time
4. Suspend again
5. Resume is almost never successful after two suspend cycles

Comment 1 Mark Knoop 2023-04-25 13:30:22 UTC
Created attachment 1959804 [details]
Call trace after successful resume

Comment 2 lukastymo 2023-07-20 15:06:59 UTC
Dell XPS 15 has precisely the same issue. The only difference is that I see the error almost every time I try to resume.

It happens for both s2idle and deep suspend. My reproducibility is almost 90%. Never two resumes within a day. 

I'm looking for a workaround, so far the below didn't help
- disable Bluetooth in BIOS
- switch to Nouveau
- switch to proprietary Nvidia drivers
- change s2idle -> deep

Steps to Reproduce and my observations:
- Suspend
- Wait a longer period (~1h)
- Press the keyboard on the laptop (nothing)
- Press the power button (nothing)
- Press many keys a couple of times (nothing, black screen)
- After 30 seconds of pressing keys, the CPU goes 100%, the laptop becomes very loud, like in some kind of intensive computation loop, and the whole laptop increases hot very quickly. All monitors stay off. I restart immediately to avoid overheating.
- It feels like the longer I wait on suspend, the less likely I will be able to resume. 

My specific setup which may affect the increase in reproducibility:
- 2 external monitors, 64GB RAM.

In previous Fedora/kernels s2idle worked fine, deep never. Now, deep works sometimes, but s2idle fails in 90% of cases.