Bug 1829096
| Summary: | [regression in 5.6] hang after resuming from suspend | ||||||
|---|---|---|---|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Dominik 'Rathann' Mierzejewski <dominik> | ||||
| Component: | kernel | Assignee: | Kernel Maintainer List <kernel-maint> | ||||
| Status: | CLOSED DUPLICATE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||
| Severity: | unspecified | Docs Contact: | |||||
| Priority: | unspecified | ||||||
| Version: | 32 | CC: | airlied, bskeggs, hdegoede, ichavero, itamar, jarodwilson, jeremy, jglisse, john.j5live, jonasd, jonathan, josef, kernel-maint, linville, luzmaximilian, masami256, mchehab, mjg59, steved, williambader | ||||
| Target Milestone: | --- | ||||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2020-05-05 10:07:10 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
|
Description
Dominik 'Rathann' Mierzejewski
2020-04-28 21:08:12 UTC
5.6.0-300.fc32 also suffers from this. I managed to get an Oops with a partial backtrace after adding no_console_suspend to kernel command line: [ 86.898573] Freezing user space processes ... (elapsed 0.001 seconds) done. [ 86.900150] OOM killer disabled. [ 86.900165] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. [ 86.901579] wlp1s0: deauthenticating from aa:bb:cc:dd:ee:ff by local choice (Reason: 3=DEAUTH_LEAVING) [ 86.907405] sd 0:0:0:0: [sda] Synchronizing SCSI cache [ 86.910092] sd 0:0:0:0: [sda] Stopping disk [ 86.939714] Removing pn544 [ 87.216904] PM: suspend devices took 0.315 seconds [ 87.230238] ACPI: EC: interrupt blocked [ 87.244083] ACPI: Preparing to enter system sleep state S3 [ 87.245040] ACPI: EC: event blocked [ 87.245046] ACPI: EC: EC stopped [ 87.245049] PM: Saving platform NVS memory [ 87.245306] Disabling non-boot CPUs ... [ 87.246142] IRQ 16: no longer affine to CPU1 [ 87.247209] smpboot: CPU 1 is now offline [ 87.252696] IRQ 45: no longer affine to CPU2 [ 87.253848] smpboot: CPU 2 is now offline [ 87.256900] IRQ 23: no longer affine to CPU3 [ 87.256902] IRQ 43: no longer affine to CPU3 [ 87.256905] IRQ 49: no longer affine to CPU3 [ 87.257919] smpboot: CPU 3 is now offline [ 87.261044] ACPI: Low-level resume complete [ 87.261115] ACPI: EC: EC started [ 87.261118] PM: Restoring platform NVS memory [ 87.265116] Enabling non-boot CPUs ... [ 87.265175] x86: Booting SMP configuration: [ 87.265179] smpboot: Booting Node 0 Processor 1 APIC 0x2 [ 87.268461] CPU1 is up [ 87.268508] smpboot: Booting Node 0 Processor 2 APIC 0x1 [ 87.269612] CPU2 is up [ 87.269649] smpboot: Booting Node 0 Processor 3 APIC 0x3 [ 87.270604] CPU3 is up [ 87.272498] ACPI: Waking up from system sleep state S3 [ 87.274019] ACPI: button: The lid device is not compliant to SW_LID. [ 87.274453] ACPI: EC: interrupt unblocked [ 87.298871] ACPI: EC: event unblocked [ 87.309328] sd 0:0:0:0: [sda] Starting disk [ 87.313944] sony_laptop: invalid acpi_object: expected 0x1 got 0x3 [ 87.314115] sony_laptop: invalid acpi_object: expected 0x1 got 0x3 [ 87.315003] sony_laptop: invalid acpi_object: expected 0x1 got 0x3 [ 87.315012] BUG: kernel NULL pointer dereference, address: 000000000000000000 [ 87.315014] #PF: supervisor read access in kernel mode [ 87.315015] #PF: error_code(0x0000) - not-present page [ 87.315016] PGD 0 P4D 0 [ 87.315019] Oops: 0000 [#1] SMP PTI [ 87.315021] CPU: 0 PID: 1796 Comm: systemd-sleep Not tainted 5.6.7-300-fc32.x86_64 #1 [ 87.315023] Hardware name: Sony Corporation SVP1322C5E/VAIO, BIOS R2091V7 03/24/2014 [ 87.315028] RIP: 0100:sony_nc_resume+0x1de/0x200 [sony_laptop] [ 87.315030] Code: ff ff ff e9 40 ff ff ff 4c 89 e2 be 00 01 00 00 bf 22 01 00 00 e8 f2 df ff ff 85 c0 75 23 0f b6 44 24 0c 48 8b 15 12 97 00 00 <8b> 3a 39 c7 0f 84 14 ff ff ff 0f b7 ff e8 30 e0 ff ff e9 07 ff ff [ 87.315032] RSP: 0018:ffffa98140953d10 EFLAGS: 00010282 [ 87.315034] RAX: 00000000fffffffb RBX: 000000000000000f RCX: 000000000000937d [ 87.315036] RDX: 0000000000000000 RSI: c5f11fee0037bf32 RDI: 0000000000030080 [ 87.315037] RBP: ffff8c69963ab260 R08: 0000000000000461 R09: 0000000000000029 [ 87.315039] R10: ffff8c696c4a59a0 R11: 0000000000000000 R12: ffffa98140953d1c [ 87.315040] R13: 0000000000000000 R14: ffffffffba3df2e1 R15: 0000000000000010 [ 87.315042] FS: 00007f45e0ab2b80(0000) GS:ffff8c6997a00000(0000) knlGS:0000000000000000 [ 87.315044] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000000050033 [ 87.315045] CR2: 0000000000000000 CR3: 00000001fcc7a001 CR4: 00000000001606f0 [ 87.315047] Call Trace: [ 87.315052] ? _cond_resched+0x16/0x40 [ 87.315054] ? sony_nc_thermal_mode_show+0x60/0x60 [sony_laptop] [ 87.315057] dpm_run_callback+0x4f/0x140 [ 87.315059] device_resume+0x136/0x200 [ 87.315062] dpm_resume+0xce/0x2e0 [ 87.315064] dpm_resume_end+0xd/0x20 [ 87.628326] ata1.00: configured for UDMA/133 SControl 300)ci_hcdType 0x1000000000000000f ff ff 85 c0 75 23 0f b6 44 24 0c 48 8b 15 12 97 00 00 <8b> 3a 39 c7 0f 84 14 ff ff ff 0f b7 ff e8 30 e0 ff ffimer mei_me snd mei soundc Ok, so this seems to be a bug in the sony-laptop driver, lets try blacklisting that as a first step towards debugging this. Can you try adding: "modprobe.blacklist=sony-laptop" to your kernel commandline ? p.s. After adding the kernel commandline option and rebooting, please do: "lsmod | grep sony" this should not show sony-laptop, if it does then the blacklisting did not work for some reason. Plain modprobe.blacklist doesn't work, but rd.driver.blacklist and blaxklist sony-laptop in /etc/modprobe.d/sony-laptop.conf does work. Also, resume works fine without the sony-laptop module loaded. Thanks. FWIW, I don't see any commits between 5.5.17 and 5.6.7 touching the sony_laptop module, but there are some new errors in dmesg compared to 5.6.7: [ 18.419670] sony_laptop: Invalid acpi_object: expected 0x1 got 0x3 [ 18.422698] sony_laptop: Invalid acpi_object: expected 0x1 got 0x3 [ 18.424856] sony_laptop: couldn't set up keyboard backlight function (-22) [ 18.428007] sony_laptop: Invalid acpi_object: expected 0x1 got 0x3 [ 18.430306] sony_laptop: Invalid acpi_object: expected 0x1 got 0x3 [ 18.433781] sony_laptop: Invalid acpi_object: expected 0x1 got 0x3 [ 18.435843] sony_laptop: No USB Charge capability found [ 18.438865] sony_laptop: Invalid acpi_object: expected 0x1 got 0x3 [ 18.441430] sony_laptop: couldn't set up lid resume function (-5) [ 18.443902] sony_laptop: Invalid acpi_object: expected 0x1 got 0x3 [ 18.446393] sony_laptop: couldn't to read the thermal profiles [ 18.448270] sony_laptop: couldn't set up thermal profile function (-22) [ 18.452028] sony_laptop: SNC setup done. Instead of blacklisting the module altogether, I added a work-around to unload the module for suspend/resume only:
$ cat /usr/lib/systemd/system-sleep/sony-laptop
#!/bin/sh
if [ "${1}" == "pre" ]; then
logger --journald <<__end
MESSAGE=removing sony-laptop module before suspending, see https://bugzilla.redhat.com/show_bug.cgi?id=1829096
__end
modprobe -r sony-laptop
elif [ "${1}" == "post" ]; then
logger --journald <<__end
MESSAGE=reloading sony-laptop module after resuming, see https://bugzilla.redhat.com/show_bug.cgi?id=1829096
__end
modprobe sony-laptop
fi
Seems to work reliably.
Hmm, ok so this regression is likely caused by some changes inside the ACPI subsystem. I'm afraid that the best way to track this down is to do a kernel bisect between 5.5.0 and 5.6.0 then. I know this is a bit time consuming, but it really is the easiest way to find the commit which causes this issue. >the best way to track this down is to do a kernel bisect between 5.5.0 and 5.6.0 then. I bisected it at https://bugzilla.redhat.com/show_bug.cgi?id=1830150#c24 *** This bug has been marked as a duplicate of bug 1830150 *** |