Bug 2211024
| Summary: | systemd mistakes LidSwitchDocked event for LidSwitchExternalPower | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 9 | Reporter: | Laszlo Ersek <lersek> |
| Component: | systemd | Assignee: | systemd maint <systemd-maint> |
| Status: | NEW --- | QA Contact: | Frantisek Sumsal <fsumsal> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 9.2 | CC: | dtardon, systemd-maint-list |
| Target Milestone: | rc | ||
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | Type: | Bug | |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Laszlo Ersek
2023-05-30 10:07:04 UTC
(In reply to Laszlo Ersek from comment #0) > With the latest RHEL-9.1 kernel (5.14.0-162.18.1.el9_1.x86_64), the symptom > is effectively invisible. What does this mean? That it doesn't happen at all, or that it does happen, but only rarely? > With the latest RHEL-9.2 kernel (5.14.0-284.11.1.el9_2.x86_64), the symptom > always reproduces (again, you need to *cold-boot* the laptop). > > *** Steps to Reproduce: > 1. Power down the laptop. > 2. Make sure it's docked. > 3. Close the lid. > 4. Power on the laptop using the power button on the dock. > 5. Enter the LUKS password (if any). > 6. Let the boot progress to the GDM login screen (graphical.target). Could you provide a log with systemd.log-level=debug? > > *** Actual results: > - Laptop immediately suspends. > - When the laptop is resumed, the GDM login screen is broken. No user list > to pick a user from, and the various widgets at the top of the screen are > broken -- they don't work when clicked, and there is some visual screen > corruption too. I doubt the latter has anything to do with logind. (In reply to David Tardon from comment #1) > (In reply to Laszlo Ersek from comment #0) > > With the latest RHEL-9.1 kernel (5.14.0-162.18.1.el9_1.x86_64), the symptom > > is effectively invisible. > > What does this mean? That it doesn't happen at all, or that it does happen, > but only rarely? It happens *extremely* rarely. I didn't mean to clutter the original report with details that I deemed irrelevant, but here's another bit: I actually "bisected" the kernel build range between 5.14.0-162.18.1.el9_1.x86_64 and 5.14.0-284.11.1.el9_2.x86_64, using the development kernel RPMs from Brew. -284 had always reproduced the issue, and -162.18.1. had never done so. So I was actually nearing completion of the bisection, which seemed to indicate that the problem had been introduced somewhere between -205 and -208 -- but then I cold-booted the laptop with the original -162.18.1 too, for some reason, and boom, the failure popped up with that one as well, totally unexpectedly. That invalidated the entire bisection of course (I couldn't call the starting point -162.18.1.el9_1.x86_64 "good" any longer). It remains a fact that I've seen the failure when cold-booting with -162.18.1.el9_1.x86_64 only once, out of dozens or even hundreds of boots. > > With the latest RHEL-9.2 kernel (5.14.0-284.11.1.el9_2.x86_64), the symptom > > always reproduces (again, you need to *cold-boot* the laptop). > > > > *** Steps to Reproduce: > > 1. Power down the laptop. > > 2. Make sure it's docked. > > 3. Close the lid. > > 4. Power on the laptop using the power button on the dock. > > 5. Enter the LUKS password (if any). > > 6. Let the boot progress to the GDM login screen (graphical.target). > > Could you provide a log with systemd.log-level=debug? Let me ask back first: - will this not render the system unbootable itself? (Sorry if this question sounds silly, but I vaguely recall booting an earlier RHEL major release like this, for a different investigation, and there were so many log messages and such a slowdown that I couldn't actually boot the system!) - Where will the log be captured? Is it available with "journalctl" or in some other way? The laptop doesn't have a serial port, so I can't log directly to a different machine. > > *** Actual results: > > - Laptop immediately suspends. > > - When the laptop is resumed, the GDM login screen is broken. No user list > > to pick a user from, and the various widgets at the top of the screen are > > broken -- they don't work when clicked, and there is some visual screen > > corruption too. > > I doubt the latter has anything to do with logind. I'm unsure, but I can imagine it is related. It seems that, exactly when the GDM login screen is about to enter, something notices that the lid is closed (note: level triggered, not edge triggered -- the lid has not been touched at all!), and apparently synthesizes a "lid closed" *event*. The only difference between the two "vectors" is that in the first case, the GDM login screen comes up as a part of a normal cold boot, while in the second case, the GDM login screen appears after logging out of a window manager session. As long as the invalid event is emitted in close connection with the GDM login screen appearing, both symptoms could originate from the same stem. (In that sense, the root cause may not even be that systemd mistakes LidSwitchDockedfor LidSwitchExternalPower -- the primary issue may be that *any* LidSwitch event is emitted when the GDM screen appears, without me touching the lid at all!) Thanks. |