Bug 2211024
| Summary: | ThinkPad Thunderbolt 4 Workstation Dock is not recognized as a dock by logind | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 9 | Reporter: | Laszlo Ersek <lersek> |
| Component: | systemd | Assignee: | systemd maint <systemd-maint> |
| Status: | CLOSED MIGRATED | QA Contact: | Frantisek Sumsal <fsumsal> |
| Severity: | high | Docs Contact: | |
| Priority: | unspecified | ||
| Version: | 9.2 | CC: | dtardon, systemd-maint-list |
| Target Milestone: | rc | Keywords: | FutureFeature, MigratedToJIRA, Triaged |
| Target Release: | --- | Flags: | pm-rhel:
mirror+
|
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2023-09-21 15:14:59 UTC | Type: | Story |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
|
Description
Laszlo Ersek
2023-05-30 10:07:04 UTC
(In reply to Laszlo Ersek from comment #0) > With the latest RHEL-9.1 kernel (5.14.0-162.18.1.el9_1.x86_64), the symptom > is effectively invisible. What does this mean? That it doesn't happen at all, or that it does happen, but only rarely? > With the latest RHEL-9.2 kernel (5.14.0-284.11.1.el9_2.x86_64), the symptom > always reproduces (again, you need to *cold-boot* the laptop). > > *** Steps to Reproduce: > 1. Power down the laptop. > 2. Make sure it's docked. > 3. Close the lid. > 4. Power on the laptop using the power button on the dock. > 5. Enter the LUKS password (if any). > 6. Let the boot progress to the GDM login screen (graphical.target). Could you provide a log with systemd.log-level=debug? > > *** Actual results: > - Laptop immediately suspends. > - When the laptop is resumed, the GDM login screen is broken. No user list > to pick a user from, and the various widgets at the top of the screen are > broken -- they don't work when clicked, and there is some visual screen > corruption too. I doubt the latter has anything to do with logind. (In reply to David Tardon from comment #1) > (In reply to Laszlo Ersek from comment #0) > > With the latest RHEL-9.1 kernel (5.14.0-162.18.1.el9_1.x86_64), the symptom > > is effectively invisible. > > What does this mean? That it doesn't happen at all, or that it does happen, > but only rarely? It happens *extremely* rarely. I didn't mean to clutter the original report with details that I deemed irrelevant, but here's another bit: I actually "bisected" the kernel build range between 5.14.0-162.18.1.el9_1.x86_64 and 5.14.0-284.11.1.el9_2.x86_64, using the development kernel RPMs from Brew. -284 had always reproduced the issue, and -162.18.1. had never done so. So I was actually nearing completion of the bisection, which seemed to indicate that the problem had been introduced somewhere between -205 and -208 -- but then I cold-booted the laptop with the original -162.18.1 too, for some reason, and boom, the failure popped up with that one as well, totally unexpectedly. That invalidated the entire bisection of course (I couldn't call the starting point -162.18.1.el9_1.x86_64 "good" any longer). It remains a fact that I've seen the failure when cold-booting with -162.18.1.el9_1.x86_64 only once, out of dozens or even hundreds of boots. > > With the latest RHEL-9.2 kernel (5.14.0-284.11.1.el9_2.x86_64), the symptom > > always reproduces (again, you need to *cold-boot* the laptop). > > > > *** Steps to Reproduce: > > 1. Power down the laptop. > > 2. Make sure it's docked. > > 3. Close the lid. > > 4. Power on the laptop using the power button on the dock. > > 5. Enter the LUKS password (if any). > > 6. Let the boot progress to the GDM login screen (graphical.target). > > Could you provide a log with systemd.log-level=debug? Let me ask back first: - will this not render the system unbootable itself? (Sorry if this question sounds silly, but I vaguely recall booting an earlier RHEL major release like this, for a different investigation, and there were so many log messages and such a slowdown that I couldn't actually boot the system!) - Where will the log be captured? Is it available with "journalctl" or in some other way? The laptop doesn't have a serial port, so I can't log directly to a different machine. > > *** Actual results: > > - Laptop immediately suspends. > > - When the laptop is resumed, the GDM login screen is broken. No user list > > to pick a user from, and the various widgets at the top of the screen are > > broken -- they don't work when clicked, and there is some visual screen > > corruption too. > > I doubt the latter has anything to do with logind. I'm unsure, but I can imagine it is related. It seems that, exactly when the GDM login screen is about to enter, something notices that the lid is closed (note: level triggered, not edge triggered -- the lid has not been touched at all!), and apparently synthesizes a "lid closed" *event*. The only difference between the two "vectors" is that in the first case, the GDM login screen comes up as a part of a normal cold boot, while in the second case, the GDM login screen appears after logging out of a window manager session. As long as the invalid event is emitted in close connection with the GDM login screen appearing, both symptoms could originate from the same stem. (In that sense, the root cause may not even be that systemd mistakes LidSwitchDockedfor LidSwitchExternalPower -- the primary issue may be that *any* LidSwitch event is emitted when the GDM screen appears, without me touching the lid at all!) Thanks. (In reply to Laszlo Ersek from comment #2) > (In reply to David Tardon from comment #1) > > Could you provide a log with systemd.log-level=debug? > > Let me ask back first: > > - will this not render the system unbootable itself? (Sorry if this question > sounds silly, but I vaguely recall booting an earlier RHEL major release > like this, for a different investigation, and there were so many log > messages and such a slowdown that I couldn't actually boot the system!) I don't think it will. I've never encountered--or heard of--such an issue before... > - Where will the log be captured? Is it available with "journalctl" or in > some other way? It'll be in the journal. This option changes just the log level of PID1, not the log target. So the root of the issue is that the dock is not recognized as a dock (and it seems there's no obvious way to do that: https://github.com/systemd/systemd/issues/14416). As there's an external monitor attached, logind still thinks the machine is on dock (hence HandleLidSwitchDocked= is being considered): Sep 15 16:51:15 lacos-laptop-9.usersys.redhat.com systemd-logind[1669]: External (1) displays connected. Sep 15 16:51:15 lacos-laptop-9.usersys.redhat.com systemd-logind[1669]: Handling of handle-lid-switch (level) is disabled, taking no action. But it looks like the ext. monitor disappears after gdm is started, because the 'External (1) displays connected' message is no longer printed. (Does gdm call `udevadm trigger`? There's nothing interesting in the log around that point.) Consequently, logind considers HandleLidSwitchExternalPower= and suspends the system: Sep 15 16:51:15 lacos-laptop-9.usersys.redhat.com systemd-logind[1669]: Sleep mode "freeze" is supported by the kernel. Sep 15 16:51:15 lacos-laptop-9.usersys.redhat.com systemd-logind[1669]: Suspending... There's also something you wrote in comment 0 to which I didn't paid enough attention at the time: "Even with RHEL-9.1 components (kernel + userland), I experienced a similar symptom whenever I *logged out* of my window manager session *back* to the GDM login screen. In that case, the laptop would suspend similarly." A new gdm is started after log out, which means the situation is similar to a fresh boot: gdm starts, does something due to which the ext. monitor "disappears" for a while, which results in logind suspending the system. I suppose that if you started the laptop with the lid open and only closed it after logging in, it would continue running (until logout)... Issue migration from Bugzilla to Jira is in process at this time. This will be the last message in Jira copied from the Bugzilla bug. This BZ has been automatically migrated to the issues.redhat.com Red Hat Issue Tracker. All future work related to this report will be managed there. Due to differences in account names between systems, some fields were not replicated. Be sure to add yourself to Jira issue's "Watchers" field to continue receiving updates and add others to the "Need Info From" field to continue requesting information. To find the migrated issue, look in the "Links" section for a direct link to the new issue location. The issue key will have an icon of 2 footprints next to it, and begin with "RHEL-" followed by an integer. You can also find this issue by visiting https://issues.redhat.com/issues/?jql= and searching the "Bugzilla Bug" field for this BZ's number, e.g. a search like: "Bugzilla Bug" = 1234567 In the event you have trouble locating or viewing this issue, you can file an issue by sending mail to rh-issues. You can also visit https://access.redhat.com/articles/7032570 for general account information. |