Bug 2192784

Summary: GDM (X in general) requires processes to be part of user's sessions (set up by pam_systemd) to be functional
Product: Red Hat Enterprise Linux 8 Reporter: Renaud Métrich <rmetrich>
Component: systemdAssignee: Michal Sekletar <msekleta>
Status: NEW --- QA Contact: Frantisek Sumsal <fsumsal>
Severity: high Docs Contact:
Priority: high    
Version: 8.7CC: bwelterl, hdegoede, msekleta, rstrode, systemd-maint-list
Target Milestone: rcKeywords: Reopened
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-05-03 15:19:42 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Renaud Métrich 2023-05-03 06:21:46 UTC
Description of problem:

In theory "pam_systemd" is optional and may be safely commented out from PAM, e.g. from /etc/pam.d/system-auth, e.g.:
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
#-session    optional                                     pam_systemd.so
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

Unfortunately this is not true: it appears gdm.service doesn't start if no user session is created.
The following error is seen in GDM's log (/var/lib/gdm/.local/share/xorg/Xorg.0.log):
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
# grep -w EE /var/lib/gdm/.local/share/xorg/Xorg.0.log
	(WW) warning, (EE) error, (NI) not implemented, (??) unknown.
[   732.585] (EE) systemd-logind: failed to get session: PID 2606 does not belong to any known session
[   732.588] (EE) 
[   732.588] (EE) xf86OpenConsole: Cannot open virtual console 1 (Permission denied)
[   732.588] (EE) 
[   732.588] (EE) 
[   732.588] (EE) Please also check the log file at "/var/lib/gdm/.local/share/xorg/Xorg.0.log" for additional information.
[   732.588] (EE) 
[   732.588] (EE) Server terminated with error (1). Closing log file.
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

Additionally, when pam_systemd is not commented out and user sessions are open, restarting/stopping systemd-logind kills all active user sessions.
The following error is then seen in the session's log (e.g. /var/lib/gdm/.local/share/xorg/Xorg.0.log):
-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------
[  1003.616] (EE) 
Fatal server error:
[  1003.616] (EE) systemd-logind disappeared (stopped/restarted?)
[  1003.616] (EE) 
[  1003.616] (EE) 
Please consult the The X.Org Foundation support 
	 at http://wiki.x.org
 for help. 
[  1003.616] (EE) Please also check the log file at "/var/lib/gdm/.local/share/xorg/Xorg.0.log" for additional information.
[  1003.616] (EE) 
[  1003.653] (EE) systemd-logind: ReleaseControl failed: You are not in control of this session
[  1003.653] (EE) Server terminated with error (1). Closing log file.

-------- 8< ---------------- 8< ---------------- 8< ---------------- 8< --------

This is a critical issue because of BZ #2158167 but not only: it may happen systemd-logind fails (e.g. in timeout and gets killed by systemd), IMHO X sessions should survive this.

Version-Release number of selected component (if applicable):

gdm-40.0-24.el8.x86_64
systemd-239-68.el8_7.4.x86_64

How reproducible:

Always, see above.

Comment 1 Ray Strode [halfline] 2023-05-03 15:19:42 UTC
no pam_systemd / logind registration isn't optional for our shipped desktop environment. That's definitely not going to change.

See also https://bugzilla.redhat.com/show_bug.cgi?id=1643928

Comment 2 Renaud Métrich 2023-05-04 05:58:59 UTC
Hi Ray,

Assuming pam_systemd is not optional for Graphical User Interface, still, I wouldn't expect all graphical sessions to die when systemd-logind dies or restarts.
Can we harden this?

Renaud.

Comment 4 Ray Strode [halfline] 2023-05-08 14:57:37 UTC
well the error message in comment 0 is:

systemd-logind: ReleaseControl failed: You are not in control of this session

This suggests when logind is restarted, it forgot who the session controller is. So that would need to be fixed first. It's possible mutter and Xorg will need follow up fixes as well, not sure. GDM shouldn't need any changes at all.

To be honest, I'm not sure trying to make the system resilient to important system services getting killed or otherwise becoming dysfunctional is that worthwhile an endeavor. There are a million ways an admin can kill or perturb things and make the system break.

Having said that, I do believe logind heavily serializes its state, so there could be acting counter to its design here.
 
We'll see what the systemd crew says.