| Summary: | user's login session sometimes fails to start because no permission on DRI | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Ian Collier <imc> | ||||||
| Component: | gdm | Assignee: | Ray Strode [halfline] <rstrode> | ||||||
| Status: | CLOSED DUPLICATE | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
| Severity: | unspecified | Docs Contact: | |||||||
| Priority: | unspecified | ||||||||
| Version: | 24 | CC: | normand, rstrode | ||||||
| Target Milestone: | --- | ||||||||
| Target Release: | --- | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2016-10-13 11:17:07 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Attachments: |
|
||||||||
Created attachment 1209623 [details]
Xorg.0.log of root attempting to start X on the console (and failing)
Digging further... in the logs we have a record of pid 1004 crashing - it was systemd-logind. Question would be why didn't the pid disappear from the DRI clients when the process crashed? Further mystery: we have recently installed Fedora 24 on 85 machines, and 16 of them have recorded a systemd-logind crash at exactly 18:50:00 on several different dates. We don't have anything that happens at 18:50 (closest is a cron job that runs at 18:35 and runs a command that calls systemd-inhibit then maybe issues a shutdown for 19:00 and sleeps for 1800 seconds. None of these machines did in fact shut down at 19:00). Anyway, if systemd-logind is crashing then maybe this is in fact a systemd bug. However, it would be nice if gdm could restart properly when systemd-login crashes. Right, the systemd crash is Bug 1371596 and that's just been fixed so hopefully once all our machines have been rebooted to restart their systemd this will no longer be an issue. *** This bug has been marked as a duplicate of bug 1371596 *** |
Created attachment 1209622 [details] Xorg.0.log of the user while unsuccessfully logging in Every so often, in an unpredictable fashion, gdm (or some system process) gets into a state where users can't log in: when the correct password is entered, the system tries to start the session and fails, then returns to the login screen. The user's Xorg.0.log file says things like: vesa: Ignoring device with a bound kernel driver (EE) modeset(0): drmSetMaster failed: Permission denied (EE) AddScreen/ScreenInit failed for driver 0 (bearing in mind the driver for this system should be intel(4) not modesetting(4)) whereas if one logs on to the console as root and tries to start X, it says things like: (EE) intel(0): [drm] failed to set drm interface version: Permission denied [13]. (EE) intel(0): Failed to claim DRM device. The problem seems to be related to this: # cat /sys/kernel/debug/dri/0/clients command pid dev master a uid magic <unknown> 1004 0 y y 0 0 Xwayland 1520 0 n y 42 1 Xwayland 1520 0 n y 42 2 Xwayland 1520 0 n y 42 3 There is no process 1004 running on the system. However, if one kills process 1520 then gdm restarts and the ghost of process 1004 disappears: # cat /sys/kernel/debug/dri/0/clients command pid dev master a uid magic systemd-logind 4955 0 n y 0 0 Xwayland 6780 0 n y 42 1 Xwayland 6780 0 n y 42 2 At that point, users are again able to log in successfully.