| Summary: | /etc/X11/prefdm throws away display manager's stdout/stderr | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | [Fedora] Fedora | Reporter: | Bob Gustafson <bobgus> | ||||||
| Component: | initscripts | Assignee: | Bill Nottingham <notting> | ||||||
| Status: | CLOSED DEFERRED | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
| Severity: | low | Docs Contact: | |||||||
| Priority: | unspecified | ||||||||
| Version: | 16 | CC: | andres.hans, iarlyy, johannbg, jonathan, lnykryn, metherid, mschmidt, notting, plautrba, rvokal, systemd-maint | ||||||
| Target Milestone: | --- | ||||||||
| Target Release: | --- | ||||||||
| Hardware: | x86_64 | ||||||||
| OS: | Linux | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | Doc Type: | Bug Fix | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | |||||||||
| : | 805507 805517 (view as bug list) | Environment: | |||||||
| Last Closed: | 2012-03-22 18:30:46 UTC | Type: | --- | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Attachments: |
|
||||||||
|
Description
Bob Gustafson
2012-03-20 18:20:02 UTC
Created attachment 571501 [details]
This is the output of the ssh session on the initiating system.
This initiating system is also a F16 system, but on a slower dual processor. It does not have the hang symptoms on reboot.
When I rebooted again (using /sbin/shutdown -r now), the system hung again at the Fedora splash. Hmm, after ssh and startx on alternative system, console on affected system DID have the login dialog box. (but before the ssh and startx, it was hung at splash screen). Another experiment - I went back to my alternative initiating system and hit the control C. The 'startx' session quit. When I went back to the problem system, I see the Fedora splash screen. The open gnome session I had previously was gone. I went back to the alternative system and typed 'startx' again. Coming back to the problem system, it is logged on, but with no application windows - just the same as if I had logged out on the problem system. So, another experiment - will log out of the problem system (just log out, not reboot). Results coming soon to a screen near you. The result of logging out - the hung fedora splash screen. I am attaching the output from the terminal of the alternative initiating system. Perhaps there are some clues. Created attachment 571503 [details]
Terminal output including control C and restart and log out.
This terminal output includes the previous terminal output, but also the control C and restart and log out from the affected system side.
hoho6 is the affected system (with ssh server)
hoho0 is the slower initiating system with the ssh client.
This time, after giving 'startx' on initiating system ssh terminal session, the affected system console screen showed the user already logged in - no login dialog window. Also, in Comment #4, paragraph 2, I hadn't actually logged out, but had returned to the blank Gnome console screen (logged in as user), after the startx was given on the alternative initiating system ssh terminal session. If my screen times out (as opposed to clicking on the Log Out..), the Password dialog does appear (not the Login dialog). (In reply to comment #3) > Hmm, after ssh and startx on alternative system, console on affected system DID > have the login dialog box. (but before the ssh and startx, it was hung at > splash screen). This could have been the 'Password' dialog box, not the Login dialog window. It may have timed out before I looked at the console screen after the startx. Following up on the hypothesis of a race condition, how can I 'turn off' a couple of cores in my quad-core processor? The turn off would have to survive a reboot. I could take the processor from hoho0 and put it into hoho6 - they are both 775 sockets - but that would be a bit too invasive. New heat sink grease and all that. Some commands from an old Leonnart Poettering email in systemd-devel [root@hoho6 user1]# cp /lib/systemd/system/prefdm.service /etc/systemd/system [root@hoho6 user1]# cd /etc/systemd/system [root@hoho6 system]# ls basic.target.wants graphical.target.wants bluetooth.target.wants multi-user.target.wants dbus-org.freedesktop.Avahi.service prefdm.service dbus-org.freedesktop.NetworkManager.service printer.target.wants default.target sockets.target.wants default.target.wants sysinit.target.wants getty.target.wants [root@hoho6 system]# vim prefdm.service -- Add StandardOutput=syslog to [Service] section [root@hoho6 system]# systemctl status prefdm.service prefdm.service - Display Manager Loaded: loaded (/lib/systemd/system/prefdm.service; static) Active: failed since Tue, 20 Mar 2012 14:27:04 -0500; 4h 39min ago Process: 1169 ExecStart=/etc/X11/prefdm -nodaemon (code=exited, status=2) CGroup: name=systemd:/system/prefdm.service [root@hoho6 system]# systemctl daemon-reload [root@hoho6 system]# systemctl start prefdm.service [root@hoho6 system]# systemctl status prefdm.service prefdm.service - Display Manager Loaded: loaded (/lib/systemd/system/prefdm.service; static) Active: failed since Tue, 20 Mar 2012 19:07:08 -0500; 16s ago Process: 3116 ExecStart=/etc/X11/prefdm -nodaemon (code=exited, status=2) CGroup: name=systemd:/system/prefdm.service [root@hoho6 system]# [root@hoho6 log]# tail -30 messages Mar 20 19:04:30 hoho6 dbus-daemon[1032]: dbus[1032]: [system] Activating service name='net.reactivated.Fprint' (using servicehelper) Mar 20 19:04:30 hoho6 dbus-daemon[1032]: Launching FprintObject Mar 20 19:04:30 hoho6 dbus[1032]: [system] Successfully activated service 'net.reactivated.Fprint' Mar 20 19:04:30 hoho6 dbus-daemon[1032]: dbus[1032]: [system] Successfully activated service 'net.reactivated.Fprint' Mar 20 19:04:30 hoho6 dbus-daemon[1032]: ** Message: D-Bus service launched with name: net.reactivated.Fprint Mar 20 19:04:30 hoho6 dbus-daemon[1032]: ** Message: entering main loop Mar 20 19:05:00 hoho6 dbus-daemon[1032]: ** Message: No devices in use, exit Mar 20 19:06:54 hoho6 systemd[1]: Reloading. Mar 20 19:07:07 hoho6 systemd[1]: prefdm.service: main process exited, code=exited, status=2 Mar 20 19:07:07 hoho6 systemd[1]: prefdm.service holdoff time over, scheduling restart. Mar 20 19:07:07 hoho6 systemd[1]: Unit prefdm.service entered failed state. Mar 20 19:07:07 hoho6 systemd[1]: prefdm.service: main process exited, code=exited, status=2 Mar 20 19:07:07 hoho6 systemd[1]: prefdm.service holdoff time over, scheduling restart. Mar 20 19:07:07 hoho6 systemd[1]: Unit prefdm.service entered failed state. Mar 20 19:07:07 hoho6 systemd[1]: prefdm.service: main process exited, code=exited, status=2 Mar 20 19:07:07 hoho6 systemd[1]: prefdm.service holdoff time over, scheduling restart. Mar 20 19:07:07 hoho6 systemd[1]: Unit prefdm.service entered failed state. Mar 20 19:07:07 hoho6 systemd[1]: prefdm.service: main process exited, code=exited, status=2 Mar 20 19:07:07 hoho6 systemd[1]: prefdm.service holdoff time over, scheduling restart. Mar 20 19:07:07 hoho6 systemd[1]: Unit prefdm.service entered failed state. Mar 20 19:07:08 hoho6 systemd[1]: prefdm.service: main process exited, code=exited, status=2 Mar 20 19:07:08 hoho6 systemd[1]: prefdm.service holdoff time over, scheduling restart. Mar 20 19:07:08 hoho6 systemd[1]: Unit prefdm.service entered failed state. Mar 20 19:07:08 hoho6 systemd[1]: prefdm.service: main process exited, code=exited, status=2 Mar 20 19:07:08 hoho6 systemd[1]: prefdm.service holdoff time over, scheduling restart. Mar 20 19:07:08 hoho6 systemd[1]: Unit prefdm.service entered failed state. Mar 20 19:07:08 hoho6 systemd[1]: prefdm.service start request repeated too quickly, refusing to start. Mar 20 19:07:14 hoho6 systemd[1]: plymouth-quit-wait.service operation timed out. Terminating. Mar 20 19:07:14 hoho6 systemd[1]: Unit plymouth-quit-wait.service entered failed state. Mar 20 19:07:14 hoho6 systemd[1]: Startup finished in 3s 153ms 655us (kernel) + 2s 745ms 58us (initrd) + 4h 40min 17s 271ms 325us (userspace) = 4h 40min 23s 170ms 38us. [root@hoho6 log]# Anything interesting? I am running rvm Perhaps this has a bearing on the problem? Or perhaps this problem has a bearing on the rvm problem. https://github.com/wayneeseguin/rvm/issues/835 The issue certainly looks similar. Does it help if you move /etc/profile.d/rvm.sh away? Yes, that is it. cd /etc/profile.d mv rvm.sh rvm.sh.old Reboot - and almost everything is fine. Good to know. To summarize the problem: 1) /etc/X11/prefdm executes /usr/sbin/gdm 2) /usr/sbin/gdm is a shell script running in POSIX mode (it has #!/bin/sh) 3) /usr/sbin/gdm sources /etc/profile 4) /etc/profile sources /etc/profile.d/rvm.sh 5) rvm.sh depends on non-POSIX bash functionality, so a syntax error is hit. The whole thing quits with error code 2. I see at least 2 issues that made this problem needlessly difficult to debug: - /etc/X11/prefdm executes the display manager with ">/dev/null 2>&1", thus hiding the error messages. - /etc/profile sources the files /etc/profile.d/*.sh with: . "$i" >/dev/null 2>&1 again hiding the error messages. prefdm is owned by initscripts. Reassigning. I cloned this for the packages setup and gdm as bug 805507 and bug 805517. This is more or less intentional; it's executing, in order, fallbacks from a list of display managers. If we don't throw the error messages out, everyone will get error messages about display managers that don't exist on the system in question, in cases where them not existing isn't an error. But: - it does the redirection even when running $preferred. - for the fallbacks it could do: [ -x /usr/sbin/gdm ] && exec /usr/sbin/gdm "$@" and similar. It could, although that means it's tied into whatever directories the DM happens to live in, even if it moves. Obviously, this should go away with systemd-ification of display managers in F18. (In reply to comment #19) > It could, although that means it's tied into whatever directories the DM > happens to live in, even if it moves. Good point. You might like this then: type -P gdm >/dev/null && exec gdm "$@" > Obviously, this should go away with systemd-ification of display managers in > F18. Yes, hopefully. You can decide whether to use the "type -P" guards, or just WONTFIX this for F<18. I think I'm going to WONTFIX this - trying to catch an error in profile.d shell scripts this way isn't a case to go out of the way to rearchitect and fix. what's the conntent of /etc/sysconfig/desktop ? i got this issue also, but i fixed, cuz i allways install Fedora from minimal packages options on install stage and every time i forgot to config /etc/sysconfig/desktop, i think this skips firstlogin and other setup which could be the issue of not having /etc/sysconfig/desktop created which could lead to this "bug") [root@lxvz9mv901 ~]# uname -a Linux lxvz9mv901 3.3.7-1.fc17.i686.PAE #1 SMP Mon May 21 22:42:05 UTC 2012 i686 i686 i386 GNU/Linux [root@lxvz9mv901 ~]# rpm -q openbox lxdm openbox-3.5.0-5.fc17.i686 lxdm-0.4.1-1.fc17.i686 [root@lxvz9mv901 ~]# and i got solved as soon as created /etc/sysconfig/desktop here you have mine: [root@lxvz9mv901 ~]# cat /etc/sysconfig/desktop PREFERRED=/usr/bin/openbox-session DISPLAYMANAGER=/usr/sbin/lxdm [root@lxvz9mv901 ~]# |