Hide Forgot
Description of problem: When I boot, it stops showing a full Fedora splash screen, but does not go further to the login. I had recently installed F16 and was apparently still living on the initial boot. I needed to reboot and found myself in this situation (again - see bug# 803574) Version-Release number of selected component (if applicable): This is on a fully updated F16 x86_64 running on a quad core processor. How reproducible: This is my first reboot of the newly installed F16 system. Steps to Reproduce: 1. Start with running F16 system 2. reboot 3. Actual results: Hang at full Fedora splash, no login user choices Expected results: Should see login choices. Additional info: Fortunately, the system comes up with networking active and a live ssh server. From another system, I can ssh into the machine successfully. I picked a user name (not root) which was available on both systems. When I ssh into the hung system and then issue the command: startx And then look at the console screen of the hung system, voila, I am logged in as that user and I have everything available. Even though I did a full replacement install of F16, including rewriting the partition map, When I log in through ssh from another machine and give the command 'startx' (my workaround from the previous bug# 803574), the Gnome console screen comes up with 'Applications Places' at the top left of the screen and there are little tabs at the screen bottom for each of the windows open on my screen. When I click on my name on the top right, I see 'Log Out..' as the bottom entry, but the Alt key does not change it to 'Shutdown..' as in the Gnome screen I had when the system originally booted up. Perhaps 'startx' does different things. [user1@hoho6 ~]$ systemctl --failed UNIT LOAD ACTIVE SUB JOB DESCRIPTION prefdm.service loaded failed failed Display Manager LOAD = Reflects whether the unit definition was properly loaded. ACTIVE = The high-level unit activation state, i.e. generalization of SUB. SUB = The low-level unit activation state, values depend on unit type. JOB = Pending job for the unit. 1 units listed. Pass --all to see inactive units, too. [user1@hoho6 ~]$ systemctl --list-jobs systemctl: unrecognized option '--list-jobs' [user1@hoho6 ~]$ systemctl list-jobs JOB UNIT TYPE STATE 1 graphical.target start waiting 2 multi-user.target start waiting 118 plymouth...t-wait.service start running 121 systemd-...nlevel.service start waiting 128 systemd-...ead-done.timer start waiting 5 jobs listed. [user1@hoho6 ~]$ [root@hoho6 log]# grep prefdm messages | tail Mar 20 12:33:35 hoho6 systemd[1]: prefdm.service: main process exited, code=exited, status=2 Mar 20 12:33:35 hoho6 systemd[1]: prefdm.service holdoff time over, scheduling restart. Mar 20 12:33:35 hoho6 systemd[1]: Unit prefdm.service entered failed state. Mar 20 12:33:35 hoho6 systemd[1]: prefdm.service: main process exited, code=exited, status=2 Mar 20 12:33:35 hoho6 systemd[1]: prefdm.service holdoff time over, scheduling restart. Mar 20 12:33:35 hoho6 systemd[1]: Unit prefdm.service entered failed state. Mar 20 12:33:35 hoho6 systemd[1]: prefdm.service: main process exited, code=exited, status=2 Mar 20 12:33:35 hoho6 systemd[1]: prefdm.service holdoff time over, scheduling restart. Mar 20 12:33:35 hoho6 systemd[1]: Unit prefdm.service entered failed state. Mar 20 12:33:35 hoho6 systemd[1]: prefdm.service start request repeated too quickly, refusing to start. [root@hoho6 log]# Looks like a race condition..
Created attachment 571501 [details] This is the output of the ssh session on the initiating system. This initiating system is also a F16 system, but on a slower dual processor. It does not have the hang symptoms on reboot.
When I rebooted again (using /sbin/shutdown -r now), the system hung again at the Fedora splash.
Hmm, after ssh and startx on alternative system, console on affected system DID have the login dialog box. (but before the ssh and startx, it was hung at splash screen).
Another experiment - I went back to my alternative initiating system and hit the control C. The 'startx' session quit. When I went back to the problem system, I see the Fedora splash screen. The open gnome session I had previously was gone. I went back to the alternative system and typed 'startx' again. Coming back to the problem system, it is logged on, but with no application windows - just the same as if I had logged out on the problem system. So, another experiment - will log out of the problem system (just log out, not reboot). Results coming soon to a screen near you.
The result of logging out - the hung fedora splash screen. I am attaching the output from the terminal of the alternative initiating system. Perhaps there are some clues.
Created attachment 571503 [details] Terminal output including control C and restart and log out. This terminal output includes the previous terminal output, but also the control C and restart and log out from the affected system side. hoho6 is the affected system (with ssh server) hoho0 is the slower initiating system with the ssh client.
This time, after giving 'startx' on initiating system ssh terminal session, the affected system console screen showed the user already logged in - no login dialog window. Also, in Comment #4, paragraph 2, I hadn't actually logged out, but had returned to the blank Gnome console screen (logged in as user), after the startx was given on the alternative initiating system ssh terminal session.
If my screen times out (as opposed to clicking on the Log Out..), the Password dialog does appear (not the Login dialog).
(In reply to comment #3) > Hmm, after ssh and startx on alternative system, console on affected system DID > have the login dialog box. (but before the ssh and startx, it was hung at > splash screen). This could have been the 'Password' dialog box, not the Login dialog window. It may have timed out before I looked at the console screen after the startx.
Following up on the hypothesis of a race condition, how can I 'turn off' a couple of cores in my quad-core processor? The turn off would have to survive a reboot. I could take the processor from hoho0 and put it into hoho6 - they are both 775 sockets - but that would be a bit too invasive. New heat sink grease and all that.
Some commands from an old Leonnart Poettering email in systemd-devel [root@hoho6 user1]# cp /lib/systemd/system/prefdm.service /etc/systemd/system [root@hoho6 user1]# cd /etc/systemd/system [root@hoho6 system]# ls basic.target.wants graphical.target.wants bluetooth.target.wants multi-user.target.wants dbus-org.freedesktop.Avahi.service prefdm.service dbus-org.freedesktop.NetworkManager.service printer.target.wants default.target sockets.target.wants default.target.wants sysinit.target.wants getty.target.wants [root@hoho6 system]# vim prefdm.service -- Add StandardOutput=syslog to [Service] section [root@hoho6 system]# systemctl status prefdm.service prefdm.service - Display Manager Loaded: loaded (/lib/systemd/system/prefdm.service; static) Active: failed since Tue, 20 Mar 2012 14:27:04 -0500; 4h 39min ago Process: 1169 ExecStart=/etc/X11/prefdm -nodaemon (code=exited, status=2) CGroup: name=systemd:/system/prefdm.service [root@hoho6 system]# systemctl daemon-reload [root@hoho6 system]# systemctl start prefdm.service [root@hoho6 system]# systemctl status prefdm.service prefdm.service - Display Manager Loaded: loaded (/lib/systemd/system/prefdm.service; static) Active: failed since Tue, 20 Mar 2012 19:07:08 -0500; 16s ago Process: 3116 ExecStart=/etc/X11/prefdm -nodaemon (code=exited, status=2) CGroup: name=systemd:/system/prefdm.service [root@hoho6 system]# [root@hoho6 log]# tail -30 messages Mar 20 19:04:30 hoho6 dbus-daemon[1032]: dbus[1032]: [system] Activating service name='net.reactivated.Fprint' (using servicehelper) Mar 20 19:04:30 hoho6 dbus-daemon[1032]: Launching FprintObject Mar 20 19:04:30 hoho6 dbus[1032]: [system] Successfully activated service 'net.reactivated.Fprint' Mar 20 19:04:30 hoho6 dbus-daemon[1032]: dbus[1032]: [system] Successfully activated service 'net.reactivated.Fprint' Mar 20 19:04:30 hoho6 dbus-daemon[1032]: ** Message: D-Bus service launched with name: net.reactivated.Fprint Mar 20 19:04:30 hoho6 dbus-daemon[1032]: ** Message: entering main loop Mar 20 19:05:00 hoho6 dbus-daemon[1032]: ** Message: No devices in use, exit Mar 20 19:06:54 hoho6 systemd[1]: Reloading. Mar 20 19:07:07 hoho6 systemd[1]: prefdm.service: main process exited, code=exited, status=2 Mar 20 19:07:07 hoho6 systemd[1]: prefdm.service holdoff time over, scheduling restart. Mar 20 19:07:07 hoho6 systemd[1]: Unit prefdm.service entered failed state. Mar 20 19:07:07 hoho6 systemd[1]: prefdm.service: main process exited, code=exited, status=2 Mar 20 19:07:07 hoho6 systemd[1]: prefdm.service holdoff time over, scheduling restart. Mar 20 19:07:07 hoho6 systemd[1]: Unit prefdm.service entered failed state. Mar 20 19:07:07 hoho6 systemd[1]: prefdm.service: main process exited, code=exited, status=2 Mar 20 19:07:07 hoho6 systemd[1]: prefdm.service holdoff time over, scheduling restart. Mar 20 19:07:07 hoho6 systemd[1]: Unit prefdm.service entered failed state. Mar 20 19:07:07 hoho6 systemd[1]: prefdm.service: main process exited, code=exited, status=2 Mar 20 19:07:07 hoho6 systemd[1]: prefdm.service holdoff time over, scheduling restart. Mar 20 19:07:07 hoho6 systemd[1]: Unit prefdm.service entered failed state. Mar 20 19:07:08 hoho6 systemd[1]: prefdm.service: main process exited, code=exited, status=2 Mar 20 19:07:08 hoho6 systemd[1]: prefdm.service holdoff time over, scheduling restart. Mar 20 19:07:08 hoho6 systemd[1]: Unit prefdm.service entered failed state. Mar 20 19:07:08 hoho6 systemd[1]: prefdm.service: main process exited, code=exited, status=2 Mar 20 19:07:08 hoho6 systemd[1]: prefdm.service holdoff time over, scheduling restart. Mar 20 19:07:08 hoho6 systemd[1]: Unit prefdm.service entered failed state. Mar 20 19:07:08 hoho6 systemd[1]: prefdm.service start request repeated too quickly, refusing to start. Mar 20 19:07:14 hoho6 systemd[1]: plymouth-quit-wait.service operation timed out. Terminating. Mar 20 19:07:14 hoho6 systemd[1]: Unit plymouth-quit-wait.service entered failed state. Mar 20 19:07:14 hoho6 systemd[1]: Startup finished in 3s 153ms 655us (kernel) + 2s 745ms 58us (initrd) + 4h 40min 17s 271ms 325us (userspace) = 4h 40min 23s 170ms 38us. [root@hoho6 log]# Anything interesting?
I am running rvm Perhaps this has a bearing on the problem? Or perhaps this problem has a bearing on the rvm problem. https://github.com/wayneeseguin/rvm/issues/835
The issue certainly looks similar. Does it help if you move /etc/profile.d/rvm.sh away?
Yes, that is it. cd /etc/profile.d mv rvm.sh rvm.sh.old Reboot - and almost everything is fine.
Good to know. To summarize the problem: 1) /etc/X11/prefdm executes /usr/sbin/gdm 2) /usr/sbin/gdm is a shell script running in POSIX mode (it has #!/bin/sh) 3) /usr/sbin/gdm sources /etc/profile 4) /etc/profile sources /etc/profile.d/rvm.sh 5) rvm.sh depends on non-POSIX bash functionality, so a syntax error is hit. The whole thing quits with error code 2. I see at least 2 issues that made this problem needlessly difficult to debug: - /etc/X11/prefdm executes the display manager with ">/dev/null 2>&1", thus hiding the error messages. - /etc/profile sources the files /etc/profile.d/*.sh with: . "$i" >/dev/null 2>&1 again hiding the error messages. prefdm is owned by initscripts. Reassigning.
I cloned this for the packages setup and gdm as bug 805507 and bug 805517.
This is more or less intentional; it's executing, in order, fallbacks from a list of display managers. If we don't throw the error messages out, everyone will get error messages about display managers that don't exist on the system in question, in cases where them not existing isn't an error.
But: - it does the redirection even when running $preferred. - for the fallbacks it could do: [ -x /usr/sbin/gdm ] && exec /usr/sbin/gdm "$@" and similar.
It could, although that means it's tied into whatever directories the DM happens to live in, even if it moves. Obviously, this should go away with systemd-ification of display managers in F18.
(In reply to comment #19) > It could, although that means it's tied into whatever directories the DM > happens to live in, even if it moves. Good point. You might like this then: type -P gdm >/dev/null && exec gdm "$@" > Obviously, this should go away with systemd-ification of display managers in > F18. Yes, hopefully. You can decide whether to use the "type -P" guards, or just WONTFIX this for F<18.
I think I'm going to WONTFIX this - trying to catch an error in profile.d shell scripts this way isn't a case to go out of the way to rearchitect and fix.
what's the conntent of /etc/sysconfig/desktop ? i got this issue also, but i fixed, cuz i allways install Fedora from minimal packages options on install stage and every time i forgot to config /etc/sysconfig/desktop, i think this skips firstlogin and other setup which could be the issue of not having /etc/sysconfig/desktop created which could lead to this "bug") [root@lxvz9mv901 ~]# uname -a Linux lxvz9mv901 3.3.7-1.fc17.i686.PAE #1 SMP Mon May 21 22:42:05 UTC 2012 i686 i686 i386 GNU/Linux [root@lxvz9mv901 ~]# rpm -q openbox lxdm openbox-3.5.0-5.fc17.i686 lxdm-0.4.1-1.fc17.i686 [root@lxvz9mv901 ~]# and i got solved as soon as created /etc/sysconfig/desktop here you have mine: [root@lxvz9mv901 ~]# cat /etc/sysconfig/desktop PREFERRED=/usr/bin/openbox-session DISPLAYMANAGER=/usr/sbin/lxdm [root@lxvz9mv901 ~]#