Bug 803574
Summary: | After update, F15 boots to Fedora splash and hangs - workaround though | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Fedora] Fedora | Reporter: | Bob Gustafson <bobgus> | ||||||
Component: | gdm | Assignee: | Ray Strode [halfline] <rstrode> | ||||||
Status: | CLOSED INSUFFICIENT_DATA | QA Contact: | Fedora Extras Quality Assurance <extras-qa> | ||||||
Severity: | unspecified | Docs Contact: | |||||||
Priority: | unspecified | ||||||||
Version: | 15 | CC: | johannbg, metherid, mschmidt, notting, plautrba, rstrode, systemd-maint | ||||||
Target Milestone: | --- | ||||||||
Target Release: | --- | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | Doc Type: | Bug Fix | |||||||
Doc Text: | Story Points: | --- | |||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2012-03-20 16:07:06 UTC | Type: | --- | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Description
Bob Gustafson
2012-03-15 05:45:45 UTC
(In reply to comment #0) > Fortunately, the system comes up with networking active and a live ssh server. That should make debugging easier. What do these commands say?: systemctl --failed systemctl list-jobs [user1@hoho6 ~]$ systemctl --failed UNIT LOAD ACTIVE SUB JOB DESCRIPTION httpd.service loaded failed failed LSB: start and stop Apache HT prefdm.service loaded failed failed Display Manager LOAD = Reflects whether the unit definition was properly loaded. ACTIVE = The high-level unit activation state, i.e. generalization of SUB. SUB = The low-level unit activation state, values depend on unit type. JOB = Pending job for the unit. 2 units listed. Pass --all to see inactive units, too. [user1@hoho6 ~]$ [user1@hoho6 ~]$ systemctl list-jobs 95 getty.target start waiting 96 getty start waiting 98 getty start waiting 100 getty start waiting 102 getty start waiting 107 getty start waiting 111 plymouth...t-wait.service start running [user1@hoho6 ~]$ I can also give you the output from the ssh session on the other system. Created attachment 570329 [details]
This is the output of the ssh session on the initiating system.
This is the output of the ssh session on the initiating system.
It is still open, so more output might be added as time goes on. If there is an interesting debug command that I can give from the initiating system, let me know.
(In reply to comment #2) > prefdm.service loaded failed failed Display Manager It would be interesting to know why exactly prefdm failed. What does "systemctl status prefdm.service" say? And are there errors about it logged in /var/log/messages? What do you use as your graphical login manager? gdm, kdm, xdm, ...? > [user1@hoho6 ~]$ systemctl list-jobs > 95 getty.target start waiting > 96 getty start waiting > 98 getty start waiting > 100 getty start waiting > 102 getty start waiting > 107 getty start waiting > 111 plymouth...t-wait.service start running It is the responsibility of prefdm.service to tell plymouth to quit. Apparently it failed to do even that. Running "plymouth quit" will likely allow the boot finish. [root@hoho6 user1]# systemctl status prefdm.service prefdm.service - Display Manager Loaded: loaded (/lib/systemd/system/prefdm.service) Active: failed since Wed, 14 Mar 2012 13:36:52 -0500; 1 day and 12h ago Process: 1573 ExecStart=/etc/X11/prefdm -nodaemon (code=exited, status=2) CGroup: name=systemd:/system/prefdm.service [root@hoho6 user1]# This might be interesting to someone: [root@hoho6 user1]# ps ax | grep X 6947 pts/3 S+ 0:00 xinit /etc/X11/xinit/xinitrc -- /usr/bin/X :0 -auth /home/user1/.serverauth.6931 6948 tty3 Ss+ 3:14 /usr/bin/X :0 -auth /home/user1/.serverauth.6931 6952 ? Ss 0:00 /usr/bin/ck-xinit-session /usr/bin/ssh-agent /etc/X11/xinit/Xclients 7075 ? Ss 0:00 /usr/bin/ssh-agent /etc/X11/xinit/Xclients 18900 pts/4 S+ 0:00 grep --color=auto X [root@hoho6 user1]# Graphical login manager is whatever is the default for: [root@hoho6 user1]# uname -a Linux hoho6.chidig.com 2.6.42.9-1.fc15.x86_64 #1 SMP Thu Mar 1 23:30:23 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux [root@hoho6 user1]# What command should I use to give you the information? Note that I never used the login manager to get into this system. I logged in using ssh on a terminal on another system. When I gave the command startx on this terminal command line, the main monitor screen came alive with that user's stuff. Essentially, I stepped over the problem, so what you see in some of the diagnostic probes might not be that informative. I'm up for any experiments. Should I shut down and come up again with the same procedure, but instead of saying startx, use plymouth quit ? Created attachment 570519 [details]
grep prefdm /var/log/messages
In addition, this file contains prefdm messages from before I was able to log in through ssh. Check the dates. Multiple reboot attempts..
(In reply to comment #5) > Graphical login manager is whatever is the default for: > > [root@hoho6 user1]# uname -a > Linux hoho6.chidig.com 2.6.42.9-1.fc15.x86_64 #1 SMP Thu Mar 1 23:30:23 UTC > 2012 x86_64 x86_64 x86_64 GNU/Linux A kernel version is not related to the login manager. What login manager you have depends on which Fedora live spin you installed, or what packages you selected if you installed using the classical installer. > What command should I use to give you the information? You use Gnome, so chances are you have gdm. Does /usr/sbin/gdm exist? Does /etc/sysconfig/desktop exist, and if yes, what is in it? (In reply to comment #6) > I'm up for any experiments. Should I shut down and come up again with the same > procedure, but instead of saying startx, use plymouth quit ? Yes, you can try that and see what happens. You can also try running /etc/X11/prefdm as root and see what you get. (In reply to comment #7) > Created attachment 570519 [details] > grep prefdm /var/log/messages > > In addition, this file contains prefdm messages from before I was able to log > in through ssh. Check the dates. Multiple reboot attempts.. Are there any interesting lines before each occurrence of "prefdm.service: main process exited"? Note that they may be interesting without containing the "prefdm" string. If you indeed have gdm, you'll find more logs in /var/log/gdm/. Before shutting down and rebooting: [root@hoho6 log]# ls -lZ /usr/sbin/gdm -rwxr-xr-x. root root system_u:object_r:xdm_exec_t:s0 /usr/sbin/gdm [root@hoho6 log]# [root@hoho6 log]# ls -lZ /etc/sysconfig/desktop ls: cannot access /etc/sysconfig/desktop: No such file or directory [root@hoho6 log]# [root@hoho6 log]# cd gdm [root@hoho6 gdm]# ls -lt total 1244 -rw-r--r--. 1 root root 536309 Mar 14 01:49 :0.log -rw-r--r--. 1 root root 185 Mar 14 01:49 :0-slave.log -rw-r--r--. 1 gdm gdm 2281 Jan 12 16:13 :0-greeter.log -rw-r--r--. 1 root root 461293 Jan 12 16:12 :0.log.1 -rw-r--r--. 1 gdm gdm 3229 Nov 14 14:16 :0-greeter.log.1 -rw-r--r--. 1 root root 340 Nov 14 14:16 :0-slave.log.1 -rw-r--r--. 1 root root 102888 Nov 14 14:16 :0.log.2 -rw-r--r--. 1 root root 185 Nov 14 14:16 :0-slave.log.2 -rw-r--r--. 1 gdm gdm 2281 Nov 6 15:39 :0-greeter.log.2 -rw-r--r--. 1 root root 185 Nov 6 14:45 :0-slave.log.3 -rw-r--r--. 1 root root 36208 Nov 6 14:45 :0.log.3 -rw-r--r--. 1 gdm gdm 2281 Nov 6 14:31 :0-greeter.log.3 -rw-r--r--. 1 root root 37875 Nov 6 14:30 :0.log.4 -rw-r--r--. 1 root root 185 Nov 6 14:30 :0-slave.log.4 -rw-r--r--. 1 gdm gdm 2281 Nov 6 14:09 :0-greeter.log.4 -rw-r--r--. 1 root root 27696 May 31 2011 :1.log -rw-r--r--. 1 root root 342 May 31 2011 :1-slave.log -rw-r--r--. 1 gdm gdm 2938 May 31 2011 :1-greeter.log [root@hoho6 gdm]# [root@hoho6 gdm]# tail :0.log (II) Power Button: Close (II) Unloading evdev (II) Power Button: Close (II) Unloading evdev (II) AT Translated Set 2 keyboard: Close (II) Unloading evdev (II) ImExPS/2 Generic Explorer Mouse: Close (II) Unloading evdev (II) NOUVEAU(0): NVLeaveVT is called. (II) NOUVEAU(0): Closed GPU channel 2 [root@hoho6 gdm]# Here is one 'vim -R messages' window starting with the last but one 'prefdm failed' into the next batch of prefdm messages. Mar 14 02:53:13 hoho6 systemd[1]: Unit prefdm.service entered failed state. Mar 14 02:53:13 hoho6 avahi-daemon[900]: Joining mDNS multicast group on interface virbr0.IPv4 with address 192.168.122.1. Mar 14 02:53:13 hoho6 kernel: [ 20.355996] ADDRCONF(NETDEV_UP): virbr0: link is not ready Mar 14 02:53:13 hoho6 avahi-daemon[900]: New relevant interface virbr0.IPv4 for mDNS. Mar 14 02:53:13 hoho6 avahi-daemon[900]: Registering new address record for 192.168.122.1 on virbr0.IPv4. Mar 14 02:53:13 hoho6 dnsmasq[1550]: started, version 2.52 cachesize 150 Mar 14 02:53:13 hoho6 dnsmasq[1550]: compile time options: IPv6 GNU-getopt DBus no-I18N DHCP TFTP Mar 14 02:53:13 hoho6 dnsmasq-dhcp[1550]: DHCP, IP range 192.168.122.2 -- 192.168.122.254, lease time 1h Mar 14 02:53:13 hoho6 dnsmasq[1550]: reading /etc/resolv.conf Mar 14 02:53:13 hoho6 dnsmasq[1550]: using nameserver 192.168.49.41#53 Mar 14 02:53:13 hoho6 dnsmasq[1550]: read /etc/hosts - 2 addresses Mar 14 02:53:13 hoho6 systemd[1]: prefdm.service: main process exited, code=exited, status=2 Mar 14 02:53:13 hoho6 systemd[1]: prefdm.service holdoff time over, scheduling restart. Mar 14 02:53:13 hoho6 systemd[1]: Unit prefdm.service entered failed state. I will do the reboot now and do your other experiments OK, I rebooted the problematic F15 system. Courtesy of the new Gnome, there is no 'restart' command available, even with the Alt key depressed, so I had to do '/sbin/shutdown -r now' in a terminal.. The system came up with the full Fedora logo on the console screen. Going to my alternative (F16) system, I logged in using ssh as before: [root@hoho0 ~]# ssh 192.168.49.51 root.49.51's password: Last login: Wed Mar 14 17:24:34 2012 from hoho6.chidig.com [root@hoho6 ~]# systemctl status prefdm.service prefdm.service - Display Manager Loaded: loaded (/lib/systemd/system/prefdm.service) Active: failed since Fri, 16 Mar 2012 12:44:41 -0500; 4min 32s ago Process: 1499 ExecStart=/etc/X11/prefdm -nodaemon (code=exited, status=2) CGroup: name=systemd:/system/prefdm.service [root@hoho6 ~]# [root@hoho6 ~]# systemctl --failed UNIT LOAD ACTIVE SUB JOB DESCRIPTION httpd.service loaded failed failed LSB: start and stop Apache HT prefdm.service loaded failed failed Display Manager LOAD = Reflects whether the unit definition was properly loaded. ACTIVE = The high-level unit activation state, i.e. generalization of SUB. SUB = The low-level unit activation state, values depend on unit type. JOB = Pending job for the unit. 2 units listed. Pass --all to see inactive units, too. [root@hoho6 ~]# [root@hoho6 ~]# plymouth quit [root@hoho6 ~]# What I saw on the console screen after doing 'plymouth quit' was a blank screen with a blinking underline cursor at the top left. I typed 'Alt-F2' and was rewarded with a login prompt. I put in my user name and password and now have a command line active on the console screen. No graphical interface though. No Gnome3 (is that bad..), No gdm. From the ssh terminal, I typed: [root@hoho6 ~]# /etc/X11/prefdm [root@hoho6 ~]# It seemed to run and then immediately exit. There is no change on the console screen. When I log out of my user on the console screen, an immediate login request line comes back. What next to do? Looking at the bottom lines of: [root@hoho6 log]# grep prefdm messages Mar 16 12:44:41 hoho6 systemd[1]: prefdm.service holdoff time over, scheduling restart. Mar 16 12:44:41 hoho6 systemd[1]: Unit prefdm.service entered failed state. Mar 16 12:44:41 hoho6 systemd[1]: prefdm.service: main process exited, code=exited, status=2 Mar 16 12:44:41 hoho6 systemd[1]: prefdm.service holdoff time over, scheduling restart. Mar 16 12:44:41 hoho6 systemd[1]: Unit prefdm.service entered failed state. Mar 16 12:44:41 hoho6 systemd[1]: prefdm.service: main process exited, code=exited, status=2 Mar 16 12:44:41 hoho6 systemd[1]: prefdm.service holdoff time over, scheduling restart. Mar 16 12:44:41 hoho6 systemd[1]: Unit prefdm.service entered failed state. Mar 16 12:44:41 hoho6 systemd[1]: prefdm.service start request repeated too quickly, refusing to start. [root@hoho6 log]# date Fri Mar 16 13:10:45 CDT 2012 [root@hoho6 log]# It appears that the most recent '/etc/X11/prefdm' doesn't appear in the log/messages. I will try again to see if that is correct [root@hoho6 log]# date Fri Mar 16 13:13:13 CDT 2012 [root@hoho6 log]# /etc/X11/prefdm [root@hoho6 log]# [root@hoho6 log]# grep prefdm messages | tail Mar 16 12:44:41 hoho6 systemd[1]: prefdm.service: main process exited, code=exited, status=2 Mar 16 12:44:41 hoho6 systemd[1]: prefdm.service holdoff time over, scheduling restart. Mar 16 12:44:41 hoho6 systemd[1]: Unit prefdm.service entered failed state. Mar 16 12:44:41 hoho6 systemd[1]: prefdm.service: main process exited, code=exited, status=2 Mar 16 12:44:41 hoho6 systemd[1]: prefdm.service holdoff time over, scheduling restart. Mar 16 12:44:41 hoho6 systemd[1]: Unit prefdm.service entered failed state. Mar 16 12:44:41 hoho6 systemd[1]: prefdm.service: main process exited, code=exited, status=2 Mar 16 12:44:41 hoho6 systemd[1]: prefdm.service holdoff time over, scheduling restart. Mar 16 12:44:41 hoho6 systemd[1]: Unit prefdm.service entered failed state. Mar 16 12:44:41 hoho6 systemd[1]: prefdm.service start request repeated too quickly, refusing to start. [root@hoho6 log]# Yes, the terminal '/etc/X11/prefdm' doesn't add another prefdm line to messages Duplicating the commands shown in Comment #10 on a satisfactorily running F16 system to see what the output 'should be' (more or less) [root@hoho0 user1]# systemctl status prefdm.service prefdm.service - Display Manager Loaded: loaded (/lib/systemd/system/prefdm.service; static) Active: active (running) since Fri, 16 Mar 2012 17:03:06 -0500; 7h ago Main PID: 966 (gdm-binary) CGroup: name=systemd:/system/prefdm.service ├ 966 /usr/sbin/gdm-binary -nodaemon ├ 978 /usr/libexec/gdm-simple-slave --display-id /org/gn... ├ 982 /usr/bin/Xorg :0 -background none -verbose -auth /... └ 1201 /usr/bin/gnome-keyring-daemon --daemonize --login [root@hoho0 user1]# [root@hoho0 user1]# systemctl --failed UNIT LOAD ACTIVE SUB JOB DESCRIPTION httpd.service loaded failed failed The Apache HTTP Server (prefork MPM udev-settle.service loaded failed failed udev Wait for Complete Device Initi LOAD = Reflects whether the unit definition was properly loaded. ACTIVE = The high-level unit activation state, i.e. generalization of SUB. SUB = The low-level unit activation state, values depend on unit type. JOB = Pending job for the unit. 2 units listed. Pass --all to see inactive units, too. [root@hoho0 user1]# I wonder what is the effect of the failed udev-settle.service. I may just skip the F15 system with a problem and backup everything and go to F16 from F15. (There is an agony with RAID1 on /boot, but there is always a RAID problem at the beginning of each new release - I can deal with it, but a broken systemd is another thing). (In reply to comment #10) > From the ssh terminal, I typed: > [root@hoho6 ~]# /etc/X11/prefdm > [root@hoho6 ~]# > > It seemed to run and then immediately exit. Try running /usr/sbin/gdm directly. Hopefully it will be more verbose. systemd did everything it could to spawn the service. That it fails is not systemd's fault. Reassigning to gdm. I don't have that F15 anymore, went to F16. As I had mentioned before, installing F on RAID has always been a problem, and one that I was trying to postpone by hanging on to F15 for awhile longer. I was able to overcome the RAID1 installation problems. The key (I think) is to create a new partition table, making sure it is gpt instead of mdos. Anaconda will give up easily with gpt and go to mdos if there is anything like a mdos partition table on the disks. This will doom further effort. I should write this up in a separate bug report, not here. (In reply to comment #14) > I don't have that F15 anymore, went to F16. No way to debug this further then. Closing. Not so fast. The problem now exists on my newly installed F16 system. I was able to do stuff for a day, then rebooted. Now have same symptoms as originally reported with the F15 system. [user1@hoho6 ~]$ systemctl status prefdm.service prefdm.service - Display Manager Loaded: loaded (/lib/systemd/system/prefdm.service; static) Active: failed since Tue, 20 Mar 2012 12:33:35 -0500; 9min ago Process: 1230 ExecStart=/etc/X11/prefdm -nodaemon (code=exited, status=2) CGroup: name=systemd:/system/prefdm.service [user1@hoho6 ~]$ [user1@hoho6 ~]$ systemctl --failed UNIT LOAD ACTIVE SUB JOB DESCRIPTION prefdm.service loaded failed failed Display Manager LOAD = Reflects whether the unit definition was properly loaded. ACTIVE = The high-level unit activation state, i.e. generalization of SUB. SUB = The low-level unit activation state, values depend on unit type. JOB = Pending job for the unit. 1 units listed. Pass --all to see inactive units, too. [user1@hoho6 ~]$ [user1@hoho6 ~]$ systemctl list-jobs JOB UNIT TYPE STATE 1 graphical.target start waiting 2 multi-user.target start waiting 118 plymouth...t-wait.service start running 121 systemd-...nlevel.service start waiting 128 systemd-...ead-done.timer start waiting 5 jobs listed. [user1@hoho6 ~]$ I will create a new bug for F16 and reference this one. |