Description of problem:
After applying the current rawhide updates on my test system an attempt to boot quickly produces a series of messages:
systemd-fstab-generator: Failed to create unit file: File exists
and is downwards from there. Mostly anything reports "Dependency failed for ..." and after a delay of close to two minutes I am getting:
Welcome to emergency mode. Use "systemctl default" or ^D to enter
Press enter for maitenance(or type Control-D to continue):
"Control-D" brings more of the same so it does not make sense. "Maitenance" reveals that only / and /usr are mounted all. Typing here 'mount -a' mounts
other file systems and starting network is causing a boot to continue, in a sense and by itself, until it brings a graphic login screen. An attempt to login there results only in a "System is going down" alert and nothing more.
The same messages shows up while loging from a remote while a system is NOT going down by any stretch of imagination.
While there 'systemctl --all --failed' reports only sendmail.service and yum-updatesd.service while 'systemctl --all | grep dead' shows 82 various services (an output attached). All in all not a very usable outcome.
Version-Release number of selected component (if applicable):
On every attempt to boot
The above is while using 3.4.0-0.rc4.git2.1.fc18.x86_64 kernel which I can still boot. Trying 3.5.0-... kernels is causing some "difficulties" (see bug 840235).
Created attachment 598265 [details]
systemd related messages from dmesg (3.4.0-0.rc4.git2.1.fc18.x86_64 kernel)
Created attachment 598266 [details]
results of 'systemctl --all | grep dead'
For an added attraction "poweroff" in a state as described in this report does kill an access to a system but really powering system off does not happen.
Bug 840235 is fixed by kernel 3.5.0-1.fc18.x86_64. This does not help with systemd, now at systemd-187-1.fc18, which is as broken as it was before and fails to mount most of disks and consequently fails on nearly anything.
Hm, we should add the file name to this error message.
Please attach a more detailed dmesg, as described in:
Created attachment 601830 [details]
dmesg for 3.6.0-0.rc0.git2.1.fc18.x86_64 with systemd debugging info
> Please attach a more detailed dmesg ...
Here we go! This part from the start to the moment when 'Welcome to emergency mode. ...' prompt shows up and that mode is entered.
If at this moment one would try 'systemctl list-jobs' then none of jobs is listed and only things like that:
[ 252.676321] systemd: Accepted connection on private bus.
[ 252.702988] systemd: Got D-Bus request: org.freedesktop.systemd1.Manager.ListJobs() on /org/freedesktop/systemd1
[ 252.731174] systemd: Got D-Bus request: org.freedesktop.DBus.Local.Disconnected() on /org/freedesktop/DBus/Local
show up on a console.
Created attachment 601831 [details]
remainder of a dmesg output after a "manual intervention"
Typing 'mount -a; exit' in "emergency shell" allows to proceed although on a resulting system non-root logins fail because presumably "System is going down" while really it is not. In particular that means that, say, gdm-220.127.116.11-2.fc18 attempts to start but, unfortunately, attempts to run a gnome-shell session for a 'gdm' user and immediately exits as the later failed due to "System is going down".
Before described series of rawhide updates this system was coming up without major issues.
Created attachment 601832 [details]
fstab from an affected test system
/bin, /lib, ... etc were converted to symlinks to /usr/ quite a while ago and before failure to boot happens / and /usr are getting mounted; only other file systems are left in a funk.
Just updated to systemd-187-3.fc18 and kernel 3.6.0-0.rc0.git6.1.fc18.x86_64. Not a surprise but nothing changed in a situation described in this report.
A few observations I made so far:
- This is what eventually leads to the switch to the emergency mode:
[ 109.888141] systemd: Job dev-disk-by\x2dlabel-opt1.device/start timed out.
- We seem to have a problem with escaping the name of the affected device unit:
[ 19.860537] systemd: Installed new job dev-disk-by\x2dlabel-opt1.device/start as 57
[ 49.497121] systemd: dev-disk-by\x2dlabel-\x5cx2fopt1.device changed dead -> plugged
See that the names came out differently in the two cases. That's why systemd
failed to detect that the device was already available.
We should be able to reproduce this by refering to disks using labels
- You have some old udev rules present that still refer to hal, which is obsolete
since F16, I believe:
[ 198.061563] systemd-udevd: failed to execute '/usr/lib/udev/socket:@/org/freedesktop/hal/udev_event' 'socket:@/org/freedesktop/hal/udev_event': No such file or directory
(In reply to comment #10)
> A few observations I made so far:
> - This is what eventually leads to the switch to the emergency mode:
Recently on email@example.com somebody was complaining that he ended up in an "emergency mode" after an attempt of a fresh rawhide installation. I am afraid that there were no details or a description of attempts to get at least some explanation. I did not save a copy of this message and now I cannot find it. Sigh!
> - You have some old udev rules present that still refer to hal, which is
Yes, but I was not adding or removing udev rules myself when going through various updates. One of goals of this test system is to see how it holds over time.
> [ 198.061563] systemd-udevd: failed to execute
> 'socket:@/org/freedesktop/hal/udev_event': No such file or directory
Is this really significant? "198.061563" is quite past an initial failure to mount disk partitions. It is also quickly followed by "systemd-udevd.service changed start -> running". In any case what was responsible for cleaning up such leftovers?
BTW - I do not know if this is related but one component of this mess is that
sendmail.service never starts. It attempts to do that, for quite a while, but eventually it always fails with "Active: failed (Result: timeout) ...".
systemd-188-3.fc18 has the same problem as systemd-187-3.fc18 - i.e. it fails to mount most of disk located file systems.
In case somebody would ask: I made sure that /var/run and /var/lock are symbolic links. After I reached shell prompts 'systemctl --failed' gives me
systemd-...es-setup.service loaded failed failed Recreate Volatile Files and Directories
yum-updatesd.service loaded failed failed YUM Package Update Service
sendmail.service is turned off at this moment.
(In reply to comment #10)
> - We seem to have a problem with escaping the name of the affected device
> [ 19.860537] systemd: Installed new job
> dev-disk-by\x2dlabel-opt1.device/start as 57
AFAICS systemd-194-1.fc18 does not have that problem anymore and it is possible to boot with it without a manual intervention at this point.