Bug 1300017 - dracut created initramfs fails to boot (zfs)
dracut created initramfs fails to boot (zfs)
Product: Fedora
Classification: Fedora
Component: dracut (Show other bugs)
x86_64 Linux
unspecified Severity unspecified
: ---
: ---
Assigned To: dracut-maint-list
Fedora Extras Quality Assurance
Depends On:
  Show dependency treegraph
Reported: 2016-01-19 13:42 EST by Kostya Berger
Modified: 2016-08-22 05:15 EDT (History)
6 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Last Closed: 2016-08-22 05:15:28 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---

Attachments (Terms of Use)

  None (edit)
Description Kostya Berger 2016-01-19 13:42:41 EST
Description of problem:
Fedora 21 upgraded to 23 (root on zfs full disk). Every new kernel with (most importantly) initramfs created on 23 fails to boot.

First it drops into dracut shell without mounting root, without ANY warning or forming of emergency logs -- as dracut usually does when fails to mount root. Also, it drops instantly, without waiting for 5 minutes timeout, as it would when failing to find/mount root, for example. Command line kernel root option is root=zfs:mypool/ROOT/linux ...
Evidently, no problem importing/mounting the pool & root, for it is easily done from the shell via #zpool import $mypool. 
Then, upon exiting the shell after importing the pool, the system, instead of continuing normal boot process, gives off "failed to start default.target: the target is destructive" with no further explanations. It doesn't hang though, just stops at this point with some messages from audit.
Ctrl+Alt+Del then reboots the system in the normal way.

Version-Release number of selected component (if applicable):
dracut version: 043-63.git20151211.fc23
systemd version: 222-12

How reproducible:
Obviously one will have to copy my configuration: ZFS pool on whole unpartitioned disk with Fedora 23 on it. Then install SPL + ZFS etc., create initramfs using dracut cli et voila.

dracut --force -a zfs /boot/initramfs-$version $kern_version

(with whole disk zfs no partition\filesystem mudules are needed except those already in the kernel)

Steps to Reproduce:
1.as above

Actual results:
System cannot be booted with F23-created initramfs, only with those remaining from the old F21 installation.

Expected results:
Normal boot as it used to boot all these years...

Additional info:
ZFS and SPL are installed from GIT, ZoL. For some reason RPM-Fusion installed RPMS result in system not being able to compile zfs kmod for the given kernel. But that's a different story...
Comment 1 Kostya Berger 2016-01-19 14:23:52 EST
Forgot to mention, this may be important: 
previous version of dracut had this problem caused by /usr/lib/dracut/modules.d/98dracut-systemd/rootfs-generator.sh. That caused the creation of the file /usr/lib/dracut/initqueue/finished/devesists-$(_name).sh in initramfs pointing to the root device (as dracut obviously believed) in /dev/disk/by-uuid/***. While in reality my whole disk zpool is linked to a device name in /dev/disk/by-id.

With that version of dracut scripts I dirty fixed it by manually removing the commands creating that file $hookdir/initqueue/finished/devesists-$(_name).sh. from rootfs-generator.sh where functions are defined. Because it caused system to hang on boot: for some reason /dev/disk/by-uuid failed to appear within the 5 min. timeout set for dracut shell to do its job. Separate problem, I know...

Now with the present version of dracut this file devexists***sh is NOT created in $hooksdir/initqueue/finished dir. There is only zfs.sh in there... Is this, possibly, the cause of the problem? Does the system refuse to boot because that file is somehow considered essential?
Comment 2 Kostya Berger 2016-01-19 14:25:46 EST
/usr/lib/dracut/initqueue/finished/devexists-$(_name).sh is the correct name of the file, sorry.
So now dracut drops into shell without even having started mount hooks.
Comment 3 Harald Hoyer 2016-01-26 06:05:58 EST
What is your kerne command line?

What is the output of:
# dracut --print-cmdline
Comment 4 Kostya Berger 2016-01-26 08:10:42 EST
Actually it's "root=zfs:mypool/ROOT/linux ro systemd.unit=graphical.target"

This used to work.
Comment 5 Harald Hoyer 2016-01-26 09:11:05 EST
dracut does not support zfs natively, so there must be some dracut zfs plugin module. Please talk to the maintainers of this.

The output of:
# rpm -qf /usr/lib/dracut/modules.d/* | sort -u

might help in finding the corresponding rpm.
Comment 6 Kostya Berger 2016-01-26 09:33:54 EST
Yes, there is dracut zfs mount hook provided with ZFS On Linux. I'm using it. Already asked this question at zol mailing list, they're eager to have some answers from dracut people...

Besides, here systemd also comes into play. And how can I possibly debug systemd issues with dracut? All this talking of "targets" that are "destructive" -- systemd language...

Anyways, I can still try git bisect with dracut code. We'll see at least if the problem is indeed with dracut. Or maybe with systemd, which wouldn't surprise me at all
Comment 7 Harald Hoyer 2016-01-26 09:46:15 EST
Is the zfs kernel module in the initramfs?
# lsinitrd | fgrep zfs

If not, maybe you just have to recreate the initramfs after you build your custom kernel with:
# dracut -f --kver <kernel-version>
Comment 8 Kostya Berger 2016-01-26 10:04:31 EST
Please, reread my first post here. Boot hangs until dracut 5 min timeout runs out, then drops into shell. At that point zfs.ko is loaded and all zfs utils are fully functional. 
With earlier versions (Fedora 21) when having problems of this sort I would just manually import the pool and log out of the emergency shell. Then normal boot would resume and the system start all right. Those were problems were related to my zpool being associated with device string in /dev/disk/by-partuuid,which is not standard with ZFS on Linux. Additional command line options were to be added to mount-zfs.sh dracut mount hook supplied with ZOL package.

This time the startup scripts don't have that problem. When dropped into emergency shell the target zpool is manually imported without any difficulties or special custom command line options whatsoever. Just "zpool import $mypool" makes it!

But when I log out of emergency shell to let the system resume booting process, it JUST refuses to use that kind of root and says "the target is destructive". This is how I read it.
Comment 9 Kostya Berger 2016-04-25 04:48:41 EDT
OK, this bug is not due to Fedora|RedHat proveded code. Actually, it's due to ZFS on Linux code, which adds zfs-capable scripts and systemd services.

These scripts take it over once the "zfs:*" root is detected, they replace accordingly any needed Fedora dracut code or add what is needed. I have pinpointed what exactly they fail to do.

Once the pool imported and mounted (ZOL part job), invoking `systemctl start initrd-switch-root.service` resumes normal boot.

So I guess it can be closed here as unrelated to RedHat/Fedora provided dracut code.

Note You need to log in before you can comment on or make changes to this bug.