Bug 1300017 - dracut created initramfs fails to boot (zfs)
Summary: dracut created initramfs fails to boot (zfs)
Keywords:
Status: CLOSED CANTFIX
Alias: None
Product: Fedora
Classification: Fedora
Component: dracut
Version: 23
Hardware: x86_64
OS: Linux
unspecified
unspecified
Target Milestone: ---
Assignee: dracut-maint-list
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-01-19 18:42 UTC by Kostya Berger
Modified: 2016-08-22 09:15 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-08-22 09:15:28 UTC
Type: Bug
Embargoed:


Attachments (Terms of Use)

Description Kostya Berger 2016-01-19 18:42:41 UTC
Description of problem:
Fedora 21 upgraded to 23 (root on zfs full disk). Every new kernel with (most importantly) initramfs created on 23 fails to boot.

First it drops into dracut shell without mounting root, without ANY warning or forming of emergency logs -- as dracut usually does when fails to mount root. Also, it drops instantly, without waiting for 5 minutes timeout, as it would when failing to find/mount root, for example. Command line kernel root option is root=zfs:mypool/ROOT/linux ...
Evidently, no problem importing/mounting the pool & root, for it is easily done from the shell via #zpool import $mypool. 
Then, upon exiting the shell after importing the pool, the system, instead of continuing normal boot process, gives off "failed to start default.target: the target is destructive" with no further explanations. It doesn't hang though, just stops at this point with some messages from audit.
Ctrl+Alt+Del then reboots the system in the normal way.


Version-Release number of selected component (if applicable):
dracut version: 043-63.git20151211.fc23
systemd version: 222-12

How reproducible:
Obviously one will have to copy my configuration: ZFS pool on whole unpartitioned disk with Fedora 23 on it. Then install SPL + ZFS etc., create initramfs using dracut cli et voila.

dracut --force -a zfs /boot/initramfs-$version $kern_version

(with whole disk zfs no partition\filesystem mudules are needed except those already in the kernel)

Steps to Reproduce:
1.as above
2.
3.

Actual results:
System cannot be booted with F23-created initramfs, only with those remaining from the old F21 installation.

Expected results:
Normal boot as it used to boot all these years...

Additional info:
ZFS and SPL are installed from GIT, ZoL. For some reason RPM-Fusion installed RPMS result in system not being able to compile zfs kmod for the given kernel. But that's a different story...

Comment 1 Kostya Berger 2016-01-19 19:23:52 UTC
Forgot to mention, this may be important: 
previous version of dracut had this problem caused by /usr/lib/dracut/modules.d/98dracut-systemd/rootfs-generator.sh. That caused the creation of the file /usr/lib/dracut/initqueue/finished/devesists-$(_name).sh in initramfs pointing to the root device (as dracut obviously believed) in /dev/disk/by-uuid/***. While in reality my whole disk zpool is linked to a device name in /dev/disk/by-id.

With that version of dracut scripts I dirty fixed it by manually removing the commands creating that file $hookdir/initqueue/finished/devesists-$(_name).sh. from rootfs-generator.sh where functions are defined. Because it caused system to hang on boot: for some reason /dev/disk/by-uuid failed to appear within the 5 min. timeout set for dracut shell to do its job. Separate problem, I know...

Now with the present version of dracut this file devexists***sh is NOT created in $hooksdir/initqueue/finished dir. There is only zfs.sh in there... Is this, possibly, the cause of the problem? Does the system refuse to boot because that file is somehow considered essential?

Comment 2 Kostya Berger 2016-01-19 19:25:46 UTC
/usr/lib/dracut/initqueue/finished/devexists-$(_name).sh is the correct name of the file, sorry.
So now dracut drops into shell without even having started mount hooks.

Comment 3 Harald Hoyer 2016-01-26 11:05:58 UTC
What is your kerne command line?

What is the output of:
# dracut --print-cmdline

Comment 4 Kostya Berger 2016-01-26 13:10:42 UTC
Actually it's "root=zfs:mypool/ROOT/linux ro systemd.unit=graphical.target"

This used to work.

Comment 5 Harald Hoyer 2016-01-26 14:11:05 UTC
dracut does not support zfs natively, so there must be some dracut zfs plugin module. Please talk to the maintainers of this.

The output of:
# rpm -qf /usr/lib/dracut/modules.d/* | sort -u

might help in finding the corresponding rpm.

Comment 6 Kostya Berger 2016-01-26 14:33:54 UTC
Yes, there is dracut zfs mount hook provided with ZFS On Linux. I'm using it. Already asked this question at zol mailing list, they're eager to have some answers from dracut people...

Besides, here systemd also comes into play. And how can I possibly debug systemd issues with dracut? All this talking of "targets" that are "destructive" -- systemd language...

Anyways, I can still try git bisect with dracut code. We'll see at least if the problem is indeed with dracut. Or maybe with systemd, which wouldn't surprise me at all

Comment 7 Harald Hoyer 2016-01-26 14:46:15 UTC
Is the zfs kernel module in the initramfs?
# lsinitrd | fgrep zfs

If not, maybe you just have to recreate the initramfs after you build your custom kernel with:
# dracut -f --kver <kernel-version>

Comment 8 Kostya Berger 2016-01-26 15:04:31 UTC
Please, reread my first post here. Boot hangs until dracut 5 min timeout runs out, then drops into shell. At that point zfs.ko is loaded and all zfs utils are fully functional. 
With earlier versions (Fedora 21) when having problems of this sort I would just manually import the pool and log out of the emergency shell. Then normal boot would resume and the system start all right. Those were problems were related to my zpool being associated with device string in /dev/disk/by-partuuid,which is not standard with ZFS on Linux. Additional command line options were to be added to mount-zfs.sh dracut mount hook supplied with ZOL package.

This time the startup scripts don't have that problem. When dropped into emergency shell the target zpool is manually imported without any difficulties or special custom command line options whatsoever. Just "zpool import $mypool" makes it!

But when I log out of emergency shell to let the system resume booting process, it JUST refuses to use that kind of root and says "the target is destructive". This is how I read it.

Comment 9 Kostya Berger 2016-04-25 08:48:41 UTC
OK, this bug is not due to Fedora|RedHat proveded code. Actually, it's due to ZFS on Linux code, which adds zfs-capable scripts and systemd services.

These scripts take it over once the "zfs:*" root is detected, they replace accordingly any needed Fedora dracut code or add what is needed. I have pinpointed what exactly they fail to do.

Once the pool imported and mounted (ZOL part job), invoking `systemctl start initrd-switch-root.service` resumes normal boot.

So I guess it can be closed here as unrelated to RedHat/Fedora provided dracut code.


Note You need to log in before you can comment on or make changes to this bug.