The latest dracut (037-10.git20140402.fc20) fails to create an initramfs that will boot my workstation. This machine has a md raid 1 mirror and then lvm disks created on top of it. When it first created the initramfs, it booted up to the rescue prompt and said it couldn't find the root disk. I then rebooted to an older kernel and recreated the initramfs, and now it just hangs so I haven't been able to determine yet why it won't find the disks. When I downgraded to 034-64.git20131205.fc20, the initramfs works. I have a laptop that is running the same version of dracut and it works fine, so my suspicion is that it isn't coping correctly with the lvm on md raid setup.
Created attachment 883258 [details] working initramfs from older dracut package
Created attachment 883259 [details] broken initramfs from newer dracut
I can confirm this bug. 037-10.git20140402.fc20 is broken 034-64.git20131205.fc20 works My setup has LVM2 volumes on top of an MD RAID1
Created attachment 883260 [details] working initramfs made with old dracut
Created attachment 883261 [details] broken initramfs made with latest dracut
Booting with rd.debug on the kernel cmdline shows that it's continually cycling in the udev code. Looks like it's waiting for the root device to show up, but it never does. If I boot with rd.break=pre-udev it'll drop to a prompt, but if I specify rd.break=pre-mount then it doesn't. So that seems to indicate that maybe it's getting stuck somewhere in between. Not sure why it doesn't eventually time out and drop to a prompt however.
What is your kernel command line? What is the output of: # dracut --print-cmdline
Grub2 config: insmod gzio insmod part_gpt insmod diskfilter insmod mdraid1x insmod ext2 set root='mduuid/11def14bb573c44fa973af51eb4cda16' linux /vmlinuz-3.13.8-200.fc20.x86_64 root=/dev/mapper/green-root ro rd.md=1 rd.lvm=1 rd.dm=1 SYSFONT=latarcyrheb-sun16 rd.luks=0 KEYTABLE=it LANG=en_US.UTF-8 ast.modeset=0 nomodeset quiet initrd /initramfs-3.13.8-200.fc20.x86_64.img ####### $ dracut --print-cmdline rd.lvm.lv=green/root rd.md.uuid=8878e57f:85c65839:2f7a4a70:73506c6d rd.md.uuid=e96c0142:6f693588:41d718d8:12413bbb rd.md.uuid=11def14b:b573c44f:a973af51:eb4cda16 resume=UUID=253d3c97-3ec1-4de1-b4e5-3dfde0b202e4 root=/dev/mapper/green-root rootflags=rw,noatime,data=ordered rootfstype=ext4 ####### $ lsblk (GPT partitions) NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT sda 8:0 0 1.8T 0 disk ├─sda1 8:1 0 2M 0 part ├─sda2 8:2 0 510M 0 part │ └─md0 9:0 0 509.7M 0 raid1 /boot ├─sda3 8:3 0 1.8T 0 part │ └─md1 9:1 0 1.8T 0 raid1 │ ├─green-root 253:0 0 20G 0 lvm / │ ├─green-var 253:2 0 30G 0 lvm /var │ └─green-home 253:3 0 1.2T 0 lvm /home └─sda4 8:4 0 4G 0 part └─md2 9:2 0 4G 0 raid1 [SWAP] sdb 8:16 0 1.8T 0 disk ├─sdb1 8:17 0 2M 0 part ├─sdb2 8:18 0 510M 0 part │ └─md0 9:0 0 509.7M 0 raid1 /boot ├─sdb3 8:19 0 1.8T 0 part │ └─md1 9:1 0 1.8T 0 raid1 │ ├─green-root 253:0 0 20G 0 lvm / │ ├─green-var 253:2 0 30G 0 lvm /var │ └─green-home 253:3 0 1.2T 0 lvm /home └─sdb4 8:20 0 4G 0 part └─md2 9:2 0 4G 0 raid1 [SWAP]
$ sudo dracut --print-cmdline rd.lvm.lv=vg_tlielax/lv_root rd.lvm.lv=vg_tlielax/lv_swap rd.md.uuid=cf8ad0d9:43a180b8:b37b8347:2ee7636f rd.md.uuid=1812a66c:86a2e53e:5a01c74d:91f8cb8c resume=/dev/mapper/vg_tlielax-lv_swap root=/dev/mapper/vg_tlielax-lv_root rootflags=rw,relatime,seclabel,attr2,inode64,noquota rootfstype=xfs
(In reply to Daniele Viganò from comment #8) > Grub2 config: > > insmod gzio > insmod part_gpt > insmod diskfilter > insmod mdraid1x > insmod ext2 > > set root='mduuid/11def14bb573c44fa973af51eb4cda16' > > linux /vmlinuz-3.13.8-200.fc20.x86_64 root=/dev/mapper/green-root ro rd.md=1 > rd.lvm=1 rd.dm=1 SYSFONT=latarcyrheb-sun16 rd.luks=0 KEYTABLE=it > LANG=en_US.UTF-8 ast.modeset=0 nomodeset quiet > initrd /initramfs-3.13.8-200.fc20.x86_64.img > > ####### > > $ dracut --print-cmdline > > rd.lvm.lv=green/root > rd.md.uuid=8878e57f:85c65839:2f7a4a70:73506c6d > rd.md.uuid=e96c0142:6f693588:41d718d8:12413bbb > rd.md.uuid=11def14b:b573c44f:a973af51:eb4cda16 > resume=UUID=253d3c97-3ec1-4de1-b4e5-3dfde0b202e4 > root=/dev/mapper/green-root rootflags=rw,noatime,data=ordered rootfstype=ext4 > > ####### > Your grub cmdline is missing: rd.lvm.lv=green/root rd.md.uuid=8878e57f:85c65839:2f7a4a70:73506c6d rd.md.uuid=e96c0142:6f693588:41d718d8:12413bbb rd.md.uuid=11def14b:b573c44f:a973af51:eb4cda16
(In reply to Jeff Layton from comment #9) > $ sudo dracut --print-cmdline > rd.lvm.lv=vg_tlielax/lv_root > rd.lvm.lv=vg_tlielax/lv_swap > rd.md.uuid=cf8ad0d9:43a180b8:b37b8347:2ee7636f > rd.md.uuid=1812a66c:86a2e53e:5a01c74d:91f8cb8c > resume=/dev/mapper/vg_tlielax-lv_swap root=/dev/mapper/vg_tlielax-lv_root > rootflags=rw,relatime,seclabel,attr2,inode64,noquota rootfstype=xfs Yours is missing maybe: rd.lvm.lv=vg_tlielax/lv_root rd.lvm.lv=vg_tlielax/lv_swap rd.md.uuid=cf8ad0d9:43a180b8:b37b8347:2ee7636f rd.md.uuid=1812a66c:86a2e53e:5a01c74d:91f8cb8c
Yes, it is missing those: $ cat /proc/cmdline BOOT_IMAGE=/vmlinuz-3.13.8-200.fc20.x86_64 root=/dev/mapper/vg_tlielax-lv_root ro LANG=en_US.UTF-8 crashkernel=128M I updated another machine that has a LVM2 on MD raid setup and it worked fine. Here's /proc/cmdline from that one: $ cat /proc/cmdline BOOT_IMAGE=/vmlinuz-3.13.8-200.fc20.x86_64 root=/dev/mapper/vg_sikun-lv_root ro rd.md.uuid=23e91dc6:e4d72df4:05636717:1d6ce72e rd.lvm.lv=vg_sikun/lv_swap rd.lvm.lv=vg_sikun/lv_root vconsole.font=latarcyrheb-sun16 rd.md.uuid=75447f7f:c9444494:2004dd8b:fb2cd818 console=tty console=ttyS0,115200 LANG=en_US.UTF-8 ...so I'm guessing this is due to the fact that I upgraded the former machine from an earlier fedora release and the upgrades never fixed /etc/default/grub ? Was it intentional to make dracut no longer try to detect and start up md raid arrays, or is this a regression?
(In reply to Harald Hoyer from comment #10) > > Your grub cmdline is missing: > rd.lvm.lv=green/root > rd.md.uuid=8878e57f:85c65839:2f7a4a70:73506c6d > rd.md.uuid=e96c0142:6f693588:41d718d8:12413bbb > rd.md.uuid=11def14b:b573c44f:a973af51:eb4cda16 Thanks Herald, now it works. I tried adding those line manually before, but I missed the first one. The lines were missing because I updgraded from Fedora 19 an I moved my partitions to MD and LVM after the installation/upgrade.
(In reply to Jeff Layton from comment #12) ... > Was it intentional to make dracut no longer try to detect and start up md > raid arrays, or is this a regression? Contents of the mdadm.conf, i.e. # lsinitrd -f etc/mdadm.conf is OK, so this is most likely a bug.
Thanks, I would think so too. dracut.cmdline(7) says: MD RAID rd.md=0 disable MD RAID detection [...] rd.md.uuid=<md raid uuid> only activate the raid sets with the given UUID. This parameter can be specified multiple times. ...so that sort of implies that rd.md=1 by default and if you don't specify a rd.md.uuid then it should activate any that it can find.
This appears to be essentially a duplicate of bug 1081841 and we might want to consolidate these two bug reports.
(In reply to Jeff Layton from comment #15) > Thanks, I would think so too. dracut.cmdline(7) says: > > MD RAID > rd.md=0 > disable MD RAID detection > > [...] > > rd.md.uuid=<md raid uuid> > only activate the raid sets with the given UUID. This parameter > can > be specified multiple times. > > ...so that sort of implies that rd.md=1 by default and if you don't specify > a rd.md.uuid then it should activate any that it can find. In addition there is $ man 5 dracut.conf ... mdadmconf="{yes|no}" Include local /etc/mdadm.conf (default=yes) So it is redundant to append its content to the kernel command line. I am actually more concerned about forced upgrade equal to that on the Rawhide.
$ man 8 dracut ... --hostonly-cmdline: Store kernel command line arguments needed in the initramfs --no-hostonly-cmdline: Do not store kernel command line arguments needed in the initramfs Pay attention, "--no-hostonly-cmdline" is default! [1] So it has to be done like this: # dracut --hostonly-cmdline [--kver 3.13.9-200.fc20.x86_64] [--force] which produces *mdraid.conf within an initramfs image: # lsinitrd --kver 3.13.9-200.fc20.x86_64 --file etc/cmdline.d/90mdraid.conf rd.md.uuid=... etc. so there is no need to populate kernel command line with the UUIDs. Summa summarum, in contrast to the prior this must be done manually, as shown. It's not a bug, it's a feature. What is the reason for this change, the author knows best. [1] http://git.kernel.org/cgit/boot/dracut/dracut.git/commit/modules.d/90mdraid/module-setup.sh?id=ab9457e [2] http://git.kernel.org/cgit/boot/dracut/dracut.git/commit/NEWS?id=2bdf760
This should also work, an is probably the optimal solution for the current situation; # dracut --print-cmdline rd.md.uuid=... rd.md.uuid=... etc. $ cat cat /etc/dracut.conf.d/kernel_cmdline.conf # # Specify default kernel command line parameters # kernel_cmdline="rd.md.uuid=... rd.md.uuid=... etc." # dracut [--kver 3.13.9-200.fc20.x86_64] [--force] ... # lsinitrd [--kver 3.13.9-200.fc20.x86_64] --file etc/cmdline.d/01-default.conf rd.md.uuid=... rd.md.uuid=... etc. $ man 5 dracut.conf ... kernel_cmdline="parameters" Specify default kernel command line parameters Nevertheless it is not a good practice to change the functionality in the midst of the stable release!
I added the output of "dracut --print-cmdline" (specifically the rd.md.uuid=... parts) to grub2.cfg, which works. As an alternative I also added a simple 'rd.auto=1' which (may not be the preferred solution but) works as well.
(In reply to Rolf Fokkens from comment #20) > I added the output of "dracut --print-cmdline" (specifically the > rd.md.uuid=... parts) to grub2.cfg, which works. > > As an alternative I also added a simple 'rd.auto=1' which (may not be the > preferred solution but) works as well. Yeah, "rd.auto=1" directive is the shortest possible permanent solution for the current situation, i.e. $ cat /etc/dracut.conf.d/kernel_cmdline.conf kernel_cmdline="rd.auto=1" # dracut --kver 3.13.9-200.fc20.x86_64 --force # lsinitrd --kver 3.13.9-200.fc20.x86_64 --file etc/cmdline.d/01-default.conf rd.auto=1 However, there is no need to append this directive(rd.auto=1) to the bootloader configs i.e. 'grub.cfg' & 'extlinux.conf' & whatnot.config. /etc/dracut.conf.d/kernel_cmdline.conf kernel_cmdline="rd.auto=1" is sufficient, and bootloader agnostic.
For folks to comprehend what we write here: $ man 7 dracut.cmdline rd.auto rd.auto=1 enable autoassembly of special devices like cryptoLUKS, dmraid, mdraid or lvm. Default is off as of dracut version >= 024. $ man 5 dracut.conf kernel_cmdline="parameters" Specify default kernel command line parameters FILES /etc/dracut.conf Old configuration file. You better use your own file in /etc/dracut.conf.d/. /etc/dracut.conf.d/ Any /etc/dracut.conf.d/*.conf file can overwrite the values in etc/dracut.conf. The configuration files are read in alphanumerical order.
kernel 3.13.9-200 does not boot on intel BIOS RAID https://bugzilla.redhat.com/show_bug.cgi?id=1085952
Unable to boot system with root on rmd raid1 https://bugzilla.redhat.com/show_bug.cgi?id=1085922
Fails to boot with software RAID https://bugzilla.redhat.com/show_bug.cgi?id=1085773
The new mandatory 'dracut --cmdline' are not practicable and break existing systems. Here, the mandatory options are 303 characters and when having a more complex setup, it is very likely that they do not fit into the kernel buffer: ./uapi/asm-generic/setup.h:4:#define COMMAND_LINE_SIZE 512
(In reply to Enrico Scholz from comment #26) > The new mandatory 'dracut --cmdline' are not practicable and break existing > systems. Here, the mandatory options are 303 characters and when having a > more complex setup, it is very likely that they do not fit into the kernel > buffer: > > ./uapi/asm-generic/setup.h:4:#define COMMAND_LINE_SIZE 512 +1
(In reply to poma from comment #22) > For folks to comprehend what we write here: > > $ man 7 dracut.cmdline > rd.auto rd.auto=1 > enable autoassembly of special devices like cryptoLUKS, dmraid, mdraid or > lvm. > Default is off as of dracut version >= 024. > > $ man 5 dracut.conf > kernel_cmdline="parameters" > Specify default kernel command line parameters > > FILES > /etc/dracut.conf > Old configuration file. You better use your own file in > /etc/dracut.conf.d/. > > /etc/dracut.conf.d/ > Any /etc/dracut.conf.d/*.conf file can overwrite the values in > etc/dracut.conf. > The configuration files are read in alphanumerical order. This change should be reverted in F20 at the very least, it breaks *every* installation that uses mdraid, and it is not clear how to work it around right away.
(In reply to Simo Sorce from comment #28) > (In reply to poma from comment #22) ... > This change should be reverted in F20 at the very least, it breaks *every* > installation that uses mdraid, and it is not clear how to work it around > right away. Simo, I understand it very well and I've already asked the author what is the reason for all of this, https://bugzilla.redhat.com/show_bug.cgi?id=1085952#c9 So far, the only response is, "We are the Borg. You will be assimilated. Resistance is futile." Auntie Kathryn, whereeeee areeeee youuuuu!
I got the sytsem to start adding rd.auto=1 to the grub commandline, and now after the system is booted it can't automatically find the Lvm volume I have on a second set of raid devices. If I do a lvdisaply I see it, but I hat to do a lvchange -ay /dev/lvmdevicename... and then systemd automatically took over and mounted it. 1. why is the lvm volume not seen ? 2. is it normal for system to take over and mount volumes automatically like that ? I wanted to perform a fsck before mounting ...
It seems that "rd.auto=1" only works if you have only MDs. It certainly works in my setup! So try this method: Specify a space-separated LVM LVs & MD UUIDs listed with # dracut --print-cmdline in dracut configuration file e.g. /etc/dracut.conf.d/kernel_cmdline.conf: kernel_cmdline="rd.lvm.lv=... rd.md.uuid=..." Afterwards # dracut --force --kver 3.13.9-200.fc20.x86_64 # reboot
I do not think this is a vaiable fix. If I add a new disk to my system, add a new lvm volume and go add it on /etc/fstab I expect the system to just boot fine, why should I go and change obscure dracut configurations ??? This is a regression and should be fixed in dracut. right now what happens is that systemd can't find/mount the device and drops you to an emergency repair shell, this is not cool.
To append all that to the bootloader config or dracut config(my option) is essentially the same and is what the author advises. There is nothing obscure regarding the "kernel_cmdline=" per se. Don't generalize because of one regression.
The dracut update in question should never have been submitted, it is entirely inappropriate under the update policy. Harald, please fix your mess ASAP. https://fedorahosted.org/fesco/ticket/1286
*** Bug 1085773 has been marked as a duplicate of this bug. ***
*** Bug 1085952 has been marked as a duplicate of this bug. ***
poma has identified what looks like it may be the questionable commit here: https://git.kernel.org/cgit/boot/dracut/dracut.git/commit/?id=ab9457efd7 this seems to be a very significant behaviour change.
*** Bug 1081841 has been marked as a duplicate of this bug. ***
Note the documentation at https://www.kernel.org/pub/linux/utils/boot/dracut/dracut.html#_boot_parameters states: "An initramfs generated without the "hostonly" mode, does not contain any system configuration files (except for some special exceptions), so the configuration has to be done on the kernel command line." which strongly implies that you should *not* need to do kernel command line configuration when booting with a hostonly initramfs.
*** Bug 1085922 has been marked as a duplicate of this bug. ***
Ok building a new initramfs with --force --hostonly-cmdline does genearate one that finally boots my system... Why is --hostonly-cmdline not the default ?
So sorry to spam the bug a bit, but Simo at least has confirmed that the change identified by poma is apparently the source of his problem. So let's look at the change a bit. AFAICS, what it does is make one particular subset of dracut's functionality configurable at runtime. Prior to the commit, dracut would unconditionally stick some bits of information regarding the host's block device configuration into /etc/cmdline.d in the initramfs, it looks like. That's what the blocks in the modules.d/ files affected by the patch do. It would do this whether it was building a 'generic' (supposed to be non-system-specific) or 'hostonly' (supposed to be system-specific) initramfs. What the commit does is conditionalize those actions, by wrapping them in: if [[ $hostonly_cmdline == "yes" ]]; then ...(do stuff)... fi so those things are only done if hostonly_cmdline is set - i.e. if the parameter --hostonly-cmdline is passed to dracut at execution time. I can kinda see the logic here: dracut should only include configuration information specific to the host system when building a 'hostonly' initramfs. OK, fine. But then two questions occur: 1. Why do we need a separate parameter just to configure *this* behaviour, separate from the question of whether dracut as a whole is in hostonly or generic mode? We already have --hostonly for that. Why would you ever pass --hostonly but not --hostonly-cmdline, or --hostonly-cmdline but not --hostonly? 2. Assuming there's some kind of reason to make these behaviours separately configurable, why doesn't the hostonly_cmdline behaviour at least match the hostonly behaviour *unless it's explicitly specified*? poma asserts, and Simo's testing seems to confirm, that in fact hostonly_cmdline simply defaults to 'no' in all cases. It seems to me that it ought to default to being 'yes' if hostonly is 'yes', or 'no' if hostonly is 'no', and only differ if the user explicitly specifies that somehow. (and, of course, this behaviour change is *still* entirely inappropriate as an update to a stable release.)
http://koji.fedoraproject.org/koji/taskinfo?taskID=6751545 is a scratch F20 build with the offending upstream changes reverted. Can folks offended by this bug please test it? You'll need to install a 'new' kernel or manually re-generate the initramfs after installing the update, then see if that one boots. thanks!
System with mdraid (BIOS RAID 0) boots fine. Dracut packages installed are: dracut-037-10.1.git20140402.fc20.x86_64.rpm dracut-config-rescue-037-10.1.git20140402.fc20.x86_64.rpm dracut-network-037-10.1.git20140402.fc20.x86_64.rpm Initramfs was recreated using dracut --force One thing I noticed is that the image generated by dracut 037-10.1.git20140402 is significantly larger than the one produced by dracut 034-64.git20131205: -rw------- 1 root root 18076529 Apr 18 09:10 /boot/initramfs-3.13.9-200.fc20.x86_64.img -rw------- 1 root root 12347261 Apr 18 09:09 /boot/initramfs-3.13.9-200.fc20.x86_64.img.orig
How does it compare to one built with 037-10.git20140402? I wouldn't expect a significant difference (maybe a few bytes).
As you say, a difference of only a few bytes: After downgrading to 037-10.git20140402: With dracut --force: -rw------- 1 root root 18076902 Apr 18 09:58 initramfs-3.13.9-200.fc20.x86_64.img With dracut --force --hostonly-cmdline: -rw------- 1 root root 18078056 Apr 18 09:59 initramfs-3.13.9-200.fc20.x86_64.img
I tested http://koji.fedoraproject.org/koji/taskinfo?taskID=6751545 on two machines that have luks on top of raid 1 (except /boot only has raid 1) and both got past the point where I was getting stuck. One was a rawhide system, the other f20 using a 3.15 kernel. I did see some other problems (kdm segfaulting and a kernel hang), but those don't seem to be related to this issue.
As an additional data point, I just upgraded my mdraid system to kernel 3.13.10-200.fc20, using dracut 037-10.1.git20140402, without trouble.
I find it interesting that in all of this discussion nothing is being said about device level issues. I have three servers all running RAID 1 on /boot and on a LVM2 partition, and only one of them is consistantly having issues with dracut 037-10. The other two boot just fine when installing new kernels with dracut 037-10. The principle difference is that the problem system is using an i915 Intel SATA controller. Dracut 034-64 works just fine on that problem system. One point that MAY be of interest is that all of these systems were upgraded in-place from Fedora-19 to Fedora-20 using yum --distro-sync. Rebuilding system configurations when doing an upgrade are a pain in the ass!
Bill: the difference is likely simply that some of the systems have the 'required' bits on the cmdline, and some don't. Check your grub config or /proc/cmdline on each affected system, and compare it to the output of 'dracut --print-cmdline'. At this point the nature of the bug is fairly well understood. If we don't hear anything from Harald tomorrow I'm going to send out my scratch build as an update.
*** Bug 1085992 has been marked as a duplicate of this bug. ***
I suspect for most cases the difference was when they were installed. I think newer versions of anaconda write more device information to the command line used by grub2. So people that have been upgrading systems for several releases are probably seeing this more than people who did fresh installs.
dracut-037-11.git20140402.fc20 has been submitted as an update for Fedora 20. https://admin.fedoraproject.org/updates/dracut-037-11.git20140402.fc20
Package dracut-037-11.git20140402.fc20: * should fix your issue, * was pushed to the Fedora 20 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing dracut-037-11.git20140402.fc20' as soon as you are able to. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2014-5509/dracut-037-11.git20140402.fc20 then log in and leave karma (feedback).
dracut-037-11.git20140402.fc20 has been pushed to the Fedora 20 stable repository. If problems still persist, please make note of it in this bug report.
*** Bug 1085784 has been marked as a duplicate of this bug. ***