Red Hat Bugzilla – Bug 990684
software raid1 devices without UUID in fstab result in failed boot after upgrade to dracut-029
Last modified: 2013-08-05 10:26:37 EDT
Description of problem:
Software raid1 partitions (which have been defined in /etc/fstab without UUID's) fail to boot after upgrading (among others) dracut and kernel; kernel-3.9.10 and up does not boot, kernel-3.9.6-200.fc18 still works fine
Version-Release number of selected component (if applicable):
Try to reboot to a new kernel. 3.9.10 and 3.9.11 did not work. I suspect this is due to the fact that these kernels were installed after dracut had been upgraded (Sun 07 Jul 2013 08:50:28 PM EEST). Changelog of dracut suggests that major modifications have occurred, including upgrade from 024 to 029.
Steps to Reproduce:
1. Install a new kernel using a new dracut in a system like this, and observe a failed boot
Boot fails with error logs such as (typed up, sorry..):
dracut-initqueue: Warning: Could not boot
dracut-initqueue: warning /dev/md2 does not exist
kernel log reveals among others:
md124: unknown partition table
md124: detected capacity change 0 -> 42....
(the same for all md devices, md124, md125, md126, md127 - see the latter bug report, comment 5, below for similar issues)
/etc/fstab is as follows:
/dev/md2 / ext3 defaults 1 1
/dev/md3 /home ext3 defaults 1 2
/dev/md1 /tmp ext3 defaults 1 2
/dev/md0 /boot ext3 defaults 1 2
tmpfs /dev/shm tmpfs defaults 0 0
devpts /dev/pts devpts gid=5,mode=620 0 0
sysfs /sys sysfs defaults 0 0
proc /proc proc defaults 0 0
LABEL=SWAP-sdb3 swap swap defaults 0 0
LABEL=SWAP-sda3 swap swap defaults 0 0
I found two a bit similar bugs, https://bugzilla.redhat.com/show_bug.cgi?id=981165 and https://bugzilla.redhat.com/show_bug.cgi?id=895805.
I successfully worked around the problem by replacing /dev/mdX in fstab with UUID's as suggested in the latter bug report, comment 5; then I removed the latest kernel and reinstalled it, forcing reconstruction of initrd images. Then booting worked without a hitch. This was my first serious attempt at working around the problem; next in line would have been downgrading dracut and/or examining generated initrd images.
I didn't and probably can't dig into this deeper, but I thought to file this bug report just in case in case someone looks a bit deeper into dracut failure modes.
Well, that is not a dracut bug. It's a user error relying on kernel enumeration.
Never ever do this! :-)
I have to disagree with this diagnosis. For 10 years, this has been the way to do it. To date, various ways how initrd is generated have included mechanisms that support this – which have ceased to work now.
IMHO, previously working configuration breaking should not be a NOTABUG.
Activation of raid happens as soon as the underlying device is recognized.
The kernel naming of those devices can be influenced by /etc/mdadm.conf.
# lsinitrd /boot/initramfs-$(uname -r).img etc/mdadm.conf
shows you the contents of the mdadm.conf in the initramfs.
If the mdadm.conf is correct you might want to file a bug against component mdadm.
etc/mdadm.conf does not exist on initrd. This was the problem mentioned (but I didn't see it addressed) in the bug reports I mentioned. I suspect the problem would have been avoided if /etc/mdadm.conf had been copied on initrd, which it obviously isn't in this case.
If dracut is supposed to copy mdadm.conf to initrd under some conditions, this is probably the correct package to file a bug against.