Description of problem: My newly installed kernel can't boot. The initramfs image created by the RPM install script (and also images I created by hand using mkinitrd and dracut) doesn't include the /etc/mdadm.conf file. Since my root filesystem is on an md raid1 volume, the kernel can't find it. I manually added /etc/mdadm.conf to the initramfs image. The kernel was then able to create the proper volume, but still didn't mount it. I'm not sure why; perhaps a delay needs to be added. We may need to come back to this later... The /etc/dracut.conf file is unmodified; the lines saying # install local /etc/mdadm.conf #mdadmconf="no" are still commented out. Therefore, by default, dracut should have put the file in the image. Note: This seems to be similar to bug 921887. Version-Release number of selected component (if applicable): dracut-029-1.fc18.2.i686. Before today's update, the installed package was dracut-018-60.git20120927.fc16.noarch, which worked okay. How reproducible: Always.
Same problem here with same version of dracut. All systems with default dracut config and root on an md device don't boot after latest kernel update. :-(
I updated the kernel in nine computers with software RAID. Seven computers boot, but the raid devices are changed, for example: /dev/md0 -> /dev/md127 /dev/md1 -> /dev/md126 ... And two computers don't boot and appear the message: dracut-initqueue[PID]: Warning: Could not boot. dracut-initqueue[PID]: Warning /dev/md2 does not exist ... I think that is the same problem, boot or not boot the computer.
(In reply to Enrique V. Bonet Esteban from comment #2) > I think that is the same problem, boot or not boot the computer. You can try to check, how these 2/7 computers are using your filesystems. If you have LABEL=... or UUID=... definitions in grub's kernel command line and /etc/fstab, then they can boot with /dev/md127 too. If you have /dev/md0 in grub or fstab config, then they don't boot. But I think this is always a bug, does not matter if they boot or don't. Curious, that I can't see same problem with dracut-029-2.fc19.x86_64 on Fedora 19. Is this already fixed? I don't see anything in F19 package changelog.
Hi Jan, The two computers that don't boot have /dev/md0 in grub.cfg file (old configuration). I change this value for UUID and the computers booting, but they change too the raid devices /dev/md0 -> /dev/md127... I think like you, this is a bug, boot or don't boot. Thanks, Enrique
Created attachment 810611 [details] System log for boot-up showing temporary failure to mount root filesystem Here is a system log extract from a boot-up attempt, after I manually added etc/mdadm.conf to the initramfs image. At about the 2.0 second point, the system failed to mount the root filesystem and dropped into an emergency shell. As you can see, the md5 device was only partially assembled at that time (the sda5 mirror was bound but not the sdb5 mirror). After looking through the system status for a while, I exited the shell. At that point (about 50.7 seconds) the md5 assembly finished. That's where the root filesystem is; the system found it and the rest of the bootup proceded normally. Questions: 1. Why didn't the system wait for md5 to be fully assembled before trying to mount the root filesystem? 2. Why did the assembly completion wait until after I exited the emergency shell? Why didn't it go on while I was doing other things?
(In reply to Enrique V. Bonet Esteban from comment #4) > Hi Jan, > > The two computers that don't boot have /dev/md0 in grub.cfg file (old > configuration). I change this value for UUID and the computers booting, > but they change too the raid devices /dev/md0 -> /dev/md127... > > I think like you, this is a bug, boot or don't boot. > > Thanks, > > Enrique Hmm, you must not specify md0 in grub.cfg. Always use UUID or LABEL. And even specify: rd.md.uuid=<MDUUID> on the kernel command line.
(In reply to Alan Stern from comment #5) > Created attachment 810611 [details] > System log for boot-up showing temporary failure to mount root filesystem > > Here is a system log extract from a boot-up attempt, after I manually added > etc/mdadm.conf to the initramfs image. At about the 2.0 second point, the > system failed to mount the root filesystem and dropped into an emergency > shell. As you can see, the md5 device was only partially assembled at that > time (the sda5 mirror was bound but not the sdb5 mirror). > > After looking through the system status for a while, I exited the shell. At > that point (about 50.7 seconds) the md5 assembly finished. That's where the > root filesystem is; the system found it and the rest of the bootup proceded > normally. > > Questions: > > 1. Why didn't the system wait for md5 to be fully assembled before trying > to mount the root filesystem? > > 2. Why did the assembly completion wait until after I exited the > emergency shell? Why didn't it go on while I was doing other things? Seems like it had to "resync", because of previous usage without all parts. Please always add the MD UUID to the kernel command line. rd.md.uuid=<MD_UUID> To find the MD_UUID, run: # mdadm --detail --export <yourmddevice> |grep -F MD_UUID
Does not matter, what is really an problem, but ignoring mdadm.conf by dracut is still a problem. Harald, can you fix this? (In reply to Harald Hoyer from comment #7) > (In reply to Alan Stern from comment #5) > Please always add the MD UUID to the kernel command line. > > rd.md.uuid=<MD_UUID> I think you should report this to anaconda to always add this parameter after installation. Will this fix original problem for proper /dev/mdX device number mapping?
(In reply to Harald Hoyer from comment #6) > (In reply to Enrique V. Bonet Esteban from comment #4) > > Hi Jan, > > > > The two computers that don't boot have /dev/md0 in grub.cfg file (old > > configuration). I change this value for UUID and the computers booting, > > but they change too the raid devices /dev/md0 -> /dev/md127... > > > > I think like you, this is a bug, boot or don't boot. > > > > Thanks, > > > > Enrique > > Hmm, you must not specify md0 in grub.cfg. Always use UUID or LABEL. > > And even specify: rd.md.uuid=<MDUUID> on the kernel command line. I probe your solution on a computer with the problem, I have a RAID device mapping: md127 -> swap md126 -> /home md125 -> /tmp md124 -> / And I add to the kernel command line the option: rd.md.uuid=a81e44e9:22c39a51:24501111:aaf04870 When a81e44e9:22c39a51:24501111:aaf04870 is the output of the command: mdadm --detail --export /dev/md124 |grep -F MD_UUID Reboot the system and the new RAID device mapping is: md2 -> /tmp md3 -> swap md0 -> /home md127 -> / The root directory is assigned to an incorrect RAID device. I attached the grub2.cfg file
Created attachment 812002 [details] grub2.cfg configuration file
*** Bug 1018272 has been marked as a duplicate of this bug. ***
(In reply to Harald Hoyer from comment #6) > Hmm, you must not specify md0 in grub.cfg. Always use UUID or LABEL. This is nonsense, and breaks backward compatibility. You can't just introduce a rule like that in the middle of a stable release. This was *working* before the updates, and is now broken. Even if it *wasn't* unacceptable for force people into using mount-by-UUID in the general case, it would be utterly insane to do this in a stable update. The RAID code has the 'preferred minor' facility for a reason. It should be honoured.
(In reply to Harald Hoyer from comment #7) > (In reply to Alan Stern from comment #5) > > Questions: > > > > 1. Why didn't the system wait for md5 to be fully assembled before trying > > to mount the root filesystem? > > > > 2. Why did the assembly completion wait until after I exited the > > emergency shell? Why didn't it go on while I was doing other things? > > Seems like it had to "resync", because of previous usage without all parts. If that's true, it would mean there's another bug in dracut: When an MD drive containing the root filesystem needs a resync, the system should wait for the resync to finish before trying to mount the root. I'm not currently able to reproduce the behavior shown in the attachment. However, the fact that it has happened twice is disturbing; this system needs to be able to boot without an operator present at the console. > Please always add the MD UUID to the kernel command line. > > rd.md.uuid=<MD_UUID> It already was there. Harald, quit trying to dodge the issue. The basic fact is very simple: Dracut has a bug -- it doesn't copy /etc/mdadm.conf into the initramfs image when it should. Just fix the bug; I'm sure it will be easier to do that than to go around telling lots of people to put rd.md.uuid=<MD_UUID> in their kernel command lines. I agree with David Woodhouse's comment about backward compatibility. What would Linux say?
Created attachment 819310 [details] Patch to set the correct defaults for mdadmconf and lvmconf Look guys, this doesn't require a huge intellectual investment. The attached patch fixes the problem for me. Please consider including it in a bug-fix release of dracut.
Same problem here with md0 now showing up as md12X, breaking boot. (In reply to Harald Hoyer from comment #6) > Hmm, you must not specify md0 in grub.cfg. Always use UUID or LABEL. I thought it was considered dangerous to use UUID/LABEL with MD RAID-1. If the RAID devices don't get assembled properly on boot (due to a misconfiguration or bug), then a search by filesystem UUID/LABEL could find a RAID-1 member partition and mount it directly, bypassing RAID. This would lead to the mirrored array becoming out-of-sync/corrupted. Specifying md0, as I understand, ensures that scenario can never occur, so that's what I've always done and what I would like to keep doing.
Comment #14 appears to suggest that mdadmconf previously defaulted to "yes", but after this update now defaults to "no". Is that correct? If so, then would creating a file called /etc/dracut.conf.d/my-md.conf with the line: mdadmconf="yes" and then installing an updated kernel package (to generate a fresh initramfs) be enough to solve the md0->md12X renaming issue? (I'm hesitant to experiment as I reboot my machines remotely, and already got burned once...) Finally: Do F19 and F20 also break the MD device naming? i.e. Will I need mdadmconf="yes" in all future releases if I want a fixed "md0" name?
dracut-029-1.fc18.3 has been submitted as an update for Fedora 18. https://admin.fedoraproject.org/updates/dracut-029-1.fc18.3
Jordan: It's not just that mdadmconf previously defaulted to "yes". The current documentation (man dracut.conf) still states that it defaults to "yes". I did indeed create such a file as you suggested, and it fixed the immediate problem. However, now I'm facing a different (although related) problem; maybe someone can suggest a solution. This is probably the result of recent changes to the kernel, not dracut's fault at all. Still, the easiest way to work around it seems to lie in the startup script. As described in comment #5, my system tries to mount the md5 device, which contains the root filesystem, before it has been assembled. Of course the mount fails, and the script drops into an emergency console shell. Simply typing "exit" is enough to get things going again, but this means that unattended boots will get stuck and fail. Can anyone suggest a simple way (like a boot command-line argument) to make the startup script pause for a few seconds before trying to mount the root device? Or if the mount fails, retry it after a few seconds delay before dropping into an emergency shell?
Package dracut-029-1.fc18.3: * should fix your issue, * was pushed to the Fedora 18 testing repository, * should be available at your local mirror within two days. Update it with: # su -c 'yum update --enablerepo=updates-testing dracut-029-1.fc18.3' as soon as you are able to. Please go to the following url: https://admin.fedoraproject.org/updates/FEDORA-2013-23312/dracut-029-1.fc18.3 then log in and leave karma (feedback).
I update dracut, add mdadmconf="yes" in the file /etc/dracut.conf and I run dracut -f command and reboot the system. The root directory is assigned a the correct RAID device /dev/md1
(In reply to Enrique V. Bonet Esteban from comment #20) > I update dracut, add mdadmconf="yes" in the file /etc/dracut.conf and I run > dracut -f command and reboot the system. > > The root directory is assigned a the correct RAID device /dev/md1 Please test the update _without_ adding anything anywhere. The update should make the "mdadmconf=yes" config file change obsolete.
Hi Harald, I remove the line added (mdadmconf="yes") and run again dracut -f, reboot the system and work fine. The update solve the problem. Thanks, Enrique
I think this would fix my bug 1024015 as well, except for the fact that lvmconf defaults to no instead of yes so lvm.conf is not included in generated initramfs. Please fix that as well.
Will an update be released for Fedora 19 as well, since it also ships dracut-029? Or did this issue only ever affect the F18 package? I'd like to know because I'm wondering whether I'll be able to safely upgrade from F18 to F19/F20 without the config setting, or if I'll be unable to boot again after the upgrade if I don't add it first.
(In reply to Jordan Russell from comment #24) > Will an update be released for Fedora 19 as well, since it also ships > dracut-029? > Or did this issue only ever affect the F18 package? > > I'd like to know because I'm wondering whether I'll be able to safely > upgrade from F18 to F19/F20 without the config setting, or if I'll be unable > to boot again after the upgrade if I don't add it first. Only affects F18. In F19, the hostonly mode is the default.
This message is a reminder that Fedora 18 is nearing its end of life. Approximately 4 (four) weeks from now Fedora will stop maintaining and issuing updates for Fedora 18. It is Fedora's policy to close all bug reports from releases that are no longer maintained. At that time this bug will be closed as WONTFIX if it remains open with a Fedora 'version' of '18'. Package Maintainer: If you wish for this bug to remain open because you plan to fix it in a currently maintained version, simply change the 'version' to a later Fedora version prior to Fedora 18's end of life. Thank you for reporting this issue and we are sorry that we may not be able to fix it before Fedora 18 is end of life. If you would still like to see this bug fixed and are able to reproduce it against a later version of Fedora, you are encouraged change the 'version' to a later Fedora version prior to Fedora 18's end of life. Although we aim to fix as many bugs as possible during every release's lifetime, sometimes those efforts are overtaken by events. Often a more recent Fedora release includes newer upstream software that fixes bugs or makes them obsolete.
It's a PITA that dracut-029-1.fc18.3 has not made to the updates repository, as it fixes unbootable kernel with root on mdraid. lvm is still missing.
(In reply to Nerijus Baliūnas from comment #27) > It's a PITA that dracut-029-1.fc18.3 has not made to the updates repository, > as it fixes unbootable kernel with root on mdraid. lvm is still missing. # echo 'mdadmconf="yes"' > /etc/dracut.conf.d/my-md.conf # echo 'lvmconf="yes"' > /etc/dracut.conf.d/my-lvm.conf # dracut -f Should fix your issue on F18. Or update to F19 or F20.
Fedora 18 changed to end-of-life (EOL) status on 2014-01-14. Fedora 18 is no longer maintained, which means that it will not receive any further security or bug fix updates. As a result we are closing this bug. If you can reproduce this bug against a currently maintained version of Fedora please feel free to reopen this bug against that version. If you are unable to reopen this bug, please file a new report against the current release. If you experience problems, please add a comment to this bug. Thank you for reporting this bug and we are sorry it could not be fixed.
the same existed on F19/F20 on 2 out of 8 machines they are not "host-only" and i really don't get why somebody stops to copy /etc/mdadm.conf into initrd