Created attachment 502917 [details] dmesg of kernal failure Description of problem: dracut can't find root lv because the underlying raid 10 pv is started Version-Release number of selected component (if applicable): kernel-3.0-0.rc1.git0.2.fc16.x86_64 How reproducible: every time Steps to Reproduce: 1. boot system 2. 3. Actual results: root lv not found and dropped into shell Expected results: normal boot into G3 desktop Additional info: kernel-2.6.39-1.fc16.x86_64 works as expected and I don't see any mdadm or dracut updates that may have impacted. am attaching dmesg output from the shell during the failure. I note the following msgs: [ 8.325272] dracut: Scanning devices sda2 sdb6 sdc2 sdd2 sde1 sde2 for LVM logical volumes VolGroup00/rawhide <----NOTE that /dev/md127 was not included!! [ 9.027064] dracut: Could not determine kernel version used. ... [ 9.590848] dracut: Volume group "VolGroup00" not found <***This is where root is located. ... [ 9.591076] dracut: Skipping volume group VolGroup00 [ 9.604755] dracut: Autoassembling MD Raid [ 9.973544] md: md127 stopped. [ 9.976869] dracut: mdadm: Cannot start array: No such device [ 10.559726] dracut: Autoassembling MD Raid [ 11.000792] md: md127 stopped. above repeated many times. mdadm.conf in /etc is correct as is kernel cmd line in grub. # mdadm.conf written out by anaconda MAILADDR root AUTO +imsm +1.x -all ARRAY /dev/md127 level=raid10 num-devices=4 UUID=b9438b55:1d815c8b:bfe78010:bc810f04 title Fedora (3.0-0.rc1.git0.2.fc16.x86_64) root (hd0,0) kernel /vmlinuz-3.0-0.rc1.git0.2.fc16.x86_64 ro root=/dev/mapper/VolGroup00-rawhide rd_LVM_LV=VolGroup00/rawhide rd_MD_UUID=b9438b55:1d815c8b:bfe78010:bc810f04 rd_NO_LUKS rd_NO_DM LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=us rdshell noresume rd_NO_PLYMOUTH initrd /initramfs-3.0-0.rc1.git0.2.fc16.x86_64.img title Fedora (2.6.39-1.fc16.x86_64) root (hd0,0) kernel /vmlinuz-2.6.39-1.fc16.x86_64 ro root=/dev/mapper/VolGroup00-rawhide rd_LVM_LV=VolGroup00/rawhide rd_MD_UUID=b9438b55:1d815c8b:bfe78010:bc810f04 rd_NO_LUKS rd_NO_DM LANG=en_US.UTF-8 SYSFONT=latarcyrheb-sun16 KEYTABLE=us rdshell noresume rd_NO_PLYMOUTH initrd /initramfs-2.6.39-1.fc16.x86_64.img
Should have made severity high.
I am seeing a similar problem. In my case I am using luks on top of software raid 1 and none of my arrays are being assembled.
volumes VolGroup00/rawhide <----NOTE that /dev/md127 was not included!! [ 9.027064] dracut: Could not determine kernel version used. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ lvm bug. if (sscanf(_uts.release, "%d.%d.%d", &_kernel_major, &_kernel_minor, &_kernel_release) != 3) { log_error("Could not determine kernel version used."); return 0; }
(In reply to comment #0) > Created attachment 502917 [details] > dmesg of kernal failure > > Description of problem: > dracut can't find root lv because the underlying raid 10 pv is started ^^^^^^^ not started.
This is in libdevmapper, so all users of libdevmapper are probably affected (lvm, dmraid, cryptsetup, mpath, kpartx, ....) (Seems there was some temporary variants during 3.0 transition, because other distro reports for me 3.0.0 which works.. Anyway, kernel check must be fixed.)
*** Bug 710713 has been marked as a duplicate of this bug. ***
I added workaroud until the detection of 3.0 kernel is settled in lvm2 upstream. At least for me system now boots properly (there are other issues with another packages but boot is not failing now). Build is here (until it reach rawhide repo) http://koji.fedoraproject.org/koji/buildinfo?buildID=246322 Please let me know if there is still some problem.
I'm using a Rawhide VM with the default LVM partitioning using the entire disk (in particular, nothing advanced such as RAID) and see this, so it should be affecting almost everybody.
I tried lvm 2.02.84-2.fc16 stuff and device mapper 1.02.63-2 stuff and my software raid 1 array is not being properly assembled.
And note that I uninstalled the latest kernel and then reinstalled it, so that dracut would rebuild the initramfs.
Created attachment 503032 [details] dmesg output If I wait long enough I get dropped into a shell. I was able to manually start the /boot array and save the dmesg output there.
hm. I am afraid that raid assembly is another bug (dracut or mdadm?). (And yes, I forgot to say it need rebuild of initramfs.) Can anyone verify that without MD RAID it works now? (Default install should be such cnfiguration.)
is this correct rebuild syntax: Have tried cd /boot dracut -v -f -o mdraid initramfs-3.0-0.rc1.git0.2.fc16.i686.img I then cannot find the luks stuff again. Will attach seriel console output.
Created attachment 503054 [details] Seriel Console Output I don't have any soft\physical raid on this vm.
I had to remove the whole kernel and reinstall again to make it works but now I am able to boot with bot lvm and some crypto volumes. When you drop to shell, kernel modules are properly loaded in dracut? Does dracut see underlying device? (try blkid from dracut shell - it should see "crypto_LUKS" device) Which kernel version you are using? (it should be kernel-3.0-0.rc1.git0.2.fc16 at least)
yum erase kernel-3.0-0.rc1.git0.2.fc16 yum install kernel-3.0-0.rc1.git0.2.fc16 reboot has got past luks, only waiting for lots of "audit" stuff to finish.
(In reply to comment #16) > yum erase kernel-3.0-0.rc1.git0.2.fc16 > yum install kernel-3.0-0.rc1.git0.2.fc16 > > reboot has got past luks, > only waiting for lots of "audit" stuff to finish. audit still going on, would that be normal? Anyone else getting lots of audit?
There are apparently more problems. MD raid fails to assemble, definitely separate problem (mdadm -As doesn't work for some reason from init ramdisk, I'll check that later, maybe it is regression in kernel md code.) For audit - isn't that just selinux? try to boot with selinux=0 (and do full relabel later). But lvm/cryptsetup should work with update above.
selinux=0 boot completes, but I also have telinit 3 on the kernel line. I login as user, password comes up in the clear, unhidden. Unsure what does that.
(In reply to comment #19) > I login as user, > password comes up in the clear, unhidden. see bug #650890 or bug #655538 (or similar, it is plymouth issue probably) Anyway, thanks for confirmation that at least devmapper problem is fixed.
Created attachment 503114 [details] patch for mdadm The same problem is in mdadm, kernel version 3.0-rc1 is mishandled and mdadm is not able to perform requested operation (mdadm -As --auto=yes --run). With attached patch I can boot from lvm over md raid1 again.
I have no rights for mdadm package, reassigning. See attached patch.
I tested the mdadm fix and I do get past the raid assembly now. I am now hitting the selinux policy load loop, but that problem is likely not related to this.
(In reply to comment #23) > I tested the mdadm fix and I do get past the raid assembly now. I am now > hitting the selinux policy load loop, but that problem is likely not related to > this. Bugged the selinux loop: https://bugzilla.redhat.com/show_bug.cgi?id=711015
(In reply to comment #23) > I tested the mdadm fix and I do get past the raid assembly now. Do you have an RPM available to share?
I just did a local build for i686. Note that even with the systemd fix for the selinux loop (there is a scratch build for that), my system still wasn't booting with the kernel appearing to crap out. So we will likely need to wait for an rc2 build to be able to use a 3.0 kernel. Hopefully by then there will also be a new mdadm.
With the 20110608 updates including the 3.0-0.rc2.git0.1.fc16 kernel, the only workaround I need to boot is "selinux=0".
(In reply to comment #27) > With the 20110608 updates including the 3.0-0.rc2.git0.1.fc16 kernel, the only > workaround I need to boot is "selinux=0". Is your root on a raid device?
(In reply to comment #28) > (In reply to comment #27) > > With the 20110608 updates including the 3.0-0.rc2.git0.1.fc16 kernel, the only > > workaround I need to boot is "selinux=0". > > Is your root on a raid device? No - see comment 8.
If your request for commit access to mdadm doesn't get approved soon, consider talking to a proven packager to either get added or to commit your fixes.
Any idea when mdadm will be fixed? TIA
Doug, please could you grant me access right to mdadm or fix this BZ? mdadm is quite seriously broken in rawide without tkernel 3.0 patch and it need rebuild.
Patch applied.
I see the new mdadm in koji, however: $ wget http://kojipkgs.fedoraproject.org/packages/mdadm/3.2.1/5.fc16/x86_64/mdadm-3.2.1-5.fc16.x86_64.rpm --2011-06-14 18:14:26-- http://kojipkgs.fedoraproject.org/packages/mdadm/3.2.1/5.fc16/x86_64/mdadm-3.2.1-5.fc16.x86_64.rpm Resolving kojipkgs.fedoraproject.org... 209.132.181.10 Connecting to kojipkgs.fedoraproject.org|209.132.181.10|:80... connected. HTTP request sent, awaiting response... 403 Forbidden 2011-06-14 18:14:27 ERROR 403: Forbidden.
Should be temporary, nothing specific to that package: https://fedorahosted.org/fedora-infrastructure/ticket/2823
I was able to boot the -rc3 kernel using mdadm-3.2.1-5.fc16.i686.
kernel-2.6.40.6-0.fc15 has been submitted as an update for Fedora 15. https://admin.fedoraproject.org/updates/kernel-2.6.40.6-0.fc15
kernel-2.6.40.6-0.fc15 has been pushed to the Fedora 15 stable repository. If problems still persist, please make note of it in this bug report.