Red Hat Bugzilla – Bug 480667
nash unable to find dm devs by uuid or label causing boot to fail
Last modified: 2013-01-09 22:26:16 EST
I've got a new install on an nvidia sata controller where / exists in an LVM group. At boot time, the boot messages show that we're not able to find the volume group and then not able to: mount: error mounting /dev/root on /sysroot as ext3: No such file or directory
If I boot with boot_delay=5 it'll work fine. Booting with scsi_scan=sync also fails. Both i386 and x86_64 installs on this system exhibit the problem.
More information. When doing encrypted lvm, I can't delay long enough yet to make this work. However I can clearly see that the "scsi" disk is detected, and I get prompted for the passphrase. The passphrase is taken then the LVM volume group and logical volumes are created, but maybe not fast enough as the mount still fails. Now I think we're missing some delay or settle or something when making LVM volumes active. I'm going to repeat this test but without LVM involved.
Creating a / filesystem that was not on LVM indeed works like a charm. Even more evidence that something is wrong in LVM land.
And more data, encrypted FS without LVM also fails to mount, so whatever encrypted FS and lvm have in common is where issues are.
It doesn't so much look like we're racing as much as we're just not seeing uevents in nash for the dm devices, thus don't add them into our block dev cache and thus never are able to match the UUID. The devices do exist, though, and the dm-* devs are even in sysfs.
Also, using root=/dev/VolGroup00/RootVol (or whatever) seems to work fine on the box I've got failing.
Reassigning to mkinitrd for now
We may not be able to fix this one by Alpha, in which case we'll have to announce with alpha the work arounds.
I can confirm that specifying the root volume group, as opposed to the uuid, works.
Also confirm that changing uuid to root volume group works - but note this is for F10 test build kernel-184.108.40.206-9.rc2.fc10.i686 (so this should probably block any push of this kernel toward F10 too?)
Rebuilding an initrd for kernel-220.127.116.11-159.fc10.i686 doesn't suffer from this on the same machine, btw.
*** Bug 480761 has been marked as a duplicate of this bug. ***
(In reply to comment #7)
> Also confirm that changing uuid to root volume group works - but note this is
> for F10 test build kernel-18.104.22.168-9.rc2.fc10.i686 (so this should probably
> block any push of this kernel toward F10 too?)
> Rebuilding an initrd for kernel-22.214.171.124-159.fc10.i686 doesn't suffer from this
> on the same machine, btw.
Please confirm the mkinitrd version being used.
(In reply to comment #9)
> Please confirm the mkinitrd version being used.
Bug #480761 has some other details (fstab, etc.)
Had F10-LXDE encrypted /,/home working
preupgrade to F11 didn't boot.
rescue mode fails loading /mnt/sysimage (errors),
so can't fix anything.
boot.iso 23-Jan-2009 07:53 120M
Booted from it as install\ug, won't boot
intel_iommu=on\off makes no difference.
Stuck at Virtual Kernel Memory Layout,
will get some sort of screen image as
soon as find the old camera.
It's an old P3 800mhz, 512mb sdram (it's max)
2x250gb ide hd
PS: ext4 except /boot
Created attachment 329936 [details]
JPG Screenshot at kernel freeze
This is fixed in mkinitrd-6.0.76, which will be in the next rawhide push.
*** Bug 484508 has been marked as a duplicate of this bug. ***
*** Bug 484474 has been marked as a duplicate of this bug. ***
*** Bug 481037 has been marked as a duplicate of this bug. ***
*** Bug 485274 has been marked as a duplicate of this bug. ***
I have two systems experiencing this problem; A T60 laptop and a Lenovo desktop system. Both running uptodate rawhide.
I did a clean reinstall of F10 on the desktop box (which worked fine), then enabled the rawhide repo and upgraded to rawhide via yum. On reboot the system hung trying to mount the rootfs.
I tried booting with both UUID and VG as the root, with and without boot_delay=5 (or 10 even) with no success.
If my T60 (intel chipset) didn't exhibit the same problem, I'd say it was NV SATA, but that's not the case.
I've got a serial console log of the boot that I'll attach
Created attachment 333103 [details]
serial console log of boot failure on Lenovo desktop
This is a console log of the boot failure on the Lenovo desktop box. At the point where the output stops, the kernel is still alive (i.e. sysrq works).
(In reply to comment #20)
> Created an attachment (id=333103) [details]
> serial console log of boot failure on Lenovo desktop
> This is a console log of the boot failure on the Lenovo desktop box. At the
> point where the output stops, the kernel is still alive (i.e. sysrq works).
What does "rpm -q mkinitrd" say? Do you have version 6.0.76 or later?
If you do please try booting with "root=/dev/VolGroup00/LogVol00" as kernel cmdline argument (replacing the existing root= argument) if that still does not boot you are seeing another problem, in that case please create a new bug.
When creating a new bug please also do (as root):
zcat /boot/mkinitrd-<failing-kern-version>.img | cpio -i init
This will extract the init script from the initrd, and then attach the init script.
$ rpm -q mkinitrd
Actually, the boot line on the desktop box already refers to the volgroup rather than a label or a UUID, so it's probably another bug. I'll close this one (again) and open a new one.
mkinitrd-6.0.71-4.fc10 has been pushed to the Fedora 10 stable repository. If problems still persist, please make note of it in this bug report.