Bug 480667 - nash unable to find dm devs by uuid or label causing boot to fail
nash unable to find dm devs by uuid or label causing boot to fail
Status: CLOSED NEXTRELEASE
Product: Fedora
Classification: Fedora
Component: mkinitrd (Show other bugs)
rawhide
All Linux
low Severity medium
: ---
: ---
Assigned To: Peter Jones
Fedora Extras Quality Assurance
: Reopened
: 480761 481037 484474 484508 485274 (view as bug list)
Depends On:
Blocks: F11Alpha/F11AlphaBlocker
  Show dependency treegraph
 
Reported: 2009-01-19 13:48 EST by Jesse Keating
Modified: 2013-01-09 22:26 EST (History)
19 users (show)

See Also:
Fixed In Version: 6.0.71-4.fc10
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2009-02-25 11:07:26 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
JPG Screenshot at kernel freeze (597.57 KB, image/jpeg)
2009-01-25 08:59 EST, Frank Murphy
no flags Details
serial console log of boot failure on Lenovo desktop (18.59 KB, text/plain)
2009-02-24 18:58 EST, Clark Williams
no flags Details

  None (edit)
Description Jesse Keating 2009-01-19 13:48:08 EST
I've got a new install on an nvidia sata controller where / exists in an LVM group.  At boot time, the boot messages show that we're not able to find the volume group and then not able to: mount: error mounting /dev/root on /sysroot as ext3: No such file or directory

If I boot with boot_delay=5 it'll work fine.  Booting with scsi_scan=sync also fails.  Both i386 and x86_64 installs on this system exhibit the problem.
Comment 1 Jesse Keating 2009-01-19 14:29:21 EST
More information.  When doing encrypted lvm, I can't delay long enough yet to make this work.  However I can clearly see that the "scsi" disk is detected, and I get prompted for the passphrase.  The passphrase is taken then the LVM volume group and logical volumes are created, but maybe not fast enough as the mount still fails.  Now I think we're missing some delay or settle or something when making LVM volumes active.  I'm going to repeat this test but without LVM involved.
Comment 2 Jesse Keating 2009-01-19 14:48:53 EST
Creating a / filesystem that was not on LVM indeed works like a charm.  Even more evidence that something is wrong in LVM land.
Comment 3 Jesse Keating 2009-01-19 16:53:16 EST
And more data, encrypted FS without LVM also fails to mount, so whatever encrypted FS and lvm have in common is where issues are.
Comment 4 Jeremy Katz 2009-01-19 17:47:22 EST
It doesn't so much look like we're racing as much as we're just not seeing uevents in nash for the dm devices, thus don't add them into our block dev cache and thus never are able to match the UUID.  The devices do exist, though, and the dm-* devs are even in sysfs.  

Also, using root=/dev/VolGroup00/RootVol (or whatever) seems to work fine on the box I've got failing.  

Reassigning to mkinitrd for now
Comment 5 Jesse Keating 2009-01-21 14:56:01 EST
We may not be able to fix this one by Alpha, in which case we'll have to announce with alpha the work arounds.
Comment 6 Luke Macken 2009-01-21 16:36:47 EST
I can confirm that specifying the root volume group, as opposed to the uuid, works.
Comment 7 Kevin R. Page 2009-01-21 17:49:11 EST
Also confirm that changing uuid to root volume group works - but note this is for F10 test build kernel-2.6.28.1-9.rc2.fc10.i686 (so this should probably block any push of this kernel toward F10 too?)

Rebuilding an initrd for kernel-2.6.27.9-159.fc10.i686 doesn't suffer from this on the same machine, btw.
Comment 8 Kevin R. Page 2009-01-21 17:50:15 EST
*** Bug 480761 has been marked as a duplicate of this bug. ***
Comment 9 Jesse Keating 2009-01-21 18:35:14 EST
(In reply to comment #7)
> Also confirm that changing uuid to root volume group works - but note this is
> for F10 test build kernel-2.6.28.1-9.rc2.fc10.i686 (so this should probably
> block any push of this kernel toward F10 too?)
> 
> Rebuilding an initrd for kernel-2.6.27.9-159.fc10.i686 doesn't suffer from this
> on the same machine, btw.

Please confirm the mkinitrd version being used.
Comment 10 Kevin R. Page 2009-01-22 12:57:56 EST
(In reply to comment #9)
> Please confirm the mkinitrd version being used.

mkinitrd-6.0.71-3.fc10.i386

Bug #480761 has some other details (fstab, etc.)
Comment 11 Frank Murphy 2009-01-25 04:26:38 EST
Had F10-LXDE encrypted /,/home working

preupgrade to F11 didn't boot.
rescue mode fails loading /mnt/sysimage (errors),
so can't fix anything.

Downloaded
 boot.iso   23-Jan-2009 07:53  120M
Booted from it as install\ug, won't boot
intel_iommu=on\off makes no difference.

Stuck at Virtual Kernel Memory Layout,
will get some sort of screen image as
soon as find the old camera.

It's an old P3 800mhz, 512mb sdram (it's max)
2x250gb ide hd
Comment 12 Frank Murphy 2009-01-25 04:27:24 EST
PS: ext4 except /boot
Comment 13 Frank Murphy 2009-01-25 08:59:13 EST
Created attachment 329936 [details]
JPG Screenshot at kernel freeze
Comment 14 Hans de Goede 2009-02-04 15:50:48 EST
This is fixed in mkinitrd-6.0.76, which will be in the next rawhide push.
Comment 15 Hans de Goede 2009-02-08 08:35:53 EST
*** Bug 484508 has been marked as a duplicate of this bug. ***
Comment 16 Eric Sandeen 2009-02-08 21:58:59 EST
*** Bug 484474 has been marked as a duplicate of this bug. ***
Comment 17 Hans de Goede 2009-02-12 06:16:22 EST
*** Bug 481037 has been marked as a duplicate of this bug. ***
Comment 18 Hans de Goede 2009-02-12 13:25:57 EST
*** Bug 485274 has been marked as a duplicate of this bug. ***
Comment 19 Clark Williams 2009-02-24 18:56:08 EST
I have two systems experiencing this problem; A T60 laptop and a Lenovo desktop system. Both running uptodate rawhide.

I did a clean reinstall of F10 on the desktop box (which worked fine), then enabled the rawhide repo and upgraded to rawhide via yum. On reboot the system hung trying to mount the rootfs.

I tried booting with both UUID and VG as the root, with and without boot_delay=5 (or 10 even) with no success. 

If my T60 (intel chipset) didn't exhibit the same problem, I'd say it was NV SATA, but that's not the case. 

I've got a serial console log of the boot that I'll attach
Comment 20 Clark Williams 2009-02-24 18:58:19 EST
Created attachment 333103 [details]
serial console log of boot failure on Lenovo desktop

This is a console log of the boot failure on the Lenovo desktop box. At the point where the output stops, the kernel is still alive (i.e. sysrq works).
Comment 21 Hans de Goede 2009-02-25 02:54:18 EST
(In reply to comment #20)
> Created an attachment (id=333103) [details]
> serial console log of boot failure on Lenovo desktop
> 
> This is a console log of the boot failure on the Lenovo desktop box. At the
> point where the output stops, the kernel is still alive (i.e. sysrq works).

What does "rpm -q mkinitrd" say? Do you have version 6.0.76 or later?
If you do please try booting with "root=/dev/VolGroup00/LogVol00" as kernel cmdline argument (replacing the existing root= argument) if that still does not boot you are seeing another problem, in that case please create a new bug.

When creating a new bug please also do (as root):
zcat /boot/mkinitrd-<failing-kern-version>.img | cpio -i init

This will extract the init script from the initrd, and then attach the init script.

Thanks!
Comment 22 Clark Williams 2009-02-25 11:07:26 EST
$ rpm -q mkinitrd
mkinitrd-6.0.78-1.fc11.x86_64

Actually, the boot line on the desktop box already refers to the volgroup rather than a label or a UUID, so it's probably another bug. I'll close this one (again) and open a new one.
Comment 23 Fedora Update System 2009-03-09 19:10:31 EDT
mkinitrd-6.0.71-4.fc10 has been pushed to the Fedora 10 stable repository.  If problems still persist, please make note of it in this bug report.

Note You need to log in before you can comment on or make changes to this bug.