Bug 237415

Summary:

Kernels above 2.6.20-1.3069 fail to boot LVM

Product:

[Fedora] Fedora

Reporter:

Andrew Baumhauer <abaumhau+fc-bugzilla>

Component:

mkinitrd

Assignee:

Peter Jones <pjones>

Status:

CLOSED NEXTRELEASE

QA Contact:

David Lawrence <dkl>

Severity:

high

Docs Contact:

Priority:

high

Version:

rawhide

CC:

amk, amlau, bobgus, bugzilla, piskozub, redhat-bugzilla, ron, russell, zing

Target Milestone:

---

Target Release:

---

Hardware:

x86_64

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

Bug Fix

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2007-10-29 18:29:33 UTC

Type:

---

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
Working initrd image	none
Broken initrd (causes kernel panic)	none
lvmdump from FC7 install DVD running in 'rescue' mode	none

Description Andrew Baumhauer 2007-04-22 12:53:20 UTC

Description of problem:
All kernels above 2.6.20-1.3069 fail to boot with LVM partition.  Here is the
output from 2.6.20-1.3088:

Loading mbcache.ko module
Loading jbd.ko module
Loading ext3.ko module
Loading dm-mod.ko module
device-mapper: ioctl: 4.11.0-ioctl (2006-10-12) initialized:
Loading dm-mirror.ko module
Loading dm-zero.ko module
Loading dm-snapshot.ko module
Trying to resume device (LABEL=swap)
Creating root device
Mounting root filesystem
mount: could not find filesystem '/dev/root'
Setting up other filesystems
Setting up new root fs
setuproot: moving /dev/ failed: no such file or directory
no fstab.sys, mounting internal defaults
setuproot: error mounting /proc: no such file or directory
setuproot: error mounting /sys no such file or directory
Switching to new root and running init
Unmounting old /dev
Unmounting old /proc
Unmounting old /sys
switchroot: mount failed: no such file or directory
Booting has failed;
Kernel panic - not syncing: Attempted to kill init!


Version-Release number of selected component (if applicable):
kernels > 2.6.20-1.3069 with mkinitrd-6.0.9-1

How reproducible:
Install any kernel above 2.6.20-1.3069, reboot and panic.

Steps to Reproduce:
1.
2.
3.
  
Actual results:
kernel panic

Expected results:
running f7t3 version!


Additional info:

In kernel 2.6.20-1.3069, everything works fine. The system boots and
discovers the drives and the LVM partitions, then calls /etc/rc.sysinit
and boot continues after I decrypt the volumes.

In all kernels since then, the system panics before starting
/etc/rc.sysinit. It appears from watching the device discovery that LVM
isn't being initialized, so the later stages of init aren't seeing the
drives. I'm quite sure it isn't my two encrypted volumes, but I'm not
sure it isn't the non-standard naming of my volume groups (but I've
always renamed volume groups from VolumeGroup00 to something else). 


/boot/grub/menu.lst:
title Fedora (2.6.20-1.3084.fc7)
root (hd0,0)
kernel /vmlinuz-2.6.20-1.3084.fc7 ro root=LABEL=/ vga=792 rhgb quiet
initrd /initrd-2.6.20-1.3084.fc7.img

/etc/crypttab:
swap /dev/mapper/vg_host-lv_swap /dev/urandom
swap,cipher=twofish-cbc-essiv:sha256
var /dev/mapper/vg_host-lv_var /keyring/var.key
home /dev/mapper/vg_host-lv_home /keyring/home.key

/etc/fstab:
LABEL=/ / ext3 defaults 1 1
LABEL=/boot /boot ext3 defaults 1 2
tmpfs /dev/shm tmpfs defaults 0 0
devpts /dev/pts devpts gid=5,mode=620 0 0
sysfs /sys sysfs defaults 0 0
proc /proc proc defaults 0 0
# Encrypted filesystems
LABEL=/home /home ext3 defaults 1 2
LABEL=/var /var ext3 defaults 1 2
LABEL=swap swap swap defaults 0 0

/dev/mapper: (on kernel 3069)
total 0
drwxr-xr-x 2 root root 280 2007-04-18 16:02 .
drwxr-xr-x 16 root root 5160 2007-04-19 12:19 ..
crw------- 1 root root 10, 63 2007-04-18 12:01 control
brw-rw---- 1 root disk 253, 6 2007-04-18 16:02 home
brw-rw---- 1 root disk 253, 11 2007-04-18 16:02 swap
brw-rw---- 1 root disk 253, 5 2007-04-18 16:02 var
brw-rw---- 1 root disk 253, 1 2007-04-18 12:01 vg_host-lv_home
brw-rw---- 1 root disk 253, 0 2007-04-18 16:02 vg_host-lv_root
brw-rw---- 1 root disk 253, 3 2007-04-18 12:01 vg_host-lv_swap
brw-rw---- 1 root disk 253, 2 2007-04-18 12:01 vg_host-lv_var

Comment 1 Andrew Baumhauer 2007-04-22 12:58:09 UTC

Created attachment 153251 [details]
Working initrd image

Comment 2 Andrew Baumhauer 2007-04-22 13:00:33 UTC

Created attachment 153252 [details]
Broken initrd (causes kernel panic)

Comment 3 Andrew Baumhauer 2007-04-23 02:34:56 UTC

After analyzing the two nash init scripts in the initrd files attached, I
noticed that the working one had lvm commands, and the other did not (as I
suspected).  After studying and debugging the mkinitrd script, I isolated the
problem to the following function call:

lvshow() {
    lvm.static lvs --ignorelockingfailure --noheadings -o vg_name \
        $1 2>/dev/null | egrep -v '^ *(WARNING:|Volume Groups with)'
}

The problem stems from my /etc/fstab having all of the LVM partitions called
with LABEL=/, LABEL=/var, and LABEL=/home.  The lvshow function (lvm.static lvs)
does not support LABEL=/ to locate a volume group.

The work-around was to change LABEL=/ to /dev/mapper/VolumeGroup00-root and this
was enough to output the correct nash script to start LVM at boot.

The permanent fix is to determine if Fedora is going to use LABEL=/ in fstab
(and use it in /etc/crypttab), and if so, fix the mkinitrd function to locate
LVM volume groups by label.

Comment 4 Bob Gustafson 2007-06-02 03:46:13 UTC

See also
    bug 241949

    bug 242043

Whatever is the cause, it has not been fixed as yet.

Comment 5 Bob Gustafson 2007-06-02 04:19:30 UTC

It appears that the structure and content of /etc/mdadm.conf has changed between
my (running) FC6 and my attempted installation of FC7 (two different hardware
systems)

The FC6 /etc/mdadm.conf shows 3 raid devices (boot,swap,/dev/rootvg/root)

# mdadm.conf written out by anaconda
DEVICE partitions
MAILADDR root
ARRAY /dev/md2 super-minor=2
ARRAY /dev/md0 super-minor=0
ARRAY /dev/md1 super-minor=1


The F7 mdadm.conf (in the initrd file) shows ONLY TWO raid devices (boot,swap)

# mdadm.conf written out by anaconda
DEVICE partitions
MAILADDR root
ARRAY /dev/md0 level=raid1 num-devices=2 uuid=47bba70b:b76ffd5f:816f55b8:cf2ee184
ARRAY /dev/md1 level=raid1 num-devices=2 uuid=36c22074:b238a704:85d99d8b:e9dafa99
[root@hoho0 etc]# 

(Note that the long lines wrapped)

Since the 3rd raid device was not identified in the mdadm.conf file, this could
be the reason why the lvm device was not found (it is on the md2 device)

Comment 6 Bob Gustafson 2007-06-02 04:43:05 UTC

I did not see any LABEL s in the /etc/fstab files

This is from the FC6 system
/home/user1/Desktop/initrd/etc

[root@hoho0 etc]# cat /etc/fstab
/dev/rootvg/root        /                       ext3    defaults        1 1
/dev/md0                /boot                   ext3    defaults        1 2
devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
tmpfs                   /dev/shm                tmpfs   defaults        0 0
proc                    /proc                   proc    defaults        0 0
sysfs                   /sys                    sysfs   defaults        0 0
/dev/md1                swap                    swap    defaults        0 0
[root@hoho0 etc]# 

This is from the F7 /mnt/sysimage/fstab

[root@hoho0 Desktop]# cat fstab
/dev/rootvg/root        /                       ext3    defaults        1 1
/dev/md0                /boot                   ext3    defaults        1 2
devpts                  /dev/pts                devpts  gid=5,mode=620  0 0
tmpfs                   /dev/shm                tmpfs   defaults        0 0
proc                    /proc                   proc    defaults        0 0
sysfs                   /sys                    sysfs   defaults        0 0
/dev/md1                swap                    swap    defaults        0 0
[root@hoho0 Desktop]#

Comment 7 Bradley 2007-06-02 05:12:21 UTC

My mdadm.conf missed the last raid array too (only md0, not md1 showing). I'd
already rebuilt the initrd first, so I don't know if there was a problem with
that too.

adding mdadm -Es output to /etc/mdadm.conf via rescue mode allowed me to boot up.

Comment 8 Bob Gustafson 2007-06-02 05:19:00 UTC

Created attachment 155963 [details]
lvmdump from FC7 install DVD running in 'rescue' mode

This is obviously not from the disk installed FC7 (that crashed on boot), but
rather from the F7 DVD running in 'rescue' mode. Might be useful.

Comment 9 Bob Gustafson 2007-06-02 12:28:43 UTC

I must have missed your 01:12 EST message last night. Saw it in email this morning.

Yes - Welcome Fedora..

I am writing this bug comment from F7

As well as doing your mdadm -Es trick (UUID in caps doesn't matter), I had to
rebuild the initrd.

Thanks much

Comment 10 Keith G. Robertson-Turner 2007-06-04 11:03:16 UTC

Ref: bug 241949 #c14

There definitely seems to be a bug in mkinitrd, specifically failing to
correctly grok LVM configurations, and subsequently included the necessary
commands in nash.

My resolution was basically to completely reconstruct my initrd manually using
cpio, gzip and vim ... reminding me of my Linux From Scratch experiences from
years ago :)

It worked, but I fear the next kernel update like the Devil ;)

Comment 11 Vaclav "sHINOBI" Misek 2007-06-05 10:05:06 UTC

Hmmm, mdadm -Es helped me. There were the same UUID's for both disks in
mdadm.conf (I'm using RAID 1).

Comment 12 Jeremy Katz 2007-10-29 18:29:33 UTC

Just tested a fresh install with today's Fedora 8 candidate with LVM on top of
RAID (with multiple arrays) and it booted fine.  There have been some fixes in
this area, so closing as NEXTRELEASE

Comment 13 Christopher Brown 2007-12-13 16:42:43 UTC

*** Bug 300701 has been marked as a duplicate of this bug. ***