Description of problem: If you boot F8T3 Live on a system that has an LVM PV within a RAID1 array, F8T3 Live detects the LVM volume group without detecting the RAID1 array that contains it. Version-Release number of selected component (if applicable): Booting F8T3 Live CD on a system that has LVM inside RAID1. How reproducible: Always. Steps to Reproduce: 1. Install Fedora 7 onto a system with two SATA drives: 1a. Partition each drive manually with one RAID autodetect for /boot and one for LVM (PV) 1b. Create two RAID1 drives (one for /boot and one for the LVM) 1c. Create a VG on the LVM RAID 1d. Add four LVs to the VG (one each for /, /var, /home, swap) 2. Install with default package set, plus Virtualisation 3. Complete first boot 4. Log in, start a Terminal, "su -" to root 5. Verify configuration with "fdisk -l /dev/sda", "fdisk -l /dev/sdb", "cat /proc/mdstat", "pvs", and "lvs" (see Additional Info for expected output) 6. Reboot, and boot F8T3 Live CD 7. Log in, start a Terminal, "su -" to root 8. Check configuration with "cat /proc/mdstat", "pvs", and "vgs" Actual results: [root@localhost ~]# cat /proc/mdstat Personalities : unused devices: <none> [root@localhost ~]# pvs PV VG Fmt Attr PSize PFree /dev/sda2 vg7101 lvm2 a- 186.22G 150.22G [root@localhost ~]# vgs VG #PV #LV #SN Attr VSize VFree vg7101 1 4 0 wz--n- 186.22G 150.22G Expected results (generated from F7 Live CD): [root@localhost ~]# cat /proc/mdstat Personalities : unused devices: <none> [root@localhost ~]# pvs [root@localhost ~]# vgs No volume groups found Additional info (after booting into successfully-installed Fedora 7): [root@neuromancer ~]# fdisk -l /dev/sda Disk /dev/sda: 400.0 GB, 400087375360 bytes 255 heads, 63 sectors/track, 48641 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x000a855c Device Boot Start End Blocks Id System /dev/sda1 * 1 13 104391 fd Linux raid autodetect /dev/sda2 14 24324 195278107+ fd Linux raid autodetect [root@neuromancer ~]# fdisk -l /dev/sdb Disk /dev/sdb: 400.0 GB, 400088457216 bytes 255 heads, 63 sectors/track, 48641 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0x00027767 Device Boot Start End Blocks Id System /dev/sdb1 * 1 13 104391 fd Linux raid autodetect /dev/sdb2 14 24324 195278107+ fd Linux raid autodetect [root@neuromancer ~]# cat /proc/mdstat Personalities : [raid1] [raid6] [raid5] [raid4] md0 : active raid1 sda1[0] sdb1[1] 104320 blocks [2/2] [UU] md1 : active raid1 sda2[0] sdb2[1] 195278016 blocks [2/2] [UU] unused devices: <none> [root@neuromancer ~]# pvs PV VG Fmt Attr PSize PFree /dev/md1 vg7101 lvm2 a- 186.22G 150.22G [root@neuromancer ~]# lvs LV VG Attr LSize Origin Snap% Move Log Copy% F7home vg7101 -wi-ao 8.00G F7root vg7101 -wi-ao 8.00G F7swap vg7101 -wi-ao 4.00G F7var vg7101 -wi-ao 16.00G
Was there a volume group on the underlying partitions before the raid1 was created?
No. The F7 that suffered at the hands of the F8T3 was the first thing installed on both drives immediately after purchase brand new.
This is a bug in the LVM tools. What's happening is that we boot up. The normal sequence in rc.sysinit is to check for an /etc/mdadm.conf and if one exists, it starts mdraid arrays. Then LVM is started afterwards. In the case of the live image, we don't have an mdadm.conf and so we never start the mdraid array. But the lvm tools then still activate the volume group, even though the block devices have mdraid metadata also. There are a few ways of fixing: 1) Always start mdraid arrays in rc.sysinit, regardless of the existence of an mdadm.conf. This feels somewhat risky as it's a substantial change from what we've done in the past 2) Create an empty mdadm.conf on the live image. Which then makes the live image essentially the first case, but (slightly) more constrained. Given the download numbers for the live images, I don't know that this is really any better to do 3) Fix the lvm tools to look for mdraid metadata and not activate a VG off of the base block device with this metadata.
While this is an annoying bug, I don't think I would consider it a blocker. This isn't a typical partition case we would see. Moving to Target as we would take a fix if it showed up, but I don't think we would hold up the release for this.
The lvm tools are supposed to already check for md devices and ignore them, but there are some caveats: Is there an lvm.conf - and if so does it have 'md_component_detection = 0' in it. (The entry should be missing, or set to 1 - never 0). What version of md metadata is being used? Only some versions are detected correctly - this only got fixed upstream this week. To get lvm2 diagnostics, you need to run 'vgscan -vvvv' at the point where things have gone wrong and attach it to this bugzilla. Another cause might be if nash was handling this internally instead of using the lvm2 tools.
(In reply to comment #5) > Is there an lvm.conf - and if so does it have 'md_component_detection = 0' in it. > (The entry should be missing, or set to 1 - never 0). It's the default lvm.conf as shipped in the package. So md_component_detection is set to 1. > What version of md metadata is being used? Only some versions are detected > correctly - this only got fixed upstream this week. Msquared -- can you take a look at your system to get this information? But it's probably worth getting the targeted fix for detecting the other versions into F8 > Another cause might be if nash was handling this internally instead of using the > lvm2 tools. Nope, everything just execs lvm.
What's puzzling is why booting the same system with F8T3 behaves differently from F7 - I can't *think* of any changes to this part of the lvm2 code since F7. Please boot with F7 CD, attach 'vgscan -vvvv' output, plus 'lvs -v', 'lvm version', 'dmsetup info -c'; then repeat with F8T3 CD.
Jeremy - How do I obtain that information? Alasdair - Can do, but how do I tell F8T3 CD not to use swap? There are swap partitions on the LVM, and I don't want to risk F8T3 causing any corruption.
Swap won't be enabled if you add 'noswap' to the kernel command line (hit tab in isolinux, add the argument) And mdadm --detail /dev/mdX should give you the information on the version of the array
mdadm create metadata v0.9 by default and this format is correctly detected by lvm tools. There is some problem with lvm cache (/etc/lvm/cache/.cache). If this file is recreated, everything works. Replacing old content will break it again (not selinux related this time) (i.e. vgscan ; vgchange -a y -> will *not* activate volumes on md device)
But once again - nothing changed in this area between those versions, did it?
So is this pointing back to changes in the mkinitrd package - what does that look like on the two live CDs? Arguably the F-7 live CD behaviour described is "wrong" - it should have activated the raid and then the lvs, same as booting from the installation did?
and the F-8 live CD behaviour is partly fixed, as it now does the LVM2 actions, but is still missing the md ones?
(In reply to comment #12) > So is this pointing back to changes in the mkinitrd package - what does that > look like on the two live CDs? There's absolutely nothing to do with the initrd here. > Arguably the F-7 live CD behaviour described is "wrong" - it should have > activated the raid and then the lvs, same as booting from the installation did? Please actually read what I wrote. One potential approach would be to change rc.sysinit so that we always activate raidsets even if there's not an mdadm.conf, but that scares me. A lot. And even if it's done, we should *still* fix the lvm tools so that they don't activate volume groups that are on the individual, unassembled RAID components. And note that this happened with the Fedora 7 live CDs, also, it just wasn't as apparent as nothing was done by default with anything from any of the LVs. Now that we enable swap, though, things are different.
Created attachment 237491 [details] Diagnosting information from Fedora 7 Live The results of this script after booting the Fedora 7 Live CD: #!/bin/bash VER=`uname -r` mkdir $VER cd $VER dmesg >dmesg.txt 2>&1 cp /var/log/messages messages.txt vgscan -vvvv >vgscan.txt 2>&1 lvs -v >lvs.txt 2>&1 lvm version >lvmversion.txt 2>&1 dmsetup info -c >dmsetup.txt 2>&1 mdadm --detail /dev/md0 >md0.txt 2>&1 mdadm --detail /dev/md1 >md1.txt 2>&1
Created attachment 237501 [details] Diagnosting information from Fedora 8 test 3 Live The results of this script after booting the Fedora 8 test 3 Live CD: #!/bin/bash VER=`uname -r` mkdir $VER cd $VER dmesg >dmesg.txt 2>&1 cp /var/log/messages messages.txt vgscan -vvvv >vgscan.txt 2>&1 lvs -v >lvs.txt 2>&1 lvm version >lvmversion.txt 2>&1 dmsetup info -c >dmsetup.txt 2>&1 mdadm --detail /dev/md0 >md0.txt 2>&1 mdadm --detail /dev/md1 >md1.txt 2>&1
Created attachment 237511 [details] Diagnosting information from installed Fedora 8 test 3 The results of this script after booting my Fedora 8 test 3 installation: #!/bin/bash VER=`uname -r` mkdir $VER cd $VER dmesg >dmesg.txt 2>&1 cp /var/log/messages messages.txt vgscan -vvvv >vgscan.txt 2>&1 lvs -v >lvs.txt 2>&1 lvm version >lvmversion.txt 2>&1 dmsetup info -c >dmsetup.txt 2>&1 mdadm --detail /dev/md0 >md0.txt 2>&1 mdadm --detail /dev/md1 >md1.txt 2>&1
(In reply to comment #14) > And note that this happened with the Fedora 7 live CDs, also, it just wasn't as > apparent as nothing was done by default with anything from any of the LVs. Now > that we enable swap, though, things are different. I'm not sure this did happen with the Fedora 7 live CD. Have a look at dmsetup.txt (dmsetup info -c) in each of my first two attachments. The Fedora 7 live CD did not map the LVs, but the Fedora 8 test 3 live CD did. Curiously, lvs.txt (lvs -v) seems to indicate that neither live CD detected the VGs. However, while this is inconsistent with the results of the dmsetup, it is consistent with one of the problems I experienced that first alerted me to the problem in the first place. When I first booted Fedora 8 test 3 live, the first LVM command or two would work fine, but then after that any LVM-related commands either said there were no VGs, or that volume group vg7101 (my volume group) was not found (I can no longer recall which happened at what times during my first round of triage).
Thanks - but that output shows the lvm2 tools operating correctly in both cases, and correctly ignoring the md component devices. So we need to work out what lvm commands got run before that point and their context - and how the .cache file is being manipulated.
Please, remove /etc/lvm/cache/.cache file from the Live-CD image, if you mount ext2 image through loopback you can see invalid devices there !
What?! LVM2 applies filters (like md_component_detection) *before* adding devices to the .cache file. You cannot take a .cache file from one system and use it on another. If you change underlying devices, you must run vgscan afterwards to refresh it.
Also there are files in /etc/lvm/backup with default VolGroup00, this can be dangerous if someone run vgcfgrestore ... Please remove these files from live-cd distribution.
Fixed in git.
Point me at instructions for downloading (or building) a new live cd image, and I'll test it, if you like.