Bug 346471 - F8T3 Live CD detects LVM groups inside RAID1, without using RAID1
Summary: F8T3 Live CD detects LVM groups inside RAID1, without using RAID1
Keywords:
Status: CLOSED RAWHIDE
Alias: None
Product: Fedora
Classification: Fedora
Component: LiveCD
Version: 8
Hardware: i386
OS: Linux
high
high
Target Milestone: ---
Assignee: Jeremy Katz
QA Contact: Fedora Extras Quality Assurance
URL:
Whiteboard:
Depends On:
Blocks: F8Target
TreeView+ depends on / blocked
 
Reported: 2007-10-23 08:20 UTC by Msquared
Modified: 2007-11-30 22:12 UTC (History)
9 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2007-10-25 17:27:06 UTC
Type: ---
Embargoed:


Attachments (Terms of Use)
Diagnosting information from Fedora 7 Live (22.25 KB, application/x-compressed-tar)
2007-10-25 15:27 UTC, Msquared
no flags Details
Diagnosting information from Fedora 8 test 3 Live (19.13 KB, application/x-compressed-tar)
2007-10-25 15:28 UTC, Msquared
no flags Details
Diagnosting information from installed Fedora 8 test 3 (21.98 KB, application/x-compressed-tar)
2007-10-25 15:29 UTC, Msquared
no flags Details

Description Msquared 2007-10-23 08:20:41 UTC
Description of problem:

If you boot F8T3 Live on a system that has an LVM PV within a RAID1 array, F8T3
Live detects the LVM volume group without detecting the RAID1 array that
contains it.


Version-Release number of selected component (if applicable):

Booting F8T3 Live CD on a system that has LVM inside RAID1.


How reproducible:

Always.


Steps to Reproduce:
1. Install Fedora 7 onto a system with two SATA drives:
1a. Partition each drive manually with one RAID autodetect for /boot and one for
LVM (PV)
1b. Create two RAID1 drives (one for /boot and one for the LVM)
1c. Create a VG on the LVM RAID
1d. Add four LVs to the VG (one each for /, /var, /home, swap)
2. Install with default package set, plus Virtualisation
3. Complete first boot
4. Log in, start a Terminal, "su -" to root
5. Verify configuration with "fdisk -l /dev/sda", "fdisk -l /dev/sdb", "cat
/proc/mdstat", "pvs", and "lvs" (see Additional Info for expected output)
6. Reboot, and boot F8T3 Live CD
7. Log in, start a Terminal, "su -" to root
8. Check configuration with "cat /proc/mdstat", "pvs", and "vgs"


Actual results:

[root@localhost ~]# cat /proc/mdstat
Personalities :
unused devices: <none>
[root@localhost ~]# pvs
  PV         VG     Fmt  Attr PSize   PFree
  /dev/sda2  vg7101 lvm2 a-   186.22G 150.22G
[root@localhost ~]# vgs
  VG     #PV #LV #SN Attr   VSize   VFree
  vg7101   1   4   0 wz--n- 186.22G 150.22G



Expected results (generated from F7 Live CD):

[root@localhost ~]# cat /proc/mdstat
Personalities :
unused devices: <none>
[root@localhost ~]# pvs
[root@localhost ~]# vgs
  No volume groups found



Additional info (after booting into successfully-installed Fedora 7):

[root@neuromancer ~]# fdisk -l /dev/sda

Disk /dev/sda: 400.0 GB, 400087375360 bytes
255 heads, 63 sectors/track, 48641 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x000a855c

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1          13      104391   fd  Linux raid autodetect
/dev/sda2              14       24324   195278107+  fd  Linux raid autodetect
[root@neuromancer ~]# fdisk -l /dev/sdb

Disk /dev/sdb: 400.0 GB, 400088457216 bytes
255 heads, 63 sectors/track, 48641 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Disk identifier: 0x00027767

   Device Boot      Start         End      Blocks   Id  System
/dev/sdb1   *           1          13      104391   fd  Linux raid autodetect
/dev/sdb2              14       24324   195278107+  fd  Linux raid autodetect
[root@neuromancer ~]# cat /proc/mdstat
Personalities : [raid1] [raid6] [raid5] [raid4]
md0 : active raid1 sda1[0] sdb1[1]
      104320 blocks [2/2] [UU]

md1 : active raid1 sda2[0] sdb2[1]
      195278016 blocks [2/2] [UU]

unused devices: <none>
[root@neuromancer ~]# pvs
  PV         VG     Fmt  Attr PSize   PFree
  /dev/md1   vg7101 lvm2 a-   186.22G 150.22G
[root@neuromancer ~]# lvs
  LV     VG     Attr   LSize  Origin Snap%  Move Log Copy%
  F7home vg7101 -wi-ao  8.00G
  F7root vg7101 -wi-ao  8.00G
  F7swap vg7101 -wi-ao  4.00G
  F7var  vg7101 -wi-ao 16.00G

Comment 1 Chuck Ebbert 2007-10-23 18:03:12 UTC
Was there a volume group on the underlying partitions before the raid1 was created?

Comment 2 Msquared 2007-10-23 18:31:34 UTC
No.  The F7 that suffered at the hands of the F8T3 was the first thing installed
on both drives immediately after purchase brand new.

Comment 3 Jeremy Katz 2007-10-23 18:44:15 UTC
This is a bug in the LVM tools.

What's happening is that we boot up.  The normal sequence in rc.sysinit is to
check for an /etc/mdadm.conf and if one exists, it starts mdraid arrays.  Then
LVM is started afterwards.  In the case of the live image, we don't have an
mdadm.conf  and so we never start the mdraid array.  But the lvm tools then
still activate the volume group, even though the block devices have mdraid
metadata also.

There are a few ways of fixing:
1) Always start mdraid arrays in rc.sysinit, regardless of the existence of an
mdadm.conf.  This feels somewhat risky as it's a substantial change from what
we've done in the past
2) Create an empty mdadm.conf on the live image.  Which then makes the live
image essentially the first case, but (slightly) more constrained.  Given the
download numbers for the live images, I don't know that this is really any
better to do
3) Fix the lvm tools to look for mdraid metadata and not activate a VG off of
the base block device with this metadata.

Comment 4 Jesse Keating 2007-10-25 13:00:19 UTC
While this is an annoying bug, I don't think I would consider it a blocker. 
This isn't a typical partition case we would see.  Moving to Target as we would
take a fix if it showed up, but I don't think we would hold up the release for this.

Comment 5 Alasdair Kergon 2007-10-25 13:21:16 UTC
The lvm tools are supposed to already check for md devices and ignore them, but
there are some caveats:

Is there an lvm.conf - and if so does it have 'md_component_detection = 0' in it.
(The entry should be missing, or set to 1 - never 0).

What version of md metadata is being used?  Only some versions are detected
correctly - this only got fixed upstream this week.

To get lvm2 diagnostics, you need to run 'vgscan -vvvv' at the point where
things have gone wrong and attach it to this bugzilla.

Another cause might be if nash was handling this internally instead of using the
lvm2 tools.

Comment 6 Jeremy Katz 2007-10-25 13:30:02 UTC
(In reply to comment #5)
> Is there an lvm.conf - and if so does it have 'md_component_detection = 0' in it.
> (The entry should be missing, or set to 1 - never 0).

It's the default lvm.conf as shipped in the package.  So md_component_detection
is set to 1.
 
> What version of md metadata is being used?  Only some versions are detected
> correctly - this only got fixed upstream this week.

Msquared -- can you take a look at your system to get this information?  But
it's probably worth getting the targeted fix for detecting the other versions
into F8

> Another cause might be if nash was handling this internally instead of using the
> lvm2 tools.

Nope, everything just execs lvm. 

Comment 7 Alasdair Kergon 2007-10-25 13:43:47 UTC
What's puzzling is why booting the same system with F8T3 behaves differently
from F7 - I can't *think* of any changes to this part of the lvm2 code since F7.

Please boot with F7 CD, attach 'vgscan -vvvv' output, plus 'lvs -v', 'lvm
version', 'dmsetup info -c'; then repeat with F8T3 CD.


Comment 8 Msquared 2007-10-25 13:53:36 UTC
Jeremy - How do I obtain that information?

Alasdair - Can do, but how do I tell F8T3 CD not to use swap?  There are swap
partitions on the LVM, and I don't want to risk F8T3 causing any corruption.

Comment 9 Jeremy Katz 2007-10-25 14:28:55 UTC
Swap won't be enabled if you add 'noswap' to the kernel command line (hit tab in
isolinux, add the argument)

And mdadm --detail /dev/mdX should give you the information on the version of
the array

Comment 10 Milan Broz 2007-10-25 14:41:08 UTC
mdadm create metadata v0.9 by default and this format is correctly detected by
lvm tools.

There is some problem with lvm cache (/etc/lvm/cache/.cache).

If this file is recreated, everything works. Replacing old content will break it
again (not selinux related this time)

(i.e. vgscan ; vgchange -a y  -> will *not* activate volumes on md device)

Comment 11 Alasdair Kergon 2007-10-25 14:55:19 UTC
But once again - nothing changed in this area between those versions, did it?

Comment 12 Alasdair Kergon 2007-10-25 15:10:05 UTC
So is this pointing back to changes in the mkinitrd package - what does that
look like on the  two live CDs?

Arguably the F-7 live CD behaviour described is "wrong" - it should have
activated the raid and then the lvs, same as booting from the installation did?

Comment 13 Alasdair Kergon 2007-10-25 15:12:30 UTC
and the F-8 live CD behaviour is partly fixed, as it now does the LVM2 actions,
but is still missing the md ones?

Comment 14 Jeremy Katz 2007-10-25 15:22:40 UTC
(In reply to comment #12)
> So is this pointing back to changes in the mkinitrd package - what does that
> look like on the  two live CDs?

There's absolutely nothing to do with the initrd here.  
 
> Arguably the F-7 live CD behaviour described is "wrong" - it should have
> activated the raid and then the lvs, same as booting from the installation did?

Please actually read what I wrote.  One potential approach would be to change
rc.sysinit so that we always activate raidsets even if there's not an
mdadm.conf, but that scares me.  A lot.  And even if it's done, we should
*still* fix the lvm tools so that they don't activate volume groups that are on
the individual, unassembled RAID components.

And note that this happened with the Fedora 7 live CDs, also, it just wasn't as
apparent as nothing was done by default with anything from any of the LVs.  Now
that we enable swap, though, things are different.

Comment 15 Msquared 2007-10-25 15:27:03 UTC
Created attachment 237491 [details]
Diagnosting information from Fedora 7 Live

The results of this script after booting the Fedora 7 Live CD:

#!/bin/bash
VER=`uname -r`
mkdir $VER
cd $VER
dmesg >dmesg.txt 2>&1
cp /var/log/messages messages.txt
vgscan -vvvv >vgscan.txt 2>&1
lvs -v >lvs.txt 2>&1
lvm version >lvmversion.txt 2>&1
dmsetup info -c >dmsetup.txt 2>&1
mdadm --detail /dev/md0 >md0.txt 2>&1
mdadm --detail /dev/md1 >md1.txt 2>&1

Comment 16 Msquared 2007-10-25 15:28:20 UTC
Created attachment 237501 [details]
Diagnosting information from Fedora 8 test 3 Live

The results of this script after booting the Fedora 8 test 3 Live CD:

#!/bin/bash
VER=`uname -r`
mkdir $VER
cd $VER
dmesg >dmesg.txt 2>&1
cp /var/log/messages messages.txt
vgscan -vvvv >vgscan.txt 2>&1
lvs -v >lvs.txt 2>&1
lvm version >lvmversion.txt 2>&1
dmsetup info -c >dmsetup.txt 2>&1
mdadm --detail /dev/md0 >md0.txt 2>&1
mdadm --detail /dev/md1 >md1.txt 2>&1

Comment 17 Msquared 2007-10-25 15:29:44 UTC
Created attachment 237511 [details]
Diagnosting information from installed Fedora 8 test 3

The results of this script after booting my Fedora 8 test 3 installation:

#!/bin/bash
VER=`uname -r`
mkdir $VER
cd $VER
dmesg >dmesg.txt 2>&1
cp /var/log/messages messages.txt
vgscan -vvvv >vgscan.txt 2>&1
lvs -v >lvs.txt 2>&1
lvm version >lvmversion.txt 2>&1
dmsetup info -c >dmsetup.txt 2>&1
mdadm --detail /dev/md0 >md0.txt 2>&1
mdadm --detail /dev/md1 >md1.txt 2>&1

Comment 18 Msquared 2007-10-25 15:50:11 UTC
(In reply to comment #14)
> And note that this happened with the Fedora 7 live CDs, also, it just wasn't as
> apparent as nothing was done by default with anything from any of the LVs.  Now
> that we enable swap, though, things are different.

I'm not sure this did happen with the Fedora 7 live CD.  Have a look at
dmsetup.txt (dmsetup info -c) in each of my first two attachments.  The Fedora 7
live CD did not map the LVs, but the Fedora 8 test 3 live CD did.

Curiously, lvs.txt (lvs -v) seems to indicate that neither live CD detected the
VGs.  However, while this is inconsistent with the results of the dmsetup, it is
consistent with one of the problems I experienced that first alerted me to the
problem in the first place.

When I first booted Fedora 8 test 3 live, the first LVM command or two would
work fine, but then after that any LVM-related commands either said there were
no VGs, or that volume group vg7101 (my volume group) was not found (I can no
longer recall which happened at what times during my first round of triage).

Comment 19 Alasdair Kergon 2007-10-25 15:51:20 UTC
Thanks - but that output shows the lvm2 tools operating correctly in both cases,
and correctly ignoring the md component devices.

So we need to work out what lvm commands got run before that point and their
context - and how the .cache file is being manipulated.

Comment 20 Milan Broz 2007-10-25 15:56:15 UTC
Please, remove /etc/lvm/cache/.cache file from the Live-CD image,
if you mount ext2 image through loopback you can see invalid devices there !


Comment 21 Alasdair Kergon 2007-10-25 16:11:42 UTC
What?!  LVM2 applies filters (like md_component_detection) *before* adding
devices to the .cache file.  You cannot take a .cache file from one system and
use it on another.  If you change underlying devices, you must run vgscan
afterwards to refresh it.

Comment 22 Milan Broz 2007-10-25 16:21:48 UTC
Also there are files in /etc/lvm/backup with default VolGroup00, this can be
dangerous if someone run vgcfgrestore ...

Please remove these files from live-cd distribution.


Comment 23 Jeremy Katz 2007-10-25 17:27:06 UTC
Fixed in git.

Comment 24 Msquared 2007-10-26 00:41:45 UTC
Point me at instructions for downloading (or building) a new live cd image, and
I'll test it, if you like.


Note You need to log in before you can comment on or make changes to this bug.