Red Hat Bugzilla – Bug 222787
lvm picks up md component devices automatically
Last modified: 2007-11-30 17:11:53 EST
Description of problem:
I added some new drives in raid 5 with LVM on top and then got bit by this bug:
In order to get some semblance of performance (3X extremely slow is still
extremely slow) I disabled the drive on the slow port. I then had a system hang
while copying stuff from one drive to another (different partition, but the
drives are the same as the ones used in the soft raid array). Upon reboot (FC5
system) I recieved several errors and a kernel panic (can't scroll back to see
what really happened). Only meaningful thing I can gather is it can't access
root, but I don't know why.
I reboot with the FC6 live cd and see the problem:
raid5: cannot start dirty degraded array for md1
However the device mapper still seems to find my lv's.
I know that I need to force the assemble when the system is up to get it working
again. However, I can't because the device mapper has taken sda2 (the first of
the 4 partitions in the raid set) to be the pv for my vg in liu of the
non-existant md1. This partition is of type fd (definately not the 8e it would
be if it was LVM). I can't free up the partition becauause doing a pvremove
would probably really mess up the data on that partition and it's already dirty
and degraded. I surmise what's going on is the lvm metadata happens to fit in
one block (chunk size=64k) so the partition looks like it's a lvm pv if you
ignore the partition type.
Internet searching does not yield a good way to disable the functionality.
Interactive startup is not interactive until after the LVM autodetect. Problem
is later worked around by starting up without the first drive present and then
plugging it in after the system is up and then getting it to register with
echo "scsi add-single-device 3 0 0 0" >/proc/scsi/scsi (my primary controller
doesn't support hot-plug so I had to swap drives around)
Later research finds I might've avoided some of the mess with:
but I don't think that this method of operation should be desirable. If I had
the same problem with the fc5 that is installed on there I could've really
messed stuff up before I noticed.
Version-Release number of selected component (if applicable):
All versions are what comes on the live cd:
Haven't attempted reproduction yet (still trying to rebuild my array at 2MB/sec)
Steps to Reproduce:
1. Create a raid 5 (maybe also 0 or 4) array with a fairly large chunk sized (I
used 64) and put lvm on top of it. Partitions should be of raid type.
2. Find some way to disable md auto assemble (degrade it and then try mounting
it while dirty is what I did) as long as the first disk is intact.
/sbin/lvm.static vgchange -a y --ignorelockingfailure
that is run at boot by rc.sysinit pretends nothing is wrong with using the first
partition of your array as the pv instead of the raid device. The closest thing
to an error is a message about the sizes seeming to not match up.
It should notice that the partition is not of type lvm and skip it or at least
release the resources when it discovers the size doesn't match. I understand
that if using a raw drive (bad idea, but whatever) or loopback/md/etc device
there is no partition table so some deeper checking might be required, but if
the partition type is incorrect or the size is wrong, it should require an
override (perhaps on the kernel command line).
Calling it high severity because it *could* lead to loss of data.
As far as i recall, running lvm on top of md is not recommended (although i
have no idea if this is mentioned anywhere in the documentation). You should
always blacklist md-participating devices in your lvm.conf to avoid lvm
picking them up. I don't think partition type check is sufficient or even a
good idea, since someone issuing pvcreate /dev/foo3, where foo3 previously
held say ext3 and was of type Linux, usually does not bother to fdisk the
drive and flip partition types, yet he probably does not want next boot to
fail due to incomplete volume group (since foo3 is excluded, being of non-lvm
What -could- work is adding md metadata detection and refusing to use devices
that have md signature on them (maybe unless forced). Thoughts, comments?
There is already md component detection, see lvm.conf
# By default, LVM2 will ignore devices used as components of
# software RAID (md) devices by looking for md superblocks.
# 1 enables; 0 disables.
md_component_detection = 1
Confirming that the md component detection works as it should.
[root@dhcp-lab-168 ~]# mdadm --assemble /dev/md0
mdadm: /dev/md0 has been started with 1 drive (out of 2).
[root@dhcp-lab-168 ~]# pvscan
PV /dev/md0 VG vg0 lvm2 [5.99 GB / 5.95 GB free]
Total: 1 [5.99 GB] / in use: 1 [5.99 GB] / in no VG: 0 [0 ]
[root@dhcp-lab-168 ~]# mdadm --stop /dev/md0
mdadm: stopped /dev/md0
[root@dhcp-lab-168 ~]# pvscan
No matching physical volumes found
[root@dhcp-lab-168 ~]# vgchange -a y
No volume groups found
Let me try to put root partition on an lv atop md to see if initrd adds some
confusion to the equation.
I do not recall any specific warnings against this, although that doesn't mean
there weren't any as I might not have read it or remembered because I originally
set up the LVM years ago (although originally with the plans of easy moving to
raid later) and only set up the raid drives in January. I do crazy things
anyway, like run raid on top of raid. Though that made sense at the time too,
two 8.4GB 5400rpm drives striped together and then mirrored against a 7200rpm
My lvm.conf contains md_component_detection = 1 as does the one in my initrd
(which brings me to another issue, the mkinitrd in fc5 doesn't seem to figure
out to use raid456 for raid5, but I just pulled in the fc6 version).
I haven't had the chance to check out what the fc6 live cd has, perhaps that's
where the fault lies.
That sounds possible, i have checked the codepaths in lvm2 and they look all
right, the test above worked as expected, so i am closing this as WORSKFORME,
if you run into the problem again, please check if md_component_detection in
the relevant configuration file is turned on and if so, reopen this report.