222787 – lvm picks up md component devices automatically

Bug 222787 - lvm picks up md component devices automatically

Summary: lvm picks up md component devices automatically

Keywords:
Status:	CLOSED WORKSFORME
Alias:	None
Product:	Fedora
Classification:	Fedora
Component:	lvm2
Sub Component:
Version:	6
Hardware:	i386
OS:	Linux
Priority:	medium
Severity:	high
Target Milestone:	---
Assignee:	Petr Rockai
QA Contact:	Corey Marthaler
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2007-01-16 07:49 UTC by Robin Panda
Modified:	2007-11-30 22:11 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Clone Of:
Environment:
Last Closed:	2007-04-04 09:15:45 UTC
Type:	---
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Robin Panda 2007-01-16 07:49:57 UTC

Description of problem:
I added some new drives in raid 5 with LVM on top and then got bit by this bug:
http://bugzilla.kernel.org/show_bug.cgi?id=7590

In order to get some semblance of performance (3X extremely slow is still
extremely slow) I disabled the drive on the slow port. I then had a system hang
while copying stuff from one drive to another (different partition, but the
drives are the same as the ones used in the soft raid array). Upon reboot (FC5
system) I recieved several errors and a kernel panic (can't scroll back to see
what really happened). Only meaningful thing I can gather is it can't access
root, but I don't know why.

I reboot with the FC6 live cd and see the problem:
raid5: cannot start dirty degraded array for md1
However the device mapper still seems to find my lv's.

I know that I need to force the assemble when the system is up to get it working
again. However, I can't because the device mapper has taken sda2 (the first of
the 4 partitions in the raid set) to be the pv for my vg in liu of the
non-existant md1. This partition is of type fd (definately not the 8e it would
be if it was LVM). I can't free up the partition becauause doing a pvremove
would probably really mess up the data on that partition and it's already dirty
and degraded. I surmise what's going on is the lvm metadata happens to fit in
one block (chunk size=64k) so the partition looks like it's a lvm pv if you
ignore the partition type.

Internet searching does not yield a good way to disable the functionality.
Interactive startup is not interactive until after the LVM autodetect. Problem
is later worked around by starting up without the first drive present and then
plugging it in after the system is up and then getting it to register with
echo "scsi add-single-device 3 0 0 0" >/proc/scsi/scsi (my primary controller
doesn't support hot-plug so I had to swap drives around)

Later research finds I might've avoided some of the mess with:
md-mod.start_dirty_degraded=1
but I don't think that this method of operation should be desirable. If I had
the same problem with the fc5 that is installed on there I could've really
messed stuff up before I noticed.

Version-Release number of selected component (if applicable):
All versions are what comes on the live cd:
http://fedoraproject.org/wiki/FedoraLiveCD

How reproducible:
Haven't attempted reproduction yet (still trying to rebuild my array at 2MB/sec)

Steps to Reproduce:
1. Create a raid 5 (maybe also 0 or 4) array with a fairly large chunk sized (I
used 64) and put lvm on top of it. Partitions should be of raid type.
2. Find some way to disable md auto assemble (degrade it and then try mounting
it while dirty is what I did) as long as the first disk is intact.
  
Actual results:
/sbin/lvm.static vgchange -a y --ignorelockingfailure
that is run at boot by rc.sysinit pretends nothing is wrong with using the first
partition of your array as the pv instead of the raid device. The closest thing
to an error is a message about the sizes seeming to not match up.

Expected results:
It should notice that the partition is not of type lvm and skip it or at least
release the resources when it discovers the size doesn't match. I understand
that if using a raw drive (bad idea, but whatever) or loopback/md/etc device
there is no partition table so some deeper checking might be required, but if
the partition type is incorrect or the size is wrong, it should require an
override (perhaps on the kernel command line).

Additional info:
Calling it high severity because it *could* lead to loss of data.

Comment 1 Petr Rockai 2007-03-14 09:14:33 UTC

As far as i recall, running lvm on top of md is not recommended (although i 
have no idea if this is mentioned anywhere in the documentation). You should 
always blacklist md-participating devices in your lvm.conf to avoid lvm 
picking them up. I don't think partition type check is sufficient or even a 
good idea, since someone issuing pvcreate /dev/foo3, where foo3 previously 
held say ext3 and was of type Linux, usually does not bother to fdisk the 
drive and flip partition types, yet he probably does not want next boot to 
fail due to incomplete volume group (since foo3 is excluded, being of non-lvm 
partition type).

What -could- work is adding md metadata detection and refusing to use devices 
that have md signature on them (maybe unless forced). Thoughts, comments?

Comment 2 Milan Broz 2007-03-14 10:44:10 UTC

There is already md component detection, see lvm.conf

# By default, LVM2 will ignore devices used as components of
# software RAID (md) devices by looking for md superblocks.
# 1 enables; 0 disables.
md_component_detection = 1

Comment 3 Petr Rockai 2007-03-14 13:30:33 UTC

Confirming that the md component detection works as it should.

[root@dhcp-lab-168 ~]# mdadm --assemble /dev/md0
mdadm: /dev/md0 has been started with 1 drive (out of 2).
[root@dhcp-lab-168 ~]# pvscan
  PV /dev/md0   VG vg0   lvm2 [5.99 GB / 5.95 GB free]
  Total: 1 [5.99 GB] / in use: 1 [5.99 GB] / in no VG: 0 [0   ]
[root@dhcp-lab-168 ~]# mdadm --stop /dev/md0
mdadm: stopped /dev/md0
[root@dhcp-lab-168 ~]# pvscan
  No matching physical volumes found
[root@dhcp-lab-168 ~]# vgchange -a y
  No volume groups found

Let me try to put root partition on an lv atop md to see if initrd adds some 
confusion to the equation.

Comment 4 Robin Panda 2007-03-17 00:12:02 UTC

I do not recall any specific warnings against this, although that doesn't mean
there weren't any as I might not have read it or remembered because I originally
set up the LVM years ago (although originally with the plans of easy moving to
raid later) and only set up the raid drives in January. I do crazy things
anyway, like run raid on top of raid. Though that made sense at the time too,
two 8.4GB 5400rpm drives striped together and then mirrored against a 7200rpm
20gb drive.

My lvm.conf contains md_component_detection = 1 as does the one in my initrd
(which brings me to another issue, the mkinitrd in fc5 doesn't seem to figure
out to use raid456 for raid5, but I just pulled in the fc6 version).

I haven't had the chance to check out what the fc6 live cd has, perhaps that's
where the fault lies.

Comment 5 Petr Rockai 2007-04-04 09:15:45 UTC

That sounds possible, i have checked the codepaths in lvm2 and they look all 
right, the test above worked as expected, so i am closing this as WORSKFORME, 
if you run into the problem again, please check if md_component_detection in 
the relevant configuration file is turned on and if so, reopen this report. 
Thanks.

Note You need to log in before you can comment on or make changes to this bug.