Red Hat Bugzilla – Bug 598135
LVM detection of MD volumes is broken
Last modified: 2010-07-01 15:56:35 EDT
LVM should detect MD volumes and not treat them as LVM physical volumes if "md_component_detection = 1" is turned on (which is default). However, this detection is bypassed, when when a given volume is stored in the cache in /etc/lvm/cache/.cache Thus, LVM can access underyling MD devices
as it's physical volumes.
This can lead even to data corruption. For example, if the admin creates a PV and VG on MD-RAID0. Now, if MD-RAID0 is deactivated for some reason (for example unplugged cable on one of the disks), LVM treats the first RAID0 leg as its physical volume and allows accesses to it. But the sector numbers on RAID0 leg don't match the sector numbers on combined RAID0 device (where the VG was originally created), thus the situation can result in data corruption.
A script that triggers this behavior:
# Cleanup from previous script runs
mdadm -S /dev/md1
# The next line adds /dev/sdc1 to lvm cache. This PV is ignored and overwritten # later.
# Alternatively, you can just add /dev/sdc1 to the cache manually.
pvcreate -ff /dev/sdc1
# Create the striped volume.
mdadm -C /dev/md1 -l 0 -c 256 -n 2 /dev/sdc1 /dev/sdd1
# Make a PV and VG on this volume
pvcreate -ff /dev/md1
vgcreate vg /dev/md1
# Make a LV and deactivate it
lvcreate -L 4M -n lv vg
lvchange -an vg/lv
# Stop the MD volume
mdadm -S /dev/md1
# Now, these three commands show that LVM treats /dev/sdc1 as a physical volume
# and reads VG metadata from it (it shouldn't because it has MD superblock).
# The sector offsets in /dev/sdc1 don't match the offsets in /dev/md1, thus
# if someone tried to manipulate VG in this state, it could cause data
# Here, LVM thinks that /dev/sdc1 is the underlying physical volume, not
# /dev/md1, thus this command overwrites random sectors of the original volume
lvcreate -L 4M -n lv2 vg
1) mdadm should detect the PV and not permit the creation else wipe the PV signature;
2) lvm should detect the md signature more frequently - it can't rely on the cached result of an earlier check. Would it eliminate races if we found a way to remove /dev/sdc1 from the cache when adding /dev/md1 to it?
(If all else fails we'd need to add a 'is_md_device' flag to the PV header.)
Doesn't address all the races, but try http://sourceware.org/cgi-bin/cvsweb.cgi/LVM2/lib/filters/filter-persistent.c.diff?cvsroot=lvm2&r1=1.41&r2=1.42
The patch works ... but I don't know what it does :)