From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4) Gecko/20030922 Description of problem: When deploying LVM, a mistyped pvcreate command will silently write the lvm block data to the superblock of a ext2/3 filesystem. This will destroy the initial superblock, and cause the filesystem label to become invalid. This occurs silently even if the ext2/3 filesystem is mounted. In this example, a disk is partitioned as: Disk /dev/hda: 20.0 GB, 20020396032 bytes 255 heads, 63 sectors/track, 2434 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/hda1 * 1 13 104391 83 Linux /dev/hda2 14 370 2867602+ 83 Linux /dev/hda3 371 421 409657+ 83 Linux /dev/hda4 422 2434 16169422+ 5 Extended /dev/hda5 422 454 265041 82 Linux swap /dev/hda6 455 487 265041 83 Linux /dev/hda7 488 503 128488+ 83 Linux /dev/hda8 504 519 128488+ 83 Linux /dev/hda9 520 536 136521 8e Linux LVM /dev/hda10 537 553 136521 8e Linux LVM /dev/hda11 554 570 136521 8e Linux LVM /dev/hda12 571 587 136521 8e Linux LVM /etc/fstab contains: LABEL=/ / ext3 defaults 1 1 LABEL=/boot /boot ext3 defaults 1 2 none /dev/pts devpts gid=5,mode=620 0 0 LABEL=/home /home ext3 defaults 1 2 none /proc proc defaults 0 0 none /dev/shm tmpfs defaults 0 0 LABEL=/tmp /tmp ext3 defaults 1 2 LABEL=/usr /usr ext3 defaults 1 2 LABEL=/var /var ext3 defaults 1 2 /dev/hda5 swap swap defaults 0 0 /dev/cdrom /mnt/cdrom udf,iso9660 noauto,owner,kudzu,ro 0 0 /dev/fd0 /mnt/floppy auto noauto,owner,kudzu 0 0 If the command 'pvcreate /dev/hda9 /dev/hda1 /dev/hda11 /dev/hda12' is run (with a typo of hda1 instead of hda10), pvcreate will properly flag the hda9, hda11, and hda12 devices as LVM PV disks. However, hda1 (which was labelled as /boot, and which was mounted at the time the pvcreate was run) now has a corrupted superblock, and a pvdisplay /dev/hda1 shows it as being a valid PV device. An attempt to 'mount -a' now will show that the label for /boot (ie, /dev/hda1) is not found. Trying a pvmove /dev/hda1 will report that the device is not a member of any VG's, but does not remove the data from the first block of the partition. Running 'pvdata -PP /dev/hda1' shows that the UUID and system_id are defined. Manually unmounting the partition, forcing an e2fsck and letting e2fsck copy the superblock from a backup superblock will correct the label issue (alternately, running e2fslabel and then a e2fsck to fix the inode tree will work), but the drive still contains the UUID and system_id and they are still recognized by pvdata and pvscan. Reading data directly from the device with a 'dd if=/dev/hda1 of=/tmp/firstbootkilos bs=1K count=10" (so I could see the first block of the device as well as the first and second superblocks) showed the pv data was stored at the very beginning of the partition, and the /boot label was missing. After running the e2fslabel and/or fsck to correct the issue, the first block still contained the pv data, but at least the e2fs label was present in the second block. The only way I found to fix this quickly was the perform a 'dd if=/dev/zero of=/dev/hda1 bs=1K count=1' and then fsck the device (either e2fslabelling it myself or allowing fsck to copy the data over from 8193). pvcreate should check the partition type (if the device is a partition and not a whole disk) to see if it is 0x8e and inform the user that the target device is not flagged correctly. Also (alternatively?) pvcreate at the least needs to see if the device/partition is currently mounted. Accidently running pvcreate on a ext2/ext3 device (especially a mounted one), then realizing it and adding the proper volumes to finish up the original task possibly leaves the machine in a state where, at next mount or boot time, the devices will not be recognized by device label (possibly causing a failure to boot), and data could be overwritten or lost (as an active filesystem gets modified). I haven't been able to test extensively as this relates to data loss on the mounted filesystem, since the first block of the partition is reserved for system block use anyway, but without inspecting the order of operations for filesystem calls, it seems as likely to cause data loss as any other write to a physical device which bypasses the filesystem api. Version-Release number of selected component (if applicable): lvm-1.0.3-15 How reproducible: Always Steps to Reproduce: 1. run pvcreate on a partition already formatted and in use as ext2/ext3 2. inspect e2fs label on the partition or try to use the label 3. check device/partition status per to fsck Actual Results: runs silently, overwriting e2fslabel and writing data directly to the underlaying physical device of a mounted partition. Expected Results: A warning of some type relating to the device/partition flag or the mount status of the existing device. Additional info: This is a fresh install of RHEL 3 using kernel-2.4.21-4.EL lvm-1.0.3-15 redhat-release-3AS-1
*** Bug 120909 has been marked as a duplicate of this bug. ***
*** Bug 128380 has been marked as a duplicate of this bug. ***
Er, do you feel like providing some kind of explanation here, to the people who've taken time to report this? Unless there's more sanity checking in pvcreate this'll result in more support calls. That's already evidenced by the amount of times this issue has been reported.
Even if we added a check to pvcreate, people will still be able to use a list of tools to overwrite filesyetem (and other) metadata. Checking for the fs mounted case wouldn't prevent people from overwriting unmounted filsystems anyway.
However, mkfs refuses to format a mounted filesystem. There's other ways to do that, but that doesn't mean a simple check in mkfs to prevent loss of data shouldn't be performed. I don't see why pvcreate should be different. A mounted check and a partition type check, as the original poster suggests, are good ideas. (responding because a GLS customer just hit this one again, and it reminded me)
(The mounted check has now made it into RHEL4 pvcreate.)