From Bugzilla Helper: User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.6) Gecko/20050324 Firefox/1.0.2 Red Hat/1.0.2-1.4.1.centos4 Description of problem: I have a system with to attached RAIDs at /dev/sda and /dev/sdb and a plain IDE disk at /dev/hda I had RH7.3 installed on /dev/hda1 and had made sda and sdb into two separate volume groups with: pvcreate /dev/sda vgcreate vg1 /dev/sda pvcreate /dev/sdb vgcreate vg2 /dev/sdb with several logical volumes on each Today I upgraded to RHEL4 and made sure when Disk Druid ran to reformat only the / partition on /dev/hda1. After the upgrade, it could not find any volume groups: # vgscan --verbose Wiping cache of LVM-capable devices Wiping internal cache Reading all physical volumes. This may take a while... Finding all volume groups No volume groups found Doing a 'fdisk /dev/sda' and 'fdisk /dev/sdb' showed that they now had empty partition tables written on them. The only thing that I can see could have done this is Disk Druid which must be programed to write partition tables to any disk it sees that doesn't have one without asking the user. Luckily I had backed up /etc from the old RH7.3 partition and was able to use vgcfgrestore to recover. But this could have made me loose Terabytes of data. Version-Release number of selected component (if applicable): How reproducible: Always Steps to Reproduce: 1. Have LVM PV's on whole disks (e.g /dev/sda) 2. Install RHEL4 3. Those whole disks now are corrupted due to empty partition tables being written to them (by Disk Druid most likely) Actual Results: I lost my LVM volume groupos Expected Results: Disk Druid should have left those disks alone Additional info:
Similar issue with installation on a system with multiple paths to the same disks, hence showing up as sda, sdb, sdc and sdd. All partition tables were blank. I only changed sda (partitioning it and setting up the LVM the way I wanted it). The installation appeared fine until I rebooted it and sda had an empty partition table. It looks like this happens) * choose the partition scheme * partitions for sda get set up as requested. * partitions for sdb, c and d get reinitialised as shown, but not requested. Since they are all paths to the same disk, goodbye partition table. I was fortunately paying attention to allocation for partitions and was able to fdisk the partition table back to what it should have been and recover the installation, but if you don't change a partition then the installer shouldn't force any partition scheme on it.
We do not support upgrades from RHL to RHEL. See the last sentence in the second paragraph of the following FAQ: http://www.redhat.com/software/rhel/faq/#17
I was doing a fresh install, not an upgrade. I have never done anything but fresh installs. Your missing a very CRITICAL point here. The RHEL install program is trashing real data on disks it should not be touching! It will happen no matter what might already be installed on the machine. Has nothing to do with fresh install vs. upgrade. This IS A BUG! A LVM PV on a whole disk (i.e. sda instead of sda1) gets trashed by anaconda
To be more specific, what I suspect is happening is that Disk Druid is writing empty partition tables on to any disk it finds on the computer that does not already have a partition table. This is trashing the LVM PV. Disk Druid should not write to a disk in any way unless the user explicity requests an operation in the Disk Druid GUI for that disk.
This is really an unsupported setup -- the PV for lvm shouldn't be the whole disk, but rather a partition of type 8e that spans it. We may add a test for this case, but at best it'll just exclude the device from all use -- probably not completely optimal, but this is really a case where the LVM was set up incorrectly.
I disagree with your opinion that LVM was setup incorrectly. If making whole disk PVs was incorrect, then pvcreate should not allow it. Hell, the LVM HOWTO even has examples doing whole disk PVs. And what do you mean "exclude the device from all use"? There is a fundemental issue here that really has nothing to do with LVM specifically. Disk Druid should not be modifying any disk it finds in any way without express consent from the user. Other users in the world may be using whole disks (i.e. sans any partition table) for other applications (e.g. raw data dump from some extremely fast data acquistion device).
Perhaps the summary should be "Disk druid overwrites whole device partition tables which have not had changes requested". In my case, the setup is supported - I partitioned the first logical view of the physical disk (128M for /boot (ext3 on native Linux partition), the rest for LVM), set up LVM on the relevant partition, and left the other three (SAN multipathed) copies untouched. My guess is it partitioned the first copy as requested, then partitioned the remaining 3 as it saw them - blank, thus wiping the original setup 3 times. As it had cached the partition table for sda after performing the operation, the install was only too happy to install onto the disk as thought it to be, but rebooted to a blank partition table - no /boot, and no LVM. My workaround solution was to boot in rescue mode, partition /dev/sda exactly as it was, rewrite the partition table and reboot. I got lucky - it worked. But what other unrequested messing around is done by the Disk Druid in an enterprise architecture installation? It seems the simplest solution is to identify all partition tables modified as the GUI goes - record them in a list or whatever, and update only those with the Disk Druid at the stage those operations are done.
The fact that the low-level tools make it possible to configure your system in a particular way is in no way related to if that configuration is one that is a supported configuration for Red Hat Enterprise Linux. That being said, I can't reproduce the behavior you're claiming happens. If I make an lvm PV on /dev/sdb, and then boot into the RHEL 4, it gives me a window claiming that /dev/sdb doesn't have a valid partition table. That window offers the choice of initializing the disk or ignoring it. If you tell it to initialize the disk, it writes a partition table. Is that what you're doing?
More like, you tell it to initialise sda and sdb, and don't tell it to touch sdc and sdd. Then it goes ahead and writes all partition tables as it had displayed, _even those you do not request to be changed_. So sda and sdb are changed as requested, and if there was data in the partition table on sdc and sdd which it didn't recognise, well bad luck, it's just been reinitialised. I don't believe it has much to do with LVM at all, just that LVM metadata is a victim of the reinitialisation which has not been requested.
We don't recall seeing any question about initializing the disk or not. It is within the realm of possibility we somehow "spaced" past that. ONe thing is we were doing a kickstart install. But with no directives about disks except for 'zerombr yes'. The kickstart config had no 'clearpart'. and no 'part'. This has always made the kickstart go interactive for the Disk Druid part and then continue on automatically with the rest.
From the documentation: zerombr (optional) If zerombr is specified, and yes is its sole argument, any invalid partition tables found on disks are initialized. This will destroy all of the contents of disks with invalid partition tables. This command should be in the following format: zerombr yes No other format is effective. So you've put in the kickstart config that you want it to clear partition tables that don't make sense, and at the same time you're expecting it'll leave data untouched on a disk which isn't partitioned. These two goals are totally incompatible.
That seems to be it, yes. It didn't dawn on me that in dealing with an MBR, anything other than the boot disk would be involved. I take it that zerombr is really unnecessary anyway if one will be doing a: bootloader --location=mbr or does that also affect every disk and not just the boot disk?