Red Hat Bugzilla – Bug 476582
Anaconda incapable of clearing data from drives with duplicate VG names
Last modified: 2009-08-03 04:11:35 EDT
Created attachment 327030 [details]
dump from anaconda crash during installation
Description of problem:
If you pull drives from systems installed with RHEL and install them all into one system, Anaconda cannot handle the duplicate VolGroup names it finds and will bomb out when it attempts to install.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Install multiple disks with identical VG names in a single system
2. Begin installation of RHEL5, using default partition options during a graphical install
3. Anaconda traceback when it begins modification of disks
Traceback and reboot
Blanked disks, then functional and installed system.
Attaching the traceback file from the aborted install
For the record, I know you can dd over the first part of the disks to get around this, but it's not the kind of error we should be encountering during install.
The thing we hit here is that we are trying to revmove PV
of existing VG. In lvm.py:vgremove we remove only one
of VGs with given name (chosen by lvm) with vgremove,
but then we try to pvremove all PVs of VGs with given
name. Hence the traceback.
Another problem is duplicate LV names. Before removing
all VGs of given name, we remove only one LV of given
name (chosen by lvm) in doMetaDeletes, so some VGs to
be removed have LVs and lvm asks for user confirmation
(it is new feature of lvm and can be supressed with --force).
This fix won't be easy, I feel high break-lvm-things potential.
I'd like to reiterate Radek's concerns about the potential for breaking other things attempting to fix this bug.
In Fedora we have mitigated this problem somewhat by using the hostname in volume group names and the mountpoint in logical volume names, and also by being more thorough in wiping volume metadata. I don't suppose it would be possible for you to give F11 when released and see if that fixes this problem for you, would it? If so I'd like to handle this one as NEXTRELEASE instead of attempting to do something in an update release that could really impact us in other ways.
There is a pretty good chance the issue will be fixed with
https://bugzilla.redhat.com/show_bug.cgi?id=462615 in 5.4.
Moving to RHEL 5.5 so it doesn't get mistakenly closed by the bot. But it needs a retest given the fix for 462615.
QA results with 5.3 GA:
1) Install Xen, PV guest with default partitioning into a file (disk1.img)
2) Copy the image to create a second one (cp disk1.img disk2.img)
3) Both disk images contain identical volume groups and logical volumes (names are default)
4) Using virt-install start another installation using both disk images
5) Select default partitioning in anaconda GUI
6) Installation completes.
7) Using virt-install install again, but only using disk1.img
8) system installs without problems
Steps 1-6 describe installation on 2 identical disks with the same volume group and logical volume names.
Steps 4-8 describe installation on 2 disks and the installation over 1 of them only.
can you provide exact steps to reproduce? Anaconda doesn't throw an exception for me.
I did this with two separate systems with real disks. Perhaps that's why you didn't see the error?
1. Install RHEL5.3 x86_64 on two similar systems, each with one hard drive.
2. Choose to "remove all partitions and create default layouts" on both systems during the install.
3. When both installs complete, move both hard drives into the same machine and try another install of RHEL5.3
4. When you reach the end of the installer, you will receive the error.
(I just verified this again with RHEL5.3)
Let me see what happens with RHEL5.4 snap 5 (160 kernel). It'll take a bit to install the new pxe components and install the OS on the systems. I'll report back as soon as the installs complete.
W00t! The error doesn't occur in RHEL5.4 Snap 5. I was able to perform the exact same steps as above and end up with a usable system with two disks LVM'd together.
Great, if this result is reproducable, we should go ahead and close this one out. Thanks for retesting.
Closing this one as per comment #9