468431 – anaconda fails to clear out previous lvm data

Bug 468431 - anaconda fails to clear out previous lvm data

Summary: anaconda fails to clear out previous lvm data

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	anaconda
Sub Component:
Version:	5.3
Hardware:	All
OS:	Linux
Priority:	medium
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Radek Vykydal
QA Contact:	Alexander Todorov
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	473247 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2008-10-24 18:08 UTC by Bill Peck
Modified:	2009-01-20 21:37 UTC (History)
CC List:	10 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2009-01-20 21:37:44 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)
kickstart that doesn't work (11.82 KB, text/plain) 2008-10-27 15:16 UTC, Bill Peck	no flags	Details
Bug text that anaconda captures (459.96 KB, text/plain) 2008-10-29 14:32 UTC, Jeff Burke	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2009:0164	0	normal	SHIPPED_LIVE	anaconda bug fix and enhancement update	2009-01-20 16:05:24 UTC

Description Bill Peck 2008-10-24 18:08:54 UTC

Description of problem:

If you re-install a system via kickstart with the same partitioning layout and same lvm names anaconda will fail even though you told it to wipe the partitions.

The problem according to pjones is that we create the exact same partition layout and the old lvm data is still there.  


Version-Release number of selected component (if applicable):
anaconda installer init version 11.1.2.145 starting

How reproducible:
unknown

Steps to Reproduce:
1. re-install the same system over and over with the same kickstart (with lvm)

Actual results:
    Wiping cache of LVM-capable devices
  Couldn't find device with uuid 'RSIUPp-f3MA-20br-1uAs-4JuJ-2lBi-RE10WR'.
    There are 1 physical volumes missing.
    visited LogVol00
    visited LogVol01
  A volume group called 'VolGroup00' already exists.

Comment 1 Milan Broz 2008-10-24 18:48:25 UTC

Anaconda should wipe the new created device (partitions) then.

For most of types (including lvm), zeroing first megabyte with dd works quite well :-)

For lvm, you can also run pvremove -ff <new parititon>, just code must assure that all former parts of the reappeared VG is wiped this way.

Also note, that if you skip pvremove and use pvcreate -ff <new device>, it correctly create physical volume, but if there is another former PV of reappeared VG, vgcreate will fail with exactly the message above (because LVM still see old VG with the same name - just with missing some PVs).
So the pvremove (or force wiping with dd) after new repartitioning is probably better idea.

Comment 2 Radek Vykydal 2008-10-27 13:40:47 UTC

Bill, can you attach the ks file please?

Comment 3 Bill Peck 2008-10-27 15:16:58 UTC

Created attachment 321618 [details]
kickstart that doesn't work

Comment 4 Jeff Burke 2008-10-29 14:32:58 UTC

Created attachment 321810 [details]
Bug text that anaconda captures

I tried to do an install with the "latest RHEL5.3 nightly" post beta. While doing the install. I asked for all linux partitions to be removed. Anaconda threw up the BUG screen and captured the attachment.

Comment 5 Radek Vykydal 2008-10-30 14:16:59 UTC

I tried slightly modified ks from comment #3 on disk with existing
default partitioning, and met

A volume group called 'VolGroup00' already exists.

error.

Though I'm not sure that it has the same cause as 
the case of the bug, it can be close. What I observed is:

With clearpart --initlabel (i.e. disk label should be reset),
the partitions aren't wiped properly because doClearPartAction is
fed only with one request (whole disk?) which is not added
to delete requests then. OTOH without --initlabel, doClearPartAction
is fed with existing requests which are used to create proper
delete requests that ensure proper deleting/wiping (including
pvremove -ff) of existing lvm stuff.

Comment 6 Radek Vykydal 2008-10-30 16:12:33 UTC

Bill, Jeff, can you try to reproduce with updates:
http://rvykydal.fedorapeople.org/updates.wipelvm.149.img ?
It is a port of bug #257161 from rhel4.

Comment 7 Radek Vykydal 2008-11-14 15:25:13 UTC

We are hitting two things here:

1) when installing with kickstart, we don't wipe the old metadata
(more in comment #5), this bug was hit in description I think.
A patch was pushed that should fix this.

2) wiping the old metadata fails.
This was hit in comment #4, it was not kickstart install,
so we tried to wipe in doMetaDeletes).
(And could be also hit in
https://bugzilla.redhat.com/show_bug.cgi?id=469700#c22,
just for reference)

The patch for bug from description was pushed
with commit b5a48bfc44a8084b4c751e192c70eb1837e44e19,
included in 11.1.2.156 (Snapshot #3), so i put to MODIFIED.

Comment 13 Radek Vykydal 2008-11-20 13:13:38 UTC

From tail of anaconda.log:
21:26:00 INFO    : removing obsolete VG VolGroup00
21:26:00 INFO    : vgremove VolGroup00
... here the function vgremove failed, most probably on
vgremove -v VolGroup00, the exception was silently catched,
so, as a consequence, in next step vgcreate failed:
21:26:00 ERROR   : createLogicalVolumes failed with vgcreate failed for VolGroup00
... with the output of failed vgcreate command in lvmout:

    Wiping cache of LVM-capable devices
  Couldn't find device with uuid 'Nccnpt-t5A4-C4JO-3f0E-mQtA-d0nU-r1Uiz9'.
    There are 1 physical volumes missing.
  A volume group called 'VolGroup00' already exists.

I need to see output of the failed vgremove, preferably reproduce
one more time with updates file adding patches which:
- remove the mentioned exception catching
- append outputs of lvm commands to lvmout file instead
  of rewriting it with output of last command.

Comment 14 Bill Peck 2008-11-20 13:52:52 UTC

I've scheduled a job with your updates.img.  I'll report back the results soon.

http://rhts.redhat.com/cgi-bin/rhts/jobs.cgi?id=36874

Comment 15 Radek Vykydal 2008-11-20 14:11:50 UTC

the updates.img file mentioned in comment #13 is on
http://rvykydal.fedorapeople.org/updates.wipelvmdbg.img

Comment 17 Radek Vykydal 2008-11-20 17:55:38 UTC

We're getting closer, patch in updates file from comment #13 gives expected
logs:

anaconda.log tail:
16:33:44 INFO    : removing obsolete VG VolGroup00
16:33:44 INFO    : vgremove -v VolGroup00
16:33:45 ERROR   : createLogicalVolumes failed with vgremove failed

lvmout tail:
    Using volume group(s) on command line
    Finding volume group "VolGroup00"
    Wiping cache of LVM-capable devices
  Couldn't find device with uuid '5iOMY6-Moqq-6bm8-zVPg-2Qhd-4HKH-QmSQgV'.
    There are 1 physical volumes missing.
  Volume group "VolGroup00" not found, is inconsistent or has PVs missing.
  Consider vgreduce --removemissing if metadata is inconsistent.

Comment 18 Radek Vykydal 2008-11-20 18:32:44 UTC

Updates file
http://rvykydal.fedorapeople.org/updates.wipelvmfix.img
fixes the issue on rhts reproducer from comment #12.
Thanks to bpeck for testing my updates.img files.

Comment 20 Radek Vykydal 2008-11-21 13:22:33 UTC

I guess testing updates.img file from #17 has destroyed
the rhts reproducer.
Finally I found a reproducer using kvm:

1) install with default partitioning on 2 clean disks (hda and hdb)
I used this ks partitioning:

zerombr
clearpart --all --initlabel
autopart

2) replace hda with a clean disk
(the order of disks matters)

3) try to install on hda and hdb with following ks partitioning:
(it is the same as in original reproducer in comment #3, only
hdc is replaced with hdb)

zerombr
clearpart --all --drives=hda,hdb --initlabel
part prepboot --fstype "PPC PReP Boot" --size=4 --ondisk=hda
part raid.27 --size=100 --ondisk=hda --asprimary
part raid.29 --size=100 --ondisk=hdb --asprimary
part raid.23 --size=1 --grow --ondisk=hdb
part raid.22 --size=1 --grow --ondisk=hda
raid /boot --fstype ext3 --level=RAID1 --device=md0 raid.27 raid.29
raid pv.17 --fstype "physical volume (LVM)" --level=RAID0 --device=md1 raid.22 raid.23
volgroup VolGroup00 --pesize=32768 pv.17
logvol swap --fstype swap --name=swap0 --vgname=VolGroup00 --size=4096
logvol / --fstype ext3 --name=LogVol00 --vgname=VolGroup00 --size=2000 --grow

Comment 22 Radek Vykydal 2008-11-26 09:26:01 UTC

Should be fixed in anaconda-11.1.2.158-1.

Comment 24 Denise Dumas 2008-12-11 15:17:06 UTC

*** Bug 473247 has been marked as a duplicate of this bug. ***

Comment 28 errata-xmlrpc 2009-01-20 21:37:44 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2009-0164.html

Note You need to log in before you can comment on or make changes to this bug.