Bug 727578

Summary: Missing PVs lead to corrupted metadata, and "vgreduce --removemissing --force" is unable to correct the metadata
Product: Red Hat Enterprise Linux 4 Reporter: Milan Broz <mbroz>
Component: lvm2-clusterAssignee: Milan Broz <mbroz>
Status: CLOSED ERRATA QA Contact: Cluster QE <mspqa-list>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 4.9CC: agk, ccaulfie, cmarthal, dwysocha, heinzm, jbrassow, mbroz, mjuricek, mkhusid, prajnoha, prockai, pvrabec, thornber
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: lvm2-cluster-2.02.42-11.el4 Doc Type: Bug Fix
Doc Text:
Previously, if more physical volumes are missing in volume group, it can happen that written metadata contains wrong name for missing physical volumes and this situation is later detected as incorrect metadata for the whole volume group. This fix enforces using physical volume UUID to reference physical volumes and fixes this problem.
Story Points: ---
Clone Of: 702308 Environment:
Last Closed: 2011-08-18 08:48:49 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 702308    
Bug Blocks:    

Description Milan Broz 2011-08-02 14:04:58 UTC
+++ This bug was initially created as a clone of Bug #702308 +++
Clone for lvm2-cluster

Description of problem:
In some customer setups, when a PV goes missing that is part of an LV, "vgreduce --removemissing --force" does not make a consistent VG to remove.  As a result, they are unable to remove the VG.


Version-Release number of selected component (if applicable):
lvm2-2.02.42-9.el4

How reproducible:
I had a hard time reproducing it, but I'll attach all the info from the customer's system, including an lvmdump, and verbose output of the command.

Steps to Reproduce:
1. create a vg from multiple pvs
2. create at least one lv on the vg
3. remove at least one of the pvs in the lv
4. try using vgreduce, vgreduce --removemissing, and vgreduce --removemissing --force.
  
Actual results:
Unable to use vgreduce to make a consistent vg to remove.


--- Additional comment from mbroz on 2011-05-05 07:59:52 EDT ---

That problem was fixed long ago in new packages but unfortunatelly not in
RHEL4,
from the a7cac2463c15c915636e511887f022b8cb63a97e commit log:
    Use PV UUID in hash for device name when exporting metadata.

    Currently code uses pv_dev_name() for hash when getting internal
    "pvX" name.

    This produce corrupted metadata if PVs are missing, pv->dev
    is NULL and all these missing devices returns one name
    (using "unknown device" for all missing devices as hash key).

I see here quite serious problem - when the simple VG with several PVs
experiences fails of several PVS, code apparently generates wrong metadata and
these metadata is not parsable, so it can lead to loss of the whole VG.

Comment 1 Milan Broz 2011-08-03 17:10:06 UTC
Fixed in lvm2-cluster-2.02.42-11.el4

Comment 3 Milan Broz 2011-08-05 11:10:05 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Previously, if more physical volumes are missing in volume group, it can happen that written metadata
contains wrong name for missing physical volumes and this situation is later detected as incorrect metadata
for the whole volume group.
This fix enforces using physical volume UUID to reference physical volumes and fixes this problem.

Comment 5 errata-xmlrpc 2011-08-18 08:48:49 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-1183.html