Bug 727578 - Missing PVs lead to corrupted metadata, and "vgreduce --removemissing --force" is unable to correct the metadata
Summary: Missing PVs lead to corrupted metadata, and "vgreduce --removemissing --force...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 4
Classification: Red Hat
Component: lvm2-cluster
Version: 4.9
Hardware: All
OS: Linux
urgent
urgent
Target Milestone: rc
: ---
Assignee: Milan Broz
QA Contact: Cluster QE
URL:
Whiteboard:
Depends On: 702308
Blocks:
TreeView+ depends on / blocked
 
Reported: 2011-08-02 14:04 UTC by Milan Broz
Modified: 2013-03-01 04:10 UTC (History)
13 users (show)

Fixed In Version: lvm2-cluster-2.02.42-11.el4
Doc Type: Bug Fix
Doc Text:
Previously, if more physical volumes are missing in volume group, it can happen that written metadata contains wrong name for missing physical volumes and this situation is later detected as incorrect metadata for the whole volume group. This fix enforces using physical volume UUID to reference physical volumes and fixes this problem.
Clone Of: 702308
Environment:
Last Closed: 2011-08-18 08:48:49 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2011:1183 0 normal SHIPPED_LIVE lvm2-cluster bug fix update 2011-08-18 08:48:42 UTC

Description Milan Broz 2011-08-02 14:04:58 UTC
+++ This bug was initially created as a clone of Bug #702308 +++
Clone for lvm2-cluster

Description of problem:
In some customer setups, when a PV goes missing that is part of an LV, "vgreduce --removemissing --force" does not make a consistent VG to remove.  As a result, they are unable to remove the VG.


Version-Release number of selected component (if applicable):
lvm2-2.02.42-9.el4

How reproducible:
I had a hard time reproducing it, but I'll attach all the info from the customer's system, including an lvmdump, and verbose output of the command.

Steps to Reproduce:
1. create a vg from multiple pvs
2. create at least one lv on the vg
3. remove at least one of the pvs in the lv
4. try using vgreduce, vgreduce --removemissing, and vgreduce --removemissing --force.
  
Actual results:
Unable to use vgreduce to make a consistent vg to remove.


--- Additional comment from mbroz on 2011-05-05 07:59:52 EDT ---

That problem was fixed long ago in new packages but unfortunatelly not in
RHEL4,
from the a7cac2463c15c915636e511887f022b8cb63a97e commit log:
    Use PV UUID in hash for device name when exporting metadata.

    Currently code uses pv_dev_name() for hash when getting internal
    "pvX" name.

    This produce corrupted metadata if PVs are missing, pv->dev
    is NULL and all these missing devices returns one name
    (using "unknown device" for all missing devices as hash key).

I see here quite serious problem - when the simple VG with several PVs
experiences fails of several PVS, code apparently generates wrong metadata and
these metadata is not parsable, so it can lead to loss of the whole VG.

Comment 1 Milan Broz 2011-08-03 17:10:06 UTC
Fixed in lvm2-cluster-2.02.42-11.el4

Comment 3 Milan Broz 2011-08-05 11:10:05 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
Previously, if more physical volumes are missing in volume group, it can happen that written metadata
contains wrong name for missing physical volumes and this situation is later detected as incorrect metadata
for the whole volume group.
This fix enforces using physical volume UUID to reference physical volumes and fixes this problem.

Comment 5 errata-xmlrpc 2011-08-18 08:48:49 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-1183.html


Note You need to log in before you can comment on or make changes to this bug.