Bug 653643 - vgextend --restoremissing "passes" when just one of the multiple devices attempted works
Summary: vgextend --restoremissing "passes" when just one of the multiple devices atte...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: lvm2
Version: 5.6
Hardware: x86_64
OS: Linux
medium
medium
Target Milestone: rc
: ---
Assignee: Petr Rockai
QA Contact: Corey Marthaler
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2010-11-15 20:56 UTC by Corey Marthaler
Modified: 2016-03-18 20:00 UTC (History)
10 users (show)

Fixed In Version: lvm2-2.02.84-1
Doc Type: Bug Fix
Doc Text:
This field is the basis of the errata or release note for this bug. It can also be used for change logs. The Technical Note template, known as CCFR, is as follows: Cause vgextend --restoremissing would have reported success even in case of partial failure of an operation Consequence users of vgextend --restoremissing may be confused by this behaviour Fix change the behaviour to report partial failures Result a partial failure is reported
Clone Of:
Environment:
Last Closed: 2011-07-21 10:51:40 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1319319 0 unspecified NEW vgextend --restoremissing is misbehaving in certain case 2023-08-10 15:40:32 UTC
Red Hat Product Errata RHBA-2011:1071 0 normal SHIPPED_LIVE lvm2 bug fix and enhancement update 2011-07-21 10:50:01 UTC

Internal Links: 1319319

Description Corey Marthaler 2010-11-15 20:56:43 UTC
Description of problem:
The following command passed when I think it should have failed. 

taft-01: vgextend --restoremissing helter_skelter /dev/sdc1 /dev/sdf1
  WARNING: Inconsistent metadata found for VG helter_skelter - updating to use version 493
  Missing device /dev/sdc1 reappeared, updating metadata for VG helter_skelter to version 493.
  Device still marked missing because of allocated data on it, remove volumes and consider vgreduce --removemissing.
  Removing PV /dev/sdf1 (CYqDST-Q8GN-8aRy-TZ09-aKoR-hmmf-ScWICU) that no longer belongs to VG helter_skelter
  WARNING: PV /dev/sdf1 not found in VG helter_skelter

From the log:
Nov 15 13:47:42 taft-01 qarshd[18718]: Running cmdline: vgextend --restoremissing helter_skelter /dev/sdc1 /dev/sdf1
Nov 15 13:47:44 taft-01 xinetd[6233]: EXIT: qarsh status=0 pid=18718 duration=2(sec)


Version-Release number of selected component (if applicable):
2.6.18-227.el5

lvm2-2.02.74-3.el5    BUILT: Thu Nov 11 02:56:33 CST 2010
lvm2-cluster-2.02.74-3.el5    BUILT: Tue Nov  9 08:01:59 CST 2010
device-mapper-1.02.55-2.el5    BUILT: Tue Nov  9 06:41:00 CST 2010
device-mapper-event-1.02.55-2.el5    BUILT: Tue Nov  9 06:41:00 CST 2010
cmirror-1.1.39-10.el5    BUILT: Wed Sep  8 16:32:05 CDT 2010
kmod-cmirror-0.1.22-3.el5    BUILT: Tue Dec 22 13:39:47 CST 2009


How reproducible:
Often

Comment 1 Corey Marthaler 2010-11-15 22:55:17 UTC
When run on the individual failed devices, the second cmd fails because it appears the first cmd already dealt with and removed the second device. Am I using this command wrong?

taft-01: vgextend --restoremissing helter_skelter /dev/sdc1
  WARNING: Inconsistent metadata found for VG helter_skelter - updating to use version 95
  Missing device /dev/sdc1 reappeared, updating metadata for VG helter_skelter to version 95.
  Device still marked missing because of allocated data on it, remove volumes and consider vgreduce --removemissing.
  Removing PV /dev/sdf1 (thO4WT-KElb-9fkg-CpLE-gxnw-Ktlv-nwYBXc) that no longer belongs to VG helter_skelter

taft-01: vgextend --restoremissing helter_skelter /dev/sdf1
  WARNING: PV /dev/sdf1 not found in VG helter_skelter
  No PV has been restored.
vgextend --restoremissing didn't work on taft-01

Comment 2 Corey Marthaler 2010-11-19 00:12:44 UTC
Without a solution to this issue, the 'additional stripe containing one of the devices being failed' (to further stress mirror device failure) will have to be turned off for 5.6.

Comment 3 Petr Rockai 2010-11-21 12:38:05 UTC
This is indeed all a bit confusing. A breakdown:

- /dev/sdf1 has been removed by (automatic, by dmeventd) vgreduce --removemissing while it was actually missing; when you run vgextend --restoremissing /dev/sdc1, the generic metadata reading notices that sdf1 has an old copy of metadata on it and that in fact the new copy says sdf1 is no longer in the VG; it is therefore kicked out; any command that writes metadata would do this, it is in no way specific to vgextend --restoremissing: it just so happened it was the first to run after the device came back

- /dev/sdc1: this device was not removed because it actually had some data on it when it went away, which was not part of a mirror that could be repaired; the "Missing device ... reappeared" and "Device still marked missing..." are misleading in this case: it is an automated attempt by the metadata code (again) that is run with every command. It would be actually good to suppress this when running vgextend --restoremissing. In fact, after these messages, vgextend --restoremissing runs and fixes up /dev/sdc1.

Overall, after the command your VG should be in proper working order. It is just the messages that are confusing. I'll look into fixing that. I believe it would be OK to skip the reappearance test completely if we have handles_missing_pvs set... I'll send a patch to do just that, it should fix this case and probably some other confusing messages.

Comment 5 Petr Rockai 2010-11-30 11:55:50 UTC
The proposed patch (see my last comment) has been checked in upstream.

Comment 6 Milan Broz 2011-03-01 15:58:24 UTC
Fixed in lvm2-2.02.84-1.el5

Comment 9 Corey Marthaler 2011-05-10 16:42:42 UTC
This appears to be fixed now. That said, there's a caveat here because this cmd is no longer required during mirror device failure testing, so it's no longer run as apart of our regular regression testing.

[root@taft-01 mnt]# pvscan
  WARNING: Volume Group helter_skelter is not consistent
  PV /dev/sdb1   VG helter_skelter   lvm2 [135.66 GB / 135.18 GB free]
  PV /dev/sde1   VG helter_skelter   lvm2 [135.66 GB / 135.18 GB free]
  PV /dev/sdf1   VG helter_skelter   lvm2 [135.66 GB / 135.18 GB free]
  PV /dev/sdg1   VG helter_skelter   lvm2 [135.66 GB / 135.66 GB free]
  PV /dev/sdh1   VG helter_skelter   lvm2 [135.66 GB / 135.66 GB free]
  PV /dev/sda2   VG VolGroup00       lvm2 [68.12 GB / 0    free]
  Total: 6 [746.45 GB] / in use: 6 [746.45 GB] / in no VG: 0 [0   ]

[root@taft-01 mnt]# vgextend --restoremissing helter_skelter  /dev/sdd1 /dev/sdc1
  WARNING: Inconsistent metadata found for VG helter_skelter - updating to use version 12
  Removing PV /dev/sdc1 (Z7T72D-7K4t-b6W1-aO5j-RHgJ-bV6b-0jOcwg) that no longer belongs to VG helter_skelter
  Removing PV /dev/sdd1 (JN45DR-H1YZ-edJ0-ujtc-BYQK-FPsm-FL2K2N) that no longer belongs to VG helter_skelter
  WARNING: PV /dev/sdd1 not found in VG helter_skelter
  WARNING: PV /dev/sdc1 not found in VG helter_skelter
  No PV has been restored.


2.6.18-256.el5

lvm2-2.02.84-2.el5    BUILT: Wed Mar 23 07:18:08 CDT 2011
lvm2-cluster-2.02.84-2.el5    BUILT: Wed Mar 23 07:19:43 CDT 2011
device-mapper-1.02.63-2.el5    BUILT: Fri Mar  4 10:23:17 CST 2011
device-mapper-event-1.02.63-2.el5    BUILT: Fri Mar  4 10:23:17 CST 2011
cmirror-1.1.39-10.el5    BUILT: Wed Sep  8 16:32:05 CDT 2010
kmod-cmirror-0.1.22-3.el5    BUILT: Tue Dec 22 13:39:47 CST 2009

Comment 11 Florian Nadge 2011-05-26 14:56:51 UTC
    Technical note added. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    New Contents:
This field is the basis of the errata or release note for this bug. It can also be used for change logs.

The Technical Note template, known as CCFR, is as follows:

Cause
    What actions or circumstances cause this bug to present.
Consequence
    What happens when the bug presents.
Fix
    What was done to fix the bug.
Result
    What now happens when the actions or circumstances above occur.
    Note: this is not the same as the bug doesn’t present anymore.

Comment 12 Petr Rockai 2011-05-30 08:18:50 UTC
    Technical note updated. If any revisions are required, please edit the "Technical Notes" field
    accordingly. All revisions will be proofread by the Engineering Content Services team.
    
    Diffed Contents:
@@ -3,11 +3,10 @@
 The Technical Note template, known as CCFR, is as follows:
 
 Cause
-    What actions or circumstances cause this bug to present.
+    vgextend --restoremissing would have reported success even in case of partial failure of an operation
 Consequence
-    What happens when the bug presents.
+    users of vgextend --restoremissing may be confused by this behaviour
 Fix
-    What was done to fix the bug.
+    change the behaviour to report partial failures
 Result
-    What now happens when the actions or circumstances above occur.
+    a partial failure is reported-    Note: this is not the same as the bug doesn’t present anymore.

Comment 13 errata-xmlrpc 2011-07-21 10:51:40 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-1071.html

Comment 14 errata-xmlrpc 2011-07-21 12:28:59 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2011-1071.html


Note You need to log in before you can comment on or make changes to this bug.