Description of problem: The following command passed when I think it should have failed. taft-01: vgextend --restoremissing helter_skelter /dev/sdc1 /dev/sdf1 WARNING: Inconsistent metadata found for VG helter_skelter - updating to use version 493 Missing device /dev/sdc1 reappeared, updating metadata for VG helter_skelter to version 493. Device still marked missing because of allocated data on it, remove volumes and consider vgreduce --removemissing. Removing PV /dev/sdf1 (CYqDST-Q8GN-8aRy-TZ09-aKoR-hmmf-ScWICU) that no longer belongs to VG helter_skelter WARNING: PV /dev/sdf1 not found in VG helter_skelter From the log: Nov 15 13:47:42 taft-01 qarshd[18718]: Running cmdline: vgextend --restoremissing helter_skelter /dev/sdc1 /dev/sdf1 Nov 15 13:47:44 taft-01 xinetd[6233]: EXIT: qarsh status=0 pid=18718 duration=2(sec) Version-Release number of selected component (if applicable): 2.6.18-227.el5 lvm2-2.02.74-3.el5 BUILT: Thu Nov 11 02:56:33 CST 2010 lvm2-cluster-2.02.74-3.el5 BUILT: Tue Nov 9 08:01:59 CST 2010 device-mapper-1.02.55-2.el5 BUILT: Tue Nov 9 06:41:00 CST 2010 device-mapper-event-1.02.55-2.el5 BUILT: Tue Nov 9 06:41:00 CST 2010 cmirror-1.1.39-10.el5 BUILT: Wed Sep 8 16:32:05 CDT 2010 kmod-cmirror-0.1.22-3.el5 BUILT: Tue Dec 22 13:39:47 CST 2009 How reproducible: Often
When run on the individual failed devices, the second cmd fails because it appears the first cmd already dealt with and removed the second device. Am I using this command wrong? taft-01: vgextend --restoremissing helter_skelter /dev/sdc1 WARNING: Inconsistent metadata found for VG helter_skelter - updating to use version 95 Missing device /dev/sdc1 reappeared, updating metadata for VG helter_skelter to version 95. Device still marked missing because of allocated data on it, remove volumes and consider vgreduce --removemissing. Removing PV /dev/sdf1 (thO4WT-KElb-9fkg-CpLE-gxnw-Ktlv-nwYBXc) that no longer belongs to VG helter_skelter taft-01: vgextend --restoremissing helter_skelter /dev/sdf1 WARNING: PV /dev/sdf1 not found in VG helter_skelter No PV has been restored. vgextend --restoremissing didn't work on taft-01
Without a solution to this issue, the 'additional stripe containing one of the devices being failed' (to further stress mirror device failure) will have to be turned off for 5.6.
This is indeed all a bit confusing. A breakdown: - /dev/sdf1 has been removed by (automatic, by dmeventd) vgreduce --removemissing while it was actually missing; when you run vgextend --restoremissing /dev/sdc1, the generic metadata reading notices that sdf1 has an old copy of metadata on it and that in fact the new copy says sdf1 is no longer in the VG; it is therefore kicked out; any command that writes metadata would do this, it is in no way specific to vgextend --restoremissing: it just so happened it was the first to run after the device came back - /dev/sdc1: this device was not removed because it actually had some data on it when it went away, which was not part of a mirror that could be repaired; the "Missing device ... reappeared" and "Device still marked missing..." are misleading in this case: it is an automated attempt by the metadata code (again) that is run with every command. It would be actually good to suppress this when running vgextend --restoremissing. In fact, after these messages, vgextend --restoremissing runs and fixes up /dev/sdc1. Overall, after the command your VG should be in proper working order. It is just the messages that are confusing. I'll look into fixing that. I believe it would be OK to skip the reappearance test completely if we have handles_missing_pvs set... I'll send a patch to do just that, it should fix this case and probably some other confusing messages.
The proposed patch (see my last comment) has been checked in upstream.
Fixed in lvm2-2.02.84-1.el5
This appears to be fixed now. That said, there's a caveat here because this cmd is no longer required during mirror device failure testing, so it's no longer run as apart of our regular regression testing. [root@taft-01 mnt]# pvscan WARNING: Volume Group helter_skelter is not consistent PV /dev/sdb1 VG helter_skelter lvm2 [135.66 GB / 135.18 GB free] PV /dev/sde1 VG helter_skelter lvm2 [135.66 GB / 135.18 GB free] PV /dev/sdf1 VG helter_skelter lvm2 [135.66 GB / 135.18 GB free] PV /dev/sdg1 VG helter_skelter lvm2 [135.66 GB / 135.66 GB free] PV /dev/sdh1 VG helter_skelter lvm2 [135.66 GB / 135.66 GB free] PV /dev/sda2 VG VolGroup00 lvm2 [68.12 GB / 0 free] Total: 6 [746.45 GB] / in use: 6 [746.45 GB] / in no VG: 0 [0 ] [root@taft-01 mnt]# vgextend --restoremissing helter_skelter /dev/sdd1 /dev/sdc1 WARNING: Inconsistent metadata found for VG helter_skelter - updating to use version 12 Removing PV /dev/sdc1 (Z7T72D-7K4t-b6W1-aO5j-RHgJ-bV6b-0jOcwg) that no longer belongs to VG helter_skelter Removing PV /dev/sdd1 (JN45DR-H1YZ-edJ0-ujtc-BYQK-FPsm-FL2K2N) that no longer belongs to VG helter_skelter WARNING: PV /dev/sdd1 not found in VG helter_skelter WARNING: PV /dev/sdc1 not found in VG helter_skelter No PV has been restored. 2.6.18-256.el5 lvm2-2.02.84-2.el5 BUILT: Wed Mar 23 07:18:08 CDT 2011 lvm2-cluster-2.02.84-2.el5 BUILT: Wed Mar 23 07:19:43 CDT 2011 device-mapper-1.02.63-2.el5 BUILT: Fri Mar 4 10:23:17 CST 2011 device-mapper-event-1.02.63-2.el5 BUILT: Fri Mar 4 10:23:17 CST 2011 cmirror-1.1.39-10.el5 BUILT: Wed Sep 8 16:32:05 CDT 2010 kmod-cmirror-0.1.22-3.el5 BUILT: Tue Dec 22 13:39:47 CST 2009
Technical note added. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. New Contents: This field is the basis of the errata or release note for this bug. It can also be used for change logs. The Technical Note template, known as CCFR, is as follows: Cause What actions or circumstances cause this bug to present. Consequence What happens when the bug presents. Fix What was done to fix the bug. Result What now happens when the actions or circumstances above occur. Note: this is not the same as the bug doesn’t present anymore.
Technical note updated. If any revisions are required, please edit the "Technical Notes" field accordingly. All revisions will be proofread by the Engineering Content Services team. Diffed Contents: @@ -3,11 +3,10 @@ The Technical Note template, known as CCFR, is as follows: Cause - What actions or circumstances cause this bug to present. + vgextend --restoremissing would have reported success even in case of partial failure of an operation Consequence - What happens when the bug presents. + users of vgextend --restoremissing may be confused by this behaviour Fix - What was done to fix the bug. + change the behaviour to report partial failures Result - What now happens when the actions or circumstances above occur. + a partial failure is reported- Note: this is not the same as the bug doesn’t present anymore.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on therefore solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2011-1071.html