| Summary: | Volume group inconsistency issues are causing mirror conversion to fail after leg failure | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Corey Marthaler <cmarthal> |
| Component: | lvm2 | Assignee: | Petr Rockai <prockai> |
| Status: | CLOSED NOTABUG | QA Contact: | Cluster QE <mspqa-list> |
| Severity: | urgent | Docs Contact: | |
| Priority: | urgent | ||
| Version: | 6.3 | CC: | agk, dwysocha, heinzm, jbrassow, mbroz, msnitzer, nperic, prajnoha, prockai, thornber, zkabelac |
| Target Milestone: | rc | Keywords: | Regression |
| Target Release: | --- | ||
| Hardware: | x86_64 | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2012-05-08 20:27:00 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
So, I'm about to release 2.02.94 proper. Then assuming the problem is still present, we'll try to reproduce a simpler version of it and debug it. I can't explain those test results, I'm afraid. I've tried various things with 2.02.94 here and have failed to reproduce any unexpected behaviour in these parts of the code. (This is non-cmirror isn't it? And without lvmetad running?) Please check whether or not this is still present in 2.02.94-final. I'll attempt this again once the 2.02.94 build it out, but I'm still able to reproduce this with the latest pre release builds. This is w/o cmirror and w/o lvmetad. 2.6.32-236.el6.x86_64 lvm2-2.02.95-0.67.el6 BUILT: Mon Mar 5 08:19:07 CST 2012 lvm2-libs-2.02.95-0.67.el6 BUILT: Mon Mar 5 08:19:07 CST 2012 lvm2-cluster-2.02.95-0.67.el6 BUILT: Mon Mar 5 08:19:07 CST 2012 udev-147-2.40.el6 BUILT: Fri Sep 23 07:51:13 CDT 2011 device-mapper-1.02.74-0.67.el6 BUILT: Mon Mar 5 08:19:07 CST 2012 device-mapper-libs-1.02.74-0.67.el6 BUILT: Mon Mar 5 08:19:07 CST 2012 device-mapper-event-1.02.74-0.67.el6 BUILT: Mon Mar 5 08:19:07 CST 2012 device-mapper-event-libs-1.02.74-0.67.el6 BUILT: Mon Mar 5 08:19:07 CST 2012 cmirror-2.02.95-0.67.el6 BUILT: Mon Mar 5 08:19:07 CST 2012 taft-01 lvm[4466]: Volume group revolution_9 metadata is inconsistent taft-01 lvm[4466]: Monitoring mirror device revolution_9-mirror_1 for events. taft-01 xinetd[1649]: EXIT: qarsh status=0 pid=5186 duration=1(sec) taft-01 lvm[4466]: Volume group revolution_9 metadata is inconsistent taft-01 lvm[4466]: Monitoring mirror device revolution_9-mirror_1 for events. taft-01 lvm[4466]: Mirror status: 1 of 3 images failed. taft-01 lvm[4466]: Trying to up-convert to 3 images, 1 logs. taft-01 lvm[4466]: Insufficient suitable allocatable extents for logical volume : 512 more required taft-01 lvm[4466]: Unable to allocate extents for mirror(s). taft-01 lvm[4466]: Trying to up-convert to 2 images, 1 logs. taft-01 lvm[4466]: Volume group revolution_9 metadata is inconsistent taft-01 lvm[4466]: Monitoring mirror device revolution_9-mirror_1 for events. taft-01 lvm[4466]: WARNING: Failed to replace 1 of 3 images in volume mirror_1 taft-01 lvm[4466]: Repair of mirrored device revolution_9-mirror_1 finished successfully. taft-01 lvm[4466]: revolution_9-mirror_1 is now in-sync. taft-01 lvm[4466]: revolution_9-mirror_1 is now in-sync. [root@taft-01 tmp]# pvscan Couldn't find device with uuid oyeCfu-wDRe-kIhe-YJLx-bolq-e1oQ-IVJCMU. PV /dev/sdh1 VG revolution_9 lvm2 [135.66 GiB / 133.66 GiB free] PV unknown device VG revolution_9 lvm2 [135.66 GiB / 135.66 GiB free] PV /dev/sdf1 VG revolution_9 lvm2 [135.66 GiB / 135.66 GiB free] PV /dev/sde1 VG revolution_9 lvm2 [135.66 GiB / 135.66 GiB free] PV /dev/sdd1 VG revolution_9 lvm2 [135.66 GiB / 135.66 GiB free] PV /dev/sdc1 VG revolution_9 lvm2 [135.66 GiB / 133.66 GiB free] PV /dev/sda2 VG vg_taft01 lvm2 [67.75 GiB / 0 free] Total: 7 [881.73 GiB] / in use: 7 [881.73 GiB] / in no VG: 0 [0 ] [root@taft-01 tmp]# lvs -a -o +devices Couldn't find device with uuid oyeCfu-wDRe-kIhe-YJLx-bolq-e1oQ-IVJCMU. LV VG Attr LSize Log Copy% Devices mirror_1 revolution_9 mwi-aom- 2.00g mirror_1_mlog 100.00 mirror_1_mimage_0(0),mirror_1_mimage_2(0) [mirror_1_mimage_0] revolution_9 iwi-aom- 2.00g /dev/sdc1(0) [mirror_1_mimage_2] revolution_9 iwi-aom- 2.00g /dev/sdh1(0) [mirror_1_mlog] revolution_9 lwi-aom- 4.00m /dev/sdf1(0) [root@taft-01 tmp]# lvconvert -m 2 revolution_9/mirror_1 Couldn't find device with uuid oyeCfu-wDRe-kIhe-YJLx-bolq-e1oQ-IVJCMU. Cannot change VG revolution_9 while PVs are missing. Consider vgreduce --removemissing. [root@taft-01 tmp]# vgreduce --removemissing revolution_9 Couldn't find device with uuid oyeCfu-wDRe-kIhe-YJLx-bolq-e1oQ-IVJCMU. Wrote out consistent volume group revolution_9 [root@taft-01 tmp]# lvconvert -m 2 revolution_9/mirror_1 Couldn't find device with uuid oyeCfu-wDRe-kIhe-YJLx-bolq-e1oQ-IVJCMU. WARNING: Inconsistent metadata found for VG revolution_9 - updating to use version 28 Removing PV unknown device (oyeCfu-wDRe-kIhe-YJLx-bolq-e1oQ-IVJCMU) that no longer belongs to VG revolution_9 unknown device: Couldn't get size. Volume group "revolution_9" not found I cannot reproduce this. Can you try again with the latest kernel/RPMs? It would also be helpful to know what LVM commands would produce the error (i.e. how do I reproduce without the test-suite.) Both Nenad and myself are still able to reproduce this with the latest rpms/kernel. 2.6.32-251.el6.x86_64 lvm2-2.02.95-1.el6 BUILT: Tue Mar 6 10:00:33 CST 2012 lvm2-libs-2.02.95-1.el6 BUILT: Tue Mar 6 10:00:33 CST 2012 lvm2-cluster-2.02.95-1.el6 BUILT: Tue Mar 6 10:00:33 CST 2012 udev-147-2.40.el6 BUILT: Fri Sep 23 07:51:13 CDT 2011 device-mapper-1.02.74-1.el6 BUILT: Tue Mar 6 10:00:33 CST 2012 device-mapper-libs-1.02.74-1.el6 BUILT: Tue Mar 6 10:00:33 CST 2012 device-mapper-event-1.02.74-1.el6 BUILT: Tue Mar 6 10:00:33 CST 2012 device-mapper-event-libs-1.02.74-1.el6 BUILT: Tue Mar 6 10:00:33 CST 2012 cmirror-2.02.95-1.el6 BUILT: Tue Mar 6 10:00:33 CST 2012 This doesn't happen everytime, so your best bet at reproducing it is to run the test suite. ./revolution_9 -l /home/msp/cmarthal/work/sts/sts-root -r /usr/tests/sts-rhel6.3 -o taft-01 -i 10 -F -e kill_non_primary_leg Otherwise the "manual" way to repo this would be to create a 3-way mirror, fail a single device, watch allocation work once the mirror is written to, remove the mirror, and repeat until you see the VG get into a state where allocation will no longer work. Mar 15 15:26:47 taft-01 lvm[1250]: Volume group revolution_9 metadata is inconsistent Mar 15 15:26:47 taft-01 lvm[1250]: Monitoring mirror device revolution_9-mirror_1 for events. Mar 15 15:26:47 taft-01 xinetd[1851]: EXIT: qarsh status=0 pid=4241 duration=1(sec) Mar 15 15:26:48 taft-01 lvm[1250]: Volume group revolution_9 metadata is inconsistent Mar 15 15:26:48 taft-01 lvm[1250]: Monitoring mirror device revolution_9-mirror_1 for events. Mar 15 15:26:48 taft-01 lvm[1250]: Mirror status: 1 of 3 images failed. Mar 15 15:26:48 taft-01 lvm[1250]: Trying to up-convert to 3 images, 1 logs. Mar 15 15:26:48 taft-01 lvm[1250]: Insufficient suitable allocatable extents for logical volume : 512 more required Mar 15 15:26:48 taft-01 lvm[1250]: Unable to allocate extents for mirror(s). Mar 15 15:26:48 taft-01 lvm[1250]: Trying to up-convert to 2 images, 1 logs. Mar 15 15:26:49 taft-01 lvm[1250]: Volume group revolution_9 metadata is inconsistent Mar 15 15:26:49 taft-01 lvm[1250]: Monitoring mirror device revolution_9-mirror_1 for events. Mar 15 15:26:49 taft-01 lvm[1250]: WARNING: Failed to replace 1 of 3 images in volume mirror_1 Mar 15 15:26:49 taft-01 lvm[1250]: Repair of mirrored device revolution_9-mirror_1 finished successfully. Mar 15 15:26:49 taft-01 lvm[1250]: revolution_9-mirror_1 is now in-sync. Mar 15 15:26:49 taft-01 lvm[1250]: revolution_9-mirror_1 is now in-sync. Ok, I have spent a while looking into this, and I can report these findings: 1) I cannot for the world of it reproduce the problem locally by just repeatedly having a mirror repaired in a VG 2) It turns out that when I run revolution_9 on taft, the PVs that are disabled by the test stay in a "missing" state (at least *sometimes*, but it seems to me that this happens reliably, every time), even after they are re-enabled 3) The above failure happens when we actually run out of PVs to use, and hence is not a "mirror repair" bug per se. In this regard, doing a vgextend --restoremissing on the re-enabled device (immediately after enabling it, eg.) will fix the problem. As far as I can tell, this is not a bug at all (at least not as far as LVM is concerned), unless we want PVs to be automatically re-added to VGs when they have dropped out temporarily (but long enough to be kicked out of mirrors). I think this was never a requirement, more to the contrary: we want replacement of PVs to be manually confirmed (hence, vgextend --restoremissing). So the two options regarding this bug: 1) Put a "vgextend --restoremissing VG PV" just after "Enabling device PV on localhost" happens in STS, to restore the PV for use 2) Change LVM to do this automatically For various reasons, I don't like option 2, which comes with non-trivial regression risks *and* I think it's a bad policy (it could complicate post-failure analysis, and possibly even damage data, in real-world failure scenarios). Opinions? I also prefer the status quo here: let's go down the route of adjusting the test to deal with this situation when it arises. I will attempt to change the test(s) to deal with this issue. |
Description of problem: Nenad Peric came across this issue while running device failure regression tests. Scenario kill_non_primary_leg: Kill non primary leg ********* Mirror info for this scenario ********* * mirrors: mirror_1 * leg devices: /dev/sde1 /dev/sdh1 /dev/sdg1 * log devices: /dev/sde1 * failpv(s): /dev/sdh1 * failnode(s): taft-01 * leg fault policy: allocate * log fault policy: allocate ************************************************* Mirror Structure(s): LV Attr LSize Copy% Devices mirror_1 mwi-aom- 2.00g 100.00 mirror_1_mimage_0(0),mirror_1_mimage_1(0),mirror_1_mimage_2(0) [mirror_1_mimage_0] iwi-aom- 2.00g /dev/sde1(0) [mirror_1_mimage_1] iwi-aom- 2.00g /dev/sdh1(0) [mirror_1_mimage_2] iwi-aom- 2.00g /dev/sdg1(0) [mirror_1_mlog] lwi-aom- 4.00m /dev/sde1(512) PVS IN VG: /dev/sdc1 /dev/sdd1 /dev/sde1 /dev/sdf1 /dev/sdg1 /dev/sdh1 PV=/dev/sdh1 mirror_1_mimage_1: 4.1 PV=/dev/sdh1 mirror_1_mimage_1: 4.1 Writing verification files (checkit) to mirror(s) on... ---- taft-01 ---- Sleeping 10 seconds to get some outsanding EXT I/O locks before the failure Verifying files (checkit) on mirror(s) on... ---- taft-01 ---- Disabling device sdh on taft-01 Attempting I/O to cause mirror down conversion(s) on taft-01 10+0 records in 10+0 records out 41943040 bytes (42 MB) copied, 0.274829 s, 153 MB/s Verifying current sanity of lvm after the failure Mirror Structure(s): Couldn't find device with uuid 76Japu-bkd7-ec1J-3q8W-Wa4g-P4mJ-B5JcTo. LV Attr LSize Copy% Devices mirror_1 mwi-aom- 2.00g 100.00 mirror_1_mimage_0(0),mirror_1_mimage_2(0) [mirror_1_mimage_0] iwi-aom- 2.00g /dev/sde1(0) [mirror_1_mimage_2] iwi-aom- 2.00g /dev/sdg1(0) [mirror_1_mlog] lwi-aom- 4.00m /dev/sde1(512) Verify that each of the mirror repairs finished successfully Verifying FAILED device /dev/sdh1 is *NOT* in the volume(s) Couldn't find device with uuid 76Japu-bkd7-ec1J-3q8W-Wa4g-P4mJ-B5JcTo. olog: 1 Verifying LOG device(s) /dev/sde1 *ARE* in the mirror(s) Couldn't find device with uuid 76Japu-bkd7-ec1J-3q8W-Wa4g-P4mJ-B5JcTo. Verifying LEG device /dev/sde1 *IS* in the volume(s) Couldn't find device with uuid 76Japu-bkd7-ec1J-3q8W-Wa4g-P4mJ-B5JcTo. Verifying LEG device /dev/sdg1 *IS* in the volume(s) Couldn't find device with uuid 76Japu-bkd7-ec1J-3q8W-Wa4g-P4mJ-B5JcTo. verify the dm devices associated with /dev/sdh1 have been removed as expected Checking REMOVAL of mirror_1_mimage_1 on: taft-01 verify the newly allocated dm devices were added as a result of the failures Checking EXISTENCE of mirror_1_mimage_3 on: taft-01 mirror_1_mimage_3 on taft-01 should now exist Here's where lvm attempts the up conversion Mar 1 10:01:47 taft-01 lvm[3060]: Volume group revolution_9 metadata is inconsistent Mar 1 10:01:47 taft-01 lvm[3060]: Monitoring mirror device revolution_9-mirror_1 for events. Mar 1 10:01:47 taft-01 lvm[3060]: Volume group revolution_9 metadata is inconsistent Mar 1 10:01:47 taft-01 lvm[3060]: Monitoring mirror device revolution_9-mirror_1 for events. Mar 1 10:01:47 taft-01 lvm[3060]: Mirror status: 1 of 3 images failed. Mar 1 10:01:47 taft-01 lvm[3060]: Trying to up-convert to 3 images, 1 logs. Mar 1 10:01:48 taft-01 lvm[3060]: Insufficient suitable allocatable extents for logical volume : 512 more required Mar 1 10:01:48 taft-01 lvm[3060]: Unable to allocate extents for mirror(s). Mar 1 10:01:48 taft-01 lvm[3060]: Trying to up-convert to 2 images, 1 logs. Mar 1 10:01:48 taft-01 qarshd[3516]: Running cmdline: dd if=/dev/zero of=/mnt/mirror_1/ddfile count=10 bs=4M Mar 1 10:01:48 taft-01 lvm[3060]: Volume group revolution_9 metadata is inconsistent Mar 1 10:01:48 taft-01 lvm[3060]: Monitoring mirror device revolution_9-mirror_1 for events. Mar 1 10:01:48 taft-01 lvm[3060]: WARNING: Failed to replace 1 of 3 images in volume mirror_1 Mar 1 10:01:48 taft-01 lvm[3060]: Repair of mirrored device revolution_9-mirror_1 finished successfully. Mar 1 10:01:48 taft-01 lvm[3060]: revolution_9-mirror_1 is now in-sync. Mar 1 10:01:48 taft-01 lvm[3060]: revolution_9-mirror_1 is now in-sync. # I attempted the same thing by hand and it failed for me as well: [root@taft-01 ~]# lvs -a -o +devices Couldn't find device with uuid 76Japu-bkd7-ec1J-3q8W-Wa4g-P4mJ-B5JcTo. LV VG Attr LSize Log Copy% Devices mirror_1 revolution_9 mwi-aom- 2.00g mirror_1_mlog 100.00 mirror_1_mimage_0(0),mirror_1_mimage_2(0) [mirror_1_mimage_0] revolution_9 iwi-aom- 2.00g /dev/sde1(0) [mirror_1_mimage_2] revolution_9 iwi-aom- 2.00g /dev/sdg1(0) [mirror_1_mlog] revolution_9 lwi-aom- 4.00m /dev/sde1(512) [root@taft-01 ~]# pvscan Couldn't find device with uuid 76Japu-bkd7-ec1J-3q8W-Wa4g-P4mJ-B5JcTo. PV unknown device VG revolution_9 lvm2 [135.66 GiB / 135.66 GiB free] PV /dev/sdg1 VG revolution_9 lvm2 [135.66 GiB / 133.66 GiB free] PV /dev/sdf1 VG revolution_9 lvm2 [135.66 GiB / 135.66 GiB free] PV /dev/sde1 VG revolution_9 lvm2 [135.66 GiB / 133.66 GiB free] PV /dev/sdd1 VG revolution_9 lvm2 [135.66 GiB / 135.66 GiB free] PV /dev/sdc1 VG revolution_9 lvm2 [135.66 GiB / 135.66 GiB free] PV /dev/sda2 VG vg_taft01 lvm2 [67.75 GiB / 0 free] Total: 7 [881.73 GiB] / in use: 7 [881.73 GiB] / in no VG: 0 [0 ] [root@taft-01 ~]# lvconvert -m 2 revolution_9/mirror_1 Couldn't find device with uuid 76Japu-bkd7-ec1J-3q8W-Wa4g-P4mJ-B5JcTo. Cannot change VG revolution_9 while PVs are missing. Consider vgreduce --removemissing. [root@taft-01 ~]# vgreduce --removemissing revolution_9 Couldn't find device with uuid 76Japu-bkd7-ec1J-3q8W-Wa4g-P4mJ-B5JcTo. Wrote out consistent volume group revolution_9 [root@taft-01 ~]# lvconvert -m 2 revolution_9/mirror_1 Couldn't find device with uuid 76Japu-bkd7-ec1J-3q8W-Wa4g-P4mJ-B5JcTo. WARNING: Inconsistent metadata found for VG revolution_9 - updating to use version 19 Removing PV unknown device (76Japu-bkd7-ec1J-3q8W-Wa4g-P4mJ-B5JcTo) that no longer belongs to VG revolution_9 unknown device: Couldn't get size. Volume group "revolution_9" not found # WHERE DID THE PREVIOUS DEVICES IN THE VG GO!?! These missing devices had been failed in a previous iteration. [root@taft-01 ~]# pvscan Couldn't find device with uuid 76Japu-bkd7-ec1J-3q8W-Wa4g-P4mJ-B5JcTo. WARNING: Volume Group revolution_9 is not consistent PV /dev/sdg1 VG revolution_9 lvm2 [135.66 GiB / 133.66 GiB free] PV /dev/sde1 VG revolution_9 lvm2 [135.66 GiB / 133.66 GiB free] PV /dev/sda2 VG vg_taft01 lvm2 [67.75 GiB / 0 free] Total: 3 [339.08 GiB] / in use: 3 [339.08 GiB] / in no VG: 0 [0 ] [root@taft-01 ~]# lvs -a -o +devices Couldn't find device with uuid 76Japu-bkd7-ec1J-3q8W-Wa4g-P4mJ-B5JcTo. WARNING: Inconsistent metadata found for VG revolution_9 - updating to use version 20 Removing PV unknown device (76Japu-bkd7-ec1J-3q8W-Wa4g-P4mJ-B5JcTo) that no longer belongs to VG revolution_9 unknown device: Couldn't get size. Recovery of volume group "revolution_9" failed. Skipping volume group revolution_9 LV VG Attr LSize Pool Origin Data% Move Log Copy% Convert Devices Version-Release number of selected component (if applicable): 2.6.32-236.el6.x86_64 lvm2-2.02.94-0.61.el6 BUILT: Thu Mar 1 07:03:29 CST 2012 lvm2-libs-2.02.94-0.61.el6 BUILT: Thu Mar 1 07:03:29 CST 2012 lvm2-cluster-2.02.94-0.61.el6 BUILT: Thu Mar 1 07:03:29 CST 2012 udev-147-2.40.el6 BUILT: Fri Sep 23 07:51:13 CDT 2011 device-mapper-1.02.73-0.61.el6 BUILT: Thu Mar 1 07:03:29 CST 2012 device-mapper-libs-1.02.73-0.61.el6 BUILT: Thu Mar 1 07:03:29 CST 2012 device-mapper-event-1.02.73-0.61.el6 BUILT: Thu Mar 1 07:03:29 CST 2012 device-mapper-event-libs-1.02.73-0.61.el6 BUILT: Thu Mar 1 07:03:29 CST 2012 cmirror-2.02.94-0.61.el6 BUILT: Thu Mar 1 07:03:29 CST 2012 How reproducible: Most of the time