Bug 829221
Summary: | RFE: automatically restore PV from MISSING state after it becomes reachable again if it has no active MDA (ignoremetadata is true) | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Leonid Natapov <lnatapov> | ||||||||
Component: | lvm2 | Assignee: | Petr Rockai <prockai> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | Cluster QE <mspqa-list> | ||||||||
Severity: | urgent | Docs Contact: | |||||||||
Priority: | urgent | ||||||||||
Version: | 6.2 | CC: | abaron, agk, bsarathy, cmarthal, coughlan, cpelland, dwysocha, ewarszaw, hateya, heinzm, jbrassow, msnitzer, prajnoha, prockai, thornber, wnefal+redhatbugzilla, yeylon, ykaul, zkabelac | ||||||||
Target Milestone: | rc | Keywords: | FutureFeature, Reopened, ZStream | ||||||||
Target Release: | --- | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Linux | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | lvm2-2.02.98-4.el6 | Doc Type: | Enhancement | ||||||||
Doc Text: |
Feature: Automatically restore PV from MISSING state after it becomes reachable again if it has no active metadata areas.
Reason: In cases of transient inaccessibility of a PV (like with iSCSI or other unreliable transport), LVM would require manual action to restore the PV for use even if there was no room for conflict, because there is no active MDA (metadata area) on the PV.
Result (if any): Manual action is no longer required if the transiently inaccessible PV had no active metadata areas.
|
Story Points: | --- | ||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2013-02-21 08:10:29 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | |||||||||||
Bug Blocks: | 883034, 886216 | ||||||||||
Attachments: |
|
Description
Leonid Natapov
2012-06-06 09:05:48 UTC
(In reply to comment #0) > attached logs are: > /var/log/messages and /etc/lvm/archive Please, attach the logs mentioned above + the output of "pvs -vvvv" and output of the "lsblk" command. Also, try to grab the "sosreport" which might provide more insight for us on the state of the system as well. Thanks. This request was not resolved in time for the current release. Red Hat invites you to ask your support representative to propose this request, if still desired, for consideration in the next release of Red Hat Enterprise Linux. This request was erroneously removed from consideration in Red Hat Enterprise Linux 6.4, which is currently under development. This request will be evaluated for inclusion in Red Hat Enterprise Linux 6.4. We need the logs to move forward with this problem. Please, if possible, attach the logs mentioned in comment #3. Created attachment 598851 [details]
logs
Created attachment 598852 [details]
logs
Created attachment 632877 [details]
logs
I just ran into this problem again, when attempting to remove a VG that was created on a single LUN, then extended to a second LUN.
The scenario used was this:
1. Create a vg on a single lun.
2. extend vg to a second lun
3. The iscsi session is then closed, but reopened about a second later
4. both pvs are visible, however the vg has the partial attribute
logs attached as per previous comments.
This was reproduced on:
Red Hat Enterprise Virtualization Hypervisor release 6.3 (20120710.0.el6_3)
Just so, we're on the same page here: in step 1, you created the 013d86f5-5c0e-4f2c-b60a-4d0117ade7df VG on top of /dev/mapper/1qe-storage_sanity11340729 in step 2, you grew the VG by adding /dev/mapper/1qe-storage_sanity_ext1340730 correct? In step 4, the dev/mapper/1qe-storage_sanity_ext1340730 PV is in the missing state /dev/mapper/1qe-storage_sanity_ext1340730 013d86f5-5c0e-4f2c-b60a-4d0117ade7df lvm2 a-m 29.62g 29.62g 30.00g aNTqKL-HLBz-JYAI-KbDP-cqGo-PxAB-Y244bu Can you run # multipath -ll both before and after step 3. I assume that like you say, both of these will show a working multipath device. Looking at the messages, I can see Oct 23 08:51:38 cyan-vdsh multipathd: sdb: remove path (uevent) Oct 23 08:51:38 cyan-vdsh multipathd: 1qe-storage_sanity_ext1340730 Last path de leted, disabling queueing Oct 23 08:51:38 cyan-vdsh multipathd: 1qe-storage_sanity_ext1340730: devmap remo ved Oct 23 08:51:38 cyan-vdsh multipathd: 1qe-storage_sanity_ext1340730: stop event checker thread (139916255160064) Oct 23 08:51:38 cyan-vdsh multipathd: 1qe-storage_sanity_ext1340730: removed map after removing all paths ... Oct 23 08:52:13 cyan-vdsh multipathd: sdc: add path (uevent) Oct 23 08:52:13 cyan-vdsh multipathd: 1qe-storage_sanity_ext1340730: load table [0 62914560 multipath 0 0 1 1 round-robin 0 1 1 8:32 1] Oct 23 08:52:13 cyan-vdsh multipathd: 1qe-storage_sanity_ext1340730: event checker started Oct 23 08:52:13 cyan-vdsh multipathd: sdc path added to devmap 1qe-storage_sanity_ext1340730 So, you lost your last path, and the multipath device wasn't open, so it got deleted. Less than a minute later, the path re-appears, and the multipath device gets recreated. The reason that 1qe-storage_sanity11340729 isn't also marked as missing is that the device is in-use, so it doesn't get freed on the last delete. Oct 23 08:51:38 cyan-vdsh multipathd: sda: remove path (uevent) Oct 23 08:51:38 cyan-vdsh multipathd: 1qe-storage_sanity11340729 Last path delet ed, disabling queueing Oct 23 08:51:38 cyan-vdsh multipathd: 1qe-storage_sanity11340729: map in use Oct 23 08:51:38 cyan-vdsh multipathd: 1qe-storage_sanity11340729: can't flush Oct 23 08:51:38 cyan-vdsh multipathd: flush_on_last_del in progress Oct 23 08:51:38 cyan-vdsh multipathd: 1qe-storage_sanity11340729: load table [0 62914560 multipath 0 0 0 0] Oct 23 08:51:38 cyan-vdsh multipathd: sda: path removed from map 1qe-storage_san ity11340729 It looks like lVM is having a problem with clearing out the MISSING flag on PVs that disappear and then reappear. I don't think this is multipath specific. Petr, I'm kicking this over to you since it looks similar to Bug 537913. If the multipath -ll output shows that the multipath device is really not usable after step 3, or if there's so other multipath issue I missed, you can kick it back. Basically, as I believe the report is about a PV actually disappearing and coming back, this is not a bug but actually a feature. If VG metadata changes while a PV is missing, that PV needs to be re-added by issuing "vgextend --restoremissing VG PV". This is to avoid accidental (and automatic) corruption in cases where the PV was modified (eg. by being available to another machine) while missing. If RHEV can reasonably assume that the PV that went offline and came back is in a state where it can be automatically made available to the VG again, it should issue vgextend --restoremissing on such PVs. Reopening and marking as rfe as LVM can determine this automatically in this case. So what are you asking for? A new configuration setting in LVM to make it *assume* that any MISSING device was not written to while it was missing? RHEV would have to take the responsibility for ensuring that condition was always true and supporting users. We would state clearly that lvm-team will not support situations caused by incorrect use of the setting and it should only be used in a fully-controlled environment where the necessary "no split write" condition can be guaranteed through other control mechanisms. (In reply to comment #14) > So what are you asking for? > > A new configuration setting in LVM to make it *assume* that any MISSING > device was not written to while it was missing? RHEV would have to take the > responsibility for ensuring that condition was always true and supporting > users. > > We would state clearly that lvm-team will not support situations caused by > incorrect use of the setting and it should only be used in a > fully-controlled environment where the necessary "no split write" condition > can be guaranteed through other control mechanisms. Since the missing pv doesn't have any active mda (which users could alter) what changes are you protecting against? The changes I can think of include: 1. the pv being added to another VG, which actually goes back to a different rfe I've opened a long time ago which was that LVM 'mark' each pv as belonging to a certain vg even if it doesn't have an active vg mda. 2. running 'pvcreate' (also covered by VG uuid appearing in pv md) 3. just writing data directly on the pv (this could happen to any pv regardless of whether it is missing or not so imo it's not interesting) Also, iiuc the pv would not require manual addition if there are no vg changes while it is gone, despite the fact that all the changes above could happen to it in the same way, so I'm not quite sure I understand what it is you're protecting against. Alasdair, what is being asked for is to detect "safe" situations automatically. It's not entirely easy, but it should be possible, at least for some of the safe situations. This basically means that for every PV that is MISSING but we have found, verify that it has no active MDA and if this is the case, clear the missing flag. I'll have to think a bit more whether we are opening any holes, but it seems to be safe. Ayal, if there are no VG changes, there is no room for conflict; if the locally-missing PV had a metadata update, it marks the locally-available PVs as MISSING in that update. In other words, the MISSING situation is basically symmetric, whatever one side sees is MISSING on the other. Either "side" of the split updating the VG while the other is gone will prompt recovery. (In reply to comment #16) > Alasdair, > > what is being asked for is to detect "safe" situations automatically. It's > not entirely easy, but it should be possible, at least for some of the safe > situations. This basically means that for every PV that is MISSING but we > have found, verify that it has no active MDA and if this is the case, clear > the missing flag. I'll have to think a bit more whether we are opening any > holes, but it seems to be safe. > > Ayal, > > if there are no VG changes, there is no room for conflict; if the > locally-missing PV had a metadata update, it marks the locally-available PVs > as MISSING in that update. In other words, the MISSING situation is > basically symmetric, whatever one side sees is MISSING on the other. Either > "side" of the split updating the VG while the other is gone will prompt > recovery. got it. as mentioned, I'm talking about "safe" situations where the missing pv did not previously have any active mda nor does it have it after reappearing. This issue needs to be resolved in RHEL6.4 and in RHEL6.3.z. RHEV 3.1 would like to take this fix in post GA, in RHEV 3.1.z. This bug needs devel and qe acks. A fix will land upstream shortly (assuming tests pass), as 09d77d0..60668f8. Therefore, devel_ack+. I'll POST this bug as soon as the fix is actually upstream. Tests passed, including the one for this specific feature. The relevant upstream commit is 60668f823e830ce39e452234996910c51728aa76. Regarding QA, this is how I test the feature in upstream suite (disable a device with no MDA, do a write operation that is legal on partial VGs -- in this case, lvremove, triggering a metadata write with MISSING flag for the disabled device, then re-enabling the device and checking that a write operation succeeds, wiping the MISSING flag): . lib/test aux prepare_vg 3 pvchange --metadataignore y $dev1 lvcreate -m 1 -l 1 -n mirror $vg lvchange -a n $vg/mirror lvcreate -l 1 -n lv1 $vg "$dev1" # try to just change metadata; we expect the new version (with MISSING_PV set # on the reappeared volume) to be written out to the previously missing PV aux disable_dev "$dev1" lvremove $vg/mirror not vgck $vg 2>&1 | tee log grep "missing 1 physical volume" log not lvcreate -m 1 -l 1 -n mirror $vg # write operations fail aux enable_dev "$dev1" lvcreate -m 1 -l 1 -n mirror $vg # no MDA => automatically restored vgck $vg Fix verified in the latest rpms. 2.6.32-343.el6.x86_64 lvm2-2.02.98-4.el6 BUILT: Wed Dec 5 08:35:04 CST 2012 lvm2-libs-2.02.98-4.el6 BUILT: Wed Dec 5 08:35:04 CST 2012 lvm2-cluster-2.02.98-4.el6 BUILT: Wed Dec 5 08:35:04 CST 2012 udev-147-2.43.el6 BUILT: Thu Oct 11 05:59:38 CDT 2012 device-mapper-1.02.77-4.el6 BUILT: Wed Dec 5 08:35:04 CST 2012 device-mapper-libs-1.02.77-4.el6 BUILT: Wed Dec 5 08:35:04 CST 2012 device-mapper-event-1.02.77-4.el6 BUILT: Wed Dec 5 08:35:04 CST 2012 device-mapper-event-libs-1.02.77-4.el6 BUILT: Wed Dec 5 08:35:04 CST 2012 cmirror-2.02.98-4.el6 BUILT: Wed Dec 5 08:35:04 CST 2012 The mirror failure test cases no longer need to 'vgreduce --removemissing' and recreate the failed PVs when there is no MDA area on the failed device. ================================================================================ Iteration 10.1 started at Tue Dec 11 13:07:59 CST 2012 ================================================================================ Scenario kill_no_mda_primary_2_legs: Kill primary leg containing *no* MDA of synced 2 leg mirror(s) ********* Mirror hash info for this scenario ********* * names: no_mda_primary_2legs_1 * sync: 1 * striped: 0 * leg devices: /dev/sdd1 /dev/sdf1 * log devices: /dev/sdg1 * no MDA devices: /dev/sdd1 * failpv(s): /dev/sdd1 * failnode(s): taft-01 * leg fault policy: remove * log fault policy: remove ****************************************************** ================================================================================ Iteration 10.1 started at Tue Dec 11 13:07:59 CST 2012 ================================================================================ Scenario kill_no_mda_primary_2_legs: Kill primary leg containing *no* MDA of synced 2 leg mirror(s) ********* Mirror hash info for this scenario ********* * names: no_mda_primary_2legs_1 * sync: 1 * striped: 0 * leg devices: /dev/sdd1 /dev/sdf1 * log devices: /dev/sdg1 * no MDA devices: /dev/sdd1 * failpv(s): /dev/sdd1 * failnode(s): taft-01 * leg fault policy: allocate * log fault policy: allocate ****************************************************** Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2013-0501.html |