Bug 608879

Summary: clvmd causing problems in/after fault scenarios
Product: Red Hat Enterprise Linux 6 Reporter: Jonathan Earl Brassow <jbrassow>
Component: lvm2Assignee: Milan Broz <mbroz>
Status: CLOSED CURRENTRELEASE QA Contact: Corey Marthaler <cmarthal>
Severity: medium Docs Contact:
Priority: low    
Version: 6.0CC: agk, dwysocha, heinzm, jbrassow, joe.thornber, mbroz, prajnoha, prockai, pvrabec
Target Milestone: rc   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: lvm2-2.02.69-2.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2010-11-10 21:08:16 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jonathan Earl Brassow 2010-06-28 19:51:10 UTC
I'm not sure if this is due to how clvmd is caching or perhaps some funny udev problem...  I know that it is causing a lot of problems with the automated testing.

The scenario is:
1) create cluster mirror
2) kill a device that a leg is on
3) write to device to force dmeventd to do repair
4) bring device back
5) attempt to convert back to mirror [fails]
*) repeating #5 will not help
**) restarting clvmd (on all machines) makes #5 work

I'm not sure if dmeventd is required in these steps.  Perhaps you could replace #3 with a simple call to 'lvconvert --repair'.

Here are the exact steps to reproduce (every time):
[root@bp-01 ~]# lvcreate -m1 -L 500M -n lv vg
  Logical volume "lv" created
[root@bp-01 ~]# devices
  LV            Copy%  Devices
  lv            100.00 lv_mimage_0(0),lv_mimage_1(0)
  [lv_mimage_0]        /dev/sdb1(0)
  [lv_mimage_1]        /dev/sdc1(0)
  [lv_mlog]            /dev/sdi1(0)
[root@bp-01 ~]# off.sh sdc <-- done on all machines
Turning off sdc
[root@bp-01 ~]# # 'dd if=/dev/zero of=/dev/vg/lv bs=4M count=1' on bp-02
[root@bp-01 ~]# devices
  Couldn't find device with uuid zwxlQu-3hG3-RhIu-izSu-z8gB-Onfy-hfwH77.
  LV      Copy%  Devices
  lv             /dev/sdb1(0)
[root@bp-01 ~]# on.sh sdc <-- done on all machines
Turning on sdc
[root@bp-01 ~]# pvscan
  WARNING: Volume Group vg is not consistent
  PV /dev/sdb1   VG vg        lvm2 [233.75 GiB / 233.27 GiB free]
  PV /dev/sdc1   VG vg        lvm2 [233.75 GiB / 233.75 GiB free]
  PV /dev/sdd1   VG vg        lvm2 [233.75 GiB / 233.75 GiB free]
  PV /dev/sde1   VG vg        lvm2 [233.75 GiB / 233.75 GiB free]
  PV /dev/sdf1   VG vg        lvm2 [233.75 GiB / 233.75 GiB free]
  PV /dev/sdg1   VG vg        lvm2 [233.75 GiB / 233.75 GiB free]
  PV /dev/sdh1   VG vg        lvm2 [233.75 GiB / 233.75 GiB free]
  PV /dev/sdi1   VG vg        lvm2 [233.75 GiB / 233.75 GiB free]
  PV /dev/sda2   VG vg_bp01   lvm2 [148.52 GiB / 0    free]
  Total: 9 [1.97 TiB] / in use: 9 [1.97 TiB] / in no VG: 0 [0   ]
[root@bp-01 ~]# lvconvert -m1 vg/lv
  WARNING: Inconsistent metadata found for VG vg - updating to use version 19
  Missing device /dev/sdc1 reappeared, updating metadata for VG vg to version 19.
  /dev/vg/lv_mlog: not found: device not cleared
  Aborting. Failed to wipe mirror log.
  Failed to initialise mirror log.
[root@bp-01 ~]# lvconvert -m1 vg/lv
  /dev/vg/lv_mlog: not found: device not cleared
  Aborting. Failed to wipe mirror log.
  Failed to initialise mirror log.
[root@bp-01 ~]# killall clvmd
[root@bp-01 ~]# clvmd
[root@bp-01 ~]# lvconvert -m1 vg/lv
  vg/lv: Converted: 1.6%
  vg/lv: Converted: 95.2%
  vg/lv: Converted: 100.0%
[root@bp-01 ~]# devices
  LV            Copy%  Devices
  lv            100.00 lv_mimage_0(0),lv_mimage_1(0)
  [lv_mimage_0]        /dev/sdb1(0)
  [lv_mimage_1]        /dev/sdc1(0)
  [lv_mlog]            /dev/sdi1(0)

Comment 2 RHEL Program Management 2010-06-28 20:02:57 UTC
This request was evaluated by Red Hat Product Management for inclusion in a Red
Hat Enterprise Linux major release.  Product Management has requested further
review of this request by Red Hat Engineering, for potential inclusion in a Red
Hat Enterprise Linux Major release.  This request is not yet committed for
inclusion.

Comment 3 Milan Broz 2010-06-29 10:23:50 UTC
I think this is dup of ug #595523 (still unresolved).

Which version of lvm rpm is that?

Comment 4 Milan Broz 2010-06-29 14:58:37 UTC
Ok, I can reproduce it. (with lvm2-2.02.68-1.el6)

It works properly with local mirror, so the problem is somewhere in updating metadata cache on remote nodes after PV reappears.

Comment 5 Milan Broz 2010-07-01 21:05:02 UTC
Fixed upstream, not balanced memlock caused clvmd to avoid rescans.

Comment 7 Corey Marthaler 2010-08-13 21:05:55 UTC
VG reconfiguration test cases (same as in bug 595523) now pass. Marking this verified.

Comment 8 releng-rhel@redhat.com 2010-11-10 21:08:16 UTC
Red Hat Enterprise Linux 6.0 is now available and should resolve
the problem described in this bug report. This report is therefore being closed
with a resolution of CURRENTRELEASE. You may reopen this bug report if the
solution does not work for you.