Bug 612291 - dm devices associated with split off mirror images are not removed
dm devices associated with split off mirror images are not removed
Status: CLOSED CURRENTRELEASE
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: lvm2 (Show other bugs)
6.0
All Linux
high Severity high
: rc
: ---
Assigned To: Jonathan Earl Brassow
Corey Marthaler
: TestBlocker
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2010-07-07 15:23 EDT by Corey Marthaler
Modified: 2010-11-10 16:08 EST (History)
10 users (show)

See Also:
Fixed In Version: lvm2-2.02.72-8.el6
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-11-10 16:08:22 EST
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Corey Marthaler 2010-07-07 15:23:19 EDT
Description of problem:

[root@taft-01 ~]# lvs -a -o +devices
  LV                VG        Attr   LSize   Log         Copy%  Devices
  mirror            taft      mwi-a- 100.00m mirror_mlog 100.00 mirror_mimage_0(0),mirror_mimage_1(0),mirror_mimage_2(0)
  [mirror_mimage_0] taft      iwi-ao 100.00m                    /dev/sdb1(0)
  [mirror_mimage_1] taft      iwi-ao 100.00m                    /dev/sdc1(0)
  [mirror_mimage_2] taft      iwi-ao 100.00m                    /dev/sdd1(0)
  [mirror_mlog]     taft      lwi-ao   4.00m                    /dev/sdh1(0)

[root@taft-01 ~]# lvconvert --splitmirrors 1 --name new taft/mirror
  Logical volume mirror converted.

# The proper LVs show up after the split

[root@taft-01 ~]# lvs -a -o +devices
  LV                VG        Attr   LSize   Log         Copy%  Devices
  mirror            taft      mwi-a- 100.00m mirror_mlog 100.00 mirror_mimage_1(0),mirror_mimage_2(0)
  [mirror_mimage_1] taft      iwi-ao 100.00m                    /dev/sdc1(0)
  [mirror_mimage_2] taft      iwi-ao 100.00m                    /dev/sdd1(0)
  [mirror_mlog]     taft      lwi-ao   4.00m                    /dev/sdh1(0)
  new               taft      -wi-a- 100.00m                    /dev/sdb1(0)

# However the dm device associated with mimage_0 still exists and there isn't a new one for the new split off linear.

[root@taft-01 ~]# dmsetup ls
taft-mirror_mimage_2    (253, 8)
taft-mirror_mimage_1    (253, 5)
taft-mirror     (253, 6)
taft-mirror_mimage_0    (253, 4)
taft-mirror_mlog        (253, 3)

# When the LV/VG is finally removed the old image still remains, which causes new mirror creation to fail.

Version-Release number of selected component (if applicable):
lvm2-2.02.69-2.el6    BUILT: Fri Jul  2 07:26:01 CDT 2010
lvm2-libs-2.02.69-2.el6    BUILT: Fri Jul  2 07:26:01 CDT 2010
lvm2-cluster-2.02.69-2.el6    BUILT: Fri Jul  2 07:26:01 CDT 2010
udev-147-2.18.el6    BUILT: Fri Jun 11 07:47:21 CDT 2010
device-mapper-1.02.51-2.el6    BUILT: Fri Jul  2 07:26:01 CDT 2010
device-mapper-libs-1.02.51-2.el6    BUILT: Fri Jul  2 07:26:01 CDT 2010
device-mapper-event-1.02.51-2.el6    BUILT: Fri Jul  2 07:26:01 CDT 2010
device-mapper-event-libs-1.02.51-2.el6    BUILT: Fri Jul  2 07:26:01 CDT 2010
cmirror-2.02.69-2.el6    BUILT: Fri Jul  2 07:26:01 CDT 2010

How reproducible:
Everytime
Comment 1 Corey Marthaler 2010-07-07 15:37:06 EDT
Deactivating and then reactivating the volume group appears to fix this issue.

[root@taft-01 ~]# dmsetup ls
taft-mirror_mimage_2    (253, 8)
taft-mirror_mimage_1    (253, 5)
taft-mirror     (253, 6)
taft-mirror_mimage_0    (253, 4)
taft-mirror_mlog        (253, 3)

[root@taft-01 ~]# vgchange -an taft
  0 logical volume(s) in volume group "taft" now active

[root@taft-01 ~]# vgchange -ay taft
  2 logical volume(s) in volume group "taft" now active

[root@taft-01 ~]# dmsetup ls
taft-mirror_mimage_2    (253, 5)
taft-mirror_mimage_1    (253, 4)
taft-mirror     (253, 6)
taft-mirror_mlog        (253, 3)
taft-new        (253, 7)
Comment 2 Peter Rajnoha 2010-07-09 08:08:51 EDT
Yes, seems like a forgotten resume for a new device with a new name assigned so the change gets accounted for.

Just like in case of bug #612248, the problem is located in _split_mirror_image fn. So it seems this one really needs inspection and a few fixes :)
Comment 3 Jonathan Earl Brassow 2010-07-12 10:31:08 EDT
Preliminary patch sent to lvm-devel
Comment 4 Jonathan Earl Brassow 2010-07-13 17:55:23 EDT
pending version 2.02.71
Comment 6 Corey Marthaler 2010-07-29 18:22:55 EDT
In the latest build the dm devices are only removed on the node that does the split operation. All other nodes continue to have the zombie dm devices.

[root@taft-02 ~]# lvconvert --splitmirrors 1 --name new taft/mirror
  Logical volume mirror converted.

[root@taft-02 ~]# dmsetup ls
taft-mirror_mimage_1    (253, 5)
taft-mirror     (253, 7)
taft-mirror_mimage_0    (253, 4)
taft-mirror_mlog        (253, 3)
taft-new        (253, 6)



[root@taft-01 ~]# dmsetup ls
taft-mirror_mimage_2    (253, 6)
taft-mirror_mimage_1    (253, 5)
taft-mirror     (253, 7)
taft-mirror_mimage_0    (253, 4)
taft-mirror_mlog        (253, 3)

[root@taft-03 ~]# dmsetup ls
taft-mirror_mimage_2    (253, 6)
taft-mirror_mimage_1    (253, 5)
taft-mirror     (253, 7)
taft-mirror_mimage_0    (253, 4)
taft-mirror_mlog        (253, 3)

[root@taft-04 ~]# dmsetup ls
taft-mirror_mimage_2    (253, 6)
taft-mirror_mimage_1    (253, 5)
taft-mirror     (253, 7)
taft-mirror_mimage_0    (253, 4)
taft-mirror_mlog        (253, 3)



2.6.32-52.el6.x86_64

lvm2-2.02.72-3.el6    BUILT: Wed Jul 28 15:39:43 CDT 2010
lvm2-libs-2.02.72-3.el6    BUILT: Wed Jul 28 15:39:43 CDT 2010
lvm2-cluster-2.02.72-3.el6    BUILT: Wed Jul 28 15:39:43 CDT 2010
udev-147-2.21.el6    BUILT: Mon Jul 12 04:55:00 CDT 2010
device-mapper-1.02.53-3.el6    BUILT: Wed Jul 28 15:39:43 CDT 2010
device-mapper-libs-1.02.53-3.el6    BUILT: Wed Jul 28 15:39:43 CDT 2010
device-mapper-event-1.02.53-3.el6    BUILT: Wed Jul 28 15:39:43 CDT 2010
device-mapper-event-libs-1.02.53-3.el6    BUILT: Wed Jul 28 15:39:43 CDT 2010
cmirror-2.02.72-3.el6    BUILT: Wed Jul 28 15:39:43 CDT 2010
Comment 8 Corey Marthaler 2010-08-12 12:27:11 EDT
This still exists in the latest rpms.

2.6.32-59.1.el6.x86_64

lvm2-2.02.72-7.el6    BUILT: Wed Aug 11 17:12:24 CDT 2010
lvm2-libs-2.02.72-7.el6    BUILT: Wed Aug 11 17:12:24 CDT 2010
lvm2-cluster-2.02.72-7.el6    BUILT: Wed Aug 11 17:12:24 CDT 2010
udev-147-2.22.el6    BUILT: Fri Jul 23 07:21:33 CDT 2010
device-mapper-1.02.53-7.el6    BUILT: Wed Aug 11 17:12:24 CDT 2010
device-mapper-libs-1.02.53-7.el6    BUILT: Wed Aug 11 17:12:24 CDT 2010
device-mapper-event-1.02.53-7.el6    BUILT: Wed Aug 11 17:12:24 CDT 2010
device-mapper-event-libs-1.02.53-7.el6    BUILT: Wed Aug 11 17:12:24 CDT 2010
cmirror-2.02.72-7.el6    BUILT: Wed Aug 11 17:12:24 CDT 2010
Comment 9 Jonathan Earl Brassow 2010-08-12 18:31:39 EDT
patch posted for review to lvm-devel
Comment 10 Jonathan Earl Brassow 2010-08-12 18:33:52 EDT
problem is that the suspend/resume operations that cause the node issuing the 'lvconvert' to reload the dm devices locally has no effect remotely.  This is because the newly split LV has no entry in the lv_hash as required to do the suspend/resume operation.  It can only get into the hash via a activate_lv operation.  So, the patch replaces the suspend/resume with an activate_lv for the newly split LV.
Comment 11 Jonathan Earl Brassow 2010-08-16 14:06:16 EDT
POST:

Fix for bug 612291: dm devices of split off mirror images are not removed

DM devices were not handled properly on nodes in a cluster that were not
where the splitmirrors command was issued.  This was happening because
suspend_lv/resume_lv were being used in a place where activate_lv should
have been used.

When the suspend/resume are issued on (effectively) new LVs, their
'resource' (UUID) is not located in the lv_hash.  Thus, both operations
turn into no-ops.  You can see this from the output of clvmd from one
of the remote nodes:
<snip>
CLVMD[3c7ed710]: Aug 12 17:01:44 do_suspend_lv, lock not already held
<snip>
CLVMD[3c7ed710]: Aug 12 17:02:03 do_resume_lv, lock not already held

'activate_lv' enjoins the other nodes in the cluster to process the lock
and activate the new LV.  clvmd output from remote node as follows:
CLVMD[776e5710]: Aug 12 16:53:21 do_lock_lv: resource 'zMseY7CBuO3Ty09vXlplPAHzD0Y0CovjrTdv0R1VcwggMwPdYhutHErRcwm5Nd2S', cmd = 0x19 LCK_LV_ACTIVATE (READ|LV|NONBLOCK), flags = 0x84 (DMEVENTD_MONITOR ), memlock = 1
CLVMD[776e5710]: Aug 12 16:53:21 sync_lock: 'zMseY7CBuO3Ty09vXlplPAHzD0Y0CovjrTdv0R1VcwggMwPdYhutHErRcwm5Nd2S' mode:1 flags=1
CLVMD[776e5710]: Aug 12 16:53:21 sync_lock: returning lkid 27b0001
Comment 12 Corey Marthaler 2010-08-18 17:23:08 EDT
Fix verified in the latest rpms.

2.6.32-59.1.el6.x86_64

lvm2-2.02.72-8.el6    BUILT: Wed Aug 18 10:41:52 CDT 2010
lvm2-libs-2.02.72-8.el6    BUILT: Wed Aug 18 10:41:52 CDT 2010
lvm2-cluster-2.02.72-8.el6    BUILT: Wed Aug 18 10:41:52 CDT 2010
udev-147-2.22.el6    BUILT: Fri Jul 23 07:21:33 CDT 2010
device-mapper-1.02.53-8.el6    BUILT: Wed Aug 18 10:41:52 CDT 2010
device-mapper-libs-1.02.53-8.el6    BUILT: Wed Aug 18 10:41:52 CDT 2010
device-mapper-event-1.02.53-8.el6    BUILT: Wed Aug 18 10:41:52 CDT 2010
device-mapper-event-libs-1.02.53-8.el6    BUILT: Wed Aug 18 10:41:52 CDT 2010
cmirror-2.02.72-8.el6    BUILT: Wed Aug 18 10:41:52 CDT 2010
Comment 13 releng-rhel@redhat.com 2010-11-10 16:08:22 EST
Red Hat Enterprise Linux 6.0 is now available and should resolve
the problem described in this bug report. This report is therefore being closed
with a resolution of CURRENTRELEASE. You may reopen this bug report if the
solution does not work for you.

Note You need to log in before you can comment on or make changes to this bug.