Bug 1136925

Summary: LVM hangs when removing snapshot in clustered volume group
Product: Red Hat Enterprise Linux 6 Reporter: Patrik Hagara <phagara>
Component: lvm2Assignee: Zdenek Kabelac <zkabelac>
lvm2 sub component: Clustering / clvmd (RHEL6) QA Contact: cluster-qe <cluster-qe>
Status: CLOSED ERRATA Docs Contact:
Severity: unspecified    
Priority: unspecified CC: agk, cmarthal, heinzm, jbrassow, msnitzer, nperic, prajnoha, prockai, zkabelac
Version: 6.5   
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Fixed In Version: lvm2-2.02.117-1.el6 Doc Type: Bug Fix
Doc Text:
lvm2 tool has incorrectly used LV locks for snapshot volumes and as a result it may have tried to access incorrect devices for locking. Tool has been fixed to always manipulate with snapshot origin volume lock.
Story Points: ---
Clone Of: Environment:
Last Closed: 2015-07-22 07:35:26 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Description Flags
lvremove debug output
more defailed logs none

Description Patrik Hagara 2014-09-03 15:30:29 UTC
Created attachment 934136 [details]
lvremove debug output

Description of problem:
After a certain sequence of LVM commands (listed in "steps to reproduce" below), the lvremove on an exclusively activated snapshot LV in clustered VG fails with the following error message:

# lvremove testvg/temp
Do you really want to remove active clustered logical volume temp? [y/n]: y
  Error locking on node virt-136: Unable to deactivate open testvg-template-real (253:4)
  Failed to resume template.

However, executing the same command again seems to fix the issue.

Version-Release number of selected component (if applicable):
# uname -r
# rpm -qv lvm2 lvm2-libs lvm2-cluster udev device-mapper device-mapper-libs device-mapper-event device-mapper-event-libs

How reproducible:

Steps to Reproduce:
# all commands executed on a single cluster node
# 1) setup clustered vg
pvcreate /dev/sda -ffy
vgcreate -cy testvg /dev/sda
# 2) create template
lvcreate -aey -n disk -L 64m testvg
lvrename /dev/testvg/disk /dev/testvg/template
lvcreate -aey -n disk -L64m testvg
# 3) basic workflow
lvrename /dev/testvg/disk /dev/testvg/temp
lvcreate -s /dev/testvg/template -n disk -aey -L64m
lvremove -f /dev/testvg/temp
lvrename /dev/testvg/disk /dev/testvg/temp
lvcreate -aey -n disk -L 64m testvg
# 4) try to remove temp snapshot
lvremove -f /dev/testvg/temp
# Issuing `lvs` now would hang lvm (uninterruptible sleep,
# kernel task hang timer expiring, can't even reboot gracefully).
# Other machines automatically activate the temp LV (non-exlc).
# However, second call to lvremove succeeds. (Why?)
lvremove -f /dev/testvg/temp
# And lvm seems to work as usual now

Actual results:
error and hang

Expected results:
lvremove should work on the first try

Additional info:
Running a quorate 3-node cman/corosync cluster with fencing and clvmd.
Also tested on RHEL-6.6 and 7.0 with the same results.

Comment 2 Peter Rajnoha 2014-09-04 10:46:38 UTC
Created attachment 934368 [details]
more defailed logs

Logs from first and second lvremove with dmsetup logs showing there's suspended LV after first lvremove.

Comment 3 Zdenek Kabelac 2014-09-04 14:56:41 UTC
Yep easily reproducible on a singlenode cluster as well - working on patch...

Comment 9 Corey Marthaler 2015-05-06 21:36:29 UTC
Marking verified in the latest rpms.

lvm2-2.02.118-2.el6    BUILT: Wed Apr 15 06:34:08 CDT 2015
lvm2-libs-2.02.118-2.el6    BUILT: Wed Apr 15 06:34:08 CDT 2015
lvm2-cluster-2.02.118-2.el6    BUILT: Wed Apr 15 06:34:08 CDT 2015
udev-147-2.61.el6    BUILT: Mon Mar  2 05:08:11 CST 2015
device-mapper-1.02.95-2.el6    BUILT: Wed Apr 15 06:34:08 CDT 2015
device-mapper-libs-1.02.95-2.el6    BUILT: Wed Apr 15 06:34:08 CDT 2015
device-mapper-event-1.02.95-2.el6    BUILT: Wed Apr 15 06:34:08 CDT 2015
device-mapper-event-libs-1.02.95-2.el6    BUILT: Wed Apr 15 06:34:08 CDT 2015
device-mapper-persistent-data-0.3.2-1.el6    BUILT: Fri Apr  4 08:43:06 CDT 2014
cmirror-2.02.118-2.el6    BUILT: Wed Apr 15 06:34:08 CDT 2015

[root@host-122 ~]# cman_tool nodes
Node  Sts   Inc   Joined               Name
   1   M    104   2015-05-06 16:13:45  host-122
   2   M    108   2015-05-06 16:13:45  host-123
   3   M    112   2015-05-06 16:13:45  host-124

[root@host-122 ~]#  pvcreate /dev/sda1 -ffy
  Physical volume "/dev/sda1" successfully created
[root@host-122 ~]# vgcreate -cy testvg /dev/sda1
  Clustered volume group "testvg" successfully created
[root@host-122 ~]# lvcreate -aey -n disk -L 64m testvg
  Logical volume "disk" created.
[root@host-122 ~]# lvrename /dev/testvg/disk /dev/testvg/template
  Renamed "disk" to "template" in volume group "testvg"
[root@host-122 ~]# lvcreate -aey -n disk -L64m testvg
  Logical volume "disk" created.
[root@host-122 ~]# lvrename /dev/testvg/disk /dev/testvg/temp
  Renamed "disk" to "temp" in volume group "testvg"
[root@host-122 ~]# lvcreate -s /dev/testvg/template -n disk -aey -L64m
  Logical volume "disk" created.
[root@host-122 ~]# lvremove -f /dev/testvg/temp
  Logical volume "temp" successfully removed
[root@host-122 ~]# lvrename /dev/testvg/disk /dev/testvg/temp
  Renamed "disk" to "temp" in volume group "testvg"
[root@host-122 ~]# lvcreate -aey -n disk -L 64m testvg
  Logical volume "disk" created.
[root@host-122 ~]# lvremove -f /dev/testvg/temp
  Logical volume "temp" successfully removed
[root@host-122 ~]# lvs -a -o +devices
  LV       VG         Attr       LSize    Devices
  disk     testvg     -wi-a-----  64.00m  /dev/sda1(16)
  template testvg     -wi-a-----  64.00m  /dev/sda1(0)

Comment 10 errata-xmlrpc 2015-07-22 07:35:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.