Bug 1136925
Summary: | LVM hangs when removing snapshot in clustered volume group | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 6 | Reporter: | Patrik Hagara <phagara> | ||||||
Component: | lvm2 | Assignee: | Zdenek Kabelac <zkabelac> | ||||||
lvm2 sub component: | Clustering / clvmd (RHEL6) | QA Contact: | cluster-qe <cluster-qe> | ||||||
Status: | CLOSED ERRATA | Docs Contact: | |||||||
Severity: | unspecified | ||||||||
Priority: | unspecified | CC: | agk, cmarthal, heinzm, jbrassow, msnitzer, nperic, prajnoha, prockai, zkabelac | ||||||
Version: | 6.5 | ||||||||
Target Milestone: | rc | ||||||||
Target Release: | --- | ||||||||
Hardware: | Unspecified | ||||||||
OS: | Unspecified | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | lvm2-2.02.117-1.el6 | Doc Type: | Bug Fix | ||||||
Doc Text: |
lvm2 tool has incorrectly used LV locks for snapshot volumes and as a result it may have tried to access incorrect devices for locking. Tool has been fixed to always manipulate with snapshot origin volume lock.
|
Story Points: | --- | ||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2015-07-22 07:35:26 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Attachments: |
|
Created attachment 934368 [details]
more defailed logs
Logs from first and second lvremove with dmsetup logs showing there's suspended LV after first lvremove.
Yep easily reproducible on a singlenode cluster as well - working on patch... Addressed upstream with few commits: https://www.redhat.com/archives/lvm-devel/2014-September/msg00019.html https://www.redhat.com/archives/lvm-devel/2014-September/msg00024.html Marking verified in the latest rpms. 2.6.32-554.el6.x86_64 lvm2-2.02.118-2.el6 BUILT: Wed Apr 15 06:34:08 CDT 2015 lvm2-libs-2.02.118-2.el6 BUILT: Wed Apr 15 06:34:08 CDT 2015 lvm2-cluster-2.02.118-2.el6 BUILT: Wed Apr 15 06:34:08 CDT 2015 udev-147-2.61.el6 BUILT: Mon Mar 2 05:08:11 CST 2015 device-mapper-1.02.95-2.el6 BUILT: Wed Apr 15 06:34:08 CDT 2015 device-mapper-libs-1.02.95-2.el6 BUILT: Wed Apr 15 06:34:08 CDT 2015 device-mapper-event-1.02.95-2.el6 BUILT: Wed Apr 15 06:34:08 CDT 2015 device-mapper-event-libs-1.02.95-2.el6 BUILT: Wed Apr 15 06:34:08 CDT 2015 device-mapper-persistent-data-0.3.2-1.el6 BUILT: Fri Apr 4 08:43:06 CDT 2014 cmirror-2.02.118-2.el6 BUILT: Wed Apr 15 06:34:08 CDT 2015 [root@host-122 ~]# cman_tool nodes Node Sts Inc Joined Name 1 M 104 2015-05-06 16:13:45 host-122 2 M 108 2015-05-06 16:13:45 host-123 3 M 112 2015-05-06 16:13:45 host-124 [root@host-122 ~]# pvcreate /dev/sda1 -ffy Physical volume "/dev/sda1" successfully created [root@host-122 ~]# vgcreate -cy testvg /dev/sda1 Clustered volume group "testvg" successfully created [root@host-122 ~]# lvcreate -aey -n disk -L 64m testvg Logical volume "disk" created. [root@host-122 ~]# lvrename /dev/testvg/disk /dev/testvg/template Renamed "disk" to "template" in volume group "testvg" [root@host-122 ~]# lvcreate -aey -n disk -L64m testvg Logical volume "disk" created. [root@host-122 ~]# lvrename /dev/testvg/disk /dev/testvg/temp Renamed "disk" to "temp" in volume group "testvg" [root@host-122 ~]# lvcreate -s /dev/testvg/template -n disk -aey -L64m Logical volume "disk" created. [root@host-122 ~]# lvremove -f /dev/testvg/temp Logical volume "temp" successfully removed [root@host-122 ~]# lvrename /dev/testvg/disk /dev/testvg/temp Renamed "disk" to "temp" in volume group "testvg" [root@host-122 ~]# lvcreate -aey -n disk -L 64m testvg Logical volume "disk" created. [root@host-122 ~]# lvremove -f /dev/testvg/temp Logical volume "temp" successfully removed [root@host-122 ~]# lvs -a -o +devices LV VG Attr LSize Devices disk testvg -wi-a----- 64.00m /dev/sda1(16) template testvg -wi-a----- 64.00m /dev/sda1(0) Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-1411.html |
Created attachment 934136 [details] lvremove debug output Description of problem: After a certain sequence of LVM commands (listed in "steps to reproduce" below), the lvremove on an exclusively activated snapshot LV in clustered VG fails with the following error message: # lvremove testvg/temp Do you really want to remove active clustered logical volume temp? [y/n]: y Error locking on node virt-136: Unable to deactivate open testvg-template-real (253:4) Failed to resume template. However, executing the same command again seems to fix the issue. Version-Release number of selected component (if applicable): # uname -r 2.6.32-431.23.3.el6.x86_64 # rpm -qv lvm2 lvm2-libs lvm2-cluster udev device-mapper device-mapper-libs device-mapper-event device-mapper-event-libs lvm2-2.02.100-8.el6.x86_64 lvm2-libs-2.02.100-8.el6.x86_64 lvm2-cluster-2.02.100-8.el6.x86_64 udev-147-2.51.el6.x86_64 device-mapper-1.02.79-8.el6.x86_64 device-mapper-libs-1.02.79-8.el6.x86_64 device-mapper-event-1.02.79-8.el6.x86_64 device-mapper-event-libs-1.02.79-8.el6.x86_64 How reproducible: always Steps to Reproduce: # all commands executed on a single cluster node # 1) setup clustered vg pvcreate /dev/sda -ffy vgcreate -cy testvg /dev/sda # 2) create template lvcreate -aey -n disk -L 64m testvg lvrename /dev/testvg/disk /dev/testvg/template lvcreate -aey -n disk -L64m testvg # 3) basic workflow lvrename /dev/testvg/disk /dev/testvg/temp lvcreate -s /dev/testvg/template -n disk -aey -L64m lvremove -f /dev/testvg/temp lvrename /dev/testvg/disk /dev/testvg/temp lvcreate -aey -n disk -L 64m testvg # 4) try to remove temp snapshot lvremove -f /dev/testvg/temp # Issuing `lvs` now would hang lvm (uninterruptible sleep, # kernel task hang timer expiring, can't even reboot gracefully). # Other machines automatically activate the temp LV (non-exlc). # However, second call to lvremove succeeds. (Why?) lvremove -f /dev/testvg/temp # And lvm seems to work as usual now lvs Actual results: error and hang Expected results: lvremove should work on the first try Additional info: Running a quorate 3-node cman/corosync cluster with fencing and clvmd. Also tested on RHEL-6.6 and 7.0 with the same results.