Hide Forgot
Description of problem: # Exclusively activate a clustered VG and create a snapshot on one node, then attempt to remove the snapshot from another node. [root@taft-03 ~]# cman_tool nodes Node Sts Inc Joined Name 1 M 8 2011-09-14 13:43:48 taft-01 2 M 12 2011-09-14 13:43:48 taft-02 3 M 8 2011-09-14 13:43:48 taft-03 4 M 16 2011-09-14 13:43:48 taft-04 [root@taft-03 ~]# lvcreate -n origin -L 100M taft Logical volume "origin" created [root@taft-03 ~]# vgchange -an 0 logical volume(s) in volume group "taft" now active [root@taft-03 ~]# lvs LV VG Attr LSize Origin Snap% origin taft -wi--- 100.00m [root@taft-03 ~]# vgchange -aye taft 1 logical volume(s) in volume group "taft" now active [root@taft-03 ~]# lvs -a -o +devices LV VG Attr LSize Origin Snap% Devices origin taft -wi-a- 100.00m /dev/sdb1(0) [root@taft-03 ~]# lvcreate -s taft/origin -n snap1 -L 12M Logical volume "snap1" created [root@taft-03 ~]# lvcreate -s taft/origin -n snap2 -L 12M Logical volume "snap2" created [root@taft-03 ~]# lvs -a -o +devices LV VG Attr LSize Origin Snap% Devices origin taft owi-a- 100.00m /dev/sdb1(0) snap1 taft swi-a- 12.00m origin 0.00 /dev/sdb1(25) snap2 taft swi-a- 12.00m origin 0.00 /dev/sdb1(28) [root@taft-03 ~]# pvscan PV /dev/sdb1 VG taft lvm2 [135.66 GiB / 135.54 GiB free] PV /dev/sdc1 VG taft lvm2 [135.66 GiB / 135.66 GiB free] PV /dev/sdd1 VG taft lvm2 [135.66 GiB / 135.66 GiB free] PV /dev/sde1 VG taft lvm2 [135.66 GiB / 135.66 GiB free] PV /dev/sdf1 VG taft lvm2 [135.66 GiB / 135.66 GiB free] PV /dev/sdg1 VG taft lvm2 [135.66 GiB / 135.66 GiB free] PV /dev/sdh1 VG taft lvm2 [135.66 GiB / 135.66 GiB free] # Do the delete attempt of the snapshot from another node in the cluster where the VG isn't active. [root@taft-04 ~]# lvs -a -o +devices LV VG Attr LSize Origin Snap% Devices origin taft owi--- 100.00m /dev/sdb1(0) snap1 taft swi--- 12.00m origin /dev/sdb1(25) snap2 taft swi--- 12.00m origin /dev/sdb1(28) [root@taft-04 ~]# lvremove taft/snap2 Do you really want to remove active clustered logical volume snap2? [y/n]: y cluster request failed: Resource temporarily unavailable Failed to refresh origin without snapshot. Device '/dev/sdf1' has been left open. Device '/dev/sdb1' has been left open. Device '/dev/sdc1' has been left open. Device '/dev/sdg1' has been left open. Device '/dev/sdh1' has been left open. Device '/dev/sdb1' has been left open. Device '/dev/sdd1' has been left open. Device '/dev/sdg1' has been left open. Device '/dev/sde1' has been left open. Device '/dev/sdf1' has been left open. Device '/dev/sdg1' has been left open. Device '/dev/sde1' has been left open. Device '/dev/sdh1' has been left open. Device '/dev/sdd1' has been left open. Device '/dev/sdf1' has been left open. Device '/dev/sdh1' has been left open. Device '/dev/sdc1' has been left open. Device '/dev/sdb1' has been left open. Device '/dev/sdc1' has been left open. Device '/dev/sde1' has been left open. Device '/dev/sdg1' has been left open. Device '/dev/sdd1' has been left open. Device '/dev/sde1' has been left open. Device '/dev/sdf1' has been left open. Device '/dev/sdh1' has been left open. Device '/dev/sde1' has been left open. Device '/dev/sdd1' has been left open. Device '/dev/sdg1' has been left open. Device '/dev/sdc1' has been left open. Device '/dev/sdb1' has been left open. Device '/dev/sdh1' has been left open. Device '/dev/sdf1' has been left open. Device '/dev/sdb1' has been left open. Device '/dev/sdc1' has been left open. Device '/dev/sdd1' has been left open. [root@taft-04 ~]# lvs -a -o +devices LV VG Attr LSize Origin Snap% Devices origin taft owi--- 100.00m /dev/sdb1(0) snap1 taft swi--- 12.00m origin /dev/sdb1(25) snap2 taft swi--- 12.00m origin /dev/sdb1(28) Version-Release number of selected component (if applicable): 2.6.32-195.el6.x86_64 lvm2-2.02.87-2.1.el6 BUILT: Wed Sep 14 09:44:16 CDT 2011 lvm2-libs-2.02.87-2.1.el6 BUILT: Wed Sep 14 09:44:16 CDT 2011 lvm2-cluster-2.02.87-2.1.el6 BUILT: Wed Sep 14 09:44:16 CDT 2011 udev-147-2.38.el6 BUILT: Fri Sep 9 16:25:50 CDT 2011 device-mapper-1.02.66-2.1.el6 BUILT: Wed Sep 14 09:44:16 CDT 2011 device-mapper-libs-1.02.66-2.1.el6 BUILT: Wed Sep 14 09:44:16 CDT 2011 device-mapper-event-1.02.66-2.1.el6 BUILT: Wed Sep 14 09:44:16 CDT 2011 device-mapper-event-libs-1.02.66-2.1.el6 BUILT: Wed Sep 14 09:44:16 CDT 2011 cmirror-2.02.87-2.1.el6 BUILT: Wed Sep 14 09:44:16 CDT 2011 How reproducible: Every time
One addition: clvmd after restart read initial lock state from "lvs" command. For virtual origin snapshot it loads wrong lock id (because virtual origin is internally represented by different LV). This must be fixed as well to make it work reliably.
We should attempt to clean up the way it handles these sorts of situations. (We probably can't support deleting a snapshot from a node different from the one where it's active, though.)
(We should perhaps consider this alongside the similar thin pool/volume activation questions.) 1. Maybe it's time to change the way locking is handled with volumes activated implicitly. Always hold locks for every active volume and have the code know how to acquire the extra locks? 2. See whether this lvremove sequence can be made to work correctly in a cluster - or if not, detect and ban it.
This low priority and unlikely bug should continue to be tracked, but will not be considered for fix in RHEL6. Moving to RHEL7. Likely an attempt to reproduce there should be done.