Description of problem: # LV inactive in the cluster [root@link-02 tmp]# lvscan inactive '/dev/vg1/lv1' [10.00 GB] inherit [root@link-08 lvm]# lvscan inactive '/dev/vg1/lv1' [10.00 GB] inherit # LV gets activated on link-08 only [root@link-08 lvm]# vgchange -aye 0 logical volume(s) in volume group "vg2" now active 1 logical volume(s) in volume group "vg1" now active [root@link-02 tmp]# lvscan inactive '/dev/vg1/lv1' [10.00 GB] inherit [root@link-08 lvm]# lvscan ACTIVE '/dev/vg1/lv1' [10.00 GB] inherit # But link-02 is able to remove it from link-08 [root@link-02 tmp]# lvremove /dev/vg1/lv1 Logical volume "lv1" successfully removed Version-Release number of selected component (if applicable): [root@link-02 tmp]# rpm -qa lvm2-cluster lvm2-cluster-2.02.06-7.0.RHEL4 How reproducible: Everytime
This one's been there for a long time - it has only ever checked for activity on the local node. Two options: 1) Pass the 'active' status around the cluster, so the existing "is active?" query will give a cluster-wide answer; 2) Do locking tricks so the local node has exclusive control before it can be removed. I reckon it's time we attempted (1) now. In other words remove the direct calls into lib/activate for lv_info-related information and always get it indirectly via the locking layer.
I suspect that as far as clvmd is concerned 1) and 2) are very similar. Testing for a lock in the cluster is easiest done by attempting to get an EX lock on the resource (especially so since the new DLM doesn't yet have the query interface that RHEL4 has). Provided the VG lock is held throughout this operation there will be no races.
How do the other alteration lvm commands works currently? # Wont let me rename it if the lock is elsewhere [root@link-02 ~]# lvrename /dev/linear_4_3390/lv /dev/linear_4_3390/rename cluster request failed: Resource temporarily unavailable # Wont let me reduce it either [root@link-02 ~]# lvreduce -L 1G /dev/linear_4_3390/lv Reducing logical volume lv to 1.00 GB cluster request failed: Resource temporarily unavailable Failed to suspend lv
Effectively because they do 2) above. LVM attempts to suspend the LV on all the nodes before doing the operation. In this case it can't because it's locked exclusively on another node. I suspect Alasdair would regard it as a gross hack to suspend the LV before removing it ;-)
lvremove now attempts to get an exclusisive lock on the resource before removing the LV Checking in WHATS_NEW; /cvs/lvm2/LVM2/WHATS_NEW,v <-- WHATS_NEW new revision: 1.477; previous revision: 1.476 done Checking in tools/lvremove.c; /cvs/lvm2/LVM2/tools/lvremove.c,v <-- lvremove.c new revision: 1.49; previous revision: 1.48 done
fix verified in lvm2-2.02.13-1.
An advisory has been issued which should help the problem described in this bug report. This report is therefore being closed with a resolution of ERRATA. For more information on the solution and/or where to find the updated files, please follow the link below. You may reopen this bug report if the solution does not work for you. http://rhn.redhat.com/errata/RHBA-2007-0046.html