Bug 517900 - bypass of exclusive vg/lv lock
bypass of exclusive vg/lv lock
Status: CLOSED ERRATA
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: lvm2-cluster (Show other bugs)
5.3
All Linux
high Severity medium
: rc
: ---
Assigned To: Christine Caulfield
Cluster QE
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2009-08-17 14:45 EDT by Brem BELGUEBLI
Modified: 2016-04-26 10:10 EDT (History)
12 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2010-03-30 05:02:12 EDT
Type: ---
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Brem BELGUEBLI 2009-08-17 14:45:15 EDT
Description of problem: vgchange -a y bypasses exclusive vg/lv locks if any  


Version-Release number of selected component (if applicable):5.3


How reproducible: always


Steps to Reproduce:
1. activate exclusively a clustered VG on node 1 (vgchange -a ey VGXX)
2. on node 2, vgchange -a ey VGXX exits in error complaining about a lock (normal)
3. on node 2, vgchange -a y VGXX succeeds even if node 1 holds the exclusive lock
  
Actual results:
2. is OK, it fails
3. is permissive

Expected results:
3. should fail

Additional info:
Comment 1 Brem BELGUEBLI 2009-09-26 14:20:22 EDT
Hello,

Any update on this case, will it be studied ?

Regards
Comment 2 Alasdair Kergon 2009-09-26 18:07:45 EDT
But does (3) actually activate the LV on node 2 or not?  (Run 'dmsetup info -c' on each machine.)

What 'vgchange -ay VGXX' means is 'activate all the LVs in VGXX on every node in the cluster taking account of any restrictions defined in local lvm configuration files on each node'.  (Run 'lvm dumpconfig' on each machine.)
Comment 3 Brem BELGUEBLI 2009-09-26 19:04:56 EDT
Yes it does.

I understand your point of view, but it looks more like a permissive behaviour than a normal and desired one (my own opinion).

All the cluster nodes are setup with cluster locking (locking_type=3).

From my understanding (and experience from other platforms), vgchange -a y should not be allowed unless the lock holding node has released it (prior vgchange -a en VGXXX one node1 then vgchange -a y one node2 [or node1] to make it active on all nodes). Or if the holding node is dead.

Wouldn't that be safer ?

I always tend to imagine the unaware admin that gets on a cluster node on which the VG is not active (held exclusively by another node) and thinking he's doing the right thing activates (vgchange -a y) the VG, and then mounts the FS.....
Comment 4 Brem BELGUEBLI 2009-09-30 08:18:19 EDT
Hello,

Can we expect something to be done about that or do you just consider it pointless?

I know you guys are doing a lot of improvements on LVM, and I'm not expecting a fix/enhancement asap, I just need to know if it gets considered and planned, at what term one should expect it to be done.
 
Regards.

Brem
Comment 5 Christine Caulfield 2009-09-30 12:27:35 EDT
I have managed to reproduce this on a two node cluster.

One node:

[root@fanny ~]#  vgchange -aey guest

On another:

[root@anna ~]# vgchange -ay guest
  Error locking on node anna: Volume is busy on another node
  Error locking on node anna: Volume is busy on another node
  Error locking on node anna: Volume is busy on another node
  1 logical volume(s) in volume group "guest" now active

so ONE volume gets activated and the others don't. If you do 'vgchange -ay guest' again then all of them will be activated because the locks have all been changed to CR by the previous command.

I have a full set of logs and am investigating.

Chrissie
Comment 6 Alasdair Kergon 2009-10-01 09:55:28 EDT
Thanks for reporting this.  It's definitely a bug and a possible solution is being tested.
Comment 7 Christine Caulfield 2009-10-01 11:04:32 EDT
date: 2009/10/01 14:14:17;  author: ccaulfield;  state: Exp;  lines: +1 -0
Stop clvmd from automatically doing lock conversions. Now, if a lock
is granted at one mode and an attempt to convert it wthout the LCK_CONVERT
flag set then it will return errno=EBUSY.

It might break some things in other areas, but I doubt it.

Checking in WHATS_NEW;
/cvs/lvm2/LVM2/WHATS_NEW,v  <--  WHATS_NEW
new revision: 1.1286; previous revision: 1.1285
done
Checking in daemons/clvmd/lvm-functions.c;
/cvs/lvm2/LVM2/daemons/clvmd/lvm-functions.c,v  <--  lvm-functions.c
new revision: 1.69; previous revision: 1.68
done
Checking in lib/locking/locking.h;
/cvs/lvm2/LVM2/lib/locking/locking.h,v  <--  locking.h
new revision: 1.52; previous revision: 1.51
Comment 8 Milan Broz 2009-11-12 06:31:13 EST
Fix in lvm2-cluster-2_02_54-1_el5.
Comment 11 Corey Marthaler 2010-02-01 17:27:38 EST
Fix verified in lvm2-2.02.56-6.el5/lvm2-cluster-2.02.56-6.el5.


Node 1:
[root@taft-01 ~]# vgchange -a ey taft
  1 logical volume(s) in volume group "taft" now active
[root@taft-01 ~]# lvs
  LV       VG         Attr   LSize   Origin Snap%  Move Log         Copy%  Convert
  LogVol00 VolGroup00 -wi-ao  58.38G                                              
  LogVol01 VolGroup00 -wi-ao   9.75G                                              
  mirror   taft       mwi-a- 500.00M                    mirror_mlog 100.00        


Node 2:
[root@taft-02 ~]# lvs
  LV       VG         Attr   LSize   Origin Snap%  Move Log         Copy%  Convert
  LogVol00 VolGroup00 -wi-ao  58.38G                                              
  LogVol01 VolGroup00 -wi-ao   9.75G                                              
  mirror   taft       mwi--- 500.00M                    mirror_mlog               
[root@taft-02 ~]# vgchange -a ey taft
  Error locking on node taft-02: Volume is busy on another node
  0 logical volume(s) in volume group "taft" now active
[root@taft-02 ~]# vgchange -a ey taft
  Error locking on node taft-02: Volume is busy on another node
  0 logical volume(s) in volume group "taft" now active
[root@taft-02 ~]# vgchange -a y taft
  Error locking on node taft-02: Volume is busy on another node
  Error locking on node taft-03: Volume is busy on another node
  Error locking on node taft-04: Volume is busy on another node
  Error locking on node taft-01: Device or resource busy
  0 logical volume(s) in volume group "taft" now active
Comment 13 errata-xmlrpc 2010-03-30 05:02:12 EDT
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2010-0299.html

Note You need to log in before you can comment on or make changes to this bug.