517900 – bypass of exclusive vg/lv lock

Bug 517900 - bypass of exclusive vg/lv lock

Summary: bypass of exclusive vg/lv lock

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 5
Classification:	Red Hat
Component:	lvm2-cluster
Sub Component:
Version:	5.3
Hardware:	All
OS:	Linux
Priority:	high
Severity:	medium
Target Milestone:	rc
Target Release:	---
Assignee:	Christine Caulfield
QA Contact:	Cluster QE
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2009-08-17 18:45 UTC by Brem BELGUEBLI
Modified:	2016-04-26 14:10 UTC (History)
CC List:	12 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed:	2010-03-30 09:02:12 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2010:0299	0	normal	SHIPPED_LIVE	lvm2-cluster bug fix and enhancement update	2010-03-29 14:26:30 UTC

Description Brem BELGUEBLI 2009-08-17 18:45:15 UTC

Description of problem: vgchange -a y bypasses exclusive vg/lv locks if any  


Version-Release number of selected component (if applicable):5.3


How reproducible: always


Steps to Reproduce:
1. activate exclusively a clustered VG on node 1 (vgchange -a ey VGXX)
2. on node 2, vgchange -a ey VGXX exits in error complaining about a lock (normal)
3. on node 2, vgchange -a y VGXX succeeds even if node 1 holds the exclusive lock
  
Actual results:
2. is OK, it fails
3. is permissive

Expected results:
3. should fail

Additional info:

Comment 1 Brem BELGUEBLI 2009-09-26 18:20:22 UTC

Hello,

Any update on this case, will it be studied ?

Regards

Comment 2 Alasdair Kergon 2009-09-26 22:07:45 UTC

But does (3) actually activate the LV on node 2 or not?  (Run 'dmsetup info -c' on each machine.)

What 'vgchange -ay VGXX' means is 'activate all the LVs in VGXX on every node in the cluster taking account of any restrictions defined in local lvm configuration files on each node'.  (Run 'lvm dumpconfig' on each machine.)

Comment 3 Brem BELGUEBLI 2009-09-26 23:04:56 UTC

Yes it does.

I understand your point of view, but it looks more like a permissive behaviour than a normal and desired one (my own opinion).

All the cluster nodes are setup with cluster locking (locking_type=3).

From my understanding (and experience from other platforms), vgchange -a y should not be allowed unless the lock holding node has released it (prior vgchange -a en VGXXX one node1 then vgchange -a y one node2 [or node1] to make it active on all nodes). Or if the holding node is dead.

Wouldn't that be safer ?

I always tend to imagine the unaware admin that gets on a cluster node on which the VG is not active (held exclusively by another node) and thinking he's doing the right thing activates (vgchange -a y) the VG, and then mounts the FS.....

Comment 4 Brem BELGUEBLI 2009-09-30 12:18:19 UTC

Hello,

Can we expect something to be done about that or do you just consider it pointless?

I know you guys are doing a lot of improvements on LVM, and I'm not expecting a fix/enhancement asap, I just need to know if it gets considered and planned, at what term one should expect it to be done.
 
Regards.

Brem

Comment 5 Christine Caulfield 2009-09-30 16:27:35 UTC

I have managed to reproduce this on a two node cluster.

One node:

[root@fanny ~]#  vgchange -aey guest

On another:

[root@anna ~]# vgchange -ay guest
  Error locking on node anna: Volume is busy on another node
  Error locking on node anna: Volume is busy on another node
  Error locking on node anna: Volume is busy on another node
  1 logical volume(s) in volume group "guest" now active

so ONE volume gets activated and the others don't. If you do 'vgchange -ay guest' again then all of them will be activated because the locks have all been changed to CR by the previous command.

I have a full set of logs and am investigating.

Chrissie

Comment 6 Alasdair Kergon 2009-10-01 13:55:28 UTC

Thanks for reporting this.  It's definitely a bug and a possible solution is being tested.

Comment 7 Christine Caulfield 2009-10-01 15:04:32 UTC

date: 2009/10/01 14:14:17;  author: ccaulfield;  state: Exp;  lines: +1 -0
Stop clvmd from automatically doing lock conversions. Now, if a lock
is granted at one mode and an attempt to convert it wthout the LCK_CONVERT
flag set then it will return errno=EBUSY.

It might break some things in other areas, but I doubt it.

Checking in WHATS_NEW;
/cvs/lvm2/LVM2/WHATS_NEW,v  <--  WHATS_NEW
new revision: 1.1286; previous revision: 1.1285
done
Checking in daemons/clvmd/lvm-functions.c;
/cvs/lvm2/LVM2/daemons/clvmd/lvm-functions.c,v  <--  lvm-functions.c
new revision: 1.69; previous revision: 1.68
done
Checking in lib/locking/locking.h;
/cvs/lvm2/LVM2/lib/locking/locking.h,v  <--  locking.h
new revision: 1.52; previous revision: 1.51

Comment 8 Milan Broz 2009-11-12 11:31:13 UTC

Fix in lvm2-cluster-2_02_54-1_el5.

Comment 11 Corey Marthaler 2010-02-01 22:27:38 UTC

Fix verified in lvm2-2.02.56-6.el5/lvm2-cluster-2.02.56-6.el5.


Node 1:
[root@taft-01 ~]# vgchange -a ey taft
  1 logical volume(s) in volume group "taft" now active
[root@taft-01 ~]# lvs
  LV       VG         Attr   LSize   Origin Snap%  Move Log         Copy%  Convert
  LogVol00 VolGroup00 -wi-ao  58.38G                                              
  LogVol01 VolGroup00 -wi-ao   9.75G                                              
  mirror   taft       mwi-a- 500.00M                    mirror_mlog 100.00        


Node 2:
[root@taft-02 ~]# lvs
  LV       VG         Attr   LSize   Origin Snap%  Move Log         Copy%  Convert
  LogVol00 VolGroup00 -wi-ao  58.38G                                              
  LogVol01 VolGroup00 -wi-ao   9.75G                                              
  mirror   taft       mwi--- 500.00M                    mirror_mlog               
[root@taft-02 ~]# vgchange -a ey taft
  Error locking on node taft-02: Volume is busy on another node
  0 logical volume(s) in volume group "taft" now active
[root@taft-02 ~]# vgchange -a ey taft
  Error locking on node taft-02: Volume is busy on another node
  0 logical volume(s) in volume group "taft" now active
[root@taft-02 ~]# vgchange -a y taft
  Error locking on node taft-02: Volume is busy on another node
  Error locking on node taft-03: Volume is busy on another node
  Error locking on node taft-04: Volume is busy on another node
  Error locking on node taft-01: Device or resource busy
  0 logical volume(s) in volume group "taft" now active

Comment 13 errata-xmlrpc 2010-03-30 09:02:12 UTC

An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2010-0299.html

Note You need to log in before you can comment on or make changes to this bug.