Bug 677817

Summary: vgchange returns success when exclusive activation fails
Product: Red Hat Enterprise Linux 6 Reporter: Nate Straz <nstraz>
Component: lvm2Assignee: LVM and device-mapper development team <lvm-team>
Status: CLOSED CURRENTRELEASE QA Contact: Corey Marthaler <cmarthal>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 6.1CC: agk, coughlan, dwysocha, heinzm, jbrassow, mbroz, prajnoha, prockai, syeghiay, thornber, zkabelac
Target Milestone: rcKeywords: Regression
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: lvm2-2.02.83-3.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1191724 (view as bug list) Environment:
Last Closed: 2011-05-03 14:56:23 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1191724    

Description Nate Straz 2011-02-15 22:55:51 UTC
Description of problem:

When an LV is activated exclusively on another node and another node tries to activate it exclusively, the LV does not become active, but the vgchange command still returns success.

Version-Release number of selected component (if applicable):
2.6.32-114.0.1.el6.x86_64

lvm2-2.02.83-2.el6    BUILT: Tue Feb  8 10:10:57 CST 2011
lvm2-libs-2.02.83-2.el6    BUILT: Tue Feb  8 10:10:57 CST 2011
lvm2-cluster-2.02.83-2.el6    BUILT: Tue Feb  8 10:10:57 CST 2011
udev-147-2.33.el6    BUILT: Wed Feb  9 09:56:24 CST 2011
device-mapper-1.02.62-2.el6    BUILT: Tue Feb  8 10:10:57 CST 2011
device-mapper-libs-1.02.62-2.el6    BUILT: Tue Feb  8 10:10:57 CST 2011
device-mapper-event-1.02.62-2.el6    BUILT: Tue Feb  8 10:10:57 CST 2011
device-mapper-event-libs-1.02.62-2.el6    BUILT: Tue Feb  8 10:10:57 CST 2011
cmirror-2.02.83-2.el6    BUILT: Tue Feb  8 10:10:57 CST 2011


How reproducible:
Every time

Steps to Reproduce:
1. on node A vgchange -aye $VG
2. on node B vgchange -aye $VG <- this should return non-zero

  
Actual results:
[root@dash-01 audit]# lvs
  LV             VG            Attr   LSize   Origin Snap%  Move Log Copy%  Conv
  linear_9_55810 linear_9_5581 -wima- 685.68g
  lv_home        vg_dash01     -wi-ao  31.87g
  lv_root        vg_dash01     -wi-ao  35.29g
  lv_swap        vg_dash01     -wi-ao   6.86g

[root@dash-02 audit]# lvs
  LV             VG            Attr   LSize   Origin Snap%  Move Log Copy%  Conv
  linear_9_55810 linear_9_5581 -wim-- 685.68g
  lv_home        vg_dash02     -wi-ao  31.87g
  lv_root        vg_dash02     -wi-ao  35.29g
  lv_swap        vg_dash02     -wi-ao   6.86g
[root@dash-02 audit]# vgchange -aye linear_9_5581; echo $?
  0 logical volume(s) in volume group "linear_9_5581" now active
0

Expected results:
vgchange should return non-zero

Additional info:

Comment 3 Alasdair Kergon 2011-02-16 01:24:45 UTC
That old chestnut!  If it's already active the command has nothing to do so should it therefore fail?  Or is it enough to say that you wanted it active, it is active so return success?

I'm not sure we're ever going to resolve this to everyone's satisfaction.

Comment 4 Alasdair Kergon 2011-02-16 01:30:25 UTC
(BTW Remember that vgchange -a is a clustered command which acts symmetrically on all nodes unless 'l' is used.  vgchange -aey means activate it exclusively on any one node, subject to any tag and lvm.conf constraints.  We don't support '-aely' yet.)

Comment 5 Alasdair Kergon 2011-02-16 01:32:56 UTC
The "0 LVs active" message only queries local LVs.  We probably do have the infrastructure available now to include LVs active remotely in those totals now.

Comment 7 Nate Straz 2011-02-16 13:47:38 UTC
Alasdair, this is a regression.  We ran this test throughout the RHEL6.0 process.

Here is the test output from the RHEL6.0-20100818.0 tree which contained lvm2-2.02.72-8.el6.x86_64.

EXCLUSIVE VOLUME GROUP LOCKING
deactivating volume group
grabing the exclusive lock on dash-01
attempting to also grab an exclusive lock on dash-02
  Error locking on node dash-02: Volume is busy on another node
attempting to grab a non exclusive lock on dash-02
  Error locking on node dash-02: Volume is busy on another node
  Error locking on node dash-03: Volume is busy on another node
  Error locking on node dash-01: Device or resource busy
attempting to also grab an exclusive lock on dash-03
  Error locking on node dash-03: Volume is busy on another node
attempting to grab a non exclusive lock on dash-03
  Error locking on node dash-03: Volume is busy on another node
  Error locking on node dash-02: Volume is busy on another node
  Error locking on node dash-01: Device or resource busy
releasing the exclusive lock on dash-01

Comment 8 Tom Coughlan 2011-03-21 23:06:33 UTC
Does anyone know why this behavior appears to have changed between 6.0 and 6.1?

Comment 12 Nate Straz 2011-04-28 20:01:41 UTC
Looking through my test logs shows that this behavior was fixed at some point in the release.  Testing against lvm2-2.02.83-3.el6.x86_64 passed this part of our tests.

Comment 13 Milan Broz 2011-05-03 09:22:42 UTC
Nate, do I read comment #12 correctly that it is in fact fixed in current 6.1?