Bug 1357162

Summary: RFE: if vg holding the lock is removed, remind user that 'lvmlockctl --gl-enable' is required to be run
Product: Red Hat Enterprise Linux 7 Reporter: Corey Marthaler <cmarthal>
Component: lvm2Assignee: LVM and device-mapper development team <lvm-team>
lvm2 sub component: LVM lock daemon / lvmlockd QA Contact: cluster-qe <cluster-qe>
Status: CLOSED ERRATA Docs Contact:
Severity: low    
Priority: unspecified CC: agk, heinzm, jbrassow, prajnoha, teigland, zkabelac
Version: 7.3Keywords: FutureFeature
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: lvm2-2.02.161-3.el7 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-11-04 04:15:58 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Corey Marthaler 2016-07-15 23:21:54 UTC
Description of problem:

I've run into this many times while testing. Basically, I remove the global locked VG and leave one VG remaining. I then assume that stopping and starting the lock on the remaining VG will start the global lock on this remaining VG, since that is how it was started on the original global locked VG. 

I end up in a manual --startlock; --stoplock; startlock; loop since all I see is "Global lock failed: check that global lockspace is started" which always eventually reminds me to read the manual, and to be fair this very issue is documented on line 381:


       An opposite problem can occur if the VG holding the global lock is removed.  In this case, no global lock will exist following the vgremove,  and  subsequent  LVM  commands
       will fail to acquire it.  In this case, the global lock needs to be manually enabled in one of the remaining sanlock VGs with the command:

       lvmlockctl --gl-enable <vgname>


It would be cool for me, and any user who doesn't read the entire manual to have a reminder in the "check that global lockspace is started" message about how exactly one does that since the preferred method is to have a small sanlock VG dedicated to holding the global lock and thus a user may never be familiar with the lvmlockctl cmd and how to manually initialize a global lock.



Version-Release number of selected component (if applicable):
lvm2-2.02.160-1.el7    BUILT: Wed Jul  6 11:16:47 CDT 2016
lvm2-libs-2.02.160-1.el7    BUILT: Wed Jul  6 11:16:47 CDT 2016
lvm2-cluster-2.02.160-1.el7    BUILT: Wed Jul  6 11:16:47 CDT 2016
device-mapper-1.02.130-1.el7    BUILT: Wed Jul  6 11:16:47 CDT 2016
device-mapper-libs-1.02.130-1.el7    BUILT: Wed Jul  6 11:16:47 CDT 2016
device-mapper-event-1.02.130-1.el7    BUILT: Wed Jul  6 11:16:47 CDT 2016
device-mapper-event-libs-1.02.130-1.el7    BUILT: Wed Jul  6 11:16:47 CDT 2016
device-mapper-persistent-data-0.6.2-0.1.rc8.el7    BUILT: Wed May  4 02:56:34 CDT 2016
cmirror-2.02.160-1.el7    BUILT: Wed Jul  6 11:16:47 CDT 2016
sanlock-3.3.0-1.el7    BUILT: Wed Feb 24 09:52:30 CST 2016
sanlock-lib-3.3.0-1.el7    BUILT: Wed Feb 24 09:52:30 CST 2016
lvm2-lockd-2.02.160-1.el7    BUILT: Wed Jul  6 11:16:47 CDT 2016

Comment 2 Corey Marthaler 2016-07-15 23:27:47 UTC
Another reason this is confusing for me is that in the start, stop, start, loop i get myself into, even though it does mention the global lock space not being started, the lock start passes, which makes it surprising when i can't actual do anything to the VG. 


[root@mckinley-01 ~]# vgchange --lockstart helter_skelter
  Skipping global lock: lockspace not found or started
  Starting locking.  Waiting until locks are ready...

Comment 3 David Teigland 2016-07-25 19:51:27 UTC
pushed fix here:
https://git.fedorahosted.org/cgit/lvm2.git/commit/?id=d0e15b86b53bd4960a7c15a7771548ab4aface8b

VGs "bb" and "gg", the global lock is in "gg"...

# vgremove gg
  VG gg held the sanlock global lock, enable global lock in another VG.
  Volume group "gg" successfully removed

# vgs
  Skipping global lock: VG with global lock was removed
  VG           #PV #LV #SN Attr   VSize   VFree  
  bb             1   8   0 wz--ns 931.01g 919.76g

# vgchange --lock-stop
  Skipping global lock: VG with global lock was removed

# vgchange --lock-start
  Skipping global lock: VG with global lock was removed
  VG bb starting sanlock lockspace
  Starting locking.  Waiting for sanlock may take 20 sec to 3 min...
  Missing global lock: global lock was lost by removing a previous VG.
  To enable the global lock in another VG, see lvmlockctl --gl-enable.

# lvmlockctl --gl-enable bb

# vgs
  VG           #PV #LV #SN Attr   VSize   VFree  
  bb             1   8   0 wz--ns 931.01g 919.76g


(If lvmlockd is restarted without resolving the global lock issue, the helpful warnings will go away and revert to the old error messages since the memory about the VG removal is kept in lvmlockd.)

Comment 5 Corey Marthaler 2016-08-03 21:46:26 UTC
Verified that the message now reminds users to run 'lvmlockctl --gl-enable' on one of the nodes attempting a lock start.

3.10.0-480.el7.x86_64

lvm2-2.02.161-3.el7    BUILT: Thu Jul 28 09:31:24 CDT 2016
lvm2-libs-2.02.161-3.el7    BUILT: Thu Jul 28 09:31:24 CDT 2016
lvm2-cluster-2.02.161-3.el7    BUILT: Thu Jul 28 09:31:24 CDT 2016
device-mapper-1.02.131-3.el7    BUILT: Thu Jul 28 09:31:24 CDT 2016
device-mapper-libs-1.02.131-3.el7    BUILT: Thu Jul 28 09:31:24 CDT 2016
device-mapper-event-1.02.131-3.el7    BUILT: Thu Jul 28 09:31:24 CDT 2016
device-mapper-event-libs-1.02.131-3.el7    BUILT: Thu Jul 28 09:31:24 CDT 2016
device-mapper-persistent-data-0.6.3-1.el7    BUILT: Fri Jul 22 05:29:13 CDT 2016
cmirror-2.02.161-3.el7    BUILT: Thu Jul 28 09:31:24 CDT 2016
sanlock-3.3.0-1.el7    BUILT: Wed Feb 24 09:52:30 CST 2016
sanlock-lib-3.3.0-1.el7    BUILT: Wed Feb 24 09:52:30 CST 2016
lvm2-lockd-2.02.161-3.el7    BUILT: Thu Jul 28 09:31:24 CDT 2016




[root@harding-02 ~]# sanlock gets -h 1
s lvm_snapper_thinp:1727:/dev/mapper/snapper_thinp-lvmlock:0 
h 290 gen 1 timestamp 445152 LIVE
h 1727 gen 1 timestamp 445146 LIVE
s lvm_global:1727:/dev/mapper/global-lvmlock:0 
h 290 gen 1 timestamp 445165 LIVE
h 1727 gen 1 timestamp 445158 LIVE

[root@harding-02 ~]# vgchange -an global
  0 logical volume(s) in volume group "global" now active
[root@harding-02 ~]# vgremove global
  Lockspace for "global" not stopped on other hosts

# other node
[root@harding-03 ~]# vgchange --lock-stop global

[root@harding-02 ~]# vgremove global
  VG global held the sanlock global lock, enable global lock in another VG.
  Volume group "global" successfully removed

[root@harding-02 ~]# sanlock gets -h 1
s lvm_snapper_thinp:1727:/dev/mapper/snapper_thinp-lvmlock:0 
h 290 gen 1 timestamp 445234 LIVE
h 1727 gen 1 timestamp 445228 LIVE

[root@harding-02 ~]# lvs
  Skipping global lock: VG with global lock was removed
  LV           VG              Attr       LSize   Pool Origin Data%  Meta%
  POOL         snapper_thinp   twi-aot---   4.00g             0.01   1.86
  origin       snapper_thinp   Vwi-a-t---   1.00g POOL        0.01
  other1       snapper_thinp   Vwi-a-t---   1.00g POOL        0.01
  other2       snapper_thinp   Vwi-a-t---   1.00g POOL        0.01
  other3       snapper_thinp   Vwi-a-t---   1.00g POOL        0.01
  other4       snapper_thinp   Vwi-a-t---   1.00g POOL        0.01
  other5       snapper_thinp   Vwi-a-t---   1.00g POOL        0.01
  pool_convert snapper_thinp   Vwi-a-t---   1.00g POOL origin 0.01

[root@harding-02 ~]# lvcreate --activate ey -V 1G -T snapper_thinp/POOL -n other6
  WARNING: Sum of all thin volume sizes (8.00 GiB) exceeds the size of thin pool snapper_thinp/POOL (4.00 GiB)!
  For thin pool auto extension activation/thin_pool_autoextend_threshold should be below 100.
  Logical volume "other6" created.

[root@harding-02 ~]# vgchange -an snapper_thinp
  0 logical volume(s) in volume group "snapper_thinp" now active

[root@harding-02 ~]# vgchange --lock-stop snapper_thinp
[root@harding-03 ~]# vgchange --lock-stop snapper_thinp

[root@harding-02 ~]# vgs
  Skipping global lock: VG with global lock was removed
  Reading VG snapper_thinp without a lock.
  VG              #PV #LV #SN Attr   VSize   VFree
  rhel_harding-02   3   3   0 wz--n- 278.47g    0 
  snapper_thinp     5   9   0 wz--ns   1.22t 1.21t

[root@harding-03 ~]# vgchange --lock-start snapper_thinp
  Skipping global lock: lockspace not found or started
  VG snapper_thinp starting sanlock lockspace
  Starting locking.  Waiting for sanlock may take 20 sec to 3 min...


[root@harding-02 ~]# vgchange --lock-start snapper_thinp
  Skipping global lock: VG with global lock was removed
  VG snapper_thinp starting sanlock lockspace
  Starting locking.  Waiting for sanlock may take 20 sec to 3 min...
  Missing global lock: global lock was lost by removing a previous VG.
  To enable the global lock in another VG, see lvmlockctl --gl-enable.


[root@harding-02 ~]# lvmlockctl --gl-enable snapper_thinp
[root@harding-02 ~]# vgchange --lock-start snapper_thinp
  Starting locking.  Waiting for sanlock may take 20 sec to 3 min...
[root@harding-03 ~]# vgchange --lock-start snapper_thinp
  Starting locking.  Waiting for sanlock may take 20 sec to 3 min...

[root@harding-03 ~]# sanlock gets -h 1  
s lvm_snapper_thinp:290:/dev/mapper/snapper_thinp-lvmlock:0 
h 290 gen 2 timestamp 445582 LIVE
h 1727 gen 2 timestamp 445604 LIVE

[root@harding-02 ~]# sanlock gets -h 1  
s lvm_snapper_thinp:1727:/dev/mapper/snapper_thinp-lvmlock:0 
h 290 gen 2 timestamp 445787 LIVE
h 1727 gen 2 timestamp 445788 LIVE

[root@harding-02 ~]# lvmlockctl --gl-disable snapper_thinp

[root@harding-02 ~]# sanlock gets -h 1  
s lvm_snapper_thinp:1727:/dev/mapper/snapper_thinp-lvmlock:0 
h 290 gen 2 timestamp 445807 LIVE
h 1727 gen 2 timestamp 445809 LIVE

Comment 7 errata-xmlrpc 2016-11-04 04:15:58 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-1445.html