Bug 1261639

Summary: pvcreate|remove doesn't work w/ lvmlockd running
Product: Red Hat Enterprise Linux 7 Reporter: Corey Marthaler <cmarthal>
Component: lvm2Assignee: David Teigland <teigland>
lvm2 sub component: LVM lock daemon / lvmlockd QA Contact: cluster-qe <cluster-qe>
Status: CLOSED NOTABUG Docs Contact:
Severity: medium    
Priority: medium CC: agk, heinzm, jbrassow, prajnoha, teigland, zkabelac
Version: 7.2Keywords: Triaged
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-01-19 16:12:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Corey Marthaler 2015-09-09 20:06:43 UTC
Description of problem:
It appears a vgcreate is the only way to create a PV with sanlock and lvmlockd running

[root@harding-03 ~]# systemctl start sanlock
[root@harding-02 ~]# systemctl start sanlock

[root@harding-03 ~]# lvmlockd
[root@harding-02 ~]# lvmlockd

[root@harding-03 ~]# systemctl status sanlock
â sanlock.service - Shared Storage Lease Manager
   Loaded: loaded (/usr/lib/systemd/system/sanlock.service; disabled; vendor preset: disabled)
   Active: active (running) since Wed 2015-09-09 19:46:50 CDT; 20s ago
  Process: 3344 ExecStart=/lib/systemd/systemd-sanlock start (code=exited, status=0/SUCCESS)
 Main PID: 3363 (sanlock)
   CGroup: /system.slice/sanlock.service
           ââ3363 sanlock daemon -U sanlock -G sanlock
           ââ3364 sanlock daemon -U sanlock -G sanlock

Sep 09 19:46:50 harding-03.lab.msp.redhat.com systemd[1]: Starting Shared Storage Lease Manager...
Sep 09 19:46:50 harding-03.lab.msp.redhat.com systemd-sanlock[3344]: Starting sanlock: [  OK  ]
Sep 09 19:46:50 harding-03.lab.msp.redhat.com systemd[1]: Started Shared Storage Lease Manager.

[root@harding-02 ~]# systemctl status sanlock
â sanlock.service - Shared Storage Lease Manager
   Loaded: loaded (/usr/lib/systemd/system/sanlock.service; disabled; vendor preset: disabled)
   Active: active (running) since Wed 2015-09-09 19:47:04 CDT; 17s ago
  Process: 4028 ExecStart=/lib/systemd/systemd-sanlock start (code=exited, status=0/SUCCESS)
 Main PID: 4047 (sanlock)
   CGroup: /system.slice/sanlock.service
           ââ4047 sanlock daemon -U sanlock -G sanlock
           ââ4048 sanlock daemon -U sanlock -G sanlock


[root@harding-03 ~]# pvscan
  Skipping global lock: lockspace not found or started
  PV /dev/sda2   VG rhel_harding-03   lvm2 [92.67 GiB / 60.00 MiB free]
  PV /dev/sdb1   VG rhel_harding-03   lvm2 [93.16 GiB / 0    free]
  PV /dev/sdc1   VG rhel_harding-03   lvm2 [93.16 GiB / 0    free]
  Total: 3 [278.98 GiB] / in use: 3 [278.98 GiB] / in no VG: 0 [0   ]

[root@harding-03 ~]# pvcreate  /dev/mapper/mpatha1
  Global lock failed: check that global lockspace is started
[root@harding-03 ~]# pvcreate  /dev/mapper/mpathb1
  Global lock failed: check that global lockspace is started
[root@harding-03 ~]# pvcreate  /dev/mapper/mpathc1
  Global lock failed: check that global lockspace is started

[root@harding-03 ~]# pvs
  Skipping global lock: lockspace not found or started
  PV         VG              Fmt  Attr PSize  PFree 
  /dev/sda2  rhel_harding-03 lvm2 a--  92.67g 60.00m
  /dev/sdb1  rhel_harding-03 lvm2 a--  93.16g     0 
  /dev/sdc1  rhel_harding-03 lvm2 a--  93.16g     0 


[root@harding-03 ~]#  vgcreate --shared VG1 /dev/mapper/mpatha1
  WARNING: shared lock type "sanlock" and lvmlockd are Technology Preview.
  For more information on Technology Preview features, visit:
  https://access.redhat.com/support/offerings/techpreview/
  Enabling sanlock global lock
  Physical volume "/dev/mapper/mpatha1" successfully created
  Physical volume "/dev/mapper/mpatha1" successfully created
  Logical volume "lvmlock" created.
  Physical volume "/dev/mapper/mpatha1" successfully created
  Volume group "VG1" successfully created
  VG VG1 starting sanlock lockspace
  Starting locking.  Waiting until locks are ready...


Version-Release number of selected component (if applicable):
3.10.0-306.el7.x86_64

lvm2-2.02.130-1.el7    BUILT: Wed Sep  9 02:44:18 CDT 2015
lvm2-libs-2.02.130-1.el7    BUILT: Wed Sep  9 02:44:18 CDT 2015
lvm2-cluster-2.02.130-1.el7    BUILT: Wed Sep  9 02:44:18 CDT 2015
device-mapper-1.02.107-1.el7    BUILT: Wed Sep  9 02:44:18 CDT 2015
device-mapper-libs-1.02.107-1.el7    BUILT: Wed Sep  9 02:44:18 CDT 2015
device-mapper-event-1.02.107-1.el7    BUILT: Wed Sep  9 02:44:18 CDT 2015
device-mapper-event-libs-1.02.107-1.el7    BUILT: Wed Sep  9 02:44:18 CDT 2015
device-mapper-persistent-data-0.5.5-1.el7    BUILT: Thu Aug 13 09:58:10 CDT 2015
cmirror-2.02.130-1.el7    BUILT: Wed Sep  9 02:44:18 CDT 2015
sanlock-3.2.4-1.el7    BUILT: Fri Jun 19 12:48:49 CDT 2015
sanlock-lib-3.2.4-1.el7    BUILT: Fri Jun 19 12:48:49 CDT 2015
lvm2-lockd-2.02.130-1.el7    BUILT: Wed Sep  9 02:44:18 CDT 2015


How reproducible:
Everytime

Comment 1 David Teigland 2015-09-09 20:12:29 UTC
Good point, that should be mentioned in the man page under "creating the first sanlock VG".  (After the first sanlock VG exists, you can do pvcreate.)

Comment 2 Corey Marthaler 2015-09-09 20:20:27 UTC
Looks like pvremove is the same way.


[root@harding-03 ~]# vgremove VG2
  Volume group "VG2" successfully removed

[root@harding-03 ~]# pvscan
  Skipping global lock: lockspace not found or started
  PV /dev/sda2             VG rhel_harding-03   lvm2 [92.67 GiB / 60.00 MiB free]
  PV /dev/sdb1             VG rhel_harding-03   lvm2 [93.16 GiB / 0    free]
  PV /dev/sdc1             VG rhel_harding-03   lvm2 [93.16 GiB / 0    free]
  PV /dev/mapper/mpathh1                        lvm2 [250.00 GiB]
  Total: 4 [528.98 GiB] / in use: 3 [278.98 GiB] / in no VG: 1 [250.00 GiB]

[root@harding-03 ~]# pvremove /dev/mapper/mpathh1
  Global lock failed: check that global lockspace is started

[root@harding-03 ~]# pvscan
  Skipping global lock: lockspace not found or started
  PV /dev/sda2             VG rhel_harding-03   lvm2 [92.67 GiB / 60.00 MiB free]
  PV /dev/sdb1             VG rhel_harding-03   lvm2 [93.16 GiB / 0    free]
  PV /dev/sdc1             VG rhel_harding-03   lvm2 [93.16 GiB / 0    free]
  PV /dev/mapper/mpathh1                        lvm2 [250.00 GiB]
  Total: 4 [528.98 GiB] / in use: 3 [278.98 GiB] / in no VG: 1 [250.00 GiB]

[root@harding-03 ~]# pvremove -ff --config global/use_lvmlockd=0 /dev/mapper/mpathh1
  Labels on physical volume "/dev/mapper/mpathh1" successfully wiped

Comment 3 David Teigland 2015-09-09 20:24:27 UTC
Here's the updated man page section section:

   creating the first sanlock VG
       Creating the first sanlock VG is not protected by locking and  requires
       special  attention.  This is because sanlock locks exist within the VG,
       so they are not available until the VG exists.  The  first  sanlock  VG
       will contain the "global lock".

       · The  first  vgcreate  command  needs to be given the path to a device
         that has not yet been initialized with pvcreate.  The  pvcreate  ini‐
         tialization  will  be done by vgcreate.  This is because the pvcreate
         command requires the global lock,  which  will  not  available  until
         after the first sanlock VG is created.

       · While  running  vgcreate  for  the  first sanlock VG, ensure that the
         device being used is not used by another LVM command.  Allocation  of
         shared devices is usually protected by the global lock, but this can‐
         not be done for the first sanlock VG which will hold the global lock.

       · While running vgcreate for the first sanlock VG, ensure that  the  VG
         name being used is not used by another LVM command.  Uniqueness of VG
         names is usually ensured by the global lock.

       · Because the first sanlock VG will contain the global  lock,  this  VG
         needs to be accessible to all hosts that will use sanlock shared VGs.
         All hosts will need to use the global lock from the first sanlock VG.

         See below for more information  about  managing  the  sanlock  global
         lock.

Comment 4 David Teigland 2015-09-09 20:27:54 UTC
There are a lot of commands, like pvremove, that will not work without a global lock, i.e. you need one sanlock VG to exist to hold the global lock for lvm to work normally.

Comment 5 Corey Marthaler 2015-09-15 15:14:08 UTC
How then does one do the "final" pv(s) cleanup? If you remove the final "global lock" vg, how does the last pv(s) get removed? Is the supported method to shutdown sanlock/lvmetad and then delete it? The start up and shut down process doesn't exactly mention when actual creation and deletion is done.

       The shut down sequence is the reverse:

       · deactivate LVs in shared VGs
       · vgchange --lock-stop
       · stop lock manager
       · stop lvmlockd
       · stop lvmetad

[root@harding-03 ~]# vgremove global
  Volume group "global" successfully removed

[root@harding-03 ~]# pvscan
  Skipping global lock: lockspace not found or started
  PV /dev/mapper/mpathg1                        lvm2 [250.00 GiB]

[root@harding-03 ~]# pvremove /dev/mapper/mpathg1
  Global lock failed: check that global lockspace is started

[root@harding-03 ~]# killall lvmetad
[root@harding-03 ~]# systemctl stop sanlock
[root@harding-03 ~]# pvremove /dev/mapper/mpathg1
[deadlocked (that or is hung for > 5 min)]

Comment 7 David Teigland 2015-09-15 15:24:06 UTC
That's a case I'd not thought of.  Currently there's no way to pvremove PVs without using the global lock.  It shouldn't be difficult to add a force option to pvremove to handle that case.  It may also make sense to allow the inverse of 'vgcreate runs pvcreate devices', i.e. 'vgremove --pvremove runs pvremove on devices'.

Also, the last pvremove should fail and report an error like the one before it. That must be a different problem.

Comment 8 David Teigland 2015-09-15 15:44:16 UTC
I'm not seeing pvremove get stuck, could you run gdb on it to see where it's waiting?

VG cc on /dev/sdc holds the sanlock global lock.

[root@null-01 ~]# vgremove cc
Do you really want to remove volume group "cc" containing 2 logical volumes? [y/n]: y
  Logical volume "lvol0" successfully removed
  Logical volume "lvol1" successfully removed
  Volume group "cc" successfully removed

[root@null-01 ~]# pvremove /dev/sdc
  Global lock failed: check that global lockspace is started

[root@null-01 ~]# killall lvmlockd

[root@null-01 ~]# pvremove /dev/sdc
  WARNING: lvmlockd process is not running.
  Global lock failed: check that lvmlockd is running.

Comment 9 Corey Marthaler 2015-09-15 16:20:43 UTC
You're correct, that must have been a different problem as I can no longer repo it. Removing blocker flag. If I can hit that issue again, I'll file a different BZ for it. 


setting up first "global lock" dummy vg for lvmlockd...
vgcreate --shared global /dev/mapper/mpatha1
harding-02: vgchange --lock-start global
harding-03: vgchange --lock-start global
  Skipping global lock: lockspace not found or started

creating lvm devices...
harding-02: pvcreate /dev/mapper/mpathb1 /dev/mapper/mpathd1
harding-02: vgcreate --shared coreyA /dev/mapper/mpathb1 /dev/mapper/mpathd1
harding-02: vgchange --lock-start coreyA
harding-03: vgchange --lock-start coreyA

harding-02: pvcreate /dev/mapper/mpathc1 /dev/mapper/mpathe1
harding-02: vgcreate --shared coreyB /dev/mapper/mpathc1 /dev/mapper/mpathe1
harding-02: vgchange --lock-start coreyB
harding-03: vgchange --lock-start coreyB

[root@harding-02 ~]# pvscan
  PV /dev/mapper/mpatha1   VG global            lvm2 [249.99 GiB / 249.74 GiB free]
  PV /dev/mapper/mpathd1   VG coreyB            lvm2 [249.99 GiB / 249.74 GiB free]
  PV /dev/mapper/mpathe1   VG coreyB            lvm2 [249.99 GiB / 249.99 GiB free]
  PV /dev/mapper/mpathb1   VG coreyA            lvm2 [249.99 GiB / 249.74 GiB free]
  PV /dev/mapper/mpathc1   VG coreyA            lvm2 [249.99 GiB / 249.99 GiB free]

[root@harding-03 ~]# pvscan
  PV /dev/mapper/mpathg1   VG global            lvm2 [249.99 GiB / 249.74 GiB free]
  PV /dev/mapper/mpatha1   VG coreyB            lvm2 [249.99 GiB / 249.74 GiB free]
  PV /dev/mapper/mpathb1   VG coreyB            lvm2 [249.99 GiB / 249.99 GiB free]
  PV /dev/mapper/mpathd1   VG coreyA            lvm2 [249.99 GiB / 249.74 GiB free]
  PV /dev/mapper/mpathc1   VG coreyA            lvm2 [249.99 GiB / 249.99 GiB free]

[root@harding-02 ~]# vgchange --lock-stop coreyA
[root@harding-02 ~]# vgchange --lock-stop coreyB
[root@harding-03 ~]# vgremove coreyB coreyA
  Volume group "coreyB" successfully removed
  Volume group "coreyA" successfully removed

[root@harding-03 ~]# pvremove /dev/mapper/mpath[abcd]1
  Labels on physical volume "/dev/mapper/mpatha1" successfully wiped
  Labels on physical volume "/dev/mapper/mpathb1" successfully wiped
  Labels on physical volume "/dev/mapper/mpathc1" successfully wiped
  Labels on physical volume "/dev/mapper/mpathd1" successfully wiped

[root@harding-03 ~]# pvscan
  PV /dev/mapper/mpathg1   VG global            lvm2 [249.99 GiB / 249.74 GiB free]

[root@harding-03 ~]# vgchange --lock-stop global
[root@harding-02 ~]# vgchange --lock-stop global
[root@harding-02 ~]# vgremove global
  Global lock failed: check that global lockspace is started

[root@harding-02 ~]# vgchange --lock-start global
  Skipping global lock: lockspace not found or started
  VG global starting sanlock lockspace
  Starting locking.  Waiting until locks are ready...

[root@harding-02 ~]# vgremove global
  Volume group "global" successfully removed
[root@harding-02 ~]# pvscan
  Skipping global lock: lockspace not found or started
  PV /dev/mapper/mpatha1                        lvm2 [250.00 GiB]

[root@harding-02 ~]# pvremove /dev/mapper/mpatha1
  Global lock failed: check that global lockspace is started

[root@harding-02 ~]# pvremove -ff --config global/use_lvmlockd=0 /dev/mapper/mpatha1
  Labels on physical volume "/dev/mapper/mpatha1" successfully wiped