Bug 449344

Summary:	clvmd keeps locks for non-existent LVs, clvmd -R doesn't remove them
Product:	Red Hat Enterprise Linux 5	Reporter:	Milan Broz <mbroz>
Component:	lvm2-cluster	Assignee:	David Teigland <teigland>
Status:	CLOSED WONTFIX	QA Contact:	Cluster QE <mspqa-list>
Severity:	high	Docs Contact:
Priority:	high
Version:	5.3	CC:	agk, agouny, casmith, ccaulfie, cmarthal, dwysocha, heinzm, iannis, jbrassow, jdelong, lmiksik, prajnoha, prockai, pvrabec, slevine, void
Target Milestone:	---
Target Release:	---
Hardware:	All
OS:	Linux
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2017-04-04 20:43:19 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	928849, 1049888

Description Milan Broz 2008-06-02 10:16:40 UTC

1) If user (by mistake) creates clustered VG on local PVs and LV on them,
command failed, but LV is created on local node despite the error message
(lvcreate should revert change if any node fails)

2) the sync_lock for LV is still held on local node, even
LV is not there activated. Also clvmd -R doesn't remove such orphan locks.

See this reproducer (/dev/sdc is local disk)

+ vgcreate vg_local /dev/sdc1 /dev/sdc2 /dev/sdc3 /dev/sdc4
  Clustered volume group "vg_local" successfully created

#this must fail, because /dev/sdc is not acceible on other nodes
+ lvcreate -L 100M -n lv vg_local
  Error locking on node bar-03.englab.brq.redhat.com: Volume group for uuid not
found: PxZd20M66SE0fnFlR4aSnhEjwrzDLyoDMnJROOY3flW2yNobmVnY16WFnXp4qofo
  Error locking on node bar-02.englab.brq.redhat.com: Volume group for uuid not
found: PxZd20M66SE0fnFlR4aSnhEjwrzDLyoDMnJROOY3flW2yNobmVnY16WFnXp4qofo
  Aborting. Failed to activate new LV to wipe the start of it.

# 1) here is incorectly LV left in device mapper, but not in metadata!

# try to fix cluster bit to non-clustered
+ vgchange -c n vg_local
  Volume group "vg_local" successfully changed
+ clvmd -R

# create LV again
+ lvcreate -L 100M -n lv vg_local
  Logical volume "lv" created


+ vgchange -a n vg_local
  0 logical volume(s) in volume group "vg_local" now active
+ vgchange -a y vg_local
  1 logical volume(s) in volume group "vg_local" now active

#ok, we have 1 local volume active


#use pvmove
+ pvmove -i 1 /dev/sdc1
  /dev/sdc1: Moved: 12.0%

  /dev/sdc1: Moved: 28.0%
  /dev/sdc1: Moved: 44.0%
  /dev/sdc1: Moved: 60.0%
  /dev/sdc1: Moved: 76.0%
  /dev/sdc1: Moved: 92.0%
  /dev/sdc1: Moved: 100.0%

# so LV is moved, but cannot be acitivated withot local clvmd restart !

+ vgchange -a n vg_local
  0 logical volume(s) in volume group "vg_local" now active
+ vgchange -a y vg_local
  0 logical volume(s) in volume group "vg_local" now active

# it is really not there, but loclk is held in clvmd
+ dmsetup table vg_local-lv
device-mapper: table ioctl failed: No such device or address


kill clvmd:
CLVMD[b7f918e0]: Jun  2 12:12:06 SIGTERM received
CLVMD[b7f918e0]: Jun  2 12:12:06 sync_unlock:
's7EsVZZ18UBGg3liwSoJ3Ccn8S5aQdsQb8Zk83mpswxZORAXAOq9cxAWf0jtusZB' lkid:10045
CLVMD[b7f918e0]: Jun  2 12:12:06 sync_unlock:
'PxZd20M66SE0fnFlR4aSnhEjwrzDLyoDroKJ6Lb6RGj60bi18fFtBvf8md0mcxEg' lkid:103ef
CLVMD[b7f918e0]: Jun  2 12:12:06 sync_unlock:
'PxZd20M66SE0fnFlR4aSnhEjwrzDLyoDMnJROOY3flW2yNobmVnY16WFnXp4qofo' lkid:10393
CLVMD[b7f918e0]: Jun  2 12:12:06 sync_unlock:
'PxZd20M66SE0fnFlR4aSnhEjwrzDLyoDeJAID1ih0b3x4MrokK6gxM1i9Cvq6CTT' lkid:20368

# lvs -o +uuid vg_local
  LV   VG       Attr   LSize   Origin Snap%  Move Log Copy%  Convert LV UUID
  lv   vg_local -wi--- 100.00M                                      
eJAID1-ih0b-3x4M-rokK-6gxM-1i9C-vq6CTT

after clvmd restart:
# vgchange -a y vg_local
  1 logical volume(s) in volume group "vg_local" now active


clvmd -R should remove all locks if there is no device/LV for them active!

Comment 3 Kiersten (Kerri) Anderson 2008-09-30 20:09:00 UTC

Did this go out in a release somewhere - it is referenced in :
http://rhn.redhat.com/errata/RHBA-2008-0806.html

Comment 5 Milan Broz 2009-03-05 13:14:26 UTC

Moving that bug to RHEL5 product for future tracking of the problem.

Comment 17 RHEL Program Management 2014-01-22 16:35:30 UTC

This request was evaluated by Red Hat Product Management for inclusion
in a Red Hat Enterprise Linux release.  Product Management has
requested further review of this request by Red Hat Engineering, for
potential inclusion in a Red Hat Enterprise Linux release for currently
deployed products.  This request is not yet committed for inclusion in
a release.

Comment 19 Chris Williams 2017-04-04 20:43:19 UTC

Red Hat Enterprise Linux 5 shipped it's last minor release, 5.11, on September 14th, 2014. On March 31st, 2017 RHEL 5 exits Production Phase 3 and enters Extended Life Phase. For RHEL releases in the Extended Life Phase, Red Hat  will provide limited ongoing technical support. No bug fixes, security fixes, hardware enablement or root-cause analysis will be available during this phase, and support will be provided on existing installations only.  If the customer purchases the Extended Life-cycle Support (ELS), certain critical-impact security fixes and selected urgent priority bug fixes for the last minor release will be provided.  The specific support and services provided during each phase are described in detail at http://redhat.com/rhel/lifecycle

This BZ does not appear to meet ELS criteria so is being closed WONTFIX. If this BZ is critical for your environment and you have an Extended Life-cycle Support Add-on entitlement, please open a case in the Red Hat Customer Portal, https://access.redhat.com ,provide a thorough business justification and ask that the BZ be re-opened for consideration of an errata. Please note, only certain critical-impact security fixes and selected urgent priority bug fixes for the last minor release can be considered.