Bug 878948 - Concurrent activations of same LV race against each other with "Device or resource busy"
Summary: Concurrent activations of same LV race against each other with "Device or res...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: lvm2
Version: 6.4
Hardware: x86_64
OS: Linux
high
high
Target Milestone: rc
: 6.5
Assignee: Alasdair Kergon
QA Contact: Cluster QE
URL:
Whiteboard: storage
Depends On:
Blocks: 871829 1002699 1112137 1127117
TreeView+ depends on / blocked
 
Reported: 2012-11-21 15:54 UTC by Dafna Ron
Modified: 2015-02-03 15:43 UTC (History)
22 users (show)

Fixed In Version: lvm2-2.02.107-1.el6
Doc Type: Bug Fix
Doc Text:
Concurrent activation and deactivation of logical volumes is now prohibited. Locking is performed so that operations are now processed sequentially. This does not apply to thin or cache logical volumes, nor in clusters. 'lvchange -ay $lv' and 'lvchange -an $lv' should no longer cause trouble if issued concurrently: the new lock should make sure they activate/deactivate $lv one-after-the-other, instead of overlapping.
Clone Of:
: 1112137 (view as bug list)
Environment:
Last Closed: 2014-10-14 08:23:43 UTC


Attachments (Terms of Use)
logs (879.67 KB, application/x-gzip)
2012-11-21 15:54 UTC, Dafna Ron
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2014:1387 normal SHIPPED_LIVE lvm2 bug fix and enhancement update 2014-10-14 01:39:47 UTC

Description Dafna Ron 2012-11-21 15:54:20 UTC
Created attachment 649282 [details]
logs

Description of problem:

Live storage migration failed for one of my vms with CannotActivateLogicalVolumes error. 
looking at the log, there is a lock for LV extend on the vm 

Version-Release number of selected component (if applicable):

si24.4
vdsm-4.9.6-44.0.el6_3.x86_64

How reproducible:

race

Steps to Reproduce:
1. create thin provision vms and have them write
2. live migrate the vms disks
3.
  
Actual results:

if LV extend is run right when we try to run prepareVolume, the live migration will fail due to lock on the image. 

Expected results:

we should retry prepareImage if other command has taken lock. 

Additional info: full logs (errors is on spm but I attached hsm as well). 

this is the ERROR: 

Thread-17704::ERROR::2012-11-21 17:02:02,091::task::853::TaskManager.Task::(_setError) Task=`c509c33f-6e1e-4605-8602-fb882b43d166`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 861, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 38, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 2880, in prepareImage
    imgVolumes = img.prepare(sdUUID, imgUUID, volUUID)
  File "/usr/share/vdsm/storage/image.py", line 374, in prepare
    volUUIDs=[vol.volUUID for vol in chain])
  File "/usr/share/vdsm/storage/blockSD.py", line 967, in activateVolumes
    lvm.activateLVs(self.sdUUID, volUUIDs)
  File "/usr/share/vdsm/storage/lvm.py", line 1046, in activateLVs
    _setLVAvailability(vgName, toActivate, "y")
  File "/usr/share/vdsm/storage/lvm.py", line 738, in _setLVAvailability
    raise error(str(e))
CannotActivateLogicalVolumes: Cannot activate Logical Volumes: ('General Storage Exception: ("5 [] [\'  device-mapper: create ioctl on 8c0ef67f--03c1--4fbf--b099--3e3668405cfc-
39f89a6a--7fbb--43c0--a5ea--19b271f51829 failed: Device or resource busy\']\\n8c0ef67f-03c1-4fbf-b099-3e3668405cfc/[\'39f89a6a-7fbb-43c0-a5ea-19b271f51829\', \'80512e28-2bef-44
14-b34d-52bdef187365\', \'d02638cb-abf3-4ed1-b55e-fd4b9b1a49d8\']",)',)
Thread-17704::DEBUG::2012-11-21 17:02:02,114::task::872::TaskManager.Task::(_run) Task=`c509c33f-6e1e-4605-8602-fb882b43d166`::Task._run: c509c33f-6e1e-4605-8602-fb882b43d166 (
'8c0ef67f-03c1-4fbf-b099-3e3668405cfc', 'edf0ee04-0cc2-4e13-877d-1e89541aea55', '5054805b-354c-462d-9953-caa4bd4a6454', 'd02638cb-abf3-4ed1-b55e-fd4b9b1a49d8') {} failed - stop
ping task



this is the LV extend taking the lock a few seconds before: 

a9c3fda2-0858-48eb-858e-8d0b854387fe::INFO::2012-11-21 17:01:54,953::blockVolume::282::Storage.Volume::(extend) Request to extend LV d02638cb-abf3-4ed1-b55e-fd4b9b1a49d8 of ima
ge 5054805b-354c-462d-9953-caa4bd4a6454 in VG 8c0ef67f-03c1-4fbf-b099-3e3668405cfc with size = 2097152
a9c3fda2-0858-48eb-858e-8d0b854387fe::DEBUG::2012-11-21 17:01:54,955::__init__::1164::Storage.Misc.excCmd::(_log) '/usr/bin/sudo -n /sbin/lvm lvextend --config " devices { pref
erred_names = [\\"^/dev/mapper/\\"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 filter = [ \\"a%1Dafna-si24_4_new-011353489|1Dafna-si24_4_new-021
353489|1Dafna-si24_4_new-031353489%\\", \\"r%.*%\\" ] }  global {  locking_type=1  prioritise_write_locks=1  wait_for_locks=1 }  backup {  retain_min = 50  retain_days = 0 } " 
--autobackup n --size 1024m 8c0ef67f-03c1-4fbf-b099-3e3668405cfc/d02638cb-abf3-4ed1-b55e-fd4b9b1a49d8' (cwd None)

Comment 1 Dafna Ron 2012-11-21 16:25:20 UTC
removing race from bug and moving to urgent since this will happen for multiple live storage migrations each time. 

locking also failed removeDisk because of cloneImageStructure command's lock on volume. 

we can see it in the same log already attached:

Thread-17922::ERROR::2012-11-21 17:05:04,056::task::853::TaskManager.Task::(_setError) Task=`c07ee9e7-fff6-4d00-96d1-b88f36cd36de`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 861, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/logUtils.py", line 38, in wrapper
    res = f(*args, **kwargs)
  File "/usr/share/vdsm/storage/hsm.py", line 1349, in deleteImage
    dom.deleteImage(sdUUID, imgUUID, volsByImg)
  File "/usr/share/vdsm/storage/blockSD.py", line 945, in deleteImage
    deleteVolumes(sdUUID, toDel)
  File "/usr/share/vdsm/storage/blockSD.py", line 177, in deleteVolumes
    lvm.removeLVs(sdUUID, vols)
  File "/usr/share/vdsm/storage/lvm.py", line 1010, in removeLVs
    raise se.CannotRemoveLogicalVolume(vgName, str(lvNames))
CannotRemoveLogicalVolume: Cannot remove Logical Volume: ('d40978c8-3fab-483b-b786-2f1e1c5cf130', "('34ff2273-e1cd-41b9-9c30-61defdc85948', '98d1cf94-5e59-4f85-8696-698b0269e347')")
44bfeb5b-eae3-4e2d-9c13-a2b9442cf865::DEBUG::2012-11-21 17:05:04,056::resourceManager::553::ResourceManager::(releaseResource) Released resource '8c0ef67f-03c1-4fbf-b099-3e3668405cfc_lvmActivationNS.17879e10-6ea9-4c32-bd7a-9b1bd71ff3ee' (0 active users)


deb69b60-c03d-4d82-80e4-e3c531165710::DEBUG::2012-11-21 17:01:28,401::resourceManager::486::ResourceManager::(registerResource) Trying to register resource '8c0ef67f-03c1-4fbf-
b099-3e3668405cfc_imageNS.270835d7-b3bb-4e1c-a34d-f09d0538affd' for lock type 'exclusive'


Thread-17662::DEBUG::2012-11-21 17:01:26,370::BindingXMLRPC::171::vds::(wrapper) [10.35.97.65]
Thread-17662::DEBUG::2012-11-21 17:01:26,371::task::588::TaskManager.Task::(_updateState) Task=`deb69b60-c03d-4d82-80e4-e3c531165710`::moving from state init -> state preparing
Thread-17662::INFO::2012-11-21 17:01:26,372::logUtils::37::dispatcher::(wrapper) Run and protect: cloneImageStructure(spUUID='edf0ee04-0cc2-4e13-877d-1e89541aea55', sdUUID='d40
978c8-3fab-483b-b786-2f1e1c5cf130', imgUUID='270835d7-b3bb-4e1c-a34d-f09d0538affd', dstSdUUID='8c0ef67f-03c1-4fbf-b099-3e3668405cfc')

Comment 2 Yeela Kaplan 2013-02-18 18:20:13 UTC
Fede,
The following bug: https://bugzilla.redhat.com/show_bug.cgi?id=893955
looks like a duplicate of this bug.
What do you think?

Thanks,
Yeela

Comment 3 Federico Simoncelli 2013-02-20 09:27:07 UTC
(In reply to comment #2)
> Fede,
> The following bug: https://bugzilla.redhat.com/show_bug.cgi?id=893955
> looks like a duplicate of this bug.
> What do you think?
> 
> Thanks,
> Yeela

No bug 893955 is more a duplicate of bug 910013.

Comment 4 Ayal Baron 2013-02-28 20:55:13 UTC
Fede, any update on this?

Comment 5 Federico Simoncelli 2013-03-01 10:21:22 UTC
This is not related to lvextend. The issue is that two vmDiskReplicateStart arrived at the same time (for two VMs):

Thread-17705::DEBUG::2012-11-21 17:02:00,954::BindingXMLRPC::894::vds::(wrapper) client [10.35.97.65]::call vmDiskReplicateStart with ('10a3ee37-a4f2-4e3e-92cb-4acbf72b9955', {'device': 'disk', 'domainID': 'd40978c8-3fab-483b-b786-2f1e1c5cf130', 'volumeID': '98d1cf94-5e59-4f85-8696-698b0269e347', 'poolID': 'edf0ee04-0cc2-4e13-877d-1e89541aea55', 'imageID': '270835d7-b3bb-4e1c-a34d-f09d0538affd'}, {'device': 'disk', 'domainID': '8c0ef67f-03c1-4fbf-b099-3e3668405cfc', 'volumeID': '98d1cf94-5e59-4f85-8696-698b0269e347', 'poolID': 'edf0ee04-0cc2-4e13-877d-1e89541aea55', 'imageID': '270835d7-b3bb-4e1c-a34d-f09d0538affd'}) {}
Thread-17704::DEBUG::2012-11-21 17:02:00,956::BindingXMLRPC::894::vds::(wrapper) client [10.35.97.65]::call vmDiskReplicateStart with ('6fde0adb-95ca-4a40-95b1-1d76e1ee4da4', {'device': 'disk', 'domainID': 'd40978c8-3fab-483b-b786-2f1e1c5cf130', 'volumeID': 'd02638cb-abf3-4ed1-b55e-fd4b9b1a49d8', 'poolID': 'edf0ee04-0cc2-4e13-877d-1e89541aea55', 'imageID': '5054805b-354c-462d-9953-caa4bd4a6454'}, {'device': 'disk', 'domainID': '8c0ef67f-03c1-4fbf-b099-3e3668405cfc', 'volumeID': 'd02638cb-abf3-4ed1-b55e-fd4b9b1a49d8', 'poolID': 'edf0ee04-0cc2-4e13-877d-1e89541aea55', 'imageID': '5054805b-354c-462d-9953-caa4bd4a6454'}) {}

The two VMs were supposedly based on the same template as prepareImage tried to activate one lv twice (at the same time from two different threads):

Thread-17704::DEBUG::2012-11-21 17:02:01,172::__init__::1164::Storage.Misc.excCmd::(_log) '/usr/bin/sudo -n /sbin/lvm lvchange --config " devices { preferred_names = [\\"^/dev/mapper/\\"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 filter = [ \\"a%1Dafna-si24_4_new-011353489|1Dafna-si24_4_new-021353489|1Dafna-si24_4_new-031353489%\\", \\"r%.*%\\" ] }  global {  locking_type=1  prioritise_write_locks=1  wait_for_locks=1 }  backup {  retain_min = 50  retain_days = 0 } " --autobackup n --available y 8c0ef67f-03c1-4fbf-b099-3e3668405cfc/39f89a6a-7fbb-43c0-a5ea-19b271f51829 8c0ef67f-03c1-4fbf-b099-3e3668405cfc/80512e28-2bef-4414-b34d-52bdef187365 8c0ef67f-03c1-4fbf-b099-3e3668405cfc/d02638cb-abf3-4ed1-b55e-fd4b9b1a49d8' (cwd None)

Thread-17705::DEBUG::2012-11-21 17:02:01,174::__init__::1164::Storage.Misc.excCmd::(_log) '/usr/bin/sudo -n /sbin/lvm lvchange --config " devices { preferred_names = [\\"^/dev/mapper/\\"] ignore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 filter = [ \\"a%1Dafna-si24_4_new-011353489|1Dafna-si24_4_new-021353489|1Dafna-si24_4_new-031353489%\\", \\"r%.*%\\" ] }  global {  locking_type=1  prioritise_write_locks=1  wait_for_locks=1 }  backup {  retain_min = 50  retain_days = 0 } " --autobackup n --available y 8c0ef67f-03c1-4fbf-b099-3e3668405cfc/39f89a6a-7fbb-43c0-a5ea-19b271f51829 8c0ef67f-03c1-4fbf-b099-3e3668405cfc/34ff2273-e1cd-41b9-9c30-61defdc85948 8c0ef67f-03c1-4fbf-b099-3e3668405cfc/98d1cf94-5e59-4f85-8696-698b0269e347' (cwd None)

Common lv: 8c0ef67f-03c1-4fbf-b099-3e3668405cfc/39f89a6a-7fbb-43c0-a5ea-19b271f51829

That resulted in a lvm race:

CannotActivateLogicalVolumes: Cannot activate Logical Volumes: ('General Storage Exception: ("5 [] [\'  device-mapper: create ioctl on 8c0ef67f--03c1--4fbf--b099--3e3668405cfc-39f89a6a--7fbb--43c0--a5ea--19b271f51829 failed: Device or resource busy\']\\n8c0ef67f-03c1-4fbf-b099-3e3668405cfc/[\'39f89a6a-7fbb-43c0-a5ea-19b271f51829\', \'80512e28-2bef-4414-b34d-52bdef187365\', \'d02638cb-abf3-4ed1-b55e-fd4b9b1a49d8\']",)',)

(8c0ef67f--03c1--4fbf--b099--3e3668405cfc-39f89a6a--7fbb--43c0--a5ea--19b271f51829 == 8c0ef67f-03c1-4fbf-b099-3e3668405cfc/39f89a6a-7fbb-43c0-a5ea-19b271f51829)

I have been able to reproduce this in a testing environment:

# lvchange -ay /dev/vgtest1/lvtest1 & lvchange -ay /dev/vgtest1/lvtest1
[1] 6798
  device-mapper: create ioctl on vgtest1-lvtest1 failed: Device or resource busy
[1]+  Exit 5                  lvchange -ay /dev/vgtest1/lvtest1

I'm not sure if it's our duty to protect against this case. Maybe lvm should have a mechanism to prevent this.

Anyway a fix on our side could be making _setLVAvailability similar to setrwLV, which even after an lvchange error it tries to check if what we requested (in this case the activation) was succesful.

Comment 6 Ayal Baron 2013-03-03 06:45:28 UTC
(In reply to comment #5)
> This is not related to lvextend. The issue is that two vmDiskReplicateStart
> arrived at the same time (for two VMs):

Please change the title to reflect the real problem

<SNIP>

> 
> I'm not sure if it's our duty to protect against this case. Maybe lvm should
> have a mechanism to prevent this.

Discuss with LVM team and move the bug to lvm if they agree.  If not, then we need to protect against it.

> 
> Anyway a fix on our side could be making _setLVAvailability similar to
> setrwLV, which even after an lvchange error it tries to check if what we
> requested (in this case the activation) was succesful.

Comment 7 Federico Simoncelli 2013-03-04 10:01:53 UTC
As discussed with Peter, the LVM team will update us on this.

Comment 8 Zdenek Kabelac 2013-03-05 09:16:01 UTC
lvm is designed for single root command line usage.

There is no protections to run multiple lvm commands that are playing with the same device - i.e. running 2 busy loops which activate & deactivate same LV - what should be the outcome of this?

Currently lvm2 protects only metadata updates - so you cannot i.e. create LVs in parallel - but there is no protection against i.e. parallel 'lvchange -an' & 'lvs'  - which will 'randomly' fail on access of currently deactivated device - this is being seen as the fault of user (root).

To fix this on lvm side would essentially mean major rewrite of locking to use much fainer locking mechanism per LV, possible even per extent...

From practical perspective -  lvm2 supports  lvm-shell - where you could stream lvm commands via pipe to process them in single  instance of lvm.

Thus you may stream all lvm commands to a single instance of lvm process - this would ensure proper order (no lvm command would run in parallel).

Unsure how this fits 'vdsm' design - but it may appear this will run faster and consume fairly less resources then individual execution of multiple lvm commands in parallel.

It could be - design may need few lvm shell instances running in parallel - i.e. 1 instance for create/delete -  another instance for 'lvs-like' commands - but in this case tool must ensure, there is no operation working with same LV running in parallel.

Comment 9 Ayal Baron 2013-03-05 22:12:28 UTC
(In reply to comment #8)
> lvm is designed for single root command line usage.
> 
> There is no protections to run multiple lvm commands that are playing with
> the same device - i.e. running 2 busy loops which activate & deactivate same
> LV - what should be the outcome of this?
> 
> Currently lvm2 protects only metadata updates - so you cannot i.e. create
> LVs in parallel - but there is no protection against i.e. parallel 'lvchange
> -an' & 'lvs'  - which will 'randomly' fail on access of currently
> deactivated device - this is being seen as the fault of user (root).
> 
> To fix this on lvm side would essentially mean major rewrite of locking to
> use much fainer locking mechanism per LV, possible even per extent...

I'm not sure I understand why this is not simply a lock on an LV?

> 
> From practical perspective -  lvm2 supports  lvm-shell - where you could
> stream lvm commands via pipe to process them in single  instance of lvm.
> 
> Thus you may stream all lvm commands to a single instance of lvm process -
> this would ensure proper order (no lvm command would run in parallel).
> 
> Unsure how this fits 'vdsm' design - but it may appear this will run faster
> and consume fairly less resources then individual execution of multiple lvm
> commands in parallel.

We actually tried this in the past but encountered too many problems with it as it's not really designed to be automated.

> 
> It could be - design may need few lvm shell instances running in parallel -
> i.e. 1 instance for create/delete -  another instance for 'lvs-like'
> commands - but in this case tool must ensure, there is no operation working
> with same LV running in parallel.

Comment 10 Ayal Baron 2013-03-05 22:14:03 UTC
(In reply to comment #5)
<SNIP>
> 
> Anyway a fix on our side could be making _setLVAvailability similar to
> setrwLV, which even after an lvchange error it tries to check if what we
> requested (in this case the activation) was succesful.

This sounds simple enough, I only fear that it may need to rely on 'lvs' which could go on failing for the same reason

Comment 11 Zdenek Kabelac 2013-03-06 09:39:15 UTC
(In reply to comment #9)
> (In reply to comment #8)
> > lvm is designed for single root command line usage.
> > 
> > There is no protections to run multiple lvm commands that are playing with
> > the same device - i.e. running 2 busy loops which activate & deactivate same
> > LV - what should be the outcome of this?
> > 
> > Currently lvm2 protects only metadata updates - so you cannot i.e. create
> > LVs in parallel - but there is no protection against i.e. parallel 'lvchange
> > -an' & 'lvs'  - which will 'randomly' fail on access of currently
> > deactivated device - this is being seen as the fault of user (root).
> > 
> > To fix this on lvm side would essentially mean major rewrite of locking to
> > use much fainer locking mechanism per LV, possible even per extent...
> 
> I'm not sure I understand why this is not simply a lock on an LV?

It would be getting really complex if you are working with LV trees,
thus decreasing scalability of LVM - i.e. it's not clear whether the lock would have to be kept until udev is finished - so effectively scaling complexity of parallelization of LV activation.

Anyway what is probably 'wanted' here is actually dm table read-write lock.
You do not need any locking if i.e. running lots of 'lvs' commands,
but you want to serialize command which are modifying tables.

However adding such thing into current LVM is rather longterm RFE.

Design of lvm is currently not supporting threads - and expects that 'user' is not doing parallel things with the same VG - i.e. activating/deactivating/quering same LV from multiple commands - the only protected thing is consistency of lvm metadata.

Basic assumption is - the application using lvm commands is not running commands with the same VG/LV in parallel and ATM there is no easy fix for this.

So is it possible to work with these constrains within vdsm?


> > From practical perspective -  lvm2 supports  lvm-shell - where you could
> > stream lvm commands via pipe to process them in single  instance of lvm.
> > 
> > Thus you may stream all lvm commands to a single instance of lvm process -
> > this would ensure proper order (no lvm command would run in parallel).
> > 
> > Unsure how this fits 'vdsm' design - but it may appear this will run faster
> > and consume fairly less resources then individual execution of multiple lvm
> > commands in parallel.
> 
> We actually tried this in the past but encountered too many problems with it
> as it's not really designed to be automated.

Any LVM related bug from those 'too many' ?

> > It could be - design may need few lvm shell instances running in parallel -
> > i.e. 1 instance for create/delete -  another instance for 'lvs-like'
> > commands - but in this case tool must ensure, there is no operation working
> > with same LV running in parallel.

Comment 12 Ayal Baron 2013-03-14 06:40:37 UTC
> 
> Design of lvm is currently not supporting threads - and expects that 'user'
> is not doing parallel things with the same VG - i.e.
> activating/deactivating/quering same LV from multiple commands - the only
> protected thing is consistency of lvm metadata.
> 
> Basic assumption is - the application using lvm commands is not running
> commands with the same VG/LV in parallel and ATM there is no easy fix for
> this.
> 
> So is it possible to work with these constrains within vdsm?

It is possible but would require us to lock where lvm doesn't.
Why can't you just take an flock on a file whose name is the name of the LV or something?

> 
> 
> > > From practical perspective -  lvm2 supports  lvm-shell - where you could
> > > stream lvm commands via pipe to process them in single  instance of lvm.
> > > 
> > > Thus you may stream all lvm commands to a single instance of lvm process -
> > > this would ensure proper order (no lvm command would run in parallel).
> > > 
> > > Unsure how this fits 'vdsm' design - but it may appear this will run faster
> > > and consume fairly less resources then individual execution of multiple lvm
> > > commands in parallel.
> > 
> > We actually tried this in the past but encountered too many problems with it
> > as it's not really designed to be automated.
> 
> Any LVM related bug from those 'too many' ?

Possibly, but this was a couple of years ago so we'd need to revisit it (doesn't seem worth it, we'd just move to using libLVM)

Note that with libLVM I don't think lvm can afford stating that user is expected not to run multiple API calls with same LV

Comment 13 Ayal Baron 2013-03-14 09:57:52 UTC
Yeela,

Please check to see the implications of implementing the locking in vdsm.

Comment 14 Zdenek Kabelac 2013-03-14 10:52:31 UTC
(In reply to comment #12)
> > 
> > Design of lvm is currently not supporting threads - and expects that 'user'
> > is not doing parallel things with the same VG - i.e.
> > activating/deactivating/quering same LV from multiple commands - the only
> > protected thing is consistency of lvm metadata.
> > 
> > Basic assumption is - the application using lvm commands is not running
> > commands with the same VG/LV in parallel and ATM there is no easy fix for
> > this.
> > 
> > So is it possible to work with these constrains within vdsm?
> 
> It is possible but would require us to lock where lvm doesn't.
> Why can't you just take an flock on a file whose name is the name of the LV
> or something?

Locking order, we would need to know in front all needed LVs  (while current command logic doesn't resolve tree dependency - that's part of activation code which may be in cluster), the necessity to rollback operation (if you are in the middle of some mirror transition and you cannot continue because of locking) - the internal lvm2 design here is really not at the level to allow proper support of this type of locking for now...

It's written in a way - there will be non-parallel operation being done by system administrator - and if admin runs   activation/deactivation and lvs command at the some time - some errors will be logged.

I guess the 'easiest' solution on lvm2 side here would be the locking on dm table modification - ro-lock, rw-lock. Effectively serializing commands that are changing table content in the whole system.

> > > > From practical perspective -  lvm2 supports  lvm-shell - where you could
> > > > stream lvm commands via pipe to process them in single  instance of lvm.
> > > > 
> > > > Thus you may stream all lvm commands to a single instance of lvm process -
> > > > this would ensure proper order (no lvm command would run in parallel).
> > > > 
> > > > Unsure how this fits 'vdsm' design - but it may appear this will run faster
> > > > and consume fairly less resources then individual execution of multiple lvm
> > > > commands in parallel.
> > > 
> > > We actually tried this in the past but encountered too many problems with it
> > > as it's not really designed to be automated.
> > 
> > Any LVM related bug from those 'too many' ?
> 
> Possibly, but this was a couple of years ago so we'd need to revisit it
> (doesn't seem worth it, we'd just move to using libLVM)
> 
> Note that with libLVM I don't think lvm can afford stating that user is
> expected not to run multiple API calls with same LV

lvm API is surely not multi threaded so far and locking required here is even against programs running in parallel - so serialization currently needs to be solved on user-side.

Comment 15 Alasdair Kergon 2013-03-14 22:31:16 UTC
This bugzilla is heading off in too many directions at once!

Let's attempt to set out as clearly as possible the actual problems here that need to be addressed then split them out as separate bugzillas.

Comment 16 Alasdair Kergon 2013-03-14 22:38:12 UTC
So starting with comment #5, if I interpret it correctly the claim is:

If you run:
  lvchange -ay $LV
followed by 
  lvchange -ay $LV
you get no errors.

But if the two instances overlap, you sometimes see error messages.

- In both cases $LV is always activated at the end?  (True or false?  Are there cases where both lvchanges fail and the LV is not activated?)


Clearly, that's a problem that needs to be fixed on the LVM side: the timing of two instances of a request to activate an LV should not affect the outcome or return code from the command.

Comment 17 Alasdair Kergon 2013-03-14 22:39:57 UTC
- Does clustered lvm also suffer from this problem?
- Does the use of lvmetad make any difference?
- Do any lvm.conf settings affect it?

Comment 18 Alasdair Kergon 2013-03-14 22:45:26 UTC
Comment #8 mentions problems arising from lvs running alongside lvchange.  Is this a problem RHEV has seen and needs fixing, or is it just something else noticed while looking at the problem but not something currently causing problems for RHEV?


You should be able to run 'lvs' at any time and not get problems *caused by* other LVM commands that happen to be running in parallel.  We already fixed this for RHEV for on-disk metadata, but is there also a problem to address in respect of in-kernel metadata?   (Separate bugzilla please.)

Comment 19 Alasdair Kergon 2013-03-14 22:53:05 UTC
Also this combination was mentioned when this bug was brought to my attention today.

lvchange -an $LV

running in parallel with

lvchange -ay $LV.

Now that particular combination is illogical and the result should be considered 'undefined'. It is the caller's responsibility to avoid or handle that situation, not LVM's.

LVM is welcome to perform one or both or neither or give error messages - I simply don't care what happens.  (It is of course possible that a fix for comment #16 would also make this behaviour defined.)

Comment 20 Alasdair Kergon 2013-03-14 23:04:49 UTC
Regarding the public liblvm API, it uses the same locking as LVM commands so of course it can cope with different instances attempting operations in parallel in just the same way that the command line copes (or fails as demonstrated in this bugzilla).

It is not multi-threaded, so a multi-threaded application is responsible itself for serialising operations.

Comment 21 Ayal Baron 2013-03-17 23:42:55 UTC
(In reply to comment #16)
> So starting with comment #5, if I interpret it correctly the claim is:
> 
> If you run:
>   lvchange -ay $LV
> followed by 
>   lvchange -ay $LV
> you get no errors.
> 
> But if the two instances overlap, you sometimes see error messages.
> 
> - In both cases $LV is always activated at the end?  (True or false?  Are
> there cases where both lvchanges fail and the LV is not activated?)

As far as we can see this is correct.  i.e. first request to be handled succeeds and second one fails with resource busy.

> 
> 
> Clearly, that's a problem that needs to be fixed on the LVM side: the timing
> of two instances of a request to activate an LV should not affect the
> outcome or return code from the command.

That was our thinking as well and is all this bug is really about so I'm moving it to LVM to take care of this specific issue.

(In reply to comment #17)
> - Does clustered lvm also suffer from this problem?
> - Does the use of lvmetad make any difference?
> - Do any lvm.conf settings affect it?

No idea to any of the above, but since reproduction is as simple as:
lvchange -ay /dev/vgtest1/lvtest1 & lvchange -ay /dev/vgtest1/lvtest1
I believe you can test this easily.

(In reply to comment #18)
> Comment #8 mentions problems arising from lvs running alongside lvchange. 
> Is this a problem RHEV has seen and needs fixing, or is it just something
> else noticed while looking at the problem but not something currently
> causing problems for RHEV?

That is a theoretical issue mentioned by Zdenek.  I am not familiar with any rhev bug due to this, but it is quite likely that it is a race waiting to happen (low priority afaic)

> 
> 
> You should be able to run 'lvs' at any time and not get problems *caused by*
> other LVM commands that happen to be running in parallel.  We already fixed
> this for RHEV for on-disk metadata, but is there also a problem to address
> in respect of in-kernel metadata?   (Separate bugzilla please.)

(In reply to comment #19)
> Also this combination was mentioned when this bug was brought to my
> attention today.
> 
> lvchange -an $LV
> 
> running in parallel with
> 
> lvchange -ay $LV.
> 
> Now that particular combination is illogical and the result should be
> considered 'undefined'. It is the caller's responsibility to avoid or handle
> that situation, not LVM's.
> 
> LVM is welcome to perform one or both or neither or give error messages - I
> simply don't care what happens.  (It is of course possible that a fix for
> comment #16 would also make this behaviour defined.)

Agreed (btw, I do not see it mentioned anywhere in the bug before)

Comment 25 Alasdair Kergon 2013-10-18 21:34:51 UTC
So the current plan is for file-based locking types to use actual LV locks now to serialise activation operations.

Comment 31 Alasdair Kergon 2014-05-17 00:10:48 UTC
I have derived the design constraints upon any solution, and worked out one possible design.  The next stage is to turn this into a prototype and see whether or not it will work.

Comment 33 Alasdair Kergon 2014-05-20 13:08:05 UTC
Prototype seems to be working for a simple case.  Now working through and extending it to more cases.

- A new lock space called 'A' (for Activation).
- Locks are similar to LV locks but held for the duration of discrete activation/deactivation activity but only if no VG WRITE lock is held.

Comment 34 Alasdair Kergon 2014-06-21 00:12:00 UTC
For this release, cluster locking will not be supported and nor will thin or cache LVs.  RHEV uses none of those.  (I have code for cluster support but it is not properly tested, and thin/cache volumes have some code paths that may need special attention.)

https://git.fedorahosted.org/cgit/lvm2.git/commit/?id=c1c2e838e88c56ef38d590007ca3b588ca06f1fd

https://git.fedorahosted.org/cgit/lvm2.git/commit/?id=78533f72d30f6e840f66e0aae89126ef139c1f2c

Comment 38 Corey Marthaler 2014-07-24 16:13:11 UTC
Ran activation intensive regression tests and saw no issues. Marking verified (SanityOnly) with the latest rpms.

2.6.32-492.el6.x86_64
lvm2-2.02.107-2.el6    BUILT: Fri Jul 11 08:47:33 CDT 2014
lvm2-libs-2.02.107-2.el6    BUILT: Fri Jul 11 08:47:33 CDT 2014
lvm2-cluster-2.02.107-2.el6    BUILT: Fri Jul 11 08:47:33 CDT 2014
udev-147-2.56.el6    BUILT: Fri Jul 11 09:53:07 CDT 2014
device-mapper-1.02.86-2.el6    BUILT: Fri Jul 11 08:47:33 CDT 2014
device-mapper-libs-1.02.86-2.el6    BUILT: Fri Jul 11 08:47:33 CDT 2014
device-mapper-event-1.02.86-2.el6    BUILT: Fri Jul 11 08:47:33 CDT 2014
device-mapper-event-libs-1.02.86-2.el6    BUILT: Fri Jul 11 08:47:33 CDT 2014
device-mapper-persistent-data-0.3.2-1.el6    BUILT: Fri Apr  4 08:43:06 CDT 2014
cmirror-2.02.107-2.el6    BUILT: Fri Jul 11 08:47:33 CDT 2014

Comment 39 errata-xmlrpc 2014-10-14 08:23:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2014-1387.html


Note You need to log in before you can comment on or make changes to this bug.