Bug 997666

Summary: [RFE] 100 createVolume calls on a block sd with 500 LVs takes too much time
Product: Red Hat Enterprise Virtualization Manager Reporter: Marina Kalinin <mkalinin>
Component: RFEsAssignee: Allon Mureinik <amureini>
Status: CLOSED ERRATA QA Contact: Yuri Obshansky <yobshans>
Severity: high Docs Contact:
Priority: high    
Version: 3.2.0CC: acanan, amureini, bazulay, bgraveno, cww, iheim, lbopf, lnatapov, lpeer, lyarwood, mkalinin, nsoffer, rbalakri, scohen, tnisan, yeylon, ylavi, yobshans
Target Milestone: ovirt-3.6.0-rcKeywords: FutureFeature, Improvement
Target Release: 3.6.0Flags: sherold: Triaged+
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-03-09 20:31:59 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Storage RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1018916    
Attachments:
Description Flags
patched adjusted to current downstream
none
the working patch
none
patch without calling activate lv for cloning from template none

Description Marina Kalinin 2013-08-15 23:45:59 UTC
Description of problem:
VDSM will perform teardown on the template volume after creating a vm from that template (see volume.clone). A race condition will occure, when multiple vms are created at the same time, this lv will get 'o' attribute, since being in use; and then, when vdsm will try to deactivate it and will fail with the error:
"device-mapper: remove ioctl on  failed: Device or resource busy\n  Unable to deactivate f48e93bb--e3bb--4291--bd2f--8d622ef8019d-a351bc36--d741--4cf7--aa89--735c82b888e5 (253:508)\n'; <rc> = 5"

(you cannot run `lvchange -an` on an open/used lv)

Why do we need to deactivate template lv?
Template's nature is to be in use by other VMs and when this is happening, they are considered to be active.

Suggested fix:
Check inside volume.clone, if we are creating a vm from a template, then call teardown with justme=True:
 self.teardown(self.sdUUID, self.volUUID, 'True')


Version-Release number of selected component (if applicable):
3.2.0

How reproducible:
Reproducable when trying to deploy multiple vms (~50) from the same template at the same time.

Steps to Reproduce:
1. Create a template.
2. Issue 50(or more) simultaneous request of creating vms from this template.
3. Watch vdsm.log for this error: 
"device-mapper: remove ioctl on  failed: Device or resource busy\n  Unable to deactivate".
Actual results:
This will slow down significantly the time required to create those vms.
Only after creating all the vms, we will be able to remove 'o' attribute from the lv and deactivate it, and terminate successfully all the createVmFromTemplate flows.

Expected results:
Do not deactivate template lv after cloning a vm from it.

Additional info:
This affect one of big customer deployments. I will attach ticket number.
Here is a partial flow from the vdsm.log:


85e6363c-1a1c-4e72-b1e7-0afd0daa67b9::DEBUG::2013-08-15 14:04:44,553::volume::263::Storage.Volume::(clone) Volume.clone: /rhev/data-center/89ced3c8-7615-11e2-87d3-0014227472ce/f48e93bb-e3bb-4291-bd2f-8d622ef8019
d/images/a460e3a9-4fd7-4fc4-bdeb-839fc85130df/a351bc36-d741-4cf7-aa89-735c82b888e5 to /rhev/data-center/89ced3c8-7615-11e2-87d3-0014227472ce/f48e93bb-e3bb-4291-bd2f-8d622ef8019d/images/a460e3a9-4fd7-4fc4-bdeb-83
9fc85130df/25fb3d9d-dcd1-417c-b5b2-72b27e9a35c4
85e6363c-1a1c-4e72-b1e7-0afd0daa67b9::DEBUG::2013-08-15 14:04:44,579::misc::83::Storage.Misc.excCmd::(<lambda>) '/bin/dd iflag=direct skip=304 bs=512 if=/dev/f48e93bb-e3bb-4291-bd2f-8d622ef8019d/metadata count=1
' (cwd None)
85e6363c-1a1c-4e72-b1e7-0afd0daa67b9::DEBUG::2013-08-15 14:04:44,731::misc::83::Storage.Misc.excCmd::(<lambda>) SUCCESS: <err> = '1+0 records in\n1+0 records out\n512 bytes (512 B) copied, 0.000354389 s, 1.4 MB/
s\n'; <rc> = 0
85e6363c-1a1c-4e72-b1e7-0afd0daa67b9::DEBUG::2013-08-15 14:04:44,741::misc::344::Storage.Misc::(validateDDBytes) err: ['1+0 records in', '1+0 records out', '512 bytes (512 B) copied, 0.000354389 s, 1.4 MB/s'], s
ize: 512
85e6363c-1a1c-4e72-b1e7-0afd0daa67b9::DEBUG::2013-08-15 14:04:44,765::misc::83::Storage.Misc.excCmd::(<lambda>) '/bin/dd iflag=direct skip=304 bs=512 if=/dev/f48e93bb-e3bb-4291-bd2f-8d622ef8019d/metadata count=1
' (cwd None)
85e6363c-1a1c-4e72-b1e7-0afd0daa67b9::DEBUG::2013-08-15 14:04:44,934::misc::83::Storage.Misc.excCmd::(<lambda>) SUCCESS: <err> = '1+0 records in\n1+0 records out\n512 bytes (512 B) copied, 0.000243984 s, 2.1 MB/
s\n'; <rc> = 0
85e6363c-1a1c-4e72-b1e7-0afd0daa67b9::DEBUG::2013-08-15 14:04:44,945::misc::344::Storage.Misc::(validateDDBytes) err: ['1+0 records in', '1+0 records out', '512 bytes (512 B) copied, 0.000243984 s, 2.1 MB/s'], s
ize: 512
85e6363c-1a1c-4e72-b1e7-0afd0daa67b9::DEBUG::2013-08-15 14:04:44,993::misc::83::Storage.Misc.excCmd::(<lambda>) '/usr/bin/qemu-img create -f qcow2 -F qcow2 -b ../a460e3a9-4fd7-4fc4-bdeb-839fc85130df/a351bc36-d74
1-4cf7-aa89-735c82b888e5 /rhev/data-center/89ced3c8-7615-11e2-87d3-0014227472ce/f48e93bb-e3bb-4291-bd2f-8d622ef8019d/images/a460e3a9-4fd7-4fc4-bdeb-839fc85130df/25fb3d9d-dcd1-417c-b5b2-72b27e9a35c4' (cwd /rhev/d
ata-center/89ced3c8-7615-11e2-87d3-0014227472ce/f48e93bb-e3bb-4291-bd2f-8d622ef8019d/images/a460e3a9-4fd7-4fc4-bdeb-839fc85130df)
85e6363c-1a1c-4e72-b1e7-0afd0daa67b9::DEBUG::2013-08-15 14:04:45,089::misc::83::Storage.Misc.excCmd::(<lambda>) SUCCESS: <err> = ''; <rc> = 0
[mku: let's see here why we are doing this ]
85e6363c-1a1c-4e72-b1e7-0afd0daa67b9::INFO::2013-08-15 14:04:45,099::blockVolume::374::Storage.Volume::(teardown) Tearing down volume f48e93bb-e3bb-4291-bd2f-8d622ef8019d/a351bc36-d741-4cf7-aa89-735c82b888e5 jus
tme False
85e6363c-1a1c-4e72-b1e7-0afd0daa67b9::DEBUG::2013-08-15 14:04:45,107::resourceManager::538::ResourceManager::(releaseResource) Trying to release resource 'f48e93bb-e3bb-4291-bd2f-8d622ef8019d_lvmActivationNS.a35
1bc36-d741-4cf7-aa89-735c82b888e5'
85e6363c-1a1c-4e72-b1e7-0afd0daa67b9::DEBUG::2013-08-15 14:04:45,122::resourceManager::553::ResourceManager::(releaseResource) Released resource 'f48e93bb-e3bb-4291-bd2f-8d622ef8019d_lvmActivationNS.a351bc36-d74
1-4cf7-aa89-735c82b888e5' (0 active users)
85e6363c-1a1c-4e72-b1e7-0afd0daa67b9::DEBUG::2013-08-15 14:04:45,137::resourceManager::558::ResourceManager::(releaseResource) Resource 'f48e93bb-e3bb-4291-bd2f-8d622ef8019d_lvmActivationNS.a351bc36-d741-4cf7-aa
89-735c82b888e5' is free, finding out if anyone is waiting for it.

85e6363c-1a1c-4e72-b1e7-0afd0daa67b9::DEBUG::2013-08-15 14:04:45,230::misc::83::Storage.Misc.excCmd::(<lambda>) '/usr/bin/sudo -n /sbin/lvm lvchange --config " devices { preferred_names = [\\"^/dev/mapper/\\"] i
gnore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 filter = [ \'a|/dev/mapper/1Dell_Internal_Dual_SD_0123456789AB|/dev/mapper/3600601605ab02c0028c71db4c076e211|/dev/mapper/3600601605ab02c0
05a0df4e048a0e211|/dev/mapper/3600601605ab02c005c0df4e048a0e211|/dev/mapper/3600601605ab02c005e0df4e048a0e211|/dev/mapper/3600601605ab02c00600df4e048a0e211|/dev/mapper/3600601605ab02c00620df4e048a0e211|/dev/mapp
er/3600601605ab02c00640df4e048a0e211|/dev/mapper/36848f690eb9aa00018d3d7c418bfabf6|\', \'r|.*|\' ] }  global {  locking_type=1  prioritise_write_locks=1  wait_for_locks=1 }  backup {  retain_min = 50  retain_day
s = 0 } " --autobackup n --available n f48e93bb-e3bb-4291-bd2f-8d622ef8019d/a351bc36-d741-4cf7-aa89-735c82b888e5' (cwd None)

85e6363c-1a1c-4e72-b1e7-0afd0daa67b9::DEBUG::2013-08-15 14:04:51,110::misc::83::Storage.Misc.excCmd::(<lambda>) FAILED: <err> = '  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  Unable to deactivate f48e93bb--e3bb--4291--bd2f--8d622ef8019d-a351bc36--d741--4cf7--aa89--735c82b888e5 (253:508)\n'; <rc> = 5
85e6363c-1a1c-4e72-b1e7-0afd0daa67b9::DEBUG::2013-08-15 14:04:52,960::lvm::500::OperationMutex::(_invalidatelvs) Operation 'lvm reload operation' is holding the operation mutex, waiting...

Comment 4 Marina Kalinin 2013-08-21 04:55:40 UTC
Created attachment 788715 [details]
patched adjusted to current downstream

Comment 6 Marina Kalinin 2013-08-21 05:28:05 UTC
Created attachment 788718 [details]
the working patch

Comment 7 Marina Kalinin 2013-08-21 05:31:14 UTC
Implemented the change in the attacked working patch.
It works - no original errors as on the bug description.
But - it takes longer time to process the request (2.5 times longer - 25 second compared to 7-9 sec before). So, probably performing the "release" command per each volume not that efficient.

I will wait for your advise,

Comment 9 Marina Kalinin 2013-08-21 22:02:32 UTC
From all the testing I am doing, I do not see when we actually call teardown for the actual new volume.
volume.clone creates a new volume from itself(i.e. the template lv), and then calls teardown on itself again. which would never deactivate the newly created lv.
Do we really need this method? Or it is not used properly?

Comment 10 Marina Kalinin 2013-08-21 22:09:00 UTC
As for the part that takes more time now, I suspect it is the activate command we added to blockVolume.py and we call in volume.prepare:

    elif self.__class__.__name__ == "BlockVolume":	567
	self.activate()

Comment 11 Marina Kalinin 2013-08-21 22:25:03 UTC
blockVolume.activate would perform lvchange -ay.

Before applying the patch, volume.prepare would set 'r' attribute to the lv.
I don't see if we activate it as well.
With the patch, we create a new race condition/redundancy - by calling lvchange -ay for the template lv all the time we clone from this template.

But this lv should be active all the time anyway.
So we should check if it is a shared volume and skip the activation part, probably.

Comment 12 Marina Kalinin 2013-08-21 22:29:38 UTC
Created attachment 789010 [details]
patch without calling activate lv for cloning from template

Comment 13 Eduardo Warszawski 2013-08-22 04:16:55 UTC
(In reply to Marina from comment #11)
> blockVolume.activate would perform lvchange -ay.
> 
> Before applying the patch, volume.prepare would set 'r' attribute to the lv.
> I don't see if we activate it as well.
> With the patch, we create a new race condition/redundancy - by calling
> lvchange -ay for the template lv all the time we clone from this template.
> 
> But this lv should be active all the time anyway.
> So we should check if it is a shared volume and skip the activation part,
> probably.

Calling lvm.activateLVs (the unique path that we should take) on an already active LV has no cost since the lvm command will be not performed.
Checking if it is a shared volume by reading MD will slow it more.

We problably should avoid the clone path at all.

Starting to read info.

Comment 14 Ayal Baron 2013-08-26 13:34:09 UTC
The issue here is not deactivation of volume, it's contention on lvm calls which take too long.
createVolume can be rewritten to avoid a lot of the calls hence significantly reducing the total run time

Comment 15 Marina Kalinin 2013-09-09 21:55:29 UTC
(In reply to Ayal Baron from comment #14)
> The issue here is not deactivation of volume, it's contention on lvm calls
> which take too long.
> createVolume can be rewritten to avoid a lot of the calls hence
> significantly reducing the total run time

Ayal, this bz comes to address the error:
FAILED: <err> = '  device-mapper: remove ioctl on  failed: Device or resource busy\n

which occurs when VDSM tries to perform teardown on the template volume after creating a vm from that template (see volume.clone). Thus, it sounds to me that the original bug title should be set back: "vdsm should not deactivate template lv after cloning a vm from this template". Please correct me if my understanding is wrong.

Comment 16 Ayal Baron 2013-09-09 22:31:04 UTC
(In reply to Marina from comment #15)
> (In reply to Ayal Baron from comment #14)
> > The issue here is not deactivation of volume, it's contention on lvm calls
> > which take too long.
> > createVolume can be rewritten to avoid a lot of the calls hence
> > significantly reducing the total run time
> 
> Ayal, this bz comes to address the error:
> FAILED: <err> = '  device-mapper: remove ioctl on  failed: Device or
> resource busy\n
> 
> which occurs when VDSM tries to perform teardown on the template volume
> after creating a vm from that template (see volume.clone). Thus, it sounds
> to me that the original bug title should be set back: "vdsm should not
> deactivate template lv after cloning a vm from this template". Please
> correct me if my understanding is wrong.

deactivation of the template lv is not the part which is consuming the time, hence the title was wrong.
What you want is for concurrent provisioning of many thin volumes derived from the same template to be fast.
How we solve this issue is an implementation detail.
Initial patch here: http://gerrit.ovirt.org/#/c/18793/

Comment 17 Marina Kalinin 2013-09-11 14:47:47 UTC
(In reply to Ayal Baron from comment #16)
> (In reply to Marina from comment #15)
> > (In reply to Ayal Baron from comment #14)
> > > The issue here is not deactivation of volume, it's contention on lvm calls
> > > which take too long.
> > > createVolume can be rewritten to avoid a lot of the calls hence
> > > significantly reducing the total run time
> > 
> > Ayal, this bz comes to address the error:
> > FAILED: <err> = '  device-mapper: remove ioctl on  failed: Device or
> > resource busy\n
> > 
> > which occurs when VDSM tries to perform teardown on the template volume
> > after creating a vm from that template (see volume.clone). Thus, it sounds
> > to me that the original bug title should be set back: "vdsm should not
> > deactivate template lv after cloning a vm from this template". Please
> > correct me if my understanding is wrong.
> 
> deactivation of the template lv is not the part which is consuming the time,
> hence the title was wrong.
> What you want is for concurrent provisioning of many thin volumes derived
> from the same template to be fast.
I want this, very much, but not in this bug.
In this bug I ask the question why do we deactivate template's lv. Which causes device mapper error.

> How we solve this issue is an implementation detail.
> Initial patch here: http://gerrit.ovirt.org/#/c/18793/

Comment 18 Ayal Baron 2013-09-11 14:56:01 UTC
http://gerrit.ovirt.org/#/c/18793/ will address the activation and deactivation problems by just avoiding both (not needed at all in this flow, which should also speed things up a bit).
That is being done for 3.3.
We're keeping this bug to track performance of the entire operation (hence the title).

Comment 19 Nir Soffer 2013-12-15 20:46:33 UTC
The errors you got (device-mapper: remove ioctl on  failed: Device or resource busy) may be related to concurrency issue with lvm, and should be solved by http://gerrit.ovirt.org/#/c/20890.

Please test again with 3.3.

Comment 27 Yuri Obshansky 2014-06-23 11:01:45 UTC
a

Comment 33 Yuri Obshansky 2015-06-08 11:02:43 UTC
no comments

Comment 35 Yuri Obshansky 2015-06-29 07:44:31 UTC
Bug verified on RHEV-M 3.5.1.1-0.1

Setup: 
1 DC, 1 Cluster, 1 Host and 1 iSCSi SD (1T)
+ 10 VM Templates (2 CPU, 2G RAM, Disk 10G)

Test scenario:
50 concurrent threads (with 1 min ramp up) create VMs from template (using Clone/Thin provisioning). JMeter script measured time from Create VM REST API call till VM is created and has status - Down.

This bug has 2 parts functional and performance
1. Functional part verified and errors didn't happened during scenario execution
2. Performance part verified by performing 2 scenarios: VM creation with Clone and Thin disk provisioning

Below Create VM operation response time: 

2.1. Clone
Transaction	                Average	90% Line Min	Max
Create VM form template (msec)	3624490	5058803	 988476	5123454
Create VM form template (min)	60	84	16	85

2.2. Thin
Transaction	             Average 90% Line Min Max
Create VM from template msec 33679   89997    119 119196
Create VM from template min  1	     1        0	  2

Comment 39 errata-xmlrpc 2016-03-09 20:31:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-0376.html