Bug 997666 - [RFE] 100 createVolume calls on a block sd with 500 LVs takes too much time
Summary: [RFE] 100 createVolume calls on a block sd with 500 LVs takes too much time
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: RFEs
Version: 3.2.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ovirt-3.6.0-rc
: 3.6.0
Assignee: Allon Mureinik
QA Contact: Yuri Obshansky
URL:
Whiteboard:
Depends On:
Blocks: 1018916
TreeView+ depends on / blocked
 
Reported: 2013-08-15 23:45 UTC by Marina Kalinin
Modified: 2018-12-03 19:40 UTC (History)
18 users (show)

Fixed In Version:
Doc Type: Enhancement
Doc Text:
Clone Of:
Environment:
Last Closed: 2016-03-09 20:31:59 UTC
oVirt Team: Storage
Target Upstream Version:
sherold: Triaged+


Attachments (Terms of Use)
patched adjusted to current downstream (6.89 KB, patch)
2013-08-21 04:55 UTC, Marina Kalinin
no flags Details | Diff
the working patch (6.90 KB, patch)
2013-08-21 05:28 UTC, Marina Kalinin
no flags Details | Diff
patch without calling activate lv for cloning from template (6.86 KB, patch)
2013-08-21 22:29 UTC, Marina Kalinin
no flags Details | Diff


Links
System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 979193 0 urgent CLOSED Storage operations are slow, long waits on OperationMutex 2021-02-22 00:41:40 UTC
Red Hat Product Errata RHEA-2016:0376 0 normal SHIPPED_LIVE Red Hat Enterprise Virtualization Manager 3.6.0 2016-03-10 01:20:52 UTC
oVirt gerrit 863 0 None None None Never
oVirt gerrit 18793 0 None None None Never

Internal Links: 979193

Description Marina Kalinin 2013-08-15 23:45:59 UTC
Description of problem:
VDSM will perform teardown on the template volume after creating a vm from that template (see volume.clone). A race condition will occure, when multiple vms are created at the same time, this lv will get 'o' attribute, since being in use; and then, when vdsm will try to deactivate it and will fail with the error:
"device-mapper: remove ioctl on  failed: Device or resource busy\n  Unable to deactivate f48e93bb--e3bb--4291--bd2f--8d622ef8019d-a351bc36--d741--4cf7--aa89--735c82b888e5 (253:508)\n'; <rc> = 5"

(you cannot run `lvchange -an` on an open/used lv)

Why do we need to deactivate template lv?
Template's nature is to be in use by other VMs and when this is happening, they are considered to be active.

Suggested fix:
Check inside volume.clone, if we are creating a vm from a template, then call teardown with justme=True:
 self.teardown(self.sdUUID, self.volUUID, 'True')


Version-Release number of selected component (if applicable):
3.2.0

How reproducible:
Reproducable when trying to deploy multiple vms (~50) from the same template at the same time.

Steps to Reproduce:
1. Create a template.
2. Issue 50(or more) simultaneous request of creating vms from this template.
3. Watch vdsm.log for this error: 
"device-mapper: remove ioctl on  failed: Device or resource busy\n  Unable to deactivate".
Actual results:
This will slow down significantly the time required to create those vms.
Only after creating all the vms, we will be able to remove 'o' attribute from the lv and deactivate it, and terminate successfully all the createVmFromTemplate flows.

Expected results:
Do not deactivate template lv after cloning a vm from it.

Additional info:
This affect one of big customer deployments. I will attach ticket number.
Here is a partial flow from the vdsm.log:


85e6363c-1a1c-4e72-b1e7-0afd0daa67b9::DEBUG::2013-08-15 14:04:44,553::volume::263::Storage.Volume::(clone) Volume.clone: /rhev/data-center/89ced3c8-7615-11e2-87d3-0014227472ce/f48e93bb-e3bb-4291-bd2f-8d622ef8019
d/images/a460e3a9-4fd7-4fc4-bdeb-839fc85130df/a351bc36-d741-4cf7-aa89-735c82b888e5 to /rhev/data-center/89ced3c8-7615-11e2-87d3-0014227472ce/f48e93bb-e3bb-4291-bd2f-8d622ef8019d/images/a460e3a9-4fd7-4fc4-bdeb-83
9fc85130df/25fb3d9d-dcd1-417c-b5b2-72b27e9a35c4
85e6363c-1a1c-4e72-b1e7-0afd0daa67b9::DEBUG::2013-08-15 14:04:44,579::misc::83::Storage.Misc.excCmd::(<lambda>) '/bin/dd iflag=direct skip=304 bs=512 if=/dev/f48e93bb-e3bb-4291-bd2f-8d622ef8019d/metadata count=1
' (cwd None)
85e6363c-1a1c-4e72-b1e7-0afd0daa67b9::DEBUG::2013-08-15 14:04:44,731::misc::83::Storage.Misc.excCmd::(<lambda>) SUCCESS: <err> = '1+0 records in\n1+0 records out\n512 bytes (512 B) copied, 0.000354389 s, 1.4 MB/
s\n'; <rc> = 0
85e6363c-1a1c-4e72-b1e7-0afd0daa67b9::DEBUG::2013-08-15 14:04:44,741::misc::344::Storage.Misc::(validateDDBytes) err: ['1+0 records in', '1+0 records out', '512 bytes (512 B) copied, 0.000354389 s, 1.4 MB/s'], s
ize: 512
85e6363c-1a1c-4e72-b1e7-0afd0daa67b9::DEBUG::2013-08-15 14:04:44,765::misc::83::Storage.Misc.excCmd::(<lambda>) '/bin/dd iflag=direct skip=304 bs=512 if=/dev/f48e93bb-e3bb-4291-bd2f-8d622ef8019d/metadata count=1
' (cwd None)
85e6363c-1a1c-4e72-b1e7-0afd0daa67b9::DEBUG::2013-08-15 14:04:44,934::misc::83::Storage.Misc.excCmd::(<lambda>) SUCCESS: <err> = '1+0 records in\n1+0 records out\n512 bytes (512 B) copied, 0.000243984 s, 2.1 MB/
s\n'; <rc> = 0
85e6363c-1a1c-4e72-b1e7-0afd0daa67b9::DEBUG::2013-08-15 14:04:44,945::misc::344::Storage.Misc::(validateDDBytes) err: ['1+0 records in', '1+0 records out', '512 bytes (512 B) copied, 0.000243984 s, 2.1 MB/s'], s
ize: 512
85e6363c-1a1c-4e72-b1e7-0afd0daa67b9::DEBUG::2013-08-15 14:04:44,993::misc::83::Storage.Misc.excCmd::(<lambda>) '/usr/bin/qemu-img create -f qcow2 -F qcow2 -b ../a460e3a9-4fd7-4fc4-bdeb-839fc85130df/a351bc36-d74
1-4cf7-aa89-735c82b888e5 /rhev/data-center/89ced3c8-7615-11e2-87d3-0014227472ce/f48e93bb-e3bb-4291-bd2f-8d622ef8019d/images/a460e3a9-4fd7-4fc4-bdeb-839fc85130df/25fb3d9d-dcd1-417c-b5b2-72b27e9a35c4' (cwd /rhev/d
ata-center/89ced3c8-7615-11e2-87d3-0014227472ce/f48e93bb-e3bb-4291-bd2f-8d622ef8019d/images/a460e3a9-4fd7-4fc4-bdeb-839fc85130df)
85e6363c-1a1c-4e72-b1e7-0afd0daa67b9::DEBUG::2013-08-15 14:04:45,089::misc::83::Storage.Misc.excCmd::(<lambda>) SUCCESS: <err> = ''; <rc> = 0
[mku: let's see here why we are doing this ]
85e6363c-1a1c-4e72-b1e7-0afd0daa67b9::INFO::2013-08-15 14:04:45,099::blockVolume::374::Storage.Volume::(teardown) Tearing down volume f48e93bb-e3bb-4291-bd2f-8d622ef8019d/a351bc36-d741-4cf7-aa89-735c82b888e5 jus
tme False
85e6363c-1a1c-4e72-b1e7-0afd0daa67b9::DEBUG::2013-08-15 14:04:45,107::resourceManager::538::ResourceManager::(releaseResource) Trying to release resource 'f48e93bb-e3bb-4291-bd2f-8d622ef8019d_lvmActivationNS.a35
1bc36-d741-4cf7-aa89-735c82b888e5'
85e6363c-1a1c-4e72-b1e7-0afd0daa67b9::DEBUG::2013-08-15 14:04:45,122::resourceManager::553::ResourceManager::(releaseResource) Released resource 'f48e93bb-e3bb-4291-bd2f-8d622ef8019d_lvmActivationNS.a351bc36-d74
1-4cf7-aa89-735c82b888e5' (0 active users)
85e6363c-1a1c-4e72-b1e7-0afd0daa67b9::DEBUG::2013-08-15 14:04:45,137::resourceManager::558::ResourceManager::(releaseResource) Resource 'f48e93bb-e3bb-4291-bd2f-8d622ef8019d_lvmActivationNS.a351bc36-d741-4cf7-aa
89-735c82b888e5' is free, finding out if anyone is waiting for it.

85e6363c-1a1c-4e72-b1e7-0afd0daa67b9::DEBUG::2013-08-15 14:04:45,230::misc::83::Storage.Misc.excCmd::(<lambda>) '/usr/bin/sudo -n /sbin/lvm lvchange --config " devices { preferred_names = [\\"^/dev/mapper/\\"] i
gnore_suspended_devices=1 write_cache_state=0 disable_after_error_count=3 filter = [ \'a|/dev/mapper/1Dell_Internal_Dual_SD_0123456789AB|/dev/mapper/3600601605ab02c0028c71db4c076e211|/dev/mapper/3600601605ab02c0
05a0df4e048a0e211|/dev/mapper/3600601605ab02c005c0df4e048a0e211|/dev/mapper/3600601605ab02c005e0df4e048a0e211|/dev/mapper/3600601605ab02c00600df4e048a0e211|/dev/mapper/3600601605ab02c00620df4e048a0e211|/dev/mapp
er/3600601605ab02c00640df4e048a0e211|/dev/mapper/36848f690eb9aa00018d3d7c418bfabf6|\', \'r|.*|\' ] }  global {  locking_type=1  prioritise_write_locks=1  wait_for_locks=1 }  backup {  retain_min = 50  retain_day
s = 0 } " --autobackup n --available n f48e93bb-e3bb-4291-bd2f-8d622ef8019d/a351bc36-d741-4cf7-aa89-735c82b888e5' (cwd None)

85e6363c-1a1c-4e72-b1e7-0afd0daa67b9::DEBUG::2013-08-15 14:04:51,110::misc::83::Storage.Misc.excCmd::(<lambda>) FAILED: <err> = '  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  device-mapper: remove ioctl on  failed: Device or resource busy\n  Unable to deactivate f48e93bb--e3bb--4291--bd2f--8d622ef8019d-a351bc36--d741--4cf7--aa89--735c82b888e5 (253:508)\n'; <rc> = 5
85e6363c-1a1c-4e72-b1e7-0afd0daa67b9::DEBUG::2013-08-15 14:04:52,960::lvm::500::OperationMutex::(_invalidatelvs) Operation 'lvm reload operation' is holding the operation mutex, waiting...

Comment 4 Marina Kalinin 2013-08-21 04:55:40 UTC
Created attachment 788715 [details]
patched adjusted to current downstream

Comment 6 Marina Kalinin 2013-08-21 05:28:05 UTC
Created attachment 788718 [details]
the working patch

Comment 7 Marina Kalinin 2013-08-21 05:31:14 UTC
Implemented the change in the attacked working patch.
It works - no original errors as on the bug description.
But - it takes longer time to process the request (2.5 times longer - 25 second compared to 7-9 sec before). So, probably performing the "release" command per each volume not that efficient.

I will wait for your advise,

Comment 9 Marina Kalinin 2013-08-21 22:02:32 UTC
From all the testing I am doing, I do not see when we actually call teardown for the actual new volume.
volume.clone creates a new volume from itself(i.e. the template lv), and then calls teardown on itself again. which would never deactivate the newly created lv.
Do we really need this method? Or it is not used properly?

Comment 10 Marina Kalinin 2013-08-21 22:09:00 UTC
As for the part that takes more time now, I suspect it is the activate command we added to blockVolume.py and we call in volume.prepare:

    elif self.__class__.__name__ == "BlockVolume":	567
	self.activate()

Comment 11 Marina Kalinin 2013-08-21 22:25:03 UTC
blockVolume.activate would perform lvchange -ay.

Before applying the patch, volume.prepare would set 'r' attribute to the lv.
I don't see if we activate it as well.
With the patch, we create a new race condition/redundancy - by calling lvchange -ay for the template lv all the time we clone from this template.

But this lv should be active all the time anyway.
So we should check if it is a shared volume and skip the activation part, probably.

Comment 12 Marina Kalinin 2013-08-21 22:29:38 UTC
Created attachment 789010 [details]
patch without calling activate lv for cloning from template

Comment 13 Eduardo Warszawski 2013-08-22 04:16:55 UTC
(In reply to Marina from comment #11)
> blockVolume.activate would perform lvchange -ay.
> 
> Before applying the patch, volume.prepare would set 'r' attribute to the lv.
> I don't see if we activate it as well.
> With the patch, we create a new race condition/redundancy - by calling
> lvchange -ay for the template lv all the time we clone from this template.
> 
> But this lv should be active all the time anyway.
> So we should check if it is a shared volume and skip the activation part,
> probably.

Calling lvm.activateLVs (the unique path that we should take) on an already active LV has no cost since the lvm command will be not performed.
Checking if it is a shared volume by reading MD will slow it more.

We problably should avoid the clone path at all.

Starting to read info.

Comment 14 Ayal Baron 2013-08-26 13:34:09 UTC
The issue here is not deactivation of volume, it's contention on lvm calls which take too long.
createVolume can be rewritten to avoid a lot of the calls hence significantly reducing the total run time

Comment 15 Marina Kalinin 2013-09-09 21:55:29 UTC
(In reply to Ayal Baron from comment #14)
> The issue here is not deactivation of volume, it's contention on lvm calls
> which take too long.
> createVolume can be rewritten to avoid a lot of the calls hence
> significantly reducing the total run time

Ayal, this bz comes to address the error:
FAILED: <err> = '  device-mapper: remove ioctl on  failed: Device or resource busy\n

which occurs when VDSM tries to perform teardown on the template volume after creating a vm from that template (see volume.clone). Thus, it sounds to me that the original bug title should be set back: "vdsm should not deactivate template lv after cloning a vm from this template". Please correct me if my understanding is wrong.

Comment 16 Ayal Baron 2013-09-09 22:31:04 UTC
(In reply to Marina from comment #15)
> (In reply to Ayal Baron from comment #14)
> > The issue here is not deactivation of volume, it's contention on lvm calls
> > which take too long.
> > createVolume can be rewritten to avoid a lot of the calls hence
> > significantly reducing the total run time
> 
> Ayal, this bz comes to address the error:
> FAILED: <err> = '  device-mapper: remove ioctl on  failed: Device or
> resource busy\n
> 
> which occurs when VDSM tries to perform teardown on the template volume
> after creating a vm from that template (see volume.clone). Thus, it sounds
> to me that the original bug title should be set back: "vdsm should not
> deactivate template lv after cloning a vm from this template". Please
> correct me if my understanding is wrong.

deactivation of the template lv is not the part which is consuming the time, hence the title was wrong.
What you want is for concurrent provisioning of many thin volumes derived from the same template to be fast.
How we solve this issue is an implementation detail.
Initial patch here: http://gerrit.ovirt.org/#/c/18793/

Comment 17 Marina Kalinin 2013-09-11 14:47:47 UTC
(In reply to Ayal Baron from comment #16)
> (In reply to Marina from comment #15)
> > (In reply to Ayal Baron from comment #14)
> > > The issue here is not deactivation of volume, it's contention on lvm calls
> > > which take too long.
> > > createVolume can be rewritten to avoid a lot of the calls hence
> > > significantly reducing the total run time
> > 
> > Ayal, this bz comes to address the error:
> > FAILED: <err> = '  device-mapper: remove ioctl on  failed: Device or
> > resource busy\n
> > 
> > which occurs when VDSM tries to perform teardown on the template volume
> > after creating a vm from that template (see volume.clone). Thus, it sounds
> > to me that the original bug title should be set back: "vdsm should not
> > deactivate template lv after cloning a vm from this template". Please
> > correct me if my understanding is wrong.
> 
> deactivation of the template lv is not the part which is consuming the time,
> hence the title was wrong.
> What you want is for concurrent provisioning of many thin volumes derived
> from the same template to be fast.
I want this, very much, but not in this bug.
In this bug I ask the question why do we deactivate template's lv. Which causes device mapper error.

> How we solve this issue is an implementation detail.
> Initial patch here: http://gerrit.ovirt.org/#/c/18793/

Comment 18 Ayal Baron 2013-09-11 14:56:01 UTC
http://gerrit.ovirt.org/#/c/18793/ will address the activation and deactivation problems by just avoiding both (not needed at all in this flow, which should also speed things up a bit).
That is being done for 3.3.
We're keeping this bug to track performance of the entire operation (hence the title).

Comment 19 Nir Soffer 2013-12-15 20:46:33 UTC
The errors you got (device-mapper: remove ioctl on  failed: Device or resource busy) may be related to concurrency issue with lvm, and should be solved by http://gerrit.ovirt.org/#/c/20890.

Please test again with 3.3.

Comment 27 Yuri Obshansky 2014-06-23 11:01:45 UTC
a

Comment 33 Yuri Obshansky 2015-06-08 11:02:43 UTC
no comments

Comment 35 Yuri Obshansky 2015-06-29 07:44:31 UTC
Bug verified on RHEV-M 3.5.1.1-0.1

Setup: 
1 DC, 1 Cluster, 1 Host and 1 iSCSi SD (1T)
+ 10 VM Templates (2 CPU, 2G RAM, Disk 10G)

Test scenario:
50 concurrent threads (with 1 min ramp up) create VMs from template (using Clone/Thin provisioning). JMeter script measured time from Create VM REST API call till VM is created and has status - Down.

This bug has 2 parts functional and performance
1. Functional part verified and errors didn't happened during scenario execution
2. Performance part verified by performing 2 scenarios: VM creation with Clone and Thin disk provisioning

Below Create VM operation response time: 

2.1. Clone
Transaction	                Average	90% Line Min	Max
Create VM form template (msec)	3624490	5058803	 988476	5123454
Create VM form template (min)	60	84	16	85

2.2. Thin
Transaction	             Average 90% Line Min Max
Create VM from template msec 33679   89997    119 119196
Create VM from template min  1	     1        0	  2

Comment 39 errata-xmlrpc 2016-03-09 20:31:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-0376.html


Note You need to log in before you can comment on or make changes to this bug.