Bug 1100527
| Summary: | ResourceManager Queue is not being cleared properly causing other tasks timeout trying to get the locked resource | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Virtualization Manager | Reporter: | Marina Kalinin <mkalinin> |
| Component: | vdsm | Assignee: | Federico Simoncelli <fsimonce> |
| Status: | CLOSED WORKSFORME | QA Contact: | Aharon Canan <acanan> |
| Severity: | urgent | Docs Contact: | |
| Priority: | urgent | ||
| Version: | 3.3.0 | CC: | amureini, bazulay, dfediuck, fsimonce, gklein, iheim, lpeer, mkalinin, scohen, tdosek, yeylon |
| Target Milestone: | --- | ||
| Target Release: | 3.4.1 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | storage | ||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2014-06-08 22:10:32 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | Storage | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
The system seems under heavy load and operations are taking longer than expected:
- 38 block storage domains attached
- 15 VMs running on the SPM (at least 7 with host cpu usage over 50%)
- vdsm under heavy load (2724% cpu usage?)
- load average: 5.91, 5.54, 5.56
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
qemu 42778 13.4 1.2 18078544 12831924 ? Sl 03:11 114:20 /usr/libexec/qemu-kvm
qemu 61104 23.4 0.6 18077664 6696008 ? Sl 03:14 198:47 /usr/libexec/qemu-kvm
qemu 149188 24.8 0.9 18040884 9870000 ? Sl 02:58 215:10 /usr/libexec/qemu-kvm
qemu 161767 26.1 0.8 18037604 8630748 ? Sl 03:02 224:36 /usr/libexec/qemu-kvm
qemu 135355 64.0 1.5 18046408 16905068 ? Sl 02:55 556:10 /usr/libexec/qemu-kvm
qemu 129328 70.0 1.5 18076460 16881296 ? Sl 02:52 609:44 /usr/libexec/qemu-kvm
qemu 82353 86.0 0.1 18076496 2077036 ? Sl 03:20 724:54 /usr/libexec/qemu-kvm
qemu 30986 129 1.5 18039844 16795672 ? Sl 03:08 1110:15 /usr/libexec/qemu-kvm
qemu 17820 132 1.5 18038676 16786008 ? Sl 03:05 1136:22 /usr/libexec/qemu-kvm
qemu 54152 156 1.5 18048112 16812268 ? Sl 03:13 1327:17 /usr/libexec/qemu-kvm
qemu 119067 216 0.2 9304372 3094524 ? Sl 02:49 1891:31 /usr/libexec/qemu-kvm
vdsm 98759 2724 0.0 9797276 143440 ? S<l 02:37 24128:44 /usr/bin/python /usr/share/vdsm/vdsm
That said, it seems that updating the VMs ovf it's taking too long (3 minutes and 41 seconds):
Thread-8651::INFO::2014-05-22 14:48:39,717::logUtils::44::dispatcher::(wrapper) Run and protect: updateVM(spUUID='...
...
Thread-8651::ERROR::2014-05-22 14:52:20,187::SecureXMLRPCServer::168::root::(handle_error) client ('10.113.136.147', 52763)
Traceback (most recent call last):
File "/usr/lib64/python2.6/SocketServer.py", line 570, in process_request_thread
File "/usr/lib64/python2.6/SocketServer.py", line 332, in finish_request
File "/usr/lib64/python2.6/SocketServer.py", line 627, in __init__
File "/usr/lib64/python2.6/BaseHTTPServer.py", line 329, in handle
File "/usr/lib64/python2.6/BaseHTTPServer.py", line 323, in handle_one_request
File "/usr/lib64/python2.6/SimpleXMLRPCServer.py", line 490, in do_POST
File "/usr/lib64/python2.6/BaseHTTPServer.py", line 384, in send_response
File "/usr/lib64/python2.6/BaseHTTPServer.py", line 390, in send_header
File "/usr/lib64/python2.6/socket.py", line 324, in write
File "/usr/lib64/python2.6/socket.py", line 303, in flush
File "/usr/lib64/python2.6/site-packages/M2Crypto/SSL/Connection.py", line 222, in write
File "/usr/lib64/python2.6/site-packages/M2Crypto/SSL/Connection.py", line 205, in _write_bio
SSLError: (32, 'Broken pipe')
Engine side:
2014-05-22 16:48:18,497 INFO [org.ovirt.engine.core.bll.OvfDataUpdater] (DefaultQuartzScheduler_Worker-7) Attempting to update VMs/Templates Ovf.
2014-05-22 16:48:18,498 INFO [org.ovirt.engine.core.bll.OvfDataUpdater] (DefaultQuartzScheduler_Worker-7) Attempting to update VM OVFs in Data Center VCP
...
2014-05-22 16:51:28,124 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-7) IrsBroker::Failed::UpdateVMVDS due to: TimeoutException:
I'd like to point out that the entire size of the XML-RPC updateVM request is ~450Kb (75 VMs)!
On the vdsm side the majority of time is spent trying to acquire and release VMs ovf locks (useless?):
Thread-8651::DEBUG::2014-05-22 14:48:39,765::resourceManager::197::ResourceManager.Request::(__init__) ResName=`Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d`ReqID=`8fa1e150-cd58-450f-93df-fabac7852e22`::Request was made in '/usr/share/vdsm/storage/hsm.py' line '1330' at 'updateVM'
...
Thread-8651::DEBUG::2014-05-22 14:50:58,902::task::974::TaskManager.Task::(_decref) Task=`249892bc-d334-4d48-9c22-b73c6c3f296f`::ref 1 aborting False
Thread-8651::DEBUG::2014-05-22 14:51:14,212::resourceManager::939::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {'...
...
Thread-8651::DEBUG::2014-05-22 14:52:19,346::resourceManager::648::ResourceManager::(releaseResource) No one is waiting for resource 'Storage.a843d0b8-38cf-4056-b08d-9e1d31dd7e07_80767ba5-7c87-45dd-9a32-7330ed889af5', Clearing records.
For a total of 3 minutes and 24 seconds (just to acquire/release 76 locks).
If we hadn't spent this time the operation would have just lasted ~17 seconds.
The updateVM failure triggers an spmStop command:
2014-05-22 16:51:28,124 ERROR [org.ovirt.engine.core.vdsbroker.irsbroker.IrsBrokerCommand] (DefaultQuartzScheduler_Worker-7) IrsBroker::Failed::UpdateVMVDS due to: TimeoutException:
2014-05-22 16:51:30,611 INFO [org.ovirt.engine.core.vdsbroker.vdsbroker.SpmStopVDSCommand] (DefaultQuartzScheduler_Worker-7) START, SpmStopVDSCommand(HostName = vcpfscrx900h02.vas.vodacom.corp, HostId = febf7f89-55f7-46e6-b65e-d4ee23f96f72, storagePoolId = 7f7534f7-e5d4-44b3-b34a-4d0743734e7d), log id: 5e0b63f
that on the vdsm side is taking a really long time (40 minutes and 7 seconds) and timeouts as well:
Thread-8687::INFO::2014-05-22 14:51:33,565::logUtils::44::dispatcher::(wrapper) Run and protect: spmStop(spUUID='7f7534f7-e5d4-44b3-b34a-4d0743734e7d', options=None)
...
Thread-8687::ERROR::2014-05-22 15:31:40,828::SecureXMLRPCServer::168::root::(handle_error) client ('10.113.136.147', 34969)
Traceback (most recent call last):
File "/usr/lib64/python2.6/SocketServer.py", line 570, in process_request_thread
File "/usr/lib64/python2.6/SocketServer.py", line 332, in finish_request
File "/usr/lib64/python2.6/SocketServer.py", line 627, in __init__
File "/usr/lib64/python2.6/BaseHTTPServer.py", line 329, in handle
File "/usr/lib64/python2.6/BaseHTTPServer.py", line 323, in handle_one_request
File "/usr/lib64/python2.6/SimpleXMLRPCServer.py", line 490, in do_POST
File "/usr/lib64/python2.6/BaseHTTPServer.py", line 384, in send_response
File "/usr/lib64/python2.6/BaseHTTPServer.py", line 390, in send_header
File "/usr/lib64/python2.6/socket.py", line 324, in write
File "/usr/lib64/python2.6/socket.py", line 303, in flush
File "/usr/lib64/python2.6/site-packages/M2Crypto/SSL/Connection.py", line 222, in write
File "/usr/lib64/python2.6/site-packages/M2Crypto/SSL/Connection.py", line 205, in _write_bio
SSLError: bad write retry
The majority of the time (39 minutes and 22 seconds) is spent in:
Thread-8687::DEBUG::2014-05-22 14:51:54,540::sp::381::Storage.StoragePool::(cleanupMasterMount) unmounting /rhev/data-center/mnt/blockSD/80767ba5-7c87-45dd-9a32-7330ed889af5/master
Thread-8687::DEBUG::2014-05-22 15:31:16,887::mount::226::Storage.Misc.excCmd::(_runcmd) '/usr/bin/sudo -n /bin/umount /rhev/data-center/mnt/blockSD/80767ba5-7c87-45dd-9a32-7330ed889af5/master' (cwd None)
If we hadn't spent this time in cleanupMasterMount then stopSpm would have taken only 45 seconds.
There's no logging in this area but looking at the code I believe the entire time is spent in the first lines of blockSD.BlockStorageDomain.doUnmountMaster():
1211 def doUnmountMaster(cls, masterdir):
...
1217 cls.__handleStuckUmount(masterdir)
1218 try:
1219 masterMount = mount.getMountFromTarget(masterdir)
1220 except OSError as ex:
1221 if ex.errno == errno.ENOENT:
1222 return
1224 raise
1225 if masterMount.isMounted():
1226 # Try umount, take 1
1227 try:
1228 masterMount.umount()
probably in __handleStuckUmount or in getMountFromTarget/isMounted.
Bottom line:
- we can try to make updateVM faster (probably getting rids of the locks if they're useless)
- we can try to investigate what is taking so long in doUnmountMaster
Anyway in general this seems related to the high cpu usage on the host.
Probably Nir's work on bug 1095907 (also related to the mount module) should improve the situation here.
Meanwhile I suggest to lower the amount of VMs that are currently running on the SPM and eventually tweak the OvfItemsCountPerUpdate config (maybe set it to 10-20) to lower the amount of ovf sent.
(In reply to Federico Simoncelli from comment #4) > Bottom line: > > - we can try to make updateVM faster (probably getting rids of the locks if > they're useless) Looking forward, in 3.4.0, on 3.4.0 DCs, we don't use it all, but stream OVFs to a special disk per domain. So this should be mitigated considerably (see bug 1032686). I'd suggest the following course of action: 1. Tweak the current system's config, as Federico suggested: > Meanwhile I suggest to lower the amount of VMs that are currently running on > the SPM and eventually tweak the OvfItemsCountPerUpdate config (maybe set it > to 10-20) to lower the amount of ovf sent. 2. Roll out 3.3.3, which includes quite a bit of work to reduce CPU consumption: > Anyway in general this seems related to the high cpu usage on the host. > Probably Nir's work on bug 1095907 (also related to the mount module) should > improve the situation here. If this isn't enough, we should re-asses, but generally, I don't want to put too much work investigating a piece of code we're deprecating anyway. Marina - we're basically waiting for your input on whether the above does the trick for the customer or not. Allon, thank you. This is exactly the action plan we have decided on. Trying to implement Fede's suggestions with the customer now. Doron / Allon, Assuming the customer has 80 cores, what value would you recommend to set to SpmVCpuConsumption? The customer set it to 9 and didn't notice any significant effect. What would be a rule of thumb for this parameter? 50%? Or it is insignificant now, and we better see how the customer environment behaves after upgrade to 3.3.3? Thanks, Marina. (In reply to Marina from comment #15) > Doron / Allon, > > Assuming the customer has 80 cores, what value would you recommend to set to > SpmVCpuConsumption? > The customer set it to 9 and didn't notice any significant effect. > What would be a rule of thumb for this parameter? 50%? > Or it is insignificant now, and we better see how the customer environment > behaves after upgrade to 3.3.3? > > Thanks, > Marina. Marina, SpmVCpuConsumption is not directly related to this issue, so I suggest we discuss it over email. To answer your questions, SpmVCpuConsumption is one factor among many others which are considered while calculating loads for load balancing and new VM placement. Additionally, we should consider VcpuConsumptionPercentage which is being used with SpmVCpuConsumption. Each flow uses SpmVCpuConsumption differently, but generally speaking the basic behavior did not change. For new VM placement, SpmVCpuConsumption is being used for optimization only as follows: //vcpu = VcpuConsumptionPercentage //spmCpu = SpmVCpuConsumption host-score = (hostCpu / vcpu) + (pendingVcpus + vm.getNumOfCpus() + spmCpu) / hostCores; For load balancing, SpmVCpuConsumption is being used for load calculations as follows: host-load = SpmVCpuConsumption * VcpuConsumptionPercentage / vds.getCpuCores() + vds.getUsageCpuPercent() So... as you can see this is changing from one host to another and from one load pattern to another. If you're looking to improve things using load balancing and wish to have as little VMs as possible on the SPM host, you can try and increase it and then give the system time to balance based on the new settings (please verify the value is being refreshed and restart is not needed). A positive indication will be when the logs indicate that load balancing is doing migration from the spm host to the other hosts. Marina, After implementing Doron's suggestions and upgrading to 3.3.3, was the issue resolved? Are there any more action items on dev's side? They did not upgrade to 3.3.3 so far (maintenance window is scheduled for the end of this week). However implementing all the config changes and repairing the RCs from sanlock side did make the environment stable. I suggest we keep this bug opened until they confirm the env. stood stable after the upgrade and close it as NOTABUG afterwards. Does that make sense to you Allon? (In reply to Tomas Dosek from comment #18) > They did not upgrade to 3.3.3 so far (maintenance window is scheduled for > the end of this week). > > However implementing all the config changes and repairing the RCs from > sanlock side did make the environment stable. > > I suggest we keep this bug opened until they confirm the env. stood stable > after the upgrade and close it as NOTABUG afterwards. > > Does that make sense to you Allon? Yup, sounds great to me. Please report the status after the upgrade. Thanks! Sure! (In reply to Allon Mureinik from comment #19) > (In reply to Tomas Dosek from comment #18) > > They did not upgrade to 3.3.3 so far (maintenance window is scheduled for > > the end of this week). > > > > However implementing all the config changes and repairing the RCs from > > sanlock side did make the environment stable. > > > > I suggest we keep this bug opened until they confirm the env. stood stable > > after the upgrade and close it as NOTABUG afterwards. > > > > Does that make sense to you Allon? > Yup, sounds great to me. > Please report the status after the upgrade. Discussed with Marina offline. The customer env. seems to be stable, and it will take some time to orchestrate an upgrade. I'm closing this BZ as WORKFORME (since, will... it does :-)). Feel free to reopen it if the issue resurfaces. |
The queue of the tasks waiting for the storage resource is not cleaned properly (I see it mainly after the resource was exclusively locked requested by _shutDownUpgrade in sp.py). So that even when the task released the resource, the queue does not get cleared and other tasks are entering the q and time out and never run. Additional question is what causes this q to empty itself at some point - 2hours after the initial lock was granted. Version-Release number of selected component (if applicable): rhevm-3.3.1-0.48 vdsm-4.13.2-0.11 How reproducible: On customer site, every 40 minutes or so. Customer has about 40 FC storage domains. Once ISO domain was put in maintenance, problem occurrence decreased, but still continues happening. ================================================================ Here are the logs from vdsm on spm: We are looking into the resource: 7f7534f7-e5d4-44b3-b34a-4d0743734e7d [mku: start here, requesting lock for upgrade_7f7534f7-e5d4-44b3-b34a-4d0743734e7d (why upgrade, btw?)] vdsm.log.2.xz:Thread-8687::DEBUG::2014-05-22 14:51:35,616::resourceManager::197::ResourceManager.Request::(__init__) ResName=`Storage.upgrade_7f7534f7-e5d4-44b3-b34a-4d0743734e7d`ReqID=`a6455513-d6ef-445c-b118-2008fea52646`::Request was made in '/usr/share/vdsm/storage/sp.py' line '345' at '_shutDownUpgrade' [mku: requesting exclusive lock] vdsm.log.2.xz:Thread-8687::DEBUG::2014-05-22 14:51:35,721::resourceManager::541::ResourceManager::(registerResource) Trying to register resource 'Storage.upgrade_7f7534f7-e5d4-44b3-b34a-4d0743734e7d' for lock type 'exclusive' vdsm.log.2.xz:Thread-8687::DEBUG::2014-05-22 14:51:36,001::resourceManager::600::ResourceManager::(registerResource) Resource 'Storage.upgrade_7f7534f7-e5d4-44b3-b34a-4d0743734e7d' is free. Now locking as 'exclusive' (1 active user) [mku: now exclusive lock is granted] vdsm.log.2.xz:Thread-8687::DEBUG::2014-05-22 14:51:36,097::resourceManager::237::ResourceManager.Request::(grant) ResName=`Storage.upgrade_7f7534f7-e5d4-44b3-b34a-4d0743734e7d`ReqID=`a6455513-d6ef-445c-b118-2008fea52646`::Granted request [mku: here we are trying to release the resource] vdsm.log.2.xz:Thread-8687::DEBUG::2014-05-22 14:51:36,366::resourceManager::615::ResourceManager::(releaseResource) Trying to release resource 'Storage.upgrade_7f7534f7-e5d4-44b3-b34a-4d0743734e7d' vdsm.log.2.xz:Thread-8687::DEBUG::2014-05-22 14:51:36,497::resourceManager::634::ResourceManager::(releaseResource) Released resource 'Storage.upgrade_7f7534f7-e5d4-44b3-b34a-4d0743734e7d' (0 active users) vdsm.log.2.xz:Thread-8687::DEBUG::2014-05-22 14:51:36,524::resourceManager::640::ResourceManager::(releaseResource) Resource 'Storage.upgrade_7f7534f7-e5d4-44b3-b34a-4d0743734e7d' is free, finding out if anyone is waiting for it. [mku: here apparently we have released the resource] vdsm.log.2.xz:Thread-8687::DEBUG::2014-05-22 14:51:36,536::resourceManager::648::ResourceManager::(releaseResource) No one is waiting for resource 'Storage.upgrade_7f7534f7-e5d4-44b3-b34a-4d0743734e7d', Clearing records. [mku: here is next thread trying to register to the resource] vdsm.log.2.xz:Thread-8722::DEBUG::2014-05-22 14:54:36,584::resourceManager::197::ResourceManager.Request::(__init__) ResName=`Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d`ReqID=`d3359f6c-f07a-4c68-ac82-4fe1c5171236`::Request was made in '/usr/share/vdsm/storage/hsm.py' line '1216' at 'deactivateStorageDomain' vdsm.log.2.xz:Thread-812::DEBUG::2014-05-22 14:54:36,609::resourceManager::939::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} vdsm.log.2.xz:Thread-8722::DEBUG::2014-05-22 14:54:36,757::resourceManager::541::ResourceManager::(registerResource) Trying to register resource 'Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d' for lock type 'exclusive' vdsm.log.2.xz:Thread-812::DEBUG::2014-05-22 14:54:36,830::resourceManager::976::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} [mku: here we are entering the queue for the first time and fail. now only 1 reported in the q] vdsm.log.2.xz:Thread-8722::DEBUG::2014-05-22 14:54:36,971::resourceManager::576::ResourceManager::(registerResource) Resource 'Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d' is currently locked, Entering queue (1 in queue) vdsm.log.2.xz:Thread-8722::DEBUG::2014-05-22 14:56:37,161::resourceManager::209::ResourceManager.Request::(cancel) ResName=`Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d`ReqID=`d3359f6c-f07a-4c68-ac82-4fe1c5171236`::Canceled request vdsm.log.2.xz:Thread-8722::DEBUG::2014-05-22 14:56:37,559::resourceManager::818::ResourceManager.Owner::(acquire) 78619516-83df-42d8-8f2e-7d3505f200e6: request for 'Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d' timed out after '120.000000' seconds [mku: continue requesting the lock for another task, and enter the queue again - now there are 2] vdsm.log.2.xz:Thread-8744::DEBUG::2014-05-22 14:56:43,561::resourceManager::197::ResourceManager.Request::(__init__) ResName=`Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d`ReqID=`433ee983-dce1-485c-956f-be28daecdf53`::Request was made in '/usr/share/vdsm/storage/hsm.py' line '2562' at 'getStoragePoolInfo' vdsm.log.2.xz:Thread-8744::DEBUG::2014-05-22 14:56:43,577::resourceManager::541::ResourceManager::(registerResource) Trying to register resource 'Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d' for lock type 'shared' vdsm.log.2.xz:Thread-8744::DEBUG::2014-05-22 14:56:43,618::resourceManager::576::ResourceManager::(registerResource) Resource 'Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d' is currently locked, Entering queue (2 in queue) [mku: it continues even further:] vdsm.log.1.xz:Thread-9115::DEBUG::2014-05-22 15:30:04,722::resourceManager::576::ResourceManager::(registerResource) Resource 'Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d' is currently locked, Entering queue (16 in queue) [mku: now, 2 hours later - the Owner released all the requests and now al the tasks are popped up of the queue] Thread-8687::DEBUG::2014-05-22 15:31:33,831::resourceManager::939::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {'Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d': < ResourceRef 'Stora ge.7f7534f7-e5d4-44b3-b34a-4d0743734e7d', isValid: 'True' obj: 'None'>} Thread-8687::DEBUG::2014-05-22 15:31:34,125::resourceManager::976::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-8687::DEBUG::2014-05-22 15:31:34,397::resourceManager::615::ResourceManager::(releaseResource) Trying to release resource 'Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d' Thread-8687::DEBUG::2014-05-22 15:31:34,746::resourceManager::634::ResourceManager::(releaseResource) Released resource 'Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d' (0 active users) Thread-8687::DEBUG::2014-05-22 15:31:34,953::resourceManager::640::ResourceManager::(releaseResource) Resource 'Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d' is free, finding out if anyone is waiting for it. Thread-413::DEBUG::2014-05-22 15:31:34,994::resourceManager::939::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-8687::DEBUG::2014-05-22 15:31:35,063::resourceManager::653::ResourceManager::(releaseResource) Resource 'Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d' has 16 requests in queue. Handling top request. Thread-413::DEBUG::2014-05-22 15:31:35,108::resourceManager::976::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-8687::DEBUG::2014-05-22 15:31:35,203::resourceManager::662::ResourceManager::(releaseResource) Request 'Request for Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d - exclusive: canceled' was canceled, Ignori ng it. Thread-8687::DEBUG::2014-05-22 15:31:35,353::resourceManager::653::ResourceManager::(releaseResource) Resource 'Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d' has 15 requests in queue. Handling top request. Thread-8687::DEBUG::2014-05-22 15:31:35,516::resourceManager::662::ResourceManager::(releaseResource) Request 'Request for Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d - shared: canceled' was canceled, Ignoring it. Thread-8687::DEBUG::2014-05-22 15:31:35,668::resourceManager::653::ResourceManager::(releaseResource) Resource 'Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d' has 14 requests in queue. Handling top request. Thread-8687::DEBUG::2014-05-22 15:31:35,811::resourceManager::662::ResourceManager::(releaseResource) Request 'Request for Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d - shared: canceled' was canceled, Ignoring it. Thread-8687::DEBUG::2014-05-22 15:31:35,971::resourceManager::653::ResourceManager::(releaseResource) Resource 'Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d' has 13 requests in queue. Handling top request. Thread-8687::DEBUG::2014-05-22 15:31:36,148::resourceManager::662::ResourceManager::(releaseResource) Request 'Request for Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d - shared: canceled' was canceled, Ignoring it. Thread-8687::DEBUG::2014-05-22 15:31:36,361::resourceManager::653::ResourceManager::(releaseResource) Resource 'Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d' has 12 requests in queue. Handling top request. Thread-8687::DEBUG::2014-05-22 15:31:36,506::resourceManager::662::ResourceManager::(releaseResource) Request 'Request for Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d - shared: canceled' was canceled, Ignoring it. Thread-8687::DEBUG::2014-05-22 15:31:36,671::resourceManager::653::ResourceManager::(releaseResource) Resource 'Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d' has 11 requests in queue. Handling top request. Thread-8687::DEBUG::2014-05-22 15:31:36,862::resourceManager::662::ResourceManager::(releaseResource) Request 'Request for Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d - shared: canceled' was canceled, Ignoring it. Thread-8687::DEBUG::2014-05-22 15:31:37,095::resourceManager::653::ResourceManager::(releaseResource) Resource 'Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d' has 10 requests in queue. Handling top request. Thread-8687::DEBUG::2014-05-22 15:31:37,257::resourceManager::662::ResourceManager::(releaseResource) Request 'Request for Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d - shared: canceled' was canceled, Ignoring it. Thread-9132::DEBUG::2014-05-22 15:31:37,378::resourceManager::939::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-8687::DEBUG::2014-05-22 15:31:37,483::resourceManager::653::ResourceManager::(releaseResource) Resource 'Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d' has 9 requests in queue. Handling top request. Thread-9132::DEBUG::2014-05-22 15:31:37,541::resourceManager::976::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-8687::DEBUG::2014-05-22 15:31:37,637::resourceManager::662::ResourceManager::(releaseResource) Request 'Request for Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d - shared: canceled' was canceled, Ignoring it. Thread-8687::DEBUG::2014-05-22 15:31:37,801::resourceManager::653::ResourceManager::(releaseResource) Resource 'Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d' has 8 requests in queue. Handling top request. Thread-8687::DEBUG::2014-05-22 15:31:37,840::resourceManager::662::ResourceManager::(releaseResource) Request 'Request for Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d - shared: canceled' was canceled, Ignoring it. Thread-8687::DEBUG::2014-05-22 15:31:37,866::resourceManager::653::ResourceManager::(releaseResource) Resource 'Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d' has 7 requests in queue. Handling top request. Thread-8687::DEBUG::2014-05-22 15:31:37,876::resourceManager::662::ResourceManager::(releaseResource) Request 'Request for Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d - shared: canceled' was canceled, Ignoring it. Thread-8687::DEBUG::2014-05-22 15:31:37,961::resourceManager::653::ResourceManager::(releaseResource) Resource 'Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d' has 6 requests in queue. Handling top request. Thread-8687::DEBUG::2014-05-22 15:31:38,001::resourceManager::662::ResourceManager::(releaseResource) Request 'Request for Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d - shared: canceled' was canceled, Ignoring it. Thread-8687::DEBUG::2014-05-22 15:31:38,093::resourceManager::653::ResourceManager::(releaseResource) Resource 'Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d' has 5 requests in queue. Handling top request. Thread-8687::DEBUG::2014-05-22 15:31:38,250::resourceManager::662::ResourceManager::(releaseResource) Request 'Request for Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d - shared: canceled' was canceled, Ignoring it. Thread-8687::DEBUG::2014-05-22 15:31:38,452::resourceManager::653::ResourceManager::(releaseResource) Resource 'Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d' has 4 requests in queue. Handling top request. Thread-8687::DEBUG::2014-05-22 15:31:38,614::resourceManager::662::ResourceManager::(releaseResource) Request 'Request for Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d - shared: canceled' was canceled, Ignoring it. Thread-8687::DEBUG::2014-05-22 15:31:38,880::resourceManager::653::ResourceManager::(releaseResource) Resource 'Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d' has 3 requests in queue. Handling top request. Thread-8687::DEBUG::2014-05-22 15:31:39,055::resourceManager::662::ResourceManager::(releaseResource) Request 'Request for Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d - shared: canceled' was canceled, Ignoring it. Thread-8687::DEBUG::2014-05-22 15:31:39,228::resourceManager::653::ResourceManager::(releaseResource) Resource 'Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d' has 2 requests in queue. Handling top request. Thread-8687::DEBUG::2014-05-22 15:31:39,395::resourceManager::662::ResourceManager::(releaseResource) Request 'Request for Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d - shared: canceled' was canceled, Ignoring it. Thread-812::DEBUG::2014-05-22 15:31:39,458::resourceManager::939::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} Thread-8687::DEBUG::2014-05-22 15:31:39,627::resourceManager::653::ResourceManager::(releaseResource) Resource 'Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d' has 1 requests in queue. Handling top request. Thread-812::DEBUG::2014-05-22 15:31:39,693::resourceManager::976::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-8687::DEBUG::2014-05-22 15:31:39,862::resourceManager::237::ResourceManager.Request::(grant) ResName=`Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d`ReqID=`f77a73c1-ceb4-4057-a1e6-df6795b3ea6a`::Granted request Thread-8687::DEBUG::2014-05-22 15:31:40,110::resourceManager::685::ResourceManager::(releaseResource) Request 'Request for Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d - shared: granted' was granted Thread-8687::DEBUG::2014-05-22 15:31:40,281::resourceManager::693::ResourceManager::(releaseResource) This is a shared lock. Granting all shared requests [mku:] now start again - bit different originator of the lock request, so that the queue is cleared properly. Thread-9147::DEBUG::2014-05-22 15:32:43,284::resourceManager::197::ResourceManager.Request::(__init__) ResName=`Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d`ReqID=`5360b79e-345e-4a8a-9ef2-a0980f58054d`::Request was made in '/usr/share/vdsm/storage/hsm.py' line '587' at 'spmStart' Thread-9147::DEBUG::2014-05-22 15:32:43,512::resourceManager::541::ResourceManager::(registerResource) Trying to register resource 'Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d' for lock type 'exclusive' Thread-9147::DEBUG::2014-05-22 15:32:43,763::resourceManager::600::ResourceManager::(registerResource) Resource 'Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d' is free. Now locking as 'exclusive' (1 active user) Thread-9147::DEBUG::2014-05-22 15:32:43,918::resourceManager::237::ResourceManager.Request::(grant) ResName=`Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d`ReqID=`5360b79e-345e-4a8a-9ef2-a0980f58054d`::Granted request 83660ad7-b4fa-46f9-b77e-2763ffed8f6f::DEBUG::2014-05-22 15:33:43,329::resourceManager::197::ResourceManager.Request::(__init__) ResName=`Storage.upgrade_7f7534f7-e5d4-44b3-b34a-4d0743734e7d`ReqID=`d8a384d2-02eb-4333-b470-0026a2a7584c`::Request was made in '/usr/share/vdsm/storage/sp.py' line '432' at '_upgradePool' 83660ad7-b4fa-46f9-b77e-2763ffed8f6f::DEBUG::2014-05-22 15:33:43,472::resourceManager::541::ResourceManager::(registerResource) Trying to register resource 'Storage.upgrade_7f7534f7-e5d4-44b3-b34a-4d0743734e7d' for lock type 'exclusive' 83660ad7-b4fa-46f9-b77e-2763ffed8f6f::DEBUG::2014-05-22 15:33:43,643::resourceManager::600::ResourceManager::(registerResource) Resource 'Storage.upgrade_7f7534f7-e5d4-44b3-b34a-4d0743734e7d' is free. Now locking as 'exclusive' (1 active user) 83660ad7-b4fa-46f9-b77e-2763ffed8f6f::DEBUG::2014-05-22 15:33:43,845::resourceManager::237::ResourceManager.Request::(grant) ResName=`Storage.upgrade_7f7534f7-e5d4-44b3-b34a-4d0743734e7d`ReqID=`d8a384d2-02eb-4333-b470-0026a2a7584c`::Granted request 83660ad7-b4fa-46f9-b77e-2763ffed8f6f::DEBUG::2014-05-22 15:34:19,426::resourceManager::615::ResourceManager::(releaseResource) Trying to release resource 'Storage.upgrade_7f7534f7-e5d4-44b3-b34a-4d0743734e7d' 83660ad7-b4fa-46f9-b77e-2763ffed8f6f::DEBUG::2014-05-22 15:34:27,394::resourceManager::634::ResourceManager::(releaseResource) Released resource 'Storage.upgrade_7f7534f7-e5d4-44b3-b34a-4d0743734e7d' (0 active users) 83660ad7-b4fa-46f9-b77e-2763ffed8f6f::DEBUG::2014-05-22 15:34:27,553::resourceManager::640::ResourceManager::(releaseResource) Resource 'Storage.upgrade_7f7534f7-e5d4-44b3-b34a-4d0743734e7d' is free, finding out if anyone is waiting for it. 83660ad7-b4fa-46f9-b77e-2763ffed8f6f::DEBUG::2014-05-22 15:34:27,762::resourceManager::648::ResourceManager::(releaseResource) No one is waiting for resource 'Storage.upgrade_7f7534f7-e5d4-44b3-b34a-4d0743734e7d', Clearing records. 83660ad7-b4fa-46f9-b77e-2763ffed8f6f::DEBUG::2014-05-22 15:35:52,979::resourceManager::939::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {'Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e 7d': < ResourceRef 'Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d', isValid: 'True' obj: 'None'>} 83660ad7-b4fa-46f9-b77e-2763ffed8f6f::DEBUG::2014-05-22 15:35:53,197::resourceManager::976::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-9283::DEBUG::2014-05-22 15:35:53,549::resourceManager::939::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {} 83660ad7-b4fa-46f9-b77e-2763ffed8f6f::DEBUG::2014-05-22 15:35:53,546::resourceManager::615::ResourceManager::(releaseResource) Trying to release resource 'Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d' Thread-9283::DEBUG::2014-05-22 15:35:53,668::resourceManager::976::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} 83660ad7-b4fa-46f9-b77e-2763ffed8f6f::DEBUG::2014-05-22 15:35:53,723::resourceManager::634::ResourceManager::(releaseResource) Released resource 'Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d' (0 active users) 83660ad7-b4fa-46f9-b77e-2763ffed8f6f::DEBUG::2014-05-22 15:35:54,058::resourceManager::640::ResourceManager::(releaseResource) Resource 'Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d' is free, finding out if anyo ne is waiting for it. 83660ad7-b4fa-46f9-b77e-2763ffed8f6f::DEBUG::2014-05-22 15:35:54,327::resourceManager::648::ResourceManager::(releaseResource) No one is waiting for resource 'Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d', Clear ing records. Thread-9287::DEBUG::2014-05-22 15:36:00,937::resourceManager::976::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {} Thread-9289::DEBUG::2014-05-22 15:36:03,314::resourceManager::197::ResourceManager.Request::(__init__) ResName=`Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d`ReqID=`31ae0a8d-7e1d-4f34-9433-d5fd772eb642`::Request was made in '/usr/share/vdsm/storage/hsm.py' line '2562' at 'getStoragePoolInfo' Thread-9289::DEBUG::2014-05-22 15:36:03,476::resourceManager::541::ResourceManager::(registerResource) Trying to register resource 'Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d' for lock type 'shared' Thread-9289::DEBUG::2014-05-22 15:36:03,787::resourceManager::600::ResourceManager::(registerResource) Resource 'Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d' is free. Now locking as 'shared' (1 active user) Thread-9289::DEBUG::2014-05-22 15:36:03,982::resourceManager::237::ResourceManager.Request::(grant) ResName=`Storage.7f7534f7-e5d4-44b3-b34a-4d0743734e7d`ReqID=`31ae0a8d-7e1d-4f34-9433-d5fd772eb642`::Granted request