Bug 719255

Summary: [vdsm] when prepareVolume fails the image lock is not released.
Product: Red Hat Enterprise Linux 6 Reporter: David Naori <dnaori>
Component: vdsmAssignee: Igor Lvovsky <ilvovsky>
Status: CLOSED ERRATA QA Contact: David Naori <dnaori>
Severity: high Docs Contact:
Priority: unspecified    
Version: 6.1CC: abaron, bazulay, danken, dnaori, hateya, iheim, lpeer, mgoldboi, ykaul
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: vdsm-4.9-81.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-12-06 07:30:26 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
vdsm log none

Description David Naori 2011-07-06 09:16:21 UTC
Created attachment 511460 [details]
vdsm log

Description:

When prepareVolume fails - for example with "VolumeDoesNotExist", the image lock is not released.

when trying to run this vm again it acquires the vmConatinerLock and waiting forever for the image resource that was never released.

result:
deadlock in vmContainer.

Version-Release number of selected component (if applicable):
vdsm-4.9-80.el6.x86_64

How reproducible:

Steps to Reproduce:
1.
2.
3.
  
Actual results:
creation of vms stuck forever in waitForLaunch.

Expected results:
image lock should be released if prepareVolume fails. 

Additional info:
log attached

Comment 1 Igor Lvovsky 2011-07-07 08:00:05 UTC
http://gerrit.usersys.redhat.com/#change,665

Comment 2 Dan Kenigsberg 2011-07-17 11:48:02 UTC
Bug report changed to ON_QA status by me since Errata System refused to do so.
A QE request has been submitted for advisory RHEA-2011:11186-01
http://errata.devel.redhat.com/errata/show/11186

Comment 3 David Naori 2011-07-18 07:43:47 UTC
Thread-2386::DEBUG::2011-07-18 10:27:39,193::resourceManager::508::ResourceManager::(registerResource) Resource 'Storage.10e67a7f-a434-478e-9e83-a724566523ce' is free. Now locking as 'shared' (1 active user)
Thread-2386::DEBUG::2011-07-18 10:27:39,194::resourceManager::191::ResourceManager.Request::(grant) ResName=`Storage.10e67a7f-a434-478e-9e83-a724566523ce`ReqID=`12aee50c-5a15-4b22-ba66-f9f5f091d4e6`::Granted request


VolumeDoesNotExist: Volume does not exist: ('a2040e9f-7b22-443e-a25f-41a39d3cd257',)
Thread-2386::DEBUG::2011-07-18 10:27:39,804::task::492::TaskManager.Task::(_debug) Task 915fdc5d-c480-4143-b723-127ef4533fb4: Task._run: 915fdc5d-c480-4143-b723-127ef4533fb4 ('10e67a7f-a434-478e-9e83-a724566523ce', '6a38970f-5596-4437-ba
8c-1093e1b48afc', '87e8b198-1cbb-43b3-a137-894d9287daa1', 'a2040e9f-7b22-443e-a25f-41a39d3cd257') {} failed - stopping task
Thread-2388::INFO::2011-07-18 10:27:39,806::dispatcher::94::Storage.Dispatcher.Protect::(run) Run and protect: repoStats, args: ()
Thread-2386::DEBUG::2011-07-18 10:27:39,806::task::492::TaskManager.Task::(_debug) Task 915fdc5d-c480-4143-b723-127ef4533fb4: stopping in state preparing (force False)
Thread-2388::DEBUG::2011-07-18 10:27:39,807::task::492::TaskManager.Task::(_debug) Task b7ae8125-5b9b-4318-a1e5-b7bdac7f1eb7: moving from state init -> state preparing
Thread-2386::DEBUG::2011-07-18 10:27:39,807::task::492::TaskManager.Task::(_debug) Task 915fdc5d-c480-4143-b723-127ef4533fb4: ref 1 aborting True
Thread-2388::DEBUG::2011-07-18 10:27:39,808::task::492::TaskManager.Task::(_debug) Task b7ae8125-5b9b-4318-a1e5-b7bdac7f1eb7: finished: {u'ea48f4b2-7cee-4669-94af-33131b0eaab9': {'delay': '0.00293016433716', 'lastCheck': 1310974059.34768
61, 'valid': True, 'code': 0}, u'7fea1def-0290-4279-9f52-341ff3c4a508': {'delay': '0.00362586975098', 'lastCheck': 1310974053.2883849, 'valid': True, 'code': 0}, u'10e67a7f-a434-478e-9e83-a724566523ce': {'delay': '0.00401496887207', 'las
tCheck': 1310974050.259624, 'valid': True, 'code': 0}}
Thread-2386::INFO::2011-07-18 10:27:39,808::task::1177::TaskManager.Task::(prepare) aborting: Task is aborted: 'Volume does not exist' - code 201
Thread-2388::DEBUG::2011-07-18 10:27:39,809::task::492::TaskManager.Task::(_debug) Task b7ae8125-5b9b-4318-a1e5-b7bdac7f1eb7: moving from state preparing -> state finished
Thread-2386::DEBUG::2011-07-18 10:27:39,809::task::492::TaskManager.Task::(_debug) Task 915fdc5d-c480-4143-b723-127ef4533fb4: Prepare: aborted: Volume does not exist
Thread-2388::DEBUG::2011-07-18 10:27:39,809::resourceManager::786::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {}
Thread-2386::DEBUG::2011-07-18 10:27:39,810::task::492::TaskManager.Task::(_debug) Task 915fdc5d-c480-4143-b723-127ef4533fb4: ref 0 aborting True
Thread-2388::DEBUG::2011-07-18 10:27:39,810::resourceManager::821::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {}
Thread-2386::DEBUG::2011-07-18 10:27:39,811::task::492::TaskManager.Task::(_debug) Task 915fdc5d-c480-4143-b723-127ef4533fb4: Task._doAbort: force False
Thread-2388::DEBUG::2011-07-18 10:27:39,811::task::492::TaskManager.Task::(_debug) Task b7ae8125-5b9b-4318-a1e5-b7bdac7f1eb7: ref 0 aborting False
Thread-2386::DEBUG::2011-07-18 10:27:39,812::resourceManager::821::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {}
Thread-2388::INFO::2011-07-18 10:27:39,812::dispatcher::100::Storage.Dispatcher.Protect::(run) Run and protect: repoStats, Return response: {'status': {'message': 'OK', 'code': 0}, u'ea48f4b2-7cee-4669-94af-33131b0eaab9': {'delay': '0.00
293016433716', 'lastCheck': 1310974059.3476861, 'valid': True, 'code': 0}, u'7fea1def-0290-4279-9f52-341ff3c4a508': {'delay': '0.00362586975098', 'lastCheck': 1310974053.2883849, 'valid': True, 'code': 0}, u'10e67a7f-a434-478e-9e83-a7245
66523ce': {'delay': '0.00401496887207', 'lastCheck': 1310974050.259624, 'valid': True, 'code': 0}}
Thread-2386::DEBUG::2011-07-18 10:27:39,812::task::492::TaskManager.Task::(_debug) Task 915fdc5d-c480-4143-b723-127ef4533fb4: moving from state preparing -> state aborting
Thread-2386::DEBUG::2011-07-18 10:27:39,813::task::492::TaskManager.Task::(_debug) Task 915fdc5d-c480-4143-b723-127ef4533fb4: _aborting: recover policy none
Thread-2386::DEBUG::2011-07-18 10:27:39,814::task::492::TaskManager.Task::(_debug) Task 915fdc5d-c480-4143-b723-127ef4533fb4: moving from state aborting -> state failed
Thread-2386::DEBUG::2011-07-18 10:27:39,817::resourceManager::786::ResourceManager.Owner::(releaseAll) Owner.releaseAll requests {} resources {'Storage.10e67a7f-a434-478e-9e83-a724566523ce': < ResourceRef 'Storage.10e67a7f-a434-478e-9e83-a724566523ce', isValid: 'True' obj: 'None'>}
Thread-2386::DEBUG::2011-07-18 10:27:39,817::resourceManager::821::ResourceManager.Owner::(cancelAll) Owner.cancelAll requests {}
Thread-2386::DEBUG::2011-07-18 10:27:39,824::resourceManager::517::ResourceManager::(releaseResource) Trying to release resource 'Storage.10e67a7f-a434-478e-9e83-a724566523ce'
Thread-2386::DEBUG::2011-07-18 10:27:39,824::resourceManager::532::ResourceManager::(releaseResource) Released resource 'Storage.10e67a7f-a434-478e-9e83-a724566523ce' (0 active users)
Thread-2386::DEBUG::2011-07-18 10:27:39,825::resourceManager::537::ResourceManager::(releaseResource) Resource 'Storage.10e67a7f-a434-478e-9e83-a724566523ce' is free, finding out if anyone is waiting for it.
Thread-2386::DEBUG::2011-07-18 10:27:39,825::resourceManager::544::ResourceManager::(releaseResource) No one is waiting for resource 'Storage.10e67a7f-a434-478e-9e83-a724566523ce', Clearing records.

Resource released Verified.
vdsm-4.9-82.el6

Comment 4 errata-xmlrpc 2011-12-06 07:30:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2011-1782.html