Bug 726952

Summary: [Vdsm] race: trying to create snapshot after first creation failed because of vdsm restart will fail with error "There is no leaf in the image"
Product: Red Hat Enterprise Linux 6 Reporter: Dafna Ron <dron>
Component: vdsmAssignee: Saggi Mizrahi <smizrahi>
Status: CLOSED ERRATA QA Contact: Dafna Ron <dron>
Severity: high Docs Contact:
Priority: medium    
Version: 6.2CC: abaron, bazulay, iheim, tdosek, ykaul
Target Milestone: rcKeywords: Regression
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard: storage
Fixed In Version: vdsm-4.9-92 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-12-06 07:36:08 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
logs none

Description Dafna Ron 2011-07-31 12:02:18 UTC
Created attachment 516006 [details]
logs

Description of problem:

after restarting vdsm during snapshot creation resulting in task failure, I tried creating a new snapshot on the same vm. 
the operation failed with error:

'Could not acquire resource. Probably resource factory threw an exception.: ()' - code 100


Version-Release number of selected component (if applicable):

ic135


How reproducible:

25%

Steps to Reproduce:
1. create a vm -> create and delete snapshot to make sure vm is fine
2. create snapshot -> restart vdsm (I restarted vdsm twice)
3. try to create a second snapshot
  
Actual results:

second snapshot will fail with error

Expected results:

snapshot creation should not fail. 

Additional info: full log attached. regression added since this did not exist in past tests that I did vdsm22. 

f6a96ca6-3b65-400c-998c-d33a637d20d6::ERROR::2011-07-31 14:27:20,734::image::330::Storage.Image::(getChain) There is no leaf in the image c4fa2ceb-7b9e-41c1-9954-c2214b846979
f6a96ca6-3b65-400c-998c-d33a637d20d6::WARNING::2011-07-31 14:27:20,735::resourceManager::500::ResourceManager::(registerResource) Resource factory failed to create resource '859
ce915-2b37-4a9b-b53d-2429acb36b2d_imageNS.c4fa2ceb-7b9e-41c1-9954-c2214b846979'. Canceling request.
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/resourceManager.py", line 498, in registerResource
    obj = namespaceObj.factory.createResource(name, lockType)
  File "/usr/share/vdsm/storage/resourceFactories.py", line 163, in createResource
    volResourcesList = self.__getResourceCandidatesList(resourceName, lockType)
  File "/usr/share/vdsm/storage/resourceFactories.py", line 111, in __getResourceCandidatesList
    chain = image.Image(repoPath).getChain(sdUUID=self.sdUUID, imgUUID=resourceName)
  File "/usr/share/vdsm/storage/image.py", line 331, in getChain
    raise se.ImageIsNotLegalChain(imgUUID)
ImageIsNotLegalChain: Image is not a legal chain: ('c4fa2ceb-7b9e-41c1-9954-c2214b846979',)
f6a96ca6-3b65-400c-998c-d33a637d20d6::DEBUG::2011-07-31 14:27:20,737::resourceManager::165::ResourceManager.Request::(cancel) ResName=`859ce915-2b37-4a9b-b53d-2429acb36b2d_image
NS.c4fa2ceb-7b9e-41c1-9954-c2214b846979`ReqID=`c6fed331-12f3-49a2-b501-86afdd48d5fb`::Canceled request
f6a96ca6-3b65-400c-998c-d33a637d20d6::WARNING::2011-07-31 14:27:20,737::resourceManager::159::ResourceManager.Request::(cancel) ResName=`859ce915-2b37-4a9b-b53d-2429acb36b2d_ima
geNS.c4fa2ceb-7b9e-41c1-9954-c2214b846979`ReqID=`c6fed331-12f3-49a2-b501-86afdd48d5fb`::Tried to cancel a processed request
f6a96ca6-3b65-400c-998c-d33a637d20d6::ERROR::2011-07-31 14:27:20,738::task::865::TaskManager.Task::(_setError) Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 873, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/storage/task.py", line 300, in run
    return self.cmd(*self.argslist, **self.argsdict)
  File "/usr/share/vdsm/storage/spm.py", line 115, in run
    return self.func(*args, **kwargs)
  File "/usr/share/vdsm/storage/spm.py", line 993, in createVolume
    with rmanager.acquireResource(imageResourcesNamespace, imgUUID, rm.LockType.exclusive):
  File "/usr/share/vdsm/storage/resourceManager.py", line 445, in acquireResource
    raise se.ResourceAcqusitionFailed()
ResourceAcqusitionFailed: Could not acquire resource. Probably resource factory threw an exception.: ()
f6a96ca6-3b65-400c-998c-d33a637d20d6::DEBUG::2011-07-31 14:27:20,739::task::492::TaskManager.Task::(_debug) Task f6a96ca6-3b65-400c-998c-d33a637d20d6: Task._run: f6a96ca6-3b65-4
00c-998c-d33a637d20d6 () {} failed - stopping task
f6a96ca6-3b65-400c-998c-d33a637d20d6::DEBUG::2011-07-31 14:27:20,739::task::492::TaskManager.Task::(_debug) Task f6a96ca6-3b65-400c-998c-d33a637d20d6: stopping in state running 
(force False)
f6a96ca6-3b65-400c-998c-d33a637d20d6::DEBUG::2011-07-31 14:27:20,740::task::492::TaskManager.Task::(_debug) Task f6a96ca6-3b65-400c-998c-d33a637d20d6: ref 1 aborting True
f6a96ca6-3b65-400c-998c-d33a637d20d6::DEBUG::2011-07-31 14:27:20,740::task::915::TaskManager.Task::(_runJobs) aborting: Task is aborted: 'Could not acquire resource. Probably re
source factory threw an exception.: ()' - code 100

this is the metadata output for image:

DOMAIN=859ce915-2b37-4a9b-b53d-2429acb36b2d
VOLTYPE=INTERNAL
CTIME=1312109436
FORMAT=COW
IMAGE=02dcc951-af80-4c4b-bbb1-9c4a29ca2671
DISKTYPE=1
PUUID=00000000-0000-0000-0000-000000000000
LEGALITY=LEGAL
MTIME=1312109545
POOL_UUID=
SIZE=2097152
TYPE=SPARSE
DESCRIPTION=_ActiveImage_VM_Sun Jul 31 13:53:41 IDT 2011
DOMAIN=859ce915-2b37-4a9b-b53d-2429acb36b2d
VOLTYPE=LEAF
CTIME=1312112432
FORMAT=COW
IMAGE=f4268ca8-05bb-4661-85e8-f3bf8e76f08d
DISKTYPE=1
PUUID=bdc1c2e5-7d36-44cb-82ef-ba76f61de464
LEGALITY=LEGAL
MTIME=1312112432
POOL_UUID=
DESCRIPTION=_ActiveImage_XP-Snap2_Sun Jul 31 14:43:36 IDT 2011
TYPE=SPARSE
SIZE=31457280

Comment 3 Saggi Mizrahi 2011-08-07 11:42:23 UTC
http://gerrit.usersys.redhat.com/794

Comment 8 Tomas Dosek 2011-08-25 09:34:48 UTC
Verified - vdsm-4.9-95 - scenario from https://tcms.engineering.redhat.com/case/49772/?from_plan=2130 no longer fails and snapshot is properly created.

Comment 9 errata-xmlrpc 2011-12-06 07:36:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHEA-2011-1782.html