Bug 689444

Summary: VDSM: when snapshot chain is broken, exporting VM fails with unknow error when it should fail with error MergeSnapshotsError
Product: Red Hat Enterprise Linux 5 Reporter: Dafna Ron <dron>
Component: vdsm22Assignee: Dan Kenigsberg <dkenigsb>
Status: CLOSED DEFERRED QA Contact: yeylon <yeylon>
Severity: low Docs Contact:
Priority: low    
Version: 5.6CC: abaron, bazulay, danken, iheim, lpeer, srevivo, ykaul
Target Milestone: rc   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-06-20 09:00:17 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
vdsm and rhevm logs none

Description Dafna Ron 2011-03-21 14:41:19 UTC
Created attachment 486621 [details]
vdsm and rhevm logs

Description of problem:

when snapshot chain is broken. export VM fails on GetAllTaskStatus phase of MergeSnapshotSingleDiskCommand, we get an unknow error in vds -> since this is sent as unknow, the backend shows the same which means that the user will not be able to know why export VM failed. 
We need to add error to this task (moveVM) in vds maybe MoveImageError or MergeSnapshotsError once this task fails. 


Version-Release number of selected component (if applicable):

ic105
vdsm22-cli-4.5-63.23.el5_6
vdsm22-4.5-63.23.el5_6
vdsm22-debuginfo-4.5-63.21.el5_6
kvm-qemu-img-83-224.el5

How reproducible:
100%

Steps to Reproduce:
1. break snapshot chain
2. try to export VM
3.
  
Actual results:

task fail:

Thread-39671::DEBUG::2011-03-21 11:13:52,797::taskManager::81::irs::(TaskManager:getTaskStatus) Return. Response: {'code': 754, 'message': "jobs number 0: moveImage: <bound method Secure.run of <storage.spm.Secure instance at 0x2aca733f2440>> (args: ('dd533a66-99a2-4a7b-8e56-5c0de68cfe45', '898ab28c-519a-4675-b1ec-5d6e9fad2940', '3d0b1c93-54d1-426e-bf80-8bcd09f27be1', '09c00681-fa8a-426b-a383-0120080323df', '34408258-7e07-4bc2-997f-c09445c576db', 1, False, False) kwargs: {}) failed: Unknown Error", 'taskState': 'finished', 'taskResult': 'cleanSuccess', 'taskID': 'c1024283-4a9b-4f1e-b547-8c2151cae4b4'}


Expected results:

we should have a  MergeSnapshotsError sent to backend

Additional info:log

Thread-39671::DEBUG::2011-03-21 11:13:52,797::taskManager::81::irs::(TaskManager:getTaskStatus) Return. Response: {'code': 754, 'message': "jobs number 0: moveImage: <bound meth
od Secure.run of <storage.spm.Secure instance at 0x2aca733f2440>> (args: ('dd533a66-99a2-4a7b-8e56-5c0de68cfe45', '898ab28c-519a-4675-b1ec-5d6e9fad2940', '3d0b1c93-54d1-426e-bf8
0-8bcd09f27be1', '09c00681-fa8a-426b-a383-0120080323df', '34408258-7e07-4bc2-997f-c09445c576db', 1, False, False) kwargs: {}) failed: Unknown Error", 'taskState': 'finished', 't
askResult': 'cleanSuccess', 'taskID': 'c1024283-4a9b-4f1e-b547-8c2151cae4b4'}
Thread-39671::DEBUG::2011-03-21 11:13:52,797::taskManager::93::irs::(TaskManager:getAllTasksStatuses) Return: {'c1024283-4a9b-4f1e-b547-8c2151cae4b4': {'code': 754, 'message': "
jobs number 0: moveImage: <bound method Secure.run of <storage.spm.Secure instance at 0x2aca733f2440>> (args: ('dd533a66-99a2-4a7b-8e56-5c0de68cfe45', '898ab28c-519a-4675-b1ec-5
d6e9fad2940', '3d0b1c93-54d1-426e-bf80-8bcd09f27be1', '09c00681-fa8a-426b-a383-0120080323df', '34408258-7e07-4bc2-997f-c09445c576db', 1, False, False) kwargs: {}) failed: Unknow
n Error", 'taskState': 'finished', 'taskResult': 'cleanSuccess', 'taskID': 'c1024283-4a9b-4f1e-b547-8c2151cae4b4'}}
Thread-39671::DEBUG::2011-03-21 11:13:52,798::task::577::irs::Task f2a740b0-e8dc-4c54-ac18-172903d0d4ff: finished: {'allTasksStatus': {'c1024283-4a9b-4f1e-b547-8c2151cae4b4': {'
code': 754, 'message': "jobs number 0: moveImage: <bound method Secure.run of <storage.spm.Secure instance at 0x2aca733f2440>> (args: ('dd533a66-99a2-4a7b-8e56-5c0de68cfe45', '8
98ab28c-519a-4675-b1ec-5d6e9fad2940', '3d0b1c93-54d1-426e-bf80-8bcd09f27be1', '09c00681-fa8a-426b-a383-0120080323df', '34408258-7e07-4bc2-997f-c09445c576db', 1, False, False) kw
args: {}) failed: Unknown Error", 'taskState': 'finished', 'taskResult': 'cleanSuccess', 'taskID': 'c1024283-4a9b-4f1e-b547-8c2151cae4b4'}}}
Thread-39671::DEBUG::2011-03-21 11:13:52,798::task::577::irs::Task f2a740b0-e8dc-4c54-ac18-172903d0d4ff: moving from state preparing -> state finished
Thread-39671::DEBUG::2011-03-21 11:13:52,798::resource::670::irs::Owner.releaseAll requests [] resources []
Thread-39671::DEBUG::2011-03-21 11:13:52,798::task::577::irs::Task f2a740b0-e8dc-4c54-ac18-172903d0d4ff: ref 0 aborting False
Thread-39671::INFO::2011-03-21 11:13:52,799::dispatcher::101::irs::Run and protect: getAllTasksStatuses, Return response: {'status': {'message': 'OK', 'code': 0}, 'allTasksStatu
s': {'c1024283-4a9b-4f1e-b547-8c2151cae4b4': {'code': 754, 'message': "jobs number 0: moveImage: <bound method Secure.run of <storage.spm.Secure instance at 0x2aca733f2440>> (ar
gs: ('dd533a66-99a2-4a7b-8e56-5c0de68cfe45', '898ab28c-519a-4675-b1ec-5d6e9fad2940', '3d0b1c93-54d1-426e-bf80-8bcd09f27be1', '09c00681-fa8a-426b-a383-0120080323df', '34408258-7e
07-4bc2-997f-c09445c576db', 1, False, False) kwargs: {}) failed: Unknown Error", 'taskState': 'finished', 'taskResult': 'cleanSuccess', 'taskID': 'c1024283-4a9b-4f1e-b547-8c2151
cae4b4'}}}


backend: 


2011-03-20 16:31:47,102 ERROR [com.redhat.rhevm.vdsbroker.vdsbroker.BrokerCommandBase] (QuartzScheduler_Worker-34) Failed in HSMGetAllTasksStatusesVDS method
2011-03-20 16:31:47,102 ERROR [com.redhat.rhevm.vdsbroker.vdsbroker.BrokerCommandBase] (QuartzScheduler_Worker-34) Error code LogicalVolumeExtendError and error message VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = jobs number 0: mergeSnapshots: <bound method Secure.run of <storage.spm.Secure instance at 0x2ae383959368>> (args: ('898ab28c-519a-4675-b1ec-5d6e9fad2940', 'dd533a66-99a2-4a7b-8e56-5c0de68cfe45', '34408258-7e07-4bc2-997f-c09445c576db', '09c00681-fa8a-426b-a383-0120080323df', '8873b46e-bd6e-44e5-bc60-0164dbe02868', 'c0e21f94-4b5d-4315-b23e-7c9ce5968536', False) kwargs: {}) failed: Unknown Error

Comment 1 Dafna Ron 2011-03-28 16:39:46 UTC
bug opened and closed for backend: 689446
once error is added in vdsm this should be checked in backend as well.

Comment 2 Dan Kenigsberg 2011-03-30 22:45:50 UTC
Do we really care for this in 2.2.Z? What's the behavior in 2.3?

Comment 3 Dan Kenigsberg 2011-04-10 07:35:47 UTC
Please test the behavior on 2.3.

Comment 7 Dafna Ron 2011-05-09 08:43:30 UTC
I was able to export VM with broken chain with and without collapsing snapshots in RHEL6. 

vdsm-cli-4.9-64.el6.x86_64
vdsm-debug-plugin-4.9-64.el6.x86_64
vdsm-debuginfo-4.9-64.el6.x86_64
vdsm-4.9-64.el6.x86_64
libvirt-client-0.8.7-18.el6.x86_64
libvirt-python-0.8.7-18.el6.x86_64
libvirt-0.8.7-18.el6.x86_64

Comment 8 Dan Kenigsberg 2011-05-09 11:42:31 UTC
Dafna, please elaborate how you've broken the chain, and wether the exported VM was usable. It is very surprising that an export succeed when an element of the chain is missing.

Comment 9 Dafna Ron 2011-05-09 12:57:18 UTC
while deleting one of the middle snapshots I restarted vdsm. 
operation should fail, the snapshot is still in the rhevm, but if you try to delete it again you will get an error with the below code:

09462d90-35f2-4c7a-b447-1101347f65c3::DEBUG::2011-05-09 15:52:07,447::task::905::TaskManager.Task::(_runJobs) aborting: Task is aborted: 'Image is not a legal chain' - code 262


Complete error:

09462d90-35f2-4c7a-b447-1101347f65c3::ERROR::2011-05-09 15:52:07,444::task::855::TaskManager.Task::(_setError) Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 863, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/storage/task.py", line 300, in run
    return self.cmd(*self.argslist, **self.argsdict)
  File "/usr/share/vdsm/storage/spm.py", line 115, in run
    return self.func(*args, **kwargs)
  File "/usr/share/vdsm/storage/spm.py", line 911, in mergeSnapshots
    image.Image(repoPath).merge(sdUUID, vmUUID, imgUUID, ancestor, successor, postZero)
  File "/usr/share/vdsm/storage/image.py", line 1061, in merge
    chain = self.getSubChain(sdUUID, imgUUID, ancestor, successor)
  File "/usr/share/vdsm/storage/image.py", line 793, in getSubChain
    raise se.ImageIsNotLegalChain("%s:%s..%s" % (imgUUID, startUUID, endUUID))
ImageIsNotLegalChain: Image is not a legal chain: ('1bf5e87d-3e52-4a15-add2-e83c25a94df9:8faa7c7d-ef68-46a0-9120-02f28ebf9f75..103e6906-8afa-4fc6-950e-64bbe201ff27',)
09462d90-35f2-4c7a-b447-1101347f65c3::DEBUG::2011-05-09 15:52:07,445::task::492::TaskManager.Task::(_debug) Task 09462d90-35f2-4c7a-b447-1101347f65c3: Task._run: 09462d90-35f2-4
c7a-b447-1101347f65c3 () {} failed - stopping task
09462d90-35f2-4c7a-b447-1101347f65c3::DEBUG::2011-05-09 15:52:07,446::task::492::TaskManager.Task::(_debug) Task 09462d90-35f2-4c7a-b447-1101347f65c3: stopping in state running 
(force False)
09462d90-35f2-4c7a-b447-1101347f65c3::DEBUG::2011-05-09 15:52:07,447::task::492::TaskManager.Task::(_debug) Task 09462d90-35f2-4c7a-b447-1101347f65c3: ref 1 aborting True
09462d90-35f2-4c7a-b447-1101347f65c3::DEBUG::2011-05-09 15:52:07,447::task::905::TaskManager.Task::(_runJobs) aborting: Task is aborted: 'Image is not a legal chain' - code 262