Bug 1431613 - CopyImageError when cloning vm from snapshot on iscsi
Summary: CopyImageError when cloning vm from snapshot on iscsi
Keywords:
Status: CLOSED DUPLICATE of bug 1435967
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Storage
Version: 4.1.1.4
Hardware: x86_64
OS: Unspecified
unspecified
high
Target Milestone: ovirt-4.1.2
: ---
Assignee: Fred Rolland
QA Contact: Lilach Zitnitski
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-03-13 12:29 UTC by Lilach Zitnitski
Modified: 2017-03-28 08:50 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-03-28 08:50:17 UTC
oVirt Team: Storage
Embargoed:
rule-engine: ovirt-4.1+
rule-engine: blocker+


Attachments (Terms of Use)
CREATE request (52.58 KB, application/rtf)
2017-03-13 12:30 UTC, Lilach Zitnitski
no flags Details
logs (784.60 KB, application/zip)
2017-03-13 12:30 UTC, Lilach Zitnitski
no flags Details
logs 2 (279.92 KB, application/zip)
2017-03-15 05:58 UTC, Lilach Zitnitski
no flags Details

Description Lilach Zitnitski 2017-03-13 12:29:15 UTC
Description of problem:
Cloning vm from snapshot on iscsi disks fails with CopyImageError and low level image copy errors. 
When running the same tests with snapshot on nfs disks there are no failures, and when trying to clone vm from snapshot manually there are no failures as well. 
The problem occurs when running this test in our automation tests with Create vm request in the REST API (the request body will be attached in a file) 

Version-Release number of selected component (if applicable):
rhevm-4.1.1.4-0.1.el7.noarch
vdsm-4.19.7-1.el7ev.x86_64

How reproducible:
100%

Steps to Reproduce:
1. create vm with disks from iscsi storage domain
2. create snapshot with all vm's disks
3. run from the API the create_vm request

Actual results:
clone vm is failing while trying to create it

Expected results:
the cloned vm should be created

Additional info:

engine.log

2017-03-13 11:15:05,336+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (DefaultQuartzScheduler2) [5b35d98b] EVENT_ID: VDS_BROKER_COMMAND_FAILURE(10,802), Correlation ID: null, Call Stack: null, Custom Event ID: -1, Message: VDSM host_mixed_3 command HSMGetAllTasksStatusesVDS failed: low level Image copy failed
2017-03-13 11:15:05,340+02 ERROR [org.ovirt.engine.core.bll.tasks.SPMAsyncTask] (DefaultQuartzScheduler2) [5b35d98b] BaseAsyncTask::logEndTaskFailure: Task '1ab3d889-66dd-486a-9ffb-d74d5afb1370' (Parent Command 'AddVmFromSnapshot', Parameters Type 'org.ovirt.engine.core.common.asynctasks.AsyncTaskParameters') ended with failure:
-- Result: 'cleanSuccess'
-- Message: 'VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = low level Image copy failed, code = 261',
-- Exception: 'VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = low level Image copy failed, code = 261'

vdsm.log

2017-03-13 11:13:47,189+0200 ERROR (tasks/9) [storage.Image] conversion failure for volume 76a3ba38-6893-4423-9806-1de99ecea882 (image:881)
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/image.py", line 876, in copyCollapsed
    self._wait_for_qemuimg_operation(operation)
  File "/usr/share/vdsm/storage/image.py", line 137, in _wait_for_qemuimg_operation
    operation.wait_for_completion()
  File "/usr/lib/python2.7/site-packages/vdsm/qemuimg.py", line 329, in wait_for_completion
    self.poll(timeout)
  File "/usr/lib/python2.7/site-packages/vdsm/qemuimg.py", line 324, in poll
    self.error)
QImgError: cmd=['/usr/bin/taskset', '--cpu-list', '0-0', '/usr/bin/nice', '-n', '19', '/usr/bin/ionice', '-c', '3', '/usr/bin/qemu-img', 'convert', '-p', '-t', 'none', '-T', 'none', '-f', 'raw', '/rhev/data-center/4f3d521b-4b6c-403a-ac0f-976ec9d952ad/6f458208-173e-4906-85ca-7d14883c8b78/images/adb30f73-49b9-413e-926d-f61aa36b9d5d/76a3ba38-6893-4423-9806-1de99ecea882', '-O', 'qcow2', '-o', 'compat=1.1', '/rhev/data-center/mnt/blockSD/6f458208-173e-4906-85ca-7d14883c8b78/images/af35370b-edc4-4b8a-9d63-28710fc604aa/1f4b766f-d54a-435c-983c-3afec4587836'], ecode=1, stdout=, stderr=qemu-img: error while writing sector 2093056: No space left on device
, message=None
2017-03-13 11:13:47,190+0200 ERROR (tasks/9) [storage.Image] Unexpected error (image:894)
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/image.py", line 882, in copyCollapsed
    raise se.CopyImageError(str(e))
CopyImageError: low level Image copy failed: ("cmd=['/usr/bin/taskset', '--cpu-list', '0-0', '/usr/bin/nice', '-n', '19', '/usr/bin/ionice', '-c', '3', '/usr/bin/qemu-img', 'convert', '-p', '-t', 'none', '-T', 'none', '-f', 'raw', '/rhev/data-center/4f3d521b-4b6c-403a-ac0f-976ec9d952ad/6f458208-173e-4906-85ca-7d14883c8b78/images/adb30f73-49b9-413e-926d-f61aa36b9d5d/76a3ba38-6893-4423-9806-1de99ecea882', '-O', 'qcow2', '-o', 'compat=1.1', '/rhev/data-center/mnt/blockSD/6f458208-173e-4906-85ca-7d14883c8b78/images/af35370b-edc4-4b8a-9d63-28710fc604aa/1f4b766f-d54a-435c-983c-3afec4587836'], ecode=1, stdout=, stderr=qemu-img: error while writing sector 2093056: No space left on device\n, message=None",)

Comment 1 Lilach Zitnitski 2017-03-13 12:30:03 UTC
Created attachment 1262429 [details]
CREATE request

Comment 2 Lilach Zitnitski 2017-03-13 12:30:32 UTC
Created attachment 1262430 [details]
logs

engine and vdsm

Comment 3 Tal Nisan 2017-03-13 14:05:27 UTC
Just to be clear, regular clone from snapshot on iSCSI works as it should but only the automation is failing?

Comment 4 Lilach Zitnitski 2017-03-13 14:08:45 UTC
(In reply to Tal Nisan from comment #3)
> Just to be clear, regular clone from snapshot on iSCSI works as it should
> but only the automation is failing?

Yes, that's why I added the REST request body because it can be reproduced only with that.

Comment 5 Yaniv Kaul 2017-03-14 09:08:10 UTC
(In reply to Lilach Zitnitski from comment #4)
> (In reply to Tal Nisan from comment #3)
> > Just to be clear, regular clone from snapshot on iSCSI works as it should
> > but only the automation is failing?
> 
> Yes, that's why I added the REST request body because it can be reproduced
> only with that.

"No space left on device" seems to hint of the issue. Please check.

Comment 6 Lilach Zitnitski 2017-03-15 05:54:50 UTC
I don't think that's the problem because when I'm trying to clone vm from the same snapshot and from the same vm and disks, manually, it's working. 
Also, when running the automated test I can see both srcVol and dstVol are 1G and the destination SD has 100G free.

Comment 7 Lilach Zitnitski 2017-03-15 05:58:45 UTC
Created attachment 1263180 [details]
logs 2

vdsm and engine logs with successful CopyImage

Comment 8 Raz Tamir 2017-03-15 07:18:21 UTC
I'm Adding the Regression keyword as this issue already happened to us in the past and fixed - https://bugzilla.redhat.com/show_bug.cgi?id=1201268

Comment 9 Red Hat Bugzilla Rules Engine 2017-03-15 07:18:28 UTC
This bug report has Keywords: Regression or TestBlocker.
Since no regressions or test blockers are allowed between releases, it is also being identified as a blocker for this release. Please resolve ASAP.

Comment 10 Fred Rolland 2017-03-28 08:50:17 UTC

*** This bug has been marked as a duplicate of bug 1435967 ***


Note You need to log in before you can comment on or make changes to this bug.