Bug 1209034 - [vdsm] Template creation on XtremeIO with pre-allocated disks on block storage fails with "CopyImageError: low level Image copy failed"
Summary: [vdsm] Template creation on XtremeIO with pre-allocated disks on block storag...
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 3.5.1
Hardware: x86_64
OS: Unspecified
unspecified
urgent
Target Milestone: ovirt-3.6.0-rc
: 3.6.0
Assignee: Fred Rolland
QA Contact: Elad
URL:
Whiteboard:
Keywords: ZStream
: 1218165 (view as bug list)
Depends On: 1203543 1215744
Blocks: 1221192
TreeView+ depends on / blocked
 
Reported: 2015-04-05 12:30 UTC by Elad
Modified: 2016-03-09 19:36 UTC (History)
17 users (show)

(edit)
Clone Of:
: 1221192 (view as bug list)
(edit)
Last Closed: 2016-03-09 19:36:14 UTC


Attachments (Terms of Use)
logs from engine and vdsm and db dump (8.64 MB, application/x-gzip)
2015-04-05 12:30 UTC, Elad
no flags Details
/rhev/data-center tree and lvs outputs (4.05 KB, application/x-gzip)
2015-04-05 12:31 UTC, Elad
no flags Details
new logs (from host, engine, db dump, lvs and /rhev/data-center/ tree) (1.35 MB, application/x-gzip)
2015-04-05 12:51 UTC, Elad
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2016:0362 normal SHIPPED_LIVE vdsm 3.6.0 bug fix and enhancement update 2016-03-09 23:49:32 UTC
oVirt gerrit 40671 master MERGED spec: updated qemu-* requirements on EL Never
oVirt gerrit 40877 ovirt-3.5 MERGED spec: updated qemu-* requirements on EL Never

Description Elad 2015-04-05 12:30:03 UTC
Created attachment 1011119 [details]
logs from engine and vdsm and db dump

Description of problem:
Template creation on a block domain (iSCSI and FC) fails with the following error in vdsm:

166d5870-ef4a-43f2-8576-08c4c3eb6015::ERROR::2015-03-31 16:21:01,107::task::866::Storage.TaskManager.Task::(_setError) Task=`166d5870-ef4a-43f2-8576-08c4c3eb6015`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 873, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/storage/task.py", line 334, in run
    return self.cmd(*self.argslist, **self.argsdict)
  File "/usr/share/vdsm/storage/securable.py", line 77, in wrapper
    return method(self, *args, **kwargs)
  File "/usr/share/vdsm/storage/sp.py", line 1508, in copyImage
    postZero, force)
  File "/usr/share/vdsm/storage/image.py", line 784, in copyCollapsed
    (srcImgUUID, dstImgUUID, str(e)))
CopyImageError: low level Image copy failed: (u"src image=f083f8a2-8aa6-4d33-885f-6648252de5d0, dst image=b2cf2203-ee0e-47c7-9725-2a7c366eaf9d: msg=local variable 'e' referenced before a
ssignment",)


Version-Release number of selected component (if applicable):
rhev 3.5.1 vt14.1
vdsm-4.16.12.1-3.el7ev.x86_64
libvirt-daemon-1.2.8-16.el7_1.2.x86_64
qemu-kvm-rhev-2.1.2-23.el7_1.1.x86_64
rhevm-3.5.1-0.2.el6ev.noarch

How reproducible:
Not 100%, seems to be reproduce on more than 5G images

Steps to Reproduce:
1. Create a VM with 10G disk resides on a block domain
2. Create a template out of the VM


Actual results:
Template creation fails with the mentioned exception

Expected results:
Template creation should succeed 

Additional info: logs from engine and vdsm and db dump

Comment 1 Elad 2015-04-05 12:31:02 UTC
Created attachment 1011120 [details]
/rhev/data-center tree and lvs outputs

Comment 2 Elad 2015-04-05 12:51:14 UTC
Created attachment 1011123 [details]
new logs (from host, engine, db dump, lvs and /rhev/data-center/ tree)

Reproduced with the fix introduced in https://bugzilla.redhat.com/show_bug.cgi?id=1207705. 

Please ignore the logs uploaded so far, attaching the new logs with the relevant exception.

 34b781dc-97d3-49bc-91e6-31eb9b9249bc::ERROR::2015-04-05 15:41:21,083::task::866::Storage.TaskManager.Task::(_setError) Task=`34b781dc-97d3-49bc-91e6-31eb9b9249bc`::Unexpected error
Traceback (most recent call last):
  File "/usr/share/vdsm/storage/task.py", line 873, in _run
    return fn(*args, **kargs)
  File "/usr/share/vdsm/storage/task.py", line 334, in run
    return self.cmd(*self.argslist, **self.argsdict)
  File "/usr/share/vdsm/storage/securable.py", line 77, in wrapper
    return method(self, *args, **kwargs)
  File "/usr/share/vdsm/storage/sp.py", line 1508, in copyImage
    postZero, force)
  File "/usr/share/vdsm/storage/image.py", line 767, in copyCollapsed
    raise se.CopyImageError(str(e))
CopyImageError: low level Image copy failed: ("ecode=1, stdout=[], stderr=['qemu-img: error writing zeroes at sector 0: Invalid argument'], message=None",)

Comment 3 Elad 2015-04-07 10:25:25 UTC
Some additional information:

- Checked the scenario with another storage server, it doesn't reproduce. (So it reproduces only with XtremIO storage server)
- This seems to be for preallocated disks only.
- The disk doesn't have to contain data.
- The bug reproduces only with RHEL7.1 (I checked with RHEL7.0 - qemu-kvm-rhev-1.5.3-60.el7_0.12.x86_64, the bug doesn't reproduce)
- The template creation doesn't have to be on the same domain for the bug to reproduce, it depends on the storage server (as mentioned above, it reproduces on a specific server)

Comment 8 Elad 2015-04-16 10:03:58 UTC
The build provided in comment 4 is for RHEL7.0, the bug occurs only with RHEL7.1.

Comment 9 Carlos Mestre González 2015-04-24 12:54:13 UTC
We've seen this same error happens sporadically in our automation tests, with XtremeIO, but in this case is after creating a vm from a snapshot of a vm with multiple disks. 

-- Result: cleanSuccess
-- Message: VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = low level Image copy failed, code = 261,
-- Exception: VDSGenericException: VDSErrorException: Failed to HSMGetAllTasksStatusesVDS, error = low level Image copy failed, code = 261

Reproduces in RHEL 7.1:
qemu-img-rhev-2.1.2-23.el7_1.1.x86_64
VDS(libvirt-1.2.8-16.el7_1.3.x86_64)
VDS(vdsm-4.16.13.1-1.el7ev.x86_64)
vt14.2

Tal, Do  you thin this is related to the same issue? Should I provide logs here or open a new bug?

Comment 10 Raz Tamir 2015-04-25 20:52:38 UTC
Hi Carlos,
There is open bug on this issue
https://bugzilla.redhat.com/show_bug.cgi?id=1201268

Comment 11 Tal Nisan 2015-04-26 16:14:19 UTC
Hi Carlos,
Seems like the same issue on the same platform

Comment 12 Allon Mureinik 2015-04-27 07:54:36 UTC
Sandro, qemu-kvm-rhev-2.1.2-23.el7_1_1.2 will probably fix this issue once it's released (see bug 1203543).

Assuming virt qe can verify it and we get a respin soon, we'll need a qemu-kvm-ev build of these sources so we can consume it in oVirt. Do you need an additional BZ to track this request, or can we use this one?

Comment 13 Sandro Bonazzola 2015-04-27 13:45:19 UTC
This bug is targeted 3.5.3 and 3.5.1 GA is scheduled for tomorrow as for upstream 3.5.2. I think it's too late for 3.5.2 but we can release qemu-kvm-ev async. Maybe better to track on a separate bug.

Comment 14 Allon Mureinik 2015-04-27 16:34:00 UTC
(In reply to Sandro Bonazzola from comment #13)
> This bug is targeted 3.5.3 and 3.5.1 GA is scheduled for tomorrow as for
> upstream 3.5.2. I think it's too late for 3.5.2 but we can release
> qemu-kvm-ev async. 
+1!

> Maybe better to track on a separate bug.
Agreed - opened bug 1215744 for tracking this. I'm not sure about the component though, feel free to move it around.

Thanks!

Comment 15 Allon Mureinik 2015-05-04 11:02:49 UTC
*** Bug 1218165 has been marked as a duplicate of this bug. ***

Comment 16 Allon Mureinik 2015-05-10 07:02:53 UTC
Fred, if you're already touching VDSM's spec file for bug 1199014, might as well just require qemu-kvm-[rh]ev-2.1.2-23.el7_1.2.x86_64.rpm and be done with this BZ too.
Thanks!

Comment 17 Yaniv Lavi 2015-05-13 13:10:53 UTC
Since it's the same fix as bug 1199014, moving to 3.5.3 to track both.

Comment 19 Elad 2015-06-17 13:22:05 UTC
Template creation using a preallocated 10G disk over a storage domain resides on a LUN from XtremIO is done successfully 

Used:
ovirt-engine-3.6.0-0.0.master.20150519172219.git9a2e2b3.el6.noarch
vdsm-4.17.0-912.git25a063d.el7.noarch

Comment 21 errata-xmlrpc 2016-03-09 19:36:14 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2016-0362.html


Note You need to log in before you can comment on or make changes to this bug.