Bug 1115126 - [vdsm] live snapshot creation fails on block storage with unsupported configuration: source for disk 'vda' is not a regular file; refusing to generate external snapshot name
Summary: [vdsm] live snapshot creation fails on block storage with unsupported configu...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: oVirt
Classification: Retired
Component: vdsm
Version: 3.5
Hardware: x86_64
OS: Unspecified
unspecified
urgent
Target Milestone: ---
: 3.5.0
Assignee: Adam Litke
QA Contact: Ori Gofen
URL:
Whiteboard: storage
Depends On:
Blocks: 1119691 1190742
TreeView+ depends on / blocked
 
Reported: 2014-07-01 15:35 UTC by Elad
Modified: 2016-05-26 01:48 UTC (History)
12 users (show)

Fixed In Version: ovirt-3.5.0-beta2
Doc Type: Bug Fix
Doc Text:
Clone Of:
: 1190742 (view as bug list)
Environment:
Last Closed: 2014-10-17 12:33:57 UTC
oVirt Team: Storage
Embargoed:


Attachments (Terms of Use)
logs from host and engine (2.55 MB, application/x-gzip)
2014-07-01 15:35 UTC, Elad
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 29567 0 master MERGED virt: Restore BC for block type live snapshots Never
oVirt gerrit 30228 0 ovirt-3.5 MERGED virt: Restore BC for block type live snapshots Never

Description Elad 2014-07-01 15:35:46 UTC
Created attachment 913796 [details]
logs from host and engine

Description of problem:
Tried to create a snapshot to a VM with a disk resides on a iscsi domain while it was running. The operation failed with a libvirtError in vdsm

Version-Release number of selected component (if applicable):
vdsm-4.16.0-3.git601f786.el6.x86_64
libvirt-0.10.2-29.el6_5.9.x86_64
qemu-kvm-0.12.1.2-2.415.el6_5.10.x86_64

How reproducible:
Always on block storage

Steps to Reproduce:
On a shared DC
1. Create a VM with a disk resides on blcok (iscsi/fc) storage domain
2. Start the VM
3. Create snapshot to the VM

Actual results:
Snapshot creation fails with the following error:

Thread-82::ERROR::2014-07-01 18:31:47,033::vm::319::vm.Vm::(_sampleCpuTune) vmId=`55749aee-709f-46cd-9148-0446bb9d9c1a`::Failed to retrieve QoS metadata
Traceback (most recent call last):
  File "/usr/share/vdsm/virt/vm.py", line 316, in _sampleCpuTune
    libvirt.VIR_DOMAIN_METADATA_ELEMENT, METADATA_VM_TUNE_URI, 0)
  File "/usr/share/vdsm/virt/vm.py", line 604, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 111, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.6/site-packages/libvirt.py", line 904, in metadata
    if ret is None: raise libvirtError ('virDomainGetMetadata() failed', dom=self)
libvirtError: argument unsupported: QEMU driver does not support <metadata> element


Expected results:
Live snapshot should succeed

Additional info: logs from host and engine

Comment 1 Adam Litke 2014-07-02 19:13:48 UTC
The initially reported error message is not related to the snapshot failure.  After looking at the logs I see the following:

Thread-62::DEBUG::2014-07-01 18:03:07,089::vm::4181::vm.Vm::(snapshot) vmId=`2d6ceb1e-9a1a-47a4-9d1c-607fa46bb122`::<domainsnapshot>
	<disks>
		<disk name="vda" snapshot="external" type="block">
			<source dev="/rhev/data-center/mnt/blockSD/bfeff0f2-cf26-4c41-aa69-bef6eed98bea/images/f7627c6a-6a42-4e11-b2a1-c12216ef159e/de2f1db6-323a-420e-bc61-8d14ed2a7bbe" type="block"/>
		</disk>
	</disks>
	<memory file="/rhev/data-center/mnt/lion.qa.lab.tlv.redhat.com:_export_elad_2/0a17cdcf-5c03-4657-a783-29525f76eb87/images/3e781470-84c0-4cef-a82d-a45ed7d33b83/b4333f28-5963-45b8-9f1b-f226a5b29222" snapshot="external"/>
</domainsnapshot>

Thread-62::DEBUG::2014-07-01 18:03:07,099::libvirtconnection::143::root::(wrapper) Unknown libvirterror: ecode: 67 edom: 35 level: 2 message: unsupported configuration: source for disk 'vda' is not a regular file; refusing to generate external snapshot name
Thread-62::DEBUG::2014-07-01 18:03:07,099::vm::4202::vm.Vm::(snapshot) vmId=`2d6ceb1e-9a1a-47a4-9d1c-607fa46bb122`::Snapshot failed using the quiesce flag, trying again without it (unsupported configuration: source for disk 'vda' is not a regular file; refusing to generate external snapshot name)
Thread-62::DEBUG::2014-07-01 18:03:07,106::libvirtconnection::143::root::(wrapper) Unknown libvirterror: ecode: 67 edom: 35 level: 2 message: unsupported configuration: source for disk 'vda' is not a regular file; refusing to generate external snapshot name
Thread-62::ERROR::2014-07-01 18:03:07,106::vm::4206::vm.Vm::(snapshot) vmId=`2d6ceb1e-9a1a-47a4-9d1c-607fa46bb122`::Unable to take snapshot
Traceback (most recent call last):
  File "/usr/share/vdsm/virt/vm.py", line 4204, in snapshot
    self._dom.snapshotCreateXML(snapxml, snapFlags)
  File "/usr/share/vdsm/virt/vm.py", line 604, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 111, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.6/site-packages/libvirt.py", line 1636, in snapshotCreateXML
    if ret is None:raise libvirtError('virDomainSnapshotCreateXML() failed', dom=self)
libvirtError: unsupported configuration: source for disk 'vda' is not a regular file; refusing to generate external snapshot name

Comment 2 Adam Litke 2014-07-02 19:20:23 UTC
I recently changed vdsm to specify the type='block' attribute when creating snapshots so libvirt would not treat a block device like a file (which could have bad implications for a future live merge of the snapshot).  It looks like older versions of libvirt are rejecting this syntax.  It seems as if vdsm was relying on a broken semantic to support live snapshots on block devices where it should not have typically worked.

Comment 3 Adam Litke 2014-07-02 19:25:30 UTC
Hi Eric,

Do you know what the behavior of and support for live external snapshots on block devices is for libvirt-0.10.2?  It seems that in the past we have been relying on libvirt assuming the device is a file all of the time.  Recently I changed the snapshot XML to declare type='block' and it's breaking on this old version.

I am wondering if we have a libvirt bug here or if we are going to need to rely on fooling libvirt on old hosts to avoid a feature regression in oVirt running on old hosts.

Comment 4 Peter Krempa 2014-07-03 06:08:22 UTC
Well, old libvirt versions as the one in question here don't support the type="block" and also ignore it. This results into libvirtd thinking that the snapshot target file was not specified as it expects the XML only in the format:

<disk name="vda" snapshot="external" path="/asdfg"/>

As in the old path. You can specify the type="block" only for a libvirt which actually understands that.

Comment 5 Peter Krempa 2014-07-03 06:41:57 UTC
(In reply to Adam Litke from comment #3)
> Hi Eric,
> 
> Do you know what the behavior of and support for live external snapshots on
> block devices is for libvirt-0.10.2?  It seems that in the past we have been
> relying on libvirt assuming the device is a file all of the time.  Recently
> I changed the snapshot XML to declare type='block' and it's breaking on this
> old version.

Well libvirt 0.10.2 is a ancient in regards of features and certainly does not support the type="block" declaration.

> 
> I am wondering if we have a libvirt bug here or if we are going to need to
> rely on fooling libvirt on old hosts to avoid a feature regression in oVirt
> running on old hosts.

Well it's not a libvirt bug, it's more like libvirt is missing the feature. On older hosts you should use the old format, the new one will not be parsed correctly. You probably can use the presence of the <backingStore> element in a running VM as a witness for support of this feature.

Comment 6 Eric Blake 2014-07-03 11:57:38 UTC
(In reply to Adam Litke from comment #3)
> Hi Eric,
> 
> Do you know what the behavior of and support for live external snapshots on
> block devices is for libvirt-0.10.2?  It seems that in the past we have been
> relying on libvirt assuming the device is a file all of the time.  Recently
> I changed the snapshot XML to declare type='block' and it's breaking on this
> old version.
> 
> I am wondering if we have a libvirt bug here or if we are going to need to
> rely on fooling libvirt on old hosts to avoid a feature regression in oVirt
> running on old hosts.

Upstream 0.10.2 only supports files, and doesn't know the type='...' element at all.  Support for type='file' vs. type='block' wasn't added until 1.2.2.

Comment 7 Eric Blake 2014-07-03 12:01:26 UTC
I have not tried this, but it may be possible to use the union of old and new formats with both libvirt versions, as in:

<disk name="vda" snapshot="external" type="block">
  <source file="/path/to/block" dev="/path/to/block"/>
</disk>

so that the old parser sees disk/source/file as expected, while the new parser relies on disk[@type=block]/source/dev

Comment 8 Eric Blake 2014-07-03 12:02:44 UTC
(In reply to Eric Blake from comment #7)
> I have not tried this, but it may be possible to use the union of old and
> new formats with both libvirt versions, as in:
> 
> <disk name="vda" snapshot="external" type="block">
>   <source file="/path/to/block" dev="/path/to/block"/>
> </disk>
> 
> so that the old parser sees disk/source/file as expected, while the new
> parser relies on disk[@type=block]/source/dev

The idea here is that you supply the union on input, then the resulting output tells you which fields got ignored as unrecognized, and therefore which version of libvirt you are dealing with.  Neither old nor new libvirt will ever output the union.

Comment 9 Adam Litke 2014-07-03 19:05:23 UTC
Posted a proposed fix to gerrit: http://gerrit.ovirt.org/#/c/29567/

Comment 10 Ori Gofen 2014-07-07 12:15:15 UTC
As noted by Adam Litke, the libvirtError: "argument unsupported: QEMU driver does not support" is a whole different bug not related to snapshot error or block domains,you can follow it on BZ #1116826

Comment 11 Allon Mureinik 2014-07-17 06:58:00 UTC
Returning BZ to POST - the patch should be backported to the ovirt-3.5 branch

Comment 12 Ori Gofen 2014-07-29 10:43:53 UTC
verified on beta.2

Comment 13 Sandro Bonazzola 2014-10-17 12:33:57 UTC
oVirt 3.5 has been released and should include the fix for this issue.


Note You need to log in before you can comment on or make changes to this bug.