Created attachment 913796 [details] logs from host and engine Description of problem: Tried to create a snapshot to a VM with a disk resides on a iscsi domain while it was running. The operation failed with a libvirtError in vdsm Version-Release number of selected component (if applicable): vdsm-4.16.0-3.git601f786.el6.x86_64 libvirt-0.10.2-29.el6_5.9.x86_64 qemu-kvm-0.12.1.2-2.415.el6_5.10.x86_64 How reproducible: Always on block storage Steps to Reproduce: On a shared DC 1. Create a VM with a disk resides on blcok (iscsi/fc) storage domain 2. Start the VM 3. Create snapshot to the VM Actual results: Snapshot creation fails with the following error: Thread-82::ERROR::2014-07-01 18:31:47,033::vm::319::vm.Vm::(_sampleCpuTune) vmId=`55749aee-709f-46cd-9148-0446bb9d9c1a`::Failed to retrieve QoS metadata Traceback (most recent call last): File "/usr/share/vdsm/virt/vm.py", line 316, in _sampleCpuTune libvirt.VIR_DOMAIN_METADATA_ELEMENT, METADATA_VM_TUNE_URI, 0) File "/usr/share/vdsm/virt/vm.py", line 604, in f ret = attr(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 111, in wrapper ret = f(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/libvirt.py", line 904, in metadata if ret is None: raise libvirtError ('virDomainGetMetadata() failed', dom=self) libvirtError: argument unsupported: QEMU driver does not support <metadata> element Expected results: Live snapshot should succeed Additional info: logs from host and engine
The initially reported error message is not related to the snapshot failure. After looking at the logs I see the following: Thread-62::DEBUG::2014-07-01 18:03:07,089::vm::4181::vm.Vm::(snapshot) vmId=`2d6ceb1e-9a1a-47a4-9d1c-607fa46bb122`::<domainsnapshot> <disks> <disk name="vda" snapshot="external" type="block"> <source dev="/rhev/data-center/mnt/blockSD/bfeff0f2-cf26-4c41-aa69-bef6eed98bea/images/f7627c6a-6a42-4e11-b2a1-c12216ef159e/de2f1db6-323a-420e-bc61-8d14ed2a7bbe" type="block"/> </disk> </disks> <memory file="/rhev/data-center/mnt/lion.qa.lab.tlv.redhat.com:_export_elad_2/0a17cdcf-5c03-4657-a783-29525f76eb87/images/3e781470-84c0-4cef-a82d-a45ed7d33b83/b4333f28-5963-45b8-9f1b-f226a5b29222" snapshot="external"/> </domainsnapshot> Thread-62::DEBUG::2014-07-01 18:03:07,099::libvirtconnection::143::root::(wrapper) Unknown libvirterror: ecode: 67 edom: 35 level: 2 message: unsupported configuration: source for disk 'vda' is not a regular file; refusing to generate external snapshot name Thread-62::DEBUG::2014-07-01 18:03:07,099::vm::4202::vm.Vm::(snapshot) vmId=`2d6ceb1e-9a1a-47a4-9d1c-607fa46bb122`::Snapshot failed using the quiesce flag, trying again without it (unsupported configuration: source for disk 'vda' is not a regular file; refusing to generate external snapshot name) Thread-62::DEBUG::2014-07-01 18:03:07,106::libvirtconnection::143::root::(wrapper) Unknown libvirterror: ecode: 67 edom: 35 level: 2 message: unsupported configuration: source for disk 'vda' is not a regular file; refusing to generate external snapshot name Thread-62::ERROR::2014-07-01 18:03:07,106::vm::4206::vm.Vm::(snapshot) vmId=`2d6ceb1e-9a1a-47a4-9d1c-607fa46bb122`::Unable to take snapshot Traceback (most recent call last): File "/usr/share/vdsm/virt/vm.py", line 4204, in snapshot self._dom.snapshotCreateXML(snapxml, snapFlags) File "/usr/share/vdsm/virt/vm.py", line 604, in f ret = attr(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/vdsm/libvirtconnection.py", line 111, in wrapper ret = f(*args, **kwargs) File "/usr/lib64/python2.6/site-packages/libvirt.py", line 1636, in snapshotCreateXML if ret is None:raise libvirtError('virDomainSnapshotCreateXML() failed', dom=self) libvirtError: unsupported configuration: source for disk 'vda' is not a regular file; refusing to generate external snapshot name
I recently changed vdsm to specify the type='block' attribute when creating snapshots so libvirt would not treat a block device like a file (which could have bad implications for a future live merge of the snapshot). It looks like older versions of libvirt are rejecting this syntax. It seems as if vdsm was relying on a broken semantic to support live snapshots on block devices where it should not have typically worked.
Hi Eric, Do you know what the behavior of and support for live external snapshots on block devices is for libvirt-0.10.2? It seems that in the past we have been relying on libvirt assuming the device is a file all of the time. Recently I changed the snapshot XML to declare type='block' and it's breaking on this old version. I am wondering if we have a libvirt bug here or if we are going to need to rely on fooling libvirt on old hosts to avoid a feature regression in oVirt running on old hosts.
Well, old libvirt versions as the one in question here don't support the type="block" and also ignore it. This results into libvirtd thinking that the snapshot target file was not specified as it expects the XML only in the format: <disk name="vda" snapshot="external" path="/asdfg"/> As in the old path. You can specify the type="block" only for a libvirt which actually understands that.
(In reply to Adam Litke from comment #3) > Hi Eric, > > Do you know what the behavior of and support for live external snapshots on > block devices is for libvirt-0.10.2? It seems that in the past we have been > relying on libvirt assuming the device is a file all of the time. Recently > I changed the snapshot XML to declare type='block' and it's breaking on this > old version. Well libvirt 0.10.2 is a ancient in regards of features and certainly does not support the type="block" declaration. > > I am wondering if we have a libvirt bug here or if we are going to need to > rely on fooling libvirt on old hosts to avoid a feature regression in oVirt > running on old hosts. Well it's not a libvirt bug, it's more like libvirt is missing the feature. On older hosts you should use the old format, the new one will not be parsed correctly. You probably can use the presence of the <backingStore> element in a running VM as a witness for support of this feature.
(In reply to Adam Litke from comment #3) > Hi Eric, > > Do you know what the behavior of and support for live external snapshots on > block devices is for libvirt-0.10.2? It seems that in the past we have been > relying on libvirt assuming the device is a file all of the time. Recently > I changed the snapshot XML to declare type='block' and it's breaking on this > old version. > > I am wondering if we have a libvirt bug here or if we are going to need to > rely on fooling libvirt on old hosts to avoid a feature regression in oVirt > running on old hosts. Upstream 0.10.2 only supports files, and doesn't know the type='...' element at all. Support for type='file' vs. type='block' wasn't added until 1.2.2.
I have not tried this, but it may be possible to use the union of old and new formats with both libvirt versions, as in: <disk name="vda" snapshot="external" type="block"> <source file="/path/to/block" dev="/path/to/block"/> </disk> so that the old parser sees disk/source/file as expected, while the new parser relies on disk[@type=block]/source/dev
(In reply to Eric Blake from comment #7) > I have not tried this, but it may be possible to use the union of old and > new formats with both libvirt versions, as in: > > <disk name="vda" snapshot="external" type="block"> > <source file="/path/to/block" dev="/path/to/block"/> > </disk> > > so that the old parser sees disk/source/file as expected, while the new > parser relies on disk[@type=block]/source/dev The idea here is that you supply the union on input, then the resulting output tells you which fields got ignored as unrecognized, and therefore which version of libvirt you are dealing with. Neither old nor new libvirt will ever output the union.
Posted a proposed fix to gerrit: http://gerrit.ovirt.org/#/c/29567/
As noted by Adam Litke, the libvirtError: "argument unsupported: QEMU driver does not support" is a whole different bug not related to snapshot error or block domains,you can follow it on BZ #1116826
Returning BZ to POST - the patch should be backported to the ovirt-3.5 branch
verified on beta.2
oVirt 3.5 has been released and should include the fix for this issue.