+++ This bug was initially created as a clone of Bug #1206722 +++ +++ This bug was initially created as a clone of Bug #1206355 +++ Description of problem: After completing an active layer merge, vdsm's representation of the new volume chain might not be properly synchronized. When this happens, engine will report a failure to delete the snapshot even though libvirt successfully merged it. Version-Release number of selected component (if applicable): libvirt-1.2.8-16.el7_1.2.x86_64 from http://download.devel.redhat.com/brewroot/packages/libvirt/1.2.8/16.el7_1.2/x86_64/ which is a candidate build for RHEL-7.1 zStream vdsm-4.16.12-28.gitf03bb74.el7.x86_64 : 3.5 branch tip How reproducible: Intermittent but easy to reproduce Steps to Reproduce: 1. Create a VM with 3 BLOCK disks (1 thin and 2 preallocated) 2. Start the VM 3. Create a snapshot 4. Delete the snapshot Actual results: Snapshot may fail to delete on at least one disk Expected results: Snapshot removal is successful Additional info: The vdsm log reveals errors such as the following: Thread-50::ERROR::2015-03-26 10:53:09,314::sampling::488::vm.Vm::(collect) vmId=`b3fbc637-9cc9-4b15-ba94-5a2ee1607785`::Stats function failed: <AdvancedStatsFunction _highWrite at 0x3133870> Traceback (most recent call last): File "/usr/share/vdsm/virt/sampling.py", line 484, in collect statsFunction() File "/usr/share/vdsm/virt/sampling.py", line 359, in __call__ retValue = self._function(*args, **kwargs) File "/usr/share/vdsm/virt/vm.py", line 292, in _highWrite self._vm.extendDrivesIfNeeded() File "/usr/share/vdsm/virt/vm.py", line 2537, in extendDrivesIfNeeded extend = [x for x in self._getExtendCandidates() File "/usr/share/vdsm/virt/vm.py", line 2489, in _getExtendCandidates capacity, alloc, physical = self._dom.blockInfo(drive.path, 0) File "/usr/share/vdsm/virt/vm.py", line 689, in f ret = attr(*args, **kwargs) File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 111, in wrapper ret = f(*args, **kwargs) File "/usr/lib64/python2.7/site-packages/libvirt.py", line 646, in blockInfo if ret is None: raise libvirtError ('virDomainGetBlockInfo() failed', dom=self) libvirtError: invalid argument: invalid path /rhev/data-center/dee92f86-2fef-4edf-ac51-135d12646bde/01f563fa-1aef-41ab-92d3-0d6561d4a731/images/31cfd267-e627-4cfe-8f33-f66b0db627c3/1863e178-c40a-42eb-803a-3d1f1ac6658a not assigned to domain After talking to Eric Blake on IRC, it looks like another problem with libvirt where the domain XML is not updated before the virDomainBlockJobAbort API (used for pivoting an active layer merge) returns. Therefore, our volume chain sync code gets an old value for the disk path which causes _highWrite to fail in this manner.
*** Bug 1207290 has been marked as a duplicate of this bug. ***
3.5.1 Z-STREAM vt14.3 rhevm-3.5.1-0.4.el6ev.noarch vdsm-4.16.13.1-1.el7ev.x86_64 libvirt-daemon-driver-nwfilter-1.2.8-16.el7_1.2.x86_64 libvirt-daemon-driver-nodedev-1.2.8-16.el7_1.2.x86_64 libvirt-lock-sanlock-1.2.8-16.el7_1.2.x86_64 libvirt-client-1.2.8-16.el7_1.2.x86_64 libvirt-daemon-driver-network-1.2.8-16.el7_1.2.x86_64 libvirt-daemon-driver-qemu-1.2.8-16.el7_1.2.x86_64 libvirt-daemon-driver-interface-1.2.8-16.el7_1.2.x86_64 libvirt-daemon-driver-storage-1.2.8-16.el7_1.2.x86_64 libvirt-daemon-config-nwfilter-1.2.8-16.el7_1.2.x86_64 libvirt-daemon-1.2.8-16.el7_1.2.x86_64 libvirt-daemon-driver-secret-1.2.8-16.el7_1.2.x86_64 libvirt-daemon-kvm-1.2.8-16.el7_1.2.x86_64 libvirt-python-1.2.8-7.el7_1.1.x86_64 I ran the scenario with the above version and the Live merge was successfull. Moving to verified
Verified with the following libvirts as well: https://brewweb.devel.redhat.com/buildinfo?buildID=428544 libvirt-1.2.8-16.el7_1.3.x86_64.rpm libvirt-client-1.2.8-16.el7_1.3.x86_64.rpm libvirt-daemon-1.2.8-16.el7_1.3.x86_64.rpm libvirt-daemon-config-network-1.2.8-16.el7_1.3.x86_64.rpm libvirt-daemon-config-nwfilter-1.2.8-16.el7_1.3.x86_64.rpm libvirt-daemon-driver-interface-1.2.8-16.el7_1.3.x86_64.rpm libvirt-daemon-driver-lxc-1.2.8-16.el7_1.3.x86_64.rpm libvirt-daemon-driver-network-1.2.8-16.el7_1.3.x86_64.rpm libvirt-daemon-driver-nodedev-1.2.8-16.el7_1.3.x86_64.rpm libvirt-daemon-driver-nwfilter-1.2.8-16.el7_1.3.x86_64.rpm libvirt-daemon-driver-qemu-1.2.8-16.el7_1.3.x86_64.rpm libvirt-daemon-driver-secret-1.2.8-16.el7_1.3.x86_64.rpm libvirt-daemon-driver-storage-1.2.8-16.el7_1.3.x86_64.rpm libvirt-daemon-kvm-1.2.8-16.el7_1.3.x86_64.rpm libvirt-daemon-lxc-1.2.8-16.el7_1.3.x86_64.rpm libvirt-debuginfo-1.2.8-16.el7_1.3.x86_64.rpm libvirt-devel-1.2.8-16.el7_1.3.x86_64.rpm libvirt-docs-1.2.8-16.el7_1.3.x86_64.rpm libvirt-lock-sanlock-1.2.8-16.el7_1.3.x86_64.rpm libvirt-login-shell-1.2.8-16.el7_1.3.x86_64.rpm Steps to Reproduce: 1. Create a VM with 3 BLOCK disks (1 thin and 2 preallocated) 2. Start the VM 3. Create a snapshot 4. Delete the snapshot - Works fine
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-0904.html