Bug 1207808 - Live Merge: Active layer merge is not properly synchronized with vdsm
Summary: Live Merge: Active layer merge is not properly synchronized with vdsm
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: vdsm
Version: 3.5.1
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: 3.5.1
Assignee: Adam Litke
QA Contact: Kevin Alon Goldblatt
URL:
Whiteboard: storage
: 1207290 (view as bug list)
Depends On: 1206355 1206365 1206722
Blocks: 1155583 1193058
TreeView+ depends on / blocked
 
Reported: 2015-03-31 18:40 UTC by Adam Litke
Modified: 2016-02-10 18:19 UTC (History)
16 users (show)

Fixed In Version: vt14.3
Doc Type: Bug Fix
Doc Text:
Clone Of: 1206722
Environment:
Last Closed: 2015-04-28 18:52:42 UTC
oVirt Team: Storage
Target Upstream Version:
Embargoed:
ylavi: Triaged+


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2015:0904 0 normal SHIPPED_LIVE vdsm 3.5.1 - bug fix and enhancement update 2015-04-28 22:50:53 UTC
oVirt gerrit 39303 0 master MERGED Live Merge: work around racy libvirt pivot Never
oVirt gerrit 39419 0 ovirt-3.5 MERGED Live Merge: work around racy libvirt pivot Never

Description Adam Litke 2015-03-31 18:40:33 UTC
+++ This bug was initially created as a clone of Bug #1206722 +++

+++ This bug was initially created as a clone of Bug #1206355 +++

Description of problem:

After completing an active layer merge, vdsm's representation of the new volume chain might not be properly synchronized.  When this happens, engine will report a failure to delete the snapshot even though libvirt successfully merged it.


Version-Release number of selected component (if applicable):
libvirt-1.2.8-16.el7_1.2.x86_64 from http://download.devel.redhat.com/brewroot/packages/libvirt/1.2.8/16.el7_1.2/x86_64/ which is a candidate build for RHEL-7.1 zStream

vdsm-4.16.12-28.gitf03bb74.el7.x86_64 : 3.5 branch tip

How reproducible: Intermittent but easy to reproduce


Steps to Reproduce:
1. Create a VM with 3 BLOCK disks (1 thin and 2 preallocated)
2. Start the VM
3. Create a snapshot
4. Delete the snapshot

Actual results:
Snapshot may fail to delete on at least one disk


Expected results:
Snapshot removal is successful


Additional info:
The vdsm log reveals errors such as the following:

Thread-50::ERROR::2015-03-26 10:53:09,314::sampling::488::vm.Vm::(collect) vmId=`b3fbc637-9cc9-4b15-ba94-5a2ee1607785`::Stats function failed: <AdvancedStatsFunction _highWrite at 0x3133870>
Traceback (most recent call last):
  File "/usr/share/vdsm/virt/sampling.py", line 484, in collect
    statsFunction()
  File "/usr/share/vdsm/virt/sampling.py", line 359, in __call__
    retValue = self._function(*args, **kwargs)
  File "/usr/share/vdsm/virt/vm.py", line 292, in _highWrite
    self._vm.extendDrivesIfNeeded()
  File "/usr/share/vdsm/virt/vm.py", line 2537, in extendDrivesIfNeeded
    extend = [x for x in self._getExtendCandidates()
  File "/usr/share/vdsm/virt/vm.py", line 2489, in _getExtendCandidates
    capacity, alloc, physical = self._dom.blockInfo(drive.path, 0)
  File "/usr/share/vdsm/virt/vm.py", line 689, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 111, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 646, in blockInfo
    if ret is None: raise libvirtError ('virDomainGetBlockInfo() failed', dom=self)
libvirtError: invalid argument: invalid path /rhev/data-center/dee92f86-2fef-4edf-ac51-135d12646bde/01f563fa-1aef-41ab-92d3-0d6561d4a731/images/31cfd267-e627-4cfe-8f33-f66b0db627c3/1863e178-c40a-42eb-803a-3d1f1ac6658a not assigned to domain

After talking to Eric Blake on IRC, it looks like another problem with libvirt where the domain XML is not updated before the virDomainBlockJobAbort API (used for pivoting an active layer merge) returns.  Therefore, our volume chain sync code gets an old value for the disk path which causes _highWrite to fail in this manner.

Comment 1 Adam Litke 2015-03-31 20:18:18 UTC
*** Bug 1207290 has been marked as a duplicate of this bug. ***

Comment 2 Kevin Alon Goldblatt 2015-04-15 15:17:25 UTC
3.5.1 Z-STREAM vt14.3
rhevm-3.5.1-0.4.el6ev.noarch
vdsm-4.16.13.1-1.el7ev.x86_64

libvirt-daemon-driver-nwfilter-1.2.8-16.el7_1.2.x86_64
libvirt-daemon-driver-nodedev-1.2.8-16.el7_1.2.x86_64
libvirt-lock-sanlock-1.2.8-16.el7_1.2.x86_64
libvirt-client-1.2.8-16.el7_1.2.x86_64
libvirt-daemon-driver-network-1.2.8-16.el7_1.2.x86_64
libvirt-daemon-driver-qemu-1.2.8-16.el7_1.2.x86_64
libvirt-daemon-driver-interface-1.2.8-16.el7_1.2.x86_64
libvirt-daemon-driver-storage-1.2.8-16.el7_1.2.x86_64
libvirt-daemon-config-nwfilter-1.2.8-16.el7_1.2.x86_64
libvirt-daemon-1.2.8-16.el7_1.2.x86_64
libvirt-daemon-driver-secret-1.2.8-16.el7_1.2.x86_64
libvirt-daemon-kvm-1.2.8-16.el7_1.2.x86_64
libvirt-python-1.2.8-7.el7_1.1.x86_64

I ran the scenario with the above version and the Live merge was successfull.

Moving to verified

Comment 3 Kevin Alon Goldblatt 2015-04-26 13:43:06 UTC
Verified with the following libvirts as well:

https://brewweb.devel.redhat.com/buildinfo?buildID=428544

libvirt-1.2.8-16.el7_1.3.x86_64.rpm
libvirt-client-1.2.8-16.el7_1.3.x86_64.rpm
libvirt-daemon-1.2.8-16.el7_1.3.x86_64.rpm
libvirt-daemon-config-network-1.2.8-16.el7_1.3.x86_64.rpm
libvirt-daemon-config-nwfilter-1.2.8-16.el7_1.3.x86_64.rpm
libvirt-daemon-driver-interface-1.2.8-16.el7_1.3.x86_64.rpm
libvirt-daemon-driver-lxc-1.2.8-16.el7_1.3.x86_64.rpm
libvirt-daemon-driver-network-1.2.8-16.el7_1.3.x86_64.rpm
libvirt-daemon-driver-nodedev-1.2.8-16.el7_1.3.x86_64.rpm
libvirt-daemon-driver-nwfilter-1.2.8-16.el7_1.3.x86_64.rpm
libvirt-daemon-driver-qemu-1.2.8-16.el7_1.3.x86_64.rpm
libvirt-daemon-driver-secret-1.2.8-16.el7_1.3.x86_64.rpm
libvirt-daemon-driver-storage-1.2.8-16.el7_1.3.x86_64.rpm
libvirt-daemon-kvm-1.2.8-16.el7_1.3.x86_64.rpm
libvirt-daemon-lxc-1.2.8-16.el7_1.3.x86_64.rpm
libvirt-debuginfo-1.2.8-16.el7_1.3.x86_64.rpm
libvirt-devel-1.2.8-16.el7_1.3.x86_64.rpm
libvirt-docs-1.2.8-16.el7_1.3.x86_64.rpm
libvirt-lock-sanlock-1.2.8-16.el7_1.3.x86_64.rpm
libvirt-login-shell-1.2.8-16.el7_1.3.x86_64.rpm

Steps to Reproduce:
1. Create a VM with 3 BLOCK disks (1 thin and 2 preallocated)
2. Start the VM
3. Create a snapshot
4. Delete the snapshot - Works fine

Comment 5 errata-xmlrpc 2015-04-28 18:52:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-0904.html


Note You need to log in before you can comment on or make changes to this bug.