1206365 – Live Merge: Active layer merge is not properly synchronized with vdsm

RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.

Bug 1206365 - Live Merge: Active layer merge is not properly synchronized with vdsm

Summary: Live Merge: Active layer merge is not properly synchronized with vdsm

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux 7
Classification:	Red Hat
Component:	libvirt
Sub Component:
Version:	7.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	urgent
Severity:	urgent
Target Milestone:	rc
Target Release:	7.1
Assignee:	Peter Krempa
QA Contact:	Virtualization Bugs
Docs Contact:
URL:
Whiteboard:	storage
Depends On:
Blocks:	1155583 1206355 1206722 1207808 1208021
TreeView+	depends on / blocked

Reported:	2015-03-26 21:26 UTC by Adam Litke
Modified:	2015-11-19 06:25 UTC (History)
CC List:	18 users (show)
Fixed In Version:	libvirt-1.2.14-1.el7
Doc Type:	Bug Fix
Doc Text:	Cause: A recent bug fix introduced a regression when a block job is finished via the synchronous block job API. The code that is updating the backing chain of the disk after the block job finishes was extracted into a separate thread and thus allowed to race with other threads. Consequence: If one thread is executing the synchronous abort/pivot operation while a second thread is already waiting to access the domain definition to read the backing chain, the update operation will be queued after the query thread and thus the query thread will not get the updated backing chain data. Fix: When a synchronous Abort/Pivot operation is requested, the backing chain update is done in the same thread that is waiting for completion of the job. This assures that the backing chain info is updated in time. Result: Different threads are able to get correct backing chain data after a synchronous blockJobAbort is issued.
Clone Of:	1206355
Clones:	1208021 (view as bug list)
Environment:
Last Closed:	2015-11-19 06:25:02 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Bugzilla	1202719	0	high	CLOSED	Libvirtd was restarted when do active blockcommit while there is a blockpull job running	2021-02-22 00:41:40 UTC
Red Hat Product Errata	RHBA-2015:2202	0	normal	SHIPPED_LIVE	libvirt bug fix and enhancement update	2015-11-19 08:17:58 UTC

Internal Links: 1202719

Description Adam Litke 2015-03-26 21:26:31 UTC

+++ This bug was initially created as a clone of Bug #1206355 +++

Description of problem:

After completing an active layer merge, vdsm's representation of the new volume chain might not be properly synchronized.  When this happens, engine will report a failure to delete the snapshot even though libvirt successfully merged it.


Version-Release number of selected component (if applicable):
libvirt-1.2.8-16.el7_1.2.x86_64 from http://download.devel.redhat.com/brewroot/packages/libvirt/1.2.8/16.el7_1.2/x86_64/ which is a candidate build for RHEL-7.1 zStream

vdsm-4.16.12-28.gitf03bb74.el7.x86_64 : 3.5 branch tip

How reproducible: Intermittent but easy to reproduce


Steps to Reproduce:
1. Create a VM with 3 BLOCK disks (1 thin and 2 preallocated)
2. Start the VM
3. Create a snapshot
4. Delete the snapshot

Actual results:
Snapshot may fail to delete on at least one disk


Expected results:
Snapshot removal is successful


Additional info:
The vdsm log reveals errors such as the following:

Thread-50::ERROR::2015-03-26 10:53:09,314::sampling::488::vm.Vm::(collect) vmId=`b3fbc637-9cc9-4b15-ba94-5a2ee1607785`::Stats function failed: <AdvancedStatsFunction _highWrite at 0x3133870>
Traceback (most recent call last):
  File "/usr/share/vdsm/virt/sampling.py", line 484, in collect
    statsFunction()
  File "/usr/share/vdsm/virt/sampling.py", line 359, in __call__
    retValue = self._function(*args, **kwargs)
  File "/usr/share/vdsm/virt/vm.py", line 292, in _highWrite
    self._vm.extendDrivesIfNeeded()
  File "/usr/share/vdsm/virt/vm.py", line 2537, in extendDrivesIfNeeded
    extend = [x for x in self._getExtendCandidates()
  File "/usr/share/vdsm/virt/vm.py", line 2489, in _getExtendCandidates
    capacity, alloc, physical = self._dom.blockInfo(drive.path, 0)
  File "/usr/share/vdsm/virt/vm.py", line 689, in f
    ret = attr(*args, **kwargs)
  File "/usr/lib/python2.7/site-packages/vdsm/libvirtconnection.py", line 111, in wrapper
    ret = f(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/libvirt.py", line 646, in blockInfo
    if ret is None: raise libvirtError ('virDomainGetBlockInfo() failed', dom=self)
libvirtError: invalid argument: invalid path /rhev/data-center/dee92f86-2fef-4edf-ac51-135d12646bde/01f563fa-1aef-41ab-92d3-0d6561d4a731/images/31cfd267-e627-4cfe-8f33-f66b0db627c3/1863e178-c40a-42eb-803a-3d1f1ac6658a not assigned to domain

After talking to Eric Blake on IRC, it looks like another problem with libvirt where the domain XML is not updated before the virDomainBlockJobAbort API (used for pivoting an active layer merge) returns.  Therefore, our volume chain sync code gets an old value for the disk path which causes _highWrite to fail in this manner.

Comment 1 Adam Litke 2015-03-26 21:31:33 UTC

This is likely fallout from Bug 1202719 and the working theory is that there is a new race window...

A virDomainBlockCommit involving the active layer is completed (pivoted) by using a call to virDomainBlockJobAbort.  The Abort returns and the caller immediately retrieves the domain XML, but that call returns the XML before libvirt could update it to represent the new backing chain (post pivot).

Comment 2 Allon Mureinik 2015-03-27 05:01:23 UTC

This fix is required to support Live Merge in RHEVM.
I don't have the permissions to raise flags on this BZ, but I'd like to propose it for RHEL 7.1.z.

Comment 3 Shanzhi Yu 2015-03-27 06:47:15 UTC

well, I can reproduce it with rhevm3.5 env, although not 100 percent.
with packages

rhevm-3.5.0-0.31.el6ev.noarch

libvirt-1.2.8-16.el7_1.2.x86_64
vdsm-4.16.12.1-3.el7ev.x86_64
qemu-kvm-rhev-2.1.2-23.el7_1.1.x86_64

Comment 5 Peter Krempa 2015-03-30 09:31:04 UTC

Patches fixing the regression proposed upstream. Please note that upstream released the broken code only in the release candidate for the upcomming release so there should not be any upstream release with the flaw once this is fixed.

https://www.redhat.com/archives/libvir-list/2015-March/msg01524.html

Comment 6 Peter Krempa 2015-03-31 07:35:11 UTC

Fixed upstream:

commit 630ee5ac6cf4e3be3f3e986897a289865dd2604b
Author: Peter Krempa <pkrempa>
Date:   Mon Mar 30 11:26:20 2015 +0200

    qemu: blockjob: Synchronously update backing chain in XML on ABORT/PIVOT
    
    When the synchronous pivot option is selected, libvirt would not update
    the backing chain until the job was exitted. Some applications then
    received invalid data as their job serialized first.
    
    This patch removes polling to wait for the ABORT/PIVOT job completion
    and replaces it with a condition. If a synchronous operation is
    requested the update of the XML is executed in the job of the caller of
    the synchronous request. Otherwise the monitor event callback uses a
    separate worker to update the backing chain with a new job.
    
    This is a regression since 1a92c719101e5bfa6fe2b78006ad04c7f075ea28
    
    When the ABORT job is finished synchronously you get the following call
    stack:
     #0  qemuBlockJobEventProcess
     #1  qemuDomainBlockJobImpl
     #2  qemuDomainBlockJobAbort
     #3  virDomainBlockJobAbort
    
    While previously or while using the _ASYNC flag you'd get:
     #0  qemuBlockJobEventProcess
     #1  processBlockJobEvent
     #2  qemuProcessEventHandler
     #3  virThreadPoolWorker

commit 0c4474df4e10d27e27dbcda80b1f9cc14f4bdd8a
Author: Peter Krempa <pkrempa>
Date:   Mon Mar 30 11:26:19 2015 +0200

    qemu: Extract internals of processBlockJobEvent into a helper
    
    Later on I'll be adding a condition that will allow to synchronise a
    SYNC block job abort. The approach will require this code to be called
    from two different places so it has to be extracted into a helper.

commit 6b6c4ab8a6d2096bd5f50d2ae9b0a929fbaaf076
Author: Peter Krempa <pkrempa>
Date:   Mon Mar 30 11:26:18 2015 +0200

    qemu: processBlockJob: Don't unlock @vm twice
    
    Commit 1a92c719 moved code to handle block job events to a different
    function that is executed in a separate thread. The caller of
    processBlockJob handles locking and unlocking of @vm, so the we should
    not do it in the function itself.

v1.2.14-rc2

Comment 14 Yang Yang 2015-07-03 06:24:36 UTC

I can register host with latest libvirt and vdsm to rhevm, so this one is not blocked by bug 1232665.

I can reproduce it with libvirt-1.2.13-1.el7.x86_64 and vdsm-4.16.21-1.el7ev.x86_64

Verified it with libvirt-1.2.17-1.el7 and vdsm-4.16.21-1.el7ev.x86_64

Steps
1. Prepare a running guest with 4 disks, (2 thin and 2 preallocated)
2. Create 4 snapshots: s4-->s3-->s2-->s1-->base
3. delete 3 snapshots: s4, s3, s1
4. create 2 snapshots: ss2-->ss1-->s2-->base
5. delete 1 snapshots: s2
6. create 2 snapshots: ss4-->ss3-->ss2-->ss1-->base
7. delete 4 snapshots: ss4, ss2, ss3, ss1

All ops work well so move it to verified.

Comment 16 errata-xmlrpc 2015-11-19 06:25:02 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-2202.html

Note You need to log in before you can comment on or make changes to this bug.