Bug 1202719

Summary:	Libvirtd was restarted when do active blockcommit while there is a blockpull job running
Product:	Red Hat Enterprise Linux 7	Reporter:	Jan Kurik <jkurik>
Component:	libvirt	Assignee:	Peter Krempa <pkrempa>
Status:	CLOSED ERRATA	QA Contact:	Virtualization Bugs <virt-bugs>
Severity:	high	Docs Contact:
Priority:	high
Version:	7.1	CC:	dyuan, eblake, jdenemar, jkurik, mzhan, pkrempa, pm-eus, rbalakri, shyu
Target Milestone:	rc	Keywords:	ZStream
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:	libvirt-1.2.8-16.el7_1.2	Doc Type:	Bug Fix
Doc Text:	Cause: Libvirt's code that handles update of internal structures after a block job on a running VM's disk finishes was not acquiring a job on the VM which is necessary to change VM internal structures. Consequence: Libvirt's internal structures were changed without proper interlocking mechanism which lead to problems/crashes if the update happened while a different operation was attempted on the same disk. Fix: The code that updates the disk metadata after block job completion is now being run in a separate thread with proper locking. Result: Internal structures are not updated while other operations may run.	Story Points:	---
Clone Of:	1199036	Environment:
Last Closed:	2015-03-26 17:54:31 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1199036
Bug Blocks:

Description Jan Kurik 2015-03-17 10:03:44 UTC

This bug has been copied from bug #1199036 and has been proposed
to be backported to 7.1 z-stream (EUS).

Comment 5 Peter Krempa 2015-03-17 12:16:59 UTC

Backport:

http://post-office.corp.redhat.com/archives/rhvirt-patches/2015-March/msg00470.html

Comment 7 Shanzhi Yu 2015-03-19 09:48:26 UTC

verify this bug with libvirt-1.2.8-16.el7_1.2.x86_64.

1. Prepare a blockpull job underground, then try another blockcommit/blockpull/blockcopy job. 

# virsh blockpull vm vda --bandwidth 1 --verbose --wait 
Block Pull: [ 19 %]


# virsh blockpull vm vda --bandwidth 1 --verbose --wait
error: Operation not supported: disk 'vda' already in active block job

# virsh blockcommit vm vda --pivot --keep-relative
error: Operation not supported: disk 'vda' already in active block job

# virsh blockcopy vm vda /var/lib/libvirt/images/vm-clone --verbose --wait 
error: Operation not supported: disk 'vda' already in active block job

# virsh snapshot-create-as vm ss2 --disk-only --no-metadata 
error: internal error: unable to execute QEMU command 'transaction': Device 'drive-virtio-disk0' is busy: block device is in use by block job: stream

[NB, this error come from qemu, it should be tweak to 

2. Prepare a blockcopy job underground, then try another blockcommit/blockpull/blockcopy job.

# virsh blockcopy vm vda /var/lib/libvirt/images/vm.clone --bandwidth 1 --verbose --wait 
Block Copy: [ 38 %]


# virsh blockpull vm vda --bandwidth 1 --verbose --wait 
error: block copy still active: disk 'vda' already in active block job

# virsh blockcommit vm vda --pivot 
error: block copy still active: disk 'vda' already in active block job

# virsh blockcopy vm vda /var/lib/libvirt/images/vm-clone --verbose --wait 
error: block copy still active: disk 'vda' already in active block job

# virsh snapshot-create-as vm ss2 --disk-only --no-metadata 
error: block copy still active: domain has active block job

3. Prepare a blockcommit job underground, then try another blockcommit/blockpull/blockcopy job

3.1 inactive commit 
# virsh blockcommit vm vda --top vda[2] --verbose --wait --bandwidth 1 
Block Commit: [ 28 %]

# virsh blockcommit vm  vda --active 
error: Operation not supported: disk 'vda' already in active block job

# virsh blockpull vm vda --verbose --wait 
error: Operation not supported: disk 'vda' already in active block job


# virsh blockcopy vm vda /var/lib/libvirt/images/vm-clone --verbose --wait 
error: Operation not supported: disk 'vda' already in active block job

# virsh snapshot-create-as vm ss3 --disk-only 
error: internal error: unable to execute QEMU command 'transaction': Device 'drive-virtio-disk0' is busy: block device is in use by block job: commit

# virsh blockcommit vm vda --top vda[5] --verbose --wait --bandwidth 1
error: Operation not supported: disk 'vda' already in active block job

[This error is not so accurate, as there is no vda[5] in live domain xml, maybe can be improved, but, small issue]


3.2 active commit 

# virsh blockcommit vm vda --active   --verbose --wait --bandwidth 1 
Block Commit: [100 %]
Now in synchronized phase



#  virsh blockcommit vm  vda --active
error: block copy still active: disk 'vda' already in active block job

# virsh blockcommit vm vda --top vda[2] --verbose --wait --bandwidth 1
error: block copy still active: disk 'vda' already in active block job

# virsh blockpull vm vda --verbose --wait 
error: block copy still active: disk 'vda' already in active block job

# virsh blockcopy vm vda /var/lib/libvirt/images/vm-clone --verbose --wait 
error: block copy still active: disk 'vda' already in active block job

# virsh snapshot-create-as vm ss3 --disk-only 
error: block copy still active: domain has active block job

# virsh blockcommit vm vda --top vda[5] --verbose --wait --bandwidth 1
error: block copy still active: disk 'vda' already in active block job

[also, the error info can be improved, but small issue]


Based above test, libvirtd will not restart, and all blockjob will be excluded if there is another one running. So, verify this bug.

Comment 9 errata-xmlrpc 2015-03-26 17:54:31 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-0742.html

Comment 10 Eric Blake 2015-03-26 20:17:22 UTC

Ouch - we caused a regression. From IRC:

[12:22]	alitke	eblake_, Is virDomainBlockJobAbort(..., VIR_DOMAIN_BLOCK_JOB_ABORT_PIVOT) a synchronous operation?
[12:23]	alitke	eblake_, ie. when that returns, can I expect that an immediate call to dom.getXML(0) would always contain an up to date volume chain
[12:26]	alitke	eblake_, I do not use VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC
[14:07]	eblake_	alitke: it can be synchronous (won't return until the pivot is confirmed) or async (return right away, even though the operation may still be flushing)
[14:07]	eblake_	if you don't pass VIR_DOMAIN_BLOCK_JOB_ABORT_ASYNC, then it should be synchronous
[14:08]	alitke	ok.. I am seeing a situation in vdsm where (when multiple merges are happening simultaneously) the pivot call returns and I grab the domain XML only to find that the old volume is still listed as part of the chain.
[14:08]	eblake_	so yes, it sounds like the XML should be up-to-date after the pivot returns
[14:08]	eblake_	but we did have a race situation where we were updating domains without holding a job lock properly
[14:08]	eblake_	which libvirt build are you testing?
[14:09]	alitke	sleeping for a few seconds in my code seems to make it go away
[14:09]	eblake_	I also wonder if the code we added to prevent the races is also causing the updates to be late
[14:09]	alitke	libvirt-1.2.8-16.el7_1.2.x86_64
[14:09]	eblake_	that is, the old code would update the XML as soon as the event arrived without grabbing the job
[14:09]	eblake_	the new code spawns a helper thread that waits to grab a job
[14:10]	eblake_	but since the code doing the pivot also holds a job,
[14:10]	alitke	http://download.devel.redhat.com/brewroot/packages/libvirt/1.2.8/16.el7_1.2/x86_64/
[14:10]	supybot	Title: Index of /brewroot/packages/libvirt/1.2.8/16.el7_1.2/x86_64 (at download.devel.redhat.com)
[14:11]	eblake_	I'm wondering if we are waiting for the event, but not relinquishing the job, and then another thread races to get the XML dump before the deferred domain changes from the event actually take place
[14:11]	eblake_	if so, that's a new bug, and we need a BZ
[14:11]	eblake_	it would be fallout from the fix for bug 1202719
[14:11]	supybot	Bug https://bugzilla.redhat.com/show_bug.cgi?id=1202719 high, high, rc, pkrempa, CLOSED ERRATA, Libvirtd was restarted when do active blockcommit while there is a blockpull job running
[14:12]	jdenemar_afk	eblake_: I wonder why the event is not processed within the thread doing the pivot in the synchronous case...
[14:14]	eblake_	the code in qemu_driver.c:qemuDomainBlockJobImpl() at the waitjob: label doesn't relinquish the job
[14:14]	alitke	eblake_, It's defintely intermittent. I'd love to create libvirt-python reproducer. Would it need to have a relative backing chain to trigger do you think?
[14:14]	eblake_	before Peter's patches, the event would update the XML while we were still in the wait loop in the main thread, so things were just fine
[14:15]	eblake_	unless you never used the wait loop (by passing the ASYNC flag), and then you could crash libvirtd by modifying the domain outside a job
[14:15]	eblake_	after Peter's patches, we never modify the domain outside a job, but now that we have two jobs, and no handshaking, it does indeed look like it is possible for the block-job-pivot to return and another command sneak in before the event handling can finally update the domain