Bug 873792 - libvirt: cancel migration is sent but migration continues
Summary: libvirt: cancel migration is sent but migration continues
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: libvirt
Version: 6.3
Hardware: x86_64
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Michal Privoznik
QA Contact: Virtualization Bugs
URL:
Whiteboard: infra
Depends On: 728904
Blocks: 867347 875770
TreeView+ depends on / blocked
 
Reported: 2012-11-06 17:39 UTC by Dafna Ron
Modified: 2013-02-21 07:26 UTC (History)
13 users (show)

Fixed In Version: libvirt-0.10.2-8.el6
Doc Type: Bug Fix
Doc Text:
Libvirt allows users to cancel an ongoing migration. Previously, if an attempt to cancel the migration was made in the migration preparation phase, qemu missed the request and the migration was not canceled. With this update, the virDomainAbortJob() function sets a flag when a cancel request is made and this flag is checked before the main phase of the migration starts. As a result, a migration can now be properly canceled even in the preparation phase.
Clone Of:
Environment:
Last Closed: 2013-02-21 07:26:00 UTC
Target Upstream Version:


Attachments (Terms of Use)
logs (1.37 MB, application/x-gzip)
2012-11-06 17:39 UTC, Dafna Ron
no flags Details


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2013:0276 normal SHIPPED_LIVE Moderate: libvirt security, bug fix, and enhancement update 2013-02-20 21:18:26 UTC

Description Dafna Ron 2012-11-06 17:39:29 UTC
Created attachment 639477 [details]
logs

Description of problem:

I cancelled migration of a vm right after it started and migration continued.
Michal looked at the logs and determined that vdsm isued 'migrate_cancel' before libvirt gets to 'migrate' 

Version-Release number of selected component (if applicable):

vdsm-4.9.6-41.0.el6_3.x86_64
libvirt-0.9.10-21.el6_3.5.x86_64
qemu-kvm-rhev-0.12.1.2-2.295.el6_3.5.x86_64

How reproducible:

100%

Steps to Reproduce:
1. migrate a vm from one host to a second host 
2. cancel migration right at the beginning of the migration
3.
  
Actual results:

migration continues

Expected results:

migration should be cancelled. 

Additional info: logs

Comment 1 Michal Privoznik 2012-11-06 17:47:04 UTC
This is race. The first thread (the one that starts migration) sets job and is preparing to issue monitor commands to qemu. However, meanwhile second thread jumps in and since the job is already set, it cancels the job, issue 'migrate_cancel' and quit. Right after the first thread finished preparation and starts the migration (executes 'migrate' on the monitor).

Comment 2 Jiri Denemark 2012-11-06 18:21:53 UTC
This is also tracked upstream in bug 728904.

Comment 3 Michal Privoznik 2012-11-07 11:04:23 UTC
Patch proposed upstream:

https://www.redhat.com/archives/libvir-list/2012-November/msg00356.html

Comment 4 weizhang 2012-11-08 06:13:41 UTC
I can reproduce with
# virsh migrate --live vr-rhel6u3-x86_64-kvm qemu+ssh://10.66.84.16/system --verbose & usleep 700000; virsh domjobabort vr-rhel6u3-x86_64-kvm; echo "$?"

domjobabort return 0 but migration continued

version
libvirt-0.10.2-7.el6.x86_64
qemu-kvm-0.12.1.2-2.323.el6.x86_64
kernel-2.6.32-329.el6.x86_64

Comment 8 weizhang 2012-11-14 12:19:24 UTC
Verify pass on

libvirt-0.10.2-8.el6.x86_64
qemu-kvm-0.12.1.2-2.323.el6.x86_64
kernel-2.6.32-329.el6.x86_64


# virsh migrate --live vr-rhel6u3-x86_64-kvm qemu+ssh://10.66.84.16/system --verbose & usleep 500000; virsh domjobabort vr-rhel6u3-x86_64-kvm

error: operation aborted: migration out: canceled by client

Comment 9 errata-xmlrpc 2013-02-21 07:26:00 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHSA-2013-0276.html


Note You need to log in before you can comment on or make changes to this bug.