Bug 682953

Summary: terminate migration with ctrl-c; migrating the same guest again fails
Product: Red Hat Enterprise Linux 6 Reporter: Takuma Umeya <tumeya>
Component: libvirtAssignee: Jiri Denemark <jdenemar>
Status: CLOSED ERRATA QA Contact: Virtualization Bugs <virt-bugs>
Severity: high Docs Contact:
Priority: high    
Version: 6.0CC: dallan, dyuan, eblake, jens.osterkamp, jyang, ltroan, myamazak, mzhan, weizhan, xen-maint, yoyzhang
Target Milestone: beta   
Target Release: 6.2   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: libvirt-0.9.3-1.el6 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2011-12-06 10:55:24 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 698812, 658636, 693512, 696653, 697582    

Description Takuma Umeya 2011-03-08 05:51:15 UTC
Description of problem:
This issue occurred with 6.1a. Trying to migrate the guest again fails, when the previous attempt was terminated with ctrl-c. 

Version-Release number of selected component (if applicable):
(This issue occurred with 6.1a) 
 libvirt-0.8.7-6.el6

How reproducible:
Always. 

Steps to Reproduce:
1. Run migrate command: 
    virsh migrate --live --verbose guest_name qemu+ssh://192.168.x.x/system 
2. Press ctrl-C to terminate the attempt. 
3. Run the command again: 
    virsh migrate --live --verbose guest_name qemu+ssh://192.168.x.x/system
  
Actual results:
The migration fails with the following error. 
        error: Timed out during operation: cannot acquire state change lock

Expected results:
The migration should succeed. 

Additional info:
From the look when it received Ctrl-C it didn't go through proper exit routine
(probably qemuDomainObjEndJob) which left the jobActive in
QEMU_JOB_MIGRATION_IN rather than QEMU_JOB_NONE, which caused the next upcoming
migration to fail at the following point: 

int qemuDomainObjBeginJobWithDriver(struct qemud_driver *driver,
                                    virDomainObjPtr obj)
{
...
    while (priv->jobActive) {
        if (virCondWaitUntil(&priv->jobCond, &obj->lock, then) < 0) {
            virDomainObjUnref(obj);
            if (errno == ETIMEDOUT)
                qemuReportError(VIR_ERR_OPERATION_TIMEOUT,
                                "%s", _("cannot acquire state change lock"));

Comment 4 Dave Allan 2011-06-10 02:19:52 UTC
*** Bug 677884 has been marked as a duplicate of this bug. ***

Comment 5 Dave Allan 2011-06-15 17:07:18 UTC
*** Bug 697813 has been marked as a duplicate of this bug. ***

Comment 6 Jiri Denemark 2011-07-14 15:27:57 UTC
One part of this was fixed upstream as v0.9.2-193-g2c2effa:

commit 2c2effa1d7000aa97140c734a6be6e8d40df1022
Author: Daniel P. Berrange <berrange>
Date:   Thu Jun 23 11:03:57 2011 +0100

    Automatically kill target QEMU if migration aborts abnormally
    
    Migration is a multi-step process
    
      1. Begin(src)
      2. Prepare(dst)
      3. Perform(src)
      4. Finish(dst)
      5. Confirm(src)
    
    At step 2, a QEMU process is lauched in the destination to
    accept the incoming migration. Occasionally the process
    that is controlling the migration workflow aborts, and fails
    to call step 4, Finish. This leaves a QEMU process running
    on the target (albeit with paused CPUs). Unfortunately because
    step 2 actives a job on the QEMU process, it is unkillable by
    normal means.
    
    By registering the VM for autokill against the src virConnectPtr
    in step 2, we can ensure that the guest is forcefully killed off
    if the connection is closed without step 4 being invoked

Another part of this is to allow virDomainAbortJob to be run in any phase of migration and on either source or target instead of just within the Perform phase. This will be fixed by the migration fixes series made for bug 690175.

Comment 8 weizhang 2011-07-18 08:30:42 UTC
verify pass on
kernel-2.6.32-166.el6.x86_64
qemu-kvm-0.12.1.2-2.169.el6.x86_64
libvirt-0.9.3-5.el6.x86_64

steps are as Description shows

Comment 9 errata-xmlrpc 2011-12-06 10:55:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2011-1513.html