Bug 682953 - terminate migration with ctrl-c; migrating the same guest again fails
Summary: terminate migration with ctrl-c; migrating the same guest again fails
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 6
Classification: Red Hat
Component: libvirt
Version: 6.0
Hardware: All
OS: Linux
high
high
Target Milestone: beta
: 6.2
Assignee: Jiri Denemark
QA Contact: Virtualization Bugs
URL:
Whiteboard:
: 677884 697813 (view as bug list)
Depends On:
Blocks: GSS_6_2_PROPOSED 693512 697582 698812 696653
TreeView+ depends on / blocked
 
Reported: 2011-03-08 05:51 UTC by Takuma Umeya
Modified: 2018-11-14 19:18 UTC (History)
11 users (show)

Fixed In Version: libvirt-0.9.3-1.el6
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2011-12-06 10:55:24 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2011:1513 normal SHIPPED_LIVE libvirt bug fix and enhancement update 2011-12-06 01:23:30 UTC

Description Takuma Umeya 2011-03-08 05:51:15 UTC
Description of problem:
This issue occurred with 6.1a. Trying to migrate the guest again fails, when the previous attempt was terminated with ctrl-c. 

Version-Release number of selected component (if applicable):
(This issue occurred with 6.1a) 
 libvirt-0.8.7-6.el6

How reproducible:
Always. 

Steps to Reproduce:
1. Run migrate command: 
    virsh migrate --live --verbose guest_name qemu+ssh://192.168.x.x/system 
2. Press ctrl-C to terminate the attempt. 
3. Run the command again: 
    virsh migrate --live --verbose guest_name qemu+ssh://192.168.x.x/system
  
Actual results:
The migration fails with the following error. 
        error: Timed out during operation: cannot acquire state change lock

Expected results:
The migration should succeed. 

Additional info:
From the look when it received Ctrl-C it didn't go through proper exit routine
(probably qemuDomainObjEndJob) which left the jobActive in
QEMU_JOB_MIGRATION_IN rather than QEMU_JOB_NONE, which caused the next upcoming
migration to fail at the following point: 

int qemuDomainObjBeginJobWithDriver(struct qemud_driver *driver,
                                    virDomainObjPtr obj)
{
...
    while (priv->jobActive) {
        if (virCondWaitUntil(&priv->jobCond, &obj->lock, then) < 0) {
            virDomainObjUnref(obj);
            if (errno == ETIMEDOUT)
                qemuReportError(VIR_ERR_OPERATION_TIMEOUT,
                                "%s", _("cannot acquire state change lock"));

Comment 4 Dave Allan 2011-06-10 02:19:52 UTC
*** Bug 677884 has been marked as a duplicate of this bug. ***

Comment 5 Dave Allan 2011-06-15 17:07:18 UTC
*** Bug 697813 has been marked as a duplicate of this bug. ***

Comment 6 Jiri Denemark 2011-07-14 15:27:57 UTC
One part of this was fixed upstream as v0.9.2-193-g2c2effa:

commit 2c2effa1d7000aa97140c734a6be6e8d40df1022
Author: Daniel P. Berrange <berrange@redhat.com>
Date:   Thu Jun 23 11:03:57 2011 +0100

    Automatically kill target QEMU if migration aborts abnormally
    
    Migration is a multi-step process
    
      1. Begin(src)
      2. Prepare(dst)
      3. Perform(src)
      4. Finish(dst)
      5. Confirm(src)
    
    At step 2, a QEMU process is lauched in the destination to
    accept the incoming migration. Occasionally the process
    that is controlling the migration workflow aborts, and fails
    to call step 4, Finish. This leaves a QEMU process running
    on the target (albeit with paused CPUs). Unfortunately because
    step 2 actives a job on the QEMU process, it is unkillable by
    normal means.
    
    By registering the VM for autokill against the src virConnectPtr
    in step 2, we can ensure that the guest is forcefully killed off
    if the connection is closed without step 4 being invoked

Another part of this is to allow virDomainAbortJob to be run in any phase of migration and on either source or target instead of just within the Perform phase. This will be fixed by the migration fixes series made for bug 690175.

Comment 8 weizhang 2011-07-18 08:30:42 UTC
verify pass on
kernel-2.6.32-166.el6.x86_64
qemu-kvm-0.12.1.2-2.169.el6.x86_64
libvirt-0.9.3-5.el6.x86_64

steps are as Description shows

Comment 9 errata-xmlrpc 2011-12-06 10:55:24 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

http://rhn.redhat.com/errata/RHBA-2011-1513.html


Note You need to log in before you can comment on or make changes to this bug.