| Summary: | terminate migration with ctrl-c; migrating the same guest again fails | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 6 | Reporter: | Takuma Umeya <tumeya> |
| Component: | libvirt | Assignee: | Jiri Denemark <jdenemar> |
| Status: | CLOSED ERRATA | QA Contact: | Virtualization Bugs <virt-bugs> |
| Severity: | high | Docs Contact: | |
| Priority: | high | ||
| Version: | 6.0 | CC: | dallan, dyuan, eblake, jens.osterkamp, jyang, ltroan, myamazak, mzhan, weizhan, xen-maint, yoyzhang |
| Target Milestone: | beta | ||
| Target Release: | 6.2 | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | libvirt-0.9.3-1.el6 | Doc Type: | Bug Fix |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2011-12-06 10:55:24 UTC | Type: | --- |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Bug Depends On: | |||
| Bug Blocks: | 698812, 658636, 693512, 696653, 697582 | ||
*** Bug 677884 has been marked as a duplicate of this bug. *** *** Bug 697813 has been marked as a duplicate of this bug. *** One part of this was fixed upstream as v0.9.2-193-g2c2effa:
commit 2c2effa1d7000aa97140c734a6be6e8d40df1022
Author: Daniel P. Berrange <berrange>
Date: Thu Jun 23 11:03:57 2011 +0100
Automatically kill target QEMU if migration aborts abnormally
Migration is a multi-step process
1. Begin(src)
2. Prepare(dst)
3. Perform(src)
4. Finish(dst)
5. Confirm(src)
At step 2, a QEMU process is lauched in the destination to
accept the incoming migration. Occasionally the process
that is controlling the migration workflow aborts, and fails
to call step 4, Finish. This leaves a QEMU process running
on the target (albeit with paused CPUs). Unfortunately because
step 2 actives a job on the QEMU process, it is unkillable by
normal means.
By registering the VM for autokill against the src virConnectPtr
in step 2, we can ensure that the guest is forcefully killed off
if the connection is closed without step 4 being invoked
Another part of this is to allow virDomainAbortJob to be run in any phase of migration and on either source or target instead of just within the Perform phase. This will be fixed by the migration fixes series made for bug 690175.
verify pass on kernel-2.6.32-166.el6.x86_64 qemu-kvm-0.12.1.2-2.169.el6.x86_64 libvirt-0.9.3-5.el6.x86_64 steps are as Description shows Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2011-1513.html |
Description of problem: This issue occurred with 6.1a. Trying to migrate the guest again fails, when the previous attempt was terminated with ctrl-c. Version-Release number of selected component (if applicable): (This issue occurred with 6.1a) libvirt-0.8.7-6.el6 How reproducible: Always. Steps to Reproduce: 1. Run migrate command: virsh migrate --live --verbose guest_name qemu+ssh://192.168.x.x/system 2. Press ctrl-C to terminate the attempt. 3. Run the command again: virsh migrate --live --verbose guest_name qemu+ssh://192.168.x.x/system Actual results: The migration fails with the following error. error: Timed out during operation: cannot acquire state change lock Expected results: The migration should succeed. Additional info: From the look when it received Ctrl-C it didn't go through proper exit routine (probably qemuDomainObjEndJob) which left the jobActive in QEMU_JOB_MIGRATION_IN rather than QEMU_JOB_NONE, which caused the next upcoming migration to fail at the following point: int qemuDomainObjBeginJobWithDriver(struct qemud_driver *driver, virDomainObjPtr obj) { ... while (priv->jobActive) { if (virCondWaitUntil(&priv->jobCond, &obj->lock, then) < 0) { virDomainObjUnref(obj); if (errno == ETIMEDOUT) qemuReportError(VIR_ERR_OPERATION_TIMEOUT, "%s", _("cannot acquire state change lock"));