Bug 519204 - After migration, paused VM is running on destination
Summary: After migration, paused VM is running on destination
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: libvirt
Version: 5.5
Hardware: All
OS: Linux
low
medium
Target Milestone: rc
: ---
Assignee: Paolo Bonzini
QA Contact: Virtualization Bugs
URL:
Whiteboard:
: 526206 (view as bug list)
Depends On:
Blocks: 503367
TreeView+ depends on / blocked
 
Reported: 2009-08-25 16:03 UTC by Paolo Bonzini
Modified: 2013-01-09 21:53 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of: 503367
Environment:
Last Closed: 2010-03-30 08:11:12 UTC


Attachments (Terms of Use)
backport of upstream d1ec4d7a, run cont on successful migration finish (1/5) (1.53 KB, patch)
2009-11-26 14:10 UTC, Paolo Bonzini
no flags Details | Diff
backport of upstream cbcf5ba7, fix QEMU domain status after restore (2/5) (1.01 KB, patch)
2009-11-26 14:11 UTC, Paolo Bonzini
no flags Details | Diff
patch posted upstream, fix migration of paused vms upon failure (3/5) (1.92 KB, patch)
2009-11-26 14:11 UTC, Paolo Bonzini
no flags Details | Diff
patch posted upstream, add virsh migrate --suspend (4/5) (6.04 KB, patch)
2009-11-26 14:11 UTC, Paolo Bonzini
no flags Details | Diff
patch posted upstream, retrieve paused/running state at beginning of migration (5/5) (2.43 KB, patch)
2009-11-26 14:12 UTC, Paolo Bonzini
no flags Details | Diff
patch posted upstream, retrieve paused/running state at beginning of migration (5/5) (2.79 KB, patch)
2009-12-09 13:33 UTC, Paolo Bonzini
no flags Details | Diff
377188: patch posted upstream, retrieve paused/running state at beginning of migration (5/5) (2.44 KB, patch)
2009-12-10 09:23 UTC, Paolo Bonzini
no flags Details | Diff


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2010:0205 normal SHIPPED_LIVE libvirt bug fix and enhancement update 2010-03-29 12:27:37 UTC

Description Paolo Bonzini 2009-08-25 16:03:10 UTC
+++ This bug was initially created as a clone of Bug #503367 +++

Description of problem:

After migration to a paused VM is completed successfully - the VM on destination is running instead of remaining in paused.

--- Additional comment from danken@redhat.com on 2009-06-08 16:21:58 EDT ---

I hope the summary is clearer now, and that it's clearer it is a qemu-kvm issue: qemu-kvm forgets it "paused" status after migrating.

--- Additional comment from ykaul@redhat.com on 2009-06-23 10:47:50 EDT ---

If it's paused because of lack of storage, isn't that dangerous?

--- Additional comment from danken@redhat.com on 2009-06-23 15:55:18 EDT ---

(In reply to comment #2)
> If it's paused because of lack of storage, isn't that dangerous?  

I don't think so. At worse, the retried write() would fail (and stopped) as the first one.

--- Additional comment from pbonzini@redhat.com on 2009-08-04 07:22:27 EDT ---

The bug is a dup of 510459.  This one has the flags, that one has the patches.  Which one should be closed?

--- Additional comment from dlaor@redhat.com on 2009-08-04 08:00:20 EDT ---

*** Bug 510459 has been marked as a duplicate of this bug. ***

--- Additional comment from pbonzini@redhat.com on 2009-08-13 08:51:02 EDT ---

Created an attachment (id=357312)
updated patch

--- Additional comment from danken@redhat.com on 2009-08-13 10:13:09 EDT ---

what happens if the vm paused after migration has started? Why cannot this bit of information (cpu paused/running) be transferred within the migration protocol? In my opinion, this is a (n important) part of machine state.

--- Additional comment from pbonzini@redhat.com on 2009-08-13 11:21:50 EDT ---

QEMU's state does not include whether the CPU is running.  libvirt has to fake it.

I'll open soon a clone of this bug because there are some migration-related libvirt parts (the patch I attached so far is enough for save/restore, but not for migration), but still I'd rather avoid changes to the libvirt protocol, and these would be required to transmit the running/paused state at the end of migration.  The reason is that there is currently no way for libvirt's Perform Migration step (running on the source) to pass information to the Finish Migration step (running on the destination after Perform has finished).

--- Additional comment from danken@redhat.com on 2009-08-13 11:33:04 EDT ---

(In reply to comment #8)
> QEMU's state does not include whether the CPU is running.  libvirt has to fake
> it.

I know, and I think this is the real bug here.

--- Additional comment from pbonzini@redhat.com on 2009-08-13 11:36:29 EDT ---

Actually I agree, but I think the simpler implementation picking up the state at the beginning would be good enough.  After all it would only affect live migration, and I'm not sure why you'd migrate live if you plan to suspend the VM.

What if you opened the libvirt clone for this, so we can discuss there whether it's okay to change the protocol?

--- Additional comment from berrange@redhat.com on 2009-08-13 11:49:14 EDT ---

There is no problem in libvirt at this time, since it does not allow the user to pause a VM after migration starts. libvirt merely requires QEMU to actually honour the '-S' flag that was given on the command line.

--- Additional comment from pbonzini@redhat.com on 2009-08-13 12:11:09 EDT ---

There is definitely in libvirt a problem that "cont" is executed unconditionally during the Finish step on the remote VM.  While the qemu patch here is enough for save/restore, libvirt needs more care as well.

I haven't experimented whether it's possible to suspend a VM concurrently to live migration, though it was on my testing todo list after Dan's comment.

--- Additional comment from danken@redhat.com on 2009-08-17 07:02:21 EDT ---

(In reply to comment #12)

> I haven't experimented whether it's possible to suspend a VM concurrently to
> live migration.

with -drive ...,werror=stop qemu can suspend its guest whenever it is doing io.

Comment 1 Paolo Bonzini 2009-08-25 16:04:03 UTC
The above trail from 503367 suggest a more ambitious plan, but anyway for RHEL5.5 we have to backport upstream d1ec4d7a.  This bug tracks this.

Comment 2 Paolo Bonzini 2009-08-25 17:28:30 UTC
Also needing backport: cbcf5ba7

Comment 3 Paolo Bonzini 2009-11-26 14:10:54 UTC
Created attachment 374007 [details]
backport of upstream d1ec4d7a, run cont on successful migration finish (1/5)

Comment 4 Paolo Bonzini 2009-11-26 14:11:19 UTC
Created attachment 374008 [details]
backport of upstream cbcf5ba7, fix QEMU domain status after restore (2/5)

Comment 5 Paolo Bonzini 2009-11-26 14:11:39 UTC
Created attachment 374009 [details]
patch posted upstream, fix migration of paused vms upon failure (3/5)

Comment 6 Paolo Bonzini 2009-11-26 14:11:53 UTC
Created attachment 374010 [details]
patch posted upstream, add virsh migrate --suspend (4/5)

Comment 7 Paolo Bonzini 2009-11-26 14:12:03 UTC
Created attachment 374011 [details]
patch posted upstream, retrieve paused/running state at beginning of migration (5/5)

Comment 8 Paolo Bonzini 2009-12-09 13:33:52 UTC
Created attachment 377188 [details]
 patch posted upstream, retrieve paused/running state at beginning of migration (5/5)

patch with fixes suggested upstream

Comment 9 Paolo Bonzini 2009-12-10 09:23:46 UTC
Created attachment 377410 [details]
377188: patch posted upstream, retrieve paused/running state at beginning of migration (5/5)

Comment 10 Daniel Veillard 2009-12-10 13:47:39 UTC
libvirt-0.6.3-24.el5 has been built in dist-5E-qu-candidate with the fixes

Daniel

Comment 12 Juan Quintela 2009-12-17 11:52:50 UTC
*** Bug 526206 has been marked as a duplicate of this bug. ***

Comment 18 errata-xmlrpc 2010-03-30 08:11:12 UTC
An advisory has been issued which should help the problem
described in this bug report. This report is therefore being
closed with a resolution of ERRATA. For more information
on therefore solution and/or where to find the updated files,
please follow the link below. You may reopen this bug report
if the solution does not work for you.

http://rhn.redhat.com/errata/RHBA-2010-0205.html


Note You need to log in before you can comment on or make changes to this bug.