Bug 1374793

Summary: RFE: support moving a running domain to an updated QEMU binary
Product: [Community] Virtualization Tools Reporter: David Jaša <djasa>
Component: libvirtAssignee: Libvirt Maintainers <libvirt-maint>
Status: NEW --- QA Contact: Fangge Jin <fjin>
Severity: medium Docs Contact:
Priority: unspecified    
Version: unspecifiedCC: berrange, dyuan, jdenemar, libvirt-maint, mprivozn, xuzhang, zpeng
Target Milestone: ---Keywords: FutureFeature
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description David Jaša 2016-09-09 16:10:51 UTC
Description of problem:
Please allow migration within the same session. There are two use cases:
  * after critical qemu security fixes, enable rapid move of VMs to updated qemu
  * migration testing: easy to setup and more likely to hit race conditions
IMO first one is good enough to implement the RFE and RHEV Virt team is in favour of the idea once libvirt supports it.

Migration works just ok when done on plain qemu so the support should be trivial. There's room for optimization of course (mainly for memory: to have a way to migrate without copying VM memory for further speed gains and ability to migrate monster VMs on hosts with free memory < VM memory) but it makes sense to have this ability without them.


Version-Release number of selected component (if applicable):
libvirt-daemon-2.0.0-6.el7.x86_64

How reproducible:
always

Steps to Reproduce:
1. try to migrate within the same libvirt session, e.g.:
virsh -c LIBVIRT_URI migrate VM_NAME LIBVIRT_URI
2.
3.

Actual results:
migration is refused

Expected results:
migration succeeds

Additional info:

Comment 1 Daniel Berrangé 2016-09-09 16:15:26 UTC
(In reply to David Jaša from comment #0)
> Description of problem:
> Please allow migration within the same session. There are two use cases:
>   * after critical qemu security fixes, enable rapid move of VMs to updated
> qemu
>   * migration testing: easy to setup and more likely to hit race conditions
> IMO first one is good enough to implement the RFE and RHEV Virt team is in
> favour of the idea once libvirt supports it.
> 
> Migration works just ok when done on plain qemu so the support should be
> trivial. 

Nope, it is not at all trivial. When managing QEMU processes there are many resources libvirt sets up on the host which are tied to the VM, typically based on its name, and/or a combination of name + uuid. You can't have two copies of the same VM running on the same host without getting clashes in these resources and making libvirt support multiple different naming schemes for the same VM would have a ripple effect across the codebase and result in bugs which only ever appear when doing local-host migrations.

Comment 2 David Jaša 2016-09-12 10:58:28 UTC
Domain ID seems like a good distinguishing mark for domains with the same name/uuid for file names. Good handling of qemu logging looks more difficult to me but that's not exactly good in current libvirt either (e.g. too verbose logging preventing domain startup, lack of timestamps etc).

Comment 3 Jiri Denemark 2016-11-22 11:55:04 UTC
As Daniel said, implementing generic localhost migrations would be nontrivial and don't think it's worth doing. Setting up two hosts for migration testing is not any harder than setting up a single host esp. when both can be virtual.

However, moving a running domain from an older QEMU binary to a new one (e.g., to address a security issue) is a valid scenario that is worth thinking over. Libvirtd could do it within a single API and it could even be theoretically possible to implement some optimizations in cooperation with QEMU.

That said, I'm changing the bug summary to match the first use case.

Comment 4 Daniel Berrangé 2016-11-22 12:12:04 UTC
(In reply to Jiri Denemark from comment #3)
> As Daniel said, implementing generic localhost migrations would be
> nontrivial and don't think it's worth doing. Setting up two hosts for
> migration testing is not any harder than setting up a single host esp. when
> both can be virtual.
> 
> However, moving a running domain from an older QEMU binary to a new one
> (e.g., to address a security issue) is a valid scenario that is worth
> thinking over. Libvirtd could do it within a single API and it could even be
> theoretically possible to implement some optimizations in cooperation with
> QEMU.
> 
> That said, I'm changing the bug summary to match the first use case.

I don't think that "upgrading" a running QEMU in-place is any easier than supporting localhost migration - all the same problems apply wrt resources on disk clashing. 

I question whether there is a genuine value in in-place upgrade of QEMU from a security POV too. There's a reasonable number of security bugs in pieces outside QEMU that effectively necessitate migrating the guest to a separate host. So given that you need that procedure for dealing with security errata in general, I'm sceptical that it is beneficial to special case QEMU security bugs. It is better to have a consistent process applied in all cases, as that reduces the testing matrix and thus scope for failure.