Bug 303111 - Ensure success of a failover or migration of Virtual Machine prior to the attempt
Summary: Ensure success of a failover or migration of Virtual Machine prior to the att...
Keywords:
Status: CLOSED DUPLICATE of bug 412911
Alias: None
Product: Red Hat Enterprise Linux 5
Classification: Red Hat
Component: rgmanager
Version: 5.2
Hardware: All
OS: Linux
high
high
Target Milestone: ---
: ---
Assignee: Lon Hohberger
QA Contact:
URL:
Whiteboard:
: 456556 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2007-09-24 14:11 UTC by Scott Crenshaw
Modified: 2016-04-26 15:29 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2009-04-02 16:48:45 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Rob Kenna 2007-09-24 14:11:31 UTC
Prior to migrating or starting a guest VM on another node in the cluster, ensure
that there is adequate memory & other required resources to run.  Either choose
a different node, or fail to attempt if it's not the case.

This will hopefully rely on the libvirt and VM infrastructure as opposed to
solely implementing this in rgmanager and the cluster infrastructure

Comment 1 Lon Hohberger 2007-11-06 15:00:26 UTC
So, at a minimum...

(a) Look for memory footprint of VM and make sure enough is available on the
remote node
(b) Look availability of that VM on the other node (e.g. xen domain config, disk
images, etc.)
(c) Check to ensure that both local migration server and remote migration server
is running.


Comment 3 Kiersten (Kerri) Anderson 2008-07-14 18:50:51 UTC
Pushing to 5.4, would be easist if libvirt just provided this capability.

Comment 4 Lon Hohberger 2008-09-08 17:21:02 UTC
*** Bug 456556 has been marked as a duplicate of this bug. ***

Comment 5 Perry Myers 2009-04-02 14:21:26 UTC
(In reply to comment #3)
> Pushing to 5.4, would be easist if libvirt just provided this capability.  

Unclear that libvirt should provide this functionality, since it is dependent on SLA requirements for the VM.  For example... Let's say a physical host has 4 processors.  And there are 3 VMs presently running each taking up 1 vCPU.  And the guest you're attempting to migrate requires 2 vCPUs.  Is there enough room to move this guest?  Two possibilities:

1. SLA indicates that there must be a physical CPU for every vCPU, in which case no, you can't start the new vm
2. SLA indicates that vCPU overcommit can't exceed 2 times the physical CPUs, in which case you can start the new vm
3. SLA allows arbitrary overcommit but uses other metrics like system load to determine if a new guest can start (needs more info from other areas of the system)

libvirt can't really track this type of business logic.  It belong further up in the stack.  For example, in oVirt we'll implement this SLA logic as part of the oVirt Server, not as part of libvirt.  If this functionality needs to be implemented for cluster, it would really need to be done as part of the core cluster stack.

Comment 6 Perry Myers 2009-04-02 14:33:46 UTC
Ok, in talking to Lon about this a little.  The question isn't about SLA or performance of the VM, it's more about 'Can this node even possibly start at all' which would depend on things like:
1. Is the architecture the same between the two migrating hosts (Intel vs. AMD)
2. Is the VM image accessible on shared storage from both hosts
3. Are the right network interfaces/bridges present

My understanding is that libvirt handles all of this as part of the migration process.  The logic would be that rgmanager should try to invoke virsh migrate.  If virsh migrate fails then it could be because one of the above reasons.

So the changes for this BZ are to make rgmanager use libvirt migration instead of xm, and to interpret the results of the migrate call to determine success or failure.

Comment 7 Lon Hohberger 2009-04-02 16:48:45 UTC
Since this is a function of porting to virsh instead of xm, I am closing it as a duplicate of the relevant bugzilla.

*** This bug has been marked as a duplicate of bug 412911 ***


Note You need to log in before you can comment on or make changes to this bug.