Bug 849157

Summary:	Load balancer doesn't take host memory size and utilization into account, causing unnecessary memory crunch
Product:	Red Hat Enterprise Virtualization Manager	Reporter:	David Jaša <djasa>
Component:	ovirt-engine	Assignee:	Nobody's working on this, feel free to take it <nobody>
Status:	CLOSED DUPLICATE	QA Contact:
Severity:	high	Docs Contact:
Priority:	high
Version:	3.0.2	CC:	dyasny, iheim, lpeer, Rhev-m-bugs, yeylon, ykaul
Target Milestone:	---
Target Release:	---
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:		Story Points:	---
Clone Of:		Environment:
Last Closed:	2012-08-18 17:44:41 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description David Jaša 2012-08-17 13:26:17 UTC

Description of problem:
Load balancer doesn't take host memory size and utilization into account

Version-Release number of selected component (if applicable):
spotted in 3.0.2

How reproducible:
didn't try to reproduce

Steps to Reproduce:
1. have a cluster with default "even distribution" policy
2. have two hosts in a cluster with different memory
sizes: ram_size(host1) << ram_size(host2)
3. start VMs so that:
* ram_size(host1) < total_ram_taken_by VMs < ram_size(host2)
* all VMs are idle once started save for the one in the next step
note: VMs should have maximum possible CPU cores whils still being
able to migrate
4. in one VM in host2, start long-running task that utilizes all CPUs available

Actual results:
1. RHEV-M detects high CPU load on host2
2. RHEV-M migrates idle VMs from host2 to host1
3. point 2. is repeated until host1 gets it's memory fully utilized but RHEV-M still tries to migrate VMs there

Expected results:
while "actual results" are clearly wrong (putting host1 into totally unnecessary memory crunch), getting this right seems way more difficult. IMO these things should be taken into account at least:
* all three main resources (CPU, RAM, network bw) should be considered
* resource availability and utilization accross the cluster
* level of competition of VMs over a resource

In my particular use case, these two approaches to the situation may be also valid:
* ignore CPU load of the host completely because no other VMs want it
* migrate CPU-hungry VM to host1 with lesser memory and migrate some
idle VMs from host1 to host2 till safe RAM utilization level on
host1 is reached, stop the effort then

Additional info:
probably could be reproduced on homogenous 2-host cluster as well if total memory guaranteed to VMs exceeds RAM available on a single host.

Comment 1 Itamar Heim 2012-08-18 17:44:41 UTC


*** This bug has been marked as a duplicate of bug 516963 ***