Bug 1008146

Summary: When run multiple vms then filter don't check correctly free resources.
Product: Red Hat Enterprise Virtualization Manager Reporter: Ondra Machacek <omachace>
Component: ovirt-engineAssignee: Martin Sivák <msivak>
Status: CLOSED CURRENTRELEASE QA Contact: Lukas Svaty <lsvaty>
Severity: high Docs Contact:
Priority: high    
Version: 3.3.0CC: acathrow, dfediuck, gchaplik, iheim, lpeer, lsvaty, mavital, movciari, omachace, Rhev-m-bugs, yeylon
Target Milestone: ---   
Target Release: 3.3.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: sla
Fixed In Version: is26 Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: SLA RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1044030    
Attachments:
Description Flags
engine.log
none
engine.log
none
engine log none

Description Ondra Machacek 2013-09-15 08:32:03 UTC
Created attachment 797822 [details]
engine.log

Description of problem:
When multiple vms are run at one time, then free resources of hosts are not
checked correctly. Ie. I have two hosts both have ~7GB free memory.
I have two vms both have 7GB guaranted memory. When I run those two vms at one
time then both vms start at one host, which has 7GB free memory.

Version-Release number of selected component (if applicable):
is13

How reproducible:
always

Steps to Reproduce:
1. Have two hosts both with ~7GB RAM.
2. Create cluster policy only with RAM filter.
3. Create two vms, both with ~7GB guaranteed memory.
4. Select both vms in webadmin and run them.

Actual results:
Both vms run on one host.

Expected results:
Both vms run on different hosts.

Additional info:

Comment 1 Doron Fediuck 2013-09-16 09:00:15 UTC
Hi Ondra,
I'm missing the log parts from the scheduling process. Can you please add them?

Also, what is your cluster policy optimization (servers? desktops?) 
There are several options that allow overcommitment, such as ksm, ballooning etc
so we really need you to specify all these settings to evaluate the memory calculations.

Comment 2 Ondra Machacek 2013-09-16 12:17:26 UTC
Created attachment 798263 [details]
engine.log

Hi Doron,

attaching log which reflect all stepts to reproduce
creating new cluster policy with filter RAM filter, selecting it as cluster policy
for cluster with those two hosts. creating two 7GB vms, and running them in
parallel.

I use Memory optimization - None - Disable memory page sharing.
I don't use balooning neither ksm.

Comment 3 Ondra Machacek 2013-09-16 12:20:50 UTC
Also note that when I run those vm one by one, then first start on the first
host and second vm on the second host and then when I try to migrate one
of those vm, it fails with error:

Error while executing action:
rhel_7gb:

    Cannot migrate VM. There are no available running Hosts with sufficient memory in VM's Cluster .

Comment 4 Martin Sivák 2013-09-23 08:49:07 UTC
I think I know the reason now:

The engine check the hosts for sufficient memory in SlaValidator.hasMemoryToRunVM. The fields that we use there are:

curVds.getMemCommited()
curVds.getPendingVmemSize()
curVds.getGuestOverhead()
curVds.getReservedMem()

All of them seem to be updated only when VDSM report comes in.

The field that contains the memory reserved for VMs is getMemCommited and the amount is recomputed in VdsUpdateRunTimeInfo.refreshCommitedMemory that is called only by refreshVmStats.

Since we poll VDSM every couple of seconds, the engine does not see the memory allocation when it is trying to start a second VM.

The fact that the second VM is properly scheduled to the second host when started with a small delay since the first VM seems to confirm this behaviour.

Comment 5 Lukas Svaty 2013-12-05 11:49:57 UTC
both VMs failed to run on 1st host then started on 2nd host attaching engine.log

Comment 6 Lukas Svaty 2013-12-05 11:50:26 UTC
Created attachment 833111 [details]
engine log

Comment 7 Martin Sivák 2013-12-09 14:00:40 UTC
Please enable DEBUG mode when attaching logs. And describe the actual verification setup (especially the Cluster policy assigned and memory sizes) as the fix is working for me locally.

Comment 8 Martin Sivák 2013-12-12 09:16:05 UTC
We just tested it with movciari and it seems to be working correctly. The environment probably had Memory filter disabled.

Comment 9 Lukas Svaty 2013-12-17 08:03:04 UTC
retested, seems to be some kind of environment issue, which is fixed nowm moving to VERIFIED

Comment 10 Itamar Heim 2014-01-21 22:32:03 UTC
Closing - RHEV 3.3 Released

Comment 11 Itamar Heim 2014-01-21 22:32:03 UTC
Closing - RHEV 3.3 Released

Comment 12 Itamar Heim 2014-01-21 22:34:36 UTC
Closing - RHEV 3.3 Released