Bug 1289468 - When starting two vms simultaneously, "Max free Memory for scheduling new VMs" is not updated fast enough
When starting two vms simultaneously, "Max free Memory for scheduling new VM...
Status: CLOSED CURRENTRELEASE
Product: ovirt-engine
Classification: oVirt
Component: Backend.Core (Show other bugs)
3.6.1.1
All Linux
medium Severity medium (vote)
: ovirt-3.6.3
: 3.6.3.2
Assigned To: Martin Sivák
Artyom
: BuildBlocker, Triaged
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2015-12-08 03:50 EST by Artyom
Modified: 2016-03-11 02:24 EST (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Engine moved to use asynchronous events for receiving VM state changes sometime during the 3.6 cycle. (3.5 is not affected) Consequence: Resource reservation for VM was cleared once the UP state was received. But the actual host resource usage was not updated until the next polling cycle (usually every 15 seconds). That created a window in which the VM resource usage was essentially not visible to scheduling and other VMs were able to start at the host although there was not enough space. Fix: The real resource allocation is updated together with the pending resources when the UP state arrives. Result: Free resources are computed properly and the scheduling is no longer overloading hosts.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2016-03-11 02:24:38 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: SLA
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
rule-engine: ovirt‑3.6.z+
rule-engine: blocker+
mgoldboi: planning_ack+
dfediuck: devel_ack+
mavital: testing_ack+


Attachments (Terms of Use)
engine log (10.12 MB, text/plain)
2015-12-08 03:50 EST, Artyom
no flags Details


External Trackers
Tracker ID Priority Status Summary Last Updated
oVirt gerrit 53063 master MERGED Fix the race between committed memory and pending memory updates 2016-02-15 15:28 EST
oVirt gerrit 53520 master MERGED scheduling: Use the right memory value for pending memory checks 2016-02-15 15:00 EST
oVirt gerrit 53542 ovirt-engine-3.6 MERGED scheduling: Initialize pending resource manager after all beans are injected 2016-02-16 01:46 EST
oVirt gerrit 53543 ovirt-engine-3.6 MERGED scheduling: Use the right memory value for pending memory checks 2016-02-16 01:48 EST
oVirt gerrit 53544 ovirt-engine-3.6 MERGED Fix the race between committed memory and pending memory updates 2016-02-16 01:50 EST
oVirt gerrit 53549 ovirt-engine-3.6.3 MERGED scheduling: Initialize pending resource manager after all beans are injected 2016-02-16 02:00 EST
oVirt gerrit 53550 ovirt-engine-3.6.3 MERGED scheduling: Use the right memory value for pending memory checks 2016-02-16 02:01 EST
oVirt gerrit 53551 ovirt-engine-3.6.3 MERGED Fix the race between committed memory and pending memory updates 2016-02-16 02:02 EST

  None (edit)
Description Artyom 2015-12-08 03:50:45 EST
Created attachment 1103483 [details]
engine log

Description of problem:
When start two vms simultaneously, parameter "Max free Memory for scheduling new VMs" not updated fast enough, that prevent from memory filter to block start one of vms.

Version-Release number of selected component (if applicable):
rhevm-3.6.1.1-0.1.el6.noarch

How reproducible:
Always

Steps to Reproduce:
1. Set cluster over commitment to 0
2. Add one host with 8 GB of memory(Max free Memory for scheduling new VMs: 8GB)
3. Create two vms each with memory and guaranteed memory equal to 7GB
4. Start vms simultaneously

Actual results:
Two vms succeed to start on host

Expected results:
One of vms must failed to start on host because it not have enough "Max free Memory for scheduling new VMs"

Additional info:
Looks like update of host "Max free Memory for scheduling new VMs" take around 1 minute, and when scheduler check host memory to run second vm, it still have old value.
I tried the same scenario with 3 vms:
2 vms succeed to start and one failed with 
2015-12-08 10:22:04,432 INFO  [org.ovirt.engine.core.bll.scheduling.SchedulingManager] (org.ovirt.thread.pool-7-thread-2) [22ee0787] Candidate host 'alma06.qa.lab.tlv.redhat.com' ('f6735fa9-4ee5-47ce-b750-a87863736cc2') was filtered out by 'VAR__FILTERTYPE__INTERNAL' filter 'Memory' (correlation id: 22ee0787)
2015-12-08 10:22:04,433 DEBUG [org.ovirt.engine.core.bll.scheduling.SchedulingManager] (org.ovirt.thread.pool-7-thread-2) [22ee0787] Scheduling ended, correlation Id: 22ee0787
2015-12-08 10:22:04,433 ERROR [org.ovirt.engine.core.bll.RunVmCommand] (org.ovirt.thread.pool-7-thread-2) [22ee0787] Can't find VDS to run the VM 'c2854589-ccdc-4ccb-9b78-73dc0f0f78b8' on, so this VM will not be run.
2015-12-08 10:22:04,433 DEBUG [org.ovirt.engine.core.bll.scheduling.pending.PendingResourceManager] (ForkJoinPool-1-worker-15) [] Clearing pending resources for VM cfc7561e-6683-468e-8298-03c1a7e79fa1
2015-12-08 10:22:04,475 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (org.ovirt.thread.pool-7-thread-2) [22ee0787] Correlation ID: 22ee0787, Job ID: 0be4993a-f8a5-4270-aeb2-7eb8ab75144c, Call Stack: null, Custom Event ID: -1, Message: Failed to run VM test_memory_2 (User: admin@internal)
but it also problematic we need to specify why vm failed to start, without need to check engine log
Comment 1 Doron Fediuck 2016-01-24 04:10:04 EST
Artyom,
did you use the UI or the REST API?
Comment 2 Artyom 2016-01-25 04:31:27 EST
I used webadmin, but I also encountered this problem under my automation test, so it is happen also via REST API.
Comment 3 Martin Sivák 2016-02-02 04:59:19 EST
There are two issues here:

1) We do not tell the user nicely even when we prevent the vm from starting

Nothing we can do to fix this. The VMs weren't scheduled yet and pending is 0 when canDoAction steps run in parallel so we let both RunVm commands to go to the execute phase and one fails there (which generates audit log message, but not the nice dialog).

2) Two VMs actually starting. This is a bug, but not related to the pending mechanism (it seems that mem commited update is too slow here):


CanDoActions in parallel:

2015-12-08 10:17:06,390 DEBUG [org.ovirt.engine.core.bll.scheduling.SlaValidator] (org.ovirt.thread.pool-7-thread-49) [3fae3cd4] hasMemoryToRunVM: host 'alma06.qa.lab.tlv.redhat.com' physical vmem size is : 15772 MB
2015-12-08 10:17:06,390 DEBUG [org.ovirt.engine.core.bll.scheduling.SlaValidator] (org.ovirt.thread.pool-7-thread-49) [3fae3cd4] Host Mem Conmmitted: '2178', pending vmem size is : 0, Host Guest Overhead 65, Host Reserved Mem: 321, VM Min Allocated Mem 12288
2015-12-08 10:17:06,390 DEBUG [org.ovirt.engine.core.bll.scheduling.SlaValidator] (org.ovirt.thread.pool-7-thread-49) [3fae3cd4] 14852.0 <= ???  15772.0

2015-12-08 10:17:06,390 DEBUG [org.ovirt.engine.core.bll.scheduling.SlaValidator] (org.ovirt.thread.pool-7-thread-48) [7a84f1e6] hasMemoryToRunVM: host 'alma06.qa.lab.tlv.redhat.com' physical vmem size is : 15772 MB
2015-12-08 10:17:06,392 DEBUG [org.ovirt.engine.core.bll.scheduling.SlaValidator] (org.ovirt.thread.pool-7-thread-48) [7a84f1e6] Host Mem Conmmitted: '2178', pending vmem size is : 0, Host Guest Overhead 65, Host Reserved Mem: 321, VM Min Allocated Mem 12288
2015-12-08 10:17:06,392 DEBUG [org.ovirt.engine.core.bll.scheduling.SlaValidator] (org.ovirt.thread.pool-7-thread-48) [7a84f1e6] 14852.0 <= ???  15772.0

2015-12-08 10:17:06,394 DEBUG [org.ovirt.engine.core.bll.scheduling.SlaValidator] (org.ovirt.thread.pool-7-thread-47) [18c96faf] hasMemoryToRunVM: host 'alma06.qa.lab.tlv.redhat.com' physical vmem size is : 15772 MB
2015-12-08 10:17:06,395 DEBUG [org.ovirt.engine.core.bll.scheduling.SlaValidator] (org.ovirt.thread.pool-7-thread-47) [18c96faf] Host Mem Conmmitted: '2178', pending vmem size is : 0, Host Guest Overhead 65, Host Reserved Mem: 321, VM Min Allocated Mem 12288
2015-12-08 10:17:06,395 DEBUG [org.ovirt.engine.core.bll.scheduling.SlaValidator] (org.ovirt.thread.pool-7-thread-47) [18c96faf] 14852.0 <= ???  15772.0

2015-12-08 10:17:06,471 INFO  [org.ovirt.engine.core.bll.MultipleActionsRunner] (org.ovirt.thread.pool-7-thread-47) [18c96faf] Finish handling CanDoAction for command number 1/3 (Command type 'RunVm')
2015-12-08 10:17:06,471 INFO  [org.ovirt.engine.core.bll.MultipleActionsRunner] (org.ovirt.thread.pool-7-thread-49) [3fae3cd4] Finish handling CanDoAction for command number 3/3 (Command type 'RunVm')
2015-12-08 10:17:06,476 INFO  [org.ovirt.engine.core.bll.MultipleActionsRunner] (org.ovirt.thread.pool-7-thread-48) [7a84f1e6] Finish handling CanDoAction for command number 2/3 (Command type 'RunVm')

Then the first VM starts:

2015-12-08 10:17:06,760 DEBUG [org.ovirt.engine.core.bll.scheduling.SchedulingManager] (org.ovirt.thread.pool-7-thread-50) [7a84f1e6] Scheduling started, correlation Id: 7a84f1e6
2015-12-08 10:17:06,770 DEBUG [org.ovirt.engine.core.bll.scheduling.SlaValidator] (org.ovirt.thread.pool-7-thread-50) [7a84f1e6] hasMemoryToRunVM: host 'alma06.qa.lab.tlv.redhat.com' physical vmem size is : 15772 MB
2015-12-08 10:17:06,770 DEBUG [org.ovirt.engine.core.bll.scheduling.SlaValidator] (org.ovirt.thread.pool-7-thread-50) [7a84f1e6] Host Mem Conmmitted: '2178', pending vmem size is : 0, Host Guest Overhead 65, Host Reserved Mem: 321, VM Min Allocated Mem 12288
2015-12-08 10:17:06,770 DEBUG [org.ovirt.engine.core.bll.scheduling.SlaValidator] (org.ovirt.thread.pool-7-thread-50) [7a84f1e6] 14852.0 <= ???  15772.0

2015-12-08 10:17:06,792 DEBUG [org.ovirt.engine.core.bll.scheduling.pending.PendingResourceManager] (org.ovirt.thread.pool-7-thread-50) [7a84f1e6] Adding pending resource org.ovirt.engine.core.bll.scheduling.pending.PendingCpuCores@f326500e (host: f6735fa9-4ee5-47ce-b750-a87863736cc2, vm: a54cb46b-c9c8-47a5-a74c-7e2938eedde9)
2015-12-08 10:17:06,792 DEBUG [org.ovirt.engine.core.bll.scheduling.pending.PendingResourceManager] (org.ovirt.thread.pool-7-thread-50) [7a84f1e6] Adding pending resource org.ovirt.engine.core.bll.scheduling.pending.PendingMemory@f326500e (host: f6735fa9-4ee5-47ce-b750-a87863736cc2, vm: a54cb46b-c9c8-47a5-a74c-7e2938eedde9)
2015-12-08 10:17:06,792 DEBUG [org.ovirt.engine.core.bll.scheduling.pending.PendingResourceManager] (org.ovirt.thread.pool-7-thread-50) [7a84f1e6] Adding pending resource org.ovirt.engine.core.bll.scheduling.pending.PendingVM@f326500e (host: f6735fa9-4ee5-47ce-b750-a87863736cc2, vm: a54cb46b-c9c8-47a5-a74c-7e2938eedde9)
2015-12-08 10:17:06,793 DEBUG [org.ovirt.engine.core.bll.scheduling.SchedulingManager] (org.ovirt.thread.pool-7-thread-50) [7a84f1e6] Scheduling ended, correlation Id: 7a84f1e6

Then the second VM tries to start and fails the memory check properly:

2015-12-08 10:17:08,377 DEBUG [org.ovirt.engine.core.bll.scheduling.SchedulingManager] (org.ovirt.thread.pool-7-thread-50) [18c96faf] Scheduling started, correlation Id: 18c96faf
2015-12-08 10:17:08,387 DEBUG [org.ovirt.engine.core.bll.scheduling.SlaValidator] (org.ovirt.thread.pool-7-thread-50) [18c96faf] hasMemoryToRunVM: host 'alma06.qa.lab.tlv.redhat.com' physical vmem size is : 15772 MB
2015-12-08 10:17:08,387 DEBUG [org.ovirt.engine.core.bll.scheduling.SlaValidator] (org.ovirt.thread.pool-7-thread-50) [18c96faf] Host Mem Conmmitted: '2178', pending vmem size is : 12288, Host Guest Overhead 65, Host Reserved Mem: 321, VM Min Allocated Mem 12288
2015-12-08 10:17:08,387 DEBUG [org.ovirt.engine.core.bll.scheduling.SlaValidator] (org.ovirt.thread.pool-7-thread-50) [18c96faf] 27140.0 <= ???  15772.0
2015-12-08 10:17:08,387 DEBUG [org.ovirt.engine.core.bll.scheduling.VdsFreeMemoryChecker] (org.ovirt.thread.pool-7-thread-50) [18c96faf] not enough memory on host. throttling...
2015-12-08 10:17:08,387 DEBUG [org.ovirt.engine.core.bll.RunVmCommandBase] (org.ovirt.thread.pool-7-thread-50) [18c96faf] Try to wait for te engine update the host memory and cpu stats

But it passes it after the delay period..

2015-12-08 10:17:18,393 DEBUG [org.ovirt.engine.core.bll.scheduling.SlaValidator] (org.ovirt.thread.pool-7-thread-50) [18c96faf] hasMemoryToRunVM: host 'alma06.qa.lab.tlv.redhat.com' physical vmem size is : 15772 MB
2015-12-08 10:17:18,394 DEBUG [org.ovirt.engine.core.bll.scheduling.SlaValidator] (org.ovirt.thread.pool-7-thread-50) [18c96faf] Host Mem Conmmitted: '2178', pending vmem size is : 0, Host Guest Overhead 65, Host Reserved Mem: 321, VM Min Allocated Mem 12288
2015-12-08 10:17:18,394 DEBUG [org.ovirt.engine.core.bll.scheduling.SlaValidator] (org.ovirt.thread.pool-7-thread-50) [18c96faf] 14852.0 <= ???  15772.0

The BUG is here, pending is cleared properly (VM monitoring got the VM in the VDSM report), but mem commited still does not contain the memory that was assigned to the first VM.
Pending clear is reported on a later line, but the difference is only about 30ms so the transactions might have started at about the same time.

2015-12-08 10:17:18,425 DEBUG [org.ovirt.engine.core.bll.scheduling.pending.PendingResourceManager] (ForkJoinPool-1-worker-2) [] Clearing pending resources for VM a54cb46b-c9c8-47a5-a74c-7e2938eedde9


The third VM ends up doing the same:

2015-12-08 10:17:19,916 DEBUG [org.ovirt.engine.core.bll.scheduling.SchedulingManager] (org.ovirt.thread.pool-7-thread-50) [3fae3cd4] Scheduling started, correlation Id: 3fae3cd4
2015-12-08 10:17:19,930 DEBUG [org.ovirt.engine.core.bll.scheduling.SlaValidator] (org.ovirt.thread.pool-7-thread-50) [3fae3cd4] hasMemoryToRunVM: host 'alma06.qa.lab.tlv.redhat.com' physical vmem size is : 15772 MB
2015-12-08 10:17:19,930 DEBUG [org.ovirt.engine.core.bll.scheduling.SlaValidator] (org.ovirt.thread.pool-7-thread-50) [3fae3cd4] Host Mem Conmmitted: '2178', pending vmem size is : 12288, Host Guest Overhead 65, Host Reserved Mem: 321, VM Min Allocated Mem 12288
2015-12-08 10:17:19,931 DEBUG [org.ovirt.engine.core.bll.scheduling.SlaValidator] (org.ovirt.thread.pool-7-thread-50) [3fae3cd4] 27140.0 <= ???  15772.0
2015-12-08 10:17:19,931 DEBUG [org.ovirt.engine.core.bll.scheduling.VdsFreeMemoryChecker] (org.ovirt.thread.pool-7-thread-50) [3fae3cd4] not enough memory on host. throttling...
2015-12-08 10:17:19,931 DEBUG [org.ovirt.engine.core.bll.RunVmCommandBase] (org.ovirt.thread.pool-7-thread-50) [3fae3cd4] Try to wait for te engine update the host memory and cpu stats

After throttling..

2015-12-08 10:17:29,936 DEBUG [org.ovirt.engine.core.bll.scheduling.SlaValidator] (org.ovirt.thread.pool-7-thread-50) [3fae3cd4] hasMemoryToRunVM: host 'alma06.qa.lab.tlv.redhat.com' physical vmem size is : 15772 MB
2015-12-08 10:17:29,936 DEBUG [org.ovirt.engine.core.bll.scheduling.SlaValidator] (org.ovirt.thread.pool-7-thread-50) [3fae3cd4] Host Mem Conmmitted: '14531', pending vmem size is : 0, Host Guest Overhead 65, Host Reserved Mem: 321, VM Min Allocated Mem 12288
2015-12-08 10:17:29,937 DEBUG [org.ovirt.engine.core.bll.scheduling.SlaValidator] (org.ovirt.thread.pool-7-thread-50) [3fae3cd4] 27205.0 <= ???  15772.0
2015-12-08 10:17:29,937 DEBUG [org.ovirt.engine.core.bll.scheduling.policyunits.MemoryPolicyUnit] (org.ovirt.thread.pool-7-thread-50) [3fae3cd4] Host 'alma06.qa.lab.tlv.redhat.com' has 920 MB available. Insufficient memory to run the VM
2015-12-08 10:17:29,938 INFO  [org.ovirt.engine.core.bll.scheduling.SchedulingManager] (org.ovirt.thread.pool-7-thread-50) [3fae3cd4] Candidate host 'alma06.qa.lab.tlv.redhat.com' ('f6735fa9-4ee5-47ce-b750-a87863736cc2') was filtered out by 'VAR__FILTERTYPE__INTERNAL' filter 'Memory' (correlation id: 3fae3cd4)
2015-12-08 10:17:29,939 DEBUG [org.ovirt.engine.core.bll.scheduling.SchedulingManager] (org.ovirt.thread.pool-7-thread-50) [3fae3cd4] Scheduling ended, correlation Id: 3fae3cd4
2015-12-08 10:17:29,939 ERROR [org.ovirt.engine.core.bll.RunVmCommand] (org.ovirt.thread.pool-7-thread-50) [3fae3cd4] Can't find VDS to run the VM 'c2854589-ccdc-4ccb-9b78-73dc0f0f78b8' on, so this VM will not be run.

Notice that pending was cleared again at about the same time, but this time the committed memory contains the memory of the first VM (the second is still not counted there though).

2015-12-08 10:17:29,939 DEBUG [org.ovirt.engine.core.bll.scheduling.pending.PendingResourceManager] (ForkJoinPool-1-worker-11) [] Clearing pending resources for VM cfc7561e-6683-468e-8298-03c1a7e79fa1


So we really have a bug here, the committed memory is not updated yet when we clear the pending memory of the VM.
Comment 4 Martin Sivák 2016-02-04 05:51:06 EST
So we analyzed the issue with Arik and the conclusion is that this is caused by the missing interaction between host and vm monitoring structures courtesy of the new event mechanism.

The pending values are cleared once the VM monitor get the asynchronous message announcing the state change to Powering Up (via RunVmCommnand#OnPowerringUp, SchedulingManager#clearVm and VdsManager#updatePendingData methods). But the committed memory of host is not recomputed here.

The committed memory is recomputed in HostMonitoring#refreshCommitedMemory which gets called when the polling returns new host data. But that can take up to 15 seconds.

So we have at max a 15 second window between when we clear the pending values and when we update the committed memory information.

The solution would be to call refreshCommitedMemory from the updatePendingData, but there is one catch. refreshCommitedMemory depends on getLastVmsList that still does not contain the VM in question. And we would have to insert the VM there too, before we ask for recomputing the committed memory.
Comment 5 Artyom 2016-02-21 04:56:01 EST
Verified on rhevm-3.6.3.2-0.1.el6.noarch
1) Create two vms with 7Gb of the memory and guaranteed memory
2) Start two vms simultaneously on the host with 13Gb of the "Max free Memory for scheduling new VMs"

Just one vm succeed to start, other failed because memory filter

Note You need to log in before you can comment on or make changes to this bug.