Bug 1717007 - "Max free Memory for scheduling new VMs: is not released after VM is migrated from the host.
Summary: "Max free Memory for scheduling new VMs: is not released after VM is migrated...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Virt
Version: 4.3.4
Hardware: x86_64
OS: Linux
unspecified
urgent
Target Milestone: ovirt-4.3.4
: ---
Assignee: Andrej Krejcir
QA Contact: Polina
URL:
Whiteboard:
Depends On: 1651406
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-06-04 14:17 UTC by Polina
Modified: 2019-06-20 11:48 UTC (History)
7 users (show)

Fixed In Version: 4.3.4-7
Clone Of:
Environment:
Last Closed: 2019-06-20 11:48:04 UTC
oVirt Team: SLA
Embargoed:
sbonazzo: ovirt-4.3?
mavital: blocker?
mavital: planning_ack?
pm-rhel: devel_ack+
mavital: testing_ack+


Attachments (Terms of Use)
engine and screenshots (3.14 MB, application/gzip)
2019-06-04 14:17 UTC, Polina
no flags Details
vdsm logs (79 bytes, text/plain)
2019-06-04 15:01 UTC, Polina
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 100530 0 'None' MERGED scheduler: Fix Guid != comparison 2021-02-11 08:55:55 UTC

Description Polina 2019-06-04 14:17:24 UTC
Created attachment 1577147 [details]
engine and screenshots

Description of problem:"Max free Memory for scheduling new VMs:" is not released after VM is migrated from the host. The initial memory on the host is 30454 MB , HE VM is migrated to this host , the  "Max free Memory for scheduling new VMs: is decreased to the 13838 MB , though afterward when the VM is migrated from this host the memory remains 13838 MB. So, in this way after several migrations (balancing) the host has "Max free Memory for scheduling new VMs:" = 0 while no VM is running on this host. and as a result, no VM could be started there.

vdsm-client see it correctly, also Physical Memory: in engine is shown correctly . the problem is with "Max free Memory for scheduling new VMs: which could in this way be 0 and prevent from running VMs on this host while actually, no VM resides there. After the host reboot, the problem remains. it is released if I remove the host from the engine and -re-install


Version-Release number of selected component (if applicable):ovirt-engine-4.3.4.2-0.1.el7.noarch


How reproducible: reproduced while tier3 automation run , and afterward manualy after several times migration/balancing of HE VM (it could be also regular VM, just configured with 16384 MB, so faster and easier to see the memory decreasing ).


Steps to Reproduce:
There is no well-defined scenario - it happened after a lot of migration/balancing of HE VM. 

Actual results: Max free Memory for scheduling new VMs: 0 MB while actually, no VMs run on host 

Expected results: memory must be freed.

Additional info: from discussion with Andrej:

I have looked at the environment, and it looks like the cause is the same as for the issue we have discussed yesterday.
Balancing a VM does not release the reserved memory, when the VM is scheduled to the same host.

Querying the DB on the engine machine we see:

engine=# select vds_name, pending_vcpus_count, pending_vmem_size from vds;
   vds_name   | pending_vcpus_count | pending_vmem_size 
--------------+---------------------+-------------------
 host_mixed_2 |                   0 |                 0
 host_mixed_1 |                   5 |             17786
 host_mixed_3 |                   0 |                 0
(3 rows)

The reserved resources on host_mixed_1 are not released.

The patch that should fix it in 4.3 was merged today: https://gerrit.ovirt.org/#/c/100530/

As a workaround, the pending resources are released when the ovirt-engine service is restarted.

Comment 1 Polina 2019-06-04 15:01:43 UTC
Created attachment 1577160 [details]
vdsm logs

Comment 2 Sandro Bonazzola 2019-06-05 07:14:19 UTC
Marking as regression since according to Meital it wasn't reproduced in 4.3.3

Comment 3 Polina 2019-06-13 08:58:25 UTC
Verified on ovirt-engine-4.3.4.3-0.1.el7.noarch (build http://bob-dr.lab.eng.brq.redhat.com/builds/4.3/rhv-4.3.4-7) by running the whole tier3 automation and repeated running of tests with balancing

Comment 4 Sandro Bonazzola 2019-06-20 11:48:04 UTC
This bugzilla is included in oVirt 4.3.4 release, published on June 11th 2019.

Since the problem described in this bug report should be
resolved in oVirt 4.3.4 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.