Description of problem: A host cannot be selected as migration destination for a High Performance Virtual Machine because the NUMA filter is expecting the host to have normal free memory for the VM on each Host NUMA. But the host already has static free hugepages reserved to accomodate the VM. This is very problematic if the host has large amounts of memory reserved for huge pages. Version-Release number of selected component (if applicable): ovirt-engine-4.3.8.2-0.5.el7.noarch How reproducible: Always Steps to Reproduce: * The key to reproduce this to have the host have less normal (non HugePages) memory on each NUMA node than the VM uses (HugePages) on each vNUMA. The easiest way to reproduce it is to reserve almost all host memory to hugepages, so there is little free normal memory on each node, but plenty of HPs. See one way to do it: 1. The Virtual Machine has 4GB of RAM, divided into 2 vNUMA nodes <vm_numa_nodes> <vm_numa_node href="/ovirt-engine/api/vms/8f54d8f6-3a16-49cd-980d-73ec684796c5/numanodes/42c70765-a82c-4488-97f1-bc1146c5b213" id="42c70765-a82c-4488-97f1-bc1146c5b213"> <cpu> <cores> <core> <index>0</index> </core> </cores> </cpu> <index>0</index> <memory>2048</memory> <numa_node_pins> <numa_node_pin> <index>0</index> </numa_node_pin> </numa_node_pins> <vm href="/ovirt-engine/api/vms/8f54d8f6-3a16-49cd-980d-73ec684796c5" id="8f54d8f6-3a16-49cd-980d-73ec684796c5"/> </vm_numa_node> <vm_numa_node href="/ovirt-engine/api/vms/8f54d8f6-3a16-49cd-980d-73ec684796c5/numanodes/1fae16f3-0ad7-4b70-b65a-227633ee81ed" id="1fae16f3-0ad7-4b70-b65a-227633ee81ed"> <cpu> <cores> <core> <index>1</index> </core> </cores> </cpu> <index>1</index> <memory>2048</memory> <numa_node_pins> <numa_node_pin> <index>1</index> </numa_node_pin> </numa_node_pins> <vm href="/ovirt-engine/api/vms/8f54d8f6-3a16-49cd-980d-73ec684796c5" id="8f54d8f6-3a16-49cd-980d-73ec684796c5"/> </vm_numa_node> </vm_numa_nodes> 2. The VM is using 1G HugePages, so each of its vNUMA node uses 2x1G Huge Pages <custom_properties> <custom_property> <name>hugepages</name> <value>1048576</value> </custom_property> </custom_properties> 3. The destination Host has 10GB of total memory (4GB per NUMA node), with 6GB Reserved for Huge Pages. engine=# select numa_node_index,mem_total,cpu_count,mem_free,usage_mem_percent from numa_node_cpus_view where vds_id = '966a05c2-493c-447d-85fd-cedafc4680ed'; numa_node_index | mem_total | cpu_count | mem_free | usage_mem_percent -----------------+-----------+-----------+----------+------------------- 0 | 5119 | 2 | 2409 | 53 1 | 5120 | 2 | 629 | 88 # grep HugePages_ /proc/meminfo HugePages_Total: 6 HugePages_Free: 6 # cat /sys/devices/system/node/node0/hugepages/hugepages-1048576kB/nr_hugepages 2 # cat /sys/devices/system/node/node1/hugepages/hugepages-1048576kB/nr_hugepages 4 The host is totally idle, not running anything. The low mem_free is due to hugepage reservation. And there are 2+ free hugepages on each Host NUMA, enough to run the VM. 4. But the engine denies it. Looking at the code I think mem_free on node1 (629MB) is causing it in NumaPinningHelper.java. This is not right, as the Host has free hugepages to accomodate the VM. 2020-03-11 10:54:46,207+10 DEBUG [org.ovirt.engine.core.bll.scheduling.policyunits.NumaPolicyUnit] (default task-5) [802ba36c-a677-4c61-822b-46c94ceec426] Host 'host2.kvm' cannot accommodate memory of VM's pinned virtual NUMA nodes within host's physical NUMA nodes 2020-03-11 10:54:46,211+10 INFO [org.ovirt.engine.core.bll.scheduling.SchedulingManager] (default task-5) [802ba36c-a677-4c61-822b-46c94ceec426] Candidate host 'host2.kvm' ('966a05c2-493c-447d-85fd-cedafc4680ed') was filtered out by 'VAR__FILTERTYPE__INTERNAL' filter 'NUMA' (correlation id: null) 5. Disabling the NUMA filter makes the schedule not filter it. Actual results: Scheduler filters out the destination host due to insufficient normal memory, but VM uses reserved hugepages which are free Expected results: Migration is allowed
*** Bug 1720558 has been marked as a duplicate of this bug. ***
Lucia, is there anything else that this bug is pending on?
Nothing that I am aware of.
Verified on vdsm-4.40.29-1.el8ev.x86_64, ovirt-engine-4.4.3.2-0.19.el8ev.noarch According to the attached https://polarion.engineering.redhat.com/polarion/redirect/project/RHEVM3/workitem?id=RHEVM-27430 https://polarion.engineering.redhat.com/polarion/redirect/project/RHEVM3/workitem?id=RHEVM-27431
Hi Lucia, please review this doc text for the errata and release notes: With this enhancement, when scheduling a Virtual Machine with pinned NUMA nodes, memory requirements are calculated correctly by taking into account the available memory as well as allocated hugepages.
I'd maybe reworded it a bit to: ...by taking into account hugepages allocated on numa nodes.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Low: Red Hat Virtualization security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5179