Description of problem: after configuring a High Performance VM with vNUMA and hugepages the VM can't be started with error message "The host foo did not satisfy internal filter NUMA because cannot accommodate memory of VM's pinned virtual NUMA nodes within host's physical NUMA nodes" Version-Release number of selected component (if applicable): ovirt-engine-4.3.3.7-1.el7.noarch vdsm-4.30.17-1.el7.x86_64 How reproducible: configure hypervisor host and VM for use of vNUMA and hugepages start VM Steps to Reproduce: 1. add "hugepagesz=1G hugepages=512" to kernel cmdline of hypervisor host (768G RAM, 2 physical 8 core CPUs, HT enabled) and reboot 2. create VM optimized for High Performance with "Memory Size: 524288 MB" and 2 virtual sockets (topology: 2:14:1) 2. configure VM with pinning vCPUs evenly to 2 CPUs of hypervisor host 3. configure VM with 2 vNUMA nodes that are pinned to the 2 NUMA nodes of hypervisor host 4. configure VM with custom property: "hugepages=1048576" 5. start VM Actual results: VM can't be started error message in UI "The host foo did not satisfy internal filter NUMA because cannot accommodate memory of VM's pinned virtual NUMA nodes within host's physical NUMA nodes" error message in engine.log "2019-06-14 09:58:25,076+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-12013) [] EVENT_ID: USER_FAILED_RUN_VM(54), Failed to run VM bar due to a failed validation: [Cannot run VM. There is no host that satisfies current scheduling constraints. See below for details:, The host foo did not satisfy internal filter NUMA because cannot accommodate memory of VM's pinned virtual NUMA nodes within host's physical NUMA nodes..] (User: someone@internal-authz). 2019-06-14 09:58:25,076+02 WARN [org.ovirt.engine.core.bll.RunVmCommand] (default task-12013) [] Validation of action 'RunVm' failed for user someone3@internal-authz. Reasons: VAR__ACTION__RUN,VAR__TYPE__VM,SCHEDULING_ALL_HOSTS_FILTERED_OUT,VAR__FILTERTYPE__INTERNAL,$hostName foo,$filterName NUMA,VAR__DETAIL__NOT_MEMORY_PINNED_NUMA,SCHEDULING_HOST_FILTERED_REASON_WITH_DETAIL" Expected results: VM starts Additional info: "bar" is only VM on host "foo" host "foo" is only host in cluster
I have to add that this problem only occurs when VM RAM > (Host RAM/2). When VM RAM fits into what is left on host after allocating hugepages the VM can start. As suggested by akrejcir I created a scheduler for the cluster without NUMA filter and could start the VM described above with no problems. When using a standard scheduler with NUMA filter I can start the VM with. eg. 192G RAM (host still has 512G hugepages allocated). When comparing VM XML definitions I can see that <numa> and <hugepages> are correctly configured in both cases. Only the NUMA filter in the scheduler supposedly makes some wrong assumptions before starting the VM.
Here's a good SLA bug to start with, since Andrej already nailed down the cause
I saw that this behaviour is not linked to hugepages, but also occurs when using vNUMA only. I know the nature of this bug changed considerably since first reporting it, but the bug in the NUMA filter is still there. I'll change the bug title again.
*** Bug 1745247 has been marked as a duplicate of this bug. ***
*** This bug has been marked as a duplicate of bug 1812316 ***