Bug 1720558 - can't use vNUMA when VM RAM is bigger than half of host RAM
Summary: can't use vNUMA when VM RAM is bigger than half of host RAM
Keywords:
Status: CLOSED DUPLICATE of bug 1812316
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: General
Version: 4.3.3.7
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ovirt-4.4.3
: ---
Assignee: Lucia Jelinkova
QA Contact: meital avital
URL:
Whiteboard:
: 1745247 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-06-14 08:27 UTC by matthias.leopold
Modified: 2023-09-07 20:08 UTC (History)
8 users (show)

Fixed In Version:
Clone Of:
Environment:
Last Closed: 2020-04-29 09:15:54 UTC
oVirt Team: Virt
Embargoed:
pm-rhel: ovirt-4.4+


Attachments (Terms of Use)

Description matthias.leopold 2019-06-14 08:27:00 UTC
Description of problem:
after configuring a High Performance VM with vNUMA and hugepages the VM can't be started with error message 
"The host foo did not satisfy internal filter NUMA because cannot accommodate memory of VM's pinned virtual NUMA nodes within host's physical NUMA nodes" 

Version-Release number of selected component (if applicable):
ovirt-engine-4.3.3.7-1.el7.noarch
vdsm-4.30.17-1.el7.x86_64

How reproducible:
configure hypervisor host and VM for use of vNUMA and hugepages
start VM

Steps to Reproduce:
1. add "hugepagesz=1G hugepages=512" to kernel cmdline of hypervisor host (768G RAM, 2 physical 8 core CPUs, HT enabled)  and reboot
2. create VM optimized for High Performance with "Memory Size: 524288 MB" and 2 virtual sockets (topology: 2:14:1)
2. configure VM with pinning vCPUs evenly to 2 CPUs of hypervisor host
3. configure VM with 2 vNUMA nodes that are pinned to the 2 NUMA nodes of hypervisor host
4. configure VM with custom property: "hugepages=1048576"
5. start VM

Actual results:
VM can't be started
error message in UI
"The host foo did not satisfy internal filter NUMA because cannot accommodate memory of VM's pinned virtual NUMA nodes within host's physical NUMA nodes"
error message in engine.log
"2019-06-14 09:58:25,076+02 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-12013) [] EVENT_ID: USER_FAILED_RUN_VM(54), Failed to run VM bar due to a failed validation: [Cannot run VM. There is no host that satisfies current scheduling constraints. See below for details:, The host foo did not satisfy internal filter NUMA because cannot accommodate memory of VM's pinned virtual NUMA nodes within host's physical NUMA nodes..] (User: someone@internal-authz).
2019-06-14 09:58:25,076+02 WARN  [org.ovirt.engine.core.bll.RunVmCommand] (default task-12013) [] Validation of action 'RunVm' failed for user someone3@internal-authz. Reasons: VAR__ACTION__RUN,VAR__TYPE__VM,SCHEDULING_ALL_HOSTS_FILTERED_OUT,VAR__FILTERTYPE__INTERNAL,$hostName foo,$filterName NUMA,VAR__DETAIL__NOT_MEMORY_PINNED_NUMA,SCHEDULING_HOST_FILTERED_REASON_WITH_DETAIL"

Expected results:
VM starts

Additional info:
"bar" is only VM on host "foo"
host "foo" is only host in cluster

Comment 1 matthias.leopold 2019-06-14 15:42:34 UTC
I have to add that this problem only occurs when VM RAM > (Host RAM/2).
When VM RAM fits into what is left on host after allocating hugepages the VM can start.
As suggested by akrejcir I created a scheduler for the cluster without NUMA filter and could start the VM described above with no problems.
When using a standard scheduler with NUMA filter I can start the VM with. eg. 192G RAM (host still has 512G hugepages allocated).
When comparing VM XML definitions I can see that <numa> and <hugepages> are correctly configured in both cases. Only the NUMA filter in the scheduler supposedly makes some wrong assumptions before starting the VM.

Comment 2 Ryan Barry 2019-06-15 02:32:55 UTC
Here's a good SLA bug to start with, since Andrej already nailed down the cause

Comment 4 matthias.leopold 2019-07-03 13:49:49 UTC
I saw that this behaviour is not linked to hugepages, but also occurs when using vNUMA only. I know the nature of this bug changed considerably since first reporting it, but the bug in the NUMA filter is still there. I'll change the bug title again.

Comment 5 Ryan Barry 2019-08-25 03:28:27 UTC
*** Bug 1745247 has been marked as a duplicate of this bug. ***

Comment 9 Lucia Jelinkova 2020-04-29 09:15:54 UTC

*** This bug has been marked as a duplicate of bug 1812316 ***


Note You need to log in before you can comment on or make changes to this bug.