Bug 1559694 - RFE: warn user if VM does not fit in a single numa node of the host
Summary: RFE: warn user if VM does not fit in a single numa node of the host
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 4.1.10
Hardware: x86_64
OS: Linux
medium
high
Target Milestone: ovirt-4.3.0
: 4.3.0
Assignee: Andrej Krejcir
QA Contact: Polina
URL:
Whiteboard: verified_upstream
Depends On:
Blocks: 902971
TreeView+ depends on / blocked
 
Reported: 2018-03-23 03:29 UTC by Germano Veit Michel
Modified: 2019-05-08 12:37 UTC (History)
11 users (show)

Fixed In Version: ovirt-engine-4.3.0_alpha
Doc Type: Enhancement
Doc Text:
If a VM does not use virtual NUMA nodes, it is better if its whole memory can fit into a single NUMA node on the host. Otherwise, there may be some performance overhead. There are two additions in this RFE: 1. A new warning message is shown in the audit log if a VM is run on a host where its memory cannot fit into a single host NUMA node. 2. A new policy unit is added to the scheduler: 'Fit VM to single host NUMA node'. When starting a VM, this policy prefers hosts where the VM can fit into a single NUMA node. This unit is not active by default, because it can cause undesired edge cases. For example, the policy unit would cause the following behavior when starting multiple VMs: In the following setup: - 9 hosts with 16 GB per NUMA node - 1 host with 4 GB per NUMA node When multiple VMs with 6 GB of memory are scheduled, the scheduling unit would prevent them from starting on the host with 4 GB per NUMA node, no matter how overloaded the other hosts are. It would use the last host only when all the others do not have enough free memory to run the VM.
Clone Of:
Environment:
Last Closed: 2019-05-08 12:37:22 UTC
oVirt Team: Virt
pagranat: testing_plan_complete+


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2019:1085 None None None 2019-05-08 12:37:42 UTC
oVirt gerrit 93532 master MERGED core: Warn if VM cannot fit to a single NUMA node. 2018-08-30 11:29:35 UTC
oVirt gerrit 93549 master MERGED core: Remove NumaUtils class 2018-08-28 11:30:19 UTC
oVirt gerrit 93550 master MERGED scheduler: Prefer hosts where the VM fits to a NUMA node 2018-09-03 11:07:27 UTC

Description Germano Veit Michel 2018-03-23 03:29:43 UTC
Description of problem:

We get several performance cases that end up being diagnosed with incorrect VM configuration. 

The most common is a VM with N vCPUs and M memory, running in a host with <N pCPUs and <M memory per numa node. It affects performance and sometimes is so bad that even other VMs and the host itself are affected due to the host struggling trying to move pages across nodes.

Another idea would be for vsdm to monitor numastat and forward it to the engine once a certain threshold is crossed.

The user should be notified on such cases to apply the correct configuration, or maybe even better, suggest using the new 'High Performance' VM type.

Comment 2 Andrej Krejcir 2018-08-02 13:48:20 UTC
Adding a warning to the audit log when the VM starts should be simple.

We can also add a policy to the scheduler so it prefers hosts, where the VM can fit to a single NUMA node.

Comment 3 Martin Tessun 2018-08-22 14:55:14 UTC
(In reply to Andrej Krejcir from comment #2)
> Adding a warning to the audit log when the VM starts should be simple.
> 
> We can also add a policy to the scheduler so it prefers hosts, where the VM
> can fit to a single NUMA node.

I would go for both.
Still there will be cases where a VM will span multiple NUMA nodes, so the warning should be something like "... Consider using vNUMA and NUMA pinning for this VM".

Once we have HP Live migration in place the NUMA pinning should no longer prevent the migration.

Comment 4 Polina 2018-10-03 14:41:10 UTC
The BZ verification on ovirt-release-master-4.3.0-0.1.master.20180906000056.git3e0522a.el7.noarch

===test1
host

numactl -H
available: 4 nodes (0-3)
node 0 cpus: 0 2
node 0 size: 16349 MB
node 0 free: 14036 MB
node 1 cpus: 4 6
node 1 size: 8192 MB
node 1 free: 6172 MB
node 2 cpus: 1 3
node 2 size: 16384 MB
node 2 free: 13932 MB
node 3 cpus: 5 7
node 3 size: 8175 MB
node 3 free: 6251 MB
node distances:

configure vm without NUMA nodes has memory 18432 MB. it is more than single numa node , but less than the free memory on host.
warning in Events appears:  "VM golden_env_mixed_virtio_0 does not fit to a single NUMA node on host host_mixed_1. This may negatively impact its performance. Consider using vNUMA and NUMA pinning for this VM."


===test2 

set the new policy filter "Fit VM to single host numa node". (Administration -> Configure -> Scheduling Policies. Copy 'none' policy. In the dialog, enable weight module 'Fit VM to single host NUMA node'.

two hosts: 
host1 - 
numactl -H
available: 4 nodes (0-3)
node 0 cpus: 0 2
node 0 size: 16349 MB
node 0 free: 13684 MB
node 1 cpus: 4 6
node 1 size: 8192 MB
node 1 free: 5878 MB
node 2 cpus: 1 3
node 2 size: 16384 MB
node 2 free: 13841 MB
node 3 cpus: 5 7
node 3 size: 8175 MB
node 3 free: 5722 MB

host2 - no numa node

Run a VM without NUMA nodes with 18 GB of memory. It runs on the host without numa, even if it could fit on the host with numa.

Comment 5 RHV Bugzilla Automation and Verification Bot 2018-12-10 15:12:50 UTC
WARN: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{'rhevm-4.3-ga': '?'}', ]

For more info please contact: rhv-devops@redhat.comINFO: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{'rhevm-4.3-ga': '?'}', ]

For more info please contact: rhv-devops@redhat.com

Comment 6 RHV Bugzilla Automation and Verification Bot 2019-01-15 23:35:20 UTC
WARN: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{'rhevm-4.3-ga': '?'}', ]

For more info please contact: rhv-devops@redhat.comINFO: Bug status (ON_QA) wasn't changed but the folowing should be fixed:

[Found non-acked flags: '{'rhevm-4.3-ga': '?'}', ]

For more info please contact: rhv-devops@redhat.com

Comment 7 Polina 2019-01-22 15:45:52 UTC
verification - https://bugzilla.redhat.com/show_bug.cgi?id=1559694#c4

Comment 9 errata-xmlrpc 2019-05-08 12:37:22 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:1085


Note You need to log in before you can comment on or make changes to this bug.