Bug 2056950 - VM with pinning policy=resize-and-pin should not start on NUMA-less hosts
Summary: VM with pinning policy=resize-and-pin should not start on NUMA-less hosts
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Virt
Version: 4.5.0
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ovirt-4.5.2
: ---
Assignee: Liran Rotenberg
QA Contact: Polina
URL:
Whiteboard:
: 2080995 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-02-22 12:30 UTC by Polina
Modified: 2022-08-30 08:47 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-08-30 08:47:42 UTC
oVirt Team: Virt
Embargoed:
pm-rhel: ovirt-4.5?


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github oVirt ovirt-engine pull 76 0 None Merged Validate NUMA support and sockets equal to NUMA nodes for resize and pin 2022-07-24 06:49:40 UTC
Red Hat Issue Tracker RHV-44747 0 None None None 2022-02-22 12:48:48 UTC

Description Polina 2022-02-22 12:30:16 UTC
Description of problem:
if the hosts have no NUMA support the VM with cpu_pinning_policy=resize_and_pin_numa could be created and started once. But an attempt to restart is on the same host or to migrate will fail with the not required check for NUMA

Version-Release number of selected component (if applicable):
ovirt-engine-4.5.0-587.g28a2798.194.el8ev.noarch

How reproducible:100%

Steps to Reproduce:
1. Create VM with cpu_pinning_policy=resize_and_pin_numa
POST https://{{host}}/ovirt-engine/api/vms/
<vm>
  <name>resize_and_pin_numa_add</name>
  <cluster>
    <name>golden_env_mixed_1</name>
  </cluster>
  <memory>4294967296</memory>
<memory_policy>
    <ballooning>true</ballooning>
    <guaranteed>4294967296</guaranteed>
    <max>17179869184</max>
</memory_policy>
<template>
    <name>latest-rhel-guest-image-8.6-infra</name>
</template>
<cpu_pinning_policy>resize_and_pin_numa</cpu_pinning_policy>
<placement_policy>
    <affinity>migratable</affinity>
    <hosts>
        <host href="/ovirt-engine/api/hosts/f7d7c0ea-88ac-4b81-9cc7-5d73f4684a65" id="f7d7c0ea-88ac-4b81-9cc7-5d73f4684a65"/>
    </hosts>
</placement_policy>
</vm>

2. Run this VM - ok

3. Try to restart or to migrate


Actual results:
Fails with error 
2022-02-21 23:51:30,886+02 ERROR [org.ovirt.engine.api.restapi.resource.AbstractBackendResource] (default task-44) [] Operation Failed: [Cannot migrate VM. There is no host that satisfies current scheduling constraints. See below for details:, The host host_mixed_2 did not satisfy internal filter NUMA because does not support NUMA., The host host_mixed_2 did not satisfy internal filter NUMA because does not support NUMA.]


Expected results:
The migration or restart must no enter this check when we work on the setup with all the hosts not supporting NUMA

Additional info:
For QE. The problem could be faced on GE-4 (hosted-engine-04.lab.eng.tlv2.redhat.com)

Comment 1 Arik 2022-02-22 14:09:27 UTC
I'm not sure it's really a bug - if the user chose resize-and-pin and there's no host with NUMA that the VM can migrate to then it sounds ok
Polina, why do you consider this a bug?

Comment 2 Arik 2022-02-22 14:11:29 UTC
(In reply to Polina from comment #0)
> Expected results:
> The migration or restart must no enter this check when we work on the setup
> with all the hosts not supporting NUMA

Ah including the host that the VM started on?

Comment 3 Liran Rotenberg 2022-02-22 14:27:11 UTC
(In reply to Arik from comment #2)
> (In reply to Polina from comment #0)
> > Expected results:
> > The migration or restart must no enter this check when we work on the setup
> > with all the hosts not supporting NUMA
> 
> Ah including the host that the VM started on?

Yes, non of the hosts has NUMA support within this cluster.

Comment 4 Arik 2022-02-22 14:28:20 UTC
(In reply to Liran Rotenberg from comment #3)
> (In reply to Arik from comment #2)
> > (In reply to Polina from comment #0)
> > > Expected results:
> > > The migration or restart must no enter this check when we work on the setup
> > > with all the hosts not supporting NUMA
> > 
> > Ah including the host that the VM started on?
> 
> Yes, non of the hosts has NUMA support within this cluster.

So the bug is about the fact we started the VM?

Comment 5 Liran Rotenberg 2022-02-22 14:30:07 UTC
(In reply to Arik from comment #4)
> (In reply to Liran Rotenberg from comment #3)
> > (In reply to Arik from comment #2)
> > > (In reply to Polina from comment #0)
> > > > Expected results:
> > > > The migration or restart must no enter this check when we work on the setup
> > > > with all the hosts not supporting NUMA
> > > 
> > > Ah including the host that the VM started on?
> > 
> > Yes, non of the hosts has NUMA support within this cluster.
> 
> So the bug is about the fact we started the VM?

No, we can start 'Resize and Pin' on hosts without NUMA support. It just won't do the vNUMA changes (or better say - only change CPU topology and CPU pinning).
The VM did start, but somehow didn't pass the NUMA scheduling policy on other flows.

Comment 6 Arik 2022-02-22 16:04:12 UTC
To sum up an offline discussion -
Since we changed the timing of the pinning to be generated when scheduling the VM, the VM is more likely to get scheduled to a host without NUMA (because beforehand, users were not supposed to pin such VMs to hosts without NUMA).
We should either prioritize hosts with NUMA for VMs with resize-and-pin policy or even better - to require NUMA when scheduling such VMs

Comment 7 Polina 2022-02-23 08:20:41 UTC
(In reply to Arik from comment #1)
> I'm not sure it's really a bug - if the user chose resize-and-pin and
> there's no host with NUMA that the VM can migrate to then it sounds ok
> Polina, why do you consider this a bug?


Hi , I missed the discussion. If we allow setting resize-and-pin policy in the setup with the all hosts not supporting NUMA, then the VM must migrate there with no problem because the VM is set with no NUMA. Also , the second start on the same host must must be allowed . the is why it is bug
If we decide not to allow resize-and-pin policy for VMs in not NUMA environment , we should change the behavior.

Comment 8 Liran Rotenberg 2022-05-03 08:17:19 UTC
*** Bug 2080995 has been marked as a duplicate of this bug. ***

Comment 9 Polina 2022-07-26 09:45:29 UTC
verified on ovirt-engine-4.5.2-0.3.el8ev.noarch.

Now such VM could not start on hosts not supporting NUMA

Comment 10 Sandro Bonazzola 2022-08-30 08:47:42 UTC
This bugzilla is included in oVirt 4.5.2 release, published on August 10th 2022.
Since the problem described in this bug report should be resolved in oVirt 4.5.2 release, it has been closed with a resolution of CURRENT RELEASE.
If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.