Bug 2079351 - VDSErrorException while migrating a dedicated VM which is part of positive enforcing affinity rule
Summary: VDSErrorException while migrating a dedicated VM which is part of positive en...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Virt
Version: 4.5.0.4
Hardware: x86_64
OS: Linux
unspecified
high
Target Milestone: ovirt-4.5.2
: ---
Assignee: Liran Rotenberg
QA Contact: Polina
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-04-27 12:43 UTC by Polina
Modified: 2022-08-30 08:47 UTC (History)
3 users (show)

Fixed In Version: ovirt-engine-4.5.2
Clone Of:
Environment:
Last Closed: 2022-08-30 08:47:42 UTC
oVirt Team: Virt
Embargoed:
pm-rhel: ovirt-4.5?


Attachments (Terms of Use)
engine log dump xmls (887.80 KB, application/gzip)
2022-04-27 12:43 UTC, Polina
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github oVirt ovirt-engine pull 380 0 None open Switch CPUPinningPolicy to handle group of VMs 2022-05-18 12:14:00 UTC
Github oVirt ovirt-engine pull 475 0 None open Release CPUs reminders 2022-06-19 17:12:18 UTC
Red Hat Issue Tracker RHV-45877 0 None None None 2022-04-27 12:49:40 UTC

Description Polina 2022-04-27 12:43:57 UTC
Created attachment 1875358 [details]
engine log dump xmls

Description of problem:
when migration of a VM which is part of hard positive affinity rule that VDSM fails with error VDSErrorException: Failed to MigrateBrokerVDS, error = Invalid parameter: {'reason': 'length of cpusets (0) must match number of CPUs (12)'}, code = 91

Version-Release number of selected component (if applicable):
ovirt-engine-4.5.0.4-0.1.el8ev.noarch

How reproducible:
100%

Steps to Reproduce:
configuration:
host1 (ibm-p8-rhevm-04.lab2.eng.bos.redhat.com)
CPU(s):               128
On-line CPU(s) list:  0,8,16,24,32,40,48,56,64,72,80,88,96,104,112,120
Off-line CPU(s) list: 1-7,9-15,17-23,25-31,33-39,41-47,49-55,57-63,65-71,73-79,81-87,89-95,97-103,105-111,113-119,121-127
Thread(s) per core:   1
Core(s) per socket:   4
Socket(s):            4
NUMA node(s):         4

host2  - the same

topology for VM1 
	    <topology>
                <cores>3</cores>
                <sockets>4</sockets>
                <threads>1</threads>
            </topology>
            
topology for VM2
            <topology>
                <cores>1</cores>
                <sockets>2</sockets>
                <threads>1</threads>
            </topology>

1. Create on cluster affinity VM enforcing positive rule including VM1, VM2
2. Run VM1 and VM2 on host1.
3. select VM1 and migrate with the option 'Migrate all VMs in positive enforcing affinity with selected VMs.'

Actual results:
2022-04-27 12:57:04,530+03 ERROR [org.ovirt.engine.core.bll.MigrateVmToServerCommand] (default task-68) [4cfb7805] Command 'org.ovirt.engine.core.bll.MigrateVmToServerCommand' failed: EngineException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to MigrateBrokerVDS, error = Invalid parameter: {'reason': 'length of cpusets (0) must match number of CPUs (12)'}, code = 91 (Failed with error InvalidParameter and code 91)

Expected results:


Additional info:

Comment 1 Polina 2022-05-02 13:13:23 UTC
a correction for https://bugzilla.redhat.com/show_bug.cgi?id=2079351#c0 - both VMs are configured with dedicated policy

Comment 2 Liran Rotenberg 2022-05-02 13:50:13 UTC
I tested locally on master, works for me.

Polina, can you please try again on other environment maybe?

If it works, i would suspect something went wrong when generating or allocating the new physical CPUs to use.

Comment 5 Polina 2022-06-19 16:07:43 UTC
the problem still happens in ovirt-engine-4.5.1.2-0.11.el8ev.noarch.
Re-assigning after discussion with Liran

Comment 6 Liran Rotenberg 2022-06-19 17:12:18 UTC
We manage to solve one problem (letting the engine to think he is possible to migrate and sending the migration command with wrong parameters) - which showed us the error from VDSM in the initial bug.
Now the engine doesn't succeed to schedule the VM group as desired.

Comment 7 Polina 2022-07-21 08:38:04 UTC
verified on ovirt-engine-4.5.1.3-0.36.el8ev.noarch according to the description

Comment 8 Sandro Bonazzola 2022-08-30 08:47:42 UTC
This bugzilla is included in oVirt 4.5.2 release, published on August 10th 2022.
Since the problem described in this bug report should be resolved in oVirt 4.5.2 release, it has been closed with a resolution of CURRENT RELEASE.
If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.