Bug 2079351

Summary: VDSErrorException while migrating a dedicated VM which is part of positive enforcing affinity rule
Product: [oVirt] ovirt-engine Reporter: Polina <pagranat>
Component: BLL.VirtAssignee: Liran Rotenberg <lrotenbe>
Status: CLOSED CURRENTRELEASE QA Contact: Polina <pagranat>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.5.0.4CC: ahadas, bugs, dfodor
Target Milestone: ovirt-4.5.2Flags: pm-rhel: ovirt-4.5?
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: ovirt-engine-4.5.2 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-08-30 08:47:42 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Virt RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
engine log dump xmls none

Description Polina 2022-04-27 12:43:57 UTC
Created attachment 1875358 [details]
engine log dump xmls

Description of problem:
when migration of a VM which is part of hard positive affinity rule that VDSM fails with error VDSErrorException: Failed to MigrateBrokerVDS, error = Invalid parameter: {'reason': 'length of cpusets (0) must match number of CPUs (12)'}, code = 91

Version-Release number of selected component (if applicable):
ovirt-engine-4.5.0.4-0.1.el8ev.noarch

How reproducible:
100%

Steps to Reproduce:
configuration:
host1 (ibm-p8-rhevm-04.lab2.eng.bos.redhat.com)
CPU(s):               128
On-line CPU(s) list:  0,8,16,24,32,40,48,56,64,72,80,88,96,104,112,120
Off-line CPU(s) list: 1-7,9-15,17-23,25-31,33-39,41-47,49-55,57-63,65-71,73-79,81-87,89-95,97-103,105-111,113-119,121-127
Thread(s) per core:   1
Core(s) per socket:   4
Socket(s):            4
NUMA node(s):         4

host2  - the same

topology for VM1 
	    <topology>
                <cores>3</cores>
                <sockets>4</sockets>
                <threads>1</threads>
            </topology>
            
topology for VM2
            <topology>
                <cores>1</cores>
                <sockets>2</sockets>
                <threads>1</threads>
            </topology>

1. Create on cluster affinity VM enforcing positive rule including VM1, VM2
2. Run VM1 and VM2 on host1.
3. select VM1 and migrate with the option 'Migrate all VMs in positive enforcing affinity with selected VMs.'

Actual results:
2022-04-27 12:57:04,530+03 ERROR [org.ovirt.engine.core.bll.MigrateVmToServerCommand] (default task-68) [4cfb7805] Command 'org.ovirt.engine.core.bll.MigrateVmToServerCommand' failed: EngineException: org.ovirt.engine.core.vdsbroker.vdsbroker.VDSErrorException: VDSGenericException: VDSErrorException: Failed to MigrateBrokerVDS, error = Invalid parameter: {'reason': 'length of cpusets (0) must match number of CPUs (12)'}, code = 91 (Failed with error InvalidParameter and code 91)

Expected results:


Additional info:

Comment 1 Polina 2022-05-02 13:13:23 UTC
a correction for https://bugzilla.redhat.com/show_bug.cgi?id=2079351#c0 - both VMs are configured with dedicated policy

Comment 2 Liran Rotenberg 2022-05-02 13:50:13 UTC
I tested locally on master, works for me.

Polina, can you please try again on other environment maybe?

If it works, i would suspect something went wrong when generating or allocating the new physical CPUs to use.

Comment 5 Polina 2022-06-19 16:07:43 UTC
the problem still happens in ovirt-engine-4.5.1.2-0.11.el8ev.noarch.
Re-assigning after discussion with Liran

Comment 6 Liran Rotenberg 2022-06-19 17:12:18 UTC
We manage to solve one problem (letting the engine to think he is possible to migrate and sending the migration command with wrong parameters) - which showed us the error from VDSM in the initial bug.
Now the engine doesn't succeed to schedule the VM group as desired.

Comment 7 Polina 2022-07-21 08:38:04 UTC
verified on ovirt-engine-4.5.1.3-0.36.el8ev.noarch according to the description

Comment 8 Sandro Bonazzola 2022-08-30 08:47:42 UTC
This bugzilla is included in oVirt 4.5.2 release, published on August 10th 2022.
Since the problem described in this bug report should be resolved in oVirt 4.5.2 release, it has been closed with a resolution of CURRENT RELEASE.
If the solution does not work for you, please open a new bug report.