Bug 1929260

Summary: Fails validation of action 'UpdateVm' when changing VM (set with auto pinning policy) type to High Performance
Product: [oVirt] ovirt-engine Reporter: Polina <pagranat>
Component: BLL.VirtAssignee: Liran Rotenberg <lrotenbe>
Status: CLOSED CURRENTRELEASE QA Contact: Polina <pagranat>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.4.4.5CC: ahadas, bugs, lrotenbe
Target Milestone: ovirt-4.5.0Flags: pm-rhel: ovirt-4.5?
Target Release: 4.5.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: ovirt-engine-4.5.0 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-04-28 09:26:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Virt RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
engine & vdsm logs none

Description Polina 2021-02-16 15:07:29 UTC
Created attachment 1757277 [details]
engine & vdsm logs

Description of problem:
If a VM was set with adjust/existing policy on a host not supporting numa , it fails to re-configure VM type to High Performance

Version-Release number of selected component (if applicable):
ovirt-engine-4.4.5.5-0.13.el8ev.noarch

How reproducible:100%

Steps to Reproduce:
1. Create Desktop/Server VM on the base of latest-rhel-guest-image-8.3-infra template.
2. Edit window on VM/ Host tab/Auto Pinning Policy set to adjust or existing. Run the VM on a host not supporting NUMA.
3. Configure Type to High Performance - Pending VM changes window appears as it must be.
4. Restart the VM

Actual results:

The VM is not re-started as HP and there is a warning in log:
Validation of action 'UpdateVm' failed for user SYSTEM. Reasons: VAR__ACTION__UPDATE,VAR__TYPE__VM,HOST_NUMA_NOT_SUPPORTED,$hostName host_mixed_1
2021-02-16 15:11:00,892+02 INFO  [org.ovirt.engine.core.bll.UpdateVmCommand] (EE-ManagedThreadFactory-engine-Thread-65245) [22d57e81] Lock freed to object 'EngineLock:{exclusiveLocks='[golden_env_mixed_virtio_0=VM_NAME]', sharedLocks='[d57112eb-8187-452a-a6b3-63ccfad991d5=VM]'}'

Expected results:
VM is restarted as HP

Additional info:timestamp in the attached engine log 2021-02-16 15:11:00

Comment 1 Liran Rotenberg 2021-02-16 15:25:36 UTC
The NEXT_RUN OVF is:
vm_configuration        | <?xml version="1.0" encoding="UTF-8"?><ovf:Envelope xmlns:ovf="http://schemas.dmtf.org/ovf/envelope/1/" xmlns:rasd="http://schemas.dmtf.org/wbem/wscim/1/cim-schema/2/CIM_ResourceAllocationSettingData" xmlns:vssd="http://schemas.dmtf.org/wbem/wscim/1/cim-schema/2/CIM_VirtualSystemSettingData" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" ovf:version="4.4.0.0"><References></References><NetworkSection><Info>List of networks</Info><Network ovf:name="ovirtmgmt"></Network></NetworkSection><Section xsi:type="ovf:DiskSection_Type"><Info>List of Virtual Disks</Info></Section><Content ovf:id="out" xsi:type="ovf:VirtualSystem_Type"><Name>golden_env_mixed_virtio_0</Name><Description></Description><Comment></Comment><CreationDate>2021/01/31 09:56:54</CreationDate><ExportDate>2021/02/16 13:04:08</ExportDate><DeleteProtected>false</DeleteProtected><SsoMethod>guest_agent</SsoMethod><IsSmartcardEnabled>false</IsSmartcardEnabled><NumOfIoThreads>1</NumOfIoThreads><TimeZone>Etc/GMT</TimeZone><default_boot_sequence>0</default_boot_sequence><Generation>0</Generation><ClusterCompatibilityVersion>4.5</ClusterCompatibilityVersion><VmType>2</VmType><ResumeBehavior>AUTO_RESUME</ResumeBehavior><MinAllocatedMem>1024</MinAllocatedMem><IsStateless>false</IsStateless><IsRunAndPause>false</IsRunAndPause><AutoStartup>false</AutoStartup><Priority>1</Priority><CreatedByUserId>2841d290-639c-11eb-9b5c-001a4a161064</CreatedByUserId><MigrationSupport>1</MigrationSupport><DedicatedVmForVds>c77adebd-b4de-48a3-9c87-72681c635a42</DedicatedVmForVds><IsBootMenuEnabled>false</IsBootMenuEnabled><IsSpiceFileTransferEnabled>true</IsSpiceFileTransferEnabled><IsSpiceCopyPasteEnabled>true</IsSpiceCopyPasteEnabled><AllowConsoleReconnect>false</AllowConsoleReconnect><ConsoleDisconnectAction>LOCK_SCREEN</ConsoleDisconnectAction><CustomEmulatedMachine></CustomEmulatedMachine><BiosType>1</BiosType><CustomCpuName></CustomCpuName><PredefinedProperties></PredefinedProperties><UserDefinedProperties></UserDefinedProperties><MaxMemorySizeMb>4096</MaxMemorySizeMb><MultiQueuesEnabled>true</MultiQueuesEnabled><UseHostCpu>true</UseHostCpu><ClusterName></ClusterName><TemplateId>00000000-0000-0000-0000-000000000000</TemplateId><TemplateName>Blank</TemplateName><IsInitilized>true</IsInitilized><Origin>0</Origin><quota_id>4dd35adf-1429-45e7-be04-ac504f9b20f7</quota_id><DefaultDisplayType>4</DefaultDisplayType><TrustedService>false</TrustedService><OriginalTemplateId>d9c72488-8204-451c-8494-431077a8d8e4</OriginalTemplateId><OriginalTemplateName>latest-rhel-guest-image-8.3-infra</OriginalTemplateName><CpuPinning>0#1,17_1#1,17_2#2,18_3#2,18_4#3,19_5#3,19_6#4,20_7#4,20_8#5,21_9#5,21_10#6,22_11#6,22_12#7,23_13#7,23_14#8,24_15#8,24_16#9,25_17#9,25_18#10,26_19#10,26_20#11,27_21#11,27_22#12,28_23#12,28_24#13,29_25#13,29_26#14,30_27#14,30_28#15,31_29#15,31</CpuPinning><UseLatestVersion>false</UseLatestVersion><Section xsi:type="ovf:NumaNodeSection_Type"><NumaNode><Index>0</Index><cpuIdList>0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29</cpuIdList><vdsNumaNodeList>0</vdsNumaNodeList><MemTotal>1024</MemTotal><NumaTuneMode>strict</NumaTuneMode></NumaNode></Section><Section ovf:id="d57112eb-8187-452a-a6b3-63ccfad991d5" ovf:required="false" xsi:type="ovf:OperatingSystemSection_Type"><Info>Guest Operating System</Info><Description>other</Description></Section><Section xsi:type="ovf:VirtualHardwareSection_Type"><Info>30 CPU, 1024 Memory</Info><System><vssd:VirtualSystemType>ENGINE 4.4.0.0</vssd:VirtualSystemType></System><Item><rasd:Caption>30 virtual cpu</rasd:Caption><rasd:Description>Number of virtual CPU</rasd:Description><rasd:InstanceId>1</rasd:InstanceId><rasd:ResourceType>3</rasd:ResourceType><rasd:num_of_sockets>1</rasd:num_of_sockets><rasd:cpu_per_socket>15</rasd:cpu_per_socket><rasd:threads_per_cpu>2</rasd:threads_per_cpu><rasd:max_num_of_vcpus>240</rasd:max_num_of_vcpus><rasd:VirtualQuantity>30</rasd:VirtualQuantity></Item><Item><rasd:Caption>1024 MB of memory</rasd:Caption><rasd:Description>Memory Size</rasd:Description><rasd:InstanceId>2</rasd:InstanceId><rasd:ResourceType>4</rasd:ResourceType><rasd:AllocationUnits>MegaBytes</rasd:AllocationUnits><rasd:VirtualQuantity>1024</rasd:VirtualQuantity></Item><Item><rasd:Caption>Ethernet adapter on ovirtmgmt</rasd:Caption><rasd:InstanceId>483328eb-a4f9-4886-9cb0-bcd9565b091f</rasd:InstanceId><rasd:ResourceType>10</rasd:ResourceType><rasd:OtherResourceType>ovirtmgmt</rasd:OtherResourceType><rasd:ResourceSubType>3</rasd:ResourceSubType><rasd:Connection>ovirtmgmt</rasd:Connection><rasd:Linked>true</rasd:Linked><rasd:Name>nic1</rasd:Name><rasd:ElementName>nic1</rasd:ElementName><rasd:MACAddress>00:1a:4a:16:10:33</rasd:MACAddress><rasd:speed>10000</rasd:speed><Type>interface</Type><Device>bridge</Device><rasd:Address>{type=pci, slot=0x00, bus=0x01, domain=0x0000, function=0x0}</rasd:Address><BootOrder>0</BootOrder><IsPlugged>true</IsPlugged><IsReadOnly>false</IsReadOnly><Alias>ua-483328eb-a4f9-4886-9cb0-bcd9565b091f</Alias></Item><Item><rasd:Caption>USB Controller</rasd:Caption><rasd:InstanceId>3</rasd:InstanceId><rasd:ResourceType>23</rasd:ResourceType><rasd:UsbPolicy>DISABLED</rasd:UsbPolicy></Item><Item><rasd:Caption>Graphical Controller</rasd:Caption><rasd:InstanceId>94d4179d-a661-4503-83b5-0d1b2e5cdc8a</rasd:InstanceId><rasd:ResourceType>20</rasd:ResourceType><rasd:VirtualQuantity>1</rasd:VirtualQuantity><rasd:SinglePciQxl>false</rasd:SinglePciQxl><Type>video</Type><Device>qxl</Device><rasd:Address>{type=pci, slot=0x01, bus=0x00, domain=0x0000, function=0x0}</rasd:Address><BootOrder>0</BootOrder><IsPlugged>true</IsPlugged><IsReadOnly>false</IsReadOnly><Alias>ua-94d4179d-a661-4503-83b5-0d1b2e5cdc8a</Alias><SpecParams><vgamem>16384</vgamem><heads>1</heads><ram>65536</ram><vram>8192</vram></SpecParams></Item><Item><rasd:Caption>CDROM</rasd:Caption><rasd:InstanceId>46daa147-82c2-4e89-a91b-02570d6f96dd</rasd:InstanceId><rasd:ResourceType>15</rasd:ResourceType><Type>disk</Type><Device>cdrom</Device><rasd:Address>{type=drive, bus=0, controller=0, target=0, unit=2}</rasd:Address><BootOrder>0</BootOrder><IsPlugged>true</IsPlugged><IsReadOnly>true</IsReadOnly><Alias>ua-46daa147-82c2-4e89-a91b-02570d6f96dd</Alias><SpecParams><path></path></SpecParams></Item><Item><rasd:ResourceType>0</rasd:ResourceType><rasd:InstanceId>94c89ea7-dfbc-4b5b-a3cf-76f36fa3cfa7</rasd:InstanceId><Type>controller</Type><Device>virtio-serial</Device><rasd:Address>{type=pci, slot=0x00, bus=0x03, domain=0x0000, function=0x0}</rasd:Address><BootOrder>0</BootOrder><IsPlugged>true</IsPlugged><IsReadOnly>false</IsReadOnly><Alias>ua-94c89ea7-dfbc-4b5b-a3cf-76f36fa3cfa7</Alias></Item><Item><rasd:ResourceType>0</rasd:ResourceType><rasd:InstanceId>e6815c0f-1b9f-40ee-ae86-e861b6bf4916</rasd:InstanceId><Type>rng</Type><Device>virtio</Device><rasd:Address>{type=pci, slot=0x00, bus=0x07, domain=0x0000, function=0x0}</rasd:Address><BootOrder>0</BootOrder><IsPlugged>true</IsPlugged><IsReadOnly>false</IsReadOnly><Alias>ua-e6815c0f-1b9f-40ee-ae86-e861b6bf4916</Alias><SpecParams><source>urandom</source></SpecParams></Item><Item><rasd:ResourceType>0</rasd:ResourceType><rasd:InstanceId>5ba4db74-3b7f-4d75-aa5e-65705cebed09</rasd:InstanceId><Type>controller</Type><Device>virtio-scsi</Device><rasd:Address>{type=pci, slot=0x00, bus=0x02, domain=0x0000, function=0x0}</rasd:Address><BootOrder>0</BootOrder><IsPlugged>true</IsPlugged><IsReadOnly>false</IsReadOnly><Alias>ua-5ba4db74-3b7f-4d75-aa5e-65705cebed09</Alias></Item><Item><rasd:ResourceType>0</rasd:ResourceType><rasd:InstanceId>cadea670-4f7c-486d-b8f7-a802493b0669</rasd:InstanceId><Type>console</Type><Device>console</Device><rasd:Address></rasd:Address><BootOrder>0</BootOrder><IsPlugged>true</IsPlugged><IsReadOnly>false</IsReadOnly><Alias></Alias></Item><Item><rasd:ResourceType>0</rasd:ResourceType><rasd:InstanceId>af43ed8d-dd7d-4c9d-853a-cca0a641c3ae</rasd:InstanceId><Type>controller</Type><Device>usb</Device><rasd:Address>{type=pci, slot=0x00, bus=0x04, domain=0x0000, function=0x0}</rasd:Address><BootOrder>0</BootOrder><IsPlugged>true</IsPlugged><IsReadOnly>false</IsReadOnly><Alias>ua-af43ed8d-dd7d-4c9d-853a-cca0a641c3ae</Alias><SpecParams><index>0</index><model>qemu-xhci</model></SpecParams></Item></Section></Content></ovf:Envelope>


Although we are expecting users using the auto pinning feature to have the relevant hardware, we should be safe to user and prevent this case. This will fail to update the VM according to the next run configuration.

In the OVF we can see:
<Section xsi:type="ovf:NumaNodeSection_Type"><NumaNode><Index>0</Index><cpuIdList>0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29</cpuIdList><vdsNumaNodeList>0</vdsNumaNodeList><MemTotal>1024</MemTotal><NumaTuneMode>strict</NumaTuneMode></NumaNode></Section>

This flow relevant to running VMs pinned to host with no NUMA support and activating the auto-pinning policy.

Comment 2 Polina 2022-04-18 11:39:45 UTC
verification  on ovirt-engine-4.5.0.2-0.7.el8ev.noarch
Today the scenario described in https://bugzilla.redhat.com/show_bug.cgi?id=1929260#c0 behaves like this :

In the setup with hosts not suppporting NUMA , the attempt to reconfigure to HP and then restart brings error:

   "Cannot edit VM. The VM NUMA configuration cannot be applied when using the CPU Pinning policy 'Resize and Pin NUMA'."

In the setup supporting NUMA the error is:

   "A NUMA node has an invalid CPU index: 45. Indexes must be between 0 and 3."

So, actually such flow is impossible now .

please let me know if this is how we want to close it .

Comment 3 Liran Rotenberg 2022-04-24 07:39:29 UTC
(In reply to Polina from comment #2)
> verification  on ovirt-engine-4.5.0.2-0.7.el8ev.noarch
> Today the scenario described in
> https://bugzilla.redhat.com/show_bug.cgi?id=1929260#c0 behaves like this :
> 
> In the setup with hosts not suppporting NUMA , the attempt to reconfigure to
> HP and then restart brings error:
> 
>    "Cannot edit VM. The VM NUMA configuration cannot be applied when using
> the CPU Pinning policy 'Resize and Pin NUMA'."
> 
> In the setup supporting NUMA the error is:
> 
>    "A NUMA node has an invalid CPU index: 45. Indexes must be between 0 and
> 3."
> 
> So, actually such flow is impossible now .
> 
> please let me know if this is how we want to close it .

Based on comment #1:
"Although we are expecting users using the auto pinning feature to have the relevant hardware, we should be safe to user and prevent this case. This will fail to update the VM according to the next run configuration." - we do not wish to run on non-NUMA hosts.

But, we may caused a new problem with an environment with NUMA support - it goes down to the NUMA static configuration versus the CPU runtime (dynamic) configuration, as we saw in many cases for `Resize and Pin NUMA`.
I think that since we prevent now the initial case - we can mark this as verified.
But for the case where NUMA is supported, we need to consider it (a new bug, or at least a discussion).

Comment 4 Liran Rotenberg 2022-04-24 08:03:27 UTC
(In reply to Liran Rotenberg from comment #3)
> (In reply to Polina from comment #2)
> > verification  on ovirt-engine-4.5.0.2-0.7.el8ev.noarch
> > Today the scenario described in
> > https://bugzilla.redhat.com/show_bug.cgi?id=1929260#c0 behaves like this :
> > 
> > In the setup with hosts not suppporting NUMA , the attempt to reconfigure to
> > HP and then restart brings error:
> > 
> >    "Cannot edit VM. The VM NUMA configuration cannot be applied when using
> > the CPU Pinning policy 'Resize and Pin NUMA'."
> > 
> > In the setup supporting NUMA the error is:
> > 
> >    "A NUMA node has an invalid CPU index: 45. Indexes must be between 0 and
> > 3."
> > 
> > So, actually such flow is impossible now .
> > 
> > please let me know if this is how we want to close it .
> 
> Based on comment #1:
> "Although we are expecting users using the auto pinning feature to have the
> relevant hardware, we should be safe to user and prevent this case. This
> will fail to update the VM according to the next run configuration." - we do
> not wish to run on non-NUMA hosts.
> 
> But, we may caused a new problem with an environment with NUMA support - it
> goes down to the NUMA static configuration versus the CPU runtime (dynamic)
> configuration, as we saw in many cases for `Resize and Pin NUMA`.
> I think that since we prevent now the initial case - we can mark this as
> verified.
> But for the case where NUMA is supported, we need to consider it (a new bug,
> or at least a discussion).

OK we do have BZ 2074525, same root cause. You may consider open a bug for this specific flow you have in this bug.

Comment 5 Polina 2022-04-24 09:00:57 UTC
I added this flow as a comment https://bugzilla.redhat.com/show_bug.cgi?id=2074525#c2
 and closing this on the base of the discussed

Comment 6 Sandro Bonazzola 2022-04-28 09:26:34 UTC
This bugzilla is included in oVirt 4.5.0 release, published on April 20th 2022.

Since the problem described in this bug report should be resolved in oVirt 4.5.0 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.