Bug 1929260 - Fails validation of action 'UpdateVm' when changing VM (set with auto pinning policy) type to High Performance
Summary: Fails validation of action 'UpdateVm' when changing VM (set with auto pinning...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Virt
Version: 4.4.4.5
Hardware: x86_64
OS: Linux
unspecified
medium
Target Milestone: ovirt-4.5.0
: 4.5.0
Assignee: Liran Rotenberg
QA Contact: Polina
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-02-16 15:07 UTC by Polina
Modified: 2022-04-28 09:26 UTC (History)
3 users (show)

Fixed In Version: ovirt-engine-4.5.0
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-04-28 09:26:34 UTC
oVirt Team: Virt
Embargoed:
pm-rhel: ovirt-4.5?


Attachments (Terms of Use)
engine & vdsm logs (1.51 MB, application/gzip)
2021-02-16 15:07 UTC, Polina
no flags Details


Links
System ID Private Priority Status Summary Last Updated
oVirt gerrit 118064 0 master MERGED backend: next-run numa update optional 2022-01-17 10:24:16 UTC

Description Polina 2021-02-16 15:07:29 UTC
Created attachment 1757277 [details]
engine & vdsm logs

Description of problem:
If a VM was set with adjust/existing policy on a host not supporting numa , it fails to re-configure VM type to High Performance

Version-Release number of selected component (if applicable):
ovirt-engine-4.4.5.5-0.13.el8ev.noarch

How reproducible:100%

Steps to Reproduce:
1. Create Desktop/Server VM on the base of latest-rhel-guest-image-8.3-infra template.
2. Edit window on VM/ Host tab/Auto Pinning Policy set to adjust or existing. Run the VM on a host not supporting NUMA.
3. Configure Type to High Performance - Pending VM changes window appears as it must be.
4. Restart the VM

Actual results:

The VM is not re-started as HP and there is a warning in log:
Validation of action 'UpdateVm' failed for user SYSTEM. Reasons: VAR__ACTION__UPDATE,VAR__TYPE__VM,HOST_NUMA_NOT_SUPPORTED,$hostName host_mixed_1
2021-02-16 15:11:00,892+02 INFO  [org.ovirt.engine.core.bll.UpdateVmCommand] (EE-ManagedThreadFactory-engine-Thread-65245) [22d57e81] Lock freed to object 'EngineLock:{exclusiveLocks='[golden_env_mixed_virtio_0=VM_NAME]', sharedLocks='[d57112eb-8187-452a-a6b3-63ccfad991d5=VM]'}'

Expected results:
VM is restarted as HP

Additional info:timestamp in the attached engine log 2021-02-16 15:11:00

Comment 1 Liran Rotenberg 2021-02-16 15:25:36 UTC
The NEXT_RUN OVF is:
vm_configuration        | <?xml version="1.0" encoding="UTF-8"?><ovf:Envelope xmlns:ovf="http://schemas.dmtf.org/ovf/envelope/1/" xmlns:rasd="http://schemas.dmtf.org/wbem/wscim/1/cim-schema/2/CIM_ResourceAllocationSettingData" xmlns:vssd="http://schemas.dmtf.org/wbem/wscim/1/cim-schema/2/CIM_VirtualSystemSettingData" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" ovf:version="4.4.0.0"><References></References><NetworkSection><Info>List of networks</Info><Network ovf:name="ovirtmgmt"></Network></NetworkSection><Section xsi:type="ovf:DiskSection_Type"><Info>List of Virtual Disks</Info></Section><Content ovf:id="out" xsi:type="ovf:VirtualSystem_Type"><Name>golden_env_mixed_virtio_0</Name><Description></Description><Comment></Comment><CreationDate>2021/01/31 09:56:54</CreationDate><ExportDate>2021/02/16 13:04:08</ExportDate><DeleteProtected>false</DeleteProtected><SsoMethod>guest_agent</SsoMethod><IsSmartcardEnabled>false</IsSmartcardEnabled><NumOfIoThreads>1</NumOfIoThreads><TimeZone>Etc/GMT</TimeZone><default_boot_sequence>0</default_boot_sequence><Generation>0</Generation><ClusterCompatibilityVersion>4.5</ClusterCompatibilityVersion><VmType>2</VmType><ResumeBehavior>AUTO_RESUME</ResumeBehavior><MinAllocatedMem>1024</MinAllocatedMem><IsStateless>false</IsStateless><IsRunAndPause>false</IsRunAndPause><AutoStartup>false</AutoStartup><Priority>1</Priority><CreatedByUserId>2841d290-639c-11eb-9b5c-001a4a161064</CreatedByUserId><MigrationSupport>1</MigrationSupport><DedicatedVmForVds>c77adebd-b4de-48a3-9c87-72681c635a42</DedicatedVmForVds><IsBootMenuEnabled>false</IsBootMenuEnabled><IsSpiceFileTransferEnabled>true</IsSpiceFileTransferEnabled><IsSpiceCopyPasteEnabled>true</IsSpiceCopyPasteEnabled><AllowConsoleReconnect>false</AllowConsoleReconnect><ConsoleDisconnectAction>LOCK_SCREEN</ConsoleDisconnectAction><CustomEmulatedMachine></CustomEmulatedMachine><BiosType>1</BiosType><CustomCpuName></CustomCpuName><PredefinedProperties></PredefinedProperties><UserDefinedProperties></UserDefinedProperties><MaxMemorySizeMb>4096</MaxMemorySizeMb><MultiQueuesEnabled>true</MultiQueuesEnabled><UseHostCpu>true</UseHostCpu><ClusterName></ClusterName><TemplateId>00000000-0000-0000-0000-000000000000</TemplateId><TemplateName>Blank</TemplateName><IsInitilized>true</IsInitilized><Origin>0</Origin><quota_id>4dd35adf-1429-45e7-be04-ac504f9b20f7</quota_id><DefaultDisplayType>4</DefaultDisplayType><TrustedService>false</TrustedService><OriginalTemplateId>d9c72488-8204-451c-8494-431077a8d8e4</OriginalTemplateId><OriginalTemplateName>latest-rhel-guest-image-8.3-infra</OriginalTemplateName><CpuPinning>0#1,17_1#1,17_2#2,18_3#2,18_4#3,19_5#3,19_6#4,20_7#4,20_8#5,21_9#5,21_10#6,22_11#6,22_12#7,23_13#7,23_14#8,24_15#8,24_16#9,25_17#9,25_18#10,26_19#10,26_20#11,27_21#11,27_22#12,28_23#12,28_24#13,29_25#13,29_26#14,30_27#14,30_28#15,31_29#15,31</CpuPinning><UseLatestVersion>false</UseLatestVersion><Section xsi:type="ovf:NumaNodeSection_Type"><NumaNode><Index>0</Index><cpuIdList>0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29</cpuIdList><vdsNumaNodeList>0</vdsNumaNodeList><MemTotal>1024</MemTotal><NumaTuneMode>strict</NumaTuneMode></NumaNode></Section><Section ovf:id="d57112eb-8187-452a-a6b3-63ccfad991d5" ovf:required="false" xsi:type="ovf:OperatingSystemSection_Type"><Info>Guest Operating System</Info><Description>other</Description></Section><Section xsi:type="ovf:VirtualHardwareSection_Type"><Info>30 CPU, 1024 Memory</Info><System><vssd:VirtualSystemType>ENGINE 4.4.0.0</vssd:VirtualSystemType></System><Item><rasd:Caption>30 virtual cpu</rasd:Caption><rasd:Description>Number of virtual CPU</rasd:Description><rasd:InstanceId>1</rasd:InstanceId><rasd:ResourceType>3</rasd:ResourceType><rasd:num_of_sockets>1</rasd:num_of_sockets><rasd:cpu_per_socket>15</rasd:cpu_per_socket><rasd:threads_per_cpu>2</rasd:threads_per_cpu><rasd:max_num_of_vcpus>240</rasd:max_num_of_vcpus><rasd:VirtualQuantity>30</rasd:VirtualQuantity></Item><Item><rasd:Caption>1024 MB of memory</rasd:Caption><rasd:Description>Memory Size</rasd:Description><rasd:InstanceId>2</rasd:InstanceId><rasd:ResourceType>4</rasd:ResourceType><rasd:AllocationUnits>MegaBytes</rasd:AllocationUnits><rasd:VirtualQuantity>1024</rasd:VirtualQuantity></Item><Item><rasd:Caption>Ethernet adapter on ovirtmgmt</rasd:Caption><rasd:InstanceId>483328eb-a4f9-4886-9cb0-bcd9565b091f</rasd:InstanceId><rasd:ResourceType>10</rasd:ResourceType><rasd:OtherResourceType>ovirtmgmt</rasd:OtherResourceType><rasd:ResourceSubType>3</rasd:ResourceSubType><rasd:Connection>ovirtmgmt</rasd:Connection><rasd:Linked>true</rasd:Linked><rasd:Name>nic1</rasd:Name><rasd:ElementName>nic1</rasd:ElementName><rasd:MACAddress>00:1a:4a:16:10:33</rasd:MACAddress><rasd:speed>10000</rasd:speed><Type>interface</Type><Device>bridge</Device><rasd:Address>{type=pci, slot=0x00, bus=0x01, domain=0x0000, function=0x0}</rasd:Address><BootOrder>0</BootOrder><IsPlugged>true</IsPlugged><IsReadOnly>false</IsReadOnly><Alias>ua-483328eb-a4f9-4886-9cb0-bcd9565b091f</Alias></Item><Item><rasd:Caption>USB Controller</rasd:Caption><rasd:InstanceId>3</rasd:InstanceId><rasd:ResourceType>23</rasd:ResourceType><rasd:UsbPolicy>DISABLED</rasd:UsbPolicy></Item><Item><rasd:Caption>Graphical Controller</rasd:Caption><rasd:InstanceId>94d4179d-a661-4503-83b5-0d1b2e5cdc8a</rasd:InstanceId><rasd:ResourceType>20</rasd:ResourceType><rasd:VirtualQuantity>1</rasd:VirtualQuantity><rasd:SinglePciQxl>false</rasd:SinglePciQxl><Type>video</Type><Device>qxl</Device><rasd:Address>{type=pci, slot=0x01, bus=0x00, domain=0x0000, function=0x0}</rasd:Address><BootOrder>0</BootOrder><IsPlugged>true</IsPlugged><IsReadOnly>false</IsReadOnly><Alias>ua-94d4179d-a661-4503-83b5-0d1b2e5cdc8a</Alias><SpecParams><vgamem>16384</vgamem><heads>1</heads><ram>65536</ram><vram>8192</vram></SpecParams></Item><Item><rasd:Caption>CDROM</rasd:Caption><rasd:InstanceId>46daa147-82c2-4e89-a91b-02570d6f96dd</rasd:InstanceId><rasd:ResourceType>15</rasd:ResourceType><Type>disk</Type><Device>cdrom</Device><rasd:Address>{type=drive, bus=0, controller=0, target=0, unit=2}</rasd:Address><BootOrder>0</BootOrder><IsPlugged>true</IsPlugged><IsReadOnly>true</IsReadOnly><Alias>ua-46daa147-82c2-4e89-a91b-02570d6f96dd</Alias><SpecParams><path></path></SpecParams></Item><Item><rasd:ResourceType>0</rasd:ResourceType><rasd:InstanceId>94c89ea7-dfbc-4b5b-a3cf-76f36fa3cfa7</rasd:InstanceId><Type>controller</Type><Device>virtio-serial</Device><rasd:Address>{type=pci, slot=0x00, bus=0x03, domain=0x0000, function=0x0}</rasd:Address><BootOrder>0</BootOrder><IsPlugged>true</IsPlugged><IsReadOnly>false</IsReadOnly><Alias>ua-94c89ea7-dfbc-4b5b-a3cf-76f36fa3cfa7</Alias></Item><Item><rasd:ResourceType>0</rasd:ResourceType><rasd:InstanceId>e6815c0f-1b9f-40ee-ae86-e861b6bf4916</rasd:InstanceId><Type>rng</Type><Device>virtio</Device><rasd:Address>{type=pci, slot=0x00, bus=0x07, domain=0x0000, function=0x0}</rasd:Address><BootOrder>0</BootOrder><IsPlugged>true</IsPlugged><IsReadOnly>false</IsReadOnly><Alias>ua-e6815c0f-1b9f-40ee-ae86-e861b6bf4916</Alias><SpecParams><source>urandom</source></SpecParams></Item><Item><rasd:ResourceType>0</rasd:ResourceType><rasd:InstanceId>5ba4db74-3b7f-4d75-aa5e-65705cebed09</rasd:InstanceId><Type>controller</Type><Device>virtio-scsi</Device><rasd:Address>{type=pci, slot=0x00, bus=0x02, domain=0x0000, function=0x0}</rasd:Address><BootOrder>0</BootOrder><IsPlugged>true</IsPlugged><IsReadOnly>false</IsReadOnly><Alias>ua-5ba4db74-3b7f-4d75-aa5e-65705cebed09</Alias></Item><Item><rasd:ResourceType>0</rasd:ResourceType><rasd:InstanceId>cadea670-4f7c-486d-b8f7-a802493b0669</rasd:InstanceId><Type>console</Type><Device>console</Device><rasd:Address></rasd:Address><BootOrder>0</BootOrder><IsPlugged>true</IsPlugged><IsReadOnly>false</IsReadOnly><Alias></Alias></Item><Item><rasd:ResourceType>0</rasd:ResourceType><rasd:InstanceId>af43ed8d-dd7d-4c9d-853a-cca0a641c3ae</rasd:InstanceId><Type>controller</Type><Device>usb</Device><rasd:Address>{type=pci, slot=0x00, bus=0x04, domain=0x0000, function=0x0}</rasd:Address><BootOrder>0</BootOrder><IsPlugged>true</IsPlugged><IsReadOnly>false</IsReadOnly><Alias>ua-af43ed8d-dd7d-4c9d-853a-cca0a641c3ae</Alias><SpecParams><index>0</index><model>qemu-xhci</model></SpecParams></Item></Section></Content></ovf:Envelope>


Although we are expecting users using the auto pinning feature to have the relevant hardware, we should be safe to user and prevent this case. This will fail to update the VM according to the next run configuration.

In the OVF we can see:
<Section xsi:type="ovf:NumaNodeSection_Type"><NumaNode><Index>0</Index><cpuIdList>0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29</cpuIdList><vdsNumaNodeList>0</vdsNumaNodeList><MemTotal>1024</MemTotal><NumaTuneMode>strict</NumaTuneMode></NumaNode></Section>

This flow relevant to running VMs pinned to host with no NUMA support and activating the auto-pinning policy.

Comment 2 Polina 2022-04-18 11:39:45 UTC
verification  on ovirt-engine-4.5.0.2-0.7.el8ev.noarch
Today the scenario described in https://bugzilla.redhat.com/show_bug.cgi?id=1929260#c0 behaves like this :

In the setup with hosts not suppporting NUMA , the attempt to reconfigure to HP and then restart brings error:

   "Cannot edit VM. The VM NUMA configuration cannot be applied when using the CPU Pinning policy 'Resize and Pin NUMA'."

In the setup supporting NUMA the error is:

   "A NUMA node has an invalid CPU index: 45. Indexes must be between 0 and 3."

So, actually such flow is impossible now .

please let me know if this is how we want to close it .

Comment 3 Liran Rotenberg 2022-04-24 07:39:29 UTC
(In reply to Polina from comment #2)
> verification  on ovirt-engine-4.5.0.2-0.7.el8ev.noarch
> Today the scenario described in
> https://bugzilla.redhat.com/show_bug.cgi?id=1929260#c0 behaves like this :
> 
> In the setup with hosts not suppporting NUMA , the attempt to reconfigure to
> HP and then restart brings error:
> 
>    "Cannot edit VM. The VM NUMA configuration cannot be applied when using
> the CPU Pinning policy 'Resize and Pin NUMA'."
> 
> In the setup supporting NUMA the error is:
> 
>    "A NUMA node has an invalid CPU index: 45. Indexes must be between 0 and
> 3."
> 
> So, actually such flow is impossible now .
> 
> please let me know if this is how we want to close it .

Based on comment #1:
"Although we are expecting users using the auto pinning feature to have the relevant hardware, we should be safe to user and prevent this case. This will fail to update the VM according to the next run configuration." - we do not wish to run on non-NUMA hosts.

But, we may caused a new problem with an environment with NUMA support - it goes down to the NUMA static configuration versus the CPU runtime (dynamic) configuration, as we saw in many cases for `Resize and Pin NUMA`.
I think that since we prevent now the initial case - we can mark this as verified.
But for the case where NUMA is supported, we need to consider it (a new bug, or at least a discussion).

Comment 4 Liran Rotenberg 2022-04-24 08:03:27 UTC
(In reply to Liran Rotenberg from comment #3)
> (In reply to Polina from comment #2)
> > verification  on ovirt-engine-4.5.0.2-0.7.el8ev.noarch
> > Today the scenario described in
> > https://bugzilla.redhat.com/show_bug.cgi?id=1929260#c0 behaves like this :
> > 
> > In the setup with hosts not suppporting NUMA , the attempt to reconfigure to
> > HP and then restart brings error:
> > 
> >    "Cannot edit VM. The VM NUMA configuration cannot be applied when using
> > the CPU Pinning policy 'Resize and Pin NUMA'."
> > 
> > In the setup supporting NUMA the error is:
> > 
> >    "A NUMA node has an invalid CPU index: 45. Indexes must be between 0 and
> > 3."
> > 
> > So, actually such flow is impossible now .
> > 
> > please let me know if this is how we want to close it .
> 
> Based on comment #1:
> "Although we are expecting users using the auto pinning feature to have the
> relevant hardware, we should be safe to user and prevent this case. This
> will fail to update the VM according to the next run configuration." - we do
> not wish to run on non-NUMA hosts.
> 
> But, we may caused a new problem with an environment with NUMA support - it
> goes down to the NUMA static configuration versus the CPU runtime (dynamic)
> configuration, as we saw in many cases for `Resize and Pin NUMA`.
> I think that since we prevent now the initial case - we can mark this as
> verified.
> But for the case where NUMA is supported, we need to consider it (a new bug,
> or at least a discussion).

OK we do have BZ 2074525, same root cause. You may consider open a bug for this specific flow you have in this bug.

Comment 5 Polina 2022-04-24 09:00:57 UTC
I added this flow as a comment https://bugzilla.redhat.com/show_bug.cgi?id=2074525#c2
 and closing this on the base of the discussed

Comment 6 Sandro Bonazzola 2022-04-28 09:26:34 UTC
This bugzilla is included in oVirt 4.5.0 release, published on April 20th 2022.

Since the problem described in this bug report should be resolved in oVirt 4.5.0 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.


Note You need to log in before you can comment on or make changes to this bug.