Bug 1539717 - High Performance VM could not be started in PPC env due to several reasons: can't add USB input device, Pass-Through Host CPU and CPU cache L3 are not supported
Summary: High Performance VM could not be started in PPC env due to several reasons: c...
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: ovirt-engine
Classification: oVirt
Component: BLL.Virt
Version: 4.2.1
Hardware: ppc64le
OS: Linux
medium
high
Target Milestone: ovirt-4.2.2
: 4.2.2.5
Assignee: Sharon Gratch
QA Contact: Polina
URL:
Whiteboard:
Depends On:
Blocks: 1444027
TreeView+ depends on / blocked
 
Reported: 2018-01-29 13:56 UTC by Polina
Modified: 2020-08-03 13:10 UTC (History)
12 users (show)

Fixed In Version: ovirt-engine-4.2.2.5
Doc Type: Bug Fix
Doc Text:
Cause: High Performance VMs with ppc based CPU arch failed to run. Fix: The fix includes 2 issues: 1. Since high performance vms are running without USB controllers (USB is totally disabled), then fix includes disabling the mouse device (since mouse device requires usb bus). 2. Auto disable of 'Pass-Through Host CPU' since it is not supported for PPC (although recommended for high performance vms). Result: High Performance VMs with ppc based CPU arch succeeded to run without any manual configuration changes.
Clone Of:
Environment:
Last Closed: 2018-04-27 07:22:22 UTC
oVirt Team: Virt
Embargoed:
rule-engine: ovirt-4.2+


Attachments (Terms of Use)
engine & vdsm logs from automation test (209.96 KB, application/x-gzip)
2018-01-29 13:56 UTC, Polina
no flags Details
look in engine.log line 8051 (1.92 MB, text/plain)
2018-03-13 12:01 UTC, Polina
no flags Details


Links
System ID Private Priority Status Summary Last Updated
IBM Linux Technology Center 165819 0 None None None 2019-02-26 09:51:31 UTC
oVirt gerrit 87562 0 master MERGED engine: Fix HP VMS issues for ppc based cpu arch 2020-09-05 17:08:26 UTC
oVirt gerrit 88155 0 ovirt-engine-4.2 MERGED engine: Fix HP VMS issues for ppc based cpu arch 2020-09-05 17:08:26 UTC
oVirt gerrit 89055 0 master MERGED core: remove cache layer 3 addition to engine xml 2020-09-05 17:08:27 UTC
oVirt gerrit 89143 0 ovirt-engine-4.2 MERGED core: remove cache layer 3 addition to engine xml 2020-09-05 17:08:26 UTC

Description Polina 2018-01-29 13:56:17 UTC
Created attachment 1387819 [details]
engine & vdsm logs from automation test

Description of problem: can't start High-Performance VM in PPC environment 


Version-Release number of selected component (if applicable):
ovirt-engine-setup-plugin-ovirt-engine-4.2.1.3


How reproducible: 


Steps to Reproduce:
1.Edit VM / Optimized For/ High Performance
2.Run VM 
3.

Actual results:
Can't add USB input device. USB bus is disabled.

2018-01-28 00:49:45,785+02 INFO  [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-1) [] add VM 'a8ce5960-4ab3-4b71-b115-7ff01aab1da1'(golden_env_mixed_virtio_1_0) to rerun treatment

2018-01-28 00:49:45,790+02 ERROR [org.ovirt.engine.core.vdsbroker.monitoring.VmsMonitoring] (ForkJoinPool-1-worker-1) [] Rerun VM 'a8ce5960-4ab3-4b71-b115-7ff01aab1da1'. Called from VDS 'host_mixed_1'

2018-01-28 00:49:45,837+02 WARN  [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedThreadFactory-engine-Thread-989) [] EVENT_ID: USER_INITIATED_RUN_VM_FAILED(151), Failed to run VM golden_env_mixed_virtio_1_0 on Host host_mixed_1.



Expected results: VM started 


Additional info: bug found in automation environment jenkins-vm-15.qa.lab.tlv.redhat.com. logs attached

Comment 1 Michal Skrivanek 2018-02-01 13:55:52 UTC
LibvirtVmXmlBuilder::writeInput() needs a fix
Perhaps just run without a mouse altogether

Comment 2 Sharon Gratch 2018-02-01 14:18:06 UTC
(In reply to Michal Skrivanek from comment #1)
> LibvirtVmXmlBuilder::writeInput() needs a fix
> Perhaps just run without a mouse altogether

Right, the same way as done for disabling Tablet devices in case of HP VMs:
https://github.com/oVirt/ovirt-engine/blob/47d43aa114a3c89c751466dea044bbcc2bb2d5b9/backend/manager/modules/vdsbroker/src/main/java/org/ovirt/engine/core/vdsbroker/builder/vminfo/LibvirtVmXmlBuilder.java#L2260

Comment 3 Polina 2018-03-13 11:58:31 UTC
Tested on PPC BUILD 4.2.2-4 (jenkins-vm-15.lab.eng.tlv2.redhat.com). the packages on host :

vdsm-hook-vhostmd-4.20.20-1.el7ev.noarch
vdsm-hook-openstacknet-4.20.20-1.el7ev.noarch
vdsm-jsonrpc-4.20.20-1.el7ev.noarch
vdsm-hook-ethtool-options-4.20.20-1.el7ev.noarch
vdsm-client-4.20.20-1.el7ev.noarch
vdsm-yajsonrpc-4.20.20-1.el7ev.noarch
vdsm-common-4.20.20-1.el7ev.noarch
vdsm-http-4.20.20-1.el7ev.noarch
vdsm-hook-vfio-mdev-4.20.20-1.el7ev.noarch
vdsm-api-4.20.20-1.el7ev.noarch
vdsm-python-4.20.20-1.el7ev.noarch
vdsm-4.20.20-1.el7ev.ppc64le
vdsm-hook-fcoe-4.20.20-1.el7ev.noarch
vdsm-network-4.20.20-1.el7ev.ppc64le
vdsm-hook-vmfex-dev-4.20.20-1.el7ev.noarch

Steps: 1.Create VM, 2.Configure High Performance. 3.Run
Actual Result:from engine.log: "VM test_hp2 is down with error. Exit message: unsupported configuration: CPU cache specification is not supported for 'ppc64' architecture."
Expected: VM started w/o error.

engine log attached

Comment 4 Polina 2018-03-13 12:01:00 UTC
Created attachment 1407505 [details]
look in engine.log line 8051

Comment 5 Sharon Gratch 2018-03-13 12:19:53 UTC
(In reply to Polina from comment #3)
> Tested on PPC BUILD 4.2.2-4 (jenkins-vm-15.lab.eng.tlv2.redhat.com). the
> packages on host :
> Steps: 1.Create VM, 2.Configure High Performance. 3.Run
> Actual Result:from engine.log: "VM test_hp2 is down with error. Exit
> message: unsupported configuration: CPU cache specification is not supported
> for 'ppc64' architecture."
> Expected: VM started w/o error.
It seems that enabling CPU cache layer 3 is not supported by libvirt for that ppc host. 
Can you please check which libvirt version is installed on host and also attach vdsm+libvirt logs?

Thanks

Comment 6 Sharon Gratch 2018-03-13 16:20:44 UTC
It seems according to [1] that the cache element in domain xml which describes the virtual CPU caching is not supported for non x86 architecture, and therefore this line is not supported for ppc:
<cpu>
    ....
    <cache level="3" mode="emulate"/>
</cpu>

Meaning that we should disable CPU cache layer 3 for HP vms in case of ppc arch. 

[1] https://libvirt.org/git/?p=libvirt.git;a=blob;f=src/qemu/qemu_domain.c#l3188

Francesco, can you please approve this or am I missing something? thanks.

Comment 7 Yaniv Kaul 2018-03-14 07:32:40 UTC
(In reply to Sharon Gratch from comment #6)
> It seems according to [1] that the cache element in domain xml which
> describes the virtual CPU caching is not supported for non x86 architecture,
> and therefore this line is not supported for ppc:
> <cpu>
>     ....
>     <cache level="3" mode="emulate"/>
> </cpu>
> 
> Meaning that we should disable CPU cache layer 3 for HP vms in case of ppc
> arch. 
> 
> [1]
> https://libvirt.org/git/?p=libvirt.git;a=blob;f=src/qemu/qemu_domain.c#l3188
> 
> Francesco, can you please approve this or am I missing something? thanks.

Also, I thought that in RHEL 7.4 and beyond this is not needed to be explicitly set. Martin?

Comment 8 Francesco Romani 2018-03-14 07:49:07 UTC
(In reply to Sharon Gratch from comment #6)
> It seems according to [1] that the cache element in domain xml which
> describes the virtual CPU caching is not supported for non x86 architecture,
> and therefore this line is not supported for ppc:
> <cpu>
>     ....
>     <cache level="3" mode="emulate"/>
> </cpu>
> 
> Meaning that we should disable CPU cache layer 3 for HP vms in case of ppc
> arch. 
> 
> [1]
> https://libvirt.org/git/?p=libvirt.git;a=blob;f=src/qemu/qemu_domain.c#l3188
> 
> Francesco, can you please approve this or am I missing something? thanks.

Yes Sharon, you seem to be right.
The documentation is a bit vague, but it seems that we should use the <cache> element only on x86_64. So, neither on PPC or in s390.

Per https://libvirt.org/formatdomain.html#elementsCPU is not that bad, the hypervisor driver is supposed to pick a good default if we don't set explicit values.

It is not uncommon that new libvirt features appear on x86_64 + qemu, and after a while they appear also on other arches/hypervisor drivers.

Feel free to file a libvirt documentation bug if you wish.

Comment 9 Martin Polednik 2018-03-14 09:35:10 UTC
I believe Yaniv is right - the CPU cache was meant to be set pre-7.4 and migrated to default later. It'd be worth verifying that it is the case though.

Comment 10 Yaniv Kaul 2018-03-15 14:09:20 UTC
Is this on track to 4.2.2? If not, please defer to 4.2.3.

Comment 12 Sharon Gratch 2018-03-18 15:14:40 UTC
(In reply to Yaniv Kaul from comment #10)
> Is this on track to 4.2.2? If not, please defer to 4.2.3.

yes, it is on track to 4.2.2

Comment 14 Polina 2018-04-24 07:42:19 UTC
The problem is verified on rhv-release-4.2.3-2-001.noarch.

Comment 15 Sandro Bonazzola 2018-04-27 07:22:22 UTC
This bugzilla is included in oVirt 4.2.2 release, published on March 28th 2018.

Since the problem described in this bug report should be
resolved in oVirt 4.2.2 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.

Comment 16 Roy Golan 2020-08-03 12:33:50 UTC
Popped up again in RHV IPI use case, where it creates a VM using the API, from blank template, set type to high_performance.

ovirt 4.4.2, cluster version 4.3, vm console type is spice+vnc.

Setting the console to vnc works around this.


2020-08-03 10:37:40,796+03 INFO  [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer] (ForkJoinPool-1-worker-31) [] VM 'a50993e0-7161-4969-ae01-2ad8fdbef02f'(catapult-znhpr-master-2) move
d from 'WaitForLaunch' --> 'Down'
2020-08-03 10:37:40,810+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (ForkJoinPool-1-worker-31) [] EVENT_ID: VM_DOWN_ERROR(119), VM catapult-znhpr-master-2
 is down with error. Exit message: unsupported configuration: Can't add USB input device. USB bus is disabled.


Reopen or a new bug?

Comment 17 Michal Skrivanek 2020-08-03 12:59:48 UTC
ppc64le doesn't have SPICE. You need to set correct console type to only VNC

Comment 18 Liran Rotenberg 2020-08-03 13:10:10 UTC
(In reply to Roy Golan from comment #16)
> Popped up again in RHV IPI use case, where it creates a VM using the API,
> from blank template, set type to high_performance.
> 
> ovirt 4.4.2, cluster version 4.3, vm console type is spice+vnc.
> 
> Setting the console to vnc works around this.
> 
> 
> 2020-08-03 10:37:40,796+03 INFO 
> [org.ovirt.engine.core.vdsbroker.monitoring.VmAnalyzer]
> (ForkJoinPool-1-worker-31) [] VM
> 'a50993e0-7161-4969-ae01-2ad8fdbef02f'(catapult-znhpr-master-2) move
> d from 'WaitForLaunch' --> 'Down'
> 2020-08-03 10:37:40,810+03 ERROR
> [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector]
> (ForkJoinPool-1-worker-31) [] EVENT_ID: VM_DOWN_ERROR(119), VM
> catapult-znhpr-master-2
>  is down with error. Exit message: unsupported configuration: Can't add USB
> input device. USB bus is disabled.
> 
> 
> Reopen or a new bug?

This bug is specific for PPC.
I guess by the spice+vnc you are not on PPC. Right?


Note You need to log in before you can comment on or make changes to this bug.