Bug 1844270 - [vGPU] nodisplay option for mdev broken since mdev scheduling unit
Summary: [vGPU] nodisplay option for mdev broken since mdev scheduling unit
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 4.4.0
Hardware: x86_64
OS: Linux
high
high
Target Milestone: ovirt-4.4.1
: 4.4.1
Assignee: Lucia Jelinkova
QA Contact: Nisim Simsolo
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-06-04 23:15 UTC by Germano Veit Michel
Modified: 2023-12-15 18:05 UTC (History)
6 users (show)

Fixed In Version: ovirt-engine-4.4.1.5
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-08-04 13:22:49 UTC
oVirt Team: Virt
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 5133831 0 None None None 2020-06-05 03:37:22 UTC
Red Hat Product Errata RHSA-2020:3247 0 None None None 2020-08-04 13:23:38 UTC
oVirt gerrit 109650 0 master MERGED engine: MDevicePolicyUnit considers nodisplay 2021-02-18 09:07:54 UTC

Description Germano Veit Michel 2020-06-04 23:15:37 UTC
Description of problem:

The new scheduling unit to ensure mdev devices are available on the host during scheduling broke the 'nodisplay' option for vGPU mdevs.
~~~
    core: created scheduling unit for mdev devices
    
    Created scheduling unit that considers if mDev devices are present
    and available on the host.
    
    Change-Id: I89c06c34e1ae5724be83a44017b762e2c4ccc068
~~~

Now this does not work, because the 'nodisplay' mdev is actually an option, not an actual device:

https://github.com/oVirt/ovirt-engine/blob/master/backend/manager/modules/vdsbroker/src/main/java/org/ovirt/engine/core/vdsbroker/builder/vminfo/LibvirtVmXmlBuilder.java#L1262

The result is the VM cannot run as there is no host with 'nodisplay' device:

2020-06-05 09:03:02,480+10 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-4) [244510ee-70cb-4d97-b7d4-1c6efaf30072] EVENT_ID: USER_FAILED_RUN_VM(54), Failed to run VM GPU2 due to a failed validation: [Cannot run VM. There is no host that satisfies current scheduling constraints. See below for details:, The host host.example.com did not satisfy internal filter MDevice because some of the required mDev devices are missing (nodisplay).] (User: admin@internal-authz).

Version-Release number of selected component (if applicable):
rhvm-4.4.0-0.34.master.el8ev.noarch

How reproducible:
Always

Steps to Reproduce:
1. vGPU and VM with "nodisplay,nvidia-xxx" mdev custom property

Comment 1 Michal Skrivanek 2020-06-05 09:32:02 UTC
eh, it was supposed to be a setting which keeps the original mdev behavior and as such it's unfortunate it got broken for different reason. I'd say it's High because of that.

Comment 2 Sandro Bonazzola 2020-07-01 09:21:53 UTC
This bug is in POST state, targeted to 4.4.1 with pending patches not merged yet.
At this time we are handling only blockers for 4.4.1.
Please either mark this bug as blocker or move it out to >= 4.4.2

Comment 5 Nisim Simsolo 2020-07-13 16:35:34 UTC
Verified:
ovirt-engine-4.4.1.8-0.7.el8ev
vdsm-4.40.22-1.el8ev.x86_64
libvirt-daemon-6.0.0-25.module+el8.2.1+7154+47ffd890.x86_64
qemu-kvm-4.2.0-29.module+el8.2.1+7297+a825794d.x86_64
Nvidia GRID 11.0 GA drivers

Verification scenario:
1. Run VM with custom property mdev_type nvidia-xx and Nvidia drivers installed inside the VM.
   Observe VM qemu process and verify display=on
   Verify VM console is using VM secondary display and console it's showing VM screen.
2. Power off VM, edit custom property mdev_type to: nodisplay,nvidia-xx
   Run VM, observe qemu process and verify display=off
   Verify VM console is now using VM emulated graphics (screen is blank).

Comment 7 errata-xmlrpc 2020-08-04 13:22:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: RHV Manager (ovirt-engine) 4.4 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:3247


Note You need to log in before you can comment on or make changes to this bug.