Description of problem: The new scheduling unit to ensure mdev devices are available on the host during scheduling broke the 'nodisplay' option for vGPU mdevs. ~~~ core: created scheduling unit for mdev devices Created scheduling unit that considers if mDev devices are present and available on the host. Change-Id: I89c06c34e1ae5724be83a44017b762e2c4ccc068 ~~~ Now this does not work, because the 'nodisplay' mdev is actually an option, not an actual device: https://github.com/oVirt/ovirt-engine/blob/master/backend/manager/modules/vdsbroker/src/main/java/org/ovirt/engine/core/vdsbroker/builder/vminfo/LibvirtVmXmlBuilder.java#L1262 The result is the VM cannot run as there is no host with 'nodisplay' device: 2020-06-05 09:03:02,480+10 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (default task-4) [244510ee-70cb-4d97-b7d4-1c6efaf30072] EVENT_ID: USER_FAILED_RUN_VM(54), Failed to run VM GPU2 due to a failed validation: [Cannot run VM. There is no host that satisfies current scheduling constraints. See below for details:, The host host.example.com did not satisfy internal filter MDevice because some of the required mDev devices are missing (nodisplay).] (User: admin@internal-authz). Version-Release number of selected component (if applicable): rhvm-4.4.0-0.34.master.el8ev.noarch How reproducible: Always Steps to Reproduce: 1. vGPU and VM with "nodisplay,nvidia-xxx" mdev custom property
eh, it was supposed to be a setting which keeps the original mdev behavior and as such it's unfortunate it got broken for different reason. I'd say it's High because of that.
This bug is in POST state, targeted to 4.4.1 with pending patches not merged yet. At this time we are handling only blockers for 4.4.1. Please either mark this bug as blocker or move it out to >= 4.4.2
Verified: ovirt-engine-4.4.1.8-0.7.el8ev vdsm-4.40.22-1.el8ev.x86_64 libvirt-daemon-6.0.0-25.module+el8.2.1+7154+47ffd890.x86_64 qemu-kvm-4.2.0-29.module+el8.2.1+7297+a825794d.x86_64 Nvidia GRID 11.0 GA drivers Verification scenario: 1. Run VM with custom property mdev_type nvidia-xx and Nvidia drivers installed inside the VM. Observe VM qemu process and verify display=on Verify VM console is using VM secondary display and console it's showing VM screen. 2. Power off VM, edit custom property mdev_type to: nodisplay,nvidia-xx Run VM, observe qemu process and verify display=off Verify VM console is now using VM emulated graphics (screen is blank).
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: RHV Manager (ovirt-engine) 4.4 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:3247