Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1620599

Summary:

[RFE] Assign more than one mdev device to a VM from RHV web UI

Product:

[oVirt] ovirt-engine

Reporter:

Martin Tessun <mtessun>

Component:

BLL.Virt

Assignee:

Ryan Barry <rbarry>

Status:

CLOSED CURRENTRELEASE

QA Contact:

Nisim Simsolo <nsimsolo>

Severity:

high

Docs Contact:

Priority:

high

Version:

4.3.0

CC:

bugs, michal.skrivanek, mtessun, mzamazal, nsimsolo, rbarry, zhguo

Target Milestone:

ovirt-4.2.7

Keywords:

FutureFeature

Target Release:

---

Flags:

rule-engine: ovirt-4.2+
mtessun: planning_ack+
rule-engine: devel_ack+
rule-engine: testing_ack+

Hardware:

x86_64

OS:

All

Whiteboard:

Fixed In Version:

ovirt-engine-4.2.7.2

Doc Type:

If docs needed, set a value

Doc Text:

Story Points:

---

Clone Of:

1609139

Environment:

Last Closed:

2018-11-02 14:38:21 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

Virt

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Attachments:

Description	Flags
Example domain XML	none

Description Martin Tessun 2018-08-23 09:49:40 UTC

Currently we can assign only 1 mdev_type to a VM from oVirt web UI

For supporting multiple Nvidia vGPU devices assigned to 1 VM, we should be able to assign multiple mdev_type to VM from the web UI.

Please note that the NVIDIA drivers are not supported by NVIDIA for oVirt.

Comment 1 Milan Zamazal 2018-08-27 12:33:08 UTC

Please note that current Vdsm code doesn't support using multiple mdev Nvidia types on a single host with the explanation that it is not supported by Nvidia. There can be run multiple instances of the same type though.

That should be clarified and Vdsm code should be adjusted if needed. Other than that current Vdsm code should work with multiple mdev devices, but it'll need some testing of course.

Comment 2 Milan Zamazal 2018-08-27 13:34:26 UTC

OK, the limitation is only for a single device, it should work fine with multiple devices.

Comment 4 Ryan Barry 2018-09-05 01:56:37 UTC

Milan, are you sure vdsm supports this? I've tried a couple of different permutations of the XML, and the VDSM testcases fail in all instances.

I need another patch to the engine side in any case, since the engine also doesn't expect to have duplicate custom properties, but I'd like to either confirm how VDSM expects it to look, or to know whether I should submit a patch to VDSM to support this.

Comment 5 Milan Zamazal 2018-09-05 10:27:29 UTC

Ryan, when I try to start a VM with two vGPU devices, it passes Vdsm preparation, but the VM fails to start on QEMU level:

  qemu-kvm: -device vfio-pci,id=hostdev1,sysfsdev=/sys/bus/mdev/devices/f68f492a-be98-4581-8453-21a2772d87ad,bus=pci.0,addr=0x7: vfio error: f68f492a-be98-4581-8453-21a2772d87ad: error getting device from group 1: Operation not permitted
  Verify all devices in group 1 are bound to vfio-<bus> or pci-stub and not already in use

So Vdsm looks ready, but it doesn't work. I don't know whether I'm doing something wrong in my testing or whether there is something wrong in Vdsm or elsewhere -- it would require further investigation.

Comment 6 Ryan Barry 2018-09-05 10:37:51 UTC

Thanks. Can you share an example of the vmxml? I haven't found one which passes the prep tests. The elementtree search appears to be looking for a single ovirt-vm:custom/mdevType node

Comment 7 Milan Zamazal 2018-09-05 10:50:38 UTC

Created attachment 1481074 [details]
Example domain XML

Comment 8 Milan Zamazal 2018-09-05 10:51:53 UTC

Attached. It uses the same mdev type for both the devices, but the result is the same if I use a different mdev type for the added device.

Comment 9 Guo, Zhiyi 2018-09-06 05:21:30 UTC

(In reply to Milan Zamazal from comment #8)
> Attached. It uses the same mdev type for both the devices, but the result is
> the same if I use a different mdev type for the added device.

In qemu-kvm-rhev test, multi vgpus inside one guest only supported on M60-8Q vgpu type.

Comment 14 Nisim Simsolo 2018-10-11 10:32:36 UTC

Verification builds: 
kernel-3.10.0-954.el7.x86_64
ovirt-engine-4.2.7.2-0.1.el7ev
vdsm-4.20.42-1.el7ev.x86_64
qemu-kvm-rhev-2.12.0-18.el7.x86_64
libvirt-client-4.5.0-10.el7.x86_64
Host: NVIDIA-vGPU-rhel-7.6-410.62.x86_64
VMs: GRID6.3-GA-392.05-Windows-Guest-Drivers
     NVIDIA-Linux-x86_64-390.96-grid
Hardware: 2 X Tesla M60 under the same host.

Verification scenario: 
1. Browse Webadmin -> compute -> VMs -> edit RHEL7 VM -> custom properties -> select mdev_type and Assign 4 mdev devices to VM. for example: 
nvidia-22,nvidia-22,nvidia-22,nvidia-22
2. Run VM.
3. Observe host nvidia-smi and verify 4 GPUs attached to VM with the same PID. 
4. Open VM and verify there are 4 M60 PCI devices. 
5. install Nvidia drivers on the VM and verify drivers functionality.
6. Repeat test steps 1-5 using different Windows and Linux VM OS types.
7. Power off VMs and verify mdev device removed from /sys/class/mdev_bus/0000\:8X\:00.0/

Polarion test case added to external trackers.

Comment 15 Sandro Bonazzola 2018-11-02 14:38:21 UTC

This bugzilla is included in oVirt 4.2.7 release, published on November 2nd 2018.

Since the problem described in this bug report should be
resolved in oVirt 4.2.7 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.