Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1980331

Summary: manage vGPU dialog: maximum value of available mdev instances to attach is incorrect.
Product: [oVirt] ovirt-engine Reporter: Nisim Simsolo <nsimsolo>
Component: ovirt-engine-ui-extensionsAssignee: Lucia Jelinkova <ljelinko>
Status: CLOSED CURRENTRELEASE QA Contact: Nisim Simsolo <nsimsolo>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.4.7CC: ahadas, bugs, dfodor, nsimsolo, sgratch
Target Milestone: ovirt-4.5.0Flags: pm-rhel: ovirt-4.5?
Target Release: 4.5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ovirt-engine-4.5.0 ovirt-engine-ui-extensions-1.3.1-1 Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-04-28 09:26:34 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Virt RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Nisim Simsolo 2021-07-08 11:52:30 UTC
Description of problem:
Currently, when using manage vGPU dialog to add multiple mdev devices to VM the maximum available mdev devices is calculated by the number of mdev_bus that are being used by the vGPU card instead of the multiply of vGPU mdev_bus by mdev_type max_instances value. 

for example, when using host with 2xTesla M60 we have 4 mdev_bus in use:
[root@lion01 ~]# ls -l /sys/class/mdev_bus/
total 0
lrwxrwxrwx. 1 root root 0 Jul  7 01:11 0000:84:00.0 -> ../../devices/pci0000:80/0000:80:02.0/0000:82:00.0/0000:83:08.0/0000:84:00.0
lrwxrwxrwx. 1 root root 0 Jul  7 01:11 0000:85:00.0 -> ../../devices/pci0000:80/0000:80:02.0/0000:82:00.0/0000:83:10.0/0000:85:00.0
lrwxrwxrwx. 1 root root 0 Jul  7 01:11 0000:8b:00.0 -> ../../devices/pci0000:80/0000:80:03.0/0000:86:00.0/0000:87:10.0/0000:89:00.0/0000:8a:08.0/0000:8b:00.0
lrwxrwxrwx. 1 root root 0 Jul  7 01:11 0000:8c:00.0 -> ../../devices/pci0000:80/0000:80:03.0/0000:86:00.0/0000:87:10.0/0000:89:00.0/0000:8a:10.0/0000:8c:00.0

And if we want to use nvidia-18 mdev_type, the max_instance per bus is 4:
mdev_type: nvidia-18 --- description: num_heads=4, frl_config=60, framebuffer=2048M, max_resolution=5120x2880, max_instance=4 --- name: GRID M60-2Q

So the maximum available mdev instances to use in that case should be 16 instead of 4.


Also, in case we are using more than one host with vGPU under the same engine, the upper limit should be the one of the host with the most available vGPU instances (if for example we are using 1st host with 1 vGPU card and the 2nd host is with 4 vGPU card, so the upper limit should be according to the 2nd host) 

This bug is relevant to Bug 1860646 - [RFE] Manage vGPU dialog, add option for assigning more than one vGPU instance to VM

A workaround for this issue is to use edit VM dialog -> custom properties -> mdev_type dropbox.


Version-Release number of selected component (if applicable):
ovirt-engine-4.4.7.6-0.11.el8ev
vdsm-4.40.70.6-1.el8ev.x86_64
libvirt-daemon-7.0.0-14.1.module+el8.4.0+11095+d46acebf.x86_64
qemu-kvm-5.2.0-16.module+el8.4.0+11536+725e25d9.2.x86_64
NVIDIA-vGPU-rhel-8.4-460.73.02.x86_64

How reproducible:
100%

Steps to Reproduce:
Assuming the host is with 2xTesla M60 (which is using 4 mdev_bus).
1. Browse WebAdmin -> VM -> host devices tab, open manage vGPU dialog.
2. try to add more than 1 nvidia-22 instance (max_instance=1).
3.

Actual results:
It is possible to add only 1 nvidia-22 instance to VM.

Expected results:
the maximum available instances to use should be 4 (max_instance=1)x(mdev_bus=4).

Additional info:
manage vGPU screenshot attached.

Comment 2 Arik 2021-08-18 08:42:03 UTC
not yet shipped

Comment 3 Nisim Simsolo 2022-04-27 13:52:42 UTC
Verified:
ovirt-engine-4.5.0.4-0.1.el8ev
qemu-kvm-6.2.0-11.module+el8.6.0+14707+5aa4b42d.x86_64
vdsm-4.50.0.13-1.el8ev.x86_64
libvirt-daemon-8.0.0-5.module+el8.6.0+14480+c0a3aa0f.x86_64
Nvidia 14.0 GA drivers

Comment 4 Sandro Bonazzola 2022-04-28 09:26:34 UTC
This bugzilla is included in oVirt 4.5.0 release, published on April 20th 2022.

Since the problem described in this bug report should be resolved in oVirt 4.5.0 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.