This bug was initially created as a copy of Bug #1846343 I am copying this bug because: Description of problem: After adding Nvidia vGPU instance using WebAdmin -> VM -> host devices -> manage vGPU button or using edit VM -> custom properties -> mdev_type, the VM failed to run with the next vdsm.log errors: 2020-06-11 15:04:14,007+0300 ERROR (vm/6099c96f) [virt.vm] (vmId='6099c96f-d79d-47ae-b39f-9489bc552cf0') The vm start process failed (vm:871) Traceback (most recent call last): . . libvirt.libvirtError: internal error: Process exited prior to exec: libvirt: error : failed to access '/sys/bus/mdev/devices/e1f27070-b062-4ea3-a689-89e37a56f677/iommu_group': No such file or directory 2020-06-11 15:04:18,533+0300 ERROR (jsonrpc/1) [root] Couldn't parse NVDIMM device data (hostdev:755) Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/vdsm/common/hostdev.py", line 753, in list_nvdimms data = json.loads(output) File "/usr/lib64/python3.6/json/__init__.py", line 354, in loads return _default_decoder.decode(s) File "/usr/lib64/python3.6/json/decoder.py", line 339, in decode obj, end = self.raw_decode(s, idx=_w(s, 0).end()) File "/usr/lib64/python3.6/json/decoder.py", line 357, in raw_decode raise JSONDecodeError("Expecting value", s, err.value) from None json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0) -------------------------- vGPU Nvidia drivers are installed and Nvidia service is running. also, it is possible to see vGPU instances in the host, for example: # /home/nsimsolo/vgpu_instances1.sh mdev_type: nvidia-11 --- description: num_heads=2, frl_config=45, framebuffer=512M, max_resolution=2560x1600, max_instance=16 --- name: GRID M60-0B mdev_type: nvidia-12 --- description: num_heads=2, frl_config=60, framebuffer=512M, max_resolution=2560x1600, max_instance=16 --- name: GRID M60-0Q mdev_type: nvidia-13 --- description: num_heads=1, frl_config=60, framebuffer=1024M, max_resolution=1280x1024, max_instance=8 --- name: GRID M60-1A mdev_type: nvidia-14 --- description: num_heads=4, frl_config=45, framebuffer=1024M, max_resolution=5120x2880, max_instance=8 --- name: GRID M60-1B ---------------- This issue is not related to emulated machine type (issue occured on pc-i440fx and Q35) Version-Release number of selected component (if applicable): ovirt-engine-4.4.1.2-0.10.el8ev vdsm-4.40.19-1.el8ev.x86_64 libvirt-daemon-6.0.0-22.module+el8.2.1+6815+1c792dc8.x86_64 qemu-kvm-4.2.0-22.module+el8.2.1+6758+cb8d64c2.x86_64 Nvidia host drivers (Tesla M60): NVIDIA-vGPU-rhel-8.2-450.36.01.x86_64 How reproducible: 100% Steps to Reproduce: 1. Browse Webadmin -> click on VM name -> host devices tab -> manage vGPU, select Nvidia instane and click "save" button. 2. Run VM 3. Actual results: VM failed to run Expected results: VM should run with attached vGPU device. Additional info: vdsm.log and engine.log attached
we need to document the workaround until bug 1846343 is fixed - either https://bugzilla.redhat.com/show_bug.cgi?id=1846343#c18 or https://bugzilla.redhat.com/show_bug.cgi?id=1846343#c24, Milan, your call