Bug 2081241

Summary: VFIO_MAP_DMA failed: Cannot allocate memory -12 (VM with GPU passthrough, Q35 machine and 16 vcpus)
Product: Red Hat Enterprise Virtualization Manager Reporter: Chetan Nagarkar <cnagarka>
Component: ovirt-engineAssignee: Milan Zamazal <mzamazal>
Status: CLOSED ERRATA QA Contact: Nisim Simsolo <nsimsolo>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.4.10CC: ahadas, apinnick, bugs, cnagarka, ctomasko, mavital, michal.skrivanek, mzamazal, nsimsolo, pkubica, yanghliu
Target Milestone: ovirt-4.5.1Keywords: ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ovirt-engine-4.5.1 Doc Type: Bug Fix
Doc Text:
Previously, VMs with one or more VFIO devices, Q35 chipset, and maximum number of vCPUs >= 256 might fail to start because of a memory allocation error reported by the QEMU guest agent. This error is fixed.
Story Points: ---
Clone Of: 2048429 Environment:
Last Closed: 2022-07-14 12:54:31 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Virt RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2048429, 2050175    
Bug Blocks:    

Comment 3 Milan Zamazal 2022-05-18 15:14:28 UTC
Besides the unnecessarily high number of max vCPUs, which is already fixed in 4.5.0, a fix for the VFIO problem is posted for review.

Comment 8 Arik 2022-05-19 15:47:16 UTC
Milan, please add some documentation and move to MODIFIED

Comment 11 Nisim Simsolo 2022-06-14 08:13:28 UTC
Verified:
ovirt-engine-4.5.1.1-0.14.el8ev
vdsm-4.50.1.2-1.el8ev.x86_64
qemu-kvm-6.2.0-11.module+el8.6.0+15489+bc23efef.1.x86_64
libvirt-daemon-8.0.0-5.2.module+el8.6.0+15256+3a0914fe.x86_64

Verification scenario:
1. Create VM with: Q35 UEFI,  16GB memory size and 16 CPUs (s:c:th = 1:8:2)
(if required, use hook from bug attachments)
2. add VFIO and passthrough NICs to VM host devices.
3. Run VM.
Verify VM is running with host devices, for example:
09:00.0 VGA compatible controller: NVIDIA Corporation GK104GL [Quadro K4200] (rev a1)
06:00.0 Audio device: NVIDIA Corporation GK104 HDMI Audio Controller (rev a1) 
07:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
08:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
Observe vdsm.log, libvirt.log, enging.log and verify there are no errors.
Observe dmesg and verify there's no message like vfio_pin_pages_remote: RLIMIT_MEMLOCK (398274330624) exceeded (see https://bugzilla.redhat.com/show_bug.cgi?id=2048429#c13).
4.Repeat step 3, this time change CPUs to 32 (1:16:2) and run VM. 
5.Repeat step 3, this time change CPUs to 64 (1:32:2) and run VM.

Comment 16 errata-xmlrpc 2022-07-14 12:54:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: RHV Manager (ovirt-engine) [ovirt-4.5.1] security, bug fix and update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5555

Comment 17 meital avital 2022-08-07 10:22:44 UTC
Due to QE capacity, we are not going to cover this issue in our automation