Bug 2081241 - VFIO_MAP_DMA failed: Cannot allocate memory -12 (VM with GPU passthrough, Q35 machine and 16 vcpus)
Summary: VFIO_MAP_DMA failed: Cannot allocate memory -12 (VM with GPU passthrough, Q35...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Virtualization Manager
Classification: Red Hat
Component: ovirt-engine
Version: 4.4.10
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ovirt-4.5.1
: ---
Assignee: Milan Zamazal
QA Contact: Nisim Simsolo
URL:
Whiteboard:
Depends On: 2048429 2050175
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-05-03 07:54 UTC by Chetan Nagarkar
Modified: 2022-08-07 10:22 UTC (History)
11 users (show)

Fixed In Version: ovirt-engine-4.5.1
Doc Type: Bug Fix
Doc Text:
Previously, VMs with one or more VFIO devices, Q35 chipset, and maximum number of vCPUs >= 256 might fail to start because of a memory allocation error reported by the QEMU guest agent. This error is fixed.
Clone Of: 2048429
Environment:
Last Closed: 2022-07-14 12:54:31 UTC
oVirt Team: Virt
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github oVirt ovirt-engine pull 382 0 None open core: Add memtune hard_limit for q35 VMs with many CPUs 2022-05-18 15:14:27 UTC
Red Hat Issue Tracker RHV-45913 0 None None None 2022-05-03 07:55:55 UTC
Red Hat Product Errata RHSA-2022:5555 0 None None None 2022-07-14 12:55:15 UTC

Comment 3 Milan Zamazal 2022-05-18 15:14:28 UTC
Besides the unnecessarily high number of max vCPUs, which is already fixed in 4.5.0, a fix for the VFIO problem is posted for review.

Comment 8 Arik 2022-05-19 15:47:16 UTC
Milan, please add some documentation and move to MODIFIED

Comment 11 Nisim Simsolo 2022-06-14 08:13:28 UTC
Verified:
ovirt-engine-4.5.1.1-0.14.el8ev
vdsm-4.50.1.2-1.el8ev.x86_64
qemu-kvm-6.2.0-11.module+el8.6.0+15489+bc23efef.1.x86_64
libvirt-daemon-8.0.0-5.2.module+el8.6.0+15256+3a0914fe.x86_64

Verification scenario:
1. Create VM with: Q35 UEFI,  16GB memory size and 16 CPUs (s:c:th = 1:8:2)
(if required, use hook from bug attachments)
2. add VFIO and passthrough NICs to VM host devices.
3. Run VM.
Verify VM is running with host devices, for example:
09:00.0 VGA compatible controller: NVIDIA Corporation GK104GL [Quadro K4200] (rev a1)
06:00.0 Audio device: NVIDIA Corporation GK104 HDMI Audio Controller (rev a1) 
07:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
08:00.0 Ethernet controller: Intel Corporation 82576 Gigabit Network Connection (rev 01)
Observe vdsm.log, libvirt.log, enging.log and verify there are no errors.
Observe dmesg and verify there's no message like vfio_pin_pages_remote: RLIMIT_MEMLOCK (398274330624) exceeded (see https://bugzilla.redhat.com/show_bug.cgi?id=2048429#c13).
4.Repeat step 3, this time change CPUs to 32 (1:16:2) and run VM. 
5.Repeat step 3, this time change CPUs to 64 (1:32:2) and run VM.

Comment 16 errata-xmlrpc 2022-07-14 12:54:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: RHV Manager (ovirt-engine) [ovirt-4.5.1] security, bug fix and update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5555

Comment 17 meital avital 2022-08-07 10:22:44 UTC
Due to QE capacity, we are not going to cover this issue in our automation


Note You need to log in before you can comment on or make changes to this bug.