Bug 2023313

Summary: VM with a PCI host device and max vCPUs >= 256 fails to start
Product: [oVirt] ovirt-engine Reporter: Milan Zamazal <mzamazal>
Component: BLL.VirtAssignee: Milan Zamazal <mzamazal>
Status: CLOSED CURRENTRELEASE QA Contact: Nisim Simsolo <nsimsolo>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.4.9.5CC: ahadas, bugs, gilboad, michal.skrivanek, mperina, nsimsolo
Target Milestone: ovirt-4.5.0Flags: ahadas: ovirt-4.5+
Target Release: 4.5.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ovirt-engine-4.5.0 Doc Type: Bug Fix
Doc Text:
Previously, certain CPU topologies would cause virtual machines with PCI host devices to fail. The current release fixes this issue.
Story Points: ---
Clone Of:
: 2025872 (view as bug list) Environment:
Last Closed: 2022-04-20 06:33:59 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: Virt RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2025872    

Description Milan Zamazal 2021-11-15 12:28:33 UTC
Description of problem:

When a VM contains a PCI host devices and its maximum number of vCPUs is at least 256 then the VM fails to start.

Version-Release number of selected component (if applicable):

4.4.9

How reproducible:

100%

Steps to Reproduce:
1. Add a PCI host device (for example a sound card) to a VM with many vCPUs (e.g. a Q35 VM with 1 socket, 16 cores, 1 thread CPU topology).
2. Try to start the VM.

Actual results:

The VM fails to start with the following error:

  qemu-kvm: We need to set caching-mode=on for intel-iommu to enable device 
assignment with IOMMU protection.

Expected results:

The VM starts.

Additional info:

This is basically the same as BZ 2013752, which handled only VMs with vGPU, while the same problem applies also to VMs with any vfio-pci device.

Comment 3 Arik 2021-11-23 09:19:52 UTC
oops, targeted to 4.4.10

Comment 4 Milan Zamazal 2021-11-29 15:18:51 UTC
*** Bug 2026893 has been marked as a duplicate of this bug. ***

Comment 5 Nisim Simsolo 2022-04-04 09:43:55 UTC
Verified:
ovirt-engine-4.5.0-0.237.el8ev
vdsm-4.50.0.10-1.el8ev.x86_64
qemu-kvm-6.2.0-10.module+el8.6.0+14540+5dcf03db.x86_64
libvirt-daemon-8.0.0-5.module+el8.6.0+14480+c0a3aa0f.x86_64

Verification scenario:
1. Create VM with mdev and VFIO (USB passthrough in this case) with 16 total virtual CPUs (1 socket and 16 cores per socket)
2. Run VM and verify VM is running.
3. Observe VM xml and verify caching=on:
 <iommu model="intel">
            <driver caching_mode="on" eim="on" intremap="on" />
        </iommu>
4. Repeat steps 1-3, this time use 24 CPus and 32 CPUs.

Comment 6 Sandro Bonazzola 2022-04-20 06:33:59 UTC
This bugzilla is included in oVirt 4.5.0 release, published on April 20th 2022.

Since the problem described in this bug report should be resolved in oVirt 4.5.0 release, it has been closed with a resolution of CURRENT RELEASE.

If the solution does not work for you, please open a new bug report.