Bug 1912093 - Failed to hotplug 2 PFs into a vm which has an iommu device
Summary: Failed to hotplug 2 PFs into a vm which has an iommu device
Keywords:
Status: CLOSED DUPLICATE of bug 1619734
Alias: None
Product: Red Hat Enterprise Linux Advanced Virtualization
Classification: Red Hat
Component: qemu-kvm
Version: 8.4
Hardware: x86_64
OS: Linux
high
high
Target Milestone: rc
: 8.4
Assignee: Amnon Ilan
QA Contact: Yanghang Liu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-01-03 11:38 UTC by Yanghang Liu
Modified: 2021-01-26 02:08 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-01-26 00:51:23 UTC
Type: ---
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Yanghang Liu 2021-01-03 11:38:58 UTC
Description of problem:
Failed to hotplug multi PF into a vm which has an iommu device

Version-Release number of selected component (if applicable):
host:
qemu-kvm-5.2.0-2.module+el8.4.0+9186+ec44380f.x86_64
4.18.0-268.el8.x86_64 or 4.18.0-267.el8.dt2.x86_64
libvirt-6.10.0-1.module+el8.4.0+8898+a84e86e1.x86_64
guest:
4.18.0-268.el8.x86_64

How reproducible:
100%

Steps to Reproduce:
1.use the following domain xml to define and start a vm with a iommu device

<domain type='kvm'>
  <name>RHEL84</name>
  <uuid>06cd66e6-4b38-12eb-b94f-a0369fc7bbea</uuid>
  <memory unit='KiB'>8388608</memory>
  <currentMemory unit='KiB'>8388608</currentMemory>
  <vcpu placement='static'>6</vcpu>
  <os>
    <type arch='x86_64' machine='pc-q35-rhel8.3.0'>hvm</type>
    <boot dev='hd'/>
  </os>
  <features>
    <acpi/>
    <apic/>
    <ioapic driver='qemu'/>
  </features>
  <cpu mode='host-model' check='partial'/>
  <clock offset='utc'/>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <emulator>/usr/libexec/qemu-kvm</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='none' io='threads'/>
      <source file='/home/images/RHEL84.qcow2'/>
      <target dev='vda' bus='virtio'/>
      <address type='pci' domain='0x0000' bus='0x02' slot='0x00' function='0x0'/>
    </disk>
    <controller type='pci' index='0' model='pcie-root'/>
    <controller type='pci' index='1' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='1' port='0x10'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0' multifunction='on'/>
    </controller>
    <controller type='pci' index='2' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='2' port='0x11'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x1'/>
    </controller>
    <controller type='pci' index='3' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='3' port='0x12'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x2'/>
    </controller>
    <controller type='pci' index='4' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='4' port='0x13'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x3'/>
    </controller>
    <controller type='pci' index='5' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='5' port='0x14'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x4'/>
    </controller>
    <controller type='pci' index='6' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='6' port='0x15'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x5'/>
    </controller>
    <controller type='pci' index='7' model='pcie-root-port'>
      <model name='pcie-root-port'/>
      <target chassis='7' port='0x16'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x6'/>
    </controller>
    <controller type='sata' index='0'>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x1f' function='0x2'/>
    </controller>
    <interface type='bridge'>
      <mac address='88:66:da:5f:dd:01'/>
      <source bridge='switch'/>
      <model type='virtio'/>
      <address type='pci' domain='0x0000' bus='0x01' slot='0x00' function='0x0'/>
    </interface>
    <graphics type='vnc' port='5900' autoport='no' listen='0.0.0.0'>
      <listen type='address' address='0.0.0.0'/>
    </graphics>
    <iommu model='intel'>
      <driver intremap='on' caching_mode='on' iotlb='on'/>
    </iommu>
  </devices>
</domain>


2.Prepare the xml for two PFs:

PF1 xml:
# cat xxv710_0000\:82\:00.0.xml 
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x82' slot='0x00' function='0x0'/>
      </source>
      <address bus="0x5" domain="0x0000" function="0x0" slot="0x00" type="pci" />
    </hostdev>
PF2 xml:
# cat xxv710_0000\:82\:00.1.xml 
    <hostdev mode='subsystem' type='pci' managed='yes'>
      <driver name='vfio'/>
      <source>
        <address domain='0x0000' bus='0x82' slot='0x00' function='0x1'/>
      </source>
      <address bus="0x6" domain="0x0000" function="0x0" slot="0x00" type="pci" />
    </hostdev>


3. hotplug two XXV710 PFs into this vm

3.1 hotplug the 1st XXV710 PFs

# virsh attach-device RHEL84 xxv710_0000\:82\:00.0.xml 
Device attached successfully

related qmp:
 > {"execute":"device_add","arguments":{"driver":"vfio-pci","host":"0000:82:00.0","id":"hostdev0","bus":"pci.5","addr":"0x0"},"id":"libvirt-373"}
 <  {"return": {}, "id": "libvirt-373"}

3.2 hotplug the 2nd XXV710 PFs

# virsh attach-device bug_test xxv710_0000\:82\:00.1.xml 
error: Failed to attach device from xxv710_0000:82:00.1.xml
error: internal error: unable to execute QEMU command 'device_add': vfio 0000:82:00.1: failed to setup container for group 69: memory listener initialization failed: Region pc.ram: vfio_dma_map(0x556147e8bde0, 0x100000, 0x7ff00000, 0x7fc033f00000) = -12 (Cannot allocate memory)

related qmp:
> {"execute":"device_add","arguments":{"driver":"vfio-pci","host":"0000:82:00.1","id":"hostdev1","bus":"pci.6","addr":"0x0"},"id":"libvirt-375"}
<  {"id": "libvirt-375", "error": {"class": "GenericError", "desc": "vfio 0000:82:00.1: failed to setup container for group 69: memory listener initialization failed: Region pc.ram: vfio_dma_map(0x556147e8bde0, 0x100000, 0x7ff00000, 0x7fc033f00000) = -12 (Cannot allocate memory)"}}



Actual results:
The second PF cannot be hot-plugged into the vm.

Expected results:
The two PFs can be hot-plugged into the vm.

Additional info:
(1) the xxv710 nic info
# virsh nodedev-dumpxml pci_0000_82_00_0 
<device>
  <name>pci_0000_82_00_0</name>
  <path>/sys/devices/pci0000:80/0000:80:02.0/0000:82:00.0</path>
  <parent>pci_0000_80_02_0</parent>
  <driver>
    <name>i40e</name>
  </driver>
  <capability type='pci'>
    <class>0x020000</class>
    <domain>0</domain>
    <bus>130</bus>
    <slot>0</slot>
    <function>0</function>
    <product id='0x158b'>Ethernet Controller XXV710 for 25GbE SFP28</product>
    <vendor id='0x8086'>Intel Corporation</vendor>
    <capability type='virt_functions' maxCount='64'/>
    <iommuGroup number='68'>
      <address domain='0x0000' bus='0x82' slot='0x00' function='0x0'/>
    </iommuGroup>
    <numa node='1'/>
    <pci-express>
      <link validity='cap' port='0' speed='8' width='8'/>
      <link validity='sta' speed='8' width='8'/>
    </pci-express>
  </capability>
</device>

(2)the related device xml and qemu-cmd line about the iommu device 

domain xml:
    <ioapic driver='qemu'/>

    <iommu model='intel'>
      <driver intremap='on' caching_mode='on' iotlb='on'/>
    </iommu>

qemu cmd line:
 -machine pc-q35-rhel8.3.0,kernel_irqchip=split 
 -device intel-iommu,intremap=on,caching-mode=on,device-iotlb=on 

(3) 
When there is no IOMMU device in the vm, these two XXV710 PFs can be successfully hot-plugged into the vm.

 > {"execute":"device_add","arguments":{"driver":"vfio-pci","host":"0000:82:00.0","id":"hostdev0","bus":"pci.5","addr":"0x0"},"id":"libvirt-373"}
 <  {"return": {}, "id": "libvirt-373"}
 > {"execute":"device_add","arguments":{"driver":"vfio-pci","host":"0000:82:00.1","id":"hostdev1","bus":"pci.6","addr":"0x0"},"id":"libvirt-375"}
 <  {"return": {}, "id": "libvirt-375"}

Comment 1 John Ferlan 2021-01-11 19:27:05 UTC
Assigned to Amnon for initial triage per bz process and age of bug created or assigned to virt-maint without triage.

Comment 3 Laine Stump 2021-01-26 00:51:23 UTC
This is an example of well known (to the right people :-)) Bug 1619734. Basically, when you have 2 vfio devices to assign to a guest, you either need to double the locked memory limit, or you need to place them on slots of a pcie-to-pci-bridge (i.e. a conventional PCI controller). The other BZ explains why.

If this functionality is required for something, the other BZ should be re-opened (it was recently marked CLOSED/DEFERRED)

*** This bug has been marked as a duplicate of bug 1619734 ***


Note You need to log in before you can comment on or make changes to this bug.