RHEL Engineering is moving the tracking of its product development work on RHEL 6 through RHEL 9 to Red Hat Jira (issues.redhat.com). If you're a Red Hat customer, please continue to file support cases via the Red Hat customer portal. If you're not, please head to the "RHEL project" in Red Hat Jira and file new tickets here. Individual Bugzilla bugs in the statuses "NEW", "ASSIGNED", and "POST" are being migrated throughout September 2023. Bugs of Red Hat partners with an assigned Engineering Partner Manager (EPM) are migrated in late September as per pre-agreed dates. Bugs against components "kernel", "kernel-rt", and "kpatch" are only migrated if still in "NEW" or "ASSIGNED". If you cannot log in to RH Jira, please consult article #7032570. That failing, please send an e-mail to the RH Jira admins at rh-issues@redhat.com to troubleshoot your issue as a user management inquiry. The email creates a ServiceNow ticket with Red Hat. Individual Bugzilla bugs that are migrated will be moved to status "CLOSED", resolution "MIGRATED", and set with "MigratedToJIRA" in "Keywords". The link to the successor Jira issue will be found under "Links", have a little "two-footprint" icon next to it, and direct you to the "RHEL project" in Red Hat Jira (issue links are of type "https://issues.redhat.com/browse/RHEL-XXXX", where "X" is a digit). This same link will be available in a blue banner at the top of the page informing you that that bug has been migrated.
Bug 2156876 - [virtual network][rhel7.9_guest] qemu-kvm: vhost vring error in virtqueue 1: Invalid argument (22)
Summary: [virtual network][rhel7.9_guest] qemu-kvm: vhost vring error in virtqueue 1: ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux 9
Classification: Red Hat
Component: qemu-kvm
Version: 9.2
Hardware: Unspecified
OS: Linux
high
high
Target Milestone: rc
: ---
Assignee: Laurent Vivier
QA Contact: Lei Yang
URL:
Whiteboard:
Depends On:
Blocks: 2179031
TreeView+ depends on / blocked
 
Reported: 2022-12-29 06:04 UTC by Lei Yang
Modified: 2023-05-09 07:56 UTC (History)
14 users (show)

Fixed In Version: qemu-kvm-7.2.0-14.el9_2
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 2179031 (view as bug list)
Environment:
Last Closed: 2023-05-09 07:23:43 UTC
Type: ---
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Gitlab redhat/centos-stream/src qemu-kvm merge_requests 157 0 None opened Draft: intel-iommu: fail DEVIOTLB_UNMAP without dt mode 2023-03-09 09:27:21 UTC
Red Hat Issue Tracker RHELPLAN-143361 0 None None None 2022-12-29 06:08:02 UTC
Red Hat Product Errata RHSA-2023:2162 0 None None None 2023-05-09 07:24:35 UTC

Description Lei Yang 2022-12-29 06:04:09 UTC
Description of problem:
Add intel_iommu=on in guest kernel line. Then reboot guest, qemu output: qemu-kvm: vhost vring error in virtqueue 1: Invalid argument (22). And guest can reboot succeed, but guest ping host failed.

Version-Release number of selected component (if applicable):
kernel-5.14.0-226.el9.x86_64
qemu-kvm-7.2.0-2.el9.x86_64
edk2-ovmf-20221207gitfff6d81270b5-1.el9.noarch

How reproducible:
100%

Steps to Reproduce:
1.Boot a guest with intel_iommu
/usr/libexec/qemu-kvm \
-name 'avocado-vt-vm1'  \
-sandbox on  \
-blockdev node-name=file_ovmf_code,driver=file,filename=/usr/share/OVMF/OVMF_CODE.secboot.fd,auto-read-only=on,discard=unmap \
-blockdev node-name=drive_ovmf_code,driver=raw,read-only=on,file=file_ovmf_code \
-blockdev node-name=file_ovmf_vars,driver=file,filename=/root/avocado/data/avocado-vt/avocado-vt-vm1_rhel79-64-virtio-scsi_avocado-vt-vm1_qcow2_filesystem_VARS.fd,auto-read-only=on,discard=unmap \
-blockdev node-name=drive_ovmf_vars,driver=raw,read-only=off,file=file_ovmf_vars \
-machine q35,kernel-irqchip=split,memory-backend=mem-machine_mem,pflash0=drive_ovmf_code,pflash1=drive_ovmf_vars \
-device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \
-device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0  \
-nodefaults \
-device intel-iommu,intremap=on,device-iotlb=on,caching-mode=on \
-device VGA,bus=pcie.0,addr=0x2 \
-m 62464 \
-object '{"qom-type": "memory-backend-ram", "size": 65498251264, "id": "mem-machine_mem"}'  \
-smp 28,maxcpus=28,cores=14,threads=1,dies=1,sockets=2  \
-cpu 'Icelake-Server',ds=on,ss=on,dtes64=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,avx512ifma=on,sha-ni=on,rdpid=on,fsrm=on,md-clear=on,stibp=on,arch-capabilities=on,xsaves=on,ibpb=on,ibrs=on,amd-stibp=on,amd-ssbd=on,rdctl-no=on,ibrs-all=on,skip-l1dfl-vmentry=on,mds-no=on,pschange-mc-no=on,tsx-ctrl=on,hle=off,rtm=off,mpx=off,intel-pt=off,kvm_pv_unhalt=on \
-device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \
-device qemu-xhci,id=usb1,bus=pcie-root-port-1,addr=0x0 \
-device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \
-device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \
-device '{"id": "virtio_scsi_pci0", "driver": "virtio-scsi-pci", "bus": "pcie-root-port-2", "addr": "0x0"}' \
-blockdev '{"node-name": "file_image1", "driver": "file", "auto-read-only": true, "discard": "unmap", "aio": "threads", "filename": "/home/kvm_autotest_root/images/rhel79-64-virtio-scsi_avocado-vt-vm1.qcow2", "cache": {"direct": true, "no-flush": false}}' \
-blockdev '{"node-name": "drive_image1", "driver": "qcow2", "read-only": false, "cache": {"direct": true, "no-flush": false}, "file": "file_image1"}' \
-device '{"driver": "scsi-hd", "id": "image1", "drive": "drive_image1", "write-cache": "on"}' \
-device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \
-device virtio-net-pci,mac=9a:2e:9f:99:4e:8c,disable-legacy=on,disable-modern=off,iommu_platform=on,ats=on,id=idtPPGyV,netdev=idtAVJJ5,bus=pcie-root-port-3,addr=0x0  \
-netdev tap,id=idtAVJJ5,vhost=on,vhostforce=on \
-vnc :0  \
-rtc base=utc,clock=host,driftfix=slew  \
-boot menu=off,order=cdn,once=c,strict=off \
-enable-kvm \
-device pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x3,chassis=5 \
-monitor stdio \

2.Add intel_iommu=on in guest kernel line. Then reboot guest.
# grubby --update-kernel=`grubby --default-kernel` --args="intel_iommu=on"
# reboot

3.qemu output
qemu-kvm: vhost vring error in virtqueue 1: Invalid argument (22)

Actual results:
qemu output error info, and guest ping host failed

Expected results:
Without error info, guest ping succeed

Additional info:
1. only rhel7.9 guest can reproduced this p[roblem. rhel8.8 guest and rhel9.2 guest test pass.
2. There is another https://bugzilla.redhat.com/show_bug.cgi?id=2039856#c4 when test on qemu-kvm-7.0.0-3.el9.x86_64, from the QE perspective,the current error is triggered for the first time, so add keywords "Regression", please correct me if I'm wrong.

Comment 1 Laurent Vivier 2023-01-02 10:09:09 UTC
Lei,

Which version of QEMU works well with RHEL 7.9 guest kernel?
Do you have any error in the host kernel logs?
Do you have any error in the guest kernel logs?

It looks like a guest kernel bug rather than a QEMU one.

The error message has been added by ae50ae0b91bb ("vhost: setup error eventfd and dump errors") in QEMU 7.1, so I think the problem is the same as for BZ 2039856 except QEMU is able to better manage the problem.

Comment 2 Lei Yang 2023-01-03 02:35:12 UTC
(In reply to Laurent Vivier from comment #1)
> Lei,
> 
Hi Laurent

> Which version of QEMU works well with RHEL 7.9 guest kernel?
qemu-kvm-7.1.0-7.el9.x86_64 and qemu-kvm-7.2.0-2.el9.x86_64 are all hit the current bug.

qemu-kvm-6.2.0-2.el9.x86_64 and qemu-kvm-7.0.0-3.el9.x86_64 are all hit Bug 2039856


> Do you have any error in the host kernel logs?
> Do you have any error in the guest kernel logs?

There are no error messages in the host kernel logs and guest kernel logs.

> 
> It looks like a guest kernel bug rather than a QEMU one.
> 
> The error message has been added by ae50ae0b91bb ("vhost: setup error
> eventfd and dump errors") in QEMU 7.1, so I think the problem is the same as
> for BZ 2039856 except QEMU is able to better manage the problem.

Thanks for your update, So can QE close Bug 2039856 to "CURRENTRELEASE" based on it ?

Thanks
Lei

Comment 3 Yanghang Liu 2023-01-05 08:12:37 UTC
Hi Laurent,

I have opened a similar bug before, but the domain I used is the AMD SEV RHEL92:
 
Bug 2153376 - [SEV][virtio-net-pci] vhost vring error in virtqueue 1: Invalid argument (22)

Comment 4 Laurent Vivier 2023-01-24 13:12:43 UTC
Reduced command line to reproduce the problem:

QEMU=/usr/libexec/qemu-kvm
IMAGE=/var/lib/libvirt/images/rhel7.9.qcow2

$QEMU -m 2G -M q35,kernel-irqchip=split -enable-kvm -nodefaults -nographic -smp 2 \
      -cpu host \
      -blockdev node-name=disk1,file.driver=file,driver=qcow2,file.driver=file,file.filename=$IMAGE \
      -device virtio-blk,drive=disk1 \
      -netdev tap,id=netdev0,vhost=on,vhostforce=on \
      -device virtio-net,netdev=netdev0,disable-legacy=on,disable-modern=off,iommu_platform=on,ats=on \
      -device intel-iommu,intremap=on,device-iotlb=on,caching-mode=on \
      -serial mon:stdio

Then in guest:

# grubby --update-kernel=`grubby --default-kernel` --args="intel_iommu=on"
# reboot
and then use eth0 (for instance "dhclient eth0").

On VM exit, I have also the following error:

KVM: injection failed, MSI lost (Operation not permitted)

Comment 5 Laurent Vivier 2023-01-24 13:31:00 UTC
(In reply to Yanghang Liu from comment #3)
> Hi Laurent,
> 
> I have opened a similar bug before, but the domain I used is the AMD SEV
> RHEL92:
>  
> Bug 2153376 - [SEV][virtio-net-pci] vhost vring error in virtqueue 1:
> Invalid argument (22)

I think the problem is related to vhost and iommu, your bug could be closed as DUPLICATE.
You can keep it open if you want to check the fix when it will be available.

Comment 6 Laurent Vivier 2023-01-25 19:47:00 UTC
With some debug in host kernel I can see:

  Unexpected header len for TX: 0 expected 0

It comes from get_tx_bufs():

        /* Sanity check */
        *len = init_iov_iter(vq, &msg->msg_iter, nvq->vhost_hlen, *out);
        if (*len == 0) {
                vq_err(vq, "Unexpected header len for TX: %zd expected %zd\n",
                        *len, nvq->vhost_hlen);
                return -EFAULT;
        }

So I guess the use iommu corrupts the content of IOV.

Comment 7 Laurent Vivier 2023-01-30 07:50:13 UTC
(In reply to Lei Yang from comment #2)
> (In reply to Laurent Vivier from comment #1)
...
> > 
> > It looks like a guest kernel bug rather than a QEMU one.
> > 
> > The error message has been added by ae50ae0b91bb ("vhost: setup error
> > eventfd and dump errors") in QEMU 7.1, so I think the problem is the same as
> > for BZ 2039856 except QEMU is able to better manage the problem.
> 
> Thanks for your update, So can QE close Bug 2039856 to "CURRENTRELEASE"
> based on it ?

Yes

Comment 8 Laurent Vivier 2023-01-30 08:11:29 UTC
Lei,

could you check with RHEL 9.2 guest you can actually ping outside world?
I can't reproduce the bug with RHEL 9.2 kernel but I'm not able to use networking.

Thanks

Comment 9 Lei Yang 2023-01-30 09:35:09 UTC
(In reply to Laurent Vivier from comment #8)
> Lei,
> 
Hi Laurent

> could you check with RHEL 9.2 guest you can actually ping outside world?

Yes, it can ping outside, in order to confirm this problem. I tested it on the latest version:
kernel-5.14.0-247.el9.x86_64
qemu-kvm-7.2.0-5.el9.x86_64
edk2-ovmf-20221207gitfff6d81270b5-2.el9.noarch

qemu CLI:
/usr/libexec/qemu-kvm \
-name 'avocado-vt-vm1'  \
-sandbox on  \
-blockdev '{"node-name": "file_ovmf_code", "driver": "file", "filename": "/usr/share/OVMF/OVMF_CODE.secboot.fd", "auto-read-only": true, "discard": "unmap"}' \
-blockdev '{"node-name": "drive_ovmf_code", "driver": "raw", "read-only": true, "file": "file_ovmf_code"}' \
-blockdev '{"node-name": "file_ovmf_vars", "driver": "file", "filename": "/root/avocado/data/avocado-vt/avocado-vt-vm1_rhel920-64-virtio-scsi_qcow2_filesystem_VARS.fd", "auto-read-only": true, "discard": "unmap"}' \
-blockdev '{"node-name": "drive_ovmf_vars", "driver": "raw", "read-only": false, "file": "file_ovmf_vars"}' \
-machine q35,kernel-irqchip=split,memory-backend=mem-machine_mem,pflash0=drive_ovmf_code,pflash1=drive_ovmf_vars \
-device '{"id": "pcie-root-port-0", "driver": "pcie-root-port", "multifunction": true, "bus": "pcie.0", "addr": "0x1", "chassis": 1}' \
-device '{"id": "pcie-pci-bridge-0", "driver": "pcie-pci-bridge", "addr": "0x0", "bus": "pcie-root-port-0"}'  \
-nodefaults \
-device '{"intremap": "on", "device-iotlb": true, "caching-mode": true, "driver": "intel-iommu"}' \
-device '{"driver": "VGA", "bus": "pcie.0", "addr": "0x2"}' \
-m 62464 \
-object '{"qom-type": "memory-backend-ram", "size": 65498251264, "id": "mem-machine_mem"}'  \
-smp 28,maxcpus=28,cores=14,threads=1,dies=1,sockets=2  \
-cpu 'Icelake-Server',ds=on,ss=on,dtes64=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,avx512ifma=on,sha-ni=on,rdpid=on,fsrm=on,md-clear=on,stibp=on,arch-capabilities=on,xsaves=on,ibpb=on,ibrs=on,amd-stibp=on,amd-ssbd=on,rdctl-no=on,ibrs-all=on,skip-l1dfl-vmentry=on,mds-no=on,pschange-mc-no=on,tsx-ctrl=on,hle=off,rtm=off,mpx=off,intel-pt=off,kvm_pv_unhalt=on \
-device '{"id": "pcie-root-port-1", "port": 1, "driver": "pcie-root-port", "addr": "0x1.0x1", "bus": "pcie.0", "chassis": 2}' \
-device '{"driver": "qemu-xhci", "id": "usb1", "bus": "pcie-root-port-1", "addr": "0x0"}' \
-device '{"driver": "usb-tablet", "id": "usb-tablet1", "bus": "usb1.0", "port": "1"}' \
-device '{"id": "pcie-root-port-2", "port": 2, "driver": "pcie-root-port", "addr": "0x1.0x2", "bus": "pcie.0", "chassis": 3}' \
-device '{"id": "virtio_scsi_pci0", "driver": "virtio-scsi-pci", "bus": "pcie-root-port-2", "addr": "0x0"}' \
-blockdev '{"node-name": "file_image1", "driver": "file", "auto-read-only": true, "discard": "unmap", "aio": "threads", "filename": "/home/kvm_autotest_root/images/rhel920-64-virtio-scsi.qcow2", "cache": {"direct": true, "no-flush": false}}' \
-blockdev '{"node-name": "drive_image1", "driver": "qcow2", "read-only": false, "cache": {"direct": true, "no-flush": false}, "file": "file_image1"}' \
-device '{"driver": "scsi-hd", "id": "image1", "drive": "drive_image1", "write-cache": "on"}' \
-device '{"id": "pcie-root-port-3", "port": 3, "driver": "pcie-root-port", "addr": "0x1.0x3", "bus": "pcie.0", "chassis": 4}' \
-device '{"driver": "virtio-net-pci", "mac": "9a:ee:05:08:95:12", "disable-legacy": "on", "disable-modern": false, "iommu_platform": true, "ats": true, "id": "idxtsSJX", "netdev": "idlKiQsz", "bus": "pcie-root-port-3", "addr": "0x0"}'  \
-netdev tap,id=idlKiQsz,vhost=on,vhostforce=on  \
-vnc :0  \
-rtc base=utc,clock=host,driftfix=slew  \
-boot menu=off,order=cdn,once=c,strict=off \
-chardev socket,id=char_vtpm_avocado-vt-vm1_tpm0,path=/root/avocado/data/avocado-vt/swtpm/avocado-vt-vm1_tpm0_swtpm.sock \
-tpmdev emulator,chardev=char_vtpm_avocado-vt-vm1_tpm0,id=emulator_vtpm_avocado-vt-vm1_tpm0 \
-device '{"id": "tpm-crb_vtpm_avocado-vt-vm1_tpm0", "tpmdev": "emulator_vtpm_avocado-vt-vm1_tpm0", "driver": "tpm-crb"}' \
-enable-kvm \
-device '{"id": "pcie_extra_root_port_0", "driver": "pcie-root-port", "multifunction": true, "bus": "pcie.0", "addr": "0x3", "chassis": 5}' \
-monitor stdio \

Test inside guest:

[root@vm-212-133 ~]# uname -r
5.14.0-247.el9.x86_64
[root@vm-212-133 ~]# cat /proc/cmdline 
BOOT_IMAGE=(hd0,gpt2)/vmlinuz-5.14.0-247.el9.x86_64 root=/dev/mapper/rhel-root ro console=tty0 crashkernel=1G-4G:192M,4G-64G:256M,64G-:512M resume=/dev/mapper/rhel-swap rd.lvm.lv=rhel/root rd.lvm.lv=rhel/swap net.ifnames=0 console=ttyS0,115200 intel_iommu=on
# ping fileshare.englab.nay.redhat.com -c 3
PING fileshare.englab.nay.redhat.com (10.73.60.73) 56(84) bytes of data.
64 bytes from 10.73.60.73 (10.73.60.73): icmp_seq=1 ttl=60 time=0.352 ms
64 bytes from 10.73.60.73 (10.73.60.73): icmp_seq=2 ttl=60 time=0.373 ms
64 bytes from 10.73.60.73 (10.73.60.73): icmp_seq=3 ttl=60 time=0.617 ms

--- fileshare.englab.nay.redhat.com ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2052ms
rtt min/avg/max/mdev = 0.352/0.447/0.617/0.120 ms

> I can't reproduce the bug with RHEL 9.2 kernel but I'm not able to use
> networking.
Maybe this is a problem from host.If it can ping outside when you remove iommu from the kernel command line?
> 
> Thanks

Thanks
Lei

Comment 10 Lei Yang 2023-01-30 11:13:41 UTC
Hit same issue on the rhel.7.9 guest

kernel-5.14.0-247.el9.x86_64
qemu-kvm-7.2.0-5.el9.x86_64
edk2-ovmf-20221207gitfff6d81270b5-2.el9.noarch

Comment 11 Laurent Vivier 2023-02-01 20:08:14 UTC
Last good QEMU version with RHEL 7.9 kernel is v5.2:

Bisected to

commit b68ba1ca57677acf870d5ab10579e6105c1f5338
Author: Eugenio Pérez <eperezma>
Date:   Mon Nov 16 17:55:04 2020 +0100

    memory: Add IOMMU_NOTIFIER_DEVIOTLB_UNMAP IOMMUTLBNotificationType
    
    This allows us to differentiate between regular IOMMU map/unmap events
    and DEVIOTLB unmap. Doing so, notifiers that only need device IOTLB
    invalidations will not receive regular IOMMU unmappings.
    
    Adapt intel and vhost to use it.
    
    Signed-off-by: Eugenio Pérez <eperezma>
    Reviewed-by: Peter Xu <peterx>
    Reviewed-by: Juan Quintela <quintela>
    Acked-by: Jason Wang <jasowang>
    Message-Id: <20201116165506.31315-4-eperezma>
    Reviewed-by: Michael S. Tsirkin <mst>
    Signed-off-by: Michael S. Tsirkin <mst>

 hw/i386/intel_iommu.c | 2 +-
 hw/virtio/vhost.c     | 2 +-
 include/exec/memory.h | 7 ++++++-

It seems there was some kind of fixes for other IOMMUs:

commit 958ec334bca3fa9862289e4cfe31bf1019e55816
Author: Peter Xu <peterx>
Date:   Thu Feb 4 14:12:28 2021 -0500

    vhost: Unbreak SMMU and virtio-iommu on dev-iotlb support
    
    Previous work on dev-iotlb message broke vhost on either SMMU or virtio-iommu
    since dev-iotlb (or PCIe ATS) is not yet supported for those archs.
    
    An initial idea is that we can let IOMMU to export this information to vhost so
    that vhost would know whether the vIOMMU would support dev-iotlb, then vhost
    can conditionally register to dev-iotlb or the old iotlb way.  We can work
    based on some previous patch to introduce PCIIOMMUOps as Yi Liu proposed [1].
    
    However it's not as easy as I thought since vhost_iommu_region_add() does not
    have a PCIDevice context at all since it's completely a backend.  It seems
    non-trivial to pass over a PCI device to the backend during init.  E.g. when
    the IOMMU notifier registered hdev->vdev is still NULL.
    
    To make the fix smaller and easier, this patch goes the other way to leverage
    the flag_changed() hook of vIOMMUs so that SMMU and virtio-iommu can trap the
    dev-iotlb registration and fail it.  Then vhost could try the fallback solution
    as using UNMAP invalidation for it's translations.
    
    [1] https://lore.kernel.org/qemu-devel/1599735398-6829-4-git-send-email-yi.l.liu@intel.com/
    
    Reported-by: Eric Auger <eric.auger>
    Fixes: b68ba1ca57677acf870d5ab10579e6105c1f5338
    Reviewed-by: Eric Auger <eric.auger>
    Tested-by: Eric Auger <eric.auger>
    Signed-off-by: Peter Xu <peterx>
    Message-Id: <20210204191228.187550-1-peterx>
    Reviewed-by: Michael S. Tsirkin <mst>
    Signed-off-by: Michael S. Tsirkin <mst>

commit 1a8e22bd20c2586df0bc0fdce8d5a3b42fffb1ac
Author: Eric Auger <eric.auger>
Date:   Tue Feb 9 22:32:33 2021 +0100

    spapr_iommu: Fix vhost integration regression
    
    Previous work on dev-iotlb message broke spapr_iommu/vhost integration
    as it did for SMMU and virtio-iommu. The spapr_iommu currently
    only sends IOMMU_NOTIFIER_UNMAP notifications. Since commit
    958ec334bca3 ("vhost: Unbreak SMMU and virtio-iommu on dev-iotlb support"),
    VHOST first tries to register IOMMU_NOTIFIER_DEVIOTLB_UNMAP notifier
    and if it fails, falls back to legacy IOMMU_NOTIFIER_UNMAP. So
    spapr_iommu must fail on the IOMMU_NOTIFIER_DEVIOTLB_UNMAP
    registration.
    
    Reported-by: Peter Xu <peterx>
    Fixes: b68ba1ca5767 ("memory: Add IOMMU_NOTIFIER_DEVIOTLB_UNMAP IOMMUTLBNotificationType")
    Signed-off-by: Eric Auger <eric.auger>
    Message-Id: <20210209213233.40985-3-eric.auger>
    Acked-by: David Gibson <david.id.au>
    Acked-by: Jason Wang <jasowang>
    Reviewed-by: Michael S. Tsirkin <mst>
    Reviewed-by: Greg Kurz <groug>
    Signed-off-by: Alex Williamson <alex.williamson>

But this does not manage intel-iommu with old kernel.

if I disable dev-iotlb with intel-iommu I can confirm it works:

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 98a5c304a7d7..923974c2f9b1 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -3180,6 +3180,11 @@ static int vtd_iommu_notify_flag_changed(IOMMUMemoryRegion *iommu,
     VTDAddressSpace *vtd_as = container_of(iommu, VTDAddressSpace, iommu);
     IntelIOMMUState *s = vtd_as->iommu_state;
 
+    if (new & IOMMU_NOTIFIER_DEVIOTLB_UNMAP) {
+        error_setg(errp, "intel-iommu does not support dev-iotlb yet");
+        return -EINVAL;
+    }
+
     /* TODO: add support for VFIO and vhost users */
     if (s->snoop_control) {
         error_setg_errno(errp, ENOTSUP,

Eugenio, Peter, Eric,

what is the kernel fix to backport to RHEL 7.9 to support IOMMU_NOTIFIER_DEVIOTLB_UNMAP with intel-iommu?

Comment 12 Laurent Vivier 2023-02-02 17:22:00 UTC
The easiest way to fix this BZ seems to be able to disable dev-iotlb support with old kernel.

I propose to a add a property to intel_iommu, dev-iotlb, "on" by default, to disable conditionally IOMMU_NOTIFIER_DEVIOTLB_UNMAP as it's done for spapr_iommu, smmuv3 and virtio-iommu:

qemu-kvm ... -device intel-iommu,...,dev-iotlb=false

---------------------------------------------------------------------------------------------
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 98a5c304a7d7..fa3544d7ef5f 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -3180,6 +3180,12 @@ static int vtd_iommu_notify_flag_changed(IOMMUMemoryRegion *iommu,
     VTDAddressSpace *vtd_as = container_of(iommu, VTDAddressSpace, iommu);
     IntelIOMMUState *s = vtd_as->iommu_state;
 
+    if ((new & IOMMU_NOTIFIER_DEVIOTLB_UNMAP) &&
+        !s->dev_iotlb) {
+        error_setg(errp, "dev-iotlb has been disabled with intel-iommu");
+        return -EINVAL;
+    }
+
     /* TODO: add support for VFIO and vhost users */
     if (s->snoop_control) {
         error_setg_errno(errp, ENOTSUP,
@@ -3276,6 +3282,7 @@ static Property vtd_properties[] = {
     DEFINE_PROP_BOOL("x-pasid-mode", IntelIOMMUState, pasid, false),
     DEFINE_PROP_BOOL("dma-drain", IntelIOMMUState, dma_drain, true),
     DEFINE_PROP_BOOL("dma-translation", IntelIOMMUState, dma_translation, true),
+    DEFINE_PROP_BOOL("dev-iotlb", IntelIOMMUState, dev_iotlb, true),
     DEFINE_PROP_END_OF_LIST(),
 };
 
diff --git a/include/hw/i386/intel_iommu.h b/include/hw/i386/intel_iommu.h
index 46d973e62975..dd2347699b22 100644
--- a/include/hw/i386/intel_iommu.h
+++ b/include/hw/i386/intel_iommu.h
@@ -271,6 +271,7 @@ struct IntelIOMMUState {
      * per-IOMMU IOTLB cache, and context entry cache in VTDAddressSpace.
      */
     QemuMutex iommu_lock;
+    bool dev_iotlb;                 /* is Dev IOTLB supported? */
 };
 
 /* Find the VTD Address space associated with the given bus pointer,

Comment 13 Eric Auger 2023-02-02 18:02:37 UTC
The rationale of "vhost: Unbreak SMMU and virtio-iommu on dev-iotlb support" was that neither SMMUv3 nor virtio-iommu do support IOMMU_NOTIFIER_DEVIOTLB_UNMAP so those vIOMMU could not propagate invalidations to vhost. So in case the registration fails, we fall back for standard UNMAP notifications. Intel iommu did support IOMMU_NOTIFIER_DEVIOTLB_UNMAP.

Comment 14 Eric Auger 2023-02-02 18:24:43 UTC
Do you suggest there is a guest kernel intel-iommu driver bug in such rhel7.9_guest that prevents the intel-iommu from working in dev-iotlb mode?

I don't see any obvious qemu virtio-iommu fix of dev-iotlb emulation later but Peter/Jason may confirm.

Comment 15 Peter Xu 2023-02-02 21:04:37 UTC
Hi, Laurent, Eric,

(In reply to Laurent Vivier from comment #12)
> I propose to a add a property to intel_iommu, dev-iotlb, "on" by default, to
> disable conditionally IOMMU_NOTIFIER_DEVIOTLB_UNMAP as it's done for
> spapr_iommu, smmuv3 and virtio-iommu:

I think we have it.  It's a common property for x86 iommus:

static Property x86_iommu_properties[] = {
    DEFINE_PROP_BOOL("device-iotlb", X86IOMMUState, dt_supported, false),

However the intel-iommu still misses relevant code to fail the notifier registration if dt not supported, so I think that'll still be needed as what Laurent proposed.  Actually it's also proposed by Jason recently [1] but it just has yet landed QEMU upstream.

I think we should have that patch asap.

(In reply to Eric Auger from comment #14)
> Do you suggest there is a guest kernel intel-iommu driver bug in such
> rhel7.9_guest that prevents the intel-iommu from working in dev-iotlb mode?
> 
> I don't see any obvious qemu virtio-iommu fix of dev-iotlb emulation later
> but Peter/Jason may confirm.

I suspect there can be dev-iotlb related bugs for VT-d kernel driver in rhel7.9 indeed.  However since both rhel8/9 work for us, I'm wondering whether it would still be worthwhile to dig into rhel7.9 tree.  

Why the bug has "high" priority?  Is some customer hitting this?  Is it blocking our tests?

[1] https://lore.kernel.org/all/20221129081037.12099-3-jasowang@redhat.com/

Comment 16 Laurent Vivier 2023-02-03 09:03:28 UTC
(In reply to Peter Xu from comment #15)
...
> Why the bug has "high" priority?  Is some customer hitting this?  Is it
> blocking our tests?

I think it's a test blocker, Lei?

> [1] https://lore.kernel.org/all/20221129081037.12099-3-jasowang@redhat.com/

Thank you.
It works fine: I apply the patch to QEMU and set "-device intel-iommu,...,device-iotlb=off" on the command line and vhost networking is working fine.

Author: Jason Wang <jasowang>
Date:   Tue Nov 29 16:10:36 2022 +0800

    intel-iommu: fail DEVIOTLB_UNMAP without dt mode
    
    Without dt mode, device IOTLB notifier won't work since guest won't
    send device IOTLB invalidation descriptor in this case. Let's fail
    early instead of misbehaving silently.
    
    Signed-off-by: Jason Wang <jasowang>
    Signed-off-by: Laurent Vivier <lvivier>

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index 98a5c304a7d7..a07e879c704f 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -3179,6 +3179,7 @@ static int vtd_iommu_notify_flag_changed(IOMMUMemoryRegion *iommu,
 {
     VTDAddressSpace *vtd_as = container_of(iommu, VTDAddressSpace, iommu);
     IntelIOMMUState *s = vtd_as->iommu_state;
+    X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s);
 
     /* TODO: add support for VFIO and vhost users */
     if (s->snoop_control) {
@@ -3186,6 +3187,13 @@ static int vtd_iommu_notify_flag_changed(IOMMUMemoryRegion *iommu,
                          "Snoop Control with vhost or VFIO is not supported");
         return -ENOTSUP;
     }
+    if (!x86_iommu->dt_supported && (new & IOMMU_NOTIFIER_DEVIOTLB_UNMAP)) {
+        error_setg_errno(errp, ENOTSUP,
+                         "device %02x.%02x.%x requires device IOTLB mode",
+                         pci_bus_num(vtd_as->bus), PCI_SLOT(vtd_as->devfn),
+                         PCI_FUNC(vtd_as->devfn));
+        return -ENOTSUP;
+    }
 
     /* Update per-address-space notifier flags */
     vtd_as->notifier_flags = new;

Comment 18 Lei Yang 2023-02-06 01:59:46 UTC
(In reply to Laurent Vivier from comment #16)
> (In reply to Peter Xu from comment #15)
> ...
> > Why the bug has "high" priority?  Is some customer hitting this?  Is it
> > blocking our tests?
> 
> I think it's a test blocker, Lei?

Hello Laurent, Peter

Since this is a regression bug, it is set to "high“ priority. Please correct me if I'm wrong.

Thanks
Lei

Comment 19 Yanghang Liu 2023-02-06 02:30:59 UTC
(In reply to Laurent Vivier from comment #5)
> (In reply to Yanghang Liu from comment #3)
> > Hi Laurent,
> > 
> > I have opened a similar bug before, but the domain I used is the AMD SEV
> > 
> > RHEL92:
> >  
> > Bug 2153376 - [SEV][virtio-net-pci] vhost vring error in virtqueue 1: Invalid argument (22)
> 
> I think the problem is related to vhost and iommu, your bug could be closed as DUPLICATE.
> You can keep it open if you want to check the fix when it will be available.


Hi Laurent and Peter,

According to QE's workflow, if this root cause is finally located as the same as mine , I think Lei's bug should be closed because my bug opens earlier.

Does there any developer can help take my bug ?

Bug 2153376 - [SEV][virtio-net-pci] vhost vring error in virtqueue 1: Invalid argument (22)

Comment 21 Laurent Vivier 2023-02-06 09:52:44 UTC
(In reply to Yanghang Liu from comment #19)
> (In reply to Laurent Vivier from comment #5)
> > (In reply to Yanghang Liu from comment #3)
> > > Hi Laurent,
> > > 
> > > I have opened a similar bug before, but the domain I used is the AMD SEV
> > > 
> > > RHEL92:
> > >  
> > > Bug 2153376 - [SEV][virtio-net-pci] vhost vring error in virtqueue 1: Invalid argument (22)
> > 
> > I think the problem is related to vhost and iommu, your bug could be closed as DUPLICATE.
> > You can keep it open if you want to check the fix when it will be available.
> 
> 
> Hi Laurent and Peter,
> 
> According to QE's workflow, if this root cause is finally located as the
> same as mine , I think Lei's bug should be closed because my bug opens
> earlier.
> 
> Does there any developer can help take my bug ?
> 
> Bug 2153376 - [SEV][virtio-net-pci] vhost vring error in virtqueue 1:
> Invalid argument (22)

From a developer point of view, I generally prefer to keep the bug easier to reproduce.

But regarding Bug 2153376 and the fix I found for BZ 2156876, I don't think now it's the same problem.

To confirm, could you check your BZ with the build from comment #17?

Comment 22 Yanghang Liu 2023-02-06 15:28:17 UTC
(In reply to Laurent Vivier from comment #21)

> 
> But regarding Bug 2153376 and the fix I found for BZ 2156876, I don't think
> now it's the same problem.
> 
> To confirm, could you check your BZ with the build from comment #17?

> https://brewweb.engineering.redhat.com/brew/taskinfo?taskID=50479018

> repo:

> http://brew-task-repos.usersys.redhat.com/repos/scratch/lvivier/qemu-kvm/7.2.0/6.el9.BZ2156876/qemu-kvm-7.2.0-6.el9.BZ2156876-scratch.repo

Hi Laurent,

Thanks for the info.

My test result shows the above build can not fix Bug 2153376 - [SEV][virtio-net-pci] vhost vring error in virtqueue 1: Invalid argument (22) 


Test env:
qemu-kvm-7.2.0-6.el9.BZ2156876.x86_64
5.14.0-253.el9.x86_64
edk2-ovmf-20220826gitba0e0e4c6a-2.el9.noarch


The check point:
[1] The domain can not be started 
[2] The qemu-kvm throws the following error:
2023-02-06T15:20:35.152243Z qemu-kvm: vhost vring error in virtqueue 1: Invalid argument (22)
2023-02-06T15:20:35.152284Z qemu-kvm: vhost vring error in virtqueue 0: Invalid argument (22)

Comment 23 Laurent Vivier 2023-02-14 10:40:22 UTC
I will backport the fix from comment #16 as soon as it is merged into QEMU master.

Comment 24 Laurent Vivier 2023-02-21 07:57:00 UTC
(In reply to Laurent Vivier from comment #16)
> (In reply to Peter Xu from comment #15)
> ...
> > Why the bug has "high" priority?  Is some customer hitting this?  Is it
> > blocking our tests?
> 
> I think it's a test blocker, Lei?
> 
> > [1] https://lore.kernel.org/all/20221129081037.12099-3-jasowang@redhat.com/
> 
> Thank you.
> It works fine: I apply the patch to QEMU and set "-device
> intel-iommu,...,device-iotlb=off" on the command line and vhost networking
> is working fine.
> 
> Author: Jason Wang <jasowang>
> Date:   Tue Nov 29 16:10:36 2022 +0800
> 
>     intel-iommu: fail DEVIOTLB_UNMAP without dt mode
>     
>     Without dt mode, device IOTLB notifier won't work since guest won't
>     send device IOTLB invalidation descriptor in this case. Let's fail
>     early instead of misbehaving silently.
>     
>     Signed-off-by: Jason Wang <jasowang>
>     Signed-off-by: Laurent Vivier <lvivier>
> 
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index 98a5c304a7d7..a07e879c704f 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -3179,6 +3179,7 @@ static int
> vtd_iommu_notify_flag_changed(IOMMUMemoryRegion *iommu,
>  {
>      VTDAddressSpace *vtd_as = container_of(iommu, VTDAddressSpace, iommu);
>      IntelIOMMUState *s = vtd_as->iommu_state;
> +    X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s);
>  
>      /* TODO: add support for VFIO and vhost users */
>      if (s->snoop_control) {
> @@ -3186,6 +3187,13 @@ static int
> vtd_iommu_notify_flag_changed(IOMMUMemoryRegion *iommu,
>                           "Snoop Control with vhost or VFIO is not
> supported");
>          return -ENOTSUP;
>      }
> +    if (!x86_iommu->dt_supported && (new & IOMMU_NOTIFIER_DEVIOTLB_UNMAP)) {
> +        error_setg_errno(errp, ENOTSUP,
> +                         "device %02x.%02x.%x requires device IOTLB mode",
> +                         pci_bus_num(vtd_as->bus), PCI_SLOT(vtd_as->devfn),
> +                         PCI_FUNC(vtd_as->devfn));
> +        return -ENOTSUP;
> +    }
>  
>      /* Update per-address-space notifier flags */
>      vtd_as->notifier_flags = new;

Jason,

do you plan to send a new version of this patch upstream?

As it fixes this BZ it should be merged upstream.

Comment 25 jason wang 2023-02-22 04:21:47 UTC
(In reply to Laurent Vivier from comment #24)
> (In reply to Laurent Vivier from comment #16)
> > (In reply to Peter Xu from comment #15)
> > ...
> > > Why the bug has "high" priority?  Is some customer hitting this?  Is it
> > > blocking our tests?
> > 
> > I think it's a test blocker, Lei?
> > 
> > > [1] https://lore.kernel.org/all/20221129081037.12099-3-jasowang@redhat.com/
> > 
> > Thank you.
> > It works fine: I apply the patch to QEMU and set "-device
> > intel-iommu,...,device-iotlb=off" on the command line and vhost networking
> > is working fine.
> > 
> > Author: Jason Wang <jasowang>
> > Date:   Tue Nov 29 16:10:36 2022 +0800
> > 
> >     intel-iommu: fail DEVIOTLB_UNMAP without dt mode
> >     
> >     Without dt mode, device IOTLB notifier won't work since guest won't
> >     send device IOTLB invalidation descriptor in this case. Let's fail
> >     early instead of misbehaving silently.
> >     
> >     Signed-off-by: Jason Wang <jasowang>
> >     Signed-off-by: Laurent Vivier <lvivier>
> > 
> > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> > index 98a5c304a7d7..a07e879c704f 100644
> > --- a/hw/i386/intel_iommu.c
> > +++ b/hw/i386/intel_iommu.c
> > @@ -3179,6 +3179,7 @@ static int
> > vtd_iommu_notify_flag_changed(IOMMUMemoryRegion *iommu,
> >  {
> >      VTDAddressSpace *vtd_as = container_of(iommu, VTDAddressSpace, iommu);
> >      IntelIOMMUState *s = vtd_as->iommu_state;
> > +    X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s);
> >  
> >      /* TODO: add support for VFIO and vhost users */
> >      if (s->snoop_control) {
> > @@ -3186,6 +3187,13 @@ static int
> > vtd_iommu_notify_flag_changed(IOMMUMemoryRegion *iommu,
> >                           "Snoop Control with vhost or VFIO is not
> > supported");
> >          return -ENOTSUP;
> >      }
> > +    if (!x86_iommu->dt_supported && (new & IOMMU_NOTIFIER_DEVIOTLB_UNMAP)) {
> > +        error_setg_errno(errp, ENOTSUP,
> > +                         "device %02x.%02x.%x requires device IOTLB mode",
> > +                         pci_bus_num(vtd_as->bus), PCI_SLOT(vtd_as->devfn),
> > +                         PCI_FUNC(vtd_as->devfn));
> > +        return -ENOTSUP;
> > +    }
> >  
> >      /* Update per-address-space notifier flags */
> >      vtd_as->notifier_flags = new;
> 
> Jason,
> 
> do you plan to send a new version of this patch upstream?
> 
> As it fixes this BZ it should be merged upstream.

Yes, will do.

Thanks

Comment 26 Laurent Vivier 2023-03-08 16:40:24 UTC
Patch merged upstream:

commit 09adb0e021207b60a0c51a68939b4539d98d3ef3
Author: Jason Wang <jasowang>
Date:   Thu Feb 23 14:59:21 2023 +0800

    intel-iommu: fail DEVIOTLB_UNMAP without dt mode
    
    Without dt mode, device IOTLB notifier won't work since guest won't
    send device IOTLB invalidation descriptor in this case. Let's fail
    early instead of misbehaving silently.
    
    Reviewed-by: Laurent Vivier <lvivier>
    Tested-by: Laurent Vivier <lvivier>
    Tested-by: Viktor Prutyanov <viktor>
    Buglink: https://bugzilla.redhat.com/2156876
    Signed-off-by: Jason Wang <jasowang>
    Message-Id: <20230223065924.42503-3-jasowang>
    Reviewed-by: Peter Xu <peterx>
    Reviewed-by: Michael S. Tsirkin <mst>
    Signed-off-by: Michael S. Tsirkin <mst>

Comment 30 Laurent Vivier 2023-03-09 09:22:59 UTC
Draft Merge Request:

https://gitlab.com/redhat/centos-stream/src/qemu-kvm/-/merge_requests/157

Summary of change:

diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index a08ee85edf2a..d2983f40d3a9 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -3179,6 +3179,7 @@ static int vtd_iommu_notify_flag_changed(IOMMUMemoryRegion *iommu,
 {
     VTDAddressSpace *vtd_as = container_of(iommu, VTDAddressSpace, iommu);
     IntelIOMMUState *s = vtd_as->iommu_state;
+    X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(s);
 
     /* TODO: add support for VFIO and vhost users */
     if (s->snoop_control) {
@@ -3186,6 +3187,13 @@ static int vtd_iommu_notify_flag_changed(IOMMUMemoryRegion *iommu,
                          "Snoop Control with vhost or VFIO is not supported");
         return -ENOTSUP;
     }
+    if (!x86_iommu->dt_supported && (new & IOMMU_NOTIFIER_DEVIOTLB_UNMAP)) {
+        error_setg_errno(errp, ENOTSUP,
+                         "device %02x.%02x.%x requires device IOTLB mode",
+                         pci_bus_num(vtd_as->bus), PCI_SLOT(vtd_as->devfn),
+                         PCI_FUNC(vtd_as->devfn));
+        return -ENOTSUP;
+    }
 
     /* Update per-address-space notifier flags */
     vtd_as->notifier_flags = new;

Comment 34 Laurent Vivier 2023-03-09 13:48:22 UTC
Lei,

could you set ITM?

Thanks

Comment 35 Lei Yang 2023-03-09 13:57:29 UTC
(In reply to Laurent Vivier from comment #34)
> Lei,
> 
> could you set ITM?

Done
> 
> Thanks

Comment 42 Lei Yang 2023-03-20 09:30:36 UTC
==>Reproduced this bug on qemu-kvm-7.2.0-12.el9_2.x86_64

Test Version:
kernel-5.14.0-284.2.1.el9_2.x86_64
qemu-kvm-7.2.0-12.el9_2.x86_64

Test Steps:
1. Boot a rhel 7.9 guest
/usr/libexec/qemu-kvm \
-name 'vm1'  \
-sandbox on  \
-blockdev '{"node-name": "file_ovmf_code", "driver": "file", "filename": "/usr/share/OVMF/OVMF_CODE.secboot.fd", "auto-read-only": true, "discard": "unmap"}' \
-blockdev '{"node-name": "drive_ovmf_code", "driver": "raw", "read-only": true, "file": "file_ovmf_code"}' \
-blockdev '{"node-name": "file_ovmf_vars", "driver": "file", "filename": "/root/avocado/data/avocado-vt/avocado-vt-vm1_rhel79-64-virtio-scsi_qcow2_filesystem_VARS.fd", "auto-read-only": true, "discard": "unmap"}' \
-blockdev '{"node-name": "drive_ovmf_vars", "driver": "raw", "read-only": false, "file": "file_ovmf_vars"}' \
-machine q35,kernel-irqchip=split,memory-backend=mem-machine_mem,pflash0=drive_ovmf_code,pflash1=drive_ovmf_vars \
-device '{"id": "pcie-root-port-0", "driver": "pcie-root-port", "multifunction": true, "bus": "pcie.0", "addr": "0x1", "chassis": 1}' \
-device '{"id": "pcie-pci-bridge-0", "driver": "pcie-pci-bridge", "addr": "0x0", "bus": "pcie-root-port-0"}'  \
-nodefaults \
-device '{"intremap": "on", "device-iotlb": true, "caching-mode": true, "driver": "intel-iommu"}' \
-device '{"driver": "VGA", "bus": "pcie.0", "addr": "0x2"}' \
-m 29696 \
-object '{"size": 31138512896, "id": "mem-machine_mem", "qom-type": "memory-backend-ram"}'  \
-smp 32,maxcpus=32,cores=16,threads=1,dies=1,sockets=2  \
-cpu 'Cascadelake-Server-noTSX',+kvm_pv_unhalt \
-device '{"id": "pcie-root-port-1", "port": 1, "driver": "pcie-root-port", "addr": "0x1.0x1", "bus": "pcie.0", "chassis": 2}' \
-device '{"driver": "qemu-xhci", "id": "usb1", "bus": "pcie-root-port-1", "addr": "0x0"}' \
-device '{"driver": "usb-tablet", "id": "usb-tablet1", "bus": "usb1.0", "port": "1"}' \
-device '{"id": "pcie-root-port-2", "port": 2, "driver": "pcie-root-port", "addr": "0x1.0x2", "bus": "pcie.0", "chassis": 3}' \
-device '{"id": "virtio_scsi_pci0", "driver": "virtio-scsi-pci", "bus": "pcie-root-port-2", "addr": "0x0"}' \
-blockdev '{"node-name": "file_image1", "driver": "file", "auto-read-only": true, "discard": "unmap", "aio": "threads", "filename": "/root/avocado/data/avocado-vt/vl_avocado-vt-vm1_image1.qcow2", "cache": {"direct": true, "no-flush": false}}' \
-blockdev '{"node-name": "drive_image1", "driver": "qcow2", "read-only": false, "cache": {"direct": true, "no-flush": false}, "file": "file_image1"}' \
-device '{"driver": "scsi-hd", "id": "image1", "drive": "drive_image1", "write-cache": "on"}' \
-device '{"id": "pcie-root-port-3", "port": 3, "driver": "pcie-root-port", "addr": "0x1.0x3", "bus": "pcie.0", "chassis": 4}' \
-device '{"driver": "virtio-net-pci", "mac": "9a:81:1f:e7:1c:fb", "disable-legacy": "on", "disable-modern": false, "iommu_platform": true, "ats": true, "id": "idh93vU4", "netdev": "id8n8aKD", "bus": "pcie-root-port-3", "addr": "0x0"}'  \
-netdev tap,id=id8n8aKD,vhost=on,vhostforce=on \
-vnc :0  \
-rtc base=utc,clock=host,driftfix=slew  \
-boot menu=off,order=cdn,once=c,strict=off \
-enable-kvm \
-device '{"id": "pcie_extra_root_port_0", "driver": "pcie-root-port", "multifunction": true, "bus": "pcie.0", "addr": "0x3", "chassis": 5}' \
-monitor stdio \

2. Add intel_iommu=on in guest kernel line. Then reboot guest.
# grubby --update-kernel=`grubby --default-kernel` --args="intel_iommu=on"
# reboot

3.qemu output
qemu-kvm: vhost vring error in virtqueue 1: Invalid argument (22)

==>Now this bug has been reproduced on qemu-kvm-7.2.0-12.el9_2.x86_64

==>Verified this bug on qemu-kvm-7.2.0-14.el9_2.x86_64

Test Version:
qemu-kvm-7.2.0-14.el9_2.x86_64
kernel-5.14.0-284.2.1.el9_2.x86_64

Test Steps:
1. Boot a rhel 7.9 guest and device-iotlb=false
/usr/libexec/qemu-kvm \
-name 'vm1'  \
-sandbox on  \
-blockdev '{"node-name": "file_ovmf_code", "driver": "file", "filename": "/usr/share/OVMF/OVMF_CODE.secboot.fd", "auto-read-only": true, "discard": "unmap"}' \
-blockdev '{"node-name": "drive_ovmf_code", "driver": "raw", "read-only": true, "file": "file_ovmf_code"}' \
-blockdev '{"node-name": "file_ovmf_vars", "driver": "file", "filename": "/root/avocado/data/avocado-vt/avocado-vt-vm1_rhel79-64-virtio-scsi_qcow2_filesystem_VARS.fd", "auto-read-only": true, "discard": "unmap"}' \
-blockdev '{"node-name": "drive_ovmf_vars", "driver": "raw", "read-only": false, "file": "file_ovmf_vars"}' \
-machine q35,kernel-irqchip=split,memory-backend=mem-machine_mem,pflash0=drive_ovmf_code,pflash1=drive_ovmf_vars \
-device '{"id": "pcie-root-port-0", "driver": "pcie-root-port", "multifunction": true, "bus": "pcie.0", "addr": "0x1", "chassis": 1}' \
-device '{"id": "pcie-pci-bridge-0", "driver": "pcie-pci-bridge", "addr": "0x0", "bus": "pcie-root-port-0"}'  \
-nodefaults \
-device '{"intremap": "on", "device-iotlb": false, "caching-mode": true, "driver": "intel-iommu"}' \
-device '{"driver": "VGA", "bus": "pcie.0", "addr": "0x2"}' \
-m 29696 \
-object '{"size": 31138512896, "id": "mem-machine_mem", "qom-type": "memory-backend-ram"}'  \
-smp 32,maxcpus=32,cores=16,threads=1,dies=1,sockets=2  \
-cpu 'Cascadelake-Server-noTSX',+kvm_pv_unhalt \
-device '{"id": "pcie-root-port-1", "port": 1, "driver": "pcie-root-port", "addr": "0x1.0x1", "bus": "pcie.0", "chassis": 2}' \
-device '{"driver": "qemu-xhci", "id": "usb1", "bus": "pcie-root-port-1", "addr": "0x0"}' \
-device '{"driver": "usb-tablet", "id": "usb-tablet1", "bus": "usb1.0", "port": "1"}' \
-device '{"id": "pcie-root-port-2", "port": 2, "driver": "pcie-root-port", "addr": "0x1.0x2", "bus": "pcie.0", "chassis": 3}' \
-device '{"id": "virtio_scsi_pci0", "driver": "virtio-scsi-pci", "bus": "pcie-root-port-2", "addr": "0x0"}' \
-blockdev '{"node-name": "file_image1", "driver": "file", "auto-read-only": true, "discard": "unmap", "aio": "threads", "filename": "/root/avocado/data/avocado-vt/vl_avocado-vt-vm1_image1.qcow2", "cache": {"direct": true, "no-flush": false}}' \
-blockdev '{"node-name": "drive_image1", "driver": "qcow2", "read-only": false, "cache": {"direct": true, "no-flush": false}, "file": "file_image1"}' \
-device '{"driver": "scsi-hd", "id": "image1", "drive": "drive_image1", "write-cache": "on"}' \
-device '{"id": "pcie-root-port-3", "port": 3, "driver": "pcie-root-port", "addr": "0x1.0x3", "bus": "pcie.0", "chassis": 4}' \
-device '{"driver": "virtio-net-pci", "mac": "9a:81:1f:e7:1c:fb", "disable-legacy": "on", "disable-modern": false, "iommu_platform": true, "ats": true, "id": "idh93vU4", "netdev": "id8n8aKD", "bus": "pcie-root-port-3", "addr": "0x0"}'  \
-netdev tap,id=id8n8aKD,vhost=on,vhostforce=on \
-vnc :0  \
-rtc base=utc,clock=host,driftfix=slew  \
-boot menu=off,order=cdn,once=c,strict=off \
-enable-kvm \
-device '{"id": "pcie_extra_root_port_0", "driver": "pcie-root-port", "multifunction": true, "bus": "pcie.0", "addr": "0x3", "chassis": 5}' \
-monitor stdio \

2. Add intel_iommu=on in guest kernel line. Then reboot guest.
# grubby --update-kernel=`grubby --default-kernel` --args="intel_iommu=on"
# reboot

3. There is no any output from qemu and can ping guest from host.
# ping 10.73.211.113  -c 10
PING 10.73.211.113 (10.73.211.113) 56(84) bytes of data.
64 bytes from 10.73.211.113: icmp_seq=1 ttl=64 time=0.651 ms
64 bytes from 10.73.211.113: icmp_seq=2 ttl=64 time=0.551 ms
64 bytes from 10.73.211.113: icmp_seq=3 ttl=64 time=0.786 ms
64 bytes from 10.73.211.113: icmp_seq=4 ttl=64 time=0.709 ms
64 bytes from 10.73.211.113: icmp_seq=5 ttl=64 time=0.739 ms
64 bytes from 10.73.211.113: icmp_seq=6 ttl=64 time=0.719 ms
64 bytes from 10.73.211.113: icmp_seq=7 ttl=64 time=0.567 ms
64 bytes from 10.73.211.113: icmp_seq=8 ttl=64 time=0.853 ms
64 bytes from 10.73.211.113: icmp_seq=9 ttl=64 time=0.675 ms
64 bytes from 10.73.211.113: icmp_seq=10 ttl=64 time=0.667 ms

--- 10.73.211.113 ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 9202ms
rtt min/avg/max/mdev = 0.551/0.691/0.853/0.087 ms

Based on the above test result this bug has been fixed very well on qemu-kvm-7.2.0-14.el9_2.x86_64.

Comment 45 Lei Yang 2023-03-21 07:27:23 UTC
Based on the Comment 42, move to "VERIFIED".

Comment 47 errata-xmlrpc 2023-05-09 07:23:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: qemu-kvm security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:2162


Note You need to log in before you can comment on or make changes to this bug.