Bug 1477099
Summary: | virtio-iommu (including ACPI, VHOST/VFIO integration, migration support) | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | Eric Auger <eric.auger> | |
Component: | qemu-kvm | Assignee: | Eric Auger <eric.auger> | |
qemu-kvm sub component: | Devices | QA Contact: | Yihuang Yu <yihyu> | |
Status: | CLOSED ERRATA | Docs Contact: | ||
Severity: | medium | |||
Priority: | medium | CC: | abologna, alex.williamson, chayang, coli, hpopal, jen, jinzhao, jsuchane, juzhang, knoel, lijin, mlangsdo, mrezanin, mst, peterx, qzhang, virt-maint, wei, yanghliu, yihyu, zhenyzha, zhguo | |
Version: | 9.0 | Keywords: | FutureFeature, Reopened, Triaged | |
Target Milestone: | beta | |||
Target Release: | 9.1 | |||
Hardware: | aarch64 | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | qemu-kvm-7.0.0-3.el9 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1653327 (view as bug list) | Environment: | ||
Last Closed: | 2022-11-15 09:53:23 UTC | Type: | Feature Request | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 1972795, 2064757 | |||
Bug Blocks: | 1543699, 1653327, 1683831, 1727536, 1802982, 1811148, 1924294 |
Description
Eric Auger
2017-08-01 08:34:43 UTC
Current upstream status is: [RFC v4 00/16] VIRTIO-IOMMU device, aligned with V0.4 specification. removing the rhel-8.0.0 flag again, see if it will hold this time. This patch series has not been merged upstream, and is unlikely to be merged in time for 8.0. Previously, the bot incorrectly set the rhel-8.0? flag. Moving to rhel-8.1. The kernel driver is not yet upstreamed and the Virtio spec is still under review (however it is closed to be approved/voted I think). So the qemu devices has those dependencies to be resolved. This will miss 8.1 as neither the virtio spec is voted or the driver is upstreamed. Also what about moving this bug to RHEL AV? (In reply to Eric Auger from comment #8) > This will miss 8.1 as neither the virtio spec is voted or the driver is > upstreamed. Also what about moving this bug to RHEL AV? It's OK to target this one for 8.2. Also, I agree this should be moved to AV. Moved to RHEL AV as other new aarch64 features QEMU has been recently split into sub-components and as a one-time operation to avoid breakage of tools, we are setting the QEMU sub-component of this BZ to "General". Please review and change the sub-component if necessary the next time you review this BZ. Thanks The code now is upstream (qemu 5.0) with the restriction it only works with arm virt machine and with guest booting in DT mode. Non DT support is under development at kernel level by Jean-Philippe Brucker from Linaro: [1] [PATCH 0/3] virtio-iommu on non-devicetree platforms (https://www.spinics.net/lists/linux-virtualization/msg41391.html) Outcome is still uncertain (ie. can we integrate without ACPI, just relying on binding info in the PCIe config space?). If we want to be able to protect VFIO devices we now need to respin: [PATCH RFC v5 0/5] virtio-iommu: VFIO integration (https://lists.gnu.org/archive/html/qemu-devel/2018-11/msg05383.html) Bharat Bhushan now working at Marvell was the original contributor of this series. Given the non DT integration issues, we cannot target 8.3 anymore. The upstream code only supports DT integration. For non DT, the plan is to introduce a new ACPI table dedicated to virtio-iommu. This is under work by Jean-Philippe Brucker (Linaro). But it is a long process ... *** Bug 1836885 has been marked as a duplicate of this bug. *** *** Bug 1736263 has been marked as a duplicate of this bug. *** After evaluating this issue, there are no plans to address it further or fix it in an upcoming release. Therefore, it is being closed. If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened. Hi Andrea, I think this is still material for RHEL. ACPI integration, which was missing to complete the job, should be voted soon, maybe in Feb. Thanks Eric (In reply to Eric Auger from comment #22) > Hi Andrea, > > I think this is still material for RHEL. ACPI integration, which was missing > to complete the job, should be voted soon, maybe in Feb. Good to know, thanks! With this in mind, I think the bug should be reopened. Reopening as per comment 22 and comment 23. From upstream pov, the ACPI integration still is not merged. However is seems close to be: [PATCH v3 0/6] Add support for ACPI VIOT https://lore.kernel.org/linux-iommu/20210602154444.1077006-7-jean-philippe@linaro.org/T/ On downstream we will need to bacport the driver, the ACPI integration, enable the CONFIG_VIRTIO_IOMMU. Then the QEMU integration needs to be upstreamed but it should go faster. After evaluating this issue, there are no plans to address it further or fix it in an upcoming release. Therefore, it is being closed. If plans change such that this issue will be fixed in an upcoming release, then the bug can be reopened. We're waiting for this series to be merged on QEMU upstream. This is taking some time but it's expected to happen. We're going to target this work for 9.0 or 9.1. Hi Eric, any updates on the current status of the virtio-iommu feature in QEMU? It missed 6.2.0, and the libvirt part is going to miss 8.0.0 too unless I can get it merged this week, which at this point doesn't seem very likely. Can we still hope it makes it into RHEL 9.0 through backports? Thanks! The kernel dependencies have been reviewed/acked and/but they should be merged in 5.17 merge window. Only afterwards QEMU patches (not yet submitted publicly) can use them. So indeed the only way now to get the feature at QEMU and libvirt level is through backport. I will ping Jean-Philippe for the QEMU patches and do the kernel/qemu backports if it is still relevant with regard to the schedule :-( The last qemu dependencies (boot bypass) were pulled in qemu 7.0. Moving the BZ to POST. Hi Luiz, what do you expect as info? everything is downstream now in qemu. (In reply to Eric Auger from comment #52) > Hi Luiz, what do you expect as info? everything is downstream now in qemu. It's about comment 49, Yihuang is reporting that we don't have CONFIG_VIRTIO_IOMMU=y. (In reply to Luiz Capitulino from comment #53) > (In reply to Eric Auger from comment #52) > > Hi Luiz, what do you expect as info? everything is downstream now in qemu. > > It's about comment 49, Yihuang is reporting that we don't have > CONFIG_VIRTIO_IOMMU=y. argh, OK Eric, do you plan to send an additional patch enabling CONFIG_VIRTIO_IOMMU? I believe we might need to update DTM/ITM. Yes that's what I am currently busy doing ... QE bot(pre verify): Set 'Verified:Tested,SanityOnly' as gating/tier1 test pass. default-bus-bypass-iommu is meant to bypass the smmu on the root bus so that's normal. If you check with smmuv3, you get the same behavior. For the second issue, ie. plugging the virtio-iommu-pci on a root port, logically it should be feasible and should not prevent the guest from booting but I need to further investigate what is the expected protection then. I can reproduce on my end. This is definitively what I would have expected as a use case and I don't think this is what libvirt does (I don't know if libvirt allows to plug the virtio-iommu-pci on a given root port though). Pinging Andrea on this. I don't think this should block this BZ and the feature, especially if libvirt does not allow that kind of topology. Maybe enter another BZ to track this down? (In reply to Eric Auger from comment #60) > default-bus-bypass-iommu is meant to bypass the smmu on the root bus so > that's normal. If you check with smmuv3, you get the same behavior. > > For the second issue, ie. plugging the virtio-iommu-pci on a root port, > logically it should be feasible and should not prevent the guest from > booting but I need to further investigate what is the expected protection > then. I can reproduce on my end. This is definitively what I would have > expected as a use case and I don't think this is what libvirt does (I don't > know if libvirt allows to plug the virtio-iommu-pci on a given root port > though). Pinging Andrea on this. I don't think this should block this BZ and > the feature, especially if libvirt does not allow that kind of topology. > Maybe enter another BZ to track this down? Thanks Eric, I am clear now. After this bug goes to ON_QA, I will verify it, and for the second issue, I will also file a new bug to track it. (In reply to Eric Auger from comment #60) > For the second issue, ie. plugging the virtio-iommu-pci on a root port, > logically it should be feasible and should not prevent the guest from > booting but I need to further investigate what is the expected protection > then. I can reproduce on my end. This is definitively what I would have > expected as a use case and I don't think this is what libvirt does (I don't > know if libvirt allows to plug the virtio-iommu-pci on a given root port > though). Pinging Andrea on this. I don't think this should block this BZ and > the feature, especially if libvirt does not allow that kind of topology. > Maybe enter another BZ to track this down? I can confirm that libvirt will always place the virtio-iommu-pci device directly on pcie.0 and reject attempts to move it to a different bus. As for whether that's actually correct... I'm not entirely sure? I based that decision off the following exchange: > >> - Here is the sample qemu cmd line I am using > >> > >> -device virtio-iommu-pci,addr=0xa,disable-legacy=on > > > > Is the exact PCI address important, or did you just pick an arbitrary > > slot on pcie.0? Are there any limitations that you're aware of in > > that regard? > > no it isn't. It is arbitrary here. You can put it anywhere on pcie.0 > normally. That's a snippet from an off-list thread between me and Eric dating back to last September. Maybe I read too much into it, and it would actually be fine if the device was not on pcie.0? If that turns out to be the case, we can easily lift the restriction on the libvirt side. Eric, you said you were going to ask Jean-Philippe Brucker for more information on this topic, right? Please update the bug once you hear back :) Verify with qemu-kvm-7.0.0-3.el9.aarch64 Guest kernel: QEMU command line: MALLOC_PERTURB_=1 /usr/libexec/qemu-kvm \ -name 'avocado-vt-vm1' \ -sandbox on \ -blockdev node-name=file_aavmf_code,driver=file,filename=/usr/share/edk2/aarch64/QEMU_EFI-silent-pflash.raw,auto-read-only=on,discard=unmap \ -blockdev node-name=drive_aavmf_code,driver=raw,read-only=on,file=file_aavmf_code \ -blockdev node-name=file_aavmf_vars,driver=file,filename=/home/kvm_autotest_root/images/avocado-vt-vm1_rhel910-aarch64-virtio.qcow2_VARS.fd,auto-read-only=on,discard=unmap \ -blockdev node-name=drive_aavmf_vars,driver=raw,read-only=off,file=file_aavmf_vars \ -machine virt,gic-version=host,memory-backend=mem-machine_mem,pflash0=drive_aavmf_code,pflash1=drive_aavmf_vars \ -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \ -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0 \ -nodefaults \ -device virtio-iommu-pci,bus=pcie.0,addr=0x2 \ -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \ -device virtio-gpu-pci,bus=pcie-root-port-1,addr=0x0,iommu_platform=on \ -m 8192 \ -object memory-backend-ram,size=8192M,id=mem-machine_mem \ -smp 4,maxcpus=4,cores=2,threads=1,sockets=2 \ -cpu 'host' \ -serial unix:'/tmp/serial-serial0',server=on,wait=off \ -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \ -device qemu-xhci,id=usb1,bus=pcie-root-port-2,addr=0x0 \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/rhel910-aarch64-virtio.qcow2,cache.direct=on,cache.no-flush=off \ -blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \ -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \ -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,write-cache=on,bus=pcie-root-port-3,addr=0x0,iommu_platform=on \ -device pcie-root-port,id=pcie-root-port-4,port=0x4,addr=0x1.0x4,bus=pcie.0,chassis=5 \ -device virtio-net-pci,mac=9a:67:ed:03:aa:3c,rombar=0,id=idtzGRNX,netdev=idzIjEeK,bus=pcie-root-port-4,addr=0x0,iommu_platform=on \ -netdev tap,id=idzIjEeK,vhost=on \ -vnc :0 \ -rtc base=utc,clock=host,driftfix=slew \ -enable-kvm Check dmesg inside the guest: # dmesg | grep -i iommu [ 1.044297] iommu: Default domain type: Translated [ 1.045534] iommu: DMA domain TLB invalidation policy: lazy mode [ 1.209360] virtio_iommu virtio0: input address: 64 bits [ 1.210709] virtio_iommu virtio0: page mask: 0xfffffffffffff000 [ 1.226013] xhci_hcd 0000:04:00.0: Adding to iommu group 0 [ 1.227516] iommu: Failed to allocate default IOMMU domain of type 11 for group (null) - Falling back to IOMMU_DOMAIN_DMA [ 2.196732] pcieport 0000:00:01.0: Adding to iommu group 1 [ 2.198186] iommu: Failed to allocate default IOMMU domain of type 11 for group (null) - Falling back to IOMMU_DOMAIN_DMA [ 2.214842] pcieport 0000:00:01.1: Adding to iommu group 1 [ 2.228951] pcieport 0000:00:01.2: Adding to iommu group 1 [ 2.239385] pcieport 0000:00:01.3: Adding to iommu group 1 [ 2.252172] pcieport 0000:00:01.4: Adding to iommu group 1 [ 2.264326] pcieport 0000:01:00.0: Adding to iommu group 1 [ 2.267173] virtio-pci 0000:03:00.0: Adding to iommu group 1 [ 2.270554] virtio-pci 0000:05:00.0: Adding to iommu group 1 [ 2.273847] virtio-pci 0000:06:00.0: Adding to iommu group 1 Check VIOT ACPI table: # dmesg | grep -i VIOT [ 0.000000] ACPI: VIOT 0x000000023C04E498 000058 (v00 BOCHS BXPC 00000001 BXPC 00000001) I am not too much concerned about those devices. However would be nice to compare with x86 and intel iommu. Do we have the same kind of failures on guest? After my try, intel-iommu doesn't print those failure messages. CPU: Intel(R) Xeon(R) CPU E3-1260L v5 @ 2.90GHz MALLOC_PERTURB_=1 /usr/libexec/qemu-kvm \ -name 'avocado-vt-vm1' \ -sandbox on \ -blockdev node-name=file_ovmf_code,driver=file,filename=/usr/share/OVMF/OVMF_CODE.secboot.fd,auto-read-only=on,discard=unmap \ -blockdev node-name=drive_ovmf_code,driver=raw,read-only=on,file=file_ovmf_code \ -blockdev node-name=file_ovmf_vars,driver=file,filename=/home/kvm_autotest_root/images/avocado-vt-vm1_rhel910-64-virtio-scsi.qcow2_VARS.fd,auto-read-only=on,discard=unmap \ -blockdev node-name=drive_ovmf_vars,driver=raw,read-only=off,file=file_ovmf_vars \ -machine q35,kernel-irqchip=split,memory-backend=mem-machine_mem,pflash0=drive_ovmf_code,pflash1=drive_ovmf_vars \ -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \ -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0 \ -nodefaults \ -device intel-iommu,intremap=on,device-iotlb=on,caching-mode=on \ -device VGA,bus=pcie.0,addr=0x2 \ -m 7168 \ -object memory-backend-ram,size=7168M,id=mem-machine_mem \ -smp 4,maxcpus=4,cores=2,threads=1,dies=1,sockets=2 \ -cpu 'Skylake-Client-IBRS',ss=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,clflushopt=on,umip=on,md-clear=on,stibp=on,arch-capabilities=on,ssbd=on,xsaves=on,pdpe1gb=on,ibpb=on,ibrs=on,amd-stibp=on,amd-ssbd=on,rsba=on,skip-l1dfl-vmentry=on,pschange-mc-no=on,hle=off,rtm=off,kvm_pv_unhalt=on \ -device pvpanic,ioport=0x505,id=idsuheBo \ -chardev socket,server=on,wait=off,path=/tmp/serial-serial0,id=chardev_serial0 \ -device isa-serial,id=serial0,chardev=chardev_serial0 \ -chardev socket,id=seabioslog_id_20220524-111702-V2HVzima,path=/tmp/seabios0,server=on,wait=off \ -device isa-debugcon,chardev=seabioslog_id_20220524-111702-V2HVzima,iobase=0x402 \ -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \ -device qemu-xhci,id=usb1,bus=pcie-root-port-1,addr=0x0 \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \ -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-2,addr=0x0 \ -blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/rhel910-64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \ -blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \ -device scsi-hd,id=image1,drive=drive_image1,write-cache=on \ -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \ -device virtio-net-pci,mac=9a:32:f8:d1:2b:62,id=idgxpMuw,netdev=idHMW2n6,bus=pcie-root-port-3,addr=0x0 \ -netdev tap,id=idHMW2n6,vhost=on \ -vnc :0 \ -rtc base=utc,clock=host,driftfix=slew \ -boot menu=off,order=cdn,once=c,strict=off \ -enable-kvm \ -monitor stdio # dmesg | grep iommu [ 0.000000] Command line: BOOT_IMAGE=(hd0,gpt2)/vmlinuz-5.14.0-96.el9.x86_64 root=/dev/mapper/rhel_vm--179--240-root ro console=tty0 crashkernel=1G-4G:192M,4G-64G:256M,64G-:512M resume=/dev/mapper/rhel_vm--179--240-swap rd.lvm.lv=rhel_vm-179-240/root rd.lvm.lv=rhel_vm-179-240/swap net.ifnames=0 console=ttyS0,115200 intel_iommu=on iommu=pt [ 0.023110] Kernel command line: BOOT_IMAGE=(hd0,gpt2)/vmlinuz-5.14.0-96.el9.x86_64 root=/dev/mapper/rhel_vm--179--240-root ro console=tty0 crashkernel=1G-4G:192M,4G-64G:256M,64G-:512M resume=/dev/mapper/rhel_vm--179--240-swap rd.lvm.lv=rhel_vm-179-240/root rd.lvm.lv=rhel_vm-179-240/swap net.ifnames=0 console=ttyS0,115200 intel_iommu=on iommu=pt [ 0.023208] Unknown kernel command line parameters "BOOT_IMAGE=(hd0,gpt2)/vmlinuz-5.14.0-96.el9.x86_64 intel_iommu=on", will be passed to user space. [ 0.358009] iommu: Default domain type: Passthrough (set via kernel command line) [ 0.504582] pci 0000:00:00.0: Adding to iommu group 0 [ 0.505088] pci 0000:00:01.0: Adding to iommu group 1 [ 0.505581] pci 0000:00:01.1: Adding to iommu group 2 [ 0.506118] pci 0000:00:01.2: Adding to iommu group 3 [ 0.506609] pci 0000:00:01.3: Adding to iommu group 4 [ 0.507117] pci 0000:00:02.0: Adding to iommu group 5 [ 0.507604] pci 0000:00:1f.0: Adding to iommu group 6 [ 0.508087] pci 0000:00:1f.2: Adding to iommu group 6 [ 0.508568] pci 0000:00:1f.3: Adding to iommu group 6 [ 0.509064] pci 0000:01:00.0: Adding to iommu group 7 [ 0.509557] pci 0000:03:00.0: Adding to iommu group 8 [ 0.510052] pci 0000:04:00.0: Adding to iommu group 9 [ 0.510543] pci 0000:05:00.0: Adding to iommu group 10 [ 1.250254] intel_iommu=on Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: qemu-kvm security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:7967 |