Bug 2126095
Summary: | [rhel9.2][intel_iommu]Booting guest with "-device intel-iommu,intremap=on,device-iotlb=on,caching-mode=on" causes kernel call trace | ||
---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | Lei Yang <leiyang> |
Component: | qemu-kvm | Assignee: | Peter Xu <peterx> |
qemu-kvm sub component: | Devices | QA Contact: | jinl |
Status: | CLOSED ERRATA | Docs Contact: | |
Severity: | high | ||
Priority: | high | CC: | chayang, coli, jinl, jinzhao, juzhang, lvivier, menli, mrezanin, nanliu, peterx, qinwang, qizhu, virt-maint, xiaolong.wang, yanghliu, zhguo |
Version: | 9.2 | Keywords: | Regression, Triaged |
Target Milestone: | rc | ||
Target Release: | 9.2 | ||
Hardware: | Unspecified | ||
OS: | All | ||
Whiteboard: | |||
Fixed In Version: | qemu-kvm-7.1.0-4.el9 | Doc Type: | If docs needed, set a value |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2023-05-09 07:20:41 UTC | Type: | --- |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: | |||
Bug Depends On: | 2135806 | ||
Bug Blocks: |
Description
Lei Yang
2022-09-12 12:05:21 UTC
It looks like more like a iommu bug or a PCI bug than a virtio-net bug. Peter, could you have a look to confirm that? This bug is potentially a duplicate of BZ2127410 as it also uses intel-iommu. Possible dup of bz2126623 too. Can verify using "eim=off" attached to intel-iommu. Moving to qemu-kvm/Devices as it doesn't seem specific to virtio-networking (In reply to Peter Xu from comment #4) > Possible dup of bz2126623 too. Can verify using "eim=off" attached to > intel-iommu. Yep, I tried to add this parameter to intel_iommeu cmd line. There is no issue any more. -device intel-iommu,intremap=on,device-iotlb=on,caching-mode=on,eim=off \ *** Bug 2126623 has been marked as a duplicate of this bug. *** Copied from Peter's comment in https://bugzilla.redhat.com/show_bug.cgi?id=2126623#c6: https://lore.kernel.org/qemu-devel/20220921161227.57259-1-peterx@redhat.com/T/#u *** Bug 2127410 has been marked as a duplicate of this bug. *** hit a similar issue with win11(both 21h2 and 22h2). packages: kernel-5.14.0-167.el9.x86_64 qemu-kvm-7.1.0-1.el9.x86_64 edk2-ovmf-20220526git16779ede2d36-4.el9.noarch seabios-bin-1.16.0-4.el9.noarch Steps: 1. boot a win11 guest with the following command. MALLOC_PERTURB_=1 /usr/libexec/qemu-kvm \ -name 'win11' \ -machine q35,kernel-irqchip=split \ -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \ -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0 \ -nodefaults \ -device VGA,bus=pcie.0,addr=0x2 \ -m 14336 \ -smp 20,maxcpus=20,cores=10,threads=1,dies=1,sockets=2 \ -cpu 'IvyBridge-IBRS',ss=on,vmx=on,pdcm=on,pcid=on,hypervisor=on,arat=on,tsc-adjust=on,umip=on,md-clear=on,stibp=on,arch-capabilities=on,ssbd=on,xsaveopt=on,pdpe1gb=on,ibpb=on,ibrs=on,amd-stibp=on,amd-ssbd=on,skip-l1dfl-vmentry=on,pschange-mc-no=on,hv_stimer,hv_synic,hv_vpindex,hv_relaxed,hv_spinlocks=0x1fff,hv_vapic,hv_time,hv_frequencies,hv_runtime,hv_tlbflush,hv_reenlightenment,hv_stimer_direct,hv_ipi,kvm_pv_unhalt=on \ -device pvpanic,ioport=0x505,id=idGusuYv \ -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \ -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-3,addr=0x0,disable-legacy=on,disable-modern=false,iommu_platform=on,ats=on \ -blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/win11-64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \ -blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \ -device scsi-hd,id=image1,drive=drive_image1,write-cache=on \ -device pcie-root-port,id=pcie-root-port-4,port=0x4,addr=0x1.0x4,bus=pcie.0,chassis=5 \ -device virtio-net-pci,mac=9a:f0:66:a0:03:50,id=idL1tYyA,netdev=ide1YiHr,bus=pcie-root-port-4,addr=0x0,disable-legacy=on,disable-modern=false,iommu_platform=on,ats=on \ -netdev tap,id=ide1YiHr,vhost=on \ -blockdev node-name=file_cd1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/iso/windows/winutils.iso,cache.direct=on,cache.no-flush=off \ -blockdev node-name=drive_cd1,driver=raw,read-only=on,cache.direct=on,cache.no-flush=off,file=file_cd1 \ -device scsi-cd,id=cd1,drive=drive_cd1,write-cache=on \ -vnc :10 \ -rtc base=localtime,clock=host,driftfix=slew \ -boot menu=off,order=cdn,once=c,strict=off \ -enable-kvm \ -device pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x3,chassis=6 \ -enable-kvm \ -qmp tcp:0:1231,server,nowait \ -monitor stdio \ -tpmdev emulator,id=tpm-tpm0,chardev=chrtpm \ -chardev socket,id=chrtpm,path=/tmp/guest-swtpm.sock \ -device tpm-crb,tpmdev=tpm-tpm0,id=tpm0 \ -drive file=/usr/share/edk2/ovmf/OVMF_CODE.secboot.fd,if=pflash,format=raw,unit=0,readonly=on \ -drive file=/home/OVMF_VARS.fd,if=pflash,format=raw,unit=1 \ -device intel-iommu,intremap=on,device-iotlb=on,eim=on \ Actual results: guest hang on the boot page with iommu. Expected results: The guest can boot normally. Additional info: 1. Not reproduce on qemu-kvm-7.0.0-13.el9.x86_64, so it should be a qemu regression issue. 2. remove '-device intel-iommu,intremap=on,device-iotlb=on,eim=on ', guest can start normally host info: model name : Intel(R) Xeon(R) CPU E7-4830 v2 @ 2.20GHz Thanks Menghuan *** Bug 2135692 has been marked as a duplicate of this bug. *** Hit same issue on Red Hat Enterprise Linux release 9.2 Beta (Plow) 5.14.0-162.el9.x86_64 qemu-kvm-7.1.0-1.el9.x86_64 seabios-bin-1.16.0-4.el9.noarch edk2-ovmf-20220526git16779ede2d36-3.el9.noarch libvirt-8.5.0-6.el9.x86_64 python3-libvirt-8.5.0-2.el9.x86_64 virtio-win-prewhql-0.1-227.iso How reproducible: 100% Steps to Reproduce: 1. Boot VM with iommu enabled /usr/libexec/qemu-kvm \ -name 'avocado-vt-vm1' \ -sandbox on \ -machine q35,kernel-irqchip=split,memory-backend=mem-machine_mem \ -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \ -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0 \ -nodefaults \ -device intel-iommu,intremap=on,device-iotlb=on \ -device VGA,bus=pcie.0,addr=0x2 \ -m 8G \ -object memory-backend-ram,size=8G,id=mem-machine_mem \ -smp 10,maxcpus=10,cores=5,threads=1,dies=1,sockets=2 \ -cpu 'Cascadelake-Server',ss=on,vmx=on,pdcm=on,hypervisor=on,tsc-adjust=on,umip=on,pku=on,md-clear=on,stibp=on,arch-capabilities=on,xsaves=on,ibpb=on,ibrs=on,amd-stibp=on,amd-ssbd=on,rdctl-no=on,ibrs-all=on,skip-l1dfl-vmentry=on,mds-no=on,pschange-mc-no=on,tsx-ctrl=on,hle=off,rtm=off,kvm_pv_unhalt=on \ -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \ -device qemu-xhci,id=usb1,bus=pcie-root-port-1,addr=0x0 \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \ -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-2,addr=0x0,disable-legacy=on,disable-modern=off,iommu_platform=on,ats=on \ -blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/rhel910-64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \ -blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \ -device scsi-hd,id=image1,drive=drive_image1,write-cache=on \ \ -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \ \ -vnc :5 \ -monitor stdio \ -rtc base=utc,clock=host,driftfix=slew \ -boot menu=off,order=cdn,once=c,strict=off \ -enable-kvm \ -device pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x3,chassis=5 It will boot succeed if add eim=off on device intel-iommu -device intel-iommu,intremap=on,device-iotlb=on,eim=off \ This bug will be resolved by the qemu 7.2 rebase bug 2135806, I've changed the DTM/ITM and added the bug dependency. Peter (or Mirek) - unless there's a desire for testing to have the patch for the existing tests using the 7.1 rebase, then I think the 9.2 downstream MR can be closed. (In reply to John Ferlan from comment #14) > This bug will be resolved by the qemu 7.2 rebase bug 2135806, I've changed > the DTM/ITM and added the bug dependency. > > Peter (or Mirek) - unless there's a desire for testing to have the patch for > the existing tests using the 7.1 rebase, then I think the 9.2 downstream MR > can be closed. What is the target date for the rebase? As long as the QE is good with the delayed fix it'll be perfectly fine to me to close the MR. Thanks. QE bot(pre verify): Set 'Verified:Tested,SanityOnly' as gating/tier1 test pass. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: qemu-kvm security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:2162 |