Description of problem: After "default-bus-bypass-iommu" is introduced, launch a guest with iommu=smmuv3 and enable iommu_platform in virtio devices, the guest will hang on "UEFI firmware starting" after sending system_reset in qmp/hmp. Version-Release number of selected component (if applicable): qemu: qemu-kvm-6.2.0-4.el9.aarch64 edk2: edk2-aarch64-20210527gite1999b264f1f-7.el9.noarch host kernel: 5.14.0-42.el9.aarch64 guest kernel: 5.14.0-39.el9.aarch64 How reproducible: always Steps to Reproduce: 1. Launch a guest with "default-bus-bypass-iommu=off,iommu=smmuv3" and "iommu_platform=on" MALLOC_PERTURB_=1 /usr/libexec/qemu-kvm \ -name 'avocado-vt-vm1' \ -sandbox on \ -blockdev node-name=file_aavmf_code,driver=file,filename=/usr/share/edk2/aarch64/QEMU_EFI-silent-pflash.raw,auto-read-only=on,discard=unmap \ -blockdev node-name=drive_aavmf_code,driver=raw,read-only=on,file=file_aavmf_code \ -blockdev node-name=file_aavmf_vars,driver=file,filename=/home/kvm_autotest_root/images/avocado-vt-vm1_rhel900-aarch64-virtio-scsi.qcow2_VARS.fd,auto-read-only=on,discard=unmap \ -blockdev node-name=drive_aavmf_vars,driver=raw,read-only=off,file=file_aavmf_vars \ -machine virt,gic-version=host,default-bus-bypass-iommu=off,iommu=smmuv3,memory-backend=mem-machine_mem,pflash0=drive_aavmf_code,pflash1=drive_aavmf_vars \ -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \ -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0 \ -nodefaults \ -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \ -device virtio-gpu-pci,bus=pcie-root-port-1,addr=0x0,iommu_platform=on \ -m 14336 \ -object memory-backend-ram,size=14336M,id=mem-machine_mem \ -smp 112,maxcpus=112,cores=56,threads=1,sockets=2 \ -cpu 'host' \ -serial unix:'/tmp/serial-serial0',server=on,wait=off \ -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \ -device qemu-xhci,id=usb1,bus=pcie-root-port-2,addr=0x0 \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \ -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-3,addr=0x0,iommu_platform=on \ -blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/rhel900-aarch64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \ -blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \ -device scsi-hd,id=image1,drive=drive_image1,write-cache=on \ -vnc :0 \ -rtc base=utc,clock=host,driftfix=slew \ -enable-kvm \ -monitor stdio 2. After the guest enters into the login window, type system_reset in the hmp/qmp. (qemu) system_reset 3. Check the guest status in the serial console # nc -U /tmp/serial-serial0 Red Hat Enterprise Linux 9.0 Beta (Plow) Kernel 5.14.0-39.el9.aarch64 on an aarch64 Activate the web console with: systemctl enable --now cockpit.socket localhost login: UEFI firmware starting. Actual results: Guest hang on the "UEFI firmware starting." Expected results: Guest boots successfully. Additional info: 1. If using "reboot" command inside the guest, the guest can be rebooted without problems. 2. Replace "QEMU_EFI-silent-pflash.raw" to "QEMU_EFI-pflash.raw", can see guest hangs in the following places after system_reset, but failed to go to "VirtioGpuDriverBindingStart: produced GOP while binding VirtIo=3BE79E9A0" InstallQemuFwCfgTables: installed 9 tables InstallQemuFwCfgTables: freeing "etc/acpi/rsdp" InstallQemuFwCfgTables: freeing "etc/acpi/tables" Virtio10:DebugDumpPciCapList: Norm 0x0001 000/001 v0x0 @0x07C+0x008 Virtio10:DebugDumpPciCapList: Norm 0x0009 000/005 v0x0 @0x0C8+0x014 Virtio10:DebugDumpPciCapList: Norm 0x0009 001/005 v0x0 @0x0B4+0x014 Virtio10:DebugDumpPciCapList: Norm 0x0009 002/005 v0x0 @0x0A4+0x010 Virtio10:DebugDumpPciCapList: Norm 0x0009 003/005 v0x0 @0x094+0x010 Virtio10:DebugDumpPciCapList: Norm 0x0009 004/005 v0x0 @0x084+0x010 Virtio10:DebugDumpPciCapList: Norm 0x0010 000/001 v0x0 @0x040+0x03C Virtio10:DebugDumpPciCapList: Norm 0x0011 000/001 v0x0 @0x0DC+0x024 InstallProtocolInterface: FA920010-6785-4941-B6EC-498C579F160A 3BE7A4A20 InstallProtocolInterface: D6099B94-CD97-4CC5-8714-7F6312701A8A 3BE7A2718 InstallProtocolInterface: 09576E91-6D3F-11D2-8E39-00A0C969723B 3BE7A2018 3, Remove "default-bus-bypass-iommu=off" can also trigger the problem, but the prblem is gone if set "default-bus-bypass-iommu=on"
Hello Eric, default-bus-bypass-iommu is a new parameter since qemu-6.2, I am not sure if default-bus-bypass-iommu=off + iommu_platform=on is a negative use case and qemu should report a warning of it? However, its default value is off, when it is omitted, the guest should work well like before.
I can reproduce upstream (without the default-bus-bypass-iommu=off which is the default value as you mentioned): When issuing the 'system_reset' in a qmp-shell I get the following qemu errors and the guest does not reboot: invalid STE smmuv3-iommu-memory-region-0-0 translation failed for iova=0x13a9d2000(SMMU_EVT_C_BAD_STE) Invalid read at addr 0x13A9D2000, size 2, region '(null)', reason: rejected invalid STE smmuv3-iommu-memory-region-0-0 translation failed for iova=0x13a9d2000(SMMU_EVT_C_BAD_STE) Invalid write at addr 0x13A9D2000, size 2, region '(null)', reason: rejected invalid STE ../.. This is definitively not expected.
Sent '[PATCH] hw/arm/smmuv3: Fix device reset' upstream. With that patch I do not seem to hit the error.
(In reply to Yihuang Yu from comment #3) > Hello Eric, > > default-bus-bypass-iommu is a new parameter since qemu-6.2, I am not sure if > default-bus-bypass-iommu=off + iommu_platform=on is a negative use case and > qemu should report a warning of it? However, its default value is off, when > it is omitted, the guest should work well like before. To me it does not relate to default-bus-bypass-iommu new option. As you said, by default it is off. To me it is a system_reset issue. I think the device reset was missing some key register resets.
QE bot(pre verify): Set 'Verified:Tested,SanityOnly' as gating/tier1 test pass.
Verify this bug with qemu-kvm-6.2.0-8.el9 Env: Host kernel: 5.14.0-56.el9.aarch64 Guest kernel: 5.14.0-58.el9.aarch64 qemu version: qemu-kvm-6.2.0-8.el9.aarch64 edk2 version: edk2-aarch64-20220126gitbb1bba3d77-1.el9.noarch python3 ConfigTest.py --testcase=system_reset --guestname=RHEL.9.0.0 --customsparams='machine_type_extra_params += ,default-bus-bypass-iommu=on,iommu=smmuv3\nvirtio_dev_iommu_platform = on\nvirtio_dev_ats = on\nvirtio_dev_aer = on' (1/2) Host_RHEL.m9.u0.qcow2.virtio_scsi.up.virtio_net.Guest.RHEL.9.0.0.aarch64.io-github-autotest-qemu.unattended_install.cdrom.extra_cdrom_ks.default_install.aio_threads.arm64-pci: STARTED (1/2) Host_RHEL.m9.u0.qcow2.virtio_scsi.up.virtio_net.Guest.RHEL.9.0.0.aarch64.io-github-autotest-qemu.unattended_install.cdrom.extra_cdrom_ks.default_install.aio_threads.arm64-pci: PASS (1119.20 s) (2/2) Host_RHEL.m9.u0.qcow2.virtio_scsi.up.virtio_net.Guest.RHEL.9.0.0.aarch64.io-github-autotest-qemu.system_reset.arm64-pci: STARTED (2/2) Host_RHEL.m9.u0.qcow2.virtio_scsi.up.virtio_net.Guest.RHEL.9.0.0.aarch64.io-github-autotest-qemu.system_reset.arm64-pci: PASS (217.90 s) MALLOC_PERTURB_=1 /usr/libexec/qemu-kvm \ -name 'avocado-vt-vm1' \ -sandbox on \ -blockdev node-name=file_aavmf_code,driver=file,filename=/usr/share/edk2/aarch64/QEMU_EFI-silent-pflash.raw,auto-read-only=on,discard=unmap \ -blockdev node-name=drive_aavmf_code,driver=raw,read-only=on,file=file_aavmf_code \ -blockdev node-name=file_aavmf_vars,driver=file,filename=/home/kvm_autotest_root/images/avocado-vt-vm1_rhel900-aarch64-virtio-scsi.qcow2_VARS.fd,auto-read-only=on,discard=unmap \ -blockdev node-name=drive_aavmf_vars,driver=raw,read-only=off,file=file_aavmf_vars \ -machine virt,gic-version=host,memory-backend=mem-machine_mem,pflash0=drive_aavmf_code,pflash1=drive_aavmf_vars \ -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \ -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0 \ -nodefaults \ -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \ -device virtio-gpu-pci,bus=pcie-root-port-1,addr=0x0 \ -m 14336 \ -object memory-backend-ram,size=14336M,id=mem-machine_mem \ -smp 112,maxcpus=112,cores=56,threads=1,sockets=2 \ -cpu 'host' \ -chardev socket,id=qmp_id_qmpmonitor1,path=/tmp/avocado_16iygt1b/monitor-qmpmonitor1-20220217-013424-Kr3SRzza,server=on,wait=off \ -mon chardev=qmp_id_qmpmonitor1,mode=control \ -chardev socket,id=qmp_id_catch_monitor,path=/tmp/avocado_16iygt1b/monitor-catch_monitor-20220217-013424-Kr3SRzza,server=on,wait=off \ -mon chardev=qmp_id_catch_monitor,mode=control \ -serial unix:'/tmp/avocado_16iygt1b/serial-serial0-20220217-013424-Kr3SRzza',server=on,wait=off \ -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \ -device qemu-xhci,id=usb1,bus=pcie-root-port-2,addr=0x0 \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \ -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-3,addr=0x0 \ -blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/rhel900-aarch64-virtio-scsi.qcow2,cache.direct=on,cache.no-flush=off \ -blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \ -device scsi-hd,id=image1,drive=drive_image1,write-cache=on \ -device pcie-root-port,id=pcie-root-port-4,port=0x4,addr=0x1.0x4,bus=pcie.0,chassis=5 \ -device virtio-net-pci,mac=9a:a4:af:3d:f3:90,rombar=0,id=idC0pAfV,netdev=idvcviNx,bus=pcie-root-port-4,addr=0x0 \ -netdev tap,id=idvcviNx,vhost=on,vhostfd=21,fd=6 \ -vnc :0 \ -rtc base=utc,clock=host,driftfix=slew \ -chardev socket,id=char_vtpm_tpm0,path=/var/tmp/vm1_tpm.sock \ -tpmdev emulator,chardev=char_vtpm_tpm0,id=emulator_vtpm_tpm0 \ -device tpm-tis-device,id=tpm-tis-device_vtpm_tpm0,tpmdev=emulator_vtpm_tpm0 \ -enable-kvm \ 01:35:52 DEBUG| Updated HWADDR (9a:a4:af:3d:f3:90)<->(10.19.158.196) IP pair into address cache 01:35:53 DEBUG| Found/Verified IP 10.19.158.196 for VM avocado-vt-vm1 NIC 0 01:36:15 DEBUG| Attempting to log into 'avocado-vt-vm1' (timeout 360s) 01:36:15 DEBUG| Found/Verified IP 10.19.158.196 for VM avocado-vt-vm1 NIC 0 01:36:17 INFO | Context: Reboot guest 'avocado-vt-vm1'. --> rebooting 'avocado-vt-vm1' 01:36:17 DEBUG| (monitor avocado-vt-vm1.qmpmonitor1) Sending command 'system_reset' 01:36:17 DEBUG| Send command: {'execute': 'system_reset', 'id': '7fUrQHn9'} 01:36:18 INFO | Context: Reboot guest 'avocado-vt-vm1'. --> rebooting 'avocado-vt-vm1' --> waiting for guest to go down 01:36:18 INFO | Context: Reboot guest 'avocado-vt-vm1'. --> rebooting 'avocado-vt-vm1' --> logging in after reboot 01:36:18 DEBUG| Attempting to log into 'avocado-vt-vm1' (timeout 359s) 01:36:18 DEBUG| Found/Verified IP 10.19.158.196 for VM avocado-vt-vm1 NIC 0 01:37:14 DEBUG| Destroying VM avocado-vt-vm1 (PID 88497) 01:37:14 DEBUG| Shutting down VM avocado-vt-vm1 (shell) serial output: 2022-02-17 01:36:14: dhcp158-196 login: 2022-02-17 01:36:17: UEFI firmware starting. 2022-02-17 01:36:18: ^@^@ 2022-02-17 01:36:19: SyncPcrAllocationsAndPcrMask! 2022-02-17 01:36:19: Tpm2GetCapabilityPcrs - 00000004 ...... ......
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (new packages: qemu-kvm), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:2307