Bug 2211060
Summary: | SEV-es guest randomly stuck at boot to hard drive screen from powerdown and boot again | ||||||
---|---|---|---|---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | zixchen | ||||
Component: | edk2 | Assignee: | Gerd Hoffmann <kraxel> | ||||
Status: | CLOSED ERRATA | QA Contact: | zixchen | ||||
Severity: | medium | Docs Contact: | |||||
Priority: | medium | ||||||
Version: | 9.3 | CC: | berrange, coli, jinzhao, kraxel, lersek, mbrown, mrezanin, pbonzini, vgoyal, virt-maint, xuwei, zhguo | ||||
Target Milestone: | rc | Keywords: | CustomerScenariosInitiative, Triaged | ||||
Target Release: | --- | ||||||
Hardware: | x86_64 | ||||||
OS: | Linux | ||||||
Whiteboard: | |||||||
Fixed In Version: | edk2-20230524-3.el9 | Doc Type: | If docs needed, set a value | ||||
Doc Text: | Story Points: | --- | |||||
Clone Of: | Environment: | ||||||
Last Closed: | 2023-11-07 08:24:30 UTC | Type: | Bug | ||||
Regression: | --- | Mount Type: | --- | ||||
Documentation: | --- | CRM: | |||||
Verified Versions: | Category: | --- | |||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
Cloudforms Team: | --- | Target Upstream Version: | |||||
Embargoed: | |||||||
Attachments: |
|
2023-05-25 04:10:03: ASSERT /builddir/build/BUILD/edk2-f80f052277c8/OvmfPkg/VirtioNetDxe/SnpSharedHelpers.c(177): ((BOOLEAN)(0==1)) switch (Status) { case EFI_OUT_OF_RESOURCES: goto UnmapTxBuf; case EFI_ALREADY_STARTED: // // This should never happen: it implies // // - an identity-mapping VIRTIO_DEVICE_PROTOCOL.MapSharedBuffer() // implementation -- which is fine, // // - and an SNP client that queues multiple instances of the exact same // buffer address with SNP.Transmit() -- which is undefined behavior, // based on the TxBuf language in UEFI-2.7, // EFI_SIMPLE_NETWORK.GetStatus(). // ASSERT (FALSE); <= THIS is triggered. Status = EFI_INVALID_PARAMETER; goto UnmapTxBuf; default: ASSERT_EFI_ERROR (Status); break; } https://kojihub.stream.centos.org/koji/taskinfo?taskID=2275930 Test build with some debug logging added. Can you try this and attach a log of a failed boot? (In reply to Gerd Hoffmann from comment #4) > https://kojihub.stream.centos.org/koji/taskinfo?taskID=2275930 > Test build with some debug logging added. > Can you try this and attach a log of a failed boot? Thanks, Gerd, I will have a try ASAP. 2023-06-01 10:12:05: IoMmuMap: Operation=Read Host=0x7F7BA61D Bytes=0x33 2023-06-01 10:12:05: InternalAllocateBuffer: range-size: 1000, start-address=0x7E06C000, pages=0x1, bits=0x1, bitmap: BFF2 => BFF3 2023-06-01 10:12:05: VirtioNetTransmit/77: Buffer 7D1F5041, Size 78 2023-06-01 10:12:05: VirtioNetMapTxBuf/142: << Buffer 7D1F5041 2023-06-01 10:12:05: IoMmuMap: Operation=Read Host=0x7D1F5041 Bytes=0x4E 2023-06-01 10:12:05: InternalAllocateBuffer: range-size: 1000, start-address=0x7E06C000, pages=0x1, bits=0x1, bitmap: BFF2 => BFF3 2023-06-01 10:12:05: IoMmuMap: Mapping=0x7D1F6D98 Device(PlainText)=0x7E06C000 Crypted=0x7D1F5041 Pages=0x1, ReservedMemBitmap=0x1 2023-06-01 10:12:05: VirtioNetMapTxBuf/157: >> Address 7E06C000 2023-06-01 10:12:05: IoMmuMap: Mapping=0x7D1FC398 Device(PlainText)=0x7E06C000 Crypted=0x7F7BA61D Pages=0x1, ReservedMemBitmap=0x1 That looks fishy. iommu driver hands out the same bounce buffer twice. Probably blocks 1+3 are one function which gets interrupted by block 2. Guess we need some locking: --- a/OvmfPkg/IoMmuDxe/IoMmuBuffer.c +++ b/OvmfPkg/IoMmuDxe/IoMmuBuffer.c @@ -367,7 +367,9 @@ IoMmuAllocateBounceBuffer ( { EFI_STATUS Status; UINT32 ReservedMemBitmap; + EFI_TPL OldTpl; + OldTpl = gBS->RaiseTPL (TPL_CALLBACK); ReservedMemBitmap = 0; Status = InternalAllocateBuffer ( Type, @@ -378,6 +380,7 @@ IoMmuAllocateBounceBuffer ( ); MapInfo->ReservedMemBitmap = ReservedMemBitmap; mReservedMemBitmap |= ReservedMemBitmap; + gBS->RestoreTPL (OldTpl); ASSERT (Status == EFI_SUCCESS); New scratch build: https://kojihub.stream.centos.org/koji/taskinfo?taskID=2385804 Any change in behavior? (In reply to Gerd Hoffmann from comment #8) > 2023-06-01 10:12:05: IoMmuMap: Operation=Read Host=0x7F7BA61D Bytes=0x33 > 2023-06-01 10:12:05: InternalAllocateBuffer: range-size: 1000, > start-address=0x7E06C000, pages=0x1, bits=0x1, bitmap: BFF2 => BFF3 > > 2023-06-01 10:12:05: VirtioNetTransmit/77: Buffer 7D1F5041, Size 78 > 2023-06-01 10:12:05: VirtioNetMapTxBuf/142: << Buffer 7D1F5041 > 2023-06-01 10:12:05: IoMmuMap: Operation=Read Host=0x7D1F5041 Bytes=0x4E > 2023-06-01 10:12:05: InternalAllocateBuffer: range-size: 1000, > start-address=0x7E06C000, pages=0x1, bits=0x1, bitmap: BFF2 => BFF3 > 2023-06-01 10:12:05: IoMmuMap: Mapping=0x7D1F6D98 > Device(PlainText)=0x7E06C000 Crypted=0x7D1F5041 Pages=0x1, > ReservedMemBitmap=0x1 > 2023-06-01 10:12:05: VirtioNetMapTxBuf/157: >> Address 7E06C000 > > 2023-06-01 10:12:05: IoMmuMap: Mapping=0x7D1FC398 > Device(PlainText)=0x7E06C000 Crypted=0x7F7BA61D Pages=0x1, > ReservedMemBitmap=0x1 > > That looks fishy. iommu driver hands out the same bounce buffer twice. > Probably blocks 1+3 are one function which gets interrupted by block 2. > > Guess we need some locking: > > --- a/OvmfPkg/IoMmuDxe/IoMmuBuffer.c > +++ b/OvmfPkg/IoMmuDxe/IoMmuBuffer.c > @@ -367,7 +367,9 @@ IoMmuAllocateBounceBuffer ( > { > EFI_STATUS Status; > UINT32 ReservedMemBitmap; > + EFI_TPL OldTpl; > > + OldTpl = gBS->RaiseTPL (TPL_CALLBACK); > ReservedMemBitmap = 0; > Status = InternalAllocateBuffer ( > Type, > @@ -378,6 +380,7 @@ IoMmuAllocateBounceBuffer ( > ); > MapInfo->ReservedMemBitmap = ReservedMemBitmap; > mReservedMemBitmap |= ReservedMemBitmap; > + gBS->RestoreTPL (OldTpl); > > ASSERT (Status == EFI_SUCCESS); > > New scratch build: > https://kojihub.stream.centos.org/koji/taskinfo?taskID=2385804 > > Any change in behavior? edk2-ovmf-20230301gitf80f052277c8-4.el9.bz2211060.20230622.1203.noarch has other issues, it couldn't boot sev-es guest. Check the console, it stucks after tiancore, please check the attachment for the firmware log. Version: edk2-ovmf-20230301gitf80f052277c8-4.el9.bz2211060.20230622.1203.noarch qemu-kvm-8.0.0-5.el9.x86_64 kernel-5.14.0-331.el9.x86_64 Steps: /usr/libexec/qemu-kvm \ -S \ -name 'avocado-vt-vm1' \ -sandbox on \ -blockdev '{"node-name": "file_ovmf_code", "driver": "file", "filename": "/usr/share/edk2/ovmf/OVMF_CODE.fd", "auto-read-only": true, "discard": "unmap"}' \ -blockdev '{"node-name": "drive_ovmf_code", "driver": "raw", "read-only": true, "file": "file_ovmf_code"}' \ -blockdev '{"node-name": "file_ovmf_vars", "driver": "file", "filename": "/root/avocado/data/avocado-vt/avocado-vt-vm1_rhel9-64-virtio-scsi-ovmf_qcow2_filesystem_VARS.raw", "auto-read-only": true, "discard": "unmap"}' \ -blockdev '{"node-name": "drive_ovmf_vars", "driver": "raw", "read-only": false, "file": "file_ovmf_vars"}' \ -machine q35,pflash0=drive_ovmf_code,pflash1=drive_ovmf_vars,confidential-guest-support=lsec0,memory-backend=mem-machine_mem \ -device '{"id": "pcie-root-port-0", "driver": "pcie-root-port", "multifunction": true, "bus": "pcie.0", "addr": "0x1", "chassis": 1}' \ -device '{"id": "pcie-pci-bridge-0", "driver": "pcie-pci-bridge", "addr": "0x0", "bus": "pcie-root-port-0"}' \ -object sev-guest,id=lsec0,policy=7,cbitpos=51,reduced-phys-bits=1 \ -nodefaults \ -device '{"driver": "VGA", "bus": "pcie.0", "addr": "0x2"}' \ -m 4096 \ -object '{"size": 4294967296, "id": "mem-machine_mem", "qom-type": "memory-backend-ram"}' \ -smp 48,maxcpus=48,cores=24,threads=1,dies=1,sockets=2 \ -cpu 'EPYC-Milan',x2apic=on,tsc-deadline=on,hypervisor=on,tsc-adjust=on,vaes=on,vpclmulqdq=on,spec-ctrl=on,stibp=on,arch-capabilities=on,ssbd=on,cmp-legacy=on,virt-ssbd=on,lbrv=on,tsc-scale=on,vmcb-clean=on,pause-filter=on,pfthreshold=on,v-vmsave-vmload=on,vgif=on,rdctl-no=on,skip-l1dfl-vmentry=on,mds-no=on,pschange-mc-no=on,kvm_pv_unhalt=on \ -chardev socket,server=on,wait=off,id=qmp_id_qmpmonitor1,path=/var/tmp/avocado_as4vdz88/monitor-qmpmonitor1-20230628-030129-xipuDste \ -mon chardev=qmp_id_qmpmonitor1,mode=control \ -chardev socket,server=on,wait=off,id=qmp_id_catch_monitor,path=/var/tmp/avocado_as4vdz88/monitor-catch_monitor-20230628-030129-xipuDste \ -mon chardev=qmp_id_catch_monitor,mode=control \ -device '{"ioport": 1285, "driver": "pvpanic", "id": "idr8y3qB"}' \ -chardev socket,server=on,wait=off,id=chardev_serial0,path=/var/tmp/avocado_as4vdz88/serial-serial0-20230628-030129-xipuDste \ -device '{"id": "serial0", "driver": "isa-serial", "chardev": "chardev_serial0"}' \ -chardev socket,id=seabioslog_id_20230628-030129-xipuDste,path=/var/tmp/avocado_as4vdz88/seabios-20230628-030129-xipuDste,server=on,wait=off \ -device isa-debugcon,chardev=seabioslog_id_20230628-030129-xipuDste,iobase=0x402 \ -device '{"id": "pcie-root-port-1", "port": 1, "driver": "pcie-root-port", "addr": "0x1.0x1", "bus": "pcie.0", "chassis": 2}' \ -device '{"driver": "qemu-xhci", "id": "usb1", "bus": "pcie-root-port-1", "addr": "0x0"}' \ -device '{"driver": "usb-tablet", "id": "usb-tablet1", "bus": "usb1.0", "port": "1"}' \ -device '{"id": "pcie-root-port-2", "port": 2, "driver": "pcie-root-port", "addr": "0x1.0x2", "bus": "pcie.0", "chassis": 3}' \ -device '{"id": "virtio_scsi_pci0", "driver": "virtio-scsi-pci", "bus": "pcie-root-port-2", "addr": "0x0"}' \ -blockdev '{"node-name": "file_image1", "driver": "file", "auto-read-only": true, "discard": "unmap", "aio": "threads", "filename": "/home/kvm_autotest_root/images/rhel9-64-virtio-scsi-ovmf.qcow2", "cache": {"direct": true, "no-flush": false}}' \ -blockdev '{"node-name": "drive_image1", "driver": "qcow2", "read-only": false, "cache": {"direct": true, "no-flush": false}, "file": "file_image1"}' \ -device '{"driver": "scsi-hd", "id": "image1", "drive": "drive_image1", "write-cache": "on"}' \ -device '{"id": "pcie-root-port-3", "port": 3, "driver": "pcie-root-port", "addr": "0x1.0x3", "bus": "pcie.0", "chassis": 4}' \ -device '{"driver": "virtio-net-pci", "mac": "9a:8f:81:2a:7b:a8", "id": "idkuetK7", "netdev": "idHAEXFR", "bus": "pcie-root-port-3", "addr": "0x0"}' \ -netdev tap,id=idHAEXFR,vhost=on \ -monitor stdio \ -vnc :0 \ -rtc base=utc,clock=host,driftfix=slew \ -boot menu=off,order=cdn,once=c,strict=off \ -enable-kvm \ 2023-06-28 03:50:27: FATAL ERROR - RaiseTpl with OldTpl(0x10) > NewTpl(0x8) Ok, picked wrong TPL level apparently, lets try s/TPL_CALLBACK/TPL_NOTIFY/. New scratch build (also updated to 2023-05 rebase): https://kojihub.stream.centos.org/koji/taskinfo?taskID=2426028 (In reply to Gerd Hoffmann from comment #8) > 2023-06-01 10:12:05: IoMmuMap: Operation=Read Host=0x7F7BA61D Bytes=0x33 > 2023-06-01 10:12:05: InternalAllocateBuffer: range-size: 1000, > start-address=0x7E06C000, pages=0x1, bits=0x1, bitmap: BFF2 => BFF3 > > 2023-06-01 10:12:05: VirtioNetTransmit/77: Buffer 7D1F5041, Size 78 > 2023-06-01 10:12:05: VirtioNetMapTxBuf/142: << Buffer 7D1F5041 > 2023-06-01 10:12:05: IoMmuMap: Operation=Read Host=0x7D1F5041 Bytes=0x4E > 2023-06-01 10:12:05: InternalAllocateBuffer: range-size: 1000, > start-address=0x7E06C000, pages=0x1, bits=0x1, bitmap: BFF2 => BFF3 > 2023-06-01 10:12:05: IoMmuMap: Mapping=0x7D1F6D98 > Device(PlainText)=0x7E06C000 Crypted=0x7D1F5041 Pages=0x1, > ReservedMemBitmap=0x1 > 2023-06-01 10:12:05: VirtioNetMapTxBuf/157: >> Address 7E06C000 > > 2023-06-01 10:12:05: IoMmuMap: Mapping=0x7D1FC398 > Device(PlainText)=0x7E06C000 Crypted=0x7F7BA61D Pages=0x1, > ReservedMemBitmap=0x1 > > That looks fishy. iommu driver hands out the same bounce buffer twice. > Probably blocks 1+3 are one function which gets interrupted by block 2. This is an extremely good observation / hunch. I've been thinking a bit about TPL usage in protocol implementations: In UEFI 2.10, "Table 7.3: TPL Restrictions", pick any line -- that line tells you the max TPL a protocol member function may be invoked at, which is also the max TPL that the protocol member function may raise the TPL to, internally. (If the TPL is already that high, then RaiseTPL will have no effect.) However, there's a third meaning to that table, which is kind of occurring to me only now. If the max TPL for a particular protocol (or protocol interface) is TPL_CALLBACK or TPL_NOTIFY (i.e., it's not restricted to TPL_APPLICATION), then that protocol interface *MUST* raise the TPL internally to that particular level as well! So it's effectively a requirement too, on implementations. Here's why: assume the protocol member is already executing at a particular TPL that's strictly below its permitted maximum TPL. (This means that, - for a max=TPL_CALLBACK interface, an agent is calling the interface directly at TPL_APPLICATION, - or for a max=TPL_NOTIFY interface, an agent is calling the interface directly at TPL_APPLICATION, - or for a max=TPL_NOTIFY interface, the interface is being called from an event notification function (callback) that is executing at TPL_CALLBACK. These are the three cases where the current TPL is strictly below the called interface's permitted max TPL.) Assume further that a timer interrupt occurs. The timer interrupt handler will consult the registered events, and may enqueue event notification functions at *higher* TPLs than the current TPL. This means the currently executing protocol interface will be *interrupted*. Assume yet further that this asynchronously invoked notification function was enqueued at a TPL that precisely equals the permitted max TPL of the interrupted protocol member function. That means the notification function is *permitted* to call the protocol member function. End to end, this means that *EVERY* protocol member function must EITHER be fully reentrant, OR must bracket itself with a RaiseTPL/RestoreTPL to its per-spec maximum permitted TPL! (So that, in the latter case, it couldn't be interrupted to begin with.) So, it's either that my reading of the UEFI spec has not been thorough enough thus far, *or* that the UEFI spec is not *clear enough* on this issue -- but either way, the TPL restrictions table effectively means that whenever we implement a protocol, we *MUST* kick the TPL up to the permitted maximum in *each* member function of the protocol. Otherwise there are going to be obscure reentrancy issues. My own treatment of this requirement has not been consistent over the years. In the VirtioNetDxe driver, I did protect every member function -- the table says for SNP: "must be <= TPL_CALLBACK", and I did raise the TPL to CALLBACK in every protocol member function. However, I didn't do the same (for example) in VirtioFsDxe, even though table 7.3 specifies <=TPL_CALLBACK for SimpleFS. Now, if a particular protocol is not listed by name, then the default (last row) takes effect: <=TPL_NOTIFY. With that, we can argue that, whenever we see a protocol member func implementation that lacks a RaiseTPL/RestoreTPL bracket, that's *in general* almost certainly a reentrancy bug! What's a further complication here BTW is that EDKII_IOMMU_PROTOCOL is not a standard protocol, so table 7.3 has no chance of covering it specifically. I guess we can still apply the default restriction: <=TPL_NOTIFY. Therefore, all member functions of an EDKII_IOMMU_PROTOCOL implementation should raise the TPL to NOTIFY upon entry. (In reply to Gerd Hoffmann from comment #11) > 2023-06-28 03:50:27: FATAL ERROR - RaiseTpl with OldTpl(0x10) > NewTpl(0x8) > > Ok, picked wrong TPL level apparently, lets try s/TPL_CALLBACK/TPL_NOTIFY/. Indeed. The following must have happened: (1) Your original attempt with gBS->RaiseTPL (TPL_CALLBACK) must certainly have solved the original issue. The original issue was that EDKII_IOMMU_PROTOCOL was being used at TPL_APPLICATION level, but then a notification function was queued (most likely due to a timer interrupt / timer event) at TPL_CALLBACK, and this notification function called into the VirtioNetDxe driver's SNP implementation (which is valid, per se, because SNP is usable up to TPL_CALLBACK). Then VirtioNetDxe indirectly called into EDKII_IOMMU_PROTOCOL as well (via VirtioMapAllBytesInSharedBuffer() / VirtioLib -> VirtioDeviceProtocol -> PciIoProtocol -> PciRootBridgeIoProtocol -> EdkiiIommuProtocol), so the iommu member function got re-entered. Your original attempt with gBS->RaiseTPL (TPL_CALLBACK) solved this, because the initial IOMMU protocol activation made itself uninterruptible -- so the timer event handling would be *delayed*. OK. (2) However, EDKII_IOMMU_PROTOCOL is almost definitely used via other avenues too (most likely some PCI device driver registers an event with a notification function to enqueue at TPL_NOTIFY, so we get PciIo -> PciRootBridgeIo -> EdkiiIommu at TPL_NOTIFY). Notably, Table 7.3 does not list PCI-related protocols by name, so they get the default <= TPL_NOTIFY restriction. And so when your gBS->RaiseTPL (TPL_CALLBACK) in the IOMMU driver is reached, the TPL is already NOTIFY, and so the behavior is undefined. IOW, the idea to use TPL_NOTIFY seems good, and not just as a random idea ("shotgun debugging"), but fully it's justified by Table 7.3 -- Table 7.3 does not treat PCI-related protocols specifically, so they can be called at TPLs up to and including NOTIFY -- which in turn means the protocol *must* bracket itself with RaiseTPL (TPL_NOTIFY). So, this is a long-winded way to say that we should probably audit *all* drivers in OvmfPkg (minimally all the virtio drivers), and add the missing locking :/ Here's a factor that mitigates the symptom somewhat, though. The virtio drivers bind individual devices. It is arguably unlikely that an application or driver uses a particular virtio device in *two ways* at the same time; namely both directly *and* via callbacks. Therefore reentrancy issues, at least at the individual protocol instance granularity, are unlikely (there could still be issues with static r/w data in the driver, of course). The edk2 IOMMU protocol is more vulnerable / exposed, because it is a *service* protocol -- the whole system has just one instance of it, and it sees a bunch of (indirect) use from lots of places. > With that, we can argue that, whenever we see a protocol member func > implementation that lacks a RaiseTPL/RestoreTPL bracket, that's *in general* > almost certainly a reentrancy bug! Depends on what the function is actually doing ... EDKII_IOMMU_PROTOCOL.Map() does first search for a bounce buffer, and then reserves it (by setting the bit in mReservedMemBitmap). That is not reentrant so it needs locking (which the patch does by raising the TPL), to avoid the function being interrupted between "search" and "reserve". EDKII_IOMMU_PROTOCOL.Unmap() will just clear the bit in mReservedMemBitmap to release the buffer, which is not critical, doing that without raising the TPL should be fine. Disclaimer: That is the bounce buffer allocation aspect of the protocol only, didn't review the other bits in detail (yet). Also note that just wrapping all callback code might not be possible because TPL_NOTITY level code is not allowed to block (see table 7-2 in uefi spec). (In reply to Laszlo Ersek from comment #14) > The edk2 IOMMU protocol is more vulnerable / exposed, because it is a > *service* protocol -- the whole system has just one instance of it, and it > sees a bunch of (indirect) use from lots of places. Yes. We see block devices and network devices calling the iommu protocol in parallel, the latter probably calling from a (polling) timer function. (In reply to Gerd Hoffmann from comment #15) > Also note that just wrapping all callback code might not be possible because > TPL_NOTITY level code is not allowed to block (see table 7-2 in uefi spec). I've not verified it, but my gut is telling me that, in case table 7.3 permits TPL_NOTIFY for a particular protocol or protocol member, then any implementation for that protocol (or protocol member) will "inherently" *not* block internally. In other words, I expect that it would be exceptional that FoobarProtocol were allowed at TPL_NOTIFY, but you'd be "tempted" to write such a Foobar implementation that blocked. I feel that this is in fact how Table 7.3 was put together in the first place! This raises an extra aspect: table 7.3 limits BlockIo and SimpleNetwork to TPL_CALLBACK, which makes sense. But the table doesn't discuss EFI_EXT_SCSI_PASS_THRU_PROTOCOL at all, as far as I can tell. Therefore ExtScsiPassThru *appears* to be permitted at TPL_NOTIFY. I think that must be wrong -- a bug in the spec. Here's why: EFI_EXT_SCSI_PASS_THRU_PROTOCOL.PassThru() *MAY* support nonblocking operation, but it is not required to. The spec says that if the protocol impl does not support non-blocking operation, then PassThru() is allowed to ignore the Event parameter altogether, and wait (block) until the request completes. That behavior however conflicts with TPL_NOTIFY. Therefore I think this must be a bug in Table 3 (i.e., omitting a restriction for ExtScsiPassThru to TPL_CALLBACK). VirtioBlkDxe and VirtioNetDxe are / would be therefore fine with TPL_CALLBACK, whereas VirtioScsiDxe is a gray area -- it handles virtio requests in a blocking manner, and it is *entitled* to that, per EFI_EXT_SCSI_PASS_THRU_PROTOCOL spec. The "gray area" seems to apply to VirtioGpuDxe and VirtioRngDxe; their respective protocols are apparently only limited to TPL_NOTIFY by the default clause in the table, however they have no way of not blocking, even at the API level (so their internal blocking virtio behavior is also valid). > (In reply to Laszlo Ersek from comment #14) > > The edk2 IOMMU protocol is more vulnerable / exposed, because it is a > > *service* protocol -- the whole system has just one instance of it, and it > > sees a bunch of (indirect) use from lots of places. > > Yes. We see block devices and network devices calling the iommu protocol > in parallel, the latter probably calling from a (polling) timer function. Ah, that's enlightening! It also raises an interesting, extra aspect. Once we modify VirtioBlkDxe to bump the TPL to TPL_CALLBACK, that will prevent a clash with VirtioNetDxe -- and this will *mask* the issue (= the lack of TPL_NOTIFY-level locking) in the IOMMU protocol implementation. Anyway, let me take a look at the virtio drivers I've written, wrt. TPL bumping. I'm somewhat concerned about the "gray area" ones; I do think EFI_EXT_SCSI_PASS_THRU_PROTOCOL should be limited to TPL_CALLBACK as well, per spec. After investigating the virtio-blk driver, my opinion is that adding TPL raising is not worth the churn. We have no practical problem (bug) to solve in the virtio device drivers, and every code modification requires review and regression-testing. And pedantically conforming would require the followig changes at least: 1. raise the TPL to CALLBACK in VirtioBlkReadBlocks, VirtioBlkWriteBlocks, VirtioBlkFlushBlocks. Not too bad thus far, but: 2. raise the TPL to NOTIFY in VirtioBlkGetDriverName (!) -- the driver exposes EFI_COMPONENT_NAME_PROTOCOL and EFI_COMPONENT_NAME2_PROTOCOL too, after all, and they are only covered by the default clause in Table 7.3 3. raise the TPL to NOTIFY in VirtioBlkDriverBindingSupported, VirtioBlkDriverBindingStart, VirtioBlkDriverBindingStop (!) -- the driver exposes EFI_DRIVER_BINDING_PROTOCOL, after all; also not covered specifically in Table 7.3 Now consider what it would require to trigger the *lack* of #1. It requires an application to open BlockIo provided by VirtioBlkDxe in EXCLUSIVE mode (if the app opens BlockIo without exclusive, the behavior is almost guaranteed to be undefined at once, because higher-level abstractions will have BlockIo open already, such as DiskIo and/or SimpleFilesystem, and such concurrent accesses are invalid inside TPL_APPLICATION context already). After that, the app would have to call ReadBlocks / WriteBlocks from normal context, and to call the same functions on the *same device* from callback (interrupt) context too. This is exorbitantly unlikely. Triggering #2 and/or #3 -- I don't even know how to make a scenario up for re-entering those. Per UEFI driver model, there's zero *reason* to call these functions from callbacks or (otherwise) an interrupt context. Re #3, you'd use ConnectController() / DisconnectController() in the first place, and again there seems to be no reason to call those from callbacks. So I'd rather suggest fixing only the service driver that has the actual bug -- the IOMMU driver. (In reply to Gerd Hoffmann from comment #11) > 2023-06-28 03:50:27: FATAL ERROR - RaiseTpl with OldTpl(0x10) > NewTpl(0x8) > > Ok, picked wrong TPL level apparently, lets try s/TPL_CALLBACK/TPL_NOTIFY/. > > New scratch build (also updated to 2023-05 rebase): > https://kojihub.stream.centos.org/koji/taskinfo?taskID=2426028 Ping, how things are going with this build? (In reply to Gerd Hoffmann from comment #18) > (In reply to Gerd Hoffmann from comment #11) > > 2023-06-28 03:50:27: FATAL ERROR - RaiseTpl with OldTpl(0x10) > NewTpl(0x8) > > > > Ok, picked wrong TPL level apparently, lets try s/TPL_CALLBACK/TPL_NOTIFY/. > > > > New scratch build (also updated to 2023-05 rebase): > > https://kojihub.stream.centos.org/koji/taskinfo?taskID=2426028 > > Ping, how things are going with this build? So far so good, I've run 20 tests now, all passes. Since the reproduce ratio is lower than 20%, I'm going to run more tests by today, if no failed cases, this build would be ok. Test with edk2-ovmf-20230524-1.el9.bz2211060.20230629.1509.noarch.rpm, repeat 40 tests, couldn't reproduce the issue now. Version: edk2-ovmf-20230524-1.el9.bz2211060.20230629.1509.noarch kernel-5.14.0-334.el9.x86_64 qemu-kvm-8.0.0-6.el9.x86_64 Steps: 1. Run sev-es guests with multi disks tests. Result: Run 40 cases of multi disks, no failed test. posted upstream: https://edk2.groups.io/g/devel/message/107049 (In reply to Gerd Hoffmann from comment #15) > > With that, we can argue that, whenever we see a protocol member func > > implementation that lacks a RaiseTPL/RestoreTPL bracket, that's *in general* > > almost certainly a reentrancy bug! > > Depends on what the function is actually doing ... > > EDKII_IOMMU_PROTOCOL.Map() does first search for a bounce buffer, and then > reserves it (by setting the bit in mReservedMemBitmap). That is not > reentrant > so it needs locking (which the patch does by raising the TPL), to avoid the > function being interrupted between "search" and "reserve". > > EDKII_IOMMU_PROTOCOL.Unmap() will just clear the bit in mReservedMemBitmap > to release the buffer, which is not critical, doing that without raising > the TPL should be fine. The bit clearing operation in IoMmuFreeBounceBuffer() is not an atomic operation, so it seems as though even clearing the bit should require locking against anything else that might be modifying mReservedMemBitmap. Michael Better approach being discussed upstream: https://edk2.groups.io/g/devel/message/107100 Test build: https://kojihub.stream.centos.org/koji/taskinfo?taskID=2730173 Can you test this build please? (In reply to Gerd Hoffmann from comment #27) > Better approach being discussed upstream: > https://edk2.groups.io/g/devel/message/107100 > Test build: > https://kojihub.stream.centos.org/koji/taskinfo?taskID=2730173 > > Can you test this build please? I run 12 tests, vm can boot but it takes quite a long time. sometimes it will freeze on the "boot to hard drive " screen for 3 mins then continue to boot. And it looks the boot time grows when I repeat more tests. upload a firmware log edk2-ovmf-20230524-2.el9.bz2211060.20230822.1033.log. (In reply to zixchen from comment #28) > (In reply to Gerd Hoffmann from comment #27) > > Better approach being discussed upstream: > > https://edk2.groups.io/g/devel/message/107100 > > Test build: > > https://kojihub.stream.centos.org/koji/taskinfo?taskID=2730173 > > > > Can you test this build please? > > I run 12 tests, vm can boot but it takes quite a long time. sometimes it > will freeze on the "boot to hard drive " screen for 3 mins then continue to > boot. And it looks the boot time grows when I repeat more tests. The builds log plenty of debug messages which takes time, so a somewhat slow boot process is expected. 3 mins seems extremely long though. How much time need the comment 11 test builds? (In reply to Gerd Hoffmann from comment #30) > (In reply to zixchen from comment #28) > > (In reply to Gerd Hoffmann from comment #27) > > > Better approach being discussed upstream: > > > https://edk2.groups.io/g/devel/message/107100 > > > Test build: > > > https://kojihub.stream.centos.org/koji/taskinfo?taskID=2730173 > > > > > > Can you test this build please? > > > > I run 12 tests, vm can boot but it takes quite a long time. sometimes it > > will freeze on the "boot to hard drive " screen for 3 mins then continue to > > boot. And it looks the boot time grows when I repeat more tests. > > The builds log plenty of debug messages which takes time, so a somewhat > slow boot process is expected. 3 mins seems extremely long though. > > How much time need the comment 11 test builds? The comment 11 build doesn't stuck at the boot hard drive screen, the whole booting time is around 20 to 30 sec. New scratch build, based on edk2-20230524-3.el9: https://kojihub.stream.centos.org/koji/taskinfo?taskID=2749240 Can you double-check with both edk2-20230524-3.el9 and the new scratch build in a otherwise 100% identical environment? It is somewhat puzzling where the boot time jump comes from. Attempts to reproduce this with a slightly modified driver IoMmuDxe driver (activate it even without SEV) failed. If the boot time difference persist I might have to borrow a SEV machine next week to try hunting it down. (In reply to Gerd Hoffmann from comment #37) > New scratch build, based on edk2-20230524-3.el9: > https://kojihub.stream.centos.org/koji/taskinfo?taskID=2749240 > > Can you double-check with both edk2-20230524-3.el9 and the new scratch build > in a otherwise 100% identical environment? > > It is somewhat puzzling where the boot time jump comes from. > Attempts to reproduce this with a slightly modified driver IoMmuDxe driver > (activate it even without SEV) failed. > If the boot time difference persist I might have to borrow a SEV machine > next week to try hunting it down. Tested edk2-ovmf-20230524-3.el9.bz2211060.20230824.1324.noarch.rpm, repeat test 20 times, not found booting time increase issue. Also test edk2-20230524-3.el9 in the same host, same result with edk2-ovmf-20230524-3.el9.bz2211060.20230824.1324.noarch Are we going to get edk2-ovmf-20230524-3.el9.bz2211060.20230824.1324 in 9.3? QE bot(pre verify): Set 'Verified:Tested,SanityOnly' as gating/tier1 test pass. Tested edk2 and qemu gating test loops, no new bug was found. Versions: kernel-5.14.0-356.el9.x86_64 qemu-kvm-8.0.0-13.el9 edk2-ovmf-20230524-3.el9.noarch 1. Edk2_test_with_edk2-ovmf-20230524-3.el9 Job link: http://fileshare.hosts.qa.psi.pek2.redhat.com/pub/logs/edk2_test_with_edk2-ovmf-20230524-3.el9/results.html 2. Qemu_gating_test_rhel9_with_edk2_test_with_edk2-ovmf-20230524-3.el9 Job link: http://fileshare.hosts.qa.psi.pek2.redhat.com/pub/logs/qemu_gating_test_rhel9_with_edk2_test_with_edk2-ovmf-20230524-3.el9/results.html Hit a system poweroff issue, tracked by Bug 2234782 - system failed to poweroff Tested sev-es guest with vtpm device, system boots up without reseting. Versions: kernel-5.14.0-360.el9.x86_64 qemu-kvm-8.0.0-13.el9 edk2-ovmf-20230524-3.el9.noarch Steps: 1. /usr/bin/swtpm socket --ctrl type=unixio,path=/run/libvirt/qemu/swtpm/6-rhel9_sev-swtpm.sock,mode=0600 --tpmstate dir=/var/lib/libvirt/swtpm/e7f5187a-6e76-4f14-8085-69c95ad5e01f/tpm2,mode=0600 --log file=/var/log/swtpm/libvirt/qemu/rhel9_sev-swtpm.log --terminate --tpm2 2. Boot VM with OVMF.amdsev.fd and vtpm device. /usr/libexec/qemu-kvm -name guest=rhel9_sev,debug-threads=on -S -object {"qom-type":"secret","id":"masterKey0","format":"raw","file":"/var/lib/libvirt/qemu/domain-6-rhel9_sev/master-key.aes"} -blockdev {"driver":"file","filename":"/usr/share/edk2/ovmf/OVMF.amdsev.fd","node-name":"libvirt-pflash0-storage","auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-pflash0-format","read-only":true,"driver":"raw","file":"libvirt-pflash0-storage"} -machine pc-q35-rhel9.2.0,usb=off,dump-guest-core=off,memory-backend=pc.ram,confidential-guest-support=lsec0,pflash0=libvirt-pflash0-format,hpet=off,acpi=on -accel kvm -cpu EPYC-Milan,x2apic=on,tsc-deadline=on,hypervisor=on,tsc-adjust=on,avx512f=on,avx512dq=on,avx512ifma=on,avx512cd=on,avx512bw=on,avx512vl=on,avx512vbmi=on,avx512vbmi2=on,gfni=on,vaes=on,vpclmulqdq=on,avx512vnni=on,avx512bitalg=on,avx512-vpopcntdq=on,la57=on,spec-ctrl=on,stibp=on,flush-l1d=on,arch-capabilities=on,ssbd=on,avx512-bf16=on,cmp-legacy=on,stibp-always-on=on,virt-ssbd=on,amd-psfd=on,lbrv=on,tsc-scale=on,vmcb-clean=on,pause-filter=on,pfthreshold=on,v-vmsave-vmload=on,vgif=on,no-nested-data-bp=on,lfence-always-serializing=on,null-sel-clr-base=on,auto-ibrs=on,rdctl-no=on,skip-l1dfl-vmentry=on,mds-no=on,pschange-mc-no=on -m size=4194304k -object {"qom-type":"memory-backend-ram","id":"pc.ram","size":4294967296} -overcommit mem-lock=off -smp 4,sockets=1,dies=1,cores=4,threads=1 -uuid e7f5187a-6e76-4f14-8085-69c95ad5e01f -no-user-config -nodefaults -chardev socket,id=charmonitor,fd=26,server=on,wait=off -mon chardev=charmonitor,id=monitor,mode=control -rtc base=utc,driftfix=slew -global kvm-pit.lost_tick_policy=delay -no-shutdown -global ICH9-LPC.disable_s3=1 -global ICH9-LPC.disable_s4=1 -boot strict=on -device {"driver":"pcie-root-port","port":16,"chassis":1,"id":"pci.1","bus":"pcie.0","multifunction":true,"addr":"0x2"} -device {"driver":"pcie-root-port","port":17,"chassis":2,"id":"pci.2","bus":"pcie.0","addr":"0x2.0x1"} -device {"driver":"pcie-root-port","port":18,"chassis":3,"id":"pci.3","bus":"pcie.0","addr":"0x2.0x2"} -device {"driver":"pcie-root-port","port":19,"chassis":4,"id":"pci.4","bus":"pcie.0","addr":"0x2.0x3"} -device {"driver":"pcie-root-port","port":20,"chassis":5,"id":"pci.5","bus":"pcie.0","addr":"0x2.0x4"} -device {"driver":"pcie-root-port","port":21,"chassis":6,"id":"pci.6","bus":"pcie.0","addr":"0x2.0x5"} -device {"driver":"pcie-root-port","port":22,"chassis":7,"id":"pci.7","bus":"pcie.0","addr":"0x2.0x6"} -device {"driver":"pcie-root-port","port":23,"chassis":8,"id":"pci.8","bus":"pcie.0","addr":"0x2.0x7"} -device {"driver":"pcie-root-port","port":24,"chassis":9,"id":"pci.9","bus":"pcie.0","multifunction":true,"addr":"0x3"} -device {"driver":"pcie-root-port","port":25,"chassis":10,"id":"pci.10","bus":"pcie.0","addr":"0x3.0x1"} -device {"driver":"pcie-root-port","port":26,"chassis":11,"id":"pci.11","bus":"pcie.0","addr":"0x3.0x2"} -device {"driver":"pcie-root-port","port":27,"chassis":12,"id":"pci.12","bus":"pcie.0","addr":"0x3.0x3"} -device {"driver":"pcie-root-port","port":28,"chassis":13,"id":"pci.13","bus":"pcie.0","addr":"0x3.0x4"} -device {"driver":"pcie-root-port","port":29,"chassis":14,"id":"pci.14","bus":"pcie.0","addr":"0x3.0x5"} -device {"driver":"qemu-xhci","p2":15,"p3":15,"id":"usb","bus":"pci.3","addr":"0x0"} -device {"driver":"virtio-serial-pci","id":"virtio-serial0","bus":"pci.2","addr":"0x0"} -blockdev {"driver":"file","filename":"/home/rhel9_sev.qcow2","node-name":"libvirt-1-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"} -blockdev {"node-name":"libvirt-1-format","read-only":false,"cache":{"direct":true,"no-flush":false},"driver":"raw","file":"libvirt-1-storage"} -device {"driver":"virtio-blk-pci","bus":"pci.4","addr":"0x0","drive":"libvirt-1-format","id":"virtio-disk0","bootindex":1,"write-cache":"on"} -netdev {"type":"tap","fd":"44","vhost":true,"vhostfd":"45","id":"hostnet0"} -device {"driver":"virtio-net-pci","netdev":"hostnet0","id":"net0","mac":"52:56:00:00:00:07","bus":"pci.1","addr":"0x0"} -add-fd set=0,fd=22,opaque=serial0-source -chardev file,id=charserial0,path=/dev/fdset/0,append=on -device {"driver":"isa-serial","chardev":"charserial0","id":"serial0","index":0} -chardev pty,id=charserial1 -device {"driver":"isa-serial","chardev":"charserial1","id":"serial1","index":1} -chardev socket,id=charchannel0,fd=23,server=on,wait=off -device {"driver":"virtserialport","bus":"virtio-serial0.0","nr":1,"chardev":"charchannel0","id":"channel0","name":"org.qemu.guest_agent.0"} -chardev socket,id=chrtpm,path=/run/libvirt/qemu/swtpm/6-rhel9_sev-swtpm.sock -tpmdev emulator,id=tpm-tpm0,chardev=chrtpm -device {"driver":"tpm-crb","tpmdev":"tpm-tpm0","id":"tpm0"} -device {"driver":"usb-tablet","id":"input0","bus":"usb.0","port":"1"} -audiodev {"id":"audio1","driver":"none"} -vnc 0.0.0.0:1,audiodev=audio1 -device {"driver":"virtio-vga","id":"video0","max_outputs":1,"bus":"pcie.0","addr":"0x1"} -global ICH9-LPC.noreboot=off -watchdog-action reset -device {"driver":"virtio-balloon-pci","id":"balloon0","bus":"pci.5","addr":"0x0"} -object {"qom-type":"rng-random","id":"objrng0","filename":"/dev/random"} -device {"driver":"virtio-rng-pci","rng":"objrng0","id":"rng0","bus":"pci.6","addr":"0x0"} -object {"qom-type":"sev-guest","id":"lsec0","cbitpos":51,"reduced-phys-bits":1,"policy":7} -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny -msg timestamp=on Result: Guest boots up. Sanity sev test log: https://beaker.engineering.redhat.com/jobs/8225293 > Tested edk2-ovmf-20230524-3.el9.bz2211060.20230824.1324.noarch.rpm, repeat > test 20 times, not found booting time increase issue. Good, thanks. Can you give it a few more test runs to be sure the fix works? 20 times isn't that much for a reproduce rate <20%. (not urgent, it's for upstream and 9.4). > Also test edk2-20230524-3.el9 in the same host, same result with > edk2-ovmf-20230524-3.el9.bz2211060.20230824.1324.noarch > > Are we going to get edk2-ovmf-20230524-3.el9.bz2211060.20230824.1324 in 9.3? No, we stick to the patch we have in edk2-20230524-3.el9 for 9.3, but upstream (and by rebase 9.4 too) will most likely switch over. (In reply to Gerd Hoffmann from comment #44) > > Tested edk2-ovmf-20230524-3.el9.bz2211060.20230824.1324.noarch.rpm, repeat > > test 20 times, not found booting time increase issue. > > Good, thanks. > > Can you give it a few more test runs to be sure the fix works? > 20 times isn't that much for a reproduce rate <20%. > (not urgent, it's for upstream and 9.4). Test with latest compose and edk2-ovmf-20230524-3.el9.bz2211060.20230824.1324.noarch, run 100 tests, no issue found, guest booting time doesn't not increase either. Version: edk2-ovmf-20230524-3.el9.bz2211060.20230824.1324.noarch qemu-kvm-8.0.0-13.el9.x86_64 kernel-5.14.0-361.el9.x86_64 Steps: 1. Boot sev-es guest with data disks, do mirror or snapshot tests and disk checks inside guest. 2. Repeat test 100 times Result: SEV-ES guests boot and the booting time doesn't increase. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: edk2 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:6330 |
Created attachment 1967857 [details] OVMF log Description of problem: SEV-es guest randomly stuck on boot to hard drive screen from powerdown and boot again Version-Release number of selected component (if applicable): kernel-5.14.0-316.el9.x86_64 edk2-ovmf-20230301gitf80f052277c8-5.el9.noarch qemu-kvm-8.0.0-3.el9.x86_64 How reproducible: <20% Steps to Reproduce: 1.Boot a sev-es guest with multi disks /usr/libexec/qemu-kvm \ -S \ -name 'avocado-vt-vm1' \ -sandbox on \ -object sev-guest,id=lsec0,policy=7,cbitpos=51,reduced-phys-bits=1 \ -blockdev '{"node-name": "file_ovmf_code", "driver": "file", "filename": "/usr/share/edk2/ovmf/OVMF_CODE.fd", "auto-read-only": true, "discard": "unmap"}' \ -blockdev '{"node-name": "drive_ovmf_code", "driver": "raw", "read-only": true, "file": "file_ovmf_code"}' \ -blockdev '{"node-name": "file_ovmf_vars", "driver": "file", "filename": "/root/avocado/data/avocado-vt/avocado-vt-vm1_rhel930-64-virtio-scsi-ovmf_qcow2_filesystem_VARS.fd", "auto-read-only": true, "discard": "unmap"}' \ -blockdev '{"node-name": "drive_ovmf_vars", "driver": "raw", "read-only": false, "file": "file_ovmf_vars"}' \ -machine q35,memory-backend=mem-machine_mem,confidential-guest-support=lsec0,pflash0=drive_ovmf_code,pflash1=drive_ovmf_vars \ -device '{"id": "pcie-root-port-0", "driver": "pcie-root-port", "multifunction": true, "bus": "pcie.0", "addr": "0x1", "chassis": 1}' \ -device '{"id": "pcie-pci-bridge-0", "driver": "pcie-pci-bridge", "addr": "0x0", "bus": "pcie-root-port-0"}' \ -nodefaults \ -device '{"driver": "VGA", "bus": "pcie-pci-bridge-0", "addr": "0x1"}' \ -device '{"driver": "i6300esb", "bus": "pcie-pci-bridge-0", "addr": "0x2"}' \ -watchdog-action reset \ -m 4096 \ -object '{"size": 4294967296, "id": "mem-machine_mem", "qom-type": "memory-backend-ram"}' \ -smp 48,maxcpus=48,cores=24,threads=1,dies=1,sockets=2 \ -cpu 'host',+kvm_pv_unhalt \ -device '{"driver": "intel-hda", "bus": "pcie-pci-bridge-0", "addr": "0x3"}' \ -device '{"driver": "hda-duplex"}' \ -chardev socket,path=/var/tmp/avocado_bawr24p4/monitor-qmpmonitor1-20230525-040741-cCKZcimN,id=qmp_id_qmpmonitor1,server=on,wait=off \ -mon chardev=qmp_id_qmpmonitor1,mode=control \ -chardev socket,path=/var/tmp/avocado_bawr24p4/monitor-catch_monitor-20230525-040741-cCKZcimN,id=qmp_id_catch_monitor,server=on,wait=off \ -mon chardev=qmp_id_catch_monitor,mode=control \ -device '{"ioport": 1285, "driver": "pvpanic", "id": "idxwhk7t"}' \ -chardev socket,path=/var/tmp/avocado_bawr24p4/serial-serial0-20230525-040741-cCKZcimN,id=chardev_serial0,server=on,wait=off \ -device '{"id": "serial0", "driver": "isa-serial", "chardev": "chardev_serial0"}' \ -object '{"qom-type": "rng-random", "filename": "/dev/urandom", "id": "passthrough-Jti99k41"}' \ -device '{"id": "pcie-root-port-1", "port": 1, "driver": "pcie-root-port", "addr": "0x1.0x1", "bus": "pcie.0", "chassis": 2}' \ -device '{"driver": "virtio-rng-pci", "id": "virtio-rng-iNEnLoli", "rng": "passthrough-Jti99k41", "bus": "pcie-root-port-1", "addr": "0x0"}' \ -chardev socket,id=seabioslog_id_20230525-040741-cCKZcimN,path=/var/tmp/avocado_bawr24p4/seabios-20230525-040741-cCKZcimN,server=on,wait=off \ -device isa-debugcon,chardev=seabioslog_id_20230525-040741-cCKZcimN,iobase=0x402 \ -device '{"driver": "ich9-usb-ehci1", "id": "usb1", "addr": "0x1d.0x7", "multifunction": true, "bus": "pcie.0"}' \ -device '{"driver": "ich9-usb-uhci1", "id": "usb1.0", "multifunction": true, "masterbus": "usb1.0", "addr": "0x1d.0x0", "firstport": 0, "bus": "pcie.0"}' \ -device '{"driver": "ich9-usb-uhci2", "id": "usb1.1", "multifunction": true, "masterbus": "usb1.0", "addr": "0x1d.0x2", "firstport": 2, "bus": "pcie.0"}' \ -device '{"driver": "ich9-usb-uhci3", "id": "usb1.2", "multifunction": true, "masterbus": "usb1.0", "addr": "0x1d.0x4", "firstport": 4, "bus": "pcie.0"}' \ -device '{"id": "pcie-root-port-2", "port": 2, "driver": "pcie-root-port", "addr": "0x1.0x2", "bus": "pcie.0", "chassis": 3}' \ -device '{"driver": "qemu-xhci", "id": "usb2", "bus": "pcie-root-port-2", "addr": "0x0"}' \ -device '{"driver": "usb-tablet", "id": "usb-tablet1", "bus": "usb2.0", "port": "1"}' \ -device '{"id": "pcie-root-port-3", "port": 3, "driver": "pcie-root-port", "addr": "0x1.0x3", "bus": "pcie.0", "chassis": 4}' \ -device '{"id": "virtio_scsi_pci0", "driver": "virtio-scsi-pci", "bus": "pcie-root-port-3", "addr": "0x0"}' \ -blockdev '{"node-name": "file_image1", "driver": "file", "auto-read-only": true, "discard": "unmap", "aio": "threads", "filename": "/home/kvm_autotest_root/images/rhel930-64-virtio-scsi-ovmf.qcow2", "cache": {"direct": true, "no-flush": false}}' \ -blockdev '{"node-name": "drive_image1", "driver": "qcow2", "read-only": false, "cache": {"direct": true, "no-flush": false}, "file": "file_image1", "discard": "unmap"}' \ -device '{"driver": "scsi-hd", "id": "image1", "drive": "drive_image1", "write-cache": "on"}' \ -blockdev '{"node-name": "host_device_scsi_debug", "driver": "host_device", "auto-read-only": true, "discard": "unmap", "aio": "threads", "filename": "/dev/sdb", "cache": {"direct": true, "no-flush": false}}' \ -blockdev '{"node-name": "drive_scsi_debug", "driver": "raw", "read-only": false, "cache": {"direct": true, "no-flush": false}, "file": "host_device_scsi_debug", "discard": "unmap"}' \ -device '{"driver": "scsi-block", "id": "scsi_debug", "drive": "drive_scsi_debug"}' \ -device '{"id": "pcie-root-port-4", "port": 4, "driver": "pcie-root-port", "addr": "0x1.0x4", "bus": "pcie.0", "chassis": 5}' \ -device '{"driver": "virtio-net-pci", "mac": "9a:90:6a:05:03:ea", "id": "idoBoi5W", "netdev": "idoIkyQ3", "bus": "pcie-root-port-4", "addr": "0x0"}' \ -netdev tap,id=idoIkyQ3,vhost=on,vhostfd=16,fd=13 \ -vnc :0 \ -rtc base=utc,clock=host,driftfix=slew \ -boot menu=off,order=cdn,once=c,strict=off \ -no-hpet \ -chardev socket,id=char_vtpm_avocado-vt-vm1_tpm0,path=/root/avocado/data/avocado-vt/swtpm/avocado-vt-vm1_tpm0_swtpm.sock \ -tpmdev emulator,chardev=char_vtpm_avocado-vt-vm1_tpm0,id=emulator_vtpm_avocado-vt-vm1_tpm0 \ -device '{"id": "tpm-crb_vtpm_avocado-vt-vm1_tpm0", "tpmdev": "emulator_vtpm_avocado-vt-vm1_tpm0", "driver": "tpm-crb"}' \ -enable-kvm \ 2. Do some test/operation on disk, powerdown guest and boot again e.g. Context: Create partition on '/dev/sdb' in guest Sending command: parted -s /dev/sdb mklabel msdos && sleep 1s && parted -s /dev/sdb mkpart primary xfs 2048s 100% Sending command: echo $? Sending command: ls /sys/dev/block -l | grep "/pci" Sending command: echo $? Sending command: lsblk -o KNAME,SIZE | grep "sda " Sending command: echo $? Sending command: udevadm info -q all -n sda Sending command: echo $? Sending command: lsblk -o KNAME,SIZE | grep "sda1 " Sending command: echo $? Sending command: udevadm info -q all -n sda1 Sending command: echo $? Sending command: lsblk -o KNAME,SIZE | grep "sdb " Sending command: echo $? Sending command: udevadm info -q all -n sdb Sending command: echo $? Sending command: lsblk -o KNAME,SIZE | grep "sdb1 " Sending command: echo $? Sending command: udevadm info -q all -n sdb1 Sending command: echo $? Sending command: lsblk -o KNAME,SIZE | grep "sda2 " Sending command: echo $? Sending command: udevadm info -q all -n sda2 Sending command: echo $? Sending command: lsblk -o KNAME,SIZE | grep "sda3 " Sending command: echo $? Sending command: udevadm info -q all -n sda3 Sending command: echo $? 3. Guest has a chance to stuck at boot to hard drive screen Actual results: Guest boots fail Expected results: Guest always can boot Additional info: Add edk2 log and guest console log, no error info in host kernel log.