Bug 2087047
Summary: | Disk detach is unsuccessful while the guest is still booting | |||
---|---|---|---|---|
Product: | Red Hat Enterprise Linux 9 | Reporter: | Balazs Gibizer <bgibizer> | |
Component: | qemu-kvm | Assignee: | Igor Mammedov <imammedo> | |
qemu-kvm sub component: | Devices | QA Contact: | Yiqian Wei <yiwei> | |
Status: | CLOSED ERRATA | Docs Contact: | ||
Severity: | high | |||
Priority: | high | CC: | afazekas, ailan, apevec, asyedham, coli, hhan, imammedo, jinzhao, jparker, jsuvorov, jusual, juzhang, kchamart, kwolf, nilal, smooney, virt-maint, xuzhang, yalzhang, yiwei, ymankad | |
Version: | 9.0 | Keywords: | Triaged, ZStream | |
Target Milestone: | rc | |||
Target Release: | --- | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | qemu-kvm-8.0.0-2.el9 | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 2186397 2203745 (view as bug list) | Environment: | ||
Last Closed: | 2023-11-07 08:26:38 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 2012096, 2186397, 2203745 |
Description
Balazs Gibizer
2022-05-17 08:14:53 UTC
This to me looks like a thing that needs some work in QEMU since libvirt is trying to detach the device again, as requested. Looking at the linked issue it confirms my speculations. Therefore I am moving this to QEMU to further triage this. Reproduce it on Red Hat Enterprise Linux release 9.0 (Plow) 5.14.0-70.13.1.el9_0.x86_64 qemu-kvm-6.2.0-11.el9_0.2.x86_64 seabios-bin-1.15.0-1.el9.noarch edk2-ovmf-20220126gitbb1bba3d77-3.el9.noarch Test steps: 1.Create image file if need qemu-img create -f qcow2 /home/kvm_autotest_root/images/stg1.qcow2 1G 2.Boot vm /usr/libexec/qemu-kvm \ -name 'avocado-vt-vm1' \ -sandbox on \ -machine q35,memory-backend=mem-machine_mem \ -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \ -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0 \ -nodefaults \ -device VGA,bus=pcie.0,addr=0x2 \ -m 8G \ -object memory-backend-ram,size=8G,id=mem-machine_mem \ -smp 2 \ -cpu host,vmx,+kvm_pv_unhalt \ -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \ -device qemu-xhci,id=usb1,bus=pcie-root-port-1,addr=0x0 \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/rhel900-64-virtio.qcow2,cache.direct=on,cache.no-flush=off \ -blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \ -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \ -device virtio-blk-pci,id=image1,drive=drive_image1,bootindex=0,write-cache=on,bus=pcie-root-port-2,addr=0x0 \ \ -blockdev node-name=file_stg1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/kvm_autotest_root/images/stg1.qcow2,cache.direct=on,cache.no-flush=off \ -blockdev node-name=drive_stg1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_stg1 \ -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \ \ -device pcie-root-port,id=pcie-root-port-5,port=0x5,addr=0x1.0x5,bus=pcie.0,chassis=6 \ -device virtio-net-pci,mac=9a:e1:e5:87:89:d2,id=idhDtYbt,netdev=id15e8Je,bus=pcie-root-port-5,addr=0x0 \ -netdev tap,id=id15e8Je,vhost=on \ -vnc :5 \ -monitor stdio \ -qmp tcp:0:5955,server,nowait \ -rtc base=localtime,clock=host,driftfix=slew \ -boot menu=off,order=cdn,once=c,strict=off \ -enable-kvm \ -device pcie-root-port,id=pcie_extra_root_port_0,multifunction=on,bus=pcie.0,addr=0x3,chassis=7 \ -chardev socket,id=charserial1,path=/var/tmp/run-serial.log,server=on,wait=off \ -device isa-serial,chardev=charserial1,id=serial1 \ 3.Sleep 3 seconds 4.execute qmp command to hot-plug/unplug disk {"execute": "device_add", "arguments": {"driver": "virtio-blk-pci", "id": "stg1", "drive": "drive_stg1", "write-cache": "on", "bus": "pcie-root-port-3"}} {"execute":"device_del","arguments":{"id":"stg1"}} No any error on qmp command 5.wait for guest finish booting the login and check disk lsblk there is new disk found in guest. It expect the disk non-exist in guest 6.execute qmp command to unplug disk again {"execute":"device_del","arguments":{"id":"stg1"}} it get error return {"error": {"class": "GenericError", "desc": "Device stg1 is already in the process of unplug"}} Can reproduce this bug with virtio-net-pci and virtio-blk-pci device on the latest rhel9.1.0 host with the test steps of Comment 2. host version: qemu-kvm-7.0.0-4.el9.x86_64 kernel-5.14.0-96.el9.x86_64 seabios-1.16.0-2.el9.x86_64 guest: rhel9.1.0 Test result: hot-plug/unplug virtio-net-pci device in qmp: { "execute": "netdev_add","arguments": { "type": "tap", "id": "hostnet0" } } { "execute": "device_add","arguments": { "driver": "virtio-net-pci", "id": "net1", "bus": "pcie-root-port-5", "mac": "52:54:00:12:34:56", "netdev": "hostnet0" } } { "execute": "device_del", "arguments": { "id": "net1" } }{"return": {}} {"return": {}} {"return": {}} { "execute": "device_del", "arguments": { "id": "net1" } } {"error": {"class": "GenericError", "desc": "Device net1 is already in the process of unplug"}} hot-plug/unplug virtio-blk-pci device in qmp: {"execute": "device_add", "arguments": {"driver": "virtio-blk-pci", "id": "stg1", "drive": "drive_stg1", "write-cache": "on", "bus": "pcie-root-port-4"}} {"execute":"device_del","arguments":{"id":"stg1"}} {"return": {}} {"return": {}} {"execute":"device_del","arguments":{"id":"stg1"}} {"error": {"class": "GenericError", "desc": "Device stg1 is already in the process of unplug"}} Boot a guest with cmd: /usr/libexec/qemu-kvm \ -name 'avocado-vt-vm1' \ -sandbox on \ -machine q35,memory-backend=mem-machine_mem \ -device pcie-root-port,id=pcie-root-port-0,multifunction=on,bus=pcie.0,addr=0x1,chassis=1 \ -device pcie-pci-bridge,id=pcie-pci-bridge-0,addr=0x0,bus=pcie-root-port-0 \ -nodefaults \ -device VGA,bus=pcie.0,addr=0x2 \ -m 16G \ -object memory-backend-ram,size=16G,id=mem-machine_mem \ -smp 6,maxcpus=6,cores=2,threads=1,dies=1,sockets=3 \ -cpu Icelake-Server-noTSX,enforce \ -device pcie-root-port,id=pcie-root-port-1,port=0x1,addr=0x1.0x1,bus=pcie.0,chassis=2 \ -device qemu-xhci,id=usb1,bus=pcie-root-port-1,addr=0x0 \ -device usb-tablet,id=usb-tablet1,bus=usb1.0,port=1 \ -device pcie-root-port,id=pcie-root-port-2,port=0x2,addr=0x1.0x2,bus=pcie.0,chassis=3 \ -device virtio-scsi-pci,id=virtio_scsi_pci0,bus=pcie-root-port-2,addr=0x0 \ -blockdev node-name=file_image1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/rhel9.1-seabios.qcow2,cache.direct=on,cache.no-flush=off \ -blockdev node-name=drive_image1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_image1 \ -device scsi-hd,id=image1,drive=drive_image1,write-cache=on \ -device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \ -device virtio-net-pci,mac=9a:5d:b0:f5:04:0f,id=idlokhzs,netdev=id4YbMcO,bus=pcie-root-port-3,addr=0x0 \ -netdev tap,id=id4YbMcO,vhost=on \ -vnc :0 \ -rtc base=utc,clock=host,driftfix=slew \ -boot menu=off,order=cdn,once=c,strict=off \ -enable-kvm \ -monitor stdio \ -S \ -qmp tcp:0:4444,server=on,wait=off \ -device pcie-root-port,id=pcie-root-port-4,port=0x4,addr=0x1.0x4,bus=pcie.0,chassis=5 \ -blockdev node-name=file_stg1,driver=file,auto-read-only=on,discard=unmap,aio=threads,filename=/home/test.qcow2,cache.direct=on,cache.no-flush=off \ -blockdev node-name=drive_stg1,driver=qcow2,read-only=off,cache.direct=on,cache.no-flush=off,file=file_stg1 \ -device pcie-root-port,id=pcie-root-port-5,port=0x5,addr=0x1.0x5,bus=pcie.0,chassis=6 \ IMHO The issue is the guest simply ignores the release requests for devices which he never learned it exists, it was added and remove requested before it initialized all devices+hotplug. The hotplug what we are using everywhere is just emulating the hotplug devs meant to be used for physical machines, where people are not expected to plug and remove a device at the first ms of the boot. If we really want to solve these kind of issues for once and for all, probably we should invent a new "cloud-plug" device named hotplug device for virtual machines. However some mitigation might be possible in same cases. - guest OS should acknowledge releasing devices what he never initialized (guest kernel modification) - guest kernel (requested by the init system?) should do another pci rescan to avoid not detected devices from the blind spot. The blind spot is between the pciscan and the hotplug initialization The feature expected from the cloud-plug device, if the guest os is not booted (yet) it simply allows to remove devices. The virtualization layer would know it is safe. So the guest os is expected to claim a device from the cloudplug in order to prevent removal, proper handshaking needed. The challenge here, is what to do with guests which does not supports the new "cloud-plug", probably we should just wait 3+/5+/.. years before we dare to try making it default expected. *** Bug 2080893 has been marked as a duplicate of this bug. *** We have been discusssing this regression upstream in the virtual OpenStack project team gathering (vPTG) i just wanted to pass on the feedback that this is still a pain point for us both upstream and in our downstream product. hopefully this is something that can be addressed with a higher priority. fell free to reach out to me as the User Advocate for the OpenStack compute team or to our pm Erwan Gallen <egallen> if you need additional information but this is still impacting our downstream product and affecting our upstream si stability. Fix posted upstream: https://www.mail-archive.com/qemu-devel@nongnu.org/msg952944.html it's too late for merging into this release, but it should make into the next one. In nutshell, it was regression introduced in QEMU * v5.0 * 'pc' machine with ACPI hotplug * 'q35' native PCIe hotplug * v6.1 * + 'q35' with ACPI hotplug (default) Fixed in: * 6.2 'q35' native PCIe hotplug * TBD (8.1?): 'q35' and 'pc' ACPI hotplug (once it's merged upstream we can backport it) Need to look into SHPC one, which seems to be broken as well. QE bot(pre verify): Set 'Verified:Tested,SanityOnly' as gating/tier1 test pass. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: qemu-kvm security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2023:6368 |