Bug 2007129
| Summary: | pcie hotplug emulation has various problems due to insufficient state tracking | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux 8 | Reporter: | Gerd Hoffmann <kraxel> |
| Component: | qemu-kvm | Assignee: | Amnon Ilan <ailan> |
| qemu-kvm sub component: | PCI | QA Contact: | jingzhao <jinzhao> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | high | ||
| Priority: | urgent | CC: | ailan, alifshit, coli, gilboad, jinzhao, juzhang, kchamart, leiyang, markus.falb, mcasquer, mihai, mrezanin, mst, nanliu, qinwang, smooney, virt-maint, xiagao, yanghliu, yfu, yiwei, ymankad, yuhuang |
| Version: | 8.2 | Keywords: | DevelBlocker, Regression, Triaged |
| Target Milestone: | rc | Flags: | jinzhao:
needinfo-
jinzhao: needinfo- pm-rhel: mirror+ |
| Target Release: | 8.5 | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | qemu-kvm-6.2.0-1.module+el8.6.0+13725+61ae1949 | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-05-10 13:21:40 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Assigned to Amnon for initial triage per bz process and age of bug created or assigned to virt-maint without triage. Moving from RHEL-AV to RHEL8 since starting with 8.6, there is not a plan for a separate package. If we need to create a clone eventually to generate AV 8.5.z (and earlier) patches, then we can do so once a fix is available. For testing, the following flag should be set: -global PIIX4_PM.acpi-pci-hotplug-with-bridge-support=off Sorry my bad. I made a mistake in the flag this is what causes the warning, the actual flag should be: -global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off tests are good but please repeat several times in a cycle, and with a couple more guest guests. Thanks! > tests are good but please repeat several times in a cycle, and with a couple
> more guest guests.
Are there autotest testcases for pcie hotplug?
If so, can you just run them all?
The issue landed in CentOS Stream 8 (bug #2024605) and is now hitting oVirt community users. *** Bug 2024662 has been marked as a duplicate of this bug. *** Tested hotplug/unplug vhost-user-fs-pci device, guest works well, virtiofs function works well.
qemu-kvm-6.2.0-1.module+el8.6.0+13725+61ae1949.x86_64
kernel-4.18.0-358.el8.x86_64
Guest: rhel9.0.0 ,win2022 - q35+seabios
Steps:
1. Boot guest with '-global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off'.
-device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x3 \
-device pcie-root-port,port=0x11,chassis=2,id=pcie-root-port-3,bus=pcie.0,addr=0x3.0x1 \
-global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off \
-chardev socket,id=char_virtiofs_fs1,path=/tmp/avocado_d0ojq02p/avocado-vt-vm1-fs1-virtiofsd.sock \
-device pcie-root-port,id=pcie-root-port-3,port=0x3,addr=0x1.0x3,bus=pcie.0,chassis=4 \
-device vhost-user-fs-pci,id=vufs_virtiofs_fs1,chardev=char_virtiofs_fs1,tag=myfs1,queue-size=1024,bus=pcie-root-port-3,addr=0x0 \
2. Hotunplug vhost-user-fs-pci device
{"execute": "device_del","arguments":{"id":"vufs_virtiofs_fs1"}}
{"return": {}}
{"timestamp": {"seconds": 1641975399, "microseconds": 81035}, "event": "DEVICE_DELETED", "data": {"path": "/machine/peripheral/vufs_virtiofs_fs1/virtio-backend"}}
{"timestamp": {"seconds": 1641975399, "microseconds": 81221}, "event": "DEVICE_DELETED", "data": {"device": "vufs_virtiofs_fs1", "path": "/machine/peripheral/vufs_virtiofs_fs1"}}
3. hotplug it.
{"execute": "device_add", "arguments": {"id": "vufs_virtiofs_fs1", "driver": "vhost-user-fs-pci", "bus": "pcie-root-port-3", "tag": "myfs1", "queue-size": "1024", "chardev": "char_virtiofs_fs1"}}
{"return": {}}
Results:
Pass.
After hotunplug vhost-user-fs-pci device, device is not in guest, and after hotplug it, it will come back.
Tested hotplug/unplug the virtio-serial-pci device, guest works well, device function works well.
Test Env:
Host:
4.18.0-358.el8.x86_64
qemu-kvm-6.2.0-2.module+el8.6.0+13738+17338784.x86_64
edk2-ovmf-20210527gite1999b264f1f-3.el8.noarch
Guest: RHEL8.6.0/Win2022/Win2019
Steps:
1. boot guests with '-global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off'.
- rhel8.6.0/win2019 guests:
-machine q35 \
-smp 8,maxcpus=8,cores=4,threads=1,dies=1,sockets=2 \
-cpu 'EPYC-Rome',+kvm_pv_unhalt \
-global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off \
-device pcie-root-port,id=root5,port=0x5,addr=0x1.0x5,bus=pcie.0,chassis=6 \
-device virtio-serial-pci,id=virtio-serial1,max_ports=31,bus=root5,addr=0x0 \
-chardev socket,id=channel1,host=127.0.0.1,port=2222,server=on,wait=off \
-device virtserialport,bus=virtio-serial1.0,chardev=channel1,name=port1,id=port1 \
- Win2022 guest:
-blockdev node-name=file_ovmf_code,driver=file,filename=/usr/share/OVMF/OVMF_CODE.secboot.fd,auto-read-only=on,discard=unmap \
-blockdev node-name=drive_ovmf_code,driver=raw,read-only=on,file=file_ovmf_code \
-blockdev node-name=file_ovmf_vars,driver=file,filename=/home/win2022-64-virtio-scsi.qcow2_VARS.fd,auto-read-only=on,discard=unmap \
-blockdev node-name=drive_ovmf_vars,driver=raw,read-only=off,file=file_ovmf_vars \
-machine q35,memory-backend=mem-machine_mem,pflash0=drive_ovmf_code,pflash1=drive_ovmf_vars \
-global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off \
-object memory-backend-ram,size=30720M,id=mem-machine_mem \
-smp 16,maxcpus=16,cores=8,threads=1,dies=1,sockets=2 \
-cpu 'EPYC-Rome',hv_stimer,hv_synic,hv_vpindex,hv_relaxed,hv_spinlocks=0x1fff,hv_vapic,hv_time,hv_frequencies,hv_runtime,hv_tlbflush,hv_reenlightenment,hv_stimer_direct,hv_ipi,+kvm_pv_unhalt \
-device pcie-root-port,id=root5,port=0x5,addr=0x1.0x5,bus=pcie.0,chassis=6 \
-device virtio-serial-pci,id=virtio-serial1,max_ports=31,bus=root5,addr=0x0 \
-chardev socket,id=channel1,host=127.0.0.1,port=2222,server=on,wait=off \
-device virtserialport,bus=virtio-serial1.0,chardev=channel1,name=port1,id=port1 \
2. hot-unplug/hot-plug virtio-serial-pci device many times.
{ 'execute': 'device_del', 'arguments': {'id': 'port1' }}
{"timestamp": {"seconds": 1642045423, "microseconds": 642601}, "event": "DEVICE_DELETED", "data": {"device": "port1", "path": "/machine/peripheral/port1"}}
{"return": {}}
{"execute":"device_del","arguments":{"id":"virtio-serial1"}}
{"timestamp": {"seconds": 1642044532, "microseconds": 183677}, "event": "DEVICE_DELETED", "data": {"path": "/machine/peripheral/virtio-serial1/virtio-backend"}}
{"timestamp": {"seconds": 1642044532, "microseconds": 183821}, "event": "DEVICE_DELETED", "data": {"device": "virtio-serial1", "path": "/machine/peripheral/virtio-serial1"}}
{"return": {}}
{"execute":"device_add", "arguments":{"driver":"virtio-serial-pci", "id":"virtio-serial1","max_ports":"31","bus":"root5","addr":"0x0"}}
{"return": {}}
{"execute":"device_add","arguments":{"driver":"virtserialport","bus":"virtio-serial1.0","chardev":"channel1","name":"port1","id":"port1","nr":"1"}}
{"return": {}}
3. Transfer data. PASS.
Add 'Verified:Tested,SanityOnly' as gating test with qemu-kvm-6.2.0-1.module+el8.6.0+13725+61ae1949 PASS It uses default configuration in qemu-kvm-6.2 without "-global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off" No new relevant issues found on block device testing. virtiofs works without "-global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off". Test hotplug/unplug on rhel860 guest. pkg: qemu-kvm-6.2.0-5.module+el8.6.0+14025+ca131e0a.x86_64 kernel-4.18.0-360.el8 Virtual Network test pass without "-global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off" Test hotplug/unplug rhtl8139 nic, e1000e nic and virtio-net nic Test Version: qemu-kvm-6.2.0-5.module+el8.6.0+14025+ca131e0a.x86_64 kernel-4.18.0-360.el8.mr1880_220122_0148.x86_64 Test hotplug/unplug PASS with virtio-serial-pci device without "-global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off".
Test Env:
4.18.0-362.el8.x86_64
qemu-kvm-6.2.0-5.module+el8.6.0+14025+ca131e0a.x86_64
vfio-vf/vfio-pf/nvme-vfio Test result : PASS Virtio balloon test passed on rhel8.6 + Win2022 guests without -global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off option qemu-kvm-6.2.0-2.module+el8.6.0+13738+17338784 kernel-4.18.0-358.el8.x86_64 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: virt:rhel and virt-devel:rhel security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:1759 |
So, I think the underlying problem for this is that qemu simply doesn't track whenever pci devices are enabled or not. But that is rather essential state to correctly emulate pcie hotplug. How we handle pcie hotplug (resulting in the race seen by David Gibson): (1) pci device is plugged in (and visiable to the guest). (2) guest is signaled. How we should handle pcie hotplug: (1) pci device is plugged in disabled state (address spaces disabled, config space access blocked). (2) guest is signaled. (3) pci device is enabled in response to the guest turning on slot power. pcie hot-unplug should greatly benefit from enabled/disabled state too. When a device is in disabled state we should be able to simply unplug it right away, without round-trip to the guest. For enabled devices we press the virtual attention button to nicely ask the guest. Maybe only after checking the power led doesn't blink (which indicates the guest is still busy processing our previous request). enabled/disabled state tracking should also allow to remove the multifunction hotplug hacks. Well, at least in the pcie code paths, acpi hotplug probably continues to need them. Also: Unplugging the device unconditionally in case the guest turns off slot power looks wrong. I think qemu should do that only in case there is a pending unplug request. Otherwise simply disable the device. The guest can virtually power-cycle individual devices then using slot power.