Description of problem: This was found while testing RHEL 8.4 based SEV enabled instances for OSP 16.2 under the following RFE: [RFE][Test Only] AMD SEV-encrypted instances https://bugzilla.redhat.com/show_bug.cgi?id=1833442 After successfully attaching a disk to a RHEL 8.4 based SEV enabled instance the request to detach the disk never completes with the following trace eventually logged: [ 7.773877] pcieport 0000:00:02.5: Slot(0-5): Attention button pressed [ 7.774743] pcieport 0000:00:02.5: Slot(0-5) Powering on due to button press [ 7.775714] pcieport 0000:00:02.5: Slot(0-5): Card present [ 7.776403] pcieport 0000:00:02.5: Slot(0-5): Link Up [ 7.903183] pci 0000:06:00.0: [1af4:1042] type 00 class 0x010000 [ 7.904095] pci 0000:06:00.0: reg 0x14: [mem 0x00000000-0x00000fff] [ 7.905024] pci 0000:06:00.0: reg 0x20: [mem 0x00000000-0x00003fff 64bit pref] [ 7.906977] pcieport 0000:00:02.5: bridge window [io 0x1000-0x0fff] to [bus 06] add_size 1000 [ 7.908069] pcieport 0000:00:02.5: BAR 13: no space for [io size 0x1000] [ 7.908917] pcieport 0000:00:02.5: BAR 13: failed to assign [io size 0x1000] [ 7.909832] pcieport 0000:00:02.5: BAR 13: no space for [io size 0x1000] [ 7.910667] pcieport 0000:00:02.5: BAR 13: failed to assign [io size 0x1000] [ 7.911586] pci 0000:06:00.0: BAR 4: assigned [mem 0x800600000-0x800603fff 64bit pref] [ 7.912616] pci 0000:06:00.0: BAR 1: assigned [mem 0x80400000-0x80400fff] [ 7.913472] pcieport 0000:00:02.5: PCI bridge to [bus 06] [ 7.915762] pcieport 0000:00:02.5: bridge window [mem 0x80400000-0x805fffff] [ 7.917525] pcieport 0000:00:02.5: bridge window [mem 0x800600000-0x8007fffff 64bit pref] [ 7.920252] virtio-pci 0000:06:00.0: enabling device (0000 -> 0002) [ 7.924487] virtio_blk virtio4: [vdb] 2097152 512-byte logical blocks (1.07 GB/1.00 GiB) [ 7.926616] vdb: detected capacity change from 0 to 1073741824 [ .. ] [ 246.751028] INFO: task irq/29-pciehp:173 blocked for more than 120 seconds. [ 246.752801] Not tainted 4.18.0-305.el8.x86_64 #1 [ 246.753902] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 246.755457] irq/29-pciehp D 0 173 2 0x80004000 [ 246.756616] Call Trace: [ 246.757328] __schedule+0x2c4/0x700 [ 246.758185] schedule+0x38/0xa0 [ 246.758966] io_schedule+0x12/0x40 [ 246.759801] do_read_cache_page+0x513/0x770 [ 246.760761] ? blkdev_writepages+0x10/0x10 [ 246.761692] ? file_fdatawait_range+0x20/0x20 [ 246.762659] read_part_sector+0x38/0xda [ 246.763554] read_lba+0x10f/0x220 [ 246.764367] efi_partition+0x1e4/0x6de [ 246.765245] ? snprintf+0x49/0x60 [ 246.766046] ? is_gpt_valid.part.5+0x430/0x430 [ 246.766991] blk_add_partitions+0x164/0x3f0 [ 246.767915] ? blk_drop_partitions+0x91/0xc0 [ 246.768863] bdev_disk_changed+0x65/0xd0 [ 246.769748] __blkdev_get+0x3c4/0x510 [ 246.770595] blkdev_get+0xaf/0x180 [ 246.771394] __device_add_disk+0x3de/0x4b0 [ 246.772302] virtblk_probe+0x4ba/0x8a0 [virtio_blk] [ 246.773313] virtio_dev_probe+0x158/0x1f0 [ 246.774208] really_probe+0x255/0x4a0 [ 246.775046] ? __driver_attach_async_helper+0x90/0x90 [ 246.776091] driver_probe_device+0x49/0xc0 [ 246.776965] bus_for_each_drv+0x79/0xc0 [ 246.777813] __device_attach+0xdc/0x160 [ 246.778669] bus_probe_device+0x9d/0xb0 [ 246.779523] device_add+0x418/0x780 [ 246.780321] register_virtio_device+0x9e/0xe0 [ 246.781254] virtio_pci_probe+0xb3/0x140 [ 246.782124] local_pci_probe+0x41/0x90 [ 246.782937] pci_device_probe+0x105/0x1c0 [ 246.783807] really_probe+0x255/0x4a0 [ 246.784623] ? __driver_attach_async_helper+0x90/0x90 [ 246.785647] driver_probe_device+0x49/0xc0 [ 246.786526] bus_for_each_drv+0x79/0xc0 [ 246.787364] __device_attach+0xdc/0x160 [ 246.788205] pci_bus_add_device+0x4a/0x90 [ 246.789063] pci_bus_add_devices+0x2c/0x70 [ 246.789916] pciehp_configure_device+0x91/0x130 [ 246.790855] pciehp_handle_presence_or_link_change+0x334/0x460 [ 246.791985] pciehp_ist+0x1a2/0x1b0 [ 246.792768] ? irq_finalize_oneshot.part.47+0xf0/0xf0 [ 246.793768] irq_thread_fn+0x1f/0x50 [ 246.794550] irq_thread+0xe7/0x170 [ 246.795299] ? irq_forced_thread_fn+0x70/0x70 [ 246.796190] ? irq_thread_check_affinity+0xe0/0xe0 [ 246.797147] kthread+0x116/0x130 [ 246.797841] ? kthread_flush_work_fn+0x10/0x10 [ 246.798735] ret_from_fork+0x22/0x40 [ 246.799523] INFO: task sfdisk:1129 blocked for more than 120 seconds. [ 246.800717] Not tainted 4.18.0-305.el8.x86_64 #1 [ 246.801733] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 246.803155] sfdisk D 0 1129 1107 0x00004080 [ 246.804225] Call Trace: [ 246.804827] __schedule+0x2c4/0x700 [ 246.805590] ? submit_bio+0x3c/0x160 [ 246.806373] schedule+0x38/0xa0 [ 246.807089] schedule_preempt_disabled+0xa/0x10 [ 246.807990] __mutex_lock.isra.6+0x2d0/0x4a0 [ 246.808876] ? wake_up_q+0x80/0x80 [ 246.809636] ? fdatawait_one_bdev+0x20/0x20 [ 246.810508] iterate_bdevs+0x98/0x142 [ 246.811304] ksys_sync+0x6e/0xb0 [ 246.812041] __ia32_sys_sync+0xa/0x10 [ 246.812820] do_syscall_64+0x5b/0x1a0 [ 246.813613] entry_SYSCALL_64_after_hwframe+0x65/0xca [ 246.814652] RIP: 0033:0x7fa9c04924fb [ 246.815431] Code: Unable to access opcode bytes at RIP 0x7fa9c04924d1. [ 246.816655] RSP: 002b:00007fff47661478 EFLAGS: 00000246 ORIG_RAX: 00000000000000a2 [ 246.818047] RAX: ffffffffffffffda RBX: 000055d79fc512f0 RCX: 00007fa9c04924fb [ 246.824526] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 000055d79fc512f0 [ 246.825714] RBP: 0000000000000000 R08: 000055d79fc51012 R09: 0000000000000006 [ 246.826941] R10: 000000000000000a R11: 0000000000000246 R12: 00007fa9c075e6e0 [ 246.828169] R13: 000055d79fc58c80 R14: 0000000000000001 R15: 00007fff47661590 This has also been reproduced with PCIe based NICs in the same environment. The full QEMU log including launch command line is provided below. Version-Release number of selected component (if applicable): qemu-kvm-block-rbd-5.2.0-16.module+el8.4.0+10806+b7d97207.x86_64 qemu-kvm-block-curl-5.2.0-16.module+el8.4.0+10806+b7d97207.x86_64 qemu-kvm-common-5.2.0-16.module+el8.4.0+10806+b7d97207.x86_64 qemu-kvm-block-ssh-5.2.0-16.module+el8.4.0+10806+b7d97207.x86_64 qemu-kvm-ui-opengl-5.2.0-16.module+el8.4.0+10806+b7d97207.x86_64 qemu-kvm-block-gluster-5.2.0-16.module+el8.4.0+10806+b7d97207.x86_64 qemu-kvm-5.2.0-16.module+el8.4.0+10806+b7d97207.x86_64 qemu-img-5.2.0-16.module+el8.4.0+10806+b7d97207.x86_64 ipxe-roms-qemu-20181214-8.git133f4c47.el8.noarch qemu-kvm-core-5.2.0-16.module+el8.4.0+10806+b7d97207.x86_64 qemu-kvm-block-iscsi-5.2.0-16.module+el8.4.0+10806+b7d97207.x86_64 qemu-kvm-ui-spice-5.2.0-16.module+el8.4.0+10806+b7d97207.x86_64 libvirt-daemon-driver-qemu-7.0.0-14.module+el8.4.0+10886+79296686.x86_64 How reproducible: Always. Steps to Reproduce: 1. Hot plug a PCIe device into a RHEL 8.4 based SEV enabled instance. 2. Attempt to hot unplug said device. Actual results: The request to hot unplug fails and the guest OS eventually logs the above trace. Expected results: The request to hot unplug succeeds. Additional info: 2021-06-02 18:58:48.515+0000: starting up libvirt version: 7.0.0, package: 14.module+el8.4.0+10886+79296686 (Red Hat, Inc. <http://bugzilla.redhat.com/bugzilla>, 2021-05-06-06:29:31, ), qemu version: 5.2.0qemu-kvm-5.2.0-16.module+el8.4.0+10806+b7d97207, kernel: 4.18.0-305.el8.x86_64, hostname: computeamdsev-0.localdomain LC_ALL=C \ PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin \ HOME=/var/lib/libvirt/qemu/domain-182-instance-0000010e \ XDG_DATA_HOME=/var/lib/libvirt/qemu/domain-182-instance-0000010e/.local/share \ XDG_CACHE_HOME=/var/lib/libvirt/qemu/domain-182-instance-0000010e/.cache \ XDG_CONFIG_HOME=/var/lib/libvirt/qemu/domain-182-instance-0000010e/.config \ QEMU_AUDIO_DRV=none \ /usr/libexec/qemu-kvm \ -name guest=instance-0000010e,debug-threads=on \ -S \ -object secret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-182-instance-0000010e/master-key.aes \ -blockdev '{"driver":"file","filename":"/usr/share/OVMF/OVMF_CODE.secboot.fd","node-name":"libvirt-pflash0-storage","auto-read-only":true,"discard":"unmap"}' \ -blockdev '{"node-name":"libvirt-pflash0-format","read-only":true,"driver":"raw","file":"libvirt-pflash0-storage"}' \ -blockdev '{"driver":"file","filename":"/var/lib/libvirt/qemu/nvram/instance-0000010e_VARS.fd","node-name":"libvirt-pflash1-storage","auto-read-only":true,"discard":"unmap"}' \ -blockdev '{"node-name":"libvirt-pflash1-format","read-only":false,"driver":"raw","file":"libvirt-pflash1-storage"}' \ -machine pc-q35-rhel8.4.0,accel=kvm,usb=off,dump-guest-core=off,memory-encryption=sev0,pflash0=libvirt-pflash0-format,pflash1=libvirt-pflash1-format,memory-backend=pc.ram \ -cpu EPYC-Rome,x2apic=on,tsc-deadline=on,hypervisor=on,tsc-adjust=on,spec-ctrl=on,stibp=on,arch-capabilities=on,ssbd=on,xsaves=on,cmp-legacy=on,ibrs=on,amd-ssbd=on,virt-ssbd=on,rdctl-no=on,skip-l1dfl-vmentry=on,mds-no=on,pschange-mc-no=on,svm=off,npt=off,nrip-save=off \ -m 2048 \ -object memory-backend-ram,id=pc.ram,size=2147483648 \ -overcommit mem-lock=on \ -smp 2,sockets=2,dies=1,cores=1,threads=1 \ -uuid db27e653-2c69-453e-84f1-6d6189fc61ae \ -smbios 'type=1,manufacturer=Red Hat,product=OpenStack Compute,version=20.6.1-2.20210510134812.10df176.el8ost.2,serial=db27e653-2c69-453e-84f1-6d6189fc61ae,uuid=db27e653-2c69-453e-84f1-6d6189fc61ae,family=Virtual Machine' \ -no-user-config \ -nodefaults \ -chardev socket,id=charmonitor,fd=34,server,nowait \ -mon chardev=charmonitor,id=monitor,mode=control \ -rtc base=utc,driftfix=slew \ -global kvm-pit.lost_tick_policy=delay \ -no-hpet \ -no-shutdown \ -boot strict=on \ -device pcie-root-port,port=0x10,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x2 \ -device pcie-root-port,port=0x11,chassis=2,id=pci.2,bus=pcie.0,addr=0x2.0x1 \ -device pcie-root-port,port=0x12,chassis=3,id=pci.3,bus=pcie.0,addr=0x2.0x2 \ -device pcie-root-port,port=0x13,chassis=4,id=pci.4,bus=pcie.0,addr=0x2.0x3 \ -device pcie-root-port,port=0x14,chassis=5,id=pci.5,bus=pcie.0,addr=0x2.0x4 \ -device pcie-root-port,port=0x15,chassis=6,id=pci.6,bus=pcie.0,addr=0x2.0x5 \ -device pcie-root-port,port=0x16,chassis=7,id=pci.7,bus=pcie.0,addr=0x2.0x6 \ -device pcie-root-port,port=0x17,chassis=8,id=pci.8,bus=pcie.0,addr=0x2.0x7 \ -device pcie-root-port,port=0x18,chassis=9,id=pci.9,bus=pcie.0,multifunction=on,addr=0x3 \ -device pcie-root-port,port=0x19,chassis=10,id=pci.10,bus=pcie.0,addr=0x3.0x1 \ -device pcie-root-port,port=0x1a,chassis=11,id=pci.11,bus=pcie.0,addr=0x3.0x2 \ -device pcie-root-port,port=0x1b,chassis=12,id=pci.12,bus=pcie.0,addr=0x3.0x3 \ -device pcie-root-port,port=0x1c,chassis=13,id=pci.13,bus=pcie.0,addr=0x3.0x4 \ -device pcie-root-port,port=0x1d,chassis=14,id=pci.14,bus=pcie.0,addr=0x3.0x5 \ -device pcie-root-port,port=0x1e,chassis=15,id=pci.15,bus=pcie.0,addr=0x3.0x6 \ -device pcie-root-port,port=0x1f,chassis=16,id=pci.16,bus=pcie.0,addr=0x3.0x7 \ -device pcie-root-port,port=0x20,chassis=17,id=pci.17,bus=pcie.0,multifunction=on,addr=0x4 \ -device pcie-root-port,port=0x21,chassis=18,id=pci.18,bus=pcie.0,addr=0x4.0x1 \ -device pcie-root-port,port=0x22,chassis=19,id=pci.19,bus=pcie.0,addr=0x4.0x2 \ -device pcie-root-port,port=0x23,chassis=20,id=pci.20,bus=pcie.0,addr=0x4.0x3 \ -device pcie-root-port,port=0x24,chassis=21,id=pci.21,bus=pcie.0,addr=0x4.0x4 \ -device pcie-root-port,port=0x25,chassis=22,id=pci.22,bus=pcie.0,addr=0x4.0x5 \ -device pcie-root-port,port=0x26,chassis=23,id=pci.23,bus=pcie.0,addr=0x4.0x6 \ -device pcie-root-port,port=0x27,chassis=24,id=pci.24,bus=pcie.0,addr=0x4.0x7 \ -device qemu-xhci,id=usb,bus=pci.2,addr=0x0 \ -blockdev '{"driver":"file","filename":"/var/lib/nova/instances/_base/8b0d11633025a596af2b8dcaf94a639e4f71a0cf","node-name":"libvirt-2-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"}' \ -blockdev '{"node-name":"libvirt-2-format","read-only":true,"cache":{"direct":true,"no-flush":false},"driver":"raw","file":"libvirt-2-storage"}' \ -blockdev '{"driver":"file","filename":"/var/lib/nova/instances/db27e653-2c69-453e-84f1-6d6189fc61ae/disk","node-name":"libvirt-1-storage","cache":{"direct":true,"no-flush":false},"auto-read-only":true,"discard":"unmap"}' \ -blockdev '{"node-name":"libvirt-1-format","read-only":false,"cache":{"direct":true,"no-flush":false},"driver":"qcow2","file":"libvirt-1-storage","backing":"libvirt-2-format"}' \ -device virtio-blk-pci,iommu_platform=on,bus=pci.3,addr=0x0,drive=libvirt-1-format,id=virtio-disk0,bootindex=1,write-cache=on \ -netdev tap,fd=197,id=hostnet0,vhost=on,vhostfd=198 \ -device virtio-net-pci,rx_queue_size=512,host_mtu=1450,netdev=hostnet0,id=net0,mac=fa:16:3e:e8:31:85,bus=pci.1,addr=0x0,iommu_platform=on \ -add-fd set=3,fd=200 \ -chardev pty,id=charserial0,logfile=/dev/fdset/3,logappend=on \ -device isa-serial,chardev=charserial0,id=serial0 \ -device usb-tablet,id=input0,bus=usb.0,port=1 \ -vnc 172.16.2.147:1 \ -device cirrus-vga,id=video0,bus=pcie.0,addr=0x1 \ -device virtio-balloon-pci,id=balloon0,bus=pci.4,addr=0x0,iommu_platform=on \ -object rng-random,id=objrng0,filename=/dev/urandom \ -device virtio-rng-pci,rng=objrng0,id=rng0,iommu_platform=on,bus=pci.5,addr=0x0 \ -object sev-guest,id=sev0,cbitpos=47,reduced-phys-bits=1,policy=0x33 \ -sandbox on,obsolete=deny,elevateprivileges=deny,spawn=deny,resourcecontrol=deny \ -msg timestamp=on char device redirected to /dev/pts/4 (label charserial0) 2021-06-02T18:58:49.178263Z qemu-kvm: -device cirrus-vga,id=video0,bus=pcie.0,addr=0x1: warning: 'cirrus-vga' is deprecated, please use a different VGA card instead 2021-06-02T18:59:02.271928Z qemu-kvm: Guest says index 352 is available
Looking at that back trace, it feels more like the hot-add never actually completed. Can you confirm the XML fragment you passed to libvirt to do the hotadd? In particular I'm curious whether it has the iommu_platform stuff.
(In reply to Dr. David Alan Gilbert from comment #1) > Looking at that back trace, it feels more like the hot-add never actually > completed. > Can you confirm the XML fragment you passed to libvirt to do the hotadd? > In particular I'm curious whether it has the iommu_platform stuff. Many thanks, the attach XML was: <disk type="block" device="disk"> <driver name="qemu" type="raw" cache="none" io="native"/> <source dev="/dev/sdc"/> <target bus="virtio" dev="vdb"/> <serial>b11ce83a-723a-49a2-a5cc-025cb8985b0d</serial> </disk> Later the detach device XML was: <disk type="block" device="disk"> <driver name="qemu" type="raw" cache="none" io="native"/> <source dev="/dev/sdc"/> <target bus="virtio" dev="vdb"/> <serial>b11ce83a-723a-49a2-a5cc-025cb8985b0d</serial> <address type="pci" domain="0x0000" bus="0x06" slot="0x00" function="0x0"/> </disk>
(In reply to Lee Yarwood from comment #2) > (In reply to Dr. David Alan Gilbert from comment #1) > > Looking at that back trace, it feels more like the hot-add never actually > > completed. > > Can you confirm the XML fragment you passed to libvirt to do the hotadd? > > In particular I'm curious whether it has the iommu_platform stuff. > > Many thanks, the attach XML was: > > <disk type="block" device="disk"> > <driver name="qemu" type="raw" cache="none" io="native"/> > <source dev="/dev/sdc"/> > <target bus="virtio" dev="vdb"/> > <serial>b11ce83a-723a-49a2-a5cc-025cb8985b0d</serial> > </disk> I think that's missing the iommu part; i.e a <driver iommu='on'/> which you need for all SEV devices. (I also thought there was a preference for virtio-scsi on sev, but you seem to be fine on your boot disk). Dave > Later the detach device XML was: > > <disk type="block" device="disk"> > <driver name="qemu" type="raw" cache="none" io="native"/> > <source dev="/dev/sdc"/> > <target bus="virtio" dev="vdb"/> > <serial>b11ce83a-723a-49a2-a5cc-025cb8985b0d</serial> > <address type="pci" domain="0x0000" bus="0x06" slot="0x00" function="0x0"/> > </disk>
(In reply to Dr. David Alan Gilbert from comment #3) > (In reply to Lee Yarwood from comment #2) > > (In reply to Dr. David Alan Gilbert from comment #1) > > > Looking at that back trace, it feels more like the hot-add never actually > > > completed. > > > Can you confirm the XML fragment you passed to libvirt to do the hotadd? > > > In particular I'm curious whether it has the iommu_platform stuff. > > > > Many thanks, the attach XML was: > > > > <disk type="block" device="disk"> > > <driver name="qemu" type="raw" cache="none" io="native"/> > > <source dev="/dev/sdc"/> > > <target bus="virtio" dev="vdb"/> > > <serial>b11ce83a-723a-49a2-a5cc-025cb8985b0d</serial> > > </disk> > > I think that's missing the iommu part; i.e a > <driver iommu='on'/> > > which you need for all SEV devices. > (I also thought there was a preference for virtio-scsi on sev, but you seem > to be fine on > your boot disk). Yeah appears there's missing logic within openstack-nova when hot plugging devices to a SEV instance as opposed to launching with them already attached. Moving this back over to openstack-nova, thanks for the help with this!
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Red Hat OpenStack Platform (RHOSP) 16.2 enhancement advisory), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2021:3483