pci(e) device removal requires a cooperating guest operating system. Typically the guest needs to learn the device plugged: - either the hotplug is initialized at plug time - explicit pcirescan happens after the plug (for ex init script) Also the guest must have hotplug activated (for ex. acpi) in order to remove a device. I wonder is qemu/libvirt/nova repeats the remove request over time, is it expected to be repeated ? In order to create clean situation and know is the VM was at least able to boot and initialized the hotplug subsystem we should do instance validation. ATM all know image initialize the hotplug before allows an ssh connection. (We might do more than an auth test if needed) 2022-08-02 17:54:01.713 322369 DEBUG tempest [-] validation.run_validation = True log_opt_values /usr/lib/python3.6/site-packages/oslo_config/cfg.py:2615 The test log did not include ssh attempt. tempest.api.volume.test_volumes_snapshots.VolumesSnapshotTestJSON.test_snapshot_create_offline_delete_online[compute,id-5210a1de-85a0-11e6-bb21-641c676a5d61] (from full) tempest.api.volume.test_volumes_snapshots.VolumesSnapshotTestJSON.test_snapshot_create_delete_with_volume_in_use How reproducible: Depends on the weather, in some test jobs almost always happens. Additional info: cirros-0.5.2-x86_64-disk.img Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/tempest/common/waiters.py", line 317, in wait_for_volume_resource_status raise lib_exc.TimeoutException(message) tempest.lib.exceptions.TimeoutException: Request timed out Details: volume 2153277e-0a49-44b6-a498-cf9f2414ac7a failed to reach available status (current in-use) within the required time (300 s).
Note this issue seems to me like it could have the same or close to root cause to https://bugzilla.redhat.com/show_bug.cgi?id=2012096, that branched into several subcomponents like qemu, libvirt, nova, ... Anyway until there is consensus on the topic whether this will be addressed by these subcomponents or not, I suggest to consider any "Tempest-waiters" as a workaround and track them accordingly (https://bugzilla.redhat.com/show_bug.cgi?id=2012096#c39).
The fix (https://review.opendev.org/c/openstack/tempest/+/852030/) is part of the 'Fixed In Version' package (openstack-tempest-31.1.0-0.20220719160757.56d259d.el9ost) which is available since RHOS-17.0-RHEL-9-20220811.n.0. Therefore I'm marking this as VERIFIED.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Release of components for Red Hat OpenStack Platform 17.0 (Wallaby)), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2022:6543