Tempest tests for cold migration and resize fail: > tempest.lib.exceptions.TimeoutException: Request timed out > Details: (ServerDiskConfigTestJSON:test_resize_server_from_auto_to_manual) > Server b592c193-88cd-4958-bf15-44b90b6531ed failed to reach VERIFY_RESIZE status and task state "None" within the required time (300 s). > Current status: ACTIVE. Current task state: None. Reproducible, in OSP CI phase2, though possibly not always / not in all setups. Nova compute log shows: > 2021-04-13 18:12:34.754 8 DEBUG oslo_concurrency.processutils [req-f7a2d6f8-51a7-4f4c-9f47-9640018c0b52 314612978bc24e9eb344156a3fc7f9b8 b0fe0b6a054f468e81b851d79e358729 - default default] 'ssh -o BatchMode=yes 172.17.1.115 mkdir -p /var/lib/nova/instances/b592c193-88cd-4958-bf15-44b90b6531ed' failed. Not Retrying. execute /usr/lib/python3.6/site-packages/oslo_concurrency/processutils.py:457 > 2021-04-13 18:12:34.792 8 INFO nova.compute.manager [req-f7a2d6f8-51a7-4f4c-9f47-9640018c0b52 314612978bc24e9eb344156a3fc7f9b8 b0fe0b6a054f468e81b851d79e358729 - default default] [instance: b592c193-88cd-4958-bf15-44b90b6531ed] Setting instance back to active after: Instance rollback performed due to: Resize error: not able to execute ssh command: Unexpected error while running command. > Command: ssh -o BatchMode=yes 172.17.1.115 mkdir -p /var/lib/nova/instances/b592c193-88cd-4958-bf15-44b90b6531ed > Exit code: 255 > Stdout: '' > Stderr: 'Host key verification failed.\r\n' Possibly could be same issue as https://bugs.launchpad.net/tripleo/+bug/1923403 ? Versions from undercloud-0: > ansible.noarch 2.9.19-1.el8ae @rhosp-ansible-2.9 > openstack-tempest.noarch 1:24.0.0-1.20201113224606.c73e6b1.el8ost @rhelosp-16.1 > openstack-tripleo-common.noarch 11.4.1-1.20210407183434.75bd92a.el8ost @rhelosp-16.1 > openstack-tripleo-common-containers.noarch 11.4.1-1.20210407183434.75bd92a.el8ost @rhelosp-16.1 > openstack-tripleo-heat-templates.noarch 11.3.2-1.20210408163446.29a02c1.el8ost @rhelosp-16.1 > openstack-tripleo-image-elements.noarch 10.6.2-1.20201113215051.7dc0fa1.el8ost @rhelosp-16.1 > openstack-tripleo-puppet-elements.noarch 11.2.2-1.20201114042506.f061f90.el8ost @rhelosp-16.1 > openstack-tripleo-validations.noarch 11.3.2-1.20210408103437.4db92ba.el8ost @rhelosp-16.1 > tripleo-ansible.noarch 0.5.1-1.20210323173503.902c3c8.el8ost @rhelosp-16.1 Versions from compute-1: > ansible.noarch 2.9.19-1.el8ae @rhos-16.1-rhel-8-ansible > puppet-nova.noarch 15.6.1-1.20201114010908.51a6857.el8ost @rhos-16.1 > ### podman images compute: > undercloud-0.ctlplane.redhat.local:8787/rh-osbs/rhosp16-openstack-nova-compute 16.1_20210409.1 452734cc0544 8 hours ago 1.94 GB Versions from compute-1 container nova-compute: > 2021-04-13T01:37:06Z SUBDEBUG Installed: openstack-nova-common-1:20.4.1-1.20210406183726.1ee93b9.el8ost.noarch > 2021-04-13T11:08:36Z SUBDEBUG Installed: openstack-nova-compute-1:20.4.1-1.20210406183726.1ee93b9.el8ost.noarch > 2021-04-09T13:49:12Z SUBDEBUG Installed: puppet-tripleo-11.5.0-1.20210406223722.f716ef5.el8ost.noarch > 2021-04-09T13:49:12Z SUBDEBUG Installed: openstack-tripleo-common-container-base-11.4.1-1.20210407183434.75bd92a.el8ost.noarch
This was introduced by https://bugzilla.redhat.com/show_bug.cgi?id=1911891, where setting ANSIBLE_INJECT_FACT_VARS=False the tripleo_ssh_known_hosts misses ansible_ssh_host_key_rsa_public information.
Waiting for https://review.opendev.org/c/openstack/tripleo-ansible/+/786159 to hit master; should be the fix once it's merged and backported.
The Phase 2 jobs referenced in comment 1 that found this BZ are passing now.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: Red Hat OpenStack Platform 16.1.6 (tripleo-ansible) security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2119