Bug 2008440

Summary: ovn migration create-resources is not stable
Product: Red Hat OpenStack Reporter: Eduardo Olivares <eolivare>
Component: python-networking-ovnAssignee: OSP Team <rhos-maint>
Status: CLOSED ERRATA QA Contact: Eran Kuris <ekuris>
Severity: medium Docs Contact:
Priority: medium    
Version: 16.2 (Train)CC: apevec, ekuris, lhh, majopela, rhos-maint, rsafrono, scohen
Target Milestone: z2Keywords: Triaged
Target Release: 16.2 (Train on RHEL 8.4)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: python-networking-ovn-7.4.2-2.20220113214852.a2eba10.el8ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 2008425 Environment:
Last Closed: 2022-03-23 22:11:40 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 2008425    
Bug Blocks:    

Description Eduardo Olivares 2021-09-28 09:00:11 UTC
+++ This bug was initially created as a clone of Bug #2008425 +++

Description of problem:
As part of the ovs2ovn migration process, some workload is created before the migration begins. The intention is to verify that those workloads are healthy after the migration.
The script that creates those workloads checks they are pingable and ssh'able. Sometimes the ssh fails because the created VMs did not get the public keys during the cloud-init phase [1]:
+ ssh -i /home/stack/ovn_migration/pre_migration_resources/ovn_migration_ssh_key -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null cirros.0.243 date
Warning: Permanently added '10.0.0.243' (ECDSA) to the list of known hosts.
Permission denied, please try again.
Permission denied, please try again.
cirros.0.243: Permission denied (publickey,password).
+ rc=255
+ echo 'Done with the resource creation : exiting with 255'
Done with the resource creation : exiting with 255
+ exit 255


It seems the issue is related to a problem in cirros 0.4.0, which was resolved in a later release of cirros [2].
So, using cirros 0.5.2 could stabilize this script.

This script can be found in two different formats:
https://opendev.org/openstack/neutron/src/branch/master/tools/ovn_migration/tripleo_environment/playbooks/roles/resources/create/templates/create-resources.sh.j2
https://opendev.org/openstack/neutron/src/branch/master/tools/ovn_migration/infrared/tripleo-ovn-migration/roles/create-resources/templates/create-resources.sh.j2



[1] http://rhos-ci-logs.lab.eng.tlv2.redhat.com/logs/rcj/DFG-network-networking-ovn-16.1_director-rhel-virthost-3cont_2comp_3net-ipv4-vxlan-ml2ovs-to-ovn-migration_composable/26/undercloud-0/home/stack/ovn_migration/pre_migration_resources/create-migration-resources.sh.log.gz
[2] https://github.com/cirros-dev/cirros/pull/11/commits/e40bcd2964aa496a9d03e1aaf95cf7a86938f129



Version-Release number of selected component (if applicable):


How reproducible:
Less than 10%


Steps to Reproduce:
1. run an ovs2ovn migration job
2.
3.

Comment 5 Roman Safronov 2022-03-14 10:11:21 UTC
Verified that the problem does not happen on RHOS-16.2-RHEL-8-20220311.n.1 with python3-networking-ovn-migration-tool-7.4.2-2.20220113214853.el8ost.noarch.rpm

Comment 10 errata-xmlrpc 2022-03-23 22:11:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Release of components for Red Hat OpenStack Platform 16.2.2), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:1001