Bug 1847463
| Summary: | [OVN migration] Some overcloud nodes are missing in the ansible inventory file used for migration | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Roman Safronov <rsafrono> | ||||||||
| Component: | python-networking-ovn | Assignee: | Jakub Libosvar <jlibosva> | ||||||||
| Status: | CLOSED ERRATA | QA Contact: | Roman Safronov <rsafrono> | ||||||||
| Severity: | urgent | Docs Contact: | |||||||||
| Priority: | high | ||||||||||
| Version: | 16.1 (Train) | CC: | amcleod, apevec, jamsmith, jlibosva, jpretori, lhh, majopela, owalsh, sclewis, scohen, spower, tfreger | ||||||||
| Target Milestone: | z1 | Keywords: | AutomationBlocker, Regression, Triaged | ||||||||
| Target Release: | 16.1 (Train on RHEL 8.2) | ||||||||||
| Hardware: | Unspecified | ||||||||||
| OS: | Unspecified | ||||||||||
| Whiteboard: | |||||||||||
| Fixed In Version: | python-networking-ovn-7.2.1-0.20200611133438.15f2281.el8ost | Doc Type: | Bug Fix | ||||||||
| Doc Text: |
This update fixes a bug that caused the `generate-inventory` step to fail during in-place migration from ML2/OVS to ML2/OVN.
+
Note that in the Red Hat OpenStack Platform 16.1.0 (GA release), migration from ML2/OVS to ML2/OVN was not supported. As of Red Hat OpenStack Platform 16.1.1, in-place migration is supported for non-NFV deployments, with various exceptions, limitations, and requirements as described in "Migrating from ML2/OVS to ML2/OVN." [1]
+
[1] https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.1/html-single/networking_with_open_virtual_network/index#migrating-ml2ovs-to-ovn
|
Story Points: | --- | ||||||||
| Clone Of: | Environment: | ||||||||||
| Last Closed: | 2020-08-27 15:20:02 UTC | Type: | Bug | ||||||||
| Regression: | --- | Mount Type: | --- | ||||||||
| Documentation: | --- | CRM: | |||||||||
| Verified Versions: | Category: | --- | |||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||
| Embargoed: | |||||||||||
| Attachments: |
|
||||||||||
|
Description
Roman Safronov
2020-06-16 12:31:36 UTC
Script that executes get_role_hosts function from ovn_migration.sh
The script should return list of controller nodes, expected output on a setup with 3 controllers: controller-0 controller-1 controller-2
==========================================================
#!/bin/bash
get_role_hosts() {
inventory_file=$1
role_name=$2
roles=`jq -r \.$role_name\.children\[\] $inventory_file`
for role in $roles; do
# During the rocky cycle the format changed to have .value.hosts
hosts=`jq -r --arg role "$role" 'to_entries[] | select(.key == $role) | .value.hosts[]' $inventory_file`
if [[ "x$hosts" == "x" ]]; then
# But we keep backwards compatibility with nested childrens (Queens)
hosts=`jq -r --arg role "$role" 'to_entries[] | select(.key == $role) | .value.children[]' $inventory_file`
for host in $hosts; do
HOSTS="$HOSTS `jq -r --arg host "$host" 'to_entries[] | select(.key == $host) | .value.hosts[0]' $inventory_file`"
done
else
HOSTS="${hosts} ${HOSTS}"
fi
done
echo $HOSTS
}
source ~/stackrc
/usr/bin/tripleo-ansible-inventory --list > /tmp/ansible-inventory.txt
get_role_hosts /tmp/ansible-inventory.txt neutron_api
osp16
(undercloud) [stack@undercloud-0 ~]$ jq -r \.neutron_api\.children\[\] /tmp/ansible-inventory.txt
Controller
osp16.1
(overcloud) [stack@undercloud-0 ~]$ jq -r \.neutron_api\.children\[\] /tmp/ansible-inventory.txt
overcloud_neutron_api
related snippet from tripleo-ansible-inventory on osp16
"neutron_api": {
"children": [
"Controller"
],
"vars": {
"ansible_ssh_user": "heat-admin"
}
},
related snippet from tripleo-ansible-inventory on osp16.1
"neutron_api": {
"children": [
"overcloud_neutron_api"
]
},
"overcloud_neutron_api": {
"children": [
"overcloud_Controller"
]
},
Created attachment 1697608 [details]
tripleo ansible inventory osp16.1
Created attachment 1697609 [details]
tripleo ansible inventory osp16
hosts_for_migration file generated on osp16.1 environment with 3 controllers and 2 computes (missing 2 controllers and 1 compute) ==================================================================================================== [ovn-dbs] controller-0 ansible_host=192.168.24.29 ovn_central=true ansible_ssh_user=heat-admin ansible_become=true [ovn-controllers] compute-0 ansible_host=192.168.24.33 ansible_ssh_user=heat-admin ansible_become=true ovn_controller=true controller-0 ansible_host=192.168.24.29 ansible_ssh_user=heat-admin ansible_become=true ovn_controller=true [overcloud-controllers:children] ovn-dbs [overcloud:children] ovn-controllers ovn-dbs [overcloud:vars] remote_user=heat-admin public_network_name=nova image_name=cirros working_dir=/home/stack/ovn_migration server_user_name=cirros validate_migration=True overcloud_ovn_deploy_script=/home/stack/overcloud-deploy-ovn.sh overcloudrc=/home/stack/overcloudrc ovn_migration_backups=/var/lib/ovn-migration-backup [overcloud-controllers:vars] remote_user=heat-admin public_network_name=nova image_name=cirros working_dir=/home/stack/ovn_migration server_user_name=cirros validate_migration=True overcloud_ovn_deploy_script=/home/stack/overcloud-deploy-ovn.sh overcloudrc=/home/stack/overcloudrc ovn_migration_backups=/var/lib/ovn-migration-backup Note: for OSP16.1 we can run "get_role_hosts" function as follows : get_role_hosts /tmp/ansible-inventory.txt overcloud_neutron_api (L143 from tools/ovn_migration/tripleo_environment/ovn_migration.sh ) in this case output is: controller-0 controller-1 controller-2 as expected. Also L158 should be changed for OSP16.1, it should look like: get_role_hosts ansible-inventory_osp16.1_ovs overcloud_neutron_ovs_agent in this case output is correct (the nodes where we want to launch ovn-controller after the migration): controller-0 controller-1 controller-2 compute-0 compute-1 Created attachment 1698297 [details]
tripleo ansible inventory osp16.1 ml2ovs
Possible solution is to replace L93 in ./tools/ovn_migration/tripleo_environment/ovn_migration.sh from roles=`jq -r \.$role_name\.children\[\] $inventory_file` to roles=`roles=`jq -r \.overcloud_$role_name\.children\[\] $inventory_file || jq -r \.$role_name\.children\[\] $inventory_file` In this case the function returns proper lists of nodes from OSP16 and OSP16.1 ansible-inventory file for ovn and ovs (In reply to Roman Safronov from comment #12) > Possible solution is to replace L93 in > ./tools/ovn_migration/tripleo_environment/ovn_migration.sh > from > roles=`jq -r \.$role_name\.children\[\] $inventory_file` > to > roles=`roles=`jq -r \.overcloud_$role_name\.children\[\] $inventory_file || > jq -r \.$role_name\.children\[\] $inventory_file` > > In this case the function returns proper lists of nodes from OSP16 and > OSP16.1 ansible-inventory file for ovn and ovs Cannot hard-code the 'overcloud' stack name. > get_role_hosts /tmp/ansible-inventory.txt neutron_api I'm a bit confused by this usage. Are you trying to get the host list for a service (e.g neutron_api), or the host list for a role (e.g Controller)? (In reply to Ollie Walsh from comment #13) > > Cannot hard-code the 'overcloud' stack name. > > > get_role_hosts /tmp/ansible-inventory.txt neutron_api > > I'm a bit confused by this usage. Are you trying to get the host list for a > service (e.g neutron_api), or the host list for a role (e.g Controller)? We need to get host list for a service (e.g. neutron_api). Comments from the code: # We want to run ovn_dbs where neutron_api is running OVN_DBS=$(get_role_hosts /tmp/ansible-inventory.txt neutron_api) # We want to run ovn-controller where OVS agent was running before the migration OVN_CONTROLLERS=$(get_role_hosts /tmp/ansible-inventory.txt neutron_ovs_agent) Can use ansible-inventory with --graph to expand the groups. E.g: $ tripleo-ansible-inventory --stack overcloud --static-yaml-inventory static_inventory.yaml $ ansible-inventory -i static_inventory.yaml --graph neutron_api @neutron_api: |--@overcloud_neutron_api: | |--@overcloud_Controller: | | |--overcloud-controller-0 $ ansible-inventory -i static_inventory.yaml --graph neutron_api | sed -ne 's/^[ \t|]\+--\([a-z0-9\-]\+\)$/\1/p' overcloud-controller-0 Note ansible-inventory fails with a non-zero exit code if the group does not exist in the inventory Moving to z2, this was not approved for z1 which is in Blockers Only. If it meets Blocker criteria for 16.1.1 please follow blocker process. Verified on puddle RHOS-16.1-RHEL-8-20200813.n.0 with python3-networking-ovn-migration-tool-7.2.1-0.20200611133439.15f2281.el8ost.noarch Verified that ansible inventory file for migration (hosts_for_migration) contains all relevant overcloud nodes. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (openstack-neutron bug fix advisory), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:3568 |