Description of problem: I have a Jenkins job failing on overcloud upgrade when trying to upgrade OSP13 to 14 with composable roles We have a side-issue with IR (causes failure on UC upgrade) that has a WA, but even though next step fails. seems to be something related to 'merge-new-params-nic-config-script.py' Job output can be found here: https://rhos-qe-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/DFG/view/network/view/neutron/job/DFG-network-neutron-upgrade-13-14_director-rhel-virthost-3cont_2comp_2net-ipv4-vxlan-composable/25/console Version-Release number of selected component (if applicable): OSP13 How reproducible: 100% Steps to Reproduce: 1. Try and run the job using IR master branch (The WA for UC issue) 2. See failure at overcloud-upgrade and debug the results 3. Actual results: Job failing Expected results: Job passing Additional info:
Assigning to HardProv to look at.
Roee - can we see the templates and deployment command being used? Its not clear how to access them from the failed test link. It looks like there's a mismatch between the use of deprecated_nic_config_names in roles_data.yaml and the role_name. error is here: # If deprecated_nic_config_names is set for role the deprecated name must # be used when loading the reference file. with open(OPTS.roles_data) as roles_data_file: roles_data = yaml.safe_load(roles_data_file) nic_config_name = next((x.get('deprecated_nic_config_name', OPTS.role_name.lower() + '.yaml') for x in roles_data if x['name'] == OPTS.role_name))
Thanks Roee. The templates at that link are using just compute.yaml and controller.yaml for nic config files: # Specify the relative/absolute path to the config files you want to use for override the default. 3 OS::TripleO::ComputeSriov::Net::SoftwareConfig: nic-configs/compute.yaml 4 OS::TripleO::Controller::Net::SoftwareConfig: nic-configs/controller.yaml While the test is failing because the script is run against the files in /home/stack/composable_roles, e.g. /home/stack/composable_roles/network/nic-configs//swift-storage.yaml /home/stack/composable_roles/network/nic-configs//database_internal.yaml This is the script failure using the nic config yaml file that isn't needed for this test: /home/stack/composable_roles/roles/nodes.yaml | awk -F '::' '{ print $3 }' );\n python /usr/share/openstack-tripleo-heat-templates/tools/merge-new-params-nic-config-script.py --tht-dir /usr/share/openstack-tripleo-heat-templates --role-name $NIC_ROLE_NAME --roles-data /home/stack/composable_roles/roles/roles_data.yaml --discard-comments yes --template /home/stack/composable_roles/network/nic-configs//swift-storage.yaml",
The code[1] that fail in the script is: nic_config_name = next((x.get('deprecated_nic_config_name', OPTS.role_name.lower() + '.yaml') for x in roles_data if x['name'] == OPTS.role_name)) The exception StopIteration[2] indicates that it iterated trought all the roles without finding a match. So my guess is that whatever the CI job assign to NIC_ROLE_NAME is not a role name in roles_data. Notice that the grep command uses a double forward slash: ``nic-configs//swift-storage.yaml`` i.e: NIC_ROLE_NAME=$( grep /home/stack/composable_roles/network/nic-configs//swift-storage.yaml /home/stack/composable_roles/roles/nodes.yaml Is there two forward slashes in the string you are searching for in /home/stack/composable_roles/roles/nodes.yaml? (My guess is that there is not? Maby you can use dirname and basename commands in the CI automation? Or simply remove the additional slash that is inserted?) [1] https://github.com/openstack/tripleo-heat-templates/blame/master/tools/merge-new-params-nic-config-script.py#L213-L215 [2] https://docs.python.org/2/library/exceptions.html#exceptions.StopIteration
Has this been resolved via infrared?
Thanks Yurii. So it seems that this particular issue isn't a bug since the role wasn't set, but it would be useful if the script generated a clear warning message instead of the "StopIteration" exception. Do you agree?
I've created a bug [1] to modify merge-new-params-nic-config to gracefully exit to make the missing role apparent. I'm closing this bug as the fixes are in IR. If its useful to keep this open to track the IR fixes please reopen and assign to another DFG. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1656878