Description of problem: Customer is performing minor upgrade to OSP version 16.2. During the converge step it fails with error[1] [1] 2022-11-25 04:30:31.152665 | 043f72de-d3f9-3533-4d43-0000000000f3 | FATAL | Write group_vars file | undercloud | error={"censored": "the output has been hidden due to the fact that 'no_log: true' was specified for this result"} Version-Release number of selected component (if applicable): Red Hat OpenStack Platform release 16.2.3 (Train) How reproducible: Customer has upgraded all overcloud nodes and in the last stage of overcloud converge. Above error happens during this step. The error seems to happen while executing this code: https://opendev.org/openstack/tripleo-heat-templates/src/branch/stable/train/deployment/octavia/octavia-deployment-config.j2.yaml#L309-L315 This issue seems similar to BZ: https://bugzilla.redhat.com/show_bug.cgi?id=2136393 However, the installed version of openstack-tripleo-heat-templates is later than in the fixed version of 'openstack-tripleo-heat-templates-11.3.2-1.20221013153258.29a02c1.el8ost' $ cat installed-rpms | grep openstack-tripleo-heat-templates openstack-tripleo-heat-templates-11.6.1-2.20220409014870.el8ost.noarch Sun Nov 20 22:46:13 2022 Actual results: Minor update fails. Expected results: Minor update complete.
I'd say the permissions on the directory are already incorrect, so mistral isn't able to access them during the execution. What are the permissions there? $ sudo ls -la /var/lib/mistral/overcloud $ sudo ls -la /var/lib/mistral/overcloud/octavia-ansible $ sudo ls -la /var/lib/mistral/overcloud/octavia-ansible/group_vars $ sudo ls -la /var/lib/mistral/overcloud/octavia-ansible/local_dir Try: 1. Move the existing overcloud directory to a backup directory, for example: sudo mv /var/lib/mistral/overcloud{,-backup} 2. Re-run the update converge script. Are you able to confirm if that resolves the issue here?
This is definitely duplicate of bz 2136489. I'll close this as a duplicate once we get feedback to the comment:2. Bz 2136393 is for OSP16.1. Please make sure you check the bug for the correct version. The build with the fix is already available. In case the steps suggested by Brendan does not work then request hotfix. Or probably you can try 1. Downgrade mistral-executor in undercloud to 16.2.3-10 2. Move /var/lib/mistral/overcloud 3. Run converge again
*** This bug has been marked as a duplicate of bug 2136489 ***