Created attachment 1288456 [details] Notes about error, log snippets Description of problem: Failure to complete convergence (Finalize) in upgrade of overcloud. OSP9 TO OSP10 upgrade. Version-Release number of selected component (if applicable): heat version 1.5.0 How reproducible: I have only done this once - it's a very long procedure. . Steps to Reproduce: 1.Do complete JetStream 6.0.1 install (no lock bits, so essentially head of OSP9. 2.Do complete upgrade via RH doc 3. Actual results: AllNodesDeploySteps UPDATE-FAILED in finalize upgrade Expected results: Additional info: Converge error ============== command: ---------- openstack overcloud deploy --stack "r8" -t 180 --templates ~/pilot/templates/overcloud -e /home/osp_admin/pilot/templates/overcloud/environments/network-solation.yaml -e /home/osp_admin/pilot/templates/network-environment.yaml -e /home/osp_admin/pilot/templates/static-ip-environment.yaml -e /home/osp_admin/pilot/templates/static-vip-environment.yaml -e /home/osp_admin/pilot/templates/node-placement.yaml -e /home/osp_admin/pilot/templates/overcloud/environments/storage-environment.yaml -e /home/osp_admin/pilot/templates/dell-environment.yaml -e /home/osp_admin/pilot/templates/overcloud/environments/puppet-pacemaker.yaml -e /home/osp_admin/pilot/templates/overcloud/environments/major-upgrade-pacemaker-converge.yaml --control-flavor control --compute-flavor compute --ceph-storage-flavor ceph-storage --swift-storage-flavor swift-storage --block-storage-flavor block-storage --neutron-public-interface bond1 --neutron-network-type vlan --neutron-disable-tunneling --control-scale 3 --compute-scale 4 --ceph-storage-scale 3 --ntp-server 0.centos.pool.ntp.org --neutron-network-vlan-ranges physint:201:220,physext --neutron-bridge-mappings physint:br-tenant,physext:br-ex
Created attachment 1288457 [details] journalctl output
Hi, so the problem seems to come from a custom script. Could you post the script that is triggered in the ControllerExtraConfigPost step. On the controller associated with the jorunactl output, this is this script: - /var/lib/heat-config/deployed/5d37fd5e-4705-4e50-a023-a7753953c3c4.json - /var/lib/heat-config/deployed/ed924d1d-e7ce-40cc-a3d2-ca655b063fdd.json - /var/lib/heat-config/deployed/273b32c2-1f07-4703-84e9-248a933a0407.json there is no output and the error code is 5. You can directly attach the content of the /var/lib/heat-config directory and the ControllerExtraConfigPost template. Thanks,
Looks like a dup of https://bugzilla.redhat.com/show_bug.cgi?id=1404810#c36
Sofer, we can provide you a pointer to current wrappers that we use for upgrade and update. It is in github and we can provide you access. Send your github id to me and we will take care of it. We are in the process of opensourcing it but not there yet.
I believe Gonéri was correct. I made the post-deploy.yaml change from that bz and got a successful finalize stage. I'm not sure what final disposition will be. Will update with more later, but we are no longer blocked on *this particular issue.
Hi Wayne, so we can close this one as this CephRDO customization was already fixed. Don't hesitate to re-open it if I missed something.