Updated to 12/10 drop and attempted a reinstall of overcloud. Deployment failed. Realized that although the stack was deleted and "nova list" showed no servers, ironic still had instance UUIDs associated with each hardware. We cleared up the instance UUIDs by doing another retrospective/scan. However, the deployment attempts now failed right away due to anther error: missing cloud image. Nothing was changed in undercloud’s openstack. Flavors and images were the same. After a lot of digging, I verified that this was due to the update. I noticed that one of the modules, overcloud_deploy.py, used to have block of parameters hardcoded and now they are removed. such as: controllerImage': 'overcloud-full', 'NovaImage': 'overcloud-full', 'BlockStorageImage': 'overcloud-full', 'SwiftStorageImage': 'overcloud-full', 'CephStorageImage': 'overcloud-full', 'OvercloudControlFlavor': 'baremetal', 'OvercloudComputeFlavor': 'baremetal', 'OvercloudBlockStorageFlavor': 'baremetal', 'OvercloudSwiftStorageFlavor': 'baremetal', 'OvercloudCephStorageFlavor': 'baremetal' I added some of the params back in and got beyond the immediate halts cause by missing/required params. I did not add them ALL back in (read on please). The deployment would go on 2-3 hours and never seem to get to some of the phases where you could see step1, step2, etc… Noticed, per ironic, that they system are rebooted and also noticed that they are indeed getting a new provisioning ip (nova list from undercloud). As 2-3 hours would pass noticed that IP for the servers does change again giving me the impression that stack deployment is retrying. But If I ssh’d to the new IP, even after 2 hours, I see the old OS/deployment. I also restored just that module in its entirety, and tried a deployment.But similar result.
The defaults for the image and flavor name parameters were moved from the client (overcloud_deploy.py) to tripleo-heat-templates directly (in overcloud-without-mergepy.yaml) in the 7.2 release of director. What I expect is happening then is that they yum updated the undercloud, and got the new client code, but are still using old tripleo-heat-templates. Do they have a local copy of the templates they might be using? If so, they need to make a new copy of /usr/share/openstack-tripleo-heat-templates, and incorporate any changes they had made to the templates into that new copy. Then try a new deployment with that new copy.
Jon, once you check the templates to make sure they are updated, please let us know in this bug what you find. Thanks.
This bug did not make the OSP 8.0 release. It is being deferred to OSP 10.
Closing since it should be resolved per comment #6. Please re-open if still experiencing some issues.
clearing needinfo