Description of problem: I tried to deploy from the GUI with 3 controllers and 1 compute (all other settings were left as default). Deployment failed because the controllers couldn't ping pool.ntp.org (probably a missing DNS setting in the ctlplane) and the GUI showed: Deployment of plan plan failed Ansible failed, check log at /var/lib/mistral/plan/ansible.log. The GUI didn't show a "delete" button or any other option - so I deleted the stack from the CLI. However, even with the stack deleted the GUI still showed the same failure and there is no "Deploy" button to try the deployment again. Version-Release number of selected component (if applicable): openstack-tripleo-ui-9.3.1-0.20180921180341.df30b55.el7ost.noarch How reproducible: 100% Steps to Reproduce: 1. Deploy from the GUI 2. Delete the stack with "openstack stack delete <<plan-name>>" 3. Return to the GUI Actual results: Stack is not shown as deleted, the system doesn't allow you to redeploy.
Workaround: 1) openstack object delete <<plan-name>>-messages deployment_status.yaml 2) F5 in the GUI 3) Click on "recover deployment status"
Since the introduction of config-download, the deployment status is tracked in <planName>-messages Mistral container in deployment_status.yaml object. Running openstack stack delete should now be considered low level operation as doing so deletes the stack but the deployment status does not get updated. Instead in case of using CLI, user should use 'openstack overcloud plan delete' or 'openstack overcloud delete'. Unfortunately both commands currently also delete the deployment plan. To fix the situtation, I'd be inclined to updating openstack overcloud delete to triggering tripleo.deployment.v1.undeploy_plan mistral workflow. In case of deployment failure, GUI provides 'Delete Deployment' in detailed deployment view page. This action triggers tripleo.deployment.v1.undeploy_plan workflow. Note that this is not GUI only problem. CLI uses 'openstack overcloud status' to retrieve deployment status, in case of deleting the heat stack, this command will also return incorrect status. IMHO to fix this bug, we need to update 'openstack overcloud delete' to use undeploy_plan workflow and update documentation to include correct commands if it is not done already.
1. Update 'openstack overcloud delete' command to use undeploy_plan workflow instead of 'stack delete'. [1] 2. In cases where documentation mentions using 'openstack stack delete' to delete the deployment, replace it with 'openstack overcloud delete' (to completely delete deployment and plan) or `openstack workflow execution create tripleo.deployment.v1.undeploy_plan '{"container":"<planName>}'` to delete the deployment only. For the future (not part of this bug) we need to change the 'openstack overcloud delete' to just undeploy plan but not delete it. Deleting plan needs to be separate operation done by 'openstack overcloud plan delete'. [1] https://github.com/openstack/python-tripleoclient/blob/master/tripleoclient/v1/overcloud_delete.py
New bug description: We don't have a CLI command to properly undeploy the overcloud. There is a workflow for it, and the GUI uses the right workflow, but the CLI doesn't support it yet. We propose to update the "openstack overcloud delete" command to call the proper workflow. Currently, this command deletes the plan in addition to deleting the stack - which is not what we want... It should only call the undeploy workflow same as the UI does. The bug is a blocker because the current CLI tools don't properly undeploy the overcloud. For example, if you delete the stack from the CLI and switch to the GUI - you still see that the cloud is deployed.
Related BZ which fixes GUI deployment actions: https://bugzilla.redhat.com/show_bug.cgi?id=1637461
*** Bug 1660110 has been marked as a duplicate of this bug. ***
The command still deletes the plan, instead of just undeploying the cloud. We need the plan to be re-deployable. (undercloud) [stack@undercloud-0 ~]$ openstack overcloud delete yac Are you sure you want to delete this overcloud [y/N]? y Undeploying stack yac... Waiting for messages on queue 'tripleo' with no timeout. Deleting plan yac... None Success.
(In reply to Udi from comment #33) > The command still deletes the plan, instead of just undeploying the cloud. > We need the plan to be re-deployable. As described in Comment 3, I think that change is outside of the scope of this bug. The way CLI currently works is that it recreates the deployment plan every time, which as you point is not compatible with GUI workflow. I'd suggest creating a new BZ to track that separately. This BZ (blocker) is about correctly tracking the deployment status which has been fixed by patches for this BZ.
Verified. The CLI calls the right workflow, but still calls the plan delete in addition to the overcloud delete. A separate bug will be opened. Verified in: python-tripleoclient-10.6.1-0.20181010222413.8c8f259.el7ost.noarch
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2019:0045
See also: https://bugzilla.redhat.com/show_bug.cgi?id=1664602