Bug 1644861 - Openstack overcloud delete should call the undeploy workflow
Summary: Openstack overcloud delete should call the undeploy workflow
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: python-tripleoclient
Version: 14.0 (Rocky)
Hardware: Unspecified
OS: Unspecified
high
urgent
Target Milestone: rc
: 14.0 (Rocky)
Assignee: Adriano Petrich
QA Contact: Udi Kalifon
URL:
Whiteboard:
: 1660110 (view as bug list)
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-10-31 18:10 UTC by Udi Kalifon
Modified: 2019-01-14 15:41 UTC (History)
14 users (show)

Fixed In Version: python-tripleoclient-10.6.1-0.20181010222413.8c8f259.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-01-11 11:54:26 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1802102 0 None None None 2018-11-07 13:17:10 UTC
OpenStack gerrit 616199 0 None None None 2018-11-13 09:44:20 UTC
OpenStack gerrit 619583 0 None None None 2018-12-17 17:44:17 UTC
OpenStack gerrit 623264 0 None None None 2018-12-06 17:14:32 UTC
OpenStack gerrit 625675 0 None None None 2018-12-17 17:44:17 UTC
Red Hat Product Errata RHEA-2019:0045 0 None None None 2019-01-11 11:54:40 UTC

Description Udi Kalifon 2018-10-31 18:10:30 UTC
Description of problem:
I tried to deploy from the GUI with 3 controllers and 1 compute (all other settings were left as default). Deployment failed because the controllers couldn't ping pool.ntp.org (probably a missing DNS setting in the ctlplane) and the GUI showed:

Deployment of plan plan failed
Ansible failed, check log at /var/lib/mistral/plan/ansible.log.

The GUI didn't show a "delete" button or any other option - so I deleted the stack from the CLI. However, even with the stack deleted the GUI still showed the same failure and there is no "Deploy" button to try the deployment again.


Version-Release number of selected component (if applicable):
openstack-tripleo-ui-9.3.1-0.20180921180341.df30b55.el7ost.noarch


How reproducible:
100%


Steps to Reproduce:
1. Deploy from the GUI
2. Delete the stack with "openstack stack delete <<plan-name>>"
3. Return to the GUI


Actual results:
Stack is not shown as deleted, the system doesn't allow you to redeploy.

Comment 1 Udi Kalifon 2018-11-02 10:48:21 UTC
Workaround:
1) openstack object delete <<plan-name>>-messages deployment_status.yaml
2) F5 in the GUI
3) Click on "recover deployment status"

Comment 2 Jiri Tomasek 2018-11-05 16:17:59 UTC
Since the introduction of config-download, the deployment status is tracked in <planName>-messages Mistral container in deployment_status.yaml object. Running openstack stack delete should now be considered low level operation as doing so deletes the stack but the deployment status does not get updated.

Instead in case of using CLI, user should use 'openstack overcloud plan delete' or 'openstack overcloud delete'. Unfortunately both commands currently also delete the deployment plan. To fix the situtation, I'd be inclined to updating openstack overcloud delete to triggering tripleo.deployment.v1.undeploy_plan mistral workflow.

In case of deployment failure, GUI provides 'Delete Deployment' in detailed deployment view page. This action triggers tripleo.deployment.v1.undeploy_plan workflow.

Note that this is not GUI only problem. CLI uses 'openstack overcloud status' to retrieve deployment status, in case of deleting the heat stack, this command will also return incorrect status.

IMHO to fix this bug, we need to update 'openstack overcloud delete' to use undeploy_plan workflow and update documentation to include correct commands if it is not done already.

Comment 3 Jiri Tomasek 2018-11-06 14:04:50 UTC
1. Update 'openstack overcloud delete' command to use undeploy_plan workflow instead of  'stack delete'. [1]

2. In cases where documentation mentions using 'openstack stack delete' to delete the deployment, replace it with 'openstack overcloud delete' (to completely delete deployment and plan) or `openstack workflow execution create tripleo.deployment.v1.undeploy_plan '{"container":"<planName>}'` to delete the deployment only.

For the future (not part of this bug) we need to change the 'openstack overcloud delete' to just undeploy plan but not delete it. Deleting plan needs to be separate operation done by 'openstack overcloud plan delete'.


[1] https://github.com/openstack/python-tripleoclient/blob/master/tripleoclient/v1/overcloud_delete.py

Comment 4 Udi Kalifon 2018-11-06 15:19:49 UTC
New bug description:
We don't have a CLI command to properly undeploy the overcloud. There is a workflow for it, and the GUI uses the right workflow, but the CLI doesn't support it yet.

We propose to update the "openstack overcloud delete" command to call the proper workflow. Currently, this command deletes the plan in addition to deleting the stack - which is not what we want... It should only call the undeploy workflow same as the UI does.

The bug is a blocker because the current CLI tools don't properly undeploy the overcloud. For example, if you delete the stack from the CLI and switch to the GUI - you still see that the cloud is deployed.

Comment 5 Jiri Tomasek 2018-11-07 16:26:47 UTC
Related BZ which fixes GUI deployment actions:
https://bugzilla.redhat.com/show_bug.cgi?id=1637461

Comment 19 Beth White 2018-12-17 14:51:24 UTC
*** Bug 1660110 has been marked as a duplicate of this bug. ***

Comment 33 Udi Kalifon 2019-01-09 08:47:54 UTC
The command still deletes the plan, instead of just undeploying the cloud. We need the plan to be re-deployable.

(undercloud) [stack@undercloud-0 ~]$ openstack overcloud delete yac 
Are you sure you want to delete this overcloud [y/N]? y
Undeploying stack yac...
Waiting for messages on queue 'tripleo' with no timeout.
Deleting plan yac...
None
Success.

Comment 35 Jiri Tomasek 2019-01-09 09:06:57 UTC
(In reply to Udi from comment #33)
> The command still deletes the plan, instead of just undeploying the cloud.
> We need the plan to be re-deployable.

As described in Comment 3, I think that change is outside of the scope of this bug. The way CLI currently works is that it recreates the deployment plan every time, which as you point is not compatible with GUI workflow. I'd suggest creating a new BZ to track that separately.

This BZ (blocker) is about correctly tracking the deployment status which has been fixed by patches for this BZ.

Comment 36 Udi Kalifon 2019-01-09 09:15:04 UTC
Verified. The CLI calls the right workflow, but still calls the plan delete in addition to the overcloud delete. A separate bug will be opened. Verified in: python-tripleoclient-10.6.1-0.20181010222413.8c8f259.el7ost.noarch

Comment 39 errata-xmlrpc 2019-01-11 11:54:26 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:0045

Comment 40 Udi Kalifon 2019-01-14 15:41:21 UTC
See also: https://bugzilla.redhat.com/show_bug.cgi?id=1664602


Note You need to log in before you can comment on or make changes to this bug.