Bug 1635698 - "openstack overcloud delete overcloud" fails with "Error occurred during stack delete None"
Summary: "openstack overcloud delete overcloud" fails with "Error occurred during stac...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: python-tripleoclient
Version: 14.0 (Rocky)
Hardware: Unspecified
OS: Linux
medium
medium
Target Milestone: z1
: 14.0 (Rocky)
Assignee: RHOS Maint
QA Contact: Sasha Smolyak
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-10-03 13:59 UTC by Alexander Chuzhoy
Modified: 2019-03-18 13:03 UTC (History)
12 users (show)

Fixed In Version: python-tripleoclient-10.6.1-0.20181010222413.8c8f259.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-03-18 13:03:10 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
debug output (6.10 KB, application/octet-stream)
2018-10-05 02:57 UTC, Jill Rouleau
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1804660 0 None None None 2018-11-22 14:35:50 UTC
OpenStack gerrit 625713 0 None None None 2019-01-28 14:53:23 UTC
Red Hat Product Errata RHBA-2019:0446 0 None None None 2019-03-18 13:03:17 UTC

Description Alexander Chuzhoy 2018-10-03 13:59:00 UTC
"openstack overcloud delete overcloud" fails with "Error occurred during stack delete None"

Environment:
python-heat-agent-1.7.1-0.20180907213355.476aae2.el7ost.noarch
python-heat-agent-apply-config-1.7.1-0.20180907213355.476aae2.el7ost.noarch
openstack-heat-api-12.0.0-0.20180604085325.7d878a8.el7ost.noarch
ansible-pacemaker-1.0.4-0.20180827141254.0e4d7c0.el7ost.noarch
python2-mistral-lib-1.0.0-0.20180821152751.d1ccfd0.el7ost.noarch
python2-heatclient-1.16.1-0.20180810081134.b5f3d34.el7ost.noarch
python-heat-agent-json-file-1.7.1-0.20180907213355.476aae2.el7ost.noarch
python-heat-agent-hiera-1.7.1-0.20180907213355.476aae2.el7ost.noarch
openstack-tripleo-heat-templates-9.0.0-0.20180919080941.0rc1.0rc1.el7ost.noarch
instack-undercloud-9.3.1-0.20180918171407.b0205ab.el7ost.noarch
openstack-heat-common-12.0.0-0.20180604085325.7d878a8.el7ost.noarch
python-tripleoclient-heat-installer-10.5.1-0.20180906012842.el7ost.noarch
ansible-role-redhat-subscription-1.0.1-1.el7ost.noarch
puppet-heat-13.3.1-0.20180831195745.28088f9.el7ost.noarch
ansible-role-tripleo-modify-image-1.0.1-0.20180915144057.cb535e9.el7ost.noarch
python-heat-agent-docker-cmd-1.7.1-0.20180907213355.476aae2.el7ost.noarch
python-heat-agent-ansible-1.7.1-0.20180907213355.476aae2.el7ost.noarch
python2-mistralclient-3.7.0-0.20180810140142.f0ee48f.el7ost.noarch
openstack-heat-engine-12.0.0-0.20180604085325.7d878a8.el7ost.noarch
ansible-2.5.7-1.el7ae.noarch
ansible-role-container-registry-1.0.1-0.20180907005806.b33f893.el7ost.noarch
puppet-mistral-13.3.1-0.20180831192741.bb0e35e.el7ost.noarch
python-heat-agent-puppet-1.7.1-0.20180907213355.476aae2.el7ost.noarch
openstack-heat-agents-1.7.1-0.20180907213355.476aae2.el7ost.noarch
openstack-heat-monolith-12.0.0-0.20180604085325.7d878a8.el7ost.noarch
ansible-tripleo-ipsec-9.0.1-0.20180827143021.d2b9234.el7ost.noarch
heat-cfntools-1.3.0-2.el7ost.noarch



Steps to reproduce:
Try to delete overcloud:
(undercloud) [stack@undercloud ~]$ openstack overcloud delete overcloud
Are you sure you want to delete this overcloud [y/N]? y
Deleting stack overcloud...
Waiting for messages on queue 'tripleo' with no timeout.
Error occurred during stack delete None
(undercloud) [stack@undercloud ~]$ 


Expected result:
Successful deletion of overcloud.

Comment 2 Alexander Chuzhoy 2018-10-03 17:15:07 UTC
Note that despite the error, the deletion does start:

(undercloud) [stack@undercloud ~]$ openstack stack list
+--------------------------------------+------------+----------------------------------+-----------------+----------------------+--------------+
| ID                                   | Stack Name | Project                          | Stack Status    | Creation Time        | Updated Time |
+--------------------------------------+------------+----------------------------------+-----------------+----------------------+--------------+
| 4ae9d0a5-0f84-4754-951e-50049c1bbb1a | overcloud  | abadfa7e2fcc4f5489c4e8ac2d9b0a0d | CREATE_COMPLETE | 2018-10-03T15:25:55Z | None         |
+--------------------------------------+------------+----------------------------------+-----------------+----------------------+--------------+


(undercloud) [stack@undercloud ~]$ openstack overcloud status --plan overcloud

+-----------+---------------------+---------------------+-------------------+
| Plan Name |       Created       |       Updated       | Deployment Status |
+-----------+---------------------+---------------------+-------------------+
| overcloud | 2018-10-03 15:51:58 | 2018-10-03 15:51:58 |   DEPLOY_SUCCESS  |
+-----------+---------------------+---------------------+-------------------+



(undercloud) [stack@undercloud ~]$ openstack overcloud delete overcloud
Are you sure you want to delete this overcloud [y/N]? y
Deleting stack overcloud...
Waiting for messages on queue 'tripleo' with no timeout.
Error occurred during stack delete None



(undercloud) [stack@undercloud ~]$ openstack stack list
+--------------------------------------+------------+----------------------------------+--------------------+----------------------+----------------------+
| ID                                   | Stack Name | Project                          | Stack Status       | Creation Time        | Updated Time         |
+--------------------------------------+------------+----------------------------------+--------------------+----------------------+----------------------+
| 4ae9d0a5-0f84-4754-951e-50049c1bbb1a | overcloud  | abadfa7e2fcc4f5489c4e8ac2d9b0a0d | DELETE_IN_PROGRESS | 2018-10-03T15:25:55Z | 2018-10-03T17:13:51Z |
+--------------------------------------+------------+----------------------------------+--------------------+----------------------+----------------------+

Comment 3 Jill Rouleau 2018-10-03 22:18:05 UTC
I haven't been able to reproduce this, and the only obvious error I see in the mistral logs is https://bugzilla.redhat.com/show_bug.cgi?id=1628319.  Can you try restarting all of the mistral containers on the undercloud before running the overcloud delete and see if you can still reproduce?

Comment 4 Jill Rouleau 2018-10-03 22:19:30 UTC
sorry, should have said "the only obvious error I see IS in the mistral logs..."

Comment 5 Jill Rouleau 2018-10-04 18:18:07 UTC
I did actually see this today, and I happened to have two stacks deployed.  After the first stack delete gave "Error occurred during stack delete None", I restarted the undercloud's mistral containers.  Immediately after that, deleting the 2nd stack still gave the message, so likely not actually related then.

Comment 6 Jill Rouleau 2018-10-05 02:57:18 UTC
Created attachment 1490744 [details]
debug output

Comment 7 Jill Rouleau 2018-10-05 02:58:36 UTC
In stack_management.py, there's a call to base.wait_for_messages which is checking the status of the mistral execution of tripleo.stack.v1._heat_stacks_list.  At the time that this is called the status is RUNNING, and not SUCCESS.  The status will change to SUCCESS within ~5-10s in the tests I've done.  But because it isn't yet SUCCESS, stack_management.delete_stack raises InvalidConfiguration[0] back to overcloud_delete._stack_delete, which raises the CommandError being sent to stdout.  The payload[message] is None, because there is no message - the workflow is still RUNNING.

Attaching a log with output of some debugging.  Unfortunately I wasn't able to pin down a fix, and I'll be out for the next few weeks, hopefully this data helps.

[0]  https://github.com/openstack/python-tripleoclient/blob/master/tripleoclient/workflows/stack_management.py#L44

Comment 8 James Slagle 2018-10-24 17:47:04 UTC
this is an issue in how the messages are being handled coming back from the workflow in tripleoclient. Could be a race condition or another status besides 'SUCCESS' needs to be handled.

Comment 10 Artem Hrechanychenko 2018-11-20 14:41:18 UTC
Any news about that issue?

Comment 15 Mikey Ariel 2019-02-20 12:44:23 UTC
If this bug requires doc text for errata release, please set the 'Doc Type' and provide draft text according to the template in the 'Doc Text' field. The documentation team will review, edit, and approve the text.

If this bug does not require doc text, please set the 'requires_doc_text' flag to -.

Comment 17 errata-xmlrpc 2019-03-18 13:03:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0446


Note You need to log in before you can comment on or make changes to this bug.