Bug 1378157

Summary: Unable to delete overcloud node
Product: Red Hat OpenStack Reporter: Marius Cornea <mcornea>
Component: python-tripleoclientAssignee: Ryan Brady <rbrady>
Status: CLOSED ERRATA QA Contact: Omri Hochman <ohochman>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 10.0 (Newton)CC: akrivoka, ccamacho, dbecker, dtrainor, hbrock, jcoufal, jpichon, jrist, jschluet, jslagle, mburns, morazi, opavlenk, rhel-osp-director-maint
Target Milestone: rcKeywords: Triaged
Target Release: 10.0 (Newton)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: python-tripleoclient-5.3.0-5.el7ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-12-14 16:03:19 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Marius Cornea 2016-09-21 16:09:42 UTC
Description of problem:
Deleting an overcloud node fails with:
Two objects are equal when all of the attributes are equal, if you want to identify whether two objects are same one with same id, please use is_same_obj() function.
<urlopen error [Errno 2] No such file or directory: '/usr/share/openstack-tripleo-heat-templates/overcloud-without-mergepy.yaml'>


Version-Release number of selected component (if applicable):
python-tripleoclient-5.0.0-0.20160907170033.b0d7ce7.el7ost.noarch
openstack-tripleo-heat-templates-5.0.0-0.20160907212643.90c852e.1.el7ost.noarch

How reproducible:
100%

Steps to Reproduce:
1. Deploy overcloud:

source ~/stackrc
export THT=/usr/share/openstack-tripleo-heat-templates
openstack overcloud deploy --templates \
-e $THT/environments/network-isolation.yaml \
-e $THT/environments/network-management.yaml \
-e ~/templates/network-environment.yaml \
-e $THT/environments/storage-environment.yaml \
-e ~/templates/disk-layout.yaml \
-e ~/templates/wipe-disk-env.yaml \
-e ~/templates/enable-tls.yaml \
-e ~/templates/inject-trust-anchor.yaml \
-e ~/templates/tls-endpoints-public-ip.yaml \
-e ~/templates/password-env.yaml \
--control-scale 3 \
--control-flavor controller-d75f3dec-c770-5f88-9d4c-3fea1bf9c484 \
--compute-scale 1 \
--compute-flavor compute-b634c10a-570f-59ba-bdbf-0c313d745a10 \
--ceph-storage-scale 1 \
--ceph-storage-flavor ceph-cf1f074b-dadb-5eb8-9eb0-55828273fab7 \
--ntp-server clock.redhat.com 

2. Scale out with additional compute node:

source ~/stackrc
export THT=/usr/share/openstack-tripleo-heat-templates
openstack overcloud deploy --templates \
-e $THT/environments/network-isolation.yaml \
-e $THT/environments/network-management.yaml \
-e ~/templates/network-environment.yaml \
-e $THT/environments/storage-environment.yaml \
-e ~/templates/disk-layout.yaml \
-e ~/templates/wipe-disk-env.yaml \
-e ~/templates/enable-tls.yaml \
-e ~/templates/inject-trust-anchor.yaml \
-e ~/templates/tls-endpoints-public-ip.yaml \
-e ~/templates/password-env.yaml \
--control-scale 3 \
--control-flavor controller-d75f3dec-c770-5f88-9d4c-3fea1bf9c484 \
--compute-scale 2 \
--compute-flavor compute-b634c10a-570f-59ba-bdbf-0c313d745a10 \
--ceph-storage-scale 1 \
--ceph-storage-flavor ceph-cf1f074b-dadb-5eb8-9eb0-55828273fab7 \
--ntp-server clock.redhat.com 


3. Delete one compute node:

source ~/stackrc
export THT=/usr/share/openstack-tripleo-heat-templates
openstack overcloud node delete --stack overcloud --templates \
-e $THT/environments/network-isolation.yaml \
-e $THT/environments/network-management.yaml \
-e ~/templates/network-environment.yaml \
-e $THT/environments/storage-environment.yaml \
-e ~/templates/disk-layout.yaml \
-e ~/templates/wipe-disk-env.yaml \
-e ~/templates/enable-tls.yaml \
-e ~/templates/inject-trust-anchor.yaml \
-e ~/templates/tls-endpoints-public-ip.yaml \
-e ~/templates/password-env.yaml \
6fc44adf-9a46-41ad-af33-c623011e1457


Actual results:
WARNING: openstackclient.common.utils is deprecated and will be removed after Jun 2017. Please use osc_lib.utils
deleting nodes [u'6fc44adf-9a46-41ad-af33-c623011e1457'] from stack overcloud
Two objects are equal when all of the attributes are equal, if you want to identify whether two objects are same one with same id, please use is_same_obj() function.
<urlopen error [Errno 2] No such file or directory: '/usr/share/openstack-tripleo-heat-templates/overcloud-without-mergepy.yaml'>


Expected results:
The node gets deleted.

Additional info:

Comment 2 James Slagle 2016-09-22 16:30:37 UTC
carlos, can you have a look at this one?

Comment 3 James Slagle 2016-09-22 21:04:49 UTC
the error is due to the scale down code not using the templates from the plan.

in general, what needs to happen here is that the scale_down function in scale.py from tripleo_common needs to get updated to use the templates from the plan, and the logic moved into a workflow so that the scale down functionality is available from the API/UI as well.

Comment 4 Jason E. Rist 2016-10-20 23:02:56 UTC
This is merged upstream!
https://review.openstack.org/#/c/383731/

Comment 5 Jason E. Rist 2016-10-20 23:03:37 UTC
Moving back from POST to ON_DEV due to the need for a backport. (I think)

Comment 6 Jason E. Rist 2016-11-02 04:22:41 UTC
This is merged upstream, and we're waiting on an ACK from backport patch:
https://review.openstack.org/#/c/390590/

Comment 7 Ola Pavlenko 2016-11-06 10:30:06 UTC
(In reply to Jason E. Rist from comment #6)
> This is merged upstream, and we're waiting on an ACK from backport patch:
> https://review.openstack.org/#/c/390590/

Merged!
Changing to Modified.

Comment 12 Ana Krivokapic 2016-11-23 14:02:15 UTC
This is still failing so moving back to NEW.

Comment 14 Ana Krivokapic 2016-11-29 13:01:30 UTC
I retested this running the following steps:

1. Deploy a simple 1 controller 1 compute overcloud
2. Scale up to 1 controller 2 computes
3. Delete the original compute node - succeeds
4. Try to delete the newly added compute node - fails with timeout

Details:

[stack@instack ~]$ openstack overcloud deploy --templates --control-scale 1 --compute-scale 1
Removing the current plan files
<snip>
Overcloud Deployed
[stack@instack ~]$ nova list
+--------------------------------------+------------------------+--------+------------+-------------+---------------------+
| ID                                   | Name                   | Status | Task State | Power State | Networks            |
+--------------------------------------+------------------------+--------+------------+-------------+---------------------+
| 762a6dcd-6d77-4c31-9f94-a157f5f74855 | overcloud-compute-0    | ACTIVE | -          | Running     | ctlplane=192.0.2.12 |
| 5a6d6c7d-fb78-4c6f-b4c0-bc0777bbbb70 | overcloud-controller-0 | ACTIVE | -          | Running     | ctlplane=192.0.2.11 |
+--------------------------------------+------------------------+--------+------------+-------------+---------------------+
[stack@instack ~]$ openstack overcloud deploy --templates --control-scale 1 --compute-scale 2
Removing the current plan files
<snip>
[stack@instack ~]$ nova list
+--------------------------------------+------------------------+--------+------------+-------------+---------------------+
| ID                                   | Name                   | Status | Task State | Power State | Networks            |
+--------------------------------------+------------------------+--------+------------+-------------+---------------------+
| 762a6dcd-6d77-4c31-9f94-a157f5f74855 | overcloud-compute-0    | ACTIVE | -          | Running     | ctlplane=192.0.2.12 |
| 0c62ec51-e559-4674-aa6a-641cab341ffb | overcloud-compute-1    | ACTIVE | -          | Running     | ctlplane=192.0.2.17 |
| 5a6d6c7d-fb78-4c6f-b4c0-bc0777bbbb70 | overcloud-controller-0 | ACTIVE | -          | Running     | ctlplane=192.0.2.11 |
+--------------------------------------+------------------------+--------+------------+-------------+---------------------+
[stack@instack ~]$ openstack overcloud node delete --templates --stack overcloud  762a6dcd-6d77-4c31-9f94-a157f5f74855
deleting nodes [u'762a6dcd-6d77-4c31-9f94-a157f5f74855'] from stack overcloud
Started Mistral Workflow. Execution ID: ec4baa68-4c8d-431e-b13d-0c2bde04c9b1
[stack@instack ~]$ nova list
+--------------------------------------+------------------------+--------+------------+-------------+---------------------+
| ID                                   | Name                   | Status | Task State | Power State | Networks            |
+--------------------------------------+------------------------+--------+------------+-------------+---------------------+
| 0c62ec51-e559-4674-aa6a-641cab341ffb | overcloud-compute-1    | ACTIVE | -          | Running     | ctlplane=192.0.2.17 |
| 5a6d6c7d-fb78-4c6f-b4c0-bc0777bbbb70 | overcloud-controller-0 | ACTIVE | -          | Running     | ctlplane=192.0.2.11 |
+--------------------------------------+------------------------+--------+------------+-------------+---------------------+
[stack@instack ~]$ openstack overcloud node delete --templates --stack overcloud  0c62ec51-e559-4674-aa6a-641cab341ffb
deleting nodes [u'0c62ec51-e559-4674-aa6a-641cab341ffb'] from stack overcloud
Started Mistral Workflow. Execution ID: 0a9d1f55-a457-4d5a-98b4-56323df7ad6a
{u'execution': {u'id': u'0a9d1f55-a457-4d5a-98b4-56323df7ad6a',
                u'input': {u'container': u'overcloud',
                           u'nodes': [u'0c62ec51-e559-4674-aa6a-641cab341ffb'],
                           u'queue_name': u'2cb6790c-6369-4e9b-bf41-cd40eb07a463',
                           u'timeout': 240},
                u'name': u'tripleo.scale.v1.delete_node',
                u'params': {},
                u'spec': {u'description': u'deletes given overcloud nodes and updates the stack',
                          u'input': [u'container',
                                     u'nodes',
                                     {u'timeout': 240},
                                     {u'queue_name': u'tripleo'}],
                          u'name': u'delete_node',
                          u'tasks': {u'delete_node': {u'action': u'tripleo.scale.delete_node nodes=<% $.nodes %> timeout=<% $.timeout %> container=<% $.container %>',
                                                      u'name': u'delete_node',
                                                      u'on-error': u'set_delete_node_failed',
                                                      u'on-success': u'send_message',
                                                      u'type': u'direct',
                                                      u'version': u'2.0'},
                                     u'send_message': {u'action': u'zaqar.queue_post',
                                                       u'input': {u'messages': {u'body': {u'payload': {u'execution': u'<% execution() %>',
                                                                                                       u'message': u"<% $.get('message', '') %>",
                                                                                                       u'status': u"<% $.get('status', 'SUCCESS') %>"},
                                                                                          u'type': u'tripleo.scale.v1.delete_node'}},
                                                                  u'queue_name': u'<% $.queue_name %>'},
                                                       u'name': u'send_message',
                                                       u'retry': u'count=5 delay=1',
                                                       u'type': u'direct',
                                                       u'version': u'2.0'},
                                     u'set_delete_node_failed': {u'name': u'set_delete_node_failed',
                                                                 u'on-success': u'send_message',
                                                                 u'publish': {u'message': u'<% task(delete_node).result %>',
                                                                              u'status': u'FAILED'},
                                                                 u'type': u'direct',
                                                                 u'version': u'2.0'}},
                          u'version': u'2.0'}},
 u'message': u"Failed to run action [action_ex_id=92c75275-8091-4234-a764-95af3b1d1ddc, action_cls='<class 'mistral.actions.action_factory.ScaleDownAction'>', attributes='{}', params='{u'nodes': [u'0c62ec51-e559-4674-aa6a-641cab341ffb'], u'container': u'overcloud', u'timeout': 240}']\n ERROR: u'u\\'0\\'\\n",
 u'status': u'FAILED'}
[stack@instack ~]$

Comment 15 Ana Krivokapic 2016-11-29 14:06:04 UTC
Mistral executor log: https://paste.fedoraproject.org/492949/28204148/
Heat engine log: https://paste.fedoraproject.org/492954/14804283/

Comment 16 Julie Pichon 2016-11-29 15:36:57 UTC
I was able to verify this, using the following packages:
openstack-tripleo-common-5.4.0-2
python-tripleoclient-5.4.0-1

The commands I used are slightly different due to working with a non-default plan, but they worked: I could delete both the original compute node and the scaled out ones.

Details:
--------
$ nova list
+--------------------------------------+----------------------------------+--------+------------+-------------+---------------------+
| ID                                   | Name                             | Status | Task State | Power State | Networks            |
+--------------------------------------+----------------------------------+--------+------------+-------------+---------------------+
| bfa87c11-d1ee-4e9e-bf39-cc93717747ec | overcloud-ssl-ipv6-cephstorage-0 | ACTIVE | -          | Running     | ctlplane=192.0.2.16 |
| 1b8700f8-fb42-4307-b467-b75e04ae2631 | overcloud-ssl-ipv6-compute-0     | ACTIVE | -          | Running     | ctlplane=192.0.2.11 |
| baa55e94-2bc9-4e94-b571-ab5f4f368f25 | overcloud-ssl-ipv6-controller-0  | ACTIVE | -          | Running     | ctlplane=192.0.2.9  |
+--------------------------------------+----------------------------------+--------+------------+-------------+---------------------+

$ openstack action execution run tripleo.parameters.update '{"container":"overcloud-ssl-ipv6", "parameters":{"ComputeCount":3}}'
$ openstack overcloud plan deploy overcloud-ssl-ipv6

$ nova list
+--------------------------------------+----------------------------------+--------+------------+-------------+---------------------+
| ID                                   | Name                             | Status | Task State | Power State | Networks            |
+--------------------------------------+----------------------------------+--------+------------+-------------+---------------------+
| bfa87c11-d1ee-4e9e-bf39-cc93717747ec | overcloud-ssl-ipv6-cephstorage-0 | ACTIVE | -          | Running     | ctlplane=192.0.2.16 |
| 1b8700f8-fb42-4307-b467-b75e04ae2631 | overcloud-ssl-ipv6-compute-0     | ACTIVE | -          | Running     | ctlplane=192.0.2.11 |
| aeed24fd-14e3-4c8d-9f58-19d8a041e81e | overcloud-ssl-ipv6-compute-1     | ACTIVE | -          | Running     | ctlplane=192.0.2.10 |
| 6a013bad-55c6-4381-bc17-054836abf747 | overcloud-ssl-ipv6-compute-2     | ACTIVE | -          | Running     | ctlplane=192.0.2.15 |
| baa55e94-2bc9-4e94-b571-ab5f4f368f25 | overcloud-ssl-ipv6-controller-0  | ACTIVE | -          | Running     | ctlplane=192.0.2.9  |
+--------------------------------------+----------------------------------+--------+------------+-------------+---------------------+

$ openstack overcloud node delete --stack overcloud-ssl-ipv6 6a013bad-55c6-4381-bc17-054836abf747
$ openstack overcloud node delete --stack overcloud-ssl-ipv6 1b8700f8-fb42-4307-b467-b75e04ae2631

$ nova list
+--------------------------------------+----------------------------------+--------+------------+-------------+---------------------+
| ID                                   | Name                             | Status | Task State | Power State | Networks            |
+--------------------------------------+----------------------------------+--------+------------+-------------+---------------------+
| bfa87c11-d1ee-4e9e-bf39-cc93717747ec | overcloud-ssl-ipv6-cephstorage-0 | ACTIVE | -          | Running     | ctlplane=192.0.2.16 |
| aeed24fd-14e3-4c8d-9f58-19d8a041e81e | overcloud-ssl-ipv6-compute-1     | ACTIVE | -          | Running     | ctlplane=192.0.2.10 |
| baa55e94-2bc9-4e94-b571-ab5f4f368f25 | overcloud-ssl-ipv6-controller-0  | ACTIVE | -          | Running     | ctlplane=192.0.2.9  |
+--------------------------------------+----------------------------------+--------+------------+-------------+---------------------+

$ openstack overcloud node delete --stack overcloud-ssl-ipv6 aeed24fd-14e3-4c8d-9f58-19d8a041e81e
$ openstack overcloud node delete --stack overcloud-ssl-ipv6 bfa87c11-d1ee-4e9e-bf39-cc93717747ec

$ nova list
+--------------------------------------+---------------------------------+--------+------------+-------------+--------------------+
| ID                                   | Name                            | Status | Task State | Power State | Networks           |
+--------------------------------------+---------------------------------+--------+------------+-------------+--------------------+
| baa55e94-2bc9-4e94-b571-ab5f4f368f25 | overcloud-ssl-ipv6-controller-0 | ACTIVE | -          | Running     | ctlplane=192.0.2.9 |
+--------------------------------------+---------------------------------+--------+------------+-------------+--------------------+

Comment 18 errata-xmlrpc 2016-12-14 16:03:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-2948.html