Bug 1384246 - Following a scale-down/scale-up operation, all the overcloud got redeployed
Summary: Following a scale-down/scale-up operation, all the overcloud got redeployed
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: python-tripleoclient
Version: 7.0 (Kilo)
Hardware: Unspecified
OS: Unspecified
high
urgent
Target Milestone: rc
: 10.0 (Newton)
Assignee: Brad P. Crochet
QA Contact: Omri Hochman
URL:
Whiteboard:
Depends On:
Blocks: 1384288 1389195 1389196 1397852
TreeView+ depends on / blocked
 
Reported: 2016-10-12 22:47 UTC by David Hill
Modified: 2017-01-30 04:32 UTC (History)
20 users (show)

Fixed In Version: python-tripleoclient-5.2.0-5.el7ost
Doc Type: Bug Fix
Doc Text:
Node delete functions used Heat's 'parameters' instead of 'parameter_defaults'. This caused Heat to redeploy some resources, such as unintentionally redploying nodes. This fix switches the node delete functions to use only 'parameter_defaults'. Heat resources are correctly left in place and not redeployed.
Clone Of:
: 1397852 (view as bug list)
Environment:
Last Closed: 2016-12-14 16:17:31 UTC


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2016:2948 normal SHIPPED_LIVE Red Hat OpenStack Platform 10 enhancement update 2016-12-14 19:55:27 UTC
OpenStack gerrit 390590 None None None 2016-11-14 16:13:44 UTC
OpenStack gerrit 396804 None None None 2016-11-22 17:00:24 UTC

Description David Hill 2016-10-12 22:47:09 UTC
Description of problem:
Following a scale-down/scale-up operation, all the overcloud got redeployed.  Each controllers/compute got deleted / redeployed following this.

Version-Release number of selected component (if applicable):


How reproducible:
Don't know

Steps to Reproduce:
1. Remove a compute from the cloud with our documented procedure
2. Scale back up to the original size
3.

Actual results:
All the overcloud got deleted/redeployed host by host

Expected results:
This should never happen

Additional info:

Comment 8 James Slagle 2016-10-27 10:16:21 UTC
Brad, can you take a look at this one?

Comment 9 Brad P. Crochet 2016-10-31 14:19:30 UTC
This bug should either be for docs, or moved to OSP 10. Backports for OSP 7 at this point are not warranted.

Comment 10 James Slagle 2016-11-01 15:05:18 UTC
(In reply to Brad P. Crochet from comment #9)
> This bug should either be for docs, or moved to OSP 10. Backports for OSP 7
> at this point are not warranted.

This is for OSP 10, the rhos-10.0 flag has been requested. We should investigate what it would take to implement the last item in comment 5.

Jarda, can you also provide PM input here as to whether this should remain a blocker or not, or if we could just get by with documentation for 10?

Comment 11 Jaromir Coufal 2016-11-01 15:34:07 UTC
python-tripleoclient fix is high priority but no blocker for the 10 release. Still I would like to ask if we can cover that. Brad is that possible? Suggesting to keep this bug for that effort.

Improve documentation for OSP10 is critical (blocker). Would suggest to duplicate this bug against documentation and escalate to Derek.

Comment 12 Brad P. Crochet 2016-11-01 15:39:48 UTC
I think that this patch https://review.openstack.org/#/c/390590/ needs to land first. It may do enough to keep the command from exiting immediately. At the very least, it would need to land for the scale down operation to work at all, I think. I will check with rbrady about status of that patch.

Comment 13 Brad P. Crochet 2016-11-03 12:48:59 UTC
https://review.openstack.org/#/c/390590/ has landed in stable/newton upstream. I am currently testing to see if this resolves the issue fully, and whether further work will be necessary.

Comment 15 Marius Cornea 2016-11-22 16:21:28 UTC
Moving this back to ASSIGNED since openstack overcloud node delete exits and leaves the stack with UPDATE_IN_PROGRESS status:

python-tripleoclient-5.4.0-1.el7ost.noarch
openstack-tripleo-common-5.4.0-2.el7ost.noarch

[stack@undercloud-0 ~]$ nova list
+--------------------------------------+---------------------------+--------+------------+-------------+-----------------------+
| ID                                   | Name                      | Status | Task State | Power State | Networks              |
+--------------------------------------+---------------------------+--------+------------+-------------+-----------------------+
| 299a0f55-2dde-4f0d-a26a-07b7b5dbb483 | overcloud-cephstorage-0   | ACTIVE | -          | Running     | ctlplane=192.168.0.12 |
| 0d020586-c035-492b-99de-1817e571a817 | overcloud-cephstorage-1   | ACTIVE | -          | Running     | ctlplane=192.168.0.25 |
| 1be07410-5cfd-4d22-875b-114aa8dd6c48 | overcloud-compute-0       | ACTIVE | -          | Running     | ctlplane=192.168.0.11 |
| 831b2fa2-0563-4938-a222-c09f07c896c7 | overcloud-compute-2       | ACTIVE | -          | Running     | ctlplane=192.168.0.27 |
| db1736da-7fcf-4f49-a6e6-624bc47e2b28 | overcloud-controller-0    | ACTIVE | -          | Running     | ctlplane=192.168.0.13 |
| 60b55e3a-a6de-45ee-90e7-6b7fde4a15eb | overcloud-controller-1    | ACTIVE | -          | Running     | ctlplane=192.168.0.29 |
| 7d292db1-fa73-4069-81e5-d2d96641f0b9 | overcloud-controller-2    | ACTIVE | -          | Running     | ctlplane=192.168.0.26 |
| cb48120f-7835-4893-beb2-30519d0eff31 | overcloud-networker-0     | ACTIVE | -          | Running     | ctlplane=192.168.0.21 |
| 01043cf0-6a9d-4130-8cfc-d61daea748aa | overcloud-networker-1     | ACTIVE | -          | Running     | ctlplane=192.168.0.17 |
| 69470000-978a-4b4d-8639-b4b11377789b | overcloud-objectstorage-0 | ACTIVE | -          | Running     | ctlplane=192.168.0.19 |
| 0235c588-da0a-4d0b-9e75-2dbb8403b457 | overcloud-serviceapi-0    | ACTIVE | -          | Running     | ctlplane=192.168.0.20 |
| c4d98af2-e939-44e0-8437-bbe9a147f4eb | overcloud-serviceapi-1    | ACTIVE | -          | Running     | ctlplane=192.168.0.23 |
+--------------------------------------+---------------------------+--------+------------+-------------+-----------------------+

[stack@undercloud-0 ~]$ export THT=/usr/share/openstack-tripleo-heat-templates/
[stack@undercloud-0 ~]$ openstack overcloud node delete --stack overcloud --templates $THT 831b2fa2-0563-4938-a222-c09f07c896c7
deleting nodes [u'831b2fa2-0563-4938-a222-c09f07c896c7'] from stack overcloud
Started Mistral Workflow. Execution ID: e5464a83-a30e-4228-9d6a-ff017225b1f8
[stack@undercloud-0 ~]$ openstack stack list
+--------------------------------------+------------+--------------------+----------------------+----------------------+
| ID                                   | Stack Name | Stack Status       | Creation Time        | Updated Time         |
+--------------------------------------+------------+--------------------+----------------------+----------------------+
| 9e02e555-1a8b-4b41-984e-a778d2a0291e | overcloud  | UPDATE_IN_PROGRESS | 2016-11-22T11:59:51Z | 2016-11-22T16:18:32Z |
+--------------------------------------+------------+--------------------+----------------------+----------------------+

Comment 16 Brad P. Crochet 2016-11-22 17:00:25 UTC
I believe this patch also addresses this problem.

Comment 19 Marius Cornea 2016-11-23 13:17:44 UTC
I'm moving this to verified as I was unable to reproduce the nodes being redeployed in the context of scaling down and then scaling out the overcloud. 

I filed BZ#1397852 for keeping track of the issue with python-tripleoclient exiting right away when the openstack overcloud node delete command gets run.

Comment 21 errata-xmlrpc 2016-12-14 16:17:31 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2016-2948.html


Note You need to log in before you can comment on or make changes to this bug.