Bug 1476233

Summary: [RFE] Increment for overcloud nodes
Product: Red Hat OpenStack Reporter: Dmitry Shevrin <dshevrin>
Component: openstack-tripleoAssignee: James Slagle <jslagle>
Status: CLOSED WONTFIX QA Contact: Arik Chernetsky <achernet>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 10.0 (Newton)CC: aschultz, jslagle, mburns, ramishra, rbrady, rhel-osp-director-maint, sbaker, sclewis, shardy, srevivo
Target Milestone: ---Keywords: FutureFeature, Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-01-11 15:00:54 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Dmitry Shevrin 2017-07-28 11:27:04 UTC
Description of problem:
Overcloud index always increments and doesn't do decrement when scaling-down or deleting the node

Version-Release number of selected component (if applicable):
10.0

How reproducible:
In deployed overcloud, try to re-utilise freed node indexes, such as overcloud-controller-NN

Steps to Reproduce:
1. Deploy an overcloud 
2. Remove a node from configuration
3. Try to re-utilise index number that belonged to this host

Actual results:
TripleO skips this number

Expected results:
TripleO re-uses this number

Additional info:

Comment 1 James Slagle 2017-08-08 18:33:20 UTC
Fundamentally, this is the nature of how Heat ResourceGroup's scale up and down. It would have to be addressed in Heat if this were to be fixed.

Comment 2 Rabi Mishra 2017-08-10 08:40:31 UTC
AFAIK, Tripleo marks the resources to be removed as blacklisted in RG (when doing overcloud node delete).

Heat has something called 'resource-mark-unhealthy' which would mark the resource (controller or compute index) as CHECK_FAILED, that would be replaced in the next update.

Probably this can be leveraged by Tripleo rather than backlisting resources, though I don't know if there are any other implications from Tripleo when new node uses an old index.

Comment 4 Rabi Mishra 2017-08-11 06:39:06 UTC
I forgot to mention that there is one drawback of using mark-unhealthy with RG though:

If you have 5 nodes, and you mark 'node-2' as unhealthy and then reduce the count(size) to 4 (expecting that 'node-2' would be removed, it would replace 'node-2' with a new node and chop off 'node-5' from the top, which may be unacceptable.

Comment 6 Steven Hardy 2017-08-11 09:15:25 UTC
This is similar to https://bugzilla.redhat.com/show_bug.cgi?id=1426563 - a procedure was tested to enable use of the heat mark-unhealthy feature to do an in-place node replacement that reuses the same IPs etc, but the decision was made to not document that process for general use.

May be we need to revisit that discussion, but if the decision is the same this might be considered a duplicate of that earlier bug because basically the observed behavior is expected.

Comment 14 Alex Schultz 2019-01-11 15:00:54 UTC
Given the discussion, this isn't something we can implement. Closing WONTFIX