Bug 1476233 - [RFE] Increment for overcloud nodes
[RFE] Increment for overcloud nodes
Status: NEW
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo (Show other bugs)
10.0 (Newton)
Unspecified Unspecified
unspecified Severity unspecified
: ---
: ---
Assigned To: James Slagle
Arik Chernetsky
: FutureFeature, Triaged
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-07-28 07:27 EDT by Dmitry Shevrin
Modified: 2018-05-24 21:57 EDT (History)
10 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)

  None (edit)
Description Dmitry Shevrin 2017-07-28 07:27:04 EDT
Description of problem:
Overcloud index always increments and doesn't do decrement when scaling-down or deleting the node

Version-Release number of selected component (if applicable):
10.0

How reproducible:
In deployed overcloud, try to re-utilise freed node indexes, such as overcloud-controller-NN

Steps to Reproduce:
1. Deploy an overcloud 
2. Remove a node from configuration
3. Try to re-utilise index number that belonged to this host

Actual results:
TripleO skips this number

Expected results:
TripleO re-uses this number

Additional info:
Comment 1 James Slagle 2017-08-08 14:33:20 EDT
Fundamentally, this is the nature of how Heat ResourceGroup's scale up and down. It would have to be addressed in Heat if this were to be fixed.
Comment 2 Rabi Mishra 2017-08-10 04:40:31 EDT
AFAIK, Tripleo marks the resources to be removed as blacklisted in RG (when doing overcloud node delete).

Heat has something called 'resource-mark-unhealthy' which would mark the resource (controller or compute index) as CHECK_FAILED, that would be replaced in the next update.

Probably this can be leveraged by Tripleo rather than backlisting resources, though I don't know if there are any other implications from Tripleo when new node uses an old index.
Comment 4 Rabi Mishra 2017-08-11 02:39:06 EDT
I forgot to mention that there is one drawback of using mark-unhealthy with RG though:

If you have 5 nodes, and you mark 'node-2' as unhealthy and then reduce the count(size) to 4 (expecting that 'node-2' would be removed, it would replace 'node-2' with a new node and chop off 'node-5' from the top, which may be unacceptable.
Comment 6 Steven Hardy 2017-08-11 05:15:25 EDT
This is similar to https://bugzilla.redhat.com/show_bug.cgi?id=1426563 - a procedure was tested to enable use of the heat mark-unhealthy feature to do an in-place node replacement that reuses the same IPs etc, but the decision was made to not document that process for general use.

May be we need to revisit that discussion, but if the decision is the same this might be considered a duplicate of that earlier bug because basically the observed behavior is expected.

Note You need to log in before you can comment on or make changes to this bug.