Bug 1396255

Summary: [RFE] Specify specific node for decommissioning in Heat
Product: Red Hat OpenStack Reporter: jomurphy
Component: rhosp-directorAssignee: Jeff Brown <jefbrown>
Status: CLOSED CURRENTRELEASE QA Contact: Yogev Rabl <yrabl>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: 10.0 (Newton)CC: acanan, dbecker, gcharot, gfidente, jefbrown, johfulto, jomurphy, mburns, morazi, sbaker, scohen, seb, shan, shardy, srevivo, yrabl
Target Milestone: ---Keywords: FutureFeature
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: 1298768 Environment:
Last Closed: 2021-01-15 12:44:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1298768    
Bug Blocks: 1387431, 1396252, 1414467, 1425155    

Comment 1 Jeff Brown 2016-11-17 19:47:23 UTC
Today is is not possible to decommission specific nodes with OSPd.  We need that capability to correctly scale down storage when required.  This capability is required beyond storage, but this bug is specifically addressing the Ceph DFG needs.

Comment 2 Giulio Fidente 2016-11-29 10:10:39 UTC
The end goal for the Ceph DFG would be to delete a storage node without disruptions.

There are two scenarios we'd need to cover:

a) the storage node went down and can't be recovered
b) the storage node is purposely deleted

It seems to me that for both these scenarios we could delete the node from the stack using a command like the one we document for the compute nodes [ref1], is this correct?

Before deleting the node from the stack though, the user needs to execute some manual steps to cleanup (scenario A) or quiesce (scenario B) the pre-exising storage node; similarily to what happens is documented for the compute nodes [ref2].

To fully automate the process we'll need to be able to:

1) trigger a command execution on DELETE before the resource is actually deleted
2) to deal specifically with the scenario A (where the node goes down without notice), we need to execute commands on a node different from the one which is targeted for deletion

ref1. http://tripleo.org/post_deployment/delete_nodes.html
ref2. http://tripleo.org/post_deployment/quiesce_compute.html#quiesce-compute

Comment 8 Giulio Fidente 2021-01-15 12:44:31 UTC
On scale down the node to remove can be passed to "overcloud node delete" command with:

$ openstack overcloud node delete $nova_node_id