Bug 1102885

Summary: Storage node undeploy process should work with down nodes
Product: [Other] RHQ Project Reporter: John Sanda <jsanda>
Component: Core Server, Plugins, Storage NodeAssignee: Nobody <nobody>
Status: NEW --- QA Contact:
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.10, 4.11CC: hrupp
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
: 1102887 (view as bug list) Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1102887    

Description John Sanda 2014-05-29 18:40:08 UTC
Description of problem:
The undeploy process does several things,

* Decommissions the node from the cluster such that
  * Its data is redistributed to other nodes 
  * Cluster nodes will no longer communicate with said node
* The node is shut down
* The rhq-storage and rhq-data directories are purged from disk
* The The storage node resource is removed from inventory
* The storage node entity, i.e., row in rhq_storage_node table, is deleted

If the node is down, the decommission operation fails and the undeploy process cannot complete. The very reason for wanting to undeploy the node in the first place may be because the node is down. 

I have also seen users do the following when they want to get rid of a storage node,

1. Stop the storage node
2. Remove the storage node resource from inventory
3. (Optionally) Delete the rhq-storage directory

Once again, we cannot decommission the node and so the undeploy process cannot complete. The decommission operation has to be run on the node being removed. If we cannot do that, then we should fall back to calling StorageServiceMBean.removeNode(String hostIdString) from a running node. It can be called on any node in the cluster. If the removeNode() operation fails, then we fall back to StorageServiceMBean.forceRemoveCompletion() as a last resort.


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info: