Bug 1002174 - uninventory storage node for stopped agent destabilizes system and leads to Administration-> Topology -> Storage Nodes page dysfunction
uninventory storage node for stopped agent destabilizes system and leads to A...
Status: CLOSED CURRENTRELEASE
Product: RHQ Project
Classification: Other
Component: Core UI (Show other bugs)
4.9
All Linux
unspecified Severity urgent (vote)
: ---
: RHQ 4.9
Assigned To: Jirka Kremser
Mike Foley
:
Depends On:
Blocks: 951619
  Show dependency treegraph
 
Reported: 2013-08-28 10:48 EDT by Armine Hovsepyan
Modified: 2015-09-02 20:01 EDT (History)
5 users (show)

See Also:
Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed: 2013-09-24 15:08:50 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)
cass-server-uninventory.png (172.97 KB, image/png)
2013-09-02 12:15 EDT, Armine Hovsepyan
no flags Details
platform-uninventory.png (111.15 KB, image/png)
2013-09-02 12:15 EDT, Armine Hovsepyan
no flags Details
storage-node-uninventory.png (173.25 KB, image/png)
2013-09-02 12:16 EDT, Armine Hovsepyan
no flags Details

  None (edit)
Description Armine Hovsepyan 2013-08-28 10:48:35 EDT
clone of bug #976741
 Armine 2013-06-21 06:48:38 EDT

Description of problem:
rhq storage node data model - record is not being removed from db and ui after uninventory

Version-Release number of selected component (if applicable):
rhq 4.8 build e59e69d

How reproducible:
always

Steps to Reproduce:
1. install rhq 4.8 with storage  (ip1)
2. install agent and storage on another vm  (ip2)
3. stop agent on ip2 and uninventory platform from inventory list

Actual results:
After step 2 Exception is vislbe on the top ofthe page -> http://d.pr/i/4z6L  
After step 3 removed storage still visible in Administration -> Topology -> Storage nodes without resouce id  -> click on which leads to exception
removed storage node details visible in rhq_storage_nodes table without resource id  --> http://d.pr/i/gNr0

Expected results:
After step 2  storage node data without exceptions visible under Administration -> Topology -> Storage nodes  -- http://d.pr/i/6hgR
After step 3 removed storage removed from in Administration -> Topology -> Storage nodes without resouce id  
removed storage node details removed from rhq_storage_nodes table 

Additional info:

Armine 2013-06-21 09:16:21 EDT
Blocks: 951619
[reply] [−]
Private
Comment 1 John Sanda 2013-06-21 09:41:15 EDT

This is not fully implemented yet. Removing a storage node from inventory is going to have to do several things including,

* Remove the resource hierarchy from the database
* Remove the storage node entity from the database
* Remove the node from the Cassandra cluster

The last one is the tricky part. There are JMX operations we want to invoke to let other nodes in the cluster know that we are permanently removing the node. Then depending the cluster size, we may have to change the replication factor for the cluster and perform maintenance to make sure data is where it belongs in the cluster.

[reply] [−]
Private
Comment 2 Charles Crouch 2013-07-01 15:58:59 EDT

Per 7/1 BZ triage: target jon32

This needs to be implemented as part of the JON3.2 Beta/RHQ4.9

Priority: unspecified → high
Target Release: --- → JON 3.2.0
[reply] [−]
Private
Comment 3 John Sanda 2013-08-22 21:37:22 EDT

Support for undeployment is available with master builds. https://docs.jboss.org/author/display/RHQ/Deploying+Storage+Nodes provides the details of what is involved in the process. Moving to ON_QA.

Status: NEW → ON_QA
Assignee: rhq-maint@redhat.comjsanda@redhat.com
Target Milestone: --- → GA
Comment 1 Armine Hovsepyan 2013-08-28 10:48:57 EDT

Stopping agent on IP2 and uninventorying platform from server gui leads to Administration-> Topology -> Storage Nodes page dysfunction as well as destabilization of system.
Comment 3 Larry O'Leary 2013-08-29 09:48:43 EDT
As reported in 1002236 this issue also occurs without removing the platform. In the case of 1002236 it may still be related to the storage node not being in inventory. The agent startup and discovery was delayed.
Comment 4 Jirka Kremser 2013-09-02 07:04:33 EDT
branch:  master
link:    http://git.fedorahosted.org/cgit/rhq/rhq.git/commit/?id=56e34b687
time:    2013-09-02 13:00:07 +0200
commit:  56e34b687112afad9b6e972dab37ed774736adb3
author:  Jirka Kremser - jkremser@redhat.com
message: [BZ 1002174] - uninventory storage node for stopped agent destabilizes
         system and leads to Administration-> Topology -> Storage Nodes
         page dysfunction - Adding yet another confirmation box when
         uninventorying the platform or storage node.

In the notification we inform the user that he/she should run the undeploy op. on the storage node first.
Comment 5 Armine Hovsepyan 2013-09-02 12:15:01 EDT
verified, thank you.

screen-shots attached
Comment 6 Armine Hovsepyan 2013-09-02 12:15:37 EDT
Created attachment 792925 [details]
cass-server-uninventory.png
Comment 7 Armine Hovsepyan 2013-09-02 12:15:52 EDT
Created attachment 792926 [details]
platform-uninventory.png
Comment 8 Armine Hovsepyan 2013-09-02 12:16:58 EDT
Created attachment 792927 [details]
storage-node-uninventory.png
Comment 9 Heiko W. Rupp 2013-09-24 15:08:50 EDT
Bulk closing of RHQ 4.9 verified items

Note You need to log in before you can comment on or make changes to this bug.