Bug 1084056

Summary: Storage node has internal server metrics "Anti Entropy Sessions" marked as unavailable
Product: [JBoss] JBoss Operations Network Reporter: bkramer <bkramer>
Component: Storage NodeAssignee: Stefan Negrea <snegrea>
Status: CLOSED CURRENTRELEASE QA Contact: Mike Foley <mfoley>
Severity: medium Docs Contact:
Priority: unspecified    
Version: JON 3.2CC: jshaughn, loleary, snegrea
Target Milestone: ER04   
Target Release: JON 3.3.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
It was discovered that the "Anti Entropy Sessions" Internal Server Metric statistic was being monitored when it should have logically been excluded. The Anti Entropy Sessions metric was being discovered only after a Repair operation was initiated, and was marked as DOWN when the JON server was restarted until another Repair operation was run. The fix adds a missing policy to the "Anty Entropy Sessions" type (a single bean), which now allows users to configure the behaviour of a missing resource after a Cassandra or Storage Node server restart. This added policy allows users to change the default (DOWN) convention of MISSING to something more suited to their use case. This bean is particularly useful for long running repair jobs, because it provides important telemetry for the progress of the repair job. Even if the bean disappears after a C* restart, it will be visible again to JON as soon as the repair job is invoked. Users can now collect Anti Entropy metrics correctly between server restarts.
Story Points: ---
Clone Of: Environment:
Last Closed: 2014-12-11 14:04:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Bug Depends On:    
Bug Blocks: 1084059    

Description bkramer 2014-04-03 14:04:08 UTC
Description of problem:
First time, when JBoss ON is installed and started, *Anti Entropy Sessions* resource will not be discovered and will not be listed under Storage Node's Internal Server Metrics subfolder. However, execution of *Repair* operation will generate *Anti Entropy Sessions" which will be discovered after JBoss ON Agent's full discovery. The status of this metric will be UP.

If we stop the storage node and start it again, previously discovered *Anti Entropy Session* will not exist any more and because of that it will be marked as DOWN in JBoss ON UI, until next "Repair" operation...

Version-Release number of selected component (if applicable):
JBoss ON 3.2

How reproducible:
Always

Steps to Reproduce:
1. Install JBoss ON 3.2;
2. In JBoss ON UI navigate to Inventory -> Servers -> RHQ Storage Node -> Internal Server Metrics and confirm that *Anti Entropy Sessions* are not listed;
3. For this storage node, execute "Repair" operation;
4. Execute "discovery -f" on the JBoss ON Agent command line;
5. Execute "avail --force" on the JBoss ON Agent command line;
6. Navigate as before to Internal Server Metrics for this storage node and confirm that *Anti Entropy Sessions* are discovered and UP;
7. Stop storage node: ./rhqctl stop --storage;
8. Start storage node: ./rhqctl start --storage;
9. Navigate to the Internal Server Metrics for this storage node and confirm that *Anti Entropy Sessions* resource is down.

Actual results:
Anti Entropy Sessions resource is discovered and monitored.

Expected results:
We should not monitor Anti Entropy Sessions resource as it will go down after every storage node restart.

Additional info:

Comment 1 Stefan Negrea 2014-09-22 17:20:29 UTC
*** Bug 1084055 has been marked as a duplicate of this bug. ***

Comment 2 Jay Shaughnessy 2014-09-22 20:13:43 UTC
Release/jon3.2.x commit 93a3cfa3650dfe52b9f022973244404cff96d325
Author: Stefan Negrea <snegrea@redhat.com>
Date:   Mon Sep 22 12:21:47 2014 -0500

    (cherry picked from commit 2dbe6c57daaae87b5ce6890ba0da22c5976c6a7d)
    Signed-off-by: Jay Shaughnessy <jshaughn@redhat.com>

Comment 3 Stefan Negrea 2014-09-22 20:35:18 UTC
Added missing policy to the "Anty Entropy Sessions" type(a single bean) to allow users to configure the behaviour of a missing resourece after a Cassandra or Storage Node server restart.  With the missing policy the users now have the option to change the default (DOWN) conversion of MISSING to something more suited for their use case.

There is a lot of value in having this bean in inventory. For long running repair jobs, it provides important telemetry for the progress of the repair job. Even if the bean dissapers after a C* restart, it will be visible again to JON as soon as the repair job is invoked.

Comment 4 Simeon Pinder 2014-10-01 21:33:35 UTC
Moving to ON_QA as available for test with build:
https://brewweb.devel.redhat.com/buildinfo?buildID=388959