Bug 1084056 - Storage node has internal server metrics "Anti Entropy Sessions" marked as unavailable
Summary: Storage node has internal server metrics "Anti Entropy Sessions" marked as un...
Status: CLOSED CURRENTRELEASE
Alias: None
Product: JBoss Operations Network
Classification: JBoss
Component: Storage Node
Version: JON 3.2
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ER04
: JON 3.3.0
Assignee: Stefan Negrea
QA Contact: Mike Foley
URL:
Whiteboard:
Keywords:
: 1084055 (view as bug list)
Depends On:
Blocks: 1084059
TreeView+ depends on / blocked
 
Reported: 2014-04-03 14:04 UTC by bkramer
Modified: 2018-12-04 17:51 UTC (History)
3 users (show)

(edit)
It was discovered that the "Anti Entropy Sessions" Internal Server Metric statistic was being monitored when it should have logically been excluded. The Anti Entropy Sessions metric was being discovered only after a Repair operation was initiated, and was marked as DOWN when the JON server was restarted until another Repair operation was run. The fix adds a missing policy to the "Anty Entropy Sessions" type (a single bean), which now allows users to configure the behaviour of a missing resource after a Cassandra or Storage Node server restart. This added policy allows users to change the default (DOWN) convention of MISSING to something more suited to their use case. This bean is particularly useful for long running repair jobs, because it provides important telemetry for the progress of the repair job. Even if the bean disappears after a C* restart, it will be visible again to JON as soon as the repair job is invoked. Users can now collect Anti Entropy metrics correctly between server restarts.
Clone Of:
(edit)
Last Closed: 2014-12-11 14:04:53 UTC


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 743393 None None None Never
Red Hat Bugzilla 1139765 None None None Never
Red Hat Bugzilla 1546066 None None None Never

Internal Trackers: 1139765 1546066

Description bkramer 2014-04-03 14:04:08 UTC
Description of problem:
First time, when JBoss ON is installed and started, *Anti Entropy Sessions* resource will not be discovered and will not be listed under Storage Node's Internal Server Metrics subfolder. However, execution of *Repair* operation will generate *Anti Entropy Sessions" which will be discovered after JBoss ON Agent's full discovery. The status of this metric will be UP.

If we stop the storage node and start it again, previously discovered *Anti Entropy Session* will not exist any more and because of that it will be marked as DOWN in JBoss ON UI, until next "Repair" operation...

Version-Release number of selected component (if applicable):
JBoss ON 3.2

How reproducible:
Always

Steps to Reproduce:
1. Install JBoss ON 3.2;
2. In JBoss ON UI navigate to Inventory -> Servers -> RHQ Storage Node -> Internal Server Metrics and confirm that *Anti Entropy Sessions* are not listed;
3. For this storage node, execute "Repair" operation;
4. Execute "discovery -f" on the JBoss ON Agent command line;
5. Execute "avail --force" on the JBoss ON Agent command line;
6. Navigate as before to Internal Server Metrics for this storage node and confirm that *Anti Entropy Sessions* are discovered and UP;
7. Stop storage node: ./rhqctl stop --storage;
8. Start storage node: ./rhqctl start --storage;
9. Navigate to the Internal Server Metrics for this storage node and confirm that *Anti Entropy Sessions* resource is down.

Actual results:
Anti Entropy Sessions resource is discovered and monitored.

Expected results:
We should not monitor Anti Entropy Sessions resource as it will go down after every storage node restart.

Additional info:

Comment 1 Stefan Negrea 2014-09-22 17:20:29 UTC
*** Bug 1084055 has been marked as a duplicate of this bug. ***

Comment 2 Jay Shaughnessy 2014-09-22 20:13:43 UTC
Release/jon3.2.x commit 93a3cfa3650dfe52b9f022973244404cff96d325
Author: Stefan Negrea <snegrea@redhat.com>
Date:   Mon Sep 22 12:21:47 2014 -0500

    (cherry picked from commit 2dbe6c57daaae87b5ce6890ba0da22c5976c6a7d)
    Signed-off-by: Jay Shaughnessy <jshaughn@redhat.com>

Comment 3 Stefan Negrea 2014-09-22 20:35:18 UTC
Added missing policy to the "Anty Entropy Sessions" type(a single bean) to allow users to configure the behaviour of a missing resourece after a Cassandra or Storage Node server restart.  With the missing policy the users now have the option to change the default (DOWN) conversion of MISSING to something more suited for their use case.

There is a lot of value in having this bean in inventory. For long running repair jobs, it provides important telemetry for the progress of the repair job. Even if the bean dissapers after a C* restart, it will be visible again to JON as soon as the repair job is invoked.

Comment 4 Simeon Pinder 2014-10-01 21:33:35 UTC
Moving to ON_QA as available for test with build:
https://brewweb.devel.redhat.com/buildinfo?buildID=388959


Note You need to log in before you can comment on or make changes to this bug.