Bug 1312207

Summary: RFE: Add self-heal monitoring nagios plugin
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Sahina Bose <sabose>
Component: nagios-server-addonsAssignee: Sahina Bose <sabose>
Status: CLOSED ERRATA QA Contact: Sweta Anandpara <sanandpa>
Severity: medium Docs Contact:
Priority: medium    
Version: rhgs-3.1CC: bugs, divya, rcyriac, rhinduja, sabose, smohan
Target Milestone: ---Keywords: FutureFeature, ZStream
Target Release: RHGS 3.1.3   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: nagios-server-addons-0.2.4-1 Doc Type: Enhancement
Doc Text:
A Nagios plugin has been added to monitor if a replicate volume has entries that are not in sync with other bricks of the replica set. Now, administrators can ensure that they do not perform maintenance actions when there are pending heals, and can also monitor the heal progress by viewing the trending information on entries to be healed.
Story Points: ---
Clone Of: 1267586 Environment:
Last Closed: 2016-06-23 05:27:39 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1267586    
Bug Blocks: 1258386, 1311386, 1320438    

Description Sahina Bose 2016-02-26 05:58:28 UTC
+++ This bug was initially created as a clone of Bug #1267586 +++

Description of problem:

Administrators need a way to be notified when self-heal is in progress, or when there are unsynced entries present in a replicated volume. 

Administrators need to be alerted if:
1. self heal is ongoing for a period of time (configurable)
2. if unsynced entries are increasing or constant over a period of time

Version-Release number of selected component (if applicable):


How reproducible:
NA


Additional info:

--- Additional comment from Sahina Bose on 2015-09-30 09:57:30 EDT ---

http://review.gluster.org/12260, http://review.gluster.org/12261, http://review.gluster.org/12262 - patches posted

Comment 3 Mike McCune 2016-03-28 23:32:32 UTC
This bug was accidentally moved from POST to MODIFIED via an error in automation, please see mmccune with any questions

Comment 4 Sweta Anandpara 2016-04-25 11:06:13 UTC
Tested and verified this on the build nagios-server-addons 0.2.4-1 and gluster-server 3.7.9-2

The sanity check on the new service 'volume Heal info' and the corresponding 'Volume Split-brain status' is complete. New BZs are raised for issues faced while executing the test cases in and around this area. 

Moving this RFE to fixed in 3.1.3

Comment 6 Sahina Bose 2016-06-08 07:38:58 UTC
acked

Comment 8 errata-xmlrpc 2016-06-23 05:27:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2016:1242