Bug 1320438

Summary: [Doc RFE] Update Nagios Monitoring chapter with information about self-heal monitoring nagios plugin
Product: [Red Hat Storage] Red Hat Gluster Storage Reporter: Anjana Suparna Sriram <asriram>
Component: doc-Administration_GuideAssignee: Laura Bailey <lbailey>
doc-Administration_Guide sub component: Default QA Contact: Sweta Anandpara <sanandpa>
Status: CLOSED CURRENTRELEASE Docs Contact:
Severity: unspecified    
Priority: unspecified CC: asriram, mhideo, nlevinki, rcyriac, rhinduja, rhs-bugs, rwheeler, sabose, sanandpa, storage-doc
Version: rhgs-3.1Keywords: Documentation, FutureFeature, ZStream
Target Milestone: ---   
Target Release: RHGS 3.1.3   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-06-29 14:20:30 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1312207    
Bug Blocks: 1311845    

Description Anjana Suparna Sriram 2016-03-23 08:43:30 UTC
Document URL: https://access.redhat.com/documentation/en-US/Red_Hat_Storage/3.1/html-single/Administration_Guide/index.html#chap-Monitoring_Red_Hat_Storage


Additional information: With this plug-in, users would be notified when self-heal is in progress.

Comment 7 Sahina Bose 2016-04-20 12:00:35 UTC
The existing configuration "Volume Self Heal" has been changed to 2 services:
- Volume Heal info 
  UNKNOWN - if glusterd is not running or another transaction in progress (same as existing)
  WARNING - if command execution fails (same as existing)
  WARNING - if there are entries present that needs to be healed. The entry count displayed is across all bricks of volume, and may contain duplicates. 
  The plugin also plots a trending graph that displays the unsynced entry count over time

- Volume Split-brain status 
  UNKNOWN - if glusterd is not running or another transaction in progress (same as existing)
  WARNING - if command execution fails (same as existing)
  CRITICAL - if entries found in split brain state.

OK and NOT APPLICABLE states are same as before.

I'm a little confused as to how the document already captures the 2 plugins - as this is a new change :)

Regarding upgrades - Nagios server could be updated by itself, though typically the packages are shipped as part of RHGS/RHGS-C erratas.

Comment 17 Sweta Anandpara 2016-06-06 13:17:28 UTC
The Installation guide looks fine. 

Had a few comments in Admin guide:

Table 18.1
-----------
1.  Volume Self-Heal Info
(available in Red Hat Gluster Storage version 3.1.3 and later)
	NOTAPPLICABLE	Volume is not of replicate type.	Displayed when there are no replicate volumes and therefore no need to self-heal.

That is incorrect. There is no NOT APPLICABLE state that gets shown for this service. If a (distribute) replicate volume is created, then this service gets shown. And if a volume gets created for which healing is not valid, then this service does not get shown itself in the UI. So, please remove NOT APPLICABLE state.

2.  Volume Split-Brain Status
(available in Red Hat Gluster Storage version 3.1.0 and later) 

It is mentioned in Sahina's reply (comment 11), that Split brain status is only from 3.1.1 to 3.1.3 (and NOT in 3.1.0)

Could you please align the cells of the table correctly for volume split brain status. Unable to make out the content correctly.

Comment 18 Sweta Anandpara 2016-06-06 13:19:06 UTC
Moving this bug back to address the above issues.

Comment 20 Sweta Anandpara 2016-06-10 06:51:26 UTC
Thanks Laura. Read through the above link, and the changes look good. 

Moving this BZ to verified in 3.1.3.