Bug 1089636 - [Nagios] Cluster status information says "None of the volumes are in critical state" even when a volume is utilized beyond critical level.
Summary: [Nagios] Cluster status information says "None of the volumes are in critical...
Keywords:
Status: CLOSED CANTFIX
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: gluster-nagios-addons
Version: rhgs-3.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: ---
Assignee: Nishanth Thomas
QA Contact: RHS-C QE
URL:
Whiteboard:
Depends On:
Blocks: 1087818
TreeView+ depends on / blocked
 
Reported: 2014-04-21 09:55 UTC by Shruti Sampat
Modified: 2023-09-14 02:06 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: Known Issue
Doc Text:
In the Nagios UI, incorrect status information is displayed as "Cluster Status OK : None of the Volumes are in Critical State", when volume is utilized beyond critical level.
Clone Of:
Environment:
Last Closed: 2018-01-30 11:12:16 UTC
Embargoed:


Attachments (Terms of Use)

Description Shruti Sampat 2014-04-21 09:55:01 UTC
Description of problem:
-----------------------

Consider a cluster having a single volume, being monitored be Nagios. When the volume is utilized beyond the critical level, the volume utilization service reports the volume to be critical. The "host" that represents the cluster in the Nagios UI, reports "Cluster Status OK : None of the Volumes are in Critical State" as part of its status information.

Version-Release number of selected component (if applicable):
gluster-nagios-addons-0.1.0-25.git25f0bba.el6.x86_64

How reproducible:
Saw it once.

Steps to Reproduce:
1. Create a cluster of RHS nodes, and create a volume, start it and fill the volume with data, such that it crosses 90% capacity.
2. Configure nagios server to run on one of the RHS nodes, and configure this cluster to be monitored by nagios.
3. Check the volume utilization service for the volume created, it should show critical status.
4. Check the state information for the "host" that represents the cluster in the Nagios UI.

Actual results:
The status information for the host says "Cluster Status OK : None of the Volumes are in Critical State".

Expected results:
The status information should indicate that one of the volumes in the cluster is utilized above critical level.

Additional info:

Comment 1 Shruti Sampat 2014-04-21 10:52:16 UTC
The performance data for the cluster also shows the number of volumes in critical state as zero - 

noOfVolumesInCriticalState=0

Comment 2 Nishanth Thomas 2014-05-08 09:23:23 UTC
As per the current design, cluster status is determined from the status of 'Volume Status' services of all the volumes under that cluster. The 'Cluster Utilization' service will reflect if any of volume's utilization goes beyond critical level. When the cascading gets implemented(in future) the cluster utilization service's will get propagated to cluster status

This is inline with the implementation of volume Status as well. Similarly we don't consider the Brick Utilization while determining the Volume status.

So as per my opinion this is not a bug and works as per the design

Comment 3 Dusmant 2014-05-16 11:26:40 UTC
As discussed in the triage meeting : we will take it up in the 3.1 release

Comment 4 Shalaka 2014-06-26 14:43:37 UTC
Please review and signoff edited doc text.

Comment 5 Nishanth Thomas 2014-06-27 05:12:56 UTC
Doctext is fine

Comment 6 Kanagaraj 2014-07-15 04:02:57 UTC
What would be the expected behavior?

- Cluster status will be aggregation of status of all volumes

or 

- Cluster status will be aggregation of both status and utilization of volumes

Comment 11 Sahina Bose 2018-01-30 11:12:16 UTC
Thank you for your report. However, this bug is being closed as it's logged against gluster-nagios monitoring for which no further new development is being undertaken.

Comment 12 Red Hat Bugzilla 2023-09-14 02:06:38 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days


Note You need to log in before you can comment on or make changes to this bug.