1089636 – [Nagios] Cluster status information says "None of the volumes are in critical state" even when a volume is utilized beyond critical level.

Bug 1089636 - [Nagios] Cluster status information says "None of the volumes are in critical state" even when a volume is utilized beyond critical level.

Summary: [Nagios] Cluster status information says "None of the volumes are in critical...

Keywords:
Status:	CLOSED CANTFIX
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	gluster-nagios-addons
Sub Component:
Version:	rhgs-3.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	---
Assignee:	Nishanth Thomas
QA Contact:	RHS-C QE
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1087818
TreeView+	depends on / blocked

Reported:	2014-04-21 09:55 UTC by Shruti Sampat
Modified:	2023-09-14 02:06 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Doc Type:	Known Issue
Doc Text:	In the Nagios UI, incorrect status information is displayed as "Cluster Status OK : None of the Volumes are in Critical State", when volume is utilized beyond critical level.
Clone Of:
Environment:
Last Closed:	2018-01-30 11:12:16 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Description Shruti Sampat 2014-04-21 09:55:01 UTC

Description of problem:
-----------------------

Consider a cluster having a single volume, being monitored be Nagios. When the volume is utilized beyond the critical level, the volume utilization service reports the volume to be critical. The "host" that represents the cluster in the Nagios UI, reports "Cluster Status OK : None of the Volumes are in Critical State" as part of its status information.

Version-Release number of selected component (if applicable):
gluster-nagios-addons-0.1.0-25.git25f0bba.el6.x86_64

How reproducible:
Saw it once.

Steps to Reproduce:
1. Create a cluster of RHS nodes, and create a volume, start it and fill the volume with data, such that it crosses 90% capacity.
2. Configure nagios server to run on one of the RHS nodes, and configure this cluster to be monitored by nagios.
3. Check the volume utilization service for the volume created, it should show critical status.
4. Check the state information for the "host" that represents the cluster in the Nagios UI.

Actual results:
The status information for the host says "Cluster Status OK : None of the Volumes are in Critical State".

Expected results:
The status information should indicate that one of the volumes in the cluster is utilized above critical level.

Additional info:

Comment 1 Shruti Sampat 2014-04-21 10:52:16 UTC

The performance data for the cluster also shows the number of volumes in critical state as zero - 

noOfVolumesInCriticalState=0

Comment 2 Nishanth Thomas 2014-05-08 09:23:23 UTC

As per the current design, cluster status is determined from the status of 'Volume Status' services of all the volumes under that cluster. The 'Cluster Utilization' service will reflect if any of volume's utilization goes beyond critical level. When the cascading gets implemented(in future) the cluster utilization service's will get propagated to cluster status

This is inline with the implementation of volume Status as well. Similarly we don't consider the Brick Utilization while determining the Volume status.

So as per my opinion this is not a bug and works as per the design

Comment 3 Dusmant 2014-05-16 11:26:40 UTC

As discussed in the triage meeting : we will take it up in the 3.1 release

Comment 4 Shalaka 2014-06-26 14:43:37 UTC

Please review and signoff edited doc text.

Comment 5 Nishanth Thomas 2014-06-27 05:12:56 UTC

Doctext is fine

Comment 6 Kanagaraj 2014-07-15 04:02:57 UTC

What would be the expected behavior?

- Cluster status will be aggregation of status of all volumes

or 

- Cluster status will be aggregation of both status and utilization of volumes

Comment 11 Sahina Bose 2018-01-30 11:12:16 UTC

Thank you for your report. However, this bug is being closed as it's logged against gluster-nagios monitoring for which no further new development is being undertaken.

Comment 12 Red Hat Bugzilla 2023-09-14 02:06:38 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days

Note You need to log in before you can comment on or make changes to this bug.