1166602 – [New] - Status information needs to be improved when glusterd goes down in all the nodes in the cluster.

Bug 1166602 - [New] - Status information needs to be improved when glusterd goes down in all the nodes in the cluster.

Summary: [New] - Status information needs to be improved when glusterd goes down in a...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Gluster Storage
Classification:	Red Hat Storage
Component:	nagios-server-addons
Sub Component:
Version:	rhgs-3.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	medium
Target Milestone:	---
Target Release:	RHGS 3.1.0
Assignee:	Timothy Asir
QA Contact:	RamaKasturi
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1202842
TreeView+	depends on / blocked

Reported:	2014-11-21 10:43 UTC by RamaKasturi
Modified:	2023-09-14 02:51 UTC (History)
CC List:	6 users (show)
Fixed In Version:	nnagios-server-addons-0.2.0-1.el6rhs.noarch
Doc Type:	Bug Fix
Doc Text:	Previously, when glusterd was down on all the nodes in the cluster, the status information for volume status, self-heal, geo-rep status were improperly displayed as "temporary error" instead of "no hosts found in cluster" or "hosts are not up". As a consequence, this confused the user to think that there are some issues with volume status, self-heal, Geo-replication and that needs to be fixed. With this fix, when the glusterd is down in all the nodes of the cluster, Volume Geo Replication ,Volume status,Volume Utilization status will be displayed as "UNKNOWN" with status information "UNKNOWN: NO hosts(with state UP) found in the cluster". The brick status will be displayed as "UNKNOWN" with status information as "UNKNOWN: Status could not be determined as glusterd is not running".
Clone Of:
Environment:
Last Closed:	2015-07-29 05:26:59 UTC
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHEA-2015:1494	0	normal	SHIPPED_LIVE	Red Hat Gluster Storage Console 3.1 Enhancement and bug fixes	2015-07-29 09:24:02 UTC

Description RamaKasturi 2014-11-21 10:43:00 UTC

Description of problem:
When glusterd goes down in all the nodes in the cluster, volume status, self-heal, geo-rep displays status as 'UNKNOW' with status information ' UNKNOWN: temporary error'. 

Status information needs to be improved as it is not a temporary error, since glusterd went down in all the nodes.

Version-Release number of selected component (if applicable):
nagios-server-addons-0.1.9-1.el6rhs.noarch

How reproducible:
Always

Steps to Reproduce:
1. Install nagios on RHS node.
2. Run discovery.py and start monitoring the nodes.
3. stop glusterd in all the nodes by running the command "service glusterd stop".

Actual results:
volume status, volume Self-heal, Volume Geo-Replication gives status as 'UNKNOWN' with status Information as "UNKOWN: temporary error".

Expected results:
Status information for these services needs to be improved.

Additional info:

Comment 2 Shruti Sampat 2014-11-27 07:13:24 UTC

Similar behavior is seen for Volume Quota services too.

Comment 4 Sahina Bose 2015-02-09 07:04:45 UTC

Enhance the message to suggest to user that issues may be with glusterd. Change temporary error - Glusterd cannot be queried.

Comment 5 Timothy Asir 2015-04-28 11:48:38 UTC

Patch sent to upstream for review: http://review.gluster.org/10421

Comment 7 RamaKasturi 2015-05-28 12:52:55 UTC

Please put the FIV for this bug

Comment 8 RamaKasturi 2015-05-29 08:48:17 UTC

Verified and works with build nagios-server-addons-0.2.0-1.el6rhs.noarch.


When glusterd is down in all the nodes of the cluster, Volume Geo Replication ,Volume status,Volume Utilization status is shown as "UNKNOWN" with status information "UNKNOWN: NO hosts(with state UP) found in the cluster".

Brick status is shown as "UNKNOWN" with status information as "UNKNOWN: Status could not be determined as glusterd is not running"

Comment 9 Divya 2015-07-26 05:32:19 UTC

Tim,

Kindly review and sign-off the edited doc text.

Comment 11 errata-xmlrpc 2015-07-29 05:26:59 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2015-1494.html

Comment 12 Red Hat Bugzilla 2023-09-14 02:51:16 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days

Note You need to log in before you can comment on or make changes to this bug.