Bug 1166602 - [New] - Status information needs to be improved when glusterd goes down in all the nodes in the cluster.
Summary: [New] - Status information needs to be improved when glusterd goes down in a...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: nagios-server-addons
Version: rhgs-3.0
Hardware: Unspecified
OS: Unspecified
high
medium
Target Milestone: ---
: RHGS 3.1.0
Assignee: Timothy Asir
QA Contact: RamaKasturi
URL:
Whiteboard:
Depends On:
Blocks: 1202842
TreeView+ depends on / blocked
 
Reported: 2014-11-21 10:43 UTC by RamaKasturi
Modified: 2023-09-14 02:51 UTC (History)
6 users (show)

Fixed In Version: nnagios-server-addons-0.2.0-1.el6rhs.noarch
Doc Type: Bug Fix
Doc Text:
Previously, when glusterd was down on all the nodes in the cluster, the status information for volume status, self-heal, geo-rep status were improperly displayed as "temporary error" instead of "no hosts found in cluster" or "hosts are not up". As a consequence, this confused the user to think that there are some issues with volume status, self-heal, Geo-replication and that needs to be fixed. With this fix, when the glusterd is down in all the nodes of the cluster, Volume Geo Replication ,Volume status,Volume Utilization status will be displayed as "UNKNOWN" with status information "UNKNOWN: NO hosts(with state UP) found in the cluster". The brick status will be displayed as "UNKNOWN" with status information as "UNKNOWN: Status could not be determined as glusterd is not running".
Clone Of:
Environment:
Last Closed: 2015-07-29 05:26:59 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2015:1494 0 normal SHIPPED_LIVE Red Hat Gluster Storage Console 3.1 Enhancement and bug fixes 2015-07-29 09:24:02 UTC

Description RamaKasturi 2014-11-21 10:43:00 UTC
Description of problem:
When glusterd goes down in all the nodes in the cluster, volume status, self-heal, geo-rep displays status as 'UNKNOW' with status information ' UNKNOWN: temporary error'. 

Status information needs to be improved as it is not a temporary error, since glusterd went down in all the nodes.

Version-Release number of selected component (if applicable):
nagios-server-addons-0.1.9-1.el6rhs.noarch

How reproducible:
Always

Steps to Reproduce:
1. Install nagios on RHS node.
2. Run discovery.py and start monitoring the nodes.
3. stop glusterd in all the nodes by running the command "service glusterd stop".

Actual results:
volume status, volume Self-heal, Volume Geo-Replication gives status as 'UNKNOWN' with status Information as "UNKOWN: temporary error".

Expected results:
Status information for these services needs to be improved.

Additional info:

Comment 2 Shruti Sampat 2014-11-27 07:13:24 UTC
Similar behavior is seen for Volume Quota services too.

Comment 4 Sahina Bose 2015-02-09 07:04:45 UTC
Enhance the message to suggest to user that issues may be with glusterd. Change temporary error - Glusterd cannot be queried.

Comment 5 Timothy Asir 2015-04-28 11:48:38 UTC
Patch sent to upstream for review: http://review.gluster.org/10421

Comment 7 RamaKasturi 2015-05-28 12:52:55 UTC
Please put the FIV for this bug

Comment 8 RamaKasturi 2015-05-29 08:48:17 UTC
Verified and works with build nagios-server-addons-0.2.0-1.el6rhs.noarch.


When glusterd is down in all the nodes of the cluster, Volume Geo Replication ,Volume status,Volume Utilization status is shown as "UNKNOWN" with status information "UNKNOWN: NO hosts(with state UP) found in the cluster".

Brick status is shown as "UNKNOWN" with status information as "UNKNOWN: Status could not be determined as glusterd is not running"

Comment 9 Divya 2015-07-26 05:32:19 UTC
Tim,

Kindly review and sign-off the edited doc text.

Comment 11 errata-xmlrpc 2015-07-29 05:26:59 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2015-1494.html

Comment 12 Red Hat Bugzilla 2023-09-14 02:51:16 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days


Note You need to log in before you can comment on or make changes to this bug.