Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1109723 - [Nagios] Cluster auto-config service is warning with "(null)" status information when glusterd is down on some nodes in the cluster
[Nagios] Cluster auto-config service is warning with "(null)" status informat...
Status: CLOSED ERRATA
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: gluster-nagios-addons (Show other bugs)
3.0
Unspecified Unspecified
unspecified Severity high
: ---
: RHGS 3.0.3
Assigned To: Ramesh N
Shruti Sampat
: ZStream
Depends On:
Blocks: 1087818
  Show dependency treegraph
 
Reported: 2014-06-16 04:55 EDT by Shruti Sampat
Modified: 2015-05-13 13:42 EDT (History)
10 users (show)

See Also:
Fixed In Version: nagios-server-addons-0.1.7-1.el6rhs
Doc Type: Bug Fix
Doc Text:
Previously, the Auto-config service would not work if the glusterd service was offline in any of the nodes in the Red Hat Storage trusted storage pool. With this fix, the Auto-config service works even if the glusterd service is down in some of the nodes in the trusted storage pool provided that the glusterd service is running in the node which is used as sync host in the auto-config service.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-01-15 08:48:14 EST
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Knowledge Base (Solution) 1218023 None None None Never
Red Hat Product Errata RHBA-2015:0039 normal SHIPPED_LIVE Red Hat Storage Console 3.0 enhancement and bug fix update #3 2015-01-15 13:46:40 EST

  None (edit)
Description Shruti Sampat 2014-06-16 04:55:07 EDT
Description of problem:
-----------------------

When glusterd was stopped on a couple of nodes in the cluster, the cluster auto-config service was seen to be in warning status with "(null)" as the status information.

Version-Release number of selected component (if applicable):
gluster-nagios-addons-0.1.2-1.el6rhs.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Setup a cluster of RHS nodes (I had 7 nodes in the cluster)
2. Monitor the cluster using Nagios.
3. Stop glusterd on a couple of nodes. Observer the cluster auto-config service.

Actual results:
The status of the service is warning and the status information reads "(null)".

Expected results:
glusterd being down on the nodes should not affect the auto-config service.

Additional info:
Comment 1 Shruti Sampat 2014-06-16 04:58:45 EDT
Similar behavior was seen when some nodes in the cluster were powered off. See BZ #1109025.
Comment 2 Ramesh N 2014-06-16 06:23:15 EDT
Fixed in Patch : http://review.gluster.org/#/c/8074/
Comment 3 Ramesh N 2014-06-16 09:49:11 EDT
Downstream patch https://code.engineering.redhat.com/gerrit/#/c/27038/
Comment 4 Shalaka 2014-06-26 01:32:53 EDT
Review and signoff the edited doc text.
Comment 5 Ramesh N 2014-06-26 01:56:42 EDT
Doc text looks good to me.
Comment 10 Shruti Sampat 2014-11-17 04:33:04 EST
Verified as fixed in nagios-server-addons-0.1.8-1.el6rhs.noarch

When glusterd is down on some of the nodes in the cluster, the cluster auto-config service remains OK and does run successfully to sync the cluster configurations.

If glusterd is down on the node that is used to sync the cluster configurations via the discovery script, then trying to run the auto-config service will cause it to be in CRITICAL state with the following in the status information -

Failed to execute NRPE command 'discover_volume_list' in host <hostname>

This is expected as the discovery script fails to run the required commands owing to glusterd being down.
Comment 12 Pavithra 2014-12-24 04:03:32 EST
Hi Ramesh,

Can you please review the edited doc text for technical accuracy and sign off?
Comment 13 Ramesh N 2014-12-24 06:18:50 EST
Doc text looks good to me.
Comment 15 errata-xmlrpc 2015-01-15 08:48:14 EST
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-0039.html

Note You need to log in before you can comment on or make changes to this bug.