Bug 1236290 - [New] - Cluster-Quorum status does not change when one of the node in the cluster is powered off
Summary: [New] - Cluster-Quorum status does not change when one of the node in the clu...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: nagios-server-addons
Version: rhgs-3.1
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: RHGS 3.1.1
Assignee: Sahina Bose
QA Contact: RamaKasturi
URL:
Whiteboard:
Depends On:
Blocks: 1216951 1251815
TreeView+ depends on / blocked
 
Reported: 2015-06-27 11:28 UTC by RamaKasturi
Modified: 2016-09-09 05:16 UTC (History)
7 users (show)

Fixed In Version: gluster-nagios-addons-0.2.5-1
Doc Type: Bug Fix
Doc Text:
Previously, the nodes were updating the older service even after the Cluster Quorum service was renamed. Due to this, the Cluster Quorum service status in Nagios was not reflected. With this fix, the plugins on the nodes are updated so that the notifications are pushed to the new service and the Cluster Quorum status is reflected correctly.
Clone Of:
Environment:
Last Closed: 2015-10-05 09:21:57 UTC
Embargoed:
knarra: needinfo-


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2015:1848 0 normal SHIPPED_LIVE Red Hat Gluster Storage Console 3.1 update 1 bug fixes 2015-10-05 13:19:50 UTC

Description RamaKasturi 2015-06-27 11:28:27 UTC
Description of problem:
Cluster-Quorum status remains as "ok" with status information "Server quorum turned on for vol4,vol1 " when one of the node in the cluster is powered off.

Version-Release number of selected component (if applicable):
nagios-server-addons-0.2.1-2.el6rhs.noarch

How reproducible:
Always

Steps to Reproduce:
1. Add two nodes in the cluster.
2. set quorum on any one of the volume by running the command "gluster volume set <vol-name> cluster.server-quorum-type server
3. Now run cluster auto-config.
4. Now power off one of the node.

Actual results:
status remains "OK" with status information "Server quorum turned on for <vol_names>"

Expected results:
Cluster-Quorum status should change the status to CRITICAL with status information " QUORUM: Cluster server-side quorum lost."

Additional info:

Comment 2 RamaKasturi 2015-06-27 11:29:32 UTC
Seeing the following in nagios.log.

[1435403539] EXTERNAL COMMAND: PROCESS_SERVICE_CHECK_RESULT;cluster1;Cluster - Quorum;2;QUORUM: Cluster server-side quorum lost.
[1435403539] Warning:  Passive check result was received for service 'Cluster - Quorum' on host 'cluster1', but the service could not be found!

Comment 4 Sahina Bose 2015-06-30 11:13:25 UTC
Issue due to change in service name in Nagios, and ncsa was sending alert to older service. The service name was changed to ensure that the command definition was modified on update, as new freshness check was introduced.

http://review.gluster.org/#/c/11465 - posted to fix this

Comment 8 monti lawrence 2015-07-23 14:34:54 UTC
Doc text is edited. Please sign off to be included in Known Issues.

Comment 9 Sahina Bose 2015-07-24 12:00:39 UTC
minor edit.

Comment 13 RamaKasturi 2015-08-28 11:28:39 UTC
Hi Sahina,

   After enabling server quorum on a volume, i powered off one node in the cluster and now my quorum status goes to UNKNOWN with status information "Server quorum not turned on for any volume". Can you please check this?

Thanks
kasturi

Comment 14 RamaKasturi 2015-09-01 05:40:58 UTC
Verified on RHS+Nagios deployment and works fine with build gluster-nagios-addons-0.2.5-1.el7rhgs.x86_64.

When one of the nodes in the cluster is powered off, Cluster - Quorum Status is marked as CRITICAL with status information "QUORUM: Cluster server-side quorum lost".

Comment 15 Bhavana 2015-09-21 08:24:26 UTC
Hi Sahina,

The doc text is updated. Please review it and share your technical review comments. If it looks ok, then sign-off on the same.

Comment 17 errata-xmlrpc 2015-10-05 09:21:57 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2015-1848.html


Note You need to log in before you can comment on or make changes to this bug.