Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1109744 - [Nagios] notifications are not sent when quorum is lost for multiple volumes, one after the other
[Nagios] notifications are not sent when quorum is lost for multiple volumes,...
Status: CLOSED ERRATA
Product: Red Hat Gluster Storage
Classification: Red Hat
Component: gluster-nagios-addons (Show other bugs)
3.0
Unspecified Unspecified
medium Severity high
: ---
: RHGS 3.1.0
Assigned To: Sahina Bose
RamaKasturi
:
Depends On:
Blocks: 1087818 1202842
  Show dependency treegraph
 
Reported: 2014-06-16 06:00 EDT by Shruti Sampat
Modified: 2015-07-29 01:26 EDT (History)
8 users (show)

See Also:
Fixed In Version: gluster-nagios-addons-0.2.0-1
Doc Type: Bug Fix
Doc Text:
Previously, there was a misleading notification message that quorum is lost for only one volume even if multiple volumes have lost quorum. With this fix, the notification message is corrected to inform the user that the quorum is lost on the entire cluster.
Story Points: ---
Clone Of:
Environment:
Last Closed: 2015-07-29 01:26:21 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHEA-2015:1494 normal SHIPPED_LIVE Red Hat Gluster Storage Console 3.1 Enhancement and bug fixes 2015-07-29 05:24:02 EDT

  None (edit)
Description Shruti Sampat 2014-06-16 06:00:11 EDT
Description of problem:
-------------------------

Consider a cluster having two volume, say vol1 and vol2, both with server-side quorum enabled.

When quorum is lost for vol1, the status of the cluster-quorum service changes to critical, and the status information reads - 

QUORUM: Server quorum lost for volume vol1. Stopping local bricks.

A notification is sent via e-mail and SNMP traps.

If quorum is lost for vol2, later, (before it was regained for vol1), the status of the service would remain critical. The status information would read - 

QUORUM: Server quorum lost for volume vol2. Stopping local bricks.

Since the status of the quorum service did not change, notifications will not be sent. 

Version-Release number of selected component (if applicable):
gluster-nagios-addons-0.1.2-1.el6rhs.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Create two distributed-replicate volumes, vol1 and vol2.
2. Cause quorum to be lost for vol1, observe that the status of the service changes to critical and that notifications are sent.
3. Cause quorum to be lost for vol2.

Actual results:
The status of the service remains critical, hence status change is not involved and notifications are not sent.

Expected results:
Notifications should be sent whenever quorum is lost/regained for any volume in the cluster.

Additional info:
Comment 1 Shalaka 2014-06-18 01:58:27 EDT
Please add doc text for the known issue
Comment 2 Shalaka 2014-06-24 12:55:01 EDT
Please review and signoff edited doc text.
Comment 3 Sahina Bose 2014-09-04 01:16:24 EDT
Looks good to me
Comment 4 Sahina Bose 2015-02-09 02:28:50 EST
As per redesign, notification is sent only once as Quorum is cluster level service
Comment 5 RamaKasturi 2015-06-01 07:33:51 EDT
Verified and works fine with build gluster-nagios-addons-0.2.0-1.

As per comment 7, notification is sent only once as Quorum is cluster level service. Below is the way notification comes.

** PROBLEM Service Alert: cluster1/Cluster - Quorum is CRITICAL **

***** Nagios *****

Notification Type: PROBLEM

Service: Cluster - Quorum
Host: cluster1
Address: cluster1
State: CRITICAL

Date/Time: Mon Jun 1 16:46:20 IST 2015

Additional Info:

QUORUM: Cluster server-side quorum lost.

When it is regained, email notification is sent as :

** RECOVERY Service Alert: cluster1/Cluster - Quorum is OK **

***** Nagios *****

Notification Type: RECOVERY

Service: Cluster - Quorum
Host: cluster1
Address: cluster1
State: OK

Date/Time: Mon Jun 1 16:56:00 IST 2015

Additional Info:

QUORUM: Cluster server-side quorum regained.
Comment 6 Divya 2015-07-26 01:27:27 EDT
Sahina,

Could you review the edited doc text and sign-off.
Comment 7 Sahina Bose 2015-07-27 01:03:49 EDT
Acked
Comment 9 errata-xmlrpc 2015-07-29 01:26:21 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHEA-2015-1494.html

Note You need to log in before you can comment on or make changes to this bug.