Bug 1128154 - [Nagios] Nagios should treat state change reported by passive checks as HARD
Summary: [Nagios] Nagios should treat state change reported by passive checks as HARD
Keywords:
Status: CLOSED CURRENTRELEASE
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: gluster-nagios-addons
Version: rhgs-3.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: RHGS 3.0.0
Assignee: Sahina Bose
QA Contact: Shruti Sampat
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2014-08-08 12:49 UTC by Shruti Sampat
Modified: 2015-05-13 16:55 UTC (History)
7 users (show)

Fixed In Version: nagios-server-addons-0.1.6-1.el6rhs
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2015-01-08 09:46:04 UTC
Embargoed:


Attachments (Terms of Use)

Description Shruti Sampat 2014-08-08 12:49:26 UTC
Description of problem:
------------------------

When a change in the state of a service is reported by a passive check, the Nagios server should be configured to send out notifications at the first occurrence of change in state, in other words the state should be treated as a HARD state.

For e.g. the quorum status service for a cluster relies only on passive checks. Whenever a passive check reporting a change in the state of a service is received, this state should be treated as a HARD state, and notifications should be sent out immediately.

Currently, a hard state is reached only after 3 checks.

Version-Release number of selected component (if applicable):
gluster-nagios-addons-0.1.10-2.el6rhs.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Enable server quorum on a volume being monitored by the Nagios server.
2. Cause quorum to be lost on the volume.
3. Observe the quorum status service, it treats the state as a SOFT state and waits for 3 checks to send out notifications.

Actual results:
The nagios server, waits for 3 checks to declare the state as a  HARD state and send out notifications.

Expected results:
Whenever a state change is reported by a passive check, immediately indicate the state to be a HARD state and thus send notifications.

Additional info:

Comment 1 Sahina Bose 2014-08-08 13:13:07 UTC
There's an is_volatile directive in the service configuration that we can use to make sure that passive service checks are treated as HARD states the first time.

Comment 2 Sahina Bose 2014-08-12 09:20:40 UTC
On further reading, is_volatile directive sends out notification on every non-OK check result - This is not what we need.

We need to set the check_max_attempts = 1, for passive checks so that the state changes are treated as HARD state change.

Comment 3 Sahina Bose 2014-08-13 06:07:31 UTC
Posted patch http://review.gluster.org/8467 - setting this config in the passive service template

Comment 4 Shruti Sampat 2014-08-21 12:27:32 UTC
Verified as fixed in nagios-server-addons-0.1.6-1.el6rhs.noarch.

Every passive check for quorum service reporting state change results in a hard state now. Notifications are sent as expected.

Comment 5 Stanislav Graf 2015-01-08 09:46:04 UTC
This bug was fixed and verified as part of RHS 3.0.0 release.

-> CLOSED - CURRENTRELEASE


Note You need to log in before you can comment on or make changes to this bug.