Description of problem:
When a change in the state of a service is reported by a passive check, the Nagios server should be configured to send out notifications at the first occurrence of change in state, in other words the state should be treated as a HARD state.
For e.g. the quorum status service for a cluster relies only on passive checks. Whenever a passive check reporting a change in the state of a service is received, this state should be treated as a HARD state, and notifications should be sent out immediately.
Currently, a hard state is reached only after 3 checks.
Version-Release number of selected component (if applicable):
Steps to Reproduce:
1. Enable server quorum on a volume being monitored by the Nagios server.
2. Cause quorum to be lost on the volume.
3. Observe the quorum status service, it treats the state as a SOFT state and waits for 3 checks to send out notifications.
The nagios server, waits for 3 checks to declare the state as a HARD state and send out notifications.
Whenever a state change is reported by a passive check, immediately indicate the state to be a HARD state and thus send notifications.
There's an is_volatile directive in the service configuration that we can use to make sure that passive service checks are treated as HARD states the first time.
On further reading, is_volatile directive sends out notification on every non-OK check result - This is not what we need.
We need to set the check_max_attempts = 1, for passive checks so that the state changes are treated as HARD state change.
Posted patch http://review.gluster.org/8467 - setting this config in the passive service template
Verified as fixed in nagios-server-addons-0.1.6-1.el6rhs.noarch.
Every passive check for quorum service reporting state change results in a hard state now. Notifications are sent as expected.
This bug was fixed and verified as part of RHS 3.0.0 release.
-> CLOSED - CURRENTRELEASE