Description of problem: When glusterd is down in one of the node, volume status should be changed accordingly. For example create a distribute volume on two nodes say node1 and node2. Now stop glusterd in one of the node. Distribute volume status should be shown critical since one of the brick of the volume resides in node2. But in nagios UI, when glusterd is down volume status is still maked as "OK". Version-Release number of selected component (if applicable): nagios-server-addons-0.2.1-3.el6rhs.noarch How reproducible: Always Steps to Reproduce: 1. Create a cluster with two nodes and monitor them using nagios 2. Now bring down glusterd in one of the node. 3. Actual results: Volume status always shows "OK" with status information "all bricks are up" Expected results: Volume status for distribute volume should be marked critical with status information one of the brick is down. Additional info:
I think this is expected behaviour. Even if glusterd is down on one of the nodes, the bricks are still online and accessible. Till the time, the bricks are marked down, the volume is not marked CRITICAL. Is this a regression, because I don't see any change to this plugin behaviour. Removing devel_ack till confirmed.
Brick status is marked as UNKNOWN in the nagios UI when glusterd in that node goes down. IMO, volume status should also be changed.
Doc text is edited. Please sign off to be included in Known Issues.
doc text looks good.
Nagios monitors brick status and glusterd status separately and sends notifications if these service are down. For this particular case, volume status cannot be correctly determined - hence even a change in volume status could be interpreted incorrectly. Closing this - please re-open if you can suggest the volume status that it needs to move to.