Description of problem: As of now for disperse and distributed disperse volumes when no of bricks greater than redundancy count goes offline these volumes should be marked as critical. But as of now volume status goes to warning state. Version-Release number of selected component (if applicable): nagios-server-addons-0.2.1-2.el6rhs.noarch How reproducible: Always Steps to Reproduce: 1. Create disperse and distributed disperse volumes. 2. Now run configure-gluster-nagios command to monitor them 3. Now bring down bricks greater than the redundancy count. Actual results: volume status for disperse and distribute disperse is marked as WARNING. Expected results: volume status should be marked as critical when no.of bricks greater than redundancy count goes offline. Additional info:
Created attachment 1043081 [details] volume status goes to warning for disperse and distribute disperse volumes.
New volume types are not considered in nagios plugins - this is a functionality break.
Upstream fix patch link: http://review.gluster.org/#/q/topic:Bug-1235651
Doc text is edited. Please sign off to be included in Known Issues.
Looks Good.
Fix patches: https://code.engineering.redhat.com/gerrit/#/c/55294 https://code.engineering.redhat.com/gerrit/#/c/55296 https://code.engineering.redhat.com/gerrit/#/c/55298
Verified and works fine with build gluster-nagios-addons-0.2.5-1.el7rhgs.x86_64 and nagios-server-addons-0.2.2-1.el6rhs.noarch In a disperse config of 1 x (4 + 2) when one brick goes down volume status in nagios goes to warning with status information as "WARNING : Volume : DISPERSE type Brick(s) - <brickpath> is are down, but disperse pair(s) are up. In a disperse config of 1 x(4 + 2) when number of brick greater than redundancy count goes down volume status goes to critical state with status information as "CRITICAL : Volume: DISPERSE type Bricks - <brick path> are down, along with one or more disperse pair(s). In a distributed disperse config of 2 x (4 + 2) when one brick goes down volume status in nagios goes to warning with status information as WARNING :Volume: DISTRIBUTED_DISPERSE type Brick(s) - <brick path> is are down, but disperse pair(s) are up . In a distributed disperse config of 2 x (4 + 2) when bricks more than redundancy count in each of the distribute sets volume status goes to CRITICAL with status information "CRITICAL:Volume:DISTRIBUTED_DISPERSE type Bricks - <brickpath> are down, along with one or more disperse pair(s) In a distributed disperse config of 2 x (4 + 2) when all the bricks in one of the distribute set goes down volume status is marked as CRITICAL with status information "CRITICAL:Volume:DISTRIBUTED_DISPERSE type Bricks - <brickpath> are down, along with one or more disperse pair(s)
Hi Darshan, The doc text is updated. Please review it and share your technical review comments. If it looks ok, then sign-off on the same.
(In reply to Bhavana from comment #13) > Hi Darshan, > > The doc text is updated. Please review it and share your technical review > comments. If it looks ok, then sign-off on the same. Small suggestion: previously the volume status service was providing the status of disperse and distributed disperse volume, but it was incorrect.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-1848.html