Bug 1284874
Summary: | cluster quorum status wrongly shows ok even when one of the nodes is powered down | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Triveni Rao <trao> | ||||||
Component: | nagios-server-addons | Assignee: | Sahina Bose <sabose> | ||||||
Status: | CLOSED ERRATA | QA Contact: | Triveni Rao <trao> | ||||||
Severity: | high | Docs Contact: | |||||||
Priority: | high | ||||||||
Version: | rhgs-3.1 | CC: | asrivast, divya, knarra, sabose, sankarshan, sashinde | ||||||
Target Milestone: | --- | Keywords: | ZStream | ||||||
Target Release: | RHGS 3.1.2 | ||||||||
Hardware: | x86_64 | ||||||||
OS: | Linux | ||||||||
Whiteboard: | |||||||||
Fixed In Version: | nagios-server-addons-0.2.3-1 | Doc Type: | Bug Fix | ||||||
Doc Text: |
Previously, Quorum service incorrectly displayed OK status even when more than 50% of the nodes were down. This was because the freshness check overwrote the quorum service. With the fix, freshness check overrides stale status only when status is not Critical. Now, Quorum service displays correct status.
|
Story Points: | --- | ||||||
Clone Of: | Environment: | ||||||||
Last Closed: | 2016-03-01 06:12:46 UTC | Type: | Bug | ||||||
Regression: | --- | Mount Type: | --- | ||||||
Documentation: | --- | CRM: | |||||||
Verified Versions: | Category: | --- | |||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||
Embargoed: | |||||||||
Bug Depends On: | |||||||||
Bug Blocks: | 1260783 | ||||||||
Attachments: |
|
Description
Triveni Rao
2015-11-24 11:28:34 UTC
The issue is due to the active check overriding the nagios output. The active check should only override, in case the service status is not critical - currently the existing service status check is not returning results, causing wrong output. Fixed in patch - http://review.gluster.org/12735 This bug is verified with the fixed version provided nagios-server-addons-0.2.3-1 Steps followed: 1.Install RHSC+nagios on new build of 312. 2.add RHGS nodes RHEL6.7 or RHEL7.2 3.power down one of the nodes and check in UI 4.Cluster quorum status shows quorum lost message properly and no flapping. 5.Power up the node and checked the UI, services came back to normal states. attached are the 2 screen shots taken after power down and power up. Version: gluster-nagios-common-0.2.3-1.el6rhs.noarch nagios-server-addons-0.2.3-1.el6rhs.noarch Created attachment 1105835 [details]
power down
Created attachment 1105836 [details]
power up
Sahina, Could you review and sign-off the edited doc text. Looks good to me Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2016-0310.html |