Bug 1136205
| Summary: | [Nagios] Volume status is seen to be in warning status with status information "null" when glusterd is stopped on one RHS node. | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Shruti Sampat <ssampat> |
| Component: | gluster-nagios-addons | Assignee: | Nishanth Thomas <nthomas> |
| Status: | CLOSED ERRATA | QA Contact: | Shruti Sampat <ssampat> |
| Severity: | urgent | Docs Contact: | |
| Priority: | urgent | ||
| Version: | rhgs-3.0 | CC: | bkunal, dpati, fharshav, kmayilsa, nthomas, psriniva, rhsc-qe-bugs, rnachimu, sharne, vumrao |
| Target Milestone: | --- | Keywords: | ZStream |
| Target Release: | RHGS 3.0.3 | ||
| Hardware: | All | ||
| OS: | Linux | ||
| Whiteboard: | |||
| Fixed In Version: | nagios-server-addons-0.1.9-1.el6rhs | Doc Type: | Bug Fix |
| Doc Text: |
Previously, the Nagios plug-in sent the volume status request to the Red Hat Storage node without converting the Nagios host name to the respective IP Address. When the glusterd service was stopped on one of the nodes in a Red Hat Storage Trusted Storage Pool, the volume status displayed a warning and the status information was empty. With this fix, the error scenarios are handled properly and the system ensures that the glusterd service starts before it sends such a request to a Red Hat Storage node.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2015-01-15 13:49:17 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 1109843 | ||
| Bug Blocks: | 1087818 | ||
|
Description
Shruti Sampat
2014-09-02 07:26:34 UTC
Please review and sign-off the edited doc text. Q1. Why Host and Address is Eskan as Eskan is nothing but a cluster name. ANS: In Nagios cluster is represented as dummy with name as cluster-name Q2. For NULL issue this is the bug which means Additional Info: NULL am I correct here ? ANS: selinux in Enforcing mode can cause this issue. Moving selinux to Permissive mode should solve this problem Q3. How customer can stop these messages to filling up their inboxes any workaround ? ANS: Messages/Notifications can be disabled using the nagios ui. But its worth checking the selinux status before attempting this. Pls read the first answer in Comment #5 as ANS: In Nagios, cluster is represented as dummy host with name as cluster-name. This is done by auto-discovery script Thanks Kanagaraj, for your quick response it will help a lot. I will get back to you if any thing else is needed from customer end. In Comment #5, Nagios needs to be restarted "service nagios restart" after moving Selinux to permissive mode. Vikhyat, pls ask the customer to restart if not already done. Moving back to assigned state as there are some scenarios which is not covered in the bug Verified as fixed in nagios-server-addons-0.1.9-1.el6rhs Tested with RHS+Nagios in a 4 node RHS cluster in the following scenarios - 1. glusterd stopped on one of the nodes, on which one of the bricks of a volume resided. Volume status was OK with status information "OK: Volume : DISTRIBUTE type - All bricks are Up " 2. On a cluster with server quorum enabled, brought down glusterd causing quorum to be lost. This issue was not observed in this case too. Volume status of volume with server quorum enabled was critical with status information - "CRITICAL: Volume : REPLICATE type - All bricks are down" 3. Stopped nrpe service on one node. Volume status shows appropriate status information in this case too. Marking as verified. Nishanth, Can you please review the edited doc text for technical accuracy and sign off? Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://rhn.redhat.com/errata/RHBA-2015-0039.html The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days |