Bug 1519201
Summary: | WA doesn't reflect that all gluster nodes are down | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Martin Kudlej <mkudlej> | ||||||||
Component: | web-admin-tendrl-monitoring-integration | Assignee: | Anmol Sachan <asachan> | ||||||||
Status: | CLOSED ERRATA | QA Contact: | Filip Balák <fbalak> | ||||||||
Severity: | unspecified | Docs Contact: | |||||||||
Priority: | unspecified | ||||||||||
Version: | rhgs-3.3 | CC: | asachan, asriram, fbalak, mkudlej, nthomas, rhs-bugs, sanandpa, sankarshan, srmukher, ssaha | ||||||||
Target Milestone: | --- | ||||||||||
Target Release: | RHGS 3.4.0 | ||||||||||
Hardware: | Unspecified | ||||||||||
OS: | Unspecified | ||||||||||
Whiteboard: | |||||||||||
Fixed In Version: | tendrl-node-agent-1.6.3-7.el7rhgs tendrl-monitoring-integration-1.6.3-5.el7rhgs tendrl-gluster-integration-1.6.3-5.el7rhgs | Doc Type: | Known Issue | ||||||||
Doc Text: |
When the entire gluster cluster goes down because the hosts go down simultaneously, the WA dashboard only displays information about the cluster and the nodes being unhealthy. It does not provide detailed information about the health of the bricks and volumes.
|
Story Points: | --- | ||||||||
Clone Of: | Environment: | ||||||||||
Last Closed: | 2018-09-04 07:00:31 UTC | Type: | Bug | ||||||||
Regression: | --- | Mount Type: | --- | ||||||||
Documentation: | --- | CRM: | |||||||||
Verified Versions: | Category: | --- | |||||||||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||
Cloudforms Team: | --- | Target Upstream Version: | |||||||||
Embargoed: | |||||||||||
Bug Depends On: | 1516845 | ||||||||||
Bug Blocks: | 1503134 | ||||||||||
Attachments: |
|
Description
Martin Kudlej
2017-11-30 11:24:44 UTC
Created attachment 1360866 [details]
gl1 is up
Created attachment 1360867 [details]
some charts don't reflect status of nodes
I haven't requested screenshot because I haven't expected that I find new bug. I've run "shutdown -h now" on all Gluster nodes. I am not able to reproduce this issue with latest builds. After the reboot I could see that host status information is up-to-date in tendrl UI and grafana dashboard. Having said that, there are issues around the updates of volumes, bricks etc on the grafana dashbord(which is not discussed as part of this bug) when all the nodes are shut-down. This is because the all the agents(which is responsible for this updates) running on the nodes are down. This needs to be tackled differently. I don't think this is something which can be taken in for this release. Also this scenario is very rare in a production environment. Even if happens, the host down status is correctly indicated on the dashboard and that's a good enough indication for the administrator to take action . Having discussed it with QE(Sweta), it has been agreed to document this bug as a known_issue for this release. Updated, pls check *** Bug 1583724 has been marked as a duplicate of this bug. *** *** Bug 1583727 has been marked as a duplicate of this bug. *** I tested several times the scenario. The current status: * All nodes in Hosts page are Down as expected. * There remains at least one (the last shut down node) as Up in Grafana. * Volume disappears from UI (BZ 1588436). * Not all bricks are Down in UI and in Grafana. --> ASSIGNED Tested with: tendrl-ansible-1.6.3-4.el7rhgs.noarch tendrl-api-1.6.3-3.el7rhgs.noarch tendrl-api-httpd-1.6.3-3.el7rhgs.noarch tendrl-commons-1.6.3-6.el7rhgs.noarch tendrl-grafana-plugins-1.6.3-4.el7rhgs.noarch tendrl-grafana-selinux-1.5.4-2.el7rhgs.noarch tendrl-monitoring-integration-1.6.3-4.el7rhgs.noarch tendrl-node-agent-1.6.3-6.el7rhgs.noarch tendrl-notifier-1.6.3-3.el7rhgs.noarch tendrl-selinux-1.5.4-2.el7rhgs.noarch tendrl-ui-1.6.3-3.el7rhgs.noarch @anmol, please take a look at https://bugzilla.redhat.com/show_bug.cgi?id=1588436#c8 PRs are under review https://github.com/Tendrl/commons/pull/989 https://github.com/Tendrl/gluster-integration/pull/660 Looks ok. All status panels in Grafana and UI reflect the status of hosts and bricks correctly and alerts are raised. --> VERIFIED Tested with: tendrl-ansible-1.6.3-5.el7rhgs.noarch tendrl-api-1.6.3-3.el7rhgs.noarch tendrl-api-httpd-1.6.3-3.el7rhgs.noarch tendrl-commons-1.6.3-7.el7rhgs.noarch tendrl-grafana-plugins-1.6.3-5.el7rhgs.noarch tendrl-grafana-selinux-1.5.4-2.el7rhgs.noarch tendrl-monitoring-integration-1.6.3-5.el7rhgs.noarch tendrl-node-agent-1.6.3-7.el7rhgs.noarch tendrl-notifier-1.6.3-4.el7rhgs.noarch tendrl-selinux-1.5.4-2.el7rhgs.noarch tendrl-ui-1.6.3-4.el7rhgs.noarch Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:2616 |