Bug 1580385
| Summary: | Node is DOWN alert not cleared properly | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat Gluster Storage | Reporter: | Filip Balák <fbalak> | ||||||
| Component: | web-admin-tendrl-notifier | Assignee: | gowtham <gshanmug> | ||||||
| Status: | CLOSED ERRATA | QA Contact: | Filip Balák <fbalak> | ||||||
| Severity: | unspecified | Docs Contact: | |||||||
| Priority: | unspecified | ||||||||
| Version: | rhgs-3.4 | CC: | asachan, nthomas, rhs-bugs, sankarshan | ||||||
| Target Milestone: | --- | ||||||||
| Target Release: | RHGS 3.4.0 | ||||||||
| Hardware: | Unspecified | ||||||||
| OS: | Unspecified | ||||||||
| Whiteboard: | |||||||||
| Fixed In Version: | tendrl-commons-1.6.3-8.el7rhgs tendrl-node-agent-1.6.3-8.el7rhgs | Doc Type: | If docs needed, set a value | ||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | Environment: | ||||||||
| Last Closed: | 2018-09-04 07:06:51 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Bug Depends On: | |||||||||
| Bug Blocks: | 1503137 | ||||||||
| Attachments: |
|
||||||||
Looks ok. --> VERIFIED Tested with: tendrl-ansible-1.6.3-4.el7rhgs.noarch tendrl-api-1.6.3-3.el7rhgs.noarch tendrl-api-httpd-1.6.3-3.el7rhgs.noarch tendrl-commons-1.6.3-6.el7rhgs.noarch tendrl-grafana-plugins-1.6.3-4.el7rhgs.noarch tendrl-grafana-selinux-1.5.4-2.el7rhgs.noarch tendrl-monitoring-integration-1.6.3-4.el7rhgs.noarch tendrl-node-agent-1.6.3-6.el7rhgs.noarch tendrl-notifier-1.6.3-3.el7rhgs.noarch tendrl-selinux-1.5.4-2.el7rhgs.noarch tendrl-ui-1.6.3-3.el7rhgs.noarch I saw similar issue is the latest build, what I feel is this issue should be happening in all older build also because when I found this root cause this for this issue I realized it not because of latest build only, this should be occured in older builds also, But in my low configuration machine I can't reproduce this issue constantly, it happening very rarely. When I tested using some high configuration machines I can reproduce this issue all the time. I fixed this issue, and PR is under review https://github.com/Tendrl/commons/pull/996 So as per discussion with Martin, I am moving this issue to assigned state. I saw this problem in latest build also tendrl-ansible-1.6.3-5.el7rhgs.noarch tendrl-commons-1.6.3-7.el7rhgs.noarch tendrl-api-1.6.3-3.el7rhgs.noarch tendrl-monitoring-integration-1.6.3-5.el7rhgs.noarch tendrl-selinux-1.5.4-2.el7rhgs.noarch tendrl-grafana-selinux-1.5.4-2.el7rhgs.noarch tendrl-notifier-1.6.3-4.el7rhgs.noarch tendrl-node-agent-1.6.3-7.el7rhgs.noarch tendrl-ui-1.6.3-4.el7rhgs.noarch tendrl-grafana-plugins-1.6.3-5.el7rhgs.noarch tendrl-api-httpd-1.6.3-3.el7rhgs.noarch Created attachment 1452872 [details]
Node down alert not cleared when node is up
This looks ok but similar bz have been filed during testing start stop node scenarios - BZ 1600910. --> VERIFIED Tested with: tendrl-ansible-1.6.3-5.el7rhgs.noarch tendrl-api-1.6.3-4.el7rhgs.noarch tendrl-api-httpd-1.6.3-4.el7rhgs.noarch tendrl-commons-1.6.3-8.el7rhgs.noarch tendrl-grafana-plugins-1.6.3-6.el7rhgs.noarch tendrl-grafana-selinux-1.5.4-2.el7rhgs.noarch tendrl-monitoring-integration-1.6.3-6.el7rhgs.noarch tendrl-node-agent-1.6.3-8.el7rhgs.noarch tendrl-notifier-1.6.3-4.el7rhgs.noarch tendrl-selinux-1.5.4-2.el7rhgs.noarch tendrl-ui-1.6.3-6.el7rhgs.noarch Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2018:2616 |
Created attachment 1439582 [details] Hosts page with alerts Description of problem: When one of the gluster nodes is shut down and after a while started, there remains an alert: `Node <node-id> is DOWN` All other alerts are cleared correctly. Version-Release number of selected component (if applicable): tendrl-ansible-1.6.3-4.el7rhgs.noarch tendrl-api-1.6.3-3.el7rhgs.noarch tendrl-api-httpd-1.6.3-3.el7rhgs.noarch tendrl-commons-1.6.3-5.el7rhgs.noarch tendrl-grafana-plugins-1.6.3-3.el7rhgs.noarch tendrl-grafana-selinux-1.5.4-2.el7rhgs.noarch tendrl-monitoring-integration-1.6.3-3.el7rhgs.noarch tendrl-node-agent-1.6.3-5.el7rhgs.noarch tendrl-notifier-1.6.3-3.el7rhgs.noarch tendrl-selinux-1.5.4-2.el7rhgs.noarch tendrl-ui-1.6.3-2.el7rhgs.noarch How reproducible: 100% Steps to Reproduce: 1. Install WA. 2. Import cluster with volume. 3. Shut down one node. 4. Wait until for tendrl to raise alerts. 5. Start the node. 6. Check alerts in UI. Actual results: There remains one alert: `Node <node-id> is DOWN` Expected results: There should be no alerts if node started correctly. Additional info: