Bug 1904302
| Summary: | [GSS] ceph_daemon label includes references to a replaced OSD that cause a prometheus ruleset to fail | ||
|---|---|---|---|
| Product: | [Red Hat Storage] Red Hat OpenShift Container Storage | Reporter: | Jay Samson <jpankaja> |
| Component: | rook | Assignee: | Anmol Sachan <asachan> |
| Status: | CLOSED ERRATA | QA Contact: | Martin Bukatovic <mbukatov> |
| Severity: | medium | Docs Contact: | |
| Priority: | high | ||
| Version: | 4.5 | CC: | asachan, assingh, bkunal, bniver, dwalveka, ebenahar, hnallurv, madam, muagarwa, nberry, nthomas, ocs-bugs |
| Target Milestone: | --- | ||
| Target Release: | OCS 4.7.0 | ||
| Hardware: | All | ||
| OS: | All | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | Bug Fix | |
| Doc Text: |
.Errors in `must gather` due to failed rule evaluation
Earlier, the recording rule record: `cluster:ceph_disk_latency:join_ceph_node_disk_irate1m` did not get evaluated because *many-to-many* match is not allowed in Prometheus. As a result, there were errors in the `must gather` and in the deployment due to this failed rule evaluation. With this release, the query for recording rule has been updated to eliminate the *many-to-many* match scenarios, and hence now the Prometheus rule evaluations should not fail and there should not be any errors seen in the deployment.
|
Story Points: | --- |
| Clone Of: | Environment: | ||
| Last Closed: | 2021-05-19 09:16:33 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | |||
| Bug Blocks: | 1938134 | ||
|
Comment 2
Mudit Agarwal
2020-12-04 04:30:05 UTC
Moving out to 4.8 fixing up acks Hi Disha, Doc text looks good to me, please go ahead with this. Thanks Mudit Verification via regression testing only: In our CI results, I see no issues which seems to be caused by this change, and I also don't see any PrometheusRuleFailures alerts there. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat OpenShift Container Storage 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2041 |