Bug 1904302

Summary: [GSS] ceph_daemon label includes references to a replaced OSD that cause a prometheus ruleset to fail
Product: [Red Hat Storage] Red Hat OpenShift Container Storage Reporter: Jay Samson <jpankaja>
Component: rookAssignee: Anmol Sachan <asachan>
Status: CLOSED ERRATA QA Contact: Martin Bukatovic <mbukatov>
Severity: medium Docs Contact:
Priority: high    
Version: 4.5CC: asachan, assingh, bkunal, bniver, dwalveka, ebenahar, hnallurv, madam, muagarwa, nberry, nthomas, ocs-bugs
Target Milestone: ---   
Target Release: OCS 4.7.0   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
.Errors in `must gather` due to failed rule evaluation Earlier, the recording rule record: `cluster:ceph_disk_latency:join_ceph_node_disk_irate1m` did not get evaluated because *many-to-many* match is not allowed in Prometheus. As a result, there were errors in the `must gather` and in the deployment due to this failed rule evaluation. With this release, the query for recording rule has been updated to eliminate the *many-to-many* match scenarios, and hence now the Prometheus rule evaluations should not fail and there should not be any errors seen in the deployment.
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-05-19 09:16:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1938134    

Comment 2 Mudit Agarwal 2020-12-04 04:30:05 UTC
Not a 4.6 blocker.

Comment 7 Nishanth Thomas 2021-02-03 07:56:14 UTC
Moving out to 4.8

Comment 21 Michael Adam 2021-03-15 08:22:34 UTC
fixing up acks

Comment 27 Mudit Agarwal 2021-05-10 07:59:33 UTC
Hi Disha,

Doc text looks good to me, please go ahead with this.

Thanks
Mudit

Comment 29 Martin Bukatovic 2021-05-13 10:42:59 UTC
Verification via regression testing only: In our CI results, I see no issues which seems to be caused
by this change, and I also don't see any PrometheusRuleFailures alerts there.

Comment 31 errata-xmlrpc 2021-05-19 09:16:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat OpenShift Container Storage 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2041