Bug 1904302

Summary:	[GSS] ceph_daemon label includes references to a replaced OSD that cause a prometheus ruleset to fail
Product:	[Red Hat Storage] Red Hat OpenShift Container Storage	Reporter:	Jay Samson <jpankaja>
Component:	rook	Assignee:	Anmol Sachan <asachan>
Status:	CLOSED ERRATA	QA Contact:	Martin Bukatovic <mbukatov>
Severity:	medium	Docs Contact:
Priority:	high
Version:	4.5	CC:	asachan, assingh, bkunal, bniver, dwalveka, ebenahar, hnallurv, madam, muagarwa, nberry, nthomas, ocs-bugs
Target Milestone:	---
Target Release:	OCS 4.7.0
Hardware:	All
OS:	All
Whiteboard:
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:	.Errors in `must gather` due to failed rule evaluation Earlier, the recording rule record: `cluster:ceph_disk_latency:join_ceph_node_disk_irate1m` did not get evaluated because many-to-many match is not allowed in Prometheus. As a result, there were errors in the `must gather` and in the deployment due to this failed rule evaluation. With this release, the query for recording rule has been updated to eliminate the many-to-many match scenarios, and hence now the Prometheus rule evaluations should not fail and there should not be any errors seen in the deployment.	Story Points:	---
Clone Of:		Environment:
Last Closed:	2021-05-19 09:16:33 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:
Bug Blocks:	1938134

Comment 2 Mudit Agarwal 2020-12-04 04:30:05 UTC

Not a 4.6 blocker.

Comment 7 Nishanth Thomas 2021-02-03 07:56:14 UTC

Moving out to 4.8

Comment 21 Michael Adam 2021-03-15 08:22:34 UTC

fixing up acks

Comment 27 Mudit Agarwal 2021-05-10 07:59:33 UTC

Hi Disha,

Doc text looks good to me, please go ahead with this.

Thanks
Mudit

Comment 29 Martin Bukatovic 2021-05-13 10:42:59 UTC

Verification via regression testing only: In our CI results, I see no issues which seems to be caused
by this change, and I also don't see any PrometheusRuleFailures alerts there.

Comment 31 errata-xmlrpc 2021-05-19 09:16:33 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: Red Hat OpenShift Container Storage 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2041