2145146 – CDI operator is not creating PrometheusRule resource with alerts if CDI resource is incorrect

Bug 2145146 - CDI operator is not creating PrometheusRule resource with alerts if CDI resource is incorrect

Summary: CDI operator is not creating PrometheusRule resource with alerts if CDI resou...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Container Native Virtualization (CNV)
Classification:	Red Hat
Component:	Storage
Sub Component:
Version:	4.12.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Target Release:	4.12.1
Assignee:	Arnon Gilboa
QA Contact:	Yan Du
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2022-11-23 10:55 UTC by João Vilaça
Modified:	2023-09-05 16:30 UTC (History)
CC List:	6 users (show)
Fixed In Version:	v4.12.1-38
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2023-09-05 16:29:19 UTC
Target Upstream Version:
Embargoed:
Dependent Products:

Attachments	(Terms of Use)

Links
System	ID	Priority	Status	Summary	Last Updated
Github	kubevirt containerized-data-importer pull 2546	None	Merged	Ensure Prometheus resources exist for CDINotReady	2023-02-01 12:26:15 UTC
Github	kubevirt containerized-data-importer pull 2562	None	Merged	[release-v1.55] Ensure Prometheus resources exist for CDINotReady	2023-02-01 12:26:19 UTC
Github	kubevirt containerized-data-importer pull 2577	None	Merged	Fix operator_test CDI CR Gets	2023-02-08 14:45:14 UTC
Github	kubevirt containerized-data-importer pull 2581	None	Merged	[release-v1.55] Fix cdi_cr_ready test CDI CR name	2023-02-09 08:55:12 UTC
Red Hat Issue Tracker	CNV-22813	None	None	None	2022-11-23 11:05:45 UTC
Red Hat Product Errata	RHSA-2023:4982	None	None	None	2023-09-05 16:30:24 UTC

Description João Vilaça 2022-11-23 10:55:36 UTC

Description of problem:

When we create a CDI resource, the operator should expose the `kubevirt_cdi_cr_ready` metric, and create the `PrometheusRule` resource with the CDI alerts. Right now, if we create the CDI resource with a wrong infra node selector,  the operator exposes the metric but the `PrometheusRule` is not created, and therefore alerts are not fired (namely `CDINotReady`) 

See https://github.com/kubevirt/containerized-data-importer/blob/a19238ebbdadb8cc02ce91d3ed01c98935ff5475/tests/monitoring_test.go#L65 for the related test

Version-Release number of selected component (if applicable):


How reproducible: 100%


Steps to Reproduce:
1. Delete CDI if it exists
2. Create a new CDI with wrong .Spec.Infra.NodeSelector (p.e. "wrong": "wrong")

Actual results:

> kubectl get PrometheusRule -n cdi prometheus-cdi-rules
Error from server (NotFound): prometheusrules.monitoring.coreos.com "prometheus-cdi-rules" not found

Expected results:

> kubectl get PrometheusRule -n cdi prometheus-cdi-rules
NAME                   AGE
prometheus-cdi-rules   3m40s


Additional info:

Comment 1 Alex Kalenyuk 2022-11-23 11:17:51 UTC

The operator is probably crashing because of this config error, and thus cannot deploy the resources
If that is the case, CDI CR status should reflect that CDI is in a "failing" state
If not, we could take a look at the operator logs to understand what is happening

Comment 3 Adam Litke 2023-02-08 14:46:33 UTC

Arnon, looks like this failed QA.  Please take a look.

Comment 4 Arnon Gilboa 2023-02-08 14:51:51 UTC

Sure Adam, I'm on it. It's a tier-1 test bug failing it D/S.

Comment 5 Yan Du 2023-02-14 11:10:06 UTC

Verified on CNV v4.12.1-40

Comment 11 errata-xmlrpc 2023-09-05 16:29:19 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Virtualization 4.12.6 Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:4982

Note You need to log in before you can comment on or make changes to this bug.