Bug 2145146 - CDI operator is not creating PrometheusRule resource with alerts if CDI resource is incorrect
Summary: CDI operator is not creating PrometheusRule resource with alerts if CDI resou...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Container Native Virtualization (CNV)
Classification: Red Hat
Component: Storage
Version: 4.12.0
Hardware: Unspecified
OS: Unspecified
unspecified
medium
Target Milestone: ---
: 4.12.1
Assignee: Arnon Gilboa
QA Contact: Yan Du
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-11-23 10:55 UTC by João Vilaça
Modified: 2023-09-05 16:30 UTC (History)
6 users (show)

Fixed In Version: v4.12.1-38
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-09-05 16:29:19 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github kubevirt containerized-data-importer pull 2546 0 None Merged Ensure Prometheus resources exist for CDINotReady 2023-02-01 12:26:15 UTC
Github kubevirt containerized-data-importer pull 2562 0 None Merged [release-v1.55] Ensure Prometheus resources exist for CDINotReady 2023-02-01 12:26:19 UTC
Github kubevirt containerized-data-importer pull 2577 0 None Merged Fix operator_test CDI CR Gets 2023-02-08 14:45:14 UTC
Github kubevirt containerized-data-importer pull 2581 0 None Merged [release-v1.55] Fix cdi_cr_ready test CDI CR name 2023-02-09 08:55:12 UTC
Red Hat Issue Tracker CNV-22813 0 None None None 2022-11-23 11:05:45 UTC
Red Hat Product Errata RHSA-2023:4982 0 None None None 2023-09-05 16:30:24 UTC

Description João Vilaça 2022-11-23 10:55:36 UTC
Description of problem:

When we create a CDI resource, the operator should expose the `kubevirt_cdi_cr_ready` metric, and create the `PrometheusRule` resource with the CDI alerts. Right now, if we create the CDI resource with a wrong infra node selector,  the operator exposes the metric but the `PrometheusRule` is not created, and therefore alerts are not fired (namely `CDINotReady`) 

See https://github.com/kubevirt/containerized-data-importer/blob/a19238ebbdadb8cc02ce91d3ed01c98935ff5475/tests/monitoring_test.go#L65 for the related test

Version-Release number of selected component (if applicable):


How reproducible: 100%


Steps to Reproduce:
1. Delete CDI if it exists
2. Create a new CDI with wrong .Spec.Infra.NodeSelector (p.e. "wrong": "wrong")

Actual results:

> kubectl get PrometheusRule -n cdi prometheus-cdi-rules
Error from server (NotFound): prometheusrules.monitoring.coreos.com "prometheus-cdi-rules" not found

Expected results:

> kubectl get PrometheusRule -n cdi prometheus-cdi-rules
NAME                   AGE
prometheus-cdi-rules   3m40s


Additional info:

Comment 1 Alex Kalenyuk 2022-11-23 11:17:51 UTC
The operator is probably crashing because of this config error, and thus cannot deploy the resources
If that is the case, CDI CR status should reflect that CDI is in a "failing" state
If not, we could take a look at the operator logs to understand what is happening

Comment 3 Adam Litke 2023-02-08 14:46:33 UTC
Arnon, looks like this failed QA.  Please take a look.

Comment 4 Arnon Gilboa 2023-02-08 14:51:51 UTC
Sure Adam, I'm on it. It's a tier-1 test bug failing it D/S.

Comment 5 Yan Du 2023-02-14 11:10:06 UTC
Verified on CNV v4.12.1-40

Comment 11 errata-xmlrpc 2023-09-05 16:29:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Virtualization 4.12.6 Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2023:4982


Note You need to log in before you can comment on or make changes to this bug.