2241872 – [Tracker][27770] ocs-metrics-exporter is in CrashLoopBackOff when running in Provider/Client mode

Bug 2241872 - [Tracker][27770] ocs-metrics-exporter is in CrashLoopBackOff when running in Provider/Client mode

Summary: [Tracker][27770] ocs-metrics-exporter is in CrashLoopBackOff when running in ...

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat OpenShift Data Foundation
Classification:	Red Hat Storage
Component:	ocs-operator
Sub Component:
Version:	4.14
Hardware:	Unspecified
OS:	Unspecified
Priority:	urgent
Severity:	medium
Target Milestone:	---
Target Release:	ODF 4.15.0
Assignee:	Ritesh Chikatwar
QA Contact:	Daniel Osypenko
Docs Contact:
URL:
Whiteboard:	isf-provider
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2023-10-03 06:34 UTC by umanga
Modified:	2024-03-19 15:27 UTC (History)
CC List:	8 users (show)
Fixed In Version:	4.14.0-160
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:	2024-03-19 15:27:02 UTC
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	red-hat-storage ocs-operator pull 2230	0	None	open	Bug 2241872: Fix for metrics exporter pod going in crashbacklloop	2023-10-25 04:19:07 UTC
Red Hat Product Errata	RHSA-2024:1383	0	None	None	None	2024-03-19 15:27:04 UTC

Description umanga 2023-10-03 06:34:30 UTC

Description of problem (please be detailed as possible and provide log
snippests):

When ODF is running in Provider/Client mode, ocs-metrics-exporter is in CrashLoopBackoff.

Error:
```
panic: inconsistent label cardinality: expected 4 label values but got 2 in []string{"storageconsumer-fd1fa8b6-3218-491d-b558-0210d6a9bdff", "Ready"}
goroutine 270 [running]:
github.com/prometheus/client_golang/prometheus.MustNewConstMetric(...)
/remote-source/app/vendor/github.com/prometheus/client_golang/prometheus/value.go:128
github.com/red-hat-storage/ocs-operator/v4/metrics/internal/collectors.(*StorageConsumerCollector).collectStorageConsumersMetadata(0xc000a7b4a0, {0xc0008b8218, 0x1, 0x0?}, 0x0?)
/remote-source/app/metrics/internal/collectors/storageconsumer.go:64 +0x131
github.com/red-hat-storage/ocs-operator/v4/metrics/internal/collectors.(*StorageConsumerCollector).Collect(0xc000a7b4a0, 0xc000b09f60?)
/remote-source/app/metrics/internal/collectors/storageconsumer.go:45 +0xab
github.com/prometheus/client_golang/prometheus.(*Registry).Gather.func1()
/remote-source/app/vendor/github.com/prometheus/client_golang/prometheus/registry.go:455 +0x10d
created by github.com/prometheus/client_golang/prometheus.(*Registry).Gather
/remote-source/app/vendor/github.com/prometheus/client_golang/prometheus/registry.go:466 +0x59d
```

Version of all relevant components (if applicable):
ODF 4.14

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
No

Is there any workaround available to the best of your knowledge?
No

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
1

Can this issue reproducible?
Yes

Can this issue reproduce from the UI?
Yes

If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. Install ODF Provider/Client mode
2. Check ocs-metrics-exporter for crashes
3.


Actual results:


Expected results:


Additional info:

Comment 2 Ritesh Chikatwar 2023-10-23 07:51:08 UTC

No quality engineering is required for this fix as it only applies to Provider and Client modes and will not affect other deployments.

Comment 3 Ritesh Chikatwar 2023-10-23 07:53:32 UTC

This fix is urgent and should be included in version 4.14 as the pod is currently in crashloopback. It is important to note that implementing this fix will not have any impact on other deployments.

Comment 4 Daniel Osypenko 2023-11-12 13:46:53 UTC

$ oc get pod ocs-metrics-exporter-56cd8d568d-lh2xh -n openshift-storage
NAME                                    READY   STATUS    RESTARTS   AGE
ocs-metrics-exporter-56cd8d568d-lh2xh   1/1     Running   0          6d2h

------ versions ------
Provider
operator.v4.14.0-160.hci              OpenShift Data Foundation     4.14.0-160.hci apiVersion: operators.coreos.com/v2

Client
name: ocs-client-operator.v4.14.0-162.hci

Comment 8 errata-xmlrpc 2024-03-19 15:27:02 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.15.0 security, enhancement, & bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2024:1383

Note You need to log in before you can comment on or make changes to this bug.