Bug 2241872 - [Tracker][27770] ocs-metrics-exporter is in CrashLoopBackOff when running in Provider/Client mode
Summary: [Tracker][27770] ocs-metrics-exporter is in CrashLoopBackOff when running in ...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: ocs-operator
Version: 4.14
Hardware: Unspecified
OS: Unspecified
urgent
medium
Target Milestone: ---
: ODF 4.15.0
Assignee: Ritesh Chikatwar
QA Contact: Daniel Osypenko
URL:
Whiteboard: isf-provider
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-10-03 06:34 UTC by umanga
Modified: 2024-03-19 15:27 UTC (History)
8 users (show)

Fixed In Version: 4.14.0-160
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2024-03-19 15:27:02 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github red-hat-storage ocs-operator pull 2230 0 None open Bug 2241872: Fix for metrics exporter pod going in crashbacklloop 2023-10-25 04:19:07 UTC
Red Hat Product Errata RHSA-2024:1383 0 None None None 2024-03-19 15:27:04 UTC

Description umanga 2023-10-03 06:34:30 UTC
Description of problem (please be detailed as possible and provide log
snippests):

When ODF is running in Provider/Client mode, ocs-metrics-exporter is in CrashLoopBackoff.

Error:
```
panic: inconsistent label cardinality: expected 4 label values but got 2 in []string{"storageconsumer-fd1fa8b6-3218-491d-b558-0210d6a9bdff", "Ready"}
goroutine 270 [running]:
github.com/prometheus/client_golang/prometheus.MustNewConstMetric(...)
/remote-source/app/vendor/github.com/prometheus/client_golang/prometheus/value.go:128
github.com/red-hat-storage/ocs-operator/v4/metrics/internal/collectors.(*StorageConsumerCollector).collectStorageConsumersMetadata(0xc000a7b4a0, {0xc0008b8218, 0x1, 0x0?}, 0x0?)
/remote-source/app/metrics/internal/collectors/storageconsumer.go:64 +0x131
github.com/red-hat-storage/ocs-operator/v4/metrics/internal/collectors.(*StorageConsumerCollector).Collect(0xc000a7b4a0, 0xc000b09f60?)
/remote-source/app/metrics/internal/collectors/storageconsumer.go:45 +0xab
github.com/prometheus/client_golang/prometheus.(*Registry).Gather.func1()
/remote-source/app/vendor/github.com/prometheus/client_golang/prometheus/registry.go:455 +0x10d
created by github.com/prometheus/client_golang/prometheus.(*Registry).Gather
/remote-source/app/vendor/github.com/prometheus/client_golang/prometheus/registry.go:466 +0x59d
```

Version of all relevant components (if applicable):
ODF 4.14

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
No

Is there any workaround available to the best of your knowledge?
No

Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
1

Can this issue reproducible?
Yes

Can this issue reproduce from the UI?
Yes

If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. Install ODF Provider/Client mode
2. Check ocs-metrics-exporter for crashes
3.


Actual results:


Expected results:


Additional info:

Comment 2 Ritesh Chikatwar 2023-10-23 07:51:08 UTC
No quality engineering is required for this fix as it only applies to Provider and Client modes and will not affect other deployments.

Comment 3 Ritesh Chikatwar 2023-10-23 07:53:32 UTC
This fix is urgent and should be included in version 4.14 as the pod is currently in crashloopback. It is important to note that implementing this fix will not have any impact on other deployments.

Comment 4 Daniel Osypenko 2023-11-12 13:46:53 UTC
$ oc get pod ocs-metrics-exporter-56cd8d568d-lh2xh -n openshift-storage
NAME                                    READY   STATUS    RESTARTS   AGE
ocs-metrics-exporter-56cd8d568d-lh2xh   1/1     Running   0          6d2h

------ versions ------
Provider
operator.v4.14.0-160.hci              OpenShift Data Foundation     4.14.0-160.hci apiVersion: operators.coreos.com/v2

Client
name: ocs-client-operator.v4.14.0-162.hci

Comment 8 errata-xmlrpc 2024-03-19 15:27:02 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.15.0 security, enhancement, & bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2024:1383


Note You need to log in before you can comment on or make changes to this bug.