Bug 2255328 - Rook-ceph-exporter pods not available in ODF 4.15
Summary: Rook-ceph-exporter pods not available in ODF 4.15
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: rook
Version: 4.15
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ODF 4.15.0
Assignee: Santosh Pillai
QA Contact: Nagendra Reddy
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-12-20 06:44 UTC by Santosh Pillai
Modified: 2024-07-18 04:25 UTC (History)
8 users (show)

Fixed In Version: 4.15.0-102
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2024-03-19 15:26:16 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github red-hat-storage rook pull 551 0 None open Bug 2255328: monitoring: enable exporter for downstream 4.14 2024-01-02 09:14:32 UTC
Red Hat Product Errata RHSA-2024:1383 0 None None None 2024-03-19 15:26:19 UTC

Description Santosh Pillai 2023-12-20 06:44:43 UTC
Description of problem (please be detailed as possible and provide log
snippests):


Version of all relevant components (if applicable):
ODF: 4.15


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?
Without exportor some metrics won't be available.


Is there any workaround available to the best of your knowledge?


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?
1

Can this issue reproducible?
yes

Can this issue reproduce from the UI?
yes

If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1. Deploy ODF 4.15
2. Observe the pods available in openshift-storage namespace



Actual results: No rook-ceph-exporter pods are running


Expected results: Rook-ceph-exporter pods should be running


Additional info:

Possible reason could be the minimum ceph version required to run the rook-ceph exporter
In 4.15 its set as 18.0.0
https://github.com/red-hat-storage/rook/blob/bcf59695f0ced4903ce818c2750eec09664907a2/pkg/operator/ceph/cluster/nodedaemon/exporter.go#L54-L63

Where as in 4.14 it set to 17.2.6
https://github.com/red-hat-storage/rook/blob/8dad108c8ac13737c7cf2564113a320014c6fa5a/pkg/operator/ceph/cluster/nodedaemon/exporter.go#L55C2-L64

Comment 2 Santosh Pillai 2023-12-20 06:49:11 UTC
Maybe we just need to update the ceph version for 4.15. Currently it runs on
```
❯ oc logs rook-ceph-operator-57cffcddb7-smrw8
2023-12-20 04:17:12.149433 I | rookcmd: starting Rook v4.15.0-0.7c6a751c94d25fb9b372b65bc66d2ff7bb6fe6ef with arguments '/usr/local/bin/rook ceph operator'
2023-12-20 04:17:12.149492 I | rookcmd: flag values: --enable-machine-disruption-budget=false, --help=false, --kubeconfig=, --log-level=INFO
2023-12-20 04:17:12.149496 I | cephcmd: starting Rook-Ceph operator
2023-12-20 04:17:12.287276 I | cephcmd: base ceph version inside the rook operator image is "ceph version 17.2.6-167.el9cp (5ef1496ea3e9daaa9788809a172bd5a1c3192cf7) quincy (stable)"
```

Comment 7 Santosh Pillai 2023-12-21 11:13:05 UTC
(In reply to Santosh Pillai from comment #2)
> Maybe we just need to update the ceph version for 4.15. Currently it runs on
> ```
> ❯ oc logs rook-ceph-operator-57cffcddb7-smrw8
> 2023-12-20 04:17:12.149433 I | rookcmd: starting Rook
> v4.15.0-0.7c6a751c94d25fb9b372b65bc66d2ff7bb6fe6ef with arguments
> '/usr/local/bin/rook ceph operator'
> 2023-12-20 04:17:12.149492 I | rookcmd: flag values:
> --enable-machine-disruption-budget=false, --help=false, --kubeconfig=,
> --log-level=INFO
> 2023-12-20 04:17:12.149496 I | cephcmd: starting Rook-Ceph operator
> 2023-12-20 04:17:12.287276 I | cephcmd: base ceph version inside the rook
> operator image is "ceph version 17.2.6-167.el9cp
> (5ef1496ea3e9daaa9788809a172bd5a1c3192cf7) quincy (stable)"
> ```

Discussed with the team. We need to update rook 4.15 to use ceph 17.2.6 as minimum version required for running rook-ceph-exporter.

Comment 14 Mudit Agarwal 2024-01-02 09:16:31 UTC
Nagendra,
Comment #9 is only for https://issues.redhat.com/browse/RHSTOR-4798 
It will NOT block you to test other stories in https://issues.redhat.com/browse/RHSTOR-4832
Santos will provide a fix for it today.

As far as #comment10 is concerned, file creator pod is just to help you. This is not required for testing the feature.
To test the feature, you need to create a lot of files in order to stress the system and generate the alert. It is not compulsory to use this pod.

Comment 16 Vijay Avuthu 2024-01-03 06:37:58 UTC
$ oc get csv
NAME                                         DISPLAY                       VERSION             REPLACES   PHASE
mcg-operator.v4.15.0-102.stable              NooBaa Operator               4.15.0-102.stable              Succeeded
ocs-operator.v4.15.0-102.stable              OpenShift Container Storage   4.15.0-102.stable              Succeeded
odf-csi-addons-operator.v4.15.0-102.stable   CSI Addons                    4.15.0-102.stable              Succeeded
odf-operator.v4.15.0-102.stable              OpenShift Data Foundation     4.15.0-102.stable              Succeeded
$ oc get pod --selector=app=rook-ceph-exporter
NAME                                            READY   STATUS    RESTARTS   AGE
rook-ceph-exporter-compute-0-749b97455c-dsvjd   1/1     Running   0          57m
rook-ceph-exporter-compute-1-5749b79cb4-vp7pw   1/1     Running   0          57m
rook-ceph-exporter-compute-2-dc9cf9679-9kwpv    1/1     Running   0          57m
$

Comment 18 Santosh Pillai 2024-01-03 10:05:29 UTC
(In reply to Vijay Avuthu from comment #16)
> $ oc get csv
> NAME                                         DISPLAY                      
> VERSION             REPLACES   PHASE
> mcg-operator.v4.15.0-102.stable              NooBaa Operator              
> 4.15.0-102.stable              Succeeded
> ocs-operator.v4.15.0-102.stable              OpenShift Container Storage  
> 4.15.0-102.stable              Succeeded
> odf-csi-addons-operator.v4.15.0-102.stable   CSI Addons                   
> 4.15.0-102.stable              Succeeded
> odf-operator.v4.15.0-102.stable              OpenShift Data Foundation    
> 4.15.0-102.stable              Succeeded
> $ oc get pod --selector=app=rook-ceph-exporter
> NAME                                            READY   STATUS    RESTARTS  
> AGE
> rook-ceph-exporter-compute-0-749b97455c-dsvjd   1/1     Running   0         
> 57m
> rook-ceph-exporter-compute-1-5749b79cb4-vp7pw   1/1     Running   0         
> 57m
> rook-ceph-exporter-compute-2-dc9cf9679-9kwpv    1/1     Running   0         
> 57m
> $

Thanks Vijay for checking this.

Comment 20 errata-xmlrpc 2024-03-19 15:26:16 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: Red Hat OpenShift Data Foundation 4.15.0 security, enhancement, & bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2024:1383

Comment 21 Red Hat Bugzilla 2024-07-18 04:25:17 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days


Note You need to log in before you can comment on or make changes to this bug.