Bug 2237861 - exporter: disable ceph-exporter for 4.13
Summary: exporter: disable ceph-exporter for 4.13
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenShift Data Foundation
Classification: Red Hat Storage
Component: rook
Version: 4.13
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ODF 4.13.4
Assignee: avan
QA Contact: Vijay Avuthu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2023-09-07 11:23 UTC by avan
Modified: 2023-10-26 17:48 UTC (History)
5 users (show)

Fixed In Version: 4.13.4-1
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2023-10-26 17:47:55 UTC
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github red-hat-storage rook pull 519 0 None Draft Disable exporter 4.13 2023-09-07 11:41:28 UTC
Red Hat Product Errata RHBA-2023:6146 0 None None None 2023-10-26 17:48:45 UTC

Description avan 2023-09-07 11:23:32 UTC
Description of problem (please be detailed as possible and provide log
snippests):
The crashing of exporter daemons during upgrade from 4.13->4.14 is because in 4.13 we don't have fix required (Ceph upstream fix) which was recelty delivered to 4.14. Thus disabling exporter in 4.13 will resolve this. 



Version of all relevant components (if applicable):


Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?


Is there any workaround available to the best of your knowledge?


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?


Can this issue reproducible?


Can this issue reproduce from the UI?


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1.
2.
3.


Actual results:


Expected results:


Additional info:

Comment 4 avan 2023-09-11 05:51:13 UTC
(In reply to Travis Nielsen from comment #2)
> The exporter was enabled in 4.13 for the rbd-mirroring metrics in
> https://bugzilla.redhat.com/show_bug.cgi?id=2192875.
> 
> If we disable the exporter in 4.13, those metrics will no longer be
> collected. Why are those not needed anymore? 
> If this is the only fix to the upgrade issue, we need to understand all the
> impact of this change.

RDR metrics are targeted for 4.14 AFAIK. So we don't need to expose them in 4.13, thus disabling exporter

Comment 13 Vijay Avuthu 2023-10-10 12:38:00 UTC
Verified with below version

1. fresh deployment with 4.13.4 version

> csv

NAME                                    DISPLAY                       VERSION        REPLACES                                PHASE
mcg-operator.v4.13.4-rhodf              NooBaa Operator               4.13.4-rhodf   mcg-operator.v4.13.3-rhodf              Succeeded
ocs-operator.v4.13.4-rhodf              OpenShift Container Storage   4.13.4-rhodf   ocs-operator.v4.13.3-rhodf              Succeeded
odf-csi-addons-operator.v4.13.4-rhodf   CSI Addons                    4.13.4-rhodf   odf-csi-addons-operator.v4.13.3-rhodf   Succeeded
odf-operator.v4.13.4-rhodf              OpenShift Data Foundation     4.13.4-rhodf   odf-operator.v4.13.3-rhodf              Succeeded

> pods
pod/compute-0-debug                                                   1/1     Running     0          73s     10.1.113.2     compute-0   <none>           <none>
pod/compute-1-debug                                                   1/1     Running     0          74s     10.1.112.244   compute-1   <none>           <none>
pod/compute-2-debug                                                   1/1     Running     0          73s     10.1.113.1     compute-2   <none>           <none>
pod/csi-addons-controller-manager-688cc884bb-npv5m                    2/2     Running     0          9m35s   10.131.0.16    compute-1   <none>           <none>
pod/csi-cephfsplugin-bmrth                                            2/2     Running     0          6m46s   10.1.113.2     compute-0   <none>           <none>
pod/csi-cephfsplugin-bx8dx                                            2/2     Running     0          6m46s   10.1.113.1     compute-2   <none>           <none>
pod/csi-cephfsplugin-j2ffr                                            2/2     Running     0          6m46s   10.1.112.244   compute-1   <none>           <none>
pod/csi-cephfsplugin-provisioner-6497549b7d-5qrhl                     5/5     Running     0          6m46s   10.129.2.19    compute-2   <none>           <none>
pod/csi-cephfsplugin-provisioner-6497549b7d-jkjf6                     5/5     Running     0          6m46s   10.128.2.23    compute-0   <none>           <none>
pod/csi-rbdplugin-2frll                                               3/3     Running     0          6m46s   10.1.112.244   compute-1   <none>           <none>
pod/csi-rbdplugin-h2qwb                                               3/3     Running     0          6m46s   10.1.113.1     compute-2   <none>           <none>
pod/csi-rbdplugin-provisioner-8bcbb667f-hfvgv                         6/6     Running     0          6m46s   10.129.2.18    compute-2   <none>           <none>
pod/csi-rbdplugin-provisioner-8bcbb667f-pk8n9                         6/6     Running     0          6m46s   10.131.0.19    compute-1   <none>           <none>
pod/csi-rbdplugin-r57rq                                               3/3     Running     0          6m46s   10.1.113.2     compute-0   <none>           <none>
pod/must-gather-gb7qk-helper                                          1/1     Running     0          74s     10.129.2.31    compute-2   <none>           <none>
pod/noobaa-core-0                                                     1/1     Running     0          3m23s   10.131.0.27    compute-1   <none>           <none>
pod/noobaa-db-pg-0                                                    1/1     Running     0          3m24s   10.128.2.35    compute-0   <none>           <none>
pod/noobaa-endpoint-586ccf6f76-clljw                                  1/1     Running     0          2m33s   10.131.0.30    compute-1   <none>           <none>
pod/noobaa-operator-55d8db996f-g75bl                                  1/1     Running     0          9m22s   10.131.0.17    compute-1   <none>           <none>
pod/ocs-metrics-exporter-76cfc5dbcc-x5jlt                             1/1     Running     0          9m25s   10.128.2.22    compute-0   <none>           <none>
pod/ocs-operator-8968678dd-dvxbw                                      1/1     Running     0          9m26s   10.128.2.21    compute-0   <none>           <none>
pod/odf-console-b68b6665-4kbkv                                        1/1     Running     0          9m51s   10.129.2.15    compute-2   <none>           <none>
pod/odf-operator-controller-manager-7bdc8b5845-kxqtv                  2/2     Running     0          9m51s   10.128.2.18    compute-0   <none>           <none>
pod/rook-ceph-crashcollector-compute-0-5c874c676b-5c5vs               1/1     Running     0          4m24s   10.128.2.27    compute-0   <none>           <none>
pod/rook-ceph-crashcollector-compute-1-856786dcc5-jprg5               1/1     Running     0          4m49s   10.131.0.23    compute-1   <none>           <none>
pod/rook-ceph-crashcollector-compute-2-cbd55f756-55zhc                1/1     Running     0          4m48s   10.129.2.22    compute-2   <none>           <none>
pod/rook-ceph-mds-ocs-storagecluster-cephfilesystem-a-6b87c888hvxl7   2/2     Running     0          3m40s   10.128.2.33    compute-0   <none>           <none>
pod/rook-ceph-mds-ocs-storagecluster-cephfilesystem-b-6fd7597788l9w   2/2     Running     0          3m38s   10.129.2.25    compute-2   <none>           <none>
pod/rook-ceph-mgr-a-597cc7b664-cb5mq                                  2/2     Running     0          4m49s   10.131.0.22    compute-1   <none>           <none>
pod/rook-ceph-mon-a-564fdf78bc-8s2hc                                  2/2     Running     0          5m58s   10.129.2.21    compute-2   <none>           <none>
pod/rook-ceph-mon-b-5868dfbdc-lpgsj                                   2/2     Running     0          5m24s   10.128.2.26    compute-0   <none>           <none>
pod/rook-ceph-mon-c-5d8d957c95-c29b8                                  2/2     Running     0          5m5s    10.131.0.21    compute-1   <none>           <none>
pod/rook-ceph-operator-844579548f-jdlkw                               1/1     Running     0          6m51s   10.129.2.17    compute-2   <none>           <none>
pod/rook-ceph-osd-0-6fcb9b77fc-p485b                                  2/2     Running     0          4m1s    10.131.0.25    compute-1   <none>           <none>
pod/rook-ceph-osd-1-596dff5d78-md7t6                                  2/2     Running     0          3m57s   10.128.2.29    compute-0   <none>           <none>
pod/rook-ceph-osd-2-679d56dc64-jkrbv                                  2/2     Running     0          3m56s   10.129.2.24    compute-2   <none>           <none>
pod/rook-ceph-osd-prepare-8665b50972b04512c9c395e41ce5e174-qfhjr      0/1     Completed   0          4m27s   10.128.2.28    compute-0   <none>           <none>
pod/rook-ceph-osd-prepare-d4e430e34c62be49db03f4c8e16bbbe3-lw6wz      0/1     Completed   0          4m27s   10.131.0.24    compute-1   <none>           <none>
pod/rook-ceph-osd-prepare-e9f0014782b5e190d23c6f28f9c2bb20-7557j      0/1     Completed   0          4m27s   10.129.2.23    compute-2   <none>           <none>
pod/rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a-68df6cflqtn4   2/2     Running     0          3m25s   10.131.0.26    compute-1   <none>           <none>
pod/rook-ceph-tools-84cd6ffb6-hnsgj                                   1/1     Running     0          3m43s   10.128.2.32    compute-0   <none>           <none>

> As expected, there is no "rook-ceph-exporter*" pods

> must gather: https://url.corp.redhat.com/cc5fcc2

> job link: https://url.corp.redhat.com/5461caf

2. upgrade from 4.13.3-6 to 4.13.4-2

before upgrade, rook-ceph-exporter exists

$ oc get csv
NAME                                    DISPLAY                       VERSION        REPLACES                                PHASE
mcg-operator.v4.13.3-rhodf              NooBaa Operator               4.13.3-rhodf   mcg-operator.v4.13.2-rhodf              Succeeded
ocs-operator.v4.13.3-rhodf              OpenShift Container Storage   4.13.3-rhodf   ocs-operator.v4.13.2-rhodf              Succeeded
odf-csi-addons-operator.v4.13.3-rhodf   CSI Addons                    4.13.3-rhodf   odf-csi-addons-operator.v4.13.2-rhodf   Succeeded
odf-operator.v4.13.3-rhodf              OpenShift Data Foundation     4.13.3-rhodf   odf-operator.v4.13.2-rhodf              Succeeded
$ oc get csv odf-operator.v4.13.3-rhodf -o yaml | grep -i full_version
    full_version: 4.13.3-6
$ oc get pods | grep -i ceph-exporter
rook-ceph-exporter-compute-0-7dc7797956-knqnz                     1/1     Running     0          165m
rook-ceph-exporter-compute-1-9896f587c-7m4hh                      1/1     Running     0          165m
rook-ceph-exporter-compute-2-f9774b458-vcmfx                      1/1     Running     0          165m

> after upgrade

$ oc get csv
NAME                                    DISPLAY                       VERSION        REPLACES                                PHASE
mcg-operator.v4.13.4-rhodf              NooBaa Operator               4.13.4-rhodf   mcg-operator.v4.13.3-rhodf              Succeeded
ocs-operator.v4.13.4-rhodf              OpenShift Container Storage   4.13.4-rhodf   ocs-operator.v4.13.3-rhodf              Succeeded
odf-csi-addons-operator.v4.13.4-rhodf   CSI Addons                    4.13.4-rhodf   odf-csi-addons-operator.v4.13.3-rhodf   Succeeded
odf-operator.v4.13.4-rhodf              OpenShift Data Foundation     4.13.4-rhodf   odf-operator.v4.13.3-rhodf              Succeeded
$ oc get csv odf-operator.v4.13.4-rhodf -o yaml | grep -i full_version
    full_version: 4.13.4-2
$ oc get pods | grep -i ceph-exporter
$ 

job link: https://url.corp.redhat.com/203e604
logs: https://url.corp.redhat.com/c960bf7

Comment 19 errata-xmlrpc 2023-10-26 17:47:55 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenShift Data Foundation 4.13.4 security and bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2023:6146


Note You need to log in before you can comment on or make changes to this bug.