Bug 2126037

Summary: ODF4.12 Deployment, ocs-metrics-exporter pod stuck on CrashLoopBackOff state
Product: [Red Hat Storage] Red Hat OpenShift Data Foundation Reporter: Oded <oviner>
Component: ocs-operatorAssignee: arun kumar mohan <amohan>
Status: CLOSED CURRENTRELEASE QA Contact: Martin Bukatovic <mbukatov>
Severity: urgent Docs Contact:
Priority: unspecified    
Version: 4.12CC: aaaggarw, akrai, amohan, asagare, kramdoss, muagarwa, nthomas, ocs-bugs, odf-bz-bot, pbalogh, sostapov, tdesala, uchapaga
Target Milestone: ---Keywords: Regression
Target Release: ODF 4.12.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: 4.12.0-70 Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-02-08 14:06:28 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Oded 2022-09-12 08:18:36 UTC
Description of problem (please be detailed as possible and provide log
snippests):
Added label PodSecurity admission on "openshift-storage" and "openshift-local-storage" [WA for BZ https://bugzilla.redhat.com/show_bug.cgi?id=2124593]
The ocs-metrics-exporter pod stuck on CrashLoopBackOff state. 

Version of all relevant components (if applicable):
ODF Version: 4.12.0-44
OCP Version: 4.12.0-0.nightly-2022-09-08-114806
Provider:Vmware

Does this issue impact your ability to continue to work with the product
(please explain in detail what is the user impact)?


Is there any workaround available to the best of your knowledge?


Rate from 1 - 5 the complexity of the scenario you performed that caused this
bug (1 - very simple, 5 - very complex)?


Can this issue reproducible?


Can this issue reproduce from the UI?


If this is a regression, please provide more details to justify this:


Steps to Reproduce:
1.Install LSO4.12 operator:
$ oc get csv -n  openshift-local-storage
NAME                                         DISPLAY         VERSION               REPLACES                                     PHASE
local-storage-operator.4.12.0-202209010624   Local Storage   4.12.0-202209010624   local-storage-operator.4.11.0-202208291725   Succeeded

2.Label PodSecurity admission on "openshift-storage":
 oc label namespace openshift-storage security.openshift.io/scc.podSecurityLabelSync=false pod-security.kubernetes.io/enforce=privileged pod-security.kubernetes.io/warn=baseline pod-security.kubernetes.io/audit=baseline --overwrite
 
3.Label PodSecurity admission on "openshift-local-storage":
 oc label namespace openshift-local-storage security.openshift.io/scc.podSecurityLabelSync=false pod-security.kubernetes.io/enforce=privileged pod-security.kubernetes.io/warn=baseline pod-security.kubernetes.io/audit=baseline --overwrite
 
4.Create ODF via UI:
$ oc get csv -A
NAMESPACE                              NAME                                         DISPLAY                       VERSION               REPLACES                                     PHASE
openshift-local-storage                local-storage-operator.4.12.0-202209010624   Local Storage                 4.12.0-202209010624   local-storage-operator.4.11.0-202208291725   Succeeded
openshift-operator-lifecycle-manager   packageserver                                Package Server                0.19.0                                                             Succeeded
openshift-storage                      mcg-operator.v4.12.0                         NooBaa Operator               4.12.0                                                             Succeeded
openshift-storage                      ocs-operator.v4.12.0                         OpenShift Container Storage   4.12.0                                                             Succeeded
openshift-storage                      odf-csi-addons-operator.v4.12.0              CSI Addons                    4.12.0                                                             Succeeded
openshift-storage                      odf-operator.v4.12.0                         OpenShift Data Foundation     4.12.0                                                             Succeeded

$  oc describe csv odf-operator.v4.12.0 -n openshift-storage | grep full
Labels:       full_version=4.12.0-44

5.Add disks to worker nodes [Vmware]

6.Install Storage System via UI
storagecluster stuck on Progressing state

7.Storageclusters stuck on Progressing more than 20 min
$ oc get storageclusters.ocs.openshift.io -n openshift-storage  
NAME                 AGE   PHASE         EXTERNAL   CREATED AT             VERSION
ocs-storagecluster   22m   Progressing              2022-09-11T12:30:54Z   4.12.0

Status:
  Conditions:
    Last Heartbeat Time:   2022-09-11T12:54:26Z
    Last Transition Time:  2022-09-11T12:30:55Z
    Message:               Error while reconciling: some StorageClasses were skipped while waiting for pre-requisites to be met: [ocs-storagecluster-cephfs,ocs-storagecluster-ceph-rbd]
    Reason:                ReconcileFailed
    Status:                False
    Type:                  ReconcileComplete

$ oc get storageclusters.ocs.openshift.io -n openshift-storage  
NAME                 AGE   PHASE   EXTERNAL   CREATED AT             VERSION
ocs-storagecluster   39m   Ready              2022-09-11T12:30:54Z   4.12.0


8.Check the status of ocs-metrics-exporter pod
$ oc get pods -n openshift-storage | grep ocs-metrics-exporter
ocs-metrics-exporter-8874fffd-2f6ft                               0/1     CrashLoopBackOff   5 (2m19s ago)   146m

[oviner@fedora auth]$ oc get pods ocs-metrics-exporter-8874fffd-2f6ft -n openshift-storage 
NAME                                  READY   STATUS             RESTARTS        AGE
ocs-metrics-exporter-8874fffd-2f6ft   0/1     CrashLoopBackOff   5 (2m36s ago)   147m

[oviner@fedora auth]$ oc logs ocs-metrics-exporter-8874fffd-2f6ft -n openshift-storage 
I0911 13:01:17.936183       1 main.go:29] using options: &{Apiserver: KubeconfigPath: Host:0.0.0.0 Port:8080 ExporterHost:0.0.0.0 ExporterPort:8081 Help:false AllowedNamespaces:[openshift-storage] flags:0xc000220a00 StopCh:<nil> Kubeconfig:<nil>}
W0911 13:01:17.936366       1 client_config.go:617] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I0911 13:01:17.941131       1 main.go:70] Running metrics server on 0.0.0.0:8080
I0911 13:01:17.941154       1 main.go:71] Running telemetry server on 0.0.0.0:8081
I0911 13:01:17.953225       1 rbd-mirror.go:213] skipping rbd mirror status update for pool openshift-storage/ocs-storagecluster-cephblockpool because mirroring is disabled
I0911 13:01:17.955836       1 pv.go:55] Skipping non Ceph CSI RBD volume local-pv-a4aedfd0
I0911 13:01:17.955860       1 pv.go:55] Skipping non Ceph CSI RBD volume local-pv-6a3c119b
I0911 13:01:17.955865       1 pv.go:55] Skipping non Ceph CSI RBD volume local-pv-163aafec
E0911 13:01:46.997911       1 ceph-block-pool.go:137] Invalid image health for pool ocs-storagecluster-cephblockpool. Must be OK, UNKNOWN, WARNING or ERROR
panic: interface conversion: interface {} is *v1.CephCluster, not *v1.CephObjectStore

goroutine 195 [running]:
github.com/rook/rook/pkg/client/listers/ceph.rook.io/v1.cephObjectStoreNamespaceLister.List.func1({0x19d9280, 0xc0001cd900})
	/remote-source/app/vendor/github.com/rook/rook/pkg/client/listers/ceph.rook.io/v1/cephobjectstore.go:84 +0xc5
k8s.io/client-go/tools/cache.ListAllByNamespace({0x1d33e90, 0xc00000ce88}, {0x7ffd580e3e7a, 0x11}, {0x1d19730, 0xc00052b680}, 0xc0004dcd60)
	/remote-source/app/vendor/k8s.io/client-go/tools/cache/listers.go:96 +0x39c
github.com/rook/rook/pkg/client/listers/ceph.rook.io/v1.cephObjectStoreNamespaceLister.List({{0x1d33e90, 0xc00000ce88}, {0x7ffd580e3e7a, 0x18}}, {0x1d19730, 0xc00052b680})
	/remote-source/app/vendor/github.com/rook/rook/pkg/client/listers/ceph.rook.io/v1/cephobjectstore.go:83 +0x6f
github.com/red-hat-storage/ocs-operator/metrics/internal/collectors.getAllObjectStores({0x1ce2bb8, 0xc00051b1a0}, {0xc0000c39a0, 0x1, 0xc00078d718})
	/remote-source/app/metrics/internal/collectors/ceph-object-store.go:87 +0x1c2
github.com/red-hat-storage/ocs-operator/metrics/internal/collectors.(*ClusterAdvanceFeatureCollector).Collect(0xc00022bec0, 0xc00078d760)
	/remote-source/app/metrics/internal/collectors/cluster-advance-feature-use.go:87 +0x11e
github.com/prometheus/client_golang/prometheus.(*Registry).Gather.func1()
	/remote-source/app/vendor/github.com/prometheus/client_golang/prometheus/registry.go:446 +0x102
created by github.com/prometheus/client_golang/prometheus.(*Registry).Gather
	/remote-source/app/vendor/github.com/prometheus/client_golang/prometheus/registry.go:538 +0xb4d
	
$ oc describe pods -n openshift-storage  ocs-metrics-exporter-8874fffd-2f6ft
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  kube-api-access-kccjp:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
    ConfigMapName:           openshift-service-ca.crt
    ConfigMapOptional:       <nil>
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
                             node.ocs.openshift.io/storage=true:NoSchedule
Events:
  Type     Reason          Age                     From               Message
  ----     ------          ----                    ----               -------
  Normal   Scheduled       148m                    default-scheduler  Successfully assigned openshift-storage/ocs-metrics-exporter-8874fffd-2f6ft to compute-1 by control-plane-1
  Normal   AddedInterface  148m                    multus             Add eth0 [10.131.0.33/23] from ovn-kubernetes
  Normal   Pulling         148m                    kubelet            Pulling image "quay.io/rhceph-dev/odf4-ocs-metrics-exporter-rhel8@sha256:72c02ff0dbf796fe821ab0358c294af19daa2023347b7f50d9a856d32a2e84b1"
  Normal   Pulled          148m                    kubelet            Successfully pulled image "quay.io/rhceph-dev/odf4-ocs-metrics-exporter-rhel8@sha256:72c02ff0dbf796fe821ab0358c294af19daa2023347b7f50d9a856d32a2e84b1" in 23.9199064s
  Normal   Created         6m35s (x5 over 148m)    kubelet            Created container ocs-metrics-exporter
  Normal   Started         6m35s (x5 over 148m)    kubelet            Started container ocs-metrics-exporter
  Warning  BackOff         4m57s (x13 over 9m16s)  kubelet            Back-off restarting failed container
  Normal   Pulled          4m46s (x5 over 10m)     kubelet            Container image "quay.io/rhceph-dev/odf4-ocs-metrics-exporter-rhel8@sha256:72c02ff0dbf796fe821ab0358c294af19daa2023347b7f50d9a856d32a2e84b1" already present on machine


Actual results:


Expected results:


Additional info:

Comment 2 arun kumar mohan 2022-09-12 11:02:22 UTC
Fix added with PR: https://github.com/red-hat-storage/ocs-operator/pull/1805

Comment 6 arun kumar mohan 2022-09-20 09:11:35 UTC
Submitted fix with PR: https://github.com/red-hat-storage/ocs-operator/pull/1826

Comment 7 avdhoot 2022-09-26 10:15:26 UTC
observed same issue without LSO as well.

Comment 9 Aaruni Aggarwal 2022-09-28 12:52:48 UTC
On IBM Power platform (ppc64le), ocs-metrics-exporter-* pod is in Error state using latest-stable ODF4.12 build. 

[root@rdr-aaaggarw-lon06-bastion-0 scripts]# oc get csv -A
NAMESPACE                              NAME                                         DISPLAY                       VERSION               REPLACES   PHASE
openshift-local-storage                local-storage-operator.4.12.0-202209161347   Local Storage                 4.12.0-202209161347              Succeeded
openshift-operator-lifecycle-manager   packageserver                                Package Server                0.19.0                           Succeeded
openshift-storage                      mcg-operator.v4.12.0                         NooBaa Operator               4.12.0                           Succeeded
openshift-storage                      ocs-operator.v4.12.0                         OpenShift Container Storage   4.12.0                           Installing
openshift-storage                      odf-csi-addons-operator.v4.12.0              CSI Addons                    4.12.0                           Succeeded
openshift-storage                      odf-operator.v4.12.0                         OpenShift Data Foundation     4.12.0                           Succeeded
 
[root@rdr-aaaggarw-lon06-bastion-0 scripts]# oc get csv odf-operator.v4.12.0 -n openshift-storage -o yaml |grep full_version
    full_version: 4.12.0-65

Pods: 

[root@rdr-aaaggarw-lon06-bastion-0 scripts]# oc get pods -n openshift-storage
NAME                                                              READY   STATUS      RESTARTS       AGE
csi-addons-controller-manager-7b87dc8945-f6zg4                    2/2     Running     0              13m
csi-cephfsplugin-25mpc                                            2/2     Running     0              12m
csi-cephfsplugin-47ws4                                            2/2     Running     0              12m
csi-cephfsplugin-bmbcr                                            2/2     Running     0              12m
csi-cephfsplugin-provisioner-7fcdd97ddb-khw5w                     5/5     Running     0              12m
csi-cephfsplugin-provisioner-7fcdd97ddb-qcqzs                     5/5     Running     0              12m
csi-rbdplugin-4wx2p                                               3/3     Running     0              12m
csi-rbdplugin-mcr6b                                               3/3     Running     0              12m
csi-rbdplugin-provisioner-75f6dcfd48-mdzkv                        6/6     Running     0              12m
csi-rbdplugin-provisioner-75f6dcfd48-qr9d2                        6/6     Running     0              12m
csi-rbdplugin-s7q9m                                               3/3     Running     0              12m
noobaa-core-0                                                     1/1     Running     0              9m20s
noobaa-db-pg-0                                                    1/1     Running     0              9m20s
noobaa-endpoint-5f444f44dd-h6f6h                                  1/1     Running     0              7m47s
noobaa-operator-6f4d8d4b78-svzj7                                  1/1     Running     0              13m
ocs-metrics-exporter-766f6b65d6-g8jsv                             0/1     Error       6 (3m1s ago)   13m
ocs-operator-6df99899bb-b5gqm                                     1/1     Running     0              13m
odf-console-84878864c5-x7dbz                                      1/1     Running     0              14m
odf-operator-controller-manager-855d7ffcbb-fnn95                  2/2     Running     0              14m
rook-ceph-crashcollector-lon06-worker-0.rdr-aaaggarw.ibm.ch9gmb   1/1     Running     0              10m
rook-ceph-crashcollector-lon06-worker-1.rdr-aaaggarw.ibm.cs8zhf   1/1     Running     0              10m
rook-ceph-crashcollector-lon06-worker-2.rdr-aaaggarw.ibm.c5vshk   1/1     Running     0              10m
rook-ceph-mds-ocs-storagecluster-cephfilesystem-a-6c7ff57dlmwwk   2/2     Running     0              10m
rook-ceph-mds-ocs-storagecluster-cephfilesystem-b-fc8f88f8l66rs   2/2     Running     0              10m
rook-ceph-mgr-a-66cb6849d-577kd                                   2/2     Running     0              11m
rook-ceph-mon-a-5fbcfcdc7c-6ds48                                  2/2     Running     0              12m
rook-ceph-mon-b-59986fcbf9-2pvj9                                  2/2     Running     0              11m
rook-ceph-mon-c-7858d4884f-n8xnt                                  2/2     Running     0              11m
rook-ceph-operator-57bc4f8d9c-59l5r                               1/1     Running     0              13m
rook-ceph-osd-0-55dcbb7896-g6hgt                                  2/2     Running     0              10m
rook-ceph-osd-1-7bffdc6fc9-26dr6                                  2/2     Running     0              10m
rook-ceph-osd-2-675945d86f-6p7mt                                  2/2     Running     0              10m
rook-ceph-osd-prepare-b4dc1a8fc4c3d3f125294f31d31b26ce-6fsk4      0/1     Completed   0              10m
rook-ceph-osd-prepare-c327fb94b8a00969bde17a058a76b71a-df6z9      0/1     Completed   0              10m
rook-ceph-osd-prepare-de7658dc2d8813fbbe2b3cc1d8915463-vdp62      0/1     Completed   0              10m
rook-ceph-rgw-ocs-storagecluster-cephobjectstore-a-7c99c8bf9zwj   2/2     Running     0              9m48s
rook-ceph-tools-5996bdc9-8xzvd                                    1/1     Running     0              9m20s

Events section of ocs-metrics-exporter-* pod: 

Events:
  Type     Reason          Age                     From               Message
  ----     ------          ----                    ----               -------
  Normal   Scheduled       14m                     default-scheduler  Successfully assigned openshift-storage/ocs-metrics-exporter-766f6b65d6-g8jsv to lon06-worker-2.rdr-aaaggarw.ibm.com by lon06-master-1.rdr-aaaggarw.ibm.com
  Normal   AddedInterface  14m                     multus             Add eth0 [10.128.2.18/23] from openshift-sdn
  Normal   Pulling         14m                     kubelet            Pulling image "quay.io/rhceph-dev/odf4-ocs-metrics-exporter-rhel8@sha256:ab190d5de6c1e1f504f3ad6bf4da7daf45439fcda3ae64f3831c57eb132fd419"
  Normal   Pulled          13m                     kubelet            Successfully pulled image "quay.io/rhceph-dev/odf4-ocs-metrics-exporter-rhel8@sha256:ab190d5de6c1e1f504f3ad6bf4da7daf45439fcda3ae64f3831c57eb132fd419" in 23.705691487s
  Normal   Pulled          5m32s (x4 over 9m17s)   kubelet            Container image "quay.io/rhceph-dev/odf4-ocs-metrics-exporter-rhel8@sha256:ab190d5de6c1e1f504f3ad6bf4da7daf45439fcda3ae64f3831c57eb132fd419" already present on machine
  Normal   Created         5m31s (x5 over 13m)     kubelet            Created container ocs-metrics-exporter
  Normal   Started         5m31s (x5 over 13m)     kubelet            Started container ocs-metrics-exporter
  Warning  BackOff         4m10s (x14 over 8m21s)  kubelet            Back-off restarting failed container



Logs of ocs-metrics-exporter-* pod: 

[root@rdr-aaaggarw-lon06-bastion-0 scripts]# oc logs -f pod/ocs-metrics-exporter-766f6b65d6-g8jsv -n openshift-storage
I0928 12:47:04.428236       1 main.go:29] using options: &{Apiserver: KubeconfigPath: Host:0.0.0.0 Port:8080 ExporterHost:0.0.0.0 ExporterPort:8081 Help:false AllowedNamespaces:[openshift-storage] flags:0xc00071c000 StopCh:<nil> Kubeconfig:<nil>}
W0928 12:47:04.428533       1 client_config.go:617] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I0928 12:47:04.431138       1 main.go:70] Running metrics server on 0.0.0.0:8080
I0928 12:47:04.431159       1 main.go:71] Running telemetry server on 0.0.0.0:8081
W0928 12:47:04.444823       1 reflector.go:324] /remote-source/app/metrics/internal/collectors/cluster-advance-feature-use.go:163: failed to list *v1.StorageClass: forbidden: User "system:serviceaccount:openshift-storage:ocs-metrics-exporter" cannot get path "/storageclasses"
E0928 12:47:04.444904       1 reflector.go:138] /remote-source/app/metrics/internal/collectors/cluster-advance-feature-use.go:163: Failed to watch *v1.StorageClass: failed to list *v1.StorageClass: forbidden: User "system:serviceaccount:openshift-storage:ocs-metrics-exporter" cannot get path "/storageclasses"
I0928 12:47:04.445317       1 rbd-mirror.go:194] skipping rbd mirror status update for pool openshift-storage/ocs-storagecluster-cephblockpool because mirroring is disabled
I0928 12:47:04.452122       1 pv.go:55] Skipping non Ceph CSI RBD volume local-pv-361593ed
I0928 12:47:04.452138       1 pv.go:55] Skipping non Ceph CSI RBD volume local-pv-b4d54cc8
I0928 12:47:04.452149       1 pv.go:55] Skipping non Ceph CSI RBD volume local-pv-2d84a845
I0928 12:47:04.452158       1 pv.go:55] Skipping non Ceph CSI RBD volume local-pv-b5b47f2
I0928 12:47:04.452166       1 pv.go:55] Skipping non Ceph CSI RBD volume local-pv-acc40f3e
I0928 12:47:04.452174       1 pv.go:55] Skipping non Ceph CSI RBD volume local-pv-b423ba9c
I0928 12:47:04.452183       1 pv.go:55] Skipping non Ceph CSI RBD volume local-pv-92a12a4a
I0928 12:47:04.452191       1 pv.go:55] Skipping non Ceph CSI RBD volume local-pv-dca91665
I0928 12:47:04.452198       1 pv.go:55] Skipping non Ceph CSI RBD volume local-pv-3c0717a0
I0928 12:47:04.452204       1 pv.go:55] Skipping non Ceph CSI RBD volume local-pv-d2a8a521
I0928 12:47:04.452211       1 pv.go:55] Skipping non Ceph CSI RBD volume local-pv-c71ea078
I0928 12:47:04.452217       1 pv.go:55] Skipping non Ceph CSI RBD volume local-pv-65bcd9d4
I0928 12:47:04.452224       1 pv.go:55] Skipping non Ceph CSI RBD volume local-pv-aad7dd43
I0928 12:47:04.452230       1 pv.go:55] Skipping non Ceph CSI RBD volume local-pv-98ea1e7f
I0928 12:47:04.452237       1 pv.go:55] Skipping non Ceph CSI RBD volume local-pv-db7e7f7
I0928 12:47:04.452244       1 pv.go:55] Skipping non Ceph CSI RBD volume pvc-e5b83885-946d-4fca-91ee-492cb55879ec
I0928 12:47:04.452250       1 pv.go:55] Skipping non Ceph CSI RBD volume local-pv-4b1170a7
I0928 12:47:04.452257       1 pv.go:55] Skipping non Ceph CSI RBD volume local-pv-d16bdf2d
I0928 12:47:04.452263       1 pv.go:55] Skipping non Ceph CSI RBD volume local-pv-a9cef73f
I0928 12:47:04.452270       1 pv.go:55] Skipping non Ceph CSI RBD volume local-pv-abc21c52
I0928 12:47:04.452277       1 pv.go:55] Skipping non Ceph CSI RBD volume local-pv-e699a89
I0928 12:47:04.452283       1 pv.go:55] Skipping non Ceph CSI RBD volume local-pv-d4d18a5b
I0928 12:47:04.452290       1 pv.go:55] Skipping non Ceph CSI RBD volume local-pv-f09b1ed6
I0928 12:47:04.452297       1 pv.go:55] Skipping non Ceph CSI RBD volume local-pv-179b3886
I0928 12:47:04.452303       1 pv.go:55] Skipping non Ceph CSI RBD volume local-pv-29885e11
I0928 12:47:04.452309       1 pv.go:55] Skipping non Ceph CSI RBD volume local-pv-14e03203
I0928 12:47:04.452315       1 pv.go:55] Skipping non Ceph CSI RBD volume local-pv-cd4f9bc9
I0928 12:47:04.452321       1 pv.go:55] Skipping non Ceph CSI RBD volume local-pv-611c74bb
W0928 12:47:05.633347       1 reflector.go:324] /remote-source/app/metrics/internal/collectors/cluster-advance-feature-use.go:163: failed to list *v1.StorageClass: forbidden: User "system:serviceaccount:openshift-storage:ocs-metrics-exporter" cannot get path "/storageclasses"
E0928 12:47:05.633417       1 reflector.go:138] /remote-source/app/metrics/internal/collectors/cluster-advance-feature-use.go:163: Failed to watch *v1.StorageClass: failed to list *v1.StorageClass: forbidden: User "system:serviceaccount:openshift-storage:ocs-metrics-exporter" cannot get path "/storageclasses"
W0928 12:47:08.523159       1 reflector.go:324] /remote-source/app/metrics/internal/collectors/cluster-advance-feature-use.go:163: failed to list *v1.StorageClass: forbidden: User "system:serviceaccount:openshift-storage:ocs-metrics-exporter" cannot get path "/storageclasses"
E0928 12:47:08.523192       1 reflector.go:138] /remote-source/app/metrics/internal/collectors/cluster-advance-feature-use.go:163: Failed to watch *v1.StorageClass: failed to list *v1.StorageClass: forbidden: User "system:serviceaccount:openshift-storage:ocs-metrics-exporter" cannot get path "/storageclasses"
W0928 12:47:12.964015       1 reflector.go:324] /remote-source/app/metrics/internal/collectors/cluster-advance-feature-use.go:163: failed to list *v1.StorageClass: forbidden: User "system:serviceaccount:openshift-storage:ocs-metrics-exporter" cannot get path "/storageclasses"
E0928 12:47:12.964051       1 reflector.go:138] /remote-source/app/metrics/internal/collectors/cluster-advance-feature-use.go:163: Failed to watch *v1.StorageClass: failed to list *v1.StorageClass: forbidden: User "system:serviceaccount:openshift-storage:ocs-metrics-exporter" cannot get path "/storageclasses"
E0928 12:47:20.982705       1 ceph-block-pool.go:137] Invalid image health for pool ocs-storagecluster-cephblockpool. Must be OK, UNKNOWN, WARNING or ERROR
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1218eb0]

goroutine 243 [running]:
github.com/red-hat-storage/ocs-operator/metrics/internal/collectors.(*CephObjectStoreAdvancedFeatureProvider).AdvancedFeature(0xc0005ca250?, {0xc00060c720, 0x1, 0x1})
	/remote-source/app/metrics/internal/collectors/cluster-advance-feature-use.go:57 +0xf0
github.com/red-hat-storage/ocs-operator/metrics/internal/collectors.(*ClusterAdvanceFeatureCollector).Collect(0xc000a1a880, 0xc0000b0f50?)
	/remote-source/app/metrics/internal/collectors/cluster-advance-feature-use.go:183 +0xb0
github.com/prometheus/client_golang/prometheus.(*Registry).Gather.func1()
	/remote-source/app/vendor/github.com/prometheus/client_golang/prometheus/registry.go:453 +0xe8
created by github.com/prometheus/client_golang/prometheus.(*Registry).Gather
	/remote-source/app/vendor/github.com/prometheus/client_golang/prometheus/registry.go:464 +0x508

Comment 13 umanga 2022-09-29 13:18:50 UTC
https://github.com/red-hat-storage/ocs-operator/pull/1830 will prevent nil pointer exception and stop the crash loop.

Comment 14 arun kumar mohan 2022-10-03 16:12:44 UTC
Tested with latest ocs-operator image which has Umanga's changes merged.
No ocs-metric-exporter crash or any errors due to advanced-feature metric were noticed.

Following are the logs...
```
1 ceph-block-pool.go:137] Invalid image health for pool ocs-storagecluster-cephblockpool. Must be OK, UNKNOWN, WARNING or ERROR
1 ceph-block-pool.go:137] Invalid image health for pool ocs-storagecluster-cephblockpool. Must be OK, UNKNOWN, WARNING or ERROR
1 rbd-mirror.go:292] RBD mirror store resync started at 2022-10-03 16:10:40.821475786 +0000 UTC m=+5671.204528376
1 rbd-mirror.go:317] RBD mirror store resync ended at 2022-10-03 16:10:40.821553818 +0000 UTC m=+5671.204606641
1 rbd-mirror.go:292] RBD mirror store resync started at 2022-10-03 16:11:10.821706396 +0000 UTC m=+5701.204758272
1 rbd-mirror.go:317] RBD mirror store resync ended at 2022-10-03 16:11:10.821759647 +0000 UTC m=+5701.204811454
```