I see this on a cloud platform (GCP) as well. I also noticed suspicious events openshift-storage namespace, out of which some could be related (I'm listing these here so that we can locate this bug via bugzilla search): ``` $ oc get events -n openshift-storage | grep Warning | grep -i noobaa 146m Warning ProvisioningFailed persistentvolumeclaim/db-noobaa-db-pg-0 failed to provision volume with StorageClass "ocs-storagecluster-ceph-rbd": rpc error: code = Internal desc = pool not found: pool (ocs-storagecluster-cephblockpool) not found in Ceph cluster 146m Warning FailedScheduling pod/noobaa-db-pg-0 0/6 nodes are available: 6 pod has unbound immediate PersistentVolumeClaims. 146m Warning FailedScheduling pod/noobaa-db-pg-0 0/6 nodes are available: 6 pod has unbound immediate PersistentVolumeClaims. 46s Warning BackingStorePhaseRejected backingstore/noobaa-default-backing-store Backing store mode: ALL_NODES_OFFLINE 28s Warning RejectedBackingStore bucketclass/noobaa-default-bucket-class NooBaa BackingStore "noobaa-default-backing-store" is in rejected phase 145m Warning Unhealthy pod/noobaa-endpoint-58656b7c8d-rz7sg Readiness probe failed: dial tcp 10.129.2.33:6001: connect: connection refused 144m Warning FailedGetResourceMetric horizontalpodautoscaler/noobaa-endpoint failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from resource metrics API 144m Warning FailedComputeMetricsReplicas horizontalpodautoscaler/noobaa-endpoint invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from resource metrics API 141m Warning FailedGetResourceMetric horizontalpodautoscaler/noobaa-endpoint failed to get cpu utilization: did not receive metrics for any ready pods 142m Warning FailedComputeMetricsReplicas horizontalpodautoscaler/noobaa-endpoint invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: did not receive metrics for any ready pods ``` And I also see that after a while, backing stores are switching between ready and rejected state: ``` $ noobaa -n openshift-storage backingstore list NAME TYPE TARGET-BUCKET PHASE AGE bz1874367backingstore google-cloud-storage noobaabz1874367bucket Rejected 1h2m27s noobaa-default-backing-store google-cloud-storage noobaabucketrxnty Rejected 2h29m54s $ noobaa -n openshift-storage backingstore list NAME TYPE TARGET-BUCKET PHASE AGE bz1874367backingstore google-cloud-storage noobaabz1874367bucket Ready 1h2m44s noobaa-default-backing-store google-cloud-storage noobaabucketrxnty Ready 2h30m11s ```
To avoid confusing, these "failed to get cpu utilization" and "invalid metrics" events from comment 4 are not related, it's known BZ 1885524.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: Red Hat OpenShift Container Storage 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2041