Bug 1885524

Summary:

[Tracker] Unable to get metrics for resource cpu events reported after OCS installation (OCP bug 2029144 )

Product:

[Red Hat Storage] Red Hat OpenShift Data Foundation

Reporter:

Martin Bukatovic <mbukatov>

Component:

Multi-Cloud Object Gateway

Assignee:

Naveen Paul <napaul>

Status:

CLOSED NOTABUG

QA Contact:

Filip Balák <fbalak>

Severity:

low

Docs Contact:

Priority:

unspecified

Version:

4.6

CC:

aindenba, bpratt, csharpe, dzaken, ebenahar, hnallurv, jolmomar, kjosy, muagarwa, napaul, nbecker, nberry, odf-bz-bot, tunguyen

Target Milestone:

---

Keywords:

Tracking

Target Release:

---

Hardware:

Unspecified

OS:

Unspecified

Whiteboard:

Fixed In Version:

4.14.0-28

Doc Type:

No Doc Update

Doc Text:

Story Points:

---

Clone Of:

Environment:

Last Closed:

2023-09-05 07:52:59 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

2029144

Bug Blocks:

Attachments:

Description	Flags
screenshot #1: storage dashboard with warning events	none
screenshot #2: Events page in OCP Console	none
must-gather logs	none
oc describe apiservice v1beta1.metrics.k8s.io > v1beta1.metrics.k8s.io.describe	none

Description Martin Bukatovic 2020-10-06 09:37:53 UTC

Description of problem
======================

After OCS installation, there are multiple Events of Warning type from
horizontalpodautoscaler/noobaa-endpoint complaining that openshift is
"unable to get metrics for resource cpu". The stream of such events stops about
15 minutes after OCS installation.

A similar problem was originally reported as BZ 1873162 for OCS 4.5, but that
turned out to be much more severe problem with metrics availablity and access
rights of HPA noobaa-endpoint, which seems to be fixed in OCS 4.6. This BZ was
created to cover what was left to be fixed.

Version-Release number of selected component
============================================

OCP 4.6.0-0.nightly-2020-10-03-051134
OCS 4.6.0-583.ci

How reproducible
================

2/2

Steps to Reproduce
==================

1. Install OCP/OCS cluster
2. Login to OCP Console and open Overview Cluster dashboard
   (Home -> Overview -> Cluster)
3. See "Recent events" list

Or you can also go to Events page or list events via command line client:
`oc get events -n openshift-storage`.

Actual results
==============

After OCS installation, I see warnings related to HPA noobaa-endpoint such as:

```
15m         Warning   FailedGetResourceMetric        horizontalpodautoscaler/noobaa-endpoint                                           unable to get metrics for resource cpu: no metrics returned from resource metrics API
15m         Warning   FailedComputeMetricsReplicas   horizontalpodautoscaler/noobaa-endpoint                                           invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: unable to get metrics for resource cpu: no metrics re
turned from resource metrics API
12m         Warning   FailedGetResourceMetric        horizontalpodautoscaler/noobaa-endpoint                                           did not receive metrics for any ready pods
12m         Warning   FailedComputeMetricsReplicas   horizontalpodautoscaler/noobaa-endpoint                                           invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: did not receive metrics for any ready pods
```

See also attached screenshots from OCP Console, where these warnings are
clearly visible, raising attention to this "red herring" problem.

Expected results
================

Admin should not wait another 15 minutes after OCS Storage Cluster installation
for these events to stop.

There should be no such Warning events from
horizontalpodautoscaler/noobaa-endpoint right after OCS installation.

Additional info
===============

After about 15 minutes after OCS installation, the horizontalpodautoscaler
noobaa-endpoint seems to work fine (I don't claim it works as expected, rather
that it's not in an error state):

```

$ ./oc describe horizontalpodautoscaler/noobaa-endpoint  -n openshift-storage 
Name:                                                  noobaa-endpoint
Namespace:                                             openshift-storage
Labels:                                                app=noobaa
Annotations:                                           <none>
CreationTimestamp:                                     Mon, 05 Oct 2020 19:58:22 +0200
Reference:                                             Deployment/noobaa-endpoint
Metrics:                                               ( current / target )
  resource cpu on pods  (as a percentage of request):  0% (2m) / 80%
Min replicas:                                          1
Max replicas:                                          2
Deployment pods:                                       1 current / 1 desired
Conditions:
  Type            Status  Reason               Message
  ----            ------  ------               -------
  AbleToScale     True    ScaleDownStabilized  recent recommendations were higher than current one, applying the highest recent recommendation
  ScalingActive   True    ValidMetricFound     the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
  ScalingLimited  False   DesiredWithinRange   the desired count is within the acceptable range
Events:
  Type     Reason                        Age                 From                       Message
  ----     ------                        ----                ----                       -------
  Warning  FailedGetResourceMetric       19m (x2 over 20m)   horizontal-pod-autoscaler  unable to get metrics for resource cpu: no metrics returned from resource metrics API
  Warning  FailedComputeMetricsReplicas  19m (x2 over 20m)   horizontal-pod-autoscaler  invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from resource metrics API
  Warning  FailedComputeMetricsReplicas  17m (x10 over 19m)  horizontal-pod-autoscaler  invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: did not receive metrics for any ready pods
  Warning  FailedGetResourceMetric       17m (x11 over 19m)  horizontal-pod-autoscaler  did not receive metrics for any ready pods
```

Comment 2 Martin Bukatovic 2020-10-06 09:50:01 UTC

Created attachment 1719314 [details]
screenshot #1: storage dashboard with warning events

Comment 3 Martin Bukatovic 2020-10-06 09:54:24 UTC

Created attachment 1719316 [details]
screenshot #2: Events page in OCP Console

Comment 9 Tiffany Nguyen 2021-02-09 18:39:42 UTC

I saw the same warning message after OCS installation on 4.7
Build version: 
 * ocs-operator.v4.7.0-256.ci
 * 4.7.0-0.nightly-2021-02-09-024347 

-------------
$ oc get hpa -n openshift-storage
NAME              REFERENCE                    TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
noobaa-endpoint   Deployment/noobaa-endpoint   0%/80%    1         2         1          11m
(myenv) tunguyen-mac:ocs-ci tunguyen$ oc describe hpa noobaa-endpoint -n openshift-storage
Name:                                                  noobaa-endpoint
Namespace:                                             openshift-storage
Labels:                                                app=noobaa
Annotations:                                           <none>
CreationTimestamp:                                     Tue, 09 Feb 2021 10:14:14 -0800
Reference:                                             Deployment/noobaa-endpoint
Metrics:                                               ( current / target )
  resource cpu on pods  (as a percentage of request):  0% (4m) / 80%
Min replicas:                                          1
Max replicas:                                          2
Deployment pods:                                       1 current / 1 desired
Conditions:
  Type            Status  Reason            Message
  ----            ------  ------            -------
  AbleToScale     True    ReadyForNewScale  recommended size matches current size
  ScalingActive   True    ValidMetricFound  the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
  ScalingLimited  True    TooFewReplicas    the desired replica count is less than the minimum replica count
Events:
  Type     Reason                        Age                   From                       Message
  ----     ------                        ----                  ----                       -------
  Warning  FailedGetResourceMetric       10m (x3 over 11m)     horizontal-pod-autoscaler  failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from resource metrics API
  Warning  FailedComputeMetricsReplicas  10m (x3 over 11m)     horizontal-pod-autoscaler  invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from resource metrics API
  Warning  FailedComputeMetricsReplicas  8m37s (x9 over 10m)   horizontal-pod-autoscaler  invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: did not receive metrics for any ready pods
  Warning  FailedGetResourceMetric       8m22s (x10 over 10m)  horizontal-pod-autoscaler  failed to get cpu utilization: did not receive metrics for any ready pods

Comment 10 Tiffany Nguyen 2021-02-09 18:53:02 UTC

Created attachment 1756015 [details]
must-gather logs

Comment 13 Alexander Indenbaum 2021-05-28 10:40:16 UTC

Thank you for the verbose details!

Resource usage metrics, such as container CPU and memory usage, are available in Kubernetes through the Metrics API. Note: The API requires the metrics server to be deployed in the cluster. Otherwise, it will be not available.
See https://kubernetes.io/docs/tasks/debug-application-cluster/resource-metrics-pipeline/


In order to determine the active metrics server on the provided OC cluster:
> ➜  ~ kubectl describe apiservice v1beta1.metrics.k8s.io
> Name:         v1beta1.metrics.k8s.io
> Namespace:
> Labels:       app.kubernetes.io/component=metrics-adapter
>               app.kubernetes.io/name=prometheus-adapter
>               app.kubernetes.io/part-of=openshift-monitoring
>               app.kubernetes.io/version=0.8.4
> Annotations:  service.alpha.openshift.io/inject-cabundle: true
> API Version:  apiregistration.k8s.io/v1
> Kind:         APIService
> Metadata:
>   Creation Timestamp:  2021-05-24T07:03:22Z
>   ....
> Spec:
>   Ca Bundle: ...
>   Group:                   metrics.k8s.io
>   Group Priority Minimum:  100
>   Service:
>     Name:            prometheus-adapter
>     Namespace:       openshift-monitoring
>     Port:            443
>   Version:           v1beta1
>   Version Priority:  100
> Status:
>   Conditions:
>     Last Transition Time:  2021-05-27T09:59:42Z
>     Message:               all checks passed
>     Reason:                Passed
>     Status:                True
>     Type:                  Available
> Events:                    <none>

In order to test the availability of pod metrics use:
> ➜ ~ kubectl top pod
> NAME                                                              CPU(cores)   MEMORY(bytes)
> csi-cephfsplugin-7bkkr                                            0m           73Mi
> csi-cephfsplugin-dzhkt                                            0m           73Mi
> csi-cephfsplugin-kx7p8                                            0m           72Mi
> ...

According to the describe apiservice output above, the metrics service is provided by the prometheus-adapter in the openshift-monitoring namespace. Based on the info available so far, the issue is with the HPA noobaa-endpoint inability to obtain CPU utilization metrics. 
* After OCS installation
* Stops about 15 minutes after OCS installation.

======================================================================================================================
| In order to troubleshoot this issue better, the following steps are recommended <em>before</em> OCS installation:  |
|                                                                                                                    |
| * Provide more info with "kubectl describe apiservice v1beta1.metrics.k8s.io"                                      |
| * Check the status of the metrics server with "kubectl get -n openshift-monitoring pods | grep prometheus-adapter" |
| * Finally ensure the metrics service is available by "kubectl top pod"                                             |
|                                                                                                                    |
| Once the metrics server availability is ensured, please try to reproduce the HPA noobaa-endpoint issue             |
| by running the OCS installation and examining events.                                                              |
======================================================================================================================

Thank you!

Comment 14 Martin Bukatovic 2021-05-31 17:08:20 UTC

(In reply to aindenba from comment #13)
> In order to troubleshoot this issue better, the following steps are
> recommended <em>before</em> OCS installation:

So you are basically saying that it's possible that something is wrong with a cluster prior to OCS installation, which could cause this bug? This is bit unlikely, but let's see.

We can ask OCP team to help us audit related noobaa code if necessary.

Comment 15 Alexander Indenbaum 2021-06-02 12:30:34 UTC

Hello Martin,

Just wanted to mention that according to the available logs the issue is between HPA (https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/) which is a standard K8S resource and the Metrics Server, served by prometheus-adapter in openshift-monitoring namespace, which is installed as a part of OCP. The error events do not really originate in the noobaa code per se.

I am not sure what is the root cause at this stage. I would like to get a better idea of the installation procedure. Could you describe how do you guys roll out OCS? One possible explanation is that during installation, OCS is being installed just about 15 seconds before the prometheus-adapter in the openshift-monitoring namespace becomes ready. After 15 seconds prometheus-adapter is up and the issue is gone. So the platform is good, there might be a timing issue during OCP/OCS bootstrap. That explanation would match the existing evidence, from another hand, it could be totally off 😃.

Nimrod suggested adding the debug commands steps suggested above (i.e. "kubectl describe apiservice v1beta1.metrics.k8s.io", "kubectl top pod") as a part of the cluster installation automation scripts, just before OCS is being installed. This way the debug info would be included in the installation logs once the issue happens.

Hope it helps. Thank you!

Comment 16 csharpe 2021-06-03 00:06:12 UTC

Hi Alexander,

(In reply to Alexander Indenbaum from comment #15)
> Hello Martin,
> 
> Just wanted to mention that according to the available logs the issue is
> between HPA
> (https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/)
> which is a standard K8S resource and the Metrics Server, served by
> prometheus-adapter in openshift-monitoring namespace, which is installed as
> a part of OCP. The error events do not really originate in the noobaa code
> per se.
> 
> I am not sure what is the root cause at this stage. I would like to get a
> better idea of the installation procedure. Could you describe how do you
> guys roll out OCS? One possible explanation is that during installation, OCS
> is being installed just about 15 seconds before the prometheus-adapter in
> the openshift-monitoring namespace becomes ready. After 15 seconds
> prometheus-adapter is up and the issue is gone. So the platform is good,
> there might be a timing issue during OCP/OCS bootstrap. That explanation
> would match the existing evidence, from another hand, it could be totally
> off 😃.

The installation steps for OCS vary according to the platform being installed on, however, they can be found here:

https://access.redhat.com/documentation/en-us/red_hat_openshift_container_storage/4.7/


> 
> Nimrod suggested adding the debug commands steps suggested above (i.e.
> "kubectl describe apiservice v1beta1.metrics.k8s.io", "kubectl top pod") as
> a part of the cluster installation automation scripts, just before OCS is
> being installed. This way the debug info would be included in the
> installation logs once the issue happens.
> 
> Hope it helps. Thank you!

Comment 17 Nimrod Becker 2021-06-03 05:42:05 UTC

Can the information requested by Alex be added after deployment of OCP was successful and before OCS was started to be deployed?

Comment 18 Martin Bukatovic 2021-06-03 09:26:20 UTC

Created attachment 1788814 [details]
oc describe apiservice v1beta1.metrics.k8s.io > v1beta1.metrics.k8s.io.describe

Comment 19 Martin Bukatovic 2021-06-03 22:18:46 UTC

(In reply to Alexander Indenbaum from comment #13)
> In order to troubleshoot this issue better, the following steps are
> recommended <em>before</em> OCS installation:

I have installed OCP 4.8 (4.8.0-0.nightly-2021-06-03-055145) on vSphere UPI
platform, and provided the information you requested before installing OCS,
see details below.

> * Provide more info with "kubectl describe apiservice v1beta1.metrics.k8s.io"

```
$ oc describe apiservice v1beta1.metrics.k8s.io > v1beta1.metrics.k8s.io.describe
```

See attachment 1788814 [details] from comment 18.

> * Check the status of the metrics server with "kubectl get -n
>   openshift-monitoring pods | grep prometheus-adapter"

```
$ oc get -n openshift-monitoring pods | grep prometheus-adapter
prometheus-adapter-5d9cbfdc5d-hlsm7            1/1     Running   0          102m
prometheus-adapter-5d9cbfdc5d-wzfsv            1/1     Running   0          104m
```

> * Finally ensure the metrics service is available by "kubectl top pod"

Are you interested in a particular namespace? The default namespace is
obviously empty in my case:

```
$ oc adm top pod
W0603 19:25:03.656858   14005 top_pod.go:140] Using json format to get metrics. Next release will switch to protocol-buffers, switch early by passing --use-protocol-buffers flag
No resources found in default namespace.
```

If I try other existing ocp namespace, I see values as expected:

```
$ oc adm top pod -n openshift-etcd
W0603 19:25:20.345433   14010 top_pod.go:140] Using json format to get metrics. Next release will switch to protocol-buffers, switch early by passing --use-protocol-buffers flag
NAME                                 CPU(cores)   MEMORY(bytes)
etcd-control-plane-0                 102m         1038Mi
etcd-control-plane-1                 94m          1027Mi
etcd-control-plane-2                 81m          1028Mi
etcd-quorum-guard-77b8f5d85b-8wsrb   3m           1Mi
etcd-quorum-guard-77b8f5d85b-nsfjr   3m           1Mi
etcd-quorum-guard-77b8f5d85b-vzzxh   5m           1Mi
```

Besides that, I also fetched a must gather data:

```
b26aa7119bad9525274a37c55beffd4851aa48ae  mbukatov-0603b-local.must-gather.2021-06-03T19:26+02:00.tar.gz
```

See http://file.emea.redhat.com/~mbukatov/bz-1885524/

> Once the metrics server availability is ensured, please try to reproduce the
> HPA noobaa-endpoint issue by running the OCS installation and examining
> events.

Then I installed the following operators from operator hub:

- LSO 4.8.0-202106021817
- OCS 4.8.0-407.ci

And created OCS StorageCluster via OCP Console. And I can confirm that the
event in question is still there:

```
$ oc get events -n openshift-storage | grep FailedGetResourceMetric
5m3s        Warning   FailedGetResourceMetric        horizontalpodautoscaler/noobaa-endpoint                                    failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from resource metrics API
2m48s       Warning   FailedGetResourceMetric        horizontalpodautoscaler/noobaa-endpoint                                    failed to get cpu utilization: did not receive metrics for any ready pods
$ oc get events -n openshift-storage | grep FailedComputeMetricsReplicas
8m53s       Warning   FailedComputeMetricsReplicas   horizontalpodautoscaler/noobaa-endpoint                                    invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from resource metrics API
6m53s       Warning   FailedComputeMetricsReplicas   horizontalpodautoscaler/noobaa-endpoint                                    invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: did not receive metrics for any ready pods
```

Which means that the problem is reproducible with OCP/OCS 4.8. There seem to be additional issues with the cluster to debug though.

Comment 20 Martin Bukatovic 2021-06-03 22:24:25 UTC

> I am not sure what is the root cause at this stage. I would like to get a better idea of the installation procedure. Could you describe how do you guys roll out OCS?

The steps are:

- deploy OCP cluster
- install LSO and OCS operators from operator hub (via OCP Console)
- via OCP console, locate OCS operator and start Create Storage Cluster procedure

> One possible explanation is that during installation, OCS is being installed just about 15 seconds before the prometheus-adapter in the openshift-monitoring namespace becomes ready.

Because of my current testing scope, I install OCS manually on automatically deployed OCP cluster, which means that OCS is installed on a cluster which is running for few minutes and fully operational.

Comment 29 Martin Bukatovic 2021-10-05 18:55:00 UTC

After installation, I still see the events complaining about getting cpu utilization:

```
$ oc get events -n openshift-storage | grep FailedGetResourceMetric
30m         Warning   FailedGetResourceMetric        horizontalpodautoscaler/noobaa-endpoint                                   failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from resource metrics API
28m         Warning   FailedGetResourceMetric        horizontalpodautoscaler/noobaa-endpoint                                   failed to get cpu utilization: did not receive metrics for any ready pods
$ oc get events -n openshift-storage | grep FailedComputeMetricsReplicas
39m         Warning   FailedComputeMetricsReplicas   horizontalpodautoscaler/noobaa-endpoint                                   invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from resource metrics API
36m         Warning   FailedComputeMetricsReplicas   horizontalpodautoscaler/noobaa-endpoint                                   invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: did not receive metrics for any ready pods
```

Tested with:

OCP 4.9.0-0.nightly-2021-10-05-004711
LSO 4.9.0-202109210853
ODF 4.9.0-164.ci

And I can see it both on vSphere LSO onpremise cluster, and in AWS UPI cloud deployment.

Comment 30 Alexander Indenbaum 2021-10-13 07:22:19 UTC

Hello, 

The issue is not reproducible with docker-desktop Kubernetes and the Metrics server https://github.com/kubernetes-sigs/metrics-server, however, the issue is still present in the OCP environment.

To troubleshoot this issue better, tried a custom NooBaa build where the NooBaa operator first fetches the Endpoint pod metrics, verifying the availability of the endpoint deployment pods metrics, before creating an HPA instance, see https://github.com/noobaa/noobaa-operator/pull/750. In the OCP environment, using the build based on the PR #750 codebase above there are still  FailedGetResourceMetric/FailedComputeMetricsReplicas warnings emitted by the HPA.

Comment 34 Danny 2021-11-04 07:47:04 UTC

Comment 35 Danny 2021-11-04 07:53:28 UTC

@ebenahar I've found an OCP bug (https://bugzilla.redhat.com/show_bug.cgi?id=1993985) which seems similar to this BZ.
it is closed as a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=2011815 which AFAIU is fixed in OCP 4.9.0.  
can we verify this issue is not reproduced on OCP 4.9?

Comment 36 Martin Bukatovic 2021-11-25 15:53:15 UTC

Retrying with vSphere LSO onpremise cluster with:

- OCP 4.9.0-0.nightly-2021-11-24-090558
- LSO 4.9.0-202111151318
- OCS 4.9.0-249.ci

And unfortunately, I still se the same behaviour as before, events related to metrics cpu issues are still present right after installation:

```
$ oc get events -n openshift-storage | grep FailedGetResourceMetric
12m         Warning   FailedGetResourceMetric        horizontalpodautoscaler/noobaa-endpoint                                    failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from resource metrics API
9m40s       Warning   FailedGetResourceMetric        horizontalpodautoscaler/noobaa-endpoint                                    failed to get cpu utilization: did not receive metrics for any ready pods
[ocsqe@fedora data]$ oc get events -n openshift-storage | grep FailedComputeMetricsReplicas
12m         Warning   FailedComputeMetricsReplicas   horizontalpodautoscaler/noobaa-endpoint                                    invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from resource metrics API
10m         Warning   FailedComputeMetricsReplicas   horizontalpodautoscaler/noobaa-endpoint                                    invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: did not receive metrics for any ready pods
```

Comment 46 dnastaci 2022-05-10 20:18:39 UTC

For whatever it is worth, also seeing this on OCP 4.10.11 with ODF 4.10 over LSO.


oc get events -n openshift-storage | grep -v Normal
LAST SEEN   TYPE      REASON                         OBJECT                                                                    MESSAGE
11m         Warning   FailedGetResourceMetric        horizontalpodautoscaler/noobaa-endpoint                                   failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from resource metrics API
11m         Warning   FailedComputeMetricsReplicas   horizontalpodautoscaler/noobaa-endpoint                                   invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from resource metrics API
8m23s       Warning   FailedGetResourceMetric        horizontalpodautoscaler/noobaa-endpoint                                   failed to get cpu utilization: did not receive metrics for any ready pods
8m38s       Warning   FailedComputeMetricsReplicas   horizontalpodautoscaler/noobaa-endpoint                                   invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: did not receive metrics for any ready pods
17m         Warning   ReconcileFailed                storagesystem/ocs-storagecluster-storagesystem                            Operation cannot be fulfilled on storageclusters.ocs.openshift.io "ocs-storagecluster": the object has been modified; please apply your changes to the latest version and try again

Comment 69 Filip Balák 2023-08-22 12:12:51 UTC

The problem is present in current version:

$ oc get events -n openshift-storage | grep FailedGetResourceMetric
48m         Warning   FailedGetResourceMetric        horizontalpodautoscaler/noobaa-hpav2                                                failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from resource metrics API
46m         Warning   FailedGetResourceMetric        horizontalpodautoscaler/noobaa-hpav2                                                failed to get cpu utilization: did not receive metrics for targeted pods (pods might be unready)

Tested with:
ocs 4.14.0-113
ocp 4.14.0-0.nightly-2023-08-11-055332