Bug 1636453 - unable to get metrics for resource cpu: no metrics returned from heapster
Summary: unable to get metrics for resource cpu: no metrics returned from heapster
Status: CLOSED DUPLICATE of bug 1593634
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Monitoring
Version: 3.9.0
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
: 3.9.z
Assignee: Ruben Vargas Palma
QA Contact: Junqi Zhao
Depends On:
TreeView+ depends on / blocked
Reported: 2018-10-05 12:42 UTC by Paul Yates
Modified: 2019-02-20 17:24 UTC (History)
4 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Last Closed: 2019-02-20 17:24:18 UTC
Target Upstream Version:

Attachments (Terms of Use)

System ID Private Priority Status Summary Last Updated
Red Hat Bugzilla 1593634 0 high CLOSED OpenShift Heapster is logging a lot of "no pod" or "no container" found messages 2021-02-22 00:41:40 UTC

Internal Links: 1593634

Description Paul Yates 2018-10-05 12:42:10 UTC
Description of problem:

A number of customers are concerned with a HorizontalPodAutoscaler (HPA) issue.

When a customer makes use of HPA, they are being bombarded with Events (e.g.: one customer ~9000 events in last 3 days) which report: 

'unable to get metrics for resource cpu: no metrics returned from heapster'

As far as the customer is concerned, HPA is NOT functioning correctly, but there is no issue with their HPA setup at all.  

The Events they are seeing are for pods that are newly created, pods that are terminating, or dead pods. Once a pod is fully functional, there are no further events for this pod, but as pods are scaled up/down frequently with the nature of HPA - there are many Events and log msgs being created, which is becoming increasingly annoying for customers.

The events they are seeing from the UI don't indicate which pod the error message is referring to also, which make it harder for them to understand what's happening.  

Version-Release number of selected component (if applicable):

oc v3.9.41
kubernetes v1.9.1+a0ce1bc657
features: Basic-Auth GSSAPI Kerberos SPNEGO

openshift v3.9.41
kubernetes v1.9.1+a0ce1bc657

How reproducible:

Steps to Reproduce:

1. Configure HPA: https://docs.openshift.com/container-platform/3.9/dev_guide/pod_autoscaling.html
2. Scale Pods using HPA
3. View events in the project and view Heapster logs.

Actual results:
There are many events and log entries for new pods, terminating pods, failed pods:
unable to get metrics for resource cpu: no metrics returned from heapster

Expected results:
In HPA setup, we should return metrics and create events for healthy pods and not repeatedly creating events for pods that are newly created or terminating.

Additional info:

Note You need to log in before you can comment on or make changes to this bug.