Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1419481 - HPA unable to get metrics for resource cpu: failed to unmarshal heapster response [NEEDINFO]
HPA unable to get metrics for resource cpu: failed to unmarshal heapster resp...
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Pod (Show other bugs)
3.5.0
Unspecified Unspecified
high Severity high
: ---
: ---
Assigned To: Solly Ross
DeShuai Ma
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-02-06 05:10 EST by DeShuai Ma
Modified: 2017-07-24 10 EDT (History)
7 users (show)

See Also:
Fixed In Version:
Doc Type: No Doc Update
Doc Text:
undefined
Story Points: ---
Clone Of:
Environment:
Last Closed: 2017-04-12 15:11:45 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
sross: needinfo? (mwringe)


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:0884 normal SHIPPED_LIVE Red Hat OpenShift Container Platform 3.5 RPM Release Advisory 2017-04-12 18:50:07 EDT

  None (edit)
Description DeShuai Ma 2017-02-06 05:10:26 EST
Description of problem:
Create a hpa a scale resource, the hpa can't get metrics correctly, always show error info "unable to get metrics for resource cpu: failed to unmarshal heapster response: json: cannot unmarshal array into Go value of type v1alpha1.PodMetricsList"

Version-Release number of selected component (if applicable):
openshift v3.5.0.16+a26133a
kubernetes v1.5.2+43a9be4
etcd 3.1.0

registry.ops.openshift.com/openshift3/metrics-hawkular-metrics:3.5.0   imageid: fc0e50112581 
registry.ops.openshift.com/openshift3/metrics-cassandra:3.5.0          imageid: aa7e5b2b7210
registry.ops.openshift.com/openshift3/metrics-heapster:3.5.0           imageid: b2cb3298b3db

How reproducible:
Always

Steps to Reproduce:
1. Create a scale resouce
$ oc run resource-consumer --image=gcr.io/google_containers/resource_consumer:beta --expose --port 8080 --requests='cpu=100m,memory=256Mi' -n dma1

2. Create hpa for scale resource
$ oc autoscale dc resource-consumer --min=1 --max=5 -n dma1

3. Check the hpa status
[root@ip-172-18-11-11 ~]# oc describe hpa/resource-consumer -n dma1
Name:                resource-consumer
Namespace:            dma1
Labels:                <none>
Annotations:            <none>
CreationTimestamp:        Mon, 06 Feb 2017 04:32:18 -0500
Reference:            DeploymentConfig/resource-consumer
Target CPU utilization:        80%
Current CPU utilization:    <unset>
Min replicas:            1
Max replicas:            5
Events:
  FirstSeen    LastSeen    Count    From                SubObjectPath    Type        Reason            Message
  ---------    --------    -----    ----                -------------    --------    ------            -------
  12m        9m        8    {horizontal-pod-autoscaler }            Normal        MetricsNotAvailableYet    unable to get metrics for resource cpu: failed to unmarshal heapster response: json: cannot unmarshal array into Go value of type v1alpha1.PodMetricsList
  9m        18s        19    {horizontal-pod-autoscaler }            Warning        FailedGetMetrics    unable to get metrics for resource cpu: failed to unmarshal heapster response: json: cannot unmarshal array into Go value of type v1alpha1.PodMetricsList

Actual results:

Expected results:

Additional info:
Comment 1 Solly Ross 2017-02-06 11:30:12 EST
There was a change in the type of PodMetricsList at some point to be more in line with the rest of the *List types in Kubernetes (before, it was just an array).  It looks like our version of Heapster is too old to work with our version of Kubernetes.

We'll need a newer version of Heapster.
Comment 5 Troy Dawson 2017-02-07 15:58:50 EST
It looks like this was my fault, but it's cleared up now.
The original OCP 3.5 repo's that we had were incorrectly setup and pulled in a 1.1.0 version of heapster.  They were fixed fairly quickly, but we didn't update the heapster image.

I have rebuilt the 3.5 metrics-heapster image, and verified that it has heapster-1.2.0 in it.

openshift3/metrics-heapster:3.5.0-2

That image is in the usual OCP 3.5 testing areas.
Comment 6 DeShuai Ma 2017-02-09 03:56:42 EST
Verify on latest 3.5 heapster image, image tag: 7799362af752, image has updated and hpa can get metrics now.

[root@dhcp-128-7 dma]# oc run resource-consumer --image=gcr.io/google_containers/resource_consumer:beta --expose --port 8080 --requests='cpu=100m,memory=256Mi'
service "resource-consumer" created
deploymentconfig "resource-consumer" created
[root@dhcp-128-7 dma]# oc autoscale dc resource-consumer --min=1 --max=5
deploymentconfig "resource-consumer" autoscaled
[root@dhcp-128-7 dma]# oc get hpa
NAME                REFERENCE                            TARGET    CURRENT   MINPODS   MAXPODS   AGE
resource-consumer   DeploymentConfig/resource-consumer   80%       0%        1         5         <invalid>
[root@dhcp-128-7 dma]# oc describe hpa resource-consumer
Name:				resource-consumer
Namespace:			dma1
Labels:				<none>
Annotations:			<none>
CreationTimestamp:		Thu, 09 Feb 2017 16:52:19 +0800
Reference:			DeploymentConfig/resource-consumer
Target CPU utilization:		80%
Current CPU utilization:	0%
Min replicas:			1
Max replicas:			5
Events:
  FirstSeen	LastSeen	Count	From				SubObjectPath	Type		Reason			Message
  ---------	--------	-----	----				-------------	--------	------			-------
  <invalid>	<invalid>	3	{horizontal-pod-autoscaler }			Normal		MetricsNotAvailableYet	unable to get metrics for resource cpu: no metrics returned from heapster
  <invalid>	<invalid>	2	{horizontal-pod-autoscaler }			Normal		DesiredReplicasComputed	Computed the desired num of replicas: 0 (avgCPUutil: 0, current replicas: 1)
Comment 8 errata-xmlrpc 2017-04-12 15:11:45 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:0884

Note You need to log in before you can comment on or make changes to this bug.