Description of problem: Create a hpa a scale resource, the hpa can't get metrics correctly, always show error info "unable to get metrics for resource cpu: failed to unmarshal heapster response: json: cannot unmarshal array into Go value of type v1alpha1.PodMetricsList" Version-Release number of selected component (if applicable): openshift v3.5.0.16+a26133a kubernetes v1.5.2+43a9be4 etcd 3.1.0 registry.ops.openshift.com/openshift3/metrics-hawkular-metrics:3.5.0 imageid: fc0e50112581 registry.ops.openshift.com/openshift3/metrics-cassandra:3.5.0 imageid: aa7e5b2b7210 registry.ops.openshift.com/openshift3/metrics-heapster:3.5.0 imageid: b2cb3298b3db How reproducible: Always Steps to Reproduce: 1. Create a scale resouce $ oc run resource-consumer --image=gcr.io/google_containers/resource_consumer:beta --expose --port 8080 --requests='cpu=100m,memory=256Mi' -n dma1 2. Create hpa for scale resource $ oc autoscale dc resource-consumer --min=1 --max=5 -n dma1 3. Check the hpa status [root@ip-172-18-11-11 ~]# oc describe hpa/resource-consumer -n dma1 Name: resource-consumer Namespace: dma1 Labels: <none> Annotations: <none> CreationTimestamp: Mon, 06 Feb 2017 04:32:18 -0500 Reference: DeploymentConfig/resource-consumer Target CPU utilization: 80% Current CPU utilization: <unset> Min replicas: 1 Max replicas: 5 Events: FirstSeen LastSeen Count From SubObjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- 12m 9m 8 {horizontal-pod-autoscaler } Normal MetricsNotAvailableYet unable to get metrics for resource cpu: failed to unmarshal heapster response: json: cannot unmarshal array into Go value of type v1alpha1.PodMetricsList 9m 18s 19 {horizontal-pod-autoscaler } Warning FailedGetMetrics unable to get metrics for resource cpu: failed to unmarshal heapster response: json: cannot unmarshal array into Go value of type v1alpha1.PodMetricsList Actual results: Expected results: Additional info:
There was a change in the type of PodMetricsList at some point to be more in line with the rest of the *List types in Kubernetes (before, it was just an array). It looks like our version of Heapster is too old to work with our version of Kubernetes. We'll need a newer version of Heapster.
It looks like this was my fault, but it's cleared up now. The original OCP 3.5 repo's that we had were incorrectly setup and pulled in a 1.1.0 version of heapster. They were fixed fairly quickly, but we didn't update the heapster image. I have rebuilt the 3.5 metrics-heapster image, and verified that it has heapster-1.2.0 in it. openshift3/metrics-heapster:3.5.0-2 That image is in the usual OCP 3.5 testing areas.
Verify on latest 3.5 heapster image, image tag: 7799362af752, image has updated and hpa can get metrics now. [root@dhcp-128-7 dma]# oc run resource-consumer --image=gcr.io/google_containers/resource_consumer:beta --expose --port 8080 --requests='cpu=100m,memory=256Mi' service "resource-consumer" created deploymentconfig "resource-consumer" created [root@dhcp-128-7 dma]# oc autoscale dc resource-consumer --min=1 --max=5 deploymentconfig "resource-consumer" autoscaled [root@dhcp-128-7 dma]# oc get hpa NAME REFERENCE TARGET CURRENT MINPODS MAXPODS AGE resource-consumer DeploymentConfig/resource-consumer 80% 0% 1 5 <invalid> [root@dhcp-128-7 dma]# oc describe hpa resource-consumer Name: resource-consumer Namespace: dma1 Labels: <none> Annotations: <none> CreationTimestamp: Thu, 09 Feb 2017 16:52:19 +0800 Reference: DeploymentConfig/resource-consumer Target CPU utilization: 80% Current CPU utilization: 0% Min replicas: 1 Max replicas: 5 Events: FirstSeen LastSeen Count From SubObjectPath Type Reason Message --------- -------- ----- ---- ------------- -------- ------ ------- <invalid> <invalid> 3 {horizontal-pod-autoscaler } Normal MetricsNotAvailableYet unable to get metrics for resource cpu: no metrics returned from heapster <invalid> <invalid> 2 {horizontal-pod-autoscaler } Normal DesiredReplicasComputed Computed the desired num of replicas: 0 (avgCPUutil: 0, current replicas: 1)
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2017:0884