Bug 1533790 - hpav2 still get metrics from https:heapster though use rest-clients
Summary: hpav2 still get metrics from https:heapster though use rest-clients
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Node
Version: 3.9.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 3.11.0
Assignee: Seth Jennings
QA Contact: Weinan Liu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-01-12 08:54 UTC by DeShuai Ma
Modified: 2018-10-11 07:19 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
undefined
Clone Of:
Environment:
Last Closed: 2018-10-11 07:19:06 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:2652 0 None None None 2018-10-11 07:19:41 UTC

Description DeShuai Ma 2018-01-12 08:54:46 UTC
Description of problem:
When enable hpa with horizontal-pod-autoscaler-use-rest-clients=true, when hpa get metrics it always failed with "failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services https:heapster:)"

Version-Release number of selected component (if applicable):
openshift v3.9.0-0.16.0
kubernetes v1.9.0-beta1
etcd 3.2.8

How reproducible:
Always

Steps to Reproduce:
 
1. Configure hpav2 in "/etc/origin/master/master-config.yaml" like below then restart master api & controller services

kubernetesMasterConfig:
  apiServerArguments:
    runtime-config:
    - autoscaling/v2beta1=true
  controllerArguments:
    horizontal-pod-autoscaler-use-rest-clients:
    - 'true'
    horizontal-pod-autoscaler-sync-period:
    - 10s

# systemctl restart atomic-openshift-master-api.service
# systemctl restart atomic-openshift-master-controllers.service

2. Deploy metrics-server and make sure metrics-server works well in ocp
[root@ip-172-18-11-233 deploy]# oc get svc -n kube-system
NAME             TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)   AGE
metrics-server   ClusterIP   172.30.14.24   <none>        443/TCP   2h
[root@ip-172-18-11-233 deploy]# oc get po -n kube-system
NAME                            READY     STATUS    RESTARTS   AGE
metrics-server-8946c475-7xfrv   1/1       Running   0          1h
[root@ip-172-18-11-233 deploy]# oc get --raw /apis/metrics.k8s.io/v1beta1
{"kind":"APIResourceList","apiVersion":"v1","groupVersion":"metrics.k8s.io/v1beta1","resources":[{"name":"nodes","singularName":"","namespaced":false,"kind":"NodeMetrics","verbs":["get","list"]},{"name":"pods","singularName":"","namespaced":true,"kind":"PodMetrics","verbs":["get","list"]}]}


3. Create a rc and hpa v2beta1 resource then check the hpa status
[root@ip-172-18-11-233 deploy]# oc adm new-project dma1
Created project dma1
[root@ip-172-18-11-233 deploy]# oc create -f https://raw.githubusercontent.com/mdshuai/testfile-openshift/master/k8s/autoscaling/hpa/rc.yaml -n dma1
replicationcontroller "hello-openshift" created
[root@ip-172-18-11-233 deploy]# oc create -f https://raw.githubusercontent.com/mdshuai/testfile-openshift/master/k8s/autoscaling/hpa/resource.yaml -n dma1
horizontalpodautoscaler "resource-hpa" created
[root@ip-172-18-11-233 deploy]# 
[root@ip-172-18-11-233 deploy]# oc get hpa.v2beta1.autoscaling -n dma1
NAME           REFERENCE                               TARGETS           MINPODS   MAXPODS   REPLICAS   AGE
resource-hpa   ReplicationController/hello-openshift   <unknown> / 80%   2         10        1          16s
[root@ip-172-18-11-233 deploy]# oc describe hpa.v2beta1.autoscaling resource-hpa -n dma1
Name:                                                  resource-hpa
Namespace:                                             dma1
Labels:                                                <none>
Annotations:                                           <none>
CreationTimestamp:                                     Fri, 12 Jan 2018 03:10:46 -0500
Reference:                                             ReplicationController/hello-openshift
Metrics:                                               ( current / target )
  resource cpu on pods  (as a percentage of request):  <unknown> / 80%
Min replicas:                                          2
Max replicas:                                          10
ReplicationController pods:                            2 current / 2 desired
Conditions:
  Type           Status  Reason                   Message
  ----           ------  ------                   -------
  AbleToScale    True    SucceededGetScale        the HPA controller was able to get the target's current scale
  ScalingActive  False   FailedGetResourceMetric  the HPA was unable to compute the replica count: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services https:heapster:)
Events:
  Type     Reason                        Age               From                       Message
  ----     ------                        ----              ----                       -------
  Normal   SuccessfulRescale             25s               horizontal-pod-autoscaler  New size: 2; reason: Current number of replicas below Spec.MinReplicas
  Warning  FailedGetResourceMetric       5s (x2 over 15s)  horizontal-pod-autoscaler  unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services https:heapster:)
  Warning  FailedComputeMetricsReplicas  5s (x2 over 15s)  horizontal-pod-autoscaler  failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services https:heapster:)

Actual results:
3. horizontal-pod-autoscaler  failed to get cpu utilization: unable to get metrics for resource cpu: failed to get pod resource metrics: the server could not find the requested resource (get services https:heapster:)

Expected results:
3. hpa should get metrics success

Additional info:

Comment 1 Solly Ross 2018-01-16 15:49:02 UTC
Because we use custom HPA setup logic, that switch won't work yet, unfortunately.  I had a PR in to fix it temporarily (https://github.com/openshift/origin/pull/18035), but it looks like we're going to wait and just remove our custom logic entirely once we're installing metrics-server by default.

Comment 2 DeShuai Ma 2018-01-17 01:03:01 UTC
I need this fix, then test the hpa v2beta1 feature. As there is pr to fix the issue, mark bug status to MODIFIED other than NOTABUG.

Comment 4 weiwei jiang 2018-01-24 06:19:29 UTC
Since the PR is not merged, so move back to MODIFIED

Comment 7 Seth Jennings 2018-04-27 16:41:45 UTC
The move to metrics-server is deferred to 3.11.

Comment 10 Seth Jennings 2018-07-30 14:09:48 UTC
Origin PR:
https://github.com/openshift/origin/pull/19115

Comment 12 Weinan Liu 2018-08-28 11:09:03 UTC
Issue verified to be fixed.

[root@ip-172-18-11-67 ~]# oc version
oc v3.11.0-0.24.0
kubernetes v1.11.0+d4cacc0
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://ip-172-18-11-67.ec2.internal:8443
openshift v3.11.0-0.24.0
kubernetes v1.11.0+d4cacc0
[root@ip-172-18-11-67 ~]# cat /etc/redhat-release 
Red Hat Enterprise Linux Server release 7.5 (Maipo)
[root@ip-172-18-11-67 ~]# 


[root@ip-172-18-11-67 ~]# oc get --raw /apis/metrics.k8s.io/v1beta1
{"kind":"APIResourceList","apiVersion":"v1","groupVersion":"metrics.k8s.io/v1beta1","resources":[{"name":"nodes","singularName":"","namespaced":false,"kind":"NodeMetrics","verbs":["get","list"]},{"name":"pods","singularName":"","namespaced":true,"kind":"PodMetrics","verbs":["get","list"]}]}

oc create -f https://raw.githubusercontent.com/mdshuai/testfile-openshift/master/k8s/autoscaling/hpa-v2beta1/rc.yaml -n dma1
oc create -f https://raw.githubusercontent.com/mdshuai/testfile-openshift/master/k8s/autoscaling/hpa-v2beta1/resource-metrics-cpu.yaml -n dma1

[root@ip-172-18-11-67 ~]# oc get hpa.v2beta1.autoscaling -n dma1
NAME           REFERENCE                               TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
resource-cpu   ReplicationController/hello-openshift   0%/80%    2         10        2          39m
[root@ip-172-18-11-67 ~]# oc describe hpa.v2beta1.autoscaling resource-hpa -n dma1
Error from server (NotFound): horizontalpodautoscalers.autoscaling "resource-hpa" not found
[root@ip-172-18-11-67 ~]# oc describe hpa.v2beta1.autoscaling resource-cpu -n dma1
Name:                                                  resource-cpu
Namespace:                                             dma1
Labels:                                                <none>
Annotations:                                           <none>
CreationTimestamp:                                     Tue, 28 Aug 2018 06:28:34 -0400
Reference:                                             ReplicationController/hello-openshift
Metrics:                                               ( current / target )
  resource cpu on pods  (as a percentage of request):  0% (0) / 80%
Min replicas:                                          2
Max replicas:                                          10
ReplicationController pods:                            2 current / 2 desired
Conditions:
  Type            Status  Reason            Message
  ----            ------  ------            -------
  AbleToScale     True    ReadyForNewScale  the last scale time was sufficiently old as to warrant a new scale
  ScalingActive   True    ValidMetricFound  the HPA was able to successfully calculate a replica count from cpu resource utilization (percentage of request)
  ScalingLimited  True    TooFewReplicas    the desired replica count is increasing faster than the maximum scale rate
Events:
  Type    Reason             Age   From                       Message
  ----    ------             ----  ----                       -------
  Normal  SuccessfulRescale  39m   horizontal-pod-autoscaler  New size: 2; reason: Current number of replicas below Spec.MinReplicas
[root@ip-172-18-11-67 ~]#

Comment 14 errata-xmlrpc 2018-10-11 07:19:06 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2652


Note You need to log in before you can comment on or make changes to this bug.