Bug 1551496

Summary: Hpa can't get metrics for deploymentconfig on 3.7 env
Product: OpenShift Container Platform Reporter: weiwei jiang <wjiang>
Component: NodeAssignee: Jordan Liggitt <jliggitt>
Status: CLOSED ERRATA QA Contact: weiwei jiang <wjiang>
Severity: high Docs Contact:
Priority: high    
Version: 3.7.1CC: aos-bugs, cstark, dma, gtedorst, jliggitt, jmalde, jokerman, mmccomas, nnosenzo, rbost, sdehn, smunilla, snalawad, sross, tibrahim, wjiang
Target Milestone: ---Keywords: Regression
Target Release: 3.7.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-05-18 03:54:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description weiwei jiang 2018-03-05 09:58:35 UTC
Description of problem:
When try to autoscale a deploymentconfig, got 
1m         31m         61        hello-hpa            HorizontalPodAutoscaler                                 Warning   FailedGetScale          horizontal-pod-autoscaler     no kind "Scale" is registered for version "extensions/v1beta1"



Version-Release number of selected component (if applicable):
# openshift version
openshift v3.7.36
kubernetes v1.7.6+a08f5eeb62
etcd 3.2.8


How reproducible:
always

Steps to Reproduce:
1.
    Given I have a project
    When I run the :create client command with:
      | f | https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/infrastructure/hpa/dc-hello-hpa.yaml |
    Then the step should succeed
    Given 2 pods become ready with labels:
      | run=hello-hpa |
    When I run the :expose client command with:
      | resource      | rc          |
      | resource name | hello-hpa-1 |
      | port          | 8080        |
    Given I wait for the "hello-hpa-1" service to become ready
    When I run the :autoscale client command with:
      | name        | dc/hello-hpa |
      | min         | 2            |
      | max         | 10           |
      | cpu-percent | 50           |
    Then the step should succeed
    Given I wait up to 300 seconds for the steps to pass:
    """
    Then expression should be true> hpa('hello-hpa').min_replicas(cached: false, user: user) == 2
    And expression should be true> hpa.max_replicas == 10
    And expression should be true> hpa.current_cpu_utilization_percentage == 0
    And expression should be true> hpa.target_cpu_utilization_percentage == 50
    And expression should be true> hpa.current_replicas == 2
    """
    When I run the :create client command with:
      | f | https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/infrastructure/hpa/hello-pod.yaml |
    Then the step should succeed
    Given the pod named "hello-pod" status becomes :running within 60 seconds
    When I run the :exec background client command with:
      | pod              | hello-pod                                         |
      | oc_opts_end      |                                                   |
      | exec_command     | sh                                                |
      | exec_command_arg | -c                                                |
      | exec_command_arg | while true;do curl http://<%= service.url %>;done |
    Then the step should succeed
    Given I wait up to 600 seconds for the steps to pass:
    """
    Then expression should be true> hpa('hello-hpa').current_replicas(cached: false, user: user) > 2
    And expression should be true> hpa.current_cpu_utilization_percentage > hpa.target_cpu_utilization_percentage
    """
    Given I ensure "hello-pod" pod is deleted
    Given I wait up to 600 seconds for the steps to pass:
    """
    Then expression should be true> hpa('hello-hpa').current_cpu_utilization_percentage(cached: false, user: user) == 0
    And expression should be true> hpa.current_replicas == 2
    """
2.
3.

Actual results:
# oc describe hpa
Name:							hello-hpa
Namespace:						2np66
Labels:							<none>
Annotations:						<none>
CreationTimestamp:					Mon, 05 Mar 2018 04:14:22 -0500
Reference:						DeploymentConfig/hello-hpa
Metrics:						( current / target )
  resource cpu on pods  (as a percentage of request):	<unknown> / 50%
Min replicas:						2
Max replicas:						10
Conditions:
  Type		Status	Reason		Message
  ----		------	------		-------
  AbleToScale	False	FailedGetScale	the HPA controller was unable to get the target's current scale: no kind "Scale" is registered for version "extensions/v1beta1"
Events:
  FirstSeen	LastSeen	Count	From				SubObjectPath	Type		Reason		Message
  ---------	--------	-----	----				-------------	--------	------		-------
  34m		4m		61	horizontal-pod-autoscaler			Warning		FailedGetScale	no kind "Scale" is registered for version "extensions/v1beta1"

Expected results:


Additional info:
replicationcontroller work well.

Comment 1 DeShuai Ma 2018-03-06 08:26:52 UTC
This will block 3.7 errata release, set to H/H

Comment 2 Solly Ross 2018-03-06 15:59:11 UTC
can you please post the output of `oc get hpa -o yaml`?

Comment 3 weiwei jiang 2018-03-13 08:10:07 UTC
# oc get hpa hello-hpa -n f0i86 -o yaml 
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  annotations:
    autoscaling.alpha.kubernetes.io/conditions: '[{"type":"AbleToScale","status":"False","lastTransitionTime":"2018-03-13T08:01:22Z","reason":"FailedGetScale","message":"the
      HPA controller was unable to get the target''s current scale: no kind \"Scale\"
      is registered for version \"extensions/v1beta1\""}]'
  creationTimestamp: 2018-03-13T08:00:52Z
  name: hello-hpa
  namespace: f0i86
  resourceVersion: "6170"
  selfLink: /apis/autoscaling/v1/namespaces/f0i86/horizontalpodautoscalers/hello-hpa
  uid: a6433671-2694-11e8-8eef-0ed7d6f1df92
spec:
  maxReplicas: 10
  minReplicas: 2
  scaleTargetRef:
    apiVersion: v1
    kind: DeploymentConfig
    name: hello-hpa
  targetCPUUtilizationPercentage: 50
status:
  currentReplicas: 0
  desiredReplicas: 0

# oc version 
oc v3.7.38
kubernetes v1.7.6+a08f5eeb62
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://ip-172-18-13-10.ec2.internal:8443
openshift v3.7.38
kubernetes v1.7.6+a08f5eeb62

Comment 6 Nicolas Nosenzo 2018-04-19 07:39:28 UTC
Hit the same issue on v3.7.42, although it was not reproducible with v3.7.23

#### FAILED ####
Test done with version:
oc v3.7.42
kubernetes v1.7.6+a08f5eeb62
features: Basic-Auth GSSAPI Kerberos SPNEGO


Apr 17 15:20:21 master.node atomic-openshift-master-controllers[2177]: I0417 15:20:21.163461    2177 horizontal.go:598] Successfully updated status for test-demo-recharge
Apr 17 15:20:21 master.node atomic-openshift-master-controllers[2177]: E0417 15:20:21.163544    2177 horizontal.go:206] failed to query scale subresource for DeploymentConfig/prod-test/test-demo-recharge: no kind "Scale" is registered for version "extensions/v1beta1"
Apr 17 15:20:51 master.node atomic-openshift-master-controllers[2177]: E0417 15:20:51.190854    2177 horizontal.go:206] failed to query scale subresource for DeploymentConfig/prod-test/test-demo-recharge: no kind "Scale" is registered for version "extensions/v1beta1"
Apr 17 15:21:21 master.node atomic-openshift-master-controllers[2177]: E0417 15:21:21.219391    2177 horizontal.go:206] failed to query scale subresource for DeploymentConfig/prod-test/test-demo-recharge: no kind "Scale" is registered for version "extensions/v1beta1"
Apr 17 15:21:51 master.node atomic-openshift-master-controllers[2177]: E0417 15:21:51.242943    2177 horizontal.go:206] failed to query scale subresource for DeploymentConfig/prod-test/test-demo-recharge: no kind "Scale" is registered for version "extensions/v1beta1"
Apr 17 15:22:21 master.node atomic-openshift-master-controllers[2177]: E0417 15:22:21.254977    2177 horizontal.go:206] failed to query scale subresource for DeploymentConfig/prod-test/test-demo-recharge: no kind "Scale" is registered for version "extensions/v1beta1"
Apr 17 15:22:51 master.node atomic-openshift-master-controllers[2177]: I0417 15:22:51.255282    2177 horizontal.go:352] Horizontal Pod Autoscaler has been deleted prod-test/test-demo-recharge



#### SUCCEED ####

Test done with version:
openshift v3.7.23
kubernetes v1.7.6+a08f5eeb62

$ oc describe hpa cakephp-mysql-example
Name:							cakephp-mysql-example
Namespace:						hello
Labels:							<none>
Annotations:						<none>
CreationTimestamp:					Wed, 18 Apr 2018 10:11:00 -0400
Reference:						DeploymentConfig/cakephp-mysql-example
Metrics:						( current / target )
  resource cpu on pods  (as a percentage of request):	1% (1m) / 50%
Min replicas:						1
Max replicas:						4
Conditions:
  Type			Status	Reason			Message
  ----			------	------			-------
  AbleToScale		True	ReadyForNewScale	the last scale time was sufficiently old as to warrant a new scale
  ScalingActive		True	ValidMetricFound	the HPA was able to succesfully calculate a replica count from cpu resource utilization (percentage of request)
  ScalingLimited	False	DesiredWithinRange	the desired replica count is within the acceptible range
Events:
  FirstSeen	LastSeen	Count	From				SubObjectPath	Type		Reason				Message
  ---------	--------	-----	----				-------------	--------	------				-------
   14m		14m		1	horizontal-pod-autoscaler			Normal		SuccessfulRescale		New size: 3; reason: cpu resource utilization (percentage of request) above target
  8m		8m		1	horizontal-pod-autoscaler			Normal		SuccessfulRescale		New size: 1; reason: All metrics below target

Comment 26 weiwei jiang 2018-05-09 05:48:06 UTC
Checked with 
# oc version 
oc v3.7.46
kubernetes v1.7.6+a08f5eeb62
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://xxx.xxx.xxx.xxx:8443
openshift v3.7.46
kubernetes v1.7.6+a08f5eeb62

And the issue can not be reproduced.

Comment 33 errata-xmlrpc 2018-05-18 03:54:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1576