1551496 – Hpa can't get metrics for deploymentconfig on 3.7 env

Bug 1551496 - Hpa can't get metrics for deploymentconfig on 3.7 env

Summary: Hpa can't get metrics for deploymentconfig on 3.7 env

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Node
Sub Component:
Version:	3.7.1
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	3.7.z
Assignee:	Jordan Liggitt
QA Contact:	weiwei jiang
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-03-05 09:58 UTC by weiwei jiang
Modified:	2018-05-25 07:14 UTC (History)
CC List:	16 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2018-05-18 03:54:45 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Knowledge Base (Solution)	3418461	0	None	None	None	2018-04-19 13:10:00 UTC
Red Hat Product Errata	RHBA-2018:1576	0	None	None	None	2018-05-18 03:55:23 UTC

Description weiwei jiang 2018-03-05 09:58:35 UTC

Description of problem:
When try to autoscale a deploymentconfig, got 
1m         31m         61        hello-hpa            HorizontalPodAutoscaler                                 Warning   FailedGetScale          horizontal-pod-autoscaler     no kind "Scale" is registered for version "extensions/v1beta1"



Version-Release number of selected component (if applicable):
# openshift version
openshift v3.7.36
kubernetes v1.7.6+a08f5eeb62
etcd 3.2.8


How reproducible:
always

Steps to Reproduce:
1.
    Given I have a project
    When I run the :create client command with:
      | f | https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/infrastructure/hpa/dc-hello-hpa.yaml |
    Then the step should succeed
    Given 2 pods become ready with labels:
      | run=hello-hpa |
    When I run the :expose client command with:
      | resource      | rc          |
      | resource name | hello-hpa-1 |
      | port          | 8080        |
    Given I wait for the "hello-hpa-1" service to become ready
    When I run the :autoscale client command with:
      | name        | dc/hello-hpa |
      | min         | 2            |
      | max         | 10           |
      | cpu-percent | 50           |
    Then the step should succeed
    Given I wait up to 300 seconds for the steps to pass:
    """
    Then expression should be true> hpa('hello-hpa').min_replicas(cached: false, user: user) == 2
    And expression should be true> hpa.max_replicas == 10
    And expression should be true> hpa.current_cpu_utilization_percentage == 0
    And expression should be true> hpa.target_cpu_utilization_percentage == 50
    And expression should be true> hpa.current_replicas == 2
    """
    When I run the :create client command with:
      | f | https://raw.githubusercontent.com/openshift-qe/v3-testfiles/master/infrastructure/hpa/hello-pod.yaml |
    Then the step should succeed
    Given the pod named "hello-pod" status becomes :running within 60 seconds
    When I run the :exec background client command with:
      | pod              | hello-pod                                         |
      | oc_opts_end      |                                                   |
      | exec_command     | sh                                                |
      | exec_command_arg | -c                                                |
      | exec_command_arg | while true;do curl http://<%= service.url %>;done |
    Then the step should succeed
    Given I wait up to 600 seconds for the steps to pass:
    """
    Then expression should be true> hpa('hello-hpa').current_replicas(cached: false, user: user) > 2
    And expression should be true> hpa.current_cpu_utilization_percentage > hpa.target_cpu_utilization_percentage
    """
    Given I ensure "hello-pod" pod is deleted
    Given I wait up to 600 seconds for the steps to pass:
    """
    Then expression should be true> hpa('hello-hpa').current_cpu_utilization_percentage(cached: false, user: user) == 0
    And expression should be true> hpa.current_replicas == 2
    """
2.
3.

Actual results:
# oc describe hpa
Name:							hello-hpa
Namespace:						2np66
Labels:							<none>
Annotations:						<none>
CreationTimestamp:					Mon, 05 Mar 2018 04:14:22 -0500
Reference:						DeploymentConfig/hello-hpa
Metrics:						( current / target )
  resource cpu on pods  (as a percentage of request):	<unknown> / 50%
Min replicas:						2
Max replicas:						10
Conditions:
  Type		Status	Reason		Message
  ----		------	------		-------
  AbleToScale	False	FailedGetScale	the HPA controller was unable to get the target's current scale: no kind "Scale" is registered for version "extensions/v1beta1"
Events:
  FirstSeen	LastSeen	Count	From				SubObjectPath	Type		Reason		Message
  ---------	--------	-----	----				-------------	--------	------		-------
  34m		4m		61	horizontal-pod-autoscaler			Warning		FailedGetScale	no kind "Scale" is registered for version "extensions/v1beta1"

Expected results:


Additional info:
replicationcontroller work well.

Comment 1 DeShuai Ma 2018-03-06 08:26:52 UTC

This will block 3.7 errata release, set to H/H

Comment 2 Solly Ross 2018-03-06 15:59:11 UTC

can you please post the output of `oc get hpa -o yaml`?

Comment 3 weiwei jiang 2018-03-13 08:10:07 UTC

# oc get hpa hello-hpa -n f0i86 -o yaml 
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  annotations:
    autoscaling.alpha.kubernetes.io/conditions: '[{"type":"AbleToScale","status":"False","lastTransitionTime":"2018-03-13T08:01:22Z","reason":"FailedGetScale","message":"the
      HPA controller was unable to get the target''s current scale: no kind \"Scale\"
      is registered for version \"extensions/v1beta1\""}]'
  creationTimestamp: 2018-03-13T08:00:52Z
  name: hello-hpa
  namespace: f0i86
  resourceVersion: "6170"
  selfLink: /apis/autoscaling/v1/namespaces/f0i86/horizontalpodautoscalers/hello-hpa
  uid: a6433671-2694-11e8-8eef-0ed7d6f1df92
spec:
  maxReplicas: 10
  minReplicas: 2
  scaleTargetRef:
    apiVersion: v1
    kind: DeploymentConfig
    name: hello-hpa
  targetCPUUtilizationPercentage: 50
status:
  currentReplicas: 0
  desiredReplicas: 0

# oc version 
oc v3.7.38
kubernetes v1.7.6+a08f5eeb62
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://ip-172-18-13-10.ec2.internal:8443
openshift v3.7.38
kubernetes v1.7.6+a08f5eeb62

Comment 6 Nicolas Nosenzo 2018-04-19 07:39:28 UTC

Hit the same issue on v3.7.42, although it was not reproducible with v3.7.23

#### FAILED ####
Test done with version:
oc v3.7.42
kubernetes v1.7.6+a08f5eeb62
features: Basic-Auth GSSAPI Kerberos SPNEGO


Apr 17 15:20:21 master.node atomic-openshift-master-controllers[2177]: I0417 15:20:21.163461    2177 horizontal.go:598] Successfully updated status for test-demo-recharge
Apr 17 15:20:21 master.node atomic-openshift-master-controllers[2177]: E0417 15:20:21.163544    2177 horizontal.go:206] failed to query scale subresource for DeploymentConfig/prod-test/test-demo-recharge: no kind "Scale" is registered for version "extensions/v1beta1"
Apr 17 15:20:51 master.node atomic-openshift-master-controllers[2177]: E0417 15:20:51.190854    2177 horizontal.go:206] failed to query scale subresource for DeploymentConfig/prod-test/test-demo-recharge: no kind "Scale" is registered for version "extensions/v1beta1"
Apr 17 15:21:21 master.node atomic-openshift-master-controllers[2177]: E0417 15:21:21.219391    2177 horizontal.go:206] failed to query scale subresource for DeploymentConfig/prod-test/test-demo-recharge: no kind "Scale" is registered for version "extensions/v1beta1"
Apr 17 15:21:51 master.node atomic-openshift-master-controllers[2177]: E0417 15:21:51.242943    2177 horizontal.go:206] failed to query scale subresource for DeploymentConfig/prod-test/test-demo-recharge: no kind "Scale" is registered for version "extensions/v1beta1"
Apr 17 15:22:21 master.node atomic-openshift-master-controllers[2177]: E0417 15:22:21.254977    2177 horizontal.go:206] failed to query scale subresource for DeploymentConfig/prod-test/test-demo-recharge: no kind "Scale" is registered for version "extensions/v1beta1"
Apr 17 15:22:51 master.node atomic-openshift-master-controllers[2177]: I0417 15:22:51.255282    2177 horizontal.go:352] Horizontal Pod Autoscaler has been deleted prod-test/test-demo-recharge



#### SUCCEED ####

Test done with version:
openshift v3.7.23
kubernetes v1.7.6+a08f5eeb62

$ oc describe hpa cakephp-mysql-example
Name:							cakephp-mysql-example
Namespace:						hello
Labels:							<none>
Annotations:						<none>
CreationTimestamp:					Wed, 18 Apr 2018 10:11:00 -0400
Reference:						DeploymentConfig/cakephp-mysql-example
Metrics:						( current / target )
  resource cpu on pods  (as a percentage of request):	1% (1m) / 50%
Min replicas:						1
Max replicas:						4
Conditions:
  Type			Status	Reason			Message
  ----			------	------			-------
  AbleToScale		True	ReadyForNewScale	the last scale time was sufficiently old as to warrant a new scale
  ScalingActive		True	ValidMetricFound	the HPA was able to succesfully calculate a replica count from cpu resource utilization (percentage of request)
  ScalingLimited	False	DesiredWithinRange	the desired replica count is within the acceptible range
Events:
  FirstSeen	LastSeen	Count	From				SubObjectPath	Type		Reason				Message
  ---------	--------	-----	----				-------------	--------	------				-------
   14m		14m		1	horizontal-pod-autoscaler			Normal		SuccessfulRescale		New size: 3; reason: cpu resource utilization (percentage of request) above target
  8m		8m		1	horizontal-pod-autoscaler			Normal		SuccessfulRescale		New size: 1; reason: All metrics below target

Comment 26 weiwei jiang 2018-05-09 05:48:06 UTC

Checked with 
# oc version 
oc v3.7.46
kubernetes v1.7.6+a08f5eeb62
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://xxx.xxx.xxx.xxx:8443
openshift v3.7.46
kubernetes v1.7.6+a08f5eeb62

And the issue can not be reproduced.

Comment 33 errata-xmlrpc 2018-05-18 03:54:45 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1576

Note You need to log in before you can comment on or make changes to this bug.