Bug 1539876

Summary: [backport] oc status reporting error with hpa attempting to scale an object that doesn't exist when the object, in fact, exists and the HPA is working as intended
Product: OKD Reporter: Miranda Shutt <Miranda_Shutt>
Component: ocAssignee: Juan Vallejo <jvallejo>
Status: CLOSED CURRENTRELEASE QA Contact: Xingxing Xia <xxia>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 3.xCC: aos-bugs, david_hocky, mfojtik, mmccomas, xxia
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-03-20 21:07:41 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Miranda Shutt 2018-01-29 19:07:20 UTC
Description of problem:

'oc status' reporting error with hpa attempting to scale an object that doesn't exist when the object, in fact, exists and the HPA is working as intended

Version-Release number of selected component (if applicable):

oc v3.7.0+7ed6862
kubernetes v1.7.6+a08f5eeb62
features: Basic-Auth GSSAPI Kerberos SPNEGO

How reproducible:

Always

Steps to Reproduce:
1. Create a new project
2. Import a "Deployment" using 'oc create -f'
3. Import a HPA using 'oc create -f' which indicates the following within the scaleTargetRef section:

Kind: Deployment
Name: ${DEPLOYMENT_NAME_ABOVE}

4. Give system several minutes to pull metrics and determine the status of the HPA

Actual results:

oc status output includes:

* hpa/${HPANAME} is attempting to scale Deployment/${DEPLOYMENT_NAME}, which doesn't exist 

However, 'oc describe hpa/${HPANAME}' indicates that the HPA is working and obtaining metrics for the pods described by the Deployment ${DEPLOYMENT_NAME}

Expected results:

No errors indicated by 'oc status'

Additional info:

See https://github.com/openshift/origin/issues/18334

Comment 1 Miranda Shutt 2018-01-29 20:12:56 UTC
Can replicate easily in 'oc cluster up' on a mac as well:

apiVersion: v1
items:
- apiVersion: autoscaling/v1
  kind: HorizontalPodAutoscaler
  metadata:
    name: caasdemo
  spec:
    maxReplicas: 20
    minReplicas: 3
    scaleTargetRef:
      apiVersion: extensions/v1beta1
      kind: Deployment
      name: caasdemo
    targetCPUUtilizationPercentage: 1
kind: List
metadata:
  resourceVersion: ""
  selfLink: ""

and

apiVersion: apps/v1beta1
kind: Deployment
metadata:
  labels:
    app: caasdemo
  name: caasdemo
spec:
  replicas: 3
  strategy:
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
    type: RollingUpdate
  template:
    metadata:
      labels:
        app: caasdemo
    spec:
      containers:
        - name: caasdemo
          image: ${DOCKER_IMAGE_TAG}
          imagePullPolicy: Always
          ports:
            - name: http
              containerPort: 8080
              protocol: TCP
          readinessProbe:
            httpGet:
              path: /health
              port: http
          livenessProbe:
            initialDelaySeconds: 2
            periodSeconds: 5
            httpGet:
              path: /
              port: http
          resources:
            requests:
              cpu: "0.50"
      terminationGracePeriodSeconds: 10

Comment 2 Juan Vallejo 2018-01-30 05:31:33 UTC
This was patched recently against maser: https://github.com/openshift/origin/pull/18150

Backporting to 3.8 here: https://github.com/openshift/origin/pull/18344
Backporting to 3.7 once 3.8 PR merges.

Comment 4 Miranda Shutt 2018-01-30 14:44:01 UTC
Indeed, that's the problem!  I can also see why searching github didn't yield any results since the actual content is the underlying code wherein the deployments weren't added to the HPA ScaleRef graph.

Regards,
--Miranda

Comment 5 Xingxing Xia 2018-01-31 09:16:14 UTC
Latest available 3.7 version v3.7.27 does not include the backport. Waiting for 3.7 new puddle that will include it.
BTW, checked with 3.9 oc (the version since https://bugzilla.redhat.com/show_bug.cgi?id=1534956#c2) against 3.7 server, it can work without the reported error.

Comment 6 Juan Vallejo 2018-01-31 15:37:40 UTC
Backporting to 3.7 here: https://github.com/openshift/origin/pull/18371

Comment 8 Xingxing Xia 2018-02-02 06:31:32 UTC
Michal, yeah, I also don't wait for 3.8 new puddle. In comment 5, I wait for 3.7 new puddle to accept, because comment 3 says backport to 3.7.
From this perspective, this bug isn't moved to VERIFIED yet. This bug is not for tracking 3.9, which is already tracked and verified per comment 5 linked bug.

Comment 10 Xingxing Xia 2018-02-06 11:59:26 UTC
New 3.7 version v3.7.28 is built, in which the fix works well:
$ v3.7.27/oc status
...
  * hpa/hello-openshift is attempting to scale Deployment/hello-openshift, which doesn't exist
...

$ v3.7.28/oc status # Output has no above error, so moving bug to VERIFIED