Bug 1869864
Summary: | Unable to view deployment config after enabling auto scaling on it | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Vijay Bhadriraju <vbhadrir> |
Component: | oc | Assignee: | Mike Dame <mdame> |
Status: | CLOSED DUPLICATE | QA Contact: | zhou ying <yinzhou> |
Severity: | high | Docs Contact: | |
Priority: | high | ||
Version: | 4.5 | CC: | alegrand, anpicker, aos-bugs, dslavens, erooth, jokerman, kakkoyun, lcosic, maszulik, mfojtik, mloibl, pkrupa, surbania |
Target Milestone: | --- | Keywords: | UpcomingSprint |
Target Release: | 4.6.0 | ||
Hardware: | s390x | ||
OS: | Linux | ||
Whiteboard: | multi-arch | ||
Fixed In Version: | Doc Type: | If docs needed, set a value | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-09-09 12:48:43 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Vijay Bhadriraju
2020-08-18 20:00:54 UTC
It seems like the issue is in `oc` client or in autoscaler. Reassigning to `oc` component for further investigation. Mike can you look into it? > /go/src/github.com/openshift/oc/pkg/helpers/describe/deployments.go:346 +0xa42 This is a weird error, because the master and release-4.5 branches of oc don't have anything at that line that would return a panic: https://github.com/openshift/oc/blob/release-4.5/pkg/helpers/describe/deployments.go#L346 (it's literally just a closing bracket) I also attempted to reproduce this on a 4.5 cluster with the latest CI 4.5 oc and didn't have any issue (repro steps below). Are you using a version of oc built from an older codebase? I see an older commit did have an Fprint() statement there: https://github.com/openshift/oc/blob/f0e69f31d2754f69833d91d6208a0dace672fd19/pkg/helpers/describe/deployments.go#L346 (Repro steps for this): $ ./oc version Client Version: 4.5.0-0.ci-2020-09-08-123649 Server Version: 4.5.8 Kubernetes Version: v1.18.3+6c42de8 $ cat dc.yaml kind: "DeploymentConfig" apiVersion: "v1" metadata: name: "frontend" spec: template: metadata: labels: name: "frontend" spec: containers: - name: "helloworld" image: "openshift/origin-ruby-sample" ports: - containerPort: 8080 protocol: "TCP" replicas: 5 triggers: - type: "ConfigChange" - type: "ImageChange" imageChangeParams: automatic: true containerNames: - "helloworld" from: kind: "ImageStreamTag" name: "origin-ruby-sample:latest" strategy: type: "Rolling" paused: false revisionHistoryLimit: 2 minReadySeconds: 0 $ ./oc create -f dc.yaml deploymentconfig.apps.openshift.io/frontend created $ oc autoscale dc/frontend --min 1 --max 10 --cpu-percent=80 horizontalpodautoscaler.autoscaling/frontend autoscaled $ ./oc describe dc frontend Name: frontend Namespace: autoscale Created: 22 seconds ago Labels: <none> Annotations: <none> Latest Version: Not deployed Selector: name=frontend Replicas: 5 Autoscaling: between 1 and 10 replicas targeting 80% CPU over all the pods Triggers: Config, Image(origin-ruby-sample@latest, auto=true) Strategy: Rolling Template: Pod Template: Labels: name=frontend Containers: helloworld: Image: openshift/origin-ruby-sample Port: 8080/TCP Host Port: 0/TCP Environment: <none> Mounts: <none> Volumes: <none> Latest Deployment: <none> Events: <none> Mike the only suspect I can think of is hpa.Spec.MinReplicas in https://github.com/openshift/oc/blob/master/pkg/helpers/describe/deployments.go#L341 which is a pointer and we're not checking if it's set and yet we access the value. I'd fix that for starters, then eventually try looking into other possible nil elements in that area. Vijay can you provide us with a full yaml of both the HPA and deployment? This bug appears to be a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1785513 (fixed in https://github.com/openshift/oc/pull/227, 4.4) *** This bug has been marked as a duplicate of bug 1785513 *** The OCP cluster where this problem occurred stopped responding where I was unable to access the cluster via oc cli or the web console. I had to tear down the cluster and rebuilt, so I do not have the yaml of HPA or Deployment. I do not know if enabling auto scaling in some way cratered my cluster, so I have not tried to re-enable HPA on my rebuilt cluster. Deploying a new OCP cluster using UPI is very tedious so have not dared to re-enable HPA with the fear of losing the cluster again. This error would be originating from the oc client, so I don't think it would be dependent on the cluster. Could you provide an `oc version`? root:scripts#.oc version Client Version: 4.3.1 Server Version: 4.5.5 Kubernetes Version: v1.18.3+08c38ef I redownloaded the OC client from the 4.5.5 cluster download link and found that the cluster is point to the same OC Client version 4.3.1 the OC client version downloaded from the link pointed by the cluster is different as seen below root:home#.oc version Client Version: openshift-clients-4.5.0-202006231303.p0-4-gb66f2d3a6 Server Version: 4.5.5 Kubernetes Version: v1.18.3+08c38ef root:home#. After using this version of the oc client, I able to run the oc describe dc/dcname on a pod with autoscaling enabled and it returning my a valid dc. I think that problem is with using backward level oc cli tool. root:home#.oc describe dc/megaweb Name: megaweb Namespace: megabank Created: 3 weeks ago Labels: app=megaweb app.kubernetes.io/component=megaweb app.kubernetes.io/instance=megaweb Annotations: openshift.io/generated-by=OpenShiftNewApp Latest Version: 3 Selector: deploymentconfig=megaweb Replicas: 0 Autoscaling: between 1 and 4 replicas targeting 70% CPU over all the pods Triggers: Config, Image(megaweb@latest, auto=true) Strategy: Rolling Template: Pod Template: Labels: deploymentconfig=megaweb Annotations: openshift.io/generated-by: OpenShiftNewApp Containers: megaweb: Image: docker.io/vbhadrir/mbweb-z@sha256:7e619f30122596f631ef803a41709533436dd672042e6b663741f6f97838a675 Ports: 9080/TCP, 9443/TCP Host Ports: 0/TCP, 0/TCP Limits: cpu: 1 memory: 4000Mi Requests: cpu: 100m memory: 1000Mi Environment: ACCIDHISTORY_SVC_HOST: $(MBSVC7_SERVICE_HOST):9080 ACCOUNTS_SVC_HOST: $(MBSVC8_SERVICE_HOST):9080 CUSTOMER_SVC_HOST: $(MBSVC2_SERVICE_HOST):9080 DEPOSIT_SVC_HOST: $(MBSVC3_SERVICE_HOST):9080 HISTORY_SVC_HOST: $(MBSVC6_SERVICE_HOST):9080 LOGIN_SVC_HOST: $(MBSVC1_SERVICE_HOST):9080 LOGOUT_SVC_HOST: $(MBSVC9_SERVICE_HOST):9080 MegaBankComponentClass: com.ibm.cpo.mb.MegaBankJDBCWS TRANSFER_SVC_HOST: $(MBSVC5_SERVICE_HOST):9080 WITHDRAW_SVC_HOST: $(MBSVC4_SERVICE_HOST):9080 Mounts: <none> Volumes: <none> Deployment #3 (latest): Name: megaweb-3 Created: 2 hours ago Status: Complete Replicas: 0 current / 0 desired Selector: deployment=megaweb-3,deploymentconfig=megaweb Labels: app.kubernetes.io/component=megaweb,app.kubernetes.io/instance=megaweb,app=megaweb,openshift.io/deployment-config.name=megaweb Pods Status: 0 Running / 0 Waiting / 0 Succeeded / 0 Failed Deployment #2: Created: 2 hours ago Status: Complete Replicas: 0 current / 0 desired Deployment #1: Created: 3 weeks ago Status: Complete Replicas: 0 current / 0 desired Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal DeploymentCreated 110m deploymentconfig-controller Created new replication controller "megaweb-2" for version 2 Normal DeploymentCreated 109m deploymentconfig-controller Created new replication controller "megaweb-3" for version 3 Normal ReplicationControllerScaled 108m deploymentconfig-controller Scaled replication controller "megaweb-3" from 4 to 0 |