Bug 1779640
| Summary: | Cluster-autoscaler stuck on update, doesn't report status | |||
|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Vadim Rutkovsky <vrutkovs> | |
| Component: | Cloud Compute | Assignee: | Alberto <agarcial> | |
| Cloud Compute sub component: | Other Providers | QA Contact: | Jianwei Hou <jhou> | |
| Status: | CLOSED ERRATA | Docs Contact: | ||
| Severity: | unspecified | |||
| Priority: | unspecified | CC: | brad.ison, vlaad, wking | |
| Version: | 4.4 | |||
| Target Milestone: | --- | |||
| Target Release: | 4.4.0 | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | No Doc Update | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1779741 1779743 1779745 (view as bug list) | Environment: | ||
| Last Closed: | 2020-05-15 15:45:24 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1779741 | |||
|
Description
Vadim Rutkovsky
2019-12-04 12:27:06 UTC
The underlying issue here is that etcd was under load and taking multiple seconds to sync its log, which was causing leader elections, and I think some API writes to fail. In addition, the cluster-autoscaler-operator was not reporting failures to apply updates to its ClusterOperator resource, and worse, was not retrying when it failed to apply an "Available" status. So the CVO was unaware of its success. The linked PR fixes that, and I'll make sure it's back ported to previous releases. > The underlying issue here is that etcd was under load and taking multiple seconds to sync its log, which was causing leader elections, and I think some API writes to fail. General tracker for this portion is bug 1775878. Verified in 4.4.0-0.nightly-2019-12-19-223334.
oc get co cluster-autoscaler -o yaml
apiVersion: config.openshift.io/v1
kind: ClusterOperator
metadata:
creationTimestamp: "2019-12-20T03:11:49Z"
generation: 1
name: cluster-autoscaler
resourceVersion: "9771"
selfLink: /apis/config.openshift.io/v1/clusteroperators/cluster-autoscaler
uid: 99dba483-4ca7-4f50-af40-6ceeddfd0143
spec: {}
status:
conditions:
- lastTransitionTime: "2019-12-20T03:11:49Z"
message: at version 4.4.0-0.nightly-2019-12-19-223334
status: "True"
type: Available
- lastTransitionTime: "2019-12-20T03:11:49Z"
status: "False"
type: Progressing
- lastTransitionTime: "2019-12-20T03:11:49Z"
status: "False"
type: Degraded
- lastTransitionTime: "2019-12-20T03:11:49Z"
status: "True"
type: Upgradeable
extension: null
relatedObjects:
- group: machine.openshift.io
name: ""
namespace: openshift-machine-api
resource: machineautoscalers
- group: machine.openshift.io
name: ""
namespace: openshift-machine-api
resource: clusterautoscalers
- group: ""
name: openshift-machine-api
resource: namespaces
versions:
- name: operator
version: 4.4.0-0.nightly-2019-12-19-223334
|