Bug 2064837
| Summary: | cluster-cloud-controller-manager not able to start and into crashloop-backoff during cluster upgrade from OCP 4.8.x to OCP 4.9.21 halted due to Operator " | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Nirupma Kashyap <nkashyap> |
| Component: | Cloud Compute | Assignee: | dmoiseev |
| Cloud Compute sub component: | Cloud Controller Manager | QA Contact: | Milind Yadav <miyadav> |
| Status: | CLOSED ERRATA | Docs Contact: | |
| Severity: | medium | ||
| Priority: | medium | CC: | dmoiseev, wking |
| Version: | 4.9 | ||
| Target Milestone: | --- | ||
| Target Release: | 4.9.z | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-05-18 13:20:29 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
| Bug Depends On: | 2037680 | ||
| Bug Blocks: | |||
|
Description
Nirupma Kashyap
2022-03-16 17:37:12 UTC
Hi team, Can we have some updates as when this fix will be backported to 4.9 ? Regards, Nirupma This is in the queue for QE to test, they should get to it soon Upgraded cluster from 4.8.39 to 4.9.0-0.nightly-2022-05-11-100812
.
..
.
.
.
05-12 12:40:35.091 clusteroperators:
05-12 12:40:35.091 NAME VERSION AVAILABLE PROGRESSING DEGRADED SINCE MESSAGE
05-12 12:40:35.091 authentication 4.9.0-0.nightly-2022-05-11-100812 True False False 39m
05-12 12:40:35.091 baremetal 4.9.0-0.nightly-2022-05-11-100812 True False False 126m
05-12 12:40:35.091 cloud-controller-manager 4.9.0-0.nightly-2022-05-11-100812 True False False 60m
05-12 12:40:35.091 cloud-credential 4.9.0-0.nightly-2022-05-11-100812 True False False 135m
05-12 12:40:35.091 cluster-autoscaler 4.9.0-0.nightly-2022-05-11-100812 True False False 126m
05-12 12:40:35.091 config-operator 4.9.0-0.nightly-2022-05-11-100812 True False False 127m
05-12 12:40:35.091 console 4.9.0-0.nightly-2022-05-11-100812 True False False 38m
05-12 12:40:35.091 csi-snapshot-controller 4.9.0-0.nightly-2022-05-11-100812 True False False 126m
05-12 12:40:35.091 dns 4.9.0-0.nightly-2022-05-11-100812 True False False 126m
05-12 12:40:35.091 etcd 4.9.0-0.nightly-2022-05-11-100812 True False False 126m
05-12 12:40:35.091 image-registry 4.9.0-0.nightly-2022-05-11-100812 True False False 120m
05-12 12:40:35.091 ingress 4.9.0-0.nightly-2022-05-11-100812 True False False 119m
05-12 12:40:35.091 insights 4.9.0-0.nightly-2022-05-11-100812 True False False 120m
05-12 12:40:35.091 kube-apiserver 4.9.0-0.nightly-2022-05-11-100812 True False False 124m
05-12 12:40:35.091 kube-controller-manager 4.9.0-0.nightly-2022-05-11-100812 True False False 124m
.
.
.
No backoff error :
oc get pod/cluster-cloud-controller-manager-operator-65b77dc777-nkqcs -n openshift-cloud-controller-manager-operator -o yaml
apiVersion: v1
kind: Pod
metadata:
creationTimestamp: "2022-05-12T07:27:26Z"
generateName: cluster-cloud-controller-manager-operator-65b77dc777-
labels:
k8s-app: cloud-manager-operator
pod-template-hash: 65b77dc777
name: cluster-cloud-controller-manager-operator-65b77dc777-nkqcs
namespace: openshift-cloud-controller-manager-operator
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: ReplicaSet
name: cluster-cloud-controller-manager-operator-65b77dc777
uid: 0ff1a37f-f2b1-49c1-9632-2fbb5f88f23b
resourceVersion: "105115"
uid: 624ce15c-2d1a-4d4b-b7b6-915f73c7dd89
spec:
containers:
- command:
- /bin/bash
- -c
- |
#!/bin/bash
set -o allexport
if [[ -f /etc/kubernetes/apiserver-url.env ]]; then
source /etc/kubernetes/apiserver-url.env
else
URL_ONLY_KUBECONFIG=/etc/kubernetes/kubeconfig
fi
exec /cluster-controller-manager-operator \
--leader-elect=true \
--leader-elect-lease-duration=137s \
--leader-elect-renew-deadline=107s \
--leader-elect-retry-period=26s \
--leader-elect-resource-namespace=openshift-cloud-controller-manager-operator \
"--images-json=/etc/cloud-controller-manager-config/images.json" \
--metrics-bind-address=:9258 \
--health-addr=127.0.0.1:9259
env:
- name: RELEASE_VERSION
value: 4.9.0-0.nightly-2022-05-11-100812
image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:375606eb429ffe7ef890295bf55c5122c300ad3879577629827dd9ddbdc191a9
imagePullPolicy: IfNotPresent
name: cluster-cloud-controller-manager
ports:
- containerPort: 9258
hostPort: 9258
name: metrics
protocol: TCP
- containerPort: 9259
hostPort: 9259
name: healthz
protocol: TCP
resources:
requests:
cpu: 10m
memory: 50Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /etc/cloud-controller-manager-config/
name: images
- mountPath: /etc/kubernetes
name: host-etc-kube
readOnly: true
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-9j79z
readOnly: true
- command:
- /bin/bash
- -c
- |
#!/bin/bash
set -o allexport
if [[ -f /etc/kubernetes/apiserver-url.env ]]; then
source /etc/kubernetes/apiserver-url.env
else
URL_ONLY_KUBECONFIG=/etc/kubernetes/kubeconfig
fi
exec /config-sync-controllers \
--leader-elect=true \
--leader-elect-lease-duration=137s \
--leader-elect-renew-deadline=107s \
--leader-elect-retry-period=26s \
--leader-elect-resource-namespace=openshift-cloud-controller-manager-operator \
--health-addr=127.0.0.1:9260
image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:375606eb429ffe7ef890295bf55c5122c300ad3879577629827dd9ddbdc191a9
imagePullPolicy: IfNotPresent
name: config-sync-controllers
ports:
- containerPort: 9260
hostPort: 9260
name: healthz
protocol: TCP
resources:
requests:
cpu: 10m
memory: 25Mi
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /etc/kubernetes
name: host-etc-kube
readOnly: true
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-9j79z
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: true
hostNetwork: true
imagePullSecrets:
- name: cluster-cloud-controller-manager-dockercfg-ms79t
nodeName: ip-10-0-52-239.us-east-2.compute.internal
nodeSelector:
node-role.kubernetes.io/master: ""
preemptionPolicy: PreemptLowerPriority
priority: 2000001000
priorityClassName: system-node-critical
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: cluster-cloud-controller-manager
serviceAccountName: cluster-cloud-controller-manager
terminationGracePeriodSeconds: 30
tolerations:
- effect: NoSchedule
key: node-role.kubernetes.io/master
operator: Exists
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 120
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 120
- effect: NoSchedule
key: node.cloudprovider.kubernetes.io/uninitialized
operator: Exists
- effect: NoSchedule
key: node.kubernetes.io/not-ready
operator: Exists
- effect: NoSchedule
key: node.kubernetes.io/memory-pressure
operator: Exists
volumes:
- configMap:
defaultMode: 420
name: cloud-controller-manager-images
name: images
- hostPath:
path: /etc/kubernetes
type: Directory
name: host-etc-kube
- name: kube-api-access-9j79z
projected:
defaultMode: 420
sources:
- serviceAccountToken:
expirationSeconds: 3607
path: token
- configMap:
items:
- key: ca.crt
path: ca.crt
name: kube-root-ca.crt
- downwardAPI:
items:
- fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
path: namespace
- configMap:
items:
- key: service-ca.crt
path: service-ca.crt
name: openshift-service-ca.crt
status:
conditions:
- lastProbeTime: null
lastTransitionTime: "2022-05-12T07:27:26Z"
status: "True"
type: Initialized
- lastProbeTime: null
lastTransitionTime: "2022-05-12T07:27:28Z"
status: "True"
type: Ready
- lastProbeTime: null
lastTransitionTime: "2022-05-12T07:27:28Z"
status: "True"
type: ContainersReady
- lastProbeTime: null
lastTransitionTime: "2022-05-12T07:27:26Z"
status: "True"
type: PodScheduled
containerStatuses:
- containerID: cri-o://8934c13b3c99a255ef7bef9bd3a1f91b1efc07bbdefcf030c899994d6575d307
image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:375606eb429ffe7ef890295bf55c5122c300ad3879577629827dd9ddbdc191a9
imageID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:375606eb429ffe7ef890295bf55c5122c300ad3879577629827dd9ddbdc191a9
lastState: {}
name: cluster-cloud-controller-manager
ready: true
restartCount: 0
started: true
state:
running:
startedAt: "2022-05-12T07:27:27Z"
- containerID: cri-o://9dfc15a88d27619ef47aba70ea06f15e4d16f288da09e873b1f36ce5eae8f845
image: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:375606eb429ffe7ef890295bf55c5122c300ad3879577629827dd9ddbdc191a9
imageID: quay.io/openshift-release-dev/ocp-v4.0-art-dev@sha256:375606eb429ffe7ef890295bf55c5122c300ad3879577629827dd9ddbdc191a9
lastState: {}
name: config-sync-controllers
ready: true
restartCount: 0
started: true
state:
running:
startedAt: "2022-05-12T07:27:27Z"
hostIP: 10.0.52.239
phase: Running
podIP: 10.0.52.239
podIPs:
- ip: 10.0.52.239
qosClass: Burstable
startTime: "2022-05-12T07:27:26Z"
Moving to verified based on these results .
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (OpenShift Container Platform 4.9.33 bug fix update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:2206 |