Description of problem: Version-Release number of selected component (if applicable): From 4.4.0-0.nightly-2020-08-19-234404/4.4.18 to 4.5.0-0.nightly-2020-08-20-031121 How reproducible: Always Steps to Reproduce: 1. Set up cluster with 4.4.0-0.nightly-2020-08-19-234404 2. upgrade the cluster to 4.5.0-0.nightly-2020-08-20-031121 3. Actual results: Fail to upgrade. # oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.4.0-0.nightly-2020-08-19-234404 True True 137m Unable to apply 4.5.0-0.nightly-2020-08-20-031121: the cluster operator kube-controller-manager is degraded # oc get po -n openshift-kube-controller-manager kube-controller-manager-ip-10-0-147-190.us-east-2.compute.internal 3/4 CrashLoopBackOff 28 132m kube-controller-manager-ip-10-0-187-191.us-east-2.compute.internal 3/4 CrashLoopBackOff 29 133m kube-controller-manager-ip-10-0-199-221.us-east-2.compute.internal 3/4 CrashLoopBackOff 27 133m # oc -n openshift-kube-controller-manager logs kube-controller-manager-ip-10-0-147-190.us-east-2.compute.internal -c kube-controller-manager F0820 10:47:25.473656 1 plugins.go:123] Could not create hostpath recycler pod from file /etc/kubernetes/manifests/recycler-pod.yaml: failed to read file path /etc/kubernetes/manifests/recycler-pod.yaml: open /etc/kubernetes/manifests/recycler-pod.yaml: no such file or directory Expected results: upgrade get passed Additional info: I also check e2e ci jobs, failed with the same error. https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/1296284570672435200
Tomas you're looking in a similar upgrade problem, can you check if this is related.
Possible duplicate https://bugzilla.redhat.com/show_bug.cgi?id=1861102
Tail of one of those pods [1]: E0820 04:03:22.279582 1 reflector.go:382] runtime/asm_amd64.s:1357: Failed to watch *v1.BuildConfig: Get https://localhost:6443/apis/build.openshift.io/v1/buildconfigs?allowWatchBookmarks=true&resourceVersion=27623&timeout=9m7s&timeoutSeconds=547&watch=true: dial tcp [::1]:6443: connect: connection refused I0820 04:03:22.915192 1 leaderelection.go:277] failed to renew lease openshift-kube-controller-manager/cluster-policy-controller: timed out waiting for the condition F0820 04:03:22.915235 1 policy_controller.go:94] leaderelection lost I0820 04:03:22.921145 1 clusterquotamapping.go:142] Shutting down ClusterQuotaMappingController controller Somewhat recently in a similar space is bug 1842002, although that was verified for 4.6 and I don't see anything about backporting changes to 4.5 or earlier. [1]: https://storage.googleapis.com/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/1296284570672435200/artifacts/e2e-aws-upgrade/pods/openshift-kube-controller-manager_kube-controller-manager-ip-10-0-169-171.us-west-2.compute.internal_cluster-policy-controller_previous.log
/etc/kubernetes/manifests/recycler-pod.yaml isn't related to the upgrade issue I am looking at. Recycler pod suggests storage, sending it to the team to investigate.
This PR (for another ticket) should fix it: https://github.com/openshift/machine-config-operator/pull/2004
POST with no attached PR doesn't make sense to me. Marking this bug as a dup, with a comment on the associated bug about testing 4.4 -> 4.5 updates [1]. [1]: https://bugzilla.redhat.com/show_bug.cgi?id=1869962#c2 *** This bug has been marked as a duplicate of bug 1869962 ***
*** Bug 1872398 has been marked as a duplicate of this bug. ***
Removing UpgradeBlocker from this older bug, to remove it from the suspect queue described in [1]. If you feel like this bug still needs to be a suspect, please add keyword again. [1]: https://github.com/openshift/enhancements/pull/475