1870553 – Fail to upgrade from 4.4 to the latest 4.5 nightly build

Bug 1870553 - Fail to upgrade from 4.4 to the latest 4.5 nightly build

Summary: Fail to upgrade from 4.4 to the latest 4.5 nightly build

Keywords:
Status:	CLOSED DUPLICATE of bug 1869962
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Storage
Sub Component:
Version:	4.5
Hardware:	Unspecified
OS:	Unspecified
Priority:	high
Severity:	high
Target Milestone:	---
Target Release:	4.6.0
Assignee:	aos-storage-staff@redhat.com
QA Contact:	Qin Ping
Docs Contact:
URL:
Whiteboard:
Duplicates (1):	1872398 (view as bug list)
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2020-08-20 10:50 UTC by Johnny Liu
Modified:	2021-04-05 17:47 UTC (History)
CC List:	6 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2020-08-24 22:33:42 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Johnny Liu 2020-08-20 10:50:33 UTC

Description of problem:


Version-Release number of selected component (if applicable):
From 4.4.0-0.nightly-2020-08-19-234404/4.4.18 to 4.5.0-0.nightly-2020-08-20-031121

How reproducible:
Always


Steps to Reproduce:
1. Set up cluster with 4.4.0-0.nightly-2020-08-19-234404
2. upgrade the cluster to 4.5.0-0.nightly-2020-08-20-031121
3.

Actual results:
Fail to upgrade.

# oc get clusterversion
NAME      VERSION                             AVAILABLE   PROGRESSING   SINCE   STATUS
version   4.4.0-0.nightly-2020-08-19-234404   True        True          137m    Unable to apply 4.5.0-0.nightly-2020-08-20-031121: the cluster operator kube-controller-manager is degraded

# oc get po -n openshift-kube-controller-manager
kube-controller-manager-ip-10-0-147-190.us-east-2.compute.internal   3/4     CrashLoopBackOff   28         132m
kube-controller-manager-ip-10-0-187-191.us-east-2.compute.internal   3/4     CrashLoopBackOff   29         133m
kube-controller-manager-ip-10-0-199-221.us-east-2.compute.internal   3/4     CrashLoopBackOff   27         133m

# oc -n openshift-kube-controller-manager logs kube-controller-manager-ip-10-0-147-190.us-east-2.compute.internal -c kube-controller-manager
F0820 10:47:25.473656       1 plugins.go:123] Could not create hostpath recycler pod from file /etc/kubernetes/manifests/recycler-pod.yaml: failed to read file path /etc/kubernetes/manifests/recycler-pod.yaml: open /etc/kubernetes/manifests/recycler-pod.yaml: no such file or directory


Expected results:
upgrade get passed

Additional info:
I also check e2e ci jobs, failed with the same error.
https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/1296284570672435200

Comment 1 Maciej Szulik 2020-08-20 10:59:58 UTC

Tomas you're looking in a similar upgrade problem, can you check if this is related.

Comment 2 Lalatendu Mohanty 2020-08-20 18:58:35 UTC

Possible duplicate https://bugzilla.redhat.com/show_bug.cgi?id=1861102

Comment 3 W. Trevor King 2020-08-20 18:59:24 UTC

Tail of one of those pods [1]:

E0820 04:03:22.279582       1 reflector.go:382] runtime/asm_amd64.s:1357: Failed to watch *v1.BuildConfig: Get https://localhost:6443/apis/build.openshift.io/v1/buildconfigs?allowWatchBookmarks=true&resourceVersion=27623&timeout=9m7s&timeoutSeconds=547&watch=true: dial tcp [::1]:6443: connect: connection refused
I0820 04:03:22.915192       1 leaderelection.go:277] failed to renew lease openshift-kube-controller-manager/cluster-policy-controller: timed out waiting for the condition
F0820 04:03:22.915235       1 policy_controller.go:94] leaderelection lost
I0820 04:03:22.921145       1 clusterquotamapping.go:142] Shutting down ClusterQuotaMappingController controller

Somewhat recently in a similar space is bug 1842002, although that was verified for 4.6 and I don't see anything about backporting changes to 4.5 or earlier.

[1]: https://storage.googleapis.com/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/1296284570672435200/artifacts/e2e-aws-upgrade/pods/openshift-kube-controller-manager_kube-controller-manager-ip-10-0-169-171.us-west-2.compute.internal_cluster-policy-controller_previous.log

Comment 4 Tomáš Nožička 2020-08-21 11:33:10 UTC

/etc/kubernetes/manifests/recycler-pod.yaml isn't related to the upgrade issue I am looking at. 

Recycler pod suggests storage, sending it to the team to investigate.

Comment 5 Fabio Bertinatto 2020-08-24 07:41:11 UTC

This PR (for another ticket) should fix it: https://github.com/openshift/machine-config-operator/pull/2004

Comment 6 W. Trevor King 2020-08-24 22:33:42 UTC

POST with no attached PR doesn't make sense to me.  Marking this bug as a dup, with a comment on the associated bug about testing 4.4 -> 4.5 updates [1].

[1]: https://bugzilla.redhat.com/show_bug.cgi?id=1869962#c2

*** This bug has been marked as a duplicate of bug 1869962 ***

Comment 7 Paige Rubendall 2020-09-01 16:24:14 UTC

*** Bug 1872398 has been marked as a duplicate of this bug. ***

Comment 8 W. Trevor King 2021-04-05 17:47:37 UTC

Removing UpgradeBlocker from this older bug, to remove it from the suspect queue described in [1].  If you feel like this bug still needs to be a suspect, please add keyword again.

[1]: https://github.com/openshift/enhancements/pull/475

Note You need to log in before you can comment on or make changes to this bug.