Bug 1820807

Summary:	Run template e2e-aws-upgrade - e2e-aws-upgrade container test failed
Product:	OpenShift Container Platform	Reporter:	Sebastian Jug <sejug>
Component:	Etcd	Assignee:	Alay Patel <alpatel>
Status:	CLOSED DUPLICATE	QA Contact:	ge liu <geliu>
Severity:	high	Docs Contact:
Priority:	unspecified
Version:	4.4	CC:	bparees, sbatsche, wking
Target Milestone:	---
Target Release:	4.5.0
Hardware:	All
OS:	All
Whiteboard:
Fixed In Version:		Doc Type:	If docs needed, set a value
Doc Text:		Story Points:	---
Clone Of:		Environment:	test: Overall
Last Closed:	2020-04-28 16:02:53 UTC	Type:	Bug
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:

Description Sebastian Jug 2020-04-03 23:22:55 UTC

test: Run template e2e-aws-upgrade - e2e-aws-upgrade container test failed, see job: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/24370

0-150-139.us-west-1.compute.internal container/olm-operator reason/Restarted Apr 03 21:07:31.468 E ns/openshift-operator-lifecycle-manager pod/olm-operator-7c77f9f96d-phzgx node/ip-10-0-150-139.us-west-1.compute.internal container/olm-operator container exited with code 1 (Error): 
time="2020-04-03T21:07:24Z" level=info msg="log level info"\n
time="2020-04-03T21:07:24Z" level=info msg="TLS keys set, using https for metrics"\nW0403 21:07:24.803627 1 client_config.go:543] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.\n
time="2020-04-03T21:07:24Z" level=info msg="Using in-cluster kube client config"\n
time="2020-04-03T21:07:24Z" level=info msg="Using in-cluster kube client config"\n
time="2020-04-03T21:07:30Z" level=fatal msg="couldn't clean previous release" error="Delete https://172.30.0.1:443/apis/operators.coreos.com/v1alpha1/namespaces/openshift-operator-lifecycle-manager/catalogsources/olm-operators: dial tcp 172.30.0.1:443: connect: no route to host"\n
Apr 03 21:07:38.410 I ns/openshift-etcd-operator deployment/etcd-operator reason/UnhealthyEtcdMember unhealthy members: ip-10-0-150-139.us-west-1.compute.internal,ip-10-0-135-25.us-west-1.compute.internal (164 times) 
Apr 03 21:07:47.532 W ns/openshift-console pod/console-849d7d5747-24frm node/ip-10-0-150-139.us-west-1.compute.internal container/console reason/Restarted 
Apr 03 21:07:56.642 W ns/openshift-kube-controller-manager-operator pod/kube-controller-manager-operator-769fffbd5d-ssp94 node/ip-10-0-150-139.us-west-1.compute.internal reason/BackOff Back-off restarting failed container (262 times) 
Apr 03 21:07:59.132 I test="[sig-arch][Feature:ClusterUpgrade] Cluster should remain functional during upgrade [Disruptive] [Serial] [Suite:openshift]" failed 
Failing tests: [sig-arch][Feature:ClusterUpgrade] Cluster should remain functional during upgrade [Disruptive] [Serial] [Suite:openshift]
Writing JUnit report to /tmp/artifacts/junit/junit_e2e_20200403-210759.xml error: 1 fail, 0 pass, 0 skip (1h19m11s)

Comment 1 W. Trevor King 2020-04-04 02:29:55 UTC

Proximal cause for the "Cluster should remain functional during upgrade" failure was:

  fail [github.com/openshift/origin/test/e2e/upgrade/upgrade.go:135]: during upgrade to registry.svc.ci.openshift.org/ocp/release:4.5.0-0.ci-2020-04-03-190912 Unexpected error: <*errors.errorString | 0xc002818c80>: { s: "Cluster did not complete upgrade: timed out waiting for the condition: deployment openshift-operator-lifecycle-manager/olm-operator is not available MinimumReplicasUnavailable: Deployment does not have minimum availability.", } Cluster did not complete upgrade: timed out waiting for the condition: deployment openshift-operator-lifecycle-manager/olm-operator is not available MinimumReplicasUnavailable: Deployment does not have minimum availability. occurred

Comment 2 Ben Parees 2020-04-28 16:02:53 UTC


*** This bug has been marked as a duplicate of bug 1817588 ***