Bug 1820807

Summary: Run template e2e-aws-upgrade - e2e-aws-upgrade container test failed
Product: OpenShift Container Platform Reporter: Sebastian Jug <sejug>
Component: EtcdAssignee: Alay Patel <alpatel>
Status: CLOSED DUPLICATE QA Contact: ge liu <geliu>
Severity: high Docs Contact:
Priority: unspecified    
Version: 4.4CC: bparees, sbatsche, wking
Target Milestone: ---   
Target Release: 4.5.0   
Hardware: All   
OS: All   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
test: Overall
Last Closed: 2020-04-28 16:02:53 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Sebastian Jug 2020-04-03 23:22:55 UTC
test: Run template e2e-aws-upgrade - e2e-aws-upgrade container test failed, see job: https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/24370

0-150-139.us-west-1.compute.internal container/olm-operator reason/Restarted Apr 03 21:07:31.468 E ns/openshift-operator-lifecycle-manager pod/olm-operator-7c77f9f96d-phzgx node/ip-10-0-150-139.us-west-1.compute.internal container/olm-operator container exited with code 1 (Error): 
time="2020-04-03T21:07:24Z" level=info msg="log level info"\n
time="2020-04-03T21:07:24Z" level=info msg="TLS keys set, using https for metrics"\nW0403 21:07:24.803627 1 client_config.go:543] Neither --kubeconfig nor --master was specified. Using the inClusterConfig. This might not work.\n
time="2020-04-03T21:07:24Z" level=info msg="Using in-cluster kube client config"\n
time="2020-04-03T21:07:24Z" level=info msg="Using in-cluster kube client config"\n
time="2020-04-03T21:07:30Z" level=fatal msg="couldn't clean previous release" error="Delete https://172.30.0.1:443/apis/operators.coreos.com/v1alpha1/namespaces/openshift-operator-lifecycle-manager/catalogsources/olm-operators: dial tcp 172.30.0.1:443: connect: no route to host"\n
Apr 03 21:07:38.410 I ns/openshift-etcd-operator deployment/etcd-operator reason/UnhealthyEtcdMember unhealthy members: ip-10-0-150-139.us-west-1.compute.internal,ip-10-0-135-25.us-west-1.compute.internal (164 times) 
Apr 03 21:07:47.532 W ns/openshift-console pod/console-849d7d5747-24frm node/ip-10-0-150-139.us-west-1.compute.internal container/console reason/Restarted 
Apr 03 21:07:56.642 W ns/openshift-kube-controller-manager-operator pod/kube-controller-manager-operator-769fffbd5d-ssp94 node/ip-10-0-150-139.us-west-1.compute.internal reason/BackOff Back-off restarting failed container (262 times) 
Apr 03 21:07:59.132 I test="[sig-arch][Feature:ClusterUpgrade] Cluster should remain functional during upgrade [Disruptive] [Serial] [Suite:openshift]" failed 
Failing tests: [sig-arch][Feature:ClusterUpgrade] Cluster should remain functional during upgrade [Disruptive] [Serial] [Suite:openshift]
Writing JUnit report to /tmp/artifacts/junit/junit_e2e_20200403-210759.xml error: 1 fail, 0 pass, 0 skip (1h19m11s)

Comment 1 W. Trevor King 2020-04-04 02:29:55 UTC
Proximal cause for the "Cluster should remain functional during upgrade" failure was:

  fail [github.com/openshift/origin/test/e2e/upgrade/upgrade.go:135]: during upgrade to registry.svc.ci.openshift.org/ocp/release:4.5.0-0.ci-2020-04-03-190912 Unexpected error: <*errors.errorString | 0xc002818c80>: { s: "Cluster did not complete upgrade: timed out waiting for the condition: deployment openshift-operator-lifecycle-manager/olm-operator is not available MinimumReplicasUnavailable: Deployment does not have minimum availability.", } Cluster did not complete upgrade: timed out waiting for the condition: deployment openshift-operator-lifecycle-manager/olm-operator is not available MinimumReplicasUnavailable: Deployment does not have minimum availability. occurred

Comment 2 Ben Parees 2020-04-28 16:02:53 UTC

*** This bug has been marked as a duplicate of bug 1817588 ***