https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-upgrade/23242#1:build-log.txt%3A11083 Failing tests: [Disruptive] Cluster upgrade should maintain a functioning cluster [Feature:ClusterUpgrade] [Suite:openshift] [Serial] 2020/03/26 23:40:16 Container test in pod e2e-aws-upgrade failed, exit code 1, reason Error 2020/03/26 23:49:13 Copied 176.08MB of artifacts from e2e-aws-upgrade to /logs/artifacts/e2e-aws-upgrade 2020/03/26 23:49:13 Releasing lease for "aws-quota-slice" 2020/03/26 23:49:14 No custom metadata found and prow metadata already exists. Not updating the metadata. 2020/03/26 23:49:15 Ran for 1h31m34s error: could not run steps: step e2e-aws-upgrade failed: template pod "e2e-aws-upgrade" failed: the pod ci-op-7shj0sm3/e2e-aws-upgrade failed after 1h30m9s (failed containers: test): ContainerFailed one or more containers exited Container test exited with code 1, reason Error --- ard Mar 26 23:38:53.271 I ns/openshift-machine-config-operator pod/etcd-quorum-guard-9498659d4-8qb76 node/ created Mar 26 23:38:53.281 I ns/openshift-machine-config-operator replicaset/etcd-quorum-guard-9498659d4 Created pod: etcd-quorum-guard-9498659d4-8qb76 Mar 26 23:38:53.284 W ns/openshift-machine-config-operator pod/etcd-quorum-guard-9498659d4-8qb76 0/6 nodes are available: 3 node(s) didn't match node selector, 3 node(s) didn't match pod affinity/anti-affinity, 3 node(s) didn't satisfy existing pods anti-affinity rules. Mar 26 23:38:54.725 W ns/openshift-machine-config-operator pod/etcd-quorum-guard-5c9b9b597c-49t6w node/ip-10-0-135-254.us-west-2.compute.internal deleted Mar 26 23:38:54.802 I ns/openshift-machine-config-operator pod/etcd-quorum-guard-9498659d4-8qb76 Successfully assigned openshift-machine-config-operator/etcd-quorum-guard-9498659d4-8qb76 to ip-10-0-135-254.us-west-2.compute.internal Mar 26 23:38:55.362 I ns/openshift-machine-config-operator pod/etcd-quorum-guard-9498659d4-8qb76 Container image "registry.svc.ci.openshift.org/ocp/4.3-2020-03-26-221327@sha256:9e71afa828f820ece9d26153a3ba52ea597609b4298acf57c4db20096e52b0d5" already present on machine Mar 26 23:38:55.515 I ns/openshift-machine-config-operator pod/etcd-quorum-guard-9498659d4-8qb76 Created container guard Mar 26 23:38:55.545 I ns/openshift-machine-config-operator pod/etcd-quorum-guard-9498659d4-8qb76 Started container guard Mar 26 23:39:17.714 W clusterversion/version cluster reached 4.3.0-0.ci-2020-03-26-221327 Mar 26 23:39:17.714 W clusterversion/version changed Progressing to False: Cluster version is 4.3.0-0.ci-2020-03-26-221327 Mar 26 23:39:28.850 I ns/openshift-ingress service/router-default Updated load balancer with new hosts (3 times) Mar 26 23:40:15.008 I test="[Disruptive] Cluster upgrade should maintain a functioning cluster [Feature:ClusterUpgrade] [Suite:openshift] [Serial]" failed Failing tests: [Disruptive] Cluster upgrade should maintain a functioning cluster [Feature:ClusterUpgrade] [Suite:openshift] [Serial]
I think that "ard" bit is just a truncated line. The job actually failed because of: fail [github.com/openshift/origin/test/extended/util/disruption/disruption.go:226]: Mar 26 23:39:42.393: Frontends were unreachable during disruption for at least 9m8s of 45m9s (20%): which is a pretty severe outage. This was a 4.2.26 -> 4.3.0-0.ci-2020-03-26-221327 update job.
Might also be an SDN issue like bug 1793635.
This bug is just another manifestation of #1809665 and isn't really adding any new information, but I'll keep it open and set a Depends On for now.
*** This bug has been marked as a duplicate of bug 1809665 ***