PR is a test skip for ovn-kubernetes, which is not even used in this e2e-gcp job, so is unrelated to the failure. https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/pr-logs/pull/24833/pull-ci-openshift-origin-master-e2e-gcp/7122
Ran this test 500 times against a 4.5 cluster with no flaking. Another data point: the admission controller plugin that enforces the constraint verified in the test is compiled into the apiserver and I'm not sure it's configurable. Not clear yet under what conditions the plugin could somehow be disabled or misbehaving.
Upon closer inspection of the admission code[1], it does seem at least _possible_ that an informer cache consistency issue could result in the admission plugin skipping defaulting of the Ingress if the IngressClass resources were persisted and the list call didn't return them (or returned just one of them). [1] https://github.com/openshift/origin/blob/master/vendor/k8s.io/kubernetes/plugin/pkg/admission/defaultingressclass/admission.go#L150
More digging is needed, but I'm suspicious of the design of that admission plugin. Stale cache data can cause it to make the wrong admission decision, and there's no controller which can correct for mistakes. It seems inherently racy with IngressClass persistence. Maybe the "only one default IngressClass" invariant itself needs enforced during admission so that this plugin doesn't need to make the racy check itself.
Test [1] should be run [Serial] since it causes other ingresses that do not specify an ingressclass [2] to use the default ingressclass created by [1] when [3] runs in parallel. [1] https://github.com/openshift/origin/blob/master/vendor/k8s.io/kubernetes/test/e2e/network/ingressclass.go#L69-L94 [2] https://github.com/openshift/origin/blob/master/test/extended/testdata/router/ingress.yaml [3] https://github.com/openshift/origin/blob/master/test/extended/router/router.go#L70-L100
Checked the latest CI test with the serialized patch and didn't see the same flakiness anymore: ------ https://testgrid.k8s.io/redhat-openshift-ocp-release-4.5-informing#release-openshift-ocp-installer-e2e-gcp-serial-4.5 https://testgrid.k8s.io/redhat-openshift-ocp-release-4.5-informing#release-openshift-ocp-installer-e2e-openstack-serial-4.5 https://testgrid.k8s.io/redhat-openshift-ocp-release-4.5-informing#release-openshift-ocp-installer-e2e-azure-serial-4.5 ------
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409