Description of problem: [sig-network] NetworkPolicy [LinuxOnly] NetworkPolicy between server and client [Top Level] [sig-network] NetworkPolicy [LinuxOnly] NetworkPolicy between server and client should enforce policy based on NamespaceSelector with MatchExpressions[Feature:NetworkPolicy] [Skipped:Network/OpenShiftSDN/Multitenant] [Suite:openshift/conformance/parallel] seems to be the top failing networking test right now, failing at a rate of 50% across all ours jobs. Example: https://testgrid.k8s.io/redhat-openshift-ocp-release-4.4-informing#release-openshift-ocp-installer-e2e-aws-ovn-4.4&sort-by-failures=&show-stale-tests= https://prow.svc.ci.openshift.org/view/gcs/origin-ci-test/logs/release-openshift-ocp-installer-e2e-aws-ovn-4.4/1296 STEP: Saw pod success Apr 13 11:59:21.455: INFO: Pod "client-a-4tqlm" satisfied condition "success or failure" Apr 13 11:59:21.482: FAIL: Error getting container logs: the server could not find the requested resource (get pods client-a-4tqlm) Full Stack Trace github.com/openshift/origin/vendor/k8s.io/kubernetes/test/e2e/network.checkNoConnectivity(0xc000c5c280, 0xc00082f760, 0xc0012b2400, 0xc000558900) /go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/test/e2e/network/network_policy.go:1457 +0x2a0 github.com/openshift/origin/vendor/k8s.io/kubernetes/test/e2e/network.testCannotConnect(0xc000c5c280, 0xc00082f760, 0x558757b, 0x8, 0xc000558900, 0x50) /go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/test/e2e/network/network_policy.go:1406 +0x1fc github.com/openshift/origin/vendor/k8s.io/kubernetes/test/e2e/network.glob..func13.2.7() /go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/k8s.io/kubernetes/test/e2e/network/network_policy.go:285 +0x883 github.com/openshift/origin/pkg/test/ginkgo.(*TestOptions).Run(0xc001600d80, 0xc001431390, 0x1, 0x1, 0x0, 0x0) /go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/pkg/test/ginkgo/cmd_runtest.go:59 +0x41f main.newRunTestCommand.func1(0xc000b64500, 0xc001431390, 0x1, 0x1, 0x0, 0x0) /go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/cmd/openshift-tests/openshift-tests.go:233 +0x15d github.com/openshift/origin/vendor/github.com/spf13/cobra.(*Command).execute(0xc000b64500, 0xc001431190, 0x1, 0x1, 0xc000b64500, 0xc001431190) /go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/github.com/spf13/cobra/command.go:826 +0x460 github.com/openshift/origin/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0xc00080bb80, 0x0, 0x61efc60, 0x99c96b0) /go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/github.com/spf13/cobra/command.go:914 +0x2fb github.com/openshift/origin/vendor/github.com/spf13/cobra.(*Command).Execute(...) /go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/vendor/github.com/spf13/cobra/command.go:864 main.main.func1(0xc00080bb80, 0x0, 0x0) /go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/cmd/openshift-tests/openshift-tests.go:57 +0x9c main.main() /go/src/github.com/openshift/origin/_output/local/go/src/github.com/openshift/origin/cmd/openshift-tests/openshift-tests.go:58 +0x341
It looks to me like this is a race condition when the network policy and pod are being created at the same time. Both access a namespacesPolicies map, except network policy path is not locking before it accesses the map. Dan already has a patch for this, so we will see if that fixes it in downstream CI runs: https://github.com/ovn-org/ovn-kubernetes/pull/1244
can we disable this test where it's not supported, and use this bug to re-enable it when the issue is fixed?
https://github.com/ovn-org/ovn-kubernetes/pull/1244 fixes most of the locking and another bug with address set creation related to this test case. I think we still need: https://github.com/ovn-org/ovn-kubernetes/pull/1262 to address one more case of not locking correctly.
(In reply to Ben Parees from comment #3) > can we disable this test where it's not supported, and use this bug to > re-enable it when the issue is fixed? Ben, the fix is now present in 4.5 and 4.4. Thanks.
It looks like the CI test is passing. I also manually tested some matchExpressions on 4.5.0-0.nightly-2020-04-21-103613 Verified.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409