Network stress since 06/21 code has been failing with significant flakes https://testgrid.k8s.io/redhat-openshift-ocp-release-4.9-informing#release-openshift-origin-installer-e2e-aws-sdn-network-stress-4.9 haproxy 2.4 was merged around that time, but the revert PR was failing only on network policy: https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-origin-installer-launch-aws/1412856796430733312 fail [k8s.io/kubernetes.1/test/e2e/network/netpol/network_legacy.go:1908]: Jul 7 20:53:30.316: Pod client-a-2v6bp should be able to connect to service svc-server, but was not able to connect. Pod logs: TIMEOUT TIMEOUT REFUSED REFUSED REFUSED Looking at jobs that fail with that error: https://search.ci.openshift.org/?search=should+be+able+to+connect+to+service+svc-server%2C+but+was+not+able+to+connect&maxAge=48h&context=1&type=bug%2Bjunit&name=master%7C4.9&excludeName=&maxMatches=1&maxBytes=20971520&groupBy=job shows about 6% failure rate. Setting high because this is showing up in what looks like all platforms at that rate. Does not seem to happen in stress jobs prior to 06/25 because we turned those tests on. last 4.9 stress job pass was https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/release-openshift-origin-installer-e2e-aws-sdn-network-stress-4.9/1408434925270470656, but note that is running code from 6/21 because apparently we didn't promote for 4 days. Network policy is just heavily flaky, if the set of tests can be made not flaky we can leave them in stress, otherwise they need to be excluded. If they are excluded this can drop to medium and remain open if it's ONLY the tests that are flaky (not the actual function). Note that NetworkPolicy should have a reasonable SLO, and network stress will push that heavily, so it's possible that instead of bypassing we should optimize network policy.
Broader than https://bugzilla.redhat.com/show_bug.cgi?id=1975865
PRs reverting haproxy-2.4 => haproxy-2.2 https://github.com/openshift/images/pull/97 https://github.com/openshift/router/pull/318
The test-disabling doesn't need QA, and we don't want this bug to get closed anyway because we need to track the fact that we have to fix them to stop being flaky
*** Bug 1975865 has been marked as a duplicate of this bug. ***
*** Bug 1986119 has been marked as a duplicate of this bug. ***
*** Bug 1975476 has been marked as a duplicate of this bug. ***
(again moving back to NEW so this stays open to track the fact that we are skipping tests)
*** Bug 1989395 has been marked as a duplicate of this bug. ***
*** Bug 1990377 has been marked as a duplicate of this bug. ***
Closing this in favor of tracking the last bit (re-enabling netpol test suite in Openshift) in an issue -> https://github.com/openshift/origin/issues/27535 Since this isn't really a bug and more of a tech-debt item since we're already running these test in upstream ovn-kubernetes. Thanks, Andrew