Bug 1970985

Summary: periodic ci-4.8-upgrade-from-stable-4.7-e2e-*-ovn-upgrade are permafailing on service/ingress disruption
Product: OpenShift Container Platform Reporter: Vadim Rutkovsky <vrutkovs>
Component: NetworkingAssignee: Surya Seetharaman <surya>
Networking sub component: ovn-kubernetes QA Contact: Anurag saxena <anusaxen>
Status: CLOSED ERRATA Docs Contact:
Severity: high    
Priority: unspecified CC: aconstan, cholman, philipp.dallig, vpickard, wking
Version: 4.8   
Target Milestone: ---   
Target Release: 4.9.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of:
: 1987046 (view as bug list) Environment:
Last Closed: 2021-10-18 17:33:48 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1929396, 1987046    

Description Vadim Rutkovsky 2021-06-11 15:36:25 UTC
Description of problem:
https://prow.ci.openshift.org/job-history/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-aws-ovn-upgrade:

last 5 failures are failing with:
  [sig-arch][Feature:ClusterUpgrade] Cluster should remain functional during upgrade [Disruptive] [Serial]
fail [github.com/openshift/origin/test/e2e/upgrade/service/service.go:161]: Jun 11 15:04:57.712: Service was unreachable during disruption for at least 7m24s of 1h23m56s (9%):

and

disruption_tests: [sig-network-edge] Cluster frontend ingress remain available
Jun 11 15:04:57.713: Frontends were unreachable during disruption for at least 25m23s of 1h25m54s (30%):

Jun 11 14:22:20.615 E ns/openshift-console route/console Route stopped responding to GET requests over new connections
Jun 11 14:22:20.615 - 405s  E ns/openshift-console route/console Route is not responding to GET requests over new connections

https://prow.ci.openshift.org/view/gs/origin-ci-test/logs/periodic-ci-openshift-release-master-ci-4.8-upgrade-from-stable-4.7-e2e-aws-ovn-upgrade/1403330789755588608

These failures are not present in OpenshiftSDN-based cluster updates

Comment 2 Surya Seetharaman 2021-07-05 09:04:12 UTC
1) https://github.com/openshift/cluster-network-operator/pull/1141
2) https://github.com/ovn-org/ovn-kubernetes/pull/2183

Should make the situation better if not be ideal fixes. I'll push on them and get them in.

Comment 7 errata-xmlrpc 2021-10-18 17:33:48 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:3759