Bug 1881113

Summary: ovn on gcp is unusually unstable
Product: OpenShift Container Platform Reporter: David Eads <deads>
Component: NetworkingAssignee: Ricardo Carrillo Cruz <ricarril>
Networking sub component: ovn-kubernetes QA Contact: Anurag saxena <anusaxen>
Status: CLOSED ERRATA Docs Contact:
Severity: urgent    
Priority: urgent CC: aconstan, anusaxen, bbennett
Version: 4.6   
Target Milestone: ---   
Target Release: 4.6.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-10-27 16:43:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description David Eads 2020-09-21 14:29:29 UTC
ovn on gcp passes about 25% of the time, versus about 60% for azure and aws.

https://testgrid.k8s.io/redhat-openshift-ocp-release-4.6-informing#release-openshift-ocp-installer-e2e-gcp-ovn-4.6&grid=old shows failures.

Looking into a few of them, there appears to be a crashlooping pod in ns/openshift-ovn-kubernetes on most of the failures.  This is just an observation, I'm not a networking expert.

Comment 1 Anurag saxena 2020-09-21 15:08:38 UTC
Guess we are taking about this https://bugzilla.redhat.com/show_bug.cgi?id=1877100

Comment 2 Juan Luis de Sousa-Valadas 2020-09-22 11:07:01 UTC
*** Bug 1877100 has been marked as a duplicate of this bug. ***

Comment 3 Ricardo Carrillo Cruz 2020-09-22 15:19:27 UTC
Alex mentioned this maybe fixed by https://bugzilla.redhat.com/show_bug.cgi?id=1880974 PR.
Keeping an eye on it.

Comment 4 Ben Bennett 2020-09-23 13:19:52 UTC
Useful sippy query for the job pass rates https://sippy.ci.openshift.org/detailed?release=4.6&endDay=1

Comment 5 Ricardo Carrillo Cruz 2020-09-23 13:20:41 UTC
It seems the pass rate has gone up?

https://sippy.ci.openshift.org/detailed?release=4.6&endDay=1

The PR from that related BZ landed, so hopefully that fixed the issue.

Comment 6 Ben Bennett 2020-09-24 17:34:36 UTC
GCP CI is looking a lot more stable.  I think the referenced fixes solved the problems.

Thanks all!

Comment 7 Anurag saxena 2020-09-24 17:47:05 UTC
Yep, recent various GCP installations looks okay to QE as well.

Comment 9 Anurag saxena 2020-09-24 18:49:00 UTC
Verifying based on comment 7

Comment 12 errata-xmlrpc 2020-10-27 16:43:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196