Bug 1849728

Summary: Cluster fails to come up on OVN-kubernetes with hybridOverlayConfig
Product: OpenShift Container Platform Reporter: ravig <rgudimet>
Component: NetworkingAssignee: Jacob Tanenbaum <jtanenba>
Networking sub component: ovn-kubernetes QA Contact: Anurag saxena <anusaxen>
Status: CLOSED ERRATA Docs Contact:
Severity: urgent    
Priority: unspecified CC: aconstan, aravindh, sdodson
Version: 4.6   
Target Milestone: ---   
Target Release: 4.6.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-10-27 16:08:40 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description ravig 2020-06-22 16:37:25 UTC
Description of problem:

Cluster is failing to come up frequently with the following error when OVNKubernetes is used:



Cluster operator etcd Degraded is True with InstallerPodContainerWaiting_ContainerCreating::InstallerPodNetworking_FailedCreatePodSandBox: InstallerPodContainerWaitingDegraded: Pod \"installer-3-ip-10-0-214-132.us-east-2.compute.internal\" on node \"ip-10-0-214-132.us-east-2.compute.internal\" container \"installer\" is waiting for 29m13.564092979s because \"\"\nInstallerPodNetworkingDegraded: Pod \"installer-3-ip-10-0-214-132.us-east-2.compute.internal\" on node \"ip-10-0-214-132.us-east-2.compute.internal\" observed degraded networking: Failed to create pod sandbox: rpc error: code = Unknown desc = failed to create pod network sandbox k8s_installer-3-ip-10-0-214-132.us-east-2.compute.internal_openshift-etcd_344b6f2c-def3-4be4-971b-e4e0257a4d80_0(b6409b47cb699f331600ad52233d5ccb808d8e6c98361c33c2edea22a620ccc1): Multus: [openshift-etcd/installer-3-ip-10-0-214-132.us-east-2.compute.internal]: error adding container to network \"ovn-kubernetes\": delegateAdd: error invoking confAdd - \"ovn-k8s-cni-overlay\": error in getting result from AddNetwork: CNI request failed with status 400: '[openshift-etcd/installer-3-ip-10-0-214-132.us-east-2.compute.internal] failed to get pod annotation: timed out waiting for the condition\nInstallerPodNetworkingDegraded: 


We're noticing this frequently in our CI as well:

https://search.apps.build01.ci.devcluster.openshift.com/?search=error+in+getting+result+from+AddNetwork%3A+CNI+request+failed+with+status+400%3A&maxAge=48h&context=1&type=bug%2Bjunit&name=&maxMatches=5&maxBytes=20971520&groupBy=job


Version-Release number of selected component (if applicable):


How reproducible:
Very frequently.

Steps to Reproduce:
1.
2.
3.

Actual results:
OCP cluster creation fails

Expected results:
OCP comes up with OVNKubernetes as network type

Additional info:

Comment 2 Anurag saxena 2020-06-22 19:32:02 UTC
Worth to wait for https://github.com/openshift/machine-config-operator/pull/1830 to present in CI or nightly and check installation again.

Comment 3 Aravindh Puthiyaparambil 2020-06-22 20:15:14 UTC
CI has already picked it up and we still the issue in our PRs. The nightly has still not picked it up but that should not affect the PRs.

Comment 4 Anurag saxena 2020-06-22 21:15:57 UTC
Thanks Aravindh. I had a successful install on 4.6.0-0.ci-2020-06-22-171752 for just networkType: OVNKubernetes on AWS. Does this bug pertained to hybrid config along with OVNKubrnetes networktype?

So two questions

1) Was this bug opened on a specific platform?
2) Did your install-config used hybrid config as well?

Comment 5 Aravindh Puthiyaparambil 2020-06-22 21:47:58 UTC
> Thanks Aravindh. I had a successful install on 4.6.0-0.ci-2020-06-22-171752
> for just networkType: OVNKubernetes on AWS. Does this bug pertained to
> hybrid config along with OVNKubrnetes networktype?

This pertains to OVN hybrid.

> 
> So two questions
> 
> 1) Was this bug opened on a specific platform?

Given we are seeing this in CI, it would be OVN hybrid on AWS.

> 2) Did your install-config used hybrid config as well?

Yes

Comment 6 Anurag saxena 2020-06-26 19:43:38 UTC
Correcting the bug title to reflect actual issue

Comment 11 errata-xmlrpc 2020-10-27 16:08:40 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196