Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1848048

Summary: OpenShift installer fails when using ovn-kubernetes
Product: OpenShift Container Platform Reporter: ravig <rgudimet>
Component: NetworkingAssignee: Ben Bennett <bbennett>
Networking sub component: ovn-kubernetes QA Contact: Anurag saxena <anusaxen>
Status: CLOSED ERRATA Docs Contact:
Severity: unspecified    
Priority: unspecified CC: dhellmann
Version: 4.6   
Target Milestone: ---   
Target Release: 4.6.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-10-27 16:07:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description ravig 2020-06-17 14:54:13 UTC
Description of problem:

Cluster fails to come up when using ovn-kubernetes as network type with the following errors:


level=error msg="Cluster operator network Degraded is True with RolloutHung: DaemonSet \"openshift-ovn-kubernetes/ovnkube-node\" rollout is not making progress - last change 2020-06-16T20:41:34Z"
level=info msg="Cluster operator network Progressing is True with Deploying: DaemonSet \"openshift-multus/network-metrics-daemon\" is waiting for other operators to become ready\nDaemonSet \"openshift-multus/multus-admission-controller\" is waiting for other operators to become ready\nDaemonSet \"openshift-ovn-kubernetes/ovnkube-node\" is not available (awaiting 3 nodes)"
level=info msg="Cluster operator network Available is False with Startup: The network is starting up"
level=info msg="Pulling debug logs from the bootstrap machine"
level=info msg="Bootstrap gather logs captured here \"/tmp/installer/log-bundle-20200616210856.tar.gz\""
level=fatal msg="Bootstrap failed to complete: failed to wait for bootstrapping to complete: timed out waiting for the condition"
error: failed to execute wrapped command: exit status 1
2020/06/16 21:09:21 Container test in pod e2e-operator-ipi-install-install failed, exit code 1, reason Error
2020/06/16 21:09:22 Copied 7.24MB of artifacts from e2e-operator-ipi-install-install to /logs/artifacts/e2e-operator/ipi-install-install
2020/06/16 21:09:22 Executing "e2e-operator-gather-must-gather"
2020/06/16 21:09:25 Container cp-secret-wrapper in pod e2e-operator-gather-must-gather completed successfully
Running must-gather...
error: gather did not start for pod must-gather-5jf2q: timed out waiting for the condition
error: failed to execute wrapped command: exit status 1 


Version-Release number of selected component (if applicable):


How reproducible:

Everytime in CI:

https://search.apps.build01.ci.devcluster.openshift.com/?search=Cluster+operator+network+Degraded+is+True+with+RolloutHung&maxAge=48h&context=1&type=bug%2Bjunit&name=&maxMatches=5&maxBytes=20971520&groupBy=job

Steps to Reproduce:
1. Spin up OpenShift cluster with ovn-kubernetes
2.
3.

Actual results:
OpenShift installation fails

Expected results:
OpenShift installation successful

Additional info:

Comment 1 ravig 2020-06-17 14:55:08 UTC
I0617 05:12:31.245747   45253 ovs.go:250] exec(126): stderr: "ovs-ofctl: br-int is not a bridge or a socket\n"
I0617 05:12:31.245754   45253 ovs.go:252] exec(126): err: exit status 1
F0617 05:12:31.245772   45253 ovnkube.go:129] timed out dumping br-int flow entries for node ip-10-0-139-86.us-east-2.compute.internal: timed out waiting for the condition

I can the above error in the ovn-node log

Comment 2 Ben Bennett 2020-06-18 13:04:26 UTC
We believe it is fixed by https://github.com/openshift/machine-config-operator/pull/1830

Comment 8 errata-xmlrpc 2020-10-27 16:07:36 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6 GA Images), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4196