Description of problem: Running upstream tests using OVN in the OCP install-config: networkType: OVNKubernetes Started upstream ovs-cni: oc apply -f https://raw.githubusercontent.com/kubevirt/ovs-cni/master/examples/ovs-cni.yml and I see that ovs-cni pods are all failing w/ CrashLoopBackoff Only the marker reports logs: # oc logs -f -n kube-system ovs-cni-amd64-n4l29 -c ovs-cni-marker F1122 20:00:25.946505 1 main.go:42] failed to create a new marker object: Error creating the ovsdb connection: failed to connect to ovsdb error: Invalid socket file Comparing OVNKubernetes to default OpenShiftSDN, there appear to be ovs path changes. OVN: [root@worker0 ~]# ls /etc/openvswitch/ conf.db system-id.conf [root@worker0 ~]# ls /run/openvswitch/ br-int.mgmt br-int.snoop br-local.mgmt br-local.snoop br1ovs.mgmt br1ovs.snoop db.sock ovn-controller.4227.ctl ovn-controller.pid ovnkube-node.pid ovs-vswitchd.4242.ctl ovs-vswitchd.pid ovsdb-server.4189.ctl ovsdb-server.pid SDN(default ovs -- different cluster): [root@worker-perf39 ~]# ls /var/run/openvswitch/ br0.mgmt br0.snoop br1.mgmt br1ovs.mgmt br1ovs.snoop br1.snoop db.sock ovsdb-server.17232.ctl ovsdb-server.pid ovs-vswitchd.17443.ctl ovs-vswitchd.pid [root@worker-perf39 ~]# ls /var/lib/openvswitch/ conf.db system-id.conf I captured the pod description outputs.. OVN: http://perf1.perf.lab.eng.bos.redhat.com/pub/jhopper/OCP4/debug/OVN/ovnkube-node_openshift-ovn-kubernetes.txt SDN (default ovs -- from a different cluster): http://perf1.perf.lab.eng.bos.redhat.com/pub/jhopper/OCP4/debug/OVN/ovs_openshift-sdn.txt Version-Release number of selected component (if applicable): OCP 4.3.0-0.nightly-2019-11-11-182924 RHCOS 43.81.201911111553.0 Kubevirt v0.23.0 REPOSITORY TAG IMAGE ID CREATED SIZE quay.io/kubevirt/ovs-cni-plugin latest 9e05c4d27e5b 2 weeks ago 114 MB quay.io/kubevirt/ovs-cni-marker latest 20b56e0e6a31 2 weeks ago 142 MB How reproducible: Failure every time. Steps to Reproduce: Start OVS-CNI on cluster using OVN. I haven't tried the network operator yet, but I suspect it will have the same error if the paths have not been changed.
Also in case it is interesting, here is the ovs-vsctl output from an OVN node: http://perf1.perf.lab.eng.bos.redhat.com/pub/jhopper/OCP4/debug/OVN/ovs-vsctl_OVN.txt
Jenifer, thanks for testing OVN with CNV. We aim to tackle its support in future releases.
Resolution for this issue is being tracked on Jira.
we are tracking it in Jira. closing
I'm reopening this issue. We need to run CNV on OCP 4.3 and this bug blocks us from successful deployment. CNV's OVS bridge marker fails on OCP 4.3 with: F0214 13:19:12.260672 1 main.go:42] failed to create a new marker object: Error creating the ovsdb connection: failed to connect to ovsdb error: Invalid socket file The solution would be to backport https://github.com/openshift/cluster-network-operator/pull/357. In this PR, we split OVS and OVN pods, making them share the OVS socket on the host. With the socket available on the host, CNV's OVS bridge marker should be able to successfully start and render the deployment done.
Test Environment: ================ $ oc version Client Version: 4.4.0-0.nightly-2020-04-20-224655 Server Version: 4.4.0-rc.8 Kubernetes Version: v1.17.1 $ oc get csv -n openshift-cnv | awk ' { print $4 } ' | tail -n1 2.3.0 Test Steps: ========== 1. Check status of all the pods under namespace openshift-cnv with app=ovs-cni. $ oc get pod -n openshift-cnv | grep ovs-cni ovs-cni-amd64-4jmxb 2/2 Running 3 67m ovs-cni-amd64-5lzpz 2/2 Running 0 67m ovs-cni-amd64-7k2xk 2/2 Running 0 67m ovs-cni-amd64-c679d 2/2 Running 0 67m ovs-cni-amd64-dxgck 2/2 Running 3 67m ovs-cni-amd64-w28td 2/2 Running 3 67m 2. Check the logs and see if there are any socket exceptions as mentioned above . $ for pod in $(oc get pod -n openshift-cnv -l app=ovs-cni --no-headers | awk '{print $1}'); do echo ===$pod=== && oc logs -n openshift-cnv $pod --all-containers=true; done ===ovs-cni-amd64-4jmxb=== I0421 08:49:16.974668 1 main.go:44] Found the OVS socket ===ovs-cni-amd64-5lzpz=== I0421 08:30:59.423366 1 main.go:44] Found the OVS socket ===ovs-cni-amd64-7k2xk=== I0421 08:30:56.322587 1 main.go:44] Found the OVS socket ===ovs-cni-amd64-c679d=== I0421 08:30:55.216315 1 main.go:44] Found the OVS socket ===ovs-cni-amd64-dxgck=== I0421 08:56:24.900515 1 main.go:44] Found the OVS socket ===ovs-cni-amd64-w28td=== I0421 08:42:15.621076 1 main.go:44] Found the OVS socket
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:1529