Bug 1942856 mentions the fact that apparently it is currently necessary to open up port 9 between nodes in order for ovn-kubernetes egress IP checking to work. This is not supposed to be required.
The original openshift-sdn code does the pings over the pod network (ie, it pings the remote node's tun0 IP), so all ports are open and the code can use whatever port it wants. (I picked port 9 because it's semantically correct ("discard"), and because I didn't want to connect to a port that might actually have a server on it that would log a message about "received connection"/"connection closed unexpectedly" or whatever every time we pinged.)
ovn-kubernetes's egress IP code, as currently written, is using the node's primary IP rather than its pod-network IP as the ping target, so the traffic goes over the node network, and fails, because most clusters are not going to allow traffic between nodes on port 9.
There are two possible fixes:
- Change the code to use the nodes' pod-network IPs like openshift-sdn
does, rather than their node-network IPs
- Change the port to something which is already open between nodes.
eg, you could reserve a port in the 9000-9999 range in
Move this bug to assign status since the linked PR has been reverted. Please help link the correct PR and update the bug status. thanks.
https://github.com/openshift/ovn-kubernetes/pull/834 includes the required fixes downstream.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.