Hide Forgot
Description of problem: In previous local gateway behavior, all traffic went via the host for routing. Due to this behavior, a user could add more specific routes to direct some traffic to go to a non-default gateway. Now with shared gateway mode this behavior no longer works, because traffic egresses the node without going to the host for routing and only uses routes on in OVN.
A potential solution here is to provide a config flag where --force-host-network, where the following would happen: 1. route policies on OVN DVR for all nodes added to redirect to mp0 (forcing the traffic into the host to be routed from OVN pods) 2. modify load balancers on GR (host -> service traffic): a. host-> service traffic still goes via br-ex b. only hairpin host network endpoint is rerouted back to the shared gw bridge and eventually the host. c. other host endpoints would need to be DNAT'ed to their host endpoints, then routes added to GR to force traffic towards the DVR (snat'ed to join subnet, 100.64.0.x) d. route policy in DVR (set from 1), force the traffic to mp0, and the host routes the traffic, and snats e. would need a route on the host for return traffic for join subnet to go back into mp0
Now that I have a better understanding of the problem ignore comment 1 solution...and I've updated the description. A potential workaround is to add custom routes to each OVN gateway router (GR). This needs to be done manually on each node. In order to add a custom route to each GR: 1. exec into the leader nbdb container for the ovnkube-master pod. You may need to try each one to find the leader. Inside the container you can issue ovn-nbctl show. If this command succeeds you are in the leader. 2. check the current routes in your node's gateway router. The name of the gateway router is always "GR_<node name>". For example, if my node name is "ovn-worker": [root@ovn-control-plane ~]# ovn-nbctl lr-route-list GR_ovn-worker IPv4 Routes 10.244.0.0/16 100.64.0.1 dst-ip 0.0.0.0/0 172.18.0.1 dst-ip rtoe-GR_ovn-worker 3. add your new route: [root@ovn-control-plane ~]# ovn-nbctl lr-route-add GR_ovn-worker 192.168.1.0/24 192.168.0.3 [root@ovn-control-plane ~]# ovn-nbctl lr-route-list GR_ovn-worker IPv4 Routes 192.168.1.0/24 192.168.0.3 dst-ip 10.244.0.0/16 100.64.0.1 dst-ip 0.0.0.0/0 172.18.0.1 dst-ip rtoe-GR_ovn-worker There is no CRD or any API way exposed today to add custom routes into OVN.
This seems like a critical problem: how are we supposed to access services on networks to which the host is directly attached?
The workaround suggested here doesn't seem appropriate for directly connected host routes. We would need the OVN equivalent of: ip route add 10.253.0.0/23 dev vlan210 The lr-route-add command doesn't seem like it will work because it requires a `nexthop` argument, but there won't be one for directly attached networks. > One other possible solution is to have ovnkube read the host routing table for the subnet attached to br-ex, and automatically add those routes to each GR. I'm not sure we want to support this or not. I don't think this would be sufficient: you don't want the routing table "for the subnet attached to br-ex"; you'd the actual host routing table. Otherwise you still wouldn't have access to directly attached networks. Tim has suggested via slack that we might be able to work around this with a route policy, such as: ovn-nbctl lr-policy-add ovn_cluster_router 1004 \ 'inport == "rtos-oct-03-26-compute" && ip4.dst == 10.253.0.0/23' \ reroute 10.130.0.2 This worked partially: we saw traffic egressing on ovn-k8s-mp0, which is what we wanted to see, but it didn't appear to leave the host. This also has the disadvantages that (a) it requires knowledge of the host IP on the target network, which ideally wouldn't be necessary, and (b) it requires one rule per/host, which means we're stuck with manual work whenever we add a node. David Guthrie has suggested adding the target network as an `additionalNetworks` configuration in the SDN config, so that pods get addresses directly on the target network. Does that seem like a viable solution? I guess it would solve the routing problem, but we would have to pay closer to attention to IP address availability if we were consuming one/container when we only need one/host.
We were able to get the connectivity we wanted by implementing the routing policy that Tim suggested... ovn-nbctl lr-policy-add ovn_cluster_router 1004 \ 'inport == "rtos-oct-03-26-compute" && ip4.dst == 10.253.0.0/23' \ reroute 10.130.0.2 ...and then adding a NAT rule on the nodes: iptables -t nat -I POSTROUTING 1 -s 10.128.0.0/14 -d 10.253.0.0/23 -j MASQUERADE (Where 10.128.0.0/14 is the cluster network) We've opted to try switching to OpenShift-SDN instead because that ultimately seems like a simpler solution, since the above changes require manual operation on the OVN controllers for every node (and every new node as we add them), combined with a machineconfig to set up the iptables rules on the workers, and would probably result in supportability questions at some point.
After some discussion we will support the previous functionality of routing all egress traffic via the kernel. This will allow the previous behavior to continue working. This mode of ovn-kubernetes is called "local gateway" mode, while the default mode in 4.8 and later is called "shared gateway" mode. Local gateway mode still exists in 4.8 and later, it is just only enabled right now via a hidden configuration. As a workaround for now, I'll provide the instructions for enabling this mode. However, we will come up with a proper configuration knob exposed via cluster network config to switch between gateway modes. Note, for now we only have validated that migrating from local gateway mode -> shared gateway mode works, and not the reverse. We will validate/fix this though. If a customer is relying on custom routes/iptables rules to steer egress traffic, it is advised to stay on local gateway mode when upgrading from 4.7->4.8->4.9. To do this: So to deploy 4.8 or later with local gateway mode, you need to create a config-map to indicate that gateway mode and have it present at deploy/upgrade time. In order to do this for a fresh install: 1. put your install-config.yaml in the your <install folder> 2. openshift-install create-manifests --dir=<install folder> 3. create a file like this in the newly created manifests dir: apiVersion: v1 kind: ConfigMap metadata: name: gateway-mode-config namespace: openshift-network-operator data: mode: "local" immutable: true 4. openshift-install create cluster --dir=<install folder> I'll keep this bug open to address any issues with switching from shared gateway back to local gateway mode.
*** Bug 2000007 has been marked as a duplicate of this bug. ***
Hello, my customer has already OCP 4.8 deployed (with default mode "shared gateway"). You only provided instructions on how to enable "local gateway" mode at installation time. Is it possible to change configuration for a cluster already installed? What would be the change? Just adding the missing configmap in openshift-network-operator namespace?
Hi, We have just faced the problem with the default gateway mode set to shared as default in 4.8. The use case for the customer is setting up external ODF (ceph) cluster. All nodes are connected to the ceph external cluster network via a secondary interface which is not the br-ex one. Then we have set up a couple of routes in the host network routing table via nmstate configs. That way it was working OK in 4.6 EUS, however, we noticed this change in 4.8 and we needed to switch back to local gateway mode. I think this is a valid use case since it makes sense to avoid mixing cluster traffic from storage traffic in the same br-ex interface. Customer would like to know if it is expected to be able to apply policies or rules can be implemented directly into OVN nbdb so we can keep shared gateway mode configured (since it is the default configuration and probably more efficient than local gateway mode).
@trozet@redhat.com is it possible to switch the gateway mode of an existing cluster (absent either an upgrade or a fresh install)? We're hitting this on a second cluster that was upgraded from 4.6 -> 4.8 and suddenly stopped working. I can roll back to OpenShiftSDN, but I would be interested in trying to modify the gateway mode instead of that's possible.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.10.3 security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0056