Doc Text:
|
Cause:
In 4.8, the gateway mode for OVN-Kubernetes deployments moved from "local" gateway mode to "shared" gateway mode. These modes affect ingress and egress traffic into a Kubernetes node. With local, all traffic is routed to the host kernel networking stack before egressing/ingressing a cluster. In other words, the host routing table and iptables are evaluated on ingress/egress packets before either entering OVN, or being sent to the next hop outside of the node, respectively. In shared gateway mode, ingress/egress traffic to/from OVN networked pods bypass the kernel routing table and are sent directly out of the NIC via OVS. The advantages of this include better performance, ability to hardware offload, and less SNAT'ing. One of the disadvantages of shared (bypassing the kernel) is that a user's custom routes/iptables rules are not respected for egress traffic.
Consequence:
Users with custom host routing/iptables rules that upgrade to a version 4.8 or newer may have unintended egress routing due to the fact that packets are bypassing the host kernel. There is currently no support to configure the equivalent routes inside of OVN.
Fix:
Allow users to configure the gateway mode so that users who depended on custom routing rules in the kernel to steer traffic will continue to have this desired behavior. Note: migrating gateway modes in a cluster may result in some temporary ingress/egress traffic outage.
Result:
In releases 4.8 and 4.9, a config map needs to be created that will signal to cluster network operator (CNO) to switch the gateway mode:
apiVersion: v1
kind: ConfigMap
metadata:
name: gateway-mode-config
namespace: openshift-network-operator
data:
mode: "local"
immutable: true
For releases > 4.10, a new API is exposed called "routingViaHost". By setting this config in CNO, traffic will be routed to the kernel before egressing the node:
spec:
defaultNetwork:
type: OVNKubernetes
ovnKubernetesConfig:
mtu: 1400
genevePort: 6081
gatewayConfig:
routingViaHost: true
Workaround:
For users who are on 4.8 or 4.9 versions in *shared* gw mode without the fixed versions of 4.8 and 4.9, they may attempt to use the config map previously mentioned. However, after doing this service traffic and egress firewall may no longer function correctly. In order to fix service traffic a route needs to be manually deleted on each node matching on the service CIDR. For example, assume a service CIDR of 10.96.0.0/16:
1. On shared gateway mode, there will be a route towards the br-ex interface like:
[root@ovn-worker ~]# ip route show 10.96.0.0/16
10.96.0.0/16 via 172.18.0.1 dev br-ex mtu 1400
2. In local gateway mode for versions < 4.10, this route needs to point to ovn-k8s-mp0 interface. Manually remove the shared gateway route on each node:
[root@ovn-worker ~]# ip route del 10.96.0.0/16
[root@ovn-worker ~]# ip route show 10.96.0.0/16
[root@ovn-worker ~]#
3. Restart ovnkube-node. ovnkube-node will now re-add the correct route which should point towards ovn-k8s-mp0. For example:
[root@ovn-worker ~]# ip route show 10.96.0.0/16
10.96.0.0/16 via 10.244.1.1 dev ovn-k8s-mp0
Note, there is no current workaround to make egress firewall work. Users should upgrade to a fixed version to ensure egress firewall works.
|