Description of problem: [OVN]Egressip doesn't take effect when egress node and pod node are same one Version-Release number of selected component (if applicable): 4.8.0-0.nightly-2021-05-06-210840 How reproducible: Steps to Reproduce: 1. Label two nodes as egressip nodes 2. Create one egressip object oc get egressip NAME EGRESSIPS ASSIGNED NODE ASSIGNED EGRESSIPS egressip 172.31.249.182 huirwang-0507a-t7n9q-worker-46cpf 172.31.249.182 oc get egressip -o yaml apiVersion: v1 items: - apiVersion: k8s.ovn.org/v1 kind: EgressIP metadata: creationTimestamp: "2021-05-07T08:38:46Z" generation: 2 managedFields: - apiVersion: k8s.ovn.org/v1 fieldsType: FieldsV1 fieldsV1: f:spec: f:podSelector: {} f:status: .: {} f:items: {} manager: huirwang-0507a-t7n9q-master-1 operation: Update time: "2021-05-07T08:38:46Z" - apiVersion: k8s.ovn.org/v1 fieldsType: FieldsV1 fieldsV1: f:spec: .: {} f:egressIPs: {} f:namespaceSelector: .: {} f:matchLabels: .: {} f:org: {} manager: kubectl-create operation: Update time: "2021-05-07T08:38:46Z" name: egressip resourceVersion: "167730" uid: 016a95fd-21d9-4c69-b98e-122da8c82505 spec: egressIPs: - 172.31.249.182 namespaceSelector: matchLabels: org: qe podSelector: {} status: items: - egressIP: 172.31.249.182 node: huirwang-0507a-t7n9q-worker-46cpf kind: List metadata: resourceVersion: "" selfLink: "" 2. Create a project 0i9xy and a pod in it. Label the project with org=qe oc get ns 0i9xy --show-labels NAME STATUS AGE LABELS 0i9xy Active 15m kubernetes.io/metadata.name=0i9xy,org=qe 3. Check the source ip from project 0i9xy while true; do date;curl -s --connect-timeout 5 172.31.249.80:9095 ;sleep 2; done Fri May 7 08:41:13 UTC 2021 172.31.249.43Fri May 7 08:41:15 UTC 2021 172.31.249.43Fri May 7 08:41:17 UTC 2021 172.31.249.43Fri May 7 08:41:19 UTC 2021 172.31.249.43Fri May 7 08:41:21 UTC 2021 172.31.249.43Fri May 7 08:41:23 UTC 2021 172.31.249.43Fri May 7 08:41:25 UTC 2021 172.31.249.43Fri May 7 08:41:27 UTC 2021 ........ 172.31.249.43Fri May 7 08:53:03 UTC 2021 172.31.249.43Fri May 7 08:53:05 UTC 2021 172.31.249.43Fri May 7 08:53:07 UTC 2021 172.31.249.43Fri May 7 08:53:09 UTC 2021 ....... 172.31.249.43Fri May 7 09:04:34 UTC 2021 172.31.249.43Fri May 7 09:04:36 UTC 2021 oc get nodes -o wide NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME huirwang-0507a-t7n9q-master-0 Ready master 6h56m v1.21.0-rc.0+291e731 172.31.249.18 172.31.249.18 Red Hat Enterprise Linux CoreOS 48.84.202105061618-0 (Ootpa) 4.18.0-293.el8.x86_64 cri-o://1.21.0-89.rhaos4.8.git3f6209a.el8 huirwang-0507a-t7n9q-master-1 Ready master 6h57m v1.21.0-rc.0+291e731 172.31.249.193 172.31.249.193 Red Hat Enterprise Linux CoreOS 48.84.202105061618-0 (Ootpa) 4.18.0-293.el8.x86_64 cri-o://1.21.0-89.rhaos4.8.git3f6209a.el8 huirwang-0507a-t7n9q-master-2 Ready master 6h56m v1.21.0-rc.0+291e731 172.31.249.126 172.31.249.126 Red Hat Enterprise Linux CoreOS 48.84.202105061618-0 (Ootpa) 4.18.0-293.el8.x86_64 cri-o://1.21.0-89.rhaos4.8.git3f6209a.el8 huirwang-0507a-t7n9q-worker-46cpf Ready worker 6h46m v1.21.0-rc.0+291e731 172.31.249.43 172.31.249.43 Red Hat Enterprise Linux CoreOS 48.84.202105061618-0 (Ootpa) 4.18.0-293.el8.x86_64 cri-o://1.21.0-89.rhaos4.8.git3f6209a.el8 huirwang-0507a-t7n9q-worker-lcflj Ready worker 6h46m v1.21.0-rc.0+291e731 172.31.249.87 172.31.249.87 Red Hat Enterprise Linux CoreOS 48.84.202105061618-0 (Ootpa) 4.18.0-293.el8.x86_64 cri-o://1.21.0-89.rhaos4.8.git3f6209a.el8 Actual results: The EgressIP didn't take effect, outbound traffic is using node's IP. Expected results: Outbound traffic should use EgressIP as configured. Additional info:
This BZ affects egress IP but is not caused by it. I am seeing the following listed on all nodes' GR: [root@huirwang-0507a-t7n9q-master-1 ~]# ovn-nbctl -p /ovn-cert/tls.key -c /ovn-cert/tls.crt -C /ovn-ca/ca-bundle.crt --db ssl:172.31.249.126:9641,ssl:172.31.249.18:9641,ssl:172.31.249.193:9641 lr-nat-list GR_huirwang-0507a-t7n9q-worker-46cpf TYPE EXTERNAL_IP EXTERNAL_PORT LOGICAL_IP EXTERNAL_MAC LOGICAL_PORT snat 172.31.249.182 10.128.2.92 snat 172.31.249.182 10.128.2.106 snat 172.31.249.182 10.128.2.105 snat 172.31.249.43 10.128.2.5 snat 172.31.249.43 10.128.2.6 snat 172.31.249.43 10.128.2.92 snat 172.31.249.43 10.128.2.4 snat 172.31.249.43 10.128.2.49 snat 172.31.249.43 10.128.2.105 snat 172.31.249.43 10.128.2.3 snat 172.31.249.43 10.128.2.28 snat 172.31.249.43 10.128.2.26 snat 172.31.249.43 10.128.2.77 snat 172.31.249.43 10.128.2.106 That is incorrect. The only SNAT that should exist on the GR are the egress IP ones. In this case something is assigning a dedicated SNAT for every pod running on the node. This is in turn "scrambles" the egress IP configuration and causes OVN to use, not the egress IP dedicated SNAT, but the incorrect one - this is why we're not seeing the egress IP on the server's side. I've looked at the logs to see what command creates these SNAT objects and I've found the following: W0507 09:10:16.872586 1 pods.go:334] Failed to get options for port: 0i9xy_test-rc-vc6hh I0507 09:10:16.872665 1 kube.go:61] Setting annotations map[k8s.ovn.org/pod-networks:{"default":{"ip_addresses":["10.128.2.105/23"],"mac_address":"0a:58:0a:80:02:69","gateway_ips":["10.128.2.1"],"ip_address":"10.128.2.105/23","gate way_ip":"10.128.2.1"}}] on pod 0i9xy/test-rc-vc6hh W0507 09:10:16.908185 1 pods.go:334] Failed to get options for port: 0i9xy_test-rc-7tcn6 I0507 09:10:16.908341 1 kube.go:61] Setting annotations map[k8s.ovn.org/pod-networks:{"default":{"ip_addresses":["10.128.2.106/23"],"mac_address":"0a:58:0a:80:02:6a","gateway_ips":["10.128.2.1"],"ip_address":"10.128.2.106/23","gate way_ip":"10.128.2.1"}}] on pod 0i9xy/test-rc-7tcn6 2021-05-07T09:10:16.940Z|04306|nbctl|INFO|Running command run -- add address_set 7a3c0f32-1bf3-4dd2-b1d9-bd157e72410f addresses "\"10.128.2.105\"" 2021-05-07T09:10:16.953Z|04307|nbctl|INFO|Running command run --if-exists -- lr-nat-del GR_huirwang-0507a-t7n9q-worker-46cpf snat 10.128.2.105/32 2021-05-07T09:10:16.957Z|04308|nbctl|INFO|Running command run -- lr-nat-add GR_huirwang-0507a-t7n9q-worker-46cpf snat 172.31.249.43 10.128.2.105/32 2021-05-07T09:10:16.969Z|04309|nbctl|INFO|Running command run -- add address_set 7a3c0f32-1bf3-4dd2-b1d9-bd157e72410f addresses "\"10.128.2.106\"" I0507 09:10:16.977094 1 pods.go:289] [0i9xy/test-rc-vc6hh] addLogicalPort took 104.719222ms It seems that some time during the pod setup in addLogicalPort we start setting up SNAT for the pod on the GR....this is happening for every pod on every node. The reason this is happening is because commit: https://github.com/openshift/cluster-network-operator/commit/14a5e41bb9b8fedaec0037b8551be4888e0ac821 added --disable-snat-multiple-gws to ovnkube-master which now does that pod setup in addLogicalPort. This is also the reason upstream CI did not pick the problem up (we have E2E tests for egress IP), because that option is OpenShift specific. I need to talk to the Platform team about this. But this is clearly a regression that breaks egress IP for OpenShift, and I am thus setting the blocker+ flag. Moreover: those pod annotation seem completely off to me: "ip_addresses":["10.128.2.106/23"] is not correct. Also another (cosmetic) problem is that even though that flag is provided to ovnkube-master, the logged "parsed config" does not indicate that it's been correctly set: + exec /usr/bin/ovnkube --init-master huirwang-0507a-t7n9q-master-1 --config-file=/run/ovnkube-config/ovnkube.conf --ovn-empty-lb-events --loglevel 4 --metrics-bind-address 127.0.0.1:29102 --gateway-mode shared --gateway-interface br-ex --sb-address ssl:172.31.249.126:9642,ssl:172.31.249.18:9642,ssl:172.31.249.193:9642 --sb-client-privkey /ovn-cert/tls.key --sb-client-cert /ovn-cert/tls.crt --sb-client-cacert /ovn-ca/ca-bundle.crt --sb-cert-common-name ovn --nb-address ssl:172.31.249.126:9641,ssl:172.31.249.18:9641,ssl:172.31.249.193:9641 --nb-client-privkey /ovn-cert/tls.key --nb-client-cert /ovn-cert/tls.crt --nb-client-cacert /ovn-ca/ca-bundle.crt --nbctl-daemon-mode --nb-cert-common-name ovn --enable-multicast --disable-snat-multiple-gws --acl-logging-rate-limit 20 I0507 05:34:25.278043 1 config.go:1437] Parsed config file /run/ovnkube-config/ovnkube.conf I0507 05:34:25.278112 1 config.go:1438] Parsed config: {Default:{MTU:1400 ConntrackZone:64000 EncapType:geneve EncapIP: EncapPort:6081 InactivityProbe:100000 OpenFlowProbe:180 RawClusterSubnets:10.128.0.0/14/23 ClusterSubnets:[]} Logging:{File: CNIFile: Level:4 LogFileMaxSize:100 LogFileMaxBackups:5 LogFileMaxAge:5 ACLLoggingRateLimit:20} Monitoring:{RawNetFlowTargets: RawSFlowTargets: RawIPFIXTargets: NetFlowTargets:[] SFlowTargets:[] IPFIXTargets:[]} CNI:{ConfDir:/etc/cni/net.d Plugin:ovn-k8s-cni-overlay} OVNKubernetesFeature:{EnableEgressIP:true} Kubernetes:{Kubeconfig: CACert: APIServer:https://api-int.huirwang-0507a.qe.devcluster.openshift.com:6443 Token: CompatServiceCIDR: RawServiceCIDRs:172.30.0.0/16 ServiceCIDRs:[] OVNConfigNamespace:openshift-ovn-kubernetes MetricsBindAddress: OVNMetricsBindAddress: MetricsEnablePprof:false OVNEmptyLbEvents:false PodIP: RawNoHostSubnetNodes: NoHostSubnetNodes:nil HostNetworkNamespace:openshift-host-network} OvnNorth:{Address: PrivKey: Cert: CACert: CertCommonName: Scheme: northbound:false exec:<nil>} OvnSouth:{Address: PrivKey: Cert: CACert: CertCommonName: Scheme: northbound:false exec:<nil>} Gateway:{Mode:local Interface: NextHop: VLANID:0 NodeportEnable:true DisableSNATMultipleGWs:false V4JoinSubnet:100.64.0.0/16 V6JoinSubnet:fd98::/64} MasterHA:{ElectionLeaseDuration:60 ElectionRenewDeadline:30 ElectionRetryPeriod:20} HybridOverlay:{Enabled:false RawClusterSubnets: ClusterSubnets:[] VXLANPort:4789} OvnKubeNode:{Mode:full}} Specifically: DisableSNATMultipleGWs:false which incorrectly indicating that the flag was not provided. The flag was provided, so that should be DisableSNATMultipleGWs:true
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438