Summary: | DNAT rules for external IP services wrong in ovn-kubernetes | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Pablo Alonso Rodriguez <palonsor> | |
Component: | Networking | Assignee: | Andrew Stoycos <astoycos> | |
Networking sub component: | ovn-kubernetes | QA Contact: | Weibin Liang <weliang> | |
Status: | CLOSED ERRATA | Docs Contact: | ||
Severity: | medium | |||
Priority: | medium | CC: | aconstan, astoycos, philipp.dallig, swasthan, zzhao | |
Version: | 4.6 | Flags: | astoycos:
needinfo-
|
|
Target Milestone: | --- | |||
Target Release: | 4.9.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | No Doc Update | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1988487 (view as bug list) | Environment: | ||
Last Closed: | 2021-10-18 17:31:04 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Bug Depends On: | ||||
Bug Blocks: | 1955192, 1988487 |
Description
Pablo Alonso Rodriguez
2021-05-12 11:58:53 UTC
Hi Pablo, I was able to reproduce and I'm working on an upstream patch that should recalculate the in memory list of node IPs for each service event (add/update/delete). Now if you follow the above steps 1. Add an IP to br-ex in one of the nodes, like `ip addr add 192.168.194.250/24 dev br-ex` 2. Create an external IP service with that IP The correct rules should be calculated without having to restart the ovnkube-node pod. I will link the upstream patch when it's complete. Thanks, Andrew Thanks! Upstream PR -> https://github.com/ovn-org/ovn-kubernetes/pull/2226 Once that merges we will pull to downstream and backport accordingly Tested and verified in 4.9.0-0.nightly-2021-08-07-175228: without restarting the ovnkube-node pod, the correct externalIP svc rule get updated for node secondary interface. [root@dell-per740-36 ~]# oc debug node/dell-per740-14.rhts.eng.pek2.redhat.com sh-4.4# ip a show br-ex 13: br-ex: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000 link/ether e4:43:4b:5b:6c:28 brd ff:ff:ff:ff:ff:ff inet 10.73.116.62/23 brd 10.73.117.255 scope global dynamic noprefixroute br-ex valid_lft 32748sec preferred_lft 32748sec inet6 2620:52:0:4974:d94e:e1d5:fcfc:fdc7/64 scope global dynamic noprefixroute valid_lft 2591925sec preferred_lft 604725sec inet6 fe80::c13e:c5ff:e5d3:8193/64 scope link noprefixroute valid_lft forever preferred_lft forever sh-4.4# iptables -n -v -t nat -L OVN-KUBE-EXTERNALIP Chain OVN-KUBE-EXTERNALIP (2 references) pkts bytes target prot opt in out source destination sh-4.4# ip addr add 10.73.116.64/23 dev br-ex sh-4.4# ip a s br-ex 13: br-ex: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000 link/ether e4:43:4b:5b:6c:28 brd ff:ff:ff:ff:ff:ff inet 10.73.116.62/23 brd 10.73.117.255 scope global dynamic noprefixroute br-ex valid_lft 32419sec preferred_lft 32419sec inet 10.73.116.64/23 scope global secondary br-ex valid_lft forever preferred_lft forever inet6 2620:52:0:4974:d94e:e1d5:fcfc:fdc7/64 scope global dynamic noprefixroute valid_lft 2591973sec preferred_lft 604773sec inet6 fe80::c13e:c5ff:e5d3:8193/64 scope link noprefixroute valid_lft forever preferred_lft forever sh-4.4# [root@dell-per740-36 ~]# curl -s https://raw.githubusercontent.com/weliang1/Openshift_Networking/master/Features/FC/externalip-svc.yaml | sed s/10.0.76.163/10.73.116.64/g | oc create -f - service/externalip-svc created [root@dell-per740-36 ~]# oc create -f https://raw.githubusercontent.com/weliang1/Openshift_Networking/master/Features/FC/externalip-pod.yaml deployment.apps/externalip-pod created [root@dell-per740-36 ~]# oc rsh externalip-pod-57f9dd7cfb-967pw error: unable to upgrade connection: container not found ("externalip-pod") [root@dell-per740-36 ~]# oc rsh externalip-pod-57f9dd7cfb-967pw ~ $ curl 10.73.116.64:27018 Customer-Blue Test ExternalIP [root@dell-per740-36 ~]# oc project openshift-ingress Now using project "openshift-ingress" on server "https://api.bm2-zzhao.qe.devcluster.openshift.com:6443". [root@dell-per740-36 ~]# oc get all NAME READY STATUS RESTARTS AGE pod/router-default-696c499cdf-85w9g 1/1 Running 0 3h26m pod/router-default-696c499cdf-jtfgp 1/1 Running 0 3h26m NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/router-internal-default ClusterIP 172.30.92.96 <none> 80/TCP,443/TCP,1936/TCP 3h26m NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/router-default 2/2 2 2 3h26m NAME DESIRED CURRENT READY AGE replicaset.apps/router-default-696c499cdf 2 2 2 3h26m [root@dell-per740-36 ~]# oc rsh router-default-696c499cdf-85w9g sh-4.4$ curl 10.73.116.64:27018 Customer-Blue Test ExternalIP [root@dell-per740-36 ~]# curl 10.73.116.64:27018 Customer-Blue Test ExternalIP [root@dell-per740-36 ~]# oc debug node/dell-per740-14.rhts.eng.pek2.redhat.com Starting pod/dell-per740-14rhtsengpek2redhatcom-debug ... To use host binaries, run `chroot /host` Pod IP: 10.73.116.62 If you don't see a command prompt, try pressing enter. sh-4.4# chroot /host sh-4.4# iptables -n -v -t nat -L OVN-KUBE-EXTERNALIP Chain OVN-KUBE-EXTERNALIP (2 references) pkts bytes target prot opt in out source destination 0 0 DNAT tcp -- * * 0.0.0.0/0 10.73.116.64 tcp dpt:27018 to:172.30.33.188:27018 sh-4.4# [root@dell-per740-36 ~]# oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.9.0-0.nightly-2021-08-07-175228 True False 173m Cluster version is 4.9.0-0.nightly-2021-08-07-175228 [root@dell-per740-36 ~]# [root@dell-per740-36 ~]# ping -c 2 10.73.116.63 PING 10.73.116.63 (10.73.116.63) 56(84) bytes of data. 64 bytes from 10.73.116.63: icmp_seq=1 ttl=64 time=0.572 ms 64 bytes from 10.73.116.63: icmp_seq=2 ttl=64 time=0.525 ms --- 10.73.116.63 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1003ms rtt min/avg/max/mdev = 0.525/0.548/0.572/0.033 ms [root@dell-per740-36 ~]# Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3759 |