Bug 1959798
| Summary: | DNAT rules for external IP services wrong in ovn-kubernetes | |||
|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Pablo Alonso Rodriguez <palonsor> | |
| Component: | Networking | Assignee: | Andrew Stoycos <astoycos> | |
| Networking sub component: | ovn-kubernetes | QA Contact: | Weibin Liang <weliang> | |
| Status: | CLOSED ERRATA | Docs Contact: | ||
| Severity: | medium | |||
| Priority: | medium | CC: | aconstan, astoycos, philipp.dallig, swasthan, zzhao | |
| Version: | 4.6 | Flags: | astoycos:
needinfo-
|
|
| Target Milestone: | --- | |||
| Target Release: | 4.9.0 | |||
| Hardware: | Unspecified | |||
| OS: | Unspecified | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | No Doc Update | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 1988487 (view as bug list) | Environment: | ||
| Last Closed: | 2021-10-18 17:31:04 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | ||||
| Bug Blocks: | 1955192, 1988487 | |||
|
Description
Pablo Alonso Rodriguez
2021-05-12 11:58:53 UTC
Hi Pablo, I was able to reproduce and I'm working on an upstream patch that should recalculate the in memory list of node IPs for each service event (add/update/delete). Now if you follow the above steps 1. Add an IP to br-ex in one of the nodes, like `ip addr add 192.168.194.250/24 dev br-ex` 2. Create an external IP service with that IP The correct rules should be calculated without having to restart the ovnkube-node pod. I will link the upstream patch when it's complete. Thanks, Andrew Thanks! Upstream PR -> https://github.com/ovn-org/ovn-kubernetes/pull/2226 Once that merges we will pull to downstream and backport accordingly Tested and verified in 4.9.0-0.nightly-2021-08-07-175228: without restarting the ovnkube-node pod, the correct externalIP svc rule get updated for node secondary interface.
[root@dell-per740-36 ~]# oc debug node/dell-per740-14.rhts.eng.pek2.redhat.com
sh-4.4# ip a show br-ex
13: br-ex: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
link/ether e4:43:4b:5b:6c:28 brd ff:ff:ff:ff:ff:ff
inet 10.73.116.62/23 brd 10.73.117.255 scope global dynamic noprefixroute br-ex
valid_lft 32748sec preferred_lft 32748sec
inet6 2620:52:0:4974:d94e:e1d5:fcfc:fdc7/64 scope global dynamic noprefixroute
valid_lft 2591925sec preferred_lft 604725sec
inet6 fe80::c13e:c5ff:e5d3:8193/64 scope link noprefixroute
valid_lft forever preferred_lft forever
sh-4.4# iptables -n -v -t nat -L OVN-KUBE-EXTERNALIP
Chain OVN-KUBE-EXTERNALIP (2 references)
pkts bytes target prot opt in out source destination
sh-4.4# ip addr add 10.73.116.64/23 dev br-ex
sh-4.4# ip a s br-ex
13: br-ex: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000
link/ether e4:43:4b:5b:6c:28 brd ff:ff:ff:ff:ff:ff
inet 10.73.116.62/23 brd 10.73.117.255 scope global dynamic noprefixroute br-ex
valid_lft 32419sec preferred_lft 32419sec
inet 10.73.116.64/23 scope global secondary br-ex
valid_lft forever preferred_lft forever
inet6 2620:52:0:4974:d94e:e1d5:fcfc:fdc7/64 scope global dynamic noprefixroute
valid_lft 2591973sec preferred_lft 604773sec
inet6 fe80::c13e:c5ff:e5d3:8193/64 scope link noprefixroute
valid_lft forever preferred_lft forever
sh-4.4#
[root@dell-per740-36 ~]# curl -s https://raw.githubusercontent.com/weliang1/Openshift_Networking/master/Features/FC/externalip-svc.yaml | sed s/10.0.76.163/10.73.116.64/g | oc create -f -
service/externalip-svc created
[root@dell-per740-36 ~]# oc create -f https://raw.githubusercontent.com/weliang1/Openshift_Networking/master/Features/FC/externalip-pod.yaml
deployment.apps/externalip-pod created
[root@dell-per740-36 ~]# oc rsh externalip-pod-57f9dd7cfb-967pw
error: unable to upgrade connection: container not found ("externalip-pod")
[root@dell-per740-36 ~]# oc rsh externalip-pod-57f9dd7cfb-967pw
~ $ curl 10.73.116.64:27018
Customer-Blue Test ExternalIP
[root@dell-per740-36 ~]# oc project openshift-ingress
Now using project "openshift-ingress" on server "https://api.bm2-zzhao.qe.devcluster.openshift.com:6443".
[root@dell-per740-36 ~]# oc get all
NAME READY STATUS RESTARTS AGE
pod/router-default-696c499cdf-85w9g 1/1 Running 0 3h26m
pod/router-default-696c499cdf-jtfgp 1/1 Running 0 3h26m
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/router-internal-default ClusterIP 172.30.92.96 <none> 80/TCP,443/TCP,1936/TCP 3h26m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/router-default 2/2 2 2 3h26m
NAME DESIRED CURRENT READY AGE
replicaset.apps/router-default-696c499cdf 2 2 2 3h26m
[root@dell-per740-36 ~]# oc rsh router-default-696c499cdf-85w9g
sh-4.4$ curl 10.73.116.64:27018
Customer-Blue Test ExternalIP
[root@dell-per740-36 ~]# curl 10.73.116.64:27018
Customer-Blue Test ExternalIP
[root@dell-per740-36 ~]# oc debug node/dell-per740-14.rhts.eng.pek2.redhat.com
Starting pod/dell-per740-14rhtsengpek2redhatcom-debug ...
To use host binaries, run `chroot /host`
Pod IP: 10.73.116.62
If you don't see a command prompt, try pressing enter.
sh-4.4# chroot /host
sh-4.4# iptables -n -v -t nat -L OVN-KUBE-EXTERNALIP
Chain OVN-KUBE-EXTERNALIP (2 references)
pkts bytes target prot opt in out source destination
0 0 DNAT tcp -- * * 0.0.0.0/0 10.73.116.64 tcp dpt:27018 to:172.30.33.188:27018
sh-4.4#
[root@dell-per740-36 ~]# oc get clusterversion
NAME VERSION AVAILABLE PROGRESSING SINCE STATUS
version 4.9.0-0.nightly-2021-08-07-175228 True False 173m Cluster version is 4.9.0-0.nightly-2021-08-07-175228
[root@dell-per740-36 ~]#
[root@dell-per740-36 ~]# ping -c 2 10.73.116.63
PING 10.73.116.63 (10.73.116.63) 56(84) bytes of data.
64 bytes from 10.73.116.63: icmp_seq=1 ttl=64 time=0.572 ms
64 bytes from 10.73.116.63: icmp_seq=2 ttl=64 time=0.525 ms
--- 10.73.116.63 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1003ms
rtt min/avg/max/mdev = 0.525/0.548/0.572/0.033 ms
[root@dell-per740-36 ~]#
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3759 |