Bug 2130045
| Summary: | the first ping from lsp to external through snat would fail in gateway router mode | ||
|---|---|---|---|
| Product: | Red Hat Enterprise Linux Fast Datapath | Reporter: | Jianlin Shi <jishi> |
| Component: | ovn22.06 | Assignee: | xsimonar |
| Status: | CLOSED ERRATA | QA Contact: | Jianlin Shi <jishi> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | FDP 21.I | CC: | ctrautma, dceara, jiji, xsimonar |
| Target Milestone: | --- | Keywords: | Regression |
| Target Release: | --- | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | |||
| Fixed In Version: | ovn22.06-22.06.0-59.el8fdp | Doc Type: | If docs needed, set a value |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-11-03 00:30:59 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Git bisect points to: https://github.com/ovn-org/ovn/commit/b89b96e1a16134c0aa8cd6513d920d49ff8c6cda commit b89b96e1a16134c0aa8cd6513d920d49ff8c6cda Author: Xavier Simonart <xsimonar> Date: Mon Aug 29 05:27:20 2022 -0400 controller: fix potential segmentation violation when removing ports If a logical switch port is added and connected to a logical router port (through options: router-port) before the router port is created, then this might cause further issues such as segmentation violation when the switch and router ports are deleted. Signed-off-by: Xavier Simonart <xsimonar> Signed-off-by: Han Zhou <hzhou> (cherry picked from commit 04292cc2dc2c3823b0cf86612e50ad0023bcb73f) controller/local_data.c | 38 +++++++++------------ controller/pinctrl.c | 16 +++++++-- tests/ovn.at | 89 +++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 117 insertions(+), 26 deletions(-) When reproducing locally with the steps mentioned in the bug description, it's important to ensure that the SB MAC_Binding table is cleared between runs, e.g.: ovn-sbctl --all destroy mac_binding Fix posted upstream for review: https://patchwork.ozlabs.org/project/openvswitch/patch/20220927151616.2490575-1-xsimonar@redhat.com/ ovn22.09 fast-datapath-rhel-8 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2130752 ovn22.09 fast-datapath-rhel-9 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2130753 ovn22.06 fast-datapath-rhel-9 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2130754 the issue is fixed on ovn22.06-22.06.0-59.el8:
+ ovn-nbctl --wait=hv sync
+ sleep 2
+ ip netns exec ls1p1 ping 172.16.1.1 -c 1
PING 172.16.1.1 (172.16.1.1) 56(84) bytes of data.
64 bytes from 172.16.1.1: icmp_seq=1 ttl=63 time=6.41 ms
--- 172.16.1.1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 6.411/6.411/6.411/0.000 ms
+ ip netns exec ls1p1 ping 172.16.1.1 -c 1
PING 172.16.1.1 (172.16.1.1) 56(84) bytes of data.
64 bytes from 172.16.1.1: icmp_seq=1 ttl=63 time=0.849 ms
--- 172.16.1.1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.849/0.849/0.849/0.000 ms
+ ip netns exec ls1p1 ping6 1711::1 -c 1
PING 1711::1(1711::1) 56 data bytes
64 bytes from 1711::1: icmp_seq=1 ttl=63 time=7.42 ms
--- 1711::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 7.420/7.420/7.420/0.000 ms
+ ip netns exec ls1p1 ping6 1711::1 -c 1
PING 1711::1(1711::1) 56 data bytes
64 bytes from 1711::1: icmp_seq=1 ttl=63 time=0.886 ms
--- 1711::1 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.886/0.886/0.886/0.000 ms
[root@dell-per740-12 bz2130045]# rpm -qa | grep -e "openvswitch2.17|ovn22.06"
[root@dell-per740-12 bz2130045]# rpm -qa | grep -e "openvswitch2.17|ovn22.06"
[root@dell-per740-12 bz2130045]# rpm -qa | grep -E "openvswitch2.17|ovn22.06"
ovn22.06-22.06.0-59.el8fdp.x86_64
ovn22.06-host-22.06.0-59.el8fdp.x86_64
openvswitch2.17-2.17.0-52.el8fdp.x86_64
ovn22.06-central-22.06.0-59.el8fdp.x86_64
Verified on ovn22.06-64.el8: + ovn-nbctl --wait=hv sync + sleep 2 + ip netns exec ls1p1 ping 172.16.1.1 -c 1 PING 172.16.1.1 (172.16.1.1) 56(84) bytes of data. 64 bytes from 172.16.1.1: icmp_seq=1 ttl=63 time=5.55 ms --- 172.16.1.1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 5.553/5.553/5.553/0.000 ms + ip netns exec ls1p1 ping 172.16.1.1 -c 1 PING 172.16.1.1 (172.16.1.1) 56(84) bytes of data. 64 bytes from 172.16.1.1: icmp_seq=1 ttl=63 time=1.31 ms --- 172.16.1.1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 1.311/1.311/1.311/0.000 ms + ip netns exec ls1p1 ping6 1711::1 -c 1 PING 1711::1(1711::1) 56 data bytes 64 bytes from 1711::1: icmp_seq=1 ttl=63 time=8.35 ms --- 1711::1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 8.352/8.352/8.352/0.000 ms + ip netns exec ls1p1 ping6 1711::1 -c 1 PING 1711::1(1711::1) 56 data bytes 64 bytes from 1711::1: icmp_seq=1 ttl=63 time=1.42 ms --- 1711::1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 1.415/1.415/1.415/0.000 ms [root@dell-per740-12 bz2130046]# rpm -qa | grep -E "openvswitch2.17|ovn22.06" ovn22.06-central-22.06.0-64.el8fdp.x86_64 ovn22.06-22.06.0-64.el8fdp.x86_64 ovn22.06-host-22.06.0-64.el8fdp.x86_64 openvswitch2.17-2.17.0-58.el8fdp.x86_64 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (ovn22.06), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:7395 |
Description of problem: the first ping from lsp to external through snat would fail in gateway router mode Version-Release number of selected component (if applicable): ovn22.06-22.06.0-57.el8 How reproducible: Always Steps to Reproduce: systemctl start openvswitch systemctl start ovn-northd ovn-nbctl set-connection ptcp:6641 ovn-sbctl set-connection ptcp:6642 ovs-vsctl set open . external_ids:system-id=hv1 external_ids:ovn-remote=tcp:20.0.187.25:6642 external_ids:ovn-encap-type=geneve external_ids:ovn-encap-ip=20.0.187.25 systemctl restart ovn-controller ovs-vsctl del-br br-test ovs-vsctl add-br br-test ip link set br-test up ovs-vsctl set open . external-ids:ovn-bridge-mappings=provider:br-test ovn-nbctl ls-add ls1 ovn-nbctl lsp-add ls1 ls1p1 ovn-nbctl lsp-set-addresses ls1p1 "00:00:00:01:01:01 192.168.1.1 2001::1" ovn-nbctl lsp-add ls1 ls1p2 ovn-nbctl lsp-set-addresses ls1p2 "00:00:00:01:01:02 192.168.1.2 2001::2" ovn-nbctl lr-add lr1 ovn-nbctl lrp-add lr1 lr1-ls1 00:00:00:00:00:01 192.168.1.254/24 2001::a/64 ovn-nbctl lsp-add ls1 ls1-lr1 ovn-nbctl lsp-set-addresses ls1-lr1 "00:00:00:00:00:01 192.168.1.254 2001::a" ovn-nbctl lsp-set-type ls1-lr1 router ovn-nbctl lsp-set-options ls1-lr1 router-port=lr1-ls1 ovn-nbctl set logical_router lr1 options:chassis=hv1 ovn-nbctl lrp-add lr1 lr1-pub 00:00:00:00:0f:01 172.16.1.254/24 1711::a/64 ovn-nbctl ls-add pub ovn-nbctl lsp-add pub pub-lr1 ovn-nbctl lsp-set-type pub-lr1 router ovn-nbctl lsp-set-options pub-lr1 router-port=lr1-pub ovn-nbctl lsp-set-addresses pub-lr1 router ovn-nbctl lsp-add pub ln ovn-nbctl lsp-set-type ln localnet ovn-nbctl lsp-set-addresses ln unknown ovn-nbctl lsp-set-options ln network_name=provider ovn-nbctl lr-nat-add lr1 snat 172.16.1.10 192.168.1.0/24 ovn-nbctl lr-nat-add lr1 snat 1711::10 2001::/64 ovs-vsctl add-port br-int ls1p1 -- set interface ls1p1 type=internal external_ids:iface-id=ls1p1 ovs-vsctl add-port br-int ls1p2 -- set interface ls1p2 type=internal external_ids:iface-id=ls1p2 ip netns add ls1p1 ip link set ls1p1 netns ls1p1 ip netns exec ls1p1 ip link set ls1p1 address 00:00:00:01:01:01 ip netns exec ls1p1 ip link set ls1p1 up ip netns exec ls1p1 ip addr add 192.168.1.1/24 dev ls1p1 ip netns exec ls1p1 ip addr add 2001::1/64 dev ls1p1 ip netns exec ls1p1 ip route add default via 192.168.1.254 dev ls1p1 ip netns exec ls1p1 ip -6 route add default via 2001::a dev ls1p1 ip netns add ls1p2 ip link set ls1p2 netns ls1p2 ip netns exec ls1p2 ip link set ls1p2 address 00:00:00:01:01:02 ip netns exec ls1p2 ip link set ls1p2 up ip netns exec ls1p2 ip addr add 192.168.1.2/24 dev ls1p2 ip netns exec ls1p2 ip addr add 2001::2/64 dev ls1p2 ip netns exec ls1p2 ip route add default via 192.168.1.254 dev ls1p2 ip netns exec ls1p2 ip -6 route add default via 2001::a ovs-vsctl add-port br-test ext1 -- set interface ext1 type=internal ip netns add ext1 ip link set ext1 netns ext1 ip netns exec ext1 ip link set ext1 up ip netns exec ext1 ip addr add 172.16.1.1/24 dev ext1 ip netns exec ext1 ip -6 addr add 1711::1/64 dev ext1 ovn-nbctl --wait=hv sync sleep 2 ip netns exec ls1p1 ping 172.16.1.1 -c 1 ip netns exec ls1p1 ping 172.16.1.1 -c 1 ip netns exec ls1p1 ping6 1711::1 -c 1 ip netns exec ls1p1 ping6 1711::1 -c 1 Actual results: + ip netns exec ls1p1 ping 172.16.1.1 -c 1 PING 172.16.1.1 (172.16.1.1) 56(84) bytes of data. --- 172.16.1.1 ping statistics --- 1 packets transmitted, 0 received, 100% packet loss, time 0ms <=== the first ping failed + ip netns exec ls1p1 ping 172.16.1.1 -c 1 PING 172.16.1.1 (172.16.1.1) 56(84) bytes of data. 64 bytes from 172.16.1.1: icmp_seq=1 ttl=63 time=2.65 ms --- 172.16.1.1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 2.654/2.654/2.654/0.000 ms + ip netns exec ls1p1 ping6 1711::1 -c 1 PING 1711::1(1711::1) 56 data bytes --- 1711::1 ping statistics --- 1 packets transmitted, 0 received, 100% packet loss, time 0ms + ip netns exec ls1p1 ping6 1711::1 -c 1 PING 1711::1(1711::1) 56 data bytes 64 bytes from 1711::1: icmp_seq=1 ttl=63 time=4.20 ms --- 1711::1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 4.203/4.203/4.203/0.000 ms Expected results: pass Additional info: [root@wsfd-advnetlab16 test]# rpm -qa | grep -E "openvswitch2.17|ovn22.06" openvswitch2.17-2.17.0-50.el8fdp.x86_64 python3-openvswitch2.17-2.17.0-50.el8fdp.x86_64 ovn22.06-central-22.06.0-57.el8fdp.x86_64 ovn22.06-host-22.06.0-57.el8fdp.x86_64 ovn22.06-22.06.0-57.el8fdp.x86_64 from tcpdump, the first ping doesn't reach to ext1. the issue didn't exist on ovn22.06-22.06.0-27.el8.