Description of problem: Neighbor Advertisement not found on router failover. More information on the ovs-discuss list: https://mail.openvswitch.org/pipermail/ovs-discuss/2022-September/052060.html We could see this happening in Openstack deployments using OVN too. This is a missing feature for Openstack L3 High Availability on IPv6. Version-Release number of selected component (if applicable): How reproducible: 100% Steps to Reproduce: We observe the following situation: - We have a logical-router bound to two gateway-chassis - The logical-router-port connected to the physical network has an ipv4 address assigned When we now failover the router (e.g. by changing the priorities) we can observe GARP packets being send out and updating the mac address tables on the physical switches. If the logical-router-port is now changed from having an ipv4 address to an ipv6 address then failover does not trigger neighbor advertisements. This means that the mac address tables on our physical switches are not updated and require manual intervention for traffic to be working again.
It seems that it was fixed already in main and 22.12, so all it takes is to backport two patches: https://github.com/ovn-org/ovn/commit/9ac5485242183fcfc0ecaed5c1c15a6ba68c420f https://github.com/ovn-org/ovn/commit/f6edbc02929583e0d66ee7938264b471c7cda397
Hi, how far do you need this fix? Down to 21.12?
Yes. The OVN version consumed by RHOS 17.0 would be ovn-2021, which I think is equivalent to 21.12. Thanks!!
Backports posted: https://patchwork.ozlabs.org/project/ovn/list/?series=339328
ovn22.09 fast-datapath-rhel-8 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2166827 ovn22.09 fast-datapath-rhel-9 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2166828 ovn22.06 fast-datapath-rhel-9 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2166829 ovn22.03 fast-datapath-rhel-8 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2166830 ovn22.03 fast-datapath-rhel-9 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2166831 ovn-2021 fast-datapath-rhel-8 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2166832 ovn-2021 fast-datapath-rhel-9 clone created at https://bugzilla.redhat.com/show_bug.cgi?id=2166833
reproducer: systemctl start openvswitch systemctl start ovn-northd ovn-nbctl set-connection ptcp:6641 ovn-sbctl set-connection ptcp:6642 ovs-vsctl set open . external_ids:system-id=hv1 external_ids:ovn-remote=tcp:127.0.0.1:6642 external_ids:ovn-encap-type=geneve external_ids:ovn-encap-ip=127.0.0.1 systemctl restart ovn-controller ovn-nbctl ls-add ls0 ovn-nbctl create Logical_Router name=lr0 options:chassis=hv1 ovn-nbctl lrp-add lr0 lrp0 f0:00:00:00:00:01 fd12:3456:789a:1::1/64 ovn-nbctl lsp-add ls0 lrp0-rp -- set Logical_Switch_Port lrp0-rp \ type=router options:router-port=lrp0 addresses='"f0:00:00:00:00:01"' ovn-nbctl lsp-set-options lrp0-rp router-port=lrp0 nat-addresses="router" ovn-nbctl lr-nat-add lr0 dnat_and_snat 3010::119 fd12:3456:789a:1::119 ovs-vsctl \ -- add-br br-eth0 ovs-vsctl add-port br-eth0 ext1 -- set interface ext1 type=internal ip netns add ext1 ip link set ext1 netns ext1 ip netns exec ext1 ip link set ext1 up ip netns exec ext1 ip addr add fd12:3456:789a:1::2/64 dev ext1 ip netns exec ext1 tcpdump -i ext1 -w ext1.pcap -nnle & sleep 2 ovs-vsctl set Open_vSwitch . external-ids:ovn-bridge-mappings=physnet1:br-eth0 ovn-nbctl lsp-add ls0 ln_port ovn-nbctl lsp-set-addresses ln_port unknown ovn-nbctl lsp-set-type ln_port localnet ovn-nbctl lsp-set-options ln_port network_name=physnet1 ovn-nbctl --wait=hv sync while : do if ovs-vsctl show | grep patch-br-int-to-ln_port then break else sleep 1 fi done sleep 15 pkill tcpdump sleep 2 tcpdump -r ext1.pcap -nnle -v ovn-nbctl --wait=hv lsp-set-options lrp0-rp router-port=lrp0 nat-addresses="" ip netns exec ext1 tcpdump -i ext1 -w ext1_2.pcap -nnle & sleep 2 ovn-nbctl --wait=hv lsp-set-options lrp0-rp router-port=lrp0 nat-addresses="router" sleep 5 pkill tcpdump sleep 2 tcpdump -r ext1_2.pcap -nnle -v result on ovn22.06-22.06.0-82.el8: [root@wsfd-advnetlab16 bz2131676]# rpm -qa | grep -E "openvswitch2.17|ovn22.06" python3-openvswitch2.17-2.17.0-60.el8fdp.x86_64 ovn22.06-central-22.06.0-82.el8fdp.x86_64 openvswitch2.17-2.17.0-60.el8fdp.x86_64 ovn22.06-22.06.0-82.el8fdp.x86_64 ovn22.06-host-22.06.0-82.el8fdp.x86_64 + tcpdump -r ext1.pcap -nnle -v reading from file ext1.pcap, link-type EN10MB (Ethernet) dropped privs to tcpdump 20:49:15.062019 d6:6e:f7:51:ff:bc > 33:33:00:00:00:16, ethertype IPv6 (0x86dd), length 110: (hlim 1, next-header Options (0) payload length: 56) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 2 group record(s) [gaddr ff02::1:ff00:2 to_ex, 0 source(s)] [gaddr ff02::1:ff51:ffbc to_ex, 0 source(s)] 20:49:15.638078 d6:6e:f7:51:ff:bc > 33:33:ff:00:00:02, ethertype IPv6 (0x86dd), length 86: (hlim 255, next-header ICMPv6 (58) payload length: 32) :: > ff02::1:ff00:2: [icmp6 sum ok] ICMP6, neighbor solicitation, length 32, who has fd12:3456:789a:1::2 unknown option (14), length 8 (1): 0x0000: 7eaf 6668 1258 20:49:15.766061 d6:6e:f7:51:ff:bc > 33:33:ff:51:ff:bc, ethertype IPv6 (0x86dd), length 86: (hlim 255, next-header ICMPv6 (58) payload length: 32) :: > ff02::1:ff51:ffbc: [icmp6 sum ok] ICMP6, neighbor solicitation, length 32, who has fe80::d46e:f7ff:fe51:ffbc unknown option (14), length 8 (1): 0x0000: f24c 782e 021b 20:49:16.790107 d6:6e:f7:51:ff:bc > 33:33:00:00:00:16, ethertype IPv6 (0x86dd), length 110: (hlim 1, next-header Options (0) payload length: 56) fe80::d46e:f7ff:fe51:ffbc > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 2 group record(s) [gaddr ff02::1:ff00:2 to_ex, 0 source(s)] [gaddr ff02::1:ff51:ffbc to_ex, 0 source(s)] 20:49:16.790143 d6:6e:f7:51:ff:bc > 33:33:00:00:00:02, ethertype IPv6 (0x86dd), length 70: (hlim 255, next-header ICMPv6 (58) payload length: 16) fe80::d46e:f7ff:fe51:ffbc > ff02::2: [icmp6 sum ok] ICMP6, router solicitation, length 16 source link-address option (1), length 8 (1): d6:6e:f7:51:ff:bc 20:49:17.814060 d6:6e:f7:51:ff:bc > 33:33:00:00:00:16, ethertype IPv6 (0x86dd), length 110: (hlim 1, next-header Options (0) payload length: 56) fe80::d46e:f7ff:fe51:ffbc > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 2 group record(s) [gaddr ff02::1:ff00:2 to_ex, 0 source(s)] [gaddr ff02::1:ff51:ffbc to_ex, 0 source(s)] 20:49:20.502014 d6:6e:f7:51:ff:bc > 33:33:00:00:00:02, ethertype IPv6 (0x86dd), length 70: (hlim 255, next-header ICMPv6 (58) payload length: 16) fe80::d46e:f7ff:fe51:ffbc > ff02::2: [icmp6 sum ok] ICMP6, router solicitation, length 16 source link-address option (1), length 8 (1): d6:6e:f7:51:ff:bc 20:49:27.926049 d6:6e:f7:51:ff:bc > 33:33:00:00:00:02, ethertype IPv6 (0x86dd), length 70: (hlim 255, next-header ICMPv6 (58) payload length: 16) fe80::d46e:f7ff:fe51:ffbc > ff02::2: [icmp6 sum ok] ICMP6, router solicitation, length 16 source link-address option (1), length 8 (1): d6:6e:f7:51:ff:bc <==== no rarp received + ovn-nbctl --wait=hv lsp-set-options lrp0-rp router-port=lrp0 nat-addresses= + ip netns exec ext1 tcpdump -i ext1 -w ext1_2.pcap -nnle + sleep 2 dropped privs to tcpdump tcpdump: listening on ext1, link-type EN10MB (Ethernet), capture size 262144 bytes + ovn-nbctl --wait=hv lsp-set-options lrp0-rp router-port=lrp0 nat-addresses=router + sleep 5 + pkill tcpdump 0 packets captured 0 packets received by filter 0 packets dropped by kernel + sleep 2 + tcpdump -r ext1_2.pcap -nnle -v reading from file ext1_2.pcap, link-type EN10MB (Ethernet) dropped privs to tcpdump result on ovn22.06-22.06.0-115.el8: [root@wsfd-advnetlab16 bz2131676]# rpm -qa | grep -E "openvswitch2.17|ovn22.06" python3-openvswitch2.17-2.17.0-60.el8fdp.x86_64 ovn22.06-22.06.0-115.el8fdp.x86_64 openvswitch2.17-2.17.0-60.el8fdp.x86_64 ovn22.06-central-22.06.0-115.el8fdp.x86_64 ovn22.06-host-22.06.0-115.el8fdp.x86_64 + tcpdump -r ext1.pcap -nnle -v reading from file ext1.pcap, link-type EN10MB (Ethernet) dropped privs to tcpdump 20:51:27.526019 26:7c:8e:94:dc:89 > 33:33:00:00:00:16, ethertype IPv6 (0x86dd), length 110: (hlim 1, next-header Options (0) payload length: 56) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 2 group record(s) [gaddr ff02::1:ff00:2 to_ex, 0 source(s)] [gaddr ff02::1:ff94:dc89 to_ex, 0 source(s)] 20:51:27.534032 26:7c:8e:94:dc:89 > 33:33:ff:00:00:02, ethertype IPv6 (0x86dd), length 86: (hlim 255, next-header ICMPv6 (58) payload length: 32) :: > ff02::1:ff00:2: [icmp6 sum ok] ICMP6, neighbor solicitation, length 32, who has fd12:3456:789a:1::2 unknown option (14), length 8 (1): 0x0000: 8723 1008 5ccc 20:51:28.246079 26:7c:8e:94:dc:89 > 33:33:00:00:00:16, ethertype IPv6 (0x86dd), length 110: (hlim 1, next-header Options (0) payload length: 56) fe80::247c:8eff:fe94:dc89 > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 2 group record(s) [gaddr ff02::1:ff00:2 to_ex, 0 source(s)] [gaddr ff02::1:ff94:dc89 to_ex, 0 source(s)] 20:51:28.246104 26:7c:8e:94:dc:89 > 33:33:00:00:00:02, ethertype IPv6 (0x86dd), length 70: (hlim 255, next-header ICMPv6 (58) payload length: 16) fe80::247c:8eff:fe94:dc89 > ff02::2: [icmp6 sum ok] ICMP6, router solicitation, length 16 source link-address option (1), length 8 (1): 26:7c:8e:94:dc:89 20:51:28.646025 26:7c:8e:94:dc:89 > 33:33:00:00:00:16, ethertype IPv6 (0x86dd), length 110: (hlim 1, next-header Options (0) payload length: 56) fe80::247c:8eff:fe94:dc89 > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 2 group record(s) [gaddr ff02::1:ff00:2 to_ex, 0 source(s)] [gaddr ff02::1:ff94:dc89 to_ex, 0 source(s)] 20:51:30.321083 f0:00:00:00:00:01 > ff:ff:ff:ff:ff:ff, ethertype Reverse ARP (0x8035), length 42: Ethernet (len 6), IPv4 (len 4), Reverse Request who-is f0:00:00:00:00:01 tell f0:00:00:00:00:01, length 28 20:51:32.323261 f0:00:00:00:00:01 > ff:ff:ff:ff:ff:ff, ethertype Reverse ARP (0x8035), length 42: Ethernet (len 6), IPv4 (len 4), Reverse Request who-is f0:00:00:00:00:01 tell f0:00:00:00:00:01, length 28 20:51:32.854125 26:7c:8e:94:dc:89 > 33:33:00:00:00:02, ethertype IPv6 (0x86dd), length 70: (hlim 255, next-header ICMPv6 (58) payload length: 16) fe80::247c:8eff:fe94:dc89 > ff02::2: [icmp6 sum ok] ICMP6, router solicitation, length 16 source link-address option (1), length 8 (1): 26:7c:8e:94:dc:89 20:51:36.327413 f0:00:00:00:00:01 > ff:ff:ff:ff:ff:ff, ethertype Reverse ARP (0x8035), length 42: Ethernet (len 6), IPv4 (len 4), Reverse Request who-is f0:00:00:00:00:01 tell f0:00:00:00:00:01, length 28 20:51:41.558026 26:7c:8e:94:dc:89 > 33:33:00:00:00:02, ethertype IPv6 (0x86dd), length 70: (hlim 255, next-header ICMPv6 (58) payload length: 16) fe80::247c:8eff:fe94:dc89 > ff02::2: [icmp6 sum ok] ICMP6, router solicitation, length 16 source link-address option (1), length 8 (1): 26:7c:8e:94:dc:89 <=== rarp is received + ovn-nbctl --wait=hv lsp-set-options lrp0-rp router-port=lrp0 nat-addresses= + sleep 2 + ip netns exec ext1 tcpdump -i ext1 -w ext1_2.pcap -nnle dropped privs to tcpdump tcpdump: listening on ext1, link-type EN10MB (Ethernet), capture size 262144 bytes + ovn-nbctl --wait=hv lsp-set-options lrp0-rp router-port=lrp0 nat-addresses=router + sleep 5 + pkill tcpdump 0 packets captured 0 packets received by filter 0 packets dropped by kernel + sleep 2 + tcpdump -r ext1_2.pcap -nnle -v reading from file ext1_2.pcap, link-type EN10MB (Ethernet) dropped privs to tcpdump <=== no rarp is received when re-add nat-addresses option, Ales, is this expected?
Yes, due to the backoff and no change with the interface. I would suggest to extend the test with another chassis and simply move the LR to the second chassis. You should see the rarps being sent on the second chassis.
updated reproducer based on comment 10: systemctl start openvswitch systemctl start ovn-northd ovn-nbctl set-connection ptcp:6641 ovn-sbctl set-connection ptcp:6642 ovs-vsctl set open . external_ids:system-id=hv1 external_ids:ovn-remote=tcp:127.0.0.1:6642 external_ids:ovn-encap-type=geneve external_ids:ovn-encap-ip=127.0.0.1 systemctl restart ovn-controller ovn-nbctl ls-add ls0 ovn-nbctl create Logical_Router name=lr0 options:chassis=hv1 ovn-nbctl lrp-add lr0 lrp0 f0:00:00:00:00:01 fd12:3456:789a:1::1/64 ovn-nbctl lsp-add ls0 lrp0-rp -- set Logical_Switch_Port lrp0-rp \ type=router options:router-port=lrp0 addresses='"f0:00:00:00:00:01"' ovn-nbctl lsp-set-options lrp0-rp router-port=lrp0 nat-addresses="router" ovn-nbctl lr-nat-add lr0 dnat_and_snat 3010::119 fd12:3456:789a:1::119 ovs-vsctl \ -- add-br br-eth0 ovs-vsctl add-port br-eth0 ext1 -- set interface ext1 type=internal ip netns add ext1 ip link set ext1 netns ext1 ip netns exec ext1 ip link set ext1 up ip netns exec ext1 ip addr add fd12:3456:789a:1::2/64 dev ext1 ip netns exec ext1 tcpdump -i ext1 -w ext1.pcap -nnle & sleep 2 ovs-vsctl set Open_vSwitch . external-ids:ovn-bridge-mappings=physnet1:br-eth0 ovn-nbctl lsp-add ls0 ln_port ovn-nbctl lsp-set-addresses ln_port unknown ovn-nbctl lsp-set-type ln_port localnet ovn-nbctl lsp-set-options ln_port network_name=physnet1 ovn-nbctl --wait=hv sync while : do if ovs-vsctl show | grep patch-br-int-to-ln_port then break else sleep 1 fi done sleep 15 pkill tcpdump sleep 2 tcpdump -r ext1.pcap -nnle -v ip netns exec ext1 tcpdump -i ext1 -w ext1_2.pcap -nnle & sleep 2 ovn-nbctl --wait=hv set logical_router lr0 options:chassis=hv0 sleep 2 ovn-nbctl set logical_router lr0 options:chassis=hv1 sleep 5 pkill tcpdump sleep 2 tcpdump -r ext1_2.pcap -nnle -v result on ovn22.06-22.06.0-115.el8: + tcpdump -r ext1.pcap -nnle -v reading from file ext1.pcap, link-type EN10MB (Ethernet) dropped privs to tcpdump 02:25:35.806056 ba:4a:9a:f0:a6:a5 > 33:33:ff:f0:a6:a5, ethertype IPv6 (0x86dd), length 86: (hlim 255, next-header ICMPv6 (58) payload length: 32) :: > ff02::1:fff0:a6a5: [icmp6 sum ok] ICMP6, neighbor solicitation, length 32, who has fe80::b84a:9aff:fef0:a6a5 unknown option (14), length 8 (1): 0x0000: afda 2a3b 8a6f 02:25:35.838026 ba:4a:9a:f0:a6:a5 > 33:33:00:00:00:16, ethertype IPv6 (0x86dd), length 110: (hlim 1, next-header Options (0) payload length: 56) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 2 group record(s) [gaddr ff02::1:ff00:2 to_ex, 0 source(s)] [gaddr ff02::1:fff0:a6a5 to_ex, 0 source(s)] 02:25:36.118074 ba:4a:9a:f0:a6:a5 > 33:33:ff:00:00:02, ethertype IPv6 (0x86dd), length 86: (hlim 255, next-header ICMPv6 (58) payload length: 32) :: > ff02::1:ff00:2: [icmp6 sum ok] ICMP6, neighbor solicitation, length 32, who has fd12:3456:789a:1::2 unknown option (14), length 8 (1): 0x0000: 7e1e 26f3 8e2e 02:25:36.822086 ba:4a:9a:f0:a6:a5 > 33:33:00:00:00:16, ethertype IPv6 (0x86dd), length 110: (hlim 1, next-header Options (0) payload length: 56) fe80::b84a:9aff:fef0:a6a5 > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 2 group record(s) [gaddr ff02::1:ff00:2 to_ex, 0 source(s)] [gaddr ff02::1:fff0:a6a5 to_ex, 0 source(s)] 02:25:36.822113 ba:4a:9a:f0:a6:a5 > 33:33:00:00:00:02, ethertype IPv6 (0x86dd), length 70: (hlim 255, next-header ICMPv6 (58) payload length: 16) fe80::b84a:9aff:fef0:a6a5 > ff02::2: [icmp6 sum ok] ICMP6, router solicitation, length 16 source link-address option (1), length 8 (1): ba:4a:9a:f0:a6:a5 02:25:37.462050 ba:4a:9a:f0:a6:a5 > 33:33:00:00:00:16, ethertype IPv6 (0x86dd), length 110: (hlim 1, next-header Options (0) payload length: 56) fe80::b84a:9aff:fef0:a6a5 > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 2 group record(s) [gaddr ff02::1:ff00:2 to_ex, 0 source(s)] [gaddr ff02::1:fff0:a6a5 to_ex, 0 source(s)] 02:25:38.683833 f0:00:00:00:00:01 > ff:ff:ff:ff:ff:ff, ethertype Reverse ARP (0x8035), length 42: Ethernet (len 6), IPv4 (len 4), Reverse Request who-is f0:00:00:00:00:01 tell f0:00:00:00:00:01, length 28 02:25:40.686050 f0:00:00:00:00:01 > ff:ff:ff:ff:ff:ff, ethertype Reverse ARP (0x8035), length 42: Ethernet (len 6), IPv4 (len 4), Reverse Request who-is f0:00:00:00:00:01 tell f0:00:00:00:00:01, length 28 02:25:41.238159 ba:4a:9a:f0:a6:a5 > 33:33:00:00:00:02, ethertype IPv6 (0x86dd), length 70: (hlim 255, next-header ICMPv6 (58) payload length: 16) fe80::b84a:9aff:fef0:a6a5 > ff02::2: [icmp6 sum ok] ICMP6, router solicitation, length 16 source link-address option (1), length 8 (1): ba:4a:9a:f0:a6:a5 02:25:44.690155 f0:00:00:00:00:01 > ff:ff:ff:ff:ff:ff, ethertype Reverse ARP (0x8035), length 42: Ethernet (len 6), IPv4 (len 4), Reverse Request who-is f0:00:00:00:00:01 tell f0:00:00:00:00:01, length 28 02:25:49.430167 ba:4a:9a:f0:a6:a5 > 33:33:00:00:00:02, ethertype IPv6 (0x86dd), length 70: (hlim 255, next-header ICMPv6 (58) payload length: 16) fe80::b84a:9aff:fef0:a6a5 > ff02::2: [icmp6 sum ok] ICMP6, router solicitation, length 16 source link-address option (1), length 8 (1): ba:4a:9a:f0:a6:a5 + sleep 2 + ip netns exec ext1 tcpdump -i ext1 -w ext1_2.pcap -nnle dropped privs to tcpdump tcpdump: listening on ext1, link-type EN10MB (Ethernet), capture size 262144 bytes + ovn-nbctl --wait=hv set logical_router lr0 options:chassis=hv0 + sleep 2 + ovn-nbctl set logical_router lr0 options:chassis=hv1 + sleep 5 + pkill tcpdump 2 packets captured 2 packets received by filter 0 packets dropped by kernel + sleep 2 + tcpdump -r ext1_2.pcap -nnle -v reading from file ext1_2.pcap, link-type EN10MB (Ethernet) dropped privs to tcpdump 02:25:59.835694 f0:00:00:00:00:01 > ff:ff:ff:ff:ff:ff, ethertype Reverse ARP (0x8035), length 42: Ethernet (len 6), IPv4 (len 4), Reverse Request who-is f0:00:00:00:00:01 tell f0:00:00:00:00:01, length 28 02:26:01.838062 f0:00:00:00:00:01 > ff:ff:ff:ff:ff:ff, ethertype Reverse ARP (0x8035), length 42: Ethernet (len 6), IPv4 (len 4), Reverse Request who-is f0:00:00:00:00:01 tell f0:00:00:00:00:01, length 28
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (ovn22.06 bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2023:1290