Description of problem: A VM is created with a port on a geneve tenant network with mtu=8942. All the interfaces from the overcloud nodes are configured to 9000. The VMs IPv4 subnet is connected to the external network through a router. A FIP is assigned to the VM. 10.218.0.155 is an IP from the external network, attached to the undercloud node. When the VM sends a ping of size 1476 or lower to 10.218.0.155 (north/south), it is replied successfully: # ping -s 1476 -c1 10.218.0.155 --- 10.218.0.155 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 1.130/1.130/1.130/0.000 ms When the VM sends a ping of size 1477 or higher to 10.218.0.155 (north/south), the reply is not received: # ping -s 1477 -c1 10.218.0.155 --- 10.218.0.155 ping statistics --- 1 packets transmitted, 0 received, +1 errors, 100% packet loss, time 0ms TRAFFIC CAPTURED ON THE COMPUTE TAP INTERFACE WHEN S=1476: 10:16:49.772492 fa:16:3e:74:e9:87 > fa:16:3e:68:05:cb, ethertype IPv4 (0x0800), length 1514: (tos 0x0, ttl 64, id 27562, offset 0, flags [+], proto ICMP (1), length 1500) 10.100.0.8 > 10.218.0.155: ICMP echo request, id 4702, seq 1, length 1480 10:16:49.772532 fa:16:3e:74:e9:87 > fa:16:3e:68:05:cb, ethertype IPv4 (0x0800), length 38: (tos 0x0, ttl 64, id 27562, offset 1480, flags [none], proto ICMP (1), length 24) 10.100.0.8 > 10.218.0.155: ip-proto-1 10:16:49.774502 fa:16:3e:68:05:cb > fa:16:3e:74:e9:87, ethertype IPv4 (0x0800), length 1518: (tos 0x0, ttl 63, id 50313, offset 0, flags [none], proto ICMP (1), length 1504) 10.218.0.155 > 10.100.0.8: ICMP echo reply, id 4702, seq 1, length 1484 TRAFFIC CAPTURED ON THE COMPUTE TAP INTERFACE WHEN S=1477 (NO REPLY): 12:49:42.088355 fa:16:3e:74:e9:87 > fa:16:3e:68:05:cb, ethertype IPv4 (0x0800), length 1514: (tos 0x0, ttl 64, id 15297, offset 0, flags [+], proto ICMP (1), length 1500) 10.100.0.8 > 10.218.0.155: ICMP echo request, id 4797, seq 1, length 1480 12:49:42.088400 fa:16:3e:74:e9:87 > fa:16:3e:68:05:cb, ethertype IPv4 (0x0800), length 39: (tos 0x0, ttl 64, id 15297, offset 1480, flags [none], proto ICMP (1), length 25) 10.100.0.8 > 10.218.0.155: ip-proto-1 TRAFFIC CAPTURED ON THE COMPUTE EXTERNAL INTERFACE (ENS5) WHEN S=1477 (THE ECHO REPLY IPV4 PACKET LENGTH IS 1505): 12:48:47.241350 fa:16:3e:2a:2f:2f > 52:54:00:46:ee:1d, ethertype 802.1Q (0x8100), length 1518: vlan 218, p 0, ethertype IPv4, (tos 0x0, ttl 63, id 32320, offset 0, flags [+], proto ICMP (1), length 1500) 10.218.0.198 > 10.218.0.155: ICMP echo request, id 4796, seq 1, length 1480 12:48:47.241380 fa:16:3e:2a:2f:2f > 52:54:00:46:ee:1d, ethertype 802.1Q (0x8100), length 43: vlan 218, p 0, ethertype IPv4, (tos 0x0, ttl 63, id 32320, offset 1480, flags [none], proto ICMP (1), length 25) 10.218.0.198 > 10.218.0.155: ip-proto-1 12:48:47.241648 52:54:00:46:ee:1d > fa:16:3e:2a:2f:2f, ethertype 802.1Q (0x8100), length 1523: vlan 218, p 0, ethertype IPv4, (tos 0x0, ttl 64, id 38547, offset 0, flags [none], proto ICMP (1), length 1505) 10.218.0.155 > 10.218.0.198: ICMP echo reply, id 4796, seq 1, length 1485 So, it seems the incoming packet with lenght higher than 1500 is dropped at br-int. As mentioned before, all the interfaces are configured with mtu=9000, but br-int's MTU cannot be configured by openstack, as far as I know: [root@compute-0 ~]# ip a s br-int 12: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000 link/ether 3a:f8:0c:5c:3c:8e brd ff:ff:ff:ff:ff:ff [root@compute-0 ~]# ovs-vsctl get interface br-int mtu 1500 This is an issue found in RHOSP 16.2, using downstream CI jobs. More specifically, this [1] is the job detecting this issue. Several tempest tests failed due to this issue: [2] is one of them. The issue is reproduced by the job [1] on ovn-2021-21.09.0-12, which configures the MTU of all the interfaces from the overcloud nodes to 9000 (see [3]) to test jumbo frames. The issue is NOT reproduced on ovn-2021-21.09.0-12 by other RHOSP 16.2 jobs using default MTU values (1500). And most importantly, the issue is NOT reproduced by the job [1] on ovn-2021-21.06.0-29. This means the behavior has changed between the OVN releases, so this bug could be a regression. See the results below: Ping from the VM instance to the undercloud (north/south): # ping 10.218.0.155 -s 8000 -c1 <- PING SENT FROM VM INSTANCE --- 10.218.0.155 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 5.568/5.568/5.568/0.000 ms # tcpdump -vne -i eth0 icmp <- CAPTURED ON THE VM INSTANCE, THE REQUEST IS SEGMENTED, THE REPLY IS NOT, BUT IT'S NOT DROPPED EITHER 08:30:35.167663 fa:16:3e:4f:8d:92 > fa:16:3e:30:3d:04, ethertype IPv4 (0x0800), length 1514: (tos 0x0, ttl 64, id 32536, offset 0, flags [+], proto ICMP (1), length 1500) 10.100.0.10 > 10.218.0.155: ICMP echo request, id 4543, seq 1, length 1480 08:30:35.167709 fa:16:3e:4f:8d:92 > fa:16:3e:30:3d:04, ethertype IPv4 (0x0800), length 1514: (tos 0x0, ttl 64, id 32536, offset 1480, flags [+], proto ICMP (1), length 1500) 10.100.0.10 > 10.218.0.155: ip-proto-1 08:30:35.167711 fa:16:3e:4f:8d:92 > fa:16:3e:30:3d:04, ethertype IPv4 (0x0800), length 1514: (tos 0x0, ttl 64, id 32536, offset 2960, flags [+], proto ICMP (1), length 1500) 10.100.0.10 > 10.218.0.155: ip-proto-1 08:30:35.167712 fa:16:3e:4f:8d:92 > fa:16:3e:30:3d:04, ethertype IPv4 (0x0800), length 1514: (tos 0x0, ttl 64, id 32536, offset 4440, flags [+], proto ICMP (1), length 1500) 10.100.0.10 > 10.218.0.155: ip-proto-1 08:30:35.167714 fa:16:3e:4f:8d:92 > fa:16:3e:30:3d:04, ethertype IPv4 (0x0800), length 1514: (tos 0x0, ttl 64, id 32536, offset 5920, flags [+], proto ICMP (1), length 1500) 10.100.0.10 > 10.218.0.155: ip-proto-1 08:30:35.167715 fa:16:3e:4f:8d:92 > fa:16:3e:30:3d:04, ethertype IPv4 (0x0800), length 642: (tos 0x0, ttl 64, id 32536, offset 7400, flags [none], proto ICMP (1), length 628) 10.100.0.10 > 10.218.0.155: ip-proto-1 08:30:35.169572 fa:16:3e:30:3d:04 > fa:16:3e:4f:8d:92, ethertype IPv4 (0x0800), length 8042: (tos 0x0, ttl 63, id 50929, offset 0, flags [none], proto ICMP (1), length 8028) 10.218.0.155 > 10.100.0.10: ICMP echo reply, id 4543, seq 1, length 8008 8008 bytes from 10.218.0.155: icmp_seq=1 ttl=63 time=10.9 ms [1] https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/job/DFG-network-networking-ovn-16.2_director-rhel-virthost-3cont_2comp_3net-ipv4-geneve-composable-vlan-provider-network/ [2] https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/job/DFG-network-networking-ovn-16.2_director-rhel-virthost-3cont_2comp_3net-ipv4-geneve-composable-vlan-provider-network/17/testReport/neutron_plugin.tests.scenario.test_multicast/MulticastTestIPv4Common/test_igmp_snooping_same_network_and_unsubscribe_id_9f6cd7af_ca52_4979_89e8_ab7436905712_/ [3] https://code.engineering.redhat.com/gerrit/plugins/gitiles/Neutron-QE/+/refs/heads/master/vlan_provider_network_ovn/network/nic-configs/16/compute.yaml Version-Release number of selected component (if applicable): ovn-2021-21.09.0-12 How reproducible: 100%
@Eduardo, are you configuring the options:gateway_mtu on the logical router ports at all in this scenario? If so, are you setting that to 9000 or 1500? That's what controls the check_pkt_larger actions that are programmed into OVS. Please check the openflow dump on the chassis where the incoming ping is received, and see if there are check_pkt_larger actions in tables 8 or 23 that correspond with the MTU setting. If there are check_pkt_larger actions present, then are the arguments in there 9000 or 1500? Do they correspond with the options:gateway_mtu on the logical router ports? This will be easier to debug if we could get a copy of the northbound database that was used in this test. Is that available on any of the links in the first comment? Thanks.
Yes, the affected networks are provider networks in OSP and have gateway_mtu set. It's set to 1500. We have another bug in the same environment that turned out to also be related to gateway_mtu: https://bugzilla.redhat.com/show_bug.cgi?id=2017424 Looks like this mechanism in 21.09 exposed several issues. Also note some suggestions in the other bug from Numan on Neutron usage of gateway_mtu. Hope this helps somewhat.
OK, so this may or may not actually be the same issue as 2017424. I think that you should try setting the gateway_mtu on the logical router port to 9000 instead of 1500 and see if that helps. However, based on the findings in 2017424, this may not actually fix the problem if there is an underlying issue in check_pkt_larger. Or, you may find it works sometimes and not others, based on that conversation. The other thing to check is to try removing the gateway_mtu altogether on the logical router port and seeing if that causes traffic to flow properly.
This is the port that issues pings: _uuid : d9c5cfd5-f4d5-4155-b64a-3425f253a3ef addresses : ["fa:16:3e:0d:72:b1 10.100.0.4"] dhcpv4_options : 6b2bcbef-2e9a-4cd9-97c6-9c21de40c872 dhcpv6_options : [] dynamic_addresses : [] enabled : true external_ids : {"neutron:cidrs"="10.100.0.4/28", "neutron:device_id"="8f1ce9ab-c076-4a3a-9997-43057d1803d6", "neutron:device_owner"="compute:nova", "neutron:network_name"=neutron-01c328b6-8966-4e69-9396-b659fee5715b, "neutron:port_fip"="10.218.0.200", "neutron:port_name"="", "neutron:project_id"=c92bb2e2db754f09820cd78ad98526b2, "neutron:revision_number"="4", "neutron:security_group_ids"="726a387c-07ab-4064-a9e0-46b9f3e89ce1"} ha_chassis_group : [] name : "16f538c5-3810-4a60-b1d6-34460bb41f54" options : {mcast_flood_reports="true", requested-chassis=compute-0.redhat.local} parent_name : [] port_security : ["fa:16:3e:0d:72:b1 10.100.0.4"] tag : [] tag_request : [] type : "" up : true
@lorenzo.bianconi can you please provide a build/rpm for Neutron to test
tested with following steps: server setup: systemctl start openvswitch systemctl start ovn-northd ovn-nbctl set-connection ptcp:6641 ovn-sbctl set-connection ptcp:6642 ovs-vsctl set open . external_ids:system-id=hv1 external_ids:ovn-remote=tcp:20.0.40.25:6642 external_ids:ovn-encap-type=geneve external_ids:ovn-encap-ip=20.0.40.25 systemctl restart ovn-controller ovn-nbctl lr-add R1 ovn-nbctl ls-add sw0 ovn-nbctl ls-add public ovn-nbctl lrp-add R1 rp-sw0 00:00:01:01:02:03 192.168.1.1/24 ovn-nbctl lrp-add R1 rp-public 00:00:02:01:02:03 172.16.1.1/24 1000::a/64 \ -- lrp-set-gateway-chassis rp-public hv0 ovs-vsctl add-br br-ext ovs-vsctl add-port br-ext ens4f1 ip link set ens4f1 up ip link set ens4f1 mtu 1500 ovn-nbctl lsp-add sw0 sw0-rp -- set Logical_Switch_Port sw0-rp \ type=router options:router-port=rp-sw0 \ -- lsp-set-addresses sw0-rp router ovn-nbctl lsp-add public public-rp -- set Logical_Switch_Port public-rp \ type=router options:router-port=rp-public \ -- lsp-set-addresses public-rp router ovs-vsctl add-port br-int sw01 -- set interface sw01 type=internal external_ids:iface-id=sw01 ip netns add sw01 ip link set sw01 netns sw01 ip netns exec sw01 ip link set sw01 address f0:00:00:01:02:03 ip netns exec sw01 ip link set sw01 up ip netns exec sw01 ip link set sw01 mtu 8942 ip netns exec sw01 ip addr add 192.168.1.2/24 dev sw01 ip netns exec sw01 ip route add default via 192.168.1.1 dev sw01 ovn-nbctl lsp-add sw0 sw01 \ -- lsp-set-addresses sw01 "f0:00:00:01:02:03 192.168.1.2" ovs-vsctl add-port br-ext server -- set interface server type=internal ip netns add server ip netns exec server ip link set lo up ip link set server netns server ip netns exec server ip link set server mtu 9000 ip netns exec server ip link set server up ip netns exec server ip addr add 172.16.1.50/24 dev server ovs-vsctl set Open_vSwitch . external-ids:ovn-bridge-mappings=phynet:br-ext ovn-nbctl lsp-add public public1 \ -- lsp-set-addresses public1 unknown \ -- lsp-set-type public1 localnet \ -- lsp-set-options public1 network_name=phynet ovn-nbctl lr-nat-add R1 dnat_and_snat 172.16.1.10 192.168.1.2 sw01 00:00:02:01:02:03 ovn-nbctl set logical_router_port rp-public options:gateway_mtu=1500 client setup: systemctl start openvswitch ovs-vsctl set open . external_ids:system-id=hv0 external_ids:ovn-remote=tcp:20.0.40.25:6642 external_ids:ovn-encap-type=geneve external_ids:ovn-encap-ip=20.0.40.26 systemctl restart ovn-controller ovs-vsctl add-br br-ext ovs-vsctl set Open_vSwitch . external-ids:ovn-bridge-mappings=phynet:br-ext ovs-vsctl add-port br-ext ens3f1 ip link set ens3f1 up ip link set ens3f1 mtu 1500 ping on server: sleep 2 ovn-nbctl --wait=hv sync ip netns exec sw01 ping 172.16.1.50 -c 1 ip netns exec sw01 ping 172.16.1.50 -c 1 -s 1472 ip netns exec sw01 ping 172.16.1.50 -c 1 -s 1476 ip netns exec sw01 ping 172.16.1.50 -c 1 -s 1477 ip netns exec sw01 ping 172.16.1.50 -c 1 -s 1477 ip netns exec sw01 ping 172.16.1.50 -c 1 -s 1477 ip netns exec sw01 ping 172.16.1.50 -c 1 -s 1477 reproduced on ovn-2021-21.09.0-12.el8: [root@dell-per740-12 bz2018179]# rpm -qa | grep -E "openvswitch2.15|ovn-2021" ovn-2021-central-21.09.0-12.el8fdp.x86_64 ovn-2021-21.09.0-12.el8fdp.x86_64 ovn-2021-host-21.09.0-12.el8fdp.x86_64 python3-openvswitch2.15-2.15.0-51.el8fdp.x86_64 openvswitch2.15-2.15.0-51.el8fdp.x86_64 [root@dell-per740-12 bz2018179]# bash -x ping.sh + ovn-nbctl --wait=hv sync + ip netns exec sw01 ping 172.16.1.50 -c 1 PING 172.16.1.50 (172.16.1.50) 56(84) bytes of data. 64 bytes from 172.16.1.50: icmp_seq=1 ttl=63 time=4.61 ms --- 172.16.1.50 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 4.606/4.606/4.606/0.000 ms + ip netns exec sw01 ping 172.16.1.50 -c 1 -s 1472 PING 172.16.1.50 (172.16.1.50) 1472(1500) bytes of data. 1480 bytes from 172.16.1.50: icmp_seq=1 ttl=63 time=0.737 ms --- 172.16.1.50 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.737/0.737/0.737/0.000 ms + ip netns exec sw01 ping 172.16.1.50 -c 1 -s 1476 PING 172.16.1.50 (172.16.1.50) 1476(1504) bytes of data. 1484 bytes from 172.16.1.50: icmp_seq=1 ttl=63 time=0.129 ms --- 172.16.1.50 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.129/0.129/0.129/0.000 ms + ip netns exec sw01 ping 172.16.1.50 -c 1 -s 1477 PING 172.16.1.50 (172.16.1.50) 1477(1505) bytes of data. From 192.168.1.1 icmp_seq=1 Frag needed and DF set (mtu = 1500) --- 172.16.1.50 ping statistics --- 1 packets transmitted, 0 received, +1 errors, 100% packet loss, time 0ms + ip netns exec sw01 ping 172.16.1.50 -c 1 -s 1477 PING 172.16.1.50 (172.16.1.50) 1477(1505) bytes of data. --- 172.16.1.50 ping statistics --- 1 packets transmitted, 0 received, 100% packet loss, time 0ms + ip netns exec sw01 ping 172.16.1.50 -c 1 -s 1477 PING 172.16.1.50 (172.16.1.50) 1477(1505) bytes of data. --- 172.16.1.50 ping statistics --- 1 packets transmitted, 0 received, 100% packet loss, time 0ms + ip netns exec sw01 ping 172.16.1.50 -c 1 -s 1477 PING 172.16.1.50 (172.16.1.50) 1477(1505) bytes of data. --- 172.16.1.50 ping statistics --- 1 packets transmitted, 0 received, 100% packet loss, time 0ms Verified on ovn-2021-21.09.0-20.el8: [root@dell-per740-12 bz2018179]# rpm -qa | grep -E "openvswitch2.15|ovn-2021" ovn-2021-host-21.09.0-20.el8fdp.x86_64 ovn-2021-central-21.09.0-20.el8fdp.x86_64 python3-openvswitch2.15-2.15.0-51.el8fdp.x86_64 ovn-2021-21.09.0-20.el8fdp.x86_64 openvswitch2.15-2.15.0-51.el8fdp.x86_64 [root@dell-per740-12 bz2018179]# bash -x ping.sh + ovn-nbctl --wait=hv sync + ip netns exec sw01 ping 172.16.1.50 -c 1 PING 172.16.1.50 (172.16.1.50) 56(84) bytes of data. 64 bytes from 172.16.1.50: icmp_seq=1 ttl=63 time=4.31 ms --- 172.16.1.50 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 4.305/4.305/4.305/0.000 ms + ip netns exec sw01 ping 172.16.1.50 -c 1 -s 1472 PING 172.16.1.50 (172.16.1.50) 1472(1500) bytes of data. 1480 bytes from 172.16.1.50: icmp_seq=1 ttl=63 time=0.726 ms --- 172.16.1.50 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.726/0.726/0.726/0.000 ms + ip netns exec sw01 ping 172.16.1.50 -c 1 -s 1476 PING 172.16.1.50 (172.16.1.50) 1476(1504) bytes of data. 1484 bytes from 172.16.1.50: icmp_seq=1 ttl=63 time=0.123 ms --- 172.16.1.50 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.123/0.123/0.123/0.000 ms + ip netns exec sw01 ping 172.16.1.50 -c 1 -s 1477 PING 172.16.1.50 (172.16.1.50) 1477(1505) bytes of data. From 192.168.1.1 icmp_seq=1 Frag needed and DF set (mtu = 1500) --- 172.16.1.50 ping statistics --- 1 packets transmitted, 0 received, +1 errors, 100% packet loss, time 0ms + ip netns exec sw01 ping 172.16.1.50 -c 1 -s 1477 PING 172.16.1.50 (172.16.1.50) 1477(1505) bytes of data. --- 172.16.1.50 ping statistics --- 1 packets transmitted, 0 received, 100% packet loss, time 0ms + ip netns exec sw01 ping 172.16.1.50 -c 1 -s 1477 PING 172.16.1.50 (172.16.1.50) 1477(1505) bytes of data. 1485 bytes from 172.16.1.50: icmp_seq=1 ttl=63 time=1.21 ms --- 172.16.1.50 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 1.205/1.205/1.205/0.000 ms + ip netns exec sw01 ping 172.16.1.50 -c 1 -s 1477 PING 172.16.1.50 (172.16.1.50) 1477(1505) bytes of data. 1485 bytes from 172.16.1.50: icmp_seq=1 ttl=63 time=0.149 ms --- 172.16.1.50 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.149/0.149/0.149/0.000 ms
set VERIFIED per comment 13
https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/job/DFG-network-networking-ovn-16.2_director-rhel-virthost-3cont_2comp_3net-ipv4-geneve-composable-vlan-provider-network/29/testReport/neutron_plugin.tests.scenario.test_multicast/MulticastTestIPv4Common/test_igmp_snooping_same_network_and_unsubscribe_id_9f6cd7af_ca52_4979_89e8_ab7436905712_/ looks like the issue is fixed openvswitch2.15.x86_64 2.15.0-42.el8fdp @download-node-02.eng.bos.redhat.com_rhel-8_nightly_updates_FDP_latest-FDP-8-RHEL-8_compose_Server_x86_64_os ovn-2021.x86_64 21.09.0-20.el8fdp @download-node-02.eng.bos.redhat.com_rhel-8_nightly_updates_FDP_latest-FDP-8-RHEL-8_compose_Server_x86_64_os there few other failures but I think they are related to bz : https://bugzilla.redhat.com/show_bug.cgi?id=2018365
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (ovn bug fix and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2021:5059
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days