+++ This bug was initially created as a clone of Bug #1850957 +++ Description of problem: I installed the latest 16.1 puddle with all needed configuration to support 9000 bytes MTU on internal networks, set 1500 bytes MTU on external network. All needed neutron settings were also configured. And I found that the OVN router does not send ICMP 'need to frag' anymore like it should according this RFE https://bugzilla.redhat.com/show_bug.cgi?id=1547074 I also still have an old environment handy where the feature is working. Some details: OLD environment build: RHOS-16.1-RHEL-8-20200511.n.0 puppet-ovn-15.4.1-0.20200311045730.192ac4e.el8ost.noarch python3-networking-ovn-7.1.1-0.20200507153427.fd1c0c3.el8ost.noarch ovn2.13-2.13.0-18.el8fdp.x86_64 openvswitch2.13-2.13.0-18.el8fdp.x86_64 kernel: 4.18.0-193.1.2.el8_2.x86_64 NEW environment build: RHOS-16.1-RHEL-8-20200616.n.0 python3-networking-ovn-7.2.1-0.20200611111150.18fabca.el8ost.noarch puppet-ovn-15.4.1-0.20200311045730.192ac4e.el8ost.noarch ovn2.13-2.13.0-30.el8fdp.x86_64 openvswitch2.13-2.13.0-25.el8fdp.1.x86_64 kernel: 4.18.0-193.6.3.el8_2.x86_64 Same on both (tested on all controllers): [heat-admin@controller-0 ~]$ sudo ovs-appctl -t ovs-vswitchd dpif/show-dp-features br-int | grep "Check pkt" Check pkt length action: Yes [heat-admin@controller-0 ~]$ sudo podman exec -it neutron_api crudini --get /etc/neutron/plugins/ml2/ml2_conf.ini ovn ovn_emit_need_to_frag True All interfaces on all nodes are set to support MTU 9000 gateway_mtu="1500" set for logical router gateway port Version-Release number of selected component (if applicable): RHOS-16.1-RHEL-8-20200616.n.0 same issue also with newer puddle RHOS-16.1-RHEL-8-20200623.n.0 python3-networking-ovn-7.2.1-0.20200611111150.18fabca.el8ost.noarch puppet-ovn-15.4.1-0.20200311045730.192ac4e.el8ost.noarch ovn2.13-2.13.0-30.el8fdp.x86_64 openvswitch2.13-2.13.0-25.el8fdp.1.x86_64 kernel: 4.18.0-193.6.3.el8_2.x86_64 How reproducible: 100% Steps to Reproduce: 1. Create external network, internal network (mtu bigger than the external network has), router, connect both networks to the router, keypair, security group with rules allowing icmp, ssh, udp. 2. Launch an instance on the internal network using created keypair and security group 3. Try to ping ip address on the external network from the VM using size bigger than mtu of the external network Actual results: No ICMP 'fragmentation needed' sent from the OVN router Expected results: ICMP 'fragmentation needed' sent by OVN router Additional info:
Hi, There are actually two issues here. 1) In the OVN version you were testing, ovs-vswitchd was dropping packets originated by ovn-controller. This is because ovn-controller was not using the proper OF port as the source. This was fixed in https://bugzilla.redhat.com/show_bug.cgi?id=1832176 . If you upgrade to ovn2.13-2.13.0-35.el8fdp or newer, you will have this fix. 2) When stateful ACLs are enabled, OVN's ICMP responses go through conntrack. There is a kernel issue where the check_pkt_len action is not taking into account GSO information, resulting in miscalculation of the packet length. This is fixed in https://patchwork.ozlabs.org/project/openvswitch/patch/fd266728e5de48e1b4bd82d08e345f308f77eb5a.1592929525.git.lorenzo@kernel.org/ . This patch has been accepted upstream to -stable, but has not been released in an upstream kernel release or RHEL8 release. From OVN's perspective, this issue is already fixed, but you may run into issue (2) until the linked patch is available in a RHEL8 kernel. I don't have an ETA for when it will be available. Since the OVN side of things is fixed, would it be appropriate to close this issue?
Shall we then reuse this BZ and change the component to the kernel so that we can track the issue?
(In reply to Daniel Alvarez Sanchez from comment #2) > Shall we then reuse this BZ and change the component to the kernel so that > we can track the issue? I have already created a bz for the related kernel issue: https://bugzilla.redhat.com/show_bug.cgi?id=1851888
Since there is a kernel issue open to track the kernel issue, and there is an OSP issue open to track the OSP side, can we close this OVN issue?
Does not occur after FDP 20.E released and used in OSP. Tested on puddle RHOS-16.1-RHEL-8-20200714.n.0 with ovn2.13-2.13.0-37.el8fdp.x86_64. ICMP 'fragmentation needed' sent by OVN router as expected. Note: tested with default kernel of rhel 8.2 therefore fragmentation did not work, i.e. https://bugzilla.redhat.com/show_bug.cgi?id=1854084 is still valid. Fix for it is available in rhel 8.3, see https://bugzilla.redhat.com/show_bug.cgi?id=1851888. There is a BZ for backporting this fix to rhel 8.2, see https://bugzilla.redhat.com/show_bug.cgi?id=1854149