Description of problem: OVS kernel space - internal port in down state forwards packets to kernel and kernel will log `over-mtu` if these packets are > kernel interface MTU: I can easily reproduce this in a lab of mine with OSP 13 and OVS 2.11: ~~~ [root@overcloud-compute-1 ~]# ip link ls dev br-int 24: br-int: <BROADCAST,MULTICAST> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000 link/ether 9e:ef:c0:47:ca:47 brd ff:ff:ff:ff:ff:ff [root@overcloud-compute-1 ~]# ip link add name veth-host type veth peer name veth-guest [root@overcloud-compute-1 ~]# ip link set dev veth-host up [root@overcloud-compute-1 ~]# ip link set dev veth-guest up [root@overcloud-compute-1 ~]# ovs-vsctl add-port br-int veth-host [root@overcloud-compute-1 ~]# ovs-vsctl show | grep br-int -A10 Bridge br-int Controller "tcp:127.0.0.1:6633" is_connected: true fail_mode: secure Port br-int Interface br-int type: internal Port veth-host Interface veth-host Port "tpi-00c6e537-cc" tag: 1 Interface "tpi-00c6e537-cc" type: patch options: {peer="tpt-00c6e537-cc"} Port "qvo5d2f3fba-11" tag: 2 [root@overcloud-compute-1 ~]# ip link set dev veth-host mtu 8900 [root@overcloud-compute-1 ~]# ip link set dev veth-guest mtu 8900 [root@overcloud-compute-1 ~]# ip a a dev veth-guest 192.168.123.10/24 [root@overcloud-compute-1 ~]# ping 192.168.123.255 -b -M do -s 8000 WARNING: pinging broadcast address PING 192.168.123.255 (192.168.123.255) 8000(8028) bytes of data. ^C --- 192.168.123.255 ping statistics --- 2 packets transmitted, 0 received, 100% packet loss, time 1000ms [1]+ Done tcpdump -nne -i br-int -l | grep --color=auto 192.168.123 [root@overcloud-compute-1 ~]# dmesg | tail (...) [281752.475238] br-int: dropped over-mtu packet: 8028 > 1500 [281753.474201] br-int: dropped over-mtu packet: 8028 > 1500 [root@overcloud-compute-1 ~]# tcpdump -nne -i br-int tcpdump: br-int: That device is not up [root@overcloud-compute-1 ~]# ~~~ Workaround: I guess the workaround is as easy as that: ~~~ [root@overcloud-compute-1 ~]# ovs-vsctl set interface br-int mtu_request=8900 [root@overcloud-compute-1 ~]# ovs-vsctl list interface br-int | grep mtu mtu : 8900 mtu_request : 8900 [root@overcloud-compute-1 ~]# ip link ls dev br-int 24: br-int: <BROADCAST,MULTICAST> mtu 8900 qdisc noqueue state DOWN mode DEFAULT group default qlen 1000 link/ether 9e:ef:c0:47:ca:47 brd ff:ff:ff:ff:ff:ff ~~~ Version-Release number of selected component (if applicable): ~~~ [root@overcloud-compute-1 ~]# rpm -qa | egrep 'kernel|openvswitch' kernel-3.10.0-1062.9.1.el7.x86_64 erlang-kernel-18.3.4.11-2.el7ost.x86_64 python-openvswitch2.11-2.11.0-26.el7fdp.x86_64 openstack-neutron-openvswitch-12.1.0-2.el7ost.noarch openvswitch2.11-2.11.0-26.el7fdp.x86_64 openvswitch-selinux-extra-policy-1.0-9.el7fdp.noarch kernel-tools-3.10.0-1062.9.1.el7.x86_64 kernel-tools-libs-3.10.0-1062.9.1.el7.x86_64 python-rhosp-openvswitch-2.11-0.6.el7ost.noarch rhosp-openvswitch-ovn-central-2.11-0.6.el7ost.noarch rhosp-openvswitch-2.11-0.6.el7ost.noarch rhosp-openvswitch-ovn-host-2.11-0.6.el7ost.noarch ~~~ How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info: I understand that everything here happens in kernel, that when the interface is down the packet gets rx_dropped: ~~~ [root@overcloud-compute-1 ~]# ovs-vsctl get interface br-int statistics {collisions=0, rx_bytes=2792740, rx_crc_err=0, rx_dropped=10185074, rx_errors=0, rx_frame_err=0, rx_over_err=0, rx_packets=15244, tx_bytes=1064, tx_dropped=0, tx_errors=2, tx_packets=12} ~~~ So I suppose that there's nothing that we can do about this warning message in the logs: ~~~ [281752.475238] br-int: dropped over-mtu packet: 8028 > 1500 ~~~ Feel free to open this bug report as won't fix if there's nothing that we can do.
Feel free to close this bug report as won't fix it there's nothing that we can do. But I'd like to find a way to suppress this log message as it may spam the logs and administrator may be worried about it.
Hi Andreas, When a port is added to the bridge, it will find the minimum MTU of the ports to set the bridge MTU. So, if in your scenario you had only veth-host, then the bridge would have its MTU set to 8900. However, if there are other ports with lower MTU, then the bridge will be set to allow communication to all ports. Having said that, I think the bridge auto config makes sense and we should not change. Now looking at the kernel, the driver is supposed to drop if the device is down, so it is a bit after the code checking the MTU. void ovs_vport_send(struct vport *vport, struct sk_buff *skb, u8 mac_proto) { if (unlikely(packet_length(skb, vport->dev) > mtu && !skb_is_gso(skb))) { net_warn_ratelimited("%s: dropped over-mtu packet: %d > %d\n", vport->dev->name, packet_length(skb, vport->dev), mtu); vport->dev->stats.tx_errors++; goto drop; } static netdev_tx_t internal_dev_recv(struct sk_buff *skb) { struct net_device *netdev = skb->dev; if (unlikely(!(netdev->flags & IFF_UP))) { kfree_skb(skb); netdev->stats.rx_dropped++; return NETDEV_TX_OK; } Perhaps we can change ovs_vport_send() to not log that message if the egress device is down. However, before I go down that route, I need to understand the use case. It doesn't seem correct to have different MTUs in the same bridge because the connection between those ports would be broken. Could you please explain why the ports have different MTUs? fbl
Hi, It's been a while for this one, but IIRC OpenStack neutron will set the MTU only for the ports that connect to the instances. It will not raise the MTU for the bridge internal port. I think the reason why I opened this here is that I'd not expect a port in DOWN state to log " dropped over-mtu packet" packets, given that it's down. So if I as an admin set a port as down, I wouldn't expect to see log messages for the port that tell me that it's MTU is not set right, given that I'd expect the port to not receive anything, anyway. I hope that makes sense. Again, feel free to close this as won't fix if you think this is not a valid request. I have not run into a similar complaint, since, and it's been a year. - Andreas
Patch posted upstream: https://lists.openwall.net/netdev/2021/03/16/279
Patch has been accepted in net-next: https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=ebfbc46b35cb We sync downstream openvswitch with upstream periodically, so this fix will eventually hit RHEL-8 and newer. Having said that and based on comment#3, I am closing the ticket. Thanks Andreas for reporting this bug! fbl