Bug 1564346
Summary: | Strange behaviour of MTU values of the tun0 of the node where router is present | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Miheer Salunke <misalunk> | |
Component: | Networking | Assignee: | Ben Bennett <bbennett> | |
Status: | CLOSED ERRATA | QA Contact: | Meng Bo <bmeng> | |
Severity: | medium | Docs Contact: | ||
Priority: | medium | |||
Version: | 3.6.0 | CC: | aconole, aos-bugs, dcbw, hongli, knakayam, rbost, zzhao | |
Target Milestone: | --- | |||
Target Release: | 3.10.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: |
Cause: The MTU of the tun0 is not correct when there are no pods on the node.
Consequence: None, other than the 'ip a' output is confusing.
Fix: Set the mtu explicitly for the tun0 port in ovs.
Result: 'ip a' shows the correct MTU when there are no pods on the node.
|
Story Points: | --- | |
Clone Of: | ||||
: | 1634089 (view as bug list) | Environment: | ||
Last Closed: | 2018-07-30 19:11:39 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1634089 |
Comment 2
Meng Bo
2018-04-09 05:53:46 UTC
I've reproduced this. It is weird, but harmless. As Meng Bo noted the router runs with HostNetworking (by default) so it does not get attached to the OVS bridge. When a pod veth gets attached to the OVS bridge the MTU on tun0 goes to the correct value. When all pod veths are removed, it goes back to 1500. I'm not sure why that behavior occurs yet, but since the value is correct when the bridge has pod attached to it it is harmless. Nowhere in the OvS code is the MTU set without a corresponding change to the appropriate row in the mtu_requested column in the Interfaces table. I'm not sure what is setting this, but it isn't OvS. Maybe docker? Kubernetes? I can reproduce this when docker and openshift are not running. # ip link add v1 type veth peer name v2 mtu 1450 # ip l set v1 up # ip l set v2 up # ip a s v2 14: v2@v1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UP group default qlen 1000 link/ether 66:8d:17:c1:38:75 brd ff:ff:ff:ff:ff:ff inet6 fe80::648d:17ff:fec1:3875/64 scope link valid_lft forever preferred_lft forever # ip a s tun0 6: tun0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN group default qlen 1000 link/ether 6e:f3:eb:d2:52:e5 brd ff:ff:ff:ff:ff:ff inet 10.128.0.1/23 brd 10.128.1.255 scope global tun0 valid_lft forever preferred_lft forever inet6 fe80::6cf3:ebff:fed2:52e5/64 scope link valid_lft forever preferred_lft forever # ovs-vsctl show a5ec748f-4dd7-4933-b4d7-92ac39724f5d Bridge "br0" fail_mode: secure Port "br0" Interface "br0" type: internal Port "vxlan0" Interface "vxlan0" type: vxlan options: {key=flow, remote_ip=flow} Port "tun0" Interface "tun0" type: internal ovs_version: "2.8.1" # ovs-vsctl add-port br0 v2 # ip a s tun0 6: tun0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default qlen 1000 link/ether 6e:f3:eb:d2:52:e5 brd ff:ff:ff:ff:ff:ff inet 10.128.0.1/23 brd 10.128.1.255 scope global tun0 valid_lft forever preferred_lft forever inet6 fe80::6cf3:ebff:fed2:52e5/64 scope link valid_lft forever preferred_lft forever (In reply to Aaron Conole from comment #4) > Nowhere in the OvS code is the MTU set without a corresponding change to the > appropriate row in the mtu_requested column in the Interfaces table. > > I'm not sure what is setting this, but it isn't OvS. Maybe docker? > Kubernetes? Kernel part of OVS maybe? I didn't see anything in kernel. I'm not sure what's happening - additional strange behaviors: 1. If we make the tun port using ip tuntap add ..., the MTU won't change 2. If we let OvS create it by setting up an internal port (which is how OpenShift works), then the mtu will be adjusted when adding / removing other ports. 3. If we set up the internal port with mtu_requested= then the MTU will be assigned the value in the mtu_requested field. I'm at a loss to explain this just yet, but I'll investigate it further. Posted a PR to request the right MTU when the OVS bridge is created. https://github.com/openshift/origin/pull/19372 Commit pushed to master at https://github.com/openshift/origin https://github.com/openshift/origin/commit/9df7ca9edfa7ec002567a12c13e3c78dd0a61afc Explicitly set the MTU on the tun0 interface Before this, we did not set the MTU on the tun0 interface explicitly. Since that interface is of type internal it is created and managed by OVS. Because it is managed by OVS, the MTU changes when pods are added and removed from the bridge. With a pod we see the desired MTU on tun0, but when there are no pods, the MTU goes back to 1500. This is normally fine, but if there are packets above the MTU going between nodes using their SDN addresses, then those packets will get dropped when there are no pods using the SDN on one of those nodes. The change is to request the correct MTU when we add the port to OVS. Fixes bug 1564346 (https://bugzilla.redhat.com/show_bug.cgi?id=1564346) This has been resolved in 3.10, but there should be no need to backport. The behavior is surprising, but harmless. Verified in atomic-openshift-3.10.0-0.47.0.git.0.2fffa04.el7 and cannot reproduce the issue. The tun0 MTU using the configured value when no veth attached to OVS bridge. OS: Red Hat Enterprise Linux Server release 7.5 (Maipo) kernel: Linux ip-172-18-5-139.ec2.internal 3.10.0-862.el7.x86_64 #1 SMP Wed Mar 21 18:14:51 EDT 2018 x86_64 x86_64 x86_64 GNU/Linux Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:1816 |