Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.

Bug 1717793

Summary: [OVN]gateway_mtu is not effective after change from a small value to a larger value
Product: Red Hat Enterprise Linux Fast Datapath Reporter: haidong li <haili>
Component: ovn2.11Assignee: Dumitru Ceara <dceara>
Status: CLOSED NOTABUG QA Contact: haidong li <haili>
Severity: medium Docs Contact:
Priority: unspecified    
Version: FDP 19.CCC: ctrautma, dceara, qding
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-06-24 19:48:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description haidong li 2019-06-06 08:19:40 UTC
Description of problem:
gateway_mtu is not effective after change from a small value to a larger value

Version-Release number of selected component (if applicable):
[root@dell-per730-57 ovn]# rpm -qa | grep ovn
ovn2.11-central-2.11.0-16.el8fdp.x86_64
ovn2.11-2.11.0-16.el8fdp.x86_64
kernel-kernel-networking-openvswitch-ovn-1.0-121.noarch
ovn2.11-host-2.11.0-16.el8fdp.x86_64
[root@dell-per730-57 ovn]# rpm -qa | grep openvswitch
kernel-kernel-networking-openvswitch-ovn-1.0-121.noarch
openvswitch-selinux-extra-policy-1.0-11.el8fdp.noarch
openvswitch2.11-2.11.0-9.el8fdp.x86_64
[root@dell-per730-57 ovn]# 


How reproducible:
everytime

Steps to Reproduce:
1.set gateway_mtu to 1000,the packets can be fragmented to 1000
2.change the gateway_mtu to 1500,the packets are still fragmented to 1000

[root@dell-per730-57 ovn]# ovn-nbctl set logical_router_port r1_s3 options:gateway_mtu=1000
[root@dell-per730-57 ovn]# virsh console hv1_vm00
Connected to domain hv1_vm00
Escape character is ^]

[root@localhost ~]# ping -s 9000 172.16.103.11
PING 172.16.103.11 (172.16.103.11) 9000(9028) bytes of data.
From 172.16.102.1 icmp_seq=1 Frag needed and DF set (mtu = 982)
9008 bytes from 172.16.103.11: icmp_seq=2 ttl=63 time=1.45 ms
9008 bytes from 172.16.103.11: icmp_seq=3 ttl=63 time=0.991 ms
9008 bytes from 172.16.103.11: icmp_seq=4 ttl=63 time=0.975 ms

--- 172.16.103.11 ping statistics ---
4 packets transmitted, 3 received, +1 errors, 25% packet loss, time 3004ms
rtt min/avg/max/mdev = 0.975/1.141/1.458/0.225 ms

packets captured on peer:

[root@localhost ~]# tcpdump -ei eth1 -nn
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth1, link-type EN10MB (Ethernet), capture size 262144 bytes
04:14:18.053864 00:de:ad:ff:01:03 > 00:de:ad:00:00:01, ethertype IPv4 (0x0800), length 994: 172.16.102.11 > 172.16.103.11: ICMP echo request, id 11611, seq 2, length 960
04:14:18.053888 00:de:ad:ff:01:03 > 00:de:ad:00:00:01, ethertype IPv4 (0x0800), length 994: 172.16.102.11 > 172.16.103.11: ip-proto-1
04:14:18.053923 00:de:ad:ff:01:03 > 00:de:ad:00:00:01, ethertype IPv4 (0x0800), length 994: 172.16.102.11 > 172.16.103.11: ip-proto-1

change the gateway_mtu to 1500:
[root@dell-per730-57 ovn]# ovn-nbctl set logical_router_port r1_s3 options:gateway_mtu=1500
[root@dell-per730-57 ovn]# virsh console hv1_vm00
Connected to domain hv1_vm00
Escape character is ^]

[root@localhost ~]# ping -s 9000 172.16.103.11
PING 172.16.103.11 (172.16.103.11) 9000(9028) bytes of data.
9008 bytes from 172.16.103.11: icmp_seq=1 ttl=63 time=2.01 ms
9008 bytes from 172.16.103.11: icmp_seq=2 ttl=63 time=0.824 ms
9008 bytes from 172.16.103.11: icmp_seq=3 ttl=63 time=0.917 ms
9008 bytes from 172.16.103.11: icmp_seq=4 ttl=63 time=0.901 ms
9008 bytes from 172.16.103.11: icmp_seq=5 ttl=63 time=0.989 ms
9008 bytes from 172.16.103.11: icmp_seq=6 ttl=63 time=0.801 ms

--- 172.16.103.11 ping statistics ---
6 packets transmitted, 6 received, 0% packet loss, time 5007ms
rtt min/avg/max/mdev = 0.801/1.074/2.014/0.425 ms
[root@localhost ~]# 

packets captured on peer:
[root@localhost ~]# tcpdump -ei eth1 -nn
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth1, link-type EN10MB (Ethernet), capture size 262144 bytes
04:17:24.450836 00:de:ad:ff:01:03 > 00:de:ad:00:00:01, ethertype IPv4 (0x0800), length 994: 172.16.102.11 > 172.16.103.11: ICMP echo request, id 11614, seq 1, length 960
04:17:24.450851 00:de:ad:ff:01:03 > 00:de:ad:00:00:01, ethertype IPv4 (0x0800), length 994: 172.16.102.11 > 172.16.103.11: ip-proto-1
04:17:24.450855 00:de:ad:ff:01:03 > 00:de:ad:00:00:01, ethertype IPv4 (0x0800), length 994: 172.16.102.11 > 172.16.103.11: ip-proto-1
04:17:24.450858 00:de:ad:ff:01:03 > 00:de:ad:00:00:01, ethertype IPv4 (0x0800), length 994: 172.16.102.11 > 172.16.103.11: ip-proto-1



Expected results:
the fragment size can changed with the setting

Additional info:

Comment 1 Dumitru Ceara 2019-06-24 19:48:11 UTC
Hi,

The default behavior on hv1_vm00 is to learn interface MTU size from "ICMP unreachable - need to frag" packets sent by the gateway. The expiry timeout is controlled via /proc/sys/net/ipv4/route/mtu_expires.

On my system:
# cat /proc/sys/net/ipv4/route/mtu_expires
600

To disable this behavior, in the VM:
# echo 0 > /proc/sys/net/ipv4/route/mtu_expires

After disabling MTU size learning ping will always try to send packets as big as the locally configured MTU size. In my case with MTU 1500 in the VM:

On northd:
# ovn-nbctl set logical_router_port rtr-ls1 options:gateway_mtu=1000

In VM:
# ping 10.0.0.1 -s 9000
PING 10.0.0.1 (10.0.0.1) 9000(9028) bytes of data.
From 20.0.0.254 icmp_seq=1 Frag needed and DF set (mtu = 982)
From 20.0.0.254 icmp_seq=2 Frag needed and DF set (mtu = 982)

On northd increase gateway mtu to 1500:
# ovn-nbctl set logical_router_port rtr-ls1 options:gateway_mtu=1500

In VM:
# ip netns exec vm2 ping 10.0.0.1 -s 9000
PING 10.0.0.1 (10.0.0.1) 9000(9028) bytes of data.
From 20.0.0.254 icmp_seq=1 Frag needed and DF set (mtu = 1482)
From 20.0.0.254 icmp_seq=2 Frag needed and DF set (mtu = 1482)
From 20.0.0.254 icmp_seq=3 Frag needed and DF set (mtu = 1482)

# ip netns exec vm2 ping 10.0.0.1 -s 1454
PING 10.0.0.1 (10.0.0.1) 1454(1482) bytes of data.
1462 bytes from 10.0.0.1: icmp_seq=1 ttl=63 time=0.762 ms
1462 bytes from 10.0.0.1: icmp_seq=2 ttl=63 time=0.639 ms

So it seems that the functionality works as expected.

Comment 2 haidong li 2019-06-25 05:53:26 UTC
Hi,
  I have another question for the fragment.If I set the gateway_mtu less than 568,it seems can't ping successfully to the remote,only displayed (mtu = 542) like this:

[root@dell-per730-19 ovn]# ovn-nbctl get logical_router_port r1_s3 options:gateway_mtu
"560"
[root@dell-per730-19 ovn]# virsh console hv1_vm00
Connected to domain hv1_vm00
Escape character is ^]

[root@localhost ~]# ping -s 1000  172.16.103.11
PING 172.16.103.11 (172.16.103.11) 1000(1028) bytes of data.
From 172.16.102.1 icmp_seq=1 Frag needed and DF set (mtu = 542)
From 172.16.102.1 icmp_seq=2 Frag needed and DF set (mtu = 542)
From 172.16.102.1 icmp_seq=3 Frag needed and DF set (mtu = 542)
From 172.16.102.1 icmp_seq=4 Frag needed and DF set (mtu = 542)
From 172.16.102.1 icmp_seq=5 Frag needed and DF set (mtu = 542)
From 172.16.102.1 icmp_seq=6 Frag needed and DF set (mtu = 542)
From 172.16.102.1 icmp_seq=7 Frag needed and DF set (mtu = 542)
From 172.16.102.1 icmp_seq=8 Frag needed and DF set (mtu = 542)
From 172.16.102.1 icmp_seq=9 Frag needed and DF set (mtu = 542)
From 172.16.102.1 icmp_seq=10 Frag needed and DF set (mtu = 542)
From 172.16.102.1 icmp_seq=11 Frag needed and DF set (mtu = 542)
From 172.16.102.1 icmp_seq=12 Frag needed and DF set (mtu = 542)
From 172.16.102.1 icmp_seq=13 Frag needed and DF set (mtu = 542)
From 172.16.102.1 icmp_seq=14 Frag needed and DF set (mtu = 542)

--- 172.16.103.11 ping statistics ---
14 packets transmitted, 0 received, +14 errors, 100% packet loss, time 13017ms

But it can ping success if I change the gateway_mtu to 562:

root@dell-per730-19 ovn]# ovn-nbctl set logical_router_port r1_s3 options:gateway_mtu=562
[root@dell-per730-19 ovn]# ovn-nbctl get logical_router_port r1_s3 options:gateway_mtu
"562"

[root@localhost ~]# ping -s 1000  172.16.103.11
PING 172.16.103.11 (172.16.103.11) 1000(1028) bytes of data.
From 172.16.102.1 icmp_seq=1 Frag needed and DF set (mtu = 544)
1008 bytes from 172.16.103.11: icmp_seq=2 ttl=63 time=1.01 ms
1008 bytes from 172.16.103.11: icmp_seq=3 ttl=63 time=0.785 ms
1008 bytes from 172.16.103.11: icmp_seq=4 ttl=63 time=0.632 ms
1008 bytes from 172.16.103.11: icmp_seq=5 ttl=63 time=0.590 ms

--- 172.16.103.11 ping statistics ---
5 packets transmitted, 4 received, +1 errors, 20% packet loss, time 4003ms
rtt min/avg/max/mdev = 0.590/0.755/1.013/0.165 ms
[root@localhost ~]# 


Is it expected?Thanks.

Comment 3 haidong li 2019-06-25 05:55:19 UTC
the vm can ping successfully with gateway_mtu larger than 562,corret the value in comment2.

Comment 4 Dumitru Ceara 2019-06-25 07:48:09 UTC
Hi,

Did hv1_vm00 get rebooted in between the gateway_mtu change on r1_s3? Or did the configuration for /proc/sys/net/ipv4/route/mtu_expires on hv1_vm00 change?

If not can you please share a tcpdump from hv1_vm00 when gateway_mtu is 562 and ping is successful?
tcpdump -n -i <interface> -v

Thanks