Bug 1010931

Summary: Migration of OVS tunnel from GRE to VXLAN stops connectivity
Product: Red Hat OpenStack Reporter: Josep 'Pep' Turro Mauri <pep>
Component: kernelAssignee: Don Howard <dhoward>
Status: CLOSED NOTABUG QA Contact: Jean-Tsung Hsiao <jhsiao>
Severity: medium Docs Contact:
Priority: medium    
Version: 3.0CC: hateya, lwang, rkhan, tgraf, tvvcox, yeylon
Target Milestone: ---   
Target Release: 4.0   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2013-09-26 07:25:36 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Josep 'Pep' Turro Mauri 2013-09-23 10:13:42 UTC
Description of problem:
Systems have GRE-type OVS tunnels configured and running fine. When changing them from GRE to VXLAN the connectivity across the tunnels stops.

Version-Release number of selected component (if applicable):
kernel-2.6.32-416.el6.x86_64
openvswitch-1.11.0_8ce28d-1.el6ost.x86_64

How reproducible:
always

Steps to Reproduce:
1. Have a GRE based setup working:
[root@host07 openvswitch]# ovs-vsctl show
(...)
    Bridge br-tun
        Port patch-int
            Interface patch-int
                type: patch
                options: {peer=patch-tun}
        Port br-tun
            Interface br-tun
                type: internal
        Port "gre-4"
            Interface "gre-4"
                type: gre
                options: {in_key=flow, out_key=flow, remote_ip="10.80.80.4"}
        Port "gre-3"
            Interface "gre-3"
                type: gre
                options: {in_key=flow, out_key=flow, remote_ip="10.80.80.5"}
        Port "gre-1"
            Interface "gre-1"
                type: gre
                options: {in_key=flow, out_key=flow, remote_ip="10.80.80.6"}
(...)

2. Switch tunnel from GRE to VXLAN:
[root@host07 openvswitch]# ovs-vsctl --set interface gre-3 type=vxlan
[root@host07 openvswitch]# ovs-vsctl show
(...)
    Bridge br-tun
        Port patch-int
            Interface patch-int
                type: patch
                options: {peer=patch-tun}
        Port br-tun
            Interface br-tun
                type: internal
        Port "gre-4"
            Interface "gre-4"
                type: gre
                options: {in_key=flow, out_key=flow, remote_ip="10.80.80.4"}
        Port "gre-3"
            Interface "gre-3"
                type: vxlan
                options: {in_key=flow, out_key=flow, remote_ip="10.80.80.5"}
        Port "gre-1"
            Interface "gre-1"
                type: gre
                options: {in_key=flow, out_key=flow, remote_ip="10.80.80.6"}
(...)

and the same in the reciprocal endpoints

Actual results:
Connectivity across the tunnel stops.

Expected results:
VXLAN tunnels working

Additional info:
Reverting the changes with "ovs-vsctl -- set interface gre-3 type=gre" restores the connectivity.

The following traces are written to ovs-vswitchd.log when changin to type=vxlan:

2013-09-16T10:37:16Z|00181|bridge|INFO|bridge br-tun: added interface gre-3 on port 3
2013-09-16T10:37:16Z|00182|tunnel|WARN|Dropped 12 log messages in last 515 seconds (most recently, 501 seconds ago) due to excessive rate
2013-09-16T10:37:16Z|00183|tunnel|WARN|receive tunnel port not found (10.80.80.7->10.80.80.5, key=0xd, dp port=4, skb mark=0)
2013-09-16T10:37:16Z|00184|tunnel|WARN|receive tunnel port not found (10.80.80.7->10.80.80.5, key=0x13, dp port=4, skb mark=0)
2013-09-16T10:37:16Z|00185|tunnel|WARN|receive tunnel port not found (10.80.80.7->10.80.80.5, key=0x13, dp port=4, skb mark=0)
2013-09-16T10:37:16Z|00186|ofproto_dpif|INFO|Dropped 9 log messages in last 511 seconds (most recently, 501 seconds ago) due to excessive rate
2013-09-16T10:37:16Z|00187|ofproto_dpif|INFO|received packet on unassociated port 65535
2013-09-16T10:37:17Z|00188|tunnel|WARN|receive tunnel port not found (10.80.80.7->10.80.80.5, key=0xe, dp port=4, skb mark=0)
2013-09-16T10:37:17Z|00189|ofproto_dpif|INFO|received packet on unassociated port 65535
2013-09-16T10:37:17Z|00190|tunnel|WARN|receive tunnel port not found (10.80.80.7->10.80.80.5, key=0xd, dp port=4, skb mark=0)
2013-09-16T10:37:17Z|00191|ofproto_dpif|INFO|received packet on unassociated port 65535
2013-09-16T10:37:17Z|00192|ofproto_dpif|INFO|received packet on unassociated port 65535
2013-09-16T10:37:18Z|00193|ofproto_dpif|INFO|received packet on unassociated port 65535

Resetting the bridge via:

  # ovs-vsctl emer-reset

didn't help

Comment 1 Jean-Tsung Hsiao 2013-09-23 19:09:10 UTC
Migration from OVS/GRE to OVS/VXLAN works under kernel-419 --- test done NOT in a Openstack env.

Will try kernel-416.

*** gre tunnel ***

    Bridge "grebr0"
        Port "grebr0"
            Interface "grebr0"
                type: internal
        Port "gre1"
            Interface "gre1"
                type: gre
                options: {remote_ip="10.1.0.3"}

    Bridge "grebr0"
        Port "gre1"
            Interface "gre1"
                type: gre
                options: {remote_ip="10.1.0.1"}
        Port "grebr0"
            Interface "grebr0"
                type: internal
ping -c 10 -i 0.2 192.168.3.1
PING 192.168.3.1 (192.168.3.1) 56(84) bytes of data.
64 bytes from 192.168.3.1: icmp_seq=1 ttl=64 time=4.37 ms
64 bytes from 192.168.3.1: icmp_seq=2 ttl=64 time=0.132 ms
64 bytes from 192.168.3.1: icmp_seq=3 ttl=64 time=0.158 ms
64 bytes from 192.168.3.1: icmp_seq=4 ttl=64 time=0.138 ms
64 bytes from 192.168.3.1: icmp_seq=5 ttl=64 time=0.139 ms
64 bytes from 192.168.3.1: icmp_seq=6 ttl=64 time=0.144 ms
64 bytes from 192.168.3.1: icmp_seq=7 ttl=64 time=0.144 ms
64 bytes from 192.168.3.1: icmp_seq=8 ttl=64 time=0.144 ms
64 bytes from 192.168.3.1: icmp_seq=9 ttl=64 time=0.155 ms
64 bytes from 192.168.3.1: icmp_seq=10 ttl=64 time=0.153 ms

*** Changing gre to vxlan ***

    Bridge "grebr0"
        Port "grebr0"
            Interface "grebr0"
                type: internal
        Port "gre1"
            Interface "gre1"
                type: vxlan
                options: {remote_ip="10.1.0.3"}

    Bridge "grebr0"
        Port "gre1"
            Interface "gre1"
                type: vxlan
                options: {remote_ip="10.1.0.1"}
        Port "grebr0"
            Interface "grebr0"
                type: internal
ping -c 10 -i 0.2 192.168.3.1
PING 192.168.3.1 (192.168.3.1) 56(84) bytes of data.
64 bytes from 192.168.3.1: icmp_seq=1 ttl=64 time=0.513 ms
64 bytes from 192.168.3.1: icmp_seq=2 ttl=64 time=0.147 ms
64 bytes from 192.168.3.1: icmp_seq=3 ttl=64 time=0.181 ms
64 bytes from 192.168.3.1: icmp_seq=4 ttl=64 time=0.146 ms
64 bytes from 192.168.3.1: icmp_seq=5 ttl=64 time=0.147 ms
64 bytes from 192.168.3.1: icmp_seq=6 ttl=64 time=0.156 ms
64 bytes from 192.168.3.1: icmp_seq=7 ttl=64 time=0.131 ms
64 bytes from 192.168.3.1: icmp_seq=8 ttl=64 time=0.176 ms
64 bytes from 192.168.3.1: icmp_seq=9 ttl=64 time=0.145 ms
64 bytes from 192.168.3.1: icmp_seq=10 ttl=64 time=0.173 ms

--- 192.168.3.1 ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 1800ms
rtt min/avg/max/mdev = 0.131/0.191/0.513/0.109 ms

Comment 2 Jean-Tsung Hsiao 2013-09-23 19:30:54 UTC
(In reply to Jean-Tsung Hsiao from comment #1)
> Migration from OVS/GRE to OVS/VXLAN works under kernel-419 --- test done NOT
> in a Openstack env.
> 
> Will try kernel-416.
> 
> *** gre tunnel ***
> 
>     Bridge "grebr0"
>         Port "grebr0"
>             Interface "grebr0"
>                 type: internal
>         Port "gre1"
>             Interface "gre1"
>                 type: gre
>                 options: {remote_ip="10.1.0.3"}
> 
>     Bridge "grebr0"
>         Port "gre1"
>             Interface "gre1"
>                 type: gre
>                 options: {remote_ip="10.1.0.1"}
>         Port "grebr0"
>             Interface "grebr0"
>                 type: internal
> ping -c 10 -i 0.2 192.168.3.1
> PING 192.168.3.1 (192.168.3.1) 56(84) bytes of data.
> 64 bytes from 192.168.3.1: icmp_seq=1 ttl=64 time=4.37 ms
> 64 bytes from 192.168.3.1: icmp_seq=2 ttl=64 time=0.132 ms
> 64 bytes from 192.168.3.1: icmp_seq=3 ttl=64 time=0.158 ms
> 64 bytes from 192.168.3.1: icmp_seq=4 ttl=64 time=0.138 ms
> 64 bytes from 192.168.3.1: icmp_seq=5 ttl=64 time=0.139 ms
> 64 bytes from 192.168.3.1: icmp_seq=6 ttl=64 time=0.144 ms
> 64 bytes from 192.168.3.1: icmp_seq=7 ttl=64 time=0.144 ms
> 64 bytes from 192.168.3.1: icmp_seq=8 ttl=64 time=0.144 ms
> 64 bytes from 192.168.3.1: icmp_seq=9 ttl=64 time=0.155 ms
> 64 bytes from 192.168.3.1: icmp_seq=10 ttl=64 time=0.153 ms
> 
> *** Changing gre to vxlan ***
> 
>     Bridge "grebr0"
>         Port "grebr0"
>             Interface "grebr0"
>                 type: internal
>         Port "gre1"
>             Interface "gre1"
>                 type: vxlan
>                 options: {remote_ip="10.1.0.3"}
> 
>     Bridge "grebr0"
>         Port "gre1"
>             Interface "gre1"
>                 type: vxlan
>                 options: {remote_ip="10.1.0.1"}
>         Port "grebr0"
>             Interface "grebr0"
>                 type: internal
> ping -c 10 -i 0.2 192.168.3.1
> PING 192.168.3.1 (192.168.3.1) 56(84) bytes of data.
> 64 bytes from 192.168.3.1: icmp_seq=1 ttl=64 time=0.513 ms
> 64 bytes from 192.168.3.1: icmp_seq=2 ttl=64 time=0.147 ms
> 64 bytes from 192.168.3.1: icmp_seq=3 ttl=64 time=0.181 ms
> 64 bytes from 192.168.3.1: icmp_seq=4 ttl=64 time=0.146 ms
> 64 bytes from 192.168.3.1: icmp_seq=5 ttl=64 time=0.147 ms
> 64 bytes from 192.168.3.1: icmp_seq=6 ttl=64 time=0.156 ms
> 64 bytes from 192.168.3.1: icmp_seq=7 ttl=64 time=0.131 ms
> 64 bytes from 192.168.3.1: icmp_seq=8 ttl=64 time=0.176 ms
> 64 bytes from 192.168.3.1: icmp_seq=9 ttl=64 time=0.145 ms
> 64 bytes from 192.168.3.1: icmp_seq=10 ttl=64 time=0.173 ms
> 
> --- 192.168.3.1 ping statistics ---
> 10 packets transmitted, 10 received, 0% packet loss, time 1800ms
> rtt min/avg/max/mdev = 0.131/0.191/0.513/0.109 ms

Also works under kerne-416.

Comment 9 Josep 'Pep' Turro Mauri 2013-09-26 07:25:36 UTC
Mistery solved: it was the firewall blocking the UDP-based VXLAN traffic. Thanks for the assistance.