Bug 1559222

Summary: [ovn]fail to ping remote hv after two chassis claim lsp at the same time
Product: Red Hat Enterprise Linux 7 Reporter: haidong li <haili>
Component: openvswitchAssignee: Timothy Redaelli <tredaelli>
Status: CLOSED ERRATA QA Contact: haidong li <haili>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 7.5CC: atragler, mmichels, mmirecki, pvauter
Target Milestone: rc   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: openvswitch-2.9.0-22.el7fdn Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-06-21 13:36:35 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1549033    

Description haidong li 2018-03-22 02:13:46 UTC
Description of problem:
fail to ping remote hv after two chassis claim lsp at the same time

Version-Release number of selected component (if applicable):
ovs_version: "2.9.0"

How reproducible:
everytime

Steps to Reproduce:

hv1 and hv0 are on two machines.Claim a logical port lsp0 with veth1 on chassis hv1 and hv0 at the same time,and the port is bond to hv1 successfully after I set the requested-chassis option for hv1.Then on hv1 I ping from veth2 to guest1,succeed.But failed to ping from veth2 to guest0 unless I remove the claimed lsp0 configuration on hv0.


          +-----+         +-----+
          |     |         |     |
          | hv1 |         | hv0 |
 guest1---|     |---------|     |------guest0
          |     |         |     |
          +-----+         +-----+
           |veth1            |veth1
           |                 |
           veth2            veth2


ovn-northd and ovn-controller is started on hv1:

[root@dell-per730-19 ~]# ovn-nbctl show
switch 0c96a118-2a66-49a5-8612-77458a20f7ce (s2)
    port hv1_vm00_vnet1                                                  <----------------------this is guest1's port
        addresses: ["00e:ad:01:00:01 172.16.102.11 2001b8:102::11"]
    port lsp0
        addresses: ["3e:83:be:5d:5d:30 172.16.102.33 2001b8:102::33"]
    port hv0_vm01_vnet1
        addresses: ["00e:ad:00:01:01 172.16.102.22 2001b8:102::22"]
    port hv1_vm01_vnet1
        addresses: ["00e:ad:01:01:01 172.16.102.12 2001b8:102::12"]
    port hv0_vm00_vnet1                                                    
        addresses: ["00e:ad:00:00:01 172.16.102.21 2001b8:102::21"]     <-----------------------this is guest0's port
[root@dell-per730-19 ~]# ovs-vsctl show
bb95cb3e-0bf2-4f4f-bf4b-dd49cc26ea22
    Bridge br-int
        fail_mode: secure
        Port "hv1_vm00_vnet1"
            Interface "hv1_vm00_vnet1"
        Port "veth1"
            Interface "veth1"
        Port br-int
            Interface br-int
                type: internal
        Port "hv1_vm01_vnet1"
            Interface "hv1_vm01_vnet1"
        Port "ovn-hv0-0"
            Interface "ovn-hv0-0"
                type: geneve
                options: {csum="true", key=flow, remote_ip="20.0.0.26"}
    ovs_version: "2.9.0"
[root@dell-per730-19 ~]# ip link show veth2
47: veth2@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
    link/ether 3e:83:be:5d:5d:30 brd ff:ff:ff:ff:ff:ff
[root@dell-per730-19 ~]#
[root@dell-per730-19 ~]# ovn-nbctl lsp-get-options lsp0
requested-chassis=hv1
[root@dell-per730-19 ~]# ovn-sbctl list Chassis
_uuid               : 340aa369-f590-4cc5-af73-c443b6ba8492
encaps              : [49f57c8f-91cd-4d9e-93d1-e3dbc9ac0ac6]
external_ids        : {datapath-type="", iface-types="geneve,gre,internal,lisp,patch,stt,system,tap,vxlan", ovn-bridge-mappings=""}
hostname            : "dell-per730-19.rhts.eng.pek2.redhat.com"
name                : "hv1"
nb_cfg              : 0
vtep_logical_switches: []

_uuid               : 87330357-3438-430a-930d-939181ad9d77
encaps              : [326c013a-8b6b-4b22-b53f-e7dcaff99194]
external_ids        : {datapath-type="", iface-types="geneve,gre,internal,lisp,patch,stt,system,tap,vxlan", ovn-bridge-mappings=""}
hostname            : "dell-per730-49.rhts.eng.pek2.redhat.com"
name                : "hv0"
nb_cfg              : 0
vtep_logical_switches: []
[root@dell-per730-19 ~]#
[root@dell-per730-19 ~]# ovs-vsctl get interface veth1 external-ids
{iface-id="lsp0"}
[root@dell-per730-19 ~]#
[root@dell-per730-19 ~]# ovn-nbctl lsp-get-options lsp0
requested-chassis=hv1
[root@dell-per730-19 ~]#
[root@dell-per730-19 ~]# ovn-sbctl --bare --columns chassis find port_binding logical_port=lsp0
340aa369-f590-4cc5-af73-c443b6ba8492
[root@dell-per730-19 ~]#
[root@dell-per730-19 ~]# ping 172.16.102.21
PING 172.16.102.21 (172.16.102.21) 56(84) bytes of data.
^C
--- 172.16.102.21 ping statistics ---
1 packets transmitted, 0 received, 100% packet loss, time 0ms

[root@dell-per730-19 ~]# ping 172.16.102.11
PING 172.16.102.11 (172.16.102.11) 56(84) bytes of data.
64 bytes from 172.16.102.11: icmp_seq=1 ttl=64 time=0.715 ms
^C
--- 172.16.102.11 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.715/0.715/0.715/0.000 ms




ovn-controller is started on hv0:

[root@dell-per730-49 ovn]# ip add show veth2
45: veth2@veth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether da:9f:4d:86:51:62 brd ff:ff:ff:ff:ff:ff
    inet6 fe80:89f:4dff:fe86:5162/64 scope link
       valid_lft forever preferred_lft forever
[root@dell-per730-49 ovn]# ovs-vsctl show
3b187b97-501a-494c-b322-b19ac5bcbcec
    Bridge br-int
        fail_mode: secure
        Port "hv0_vm00_vnet1"
            Interface "hv0_vm00_vnet1"
        Port "ovn-hv1-0"
            Interface "ovn-hv1-0"
                type: geneve
                options: {csum="true", key=flow, remote_ip="20.0.0.25"}
        Port br-int
            Interface br-int
                type: internal
        Port "veth1"
            Interface "veth1"
        Port "hv0_vm01_vnet1"
            Interface "hv0_vm01_vnet1"
    ovs_version: "2.9.0"
[root@dell-per730-49 ovn]# ovs-vsctl get interface veth1 external-ids
{iface-id="lsp0"}
[root@dell-per730-49 ovn]#


==============================================================================================

Then I clear the lsp0 related configuration on hv0,ping succeed:

hv0:
[root@dell-per730-49 ovn]# ovs-vsctl clear  interface veth1 external-ids
[root@dell-per730-49 ovn]# ovs-vsctl get interface veth1 external-ids
{}
[root@dell-per730-49 ovn]#


hv1:
[root@dell-per730-19 ~]# ping 172.16.102.21
PING 172.16.102.21 (172.16.102.21) 56(84) bytes of data.
64 bytes from 172.16.102.21: icmp_seq=1 ttl=64 time=1.06 ms
64 bytes from 172.16.102.21: icmp_seq=2 ttl=64 time=0.253 ms
^C
--- 172.16.102.21 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 0.253/0.659/1.065/0.406 ms
[root@dell-per730-19 ~]#

Actual results:


Expected results:


Additional info:

Comment 2 Mark Michelson 2018-04-26 12:04:36 UTC
This patch is merged in OVS master and OVS 2.9 branches now. I am moving this to the MODIFIED state.

Comment 3 Mark Michelson 2018-04-26 12:05:49 UTC
For reference, the commit is 656208e735cf076a9166792cb82a46383f1ff6fe in the 2.9 branch of OVS.

Comment 5 Timothy Redaelli 2018-04-27 12:08:40 UTC
*** Bug 1561499 has been marked as a duplicate of this bug. ***

Comment 8 haidong li 2018-05-23 03:11:37 UTC
This issue is verified on the latest version:
[root@dell-per730-49 ovn]# rpm -qa | grep openvswitch
openvswitch-2.9.0-36.el7fdp.x86_64
openvswitch-selinux-extra-policy-1.0-3.el7fdp.noarch
openvswitch-ovn-common-2.9.0-36.el7fdp.x86_64
openvswitch-ovn-host-2.9.0-36.el7fdp.x86_64
openvswitch-ovn-central-2.9.0-36.el7fdp.x86_64
[root@dell-per730-49 ovn]# 

::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
::   ovn_port_migration
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

:: [ 22:20:49 ] :: [   PASS   ] :: Command 'port_migration' (Expected 0, got 0)
::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::
::   Duration: 397s
::   Assertions: 1 good, 0 bad
::   RESULT: PASS


log:
http://beaker-archive.host.prod.eng.bos.redhat.com/beaker-logs/2018/05/24799/2479955/5149251/72013061/TESTOUT.log

job link:
https://beaker.engineering.redhat.com/jobs/2479955

Comment 10 errata-xmlrpc 2018-06-21 13:36:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:1962