Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.

Bug 1816616

Summary: [OVN] ARP requests not forwarded to chassis owning the logical port behind FIP in DVR scenarios.
Product: Red Hat Enterprise Linux Fast Datapath Reporter: Dumitru Ceara <dceara>
Component: ovn2.13Assignee: Dumitru Ceara <dceara>
Status: CLOSED ERRATA QA Contact: ying xu <yinxu>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: FDP 20.ACC: ctrautma, jishi, mmichels, ralongi, yinxu
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
: 1816617 1816620 (view as bug list) Environment:
Last Closed: 2020-04-14 08:21:39 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1816617, 1816620    

Description Dumitru Ceara 2020-03-24 11:41:29 UTC
Description of problem:

In a distributed routing scenario, if a VM (VM1) is connected to the aggregation switch (public) and tries to connect to another VM (VM2) connected to a switch through a floating IP (dnat_and_snat with external_mac and logical_port set), if VM2 resides on a different chassis then ARP requests don't reach the chassis of VM2.

Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. On a two chassis (hv1 and hv2) physical topology configure the following logical topology:

ovn-nbctl ls-add sw-agg
ovn-nbctl lsp-add sw-agg sw-agg-ext \
    -- lsp-set-addresses sw-agg-ext 00:00:00:00:00:01

ovn-nbctl lsp-add sw-agg sw-rtr1                   \
    -- lsp-set-type sw-rtr1 router                 \
    -- lsp-set-addresses sw-rtr1 00:00:00:00:01:00 \
    -- lsp-set-options sw-rtr1 router-port=rtr1-sw

ovn-nbctl lsp-add sw-agg sw-agg-ln
ovn-nbctl lsp-set-addresses sw-agg-ln unknown
ovn-nbctl lsp-set-type sw-agg-ln localnet
ovn-nbctl lsp-set-options sw-agg-ln network_name=phys

ovn-nbctl lr-add rtr1
ovn-nbctl lrp-add rtr1 rtr1-sw 00:00:00:00:01:00 10.0.0.1/24 10::1/64

ovn-nbctl lrp-add rtr1 rtr1-sw1 00:00:01:00:00:00 20.0.0.1/24 20::1/64

ovn-nbctl lrp-set-gateway-chassis rtr1-sw hv1 20

ovn-nbctl lr-nat-add rtr1 dnat_and_snat 10.0.0.122 20.0.0.12 sw1-p2 00:00:00:02:00:00

2. Configure the underlying physical network on both hv1 and hv2:
ovs-vsctl set open . external-ids:ovn-bridge-mappings=phys:br-phys

3. Bind sw-agg-ext to an OVS port on hv1.

4. Bind sw1-p2 to an OVS port on hv2.

5. Send ARP request from sw-agg-ext for 10.0.0.122.

Actual results:
ARP requests don't reach hv2 and are not replied to.

Expected results:
ARP requests reach hv2 and get replied to. The neighbor entry is populated on sw-agg-ext.

Additional info:
Originally reported upstream: https://mail.openvswitch.org/pipermail/ovs-discuss/2020-March/049856.html

Comment 5 ying xu 2020-03-30 04:05:20 UTC
reproduced on version:
# rpm -qa|grep ovn
ovn2.13-2.13.0-4.el8fdp.x86_64
kernel-kernel-networking-openvswitch-ovn-basic-1.0-23.noarch
ovn2.13-central-2.13.0-4.el8fdp.x86_64
ovn2.13-host-2.13.0-4.el8fdp.x86_64

set the env as below:
topo:
s3-----------r1-------public------localnet
|                       |
hv0vm0                 hv1vm0

# ovn-nbctl show
switch 350adf54-a2e0-4b34-94f8-34e05c4c7aca (s3)
    port hv0_vm00_vnet1
        addresses: ["00:de:ad:00:00:01 172.16.103.11"]
    port s3_r1
        type: router
        addresses: ["00:de:ad:ff:01:03 172.16.103.1"]
        router-port: r1_s3
    port hv0_vm01_vnet1
        addresses: ["00:de:ad:00:01:01 172.16.103.12"]
switch 90dcc1b5-8b5d-4bcc-bf11-e4be61ced168 (public)
    port public_r1
        type: router
        router-port: r1_public
    port ln_p1
        type: localnet
        addresses: ["unknown"]
    port hv1_vm00_vnet1
        addresses: ["00:de:ad:01:00:01 172.16.102.11"]
router 826d7a1c-1268-4cd2-8772-a72c3b142336 (r1)
    port r1_public
        mac: "00:de:ad:ff:01:02"
        networks: ["172.16.102.1/24"]
        gateway chassis: [hv0]
    port r1_s3
        mac: "00:de:ad:ff:01:03"
        networks: ["172.16.103.1/24"]
    nat 27138ed3-6fe6-4828-8682-f16793c03034
        external ip: "172.16.102.201"
        logical ip: "172.16.103.11"
        type: "dnat_and_snat"
# ovs-vsctl show
54955998-12e7-4415-8fb0-69dc705bfa0f
    Bridge br-int
        fail_mode: secure
        Port "hv1_vm00_vnet1"
            Interface "hv1_vm00_vnet1"
        Port br-int
            Interface br-int
                type: internal
        Port "ovn-hv0-0"
            Interface "ovn-hv0-0"
                type: geneve
                options: {csum="true", key=flow, remote_ip="20.0.10.26"}
        Port "patch-br-int-to-ln_p1"
            Interface "patch-br-int-to-ln_p1"
                type: patch
                options: {peer="patch-ln_p1-to-br-int"}
    Bridge nat_test
        Port nat_test
            Interface nat_test
                type: internal
        Port "enp4s0d1"
            Interface "enp4s0d1"
        Port "patch-ln_p1-to-br-int"
            Interface "patch-ln_p1-to-br-int"
                type: patch
                options: {peer="patch-br-int-to-ln_p1"}
    ovs_version: "2.11.0"
after set the env,
ovn-nbctl lrp-set-gateway-chassis r1_public hv0 20

then, ping from hv1vm0 to hv0vm0;failed
# ip nei flush all;ping 172.16.102.201 -c10
PING 172.16.102.201 (172.16.102.201) 56(84) bytes of data.
From 172.16.102.11 icmp_seq=1 Destination Host Unreachable
From 172.16.102.11 icmp_seq=2 Destination Host Unreachable
From 172.16.102.11 icmp_seq=3 Destination Host Unreachable
From 172.16.102.11 icmp_seq=4 Destination Host Unreachable
From 172.16.102.11 icmp_seq=5 Destination Host Unreachable
From 172.16.102.11 icmp_seq=6 Destination Host Unreachable
From 172.16.102.11 icmp_seq=7 Destination Host Unreachable
From 172.16.102.11 icmp_seq=8 Destination Host Unreachable
From 172.16.102.11 icmp_seq=9 Destination Host Unreachable
From 172.16.102.11 icmp_seq=10 Destination Host Unreachable

--- 172.16.102.201 ping statistics ---
10 packets transmitted, 0 received, +10 errors, 100% packet loss, time 9001ms

verified on version:
# rpm -qa|grep ovn
ovn2.13-2.13.0-7.el8fdp.x86_64
ovn2.13-host-2.13.0-7.el8fdp.x86_64
ovn2.13-central-2.13.0-7.el8fdp.x86_64

# ip nei flush all;ping 172.16.102.201 -c10
PING 172.16.102.201 (172.16.102.201) 56(84) bytes of data.
64 bytes from 172.16.102.201: icmp_seq=1 ttl=64 time=2.17 ms
64 bytes from 172.16.102.201: icmp_seq=2 ttl=64 time=0.384 ms
64 bytes from 172.16.102.201: icmp_seq=3 ttl=64 time=1.31 ms
64 bytes from 172.16.102.201: icmp_seq=4 ttl=64 time=0.520 ms
64 bytes from 172.16.102.201: icmp_seq=5 ttl=64 time=0.453 ms
64 bytes from 172.16.102.201: icmp_seq=6 ttl=64 time=0.447 ms
64 bytes from 172.16.102.201: icmp_seq=7 ttl=64 time=0.405 ms
64 bytes from 172.16.102.201: icmp_seq=8 ttl=64 time=0.477 ms
64 bytes from 172.16.102.201: icmp_seq=9 ttl=64 time=0.398 ms
64 bytes from 172.16.102.201: icmp_seq=10 ttl=64 time=0.483 ms

--- 172.16.102.201 ping statistics ---
10 packets transmitted, 10 received, 0% packet loss, time 9002ms
rtt min/avg/max/mdev = 0.384/0.705/2.176/0.555 ms

Comment 7 errata-xmlrpc 2020-04-14 08:21:39 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:1434