The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.
Bug 1761371 - [RHEL 8] GARP reply packets from switches are handled on all ovn-controllers
Summary: [RHEL 8] GARP reply packets from switches are handled on all ovn-controllers
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: ovn2.11
Version: FDP 19.G
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Numan Siddique
QA Contact: haidong li
URL:
Whiteboard:
Depends On: 1749739
Blocks:
TreeView+ depends on / blocked
 
Reported: 2019-10-14 09:36 UTC by Numan Siddique
Modified: 2020-01-17 02:19 UTC (History)
6 users (show)

Fixed In Version: ovn2.12-2.12.0-2
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1749739
Environment:
Last Closed: 2019-11-06 05:23:45 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2019:3721 0 None None None 2019-11-06 05:23:47 UTC

Description Numan Siddique 2019-10-14 09:36:19 UTC
+++ This bug was initially created as a clone of Bug #1749739 +++

Description of problem:
If all the chassis have external connectivity (ovn-bridge-mappings defined) and if the physical switch keeps sending periodic GARP replies (instead of Requests) they are handled by all chassis's.

It's enough if those GARPs are handled only on the gateway chassis where the distributed router port is scheduled.


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 3 haidong li 2019-10-17 09:44:34 UTC
Hi Numan,to verify this bug,is it the right way by checking the cpu of the chassises when send the GARP from physical switch?Thanks!

Comment 4 Numan Siddique 2019-10-17 13:25:29 UTC
Yes. You can flood the garp from physical network and monitor the cpu. 
All the chassis should have bridge mappings configured so that the packet enteres the ovn pipeline i.e it should enter br-int.
But should be processed by only one node and there should be no cpu hogging.

Thanks

Comment 5 haidong li 2019-10-18 08:40:34 UTC
verified on the latest version:
[root@hp-dl380g10-05 bin]# rpm -qa | grep openvswitch
openvswitch-selinux-extra-policy-1.0-19.el8fdp.noarch
openvswitch2.11-2.11.0-24.el8fdp.x86_64
kernel-kernel-networking-openvswitch-ovn-1.0-148.noarch
[root@hp-dl380g10-05 bin]# rpm -qa | grep ovn
ovn2.11-2.11.1-8.el8fdp.x86_64
ovn2.11-host-2.11.1-8.el8fdp.x86_64
kernel-kernel-networking-openvswitch-ovn-1.0-148.noarch
ovn2.11-central-2.11.1-8.el8fdp.x86_64
[root@hp-dl380g10-05 bin]# 
[root@dell-per730-42 ovn]# ovn-nbctl show
switch 08f2e2c0-0a9f-4f76-ac5f-a74f809d6dd8 (s2)
    port hv1_vm01_vnet1
        addresses: ["00:de:ad:01:01:01 172.16.102.12"]
    port s2_r1
        type: router
        addresses: ["00:de:ad:ff:01:02 172.16.102.1"]
        router-port: r1_s2
    port hv1_vm00_vnet1
        addresses: ["00:de:ad:01:00:01 172.16.102.11"]
switch f89eb566-1792-445a-b0fc-c4dc2132e5ab (s3)
    port hv0_vm01_vnet1
        addresses: ["00:de:ad:00:01:01 172.16.103.12"]
    port hv0_vm00_vnet1
        addresses: ["00:de:ad:00:00:01 172.16.103.11"]
    port s3_r1
        type: router
        addresses: ["00:de:ad:ff:01:03 172.16.103.1"]
        router-port: r1_s3
switch 555cf4a3-aa90-44bc-9735-0b8ac50b3a0d (public)
    port ln_p1
        type: localnet
        addresses: ["unknown"]
    port public_r1
        type: router
        router-port: r1_public
router 38b406b7-e78d-48e6-bf5f-89ebec55ddd9 (r1)
    port r1_public
        mac: "40:44:00:00:00:03"
        networks: ["172.16.104.1/24"]
    port r1_s3
        mac: "00:de:ad:ff:01:03"
        networks: ["172.16.103.1/24"]
    port r1_s2
        mac: "00:de:ad:ff:01:02"
        networks: ["172.16.102.1/24"]
    nat 1e99a341-7b98-4758-9574-d9f85bde501a
        external ip: "172.16.104.200"
        logical ip: "172.16.102.11"
        type: "dnat_and_snat"
    nat c87da4ea-7a83-4fc8-9613-6092c1d07cb6
        external ip: "172.16.104.201"
        logical ip: "172.16.103.11"
        type: "dnat_and_snat"
[root@dell-per730-42 ovn]# 


produce a lot of RARP packets on another machine on switch:

from scapy.all import *
>>> for x in range(1000):
...   sendp(Ether(src="00:de:ad:01:00:01",dst="ff:ff:ff:ff:ff:ff")/ARP(op=2,hwsrc="00:de:aq:01:00:01",hwdst="00:de:aq:01:00:01",psrc="172.16.102.99",pdst="172.16.102.99"),iface="p4p1")
... 

[root@dell-per730-42 ~]# top

top - 03:39:47 up 1 day, 23:59,  2 users,  load average: 0.05, 0.10, 0.04
Tasks: 547 total,   1 running, 546 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :  31967.2 total,  23898.4 free,   1976.0 used,   6092.8 buff/cache
MiB Swap:  16128.0 total,  16128.0 free,      0.0 used.  29505.8 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                     
38168 openvsw+  10 -10 3408452 157256  33044 S   1.3   0.5   0:41.38 ovs-vswitchd                                                
38237 root      10 -10  264336   5632   3356 S   0.3   0.0   0:01.79 ovn-controller                                              
38412 qemu      20   0 7025884 567320  21220 S   0.3   1.7   1:11.51 qemu-kvm                                                    
38571 qemu      20   0 6169032 585332  21116 S   0.3   1.8   1:01.51 qemu-kvm                                                    
    1 root      20   0  244620  11584   8276 S   0.0   0.0   0:06.95 systemd                                                     
    2 root      20   0       0      0      0 S   0.0   0.0   0:00.07 kthreadd                      
[root@hp-dl380g10-05 ~]# top

top - 04:13:58 up 2 days, 32 min,  2 users,  load average: 0.00, 0.00, 0.00
Tasks: 525 total,   1 running, 524 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.0 us,  0.0 sy,  0.0 ni,100.0 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :  63989.4 total,  55211.2 free,   2454.2 used,   6324.0 buff/cache
MiB Swap:  28608.0 total,  28608.0 free,      0.0 used.  60878.0 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND                                                   
35052 openvsw+  10 -10 3408432 157416  33044 S   2.0   0.2   2:09.00 ovs-vswitchd                                              
 4503 root      20   0  424984  31828  16224 S   0.3   0.0   0:14.86 tuned                                                     
    1 root      20   0  247016  14340   9040 S   0.0   0.0   0:08.47 systemd                                                   
    2 root      20   0       0      0      0 S   0.0   0.0   0:00.15 kthreadd                                                  
    3 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 rcu_gp                                                    
    4 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 rcu_par_gp                                                
    6 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 kworker/0:0H-kblockd                                      
    7 root      20   0       0      0      0 I   0.0   0.0   0:00.01 kworker/u96:0-cpuset_migrate_mm                           
    9 root       0 -20       0      0      0 I   0.0   0.0   0:00.00 mm_percpu_wq                                              
   10 root      20   0       0      0      0 S   0.0   0.0   0:00.04 ksoftirqd/0                                               
   11 root      20   0       0      0      0 I   0.0   0.0   0:34.95 rcu_sched

Comment 7 errata-xmlrpc 2019-11-06 05:23:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:3721


Note You need to log in before you can comment on or make changes to this bug.