+++ This bug was initially created as a clone of Bug #1749739 +++ Description of problem: If all the chassis have external connectivity (ovn-bridge-mappings defined) and if the physical switch keeps sending periodic GARP replies (instead of Requests) they are handled by all chassis's. It's enough if those GARPs are handled only on the gateway chassis where the distributed router port is scheduled. Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Hi Numan,to verify this bug,is it the right way by checking the cpu of the chassises when send the GARP from physical switch?Thanks!
Yes. You can flood the garp from physical network and monitor the cpu. All the chassis should have bridge mappings configured so that the packet enteres the ovn pipeline i.e it should enter br-int. But should be processed by only one node and there should be no cpu hogging. Thanks
verified on the latest version: [root@hp-dl380g10-05 bin]# rpm -qa | grep openvswitch openvswitch-selinux-extra-policy-1.0-19.el8fdp.noarch openvswitch2.11-2.11.0-24.el8fdp.x86_64 kernel-kernel-networking-openvswitch-ovn-1.0-148.noarch [root@hp-dl380g10-05 bin]# rpm -qa | grep ovn ovn2.11-2.11.1-8.el8fdp.x86_64 ovn2.11-host-2.11.1-8.el8fdp.x86_64 kernel-kernel-networking-openvswitch-ovn-1.0-148.noarch ovn2.11-central-2.11.1-8.el8fdp.x86_64 [root@hp-dl380g10-05 bin]# [root@dell-per730-42 ovn]# ovn-nbctl show switch 08f2e2c0-0a9f-4f76-ac5f-a74f809d6dd8 (s2) port hv1_vm01_vnet1 addresses: ["00:de:ad:01:01:01 172.16.102.12"] port s2_r1 type: router addresses: ["00:de:ad:ff:01:02 172.16.102.1"] router-port: r1_s2 port hv1_vm00_vnet1 addresses: ["00:de:ad:01:00:01 172.16.102.11"] switch f89eb566-1792-445a-b0fc-c4dc2132e5ab (s3) port hv0_vm01_vnet1 addresses: ["00:de:ad:00:01:01 172.16.103.12"] port hv0_vm00_vnet1 addresses: ["00:de:ad:00:00:01 172.16.103.11"] port s3_r1 type: router addresses: ["00:de:ad:ff:01:03 172.16.103.1"] router-port: r1_s3 switch 555cf4a3-aa90-44bc-9735-0b8ac50b3a0d (public) port ln_p1 type: localnet addresses: ["unknown"] port public_r1 type: router router-port: r1_public router 38b406b7-e78d-48e6-bf5f-89ebec55ddd9 (r1) port r1_public mac: "40:44:00:00:00:03" networks: ["172.16.104.1/24"] port r1_s3 mac: "00:de:ad:ff:01:03" networks: ["172.16.103.1/24"] port r1_s2 mac: "00:de:ad:ff:01:02" networks: ["172.16.102.1/24"] nat 1e99a341-7b98-4758-9574-d9f85bde501a external ip: "172.16.104.200" logical ip: "172.16.102.11" type: "dnat_and_snat" nat c87da4ea-7a83-4fc8-9613-6092c1d07cb6 external ip: "172.16.104.201" logical ip: "172.16.103.11" type: "dnat_and_snat" [root@dell-per730-42 ovn]# produce a lot of RARP packets on another machine on switch: from scapy.all import * >>> for x in range(1000): ... sendp(Ether(src="00:de:ad:01:00:01",dst="ff:ff:ff:ff:ff:ff")/ARP(op=2,hwsrc="00:de:aq:01:00:01",hwdst="00:de:aq:01:00:01",psrc="172.16.102.99",pdst="172.16.102.99"),iface="p4p1") ... [root@dell-per730-42 ~]# top top - 03:39:47 up 1 day, 23:59, 2 users, load average: 0.05, 0.10, 0.04 Tasks: 547 total, 1 running, 546 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st MiB Mem : 31967.2 total, 23898.4 free, 1976.0 used, 6092.8 buff/cache MiB Swap: 16128.0 total, 16128.0 free, 0.0 used. 29505.8 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 38168 openvsw+ 10 -10 3408452 157256 33044 S 1.3 0.5 0:41.38 ovs-vswitchd 38237 root 10 -10 264336 5632 3356 S 0.3 0.0 0:01.79 ovn-controller 38412 qemu 20 0 7025884 567320 21220 S 0.3 1.7 1:11.51 qemu-kvm 38571 qemu 20 0 6169032 585332 21116 S 0.3 1.8 1:01.51 qemu-kvm 1 root 20 0 244620 11584 8276 S 0.0 0.0 0:06.95 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:00.07 kthreadd [root@hp-dl380g10-05 ~]# top top - 04:13:58 up 2 days, 32 min, 2 users, load average: 0.00, 0.00, 0.00 Tasks: 525 total, 1 running, 524 sleeping, 0 stopped, 0 zombie %Cpu(s): 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st MiB Mem : 63989.4 total, 55211.2 free, 2454.2 used, 6324.0 buff/cache MiB Swap: 28608.0 total, 28608.0 free, 0.0 used. 60878.0 avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 35052 openvsw+ 10 -10 3408432 157416 33044 S 2.0 0.2 2:09.00 ovs-vswitchd 4503 root 20 0 424984 31828 16224 S 0.3 0.0 0:14.86 tuned 1 root 20 0 247016 14340 9040 S 0.0 0.0 0:08.47 systemd 2 root 20 0 0 0 0 S 0.0 0.0 0:00.15 kthreadd 3 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_gp 4 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_par_gp 6 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kworker/0:0H-kblockd 7 root 20 0 0 0 0 I 0.0 0.0 0:00.01 kworker/u96:0-cpuset_migrate_mm 9 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 mm_percpu_wq 10 root 20 0 0 0 0 S 0.0 0.0 0:00.04 ksoftirqd/0 11 root 20 0 0 0 0 I 0.0 0.0 0:34.95 rcu_sched
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2019:3721