Bug 1745002

Summary: OVN: do not discard mac_bindings under heavy load
Product: Red Hat Enterprise Linux Fast Datapath Reporter: lorenzo bianconi <lorenzo.bianconi>
Component: ovn2.11Assignee: lorenzo bianconi <lorenzo.bianconi>
Status: CLOSED ERRATA QA Contact: haidong li <haili>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: RHEL 7.7CC: ctrautma, fleitner, kfida, qding
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: ovn2.11-2.11.0-36.el7fdn Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-10-01 07:21:33 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description lorenzo bianconi 2019-08-23 13:00:20 UTC
Description of problem:


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:


Additional info:

Comment 2 haidong li 2019-08-28 02:14:35 UTC
Hi Lorenzo,
   can you please help describe how to reproduce this bug?
And is there any new configuration? Thanks!

Comment 4 haidong li 2019-09-10 09:00:11 UTC
reproduced on the old version:

[root@dell-per730-19 ovn]# rpm -qa | grep openvswitch
kernel-kernel-networking-openvswitch-ovn-1.0-138.noarch
openvswitch-selinux-extra-policy-1.0-13.el7fdp.noarch
openvswitch2.11-2.11.0-18.el7fdp.x86_64
[root@dell-per730-19 ovn]# rpm -qa | grep ovn
ovn2.11-central-2.11.0-26.el7fdp.x86_64
kernel-kernel-networking-openvswitch-ovn-1.0-138.noarch
ovn2.11-2.11.0-26.el7fdp.x86_64
ovn2.11-host-2.11.0-26.el7fdp.x86_64
[root@dell-per730-19 ovn]# 

[root@dell-per730-19 ovn]# top

top - 04:27:53 up 5 days,  5:30,  3 users,  load average: 2.16, 1.89, 1.76
Tasks: 461 total,   3 running, 458 sleeping,   0 stopped,   0 zombie
%Cpu(s):  7.7 us,  0.8 sy,  0.0 ni, 91.5 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 65708808 total, 60933784 free,  2292016 used,  2483008 buff/cache
KiB Swap: 29241340 total, 29241340 free,        0 used. 62828764 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                               
25693 openvsw+  10 -10 2400440 182780  17932 R 100.0  0.3  54:27.85 ovs-vswitchd                                          
25767 root      10 -10  384444 106492   1708 R  94.1  0.2  35:20.03 ovn-controller                                        
25735 root      20   0  167640 102308   1776 S  35.3  0.2   3:34.35 ovsdb-server                                          
25636 openvsw+  10 -10   98460  30460   1728 S  23.5  0.0   2:41.49 ovsdb-server                                          
16280 root      20   0  162316   2540   1524 R   5.9  0.0   0:00.03 top                                                   
    1 root      20   0  202004  15148   4224 S   0.0  0.0   0:30.98 systemd                                               
    2 root      20   0       0      0      0 S   0.0  0.0   0:00.24 kthreadd                                              
    4 root       0 -20       0      0      0 S   0.0  0.0   0:00.00 kworker/0:0H                                          
    6 root      20   0       0      0      0 S   0.0  0.0   0:02.18 ksoftirqd/0                                           
    7 root      rt   0       0      0      0 S   0.0  0.0   0:01.37 migration/0                                           
    8 root      20   0       0      0      0 S   0.0  0.0   0:00.00 rcu_bh  

[root@dell-per730-19 ovn]# ping 172.16.103.11
PING 172.16.103.11 (172.16.103.11) 56(84) bytes of data.
64 bytes from 172.16.103.11: icmp_seq=1 ttl=63 time=1.60 ms
64 bytes from 172.16.103.11: icmp_seq=2 ttl=63 time=0.200 ms
^C
--- 172.16.103.11 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 0.200/0.900/1.600/0.700 ms
[root@dell-per730-19 ovn]#
[root@dell-per730-19 ovn]# ovn-sbctl list mac_binding
[root@dell-per730-19 ovn]# ovn-sbctl list mac_binding
[root@dell-per730-19 ovn]# 
===============================================================================================================

The issue is verified on the latest version:
[root@dell-per730-19 ~]# uname -a
Linux dell-per730-19.rhts.eng.pek2.redhat.com 3.10.0-1062.el7.x86_64 #1 SMP Thu Jul 18 20:25:13 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
[root@dell-per730-19 ~]# rpm -qa | grep openvswitch
openvswitch-selinux-extra-policy-1.0-13.el7fdp.noarch
openvswitch2.11-2.11.0-21.el7fdp.x86_64
[root@dell-per730-19 ~]# rpm -qa | grep ovn
ovn2.11-central-2.11.0-36.el7fdp.x86_64
ovn2.11-2.11.0-36.el7fdp.x86_64
ovn2.11-host-2.11.0-36.el7fdp.x86_64
[root@dell-per730-19 ~]#
[root@dell-per730-19 ~]# top

top - 02:17:03 up 5 days,  3:19,  3 users,  load average: 1.39, 0.35, 0.15
Tasks: 471 total,   3 running, 468 sleeping,   0 stopped,   0 zombie
%Cpu(s):  4.6 us,  2.0 sy,  0.0 ni, 93.3 id,  0.0 wa,  0.0 hi,  0.1 si,  0.0 st
KiB Mem : 65708808 total, 56627600 free,  3262252 used,  5818956 buff/cache
KiB Swap: 29241340 total, 29241340 free,        0 used. 61850996 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                               
32627 openvsw+  10 -10 2336220 125240  17936 R 120.2  0.2   7:15.68 ovs-vswitchd                                          
32709 root      10 -10  285656   8032   1740 R  36.1  0.0   2:46.45 ovn-controller                                        
32675 root      10 -10   62576   5232   1084 S   8.6  0.0   0:10.01 ovn-northd                                            
 7362 root      20   0  694216 157200   6780 S   5.3  0.2   9:57.87 NetworkManager                                        
32293 nobody    20   0   53896   1232    832 S   5.0  0.0   0:09.10 dnsmasq                                               
32661 root      20   0   70484   5144   1736 S   4.0  0.0   0:06.26 ovsdb-server                                          
32569 openvsw+  10 -10   73552   6108   1740 S   3.0  0.0   0:04.97 ovsdb-server                                          
32669 root      20   0   71124   5808   1776 S   1.7  0.0   0:04.83 ovsdb-server                                          
 6355 root      20   0  115944   2572   1684 S   1.0  0.0   0:11.43 bash                                                  
    9 root      20   0       0      0      0 S   0.7  0.0   6:14.15 rcu_sched                                             
  888 root      20   0   40228   6480   6056 S   0.7  0.0   0:42.95 systemd-journal                                       
 1450 dbus      20   0   66616   2852   1904 S   0.7  0.0   0:48.78 dbus-daemon                                           
 1933 root      20   0  265572   6456   3132 S   0.7  0.0   0:36.62 rsyslogd                                              
16290 root      20   0  162320   2664   1580 R   0.7  0.0   0:00.06 top                                                   
    1 root      20   0  202004  15140   4216 S   0.3  0.0   0:26.32 systemd                                               
   56 root      20   0       0      0      0 S   0.3  0.0   0:02.84 ksoftirqd/9  
[root@dell-per730-19 ~]# virsh console hv1_vm00
Connected to domain hv1_vm00
Escape character is ^]

[root@localhost ~]# ping 172.16.103.12
PING 172.16.103.12 (172.16.103.12) 56(84) bytes of data.
64 bytes from 172.16.103.12: icmp_seq=1 ttl=63 time=347 ms
64 bytes from 172.16.103.12: icmp_seq=2 ttl=63 time=0.928 ms

--- 172.16.103.12 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 0.928/174.316/347.704/173.388 ms
[root@localhost ~]# ping 172.16.102.12
PING 172.16.102.12 (172.16.102.12) 56(84) bytes of data.
64 bytes from 172.16.102.12: icmp_seq=1 ttl=64 time=0.813 ms
64 bytes from 172.16.102.12: icmp_seq=2 ttl=64 time=0.389 ms

--- 172.16.102.12 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1000ms
rtt min/avg/max/mdev = 0.389/0.601/0.813/0.212 ms
[root@dell-per730-19 ~]# ovn-sbctl list mac_binding
_uuid               : 37df4490-e886-48c8-92ac-314e830c8030
datapath            : a46f7f86-e38d-4871-9b12-a2ea70ad4541
ip                  : "172.16.103.11"
logical_port        : "r1_s3"
mac                 : "00:de:ad:00:00:01"

_uuid               : c84cf819-0f7b-49c8-bc43-a1e1bdedb5f5
datapath            : a46f7f86-e38d-4871-9b12-a2ea70ad4541
ip                  : "::"
logical_port        : "r1_s3"
mac                 : "00:00:00:00:00:00"

_uuid               : d9440e6d-9e76-4734-9cba-2e7e4b2e2868
datapath            : a46f7f86-e38d-4871-9b12-a2ea70ad4541
ip                  : "::"
logical_port        : "r1_s2"
mac                 : "00:00:00:00:00:00"

_uuid               : 62aa3d3d-5bc1-4a55-afe1-a46e27889076
datapath            : a46f7f86-e38d-4871-9b12-a2ea70ad4541
ip                  : "172.16.102.12"
logical_port        : "r1_s2"
mac                 : "00:de:ad:01:01:01"

_uuid               : bb7c2c96-0523-4abe-bbd8-d9fe28ef3188
datapath            : a46f7f86-e38d-4871-9b12-a2ea70ad4541
ip                  : "172.16.103.12"
logical_port        : "r1_s3"
mac                 : "00:de:ad:00:01:01"

_uuid               : cf776817-681b-4cfd-9159-428b96a58e3b
datapath            : a46f7f86-e38d-4871-9b12-a2ea70ad4541
ip                  : "172.16.102.11"
logical_port        : "r1_s2"
mac                 : "00:de:ad:01:00:01"
[root@dell-per730-19 ~]#

Comment 6 errata-xmlrpc 2019-10-01 07:21:33 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:2943