Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.

Bug 1731165

Summary: CPU usage goes high while constantly add veth port to ovs bridge
Product: Red Hat Enterprise Linux Fast Datapath Reporter: haidong li <haili>
Component: openvswitch2.11Assignee: Flavio Leitner <fleitner>
Status: CLOSED NOTABUG QA Contact: haidong li <haili>
Severity: medium Docs Contact:
Priority: unspecified    
Version: FDP 19.DCC: ctrautma, fleitner, jhsiao, mcroce, ralongi, tredaelli
Target Milestone: ---   
Target Release: ---   
Hardware: x86_64   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2019-08-21 17:41:20 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description haidong li 2019-07-18 13:59:28 UTC
Description of problem:
CPU usage goes high while constantly add veth port to ovs bridge

Version-Release number of selected component (if applicable):
[root@dell-per730-42 ~]# uname -a
Linux dell-per730-42.rhts.eng.pek2.redhat.com 3.10.0-1061.el7.x86_64 #1 SMP Thu Jul 11 21:02:44 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
[root@dell-per730-42 ~]# rpm -qa | grep openvswitch
openvswitch2.11-2.11.0-14.el7fdp.x86_64
openvswitch-selinux-extra-policy-1.0-11.el7fdp.noarch
[root@dell-per730-42 ~]# 

How reproducible:
everytime

Steps to Reproduce:
1.add ovs bridge 
2.constantly add veth port to ovs bridge

[root@dell-per730-42 ~]# top

top - 07:50:36 up 47 min,  2 users,  load average: 0.43, 0.16, 0.10
Tasks: 464 total,   9 running, 455 sleeping,   0 stopped,   0 zombie
%Cpu(s):  1.5 us,  1.9 sy,  0.0 ni, 96.5 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
KiB Mem : 32744248 total, 31240408 free,   885072 used,   618768 buff/cache
KiB Swap: 16515068 total, 16515068 free,        0 used. 31469072 avail Mem 

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND                                                                       
12944 openvsw+  10 -10 3390808 138164  17080 R 132.9  0.4   0:59.22 ovs-vswitchd                                                                  
 2388 root      20   0  559508  22424   6704 R  14.0  0.1   0:15.53 NetworkManager                                                                
12862 openvsw+  10 -10   71432   5164   1736 S   3.3  0.0   0:01.15 ovsdb-server                                                                  
  517 root      20   0       0      0      0 S   1.7  0.0   0:00.72 kworker/35:1                                                                  
11632 root      20   0  115844   2464   1656 S   1.3  0.0   0:03.92 bash                                                                          
  157 root      20   0       0      0      0 S   0.7  0.0   0:00.06 kworker/29:0                                                                  
  318 root      20   0       0      0      0 S   0.7  0.0   0:00.90 kworker/32:1                                                                  
  321 root      20   0       0      0      0 S   0.7  0.0   0:00.18 kworker/27:1                                                                  
  516 root      20   0       0      0      0 S   0.7  0.0   0:01.02 kworker/33:1                                                                  
 1433 root      20   0   47664  12044  11664 R   0.7  0.0   0:01.39 systemd-journal                                                               
 2369 dbus      20   0   66596   2656   1876 R   0.7  0.0   0:01.59 dbus-daemon                                                                   
 7239 root      20   0   49024   3888    488 R   0.7  0.0   0:00.21 systemd-udevd    


[root@dell-per730-42 openvswitch]# cat ovs-vswitchd.log | grep CPU|tail -20
2019-07-18T13:33:37.542Z|00790|poll_loop|INFO|wakeup due to [POLLIN] on fd 14 (NETLINK_ROUTE<->NETLINK_ROUTE) at ../lib/netlink-socket.c:1401 (91% CPU usage)
2019-07-18T13:33:37.542Z|00791|poll_loop|INFO|wakeup due to [POLLIN] on fd 15 (<->/var/run/openvswitch/db.sock) at ../lib/stream-fd.c:157 (91% CPU usage)
2019-07-18T13:33:37.542Z|00792|poll_loop|INFO|wakeup due to [POLLIN] on fd 17 (NETLINK_ROUTE<->NETLINK_ROUTE) at ../lib/netlink-socket.c:1401 (91% CPU usage)
2019-07-18T13:33:43.261Z|00833|poll_loop|INFO|wakeup due to [POLLIN] on fd 14 (NETLINK_ROUTE<->NETLINK_ROUTE) at ../lib/netlink-socket.c:1401 (99% CPU usage)
2019-07-18T13:33:49.316Z|00877|poll_loop|INFO|wakeup due to [POLLIN] on fd 14 (NETLINK_ROUTE<->NETLINK_ROUTE) at ../lib/netlink-socket.c:1401 (100% CPU usage)
2019-07-18T13:33:55.142Z|00920|poll_loop|INFO|wakeup due to [POLLIN] on fd 14 (NETLINK_ROUTE<->NETLINK_ROUTE) at ../lib/netlink-socket.c:1401 (98% CPU usage)
2019-07-18T13:34:22.595Z|00943|poll_loop|INFO|wakeup due to [POLLIN] on fd 14 (NETLINK_ROUTE<->NETLINK_ROUTE) at ../lib/netlink-socket.c:1401 (74% CPU usage)
2019-07-18T13:34:22.595Z|00944|poll_loop|INFO|wakeup due to [POLLIN] on fd 15 (<->/var/run/openvswitch/db.sock) at ../lib/stream-fd.c:157 (74% CPU usage)
2019-07-18T13:34:22.722Z|00946|poll_loop|INFO|wakeup due to [POLLIN] on fd 14 (NETLINK_ROUTE<->NETLINK_ROUTE) at ../lib/netlink-socket.c:1401 (74% CPU usage)
2019-07-18T13:34:22.722Z|00947|poll_loop|INFO|wakeup due to [POLLIN] on fd 15 (<->/var/run/openvswitch/db.sock) at ../lib/stream-fd.c:157 (74% CPU usage)
2019-07-18T13:34:25.271Z|00966|poll_loop|INFO|wakeup due to [POLLIN] on fd 14 (NETLINK_ROUTE<->NETLINK_ROUTE) at ../lib/netlink-socket.c:1401 (74% CPU usage)
2019-07-18T13:34:31.474Z|01005|poll_loop|INFO|wakeup due to [POLLIN] on fd 16 (NETLINK_ROUTE<->NETLINK_ROUTE) at ../lib/netlink-socket.c:1401 (99% CPU usage)
2019-07-18T13:36:23.283Z|01007|poll_loop|INFO|wakeup due to [POLLIN] on fd 22 (FIFO pipe:[72262]) at ../vswitchd/bridge.c:384 (87% CPU usage)
2019-07-18T13:36:23.783Z|01008|poll_loop|INFO|wakeup due to [POLLIN] on fd 22 (FIFO pipe:[72262]) at ../vswitchd/bridge.c:384 (87% CPU usage)
2019-07-18T13:36:24.283Z|01009|poll_loop|INFO|wakeup due to [POLLIN] on fd 22 (FIFO pipe:[72262]) at ../vswitchd/bridge.c:384 (87% CPU usage)
2019-07-18T13:36:24.503Z|01010|poll_loop|INFO|wakeup due to 220-ms timeout at ../vswitchd/bridge.c:2828 (87% CPU usage)
2019-07-18T13:36:24.783Z|01011|poll_loop|INFO|wakeup due to [POLLIN] on fd 22 (FIFO pipe:[72262]) at ../vswitchd/bridge.c:384 (87% CPU usage)
2019-07-18T13:36:25.283Z|01012|poll_loop|INFO|wakeup due to [POLLIN] on fd 22 (FIFO pipe:[72262]) at ../vswitchd/bridge.c:384 (87% CPU usage)
2019-07-18T13:36:25.784Z|01013|poll_loop|INFO|wakeup due to [POLLIN] on fd 22 (FIFO pipe:[72262]) at ../vswitchd/bridge.c:384 (87% CPU usage)
2019-07-18T13:36:59.802Z|01195|poll_loop|INFO|wakeup due to [POLLIN] on fd 14 (NETLINK_ROUTE<->NETLINK_ROUTE) at ../lib/netlink-socket.c:1401 (101% CPU usage)
[root@dell-per730-42 openvswitch]# 



script:
vlanid=0
                for ((i=1;i<=20;i++))
                do
                        for ((j=1;j<=200;j++))
                        do
                                ((vlanid+=1))
                                ip link add name veth$vlanid type veth peer name peer$vlanid
                                ip add add 80.$i.$j.2/24 dev peer$vlanid
                                ip link set veth$vlanid up
                                ip link set peer$vlanid up

                                if (($i<16));then
                                        M=0
                                fi
                                if (($j<16));then
                                        N=0
                                fi
                                X=`printf %x $i`
                                Y=`printf %x $j`
                                mac_peer_hv1=00:00:00:22:$M$X:$N$Y
                                mac_peer_hv0=00:00:00:11:$M$X:$N$Y

                                ip link set peer$vlanid address $mac_peer_hv0
                                ovs-vsctl add-port br-int veth$vlanid
#                                vlanid=$(($i*$j))
                                ovs-vsctl set interface veth$vlanid external_ids:iface-id=lsprmt$vlanid
                        done
                done


Actual results:


Expected results:


Additional info:
This issue can be reproduced by constantly add and remove a veth port from ovs bridge

Comment 1 Matteo Croce 2019-07-18 19:22:39 UTC
Hi,

is this really a bug? what should we expect during the veth insertion?

Comment 2 haidong li 2019-07-19 02:13:58 UTC
I think the veth insertion shouldn't take so much 100% cpu usage,it's just adding or removing ovs port command.

Comment 3 Flavio Leitner 2019-07-29 17:46:49 UTC
Adding or removing a port to the switch will force it to rebuild itself, internal caches and changes to the database.
I don't know the goal of this test, but if you just want to add many interfaces to the bridge as a goal, then add them in batches in a atomic transaction.
E.g.:

   ovs-vsctl add-port br-int veth$vlanid -- \
             add-port br-int veth${vlanid+1} -- \
             add-port br-int veth${vlanid+2} -- \
             add-port br-int veth${vlanid+3} -- \
             add-port br-int veth${vlanid+4} -- \
             add-port br-int veth${vlanid+5} -- \
             add-port br-int veth${vlanid+6} -- \
             add-port br-int veth${vlanid+7} -- \
             add-port br-int veth${vlanid+8} -- \
             add-port br-int veth${vlanid+9}

You could try adding 10 ifaces at once, or 50 maybe and see if it helps.
Of course, there will be a spike in CPU usage but for a significant shorter period of time.

HTH
fbl

Comment 4 Flavio Leitner 2019-08-21 17:41:20 UTC
Hi,

It has been almost a month without updates, so I am going to close this.
Feel free to re-open this bug if you need anything and I will be happy to continue to help.
Thanks!
fbl