Bug 1473654 - vswitchd Sharp CPU utilization during Neutron create-router tests [NEEDINFO]
vswitchd Sharp CPU utilization during Neutron create-router tests
Status: NEW
Product: Red Hat OpenStack
Classification: Red Hat
Component: openvswitch (Show other bugs)
12.0 (Pike)
Unspecified Unspecified
high Severity high
: ---
: ---
Assigned To: Lance Richardson
Ofer Blaut
: AutomationBlocker
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2017-07-21 07:54 EDT by Joe Talerico
Modified: 2017-08-23 08:07 EDT (History)
11 users (show)

See Also:
Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of:
Environment:
Last Closed:
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
amuller: needinfo? (fleitner)


Attachments (Terms of Use)
perf record of vswitchd (18.58 MB, application/octet-stream)
2017-07-21 07:55 EDT, Joe Talerico
no flags Details

  None (edit)
Description Joe Talerico 2017-07-21 07:54:26 EDT
Description of problem:
ovs-vswitchd CPU utilization jumps 50x during our normal Rally API tests (control plane - no data plane).[1]

This can begin to break down the Cloud - I have seen services like RabbitMQ lose its heartbeat become partitioned and the rest is history. 


[1] http://norton.perf.lab.eng.rdu.redhat.com:3000/dashboard/db/openstack-general-system-performance?from=1500545668722&to=1500547429680&orgId=1&var-Cloud=openstack-12-ha-noncontainer&var-NodeType=*&var-Node=overcloud-controller-0&var-Interface=interface-br-ex&var-Disk=disk-sda&var-OSD=All&var-Pool=All&var-cpus0=All&var-cpus00=All&panelId=78&fullscreen

Version-Release number of selected component (if applicable):
[root@overcloud-controller-0 heat-admin]# rpm -qa | grep openvswitch
openvswitch-ovn-central-2.7.0-8.git20170530.el7fdb.x86_64
openvswitch-2.7.0-8.git20170530.el7fdb.x86_64
openvswitch-ovn-host-2.7.0-8.git20170530.el7fdb.x86_64
openvswitch-ovn-common-2.7.0-8.git20170530.el7fdb.x86_64
openstack-neutron-openvswitch-11.0.0-0.20170628060509.9a72bfe.el7ost.noarch
python-openvswitch-2.7.0-8.git20170530.el7fdb.noarch
[root@overcloud-controller-0 heat-admin]# 


How reproducible:
100%

Steps to Reproduce:
1. with 3x controller ha deployment, execute Neutron Rally create-list-routers

Actual results:
Link in the description to Grafana dashboard.

Expected results:
Utilization doesn't jump 50x


Additional info:
From vswitchd log (tail):
2017-07-21T11:51:14.589Z|2164542|bridge|INFO|bridge br-int: deleted interface ha-4f9e2bce-e1 on port 1008
2017-07-21T11:51:14.615Z|2164543|bridge|INFO|bridge br-int: added interface ha-7bba4721-43 on port 995
2017-07-21T11:51:14.639Z|2164544|bridge|INFO|bridge br-int: added interface ha-4f9e2bce-e1 on port 1008
2017-07-21T11:51:14.713Z|2164545|bridge|INFO|bridge br-int: deleted interface qr-504e5f81-9a on port 1008
2017-07-21T11:51:14.715Z|2164546|bridge|INFO|bridge br-int: deleted interface ha-8ae9c391-16 on port 995
2017-07-21T11:51:14.743Z|2164547|bridge|INFO|bridge br-int: added interface qr-504e5f81-9a on port 1008
2017-07-21T11:51:14.763Z|2164548|bridge|INFO|bridge br-int: added interface ha-8ae9c391-16 on port 995
2017-07-21T11:51:14.833Z|2164549|bridge|INFO|bridge br-int: deleted interface ha-7bba4721-43 on port 995
2017-07-21T11:51:14.835Z|2164550|bridge|INFO|bridge br-int: deleted interface ha-4f9e2bce-e1 on port 1008
2017-07-21T11:51:14.858Z|2164551|bridge|INFO|bridge br-int: added interface ha-7bba4721-43 on port 995
2017-07-21T11:51:14.885Z|2164552|bridge|INFO|bridge br-int: added interface ha-4f9e2bce-e1 on port 1008
2017-07-21T11:51:14.961Z|2164553|bridge|INFO|bridge br-int: deleted interface qr-504e5f81-9a on port 1008
2017-07-21T11:51:14.963Z|2164554|bridge|INFO|bridge br-int: deleted interface ha-8ae9c391-16 on port 995
2017-07-21T11:51:14.987Z|2164555|bridge|INFO|bridge br-int: added interface qr-504e5f81-9a on port 1008
2017-07-21T11:51:15.012Z|2164556|bridge|INFO|bridge br-int: added interface ha-8ae9c391-16 on port 995
2017-07-21T11:51:15.081Z|2164557|bridge|INFO|bridge br-int: deleted interface ha-7bba4721-43 on port 995
2017-07-21T11:51:15.083Z|2164558|bridge|INFO|bridge br-int: deleted interface ha-4f9e2bce-e1 on port 1008
2017-07-21T11:51:15.108Z|2164559|bridge|INFO|bridge br-int: added interface ha-7bba4721-43 on port 995

From /var/log/messages : (tail)
Jul 21 07:50:16 localhost NetworkManager[1264]: <info>  [1500637816.7172] manager: (qr-504e5f81-9a): new Generic device (/org/freedesktop/NetworkManager/Devices/1027729)
Jul 21 07:50:16 localhost kernel: device ha-8ae9c391-16 entered promiscuous mode
Jul 21 07:50:16 localhost NetworkManager[1264]: <info>  [1500637816.7434] manager: (ha-8ae9c391-16): new Generic device (/org/freedesktop/NetworkManager/Devices/1027730)
Jul 21 07:50:16 localhost kernel: device ha-7bba4721-43 left promiscuous mode
Jul 21 07:50:16 localhost kernel: device ha-7bba4721-43 entered promiscuous mode
Jul 21 07:50:16 localhost kernel: device ha-8ae9c391-16 left promiscuous mode
Jul 21 07:50:16 localhost NetworkManager[1264]: <info>  [1500637816.8710] manager: (ha-7bba4721-43): new Generic device (/org/freedesktop/NetworkManager/Devices/1027731)
Jul 21 07:50:16 localhost kernel: device ha-4f9e2bce-e1 entered promiscuous mode
Jul 21 07:50:16 localhost NetworkManager[1264]: <info>  [1500637816.8896] manager: (ha-4f9e2bce-e1): new Generic device (/org/freedesktop/NetworkManager/Devices/1027732)
Jul 21 07:50:16 localhost kernel: device qr-504e5f81-9a left promiscuous mode
Jul 21 07:50:16 localhost kernel: device qr-504e5f81-9a entered promiscuous mode
Jul 21 07:50:16 localhost kernel: device ha-4f9e2bce-e1 left promiscuous mode
Jul 21 07:50:16 localhost NetworkManager[1264]: <info>  [1500637816.9925] manager: (qr-504e5f81-9a): new Generic device (/org/freedesktop/NetworkManager/Devices/1027733)
Jul 21 07:50:17 localhost kernel: device ha-8ae9c391-16 entered promiscuous mode
Jul 21 07:50:17 localhost kernel: device ha-7bba4721-43 left promiscuous mode
Jul 21 07:50:17 localhost NetworkManager[1264]: <info>  [1500637817.0129] manager: (ha-8ae9c391-16): new Generic device (/org/freedesktop/NetworkManager/Devices/1027734)
Jul 21 07:50:17 localhost kernel: device ha-7bba4721-43 entered promiscuous mode
Jul 21 07:50:17 localhost NetworkManager[1264]: <info>  [1500637817.1254] manager: (ha-7bba4721-43): new Generic device (/org/freedesktop/NetworkManager/Devices/1027735)
Jul 21 07:50:17 localhost kernel: device ha-8ae9c391-16 left promiscuous mode
Jul 21 07:50:17 localhost kernel: device ha-4f9e2bce-e1 entered promiscuous mode
Jul 21 07:50:17 localhost kernel: device qr-504e5f81-9a left promiscuous mode
Jul 21 07:50:17 localhost NetworkManager[1264]: <info>  [1500637817.1502] manager: (ha-4f9e2bce-e1): new Generic device (/org/freedesktop/NetworkManager/Devices/1027736)


perf report of the vswitchd - 

  17.11%  ovs-vswitchd    [kernel.kallsyms]   [k] snmp_fold_field                                                                                                                             
  11.09%  ovs-vswitchd    [kernel.kallsyms]   [k] internal_get_stats                                                                                                                          
   8.08%  ovs-vswitchd    [kernel.kallsyms]   [k] mutex_spin_on_owner                                                                                                                         
   4.17%  ovs-vswitchd    [kernel.kallsyms]   [k] find_next_bit                                                                                                                               
   3.16%  ovs-vswitchd    libc-2.17.so        [.] _int_malloc                                                                                                                                 
   2.25%  ovs-vswitchd    libc-2.17.so        [.] _int_free                                                                                                                                   
   1.39%  ovs-vswitchd    [kernel.kallsyms]   [k] __memcpy                                                                                                                                    
   1.35%  ovs-vswitchd    libc-2.17.so        [.] malloc                                                                                                                                      
   1.15%  ovs-vswitchd    [kernel.kallsyms]   [k] osq_lock                                                                                                                                    
   1.09%  ovs-vswitchd    libc-2.17.so        [.] __strlen_sse2_pminub                                                                                                                        
   0.98%  ovs-vswitchd    [kernel.kallsyms]   [k] strlen                                                                                                                                      
   0.93%  ovs-vswitchd    [kernel.kallsyms]   [k] netlink_broadcast_filtered                                                                                                                  
   0.92%  ovs-vswitchd    libc-2.17.so        [.] __libc_calloc                                                                                                                               
   0.90%  ovs-vswitchd    [kernel.kallsyms]   [k] strcpy                                                                                                                                      
   0.74%  ovs-vswitchd    libc-2.17.so        [.] malloc_consolidate                                                                                                                          
   0.68%  ovs-vswitchd    libpthread-2.17.so  [.] pthread_mutex_lock                                                                                                                          
   0.65%  ovs-vswitchd    libpthread-2.17.so  [.] pthread_mutex_unlock                                                                                                                        
   0.64%  ovs-vswitchd    ovs-vswitchd        [.] 0x000000000027f3e7                                                                                                                          
   0.59%  ovs-vswitchd    [kernel.kallsyms]   [k] _raw_spin_lock_irqsave                                                                                                                      
   0.56%  ovs-vswitchd    libc-2.17.so        [.] __strcmp_sse42                                                                                                                              
   0.54%  ovs-vswitchd    [kernel.kallsyms]   [k] kobject_uevent_env                                                                                                                          
   0.52%  ovs-vswitchd    ovs-vswitchd        [.] 0x00000000002c58d3                                                                                                                          
   0.52%  ovs-vswitchd    [kernel.kallsyms]   [k] __memset                                                                                                                                    
   0.50%  ovs-vswitchd    [kernel.kallsyms]   [k] dst_ifdown      

Reviewing the controllers, Rally might not be cleaning up, however it seems to be qdhcp namespaces (not router ns).

[root@overcloud-controller-0 heat-admin]# ip netns | grep qdhcp | wc -l
795
Comment 1 Joe Talerico 2017-07-21 07:55 EDT
Created attachment 1302316 [details]
perf record of vswitchd
Comment 2 Joe Talerico 2017-07-24 11:34:00 EDT
Neutron creates an HA port per-router, referenced from Assaf's blog post [1] to send keepalived traffic.

Attempting to diagnose why ovs-vswitchd had the sharp CPU utilization, I began deleting resources off of the switch. Deleting tap and qr devices had zero impact on the utilization. Deleting all of the ha-ports, I witnessed [2], which CPU utilization returned.

[1] https://assafmuller.com/2014/08/16/layer-3-high-availability/

[2] https://snapshot.raintank.io/dashboard/snapshot/M8NiZW2ooIfqyUnAP7Bo3IpgaT81xtQW?panelId=72&fullscreen&orgId=2
Comment 3 Joe Talerico 2017-07-25 13:00:22 EDT
Packet capture on one of the HA ports

[root@overcloud-controller-1 neutron]# ip l s up ha-0cc675d6-46
[root@overcloud-controller-1 neutron]# tcpdump -vvv -i ha-0cc675d6-46                                                                       
tcpdump: listening on ha-0cc675d6-46, link-type EN10MB (Ethernet), capture size 262144 bytes
16:56:25.692546 IP6 (hlim 1, next-header Options (0) payload length: 36) overcloud-controller-1.perf.lab.eng.rdu.redhat.com > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 1 group record(s) [gaddr ff02::1:ff1f:6769 to_ex { }]
16:56:26.314133 IP (tos 0xc0, ttl 255, id 8109, offset 0, flags [none], proto VRRP (112), length 40)
    169.254.192.10 > 224.0.0.18: vrrp 169.254.192.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype none, intvl 2s, length 20, addrs: 169.254.0.1
16:56:28.315246 IP (tos 0xc0, ttl 255, id 8110, offset 0, flags [none], proto VRRP (112), length 40)
    169.254.192.10 > 224.0.0.18: vrrp 169.254.192.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype none, intvl 2s, length 20, addrs: 169.254.0.1
16:56:30.316924 IP (tos 0xc0, ttl 255, id 8111, offset 0, flags [none], proto VRRP (112), length 40)
    169.254.192.10 > 224.0.0.18: vrrp 169.254.192.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype none, intvl 2s, length 20, addrs: 169.254.0.1
16:56:32.318545 IP (tos 0xc0, ttl 255, id 8112, offset 0, flags [none], proto VRRP (112), length 40)
    169.254.192.10 > 224.0.0.18: vrrp 169.254.192.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype none, intvl 2s, length 20, addrs: 169.254.0.1
16:56:34.320108 IP (tos 0xc0, ttl 255, id 8113, offset 0, flags [none], proto VRRP (112), length 40)
    169.254.192.10 > 224.0.0.18: vrrp 169.254.192.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype none, intvl 2s, length 20, addrs: 169.254.0.1
16:56:36.321738 IP (tos 0xc0, ttl 255, id 8114, offset 0, flags [none], proto VRRP (112), length 40)
    169.254.192.10 > 224.0.0.18: vrrp 169.254.192.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype none, intvl 2s, length 20, addrs: 169.254.0.1
16:56:38.323299 IP (tos 0xc0, ttl 255, id 8115, offset 0, flags [none], proto VRRP (112), length 40)
    169.254.192.10 > 224.0.0.18: vrrp 169.254.192.10 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 50, authtype none, intvl 2s, length 20, addrs: 169.254.0.1
^C
8 packets captured
8 packets received by filter
0 packets dropped by kernel
Comment 4 Flavio Leitner 2017-07-26 10:27:23 EDT
Hi,

We need a sosreport from the reproducing system to see what is going on.
I am particularly interested in the output of 'ovsdb-tool -m show-log' to review the transactions history.

There has been some bug reports related to adding ports to ovs bridge and this sounds like to be just another consequence of the same bug.  It's fixed in 2.7.2, so could please check if that resolves the issue?

http://download-node-02.eng.bos.redhat.com/brewroot/packages/openvswitch/2.7.2/1.git20170719.el7fdp/

Please capture the requested data before updating OVS because it might still be useful even if 2.7.2 doesn't show the issue.

Thanks,
fbl
Comment 5 Joe Talerico 2017-08-03 07:27:20 EDT
fbl - ovsdb-tool output : https://gist.github.com/jtaleric/28d908f289f073c102337d02546f1c8b

I will update to 2.7 and re-run today.
Comment 6 Flavio Leitner 2017-08-03 08:58:49 EDT
Hi,

If the output provided in comment#5 was from after or during reproducing the issue, then I think we can rule out external triggers, like an agent adding and removing ports in a loop.

Please try with our latest 2.7 to see if the fix in upstream helps with this issue as well.

fbl
Comment 7 Joe Talerico 2017-08-03 10:57:51 EDT
(In reply to Flavio Leitner from comment #6)
> Hi,
> 
> If the output provided in comment#5 was from after or during reproducing the
> issue, then I think we can rule out external triggers, like an agent adding
> and removing ports in a loop.
>

Yes - was after reproducing the bug.
 
> Please try with our latest 2.7 to see if the fix in upstream helps with this
> issue as well.
> 
> fbl

Yup. This was my next step.
Comment 8 Joe Talerico 2017-08-04 16:28:02 EDT
(In reply to Flavio Leitner from comment #6)
> Hi,
> 
> If the output provided in comment#5 was from after or during reproducing the
> issue, then I think we can rule out external triggers, like an agent adding
> and removing ports in a loop.
> 
> Please try with our latest 2.7 to see if the fix in upstream helps with this
> issue as well.
> 
> fbl

Updating to OVS2.7.2 did not help. 

Disabling l3_ha (set it to false across controllers) seems to have helped.

1830-~1930 is l3_ha:true 
2015-~2027 l3_ha:false

https://snapshot.raintank.io/dashboard/snapshot/btt0hoOW1rjKOSteD0ze2BQFNceEXikK?refresh=10s&panelId=72&fullscreen&orgId=2
Comment 9 Assaf Muller 2017-08-23 08:07:43 EDT
The only difference with HA routers or legacy as far as the data plane would be VRRP traffic from keepalived.

@Flavio do you have the info that you need? This is impacting some OSP 12 CI jobs.

Note You need to log in before you can comment on or make changes to this bug.