Bug 1589031

Summary: The mac table size of neutron bridges (br-tun, br-int, br-*) is too small by default and eventually makes openvswitch explode
Product: Red Hat OpenStack Reporter: Miguel Angel Ajo <majopela>
Component: openstack-neutronAssignee: Slawek Kaplonski <skaplons>
Status: CLOSED ERRATA QA Contact: Toni Freger <tfreger>
Severity: urgent Docs Contact:
Priority: urgent    
Version: 10.0 (Newton)CC: amcleod, amuller, apevec, atelang, bhaley, bjarolim, chrisw, cshastri, dvd, echaudro, erkki.peura, feimingyun, gkumar, jamsmith, jbenc, jlibosva, jraju, kiyyappa, majopela, mcroce, mori, nyechiel, oblaut, ovs-team, pabeni, pablo.iranzo, pmorey, ragiman, rcernin, rhos-maint, rkhan, sbandyop, srevivo, vkommadi, wlehman
Target Milestone: z9Keywords: Triaged, ZStream
Target Release: 10.0 (Newton)   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-neutron-9.4.1-26.el7ost Doc Type: Release Note
Doc Text:
With this update, the neutron OVS agent has a new configuration option `bridge_mac_table_size`. This value controls the maximum number of MAC addresses that can be learned on a bridge. The default value for this new option is 50,000, which should be enough for most systems. Values outside a reasonable range (10 to 1,000,000) might be overridden by Open vSwitch.
Story Points: ---
Clone Of: 1558336
: 1591204 1591206 (view as bug list) Environment:
Last Closed: 2018-09-17 16:52:32 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1591204, 1591206    

Description Miguel Angel Ajo 2018-06-08 09:16:12 UTC
We need to increase the default OpenvSwitch mac table size (2048) to something that works better with busy environments.

ovs-vsctl set bridge <bridge> other-config:mac-table-size=50000

+++ This bug was initially created as a clone of Bug #1558336 +++

Description of problem:

the CPU utilization of ovs-vswitchd is high without DPDK enabled

 PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
1512 root      10 -10 4352840 793864  12008 R  1101  0.3  15810:26 ovs-vswitchd

at the same time we were observing failures to send packets (ICMP) over VXLAN tunnel, we think this might be related to high CPU usage.

--- Additional comment from Jiri Benc on 2018-05-31 14:03:36 EDT ---

I managed to reproduce and analyze this.

First, the reproduction steps. It's actually surprisingly simple once you explore all the blind alleys.

Create an ovs bridge:

------
ovs-vsctl add-br ovs0
ip l s ovs0 up
------

Save this to a file named "reproducer.py":

------
#!/usr/bin/python
from scapy.all import *

data = [(str(RandMAC()), str(RandIP())) for i in range(int(sys.argv[1]))]

s = conf.L2socket(iface="ovs0")
while True:
    for mac, ip in data:
        p = Ether(src=mac, dst=mac)/IP(src=ip, dst=ip)
        s.send(p)
------

Run the reproducer:

./reproducer.py 5000

--- Additional comment from Jiri Benc on 2018-05-31 14:26:26 EDT ---

The problem is how flow revalidation works in ovs. There are several 'revalidator' threads launched. They should normally sleep (modulo waking every 0.5 second just to do nothing) and they wake if anything of interest happens (udpif_revalidator => poll_block). On every wake up, each revalidator thread checks whether flow revalidation is needed and if it is, it does the revalidation.

The revalidation is very costly with high number of flows. I also suspect there's a lot of contention between the revalidator threads.

The flow revalidation is triggered by many things. What is of interest for us is that any eviction of a MAC learning table entry triggers revalidation.

The reproducer script repeatedly sends the same 5000 packets, all of them with a different MAC address. This causes constant overflows of the MAC learning table and constant revalidation. The revalidator threads are being immediately woken up and are busy looping the revalidation.

Which is exactly the pattern from the customers' data: there are 16000+ flows and the packet capture shows that the packets are repeating every second.

A quick fix is to increase the MAC learning table size:

ovs-vsctl set bridge <bridge> other-config:mac-table-size=50000

This should lower the CPU usage down substantially; allow a few seconds for things to settle down.

Comment 2 James Smith 2018-07-09 23:13:29 UTC
copied doc text from 1591206.

Comment 12 errata-xmlrpc 2018-09-17 16:52:32 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2715