Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1795308

Summary: [ovn] Reduce the number of connections to SB database from metadata agents
Product: Red Hat OpenStack Reporter: Daniel Alvarez Sanchez <dalvarez>
Component: python-networking-ovnAssignee: Lucas Alvares Gomes <lmartins>
Status: CLOSED CURRENTRELEASE QA Contact: Eran Kuris <ekuris>
Severity: high Docs Contact:
Priority: high    
Version: 16.0 (Train)CC: apevec, jlibosva, lhh, lmartins, majopela, scohen
Target Milestone: ---Keywords: Triaged
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-02-11 09:28:31 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1797685    

Description Daniel Alvarez Sanchez 2020-01-27 16:32:38 UTC
Right now, the number of metadata workers comes from os_workers fact [0] which is calculated here [1] and capped to a max of 12.

Each of this workers opens a connection to the OVN SouthBound database which, on large systems, can lead to unnecessary load for the central ovsdb-server.

In ML2/OVN, metadata agent is deployed on every compute node so compared to ML2/OVS, where it's deployed on controllers/networkers, the number of workers could be ideally reduced offering the same or better level of scalability (?).

In order to reduce the stress on ovsdb-server and offer the same level of scalability when it comes to serving concurrent metadata requests from instances, we need to reduce the number of workers.

We could investigate if we can keep the number of workers that process metadata requests coming from haproxy on the UNIX domain socket while having just one single OVNWorker that processes events from the SB server.

As a note, metadata agent does not monitor too many tables [2] so, at scale, it will not suffer too much as tables like Logical_Flows are not being monitored. However, if the number of port grows, it can be hard for both metadata agent and central ovsdb-server. See [3]. 

[0] https://opendev.org/openstack/puppet-neutron/src/branch/stable/train/manifests/agents/ovn_metadata.pp#L48
[1] https://opendev.org/openstack/puppet-openstacklib/src/branch/stable/train/lib/facter/os_workers.rb#L37
[2] https://github.com/openstack/networking-ovn/blob/stable/train/networking_ovn/agent/metadata/ovsdb.py#L34-L35
[3] https://bugzilla.redhat.com/show_bug.cgi?id=1795295