Bug 1940346

Summary: [OVN-SCALE] OVN: Optimize trivial logical flow to OF rule translation.
Product: Red Hat Enterprise Linux Fast Datapath Reporter: Dumitru Ceara <dceara>
Component: ovn2.13Assignee: OVN Team <ovnteam>
Status: NEW --- QA Contact: Jianlin Shi <jishi>
Severity: medium Docs Contact:
Priority: medium    
Version: FDP 20.HCC: ctrautma, jishi, ralongi
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
OVN NB sample database. none

Description Dumitru Ceara 2021-03-18 09:10:31 UTC
Created attachment 1764301 [details]
OVN NB sample database.

Description of problem:

In large scale deployments OVN topologies usually consist of a large number of logical switches/routers.

The logical flow pipelines for logical datapaths have a number of common flows (trivial flows) that are identical across all datapaths and usually just advance to the next table in the pipeline.

Currently, with logical_dp_groups enabled, all these flows are associated to a datapath group that consists of all logical switches/routers.

However, ovn-controller, when generating OF rules, expands the trivial logical flow applied on a datapath group with N datapaths into N individual openflows, prepending the "metadata=<datapath-key>" match to each of the openflows.

For example, with the attached OVN NB database (from a scale test run), ovn-controller generates ~400K openflows on a single hypervisor.

Changing ovn-controller to not match on logical datapath key for trivial flows could reduce the number of openflows to ~200K.

Potential implementation:
- extend the Logical_DP_Group SB schema to add a column to allow specifying if the datapath group consists of all switches/routers in the topology.
- in the first stage of the switch/router pipeline store in the "flags" logical register a bit to indicate that the packet is processed on a switch or router pipeline.
- when processing logical flows applied to datapath groups, if the group includes all switches, instead of generating N OF rules, one for each datapath in the group, generate a single OF rule with an additional match on the "flags" register bit that indicates that the packet is processed on the switch pipeline.
- similarly, OF rules can be optimized for trivial logical flows applied to logical routers.

Initial scale tests show an improvement of ~8% in time it takes to bring up an OVN node with an OpenShift-like deployment.  This is explained by the fact that ovn-controller spends constant time in generating trivial flows.