Bug 2043116

Summary: [OVN SCALE][ovn-northd] Incrementally build logical datapaths.
Product: Red Hat Enterprise Linux Fast Datapath Reporter: Dumitru Ceara <dceara>
Component: OVNAssignee: Dumitru Ceara <dceara>
Status: CLOSED WONTFIX QA Contact: Jianlin Shi <jishi>
Severity: unspecified Docs Contact:
Priority: high    
Version: FDP 21.KCC: ctrautma, jiji, mmichels
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2023-07-17 09:20:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2043119, 2043128    
Attachments:
Description Flags
Scale test NB database. none

Description Dumitru Ceara 2022-01-20 16:28:20 UTC
Created attachment 1852237 [details]
Scale test NB database.

Description of problem:

In ovn-northd, the build_datapaths() function reconciles the Southbound DB Datapath_Binding records and ensures that there's exactly one record for every Northbound DB Logical_Switch and Logical_Router.

In some deployments (e.g., ovn-kubernetes) the datapaths (logical switches/routers) don't get added/removed often so recomputing the whole set every time something unrelated changes in the NB/SB doesn't make much sense.

Instead, we could extract this into a separate I-P engine node that computes 3 sets of datapaths:
- deleted
- updated
- added

This I-P node would need to have (at least) 3 input nodes:
- NB.Logical_Switch
- NB.Logical_Router
- SB.Datapath_Binding

For example, with the attached NB database (from a scale test run), build_datapaths() takes ~150msec on the test machine, each and every time northd runs to process a change in the NB/SB.

Furthermore, knowing the exact set of datapaths that changed can be the first step in adding incremental processing for other components like load balancers.

Comment 1 Mark Michelson 2022-01-20 18:05:53 UTC
If I understand correctly, the idea here is to have an engine node whose sole job is to output the set of deleted, updated, and added datapaths. It's then outside the scope of the engine node to use those sets of datapaths to determine how to act.

I think this is an ideal use of the I-P engine. It allows for different contexts to use the data produced by the I-P node as they see fit. I also think it's a good idea to limit the scope of this particular issue to reconciliation of the southbound datapaths since it is a more manageable change than trying to incrementally process *everything* involving datapath changes.

Question: Would it make sense to have two I-P engine nodes instead of one?

Node 1 inputs:
- NB.Logical_Switch
- SB.Datapath_Binding

Node 1 outputs the sets of updated, deleted, and added logical switches.

Node 2 inputs:
- NB.Logical_Router
- SB.Datapath Binding

Node 2 outputs the sets of updated, deleted, and added logical routers.

This might make the related load balancer issues (2043119 and 2043128) easier to implement, since each would be able to take distinct inputs of the appropriate logical datapath type.

Comment 2 Dumitru Ceara 2023-07-17 09:20:43 UTC
Closing this as "won't fix".  Recent patches from Han Zhou added support for incrementally processing switch/router port changes without the need of datapaths incremental processing:
https://github.com/ovn-org/ovn/commit/b337750e45be2de370eebd7bc8f1ecd52cb72750
https://github.com/ovn-org/ovn/commit/3b120ccf7f7c4d15e1936ea7a18d1fc047879c99

At the same time Numan is adding I-P for load balancers in a separate patch set:
https://patchwork.ozlabs.org/project/ovn/list/?series=362800&state=*

If we ever decide that we need LR/LS datapath incremental processing we can revisit this issue.