Bug 2097782 - Revisit revalidator flow-size reduction algorithm
Summary: Revisit revalidator flow-size reduction algorithm
Keywords:
Status: NEW
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: openvswitch
Version: FDP 20.E
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: ---
Assignee: Timothy Redaelli
QA Contact: qding
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-06-16 14:31 UTC by Adrián Moreno
Modified: 2023-07-13 07:25 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker FD-2036 0 None None None 2022-06-16 14:37:39 UTC

Description Adrián Moreno 2022-06-16 14:31:03 UTC
Currently, the revalidator has the following logic:

            duration = MAX(time_msec() - start_time, 1);
            if (duration > 2000) {
                flow_limit /= duration / 1000;
            } else if (duration > 1300) {
                flow_limit = flow_limit * 3 / 4;
            } else if (duration < 1000 &&
                       flow_limit < n_flows * 1000 / duration) {
                flow_limit += 1000;
            }

The goal of this mechanism is to always guarantee that we apply changes to the datapath within a "reasonable time": 2 seconds.

In an overloaded system, reducing the number of flows in the cache leads to flows being evicted, which can lead to higher number of upcalls which then leads to higher pressure on upcall handlers (that typically use the same cores as revalidators) and possible packet drops.

This task is to try revisit this, test it under high pressure and see if we can make OVS more robust or at least find a good balance between revalidation time and upcalls.

Since we're seeing deployments where ovs-vswitchd is being restricted to a small number of CPUs (e.g: PAO) this becomes more relevant.


Note You need to log in before you can comment on or make changes to this bug.