Bug 2073473 - [OVN SCALE][ovn-northd] Unnecessary SB record no-op changes added to SB transaction.
Summary: [OVN SCALE][ovn-northd] Unnecessary SB record no-op changes added to SB trans...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.10
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.11.0
Assignee: Surya Seetharaman
QA Contact: Anurag saxena
URL:
Whiteboard:
Depends On: 2069623
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-04-08 14:51 UTC by Surya Seetharaman
Modified: 2022-08-10 11:05 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of: 2069623
Environment:
Last Closed: 2022-08-10 11:05:44 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift ovn-kubernetes pull 1052 0 None Merged Bug 2077357: Bump OVN to ovn22.03-22.03.0-24 2022-05-10 18:06:39 UTC
Red Hat Product Errata RHSA-2022:5069 0 None None None 2022-08-10 11:05:58 UTC

Description Surya Seetharaman 2022-04-08 14:51:15 UTC
This bug is to track the OVN bump needed in our docker file for this.

+++ This bug was initially created as a clone of Bug #2069623 +++

OVS commit 1cc618c32524 ("ovsdb-idl: Fix atomicity of writes that
don't change a column's value.") [0] explains why writes that don't
change a column's value cannot be optimized out early if the column is
read/write.

In northd, most tables have change tracking enabled, making all their
columns read/write.  That means that a write to a column (even if it
doesn't change the value) will add the row to the current transaction.
Validation is eventually performed before sending the transaction to the
server but if there are lots of such records this becomes costly.

Profiling what happens in northd when running with a NB database taken
from an ovn-k8s-like scale test (16K load balancers applied to 120
logical switches and routers) we notice that ovn-northd was always
writing to the SB.Load_Balancer columns even if nothing changed.

This doesn't translate into any transaction being sent to the SB server
but does, however, increase the amount of work the IDL needs to
perform.

On a test setup, we see that with the attached database, the overhead
is ~6 seconds (out of a total of ~13 seconds ovn-northd spends in a
poll loop).

[0] https://github.com/openvswitch/ovs/commit/1cc618c32524

--- Additional comment from Dumitru Ceara on 2022-03-29 09:41:53 UTC ---

v2 posted for review: https://patchwork.ozlabs.org/project/ovn/list/?series=292465&state=*

--- Additional comment from Dumitru Ceara on 2022-04-01 11:26:13 UTC ---

Posted patch to add support for change tracking of write-only columns in the IDL instead:
https://patchwork.ozlabs.org/project/openvswitch/list/?series=293071&state=*

This is a more generic solution.

--- Additional comment from Dumitru Ceara on 2022-04-04 13:57:24 UTC ---

Moving back to ASSIGNED until the OVS patch is accepted and a patch to bump the OVS submodule in OVN is posted.

Comment 3 Mohit Sheth 2022-05-12 20:10:25 UTC
The observed P99 pod latency with the above patch is 7.63s for node-density on a 120 node cluster

Comment 5 errata-xmlrpc 2022-08-10 11:05:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Important: OpenShift Container Platform 4.11.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2022:5069


Note You need to log in before you can comment on or make changes to this bug.