Bug 1961757

Summary: ovn-kubernetes: Enable ovn-controller lflow-cache limits (memory and/or size)
Product: OpenShift Container Platform Reporter: Dumitru Ceara <dceara>
Component: NetworkingAssignee: Jaime Caamaño Ruiz <jcaamano>
Networking sub component: ovn-kubernetes QA Contact: Anurag saxena <anusaxen>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: unspecified CC: aconstan, astoycos, dcbw, kkulkarn
Version: 4.8   
Target Milestone: ---   
Target Release: 4.9.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: Enhancement
Doc Text:
Feature: Enable the ability to configure logical flow cache limit in ovn-kubernetes. Set to 1GB by default Reason: Helps limiting ovn-controller memory consumption on large clusters in exchange of performance. Result: Logical flow cache limits can be configured. Set to 1GB by default. Memory consumption on large clusters should decrease.
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-10-18 17:31:06 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 2021221    

Description Dumitru Ceara 2021-05-18 16:10:08 UTC
Per https://bugzilla.redhat.com/show_bug.cgi?id=1954853#c8 opening a BZ to track enabling of ovn-controller lflow-cache limits (memory and/or number of entries) from ovn-kubernetes.

Relevant OVN documentation, section CONFIGURATION:

https://www.ovn.org/support/dist-docs/ovn-controller.8.html

"external_ids:ovn-enable-lflow-cache
       The  boolean  flag  indicates  if  ovn-controller  should
       enable/disable the logical flow in-memory cache  it  uses
       when processing Southbound database logical flow changes.
       By default caching is enabled.
       
external_ids:ovn-limit-lflow-cache
       When used, this configuration value determines the  maxi‐
       mum  number  of logical flow cache entries ovn-controller
       may create when the logical flow  cache  is  enabled.  By
       default the size of the cache is unlimited.

external_ids:ovn-memlimit-lflow-cache-kb
       When  used, this configuration value determines the maxi‐
       mum size of the logical flow cache (in KB) ovn-controller
       may  create  when  the  logical flow cache is enabled. By
       default the size of the cache is unlimited."

Comment 1 Dumitru Ceara 2021-05-27 14:40:34 UTC
Lflow cache limit patchset backported to ovn2.13-20.12.0-135

Comment 2 Dan Williams 2021-07-02 19:14:55 UTC
Jaime posted upstream PR https://github.com/ovn-org/ovn-kubernetes/pull/2247
That was brought downstream to 4.9 in https://github.com/openshift/ovn-kubernetes/pull/582

CNO change to limit cache is posted in https://github.com/openshift/cluster-network-operator/pull/1147

Comment 4 Kedar Kulkarni 2021-08-06 18:21:44 UTC
Hi,

I reviewed the BZ fix in order to review if the lflow cache enablement change is integrated. Cluster version tested is 4.9.0-0.nightly-2021-08-06-060446 . 

Reviewed the Configmap, and it reflects the changes posted in the PR, enabling the lflow-cache and lflow-cache-limit-kb is set: 

oc get -n openshift-ovn-kubernetes cm  ovnkube-config -oyaml 

apiVersion: v1
data:
  ovnkube.conf: |-
    [default]
    mtu="8901"
    cluster-subnets="10.128.0.0/10/22"
    encap-port="6081"
    enable-lflow-cache=true
    lflow-cache-limit-kb=1048576

    [kubernetes]
    service-cidrs="172.30.0.0/16"
    ovn-config-namespace="openshift-ovn-kubernetes"
    apiserver="https://api-int.<snip>openshift.com:6443"
    host-network-namespace="openshift-host-network"

    [ovnkubernetesfeature]
    enable-egress-ip=true
    enable-egress-firewall=true

    [gateway]
    mode=shared
    nodeport=true
kind: ConfigMap
metadata:
  creationTimestamp: "2021-08-06T16:09:19Z"
  name: ovnkube-config
  namespace: openshift-ovn-kubernetes
  ownerReferences:
  - apiVersion: operator.openshift.io/v1
    blockOwnerDeletion: true
    controller: true
    kind: Network
    name: cluster
    uid: d90d66a1-13eb-4d3c-83ca-b4afadb397f2
  resourceVersion: "2814"
  uid: 985db6a9-8ba0-4d6d-bd27-ef750adaa2af



Thanks,
KK.

Comment 5 Kedar Kulkarni 2021-08-06 18:27:26 UTC
Additionally, also reviewed ovnkube cli mentions lflow cache options : 

ovnkube -h | grep lflow

   --enable-lflow-cache            Enable the logical flow in-memory cache it uses when processing Southbound database logical flow changes. By default caching is enabled. (default: true)
   --lflow-cache-limit value       Maximum number of logical flow cache entries ovn-controller may create when the logical flow cache is enabled. By default the size of the cache is unlimited. (default: 0)
   --lflow-cache-limit-kb value    Maximum size of the logical flow cache ovn-controller may create when the logical flow cache is enabled. By default the size of the cache is unlimited. (default: 0)

Comment 8 errata-xmlrpc 2021-10-18 17:31:06 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:3759