Bug 1814099

Summary: [scale] enable monitor-all to reduce load on southbound database
Product: OpenShift Container Platform Reporter: Dan Williams <dcbw>
Component: NetworkingAssignee: Dan Williams <dcbw>
Networking sub component: ovn-kubernetes QA Contact: Anurag saxena <anusaxen>
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: medium CC: aconstan, anbhat, bbennett, rbrattai, rkhan, zzhao
Version: 4.4   
Target Milestone: ---   
Target Release: 4.4.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: SDN-CI-IMPACT
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: 1814098
: 1814100 (view as bug list) Environment:
Last Closed: 2020-06-17 22:26:03 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1814100    
Bug Blocks: 1814098    

Description Dan Williams 2020-03-17 02:56:44 UTC
+++ This bug was initially created as a clone of Bug #1814098 +++

Setting monitor-all=true in each node's ovsdb causes each ovn-controller to monitor all chassis events, which reduces load on the southbound database at the expense of a bit more CPU and network activity on each node. This increases the ability to scale.

See OVN bug https://bugzilla.redhat.com/1808125 for more details.

Comment 3 Dan Williams 2020-04-17 13:14:27 UTC
Verification would be same as https://bugzilla.redhat.com/show_bug.cgi?id=1814100#c7

Comment 4 zhaozhanqi 2020-04-20 08:42:50 UTC
just check the https://github.com/openshift/ovn-kubernetes/pull/135 still in open.  So move this bug to 'post' for now.

Comment 8 Ross Brattain 2020-06-09 16:25:32 UTC
Verified on 4.4.0-0.nightly-2020-06-08-083627



$ for f in $(oc get pods -o wide  -l app=ovnkube-node $JPMNS) ; do  oc exec -c ovn-controller "${f}"  -- ovs-vsctl get Open_vSwitch .  external-ids | grep --color=auto monitor-all ; done
{hostname=compute.internal, ovn-bridge-mappings="physnet:br-local", ovn-encap-ip="10.0.205.114", ovn-encap-type=geneve, ovn-monitor-all="true", ovn-nb="ssl:10.0.132.181:9641,ssl:10.0.185.61:9641,ssl:10.0.218.242:9641", ovn-openflow-probe-interval="180", ovn-remote="ssl:10.0.132.181:9642,ssl:10.0.185.61:9642,ssl:10.0.218.242:9642", ovn-remote-probe-interval="100000", rundir="/var/run/openvswitch", system-id="f527a7c7-e386-46c7-bb2f-4242e26ab9a5"}
{hostname=compute.internal, ovn-bridge-mappings="physnet:br-local", ovn-encap-ip="10.0.167.119", ovn-encap-type=geneve, ovn-monitor-all="true", ovn-nb="ssl:10.0.132.181:9641,ssl:10.0.185.61:9641,ssl:10.0.218.242:9641", ovn-openflow-probe-interval="180", ovn-remote="ssl:10.0.132.181:9642,ssl:10.0.185.61:9642,ssl:10.0.218.242:9642", ovn-remote-probe-interval="100000", rundir="/var/run/openvswitch", system-id="91912446-bb24-4f65-8574-061cba441eae"}
{hostname=compute.internal, ovn-bridge-mappings="physnet:br-local", ovn-encap-ip="10.0.132.181", ovn-encap-type=geneve, ovn-monitor-all="true", ovn-nb="ssl:10.0.132.181:9641,ssl:10.0.185.61:9641,ssl:10.0.218.242:9641", ovn-openflow-probe-interval="180", ovn-remote="ssl:10.0.132.181:9642,ssl:10.0.185.61:9642,ssl:10.0.218.242:9642", ovn-remote-probe-interval="100000", rundir="/var/run/openvswitch", system-id="31e001a1-e4a3-45be-b5a1-9e90c1ac5acb"}
{hostname=compute.internal, ovn-bridge-mappings="physnet:br-local", ovn-encap-ip="10.0.152.146", ovn-encap-type=geneve, ovn-monitor-all="true", ovn-nb="ssl:10.0.132.181:9641,ssl:10.0.185.61:9641,ssl:10.0.218.242:9641", ovn-openflow-probe-interval="180", ovn-remote="ssl:10.0.132.181:9642,ssl:10.0.185.61:9642,ssl:10.0.218.242:9642", ovn-remote-probe-interval="100000", rundir="/var/run/openvswitch", system-id="53cf8742-c3cf-4968-a535-ab0d154cffa6"}
{hostname=compute.internal, ovn-bridge-mappings="physnet:br-local", ovn-encap-ip="10.0.185.61", ovn-encap-type=geneve, ovn-monitor-all="true", ovn-nb="ssl:10.0.132.181:9641,ssl:10.0.185.61:9641,ssl:10.0.218.242:9641", ovn-openflow-probe-interval="180", ovn-remote="ssl:10.0.132.181:9642,ssl:10.0.185.61:9642,ssl:10.0.218.242:9642", ovn-remote-probe-interval="100000", rundir="/var/run/openvswitch", system-id="cf13cc53-6688-4eb9-b283-9d97133ea561"}
{hostname=compute.internal, ovn-bridge-mappings="physnet:br-local", ovn-encap-ip="10.0.218.242", ovn-encap-type=geneve, ovn-monitor-all="true", ovn-nb="ssl:10.0.132.181:9641,ssl:10.0.185.61:9641,ssl:10.0.218.242:9641", ovn-openflow-probe-interval="180", ovn-remote="ssl:10.0.132.181:9642,ssl:10.0.185.61:9642,ssl:10.0.218.242:9642", ovn-remote-probe-interval="100000", rundir="/var/run/openvswitch", system-id="4068bd8b-9c8e-4016-9f7f-a0d2d424d5f4"}

Comment 10 errata-xmlrpc 2020-06-17 22:26:03 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2445