Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.

Bug 1824424

Summary: [OVN SCALE] [ovn-controller] chassis external_ids update retriggers flow calculation
Product: Red Hat Enterprise Linux Fast Datapath Reporter: anil venkata <vkommadi>
Component: OVNAssignee: Numan Siddique <nusiddiq>
Status: CLOSED DUPLICATE QA Contact: Jianlin Shi <jishi>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: RHEL 8.0CC: ctrautma, lmartins, nusiddiq
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-04-16 08:38:42 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
ovn-controller.log with debug info none

Description anil venkata 2020-04-16 08:28:59 UTC
Created attachment 1679269 [details]
ovn-controller.log with debug info

Description of problem:

When we were running rally scenario tests on OSP16 with OVN driver, system was loaded with neutron resources and networking ovn driver is frequently running (for 5 seconds) liveliness checks which updates "nb_cfg" and "external_ids" columns of SBDB Chassis table.

It increments nb_cfg and also add a new timestamp in external_ids.
_uuid               : 925f3247-7132-48a9-8634-16f90d33f043
encaps              : [82d43060-1071-4c06-8c74-1b8424205719]
external_ids        : {datapath-type="", iface-types="erspan,geneve,gre,internal,ip6erspan,ip6gre,lisp,patch,stt,system,tap,vxlan", "neutron:liveness_check_at"="2020-04-14T18:48:21.122794+00:00", ovn-bridge-mappings="datacentre:br-ex", ovn-chassis-mac-mappings="", ovn-cms-options=enable-chassis-as-gw}
hostname            : "controller-2.redhat.local"
name                : "059c6ee3-4700-43a0-a652-227d30fbcd1c"
nb_cfg              : 147126
transport_zones     : []
vtep_logical_switches: []

In the compute node, it also updates external_ids with "neutron-metadata-proxy-networks" and "neutron:ovn-metadata-sb-cfg" keys. 

This triggers recalculation of flows in ovn-controllers resulting in slowness during programming flows for the VMs.  

I have attached ovn-controller.log, which shows the same
2020-04-16T07:03:05.468Z|00011|poll_loop(stopwatch2)|DBG|wakeup due to [POLLIN] on fd 17 (FIFO pipe:[1086516]) at lib/stopwatch.c:458 (0% CPU usage)
2020-04-16T07:03:05.477Z|00275|poll_loop|DBG|wakeup due to [POLLIN] on fd 19 (172.17.1.54:49426<->172.17.1.54:6642) at lib/stream-fd.c:157 (0% CPU usage)
2020-04-16T07:03:05.477Z|00276|jsonrpc|DBG|tcp:172.17.1.54:6642: received notification, method="update2", params=[["monid","OVN_Southbound"],{"Chassis":{"10d6b5e2-99c7-4a8c-a0be-b8a090b6e1cc":{"modify":{"external_ids":["map",[["neutron:metadata_liveness_check_at","2020-04-16T07:03:05.472814+00:00"]]]}}}}]
2020-04-16T07:03:05.477Z|00286|inc_proc_eng|DBG|node: SB_chassis, changed: 1
2020-04-16T07:03:05.477Z|00289|inc_proc_eng|DBG|node: runtime_data, recompute (triggered)
2020-04-16T07:03:05.478Z|00296|inc_proc_eng|DBG|node: runtime_data, changed: 1
2020-04-16T07:03:05.478Z|00305|inc_proc_eng|DBG|node: flow_output, recompute (triggered)
2020-04-16T07:03:05.478Z|00306|ofctrl|DBG|ofctrl_add_flow flow: sb_uuid=6088cb73-ee83-4275-b7dc-c0eb5fa03fe5, table_id=0, priority=100, in_port=2, actions=move:NXM_NX_TUN_ID[0..23]->OXM_OF_METADATA[0..23],move:NXM_NX_TUN_METADATA0[16..30]->NXM_NX_REG14[0..14],move:NXM_NX_TUN_METADATA0[0..15]->NXM_NX_REG15[0..15],resubmit(,33)


This setup has 3 computes and 3 controllers, with HA setup (no DVR).
OSP16 puddle version: RHOS_TRUNK-16.0-RHEL-8-20200226.n.1

Comment 1 Lucas Alvares Gomes 2020-04-16 08:38:42 UTC
Thanks for opening this Anil, I'm marking this as duplicated of https://bugzilla.redhat.com/show_bug.cgi?id=1824220 because the later was already triaged by Dumitru.

*** This bug has been marked as a duplicate of bug 1824220 ***