The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.
Bug 1871054 - Avoid nb_cfg update notification flooding (Chassis_Private)
Summary: Avoid nb_cfg update notification flooding (Chassis_Private)
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: OVN
Version: RHEL 8.0
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: ---
Assignee: Numan Siddique
QA Contact: ying xu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-08-21 08:45 UTC by Lucas Alvares Gomes
Modified: 2020-09-16 16:01 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-09-16 16:01:25 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2020:3769 0 None None None 2020-09-16 16:01:34 UTC

Description Lucas Alvares Gomes 2020-08-21 08:45:33 UTC
The nb_cfg as a mechanism to "ping" OVN control plane is very useful in many ways. However, the current implementation will trigger update notifications flooding in the whole control plane. Each HV updates to SB the nb_cfg number and all these updates are notified to all the other HVs, which is O(n^2). Although updates are batched in fewers notifications than n^2, it still generates significant load on SB DB and also on ovn-controllers.

In order to solve this problem and make the mechanism more scalable for large environments we should separate the chassis private data on it's own table so that chassis' can conditionally monitor and get updates for those records without flooding all other hypervisors.

Comment 5 ying xu 2020-08-26 10:08:34 UTC
by this bug, there is a new table added:chassis_private
it is seperated from the table chassis.

verified on the version:
# rpm -qa|grep ovn
ovn2.13-central-20.06.2-1.el8fdp.x86_64
ovn2.13-20.06.2-1.el8fdp.x86_64
ovn2.13-host-20.06.2-1.el8fdp.x86_64


# ovn-sbctl --columns name --bare find chassis
hv1

hv0
[root@dell-per730-19 multicast]# nb_global_id=$(ovn-nbctl --columns _uuid --bare find nb_global)
[root@dell-per730-19 multicast]# ovn-nbctl set NB_Global ${nb_global_id} nb_cfg=99
[root@dell-per730-19 multicast]# ovn-sbctl --columns nb_cfg --bare find chassis_private    -----------this table is seperated from the table chassis
99

99
[root@dell-per730-19 multicast]# ovn-sbctl --columns nb_cfg --bare find chassis
0

0

# ovn-nbctl --wait=hv sync    -----------sync
[root@dell-per730-19 multicast]# ovn-sbctl --columns nb_cfg --bare find chassis    -------this table no updates
0

0
[root@dell-per730-19 multicast]# ovn-sbctl --columns nb_cfg --bare find chassis_private   -----------this table updates(incr 1)
100

100

tcpdump on the hv0,and then sync one time,we can see that only information about hv0 can be captured.
# ovn-sbctl find chassis_private
_uuid               : 89754dfe-b7d2-4888-be0d-fc797469bc4a
chassis             : e260facc-f8a1-4a1a-95d0-4dbb4a80d529
external_ids        : {}
name                : hv1
nb_cfg              : 101

_uuid               : 84d0f88b-29f1-4641-b922-c4a9c154f1a1    ----------the uuid
chassis             : c494a5cd-9e87-41ff-b223-14f7df0bcefc
external_ids        : {}
name                : hv0
nb_cfg              : 101


check the packets in the wireshark,we can see information below:
âº)À¬|þ¯àEÀö9@@Ã|

òÖ{Gÿ1x7~/
.äK_{Ôdk{"id":null,"method":"update3","params":[["monid","OVN_Southbound"],"00000000-0000-0000-0000-000000000000",{"Chassis_Private":{"84d0f88b-29f1-4641-b922-c4a9c154f1a1":{"modify":{"nb_cfg":101}}}}]}  --------it is the uuid of hv0.

and no uuid of hv1.

Comment 7 errata-xmlrpc 2020-09-16 16:01:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (ovn2.13 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:3769


Note You need to log in before you can comment on or make changes to this bug.