Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.
The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.

Bug 1827403

Summary: Explosion of logical flows in sb database after scaling to 100 nodes because of pre-hairpin changes
Product: Red Hat Enterprise Linux Fast Datapath Reporter: Aniket Bhat <anbhat>
Component: ovn2.13Assignee: OVN Team <ovnteam>
Status: CLOSED ERRATA QA Contact: Jianlin Shi <jishi>
Severity: high Docs Contact:
Priority: high    
Version: FDP 20.ACC: avishnoi, ctrautma, dceara, jiji, jishi, mmichels, ralongi
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-05-26 14:07:18 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Load balancer dump
none
Dump of the entire sbdb
none
Output of nbctl show
none
Raw nbdb file none

Description Aniket Bhat 2020-04-23 20:18:02 UTC
Created attachment 1681248 [details]
Load balancer dump

Description of problem:

A 100 node OVN cluster is showing 318888 logical flows in sbdb.

Version-Release number of selected component (if applicable):
ovn-20.03.0-2.fc31.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Scale ovn configuration to have 100 nodes
2. Dump sbctl logical flows using "ovn-sbctl lflow-list"
3.

Actual results:
318888 logical flows are seen. A lot of them are in the pre-hairpin table

Expected results:


Additional info:

Comment 1 Aniket Bhat 2020-04-23 20:18:54 UTC
Created attachment 1681249 [details]
Dump of the entire sbdb

Comment 2 Aniket Bhat 2020-04-23 20:19:21 UTC
Created attachment 1681250 [details]
Output of nbctl show

Comment 3 Aniket Bhat 2020-04-23 20:20:06 UTC
Created attachment 1681251 [details]
Raw nbdb file

Comment 8 Jianlin Shi 2020-04-29 02:41:32 UTC
start ovn-northd with attached ovnnb file and get the following result.

on ovn2.13.0-18:

[root@hp-dl380pg8-13 bz1827403]# ovn-sbctl lflow-list | wc -l                                         
25755                                                                                                 
[root@hp-dl380pg8-13 bz1827403]# ovn-sbctl lflow-list | grep pre_hairpin -c                           
9014                                                                                                  
                                                                                                      
[root@hp-dl380pg8-13 bz1827403]# rpm -qa | grep ovn                                                   
ovn2.13-central-2.13.0-18.el8fdp.x86_64                                                               
kernel-kernel-networking-openvswitch-ovn-common-1.0-7.noarch                                          
ovn2.13-host-2.13.0-18.el8fdp.x86_64                                                                  
kernel-kernel-networking-openvswitch-ovn_ha-1.0-55.noarch                                             
ovn2.13-2.13.0-18.el8fdp.x86_64

on ovn2.13.0-21:

[root@hp-dl380pg8-13 bz1827403]# rpm -qa | grep ovn                                                   
ovn2.13-host-2.13.0-21.el8fdp.x86_64                                                                  
ovn2.13-2.13.0-21.el8fdp.x86_64                                                                       
kernel-kernel-networking-openvswitch-ovn-common-1.0-7.noarch                                          
ovn2.13-central-2.13.0-21.el8fdp.x86_64                                                               
kernel-kernel-networking-openvswitch-ovn_ha-1.0-55.noarch                                             
[root@hp-dl380pg8-13 bz1827403]# ovn-sbctl lflow-list | wc -l                                         
21443                                                                                                 
[root@hp-dl380pg8-13 bz1827403]# ovn-sbctl lflow-list | grep pre_hairpin -c                           
4702

the same result on rhel7 version

Comment 9 Jianlin Shi 2020-04-29 08:55:40 UTC
Hi Dumitru,

I add comment 8 for test result on old version and the fixed version.
does that mean the issue is fixed?
is there other better way to verify the issue?

Comment 10 Dumitru Ceara 2020-04-29 11:54:28 UTC
(In reply to Jianlin Shi from comment #9)
> Hi Dumitru,
> 
> I add comment 8 for test result on old version and the fixed version.
> does that mean the issue is fixed?
> is there other better way to verify the issue?

Hi Jianlin,

It should be enough but I guess Aniket can also confirm that the issue is mitigated on his setup.

Thanks,
Dumitru

Comment 11 Aniket Bhat 2020-05-05 14:43:38 UTC
So when I scale to 100 nodes, it looks like we are now having about half the number of flows around 160000 flows. I am not sure if we can optimize further. But if we have hit the theoretical min. then we are OK.

Comment 12 Jianlin Shi 2020-05-08 09:08:13 UTC
Hi Dumitru,

how do you think about comment 11?

Comment 13 Dumitru Ceara 2020-05-08 09:37:57 UTC
(In reply to Jianlin Shi from comment #12)
> Hi Dumitru,
> 
> how do you think about comment 11?

Hi Jianlin,

I think for now it's the best we can do.

Regards,
Dumitru

Comment 14 Jianlin Shi 2020-05-08 09:51:53 UTC
(In reply to Dumitru Ceara from comment #13)
> (In reply to Jianlin Shi from comment #12)
> > Hi Dumitru,
> > 
> > how do you think about comment 11?
> 
> Hi Jianlin,
> 
> I think for now it's the best we can do.
> 
> Regards,
> Dumitru

got it, set VERIFIED

Comment 16 errata-xmlrpc 2020-05-26 14:07:18 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2317