Bug 1827403
| Summary: | Explosion of logical flows in sb database after scaling to 100 nodes because of pre-hairpin changes | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Product: | Red Hat Enterprise Linux Fast Datapath | Reporter: | Aniket Bhat <anbhat> | ||||||||||
| Component: | ovn2.13 | Assignee: | OVN Team <ovnteam> | ||||||||||
| Status: | CLOSED ERRATA | QA Contact: | Jianlin Shi <jishi> | ||||||||||
| Severity: | high | Docs Contact: | |||||||||||
| Priority: | high | ||||||||||||
| Version: | FDP 20.A | CC: | avishnoi, ctrautma, dceara, jiji, jishi, mmichels, ralongi | ||||||||||
| Target Milestone: | --- | ||||||||||||
| Target Release: | --- | ||||||||||||
| Hardware: | Unspecified | ||||||||||||
| OS: | Unspecified | ||||||||||||
| Whiteboard: | |||||||||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||||||
| Doc Text: | Story Points: | --- | |||||||||||
| Clone Of: | Environment: | ||||||||||||
| Last Closed: | 2020-05-26 14:07:18 UTC | Type: | Bug | ||||||||||
| Regression: | --- | Mount Type: | --- | ||||||||||
| Documentation: | --- | CRM: | |||||||||||
| Verified Versions: | Category: | --- | |||||||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||||||
| Embargoed: | |||||||||||||
| Attachments: |
|
||||||||||||
Created attachment 1681249 [details]
Dump of the entire sbdb
Created attachment 1681250 [details]
Output of nbctl show
Created attachment 1681251 [details]
Raw nbdb file
start ovn-northd with attached ovnnb file and get the following result. on ovn2.13.0-18: [root@hp-dl380pg8-13 bz1827403]# ovn-sbctl lflow-list | wc -l 25755 [root@hp-dl380pg8-13 bz1827403]# ovn-sbctl lflow-list | grep pre_hairpin -c 9014 [root@hp-dl380pg8-13 bz1827403]# rpm -qa | grep ovn ovn2.13-central-2.13.0-18.el8fdp.x86_64 kernel-kernel-networking-openvswitch-ovn-common-1.0-7.noarch ovn2.13-host-2.13.0-18.el8fdp.x86_64 kernel-kernel-networking-openvswitch-ovn_ha-1.0-55.noarch ovn2.13-2.13.0-18.el8fdp.x86_64 on ovn2.13.0-21: [root@hp-dl380pg8-13 bz1827403]# rpm -qa | grep ovn ovn2.13-host-2.13.0-21.el8fdp.x86_64 ovn2.13-2.13.0-21.el8fdp.x86_64 kernel-kernel-networking-openvswitch-ovn-common-1.0-7.noarch ovn2.13-central-2.13.0-21.el8fdp.x86_64 kernel-kernel-networking-openvswitch-ovn_ha-1.0-55.noarch [root@hp-dl380pg8-13 bz1827403]# ovn-sbctl lflow-list | wc -l 21443 [root@hp-dl380pg8-13 bz1827403]# ovn-sbctl lflow-list | grep pre_hairpin -c 4702 the same result on rhel7 version Hi Dumitru, I add comment 8 for test result on old version and the fixed version. does that mean the issue is fixed? is there other better way to verify the issue? (In reply to Jianlin Shi from comment #9) > Hi Dumitru, > > I add comment 8 for test result on old version and the fixed version. > does that mean the issue is fixed? > is there other better way to verify the issue? Hi Jianlin, It should be enough but I guess Aniket can also confirm that the issue is mitigated on his setup. Thanks, Dumitru So when I scale to 100 nodes, it looks like we are now having about half the number of flows around 160000 flows. I am not sure if we can optimize further. But if we have hit the theoretical min. then we are OK. Hi Dumitru, how do you think about comment 11? (In reply to Jianlin Shi from comment #12) > Hi Dumitru, > > how do you think about comment 11? Hi Jianlin, I think for now it's the best we can do. Regards, Dumitru (In reply to Dumitru Ceara from comment #13) > (In reply to Jianlin Shi from comment #12) > > Hi Dumitru, > > > > how do you think about comment 11? > > Hi Jianlin, > > I think for now it's the best we can do. > > Regards, > Dumitru got it, set VERIFIED Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2317 |
Created attachment 1681248 [details] Load balancer dump Description of problem: A 100 node OVN cluster is showing 318888 logical flows in sbdb. Version-Release number of selected component (if applicable): ovn-20.03.0-2.fc31.x86_64 How reproducible: Always Steps to Reproduce: 1. Scale ovn configuration to have 100 nodes 2. Dump sbctl logical flows using "ovn-sbctl lflow-list" 3. Actual results: 318888 logical flows are seen. A lot of them are in the pre-hairpin table Expected results: Additional info: