Bug 1960042
| Summary: | [scale] northd at 100% and taking > 30sec to process changes | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Joe Talerico <jtaleric> | ||||||
| Component: | Networking | Assignee: | Tim Rozet <trozet> | ||||||
| Networking sub component: | ovn-kubernetes | QA Contact: | Anurag saxena <anusaxen> | ||||||
| Status: | CLOSED DUPLICATE | Docs Contact: | |||||||
| Severity: | high | ||||||||
| Priority: | unspecified | CC: | aconstan, astoycos, dcbw, dceara, vpickard | ||||||
| Version: | 4.8 | ||||||||
| Target Milestone: | --- | ||||||||
| Target Release: | 4.9.0 | ||||||||
| Hardware: | All | ||||||||
| OS: | All | ||||||||
| Whiteboard: | perfscale-ovn | ||||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||||
| Doc Text: | Story Points: | --- | |||||||
| Clone Of: | |||||||||
| : | 1962338 (view as bug list) | Environment: | |||||||
| Last Closed: | 2021-10-05 17:25:28 UTC | Type: | Bug | ||||||
| Regression: | --- | Mount Type: | --- | ||||||
| Documentation: | --- | CRM: | |||||||
| Verified Versions: | Category: | --- | |||||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||||
| Embargoed: | |||||||||
| Bug Depends On: | 1962338, 1962818, 1962833 | ||||||||
| Bug Blocks: | |||||||||
| Attachments: |
|
||||||||
Created attachment 1784815 [details]
OVN NBDB where we observe long (30s) poll intervals
After initial analysis, a decent time of the poll interval is spent in
building router load balancer logical flows.
1. get_router_load_balancer_ips() is called way too often, we can just
precompute those IPs once per iteration, instead of calling it for every
logical router port. An initial test shows that this change reduces
the loop iteration time from ~22s to ~19s.
2. ovn_lflow_add_at() always builds a logical flow record even though
this will be discarded if the logical flow is aggregated on a datapath
group. We can instead try to delay the creation of new flow records
until really necessary. This saves a decent amount of allocations and
memory copying. An initial test shows that this change reduces the
loop iteration time further to ~15s.
3. We can try to change the way load balancer flows are built.
Currently for X routers (or switches) with Y load balancers applied to
them we do:
- for every router:
- for every load balancer:
- parse and generate lots of common lb stuff (e.g., VIPs, backends)
- generate one logical flow per VIP.
I think we can save quite a lot of CPU by changing this to:
- for every load balancer:
- parse and generate lots of common lb stuff (e.g., VIPs, backends)
- for every router:
- generate one logical flow per VIP.
4. I also enabled northd parallelization and this further reduced the
loop iteration times to ~8s with the cost of northd consuming up to
900% CPU. However, northd parallelization is a new feature and needs
further testing and needs to be enabled in CI.
I'll open OVN BZs for all items above so we can track the work
independently.
All linked bugs have been addressed in ovn21.09-21.09.0-9.el8fdp which is part of the 4.9.0 release, via https://bugzilla.redhat.com/show_bug.cgi?id=1999852. Going to dupe this bug to https://bugzilla.redhat.com/show_bug.cgi?id=1999852 *** This bug has been marked as a duplicate of bug 1999852 *** |
Created attachment 1782550 [details] must-gather network Description of problem: OCP4.8 w/ OVNKubernetes as the SDN. Scaled to 300 nodes, we are seeing ovn-northd consume an entire core: 1 root 20 0 1521272 1.4g 7420 R 98.7 1.1 370:31.04 ovn-northd TimR also noted that it is taking 30+ seconds for northd to process changes. Version-Release number of selected component (if applicable): 4.8 How reproducible: 100% Steps to Reproduce: 1. Deploy OCP4.8 2. Scale to 300 nodes 3. Run clusterdensity 2k Actual results: