Bug 1986946
Summary: | High ICNI2 application pod creation times | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Jose Castillo Lema <jlema> | |
Component: | Networking | Assignee: | Surya Seetharaman <surya> | |
Networking sub component: | ovn-kubernetes | QA Contact: | Yurii Prokulevych <yprokule> | |
Status: | CLOSED ERRATA | Docs Contact: | ||
Severity: | high | |||
Priority: | high | CC: | dblack, smalleni, surya, trozet, yprokule | |
Version: | 4.7 | Keywords: | FastFix | |
Target Milestone: | --- | Flags: | jlema:
needinfo-
|
|
Target Release: | 4.9.0 | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | ||||
Fixed In Version: | Doc Type: | If docs needed, set a value | ||
Doc Text: | Story Points: | --- | ||
Clone Of: | ||||
: | 1996739 (view as bug list) | Environment: | ||
Last Closed: | 2021-10-18 17:42:58 UTC | Type: | Bug | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | ||||
Bug Blocks: | 1996739 |
Description
Jose Castillo Lema
2021-07-28 15:05:31 UTC
Lots of N/S leader elections during the 120 BM node / 25 pods/worker test. (In reply to Jose Castillo Lema from comment #0) > - Lots of nbctl commands run in the ovnkube-masters during the test: > 2021-07-28T11:15:07.403Z|921664|nbctl|INFO|Running command run > --may-exist --bfd --policy=src-ip --ecmp-symmetric-reply -- lr-route-add > GR_worker064-fc640 10.129.47.201/32 192.168.217.71 rtoe-GR_worker064-fc640 > 2021-07-28T11:15:07.422Z|921665|nbctl|INFO|Running command run -- > lr-policy-add ovn_cluster_router 501 "inport == \"rtos-worker064-fc640\" && > ip4.src == 10.129.47.201 && ip4.dst != 10.128.0.0/14" reroute 100.64.0.64 AFAICS, these are just the nbctl daemon logs, since we have the "may-exist", they don't actually do anything. It creates the policies only once - when pod gets created, and rest of the times its a no-op, so should effect the perf. On the other hand, I am working on PRs to reduce the number of lflows and openflows created by the routes and policies. Hopefully this should have some effect at least. > - nbdb and sbdb show a considerably higher memory usage The 501 LRP != can cause a lot of flow creations. For ~2800 pods we were using, it would create 2800 lflows and 2800*21=~60,000 openflows during pod creation. With the fixes, we are hoping to bring this down to ~120 lflows {equal to the number of nodes in the cluster} and ~5000 openflows totally. We are seeing multiple ecmp/policy adds happening for a single exgw pod creation. This is because we are calling addPodExternalGW from both addLogicalPort and ensurePod. This means we do multiple adds for each pod and that can really effect the pod latency. // ensurePod tries to set up a pod. It returns success or failure; failure // indicates the pod should be retried later. func (oc *Controller) ensurePod(oldPod, pod *kapi.Pod, addPort bool) bool { // Try unscheduled pods later if !util.PodScheduled(pod) { return false } if oldPod != nil && (exGatewayAnnotationsChanged(oldPod, pod) || networkStatusAnnotationsChanged(oldPod, pod)) { // No matter if a pod is ovn networked, or host networked, we still need to check for exgw // annotations. If the pod is ovn networked and is in update reschedule, addLogicalPort will take // care of updating the exgw updates oc.deletePodExternalGW(oldPod) } if util.PodWantsNetwork(pod) && addPort { if err := oc.addLogicalPort(pod); err != nil { klog.Errorf(err.Error()) oc.recordPodEvent(err, pod) return false } } *** this code is wrong. We see this code getting called for every update. else { if err := oc.addPodExternalGW(pod); err != nil { klog.Errorf(err.Error()) oc.recordPodEvent(err, pod) return false } } return true } We see this happening three times per pod creation. I0823 21:15:24.588731 1 egressgw.go:38] External gateway pod: pod-serving-9-2-serving-job, detected for namespace(s) served-ns-9 2021-08-23T21:15:24.591Z|33431|nbctl|INFO|Running command run --may-exist --bfd --policy=src-ip --ecmp-symmetric-reply -- lr-route-add GR_worker080-r640 10.130.38.10/32 192.168.218.9 rtoe-GR_worker080-r640 2021-08-23T21:15:24.597Z|33433|nbctl|INFO|Running command run -- lr-policy-add ovn_cluster_router 501 "inport == \"rtos-worker080-r640\" && ip4.src == 10.130.38.10 && ip4.dst != 10.128.0.0/14" reroute 100.64.0.80 I0823 21:15:24.616615 1 egressgw.go:38] External gateway pod: pod-serving-9-2-serving-job, detected for namespace(s) served-ns-9 2021-08-23T21:15:24.619Z|33434|nbctl|INFO|Running command run --may-exist --bfd --policy=src-ip --ecmp-symmetric-reply -- lr-route-add GR_worker080-r640 10.130.38.10/32 192.168.218.9 rtoe-GR_worker080-r640 2021-08-23T21:15:24.624Z|33435|nbctl|INFO|Running command run -- lr-policy-add ovn_cluster_router 501 "inport == \"rtos-worker080-r640\" && ip4.src == 10.130.38.10 && ip4.dst != 10.128.0.0/14" reroute 100.64.0.80 I0823 21:15:24.624864 1 egressgw.go:38] External gateway pod: pod-serving-9-2-serving-job, detected for namespace(s) served-ns-9 2021-08-23T21:15:24.626Z|33436|nbctl|INFO|Running command run --may-exist --bfd --policy=src-ip --ecmp-symmetric-reply -- lr-route-add GR_worker080-r640 10.130.38.10/32 192.168.218.9 rtoe-GR_worker080-r640 2021-08-23T21:15:24.631Z|33437|nbctl|INFO|Running command run -- lr-policy-add ovn_cluster_router 501 "inport == \"rtos-worker080-r640\" && ip4.src == 10.130.38.10 && ip4.dst != 10.128.0.0/14" reroute 100.64.0.80 Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.9.0 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:3759 |