Bug 1958958
| Summary: | [SCALE] segfault with ovnkube adding to address set | ||||||
|---|---|---|---|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | Tim Rozet <trozet> | ||||
| Component: | Networking | Assignee: | Federico Paolinelli <fpaoline> | ||||
| Networking sub component: | ovn-kubernetes | QA Contact: | Mike Fiedler <mifiedle> | ||||
| Status: | CLOSED ERRATA | Docs Contact: | |||||
| Severity: | high | ||||||
| Priority: | unspecified | CC: | aconstan, akrzos, anbhat, astoycos, dcbw, fpaoline, mifiedle | ||||
| Version: | 4.8 | ||||||
| Target Milestone: | --- | ||||||
| Target Release: | 4.8.0 | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | perfscale-ovn | ||||||
| Fixed In Version: | Doc Type: | No Doc Update | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2021-07-27 23:07:46 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Bug Depends On: | |||||||
| Bug Blocks: | 1962590 | ||||||
| Attachments: |
|
||||||
Very likely something still trying to use the address set after it's been disposed of. 292 maps to "as.Lock()". Created attachment 1781726 [details]
ovnkube master logs
*** Bug 1962886 has been marked as a duplicate of this bug. *** verified on 4.8.0-0.nightly-2021-06-09-095212. 300 nodes, 15K pods, no segfault in ovnkube-master. Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438 |
Description of problem: During scale test with 300 nodes and around 15k pods we see a segfault when adding to address set: I0508 11:33:49.985364 1 pods.go:289] [cluster-density-7d10a4fe-d166-439c-b343-bcb4789a554d-4/deployment-2pod-1-7f7468ffdc-jtlhh] addLogicalPort took 21.669154476s panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x17b47e0] goroutine 52 [running]: github.com/ovn-org/ovn-kubernetes/go-controller/pkg/ovn/address_set.(*ovnAddressSets).AddIPs(0x0, 0xc037df90a0, 0x1, 0x1, 0x0, 0x0) /go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/ovn/address_set/address_set.go:292 +0x60 github.com/ovn-org/ovn-kubernetes/go-controller/pkg/ovn.(*Controller).addPodToNamespace(0xc000843600, 0xc05216c780, 0x36, 0xc024c82480, 0x0, 0x0) /go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/ovn/namespace.go:69 +0x111 github.com/ovn-org/ovn-kubernetes/go-controller/pkg/ovn.(*Controller).addLogicalPort(0xc000843600, 0xc002635000, 0x0, 0x0) /go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/ovn/pods.go:501 +0xeea github.com/ovn-org/ovn-kubernetes/go-controller/pkg/ovn.(*Controller).ensurePod(0xc000843600, 0xc02e6be800, 0xc002635000, 0xc00e657d01, 0x413af0) /go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/ovn/ovn.go:507 +0x545 github.com/ovn-org/ovn-kubernetes/go-controller/pkg/ovn.(*Controller).WatchPods.func3(0x1bdc4a0, 0xc02e6be800, 0x1bdc4a0, 0xc002635000) /go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/ovn/ovn.go:547 +0x8c k8s.io/client-go/tools/cache.ResourceEventHandlerFuncs.OnUpdate(...) /go/src/github.com/openshift/ovn-kubernetes/go-controller/vendor/k8s.io/client-go/tools/cache/controller.go:238 k8s.io/client-go/tools/cache.FilteringResourceEventHandler.OnUpdate(0xc03e730700, 0x1e79ce0, 0xc0276526c0, 0x1bdc4a0, 0xc02e6be800, 0x1bdc4a0, 0xc002635000) /go/src/github.com/openshift/ovn-kubernetes/go-controller/vendor/k8s.io/client-go/tools/cache/controller.go:273 +0x122 github.com/ovn-org/ovn-kubernetes/go-controller/pkg/factory.(*Handler).OnUpdate(...) /go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/factory/handler.go:44 github.com/ovn-org/ovn-kubernetes/go-controller/pkg/factory.(*informer).newFederatedQueuedHandler.func2.1.1(0xc03070ced0) /go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/factory/handler.go:221 +0x78 github.com/ovn-org/ovn-kubernetes/go-controller/pkg/factory.(*informer).forEachQueuedHandler(0xc00004a600, 0xc001331ef8) /go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/factory/handler.go:90 +0x1d6 github.com/ovn-org/ovn-kubernetes/go-controller/pkg/factory.(*informer).newFederatedQueuedHandler.func2.1(0xc04c30cf00) /go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/factory/handler.go:220 +0x52 github.com/ovn-org/ovn-kubernetes/go-controller/pkg/factory.(*informer).processEvents(0xc00004a600, 0xc0001aa360, 0xc000034360) /go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/factory/handler.go:157 +0x78 created by github.com/ovn-org/ovn-kubernetes/go-controller/pkg/factory.newQueuedInformer /go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/factory/handler.go:348 +0x11e