Hide Forgot
Description of problem: During scale test with 300 nodes and around 15k pods we see a segfault when adding to address set: I0508 11:33:49.985364 1 pods.go:289] [cluster-density-7d10a4fe-d166-439c-b343-bcb4789a554d-4/deployment-2pod-1-7f7468ffdc-jtlhh] addLogicalPort took 21.669154476s panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x17b47e0] goroutine 52 [running]: github.com/ovn-org/ovn-kubernetes/go-controller/pkg/ovn/address_set.(*ovnAddressSets).AddIPs(0x0, 0xc037df90a0, 0x1, 0x1, 0x0, 0x0) /go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/ovn/address_set/address_set.go:292 +0x60 github.com/ovn-org/ovn-kubernetes/go-controller/pkg/ovn.(*Controller).addPodToNamespace(0xc000843600, 0xc05216c780, 0x36, 0xc024c82480, 0x0, 0x0) /go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/ovn/namespace.go:69 +0x111 github.com/ovn-org/ovn-kubernetes/go-controller/pkg/ovn.(*Controller).addLogicalPort(0xc000843600, 0xc002635000, 0x0, 0x0) /go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/ovn/pods.go:501 +0xeea github.com/ovn-org/ovn-kubernetes/go-controller/pkg/ovn.(*Controller).ensurePod(0xc000843600, 0xc02e6be800, 0xc002635000, 0xc00e657d01, 0x413af0) /go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/ovn/ovn.go:507 +0x545 github.com/ovn-org/ovn-kubernetes/go-controller/pkg/ovn.(*Controller).WatchPods.func3(0x1bdc4a0, 0xc02e6be800, 0x1bdc4a0, 0xc002635000) /go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/ovn/ovn.go:547 +0x8c k8s.io/client-go/tools/cache.ResourceEventHandlerFuncs.OnUpdate(...) /go/src/github.com/openshift/ovn-kubernetes/go-controller/vendor/k8s.io/client-go/tools/cache/controller.go:238 k8s.io/client-go/tools/cache.FilteringResourceEventHandler.OnUpdate(0xc03e730700, 0x1e79ce0, 0xc0276526c0, 0x1bdc4a0, 0xc02e6be800, 0x1bdc4a0, 0xc002635000) /go/src/github.com/openshift/ovn-kubernetes/go-controller/vendor/k8s.io/client-go/tools/cache/controller.go:273 +0x122 github.com/ovn-org/ovn-kubernetes/go-controller/pkg/factory.(*Handler).OnUpdate(...) /go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/factory/handler.go:44 github.com/ovn-org/ovn-kubernetes/go-controller/pkg/factory.(*informer).newFederatedQueuedHandler.func2.1.1(0xc03070ced0) /go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/factory/handler.go:221 +0x78 github.com/ovn-org/ovn-kubernetes/go-controller/pkg/factory.(*informer).forEachQueuedHandler(0xc00004a600, 0xc001331ef8) /go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/factory/handler.go:90 +0x1d6 github.com/ovn-org/ovn-kubernetes/go-controller/pkg/factory.(*informer).newFederatedQueuedHandler.func2.1(0xc04c30cf00) /go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/factory/handler.go:220 +0x52 github.com/ovn-org/ovn-kubernetes/go-controller/pkg/factory.(*informer).processEvents(0xc00004a600, 0xc0001aa360, 0xc000034360) /go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/factory/handler.go:157 +0x78 created by github.com/ovn-org/ovn-kubernetes/go-controller/pkg/factory.newQueuedInformer /go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/factory/handler.go:348 +0x11e
Very likely something still trying to use the address set after it's been disposed of. 292 maps to "as.Lock()".
Created attachment 1781726 [details] ovnkube master logs
*** Bug 1962886 has been marked as a duplicate of this bug. ***
verified on 4.8.0-0.nightly-2021-06-09-095212. 300 nodes, 15K pods, no segfault in ovnkube-master.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:2438