Bug 1909911 - [OVN]EgressFirewall caused a segfault
Summary: [OVN]EgressFirewall caused a segfault
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.7
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: 4.7.0
Assignee: Jacob Tanenbaum
QA Contact: huirwang
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-12-22 02:22 UTC by huirwang
Modified: 2021-02-24 15:48 UTC (History)
1 user (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-02-24 15:47:55 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift ovn-kubernetes pull 398 0 None closed Bug 1909911: Fix egressFirewall segfault caused by restarting 2021-01-25 15:43:00 UTC
Github ovn-org ovn-kubernetes pull 1936 0 None closed Fix egressFirewall segfault caused by restarting 2021-01-25 15:43:00 UTC
Red Hat Product Errata RHSA-2020:5633 0 None None None 2021-02-24 15:48:11 UTC

Description huirwang 2020-12-22 02:22:39 UTC
Description of problem:
Do some EgressFirewall testing in shared cluster, found  segfault in ovnkube-master pods

Version-Release number of selected component (if applicable):
4.7.0-0.nightly-2020-12-20-031835 

How reproducible:
Not sure

Steps to Reproduce:
The issue was found in a shared cluster in QE, not sure which step caused the issue, tried some ways to reproduce the issue, but general EgressFirewall testing cannot trigger it.

oc get pods -n openshift-ovn-kubernetes
NAME                   READY   STATUS             RESTARTS   AGE
ovnkube-master-jp2vh   5/6     CrashLoopBackOff   89         21h
ovnkube-master-wtxp2   6/6     Running            88         21h
ovnkube-master-z6ns5   5/6     Error              86         21h
ovnkube-node-9ksg2     3/3     Running            0          21h
ovnkube-node-d99qj     3/3     Running            0          21h
ovnkube-node-h7mc8     3/3     Running            0          21h
ovnkube-node-mfr7v     3/3     Running            0          21h
ovnkube-node-nfsgr     3/3     Running            0          21h
ovnkube-node-qcww2     3/3     Running            0          21h
ovnkube-node-rf9gw     3/3     Running            0          21h
ovs-node-4pbht         1/1     Running            0          21h
ovs-node-4xzrs         1/1     Running            0          21h
ovs-node-7wwqv         1/1     Running            0          21h
ovs-node-bpclq         1/1     Running            0          21h
ovs-node-frrh7         1/1     Running            0          21h
ovs-node-shn55         1/1     Running            0          21h
ovs-node-zk2hh         1/1     Running            0          21h

oc logs ovnkube-master-jp2vh -n openshift-ovn-kubernetes -c ovnkube-master

....


I1222 00:30:55.320860       1 ovn.go:778] Adding CRD clustercsidrivers.operator.openshift.io to cluster
I1222 00:30:55.320863       1 ovn.go:778] Adding CRD clusterserviceversions.operators.coreos.com to cluster
I1222 00:30:55.320867       1 ovn.go:778] Adding CRD authentications.operator.openshift.io to cluster
I1222 00:30:55.320871       1 ovn.go:778] Adding CRD egressfirewalls.k8s.ovn.org to cluster
I1222 00:30:55.321819       1 reflector.go:219] Starting reflector *v1.EgressFirewall (0s) from github.com/openshift/ovn-kubernetes/go-controller/pkg/crd/egressfirewall/v1/apis/informers/externalversions/factory.go:117
I1222 00:30:55.321841       1 reflector.go:255] Listing and watching *v1.EgressFirewall from github.com/openshift/ovn-kubernetes/go-controller/pkg/crd/egressfirewall/v1/apis/informers/externalversions/factory.go:117
I1222 00:30:55.421198       1 shared_informer.go:270] caches populated
I1222 00:30:55.421276       1 egressfirewall.go:77] Adding egressFirewall default in namespace test
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1761821]

goroutine 244 [running]:
github.com/ovn-org/ovn-kubernetes/go-controller/pkg/ovn.(*EgressDNS).Add(0x0, 0xc001e91ef0, 0x4, 0xc001e91f10, 0xc, 0x0, 0x0, 0x0, 0x0)
	/go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/ovn/egressfirewall_dns.go:61 +0x61
github.com/ovn-org/ovn-kubernetes/go-controller/pkg/ovn.(*Controller).addLogicalRouterPolicyToClusterRouter(0xc0018d0000, 0xc001ee0920, 0x14, 0x0, 0x0, 0xc001e91ef0, 0x4, 0x270f, 0x0, 0x2)
	/go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/ovn/egressfirewall.go:243 +0x32a
github.com/ovn-org/ovn-kubernetes/go-controller/pkg/ovn.(*Controller).addEgressFirewall(0xc0018d0000, 0xc0027d7400, 0x0, 0x0)
	/go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/ovn/egressfirewall.go:128 +0x5bc
github.com/ovn-org/ovn-kubernetes/go-controller/pkg/ovn.(*Controller).WatchEgressFirewall.func1(0x1b24ea0, 0xc001998500)
	/go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/ovn/ovn.go:815 +0x7e
k8s.io/client-go/tools/cache.ResourceEventHandlerFuncs.OnAdd(...)
	/go/src/github.com/openshift/ovn-kubernetes/go-controller/vendor/k8s.io/client-go/tools/cache/controller.go:231
k8s.io/client-go/tools/cache.FilteringResourceEventHandler.OnAdd(0xc0028fd900, 0x1dc0380, 0xc001cadcc0, 0x1b24ea0, 0xc001998500)
	/go/src/github.com/openshift/ovn-kubernetes/go-controller/vendor/k8s.io/client-go/tools/cache/controller.go:264 +0x6a
github.com/ovn-org/ovn-kubernetes/go-controller/pkg/factory.(*Handler).OnAdd(...)
	/go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/factory/handler.go:38
github.com/ovn-org/ovn-kubernetes/go-controller/pkg/factory.newInformer.func1(0xc001cb24e0, 0xc001cadd00, 0x2, 0x2)
	/go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/factory/handler.go:332 +0x93
github.com/ovn-org/ovn-kubernetes/go-controller/pkg/factory.(*informer).addHandler(0xc0021d8900, 0x8, 0xc0028fd900, 0x1dc0380, 0xc001cadcc0, 0xc001cadd00, 0x2, 0x2, 0xc001cadcc0)
	/go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/factory/handler.go:122 +0xb4
github.com/ovn-org/ovn-kubernetes/go-controller/pkg/factory.(*WatchFactory).addHandler(0xc00064f810, 0x1deb980, 0x1b24ea0, 0x0, 0x0, 0x0, 0x0, 0x1dc0380, 0xc001cadcc0, 0x0, ...)
	/go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/factory/factory.go:373 +0x35b
github.com/ovn-org/ovn-kubernetes/go-controller/pkg/factory.(*WatchFactory).AddEgressFirewallHandler(...)
	/go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/factory/factory.go:434
github.com/ovn-org/ovn-kubernetes/go-controller/pkg/ovn.(*Controller).WatchEgressFirewall(0xc0018d0000, 0x0)
	/go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/ovn/ovn.go:812 +0x165
github.com/ovn-org/ovn-kubernetes/go-controller/pkg/ovn.(*Controller).WatchCRD.func1(0x1b457a0, 0xc00100bf40)
	/go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/ovn/ovn.go:785 +0x285
k8s.io/client-go/tools/cache.ResourceEventHandlerFuncs.OnAdd(...)
	/go/src/github.com/openshift/ovn-kubernetes/go-controller/vendor/k8s.io/client-go/tools/cache/controller.go:231
k8s.io/client-go/tools/cache.FilteringResourceEventHandler.OnAdd(0xc001ba4d40, 0x1dc0380, 0xc001bc9300, 0x1b457a0, 0xc00100bf40)
	/go/src/github.com/openshift/ovn-kubernetes/go-controller/vendor/k8s.io/client-go/tools/cache/controller.go:264 +0x6a
github.com/ovn-org/ovn-kubernetes/go-controller/pkg/factory.(*Handler).OnAdd(...)
	/go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/factory/handler.go:38
github.com/ovn-org/ovn-kubernetes/go-controller/pkg/factory.newInformer.func1(0xc001bce930, 0xc001326800, 0x61, 0x80)
	/go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/factory/handler.go:332 +0x93
github.com/ovn-org/ovn-kubernetes/go-controller/pkg/factory.(*informer).addHandler(0xc000452780, 0x7, 0xc001ba4d40, 0x1dc0380, 0xc001bc9300, 0xc001326800, 0x61, 0x80, 0xc001bc9300)
	/go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/factory/handler.go:122 +0xb4
github.com/ovn-org/ovn-kubernetes/go-controller/pkg/factory.(*WatchFactory).addHandler(0xc00064f810, 0x1deb980, 0x1b457a0, 0x0, 0x0, 0x0, 0x0, 0x1dc0380, 0xc001bc9300, 0x0, ...)
	/go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/factory/factory.go:373 +0x35b
github.com/ovn-org/ovn-kubernetes/go-controller/pkg/factory.(*WatchFactory).AddCRDHandler(...)
	/go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/factory/factory.go:444
github.com/ovn-org/ovn-kubernetes/go-controller/pkg/ovn.(*Controller).WatchCRD(0xc0018d0000)
	/go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/ovn/ovn.go:775 +0x135
github.com/ovn-org/ovn-kubernetes/go-controller/pkg/ovn.(*Controller).Run(0xc0018d0000, 0xc0005c2030, 0x1b, 0x0)
	/go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/ovn/ovn.go:355 +0x1f9
github.com/ovn-org/ovn-kubernetes/go-controller/pkg/ovn.(*Controller).Start.func1(0x1dc3f00, 0xc001b00980)
	/go/src/github.com/openshift/ovn-kubernetes/go-controller/pkg/ovn/master.go:95 +0x192
created by k8s.io/client-go/tools/leaderelection.(*LeaderElector).Run
	/go/src/github.com/openshift/ovn-kubernetes/go-controller/vendor/k8s.io/client-go/tools/leaderelection/leaderelection.go:207 +0x113

Actual results:
 segfault in ovnkube-master pods

Expected results:
No segfault in ovnkube-master pods

Additional info:

After found this segfault, deleting the EgressFirewall  and recreating it on same namespace cannot reproduce the issue.

Comment 4 Jacob Tanenbaum 2021-01-04 19:15:40 UTC
upstream fix: https://github.com/ovn-org/ovn-kubernetes/pull/1936

Comment 9 errata-xmlrpc 2021-02-24 15:47:55 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2020:5633


Note You need to log in before you can comment on or make changes to this bug.