Bug 1394190

Summary: "panic: runtime error: invalid memory address or nil pointer dereference" appears in node log sometimes when trying to join network
Product: OpenShift Container Platform Reporter: Meng Bo <bmeng>
Component: NetworkingAssignee: Dan Williams <dcbw>
Status: CLOSED ERRATA QA Contact: Meng Bo <bmeng>
Severity: high Docs Contact:
Priority: medium    
Version: 3.4.0CC: aos-bugs, danw, tdawson
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
undefined
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-01-18 12:54:42 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Attachments:
Description Flags
node log for the panic error none

Description Meng Bo 2016-11-11 11:20:13 UTC
Created attachment 1219706 [details]
node log for the panic error

Description of problem:
When trying to merge network via oadm pod-network command, sometimes there is panic error appears in the node log. And the node service gets restarted after that.

Version-Release number of selected component (if applicable):
openshift v3.4.0.24+52fd77b
kubernetes v1.4.0+776c994
etcd 3.1.0-rc.0
openvswitch 2.4.0

How reproducible:
sometimes

Steps to Reproduce:
1. Setup multinode env with multitenant plugin

2. Create a lot of projects and pods and then delete them
$ for i in `seq 1 100` ; do oc new-project bmengtestpro$i ; oc create -f //raw.githubusercontent.com/openshift-qe/v3-testfiles/master/networking/pod-for-ping.json  -n bmengtestpro$i ; oc delete project bmengtestpro$i ; done

3. Create another two projects with some pods in them

4. Try to merge the two projects via oadm pod-network
# oadm pod-network join-projects project1 --to project2

5. Check the node log 

Actual results:
5. Panic error and traceback shows in node log. And the project cannot be merged.

Expected results:
Should not panic and merge network works.

Additional info:
I have never experienced this error when testing with openvswitch 2.5.0, but I am not pretty sure that this is an ovs 2.4.0 only issue. I can just say that it will have a high rate to reproduce this with ovs 2.4.0.

Full trace log attached.

Comment 2 Dan Winship 2016-11-11 16:10:54 UTC
Looks like the same panic as bug 138912, should also be fixed now. (But leaving this bug separate to make sure both failure modes get tested.)

Comment 4 Dan Williams 2016-11-11 17:33:57 UTC
Winship really means https://github.com/openshift/origin/pull/11852

Comment 5 Troy Dawson 2016-11-11 19:52:15 UTC
This has been merged into ocp and is in OCP v3.4.0.25 or newer.

Comment 8 Meng Bo 2016-11-15 11:00:11 UTC
Tested on OCP v3.4.0.26 with openvswitch 2.4

Cannot re-create the issue.

Verify the bug.

Comment 10 errata-xmlrpc 2017-01-18 12:54:42 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:0066