Bug 1298942

Summary: atomic-openshift-node crash
Product: OpenShift Container Platform Reporter: Jeremy Eder <jeder>
Component: NetworkingAssignee: Dan Winship <danw>
Status: CLOSED ERRATA QA Contact: Meng Bo <bmeng>
Severity: high Docs Contact:
Priority: unspecified    
Version: 3.1.0CC: agoldste, aos-bugs, danw, eparis, jamills, jeder, jokerman, mmccomas, rpenta, tdawson, tstclair, uobergfe
Target Milestone: ---   
Target Release: ---   
Hardware: All   
OS: Linux   
Whiteboard:
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2016-05-12 16:26:43 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Comment 5 Andy Goldstein 2016-01-15 16:17:51 UTC
Relevant log entry:

Jan 14 07:53:13 dell-r730-02 atomic-openshift-node: panic: runtime error: invalid memory address or nil pointer dereference
Jan 14 07:53:13 dell-r730-02 atomic-openshift-node: [signal 0xb code=0x1 addr=0x0 pc=0xa6c350]
Jan 14 07:53:13 dell-r730-02 atomic-openshift-node: goroutine 244 [running]:
Jan 14 07:53:13 dell-r730-02 atomic-openshift-node: runtime.gopanic(0x26a3080, 0xc208028050)
Jan 14 07:53:13 dell-r730-02 atomic-openshift-node: /usr/lib/golang/src/runtime/panic.go:425 +0x2a3 fp=0xc2085dd908 sp=0xc2085dd8a0
Jan 14 07:53:13 dell-r730-02 atomic-openshift-node: runtime.panicmem()
Jan 14 07:53:13 dell-r730-02 atomic-openshift-node: /usr/lib/golang/src/runtime/panic.go:42 +0x4e fp=0xc2085dd930 sp=0xc2085dd908
Jan 14 07:53:13 dell-r730-02 atomic-openshift-node: runtime.sigpanic()
Jan 14 07:53:13 dell-r730-02 atomic-openshift-node: /usr/lib/golang/src/runtime/sigpanic_unix.go:26 +0x274 fp=0xc2085dd980 sp=0xc2085dd930
Jan 14 07:53:13 dell-r730-02 atomic-openshift-node: net.networkNumberAndMask(0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0)
Jan 14 07:53:13 dell-r730-02 atomic-openshift-node: /usr/lib/golang/src/net/ip.go:436 +0x200 fp=0xc2085dd9b8 sp=0xc2085dd980
Jan 14 07:53:13 dell-r730-02 atomic-openshift-node: net.(*IPNet).Contains(0x0, 0xc208577bb0, 0x10, 0x10, 0x10)
Jan 14 07:53:13 dell-r730-02 atomic-openshift-node: /usr/lib/golang/src/net/ip.go:460 +0x28 fp=0xc2085dda28 sp=0xc2085dd9b8
Jan 14 07:53:13 dell-r730-02 atomic-openshift-node: github.com/openshift/openshift-sdn/plugins/osdn.(*Registry).OnEndpointsUpdate(0xc208978000, 0xc208b662a0, 0x1, 0x1)
Jan 14 07:53:13 dell-r730-02 atomic-openshift-node: /builddir/build/BUILD/atomic-openshift-git-0.dba03a7/_thirdpartyhacks/src/github.com/openshift/openshift-sdn/plugins/osdn/registry.go:626 +0x2c8 fp=0xc2085ddea0 sp=0xc2085dda28
Jan 14 07:53:13 dell-r730-02 atomic-openshift-node: k8s.io/kubernetes/pkg/proxy/config.funcĀ·003(0x1fd9e40, 0xc208ccb2a0)
Jan 14 07:53:13 dell-r730-02 atomic-openshift-node: /builddir/build/BUILD/atomic-openshift-git-0.dba03a7/_thirdpartyhacks/src/k8s.io/kubernetes/pkg/proxy/config/config.go:96 +0xd2 fp=0xc2085ddef0 sp=0xc2085ddea0
Jan 14 07:53:13 dell-r730-02 atomic-openshift-node: k8s.io/kubernetes/pkg/util/config.ListenerFunc.OnUpdate(0xc208d31b30, 0x1fd9e40, 0xc208ccb2a0)
Jan 14 07:53:13 dell-r730-02 atomic-openshift-node: /builddir/build/BUILD/atomic-openshift-git-0.dba03a7/_thirdpartyhacks/src/k8s.io/kubernetes/pkg/util/config/config.go:110 +0x37 fp=0xc2085ddf08 sp=0xc2085ddef0
Jan 14 07:53:13 dell-r730-02 atomic-openshift-node: k8s.io/kubernetes/pkg/util/config.(*Broadcaster).Notify(0xc2084256e0, 0x1fd9e40, 0xc208ccb2a0)
Jan 14 07:53:13 dell-r730-02 atomic-openshift-node: /builddir/build/BUILD/atomic-openshift-git-0.dba03a7/_thirdpartyhacks/src/k8s.io/kubernetes/pkg/util/config/config.go:138 +0xfb fp=0xc2085ddf90 sp=0xc2085ddf08
Jan 14 07:53:13 dell-r730-02 atomic-openshift-node: k8s.io/kubernetes/pkg/proxy/config.watchForUpdates(0xc2084256e0, 0x7ff9e9884750, 0xc208425650, 0xc208d85140)
Jan 14 07:53:13 dell-r730-02 atomic-openshift-node: /builddir/build/BUILD/atomic-openshift-git-0.dba03a7/_thirdpartyhacks/src/k8s.io/kubernetes/pkg/proxy/config/config.go:277 +0x7d fp=0xc2085ddfc0 sp=0xc2085ddf90
Jan 14 07:53:13 dell-r730-02 atomic-openshift-node: runtime.goexit()
Jan 14 07:53:13 dell-r730-02 atomic-openshift-node: /usr/lib/golang/src/runtime/asm_amd64.s:2232 +0x1 fp=0xc2085ddfc8 sp=0xc2085ddfc0
Jan 14 07:53:13 dell-r730-02 atomic-openshift-node: created by k8s.io/kubernetes/pkg/proxy/config.NewEndpointsConfig
Jan 14 07:53:13 dell-r730-02 atomic-openshift-node: /builddir/build/BUILD/atomic-openshift-git-0.dba03a7/_thirdpartyhacks/src/k8s.io/kubernetes/pkg/proxy/config/config.go:89 +0x254

Comment 7 Andy Goldstein 2016-01-15 18:35:57 UTC
Need a bump commit for origin; moving back to assigned.

Comment 11 Eric Paris 2016-01-16 00:23:45 UTC
https://github.com/openshift/origin/pull/6684

Comment 12 Eric Paris 2016-01-16 02:17:13 UTC
I am marking this UpcomingRelease. This is a transient failure at node startup which should be recovered when systemd restarts the node a second time. We recognize the need to fix this problem but since we are already 3 days past the code due date and since this does not appear to be a new problem, I am not going to blocker the 3.1.1 release.

Although we thought we had a reasonable path forward as this bug was debugged it was found that the order in which the system comes up means that fixing the issue is much more complex than anticipated.

Comment 13 Jeremy Eder 2016-01-18 13:05:42 UTC
Thanks, I have built OSE test packages with the commit below, and will let you know if it happens again.

https://github.com/danwinship/ose/commit/bde7b29ffa9c0782eb0754654702857ace2158c7

Comment 14 Dan Winship 2016-01-28 16:35:39 UTC
This is fixed in git master (by https://github.com/openshift/origin/pull/6684)

Comment 15 Meng Bo 2016-02-16 06:19:31 UTC
@Jeremy

Did you meet the crash recently? Can we close this bug? 
It is difficult to reproduce/verify this bug from my side.

Thanks.

Comment 16 Jeremy Eder 2016-02-16 12:34:51 UTC
Yep, I think it's fixed.

Comment 17 Meng Bo 2016-02-17 05:25:27 UTC
Thanks.

Resolve the bug.

Comment 19 errata-xmlrpc 2016-05-12 16:26:43 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2016:1064