Bug 1785728

Summary: [4.2]openshift-sdn nil dereference in startup
Product: OpenShift Container Platform Reporter: Juan Luis de Sousa-Valadas <jdesousa>
Component: NetworkingAssignee: Juan Luis de Sousa-Valadas <jdesousa>
Networking sub component: openshift-sdn QA Contact: huirwang
Status: CLOSED ERRATA Docs Contact:
Severity: unspecified    
Priority: unspecified CC: bleanhar, erich, scuppett
Version: 4.2.0   
Target Milestone: ---   
Target Release: 4.2.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
273e1ef4-efd6-49c4-bd46-cdee8d9cfb03
Last Closed: 2020-02-24 16:52:45 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On: 1781707    
Bug Blocks:    

Description Juan Luis de Sousa-Valadas 2019-12-20 19:15:19 UTC
This bug was initially created as a copy of Bug #1781707

I am copying this bug because: 
Backport to 4.2

As seen in a training cluster:

    panic: runtime error: invalid memory address or nil pointer dereference
    Dec 09 14:39:33 ip-10-0-133-40 hyperkube[2027]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x166faf9]
    Dec 09 14:39:33 ip-10-0-133-40 hyperkube[2027]: goroutine 1 [running]:
    Dec 09 14:39:33 ip-10-0-133-40 hyperkube[2027]: github.com/openshift/sdn/pkg/openshift-sdn.injectKubeAPIEnv(0x7ffd075a6cd8, 0x1a, 0x0, 0xc000705ba0)
    Dec 09 14:39:33 ip-10-0-133-40 hyperkube[2027]:         /go/src/github.com/openshift/sdn/pkg/openshift-sdn/cmd.go:191 +0xa9
    Dec 09 14:39:33 ip-10-0-133-40 hyperkube[2027]: github.com/openshift/sdn/pkg/openshift-sdn.(*OpenShiftSDN).Run(0xc0000d0600, 0xc000676280, 0x1d424e0, 0xc0000b6010, 0xc000098480)
    Dec 09 14:39:33 ip-10-0-133-40 hyperkube[2027]:         /go/src/github.com/openshift/sdn/pkg/openshift-sdn/cmd.go:77 +0x50
    Dec 09 14:39:33 ip-10-0-133-40 hyperkube[2027]: github.com/openshift/sdn/pkg/openshift-sdn.NewOpenShiftSDNCommand.func1.2(0xc000000008, 0x1b4f118)
    Dec 09 14:39:33 ip-10-0-133-40 hyperkube[2027]:         /go/src/github.com/openshift/sdn/pkg/openshift-sdn/cmd.go:61 +0x4e
    Dec 09 14:39:33 ip-10-0-133-40 hyperkube[2027]: github.com/openshift/sdn/vendor/k8s.io/kubernetes/pkg/util/interrupt.(*Handler).Run(0xc00030f410, 0xc000705d50, 0x0, 0x0)
    Dec 09 14:39:33 ip-10-0-133-40 hyperkube[2027]:         /go/src/github.com/openshift/sdn/vendor/k8s.io/kubernetes/pkg/util/interrupt/interrupt.go:103 +0xff
    Dec 09 14:39:33 ip-10-0-133-40 hyperkube[2027]: github.com/openshift/sdn/pkg/openshift-sdn.NewOpenShiftSDNCommand.func1(0xc000676280, 0xc00030f3b0, 0x0, 0x3)
    Dec 09 14:39:33 ip-10-0-133-40 hyperkube[2027]:         /go/src/github.com/openshift/sdn/pkg/openshift-sdn/cmd.go:60 +0x154
    Dec 09 14:39:33 ip-10-0-133-40 hyperkube[2027]: github.com/openshift/sdn/vendor/github.com/spf13/cobra.(*Command).execute(0xc000676280, 0xc0000ba010, 0x3, 0x3, 0xc000676280, 0xc0000ba010)
    Dec 09 14:39:33 ip-10-0-133-40 hyperkube[2027]:         /go/src/github.com/openshift/sdn/vendor/github.com/spf13/cobra/command.go:760 +0x2ae
    Dec 09 14:39:33 ip-10-0-133-40 hyperkube[2027]: github.com/openshift/sdn/vendor/github.com/spf13/cobra.(*Command).ExecuteC(0xc000676280, 0xd, 0x1d424e0, 0xc0000b6010)
    Dec 09 14:39:33 ip-10-0-133-40 hyperkube[2027]:         /go/src/github.com/openshift/sdn/vendor/github.com/spf13/cobra/command.go:846 +0x2ec
    Dec 09 14:39:33 ip-10-0-133-40 hyperkube[2027]: github.com/openshift/sdn/vendor/github.com/spf13/cobra.(*Command).Execute(...)
    Dec 09 14:39:33 ip-10-0-133-40 hyperkube[2027]:         /go/src/github.com/openshift/sdn/vendor/github.com/spf13/cobra/command.go:794
    Dec 09 14:39:33 ip-10-0-133-40 hyperkube[2027]: main.main()
    Dec 09 14:39:33 ip-10-0-133-40 hyperkube[2027]:         /go/src/github.com/openshift/sdn/cmd/openshift-sdn/openshift-sdn.go:28 +0x17b

Comment 1 Stephen Cuppett 2020-02-05 19:10:42 UTC
What is the way to get NotReady nodes (workers and masters) failing due to this back to a Ready status for the upgrade to this z-stream?

Comment 2 Juan Luis de Sousa-Valadas 2020-02-05 19:26:17 UTC
You can probably workaround it by doing manually:
cp /var/lib/kubelet/kubeconfig /etc/kubernetes/kubeconfig
I haven't tested it though. If /etc/kubernetes/kubeconfig is present on the node do *not* overwrite it.

Comment 4 zhaozhanqi 2020-02-06 03:07:06 UTC
@huiran could you help check this bug?

Comment 7 errata-xmlrpc 2020-02-24 16:52:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0460