1665605 – Repeated panics in openshift-cluster-kube-scheduler-operator pod logs

Bug 1665605 - Repeated panics in openshift-cluster-kube-scheduler-operator pod logs

Summary: Repeated panics in openshift-cluster-kube-scheduler-operator pod logs

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Node
Sub Component:
Version:	4.1.0
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	medium
Target Milestone:	---
Target Release:	4.1.0
Assignee:	ravig
QA Contact:	Jianwei Hou
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2019-01-11 22:17 UTC by Mike Fiedler
Modified:	2019-06-04 10:41 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-06-04 10:41:49 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
openshift-cluster-kube-scheduler pod logs (164.63 KB, application/gzip) 2019-01-11 22:17 UTC, Mike Fiedler	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2019:0758	0	None	None	None	2019-06-04 10:41:55 UTC

Description Mike Fiedler 2019-01-11 22:17:01 UTC

Created attachment 1520148 [details]
openshift-cluster-kube-scheduler pod logs

Description of problem:

My openshift-cluster-kube-scheduler-operator pod logs covering about 8 hours has 126 instances of this panic:

I0111 16:41:50.567169       1 shared_informer.go:123] caches populated
I0111 16:41:50.567461       1 config_observer_controller.go:95] decode of existing config failed with error: EOF
E0111 16:41:50.568134       1 runtime.go:66] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
/go/src/github.com/openshift/cluster-kube-scheduler-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:72
/go/src/github.com/openshift/cluster-kube-scheduler-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:65
/go/src/github.com/openshift/cluster-kube-scheduler-operator/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:51
/opt/rh/go-toolset-1.10/root/usr/lib/go-toolset-1.10-golang/src/runtime/asm_amd64.s:573
/opt/rh/go-toolset-1.10/root/usr/lib/go-toolset-1.10-golang/src/runtime/panic.go:502
/opt/rh/go-toolset-1.10/root/usr/lib/go-toolset-1.10-golang/src/runtime/panic.go:63
/opt/rh/go-toolset-1.10/root/usr/lib/go-toolset-1.10-golang/src/runtime/signal_unix.go:388
/go/src/github.com/openshift/cluster-kube-scheduler-operator/vendor/github.com/openshift/library-go/pkg/operator/configobserver/config_observer_controller.go:102
/go/src/github.com/openshift/cluster-kube-scheduler-operator/vendor/github.com/openshift/library-go/pkg/operator/configobserver/config_observer_controller.go:189
/go/src/github.com/openshift/cluster-kube-scheduler-operator/vendor/github.com/openshift/library-go/pkg/operator/configobserver/config_observer_controller.go:175
/go/src/github.com/openshift/cluster-kube-scheduler-operator/vendor/github.com/openshift/library-go/pkg/operator/configobserver/config_observer_controller.go:169
/go/src/github.com/openshift/cluster-kube-scheduler-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:133
/go/src/github.com/openshift/cluster-kube-scheduler-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:134
/go/src/github.com/openshift/cluster-kube-scheduler-operator/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88
/opt/rh/go-toolset-1.10/root/usr/lib/go-toolset-1.10-golang/src/runtime/asm_amd64.s:2361


No pod restarts though.


Version-Release number of selected component (if applicable): 4.0.0-0.nightly-2019-01-10-165754


How reproducible: Always in this build


Steps to Reproduce:
1. Install typical 3 master/3 worker cluster on AWS with nextgen installer using OCP build 4.0.0-0.nightly-2019-01-10-165754
2. oc logs -n openshift-cluster-kube-scheduler <pod>


Actual results:

Repeated instances of the subject panic



Additional info:

Full pod log attached.

Comment 2 ravig 2019-02-26 19:00:00 UTC

We already have fix for this. Thanks to Dan for identifying it earlier and providing fix.

https://github.com/openshift/cluster-kube-scheduler-operator/pull/65

Comment 3 ravig 2019-02-27 09:15:29 UTC

The above PR merged.

Comment 5 Mike Fiedler 2019-03-06 15:44:42 UTC

Verified on 4.0.0-0.nightly-2019-03-06-074438

Comment 8 errata-xmlrpc 2019-06-04 10:41:49 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2019:0758

Note You need to log in before you can comment on or make changes to this bug.