Bug 1701495

Summary: Eventrouter pod in CrashLoopBackOff status: "runtime error: invalid memory address or nil pointer dereference"
Product: OpenShift Container Platform Reporter: Qiaoling Tang <qitang>
Component: LoggingAssignee: Rich Megginson <rmeggins>
Status: CLOSED ERRATA QA Contact: Qiaoling Tang <qitang>
Severity: high Docs Contact:
Priority: high    
Version: 4.1.0CC: aos-bugs, clasohm, david, jcantril, jmalde, nhosoi, rmeggins, skrenger, sreber, vlaad
Target Milestone: ---   
Target Release: 4.3.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openshift-ose-logging-eventrouter:v4.3.0-201910250623 Doc Type: Bug Fix
Doc Text:
Cause: Eventrouter was not able to handle events of type "Info" or other event types unknown to it. Consequence: Eventrouter would crash trying to handle unknown event types. Fix: Add explicit support for "Info" types, and otherwise fix handling for unknown event types. Result: Eventrouter properly handles "Info" and unknown event types.
Story Points: ---
Clone Of:
: 1762409 (view as bug list) Environment:
Last Closed: 2020-01-23 11:03:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1762409, 1764312, 1764314    

Description Qiaoling Tang 2019-04-19 09:21:02 UTC
Description of problem:
Deploy logging cluster, then deploy eventrouter using https://raw.githubusercontent.com/QiaolingTang/logging/master/eventrouter.yaml, the eventrouter pod was in CrashLoopBackOff status, check eventrouter pod logs, some events in OCP already gatherd, but has error message in the pod log:

E0419 07:14:53.753914       1 runtime.go:66] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
/go/src/github.com/openshift/eventrouter/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:72
/go/src/github.com/openshift/eventrouter/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:65
/go/src/github.com/openshift/eventrouter/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:51
/usr/local/go/src/runtime/asm_amd64.s:573
/usr/local/go/src/runtime/panic.go:502
/usr/local/go/src/runtime/panic.go:63
/usr/local/go/src/runtime/signal_unix.go:388
/go/src/github.com/openshift/eventrouter/eventrouter.go:152
/go/src/github.com/openshift/eventrouter/eventrouter.go:113
/go/src/github.com/openshift/eventrouter/eventrouter.go:86
/go/src/github.com/openshift/eventrouter/vendor/k8s.io/client-go/tools/cache/controller.go:195
/go/src/github.com/openshift/eventrouter/vendor/k8s.io/client-go/tools/cache/shared_informer.go:545
/go/src/github.com/openshift/eventrouter/vendor/k8s.io/client-go/tools/cache/shared_informer.go:381
/go/src/github.com/openshift/eventrouter/vendor/k8s.io/client-go/tools/cache/shared_informer.go:381
/go/src/github.com/openshift/eventrouter/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:71
/usr/local/go/src/runtime/asm_amd64.s:2361
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0xf092c9]

goroutine 98 [running]:
github.com/openshift/eventrouter/vendor/k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
	/go/src/github.com/openshift/eventrouter/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:58 +0x107
panic(0x102c6a0, 0x199ab10)
	/usr/local/go/src/runtime/panic.go:502 +0x229
main.prometheusEvent(0xc4211de400)
	/go/src/github.com/openshift/eventrouter/eventrouter.go:152 +0xe9
main.(*EventRouter).addEvent(0xc420324200, 0x116fbc0, 0xc4211de400)
	/go/src/github.com/openshift/eventrouter/eventrouter.go:113 +0x3c
main.(*EventRouter).(main.addEvent)-fm(0x116fbc0, 0xc4211de400)
	/go/src/github.com/openshift/eventrouter/eventrouter.go:86 +0x3e
github.com/openshift/eventrouter/vendor/k8s.io/client-go/tools/cache.ResourceEventHandlerFuncs.OnAdd(0xc420252390, 0xc4202523a0, 0xc4202523b0, 0x116fbc0, 0xc4211de400)
	/go/src/github.com/openshift/eventrouter/vendor/k8s.io/client-go/tools/cache/controller.go:195 +0x49
github.com/openshift/eventrouter/vendor/k8s.io/client-go/tools/cache.(*processorListener).run(0xc42037e460)
	/go/src/github.com/openshift/eventrouter/vendor/k8s.io/client-go/tools/cache/shared_informer.go:545 +0x247
github.com/openshift/eventrouter/vendor/k8s.io/client-go/tools/cache.(*processorListener).(github.com/openshift/eventrouter/vendor/k8s.io/client-go/tools/cache.run)-fm()
	/go/src/github.com/openshift/eventrouter/vendor/k8s.io/client-go/tools/cache/shared_informer.go:381 +0x2a
github.com/openshift/eventrouter/vendor/k8s.io/apimachinery/pkg/util/wait.(*Group).Start.func1(0xc4204e27c8, 0xc4202fe000)
	/go/src/github.com/openshift/eventrouter/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:71 +0x4f
created by github.com/openshift/eventrouter/vendor/k8s.io/apimachinery/pkg/util/wait.(*Group).Start
	/go/src/github.com/openshift/eventrouter/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:69 +0x62

$ oc get pod
NAME                                            READY   STATUS             RESTARTS   AGE
cluster-logging-eventrouter-5866f97957-7vqnf    0/1     CrashLoopBackOff   1          37s
cluster-logging-operator-5b5fb64d66-m2tt7       1/1     Running            0          47m
elasticsearch-cdm-niao8lka-1-97dd9c775-r98lz    2/2     Running            0          45m
$ oc describe pod cluster-logging-eventrouter-5866f97957-7vqnf
Name:               cluster-logging-eventrouter-5866f97957-7vqnf
Namespace:          openshift-logging
Priority:           0
PriorityClassName:  <none>
Node:               dell-r730-066.dsal.lab.eng.rdu2.redhat.com/10.1.8.76
Start Time:         Fri, 19 Apr 2019 15:10:28 +0800
Labels:             component=eventrouter
                    logging-infra=eventrouter
                    pod-template-hash=5866f97957
                    provider=openshift
Annotations:        k8s.v1.cni.cncf.io/networks-status:
                      [{
                          "name": "openshift-sdn",
                          "interface": "eth0",
                          "ips": [
                              "10.130.0.46"
                          ],
                          "default": true,
                          "dns": {}
                      }]
                    openshift.io/scc: restricted
Status:             Running
IP:                 10.130.0.46
Controlled By:      ReplicaSet/cluster-logging-eventrouter-5866f97957
Containers:
  kube-eventrouter:
    Container ID:   cri-o://db33e4a0ced7115452436d0bdf8766ed48cb7d076398c179bf718dd13f800534
    Image:          quay.io/openshift/origin-logging-eventrouter:latest
    Image ID:       quay.io/openshift/origin-logging-eventrouter@sha256:1bca5f15a04bda7dfb56e5ecd9f7e6f6acc192b06a925a3693e43a9663dc605e
    Port:           <none>
    Host Port:      <none>
    State:          Terminated
      Reason:       Error
      Exit Code:    2
      Started:      Fri, 19 Apr 2019 15:11:17 +0800
      Finished:     Fri, 19 Apr 2019 15:11:17 +0800
    Last State:     Terminated
      Reason:       Error
      Exit Code:    2
      Started:      Fri, 19 Apr 2019 15:11:00 +0800
      Finished:     Fri, 19 Apr 2019 15:11:01 +0800
    Ready:          False
    Restart Count:  2
    Limits:
      memory:  128Mi
    Requests:
      cpu:        100m
      memory:     128Mi
    Environment:  <none>
    Mounts:
      /etc/eventrouter from config-volume (rw)
      /var/run/secrets/kubernetes.io/serviceaccount from cluster-logging-eventrouter-token-898wp (ro)
Conditions:
  Type              Status
  Initialized       True 
  Ready             False 
  ContainersReady   False 
  PodScheduled      True 
Volumes:
  config-volume:
    Type:      ConfigMap (a volume populated by a ConfigMap)
    Name:      cluster-logging-eventrouter
    Optional:  false
  cluster-logging-eventrouter-token-898wp:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  cluster-logging-eventrouter-token-898wp
    Optional:    false
QoS Class:       Burstable
Node-Selectors:  <none>
Tolerations:     node.kubernetes.io/memory-pressure:NoSchedule
                 node.kubernetes.io/not-ready:NoExecute for 300s
                 node.kubernetes.io/unreachable:NoExecute for 300s
Events:
  Type     Reason     Age               From                                                 Message
  ----     ------     ----              ----                                                 -------
  Normal   Scheduled  58s               default-scheduler                                    Successfully assigned openshift-logging/cluster-logging-eventrouter-5866f97957-7vqnf to dell-r730-066.dsal.lab.eng.rdu2.redhat.com
  Normal   Pulling    50s               kubelet, dell-r730-066.dsal.lab.eng.rdu2.redhat.com  Pulling image "quay.io/openshift/origin-logging-eventrouter:latest"
  Normal   Pulled     28s               kubelet, dell-r730-066.dsal.lab.eng.rdu2.redhat.com  Successfully pulled image "quay.io/openshift/origin-logging-eventrouter:latest"
  Normal   Created    9s (x3 over 28s)  kubelet, dell-r730-066.dsal.lab.eng.rdu2.redhat.com  Created container kube-eventrouter
  Normal   Started    9s (x3 over 28s)  kubelet, dell-r730-066.dsal.lab.eng.rdu2.redhat.com  Started container kube-eventrouter
  Normal   Pulled     9s (x2 over 26s)  kubelet, dell-r730-066.dsal.lab.eng.rdu2.redhat.com  Container image "quay.io/openshift/origin-logging-eventrouter:latest" already present on machine
  Warning  BackOff    8s (x3 over 25s)  kubelet, dell-r730-066.dsal.lab.eng.rdu2.redhat.com  Back-off restarting failed container

Version-Release number of selected component (if applicable):
4.0.0-0.nightly-2019-04-18-170158
quay.io/openshift/origin-logging-eventrouter@sha256:1bca5f15a04bda7dfb56e5ecd9f7e6f6acc192b06a925a3693e43a9663dc605e

How reproducible:
not sure

when the eventrouter pod in CrashLoopBackOff status, delete all the eventrouter resources, then create again, do these step three or four times, the issue disappeared.

I tested in upi install on bare mental and upi install on vsphere, all have this issue

Steps to Reproduce:
1.
2.
3.

Actual results:


Expected results:
The eventrouter pod could start

Additional info:

Comment 1 Qiaoling Tang 2019-04-22 02:13:59 UTC
This issue also could be reproduced in env installed by IPI in AWS.

Comment 9 Simon Krenger 2019-09-12 14:32:27 UTC
I was able to reproduce this issue by manually creating an event with `"type": "Info"`:

1. Set up a cluster (I used v3.11.141), with `openshift3/ose-logging-eventrouter:v3.11.141-2`
2. Run `oc apply -f event.json` with any new event with `"type": "Info"`:

```
{
    "apiVersion": "v1",
    "count": 1,
    "eventTime": null,
    "involvedObject": {
        "apiVersion": "apps.openshift.io/v1",
        "kind": "DeploymentConfig",
        "name": "router",
        "namespace": "default",
    },
    "kind": "Event",
    "message": "My test",
    "metadata": {
        "name": "test-event",
        "namespace": "default"
    },
    "reason": "DeploymentCreated",
    "reportingComponent": "",
    "reportingInstance": "",
    "source": {
        "component": "deploymentconfig-controller"
    },
    "type": "Info"
}
```

3. Observe that `eventrouter` crashes immediately and stays in `CrashLoopBackOff`:

```
$ oc get po
NAME                          READY     STATUS             RESTARTS   AGE
[..]
logging-eventrouter-1-gv6jp   0/1       CrashLoopBackOff   1          6m
[..]
```

4. Observe the error message:

```
$ oc logs logging-eventrouter-1-gv6jp
[..]
E0912 14:23:42.370996       1 runtime.go:66] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
/builddir/build/BUILD/eventrouter-5bd925143c4263d18cb33bfc3f3edbcf4267b2a5/src/github.com/openshift/eventrouter/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:72
/builddir/build/BUILD/eventrouter-5bd925143c4263d18cb33bfc3f3edbcf4267b2a5/src/github.com/openshift/eventrouter/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:65
/builddir/build/BUILD/eventrouter-5bd925143c4263d18cb33bfc3f3edbcf4267b2a5/src/github.com/openshift/eventrouter/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:51
/usr/lib/golang/src/runtime/asm_amd64.s:514
/usr/lib/golang/src/runtime/panic.go:489
/usr/lib/golang/src/runtime/panic.go:63
/usr/lib/golang/src/runtime/signal_unix.go:290
/builddir/build/BUILD/eventrouter-5bd925143c4263d18cb33bfc3f3edbcf4267b2a5/src/github.com/openshift/eventrouter/eventrouter.go:152
/builddir/build/BUILD/eventrouter-5bd925143c4263d18cb33bfc3f3edbcf4267b2a5/src/github.com/openshift/eventrouter/eventrouter.go:113
/builddir/build/BUILD/eventrouter-5bd925143c4263d18cb33bfc3f3edbcf4267b2a5/src/github.com/openshift/eventrouter/eventrouter.go:86
/builddir/build/BUILD/eventrouter-5bd925143c4263d18cb33bfc3f3edbcf4267b2a5/src/github.com/openshift/eventrouter/vendor/k8s.io/client-go/tools/cache/controller.go:192
<autogenerated>:54
/builddir/build/BUILD/eventrouter-5bd925143c4263d18cb33bfc3f3edbcf4267b2a5/src/github.com/openshift/eventrouter/vendor/k8s.io/client-go/tools/cache/shared_informer.go:547
/usr/lib/golang/src/runtime/asm_amd64.s:2197
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x11c2239]

goroutine 35 [running]:
github.com/openshift/eventrouter/vendor/k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
	/builddir/build/BUILD/eventrouter-5bd925143c4263d18cb33bfc3f3edbcf4267b2a5/src/github.com/openshift/eventrouter/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:58 +0x126
panic(0x130ff20, 0x1d2c5b0)
	/usr/lib/golang/src/runtime/panic.go:489 +0x2cf
main.prometheusEvent(0xc4205ecd38)
	/builddir/build/BUILD/eventrouter-5bd925143c4263d18cb33bfc3f3edbcf4267b2a5/src/github.com/openshift/eventrouter/eventrouter.go:152 +0xe9
main.(*EventRouter).addEvent(0xc42012d340, 0x1471800, 0xc4205ecd38)
	/builddir/build/BUILD/eventrouter-5bd925143c4263d18cb33bfc3f3edbcf4267b2a5/src/github.com/openshift/eventrouter/eventrouter.go:113 +0x3c
main.(*EventRouter).(main.addEvent)-fm(0x1471800, 0xc4205ecd38)
	/builddir/build/BUILD/eventrouter-5bd925143c4263d18cb33bfc3f3edbcf4267b2a5/src/github.com/openshift/eventrouter/eventrouter.go:86 +0x3e
github.com/openshift/eventrouter/vendor/k8s.io/client-go/tools/cache.ResourceEventHandlerFuncs.OnAdd(0xc4201a74a0, 0xc4201a74b0, 0xc4201a74c0, 0x1471800, 0xc4205ecd38)
	/builddir/build/BUILD/eventrouter-5bd925143c4263d18cb33bfc3f3edbcf4267b2a5/src/github.com/openshift/eventrouter/vendor/k8s.io/client-go/tools/cache/controller.go:192 +0x49
github.com/openshift/eventrouter/vendor/k8s.io/client-go/tools/cache.(*ResourceEventHandlerFuncs).OnAdd(0xc42000ca60, 0x1471800, 0xc4205ecd38)
	<autogenerated>:54 +0x73
github.com/openshift/eventrouter/vendor/k8s.io/client-go/tools/cache.(*processorListener).run(0xc42009f6b0, 0xc420418120)
	/builddir/build/BUILD/eventrouter-5bd925143c4263d18cb33bfc3f3edbcf4267b2a5/src/github.com/openshift/eventrouter/vendor/k8s.io/client-go/tools/cache/shared_informer.go:547 +0x2f6
created by github.com/openshift/eventrouter/vendor/k8s.io/client-go/tools/cache.(*sharedProcessor).run
	/builddir/build/BUILD/eventrouter-5bd925143c4263d18cb33bfc3f3edbcf4267b2a5/src/github.com/openshift/eventrouter/vendor/k8s.io/client-go/tools/cache/shared_informer.go:402 +0xb3
```

Comment 11 Jeff Cantrill 2019-09-30 17:30:11 UTC
Not a blocker for 4.2.  Moving to 4.3 and likely will need to be backported to 4.2 and prior

Comment 19 Qiaoling Tang 2019-10-29 01:26:38 UTC
Verified with ose-logging-eventrouter-v4.3.0-201910250623.

Comment 21 errata-xmlrpc 2020-01-23 11:03:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:0062