Description of problem: Deploy logging and eventrouter, the eventrouter can't gather event logs, lots of error message in the eventrouter pod: I0721 00:23:52.890323 1 reflector.go:240] Listing and watching *v1.Event from github.com/openshift/eventrouter/vendor/k8s.io/client-go/informers/factory.go:73 E0721 00:23:52.904347 1 reflector.go:205] github.com/openshift/eventrouter/vendor/k8s.io/client-go/informers/factory.go:73: Failed to list *v1.Event: v1.EventList: Items: []v1.Event: v1.Event: ObjectMeta: v1.ObjectMeta: readObjectFieldAsBytes: expect : after object field, but found u, error found in #10 byte of ...|:{},"k:{\"uid\":\"30|..., bigger context ...|},"f:metadata":{"f:ownerReferences":{".":{},"k:{\"uid\":\"303e9c69-80bf-4001-9ccf-25c8f1f4c14e\"}":{|... I0721 00:23:53.904461 1 reflector.go:240] Listing and watching *v1.Event from github.com/openshift/eventrouter/vendor/k8s.io/client-go/informers/factory.go:73 E0721 00:23:53.922332 1 reflector.go:205] github.com/openshift/eventrouter/vendor/k8s.io/client-go/informers/factory.go:73: Failed to list *v1.Event: v1.EventList: Items: []v1.Event: v1.Event: ObjectMeta: v1.ObjectMeta: readObjectFieldAsBytes: expect : after object field, but found u, error found in #10 byte of ...|:{},"k:{\"uid\":\"30|..., bigger context ...|},"f:metadata":{"f:ownerReferences":{".":{},"k:{\"uid\":\"303e9c69-80bf-4001-9ccf-25c8f1f4c14e\"}":{|... Version-Release number of selected component (if applicable): ose-logging-eventrouter-v4.5.0-202007172106.p0 cluster version: 4.5.0-0.nightly-2020-07-20-152128 How reproducible: In some clusters, it's 100% reproducible, in some clusters, no such issue Steps to Reproduce: 1.deploy logging 2.deploy eventrouter in openshift-logging namespace with: kind: Template apiVersion: v1 metadata: name: eventrouter-template annotations: description: "A pod forwarding kubernetes events to cluster logging stack." tags: "events,EFK,logging, cluster-logging" objects: - kind: ServiceAccount apiVersion: v1 metadata: name: cluster-logging-eventrouter namespace: ${NAMESPACE} - kind: ClusterRole apiVersion: v1 metadata: name: event-reader rules: - apiGroups: [""] resources: ["events"] verbs: ["get", "watch", "list"] - kind: ClusterRoleBinding apiVersion: v1 metadata: name: event-reader-binding subjects: - kind: ServiceAccount name: cluster-logging-eventrouter namespace: ${NAMESPACE} roleRef: kind: ClusterRole name: event-reader - kind: ConfigMap apiVersion: v1 metadata: name: cluster-logging-eventrouter namespace: ${NAMESPACE} data: config.json: |- { "sink": "stdout" } - kind: Deployment apiVersion: apps/v1 metadata: name: cluster-logging-eventrouter namespace: ${NAMESPACE} labels: component: eventrouter logging-infra: eventrouter provider: openshift spec: selector: matchLabels: component: eventrouter logging-infra: eventrouter provider: openshift replicas: 1 template: metadata: labels: component: eventrouter logging-infra: eventrouter provider: openshift name: cluster-logging-eventrouter spec: serviceAccount: cluster-logging-eventrouter containers: - name: kube-eventrouter image: ${IMAGE} imagePullPolicy: IfNotPresent resources: limits: memory: ${MEMORY} requests: cpu: ${CPU} memory: ${MEMORY} volumeMounts: - name: config-volume mountPath: /etc/eventrouter volumes: - name: config-volume configMap: name: cluster-logging-eventrouter parameters: - name: IMAGE displayName: Image value: "image-registry.openshift-image-registry.svc:5000/openshift/ose-logging-eventrouter:latest" - name: MEMORY displayName: Memory value: "128Mi" - name: CPU displayName: CPU value: "100m" - name: NAMESPACE displayName: Namespace value: "openshift-logging" 3.check eventrouter pod logs Actual results: Expected results: Additional info: must-gather: http://file.apac.redhat.com/~qitang/must-gather.tar.gz
The issue couldn't be reproduced.
I cannot reproduce this issue. This is not really a bug. I suspect something went wrong on the transfer side and the json transfer was incomplete. I will close for now, until someone can reproduce this.
Also happening in 4.6 for reference, same logs after I tried the template.
I can't reproduce this bug on 4.7 (current master).
I am unable to reproduce this issue after deploying the pod as described [1]: Using 4.6 dev cluster: $ oc version Client Version: 4.5.0-0.nightly-2020-04-21-103613 Server Version: 4.6.0-0.nightly-2020-09-28-110510 Kubernetes Version: v1.19.0+e465e66 Image info: Image: registry.redhat.io/openshift4/ose-logging-eventrouter:latest Image ID: registry.redhat.io/openshift4/ose-logging-eventrouter@sha256:40433a3b3eaf34126c81d62ca7755675a5e25bc4489793dff7924abe447005ca [1] https://docs.openshift.com/container-platform/4.6/logging/cluster-logging-eventrouter.html
It's happening sometimes, so I'm expecting that reproducing is not 100% easily. Is there any other information that could affect to the use case that could be causing the issue? Not sure if we can help with the reproducer, if you need any kind of extra log, let us know.
(In reply to David Hernández Fernández from comment #14) > It's happening sometimes, Can you clarify this statement? Does it run for several days or hours and then stops producing events? Is it transient in that its a short blip where the pod recovers and continues to collect events? Is it possible maybe the watch expires and is not cleaned up properly?
It works a few hours but it stops working and the following messages appear in the log: E0226 07:09:06.235292 1 streamwatcher.go:109] Unable to decode an event from the watch stream: http2: server sent GOAWAY and closed the connection; LastStreamID=295, ErrCode=NO_ERROR, debug="" E0226 07:09:06.235562 1 reflector.go:315] github.com/openshift/eventrouter/vendor/k8s.io/client-go/informers/factory.go:73: Failed to watch *v1.Event: Get "https://10.83.0.1:443/api/v1/events?resourceVersion=351123363&timeoutSeconds=578&watch=true": http2: no cached connection was available I0226 07:09:07.235684 1 reflector.go:240] Listing and watching *v1.Event from github.com/openshift/eventrouter/vendor/k8s.io/client-go/informers/factory.go:73 E0226 07:09:07.235894 1 reflector.go:205] github.com/openshift/eventrouter/vendor/k8s.io/client-go/informers/factory.go:73: Failed to list *v1.Event: Get "https://10.83.0.1:443/api/v1/events?resourceVersion=0": http2: no cached connection was available I0226 07:09:08.236123 1 reflector.go:240] Listing and watching *v1.Event from github.com/openshift/eventrouter/vendor/k8s.io/client-go/informers/factory.go:73 E0226 07:09:08.236424 1 reflector.go:205] github.com/openshift/eventrouter/vendor/k8s.io/client-go/informers/factory.go:73: Failed to list *v1.Event: Get "https://10.83.0.1:443/api/v1/events?resourceVersion=0": http2: no cached connection was available So it looks like the initial connection to the API won't get re-established, once it may get dropped. Hope it helps to resolve that issue. /Andreas
We are facing the exact same issue. Including the E0306 01:39:01.630417 1 streamwatcher.go:109] Unable to decode an event from the watch stream: http2: server sent GOAWAY and closed the connection; LastStreamID=297, ErrCode=NO_ERROR, debug="" error. Hoping this problem will receive some attention soon.
Thanks this info help us. We already work on solution
Moved to assigned to resolve for 4.6. Created https://issues.redhat.com/browse/LOG-1230 to address in 5.x
No such issue with image provided by Vitalii Parfonov. Either the PR wasn't merged or the lack some packages when building the downstream image.
Verified and passed using the image v4.6.26.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Important: OpenShift Container Platform 4.6.26 security and extras update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:1230