Bug 1835396

Summary:

Cannot access logs in kibana with the managed deployment orchestrated by the cluster logging operator

Product:

OpenShift Container Platform

Reporter:

Tyler Lisowski <lisowski>

Component:

Logging

Assignee:

Jeff Cantrill <jcantril>

Status:

CLOSED ERRATA

QA Contact:

Anping Li <anli>

Severity:

high

Docs Contact:

Priority:

high

Version:

4.3.z

CC:

aos-bugs, bgottfri, brian_mckeown, brueckner, cewong, ewolinet, ikarpukh, jason.greene, jcantril, jtpape, nbziouec, periklis, pweil, rkonuru, smerrow, tnakajo, wili, yhuang

Target Milestone:

---

Keywords:

Reopened, UpcomingSprint

Target Release:

4.4.z

Hardware:

x86_64

OS:

Linux

Whiteboard:

Fixed In Version:

Doc Type:

No Doc Update

Doc Text:

Story Points:

---

Clone Of:

Clones:

1854608 (view as bug list)

Environment:

Last Closed:

2020-07-21 10:31:05 UTC

Type:

Bug

Regression:

---

Mount Type:

---

Documentation:

---

CRM:

Verified Versions:

Category:

---

oVirt Team:

---

RHEL 7.3 requirements from Atomic Host:

Cloudforms Team:

---

Target Upstream Version:

Embargoed:

Bug Depends On:

1854610

Bug Blocks:

1854997

Attachments:

Description	Flags
must-gather-logs	none
Kibana UI screenshot	none
elasticsearch log	none

Description Tyler Lisowski 2020-05-13 17:51:21 UTC

Created attachment 1688157 [details]
must-gather-logs

Description of problem:
With the newest version of the cluster logging operator: a admin user cannot see any logs in Kibana. The logged exception in the Kibana UI is 
```
Error: [object Object]: [security_exception] no permissions for [indices:data/read/field_caps] and User [name=CN=system.logging.kibana,OU=OpenShift,O=Logging, roles=[]]
    at https://kibana-openshift-logging.buf-oc-playground-9e37478581b5d9de33607f5926d1d18f-0000.us-south.stg.containers.appdomain.cloud/bundles/kibana.bundle.js?v=15690:227:21353
    at processQueue (https://kibana-openshift-logging.buf-oc-playground-9e37478581b5d9de33607f5926d1d18f-0000.us-south.stg.containers.appdomain.cloud/bundles/commons.bundle.js?v=15690:38:23621)
    at https://kibana-openshift-logging.buf-oc-playground-9e37478581b5d9de33607f5926d1d18f-0000.us-south.stg.containers.appdomain.cloud/bundles/commons.bundle.js?v=15690:38:23888
    at Scope.$eval (https://kibana-openshift-logging.buf-oc-playground-9e37478581b5d9de33607f5926d1d18f-0000.us-south.stg.containers.appdomain.cloud/bundles/commons.bundle.js?v=15690:39:4619)
    at Scope.$digest (https://kibana-openshift-logging.buf-oc-playground-9e37478581b5d9de33607f5926d1d18f-0000.us-south.stg.containers.appdomain.cloud/bundles/commons.bundle.js?v=15690:39:2359)
    at Scope.$apply (https://kibana-openshift-logging.buf-oc-playground-9e37478581b5d9de33607f5926d1d18f-0000.us-south.stg.containers.appdomain.cloud/bundles/commons.bundle.js?v=15690:39:5037)
    at done (https://kibana-openshift-logging.buf-oc-playground-9e37478581b5d9de33607f5926d1d18f-0000.us-south.stg.containers.appdomain.cloud/bundles/commons.bundle.js?v=15690:37:25027)
    at completeRequest (https://kibana-openshift-logging.buf-oc-playground-9e37478581b5d9de33607f5926d1d18f-0000.us-south.stg.containers.appdomain.cloud/bundles/commons.bundle.js?v=15690:37:28702)
    at XMLHttpRequest.xhr.onload (https://kibana-openshift-logging.buf-oc-playground-9e37478581b5d9de33607f5926d1d18f-0000.us-south.stg.containers.appdomain.cloud/bundles/commons.bundle.js?v=15690:37:29634)
```

My user is an admin user that is bound to the cluster-admin clusterrole. I suspect this also affects all users of the cluster (even non admins) but have not verified that is the case.

When I go into the elastic search pods I can see indices are there:
```
Use 'oc describe pod/elasticsearch-cdm-bp8r0kpb-1-6cb4c4875-lhwfl -n openshift-logging' to see all of the containers in this pod.
bash-4.2$ indices 
Wed May 13 17:33:12 UTC 2020
health status index                                                                 uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   .orphaned.2020.05.13                                                  tIyMpeqbTKuVhmf6THN4TQ   3   1          1            0          0              0
green  open   project.tyler-project.7557e238-678e-4186-ab8c-a158fa40b597.2020.05.13 Ink_sbWJSA-FZCgw2K2jCA   3   1        113            0          0              0
green  open   .searchguard                                                          PBsNUAKbR5e-YRyA8MKYnw   1   1          5            0          0              0
green  open   project.calico-system.254fec1f-5b61-4acb-99d4-efce1130c7cf.2020.05.13 5Xu-93TYR1yB89OV9QS3xg   3   1      22912            0         40             20
green  open   project.ibm-system.f431b8ea-9351-4916-95b1-9709c350e98c.2020.05.13    -CUSqHR6Tc6ZdnFRkFI0sg   3   1         13            0          0              0
green  open   .kibana                                                               dUtdfstvQly0Bqr18_4AfQ   1   1          2            0          0              0
green  open   .operations.2020.05.13                                                F5fp9UQ-TcOf_j9Aus2mCw   3   1      47468            0         96             48
```

However, these are not being pulled by Kibana and no views are available as my admin user. 

Version-Release number of selected component (if applicable):
clusterlogging.4.3.19-202005041055
elasticsearch-operator.4.3.19-202005041055
```
NAMESPACE                                               NAME                                         DISPLAY                  VERSION               REPLACES   PHASE
calico-system                                           elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
default                                                 elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
ibm-cert-store                                          elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
ibm-system                                              elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
kube-node-lease                                         elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
kube-public                                             elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
kube-system                                             elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-apiserver-operator                            elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-apiserver                                     elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-authentication-operator                       elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-authentication                                elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-cloud-credential-operator                     elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-cluster-machine-approver                      elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-cluster-node-tuning-operator                  elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-cluster-samples-operator                      elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-cluster-storage-operator                      elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-cluster-version                               elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-config-managed                                elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-config                                        elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-console-operator                              elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-console                                       elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-controller-manager-operator                   elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-controller-manager                            elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-dns-operator                                  elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-dns                                           elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-etcd                                          elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-image-registry                                elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-infra                                         elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-ingress-operator                              elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-ingress                                       elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-insights                                      elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-kni-infra                                     elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-kube-apiserver-operator                       elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-kube-apiserver                                elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-kube-controller-manager-operator              elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-kube-controller-manager                       elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-kube-proxy                                    elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-kube-scheduler-operator                       elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-kube-scheduler                                elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-logging                                       clusterlogging.4.3.19-202005041055           Cluster Logging          4.3.19-202005041055              Succeeded
openshift-logging                                       elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-machine-api                                   elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-machine-config-operator                       elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-marketplace                                   elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-monitoring                                    elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-multus                                        elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-network-operator                              elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-node                                          elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-openstack-infra                               elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-operator-lifecycle-manager                    elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-operator-lifecycle-manager                    packageserver                                Package Server           0.13.0                           Succeeded
openshift-operators                                     elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-ovirt-infra                                   elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-service-ca-operator                           elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-service-ca                                    elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-service-catalog-apiserver-operator            elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-service-catalog-controller-manager-operator   elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift-user-workload-monitoring                      elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
openshift                                               elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
tigera-operator                                         elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
tyler-project                                           elasticsearch-operator.4.3.19-202005041055   Elasticsearch Operator   4.3.19-202005041055              Succeeded
```

How reproducible:
Reproducible on every deployment I've done of the logging operator

Steps to Reproduce:
1) Create the namespace the logging operator will live in
```
oc create namespace openshift-logging
oc label ns openshift-logging openshift.io/cluster-monitoring="true"
```

2) Deploy the Elastic Search Operator
```
From the OpenShift web console Administrator perspective, click Operators > OperatorHub.

In the Filter by keyword field, enter Elasticsearch Operator, click Elasticsearch Operator, then click Install.

Confirm the Update Channel matches your cluster version (4.3), then click Subscribe.

Wait until the Elasticsearch Operator status is successfully installed.
```

3) Deploy the cluster logging operator
```
From the OpenShift web console Administrator perspective, click Operators > OperatorHub.

In the Filter by keyword field, enter Cluster Logging, click Cluster Logging, then click Install.

For Installation Mode, click A specific namespace on the cluster.

From the project dropdown list, select openshift-logging.

Confirm the Update Channel matches your cluster version (4.3), then click Subscribe.
Wait until the Cluster Logging Operator status is successfully installed.
```

4) Create an instance of cluster logging
```
Create an instance of cluster logging.
From the OpenShift web console Installed Operators page, click Cluster Logging.
In the Provided APIs section, Cluster Logging tile, click Create Instance.

Click Create.

Verify that the operator, Elasticsearch, Fluentd, and Kibana pods are all Running.


5) Go to the kibana route (can be retrieved with oc get routes --all-namespaces) and login with your admin user.

Actual results:
Get error in Kibana and cannot view any logs
```
Error: [object Object]: [security_exception] no permissions for [indices:data/read/field_caps] and User [name=CN=system.logging.kibana,OU=OpenShift,O=Logging, roles=[]]
    at https://kibana-openshift-logging.buf-oc-playground-9e37478581b5d9de33607f5926d1d18f-0000.us-south.stg.containers.appdomain.cloud/bundles/kibana.bundle.js?v=15690:227:21353
    at processQueue (https://kibana-openshift-logging.buf-oc-playground-9e37478581b5d9de33607f5926d1d18f-0000.us-south.stg.containers.appdomain.cloud/bundles/commons.bundle.js?v=15690:38:23621)
    at https://kibana-openshift-logging.buf-oc-playground-9e37478581b5d9de33607f5926d1d18f-0000.us-south.stg.containers.appdomain.cloud/bundles/commons.bundle.js?v=15690:38:23888
    at Scope.$eval (https://kibana-openshift-logging.buf-oc-playground-9e37478581b5d9de33607f5926d1d18f-0000.us-south.stg.containers.appdomain.cloud/bundles/commons.bundle.js?v=15690:39:4619)
    at Scope.$digest (https://kibana-openshift-logging.buf-oc-playground-9e37478581b5d9de33607f5926d1d18f-0000.us-south.stg.containers.appdomain.cloud/bundles/commons.bundle.js?v=15690:39:2359)
    at Scope.$apply (https://kibana-openshift-logging.buf-oc-playground-9e37478581b5d9de33607f5926d1d18f-0000.us-south.stg.containers.appdomain.cloud/bundles/commons.bundle.js?v=15690:39:5037)
    at done (https://kibana-openshift-logging.buf-oc-playground-9e37478581b5d9de33607f5926d1d18f-0000.us-south.stg.containers.appdomain.cloud/bundles/commons.bundle.js?v=15690:37:25027)
    at completeRequest (https://kibana-openshift-logging.buf-oc-playground-9e37478581b5d9de33607f5926d1d18f-0000.us-south.stg.containers.appdomain.cloud/bundles/commons.bundle.js?v=15690:37:28702)
    at XMLHttpRequest.xhr.onload (https://kibana-openshift-logging.buf-oc-playground-9e37478581b5d9de33607f5926d1d18f-0000.us-south.stg.containers.appdomain.cloud/bundles/commons.bundle.js?v=15690:37:29634)
```

Expected results:
Be able to view the elastic search indicies that I verified are there in the elastic search pod
```
Use 'oc describe pod/elasticsearch-cdm-bp8r0kpb-1-6cb4c4875-lhwfl -n openshift-logging' to see all of the containers in this pod.
bash-4.2$ indices 
Wed May 13 17:33:12 UTC 2020
health status index                                                                 uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   .orphaned.2020.05.13                                                  tIyMpeqbTKuVhmf6THN4TQ   3   1          1            0          0              0
green  open   project.tyler-project.7557e238-678e-4186-ab8c-a158fa40b597.2020.05.13 Ink_sbWJSA-FZCgw2K2jCA   3   1        113            0          0              0
green  open   .searchguard                                                          PBsNUAKbR5e-YRyA8MKYnw   1   1          5            0          0              0
green  open   project.calico-system.254fec1f-5b61-4acb-99d4-efce1130c7cf.2020.05.13 5Xu-93TYR1yB89OV9QS3xg   3   1      22912            0         40             20
green  open   project.ibm-system.f431b8ea-9351-4916-95b1-9709c350e98c.2020.05.13    -CUSqHR6Tc6ZdnFRkFI0sg   3   1         13            0          0              0
green  open   .kibana                                                               dUtdfstvQly0Bqr18_4AfQ   1   1          2            0          0              0
green  open   .operations.2020.05.13                                                F5fp9UQ-TcOf_j9Aus2mCw   3   1      47468            0         96             48
```

Additional info:
```
Tylers-MBP:~ tylerlisowski$ oc get pods -n openshift-logging -o wide
NAME                                            READY   STATUS    RESTARTS   AGE   IP              NODE             NOMINATED NODE   READINESS GATES
cluster-logging-operator-6ccc875469-lb4l8       1/1     Running   0          37m   172.30.196.19   10.184.253.164   <none>           <none>
elasticsearch-cdm-bp8r0kpb-1-6cb4c4875-lhwfl    2/2     Running   0          36m   172.30.251.24   10.184.253.134   <none>           <none>
elasticsearch-cdm-bp8r0kpb-2-544d765bcf-dk6qx   2/2     Running   0          36m   172.30.196.22   10.184.253.164   <none>           <none>
elasticsearch-cdm-bp8r0kpb-3-dfd479fd6-59sk2    2/2     Running   0          35m   172.30.71.151   10.184.253.171   <none>           <none>
fluentd-9m5tg                                   1/1     Running   0          36m   172.30.71.150   10.184.253.171   <none>           <none>
fluentd-b49mv                                   1/1     Running   0          36m   172.30.57.91    10.184.253.200   <none>           <none>
fluentd-jhlpq                                   1/1     Running   0          36m   172.30.251.23   10.184.253.134   <none>           <none>
fluentd-klpx5                                   1/1     Running   0          36m   172.30.31.216   10.184.253.178   <none>           <none>
fluentd-kmzxn                                   1/1     Running   0          36m   172.30.86.146   10.184.110.176   <none>           <none>
fluentd-str4l                                   1/1     Running   0          36m   172.30.196.21   10.184.253.164   <none>           <none>
kibana-699d6bbb45-qb5mt                         2/2     Running   0          36m   172.30.251.22   10.184.253.134   <none>           <none>
Tylers-MBP:~ tylerlisowski$ 
```

```
Use 'oc describe pod/elasticsearch-cdm-bp8r0kpb-1-6cb4c4875-lhwfl -n openshift-logging' to see all of the containers in this pod.
bash-4.2$ indices 
Wed May 13 17:33:12 UTC 2020
health status index                                                                 uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   .orphaned.2020.05.13                                                  tIyMpeqbTKuVhmf6THN4TQ   3   1          1            0          0              0
green  open   project.tyler-project.7557e238-678e-4186-ab8c-a158fa40b597.2020.05.13 Ink_sbWJSA-FZCgw2K2jCA   3   1        113            0          0              0
green  open   .searchguard                                                          PBsNUAKbR5e-YRyA8MKYnw   1   1          5            0          0              0
green  open   project.calico-system.254fec1f-5b61-4acb-99d4-efce1130c7cf.2020.05.13 5Xu-93TYR1yB89OV9QS3xg   3   1      22912            0         40             20
green  open   project.ibm-system.f431b8ea-9351-4916-95b1-9709c350e98c.2020.05.13    -CUSqHR6Tc6ZdnFRkFI0sg   3   1         13            0          0              0
green  open   .kibana                                                               dUtdfstvQly0Bqr18_4AfQ   1   1          2            0          0              0
green  open   .operations.2020.05.13                                                F5fp9UQ-TcOf_j9Aus2mCw   3   1      47468            0         96             48
```

Comment 1 Tyler Lisowski 2020-05-13 17:56:01 UTC

Created attachment 1688158 [details]
Kibana UI screenshot

Comment 2 Ravi Konuru 2020-05-17 18:59:17 UTC

Hello, is there any update on this ticket. Is this going to be fixed anytime soon. We have a customer who wants to use the this feature on IBM Redhat Openshift Clusters by the end of next week. Thank you.

Comment 3 Ravi Konuru 2020-05-18 18:48:22 UTC

There are additional issues that I discovered independently:

* all CLO pods are running fine. 

oc get pods -n openshift-logging

NAME                                            READY   STATUS      RESTARTS   AGE
cluster-logging-operator-7647fbdfbf-gzbwq       1/1     Running     0          2d19h
curator-1589826600-dg5tf                        0/1     Completed   0          6m32s
elasticsearch-cdm-d9bzp7fs-1-549fd8d989-7wpdq   2/2     Running     0          2d19h
fluentd-t72c9                                   1/1     Running     0          2d19h
fluentd-xfvg6                                   1/1     Running     0          2d19h
fluentd-zvkf6                                   1/1     Running     0          2d19h
kibana-7479c479cc-mwmqj                         2/2     Running     0          2d19h

I see elastic search getting the documents but see lots of exception mesagees in elasticsearch logs. Whats even more wierd is elastic search is creating all the document entries, but only some of the messages are correct and rest are set to null.
Not sure if its a ROKS issue or vanilla OS issue at this point. 

Here is an excerpt from the elastic search>
Elastic-search log
[2020-05-18T03:32:05,231][WARN ][c.f.s.a.BackendRegistry  ] Authentication finally failed for null
[2020-05-18T03:32:16,379][ERROR][i.f.e.p.OpenshiftAPIService] Error retrieving username from token
okhttp3.internal.http2.StreamResetException: stream was reset: PROTOCOL_ERROR
	at okhttp3.internal.http2.Http2Stream.takeHeaders(Http2Stream.java:158) ~[okhttp-3.12.6.jar:?]
	at okhttp3.internal.http2.Http2Codec.readResponseHeaders(Http2Codec.java:131) ~[okhttp-3.12.6.jar:?]
	at okhttp3.internal.http.CallServerInterceptor.intercept(CallServerInterceptor.java:88) ~[okhttp-3.12.6.jar:?]
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) ~[okhttp-3.12.6.jar:?]
	at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:45) ~[okhttp-3.12.6.jar:?]
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) ~[okhttp-3.12.6.jar:?]
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:121) ~[okhttp-3.12.6.jar:?]
	at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93) ~[okhttp-3.12.6.jar:?]
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:147) ~[okhttp-3.12.6.jar:?]
	at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.jav


I log into elastic search node and get all the documents in my project index as follows:
curl -s -k --cert /etc/elasticsearch/secret/admin-cert --key /etc/elasticsearch/secret/admin-key 'https://localhost:9200/project.appteam-one.d46b3da5-5b5e-4208-93d1-a5cde9c5f276.2020.05.18/_search?pretty=true'   | grep -i sending     
          "message" : "Sending message 59 : Hello world from hello-world2-654d89c79d-lqt84! blah blah blah!",
          "message" : "Sending message 60 : Hello world from hello-world2-654d89c79d-lqt84! blah blah blah!",
          "message" : "Sending message 61 : Hello world from hello-world2-654d89c79d-lqt84! blah blah blah!",
          "message" : "Sending message 113 : Hello world from hello-world2-654d89c79d-mbv4s! blah blah blah!",
          "message" : "Sending message 114 : Hello world from hello-world2-654d89c79d-mbv4s! blah blah blah!",
2:09
there should be 302 such entries (index count is correct at 302) but only the above docs are correct. rest are null.

Comment 4 brian_mckeown 2020-05-21 08:59:24 UTC

Hello folks - would someone be able to take a look/comment on the issue reported? Thanks in advance

Comment 5 ewolinet 2020-05-21 16:13:53 UTC

While it is a different version of OCP it would be the same version of Kibana/ES, I'm wondering if this is the same as what is being seen here:
https://bugzilla.redhat.com/show_bug.cgi?id=1829062#c12

What page were you on when you received the original error in Kibana?

Comment 6 Tyler Lisowski 2020-05-21 19:55:11 UTC

Hey! The page is
https://kibana-openshift-logging.cesar-oc-playground-1-9e37478581b5d9de33607f5926d1d18f-0000.us-south.stg.containers.appdomain.cloud/app/kibana#/discover?_g=()

Comment 7 ewolinet 2020-05-21 20:41:10 UTC

When testing Kibana, what user is being used to log in? (is this a predefined user like "kubeadmin"? if not, what sort of permissions does the user have?)

I see this in the ES logs
"[2020-05-20T22:58:09,202][ERROR][i.f.e.p.OpenshiftAPIService] Error retrieving username from token"

Which would explain why there are missing user permissions. Also I don't see any user Kibana indices in ES...


Tangentially, looking at the .kibana index, it seems like an index pattern for "logstash-*" was created, this will not match any indices to be viewed (not seeing expected IP is expected given the above error message)

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 2,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : ".kibana",
        "_type" : "index-pattern",
        "_id" : "AXI3oSsJOb_Wj8MybJOU",
        "_score" : 1.0,
        "_source" : {
          "title" : "logstash-*",
          "notExpandable" : true
        }
      },
      {
        "_index" : ".kibana",
        "_type" : "config",
        "_id" : "5.6.16",
        "_score" : 1.0,
        "_source" : {
          "buildNum" : 15690,
          "defaultIndex" : "AXI3oSsJOb_Wj8MybJOU"
        }
      }
    ]
  }
}

Comment 8 Tyler Lisowski 2020-05-22 01:30:06 UTC

so, turning the log level up i'm not seeing the kibana request get processed. i see the metrics endpoint and the other cluster members...   for some reason we aren't getting a token for kibana...

Going to investigate why that is occurring tomorrow

Comment 9 Periklis Tsirakidis 2020-05-22 07:26:53 UTC

@Tyler

I will keep this on needsinfo until you report back your investigation results.

Comment 10 Ravi Konuru 2020-05-22 13:33:53 UTC

@periklis in my cluster that I created independently from Tyler's cluster on stage, I see exceptions in the initialization of elastic search where it is not finding classes related to sgadmin and a message that sgadmin needs to be run. 

For your question on who is bringing up Kibana UI, both the cluster creation and UI bring up was by the same id (mine). I am not clearing the need info flag so that Tyler can add to my comments. 

Thanks

Comment 11 Tyler Lisowski 2020-05-22 19:09:47 UTC

Update with the logging team:

we're still seeing protocol issues... elasticsearch is only configured to use TLS v1.1 and 1.2 (not sure if that's currently whats causing an issue). but we confirmed that its not anything related to the token that we get back because if i curl the k8s service with my bearer token i can hit my expected endpoint and get a valid response of who my user is... but our plugin isn't able to make that call

They are looking into why specifically the Kibana plugin is experiencing authentication issues. They will provide an update when they are able to root cause on why that is occurring.

Comment 12 Tyler Lisowski 2020-05-27 03:20:01 UTC

Update with the logging team:

We found the older version:
`clusterlogging.4.3.16-202004240713`

You can look and examine older versions with
```
Current versions seem to be 57.0.0 (newest) to 1.0.0 (oldest)
curl https://quay.io/cnr/api/v1/packages/redhat-operators/cluster-logging/53.0.0 (note the digest)
curl -XGET https://quay.io/cnr/api/v1/packages/redhat-operators/cluster-logging/blobs/sha256/41d7170cbca29fd933202053bfe525fcde7fd3546f64e31cc056f6eccfdede36 -o cluster-logging.tar.gz (digest is plugged in after sha)
tar xvzf cluster-logging.tar.gz

Then you can examine the ClusterServiceVersion and it's dependencies at that point in time. Tomorrow the team will look at the different between the newer and older version.

Comment 13 Tyler Lisowski 2020-05-27 03:27:04 UTC

To note:
Release 53.0.0 (clusterlogging.4.3.16-202004240713) is created on May 4th
```
created_at":"2020-05-04T11:58:01",
```

The next release is created May 12th:
```
2020-05-12T05
```

Which I expect is when the regression might have first been introduced. I am not familiar with a way to deploy the old version yet but I believe that version was working.

Comment 14 Tyler Lisowski 2020-05-27 03:49:47 UTC

```
#! validate-crd: deploy/chart/templates/0000_30_02-clusterserviceversion.crd.yaml
#! parse-kind: ClusterServiceVersion
apiVersion: operators.coreos.com/v1alpha1
kind: ClusterServiceVersion
metadata:
  # The version value is substituted by the ART pipeline
  name: clusterlogging.4.3.16-202004240713
  namespace: openshift-logging
  annotations:
    capabilities: Seamless Upgrades
    categories: "OpenShift Optional, Logging & Tracing"
    certified: "false"
    description: |-
      The Cluster Logging Operator for OKD provides a means for configuring and managing your aggregated logging stack.
    containerImage: registry.redhat.io/openshift4/ose-cluster-logging-operator@sha256:648b96c77f8b0068bd8323a092cf06793ebd7566046a6ffb88af1d7fabadeaa3
    createdAt: 2018-08-01T08:00:00Z
    support: AOS Logging
    # The version value is substituted by the ART pipeline
    olm.skipRange: ">=4.1.0 <4.3.16-202004240713"
    alm-examples: |-
        [
          {
            "apiVersion": "logging.openshift.io/v1",
            "kind": "ClusterLogging",
            "metadata": {
              "name": "instance",
              "namespace": "openshift-logging"
            },
            "spec": {
              "managementState": "Managed",
              "logStore": {
                "type": "elasticsearch",
                "elasticsearch": {
                  "nodeCount": 3,
                  "redundancyPolicy": "SingleRedundancy",
                  "storage": {
                    "storageClassName": "gp2",
                    "size": "200G"
                  }
                }
              },
              "visualization": {
                "type": "kibana",
                "kibana": {
                  "replicas": 1
                }
              },
              "curation": {
                "type": "curator",
                "curator": {
                  "schedule": "30 3 * * *"
                }
              },
              "collection": {
                "logs": {
                  "type": "fluentd",
                  "fluentd": {}
                }
              }
            }
          },
          {
            "apiVersion": "logging.openshift.io/v1alpha1",
            "kind": "LogForwarding",
            "metadata": {
              "name": "instance",
              "namespace": "openshift-logging"
            },
            "spec": {
              "outputs": [
                {
                  "name": "clo-default-output-es",
                  "type": "elasticsearch",
                  "endpoint": "elasticsearch.openshift-logging.svc:9200",
                  "secret": {
                    "name": "elasticsearch"
                  }
                }
              ],
              "pipelines": [
                {
                  "name": "clo-default-app-pipeline",
                  "inputSource": "logs.app",
                  "outputRefs": ["clo-managaged-output-es"]
                },
                {
                  "name": "clo-default-infra-pipeline",
                  "inputSource": "logs.app",
                  "outputRefs": ["clo-managaged-output-es"]
                }
              ]
            }
          }
        ]
spec:
  relatedImages:
    - name: ose-cluster-logging-operator
      image: registry.redhat.io/openshift4/ose-cluster-logging-operator@sha256:648b96c77f8b0068bd8323a092cf06793ebd7566046a6ffb88af1d7fabadeaa3
    - name: ose-logging-curator5
      image: registry.redhat.io/openshift4/ose-logging-curator5@sha256:da8943a7eacfd34ac8687ae607e11fb1ad1f538e4bdcae95f3ed70039be72f04
    - name: ose-logging-elasticsearch5
      image: registry.redhat.io/openshift4/ose-logging-elasticsearch5@sha256:f02e4f75617b706d9b8e2dc06777aa572a443ccc3dd604ce4c21667f55725435
    - name: ose-logging-fluentd
      image: registry.redhat.io/openshift4/ose-logging-fluentd@sha256:a43ba2606777a8b6e3a45443bac1ae697600731b34c2abb84e35624ed8ef0270
    - name: ose-logging-kibana5
      image: registry.redhat.io/openshift4/ose-logging-kibana5@sha256:8f3dc6d2e8c80fce660f65c3c7be1330d6a7b73d003998be8c333e993ccafc78
    - name: ose-oauth-proxy
      image: registry.redhat.io/openshift4/ose-oauth-proxy@sha256:5fc02d6d99203f2d437068315434b5ca926b992ec02e686ae8b47fbc5ddc89a1
    - name: ose-promtail
      image: registry.redhat.io/openshift4/ose-promtail@sha256:1264aa92ebc6cccf46da3a35fbb54421b806dda5640c7e9706e6e815d13f509d
  # The version value is substituted by the ART pipeline
  version: 4.3.16-202004240713
  displayName: Cluster Logging
  minKubeVersion: 1.16.0
  description: |
    # Cluster Logging
    The Cluster Logging Operator orchestrates and manages the aggregated logging stack as a cluster-wide service.

    ##Features
    * **Create/Destroy**: Launch and create an aggregated logging stack to support the entire OKD cluster.
    * **Simplified Configuration**: Configure your aggregated logging cluster's structure like components and end points easily.

    ## Prerequisites and Requirements
    ### Cluster Logging Namespace
    Cluster logging and the Cluster Logging Operator is only deployable to the **openshift-logging** namespace. This namespace
    must be explicitly created by a cluster administrator (e.g. `oc create ns openshift-logging`). To enable metrics
    service discovery add namespace label `openshift.io/cluster-monitoring: "true"`.

    For additional installation documentation see [Deploying cluster logging](https://docs.openshift.com/container-platform/4.1/logging/efk-logging-deploying.html)
    in the OpenShift product documentation.

    ### Elasticsearch Operator
    The Elasticsearch Operator is responsible for orchestrating and managing cluster logging's Elasticsearch cluster.  This
    operator must be deployed to the global operator group namespace
    ### Memory Considerations
    Elasticsearch is a memory intensive application.  Cluster Logging will specify that each Elasticsearch node needs
    16G of memory for both request and limit unless otherwise defined in the ClusterLogging custom resource. The initial
    set of OKD nodes may not be large enough to support the Elasticsearch cluster.  Additional OKD nodes must be added
    to the OKD cluster if you desire to run with the recommended(or better) memory. Each ES node can operate with a
    lower memory setting though this is not recommended for production deployments.

  keywords: ['elasticsearch', 'kibana', 'fluentd', 'logging', 'aggregated', 'efk']

  maintainers:
  - name: Red Hat
    email: aos-logging

  provider:
    name: Red Hat, Inc

  links:
  - name: Elastic
    url: https://www.elastic.co/
  - name: Fluentd
    url: https://www.fluentd.org/
  - name: Documentation
    url: https://github.com/openshift/cluster-logging-operator/blob/master/README.md
  - name: Cluster Logging Operator
    url: https://github.com/openshift/cluster-logging-operator
  installModes:
  - type: OwnNamespace
    supported: true
  - type: SingleNamespace
    supported: true
  - type: MultiNamespace
    supported: false
  - type: AllNamespaces
    supported: false
  install:
    strategy: deployment
    spec:
      permissions:
      - serviceAccountName: cluster-logging-operator
        rules:
        - apiGroups:
          - logging.openshift.io
          resources:
          - "*"
          verbs:
          - "*"
        - apiGroups:
          - ""
          resources:
          - pods
          - services
          - endpoints
          - persistentvolumeclaims
          - events
          - configmaps
          - secrets
          - serviceaccounts
          - serviceaccounts/finalizers
          verbs:
          - "*"
        - apiGroups:
          - apps
          resources:
          - deployments
          - daemonsets
          - replicasets
          - statefulsets
          verbs:
          - "*"
        - apiGroups:
          - route.openshift.io
          resources:
          - routes
          - routes/custom-host
          verbs:
          - "*"
        - apiGroups:
          - batch
          resources:
          - cronjobs
          verbs:
          - "*"
        - apiGroups:
          - rbac.authorization.k8s.io
          resources:
          - roles
          - rolebindings
          verbs:
          - "*"
        - apiGroups:
          - security.openshift.io
          resources:
          - securitycontextconstraints
          resourceNames:
          - privileged
          verbs:
          - use
        - apiGroups:
          - monitoring.coreos.com
          resources:
          - servicemonitors
          - prometheusrules
          verbs:
          - "*"
      clusterPermissions:
      - serviceAccountName: cluster-logging-operator
        rules:
        - apiGroups:
          - console.openshift.io
          resources:
          - consoleexternalloglinks
          verbs:
          - "*"
        - apiGroups:
          - scheduling.k8s.io
          resources:
          - priorityclasses
          verbs:
          - "*"
        - apiGroups:
          - oauth.openshift.io
          resources:
          - oauthclients
          verbs:
          - "*"
        - apiGroups:
          - rbac.authorization.k8s.io
          resources:
          - clusterroles
          - clusterrolebindings
          verbs:
          - "*"
        - apiGroups:
          - config.openshift.io
          resources:
          - proxies
          verbs:
          - get
          - list
          - watch
        - apiGroups:
          - ""
          resources:
          - pods
          - namespaces
          - services
          - services/finalizers
          verbs:
          - get
          - list
          - watch
      deployments:
      - name: cluster-logging-operator
        spec:
          replicas: 1
          selector:
            matchLabels:
              name: cluster-logging-operator
          template:
            metadata:
              labels:
                name: cluster-logging-operator
            spec:
              serviceAccountName: cluster-logging-operator
              containers:
              - name: cluster-logging-operator
                image: registry.redhat.io/openshift4/ose-cluster-logging-operator@sha256:648b96c77f8b0068bd8323a092cf06793ebd7566046a6ffb88af1d7fabadeaa3
                imagePullPolicy: IfNotPresent
                command:
                - cluster-logging-operator
                env:
                  - name: WATCH_NAMESPACE
                    valueFrom:
                      fieldRef:
                        fieldPath: metadata.annotations['olm.targetNamespaces']
                  - name: POD_NAME
                    valueFrom:
                      fieldRef:
                        fieldPath: metadata.name
                  - name: OPERATOR_NAME
                    value: "cluster-logging-operator"
                  - name: ELASTICSEARCH_IMAGE
                    value: "registry.redhat.io/openshift4/ose-logging-elasticsearch5@sha256:f02e4f75617b706d9b8e2dc06777aa572a443ccc3dd604ce4c21667f55725435"
                  - name: FLUENTD_IMAGE
                    value: "registry.redhat.io/openshift4/ose-logging-fluentd@sha256:a43ba2606777a8b6e3a45443bac1ae697600731b34c2abb84e35624ed8ef0270"
                  - name: KIBANA_IMAGE
                    value: "registry.redhat.io/openshift4/ose-logging-kibana5@sha256:8f3dc6d2e8c80fce660f65c3c7be1330d6a7b73d003998be8c333e993ccafc78"
                  - name: CURATOR_IMAGE
                    value: "registry.redhat.io/openshift4/ose-logging-curator5@sha256:da8943a7eacfd34ac8687ae607e11fb1ad1f538e4bdcae95f3ed70039be72f04"
                  - name: OAUTH_PROXY_IMAGE
                    value: "registry.redhat.io/openshift4/ose-oauth-proxy@sha256:5fc02d6d99203f2d437068315434b5ca926b992ec02e686ae8b47fbc5ddc89a1"
                  - name: PROMTAIL_IMAGE
                    value: "registry.redhat.io/openshift4/ose-promtail@sha256:1264aa92ebc6cccf46da3a35fbb54421b806dda5640c7e9706e6e815d13f509d"
  customresourcedefinitions:
    owned:
    - name: clusterloggings.logging.openshift.io
      version: v1
      kind: ClusterLogging
      displayName: Cluster Logging
      description: A Cluster Logging instance
      resources:
      - kind: Deployment
        version: v1
      - kind: DaemonSet
        version: v1
      - kind: CronJob
        version: v1beta1
      - kind: ReplicaSet
        version: v1
      - kind: Pod
        version: v1
      - kind: ConfigMap
        version: v1
      - kind: Secret
        version: v1
      - kind: Service
        version: v1
      - kind: Route
        version: v1
      - kind: Elasticsearch
        version: v1
      - kind: LogForwarding
        version: v1alpha1
      - kind: Collector
        version: v1alpha1
      specDescriptors:
      - description: The desired number of Kibana Pods for the Visualization component
        displayName: Kibana Size
        path: visualization.kibana.replicas
        x-descriptors:
        - 'urn:alm:descriptor:com.tectonic.ui:podCount'
      - description: Resource requirements for the Kibana pods
        displayName: Kibana Resource Requirements
        path: visualization.kibana.resources
        x-descriptors:
        - 'urn:alm:descriptor:com.tectonic.ui:resourceRequirements'
      - description: The node selector to use for the Kibana Visualization component
        displayName: Kibana Node Selector
        path: visualization.kibana.nodeSelector
        x-descriptors:
        - 'urn:alm:descriptor:com.tectonic.ui:nodeSelector'
      - description: The desired number of Elasticsearch Nodes for the Log Storage component
        displayName: Elasticsearch Size
        path: logStore.elasticsearch.nodeCount
        x-descriptors:
        - 'urn:alm:descriptor:com.tectonic.ui:podCount'
      - description: Resource requirements for each Elasticsearch node
        displayName: Elasticsearch Resource Requirements
        path: logStore.elasticsearch.resources
        x-descriptors:
        - 'urn:alm:descriptor:com.tectonic.ui:resourceRequirements'
      - description: The node selector to use for the Elasticsearch Log Storage component
        displayName: Elasticsearch Node Selector
        path: logStore.elasticsearch.nodeSelector
        x-descriptors:
        - 'urn:alm:descriptor:com.tectonic.ui:nodeSelector'
      - description: Resource requirements for the Fluentd pods
        displayName: Fluentd Resource Requirements
        path: collection.logs.fluentd.resources
        x-descriptors:
        - 'urn:alm:descriptor:com.tectonic.ui:resourceRequirements'
      - description: The node selector to use for the Fluentd log collection component
        displayName: Fluentd node selector
        path: collection.logs.fluentd.nodeSelector
        x-descriptors:
        - 'urn:alm:descriptor:com.tectonic.ui:nodeSelector'
      - description: The list of output targets that receive log messages
        displayName: Forwarding Outputs
        path: forwarding.outputs
        x-descriptors:
        - 'urn:alm:descriptor:com.tectonic.ui:forwardingOutputs'
      - description: The list of mappings between log sources (e.g. application logs) and forwarding outputs
        displayName: Forwarding Pipelines
        path: forwarding.pipelines
        x-descriptors:
        - 'urn:alm:descriptor:com.tectonic.ui:forwardingPipelines'
      - description: Resource requirements for the Curator pods
        displayName: Curator Resource Requirements
        path: curation.curator.resources
        x-descriptors:
        - 'urn:alm:descriptor:com.tectonic.ui:resourceRequirements'
      - description: The node selector to use for the Curator component
        displayName: Curator Node Selector
        path: curation.curator.nodeSelector
        x-descriptors:
        - 'urn:alm:descriptor:com.tectonic.ui:nodeSelector'
      - description: The cron schedule for the Curator component
        displayName: Curation Schedule
        path: curation.curator.schedule
      statusDescriptors:
      - description: The status for each of the Kibana pods for the Visualization component
        displayName: Kibana Status
        path: visualization.kibanaStatus.pods
        x-descriptors:
        - 'urn:alm:descriptor:com.tectonic.ui:podStatuses'
      - description: The status for each of the Elasticsearch Client pods for the Log Storage component
        displayName: Elasticsearch Client Pod Status
        path: logStore.elasticsearchStatus.pods.client
        x-descriptors:
        - 'urn:alm:descriptor:com.tectonic.ui:podStatuses'
      - description: The status for each of the Elasticsearch Data pods for the Log Storage component
        displayName: Elasticsearch Data Pod Status
        path: logStore.elasticsearchStatus.pods.data
        x-descriptors:
        - 'urn:alm:descriptor:com.tectonic.ui:podStatuses'
      - description: The status for each of the Elasticsearch Master pods for the Log Storage component
        displayName: Elasticsearch Master Pod Status
        path: logStore.elasticsearchStatus.pods.master
        x-descriptors:
        - 'urn:alm:descriptor:com.tectonic.ui:podStatuses'
      - description: The cluster status for each of the Elasticsearch Clusters for the Log Storage component
        displayName: Elasticsearch Cluster Health
        path: logStore.elasticsearchStatus.clusterHealth
      - description: The status for each of the Fluentd pods for the Log Collection component
        displayName: Fluentd status
        path: collection.logs.fluentdStatus.pods
        x-descriptors:
        - 'urn:alm:descriptor:com.tectonic.ui:podStatuses'
      - description: The status for migration of a clusterlogging instance
        displayName: Fluentd status
        path: migration
        x-descriptors:
        - 'urn:alm:descriptor:com.tectonic.ui:migrationStatus'
    - name: logforwardings.logging.openshift.io
      version: v1alpha1
      kind: LogForwarding
      displayName: Log Forwarding
      description: Log forwarding spec to define destinations for specific log sources
      specDescriptors:
      - description: The list of output targets that receive log messages
        displayName: Forwarding Outputs
        path: outputs
        x-descriptors:
        - 'urn:alm:descriptor:com.tectonic.ui:forwardingOutputs'
      - description: The list of mappings between log sources (e.g. application logs) and forwarding outputs
        displayName: Forwarding Pipelines
        path: pipelines
        x-descriptors:
        - 'urn:alm:descriptor:com.tectonic.ui:forwardingPipelines'
      statusDescriptors:
      - description: The status of the sources being collected
        displayName: Source Status
        path: sources
        x-descriptors:
        - 'urn:alm:descriptor:com.tectonic.ui:sourceStatuses'        
      - description: The status of forwarding outputs
        displayName: Outputs Status
        path: outputs
        x-descriptors:
        - 'urn:alm:descriptor:com.tectonic.ui:outputStatuses'        
      - description: The status of forwarding pipelines
        displayName: Pipelines Status
        path: pipelines
        x-descriptors:
        - 'urn:alm:descriptor:com.tectonic.ui:pipelineStatuses'        
      - description: The status of log forwarding resourece
        displayName: Log Forwarding Status
        path: status
        x-descriptors:
        - 'urn:alm:descriptor:com.tectonic.ui:logforwardingStatuses'        
    - name: collectors.logging.openshift.io
      version: v1alpha1
      kind: Collector
      displayName: Log Collector
      description: Log Collector spec to define log collection
      specDescriptors:
      - description: The type of log collector
        displayName: Collector type
        path: type
        x-descriptors:
        - 'urn:alm:descriptor:com.tectonic.ui:collectorType'

```

That is the cluster service version for 4.3.16

Comment 15 Tyler Lisowski 2020-05-27 05:07:03 UTC

Update can prove that the old one works by

Ensuring the olm-operator stays scaled down so the old cluster logging operator does not get destroyed
```
watch -n 5 kubectl -n openshift-operator-lifecycle-manager scale --replicas 0 deploy olm-operator
```

Deleting the existing ClusterLogging Solution
```
kubectl delete clusterlogging -n openshift-logging instance
```


Let everything clean up. Then apply the older cluster logging operator
```
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "2"
  labels:
  name: cluster-logging-operator
  namespace: openshift-logging
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      name: cluster-logging-operator
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      annotations:
        alm-examples: |-
          [
            {
              "apiVersion": "logging.openshift.io/v1",
              "kind": "ClusterLogging",
              "metadata": {
                "name": "instance",
                "namespace": "openshift-logging"
              },
              "spec": {
                "managementState": "Managed",
                "logStore": {
                  "type": "elasticsearch",
                  "elasticsearch": {
                    "nodeCount": 3,
                    "redundancyPolicy": "SingleRedundancy",
                    "storage": {
                      "storageClassName": "gp2",
                      "size": "200G"
                    }
                  }
                },
                "visualization": {
                  "type": "kibana",
                  "kibana": {
                    "replicas": 1
                  }
                },
                "curation": {
                  "type": "curator",
                  "curator": {
                    "schedule": "30 3 * * *"
                  }
                },
                "collection": {
                  "logs": {
                    "type": "fluentd",
                    "fluentd": {}
                  }
                }
              }
            },
            {
              "apiVersion": "logging.openshift.io/v1alpha1",
              "kind": "LogForwarding",
              "metadata": {
                "name": "instance",
                "namespace": "openshift-logging"
              },
              "spec": {
                "outputs": [
                  {
                    "name": "clo-default-output-es",
                    "type": "elasticsearch",
                    "endpoint": "elasticsearch.openshift-logging.svc:9200",
                    "secret": {
                      "name": "elasticsearch"
                    }
                  }
                ],
                "pipelines": [
                  {
                    "name": "clo-default-app-pipeline",
                    "inputSource": "logs.app",
                    "outputRefs": ["clo-managaged-output-es"]
                  },
                  {
                    "name": "clo-default-infra-pipeline",
                    "inputSource": "logs.app",
                    "outputRefs": ["clo-managaged-output-es"]
                  }
                ]
              }
            }
          ]
        capabilities: Seamless Upgrades
        categories: OpenShift Optional, Logging & Tracing
        certified: "false"
        containerImage: registry.redhat.io/openshift4/ose-cluster-logging-operator@sha256:2e08105b56f4f3d2f1842fdc13571720aa36754e64885ccf55987ba69a14a079
        createdAt: "2018-08-01T08:00:00Z"
        description: The Cluster Logging Operator for OKD provides a means for configuring
          and managing your aggregated logging stack.
        olm.operatorGroup: openshift-logging-92zpg
        olm.operatorNamespace: openshift-logging
        olm.skipRange: '>=4.1.0 <4.3.20-202005121847'
        olm.targetNamespaces: openshift-logging
        support: AOS Logging
      labels:
        name: cluster-logging-operator
    spec:
      containers:
      - command:
        - cluster-logging-operator
        env:
        - name: WATCH_NAMESPACE
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.annotations['olm.targetNamespaces']
        - name: POD_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: metadata.name
        - name: OPERATOR_NAME
          value: cluster-logging-operator
        - name: ELASTICSEARCH_IMAGE
          value: registry.redhat.io/openshift4/ose-logging-elasticsearch5@sha256:f02e4f75617b706d9b8e2dc06777aa572a443ccc3dd604ce4c21667f55725435
        - name: FLUENTD_IMAGE
          value: registry.redhat.io/openshift4/ose-logging-fluentd@sha256:a43ba2606777a8b6e3a45443bac1ae697600731b34c2abb84e35624ed8ef0270
        - name: KIBANA_IMAGE
          value: registry.redhat.io/openshift4/ose-logging-kibana5@sha256:8f3dc6d2e8c80fce660f65c3c7be1330d6a7b73d003998be8c333e993ccafc78
        - name: CURATOR_IMAGE
          value: registry.redhat.io/openshift4/ose-logging-curator5@sha256:da8943a7eacfd34ac8687ae607e11fb1ad1f538e4bdcae95f3ed70039be72f04
        - name: OAUTH_PROXY_IMAGE
          value: registry.redhat.io/openshift4/ose-oauth-proxy@sha256:5fc02d6d99203f2d437068315434b5ca926b992ec02e686ae8b47fbc5ddc89a1
        - name: PROMTAIL_IMAGE
          value: registry.redhat.io/openshift4/ose-promtail@sha256:1264aa92ebc6cccf46da3a35fbb54421b806dda5640c7e9706e6e815d13f509d
        image: registry.redhat.io/openshift4/ose-cluster-logging-operator@sha256:648b96c77f8b0068bd8323a092cf06793ebd7566046a6ffb88af1d7fabadeaa3
        imagePullPolicy: IfNotPresent
        name: cluster-logging-operator
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      serviceAccount: cluster-logging-operator
      serviceAccountName: cluster-logging-operator
      terminationGracePeriodSeconds: 30
```


Then create a new Logging Instance and let it initialize 


Once it initializes logs will start flowing again: so this likely points to a regression somewhere between the 4.16 code and the 4.19 code

Comment 16 Tyler Lisowski 2020-05-28 01:01:39 UTC

Logging team is investigating the code delta between the versions and trying to pinpoint the change that caused it

Comment 17 brian_mckeown 2020-06-01 08:53:38 UTC

Hi folks - any recent news on how this is going?

Btw, I have another customer case reporting the same issue, following these steps: https://cloud.ibm.com/docs/openshift?topic=openshift-health#oc_logging_operator

"[object Object]: [security_exception] no permissions for [indices:data/read/field_caps] and User [name=CN=system.logging.kibana,OU=OpenShift,O=Logging, roles=[]]"

- this client said they were using CLO v4.3.20-202005121847 when they hit the error
- they said they were able to get cluster logging working with CLO v4.2.29-202004140532

does this tally with your understanding?

also, just for my own knowledge, what are the steps to install a lower version of the CLO than the default version available from operator hub, and is this then an environment we would support? [if they move from the defaults]

thanks a lot.

Comment 18 Ravi Konuru 2020-06-02 22:35:53 UTC

Hi folks, any update ? Thank you.

Comment 19 Ravi Konuru 2020-06-02 22:36:25 UTC

Hi folks, any update ? Thank you.

Comment 20 Ravi Konuru 2020-06-02 22:36:30 UTC

Hi folks, any update ? Thank you.

Comment 21 Jeff Cantrill 2020-06-03 15:49:12 UTC

I believe this to be a session and cookie issue as observed:

https://bugzilla.redhat.com/show_bug.cgi?id=1791837#c29
https://bugzilla.redhat.com/show_bug.cgi?id=1791837#c30


Closing NOTABUG

Comment 22 Cesar Wong 2020-06-09 21:17:10 UTC

Reopening given that solutions in other bz's have not worked in this case. Clearing all cookies and using incognito mode still results in the same error in the kibana UI.

Comment 25 Cesar Wong 2020-06-18 12:51:58 UTC

Looked at this a little bit more and narrowed the issue down to the elasticsearch container. 
There is no difference in elasticsearch version (or plugin jar versions for that matter) between the one that works and the one that doesn't. 
However, the JVM is different.
The one that works:         java-1.8.0-openjdk-1.8.0.242.b08-1.el7.x86_64
The one that doesn't work:  java-1.8.0-openjdk-1.8.0.252.b09-2.el7_8.x86_64

Which may explain the error mentioned above of:
[2020-05-18T03:32:16,379][ERROR][i.f.e.p.OpenshiftAPIService] Error retrieving username from token
okhttp3.internal.http2.StreamResetException: stream was reset: PROTOCOL_ERROR

in the elasticsearch log.

Comment 26 Jeff Cantrill 2020-06-18 19:21:51 UTC

This may need to be resolved by ART.  Moving to UpcomingSprint

Comment 27 Cesar Wong 2020-06-18 20:49:28 UTC

Created openjdk issue:
https://issues.redhat.com/browse/OPENJDK-114

Comment 28 tnakajo 2020-06-24 01:13:40 UTC

Hello, this is from IBM Cloud Support. We still have the issue with one of the customers, using IBM Cloud ROKS 4.3.12_1520_openshift.

It is said that the root cause is JDK version that elasticsearch container used.

The one that works: java-1.8.0-openjdk-1.8.0.242.b08-1.el7.x86_64
The one that doesn't work: java-1.8.0-openjdk-1.8.0.252.b09-2.el7_8.x86_64
https://issues.redhat.com/browse/OPENJDK-114

But the issue still is not resolved. Can you investigate further?

Please help to push Red Hat to update the elasticsearch image.
The elasticsearch image in ROKS 4.3 is using openjdk 1.8.0.252.

Comment 29 tnakajo 2020-06-28 23:28:41 UTC

@Jeff Cantrill @Cesar Wong, Could you investigate the issue based on the last comment? Let us know if you need any more information from us.

Comment 31 Periklis Tsirakidis 2020-06-30 07:29:06 UTC

Putting this back to low, because not a blocker for 4.5. Relevant workarounds provided by [1] need to reevaluated. <4.5 releases use Kibana 5 related in this issue.

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1835396#c21

Comment 32 John Pape 2020-06-30 13:29:44 UTC

@Periklis - I don't think this is accurate. The errors are seen immediately upon accessing Kibana using the 4.3-based CLO. This isn't a scenario of log in, wait a while for a token to expire, and then try to refresh the page or access the dashboard links again: It straight-up fails immediately.

Comment 33 John Pape 2020-06-30 13:30:34 UTC

What I mean above is that the linked workarounds in that comment do *not* work in this 4.3 case. I'm sure the statements on 4.5 are accurate. ;)

Comment 34 Jason T. Greene 2020-06-30 16:58:09 UTC

Is the full log available for that elastic search error. It sounds like TLS is being used for this? I wonder if you are seeing:

https://github.com/square/okhttp/issues/5970
https://github.com/square/okhttp/pull/5971/files

Comment 35 Cesar Wong 2020-06-30 18:24:38 UTC

Created attachment 1699378 [details]
elasticsearch log

Jason, that looks like it could be this bug. I am attaching the logs from one of the elastic search nodes on from a 4.3.23 ROKS cluster.

Comment 36 Cesar Wong 2020-06-30 21:53:31 UTC

The bug above has been addressed higher up in the stack in the kubernetes-client: https://github.com/fabric8io/kubernetes-client/pull/2227
I can confirm that running with a version of the elasticsearch plugin that includes the fix above does fix the issue.

Comment 37 tnakajo 2020-07-01 00:20:37 UTC

@Cesar Wong, do you have any way to upgrade the kubernetes-client in the IBM Cloud ROCK cluster v4.3?

Comment 38 Jason T. Greene 2020-07-01 04:57:58 UTC

Excellent. Glad to hear the update addresses the issue.

Comment 39 Cesar Wong 2020-07-01 14:06:26 UTC

@tnakajo.com - I've now submitted a PR to bump the k8s client version in the elastic search plugin
https://github.com/fabric8io/openshift-elasticsearch-plugin/pull/190

What I did to test it is to build the plugin from that repo, and then injected that build into a new image for elasticsearch.
Once the PR above merges, the elastic search build will need to be updated to pull that in to the image.

Comment 42 tnakajo 2020-07-04 00:40:41 UTC

@Cesar Wong It seems the PR above has been merged. Can you proceed further?

Comment 43 tnakajo 2020-07-06 01:35:13 UTC

Hello, this is from IBM Cloud Support. I have another customer reporting the same issue using IBM Cloud ROKS 4.3 openshift. Could you provide an ETA for the bug fixed?

Comment 44 Periklis Tsirakidis 2020-07-06 07:06:12 UTC

@tnakajo

Fixing this issue seems to be only merging the PR and integrating it in elasticsearch images. AFAICS, the fix targets 4.3.z and thus it needs to land first in 4.4.z and then in 4.3.z. The streams open and close weekly, e.g. 4.4.z by tomorrow. Thus we need to wait at least for the next cycle.

Comment 45 Periklis Tsirakidis 2020-07-06 07:09:51 UTC

Put "Upcoming Sprint", because the fix will not land on this weeks 4.3.z release, thus next releases are earliest in next sprint.

Comment 49 tnakajo 2020-07-17 06:40:13 UTC

@Periklis Tsirakidis @Cesar Wong Can you tell us the current status? Will the fix be mearged on 4.3.z release next week? FYI: We have 4 customers waiting for the fix now.

By the way, once the fix 4.3.z is ready, how does the user update on their cluster?

Comment 51 errata-xmlrpc 2020-07-21 10:31:05 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2913

Comment 52 tnakajo 2020-07-23 08:59:18 UTC

@Periklis Tsirakidis @Cesar Wong @Jeff Cantrill It seems the fix has been merged into 4.4.z. Is the fix scheduled to marge into 4.3.z in the next sprint?

Comment 53 Periklis Tsirakidis 2020-07-23 09:24:07 UTC

The BZ chain tell me that the 4.3 is also merged. Have a look here https://bugzilla.redhat.com/show_bug.cgi?id=1854997