Created attachment 1697587 [details] Elasticsearch plugin is red on kibana dashboard Created attachment 1697587 [details] Elasticsearch plugin is red on kibana dashboard Description of problem: Followed instructions at https://docs.openshift.com/container-platform/4.3/logging/cluster-logging-deploying.html to install additional operators logging and elasticsearch on OCP 4.3.18 Kibana dashborad opens but elasticsearch plugin is red as shown in the attached file. Followed instructions at https://docs.openshift.com/container-platform/4.3/logging/config/cluster-logging-elasticsearch.html to create and expose elasticsearch route. ``` [root@arc-es-ec43-bastion elk]# oc get pods -n openshift-logging NAME READY STATUS RESTARTS AGE cluster-logging-operator-7fdc89799c-rk4b7 1/1 Running 0 4m17s elasticsearch-cdm-ak8cqf3b-1-7b8746c755-hlxjm 1/2 Running 0 55s elasticsearch-cdm-ak8cqf3b-2-cdbb8ff65-pr2c2 1/2 Running 0 21s elasticsearch-cdm-ak8cqf3b-3-558b97b6c-242pz 1/2 Running 0 18s fluentd-5dlbd 1/1 Running 0 46s fluentd-f9qlw 1/1 Running 0 46s fluentd-g2bd9 1/1 Running 0 45s fluentd-vl8gl 1/1 Running 0 47s fluentd-wdmvw 1/1 Running 0 51s kibana-77d496d75d-8lcdr 2/2 Running 0 54s ``` ``` [root@arc-es-ec43-bastion elk]# oc get service elasticsearch NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE elasticsearch ClusterIP 172.30.108.6 <none> 9200/TCP 97s oc get route elasticsearch -o jsonpath={.spec.host} elasticsearch-openshift-logging.apps.arc-es-ec43.redhat.com [root@arc-es-ec43-bastion elk]# curl -tlsv1.2 --insecure "https://elasticsearch-openshift-logging.apps.arc-es-ec43.redhat.com/.operations.*/_search?size=1" | jq % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 3131 0 3131 0 0 203k 0 --:--:-- --:--:-- --:--:-- 203k parse error: Invalid numeric literal at line 2, column 0 ``` ``` [root@arc-es-ec43-bastion elk]# oc get routes --all-namespaces NAMESPACE NAME HOST/PORT PATH SERVICES PORT TERMINATION WILDCARD openshift-authentication oauth-openshift oauth-openshift.apps.arc-es-ec43.redhat.com oauth-openshift 6443 passthrough/Redirect None openshift-console console console-openshift-console.apps.arc-es-ec43.redhat.com console https reencrypt/Redirect None openshift-console downloads downloads-openshift-console.apps.arc-es-ec43.redhat.com downloads http edge/Redirect None openshift-logging elasticsearch elasticsearch-openshift-logging.apps.arc-es-ec43.redhat.com elasticsearch <all> reencrypt None openshift-logging kibana kibana-openshift-logging.apps.arc-es-ec43.redhat.com kibana <all> reencrypt/Redirect None openshift-monitoring alertmanager-main alertmanager-main-openshift-monitoring.apps.arc-es-ec43.redhat.com alertmanager-main web reencrypt/Redirect None openshift-monitoring grafana grafana-openshift-monitoring.apps.arc-es-ec43.redhat.com grafana https reencrypt/Redirect None openshift-monitoring prometheus-k8s prometheus-k8s-openshift-monitoring.apps.arc-es-ec43.redhat.com prometheus-k8s web reencrypt/Redirect None openshift-monitoring thanos-querier thanos-querier-openshift-monitoring.apps.arc-es-ec43.redhat.com thanos-querier web reencrypt/Redirect None ``` elasticsearch service is created but I don't see any endpoints(pod ip addresses) reference for the same! ``` [root@arc-es-ec43-bastion ~]# oc describe svc elasticsearch -n openshift-logging Name: elasticsearch Namespace: openshift-logging Labels: cluster-name=elasticsearch Annotations: <none> Selector: cluster-name=elasticsearch,es-node-client=true Type: ClusterIP IP: 172.30.108.6 Port: elasticsearch 9200/TCP TargetPort: restapi/TCP Endpoints: Session Affinity: None Events: <none> [root@arc-es-ec43-bastion ~]# [root@arc-es-ec43-bastion ~]# oc get ep NAME ENDPOINTS AGE elasticsearch 14d elasticsearch-cluster 10.130.0.5:9300,10.130.0.6:9300,10.131.0.18:9300 14d elasticsearch-metrics 14d fluentd 10.128.0.77:24231,10.128.2.72:24231,10.129.0.74:24231 + 2 more... 14d kibana 10.131.1.0:3000 14d [root@arc-es-ec43-bastion ~]# ``` though get pods command is showing proper IP addresses! ``` [root@arc-es-ec43-bastion ~]# oc get pods -n openshift-logging -l cluster-name=elasticsearch,es-node-client=true -o=wide NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES elasticsearch-cdm-ak8cqf3b-1-7b8746c755-6cqkc 1/2 Running 1 8d 10.130.0.5 worker-0.arc-es-ec43.redhat.com <none> <none> elasticsearch-cdm-ak8cqf3b-2-cdbb8ff65-xw6ks 1/2 Running 3 8d 10.130.0.6 worker-0.arc-es-ec43.redhat.com <none> <none> elasticsearch-cdm-ak8cqf3b-3-558b97b6c-fqwt9 1/2 Running 5 8d 10.131.0.18 worker-1.arc-es-ec43.redhat.com <none> <none> ``` Version-Release number of selected component (if applicable): 4.3.18 How reproducible: On all 4.3.18 GA'd builds of OCP on POWER Steps to Reproduce: 1. Install OCP 4.3.18 on powervm server 2. Follow documentation steps at to enable logging and elasticsearch operators. https://docs.openshift.com/container-platform/4.3/logging/cluster-logging-deploying.html 3. Kibana dashboard opens but the ELK plugin is red as shown in the attached file. Actual results: After enabling the elasticsearch and logging operator on an OCP 4.3 cluster, the Kibana dashboard is accessible but the elasticsearch plugin shows up in a bad state. The elk route has been created and added to the local systems /etc/hosts entry. Expected results: Additional info:
Address resolution is happening properly, ran the curl from kibana and see that is able to resolve the svc address to 172.30.108.6 but unfortunately couldn't connect because it is not able to forward the traffic down to the pod's port. [root@arc-es-ec43-bastion ~]# oc debug pod/kibana-68d8f7694d-wrn6x -n openshift-logging Defaulting container name to kibana. Use 'oc describe pod/kibana-68d8f7694d-wrn6x-debug -n openshift-logging' to see all of the containers in this pod. Starting pod/kibana-68d8f7694d-wrn6x-debug ... Pod IP: 10.130.0.80 If you don't see a command prompt, try pressing enter. sh-4.2$ curl -v https://elasticsearch.openshift-logging.svc.cluster.local:9300 ^C sh-4.2$ curl -v https://elasticsearch.openshift-logging.svc.cluster.local:9200 * About to connect() to elasticsearch.openshift-logging.svc.cluster.local port 9200 (#0) * Trying 172.30.108.6...
I see the following messages in the ES pods: [root@arc-es-ec43-bastion ~]# oc logs elasticsearch-cdm-ak8cqf3b-2-cdbb8ff65-xw6ks -c elasticsearch [2020-05-31 11:12:46,140][INFO ][container.run ] Begin Elasticsearch startup script [2020-05-31 11:12:46,256][INFO ][container.run ] Comparing the specified RAM to the maximum recommended for Elasticsearch... [2020-05-31 11:12:46,261][INFO ][container.run ] Inspecting the maximum RAM available... [2020-05-31 11:12:46,285][INFO ][container.run ] ES_JAVA_OPTS: ' -Xms8192m -Xmx8192m' [2020-05-31 11:12:46,287][INFO ][container.run ] Copying certs from /etc/openshift/elasticsearch/secret to /etc/elasticsearch/secret [2020-05-31 11:12:46,729][INFO ][container.run ] Building required jks files and truststore Importing keystore /etc/elasticsearch/secret/admin.p12 to /etc/elasticsearch/secret/admin.jks... Entry for alias 1 successfully imported. Import command completed: 1 entries successfully imported, 0 entries failed or cancelled Warning: The JKS keystore uses a proprietary format. It is recommended to migrate to PKCS12 which is an industry standard format using "keytool -importkeystore -srckeystore /etc/elasticsearch/secret/admin.jks -destkeystore /etc/elasticsearch/secret/admin.jks -deststoretype pkcs12". Warning: The JKS keystore uses a proprietary format. It is recommended to migrate to PKCS12 which is an industry standard format using "keytool -importkeystore -srckeystore /etc/elasticsearch/secret/admin.jks -destkeystore /etc/elasticsearch/secret/admin.jks -deststoretype pkcs12". Certificate was added to keystore Warning: The JKS keystore uses a proprietary format. It is recommended to migrate to PKCS12 which is an industry standard format using "keytool -importkeystore -srckeystore /etc/elasticsearch/secret/admin.jks -destkeystore /etc/elasticsearch/secret/admin.jks -deststoretype pkcs12". Importing keystore /etc/elasticsearch/secret/elasticsearch.p12 to /etc/elasticsearch/secret/elasticsearch.jks... Entry for alias 1 successfully imported. Import command completed: 1 entries successfully imported, 0 entries failed or cancelled Warning: The JKS keystore uses a proprietary format. It is recommended to migrate to PKCS12 which is an industry standard format using "keytool -importkeystore -srckeystore /etc/elasticsearch/secret/elasticsearch.jks -destkeystore /etc/elasticsearch/secret/elasticsearch.jks -deststoretype pkcs12". Warning: The JKS keystore uses a proprietary format. It is recommended to migrate to PKCS12 which is an industry standard format using "keytool -importkeystore -srckeystore /etc/elasticsearch/secret/elasticsearch.jks -destkeystore /etc/elasticsearch/secret/elasticsearch.jks -deststoretype pkcs12". Certificate was added to keystore Warning: The JKS keystore uses a proprietary format. It is recommended to migrate to PKCS12 which is an industry standard format using "keytool -importkeystore -srckeystore /etc/elasticsearch/secret/elasticsearch.jks -destkeystore /etc/elasticsearch/secret/elasticsearch.jks -deststoretype pkcs12". Importing keystore /etc/elasticsearch/secret/logging-es.p12 to /etc/elasticsearch/secret/logging-es.jks... Entry for alias 1 successfully imported. Import command completed: 1 entries successfully imported, 0 entries failed or cancelled Warning: The JKS keystore uses a proprietary format. It is recommended to migrate to PKCS12 which is an industry standard format using "keytool -importkeystore -srckeystore /etc/elasticsearch/secret/logging-es.jks -destkeystore /etc/elasticsearch/secret/logging-es.jks -deststoretype pkcs12". Warning: The JKS keystore uses a proprietary format. It is recommended to migrate to PKCS12 which is an industry standard format using "keytool -importkeystore -srckeystore /etc/elasticsearch/secret/logging-es.jks -destkeystore /etc/elasticsearch/secret/logging-es.jks -deststoretype pkcs12". Certificate was added to keystore Warning: The JKS keystore uses a proprietary format. It is recommended to migrate to PKCS12 which is an industry standard format using "keytool -importkeystore -srckeystore /etc/elasticsearch/secret/logging-es.jks -destkeystore /etc/elasticsearch/secret/logging-es.jks -deststoretype pkcs12". Certificate was added to keystore Certificate was added to keystore [2020-05-31 11:13:05,012][INFO ][container.run ] Setting heap dump location /elasticsearch/persistent/heapdump.hprof [2020-05-31 11:13:05,016][INFO ][container.run ] ES_JAVA_OPTS: ' -Xms8192m -Xmx8192m -XX:HeapDumpPath=/elasticsearch/persistent/heapdump.hprof -Dsg.display_lic_none=false -Dio.netty.recycler.maxCapacityPerThread=0 -Dio.netty.allocator.type=unpooled' [2020-05-31 11:13:05,090][INFO ][container.run ] Checking if Elasticsearch is ready [2020-05-31 11:18:10,916][ERROR][container.run ] Timed out waiting for Elasticsearch to be ready cat: elasticsearch_connect_log.txt: No such file or directory [root@arc-es-ec43-bastion ~]# readinessProbe is failing and ES is also not running properly! [root@arc-es-ec43-bastion ~]# oc exec -ti elasticsearch-cdm-ak8cqf3b-1-7b8746c755-6cqkc -n openshift-logging sh Defaulting container name to elasticsearch. Use 'oc describe pod/elasticsearch-cdm-ak8cqf3b-1-7b8746c755-6cqkc -n openshift-logging' to see all of the containers in this pod. sh-4.2$ sh-4.2$ sh-4.2$ es_util --query=/_cat/indices?v -v * About to connect() to localhost port 9200 (#0) * Trying ::1... * Connection refused * Trying 127.0.0.1... * Connection refused * Failed connect to localhost:9200; Connection refused * Closing connection 0 sh-4.2$ sh-4.2$ /usr/share/elasticsearch/probe/readiness.sh Elasticsearch node is not ready to accept HTTP requests yet [response code: 000] sh-4.2$
I noticed that the namespace you used was not the one "recommended" by the documentation referenced. Do you think it's possible something was misconfigured in using the different namespace than what the instructions consistently used? (i.e. openshift-logging instead of openshift-operators-redhat).
BZ #1807201 is WIP that looks to fix the functionality of the Elasticsearch, which could potentially block this bug. I am linking 1807201 to this bug.
Encountered this issue on OCP 4.4.9 on Power: # oc version Client Version: 4.4.9 Server Version: 4.4.9 Kubernetes Version: v1.17.1+912792b # oc get subscription NAME PACKAGE SOURCE CHANNEL cluster-logging cluster-logging redhat-operators 4.4 # oc get csv NAME DISPLAY VERSION REPLACES PHASE clusterlogging.4.4.0-202006211643.p0 Cluster Logging 4.4.0-202006211643.p0 Succeeded elasticsearch-operator.4.4.0-202006211643.p0 Elasticsearch Operator 4.4.0-202006211643.p0 Succeeded # oc get pods -n openshift-logging NAME READY STATUS RESTARTS AGE cluster-logging-operator-74c9cf49bc-vk4cj 1/1 Running 0 29m elasticsearch-cdm-00ygk06t-1-64d6b99f4b-dgb6v 1/2 Running 0 28m elasticsearch-cdm-00ygk06t-2-56cf7dffc-dhgp9 1/2 Running 0 28m elasticsearch-cdm-00ygk06t-3-664f84cdd6-lntdz 1/2 Running 0 28m fluentd-6t6dw 1/1 Running 0 29m fluentd-8vdkm 1/1 Running 0 29m fluentd-jbbdd 1/1 Running 0 29m fluentd-lrf6d 1/1 Running 0 29m fluentd-n87zr 1/1 Running 0 29m fluentd-tpgrf 1/1 Running 0 29m fluentd-v4tv5 1/1 Running 0 29m kibana-855d757cbd-swg79 2/2 Running 2 29m Events: Type Reason Age From Message ---- ------ ---- ---- ------- Normal Scheduled 29m default-scheduler Successfully assigned openshift-logging/elasticsearch-cdm-00ygk06t-2-56cf7dffc-dhgp9 to worker-1.test-4604.example.com Normal Pulled 29m kubelet, worker-1.test-4604.example.com Container image "registry.redhat.io/openshift4/ose-logging-elasticsearch5@sha256:2940a8ce2837ee02afcf5de485d8e7eb3584c9ce56602c2625efe5603d53a1b2" already present on machine Normal Created 29m kubelet, worker-1.test-4604.example.com Created container elasticsearch Normal Started 29m kubelet, worker-1.test-4604.example.com Started container elasticsearch Normal Pulled 29m kubelet, worker-1.test-4604.example.com Container image "registry.redhat.io/openshift4/ose-oauth-proxy@sha256:03289a1d986efec545ac68c9bd8839a3a1ef0ad4c5a082d4c392cd27c5143b21" already present on machine Normal Created 29m kubelet, worker-1.test-4604.example.com Created container proxy Normal Started 29m kubelet, worker-1.test-4604.example.com Started container proxy Warning Unhealthy 4m2s (x299 over 28m) kubelet, worker-1.test-4604.example.com Readiness probe failed: Elasticsearch node is not ready to accept HTTP requests yet [response code: 000]
Bug #1855072 is still WIP, therefore, this bug could potentially be blocked by it.
The blocking bug 1855072 is still WIP. Therefore, it is unlikely that the fix for this bug will be in the current sprint before August 1. Adding UpcomingSprint tag
Setting the Target Release to match the blocking bug (4.3.z)
Hi Archana, could you re-test this bug to confirm that it is still an issue in 4.3? The blocking bug has been resolved.
This has been re-tested and the issue seems to have disappeared. However, afaik, we never saw any seccomp errors. I'll let Archana correct me if I'm mistaken. *I* didn't see any of those when I looked at her setup. :)
We tested this on 4.3.32 build number 4.3.0-0.nightly-ppc64le-2020-07-25-094111 It is working fine. No other issues seen. We can close this bug.
# oc get clusterversion NAME VERSION AVAILABLE PROGRESSING SINCE STATUS version 4.3.0-0.nightly-ppc64le-2020-07-25-094111 True False 12h Cluster version is 4.3.0-0.nightly-ppc64le-2020-07-25-094111 # oc get csv NAME DISPLAY VERSION REPLACES PHASE clusterlogging.4.3.31-202007272153.p0 Cluster Logging 4.3.31-202007272153.p0 Succeeded elasticsearch-operator.4.3.31-202007272153.p0 Elasticsearch Operator 4.3.31-202007272153.p0 Succeeded
Closing per feedback.