Bug 1985826
| Summary: | The elasticsearch-im-xxx job failed when trying to start index management process for a non-existent(empty-named) index | ||
|---|---|---|---|
| Product: | OpenShift Container Platform | Reporter: | yhe |
| Component: | Logging | Assignee: | Gerard Vanloo <gvanloo> |
| Status: | CLOSED ERRATA | QA Contact: | Anping Li <anli> |
| Severity: | medium | Docs Contact: | |
| Priority: | medium | ||
| Version: | 4.6.z | CC: | ableisch, achakrat, afurbach, aharchin, aos-bugs, asheth, ausov, bmehra, ChetRHosey, chrzhang, cruhm, ctakahas, dahernan, dgautam, dkulkarn, gvanloo, ivan.wong, jaliang, jcall, jcantril, kiyyappa, mchebbi, mmarkand, morgan.peterman, mzali, ncarmich, ocasalsa, periklis, pkhaire, pwaghmod, sascha.gruen, yener.yumlu |
| Target Milestone: | --- | Keywords: | Reopened |
| Target Release: | 4.6.z | ||
| Hardware: | Unspecified | ||
| OS: | Unspecified | ||
| Whiteboard: | logging-exploration | ||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |
| Doc Text: | Story Points: | --- | |
| Clone Of: | Environment: | ||
| Last Closed: | 2022-01-27 08:26:04 UTC | Type: | Bug |
| Regression: | --- | Mount Type: | --- |
| Documentation: | --- | CRM: | |
| Verified Versions: | Category: | --- | |
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
| Cloudforms Team: | --- | Target Upstream Version: | |
| Embargoed: | |||
Facing similar issue in openshift 4.7 as mentioned in the description.
Below are the logs of failed jobs of Elasticsearch
$ omg logs elasticsearch-im-app-1628712900-8nsjt
++++
2021-08-11T20:15:19.515097868Z Index management delete process starting for {"code":500,"message":"Internal
2021-08-11T20:15:19.515097868Z
2021-08-11T20:15:19.524015986Z Received an empty response from elasticsearch -- server may not be ready
++++
$ omg logs elasticsearch-im-infra-1628748000-wv7gf
++++
2021-08-12T06:00:12.478110967Z Index management delete process starting for {"error":{"root_cause":[{"type":"security_exception","reason":"_opendistro_security_dls_query
2021-08-12T06:00:12.478178196Z
2021-08-12T06:00:12.488256306Z Received an empty response from elasticsearch -- server may not be ready
++++
If you need any more information, please let me know.
Regards,
Bobby Mehra
Red Hat.
Facing similar issue in openshift 4.7 as mentioned in the description.
Below are the logs of failed jobs of Elasticsearch
$ omg logs elasticsearch-im-app-1628712900-8nsjt
++++
2021-08-11T20:15:19.515097868Z Index management delete process starting for {"code":500,"message":"Internal
2021-08-11T20:15:19.515097868Z
2021-08-11T20:15:19.524015986Z Received an empty response from elasticsearch -- server may not be ready
++++
$ omg logs elasticsearch-im-infra-1628748000-wv7gf
++++
2021-08-12T06:00:12.478110967Z Index management delete process starting for {"error":{"root_cause":[{"type":"security_exception","reason":"_opendistro_security_dls_query
2021-08-12T06:00:12.478178196Z
2021-08-12T06:00:12.488256306Z Received an empty response from elasticsearch -- server may not be ready
++++
If you need any more information, please let me know.
Regards,
Bobby Mehra
Red Hat.
Facing similar issue in openshift 4.7.9, using ipi on aws.
Cluster Logging 5.1.0-75
OpenShift Elasticsearch Operator 5.1.0-96
Log of elasticsearch-im-infra-1629355500-2flpr
========================
Index management delete process starting for infra
No indices to delete
========================
Index management rollover process starting for infra
Current write index for infra-write: infra-000493
Checking results from _rollover call
Next write index for infra-write: infra-000494
Checking if infra-000494 exists
Checking if infra-000494 is the write index for infra-write
Done!
========================
Index management rollover process starting for {"error":{"root_cause":[{"type":"security_exception","reason":"_opendistro_security_dls_query
Received an empty response from elasticsearch -- server may not be ready
Facing a similar issue in OpenShift 4.8.4
Cluster Logging 5.1.0-75
OpenShift Elasticsearch Operator 5.1.0-96
**app**
~~~
2021-08-18T07:45:02.880651084Z ========================
2021-08-18T07:45:02.880651084Z Index management delete process starting for app
2021-08-18T07:45:02.880651084Z
2021-08-18T07:45:03.024691766Z No indices to delete
2021-08-18T07:45:03.079613499Z ========================
2021-08-18T07:45:03.079613499Z Index management rollover process starting for app
2021-08-18T07:45:03.079636900Z
2021-08-18T07:45:03.157619939Z Current write index for app-write: app-001391
2021-08-18T07:45:03.174729379Z Checking results from _rollover call
2021-08-18T07:45:03.174785694Z Calculating next write index based on current write index...
2021-08-18T07:45:03.185094156Z Next write index for app-write: app-001392
2021-08-18T07:45:03.185094156Z Checking if app-001392 exists
2021-08-18T07:45:03.446187155Z {"error":{"root_cause":[{"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"app-001392","index_uuid":"_na_","index":"app-001392"}],"type":"index_not_found_exception","reason":"no such index","resource.type":"index_or_alias","resource.id":"app-001392","index_uuid":"_na_","index":"app-001392"},"status":404}
~~~
------------------------------------------------
**audit**
~~~
2021-08-18T11:15:03.170981328Z ========================
2021-08-18T11:15:03.170981328Z Index management delete process starting for Received
2021-08-18T11:15:03.170981328Z
2021-08-18T11:15:03.273149997Z Error while attemping to determine the active write alias: {'error': 'alias [Received-write] missing', 'status': 404}
2021-08-18T11:15:03.356975699Z ========================
2021-08-18T11:15:03.356975699Z Index management rollover process starting for audit
2021-08-18T11:15:03.356975699Z
2021-08-18T11:15:03.428212646Z Current write index for audit-write: audit-001393
2021-08-18T11:15:04.067360313Z Checking results from _rollover call
2021-08-18T11:15:04.098107464Z Next write index for audit-write: audit-001394
2021-08-18T11:15:04.098107464Z Checking if audit-001394 exists
2021-08-18T11:15:04.136677540Z Checking if audit-001394 is the write index for audit-write
2021-08-18T11:15:04.203876863Z Done!
~~~
-------------------------------------------------
**infra**
~~~
2021-08-18T10:30:03.390222987Z ========================
2021-08-18T10:30:03.390222987Z Index management delete process starting for infra
2021-08-18T10:30:03.390222987Z
2021-08-18T10:30:03.408917414Z Received an empty response from elasticsearch -- server may not be ready
~~~
Pods status:
~~~
$ oc -n openshift-logging get po --field-selector=status.phase==Failed
NAME READY STATUS RESTARTS AGE
elasticsearch-im-app-27155985-xxxx 0/1 Error 0 14m
elasticsearch-im-audit-27155820-xxxx 0/1 Error 0 179m
elasticsearch-im-infra-27155310-xxxx 0/1 Error 0 11h
~~~
Seeing the same issue in 4.7.21:
# oc logs elasticsearch-im-audit-1631027700-8gqhd
========================
Index management delete process starting for audit
No indices to delete
========================
Index management delete process starting for {"error":{"root_cause":[{"type":"security_exception","reason":"_opendistro_security_dls_query
Received an empty response from elasticsearch -- server may not be ready
========================
Index management rollover process starting for audit
Current write index for audit-write: audit-000011
Checking results from _rollover call
Next write index for audit-write: audit-000011
Checking if audit-000011 exists
Checking if audit-000011 is the write index for audit-write
Done!
# oc get csv
NAME DISPLAY VERSION REPLACES PHASE
cluster-logging.5.2.0-57 Red Hat OpenShift Logging 5.2.0-57 cluster-logging.5.1.0-75 Succeeded
elasticsearch-operator.5.2.0-58 OpenShift Elasticsearch Operator 5.2.0-58 elasticsearch-operator.5.1.0-96 Succeeded
Facing the similar issue in 4.7.24.
Cluster Logging 5.1.0-75
OpenShift Elasticsearch Operator 5.1.0-96
$ oc logs elasticsearch-im-audit-1631232000-8mx4p
========================
Index management delete process starting for audit
No indices to delete
========================
Index management delete process starting for {"error":{"root_cause":[{"type":"security_exception","reason":"_opendistro_security_dls_query
Received an empty response from elasticsearch -- server may not be ready
========================
Index management rollover process starting for audit
Current write index for audit-write: audit-000590
Checking results from _rollover call
Next write index for audit-write: audit-000590
Checking if audit-000590 exists
Checking if audit-000590 is the write index for audit-write
Done!
Regards,
Yener Yumlu
Still facing the issue and able to be reproduced in latest Cluster Logging version and respective latest OpenShift Elasticsearch Operator. *** Bug 2000309 has been marked as a duplicate of this bug. *** Verified on csv: elasticsearch-operator.4.6.0-202201111633. image: openshift/ose-elasticsearch-proxy/images/v4.6.0-202201111633.p0.g12d80b2.assembly.stream Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.6.54 extras and security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2022:0181 |
Description of problem: The elasticsearch-im-xxx job failed when trying to start the index management process for a non-existent(empty-named) index. The following logs can be seen in the logs of the elasticsearch-im-xxx-xxxxx pods that are in Error status. 2021-07-25T05:15:12.824232244Z Index management delete process starting for {"error":{"root_cause":[{"type":"security_exception","reason":"_opendistro_security_dls_query 2021-07-25T05:15:12.824232244Z 2021-07-25T05:15:12.839052478Z Received an empty response from elasticsearch -- server may not be ready And also: 2021-07-25T20:45:15.629014714Z Index management rollover process starting for {"error":{"root_cause":[{"type":"security_exception","reason":"_opendistro_security_dls_query 2021-07-25T20:45:15.629014714Z 2021-07-25T20:45:15.640236263Z Received an empty response from elasticsearch -- server may not be ready while the normal logs should be like: 2021-07-25T05:15:12.518377608Z Index management delete process starting for infra 2021-07-25T05:15:12.518377608Z 2021-07-25T05:15:12.714427228Z deleting indices: infra-003321 2021-07-25T05:15:12.817929576Z Done! 2021-07-25T05:15:12.892345720Z Index management rollover process starting for infra 2021-07-25T05:15:12.892345720Z 2021-07-25T05:15:12.978155500Z Current write index for infra-write: infra-003341 2021-07-25T05:15:13.029990544Z Checking results from _rollover call 2021-07-25T05:15:13.070198489Z Next write index for infra-write: infra-003341 2021-07-25T05:15:13.070198489Z Checking if infra-003341 exists 2021-07-25T05:15:13.116276007Z Checking if infra-003341 is the write index for infra-write 2021-07-25T05:15:13.194983109Z Done! Will upload other related information later. Version-Release number of selected component (if applicable): How reproducible: Sometimes Steps to Reproduce: 1. Check the elasticsearch-im-xxx job's status 2. Find the failed elasticsearch-im-xxx job 3. Check the logs of the elasticsearch-im-xxx-xxxxx pod Actual results: The above error can be seen in the logs Expected results: The above error doesn't happen and the job was completed correctly Additional info: