Bug 1916910
Summary: | Sometimes the elasticsearch-delete-xxx job failed at "Unexpected exception indices:admin/aliases/get" | |||
---|---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Jeff Cantrill <jcantril> | |
Component: | Logging | Assignee: | Jeff Cantrill <jcantril> | |
Status: | CLOSED ERRATA | QA Contact: | Giriyamma <gkarager> | |
Severity: | high | Docs Contact: | ||
Priority: | high | |||
Version: | 4.6 | CC: | akhaire, alchan, andcosta, anisal, anli, aos-bugs, apaladug, ChetRHosey, cruhm, dageoffr, dahernan, dkulkarn, dseals, jcantril, juherrer, kiyyappa, ksathe, luaparicio, lvlcek, mrdest, mrobson, naoto30, naygupta, nnosenzo, ocasalsa, periklis, prdeshpa, qitang, rkant, ronald.rademaker, sauchter, shishika, sreber, ssadhale, ssonigra, tmicheli, vhernand, vjaypurk, xingli, ykarajag | |
Target Milestone: | --- | |||
Target Release: | 4.6.z | |||
Hardware: | Unspecified | |||
OS: | Unspecified | |||
Whiteboard: | logging-exploration | |||
Fixed In Version: | Doc Type: | Bug Fix | ||
Doc Text: |
collapses the multiple policy cronjobs to a single job with multiple tasks it runs:
delete
rollover
The reasoning is there is a potential race condition between the previous jobs which both rely upon a -write alias that may lead to false information. Additionally, ES does not have transactions or is ACID. By converting these into tasks we execute for management we:
potentially free disk for ES to do additional work
give a better chance for the rollover to be successful
|
Story Points: | --- | |
Clone Of: | 1890838 | |||
: | 1928772 (view as bug list) | Environment: | ||
Last Closed: | 2021-02-08 13:41:45 UTC | Type: | --- | |
Regression: | --- | Mount Type: | --- | |
Documentation: | --- | CRM: | ||
Verified Versions: | Category: | --- | ||
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
Cloudforms Team: | --- | Target Upstream Version: | ||
Embargoed: | ||||
Bug Depends On: | 1890838 | |||
Bug Blocks: | 1919075, 1928772 |
Comment 2
Giriyamma
2021-01-28 07:51:40 UTC
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.6.16 extras security update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2021:0310 Guys I've installed the 4.6.16 and still seeing the following err: {"error":{"root_cause":[{"type":"security_exception","reason":"Unexpected exception indices:admin/aliases/get"}],"type":"security_exception","reason":"Unexpected exception indices:admin/aliases/get"},"status":500} Error while attemping to determine the active write alias: {"error":{"root_cause":[{"type":"security_exception","reason":"Unexpected exception indices:admin/aliases/get"}],"type":"security_exception","reason":"Unexpected exception indices:admin/aliases/get"},"status":500} Current write index for audit-write: audit-000160 Checking results from _rollover call Next write index for audit-write: audit-000160 Checking if audit-000160 exists Checking if audit-000160 is the write index for audit-write Done! Thanks, Ronald Vedanti, here are the new bugs as the issue is still not fixed. It's being verified in https://github.com/openshift/elasticsearch-operator/pull/678 Bug 1929688 (VERIFIED) Sometimes The Elasticsearch-Delete-Xxx Job Failed At "Unexpected Exception Indices:admin/Aliases/Get" - OCP 4.6.16 Bug 1928772 (VERIFIED) Sometimes The Elasticsearch-Delete-Xxx Job Failed: After OCP 4.6.16 Patch. |