Bug 1916910

Summary:	Sometimes the elasticsearch-delete-xxx job failed at "Unexpected exception indices:admin/aliases/get"
Product:	OpenShift Container Platform	Reporter:	Jeff Cantrill <jcantril>
Component:	Logging	Assignee:	Jeff Cantrill <jcantril>
Status:	CLOSED ERRATA	QA Contact:	Giriyamma <gkarager>
Severity:	high	Docs Contact:
Priority:	high
Version:	4.6	CC:	akhaire, alchan, andcosta, anisal, anli, aos-bugs, apaladug, ChetRHosey, cruhm, dageoffr, dahernan, dkulkarn, dseals, jcantril, juherrer, kiyyappa, ksathe, luaparicio, lvlcek, mrdest, mrobson, naoto30, naygupta, nnosenzo, ocasalsa, periklis, prdeshpa, qitang, rkant, ronald.rademaker, sauchter, shishika, sreber, ssadhale, ssonigra, tmicheli, vhernand, vjaypurk, xingli, ykarajag
Target Milestone:	---
Target Release:	4.6.z
Hardware:	Unspecified
OS:	Unspecified
Whiteboard:	logging-exploration
Fixed In Version:		Doc Type:	Bug Fix
Doc Text:	collapses the multiple policy cronjobs to a single job with multiple tasks it runs: delete rollover The reasoning is there is a potential race condition between the previous jobs which both rely upon a -write alias that may lead to false information. Additionally, ES does not have transactions or is ACID. By converting these into tasks we execute for management we: potentially free disk for ES to do additional work give a better chance for the rollover to be successful	Story Points:	---
Clone Of:	1890838
Clones:	1928772 (view as bug list)		Environment:
Last Closed:	2021-02-08 13:41:45 UTC	Type:	---
Regression:	---	Mount Type:	---
Documentation:	---	CRM:
Verified Versions:		Category:	---
oVirt Team:	---	RHEL 7.3 requirements from Atomic Host:
Cloudforms Team:	---	Target Upstream Version:
Embargoed:
Bug Depends On:	1890838
Bug Blocks:	1919075, 1928772

Comment 2 Giriyamma 2021-01-28 07:51:40 UTC

Verified this issue using clusterlogging.4.6.0-202101271348.p0, elasticsearch-operator.4.6.0-202101271348.p0.

Comment 6 errata-xmlrpc 2021-02-08 13:41:45 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.6.16 extras security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:0310

Comment 8 Ronald 2021-02-16 12:25:28 UTC

Guys I've installed the 4.6.16 and still seeing the following err:


{"error":{"root_cause":[{"type":"security_exception","reason":"Unexpected exception indices:admin/aliases/get"}],"type":"security_exception","reason":"Unexpected exception indices:admin/aliases/get"},"status":500}
Error while attemping to determine the active write alias: {"error":{"root_cause":[{"type":"security_exception","reason":"Unexpected exception indices:admin/aliases/get"}],"type":"security_exception","reason":"Unexpected exception indices:admin/aliases/get"},"status":500}
Current write index for audit-write: audit-000160
Checking results from _rollover call
Next write index for audit-write: audit-000160
Checking if audit-000160 exists
Checking if audit-000160 is the write index for audit-write
Done!



Thanks,
Ronald

Comment 9 David Hernández Fernández 2021-02-16 12:39:59 UTC

https://bugzilla.redhat.com/show_bug.cgi?id=1928772 for 4.6.16+

Comment 11 David Hernández Fernández 2021-03-27 09:07:14 UTC

Vedanti, here are the new bugs as the issue is still not fixed. It's being verified in https://github.com/openshift/elasticsearch-operator/pull/678

Bug 1929688 (VERIFIED)   
Sometimes The Elasticsearch-Delete-Xxx Job Failed At "Unexpected Exception Indices:admin/Aliases/Get" - OCP 4.6.16
Bug 1928772 (VERIFIED)   
Sometimes The Elasticsearch-Delete-Xxx Job Failed: After OCP 4.6.16 Patch.