1882525 – elasticsearch-delete KeyError: 'is_write_index'

Bug 1882525 - elasticsearch-delete KeyError: 'is_write_index'

Summary: elasticsearch-delete KeyError: 'is_write_index'

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Logging
Sub Component:
Version:	4.5
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	4.6.0
Assignee:	Jeff Cantrill
QA Contact:	Anping Li
Docs Contact:
URL:
Whiteboard:	osd-45-logging
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2020-09-24 20:02 UTC by Eric Fried
Modified:	2023-12-15 19:31 UTC (History)
CC List:	3 users (show)
Fixed In Version:
Doc Type:	No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed:	2020-10-27 15:12:50 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift elasticsearch-operator pull 497	0	None	closed	Bug 1882525: Resolve KeyError when determining indices to delete	2021-02-03 13:51:48 UTC
Red Hat Product Errata	RHBA-2020:4198	0	None	None	None	2020-10-27 15:13:15 UTC

Description Eric Fried 2020-09-24 20:02:59 UTC

NOTE: This is *not* a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1866019 / https://bugzilla.redhat.com/show_bug.cgi?id=1866963 / https://bugzilla.redhat.com/show_bug.cgi?id=1868675 which are caused by 500s from ES; or https://bugzilla.redhat.com/show_bug.cgi?id=1881709 where the ES query hangs.

Description of problem:

elasticsearch-delete-* pods fail with log:

Traceback (most recent call last):
  File "<string>", line 4, in <module>
KeyError: 'is_write_index'

Version-Release number of selected component (if applicable):

4.5.0-202009041228.p0 (and others)

How reproducible:

Unknown -- but once a cluster is in this state, it semes to stick there for "a while".

Additional info:

Debugging the failed pods, they consistently show the following, indicating that the error is the result of a malformed JSON response from elasticsearch:

sh-4.2$ bash -x /tmp/scripts/delete
+ set -euo pipefail
+++ cat /var/run/secrets/kubernetes.io/serviceaccount/token
++ curl -s 'https://elasticsearch:9200/infra-*/_alias/infra-write' --cacert /etc/indexmanagement/keys/admin-ca '-HAuthorization: Bearer {redacted}' -HContent-Type:application/json
+ writeIndices='{"infra-000001":{"aliases":{"infra-write":{}}}}'
++ cat
+ CMD='import json,sys
r=json.load(sys.stdin)
alias="infra-write"
indices = [index for index in r if r[index]['\''aliases'\''][alias]['\''is_write_index'\'']]
if len(indices) > 0:
  print indices[0] '
++ echo '{"infra-000001":{"aliases":{"infra-write":{}}}}'
++ python -c 'import json,sys
r=json.load(sys.stdin)
alias="infra-write"
indices = [index for index in r if r[index]['\''aliases'\''][alias]['\''is_write_index'\'']]
if len(indices) > 0:
  print indices[0] '
Traceback (most recent call last):
  File "<string>", line 4, in <module>
KeyError: 'is_write_index'
+ writeIndex=

NOTE: This is *not* a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1866019 / https://bugzilla.redhat.com/show_bug.cgi?id=1866963 / https://bugzilla.redhat.com/show_bug.cgi?id=1868675 which are caused by 500s from ES; or https://bugzilla.redhat.com/show_bug.cgi?id=1881709 where the ES query hangs.

Comment 1 Jeff Cantrill 2020-09-24 20:42:29 UTC

(In reply to Eric Fried from comment #0)
> NOTE: This is *not* a duplicate of
> https://bugzilla.redhat.com/show_bug.cgi?id=1866019 /
> https://bugzilla.redhat.com/show_bug.cgi?id=1866963 /
> https://bugzilla.redhat.com/show_bug.cgi?id=1868675 which are caused by 500s
> from ES; or https://bugzilla.redhat.com/show_bug.cgi?id=1881709 where the ES
> query hangs.

No but it is likely resolved when https://github.com/openshift/elasticsearch-operator/pull/488 merges.  Closing as a duplicate for the time being until can be verified otherwise

*** This bug has been marked as a duplicate of bug 1868675 ***

Comment 2 Eric Fried 2020-09-24 21:16:28 UTC

I ran the delete script from https://github.com/openshift/elasticsearch-operator/pull/488 in the failing environment and it still fails the same way.

sh-4.2$ ./delete.latest

Error trying to determine the 'write' index from '{u'app-000002': {u'aliases': {u'app-write': {}}}}': <type 'exceptions.KeyError'>sh-4.2$ echo $?
1

Comment 3 Jeff Cantrill 2020-09-24 21:57:42 UTC

4.5 backport added to https://github.com/openshift/elasticsearch-operator/pull/488

Comment 10 errata-xmlrpc 2020-10-27 15:12:50 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6.1 extras update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4198

Note You need to log in before you can comment on or make changes to this bug.