Bug 1882525 - elasticsearch-delete KeyError: 'is_write_index'
Summary: elasticsearch-delete KeyError: 'is_write_index'
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging
Version: 4.5
Hardware: Unspecified
OS: Unspecified
unspecified
high
Target Milestone: ---
: 4.6.0
Assignee: Jeff Cantrill
QA Contact: Anping Li
URL:
Whiteboard: osd-45-logging
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2020-09-24 20:02 UTC by Eric Fried
Modified: 2023-12-15 19:31 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2020-10-27 15:12:50 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift elasticsearch-operator pull 497 0 None closed Bug 1882525: Resolve KeyError when determining indices to delete 2021-02-03 13:51:48 UTC
Red Hat Product Errata RHBA-2020:4198 0 None None None 2020-10-27 15:13:15 UTC

Description Eric Fried 2020-09-24 20:02:59 UTC
NOTE: This is *not* a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1866019 / https://bugzilla.redhat.com/show_bug.cgi?id=1866963 / https://bugzilla.redhat.com/show_bug.cgi?id=1868675 which are caused by 500s from ES; or https://bugzilla.redhat.com/show_bug.cgi?id=1881709 where the ES query hangs.

Description of problem:

elasticsearch-delete-* pods fail with log:

Traceback (most recent call last):
  File "<string>", line 4, in <module>
KeyError: 'is_write_index'

Version-Release number of selected component (if applicable):

4.5.0-202009041228.p0 (and others)

How reproducible:

Unknown -- but once a cluster is in this state, it semes to stick there for "a while".

Additional info:

Debugging the failed pods, they consistently show the following, indicating that the error is the result of a malformed JSON response from elasticsearch:

sh-4.2$ bash -x /tmp/scripts/delete
+ set -euo pipefail
+++ cat /var/run/secrets/kubernetes.io/serviceaccount/token
++ curl -s 'https://elasticsearch:9200/infra-*/_alias/infra-write' --cacert /etc/indexmanagement/keys/admin-ca '-HAuthorization: Bearer {redacted}' -HContent-Type:application/json
+ writeIndices='{"infra-000001":{"aliases":{"infra-write":{}}}}'
++ cat
+ CMD='import json,sys
r=json.load(sys.stdin)
alias="infra-write"
indices = [index for index in r if r[index]['\''aliases'\''][alias]['\''is_write_index'\'']]
if len(indices) > 0:
  print indices[0] '
++ echo '{"infra-000001":{"aliases":{"infra-write":{}}}}'
++ python -c 'import json,sys
r=json.load(sys.stdin)
alias="infra-write"
indices = [index for index in r if r[index]['\''aliases'\''][alias]['\''is_write_index'\'']]
if len(indices) > 0:
  print indices[0] '
Traceback (most recent call last):
  File "<string>", line 4, in <module>
KeyError: 'is_write_index'
+ writeIndex=

NOTE: This is *not* a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1866019 / https://bugzilla.redhat.com/show_bug.cgi?id=1866963 / https://bugzilla.redhat.com/show_bug.cgi?id=1868675 which are caused by 500s from ES; or https://bugzilla.redhat.com/show_bug.cgi?id=1881709 where the ES query hangs.

Comment 1 Jeff Cantrill 2020-09-24 20:42:29 UTC
(In reply to Eric Fried from comment #0)
> NOTE: This is *not* a duplicate of
> https://bugzilla.redhat.com/show_bug.cgi?id=1866019 /
> https://bugzilla.redhat.com/show_bug.cgi?id=1866963 /
> https://bugzilla.redhat.com/show_bug.cgi?id=1868675 which are caused by 500s
> from ES; or https://bugzilla.redhat.com/show_bug.cgi?id=1881709 where the ES
> query hangs.

No but it is likely resolved when https://github.com/openshift/elasticsearch-operator/pull/488 merges.  Closing as a duplicate for the time being until can be verified otherwise

*** This bug has been marked as a duplicate of bug 1868675 ***

Comment 2 Eric Fried 2020-09-24 21:16:28 UTC
I ran the delete script from https://github.com/openshift/elasticsearch-operator/pull/488 in the failing environment and it still fails the same way.

sh-4.2$ ./delete.latest

Error trying to determine the 'write' index from '{u'app-000002': {u'aliases': {u'app-write': {}}}}': <type 'exceptions.KeyError'>sh-4.2$ echo $?
1

Comment 3 Jeff Cantrill 2020-09-24 21:57:42 UTC
4.5 backport added to https://github.com/openshift/elasticsearch-operator/pull/488

Comment 10 errata-xmlrpc 2020-10-27 15:12:50 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 4.6.1 extras update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4198


Note You need to log in before you can comment on or make changes to this bug.