Bug 1885723 - Old kibana index causing crashloop
Summary: Old kibana index causing crashloop
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging
Version: 4.5
Hardware: x86_64
OS: Linux
medium
high
Target Milestone: ---
: 4.7.0
Assignee: Periklis Tsirakidis
QA Contact: Anping Li
URL:
Whiteboard: osd-45-logging, logging-exploration
: 1870371 (view as bug list)
Depends On:
Blocks: 1909614
TreeView+ depends on / blocked
 
Reported: 2020-10-06 19:44 UTC by tfahlman
Modified: 2024-03-25 16:39 UTC (History)
22 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-02-24 11:21:19 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
reindex failed (30.39 KB, text/plain)
2020-10-11 14:41 UTC, Anping Li
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Github openshift elasticsearch-operator pull 603 0 None closed Bug 1885723: Allow kibana server to access OD tenantinfo 2021-02-14 11:13:54 UTC
Red Hat Knowledge Base (Solution) 5332221 0 None None None 2020-11-08 20:43:22 UTC
Red Hat Knowledge Base (Solution) 5652591 0 None None None 2020-12-18 08:41:00 UTC
Red Hat Product Errata RHBA-2021:0652 0 None None None 2021-02-24 11:22:11 UTC

Description tfahlman 2020-10-06 19:44:53 UTC
Description of problem:

Using NODE_OPTIONS: '--max_old_space_size=368' Memory setting is in MB
{"type":"log","@timestamp":"2020-10-06T14:23:56Z","tags":["fatal","root"],"pid":121,"message":"Error: Index .kibana belongs to a version of Kibana that cannot be automatically migrated. Reset it or use the X-Pack upgrade assistant.\n    at assertIsSupportedIndex (/opt/app-root/src/src/server/saved_objects/migrations/core/elastic_index.js:246:15)\n    at Object.fetchInfo (/opt/app-root/src/src/server/saved_objects/migrations/core/elastic_index.js:52:12)"}
 FATAL  Error: Index .kibana belongs to a version of Kibana that cannot be automatically migrated. Reset it or use the X-Pack upgrade assistant.

Version-Release number of selected component (if applicable):

4.5.0

This happened after an upgrade for 4.4.x to 4.5.0. The pod restarted 25 times before this issue went away. 

Seems similar to this: https://bugzilla.redhat.com/show_bug.cgi?id=1835903
 
As I understand the status of that bz, this could be a regression.

Comment 1 Anping Li 2020-10-11 14:41:51 UTC
Created attachment 1720673 [details]
reindex failed

reproduced it when upgrade from elasticsearch-operator.4.4.0-202009161309.p0 to elasticsearch-operator.4.5.0-202009182238.p0.    Not always reproducible.

Comment 3 Jeff Cantrill 2020-10-12 14:24:20 UTC
*** Bug 1870371 has been marked as a duplicate of this bug. ***

Comment 4 Jeff Cantrill 2020-10-23 15:20:10 UTC
Setting UpcomingSprint as unable to resolve before EOD

Comment 5 Periklis Tsirakidis 2020-11-09 10:24:09 UTC
(In reply to tfahlman from comment #0)
> Description of problem:
> 
> Using NODE_OPTIONS: '--max_old_space_size=368' Memory setting is in MB
> {"type":"log","@timestamp":"2020-10-06T14:23:56Z","tags":["fatal","root"],
> "pid":121,"message":"Error: Index .kibana belongs to a version of Kibana
> that cannot be automatically migrated. Reset it or use the X-Pack upgrade
> assistant.\n    at assertIsSupportedIndex
> (/opt/app-root/src/src/server/saved_objects/migrations/core/elastic_index.js:
> 246:15)\n    at Object.fetchInfo
> (/opt/app-root/src/src/server/saved_objects/migrations/core/elastic_index.js:
> 52:12)"}
>  FATAL  Error: Index .kibana belongs to a version of Kibana that cannot be
> automatically migrated. Reset it or use the X-Pack upgrade assistant.
> 
> Version-Release number of selected component (if applicable):
> 
> 4.5.0
> 
> This happened after an upgrade for 4.4.x to 4.5.0. The pod restarted 25
> times before this issue went away. 
> 
> Seems similar to this: https://bugzilla.redhat.com/show_bug.cgi?id=1835903
>  
> As I understand the status of that bz, this could be a regression.

Kibana Index migration is done by the elasticsearch-operator in 4.5 because kibana6 requires some manual steps. Could you provide a cluster-logging must-gather for this cluster to ensure that it doesn't fail there?

The crashloop you see is nothing serious, it is a indicator only that the elasticsearch-operator did not complete with migrating the index.

Comment 6 Periklis Tsirakidis 2020-11-11 13:03:28 UTC
@sreber 

Please provide a must-gather for your customer case.

Comment 14 Periklis Tsirakidis 2020-11-26 17:01:37 UTC
@tmicheli

Looking through the various uploads, none of them is a proper cluster-logging must-gather taken with:

https://github.com/openshift/cluster-logging-operator/tree/master/must-gather

Can you please provide one latest snapshot using this must-gather please?

Comment 17 Periklis Tsirakidis 2020-11-27 15:51:35 UTC
@tmicheli

Based on a live session with @sreber I have a hypothesis which I would you both to validate on the customer side or on lab setup. 

First of all the key observations:
Case 1: Kibana crashloops once after upgrading from 4.4 to 4.5 on old user `.kibana*` indices, because they are still point to the old data model. Once they are deleted, everything works fine.

Case 2: Kibana crashloops again on a cluster where the internal migration from the old data model to the new model is in progress after users create their index patterns.

---

The hypothesis is that users on case 2 create index patterns that refer to the new data model (e.g. app*) and to the old data model (e.g. project*). 
While the migration happens, old indices (e.g. project*, operations*) get deleted. Thus the index patterns should be in a broken state.

Could you please take a look on the users' `.kibana*` indices to identify what index pattern they are creating?
Would it possible to get a dump of these indices to inspect ourselves?

Comment 37 Anping Li 2020-12-21 07:23:24 UTC
No regression was found in 4.7. so move to verified.  Futher more testing will be one in 4.5.

Comment 55 errata-xmlrpc 2021-02-24 11:21:19 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Errata Advisory for Openshift Logging 5.0.0), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2021:0652


Note You need to log in before you can comment on or make changes to this bug.