Bug 1670587
Summary: | ES pod deployment timeout can corrupt logging indices | ||
---|---|---|---|
Product: | OpenShift Container Platform | Reporter: | Matthew Barnes <mbarnes> |
Component: | Logging | Assignee: | Michael Burke <mburke> |
Status: | CLOSED CURRENTRELEASE | QA Contact: | Anping Li <anli> |
Severity: | unspecified | Docs Contact: | |
Priority: | unspecified | ||
Version: | 3.11.0 | CC: | aos-bugs, cvogel, lvlcek, mburke, rmeggins |
Target Milestone: | --- | Keywords: | OpsBlocker |
Target Release: | 3.11.z | ||
Hardware: | Unspecified | ||
OS: | Unspecified | ||
Whiteboard: | groom | ||
Fixed In Version: | Doc Type: | No Doc Update | |
Doc Text: | Story Points: | --- | |
Clone Of: | Environment: | ||
Last Closed: | 2020-02-06 21:56:44 UTC | Type: | Bug |
Regression: | --- | Mount Type: | --- |
Documentation: | --- | CRM: | |
Verified Versions: | Category: | --- | |
oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |
Cloudforms Team: | --- | Target Upstream Version: | |
Embargoed: |
Description
Matthew Barnes
2019-01-29 21:08:36 UTC
(In reply to Matthew Barnes from comment #0) > Description of problem: > > ElasticSearch v5 stores indices on persistent volumes differently than > earlier versions (using a hash value instead of the name of the index, I > believe). > > When ElasticSearch is upgraded to v5, the new pods are not considered ready > until the Searchguard index becomes green. Especially on large clusters > this can take a VERY long time to complete, but the rollout strategy has a > 30-minute default timeout before terminating the pods and rolling back to > the previous version. This is not accurate. The state of the Searchguard index is not involved in determination of readiness. My assumption is the pod(s) are rolled back because the storage from the previous deployment is not released by AWS and attached to the new deployment before the rollback time is exceeded. This was fixed by [1]. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1655675 It's been a long time with this issue but talking to PM we may resolve via documentation. Need to verify following [1] returns no results and that customer's must validate prior to upgrading their ES clusters otherwise their data may not be recoverable. [1] https://github.com/jcantrill/cluster-logging-tools/blob/release-3.x/scripts/dots-in-field-names @lukas, I'm looking to turn this into a doc issue. Expecting to reference the content of the script in #c3 and want to reference ES changes. I found the mapping explosion [1] but I dont see ref to dots in fields. Do you have a link? [1] https://www.elastic.co/guide/en/elasticsearch/reference/5.6/breaking_50_mapping_changes.html#breaking_50_mapping_changes Documentation PR: https://github.com/openshift/openshift-docs/pull/17931 Moving to ON_QA for validation which may have already occurred given QE has been involved in reviewing the docs LGTM Changes are live: https://docs.openshift.com/container-platform/3.11/upgrading/automated_upgrades.html#upgrading-efk-logging-stack https://access.redhat.com/documentation/en-us/openshift_container_platform/3.11/html/upgrading_clusters/install-config-upgrading-automated-upgrades#upgrading-efk-logging-stack The needinfo request[s] on this closed bug have been removed as they have been unresolved for 1000 days |