Bug 1489498 - [3.5] Upgrade of 3.4 to 3.5 does not preserve replica count settings for Elasticsearch indices
Summary: [3.5] Upgrade of 3.4 to 3.5 does not preserve replica count settings for Elas...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging
Version: 3.5.0
Hardware: All
OS: All
urgent
urgent
Target Milestone: ---
: 3.5.z
Assignee: Jeff Cantrill
QA Contact: Anping Li
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-09-07 14:40 UTC by Peter Portante
Modified: 2017-11-21 05:41 UTC (History)
9 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
The logging role of `openshift-installer` did not consider the replica count from the gathered facts when re-installing causing the replica counts to be set to the role defaults. Now the `openshift-installer` evaluates and reapplies the value from the facts, preserving the counts.
Clone Of:
Environment:
Last Closed: 2017-11-21 05:41:13 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Github openshift openshift-ansible pull 5766 0 None None None 2017-10-13 21:16:38 UTC
Github openshift openshift-ansible pull 5831 0 None None None 2017-10-20 19:31:33 UTC
Github openshift openshift-ansible pull 5852 0 None None None 2017-10-23 22:45:00 UTC
Red Hat Product Errata RHBA-2017:3255 0 normal SHIPPED_LIVE OpenShift Container Platform atomic-openshift-utils bug fix and enhancement 2017-11-21 10:40:59 UTC

Description Peter Portante 2017-09-07 14:40:39 UTC
Description of problem:

  When upgrading from 3.4 to 3.5 logging, the settings for the number of primary
  shards for new indices, and the number of replicas of those primary shards for
  new indices is not preserved.

  The 3.4 method of deployment as a deployer pod via a template, and with 3.5 the
  deployment is via ansible.

  The defaults for the ansible playbook is:

    number_of_shards = 1
    number_of_replicas = 0

  This can lead to data loss as new indices are created after an upgrade. If a PV
  is lost, or corrupted, or node becomes unavailable that hosts that PV, then
  access to data in those indices is lost entirely.

Version-Release number of selected component (if applicable):

  3.5.*

How reproducible:

  All upgrades from 3.4 to 3.5 (apparently).

  This bug needs to be verified with the exact conditions under which it occurs.

Steps to Reproduce:

  1. Take a 3.4 cluster with the logging-elasticsearch configmap having a value of
     1 for number_of_replicas
  2. Upgrade to 3.5
  3. Observe the logging-elasticsearch configmap to see what the value is for
     number_of_replicas

Actual results:

  The elasticsearch configmap is changed to have a default of 0 for
  number_of_replicas and 1 for number_of_shards.

Expected results:

  The previous elasticsearch configmap is maintained.

Comment 3 openshift-github-bot 2017-10-20 19:10:53 UTC
Commits pushed to master at https://github.com/openshift/openshift-ansible

https://github.com/openshift/openshift-ansible/commit/46551d58d286fe18bb5637be2b9a21a928f05632
bug 1489498. preserve replica and shard settings

https://github.com/openshift/openshift-ansible/commit/495909e50146217adcca32e7c051f4f90dd39bf7
Merge pull request #5766 from jcantrill/1489498_preserve_replica_count

bug 1489498. preserve replica and shard settings

Comment 4 Jeff Cantrill 2017-10-23 22:45:01 UTC
3.5 backport https://github.com/openshift/openshift-ansible/pull/5852

Comment 6 Anping Li 2017-11-06 06:39:53 UTC
Verified and pass on openshift-ansible-roles-3.5.139. The shards and replics number are kept during upgrade.


# oc get configmap logging-elasticsearch -o yaml |grep -A 2 index
     index:
      number_of_shards: 3
      number_of_replicas: 3

Comment 10 errata-xmlrpc 2017-11-21 05:41:13 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:3255


Note You need to log in before you can comment on or make changes to this bug.