Bug 1469445 - Can't scale up elasticsearch by ansible deployment
Summary: Can't scale up elasticsearch by ansible deployment
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 3.6.0
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 3.7.0
Assignee: Jeff Cantrill
QA Contact: Xia Zhao
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2017-07-11 09:40 UTC by Xia Zhao
Modified: 2017-11-28 22:00 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: No Doc Update
Doc Text:
undefined
Clone Of:
Environment:
Last Closed: 2017-11-28 22:00:46 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)
inventory file used for logging deployment (746 bytes, text/plain)
2017-07-11 09:40 UTC, Xia Zhao
no flags Details
ES log (24.92 KB, text/plain)
2017-07-11 09:41 UTC, Xia Zhao
no flags Details
ansible log (1.02 MB, text/plain)
2017-07-11 09:59 UTC, Xia Zhao
no flags Details


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2017:3188 0 normal SHIPPED_LIVE Moderate: Red Hat OpenShift Container Platform 3.7 security, bug, and enhancement update 2017-11-29 02:34:54 UTC

Description Xia Zhao 2017-07-11 09:40:44 UTC
Created attachment 1296146 [details]
inventory file used for logging deployment

Description of problem:
Specify openshift_logging_es_number_of_replicas=2 in logging 3.6.0 deployment inventory, only have 1 es pod existing, and  Number of nodes/data nodes is still 1:

Contacting elasticsearch cluster 'elasticsearch' and wait for YELLOW clusterstate ...
Clustername: logging-es
Clusterstate: GREEN
Number of nodes: 1
Number of data nodes: 1

Also meet this exception when seeding .kibana index:
[2017-07-11 09:33:55,079][WARN ][rest.suppressed          ] path: /.kibana/config/4.6.4, params: {index=.kibana, op_type=create, id=4.6.4, type=config}
RemoteTransportException[[logging-es-data-master-u09phwje][10.129.0.25:9300][indices:data/write/index[p]]]; nested: UnavailableShardsException[[.kibana][0] Not enough active copies to meet write consistency of [QUORUM] (have 1, needed 2). Timeout: [1m], request: [index {[.kibana][config][4.6.4], source[{"buildNum":10229}]}]];
Caused by: UnavailableShardsException[[.kibana][0] Not enough active copies to meet write consistency of [QUORUM] (have 1, needed 2). Timeout: [1m], request: [index {[.kibana][config][4.6.4], source[{"buildNum":10229}]}]]


Version-Release number of selected component (if applicable):
openshift3/logging-auth-proxy    4cf6b1d60d2b
openshift3/logging-kibana    4563b27eac07
openshift3/logging-elasticsearch    8809f390a819
openshift3/logging-fluentd    a2ea005ef4f6
openshift3/logging-curator    ea1887b8e441

# rpm -qa | grep ansible
openshift-ansible-roles-3.6.140-1.git.0.4a02427.el7.noarch
openshift-ansible-callback-plugins-3.6.140-1.git.0.4a02427.el7.noarch
openshift-ansible-filter-plugins-3.6.140-1.git.0.4a02427.el7.noarch
openshift-ansible-playbooks-3.6.140-1.git.0.4a02427.el7.noarch
ansible-2.3.1.0-3.el7.noarch
openshift-ansible-3.6.140-1.git.0.4a02427.el7.noarch
openshift-ansible-lookup-plugins-3.6.140-1.git.0.4a02427.el7.noarch
openshift-ansible-docs-3.6.140-1.git.0.4a02427.el7.noarch

# openshift version
openshift v3.6.140
kubernetes v1.6.1+5115d708d7
etcd 3.2.1


How reproducible:
always

Steps to Reproduce:
1.Deploy logging 3.6.0 , specify openshift_logging_es_number_of_replicas=2 in the inventory file
2.
3.

Actual results:
Can't scale up elasticsearch

Expected results:
elasticsearch should scale up

Additional info:
es log attached
inventory file attached

Comment 1 Xia Zhao 2017-07-11 09:41:43 UTC
Created attachment 1296148 [details]
ES log

Comment 2 Xia Zhao 2017-07-11 09:59:57 UTC
Created attachment 1296153 [details]
ansible log

Comment 3 Scott Dodson 2017-07-11 13:01:31 UTC
Unless this is a regression I don't think we should consider this a 3.6 blocker.

Comment 4 ewolinet 2017-07-11 17:28:03 UTC
The parameter you are using is incorrect for having more than one ES pod.
openshift_logging_es_number_of_replicas is specific to the ES indices and will change the number of shard replicas.

What you should be setting instead is openshift_logging_es_cluster_size

Comment 5 Xia Zhao 2017-07-12 05:45:48 UTC
Oh, sorry for my mistake and thanks for pointing this.-- redeployed the 3.6.0 logging stacks with openshift_logging_es_cluster_size=2 set in inventory file, and this time I can see 2 es-master pods running, even with https://bugzilla.redhat.com/show_bug.cgi?id=1469918 observed:


Set this bz to verified since this deployment parameter: openshift_logging_es_cluster_size is working fine.

Ansible version tested with:
# rpm -qa | grep ansible
openshift-ansible-3.6.140-1.git.0.4a02427.el7.noarch
openshift-ansible-roles-3.6.140-1.git.0.4a02427.el7.noarch
openshift-ansible-docs-3.6.140-1.git.0.4a02427.el7.noarch
openshift-ansible-callback-plugins-3.6.140-1.git.0.4a02427.el7.noarch
openshift-ansible-filter-plugins-3.6.140-1.git.0.4a02427.el7.noarch
openshift-ansible-playbooks-3.6.140-1.git.0.4a02427.el7.noarch
ansible-2.2.3.0-1.el7.noarch
openshift-ansible-lookup-plugins-3.6.140-1.git.0.4a02427.el7.noarch

# openshift version
openshift v3.6.140
kubernetes v1.6.1+5115d708d7
etcd 3.2.1

Comment 9 errata-xmlrpc 2017-11-28 22:00:46 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:3188


Note You need to log in before you can comment on or make changes to this bug.