1469445 – Can't scale up elasticsearch by ansible deployment

Bug 1469445 - Can't scale up elasticsearch by ansible deployment

Summary: Can't scale up elasticsearch by ansible deployment

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Installer
Sub Component:
Version:	3.6.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	medium
Target Milestone:	---
Target Release:	3.7.0
Assignee:	Jeff Cantrill
QA Contact:	Xia Zhao
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2017-07-11 09:40 UTC by Xia Zhao
Modified:	2017-11-28 22:00 UTC (History)
CC List:	5 users (show)
Fixed In Version:
Doc Type:	No Doc Update
Doc Text:	undefined
Clone Of:
Environment:
Last Closed:	2017-11-28 22:00:46 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
inventory file used for logging deployment (746 bytes, text/plain) 2017-07-11 09:40 UTC, Xia Zhao	no flags	Details
ES log (24.92 KB, text/plain) 2017-07-11 09:41 UTC, Xia Zhao	no flags	Details
ansible log (1.02 MB, text/plain) 2017-07-11 09:59 UTC, Xia Zhao	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHSA-2017:3188	0	normal	SHIPPED_LIVE	Moderate: Red Hat OpenShift Container Platform 3.7 security, bug, and enhancement update	2017-11-29 02:34:54 UTC

Description Xia Zhao 2017-07-11 09:40:44 UTC

Created attachment 1296146 [details]
inventory file used for logging deployment

Description of problem:
Specify openshift_logging_es_number_of_replicas=2 in logging 3.6.0 deployment inventory, only have 1 es pod existing, and  Number of nodes/data nodes is still 1:

Contacting elasticsearch cluster 'elasticsearch' and wait for YELLOW clusterstate ...
Clustername: logging-es
Clusterstate: GREEN
Number of nodes: 1
Number of data nodes: 1

Also meet this exception when seeding .kibana index:
[2017-07-11 09:33:55,079][WARN ][rest.suppressed          ] path: /.kibana/config/4.6.4, params: {index=.kibana, op_type=create, id=4.6.4, type=config}
RemoteTransportException[[logging-es-data-master-u09phwje][10.129.0.25:9300][indices:data/write/index[p]]]; nested: UnavailableShardsException[[.kibana][0] Not enough active copies to meet write consistency of [QUORUM] (have 1, needed 2). Timeout: [1m], request: [index {[.kibana][config][4.6.4], source[{"buildNum":10229}]}]];
Caused by: UnavailableShardsException[[.kibana][0] Not enough active copies to meet write consistency of [QUORUM] (have 1, needed 2). Timeout: [1m], request: [index {[.kibana][config][4.6.4], source[{"buildNum":10229}]}]]


Version-Release number of selected component (if applicable):
openshift3/logging-auth-proxy    4cf6b1d60d2b
openshift3/logging-kibana    4563b27eac07
openshift3/logging-elasticsearch    8809f390a819
openshift3/logging-fluentd    a2ea005ef4f6
openshift3/logging-curator    ea1887b8e441

# rpm -qa | grep ansible
openshift-ansible-roles-3.6.140-1.git.0.4a02427.el7.noarch
openshift-ansible-callback-plugins-3.6.140-1.git.0.4a02427.el7.noarch
openshift-ansible-filter-plugins-3.6.140-1.git.0.4a02427.el7.noarch
openshift-ansible-playbooks-3.6.140-1.git.0.4a02427.el7.noarch
ansible-2.3.1.0-3.el7.noarch
openshift-ansible-3.6.140-1.git.0.4a02427.el7.noarch
openshift-ansible-lookup-plugins-3.6.140-1.git.0.4a02427.el7.noarch
openshift-ansible-docs-3.6.140-1.git.0.4a02427.el7.noarch

# openshift version
openshift v3.6.140
kubernetes v1.6.1+5115d708d7
etcd 3.2.1


How reproducible:
always

Steps to Reproduce:
1.Deploy logging 3.6.0 , specify openshift_logging_es_number_of_replicas=2 in the inventory file
2.
3.

Actual results:
Can't scale up elasticsearch

Expected results:
elasticsearch should scale up

Additional info:
es log attached
inventory file attached

Comment 1 Xia Zhao 2017-07-11 09:41:43 UTC

Created attachment 1296148 [details]
ES log

Comment 2 Xia Zhao 2017-07-11 09:59:57 UTC

Created attachment 1296153 [details]
ansible log

Comment 3 Scott Dodson 2017-07-11 13:01:31 UTC

Unless this is a regression I don't think we should consider this a 3.6 blocker.

Comment 4 ewolinet 2017-07-11 17:28:03 UTC

The parameter you are using is incorrect for having more than one ES pod.
openshift_logging_es_number_of_replicas is specific to the ES indices and will change the number of shard replicas.

What you should be setting instead is openshift_logging_es_cluster_size

Comment 5 Xia Zhao 2017-07-12 05:45:48 UTC

Oh, sorry for my mistake and thanks for pointing this.-- redeployed the 3.6.0 logging stacks with openshift_logging_es_cluster_size=2 set in inventory file, and this time I can see 2 es-master pods running, even with https://bugzilla.redhat.com/show_bug.cgi?id=1469918 observed:


Set this bz to verified since this deployment parameter: openshift_logging_es_cluster_size is working fine.

Ansible version tested with:
# rpm -qa | grep ansible
openshift-ansible-3.6.140-1.git.0.4a02427.el7.noarch
openshift-ansible-roles-3.6.140-1.git.0.4a02427.el7.noarch
openshift-ansible-docs-3.6.140-1.git.0.4a02427.el7.noarch
openshift-ansible-callback-plugins-3.6.140-1.git.0.4a02427.el7.noarch
openshift-ansible-filter-plugins-3.6.140-1.git.0.4a02427.el7.noarch
openshift-ansible-playbooks-3.6.140-1.git.0.4a02427.el7.noarch
ansible-2.2.3.0-1.el7.noarch
openshift-ansible-lookup-plugins-3.6.140-1.git.0.4a02427.el7.noarch

# openshift version
openshift v3.6.140
kubernetes v1.6.1+5115d708d7
etcd 3.2.1

Comment 9 errata-xmlrpc 2017-11-28 22:00:46 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2017:3188

Note You need to log in before you can comment on or make changes to this bug.