Bug 1712955

Summary: Can't scale up ES nodes from 3 to N (N>3) in clusterlogging CRD instance.
Product: OpenShift Container Platform Reporter: ewolinet
Component: LoggingAssignee: ewolinet
Status: CLOSED ERRATA QA Contact: Anping Li <anli>
Severity: medium Docs Contact:
Priority: unspecified    
Version: 4.1.zCC: anli, aos-bugs, ewolinet, jcantril, pweil, qitang, rmeggins, vlaad
Target Milestone: ---   
Target Release: 4.1.z   
Hardware: Unspecified   
OS: Unspecified   
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: 1712721 Environment:
Last Closed: 2019-08-28 19:54:45 UTC Type: ---
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---
Bug Depends On: 1712721    
Bug Blocks:    

Description ewolinet 2019-05-22 15:11:13 UTC
+++ This bug was initially created as a clone of Bug #1712721 +++

Description of problem:
Deploy logging with 3 ES nodes, then wait until all pods running, change es node count to 4 in clusterlogging CRD instance, wait for about 10 minutes, the number of es node count is still 3 in the elasticsearch CRD instance. No logs in cluster-logging-operator pod.

Add ES nodes from 2 to 4 in the clusterlogging CRD instance, the ES node count can be changed to 4 in the elasticsearch CRD instance, and the ES pods could be scaled up. Find log `level=info msg="Elasticsearch node configuration change found, updating elasticsearch"` in cluster-logging-operator pod.

Version-Release number of selected component (if applicable):

How reproducible:

Steps to Reproduce:
1.Deploy logging via OLM, set es node count to 3 in the clusterlogging CRD instance
2.wait until all logging pods running, change es node count to 4 in clusterlogging CRD instance
3.check pods in `openshift-logging` namespace, and check the es node count in elasticsearch CRD instance and clusterlogging CRD instance

Actual results:

Expected results:

Additional info:

--- Additional comment from Qiaoling Tang on 2019-05-22 08:23:29 UTC ---

Actual results:
the es nodeCount in elasticsearch CRD instance isn't changed after changing es nodeCount from 3 to n (n>3) in the clusterlogging CRD instance

Expected results:
the es node count should be the same as it in the clusterlogging CRD instance.

Additional info:
Scaling up es nodes from 1 or 2 to n(n>=3), no issue.
Scaling up es nodes from 4 or 5 to 6, no issue.

This issue only happens when scaling up from 3 nodes to n(n > 3) nodes

The workaround is: 
1. change es nodeCount in clusterlogging CRD instance, 
2. use `oc delete elasticsearch elasticsearch -n openshift-logging` to delete elasticsearch CRD instance, then the elasticsearch would be recreated, and the nodeCount is what it set in the clusterlogging CRD instance.

--- Additional comment from Ben Parees on 2019-05-22 13:54:33 UTC ---

this should likely be cloned+backported to 4.1.z

Comment 3 Anping Li 2019-08-16 05:53:04 UTC
Test blocked by https://bugzilla.redhat.com/show_bug.cgi?id=1741753

Comment 4 Anping Li 2019-08-19 06:59:36 UTC
Verified in v4.1.12

Comment 8 Anping Li 2019-08-23 05:32:54 UTC
Verified in 4.1.13, the ES can be scaled up from 3-4.
$ oc get pods
NAME                                            READY   STATUS      RESTARTS   AGE
cluster-logging-operator-54894bdc48-xn589       1/1     Running     0          34m
curator-1566538200-f2ddz                        0/1     Completed   0          118s
elasticsearch-cd-9ga2844i-1-7dc7f8d66-s4h5f     2/2     Running     0          2m2s
elasticsearch-cdm-80417wre-1-576558cfbb-wtgkf   2/2     Running     0          33m
elasticsearch-cdm-80417wre-2-78d4647ff5-sw58h   2/2     Running     0          33m
elasticsearch-cdm-80417wre-3-5cff54d779-g45vm   2/2     Running     0          33m

Comment 10 errata-xmlrpc 2019-08-28 19:54:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.