Created attachment 1595084 [details] Invenotry file (second variant with es_cluster_size=3) Description of problem: After provision OCP v3.11.129 with openshift_logging_es_cluster_size=1 openshift_logging_es_number_of_shards=1 openshift_logging_es_number_of_replicas=1 I realize that I need: openshift_logging_es_cluster_size=3 openshift_logging_es_number_of_shards=1 openshift_logging_es_number_of_replicas=1 So after executing openshift-logging playbook it failed to deploy the rest two ES pods. It was important to manually oc rollout latest <es-deployment> on openshift-logging namespace. Then the two additional ES pods were created, but this should be managed by the ansible playbook. Version-Release number of the following components: # rpm -q openshift-ansible openshift-ansible-3.11.123-1.git.0.db681ba.el7.noarch # rpm -q ansible ansible-2.6.18-1.el7ae.noarch # ansible --version ansible 2.6.18 config file = /usr/share/ansible/openshift-ansible/ansible.cfg configured module search path = [u'/root/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules'] ansible python module location = /usr/lib/python2.7/site-packages/ansible executable location = /usr/bin/ansible python version = 2.7.5 (default, Jun 11 2019, 12:19:05) [GCC 4.8.5 20150623 (Red Hat 4.8.5-36)] How reproducible: Deploy OCP with only one instance of ES, then change inventory file for openshift_logging_es_cluster_size=3 and deploy openshift-logging. Actual results: FAILED - RETRYING: openshift_logging_elasticsearch : command (3 retries left). FAILED - RETRYING: openshift_logging_elasticsearch : command (2 retries left). FAILED - RETRYING: openshift_logging_elasticsearch : command (1 retries left). fatal: [torii-ichi-master.local.nutius.com]: FAILED! => {"attempts": 120, "changed": true, "cmd": ["oc", "--config=/etc/origin/master/admin.kubeconfig", "get", "pod", "-l", "component=es,provider=openshift", "-n", "openshift-logging", "-o", "jsonpath={.items[?(@.status.phase==\"Running\")].metadata.name}"], "delta": "0:00:00.182890", "end": "2019-07-31 14:48:27.233188", "rc": 0, "start": "2019-07-31 14:48:27.050298", "stderr": "", "stderr_lines": [], "stdout": "logging-es-data-master-vevnrhov-1-g8k89", "stdout_lines": ["logging-es-data-master-vevnrhov-1-g8k89"]} Expected results: Next two instances of ES should be created by ansible installation script without any issue. Additional information: I was simulating a situation when a customer has existing ES and needs to extend the ES for High-Availability. Without removing existing storage or existing ES pod.
Created attachment 1595085 [details] Log from ansible playbook
Additional info --------------- This cluster has a lot of install/uninstall playbook execution. But before the last installation, the OCP was regularly uninstalled by ansible playbook successfully. So the process workflow: Uninstall OCP -> Install OCP -> add es_cluster_size=3 -> deploy again openshift-logging