Bug 1734793 - [openshift-ansible]/openshift-logging/config.yaml failed provisioning additional instances of Elasticsearch
Summary: [openshift-ansible]/openshift-logging/config.yaml failed provisioning additio...
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging
Version: 3.11.0
Hardware: Unspecified
OS: Unspecified
Target Milestone: ---
: 3.11.z
Assignee: Jeff Cantrill
QA Contact: Anping Li
Depends On:
TreeView+ depends on / blocked
Reported: 2019-07-31 13:18 UTC by Radomir Ludva
Modified: 2020-02-02 01:29 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Last Closed: 2020-02-02 01:29:36 UTC
Target Upstream Version:

Attachments (Terms of Use)
Invenotry file (second variant with es_cluster_size=3) (5.94 KB, text/plain)
2019-07-31 13:18 UTC, Radomir Ludva
no flags Details
Log from ansible playbook (4.26 MB, text/plain)
2019-07-31 13:19 UTC, Radomir Ludva
no flags Details

Description Radomir Ludva 2019-07-31 13:18:32 UTC
Created attachment 1595084 [details]
Invenotry file (second variant with es_cluster_size=3)

Description of problem:
After provision OCP v3.11.129 with

I realize that I need:

So after executing openshift-logging playbook it failed to deploy the rest two ES pods. It was important to manually oc rollout latest <es-deployment> on openshift-logging namespace. Then the two additional ES pods were created, but this should be managed by the ansible playbook.

Version-Release number of the following components:
# rpm -q openshift-ansible

# rpm -q ansible

# ansible --version
ansible 2.6.18
  config file = /usr/share/ansible/openshift-ansible/ansible.cfg
  configured module search path = [u'/root/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python2.7/site-packages/ansible
  executable location = /usr/bin/ansible
  python version = 2.7.5 (default, Jun 11 2019, 12:19:05) [GCC 4.8.5 20150623 (Red Hat 4.8.5-36)]

How reproducible:
Deploy OCP with only one instance of ES, then change inventory file for 
openshift_logging_es_cluster_size=3 and deploy openshift-logging.

Actual results:
FAILED - RETRYING: openshift_logging_elasticsearch : command (3 retries left).
FAILED - RETRYING: openshift_logging_elasticsearch : command (2 retries left).
FAILED - RETRYING: openshift_logging_elasticsearch : command (1 retries left).
fatal: [torii-ichi-master.local.nutius.com]: FAILED! => {"attempts": 120, "changed": true, "cmd": ["oc", "--config=/etc/origin/master/admin.kubeconfig", "get", "pod", "-l", "component=es,provider=openshift", "-n", "openshift-logging", "-o", "jsonpath={.items[?(@.status.phase==\"Running\")].metadata.name}"], "delta": "0:00:00.182890", "end": "2019-07-31 14:48:27.233188", "rc": 0, "start": "2019-07-31 14:48:27.050298", "stderr": "", "stderr_lines": [], "stdout": "logging-es-data-master-vevnrhov-1-g8k89", "stdout_lines": ["logging-es-data-master-vevnrhov-1-g8k89"]}

Expected results:
Next two instances of ES should be created by ansible installation script without any issue.

Additional information:
I was simulating a situation when a customer has existing ES and needs to extend the ES for High-Availability. Without removing existing storage or existing ES pod.

Comment 1 Radomir Ludva 2019-07-31 13:19:21 UTC
Created attachment 1595085 [details]
Log from ansible playbook

Comment 2 Radomir Ludva 2019-07-31 13:24:40 UTC
Additional info
This cluster has a lot of install/uninstall playbook execution. But before the last installation, the OCP was regularly uninstalled by ansible playbook successfully. 

So the process workflow: Uninstall OCP -> Install OCP -> add es_cluster_size=3 -> deploy again openshift-logging

Note You need to log in before you can comment on or make changes to this bug.