Description of problem: Unable to fully execute Aggregated Logging playbook when specifying multiple replicas of Elasticsearch. Fails to rollout Elasticsearch replicas FAILED - RETRYING: Waiting for logging-es-data-master-hsjwgec4 to finish scaling up (60 retries left). FAILED - RETRYING: Waiting for logging-es-data-master-hsjwgec4 to finish scaling up (59 retries left). FAILED - RETRYING: Waiting for logging-es-data-master-hsjwgec4 to finish scaling up (58 retries left). FAILED - RETRYING: Waiting for logging-es-data-master-hsjwgec4 to finish scaling up (57 retries left). Issue can be overcome by specifying the following inventory variable logging_elasticsearch_rollout_override=true Once playbook completes, each Elasticsearch DeploymentConfig can be rolled out Version-Release number of selected component (if applicable): 3.7.23 How reproducible: Always Steps to Reproduce: 1. Specify multiple replicas of Elasticsearch in inventory openshift_logging_es_number_of_replicas 2. Execute Aggregated Logging playbook ansible-playbook [-i </path/to/inventory>] \ /usr/share/ansible/openshift-ansible/playbooks/byo/openshift-cluster/openshift-logging.yml Actual results: Playbook fails as Elasticsearch never becomes ready Expected results: Aggregated logging playbook completes successfully Additional info:
Andy, is this during a fresh installation or an upgrade?
(In reply to ewolinet from comment #1) > Andy, > is this during a fresh installation or an upgrade? It occurs on both install and upgrades
This should resolve the fresh install issue: https://github.com/openshift/openshift-ansible/pull/7097 When you say that the upgrade fails, is it that the playbook fails ultimately, or you see "FAILED - RETRYING: Waiting for logging-es-data-master-hsjwgec4 to finish scaling up (# retries left)." shows up in the logs a lot? Also to clarify, when you say upgrade you do mean there is an existing deployment of logging and it is being upgraded? (not just that OCP is being upgraded and a fresh installation of logging is being installed).
I've been using the below command to deploy the ES pods after using Andy's workaround: for x in $(oc get dc -l component=es -o=custom-columns=NAME:.metadata.name --no-headers); do oc rollout latest $x; done;
Same issue with openshift3/ose-ansible/images/v3.7?
Pass with openshift-ansible:v3.7.36.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0636
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days