+++ This bug was initially created as a clone of Bug #1541403 +++ Opened this bug to track this issue (https://github.com/openshift/openshift-ansible/issues/6911/) on OCP. A PR is already opened: https://github.com/openshift/openshift-ansible/pull/6910/ I wasn't using any version constraint for the installation of openshift-ansible-playbooks RPM package. Earlier this week, the version of the RPM package was 3.7.14-1.git.0.4b35b2d.el7. Since the package upgraded to 3.7.23-1.git.0.bc406aa.el7 installation of the cluster is failing Version ansible 2.4.2.0 openshift-ansible-3.7.23-1.git.0.bc406aa.el7.noarch Steps To Reproduce yum install openshift-ansible-3.7.23-1.git.0.bc406aa.el7.noarch Add the configuration below to the inventory file. /usr/bin/ansible-playbook -vvv /usr/share/ansible/openshift-ansible/playbooks/byo/config.yml openshift_hosted_logging_deploy=true openshift_hosted_logging_storage_kind=dynamic openshift_hosted_logging_elasticsearch_cluster_size=1 openshift_logging_elasticsearch_nodeselector="{'purpose': 'infra'}" openshift_logging_curator_nodeselector="{'purpose': 'infra'}" openshift_logging_kibana_nodeselector="{'purpose': 'infra'}" openshift_logging_mux_nodeselector="{'purpose': 'infra'}" openshift_logging_curator_default_days=90 openshift_logging_curator_run_hour=0 openshift_logging_curator_run_minute=0 Expected Results Observed Results The reason is Ansible is using non-interactive SSH connections which means PATH is the system default such as /sbin:/bin:/usr/sbin:/usr/bin. At the same time, the line below is calling an oc command with a relative path which is not in PATH. TASK [openshift_logging_elasticsearch : command] ******************************* Friday 26 January 2018 18:04:18 +0000 (0:00:01.007) 0:16:18.018 ******** fatal: [ip-10-113-23-117.eu-west-2.compute.internal]: FAILED! => {"changed": false, "cmd": "oc get pod -l component=es,provider=openshift -n logging -o 'jsonpath={.items[*].metadata.name}'", "msg": "[Errno 2] No such file or directory", "rc": 2} I think oc command should be called with the absolute path Additional Information Red Hat Enterprise Linux Server release 7.3 (Maipo) --- Additional comment from Peter Verbist on 2018-02-07 17:49:09 EST --- Dear reader, Indeed. Had the same issue but on atomic hosts. This means I cannot just modify the path. My path = /sbin:/bin:/usr/sbin:/usr/bin ; but on the atomic host, the oc command can be found in: /usr/local/bin As a result I changed the following: diff --git a/roles/openshift_logging_elasticsearch/tasks/get_es_version.yml b/roles/openshift_logging_elasticsearch/tasks/get_es_version.yml index 16de6f2..f818925 100644 --- a/roles/openshift_logging_elasticsearch/tasks/get_es_version.yml +++ b/roles/openshift_logging_elasticsearch/tasks/get_es_version.yml @@ -1,21 +1,21 @@ --- - command: > - oc get pod -l component=es,provider=openshift -n {{ openshift_logging_elasticsearch_namespace }} -o jsonpath={.items[?(@.status.phase==\"Running\")].m + /usr/local/bin/oc get pod -l component=es,provider=openshift -n {{ openshift_logging_elasticsearch_namespace }} -o jsonpath={.items[?(@.status.phase== register: _cluster_pods - name: "Getting ES version for logging-es cluster" command: > - oc exec {{ _cluster_pods.stdout.split(' ')[0] }} -c elasticsearch -n {{ openshift_logging_elasticsearch_namespace }} -- {{ __es_local_curl }} -XGET 'h + /usr/local/bin/oc exec {{ _cluster_pods.stdout.split(' ')[0] }} -c elasticsearch -n {{ openshift_logging_elasticsearch_namespace }} -- {{ __es_local_c register: _curl_output when: _cluster_pods.stdout_lines | count > 0 - command: > - oc get pod -l component=es-ops,provider=openshift -n {{ openshift_logging_elasticsearch_namespace }} -o jsonpath={.items[?(@.status.phase==\"Running\" + /usr/local/bin/oc get pod -l component=es-ops,provider=openshift -n {{ openshift_logging_elasticsearch_namespace }} -o jsonpath={.items[?(@.status.pha register: _ops_cluster_pods - name: "Getting ES version for logging-es-ops cluster" command: > - oc exec {{ _ops_cluster_pods.stdout.split(' ')[0] }} -c elasticsearch -n {{ openshift_logging_elasticsearch_namespace }} -- {{ __es_local_curl }} -XGE + /usr/local/bin/oc exec {{ _ops_cluster_pods.stdout.split(' ')[0] }} -c elasticsearch -n {{ openshift_logging_elasticsearch_namespace }} -- {{ __es_loc register: _ops_curl_output when: _ops_cluster_pods.stdout_lines | count > 0 Main issue should be handled but in the mean time, you can deploy EFK on your cluster. Kind regards, Peter --- Additional comment from Jeff Cantrill on 2018-02-08 09:01:43 EST --- UPdated https://github.com/openshift/openshift-ansible/pull/7069 --- Additional comment from Jeff Cantrill on 2018-02-08 09:22:36 EST --- Fixed by https://github.com/openshift/openshift-ansible/pull/7068
Tested with openshift-ansible-3.9.0-0.47.0, es pod could be started up without deployment error # rpm -qa | grep openshift-ansible openshift-ansible-docs-3.9.0-0.47.0.git.0.f8847bb.el7.noarch openshift-ansible-roles-3.9.0-0.47.0.git.0.f8847bb.el7.noarch openshift-ansible-3.9.0-0.47.0.git.0.f8847bb.el7.noarch openshift-ansible-playbooks-3.9.0-0.47.0.git.0.f8847bb.el7.noarch There is one defect for node selector, but it is not related to this defect. https://bugzilla.redhat.com/show_bug.cgi?id=1547972
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:0489