Bug 1543478 - openshift_logging_elasticsearch fails with No such file or directory
Summary: openshift_logging_elasticsearch fails with No such file or directory
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Logging
Version: 3.9.0
Hardware: Unspecified
OS: Unspecified
medium
low
Target Milestone: ---
: 3.9.0
Assignee: Jeff Cantrill
QA Contact: Junqi Zhao
URL:
Whiteboard:
Depends On: 1541403
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-02-08 14:27 UTC by Jeff Cantrill
Modified: 2018-04-04 12:47 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: The task assumed the oc binary was in the usable path of ansible Consequence: The task failed when the oc binary is not found Fix: Modify the task to allow the binary to be provided as an ansible fact Result: The task completes successfully.
Clone Of: 1541403
Environment:
Last Closed: 2018-03-28 14:27:28 UTC
Target Upstream Version:


Attachments (Terms of Use)


Links
System ID Priority Status Summary Last Updated
Github openshift openshift-ansible pull 6910 None None None 2018-02-08 14:27:55 UTC
Red Hat Product Errata RHBA-2018:0489 None None None 2018-03-28 14:27:56 UTC

Description Jeff Cantrill 2018-02-08 14:27:56 UTC
+++ This bug was initially created as a clone of Bug #1541403 +++

Opened this bug to track this issue (https://github.com/openshift/openshift-ansible/issues/6911/) on OCP. A PR is already opened: 
https://github.com/openshift/openshift-ansible/pull/6910/

I wasn't using any version constraint for the installation of openshift-ansible-playbooks RPM package. Earlier this week, the version of the RPM package was 3.7.14-1.git.0.4b35b2d.el7. Since the package upgraded to 3.7.23-1.git.0.bc406aa.el7 installation of the cluster is failing

Version
ansible 2.4.2.0
openshift-ansible-3.7.23-1.git.0.bc406aa.el7.noarch
Steps To Reproduce
yum install openshift-ansible-3.7.23-1.git.0.bc406aa.el7.noarch
Add the configuration below to the inventory file.
/usr/bin/ansible-playbook -vvv /usr/share/ansible/openshift-ansible/playbooks/byo/config.yml
openshift_hosted_logging_deploy=true
openshift_hosted_logging_storage_kind=dynamic
openshift_hosted_logging_elasticsearch_cluster_size=1
openshift_logging_elasticsearch_nodeselector="{'purpose': 'infra'}"
openshift_logging_curator_nodeselector="{'purpose': 'infra'}"
openshift_logging_kibana_nodeselector="{'purpose': 'infra'}"
openshift_logging_mux_nodeselector="{'purpose': 'infra'}"
openshift_logging_curator_default_days=90
openshift_logging_curator_run_hour=0
openshift_logging_curator_run_minute=0
Expected Results
Observed Results
The reason is Ansible is using non-interactive SSH connections which means PATH is the system default such as /sbin:/bin:/usr/sbin:/usr/bin. At the same time, the line below is calling an oc command with a relative path which is not in PATH.

TASK [openshift_logging_elasticsearch : command] *******************************
Friday 26 January 2018  18:04:18 +0000 (0:00:01.007)       0:16:18.018 ******** 
fatal: [ip-10-113-23-117.eu-west-2.compute.internal]: FAILED! => {"changed": false, "cmd": "oc get pod -l component=es,provider=openshift -n logging -o 'jsonpath={.items[*].metadata.name}'", "msg": "[Errno 2] No such file or directory", "rc": 2}
I think oc command should be called with the absolute path

Additional Information
Red Hat Enterprise Linux Server release 7.3 (Maipo)

--- Additional comment from Peter Verbist on 2018-02-07 17:49:09 EST ---

Dear reader,

Indeed. Had the same issue but on atomic hosts. This means I cannot just modify the path. My path = /sbin:/bin:/usr/sbin:/usr/bin ; but on the atomic host, the oc command can be found in: /usr/local/bin
As a result I changed the following:

diff --git a/roles/openshift_logging_elasticsearch/tasks/get_es_version.yml b/roles/openshift_logging_elasticsearch/tasks/get_es_version.yml
index 16de6f2..f818925 100644
--- a/roles/openshift_logging_elasticsearch/tasks/get_es_version.yml
+++ b/roles/openshift_logging_elasticsearch/tasks/get_es_version.yml
@@ -1,21 +1,21 @@
 ---
 - command: >
-    oc get pod -l component=es,provider=openshift -n {{ openshift_logging_elasticsearch_namespace }} -o jsonpath={.items[?(@.status.phase==\"Running\")].m
+    /usr/local/bin/oc get pod -l component=es,provider=openshift -n {{ openshift_logging_elasticsearch_namespace }} -o jsonpath={.items[?(@.status.phase==
   register: _cluster_pods
 
 - name: "Getting ES version for logging-es cluster"
   command: >
-    oc exec {{ _cluster_pods.stdout.split(' ')[0] }} -c elasticsearch -n {{ openshift_logging_elasticsearch_namespace }} -- {{ __es_local_curl }} -XGET 'h
+    /usr/local/bin/oc exec {{ _cluster_pods.stdout.split(' ')[0] }} -c elasticsearch -n {{ openshift_logging_elasticsearch_namespace }} -- {{ __es_local_c
   register: _curl_output
   when: _cluster_pods.stdout_lines | count > 0
 
 - command: >
-    oc get pod -l component=es-ops,provider=openshift -n {{ openshift_logging_elasticsearch_namespace }} -o jsonpath={.items[?(@.status.phase==\"Running\"
+    /usr/local/bin/oc get pod -l component=es-ops,provider=openshift -n {{ openshift_logging_elasticsearch_namespace }} -o jsonpath={.items[?(@.status.pha
   register: _ops_cluster_pods
 
 - name: "Getting ES version for logging-es-ops cluster"
   command: >
-    oc exec {{ _ops_cluster_pods.stdout.split(' ')[0] }} -c elasticsearch -n {{ openshift_logging_elasticsearch_namespace }} -- {{ __es_local_curl }} -XGE
+    /usr/local/bin/oc exec {{ _ops_cluster_pods.stdout.split(' ')[0] }} -c elasticsearch -n {{ openshift_logging_elasticsearch_namespace }} -- {{ __es_loc
   register: _ops_curl_output
   when: _ops_cluster_pods.stdout_lines | count > 0

Main issue should be handled but in the mean time, you can deploy EFK on your cluster.

Kind regards,

Peter

--- Additional comment from Jeff Cantrill on 2018-02-08 09:01:43 EST ---

UPdated https://github.com/openshift/openshift-ansible/pull/7069

--- Additional comment from Jeff Cantrill on 2018-02-08 09:22:36 EST ---

Fixed by https://github.com/openshift/openshift-ansible/pull/7068

Comment 2 Junqi Zhao 2018-02-22 12:44:27 UTC
Tested with openshift-ansible-3.9.0-0.47.0, es pod could be started up without deployment error
# rpm -qa | grep openshift-ansible
openshift-ansible-docs-3.9.0-0.47.0.git.0.f8847bb.el7.noarch
openshift-ansible-roles-3.9.0-0.47.0.git.0.f8847bb.el7.noarch
openshift-ansible-3.9.0-0.47.0.git.0.f8847bb.el7.noarch
openshift-ansible-playbooks-3.9.0-0.47.0.git.0.f8847bb.el7.noarch

There is one defect for node selector, but it is not related to this defect.
https://bugzilla.redhat.com/show_bug.cgi?id=1547972

Comment 5 errata-xmlrpc 2018-03-28 14:27:28 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0489


Note You need to log in before you can comment on or make changes to this bug.