Bug 1541403

Summary: openshift_logging_elasticsearch fails with No such file or directory
Product: OpenShift Container Platform Reporter: Bruno Andrade <bandrade>
Component: LoggingAssignee: Jeff Cantrill <jcantril>
Status: CLOSED ERRATA QA Contact: Anping Li <anli>
Severity: low Docs Contact:
Priority: medium    
Version: 3.7.0CC: aos-bugs, bandrade, jmalde, juzhao, maupadhy, mmariyan, peter, rmeggins, smulholland, srirammageswaran
Target Milestone: ---   
Target Release: 3.7.z   
Hardware: Unspecified   
OS: Unspecified   
Fixed In Version: Doc Type: Bug Fix
Doc Text:
Cause: The task assumed the oc binary was in the usable path of ansible Consequence: The task failed when the oc binary is not found Fix: Modify the task to allow the binary to be provided as an ansible fact Result: The task completes successfully.
Story Points: ---
Clone Of:
: 1543478 (view as bug list) Environment:
Last Closed: 2018-04-05 09:37:08 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Bug Depends On:    
Bug Blocks: 1543478    

Description Bruno Andrade 2018-02-02 13:49:27 UTC
Opened this bug to track this issue (https://github.com/openshift/openshift-ansible/issues/6911/) on OCP. A PR is already opened: 

I wasn't using any version constraint for the installation of openshift-ansible-playbooks RPM package. Earlier this week, the version of the RPM package was 3.7.14-1.git.0.4b35b2d.el7. Since the package upgraded to 3.7.23-1.git.0.bc406aa.el7 installation of the cluster is failing

Steps To Reproduce
yum install openshift-ansible-3.7.23-1.git.0.bc406aa.el7.noarch
Add the configuration below to the inventory file.
/usr/bin/ansible-playbook -vvv /usr/share/ansible/openshift-ansible/playbooks/byo/config.yml
openshift_logging_elasticsearch_nodeselector="{'purpose': 'infra'}"
openshift_logging_curator_nodeselector="{'purpose': 'infra'}"
openshift_logging_kibana_nodeselector="{'purpose': 'infra'}"
openshift_logging_mux_nodeselector="{'purpose': 'infra'}"
Expected Results
Observed Results
The reason is Ansible is using non-interactive SSH connections which means PATH is the system default such as /sbin:/bin:/usr/sbin:/usr/bin. At the same time, the line below is calling an oc command with a relative path which is not in PATH.

TASK [openshift_logging_elasticsearch : command] *******************************
Friday 26 January 2018  18:04:18 +0000 (0:00:01.007)       0:16:18.018 ******** 
fatal: [ip-10-113-23-117.eu-west-2.compute.internal]: FAILED! => {"changed": false, "cmd": "oc get pod -l component=es,provider=openshift -n logging -o 'jsonpath={.items[*].metadata.name}'", "msg": "[Errno 2] No such file or directory", "rc": 2}
I think oc command should be called with the absolute path

Additional Information
Red Hat Enterprise Linux Server release 7.3 (Maipo)

Comment 1 Peter Verbist 2018-02-07 22:49:09 UTC
Dear reader,

Indeed. Had the same issue but on atomic hosts. This means I cannot just modify the path. My path = /sbin:/bin:/usr/sbin:/usr/bin ; but on the atomic host, the oc command can be found in: /usr/local/bin
As a result I changed the following:

diff --git a/roles/openshift_logging_elasticsearch/tasks/get_es_version.yml b/roles/openshift_logging_elasticsearch/tasks/get_es_version.yml
index 16de6f2..f818925 100644
--- a/roles/openshift_logging_elasticsearch/tasks/get_es_version.yml
+++ b/roles/openshift_logging_elasticsearch/tasks/get_es_version.yml
@@ -1,21 +1,21 @@
 - command: >
-    oc get pod -l component=es,provider=openshift -n {{ openshift_logging_elasticsearch_namespace }} -o jsonpath={.items[?(@.status.phase==\"Running\")].m
+    /usr/local/bin/oc get pod -l component=es,provider=openshift -n {{ openshift_logging_elasticsearch_namespace }} -o jsonpath={.items[?(@.status.phase==
   register: _cluster_pods
 - name: "Getting ES version for logging-es cluster"
   command: >
-    oc exec {{ _cluster_pods.stdout.split(' ')[0] }} -c elasticsearch -n {{ openshift_logging_elasticsearch_namespace }} -- {{ __es_local_curl }} -XGET 'h
+    /usr/local/bin/oc exec {{ _cluster_pods.stdout.split(' ')[0] }} -c elasticsearch -n {{ openshift_logging_elasticsearch_namespace }} -- {{ __es_local_c
   register: _curl_output
   when: _cluster_pods.stdout_lines | count > 0
 - command: >
-    oc get pod -l component=es-ops,provider=openshift -n {{ openshift_logging_elasticsearch_namespace }} -o jsonpath={.items[?(@.status.phase==\"Running\"
+    /usr/local/bin/oc get pod -l component=es-ops,provider=openshift -n {{ openshift_logging_elasticsearch_namespace }} -o jsonpath={.items[?(@.status.pha
   register: _ops_cluster_pods
 - name: "Getting ES version for logging-es-ops cluster"
   command: >
-    oc exec {{ _ops_cluster_pods.stdout.split(' ')[0] }} -c elasticsearch -n {{ openshift_logging_elasticsearch_namespace }} -- {{ __es_local_curl }} -XGE
+    /usr/local/bin/oc exec {{ _ops_cluster_pods.stdout.split(' ')[0] }} -c elasticsearch -n {{ openshift_logging_elasticsearch_namespace }} -- {{ __es_loc
   register: _ops_curl_output
   when: _ops_cluster_pods.stdout_lines | count > 0

Main issue should be handled but in the mean time, you can deploy EFK on your cluster.

Kind regards,


Comment 2 Jeff Cantrill 2018-02-08 14:01:43 UTC
UPdated https://github.com/openshift/openshift-ansible/pull/7069

Comment 3 Jeff Cantrill 2018-02-08 14:22:36 UTC
Fixed by https://github.com/openshift/openshift-ansible/pull/7068

Comment 4 Jeff Cantrill 2018-02-08 14:29:18 UTC
3.7 PR https://github.com/openshift/openshift-ansible/pull/7070

Comment 6 Junqi Zhao 2018-02-14 08:56:01 UTC
The latest openshift-ansible version is openshift-ansible-roles-3.7.29-1.git.0.e1bfc35.el7.noarch, which does not shipped with fix in PR 7070

Change back to MODIFIED

Comment 8 Anping Li 2018-03-06 10:30:35 UTC
The logging deploy succeed with openshift-ansible-3.7.36-1.git.0.0f4a93c.el7.noarch

Comment 15 errata-xmlrpc 2018-04-05 09:37:08 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.