1543478 – openshift_logging_elasticsearch fails with No such file or directory

Bug 1543478 - openshift_logging_elasticsearch fails with No such file or directory

Summary: openshift_logging_elasticsearch fails with No such file or directory

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Logging
Sub Component:
Version:	3.9.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	medium
Severity:	low
Target Milestone:	---
Target Release:	3.9.0
Assignee:	Jeff Cantrill
QA Contact:	Junqi Zhao
Docs Contact:
URL:
Whiteboard:
Depends On:	1541403
Blocks:
TreeView+	depends on / blocked

Reported:	2018-02-08 14:27 UTC by Jeff Cantrill
Modified:	2021-06-10 14:32 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:	Cause: The task assumed the oc binary was in the usable path of ansible Consequence: The task failed when the oc binary is not found Fix: Modify the task to allow the binary to be provided as an ansible fact Result: The task completes successfully.
Clone Of:	1541403
Environment:
Last Closed:	2018-03-28 14:27:28 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Github	openshift openshift-ansible pull 6910	0	None	None	None	2018-02-08 14:27:55 UTC
Red Hat Product Errata	RHBA-2018:0489	0	None	None	None	2018-03-28 14:27:56 UTC

Description Jeff Cantrill 2018-02-08 14:27:56 UTC

+++ This bug was initially created as a clone of Bug #1541403 +++

Opened this bug to track this issue (https://github.com/openshift/openshift-ansible/issues/6911/) on OCP. A PR is already opened: 
https://github.com/openshift/openshift-ansible/pull/6910/

I wasn't using any version constraint for the installation of openshift-ansible-playbooks RPM package. Earlier this week, the version of the RPM package was 3.7.14-1.git.0.4b35b2d.el7. Since the package upgraded to 3.7.23-1.git.0.bc406aa.el7 installation of the cluster is failing

Version
ansible 2.4.2.0
openshift-ansible-3.7.23-1.git.0.bc406aa.el7.noarch
Steps To Reproduce
yum install openshift-ansible-3.7.23-1.git.0.bc406aa.el7.noarch
Add the configuration below to the inventory file.
/usr/bin/ansible-playbook -vvv /usr/share/ansible/openshift-ansible/playbooks/byo/config.yml
openshift_hosted_logging_deploy=true
openshift_hosted_logging_storage_kind=dynamic
openshift_hosted_logging_elasticsearch_cluster_size=1
openshift_logging_elasticsearch_nodeselector="{'purpose': 'infra'}"
openshift_logging_curator_nodeselector="{'purpose': 'infra'}"
openshift_logging_kibana_nodeselector="{'purpose': 'infra'}"
openshift_logging_mux_nodeselector="{'purpose': 'infra'}"
openshift_logging_curator_default_days=90
openshift_logging_curator_run_hour=0
openshift_logging_curator_run_minute=0
Expected Results
Observed Results
The reason is Ansible is using non-interactive SSH connections which means PATH is the system default such as /sbin:/bin:/usr/sbin:/usr/bin. At the same time, the line below is calling an oc command with a relative path which is not in PATH.

TASK [openshift_logging_elasticsearch : command] *******************************
Friday 26 January 2018  18:04:18 +0000 (0:00:01.007)       0:16:18.018 ******** 
fatal: [ip-10-113-23-117.eu-west-2.compute.internal]: FAILED! => {"changed": false, "cmd": "oc get pod -l component=es,provider=openshift -n logging -o 'jsonpath={.items[*].metadata.name}'", "msg": "[Errno 2] No such file or directory", "rc": 2}
I think oc command should be called with the absolute path

Additional Information
Red Hat Enterprise Linux Server release 7.3 (Maipo)

--- Additional comment from Peter Verbist on 2018-02-07 17:49:09 EST ---

Dear reader,

Indeed. Had the same issue but on atomic hosts. This means I cannot just modify the path. My path = /sbin:/bin:/usr/sbin:/usr/bin ; but on the atomic host, the oc command can be found in: /usr/local/bin
As a result I changed the following:

diff --git a/roles/openshift_logging_elasticsearch/tasks/get_es_version.yml b/roles/openshift_logging_elasticsearch/tasks/get_es_version.yml
index 16de6f2..f818925 100644
--- a/roles/openshift_logging_elasticsearch/tasks/get_es_version.yml
+++ b/roles/openshift_logging_elasticsearch/tasks/get_es_version.yml
@@ -1,21 +1,21 @@
 ---
 - command: >
-    oc get pod -l component=es,provider=openshift -n {{ openshift_logging_elasticsearch_namespace }} -o jsonpath={.items[?(@.status.phase==\"Running\")].m
+    /usr/local/bin/oc get pod -l component=es,provider=openshift -n {{ openshift_logging_elasticsearch_namespace }} -o jsonpath={.items[?(@.status.phase==
   register: _cluster_pods
 
 - name: "Getting ES version for logging-es cluster"
   command: >
-    oc exec {{ _cluster_pods.stdout.split(' ')[0] }} -c elasticsearch -n {{ openshift_logging_elasticsearch_namespace }} -- {{ __es_local_curl }} -XGET 'h
+    /usr/local/bin/oc exec {{ _cluster_pods.stdout.split(' ')[0] }} -c elasticsearch -n {{ openshift_logging_elasticsearch_namespace }} -- {{ __es_local_c
   register: _curl_output
   when: _cluster_pods.stdout_lines | count > 0
 
 - command: >
-    oc get pod -l component=es-ops,provider=openshift -n {{ openshift_logging_elasticsearch_namespace }} -o jsonpath={.items[?(@.status.phase==\"Running\"
+    /usr/local/bin/oc get pod -l component=es-ops,provider=openshift -n {{ openshift_logging_elasticsearch_namespace }} -o jsonpath={.items[?(@.status.pha
   register: _ops_cluster_pods
 
 - name: "Getting ES version for logging-es-ops cluster"
   command: >
-    oc exec {{ _ops_cluster_pods.stdout.split(' ')[0] }} -c elasticsearch -n {{ openshift_logging_elasticsearch_namespace }} -- {{ __es_local_curl }} -XGE
+    /usr/local/bin/oc exec {{ _ops_cluster_pods.stdout.split(' ')[0] }} -c elasticsearch -n {{ openshift_logging_elasticsearch_namespace }} -- {{ __es_loc
   register: _ops_curl_output
   when: _ops_cluster_pods.stdout_lines | count > 0

Main issue should be handled but in the mean time, you can deploy EFK on your cluster.

Kind regards,

Peter

--- Additional comment from Jeff Cantrill on 2018-02-08 09:01:43 EST ---

UPdated https://github.com/openshift/openshift-ansible/pull/7069

--- Additional comment from Jeff Cantrill on 2018-02-08 09:22:36 EST ---

Fixed by https://github.com/openshift/openshift-ansible/pull/7068

Comment 2 Junqi Zhao 2018-02-22 12:44:27 UTC

Tested with openshift-ansible-3.9.0-0.47.0, es pod could be started up without deployment error
# rpm -qa | grep openshift-ansible
openshift-ansible-docs-3.9.0-0.47.0.git.0.f8847bb.el7.noarch
openshift-ansible-roles-3.9.0-0.47.0.git.0.f8847bb.el7.noarch
openshift-ansible-3.9.0-0.47.0.git.0.f8847bb.el7.noarch
openshift-ansible-playbooks-3.9.0-0.47.0.git.0.f8847bb.el7.noarch

There is one defect for node selector, but it is not related to this defect.
https://bugzilla.redhat.com/show_bug.cgi?id=1547972

Comment 5 errata-xmlrpc 2018-03-28 14:27:28 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0489

Note You need to log in before you can comment on or make changes to this bug.