Bug 1636616 - OCS 3.10 fails to deploy at 'Load heketi topology' with 'command not found'
Summary: OCS 3.10 fails to deploy at 'Load heketi topology' with 'command not found'
Keywords:
Status: CLOSED NOTABUG
Alias: None
Product: Red Hat Gluster Storage
Classification: Red Hat Storage
Component: CNS-deployment
Version: cns-3.10
Hardware: x86_64
OS: Linux
urgent
high
Target Milestone: ---
: ---
Assignee: Jose A. Rivera
QA Contact: Prasanth
URL:
Whiteboard:
Depends On:
Blocks: OCS-3.11.1-devel-triage-done 1642792
TreeView+ depends on / blocked
 
Reported: 2018-10-05 21:28 UTC by Davi Garcia
Modified: 2021-12-10 17:49 UTC (History)
14 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2018-11-07 18:40:04 UTC
Embargoed:


Attachments (Terms of Use)

Description Davi Garcia 2018-10-05 21:28:57 UTC
>> Description of problem:

We are trying to deploy OCS 3.10 in Independent Mode on a OCP 3.9 (which should be supported), but it fails during the 'openshift_storage_glusterfs : Load heketi topology' task due multiple 'command not found errors'.

TASK [openshift_storage_glusterfs : Load heketi topology] ********************************************************************************************
Friday 05 October 2018  16:56:04 -0300 (0:00:01.729)       0:04:58.627 ******** 
fatal: [s01ops04]: FAILED! => {"changed": true, "cmd": ["oc", "--config=/tmp/openshift-glusterfs-ansible-GgLL9D/admin.kubeconfig", "rsh", "--namespace=openshift-storage-ext", "deploy-heketi-storage-2-6vg72", "heketi-cli", "-s", "http://localhost:8080", "--user", "admin", "--secret", "SkACnWqb3pfWaQKV6WGsM/0bhfo9OTsg7PE0g4t4o1E=", "topology", "load", "--json=/tmp/openshift-glusterfs-ansible-GgLL9D/topology.json", "2>&1"], "delta": "0:00:05.826397", "end": "2018-10-05 16:56:11.009775", "failed_when_result": true, "rc": 0, "start": "2018-10-05 16:56:05.183378", "stderr": "", "stderr_lines": [], "stdout": "Creating cluster ... ID: f22481462a83889ba9082080a563e9f2\n\tAllowing file volumes on cluster.\n\tAllowing block volumes on cluster.\n\tCreating node s01ocs01.example.com ... ID: a00903ff26d020f48592b44f07283605\n\t\tAdding device /dev/sdb ... Unable to add device: sudo: pvcreate: command not found\n\tCreating node s01ocs02.example.com ... Unable to create node: sudo: gluster: command not found\n\tCreating node s01ocs03.example.com ... Unable to create node: sudo: gluster: command not found\n\tCreating node s01ocs04.example.com ... Unable to create node: sudo: gluster: command not found", "stdout_lines": ["Creating cluster ... ID: f22481462a83889ba9082080a563e9f2", "\tAllowing file volumes on cluster.", "\tAllowing block volumes on cluster.", "\tCreating node s01ocs01.example.com ... ID: a00903ff26d020f48592b44f07283605", "\t\tAdding device /dev/sdb ... Unable to add device: sudo: pvcreate: command not found", "\tCreating node s01ocs02.example.com ... Unable to create node: sudo: gluster: command not found", "\tCreating node s01ocs03.example.com ... Unable to create node: sudo: gluster: command not found", "\tCreating node s01ocs04.example.com ... Unable to create node: sudo: gluster: command not found"]}

>> Version-Release:

OpenShift Container Platform 3.9.33
Advanced Installer (openshift-ansible) 3.9.z
OpenShift Container Storage 3.10.z

>> How reproducible:

Easy

>> Steps to Reproduce:
1. Using and OCP 3.9.z environment, customize the inventory with:

openshift_storage_glusterfs_namespace=openshift-storage-ext
openshift_storage_glusterfs_storageclass=true
openshift_storage_glusterfs_storageclass_default=false
openshift_storage_glusterfs_block_deploy=true
openshift_storage_glusterfs_block_host_vol_create=true
openshift_storage_glusterfs_block_host_vol_size=400
openshift_storage_glusterfs_block_storageclass=true
openshift_storage_glusterfs_block_storageclass_default=false
openshift_storage_glusterfs_is_native=false
openshift_storage_glusterfs_heketi_is_native=true
openshift_storage_glusterfs_heketi_executor=ssh
openshift_storage_glusterfs_heketi_ssh_port=22
openshift_storage_glusterfs_heketi_ssh_user=heketi
openshift_storage_glusterfs_heketi_ssh_sudo=true
openshift_storage_glusterfs_heketi_ssh_keyfile="/root/.ssh/heketi_rsa"
openshift_storage_glusterfs_image=registry.access.redhat.com/rhgs3/rhgs-server-rhel7:v3.10
openshift_storage_glusterfs_block_image=registry.access.redhat.com/rhgs3/rhgs-gluster-block-prov-rhel7:v3.10
openshift_storage_glusterfs_heketi_image=registry.access.redhat.com/rhgs3/rhgs-volmanager-rhel7:v3.10

2. Run the playbook openshift-glusterfs:

ansible-playbook -i ~/inventory /usr/share/ansible/openshift-ansible/playbooks/openshift-glusterfs/config.yml

>> Actual results:

TASK [openshift_storage_glusterfs : Load heketi topology] ********************************************************************************************
Friday 05 October 2018  16:56:04 -0300 (0:00:01.729)       0:04:58.627 ******** 
fatal: [s01ops04]: FAILED! => {"changed": true, "cmd": ["oc", "--config=/tmp/openshift-glusterfs-ansible-GgLL9D/admin.kubeconfig", "rsh", "--namespace=openshift-storage-ext", "deploy-heketi-storage-2-6vg72", "heketi-cli", "-s", "http://localhost:8080", "--user", "admin", "--secret", "SkACnWqb3pfWaQKV6WGsM/0bhfo9OTsg7PE0g4t4o1E=", "topology", "load", "--json=/tmp/openshift-glusterfs-ansible-GgLL9D/topology.json", "2>&1"], "delta": "0:00:05.826397", "end": "2018-10-05 16:56:11.009775", "failed_when_result": true, "rc": 0, "start": "2018-10-05 16:56:05.183378", "stderr": "", "stderr_lines": [], "stdout": "Creating cluster ... ID: f22481462a83889ba9082080a563e9f2\n\tAllowing file volumes on cluster.\n\tAllowing block volumes on cluster.\n\tCreating node s01ocs01.example.com ... ID: a00903ff26d020f48592b44f07283605\n\t\tAdding device /dev/sdb ... Unable to add device: sudo: pvcreate: command not found\n\tCreating node s01ocs02.example.com ... Unable to create node: sudo: gluster: command not found\n\tCreating node s01ocs03.example.com ... Unable to create node: sudo: gluster: command not found\n\tCreating node s01ocs04.example.com ... Unable to create node: sudo: gluster: command not found", "stdout_lines": ["Creating cluster ... ID: f22481462a83889ba9082080a563e9f2", "\tAllowing file volumes on cluster.", "\tAllowing block volumes on cluster.", "\tCreating node s01ocs01.example.com ... ID: a00903ff26d020f48592b44f07283605", "\t\tAdding device /dev/sdb ... Unable to add device: sudo: pvcreate: command not found", "\tCreating node s01ocs02.example.com ... Unable to create node: sudo: gluster: command not found", "\tCreating node s01ocs03.example.com ... Unable to create node: sudo: gluster: command not found", "\tCreating node s01ocs04.example.com ... Unable to create node: sudo: gluster: command not found"]}

PLAY RECAP *******************************************************************************************************************************************
localhost                  : ok=12   changed=0    unreachable=0    failed=0   
s01ocs01.example.com       : ok=20   changed=0    unreachable=0    failed=0   
s01ocs02.example.com       : ok=17   changed=0    unreachable=0    failed=0   
s01ocs03.example.com       : ok=17   changed=0    unreachable=0    failed=0   
s01ocs04.example.com       : ok=17   changed=0    unreachable=0    failed=0   
s01ops04                   : ok=64   changed=12   unreachable=0    failed=1   
s01ops05                   : ok=19   changed=1    unreachable=0    failed=0   
s01ops06                   : ok=19   changed=1    unreachable=0    failed=0   
s01ops11                   : ok=1    changed=0    unreachable=0    failed=0     

>> Expected results:

OCS Independent installed successfully 

>> Additional info:

We already tested and the remote user configured for Heketi has 'sudo' permissions and proper SSH keys distributed. Also, we could run the commands manually without problems:

[root@s01ocs01 ~]# su heketi -
[heketi@s01ocs01 root]$ whoami
heketi
[heketi@s01ocs01 root]$ sudo pvcreate --metadatasize=128M --dataalignment=256K '/dev/sdb'
  Physical volume "/dev/sdb" successfully created.

Comment 18 Davi Garcia 2018-11-07 18:40:04 UTC
We did some extra tests and verified that the user 'heketi' is only able to find the binaries in the path on login shells (the environment variable was being exported from /etc/profile). On non-interactive shells, the path was not properly defined and that was causing the problem. We fixed that putting the export on .bashrc for the 'heketi' user.


Note You need to log in before you can comment on or make changes to this bug.