>> Description of problem: We are trying to deploy OCS 3.10 in Independent Mode on a OCP 3.9 (which should be supported), but it fails during the 'openshift_storage_glusterfs : Load heketi topology' task due multiple 'command not found errors'. TASK [openshift_storage_glusterfs : Load heketi topology] ******************************************************************************************** Friday 05 October 2018 16:56:04 -0300 (0:00:01.729) 0:04:58.627 ******** fatal: [s01ops04]: FAILED! => {"changed": true, "cmd": ["oc", "--config=/tmp/openshift-glusterfs-ansible-GgLL9D/admin.kubeconfig", "rsh", "--namespace=openshift-storage-ext", "deploy-heketi-storage-2-6vg72", "heketi-cli", "-s", "http://localhost:8080", "--user", "admin", "--secret", "SkACnWqb3pfWaQKV6WGsM/0bhfo9OTsg7PE0g4t4o1E=", "topology", "load", "--json=/tmp/openshift-glusterfs-ansible-GgLL9D/topology.json", "2>&1"], "delta": "0:00:05.826397", "end": "2018-10-05 16:56:11.009775", "failed_when_result": true, "rc": 0, "start": "2018-10-05 16:56:05.183378", "stderr": "", "stderr_lines": [], "stdout": "Creating cluster ... ID: f22481462a83889ba9082080a563e9f2\n\tAllowing file volumes on cluster.\n\tAllowing block volumes on cluster.\n\tCreating node s01ocs01.example.com ... ID: a00903ff26d020f48592b44f07283605\n\t\tAdding device /dev/sdb ... Unable to add device: sudo: pvcreate: command not found\n\tCreating node s01ocs02.example.com ... Unable to create node: sudo: gluster: command not found\n\tCreating node s01ocs03.example.com ... Unable to create node: sudo: gluster: command not found\n\tCreating node s01ocs04.example.com ... Unable to create node: sudo: gluster: command not found", "stdout_lines": ["Creating cluster ... ID: f22481462a83889ba9082080a563e9f2", "\tAllowing file volumes on cluster.", "\tAllowing block volumes on cluster.", "\tCreating node s01ocs01.example.com ... ID: a00903ff26d020f48592b44f07283605", "\t\tAdding device /dev/sdb ... Unable to add device: sudo: pvcreate: command not found", "\tCreating node s01ocs02.example.com ... Unable to create node: sudo: gluster: command not found", "\tCreating node s01ocs03.example.com ... Unable to create node: sudo: gluster: command not found", "\tCreating node s01ocs04.example.com ... Unable to create node: sudo: gluster: command not found"]} >> Version-Release: OpenShift Container Platform 3.9.33 Advanced Installer (openshift-ansible) 3.9.z OpenShift Container Storage 3.10.z >> How reproducible: Easy >> Steps to Reproduce: 1. Using and OCP 3.9.z environment, customize the inventory with: openshift_storage_glusterfs_namespace=openshift-storage-ext openshift_storage_glusterfs_storageclass=true openshift_storage_glusterfs_storageclass_default=false openshift_storage_glusterfs_block_deploy=true openshift_storage_glusterfs_block_host_vol_create=true openshift_storage_glusterfs_block_host_vol_size=400 openshift_storage_glusterfs_block_storageclass=true openshift_storage_glusterfs_block_storageclass_default=false openshift_storage_glusterfs_is_native=false openshift_storage_glusterfs_heketi_is_native=true openshift_storage_glusterfs_heketi_executor=ssh openshift_storage_glusterfs_heketi_ssh_port=22 openshift_storage_glusterfs_heketi_ssh_user=heketi openshift_storage_glusterfs_heketi_ssh_sudo=true openshift_storage_glusterfs_heketi_ssh_keyfile="/root/.ssh/heketi_rsa" openshift_storage_glusterfs_image=registry.access.redhat.com/rhgs3/rhgs-server-rhel7:v3.10 openshift_storage_glusterfs_block_image=registry.access.redhat.com/rhgs3/rhgs-gluster-block-prov-rhel7:v3.10 openshift_storage_glusterfs_heketi_image=registry.access.redhat.com/rhgs3/rhgs-volmanager-rhel7:v3.10 2. Run the playbook openshift-glusterfs: ansible-playbook -i ~/inventory /usr/share/ansible/openshift-ansible/playbooks/openshift-glusterfs/config.yml >> Actual results: TASK [openshift_storage_glusterfs : Load heketi topology] ******************************************************************************************** Friday 05 October 2018 16:56:04 -0300 (0:00:01.729) 0:04:58.627 ******** fatal: [s01ops04]: FAILED! => {"changed": true, "cmd": ["oc", "--config=/tmp/openshift-glusterfs-ansible-GgLL9D/admin.kubeconfig", "rsh", "--namespace=openshift-storage-ext", "deploy-heketi-storage-2-6vg72", "heketi-cli", "-s", "http://localhost:8080", "--user", "admin", "--secret", "SkACnWqb3pfWaQKV6WGsM/0bhfo9OTsg7PE0g4t4o1E=", "topology", "load", "--json=/tmp/openshift-glusterfs-ansible-GgLL9D/topology.json", "2>&1"], "delta": "0:00:05.826397", "end": "2018-10-05 16:56:11.009775", "failed_when_result": true, "rc": 0, "start": "2018-10-05 16:56:05.183378", "stderr": "", "stderr_lines": [], "stdout": "Creating cluster ... ID: f22481462a83889ba9082080a563e9f2\n\tAllowing file volumes on cluster.\n\tAllowing block volumes on cluster.\n\tCreating node s01ocs01.example.com ... ID: a00903ff26d020f48592b44f07283605\n\t\tAdding device /dev/sdb ... Unable to add device: sudo: pvcreate: command not found\n\tCreating node s01ocs02.example.com ... Unable to create node: sudo: gluster: command not found\n\tCreating node s01ocs03.example.com ... Unable to create node: sudo: gluster: command not found\n\tCreating node s01ocs04.example.com ... Unable to create node: sudo: gluster: command not found", "stdout_lines": ["Creating cluster ... ID: f22481462a83889ba9082080a563e9f2", "\tAllowing file volumes on cluster.", "\tAllowing block volumes on cluster.", "\tCreating node s01ocs01.example.com ... ID: a00903ff26d020f48592b44f07283605", "\t\tAdding device /dev/sdb ... Unable to add device: sudo: pvcreate: command not found", "\tCreating node s01ocs02.example.com ... Unable to create node: sudo: gluster: command not found", "\tCreating node s01ocs03.example.com ... Unable to create node: sudo: gluster: command not found", "\tCreating node s01ocs04.example.com ... Unable to create node: sudo: gluster: command not found"]} PLAY RECAP ******************************************************************************************************************************************* localhost : ok=12 changed=0 unreachable=0 failed=0 s01ocs01.example.com : ok=20 changed=0 unreachable=0 failed=0 s01ocs02.example.com : ok=17 changed=0 unreachable=0 failed=0 s01ocs03.example.com : ok=17 changed=0 unreachable=0 failed=0 s01ocs04.example.com : ok=17 changed=0 unreachable=0 failed=0 s01ops04 : ok=64 changed=12 unreachable=0 failed=1 s01ops05 : ok=19 changed=1 unreachable=0 failed=0 s01ops06 : ok=19 changed=1 unreachable=0 failed=0 s01ops11 : ok=1 changed=0 unreachable=0 failed=0 >> Expected results: OCS Independent installed successfully >> Additional info: We already tested and the remote user configured for Heketi has 'sudo' permissions and proper SSH keys distributed. Also, we could run the commands manually without problems: [root@s01ocs01 ~]# su heketi - [heketi@s01ocs01 root]$ whoami heketi [heketi@s01ocs01 root]$ sudo pvcreate --metadatasize=128M --dataalignment=256K '/dev/sdb' Physical volume "/dev/sdb" successfully created.
We did some extra tests and verified that the user 'heketi' is only able to find the binaries in the path on login shells (the environment variable was being exported from /etc/profile). On non-interactive shells, the path was not properly defined and that was causing the problem. We fixed that putting the export on .bashrc for the 'heketi' user.