Description of problem: Install OCP with native glusterfs failed when set "openshift_storage_glusterfs_namespace=<other namespace>" Version-Release number of selected component (if applicable): openshift-ansible-3.6.51-1.git.0.18eb563.el7 How reproducible: 100% Steps to Reproduce: 1. install OCP with native glusterfs, set "openshift_storage_glusterfs_namespace=glusterfs" 2. 3. Actual results: # ansible-playbook -i hosts -v /usr/share/ansible/openshift-ansible/playbooks/byo/config.yml ... TASK [openshift_storage_glusterfs : Load heketi topology] ********************** Wednesday 03 May 2017 08:13:12 +0000 (0:00:03.145) 0:25:11.776 ********* fatal: [master.example.com]: FAILED! => { "changed": true, "cmd": [ "heketi-cli", "-s", "http://10.128.2.2:8080", "--user", "admin", "--secret", "", "topology", "load", "--json=/tmp/openshift-glusterfs-ansible-rRMhiq/topology.json", "2>&1" ], "delta": "0:00:01.834207", "end": "2017-05-03 04:13:13.879441", "failed": true, "failed_when_result": true, "rc": 0, "start": "2017-05-03 04:13:12.045234", "warnings": [] } STDOUT: Creating cluster ... ID: cfc044eddb224dd217889fc92efca600 Creating node glusterfsnode1.example.com ... ID: fb3f241ef776df50f4267820ca3831b0 Adding device /dev/vsda ... Unable to add device: Failed to get list of pods Creating node glusterfsnode2.example.com ... Unable to create node: Failed to get list of pods Creating node glusterfsnode3.example.com ... Unable to create node: Failed to get list of pods Expected results: Installation succeed Additional info: # oc get po -n glusterfs NAME READY STATUS RESTARTS AGE deploy-heketi-1-w0j51 1/1 Running 0 47m glusterfs-5g22n 1/1 Running 0 50m glusterfs-g4nxr 1/1 Running 0 50m glusterfs-k3d54 1/1 Running 0 50m # heketi-cli -s http://10.128.2.2:8080 --user admin topology load --json=/tmp/openshift-glusterfs-ansible-rRMhiq/topology.json Found node glusterfsnode1.example.com on cluster cfc044eddb224dd217889fc92efca600 Adding device /dev/vsda ... Unable to add device: Failed to get list of pods Creating node glusterfsnode2.example.com ... Unable to create node: Failed to get list of pods Creating node glusterfsnode3.example.com ... Unable to create node: Failed to get list of pods
This should be fixed by the following PR: https://github.com/openshift/openshift-ansible/pull/4245
Check with version openshift-ansible-3.6.121-1.git.0.ed0b72c.el7, installation still fail: # ansible-playbook -i hosts -v /usr/share/ansible/openshift-ansible/playbooks/byo/config.yml ... TASK [openshift_storage_glusterfs : Verify heketi service] ********************* Wednesday 21 June 2017 06:07:51 +0000 (0:00:00.043) 0:14:28.856 ******** fatal: [master.example.com]: FAILED! => { "changed": false, "cmd": [ "oc", "rsh", "deploy-heketi-storage-1-zq9m8", "heketi-cli", "-s", "http://localhost:8080", "--user", "admin", "--secret", "r8SizUA1YQJs0lyWRplZEXl1eNy8lLnP4a67Kqq/OuA=", "cluster", "list" ], "delta": "0:00:00.196119", "end": "2017-06-21 02:07:51.045884", "failed": true, "rc": 1, "start": "2017-06-21 02:07:50.849765", "warnings": [] } STDERR: Error from server (NotFound): pods "deploy-heketi-storage-1-zq9m8" not found ...
Even newer PR that should hopefully actually fix this BZ: https://github.com/openshift/openshift-ansible/pull/4534
Check with version openshift-ansible-3.6.122-1.git.0.62fcd88.el7, still failed: # ansible-playbook -i hosts -v /usr/share/ansible/openshift-ansible/playbooks/byo/config.yml ... TASK [openshift_storage_glusterfs : Verify heketi service] ********************* Friday 23 June 2017 05:07:35 +0000 (0:00:00.076) 0:14:48.712 *********** fatal: [host-8-175-81.host.centralci.eng.rdu2.redhat.com]: FAILED! => { "changed": false, "cmd": [ "oc", "rsh", "deploy-heketi-storage-1-25jv8", "heketi-cli", "-s", "http://localhost:8080", "--user", "admin", "--secret", "h7i7qOayOEI1IXflM0sx4UeccTiYx0HMTiJhP4o0Tyc=", "cluster", "list" ], "delta": "0:00:00.201332", "end": "2017-06-23 01:07:34.857759", "failed": true, "rc": 1, "start": "2017-06-23 01:07:34.656427", "warnings": [] } STDERR: Error from server (NotFound): pods "deploy-heketi-storage-1-25jv8" not found ...
I don't know why it was moved to ON_QA, it hasn't merged yet.
PR merged.
Verified with version openshift-ansible-3.6.126.0-1.git.0.f9c47bf.el7, installation succeed. Pretty cool jobs! # oc get po NAME READY STATUS RESTARTS AGE docker-registry-1-12z27 1/1 Running 0 5m registry-console-1-rdvsq 1/1 Running 0 4m router-1-q3pwh 1/1 Running 0 6m # oc get po -n glusterfs NAME READY STATUS RESTARTS AGE glusterfs-storage-5hkh1 1/1 Running 0 10m glusterfs-storage-xccjh 1/1 Running 0 10m glusterfs-storage-xfvz4 1/1 Running 0 10m heketi-storage-1-q9zr7 1/1 Running 0 7m
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2017:1716