Description of problem: Fresh installation in ocp 3.7 1>The cu has noted when the glusterfs installation playbook it failed in the task reach " Wait for GlusterFS pods" which is defined in the playbook "usr/share/ansible/openshift-ansible/roles/openshift_storage_glusterfs/tasks/glusterfs_deploy.yml" here he made the changes like below the glusterfs installation playbook runs successfully. ~~ # retries: "{{ (glusterfs_timeout | int / 10) | int }}" retries: 150 ~~ >In that time all the glusterfs pods not coming with "1/1 Running" state so he restarted the docker and node services then the pods state become 1/1 Running" state. Is this mandatory? --- systemctl restart docker.service systemctl restart atomic-openshift-node.service --- Finally they then came up in the correct state: --- READY STATUS 1/1 Running --- Actual results: Playbook fails Expected results: It should get success glusterfs installation in openshift 3.7 Additional info:
No, this should not be required. I'm not sure I fully understand the problem, however: Are you saying that the customer set the timeout to 150 and then the playbooks succeeded but the pods were not actually running? Did you wipe the failed installation before re-running the playbooks? The GlusterFS playbooks are not idempotent, and it is not supported to run them more than once without uninstalling/wiping the failed deployment first. If you did do a second run, I suspect the playbooks detected the extant pods and skipped over the task that initially gave you problems. I don't know how they would have completed without heketi, however. Please inspect the logs found in /var/log/glusterfs/glusterd.log as well as systemctl/journalctl logs for docker and atomic-openshift-node for additional information. Updating the summary line to a more useful sentence.