Description of problem: health check fails when advanced installer is used to deploy OCP with the error shown in actual results output. Version-Release number of the following components: openshift-ansible-3.9.0-0.42.0.git.0.1a9a61b.el7.noarch openshift-ansible-roles-3.9.0-0.42.0.git.0.1a9a61b.el7.noarch openshift-ansible-playbooks-3.9.0-0.42.0.git.0.1a9a61b.el7.noarch ansible-2.4.2.0-2.el7.noarch openshift-ansible-docs-3.9.0-0.42.0.git.0.1a9a61b.el7.noarch ansible --version ansible 2.4.2.0 config file = /etc/ansible/ansible.cfg configured module search path = [u'/root/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules'] ansible python module location = /usr/lib/python2.7/site-packages/ansible executable location = /usr/bin/ansible python version = 2.7.5 (default, May 3 2017, 07:55:04) [GCC 4.8.5 20150623 (Red Hat 4.8.5-14)] How reproducible: Always Steps to Reproduce: 1. Install OCP using advanced installer using the config file attached Actual results: 2018-02-13 12:32:54,491 p=53996 u=root | INSTALLER STATUS ****************************************************************************************************************** ****************************************** 2018-02-13 12:32:54,494 p=53996 u=root | Initialization : Complete (0:00:53) 2018-02-13 12:32:54,494 p=53996 u=root | Health Check : In Progress (0:04:18) 2018-02-13 12:32:54,494 p=53996 u=root | This phase can be restarted by running: playbooks/openshift-checks/pre-install.yml 2018-02-13 12:32:54,495 p=53996 u=root | Failure summary: 1. Hosts: 10.70.46.188, 10.70.46.30, 10.70.46.83 Play: OpenShift Health Checks Task: Run health checks (install) - EL Message: ESC[0;31mOne or more checks failedESC[0m Details: ESC[0;31mcheck "docker_image_availability":ESC[0m ESC[0;31m'ascii' codec can't encode character u'\u2019' in position 86: ordinal not in range(128)ESC[0m ESC[0;31mTraceback (most recent call last):ESC[0m ESC[0;31m File "/usr/share/ansible/openshift-ansible/roles/openshift_health_checker/action_plugins/openshift_health_check.py", line 222, in run_checkESC[0m ESC[0;31m result = check.run()ESC[0m ESC[0;31m File "/usr/share/ansible/openshift-ansible/roles/openshift_health_checker/openshift_checks/docker_image_availability.py", line 133, in runESC[0m ESC[0;31m unreachable=unreachable_msg if unreachable else "",ESC[0m ESC[0;31mUnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in position 86: ordinal not in range(128)ESC[0m ESC[0;31mESC[0m The execution of "/usr/share/ansible/openshift-ansible/playbooks/deploy_cluster.yml" includes checks designed to fail early if the requirements of the playbook are not met. One or more of these checks failed. To disregard these results,explicitly disable checks by setting an Ansible variable: openshift_disable_check=docker_image_availability Failing check names are shown in the failure details above. Some checks may be configurable by variables if your requirements are different from the defaults; consult check documentation. Expected results: No traceback should be seen and a meaningful error message should be displayed Additional info: Logs shall be attached shortly.
Created attachment 1395344 [details] ansible.log
From facts output: "oreg_url": "'brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/ose-${component}:${version}.0’" There's a Unicode char (\u2019) in oreg_url, which causes the failure
Created https://github.com/openshift/openshift-ansible/pull/7138
Looks like someone copy and pasted the left-pointing single quote '’' (0x2019), which didn't properly terminate the first single quote in their variable. This can happen easily when copying from word document software/email instead of plain-text files.
Fix is available in openshift-ansible-3.9.0-0.47.0.git.0.f8847bb.el7
Check with version openshift-ansible-3.9.0-0.48.0.git.0.2fb33db.el7, the code has been merged. Still see traceback: # cat hosts ... oreg_url=’registry.example.com:443/openshift3/ose-${component}:${version}’ ... # ansible-playbook -i hosts -vv /usr/share/ansible/openshift-ansible/playbooks/deploy_cluster.yml ... 1. Hosts: qe-weshi-bug-master-etcd-1.0223-0l0.example.com, qe-weshi-bug-node-registry-router-1.0223-0l0.example.com Play: OpenShift Health Checks Task: Run health checks (install) - EL Message: One or more checks failed Details: check "docker_image_availability": 'ascii' codec can't encode character u'\u2019' in position 0: ordinal not in range(128) Traceback (most recent call last): File "/usr/share/ansible/openshift-ansible/roles/openshift_health_checker/action_plugins/openshift_health_check.py", line 222, in run_check result = check.run() File "/usr/share/ansible/openshift-ansible/roles/openshift_health_checker/openshift_checks/docker_image_availability.py", line 120, in run unreachable_msg = "Failed connecting to: {}\n".format(", ".join(unreachable)) UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in position 0: ordinal not in range(128)
Created https://github.com/openshift/openshift-ansible/pull/7287 with a better fix
Fix is available in openshift-ansible-3.9.2-1
Verified with version openshift-ansible-3.9.2-1.git.0.1a855b3.el7, looks better now. # cat hosts ... oreg_url=’registry.example.com:443/openshift3/ose-${component}:${version}’ ... # ansible-playbook -i hosts -vv /usr/share/ansible/openshift-ansible/playbooks/deploy_cluster.yml ... [WARNING]: Failure using method (v2_runner_on_failed) in callback plugin (<ansible.plugins.callback.default.CallbackModule object at 0x4452790>): 'ascii' codec can't decode byte 0xe2 in position 65: ordinal not in range(128) NO MORE HOSTS LEFT *************************************************************************************************************************************************************************** PLAY RECAP *********************************************************************************************************************************************************************************** localhost : ok=11 changed=0 unreachable=0 failed=0 qe-weshi-bug-master-etcd-1.0307-hnf.qe.rhcloud.com : ok=33 changed=0 unreachable=0 failed=1 qe-weshi-bug-node-registry-router-1.0307-hnf.qe.rhcloud.com : ok=22 changed=0 unreachable=0 failed=1 INSTALLER STATUS ***************************************************************************************************************************************************************************** Initialization : Complete (0:00:19) Health Check : In Progress (0:00:55) ... Failure summary: 1. Hosts: qe-weshi-bug-master-etcd-1.0307-hnf.qe.rhcloud.com, qe-weshi-bug-node-registry-router-1.0307-hnf.qe.rhcloud.com Play: OpenShift Health Checks Task: Run health checks (install) - EL Message: One or more checks failed Details: check "docker_image_availability": One or more required container images are not available: ’registry.example.com:443/openshift3/ose-deployer:v3.9.3', ’registry.example.com:443/openshift3/ose-docker-registry:v3.9.3', ’registry.example.com:443/openshift3/ose-haproxy-router:v3.9.3', ’registry.example.com:443/openshift3/ose-pod:v3.9.3' Checked with: skopeo inspect [--tls-verify=false] [--creds=<user>:<pass>] docker://<registry>/<image> Default registries searched: registry.example.com:443, registry.access.redhat.com Blocked registries: registry.hacker.com Failed connecting to: ’registry.example.com:443 The execution of "/usr/share/ansible/openshift-ansible/playbooks/deploy_cluster.yml" includes checks designed to fail early if the requirements of the playbook are not met. One or more of these checks failed. To disregard these results,explicitly disable checks by setting an Ansible variable: openshift_disable_check=docker_image_availability Failing check names are shown in the failure details above. Some checks may be configurable by variables if your requirements are different from the defaults; consult check documentation. Variables can be set in the inventory or passed on the command line using the -e flag to ansible-playbook. ...
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2018:3748