1544648 – ocp deployment using advanced installation fails during health check

Bug 1544648 - ocp deployment using advanced installation fails during health check

Summary: ocp deployment using advanced installation fails during health check

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	OpenShift Container Platform
Classification:	Red Hat
Component:	Installer
Sub Component:
Version:	3.9.0
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	3.9.0
Assignee:	Vadim Rutkovsky
QA Contact:	Johnny Liu
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2018-02-13 07:09 UTC by krishnaram Karthick
Modified:	2018-12-13 19:26 UTC (History)
CC List:	7 users (show)
Fixed In Version:
Doc Type:	Bug Fix
Doc Text:	Cause: Unicode characters in pullspecs for images were treated incorrecly Consequence: Installation stopped with a cryptic message Fix: Unicode handling in image paths was fixed Result: Correct error message is displayed
Clone Of:
Environment:
Last Closed:	2018-12-13 19:26:51 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
ansible.log (820.94 KB, text/plain) 2018-02-13 13:19 UTC, krishnaram Karthick	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2018:3748	0	None	None	None	2018-12-13 19:26:58 UTC

Description krishnaram Karthick 2018-02-13 07:09:49 UTC

Description of problem:
health check fails when advanced installer is used to deploy OCP with the error shown in actual results output.

Version-Release number of the following components:
openshift-ansible-3.9.0-0.42.0.git.0.1a9a61b.el7.noarch
openshift-ansible-roles-3.9.0-0.42.0.git.0.1a9a61b.el7.noarch
openshift-ansible-playbooks-3.9.0-0.42.0.git.0.1a9a61b.el7.noarch
ansible-2.4.2.0-2.el7.noarch
openshift-ansible-docs-3.9.0-0.42.0.git.0.1a9a61b.el7.noarch
ansible --version
ansible 2.4.2.0
  config file = /etc/ansible/ansible.cfg
  configured module search path = [u'/root/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python2.7/site-packages/ansible
  executable location = /usr/bin/ansible
  python version = 2.7.5 (default, May  3 2017, 07:55:04) [GCC 4.8.5 20150623 (Red Hat 4.8.5-14)]

How reproducible:
Always

Steps to Reproduce:
1. Install OCP using advanced installer using the config file attached

Actual results:
2018-02-13 12:32:54,491 p=53996 u=root |  INSTALLER STATUS ******************************************************************************************************************
******************************************
2018-02-13 12:32:54,494 p=53996 u=root |  Initialization             : Complete (0:00:53)
2018-02-13 12:32:54,494 p=53996 u=root |  Health Check               : In Progress (0:04:18)
2018-02-13 12:32:54,494 p=53996 u=root |        This phase can be restarted by running: playbooks/openshift-checks/pre-install.yml
2018-02-13 12:32:54,495 p=53996 u=root |  Failure summary:


  1. Hosts:    10.70.46.188, 10.70.46.30, 10.70.46.83
     Play:     OpenShift Health Checks
     Task:     Run health checks (install) - EL
     Message:  ESC[0;31mOne or more checks failedESC[0m
     Details:  ESC[0;31mcheck "docker_image_availability":ESC[0m
               ESC[0;31m'ascii' codec can't encode character u'\u2019' in position 86: ordinal not in range(128)ESC[0m
               ESC[0;31mTraceback (most recent call last):ESC[0m
               ESC[0;31m  File "/usr/share/ansible/openshift-ansible/roles/openshift_health_checker/action_plugins/openshift_health_check.py", line 222, in run_checkESC[0m
               ESC[0;31m    result = check.run()ESC[0m
               ESC[0;31m  File "/usr/share/ansible/openshift-ansible/roles/openshift_health_checker/openshift_checks/docker_image_availability.py", line 133, in runESC[0m
               ESC[0;31m    unreachable=unreachable_msg if unreachable else "",ESC[0m
               ESC[0;31mUnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in position 86: ordinal not in range(128)ESC[0m
               ESC[0;31mESC[0m

The execution of "/usr/share/ansible/openshift-ansible/playbooks/deploy_cluster.yml" includes checks designed to fail early if the requirements of the playbook are not met. One or more of these checks failed. To disregard these results,explicitly disable checks by setting an Ansible variable:
   openshift_disable_check=docker_image_availability
Failing check names are shown in the failure details above. Some checks may be configurable by variables if your requirements are different from the defaults; consult check documentation.


Expected results:
No traceback should be seen and a meaningful error message should be displayed

Additional info:
Logs shall be attached shortly.

Comment 1 krishnaram Karthick 2018-02-13 13:19:02 UTC

Created attachment 1395344 [details]
ansible.log

Comment 2 Vadim Rutkovsky 2018-02-13 14:36:11 UTC

From facts output:

"oreg_url": "'brew-pulp-docker01.web.prod.ext.phx2.redhat.com:8888/openshift3/ose-${component}:${version}.0’"

There's a Unicode char (\u2019) in oreg_url, which causes the failure

Comment 3 Vadim Rutkovsky 2018-02-13 21:28:45 UTC

Created https://github.com/openshift/openshift-ansible/pull/7138

Comment 4 Michael Gugino 2018-02-19 15:50:15 UTC

Looks like someone copy and pasted the left-pointing single quote '’' (0x2019), which didn't properly terminate the first single quote in their variable.

This can happen easily when copying from word document software/email instead of plain-text files.

Comment 5 Vadim Rutkovsky 2018-02-22 09:45:19 UTC

Fix is available in openshift-ansible-3.9.0-0.47.0.git.0.f8847bb.el7

Comment 6 Wenkai Shi 2018-02-23 07:38:10 UTC

Check with version openshift-ansible-3.9.0-0.48.0.git.0.2fb33db.el7, the code has been merged. Still see traceback:

# cat hosts
...
oreg_url=’registry.example.com:443/openshift3/ose-${component}:${version}’
...

# ansible-playbook -i hosts -vv /usr/share/ansible/openshift-ansible/playbooks/deploy_cluster.yml
...
  1. Hosts:    qe-weshi-bug-master-etcd-1.0223-0l0.example.com, qe-weshi-bug-node-registry-router-1.0223-0l0.example.com
     Play:     OpenShift Health Checks
     Task:     Run health checks (install) - EL
     Message:  One or more checks failed
     Details:  check "docker_image_availability":
               'ascii' codec can't encode character u'\u2019' in position 0: ordinal not in range(128)
               Traceback (most recent call last):
                 File "/usr/share/ansible/openshift-ansible/roles/openshift_health_checker/action_plugins/openshift_health_check.py", line 222, in run_check
                   result = check.run()
                 File "/usr/share/ansible/openshift-ansible/roles/openshift_health_checker/openshift_checks/docker_image_availability.py", line 120, in run
                   unreachable_msg = "Failed connecting to: {}\n".format(", ".join(unreachable))
               UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in position 0: ordinal not in range(128)

Comment 7 Vadim Rutkovsky 2018-02-26 13:39:22 UTC

Created https://github.com/openshift/openshift-ansible/pull/7287 with a better fix

Comment 8 Vadim Rutkovsky 2018-03-05 11:21:34 UTC

Fix is available in openshift-ansible-3.9.2-1

Comment 9 Wenkai Shi 2018-03-07 09:21:30 UTC

Verified with version openshift-ansible-3.9.2-1.git.0.1a855b3.el7, looks better now.

# cat hosts
...
oreg_url=’registry.example.com:443/openshift3/ose-${component}:${version}’
...

# ansible-playbook -i hosts -vv /usr/share/ansible/openshift-ansible/playbooks/deploy_cluster.yml
...
 [WARNING]: Failure using method (v2_runner_on_failed) in callback plugin (<ansible.plugins.callback.default.CallbackModule object at 0x4452790>): 'ascii' codec can't decode byte 0xe2 in
position 65: ordinal not in range(128)


NO MORE HOSTS LEFT ***************************************************************************************************************************************************************************

PLAY RECAP ***********************************************************************************************************************************************************************************
localhost                  : ok=11   changed=0    unreachable=0    failed=0   
qe-weshi-bug-master-etcd-1.0307-hnf.qe.rhcloud.com : ok=33   changed=0    unreachable=0    failed=1   
qe-weshi-bug-node-registry-router-1.0307-hnf.qe.rhcloud.com : ok=22   changed=0    unreachable=0    failed=1   


INSTALLER STATUS *****************************************************************************************************************************************************************************
Initialization             : Complete (0:00:19)
Health Check               : In Progress (0:00:55)
...
Failure summary:


  1. Hosts:    qe-weshi-bug-master-etcd-1.0307-hnf.qe.rhcloud.com, qe-weshi-bug-node-registry-router-1.0307-hnf.qe.rhcloud.com
     Play:     OpenShift Health Checks
     Task:     Run health checks (install) - EL
     Message:  One or more checks failed
     Details:  check "docker_image_availability":
               One or more required container images are not available:
                   ’registry.example.com:443/openshift3/ose-deployer:v3.9.3',
                   ’registry.example.com:443/openshift3/ose-docker-registry:v3.9.3',
                   ’registry.example.com:443/openshift3/ose-haproxy-router:v3.9.3',
                   ’registry.example.com:443/openshift3/ose-pod:v3.9.3'
               Checked with: skopeo inspect [--tls-verify=false] [--creds=<user>:<pass>] docker://<registry>/<image>
               Default registries searched: registry.example.com:443, registry.access.redhat.com
               Blocked registries: registry.hacker.com
               Failed connecting to: ’registry.example.com:443
               

The execution of "/usr/share/ansible/openshift-ansible/playbooks/deploy_cluster.yml" includes checks designed to fail early if the requirements of the playbook are not met. One or more of these checks failed. To disregard these results,explicitly disable checks by setting an Ansible variable:
   openshift_disable_check=docker_image_availability
Failing check names are shown in the failure details above. Some checks may be configurable by variables if your requirements are different from the defaults; consult check documentation.
Variables can be set in the inventory or passed on the command line using the -e flag to ansible-playbook.

...

Comment 13 errata-xmlrpc 2018-12-13 19:26:51 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:3748

Note You need to log in before you can comment on or make changes to this bug.