Bug 1608476 - OpenShift-Ansible ignores oreg_url for registry-console Docker image
Summary: OpenShift-Ansible ignores oreg_url for registry-console Docker image
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer
Version: 3.10.0
Hardware: Unspecified
OS: Unspecified
unspecified
unspecified
Target Milestone: ---
: 3.10.z
Assignee: Michael Gugino
QA Contact: Johnny Liu
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-07-25 15:25 UTC by Michał Dulko
Modified: 2018-11-11 16:40 UTC (History)
7 users (show)

Fixed In Version:
Doc Type: Bug Fix
Doc Text:
Cause: Docker image availability checks did not utilize oreg_url properly. Consequence: registry-console image failed to honor oreg_url, in some environments this caused failure of image check. Fix: Update docker image checks to properly utilize docker image checks. Result: Image checks now succeed when using oreg_url.
Clone Of:
: 1613100 (view as bug list)
Environment:
Last Closed: 2018-11-11 16:39:10 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHSA-2018:2709 0 None None None 2018-11-11 16:40:08 UTC

Description Michał Dulko 2018-07-25 15:25:46 UTC
Description of problem:

Version-Release number of the following components:
rpm -q openshift-ansible
openshift-ansible-3.10.23-1.git.0.a9c7e7d.el7.noarch

rpm -q ansible
ansible-2.4.6.0-1.el7ae.noarch

ansible --version
ansible 2.4.6.0
  config file = /etc/ansible/ansible.cfg
  configured module search path = [u'/home/cloud-user/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python2.7/site-packages/ansible
  executable location = /usr/bin/ansible
  python version = 2.7.5 (default, Feb 20 2018, 09:19:12) [GCC 4.8.5 20150623 (Red Hat 4.8.5-28)]

How reproducible:
Always

Steps to Reproduce:
1. Set oreg_url: 'registry.reg-aws.openshift.com:443/openshift3/ose-${component}:${version}' and 
openshift_release to a tag that's not available in https://access.redhat.com/containers
2. Deploy OpenShift-Ansible with registry-console
3. Notice the error

Actual results:
Please include the entire output from the last TASK line through the end of output if an error is generated

fatal: [infra-node-0.openshift.example.com]: FAILED! => {"changed": true, "checks": {"disk_availability": {"skipped": true, "skipped_reason": "Disabled by user request"}, "docker_image_availability": {"changed": true, "failed": true, "failures": [["OpenShiftCheckException", "One or more required container images are not available:\n    registry.access.redhat.com/openshift3/registry-console:v3.10\nChecked with: skopeo inspect [--tls-verify=false] [--creds=<user>:<pass>] docker://<registry>/<image>\nDefault registries searched: registry.reg-aws.openshift.com:443, registry.access.redhat.com\n"]], "msg": "One or more required container images are not available:\n    registry.access.redhat.com/openshift3/registry-console:v3.10\nChecked with: skopeo inspect [--tls-verify=false] [--creds=<user>:<pass>] docker://<registry>/<image>\nDefault registries searched: registry.reg-aws.openshift.com:443, registry.access.redhat.com\n"}, "docker_storage": {"skipped": true, "skipped_reason": "Disabled by user request"}, "memory_availability": {"skipped": true, "skipped_reason": "Disabled by user request"}, "package_availability": {"skipped": true, "skipped_reason": "Disabled by user request"}, "package_version": {"skipped": true, "skipped_reason": "Disabled by user request"}}, "failed": true, "msg": "One or more checks failed", "playbook_context": "install"}

Expected results:
registry-console is fetched from registry.reg-aws.openshift.com:443.

Additional info:
Please attach logs from ansible-playbook with the -vvv flag

Comment 1 Michael Gugino 2018-08-06 21:58:40 UTC
PR Created: https://github.com/openshift/openshift-ansible/pull/9447

Comment 2 Michael Gugino 2018-08-17 19:03:34 UTC
PR Created in 3.10: https://github.com/openshift/openshift-ansible/pull/9659

Comment 4 Johnny Liu 2018-08-30 10:20:34 UTC
Re-test this bug with openshift-ansible-3.10.41-1.git.0.fd15dd7.el7.noarch, this bug is fixed partially.

registry-console component is now respecting oreg_url setting.
1. Create a private mirror registry, and point oreg_url to the registry.
openshift_deployment_type=openshift-enterprise
oreg_url=host-8-241-45.host.centralci.eng.rdu2.redhat.com:5000/testing/ocp3/ose-${component}:${version}
openshift_examples_modify_imagestreams=false
openshift_docker_insecure_registries=host-8-241-45.host.centralci.eng.rdu2.redhat.com:5000

2. sync all necessary images to the private registry, but no registry-console image.
 
3. installation is completed.

[root@qe-jialiu310z2-mrre-1 ~]# oc get po
NAME                        READY     STATUS             RESTARTS   AGE
docker-registry-1-7njps     1/1       Running            0          3m
registry-console-1-8vvn5    0/1       ImagePullBackOff   0          3m
registry-console-1-deploy   1/1       Running            0          3m
router-1-2s8h2              1/1       Running            0          3m


[root@qe-jialiu310z2-mrre-1 ~]# oc describe po registry-console-1-8vvn5
Name:           registry-console-1-8vvn5
<--snip-->
Events:
  Type     Reason          Age                From                            Message
  ----     ------          ----               ----                            -------
  Normal   Scheduled       5m                 default-scheduler               Successfully assigned registry-console-1-8vvn5 to qe-jialiu310z2-mrre-1
  Normal   Pulling         5m (x2 over 5m)    kubelet, qe-jialiu310z2-mrre-1  pulling image "host-8-241-45.host.centralci.eng.rdu2.redhat.com:5000/testing/ocp3/registry-console:v3.10"
  Warning  Failed          5m (x2 over 5m)    kubelet, qe-jialiu310z2-mrre-1  Failed to pull image "host-8-241-45.host.centralci.eng.rdu2.redhat.com:5000/testing/ocp3/registry-console:v3.10": rpc error: code = Unknown desc = Error: image testing/ocp3/registry-console:v3.10 not found
  Warning  Failed          5m (x2 over 5m)    kubelet, qe-jialiu310z2-mrre-1  Error: ErrImagePull
  Normal   SandboxChanged  5m (x7 over 5m)    kubelet, qe-jialiu310z2-mrre-1  Pod sandbox changed, it will be killed and re-created.
  Normal   BackOff         5m (x6 over 5m)    kubelet, qe-jialiu310z2-mrre-1  Back-off pulling image "host-8-241-45.host.centralci.eng.rdu2.redhat.com:5000/testing/ocp3/registry-console:v3.10"
  Warning  Failed          42s (x35 over 5m)  kubelet, qe-jialiu310z2-mrre-1  Error: ImagePullBackOff

The ImagePullBackOff error is expected, and prove that oreg_url is respected now.

But the left issue is "Health Checks" at the beginning of the installation is passed, which should fail, due to no registry-console is existing on the private mirror registry.

PLAY [OpenShift Health Checks] *************************************************

TASK [Gathering Facts] *********************************************************
Thursday 30 August 2018  17:53:28 +0800 (0:00:00.143)       0:00:13.313 ******* 
ok: [host-8-250-35.host.centralci.eng.rdu2.redhat.com]

TASK [Run health checks (install) - EL] ****************************************
Thursday 30 August 2018  17:53:29 +0800 (0:00:00.415)       0:00:13.729 ******* 

CHECK [docker_storage : host-8-250-35.host.centralci.eng.rdu2.redhat.com] ******

CHECK [disk_availability : host-8-250-35.host.centralci.eng.rdu2.redhat.com] ***

CHECK [package_availability : host-8-250-35.host.centralci.eng.rdu2.redhat.com] ***

CHECK [package_version : host-8-250-35.host.centralci.eng.rdu2.redhat.com] *****

CHECK [docker_image_availability : host-8-250-35.host.centralci.eng.rdu2.redhat.com] ***

CHECK [memory_availability : host-8-250-35.host.centralci.eng.rdu2.redhat.com] ***
changed: [host-8-250-35.host.centralci.eng.rdu2.redhat.com] => {"changed": true, "checks": {"disk_availability": {}, "docker_image_availability": {"changed": true}, "docker_storage": {}, "memory_availability": {}, "package_availability": {"changed": false, "invocation": {"module_args": {"packages": ["PyYAML", "atomic-openshift", "atomic-openshift-clients", "atomic-openshift-hyperkube", "atomic-openshift-node", "bash-completion", "bind", "ceph-common", "cockpit-bridge", "cockpit-docker", "cockpit-system", "cockpit-ws", "dnsmasq", "docker", "firewalld", "flannel", "glusterfs-fuse", "httpd-tools", "iptables", "iptables-services", "iscsi-initiator-utils", "libselinux-python", "nfs-utils", "ntp", "openssl", "pyparted", "python-httplib2", "yum-utils"]}}}, "package_version": {"changed": false, "invocation": {"module_args": {"package_list": [{"check_multi": true, "name": "atomic-openshift", "version": ""}, {"check_multi": true, "name": "atomic-openshift-master", "version": ""}, {"check_multi": true, "name": "atomic-openshift-node", "version": ""}], "package_mgr": "yum"}}}}, "failed": false, "playbook_context": "install"}

TASK [Run health checks (install) - Fedora] ************************************
Thursday 30 August 2018  17:54:49 +0800 (0:01:19.971)       0:01:33.700 ******* 
skipping: [host-8-250-35.host.centralci.eng.rdu2.redhat.com] => {"changed": false, "skip_reason": "Conditional result was False", "skipped": true}

PLAY [Health Check Checkpoint End] *********************************************

Comment 5 Michael Gugino 2018-08-30 14:46:56 UTC
Health check is currently broken.  Refactor WIP for master here: https://github.com/openshift/openshift-ansible/pull/9827

This will be backported to 3.10 after merging.

Comment 9 Johnny Liu 2018-09-26 08:35:21 UTC
Verified this bug with openshift-ansible-3.10.50-1.git.0.96a93c5.el7.noarch, and PASS.

1. Create a private mirror registry, and point oreg_url to the registry.
openshift_deployment_type=openshift-enterprise
oreg_url=host-8-241-45.host.centralci.eng.rdu2.redhat.com:5000/testing/ocp3/ose-${component}:${version}
openshift_examples_modify_imagestreams=false
openshift_docker_insecure_registries=host-8-241-45.host.centralci.eng.rdu2.redhat.com:5000

2. sync all necessary images to the private registry, but no registry-console image.
 
Health-check failed as expected.

  1. Hosts:    host-8-244-150.host.centralci.eng.rdu2.redhat.com, host-8-244-42.host.centralci.eng.rdu2.redhat.com, host-8-251-221.host.centralci.eng.rdu2.redhat.com
     Play:     OpenShift Health Checks
     Task:     Run health checks (install) - EL
     Message:  One or more checks failed
     Details:  check "docker_image_availability":
               One or more required container images are not available:
                   host-8-241-45.host.centralci.eng.rdu2.redhat.com:5000/testing/ocp3/registry-console:v3.10
               Checked with: skopeo inspect [--tls-verify=false] [--creds=<user>:<pass>] docker://<registry>/<image>

Sync registry-console to private mirror registry, run the installation again, it is completed successfully. registry-console component is now respecting oreg_url setting.

# oc describe po registry-console-1-n8cl2|grep Image:
    Image:          host-8-241-45.host.centralci.eng.rdu2.redhat.com:5000/testing/ocp3/registry-console:v3.10

Comment 11 errata-xmlrpc 2018-11-11 16:39:10 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2018:2709


Note You need to log in before you can comment on or make changes to this bug.