Description of problem: docker_image_availability check fails when deploying Openshift via Director: INSTALLER STATUS *************************************************************** Initialization : Complete (0:01:21) Health Check : In Progress (0:02:27) This phase can be restarted by running: playbooks/openshift-checks/pre-install.yml Monday 01 October 2018 15:45:43 -0400 (0:02:20.792) 0:08:13.461 ******** =============================================================================== Failure summary: 1. Hosts: openshift-infra-0, openshift-infra-1, openshift-master-0, openshift-master-1, openshift-master-2, openshift-worker-0, openshift-worker-1, openshift-worker-2 Play: OpenShift Health Checks Task: Run health checks (install) - EL Message: One or more checks failed Details: check "docker_image_availability": One or more required container images are not available: 192.168.24.1:8787/openshift3/ose-deployer:v3.10, 192.168.24.1:8787/openshift3/ose-docker-registry:v3.10, 192.168.24.1:8787/openshift3/ose-haproxy-router:v3.10, 192.168.24.1:8787/openshift3/ose-pod:v3.10 Checked with: skopeo inspect [--tls-verify=false] [--creds=<user>:<pass>] docker://<registry>/<image> Default registries searched: 192.168.24.1:8787, registry.access.redhat.com The execution of "/var/lib/mistral/openshift/openshift/playbook.yml" includes checks designed to fail early if the requirements of the playbook are not met. One or more of these checks failed. To disregard these results,explicitly disable checks by setting an Ansible variable: openshift_disable_check=docker_image_availability Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. Deploy overcloud: openstack overcloud deploy \ --stack openshift \ --templates \ -r /home/stack/openshift_roles_data.yaml \ -n /home/stack/network_data.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/openshift.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/openshift-cns.yaml \ -e /home/stack/openshift_env.yaml \ -e /home/stack/containers-default-parameters.yaml [stack@undercloud-0 ~]$ cat containers-default-parameters.yaml # Generated with the following on 2018-09-26T20:02:43.602821 # # openstack tripleo container image prepare -e /home/stack/containers-prepare-parameter.yaml --roles-file /home/stack/openshift_roles_data.yaml --output-env-file /home/stack/containers-default-parameters.yaml # parameter_defaults: DockerHAProxyConfigImage: 192.168.24.1:8787/rhosp14/openstack-haproxy:2018-09-06.1 DockerHAProxyImage: 192.168.24.1:8787/rhosp14/openstack-haproxy:2018-09-06.1 DockerInsecureRegistryAddress: - 192.168.24.1:8787 DockerKeepalivedConfigImage: 192.168.24.1:8787/rhosp14/openstack-keepalived:2018-09-06.1 DockerKeepalivedImage: 192.168.24.1:8787/rhosp14/openstack-keepalived:2018-09-06.1 DockerOpenShiftBaseImage: 192.168.24.1:8787/openshift3/ose:v3.10 DockerOpenShiftCockpitImage: 192.168.24.1:8787/openshift3/registry-console:v3.10 DockerOpenShiftDeployerImage: 192.168.24.1:8787/openshift3/ose-deployer:v3.10 DockerOpenShiftDockerRegistryImage: 192.168.24.1:8787/openshift3/ose-docker-registry:v3.10 DockerOpenShiftEtcdImage: 192.168.24.1:8787/rhel7/etcd:latest DockerOpenShiftGlusterFSBlockImage: 192.168.24.1:8787/rhgs3/rhgs-gluster-block-prov-rhel7:latest DockerOpenShiftGlusterFSHeketiImage: 192.168.24.1:8787/rhgs3/rhgs-volmanager-rhel7:latest DockerOpenShiftGlusterFSImage: 192.168.24.1:8787/rhgs3/rhgs-server-rhel7:latest DockerOpenShiftHAProxyRouterImage: 192.168.24.1:8787/openshift3/ose-haproxy-router:v3.10 DockerOpenShiftNodeImage: 192.168.24.1:8787/openshift3/node:v3.10 DockerOpenShiftPodImage: 192.168.24.1:8787/openshift3/ose-pod:v3.10 DockerOpenShiftWebConsoleImage: 192.168.24.1:8787/openshift3/ose-web-console:v3.10 Actual results: Deployment fails because docker_image_availability validation fails. Expected results: Deployment shouldn't fail as the images are available: [root@undercloud-0 stack]# skopeo inspect --tls-verify=false docker://192.168.24.1:8787/openshift3/ose-deployer:v3.10 { "Name": "192.168.24.1:8787/openshift3/ose-deployer", "Digest": "sha256:d147e7e4595481c724ff74bf393cfb00d10c281685830a0840f62b28fde140db", "RepoTags": [ "v3.10" ], "Created": "2018-09-10T18:00:20.217044Z", "DockerVersion": "1.12.6", "Labels": { "License": "GPLv2+", "architecture": "x86_64", "authoritative-source-url": "registry.access.redhat.com", "build-date": "2018-09-10T17:59:46.483136", "com.redhat.build-host": "osbs-cpt-007.ocp.osbs.upshift.eng.rdu2.redhat.com", "com.redhat.component": "openshift-enterprise-deployer-container", "description": "This is a component of OpenShift Container Platform and executes the user deployment process to roll out new containers. It may be used as a base image for building your own custom deployer image.", "distribution-scope": "public", "io.k8s.description": "This is a component of OpenShift Container Platform and executes the user deployment process to roll out new containers. It may be used as a base image for building your own custom deployer image.", "io.k8s.display-name": "OpenShift Container Platform Deployer", "io.openshift.expose-services": "", "io.openshift.tags": "openshift,deployer", "maintainer": "Red Hat, Inc.", "name": "openshift3/ose-deployer", "release": "2", "summary": "Provides the latest release of Red Hat Enterprise Linux 7 in a fully featured and supported base image.", "url": "https://access.redhat.com/containers/#/registry.access.redhat.com/openshift3/ose-deployer/images/v3.10.45-2", "usage": "This image is very generic and does not serve a single use case. Use it as a base to build your own images.", "vcs-ref": "4f6dafb9ca673905a9cedcba73366ae972a9964d", "vcs-type": "git", "vendor": "Red Hat, Inc.", "version": "v3.10.45" }, "Architecture": "amd64", "Os": "linux", "Layers": [ "sha256:367d845540573038025f445c654675aa63905ec8682938fb45bc00f40849c37b", "sha256:b82a357e4f15fda58e9728fced8558704e3a2e1d100e93ac408edb45fe3a5cb9", "sha256:5305a8b2fbab987f183698f1c0845b7a6fbb2490e05fe51fa68a7f838fa950c7", "sha256:d452a8e8215cbbf07a2fb9e486ba6621edef3f9ae2182d80872a13a7b8e9e415", "sha256:f767ece49514ec0f4b960ffe124f62eeca2b4e9f5bdbbccedaa950dade6b4ba9" ] } Additional info: [stack@undercloud-0 ~]$ cat openshift_env.yaml resource_registry: OS::TripleO::Services::HAproxy: /usr/share/openstack-tripleo-heat-templates/docker/services/haproxy.yaml OS::TripleO::Services::Keepalived: /usr/share/openstack-tripleo-heat-templates//docker/services/keepalived.yaml OS::TripleO::NodeUserData: /home/stack/firstboot.yaml OS::TripleO::OpenShiftMaster::Net::SoftwareConfig: /home/stack/master-nic.yaml OS::TripleO::OpenShiftWorker::Net::SoftwareConfig: /home/stack/worker-nic.yaml OS::TripleO::OpenShiftInfra::Net::SoftwareConfig: /home/stack/infra-nic.yaml parameter_defaults: CloudName: openshift.localdomain OvercloudOpenShiftMasterFlavor: master OpenShiftMasterHostnameFormat: '%stackname%-master-%index%' OvercloudOpenShiftWorkerFlavor: worker OpenShiftWorkerHostnameFormat: '%stackname%-worker-%index%' OvercloudOpenShiftInfraFlavor: infra OpenShiftInfraHostnameFormat: '%stackname%-infra-%index%' OpenShiftMasterCount: 3 OpenShiftWorkerCount: 3 OpenShiftInfraCount: 2 NtpServer: ["clock.redhat.com","clock2.redhat.com"] ControlPlaneDefaultRoute: 192.168.24.1 EC2MetadataIp: 192.168.24.1 ControlPlaneSubnetCidr: 24 DnsServers: - 10.0.0.1 OpenShiftGlobalVariables: openshift_master_identity_providers: - name: allow_all login: 'true' challenge: true kind: AllowAllPasswordIdentityProvider openshift_deployment_type: openshift-enterprise openshift_master_default_subdomain: apps.openshift.localdomain
I cannot reproduce locally. Could it be a networking issue between your local registry and the openshift nodes? Or the docker_image_availability check was broken in openshift-ansible and was fixed in a later release? I'm using openshift-ansible container image: "Labels": { "License": "GPLv2+", "architecture": "x86_64", "atomic.run": "once", "authoritative-source-url": "registry.access.redhat.com", "build-date": "2018-09-24T16:47:49.685343", "com.redhat.build-host": "osbs-cpt-013.ocp.osbs.upshift.eng.rdu2.redhat.com", "com.redhat.component": "aos3-installation-container", "description": "A containerized openshift-ansible image to let you run playbooks to install, upgrade, maintain and check an OpenShift cluster", "distribution-scope": "public", "io.k8s.description": "A containerized openshift-ansible image to let you run playbooks to install, upgrade, maintain and check an OpenShift cluster", "io.k8s.display-name": "openshift-ansible", "io.openshift.expose-services": "", "io.openshift.source-commit-url": "https://github.com/openshift/openshift-ansible/commit/5aef9413a9af137a8f63fa3c502f0a912bf0e481", "io.openshift.source-repo-commit": "5aef9413a9af137a8f63fa3c502f0a912bf0e481", "io.openshift.source-repo-url": "https://github.com/openshift/openshift-ansible", "io.openshift.tags": "openshift,install,upgrade,ansible", "maintainer": "Red Hat, Inc.", "name": "openshift3/ose-ansible", "release": "4", "summary": "OpenShift's installation and configuration tool", "url": "https://access.redhat.com/containers/#/registry.access.redhat.com/openshift3/ose-ansible/images/v3.10.45-4", "usage": "This image is very generic and does not serve a single use case. Use it as a base to build your own images.", "vcs-ref": "02cdb7f0791b2df3287af8a83780500547757ce1", "vcs-type": "git", "vendor": "Red Hat, Inc.", "version": "v3.10.45" }
I made a bit of progress on this. Part of the problem is that we do not tell the docker_image_availability check to query the local registry on the undercloud as an insecure registry and that fails the test. Adding the undercloud registry to the openshift_docker_insecure_registries ansible variable gets us closer. However we're not done yet as it still fails to check the registry-console image. This was reported in [1] and was already fixed in openshift-ansible with [2]. This commit first appeared in openshift-ansible-3.10.48-1. The openshift-ansible container image with the v3.10 tag currently points at v3.10.45-4 [3] with the most recent openshift-ansible image released being v3.10.47-1. One way to workaround the issue until we have a fixed openshift-ansible is to restore the openshift_cockpit_deployer_* variables as we were doing before [4]. Ideally, we should use a version of openshift-ansible newer than v3.10.48-1. [1] https://bugzilla.redhat.com/show_bug.cgi?id=1613100 [2] https://github.com/openshift/openshift-ansible/commit/37cd8d6136781a22c74cc0fc92510c990e63b718 [3] https://access.redhat.com/containers/?tab=tags#/registry.access.redhat.com/openshift3/ose-ansible [4] https://review.openstack.org/#/c/602074/5/extraconfig/services/openshift-master.yaml
I submitted a patch to set the openshift_docker_insecure_registries variable accordingly: https://review.openstack.org/#/c/609603/
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHEA-2019:0045