Description of problem: Upgrade failed at TASK [Run health checks (upgrade)] due to ose-control-plane&ose-node:latest image unavailable. openshift_health_checker role was run before openshift_version role, so openshift_release/openshift_image_tag was not set to any value which caused all images's tag was set to default latest during healthy check. # vim roles/openshift_health_checker/openshift_checks/docker_image_availability.py image_tag = self.get_var("openshift_image_tag", default="latest") Failure summary: 1. Hosts: x.x.x.x Play: OpenShift Health Checks Task: Run health checks (upgrade) Message: One or more checks failed Details: check "docker_image_availability": One or more required container images are not available: openshift3/ose-control-plane:latest, openshift3/ose-node:latest Checked with: skopeo inspect [--tls-verify=false] [--creds=<user>:<pass>] docker://<registry>/<image> Default registries searched: registry.reg-aws.openshift.com:443, registry.access.redhat.com Blocked registries: registry.hacker.com # docker pull openshift3/ose-control-plane:latest Trying to pull repository registry.reg-aws.openshift.com:443/openshift3/ose-control-plane ... Pulling repository registry.reg-aws.openshift.com:443/openshift3/ose-control-plane Trying to pull repository registry.access.redhat.com/openshift3/ose-control-plane ... Trying to pull repository registry.access.redhat.com/openshift3/ose-control-plane ... Trying to pull repository docker.io/openshift3/ose-control-plane ... repository docker.io/openshift3/ose-control-plane not found: does not exist or no pull access # docker pull openshift3/ose-control-plane:v3.10 Trying to pull repository registry.reg-aws.openshift.com:443/openshift3/ose-control-plane ... v3.10: Pulling from registry.reg-aws.openshift.com:443/openshift3/ose-control-plane d1fe25896eb5: Already exists 001d79f68470: Already exists 51c5e732a200: Pull complete 4d0779510506: Pull complete Digest: sha256:47b10c1856fdde3d3af8bb6759e19127608e99e5ac6e37a653b556fb5e408890 Status: Downloaded newer image for registry.reg-aws.openshift.com:443/openshift3/ose-control-plane:v3.10 Version-Release number of the following components: openshift-ansible-3.10.0-0.36.0.git.0.521f0ef.el7.noarch How reproducible: always Steps to Reproduce: 1. Run upgrade against docker containerized ocp on rhel 2. 3. Actual results: Upgrade will fail due to docker_image_availability check. Expected results: docker_image_availability check should use correct image tag. Additional info: workaround:openshift_disable_check=docker_image_availability
This is an issue only with the internal registry that's been fixed by pushing the latest tag. This shouldn't happen in registry.access.redhat.com. Lets verify that this fixes the problem and we can go ahead and close this.
Hi, Scott Yes, when image was tagged with "latest" in internal registry, then this issue can be workaround. However, I file this bug because I think the root cause should be that openshift_version role should run before openshift_health_checker role, so that openshift_release/openshift_image_tag be set to correct value. And I think docker_image_availability need check the image(v3.10/v3.10.0-x.x.x.x) which will be really used in later upgrade, but not default(latest) one which will not be pulled at all. BTW, Though this wouldn't happen in registry.access.redhat.com, but I'm not sure if "latest" tag always existed for user who prefer an internal registry(just like our registry.reg-aws.openshift.com:443). If not, I think maybe the workaround should not work. Change status back to wait a more reasonable solution. Of course, this issue does not block anything.
There appear to be no active cases related to this bug. As such we're closing this bug in order to focus on bugs that are still tied to active customer cases. Please re-open this bug if you feel it was closed in error or a new active case is attached.