Bug 1575898 - docker_image_availability check in upgrade failed due to ose-control-plane&ose-node:latest image unavailable
Summary: docker_image_availability check in upgrade failed due to ose-control-plane&os...
Keywords:
Status: CLOSED WONTFIX
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Cluster Version Operator
Version: 3.10.0
Hardware: Unspecified
OS: Unspecified
low
low
Target Milestone: ---
: 3.10.0
Assignee: Scott Dodson
QA Contact: liujia
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-05-08 08:30 UTC by liujia
Modified: 2019-02-18 18:00 UTC (History)
5 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-02-18 18:00:31 UTC
Target Upstream Version:


Attachments (Terms of Use)

Description liujia 2018-05-08 08:30:31 UTC
Description of problem:
Upgrade failed at TASK [Run health checks (upgrade)] due to ose-control-plane&ose-node:latest image unavailable. openshift_health_checker role was run before openshift_version role, so openshift_release/openshift_image_tag was not set to any value which caused all images's tag was set to default latest during healthy check.

# vim roles/openshift_health_checker/openshift_checks/docker_image_availability.py
image_tag = self.get_var("openshift_image_tag", default="latest")

Failure summary:
  1. Hosts:    x.x.x.x
     Play:     OpenShift Health Checks
     Task:     Run health checks (upgrade)
     Message:  One or more checks failed
     Details:  check "docker_image_availability":
               One or more required container images are not available:
                   openshift3/ose-control-plane:latest,
                   openshift3/ose-node:latest
               Checked with: skopeo inspect [--tls-verify=false] [--creds=<user>:<pass>] docker://<registry>/<image>
               Default registries searched: registry.reg-aws.openshift.com:443, registry.access.redhat.com
               Blocked registries: registry.hacker.com

# docker pull openshift3/ose-control-plane:latest
Trying to pull repository registry.reg-aws.openshift.com:443/openshift3/ose-control-plane ... 
Pulling repository registry.reg-aws.openshift.com:443/openshift3/ose-control-plane
Trying to pull repository registry.access.redhat.com/openshift3/ose-control-plane ... 
Trying to pull repository registry.access.redhat.com/openshift3/ose-control-plane ... 
Trying to pull repository docker.io/openshift3/ose-control-plane ... 
repository docker.io/openshift3/ose-control-plane not found: does not exist or no pull access


# docker pull openshift3/ose-control-plane:v3.10
Trying to pull repository registry.reg-aws.openshift.com:443/openshift3/ose-control-plane ... 
v3.10: Pulling from registry.reg-aws.openshift.com:443/openshift3/ose-control-plane
d1fe25896eb5: Already exists 
001d79f68470: Already exists 
51c5e732a200: Pull complete 
4d0779510506: Pull complete 
Digest: sha256:47b10c1856fdde3d3af8bb6759e19127608e99e5ac6e37a653b556fb5e408890
Status: Downloaded newer image for registry.reg-aws.openshift.com:443/openshift3/ose-control-plane:v3.10


Version-Release number of the following components:
openshift-ansible-3.10.0-0.36.0.git.0.521f0ef.el7.noarch

How reproducible:
always

Steps to Reproduce:
1. Run upgrade against docker containerized ocp on rhel
2.
3.

Actual results:
Upgrade will fail due to docker_image_availability check.

Expected results:
docker_image_availability check should use correct image tag.

Additional info:
workaround:openshift_disable_check=docker_image_availability

Comment 1 Scott Dodson 2018-05-11 19:18:15 UTC
This is an issue only with the internal registry that's been fixed by pushing the latest tag. This shouldn't happen in registry.access.redhat.com. Lets verify that this fixes the problem and we can go ahead and close this.

Comment 2 liujia 2018-05-15 01:38:05 UTC
Hi, Scott

Yes, when image was tagged with "latest" in internal registry, then this issue can be workaround.

However, I file this bug because I think the root cause should be that openshift_version role should run before openshift_health_checker role, so that openshift_release/openshift_image_tag be set to correct value. And I think docker_image_availability need check the image(v3.10/v3.10.0-x.x.x.x) which will be really used in later upgrade, but not default(latest) one which will not be pulled at all.

BTW, Though this wouldn't happen in registry.access.redhat.com, but I'm not sure if "latest" tag always existed for user who prefer an internal registry(just like our registry.reg-aws.openshift.com:443). If not, I think maybe the workaround should not work. 

Change status back to wait a more reasonable solution. Of course, this issue does not block anything.

Comment 3 Scott Dodson 2019-02-18 18:00:31 UTC
There appear to be no active cases related to this bug. As such we're closing this bug in order to focus on bugs that are still tied to active customer cases. Please re-open this bug if you feel it was closed in error or a new active case is attached.


Note You need to log in before you can comment on or make changes to this bug.