Bug 1635010 - docker_image_availability check fails when deploying Openshift via Director
Summary: docker_image_availability check fails when deploying Openshift via Director
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-tripleo-heat-templates
Version: 14.0 (Rocky)
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: beta
: 14.0 (Rocky)
Assignee: Martin André
QA Contact: Marius Cornea
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2018-10-01 19:52 UTC by Marius Cornea
Modified: 2019-01-11 11:53 UTC (History)
7 users (show)

Fixed In Version: openstack-tripleo-heat-templates-9.0.1-0.20181013060864.ffbe879.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2019-01-11 11:53:30 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
OpenStack gerrit 609603 0 None None None 2018-10-11 08:19:49 UTC
Red Hat Product Errata RHEA-2019:0045 0 None None None 2019-01-11 11:53:35 UTC

Description Marius Cornea 2018-10-01 19:52:25 UTC
Description of problem:
docker_image_availability check fails when deploying Openshift via Director:

INSTALLER STATUS ***************************************************************
Initialization  : Complete (0:01:21)
Health Check    : In Progress (0:02:27)
	This phase can be restarted by running: playbooks/openshift-checks/pre-install.yml
Monday 01 October 2018  15:45:43 -0400 (0:02:20.792)       0:08:13.461 ******** 
=============================================================================== 


Failure summary:


  1. Hosts:    openshift-infra-0, openshift-infra-1, openshift-master-0, openshift-master-1, openshift-master-2, openshift-worker-0, openshift-worker-1, openshift-worker-2
     Play:     OpenShift Health Checks
     Task:     Run health checks (install) - EL
     Message:  One or more checks failed
     Details:  check "docker_image_availability":
               One or more required container images are not available:
                   192.168.24.1:8787/openshift3/ose-deployer:v3.10,
                   192.168.24.1:8787/openshift3/ose-docker-registry:v3.10,
                   192.168.24.1:8787/openshift3/ose-haproxy-router:v3.10,
                   192.168.24.1:8787/openshift3/ose-pod:v3.10
               Checked with: skopeo inspect [--tls-verify=false] [--creds=<user>:<pass>] docker://<registry>/<image>
               Default registries searched: 192.168.24.1:8787, registry.access.redhat.com
               

The execution of "/var/lib/mistral/openshift/openshift/playbook.yml" includes checks designed to fail early if the requirements of the playbook are not met. One or more of these checks failed. To disregard these results,explicitly disable checks by setting an Ansible variable:
   openshift_disable_check=docker_image_availability


Version-Release number of selected component (if applicable):


How reproducible:


Steps to Reproduce:
1. Deploy overcloud:

openstack overcloud deploy \
--stack openshift \
--templates \
-r /home/stack/openshift_roles_data.yaml \
-n /home/stack/network_data.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/openshift.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/openshift-cns.yaml \
-e /home/stack/openshift_env.yaml \
-e /home/stack/containers-default-parameters.yaml


[stack@undercloud-0 ~]$ cat containers-default-parameters.yaml 
# Generated with the following on 2018-09-26T20:02:43.602821
#
#   openstack tripleo container image prepare -e /home/stack/containers-prepare-parameter.yaml --roles-file /home/stack/openshift_roles_data.yaml --output-env-file /home/stack/containers-default-parameters.yaml
#

parameter_defaults:
  DockerHAProxyConfigImage: 192.168.24.1:8787/rhosp14/openstack-haproxy:2018-09-06.1
  DockerHAProxyImage: 192.168.24.1:8787/rhosp14/openstack-haproxy:2018-09-06.1
  DockerInsecureRegistryAddress:
  - 192.168.24.1:8787
  DockerKeepalivedConfigImage: 192.168.24.1:8787/rhosp14/openstack-keepalived:2018-09-06.1
  DockerKeepalivedImage: 192.168.24.1:8787/rhosp14/openstack-keepalived:2018-09-06.1
  DockerOpenShiftBaseImage: 192.168.24.1:8787/openshift3/ose:v3.10
  DockerOpenShiftCockpitImage: 192.168.24.1:8787/openshift3/registry-console:v3.10
  DockerOpenShiftDeployerImage: 192.168.24.1:8787/openshift3/ose-deployer:v3.10
  DockerOpenShiftDockerRegistryImage: 192.168.24.1:8787/openshift3/ose-docker-registry:v3.10
  DockerOpenShiftEtcdImage: 192.168.24.1:8787/rhel7/etcd:latest
  DockerOpenShiftGlusterFSBlockImage: 192.168.24.1:8787/rhgs3/rhgs-gluster-block-prov-rhel7:latest
  DockerOpenShiftGlusterFSHeketiImage: 192.168.24.1:8787/rhgs3/rhgs-volmanager-rhel7:latest
  DockerOpenShiftGlusterFSImage: 192.168.24.1:8787/rhgs3/rhgs-server-rhel7:latest
  DockerOpenShiftHAProxyRouterImage: 192.168.24.1:8787/openshift3/ose-haproxy-router:v3.10
  DockerOpenShiftNodeImage: 192.168.24.1:8787/openshift3/node:v3.10
  DockerOpenShiftPodImage: 192.168.24.1:8787/openshift3/ose-pod:v3.10
  DockerOpenShiftWebConsoleImage: 192.168.24.1:8787/openshift3/ose-web-console:v3.10


Actual results:
Deployment fails because docker_image_availability validation fails.

Expected results:
Deployment shouldn't fail as the images are available:

[root@undercloud-0 stack]# skopeo inspect --tls-verify=false docker://192.168.24.1:8787/openshift3/ose-deployer:v3.10
{
    "Name": "192.168.24.1:8787/openshift3/ose-deployer",
    "Digest": "sha256:d147e7e4595481c724ff74bf393cfb00d10c281685830a0840f62b28fde140db",
    "RepoTags": [
        "v3.10"
    ],
    "Created": "2018-09-10T18:00:20.217044Z",
    "DockerVersion": "1.12.6",
    "Labels": {
        "License": "GPLv2+",
        "architecture": "x86_64",
        "authoritative-source-url": "registry.access.redhat.com",
        "build-date": "2018-09-10T17:59:46.483136",
        "com.redhat.build-host": "osbs-cpt-007.ocp.osbs.upshift.eng.rdu2.redhat.com",
        "com.redhat.component": "openshift-enterprise-deployer-container",
        "description": "This is a component of OpenShift Container Platform and executes the user deployment process to roll out new containers. It may be used as a base image for building your own custom deployer image.",
        "distribution-scope": "public",
        "io.k8s.description": "This is a component of OpenShift Container Platform and executes the user deployment process to roll out new containers. It may be used as a base image for building your own custom deployer image.",
        "io.k8s.display-name": "OpenShift Container Platform Deployer",
        "io.openshift.expose-services": "",
        "io.openshift.tags": "openshift,deployer",
        "maintainer": "Red Hat, Inc.",
        "name": "openshift3/ose-deployer",
        "release": "2",
        "summary": "Provides the latest release of Red Hat Enterprise Linux 7 in a fully featured and supported base image.",
        "url": "https://access.redhat.com/containers/#/registry.access.redhat.com/openshift3/ose-deployer/images/v3.10.45-2",
        "usage": "This image is very generic and does not serve a single use case. Use it as a base to build your own images.",
        "vcs-ref": "4f6dafb9ca673905a9cedcba73366ae972a9964d",
        "vcs-type": "git",
        "vendor": "Red Hat, Inc.",
        "version": "v3.10.45"
    },
    "Architecture": "amd64",
    "Os": "linux",
    "Layers": [
        "sha256:367d845540573038025f445c654675aa63905ec8682938fb45bc00f40849c37b",
        "sha256:b82a357e4f15fda58e9728fced8558704e3a2e1d100e93ac408edb45fe3a5cb9",
        "sha256:5305a8b2fbab987f183698f1c0845b7a6fbb2490e05fe51fa68a7f838fa950c7",
        "sha256:d452a8e8215cbbf07a2fb9e486ba6621edef3f9ae2182d80872a13a7b8e9e415",
        "sha256:f767ece49514ec0f4b960ffe124f62eeca2b4e9f5bdbbccedaa950dade6b4ba9"
    ]
}

Additional info:

[stack@undercloud-0 ~]$ cat openshift_env.yaml 
resource_registry:
  OS::TripleO::Services::HAproxy: /usr/share/openstack-tripleo-heat-templates/docker/services/haproxy.yaml
  OS::TripleO::Services::Keepalived: /usr/share/openstack-tripleo-heat-templates//docker/services/keepalived.yaml
  OS::TripleO::NodeUserData: /home/stack/firstboot.yaml
  OS::TripleO::OpenShiftMaster::Net::SoftwareConfig: /home/stack/master-nic.yaml
  OS::TripleO::OpenShiftWorker::Net::SoftwareConfig: /home/stack/worker-nic.yaml
  OS::TripleO::OpenShiftInfra::Net::SoftwareConfig: /home/stack/infra-nic.yaml

parameter_defaults:
  CloudName: openshift.localdomain

  OvercloudOpenShiftMasterFlavor: master
  OpenShiftMasterHostnameFormat: '%stackname%-master-%index%'
  OvercloudOpenShiftWorkerFlavor: worker
  OpenShiftWorkerHostnameFormat: '%stackname%-worker-%index%'
  OvercloudOpenShiftInfraFlavor: infra
  OpenShiftInfraHostnameFormat: '%stackname%-infra-%index%'

  OpenShiftMasterCount: 3
  OpenShiftWorkerCount: 3
  OpenShiftInfraCount: 2

  NtpServer: ["clock.redhat.com","clock2.redhat.com"]

  ControlPlaneDefaultRoute: 192.168.24.1
  EC2MetadataIp: 192.168.24.1
  ControlPlaneSubnetCidr: 24

  DnsServers:
   - 10.0.0.1

  OpenShiftGlobalVariables:

    openshift_master_identity_providers:
    - name: allow_all
      login: 'true'
      challenge: true
      kind: AllowAllPasswordIdentityProvider

    openshift_deployment_type: openshift-enterprise
    openshift_master_default_subdomain: apps.openshift.localdomain

Comment 1 Martin André 2018-10-09 07:47:46 UTC
I cannot reproduce locally. Could it be a networking issue between your local registry and the openshift nodes? Or the docker_image_availability check was broken in openshift-ansible and was fixed in a later release?

I'm using openshift-ansible container image:

            "Labels": {
                "License": "GPLv2+",
                "architecture": "x86_64",
                "atomic.run": "once",
                "authoritative-source-url": "registry.access.redhat.com",
                "build-date": "2018-09-24T16:47:49.685343",
                "com.redhat.build-host": "osbs-cpt-013.ocp.osbs.upshift.eng.rdu2.redhat.com",
                "com.redhat.component": "aos3-installation-container",
                "description": "A containerized openshift-ansible image to let you run playbooks to install, upgrade, maintain and check an OpenShift cluster",
                "distribution-scope": "public",
                "io.k8s.description": "A containerized openshift-ansible image to let you run playbooks to install, upgrade, maintain and check an OpenShift cluster",
                "io.k8s.display-name": "openshift-ansible",
                "io.openshift.expose-services": "",
                "io.openshift.source-commit-url": "https://github.com/openshift/openshift-ansible/commit/5aef9413a9af137a8f63fa3c502f0a912bf0e481",
                "io.openshift.source-repo-commit": "5aef9413a9af137a8f63fa3c502f0a912bf0e481",
                "io.openshift.source-repo-url": "https://github.com/openshift/openshift-ansible",
                "io.openshift.tags": "openshift,install,upgrade,ansible",
                "maintainer": "Red Hat, Inc.",
                "name": "openshift3/ose-ansible",
                "release": "4",
                "summary": "OpenShift's installation and configuration tool",
                "url": "https://access.redhat.com/containers/#/registry.access.redhat.com/openshift3/ose-ansible/images/v3.10.45-4",
                "usage": "This image is very generic and does not serve a single use case. Use it as a base to build your own images.",
                "vcs-ref": "02cdb7f0791b2df3287af8a83780500547757ce1",
                "vcs-type": "git",
                "vendor": "Red Hat, Inc.",
                "version": "v3.10.45"
            }

Comment 2 Martin André 2018-10-11 07:08:10 UTC
I made a bit of progress on this. Part of the problem is that we do not tell the docker_image_availability check to query the local registry on the undercloud as an insecure registry and that fails the test. Adding the undercloud registry to the openshift_docker_insecure_registries ansible variable gets us closer.

However we're not done yet as it still fails to check the registry-console image. This was reported in [1] and was already fixed in openshift-ansible with [2]. This commit first appeared in openshift-ansible-3.10.48-1.

The openshift-ansible container image with the v3.10 tag currently points at v3.10.45-4 [3] with the most recent openshift-ansible image released being v3.10.47-1.

One way to workaround the issue until we have a fixed openshift-ansible is to restore the openshift_cockpit_deployer_* variables as we were doing before [4]. Ideally, we should use a version of openshift-ansible newer than v3.10.48-1. 

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1613100

[2] https://github.com/openshift/openshift-ansible/commit/37cd8d6136781a22c74cc0fc92510c990e63b718

[3] https://access.redhat.com/containers/?tab=tags#/registry.access.redhat.com/openshift3/ose-ansible

[4] https://review.openstack.org/#/c/602074/5/extraconfig/services/openshift-master.yaml

Comment 3 Martin André 2018-10-11 08:19:49 UTC
I submitted a patch to set the openshift_docker_insecure_registries variable accordingly: https://review.openstack.org/#/c/609603/

Comment 11 errata-xmlrpc 2019-01-11 11:53:30 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHEA-2019:0045


Note You need to log in before you can comment on or make changes to this bug.