Bugzilla will be upgraded to version 5.0. The upgrade date is tentatively scheduled for 2 December 2018, pending final testing and feedback.
Bug 1615504 - Installer fails on task "Wait for the ServiceMonitor CRD to be created"
Installer fails on task "Wait for the ServiceMonitor CRD to be created"
Status: CLOSED ERRATA
Product: OpenShift Container Platform
Classification: Red Hat
Component: Installer (Show other bugs)
3.11.0
Unspecified Unspecified
unspecified Severity unspecified
: ---
: 3.11.0
Assigned To: Frederic Branczyk
Gaoyun Pei
:
Depends On:
Blocks:
  Show dependency treegraph
 
Reported: 2018-08-13 14:08 EDT by Matt Bruzek
Modified: 2018-10-11 03:25 EDT (History)
4 users (show)

See Also:
Fixed In Version:
Doc Type: No Doc Update
Doc Text:
undefined
Story Points: ---
Clone Of:
Environment:
Last Closed: 2018-10-11 03:24:39 EDT
Type: Bug
Regression: ---
Mount Type: ---
Documentation: ---
CRM:
Verified Versions:
Category: ---
oVirt Team: ---
RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: ---


Attachments (Terms of Use)


External Trackers
Tracker ID Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:2652 None None None 2018-10-11 03:25 EDT

  None (edit)
Description Matt Bruzek 2018-08-13 14:08:30 EDT
Description of problem:

While running the openshift-ansible install command: 
source /home/cloud-user/keystonerc; ansible-playbook -vvv --user openshift -i inventory -i openshift-ansible/playbooks/openstack/inventory.py openshift-ansible/playbooks/openstack/openshift-cluster/install.yml 2>&1 >> /home/cloud-user/logs/openshift_install.log

The installer fails on:

TASK [openshift_cluster_monitoring_operator : Wait for the ServiceMonitor CRD to be created] ***
task path: /home/cloud-user/openshift-ansible/roles/openshift_cluster_monitoring_operator/tasks/install.yaml:115

The cluster-monitoring-operator-cf688f46c-94cnf pod is in an ImagePullBackOff state. According to the log the Deployment object specified the "image": "registry.redhat.io/openshift3/ose-cluster-monitoring-operator:v3.11.0"

I could not pull that image

# docker pull registry.redhat.io/openshift3/ose-cluster-monitoring-operator:v3.11.0
Trying to pull repository registry.redhat.io/openshift3/ose-cluster-monitoring-operator ... 
Get https://registry.redhat.io/v2/openshift3/ose-cluster-monitoring-operator/manifests/v3.11.0: unauthorized: invalid credentials provided when attempting to perform docker authentication

The install is configured to use oreg_url: registry.reg-aws.openshift.com:443/openshift3/ose-${component}:${version}

I found the image at that url:
# docker pull registry.reg-aws.openshift.com:443/openshift3/ose-cluster-monitoring-operator:v3.11.0

I believe the deployment template for the cluster-monitoring-operator should use the oreg_url for the image location.


Version-Release number of the following components:
ansible 2.6.0
  config file = /etc/ansible/ansible.cfg
  configured module search path = [u'/home/cloud-user/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/lib/python2.7/site-packages/ansible
  executable location = /usr/bin/ansible
  python version = 2.7.5 (default, Jul 16 2018, 19:52:45) [GCC 4.8.5 20150623 (Red Hat 4.8.5-36)]
git describe: openshift-ansible-3.9.0-0.10.0-3147-g7ad2385

How reproducible:

Steps to Reproduce:
1.
2.
3.

Actual results:

TASK [openshift_cluster_monitoring_operator : Wait for the ServiceMonitor CRD to be created] ***
task path: /home/cloud-user/openshift-ansible/roles/openshift_cluster_monitoring_operator/tasks/install.yaml:115
... 30 retries ...
<192.168.0.18> (1, '\n{"changed": true, "end": "2018-08-13 11:33:52.869678", "stdout": "", "cmd": ["oc", "get", "crd", "servicemonitors.monitoring.coreos.com", "-n", "openshift-monitoring", "--config=/tmp/openshift-cluster-monitoring-ansible-RtKcWm/admin.kubeconfig"], "failed": true, "delta": "0:00:00.159734", "stderr": "No resources found.\\nError from server (NotFound): customresourcedefinitions.apiextensions.k8s.io \\"servicemonitors.monitoring.coreos.com\\" not found", "rc": 1, "invocation": {"module_args": {"warn": true, "executable": null, "_uses_shell": false, "_raw_params": "oc get crd servicemonitors.monitoring.coreos.com -n openshift-monitoring --config=/tmp/openshift-cluster-monitoring-ansible-RtKcWm/admin.kubeconfig", "removes": null, "argv": null, "creates": null, "chdir": null, "stdin": null}}, "start": "2018-08-13 11:33:52.709944", "msg": "non-zero return code"}\n', '')
fatal: [master-0.scale-ci.example.com]: FAILED! => {
    "attempts": 30, 
    "changed": true, 
    "cmd": [
        "oc", 
        "get", 
        "crd", 
        "servicemonitors.monitoring.coreos.com", 
        "-n", 
        "openshift-monitoring", 
        "--config=/tmp/openshift-cluster-monitoring-ansible-RtKcWm/admin.kubeconfig"
    ], 
    "delta": "0:00:00.159734", 
    "end": "2018-08-13 11:33:52.869678", 
    "invocation": {
        "module_args": {
            "_raw_params": "oc get crd servicemonitors.monitoring.coreos.com -n openshift-monitoring --config=/tmp/openshift-cluster-monitoring-ansible-RtKcWm/admin.kubeconfig", 
            "_uses_shell": false, 
            "argv": null, 
            "chdir": null, 
            "creates": null, 
            "executable": null, 
            "removes": null, 
            "stdin": null, 
            "warn": true
        }
    }, 
    "msg": "non-zero return code", 
    "rc": 1, 
    "start": "2018-08-13 11:33:52.709944", 
    "stderr": "No resources found.\nError from server (NotFound): customresourcedefinitions.apiextensions.k8s.io \"servicemonitors.monitoring.coreos.com\" not found", 
    "stderr_lines": [
        "No resources found.", 
        "Error from server (NotFound): customresourcedefinitions.apiextensions.k8s.io \"servicemonitors.monitoring.coreos.com\" not found"
    ], 
    "stdout": "", 
    "stdout_lines": []
}

...

Failure summary:


  1. Hosts:    master-0.scale-ci.example.com
     Play:     Configure Cluster Monitoring Operator
     Task:     Wait for the ServiceMonitor CRD to be created
     Message:  non-zero return code

Expected results: I expect the cluster-monitoring-operator image to use the oreg_url
Comment 2 Scott Dodson 2018-08-13 16:44:48 EDT
Yeah, we need to account for evaluating oreg_url in order to support disconnected installs. Here's a relatively decent pattern to follow,

https://github.com/openshift/openshift-ansible/blob/master/roles/ansible_service_broker/defaults/main.yml#L26-L33

However, I've just noticed a bug, in that the default dictionary should contain ${component}

l_asb_default_images_dict:
  origin: 'docker.io/ansibleplaybookbundle/origin-${component}:latest'
  openshift-enterprise: 'registry.redhat.io/openshift3/ose-${component}:${version}'

l_asb_default_images_default: "{{ l_asb_default_images_dict[openshift_deployment_type] }}"
l_asb_image_url: "{{ oreg_url | default(l_asb_default_images_default) | regex_replace('${version}' | regex_escape, openshift_image_tag) }}"

ansible_service_broker_image: "{{ l_asb_image_url | regex_replace('${component}' | regex_escape, 'ansible-service-broker') }}"
Comment 3 Scott Dodson 2018-08-14 09:38:22 EDT
https://github.com/openshift/openshift-ansible/pull/9477 should address this.
Comment 4 Scott Dodson 2018-08-14 17:24:31 EDT
Should be in openshift-ansible-3.11.0-0.15.0
Comment 7 errata-xmlrpc 2018-10-11 03:24:39 EDT
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:2652

Note You need to log in before you can comment on or make changes to this bug.