1680523 – [RHOSP14][OpenShift 3.11] OpenShift integration has failed while fetching follwoing image:ose-cluster-monitoring-operator

Bug 1680523 - [RHOSP14][OpenShift 3.11] OpenShift integration has failed while fetching follwoing image:ose-cluster-monitoring-operator

Summary: [RHOSP14][OpenShift 3.11] OpenShift integration has failed while fetching fol...

Keywords:
Status:	CLOSED DUPLICATE of bug 1659183
Alias:	None
Product:	Red Hat OpenStack
Classification:	Red Hat
Component:	openstack-tripleo-common
Sub Component:
Version:	14.0 (Rocky)
Hardware:	x86_64
OS:	Linux
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Adriano Petrich
QA Contact:	Alexander Chuzhoy
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2019-02-25 09:34 UTC by Pradipta Kumar Sahoo
Modified:	2019-03-13 10:23 UTC (History)
CC List:	4 users (show)
Fixed In Version:
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2019-03-13 10:23:06 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Description Pradipta Kumar Sahoo 2019-02-25 09:34:19 UTC

Description of problem:

Referring to the following bug, ServiceMonitor CRD has fixed. But the issue is still reappeared while trying to integrate OpenShift3.11 with RHOSP14.
https://bugzilla.redhat.com/show_bug.cgi?id=1640287

Version-Release number of selected component (if applicable):
$ cat /etc/rhosp-release 
Red Hat OpenStack Platform release 14.0.0 RC (Rocky)

$ sudo rpm -qa | grep -i openstack-tripleo-common
openstack-tripleo-common-containers-9.4.1-0.20181012010888.el7ost.noarch
openstack-tripleo-common-9.4.1-0.20181012010888.el7ost.noarch

How reproducible:
100%

Steps to Reproduce:

1. The default OpenShift integration steps have failed as per RHOSP14 official document guide.
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/14/html-single/installing_openshift_container_platform_on_bare_metal_using_director/

2. After the failure, we noticed that ose-cluster-monitoring-operator container image tag is not uniformed. 
$ sudo docker images | grep ose
192.168.24.1:8787/openshift3/ose-ansible                         v3.11.82-5            1ec01060c252        10 days ago         700 MB
192.168.24.1:8787/openshift3/ose-cluster-monitoring-operator     v3.11.82-4            82b529a4a254        11 days ago         338 MB
192.168.24.1:8787/openshift3/ose-node                            v3.11.82-3            6fc64a6096d1        12 days ago         1.17 GB
192.168.24.1:8787/openshift3/ose-control-plane                   v3.11.82-3            af096add3983        12 days ago         808 MB
192.168.24.1:8787/openshift3/ose-docker-builder                  v3.11.82-3            fb22439848f3        12 days ago         430 MB
192.168.24.1:8787/openshift3/ose-haproxy-router                  v3.11.82-3            3d7ff92c689c        12 days ago         385 MB
192.168.24.1:8787/openshift3/ose-deployer                        v3.11.82-3            54d335969fb7        12 days ago         361 MB
192.168.24.1:8787/openshift3/ose-kube-rbac-proxy                 v3.11.82-3            8f398aed20a5        12 days ago         264 MB
192.168.24.1:8787/openshift3/ose-prometheus-operator             v3.11.82-3            d6a5556e326a        12 days ago         481 MB
192.168.24.1:8787/openshift3/ose-console                         v3.11.82-3            9c85542fef2c        12 days ago         254 MB
192.168.24.1:8787/openshift3/ose-prometheus-config-reloader      v3.11.82-3            f1b13d5d1906        12 days ago         409 MB
192.168.24.1:8787/openshift3/ose-kube-state-metrics              v3.11.82-3            3b7ea4583a0e        12 days ago         342 MB
192.168.24.1:8787/openshift3/ose-docker-registry                 v3.11.82-3            6c569c7205f7        12 days ago         289 MB
192.168.24.1:8787/openshift3/ose-pod                             v3.11.82-3            74ee8ca2c756        12 days ago         238 MB
192.168.24.1:8787/openshift3/ose-configmap-reloader              v3.11.82-3            078153495a24        12 days ago         293 MB
192.168.24.1:8787/openshift3/ose-web-console                     v3.11.82-3            4a5379481b8c        12 days ago         322 MB
192.168.24.1:8787/openshift3/ose-template-service-broker         v3.11.82-3            a583363cacbf        12 days ago         313 MB
192.168.24.1:8787/openshift3/ose-service-catalog                 v3.11.82-3            ea03dbaa21f8        12 days ago         309 MB
192.168.24.1:8787/openshift3/ose-ansible-service-broker          v3.11.82-3            34631fb45718        12 days ago         457 MB

3. Also, I tried to force image tag with following tripleo variable, but it didn't help to mitigate the failures.
   Please suggest us the supported method to integrate non-uniform image tag for integration.

parameter_defaults:
  OpenShiftGlobalVariables:
    cluster-monitoring-operator:
      openshift_docker_image_tag: v3.11.82-4



4. openstack overcloud failures --stack overcloud

    "TASK [openshift_cluster_monitoring_operator : Wait for the ServiceMonitor CRD to be created] ***",
    "\u001b[1;30mFAILED - RETRYING: Wait for the ServiceMonitor CRD to be created (30 retries left).\u001b[0m",
    "\u001b[1;30mFAILED - RETRYING: Wait for the ServiceMonitor CRD to be created (29 retries left).\u001b[0m",
    "\u001b[1;30mFAILED - RETRYING: Wait for the ServiceMonitor CRD to be created (28 retries left).\u001b[0m",
    "\u001b[1;30mFAILED - RETRYING: Wait for the ServiceMonitor CRD to be created (27 retries left).\u001b[0m",
    "\u001b[1;30mFAILED - RETRYING: Wait for the ServiceMonitor CRD to be created (26 retries left).\u001b[0m",
    "\u001b[1;30mFAILED - RETRYING: Wait for the ServiceMonitor CRD to be created (25 retries left).\u001b[0m",
    "\u001b[1;30mFAILED - RETRYING: Wait for the ServiceMonitor CRD to be created (24 retries left).\u001b[0m",
    "\u001b[1;30mFAILED - RETRYING: Wait for the ServiceMonitor CRD to be created (23 retries left).\u001b[0m",
    "\u001b[1;30mFAILED - RETRYING: Wait for the ServiceMonitor CRD to be created (22 retries left).\u001b[0m",
    "\u001b[1;30mFAILED - RETRYING: Wait for the ServiceMonitor CRD to be created (21 retries left).\u001b[0m",
    "\u001b[1;30mFAILED - RETRYING: Wait for the ServiceMonitor CRD to be created (20 retries left).\u001b[0m",
    "\u001b[1;30mFAILED - RETRYING: Wait for the ServiceMonitor CRD to be created (19 retries left).\u001b[0m",
    "\u001b[1;30mFAILED - RETRYING: Wait for the ServiceMonitor CRD to be created (18 retries left).\u001b[0m",
    "\u001b[1;30mFAILED - RETRYING: Wait for the ServiceMonitor CRD to be created (17 retries left).\u001b[0m",
    "\u001b[1;30mFAILED - RETRYING: Wait for the ServiceMonitor CRD to be created (16 retries left).\u001b[0m",
    "\u001b[1;30mFAILED - RETRYING: Wait for the ServiceMonitor CRD to be created (15 retries left).\u001b[0m",
    "\u001b[1;30mFAILED - RETRYING: Wait for the ServiceMonitor CRD to be created (14 retries left).\u001b[0m",
    "\u001b[1;30mFAILED - RETRYING: Wait for the ServiceMonitor CRD to be created (13 retries left).\u001b[0m",
    "\u001b[1;30mFAILED - RETRYING: Wait for the ServiceMonitor CRD to be created (12 retries left).\u001b[0m",
    "\u001b[1;30mFAILED - RETRYING: Wait for the ServiceMonitor CRD to be created (11 retries left).\u001b[0m",
    "\u001b[1;30mFAILED - RETRYING: Wait for the ServiceMonitor CRD to be created (10 retries left).\u001b[0m",
    "\u001b[1;30mFAILED - RETRYING: Wait for the ServiceMonitor CRD to be created (9 retries left).\u001b[0m",
    "\u001b[1;30mFAILED - RETRYING: Wait for the ServiceMonitor CRD to be created (8 retries left).\u001b[0m",
    "\u001b[1;30mFAILED - RETRYING: Wait for the ServiceMonitor CRD to be created (7 retries left).\u001b[0m",
    "\u001b[1;30mFAILED - RETRYING: Wait for the ServiceMonitor CRD to be created (6 retries left).\u001b[0m",
    "\u001b[1;30mFAILED - RETRYING: Wait for the ServiceMonitor CRD to be created (5 retries left).\u001b[0m",
    "\u001b[1;30mFAILED - RETRYING: Wait for the ServiceMonitor CRD to be created (4 retries left).\u001b[0m",
    "\u001b[1;30mFAILED - RETRYING: Wait for the ServiceMonitor CRD to be created (3 retries left).\u001b[0m",
    "\u001b[1;30mFAILED - RETRYING: Wait for the ServiceMonitor CRD to be created (2 retries left).\u001b[0m",
    "\u001b[1;30mFAILED - RETRYING: Wait for the ServiceMonitor CRD to be created (1 retries left).\u001b[0m",
    "\u001b[0;31mfatal: [overcloud-master-1]: FAILED! => {\"attempts\": 30, \"changed\": true, \"cmd\": [\"oc\", \"get\", \"crd\", \"servicemonitors.monitoring.coreos.com\", \"-n\", \"openshift-monitoring\", \"--config=/tmp/openshift-cluster-monitoring-ansible-ppxWNO/admin.kubeconfig\"], \"delta\": \"0:00:00.223581\", \"end\": \"2019-02-25 02:22:40.241355\", \"msg\": \"non-zero return code\", \"rc\": 1, \"start\": \"2019-02-25 02:22:40.017774\", \"stderr\": \"No resources found.\\nError from server (NotFound): customresourcedefinitions.apiextensions.k8s.io \\\"servicemonitors.monitoring.coreos.com\\\" not found\", \"stderr_lines\": [\"No resources found.\", \"Error from server (NotFound): customresourcedefinitions.apiextensions.k8s.io \\\"servicemonitors.monitoring.coreos.com\\\" not found\"], \"stdout\": \"\", \"stdout_lines\": []}\u001b[0m",
    "",
    "PLAY RECAP *********************************************************************",
    "\u001b[0;32mlocalhost\u001b[0m                  : \u001b[0;32mok=22  \u001b[0m changed=0    unreachable=0    failed=0   ",
    "\u001b[0;33movercloud-infra-0\u001b[0m          : \u001b[0;32mok=176 \u001b[0m \u001b[0;33mchanged=71  \u001b[0m unreachable=0    failed=0   ",
    "\u001b[0;33movercloud-infra-1\u001b[0m          : \u001b[0;32mok=176 \u001b[0m \u001b[0;33mchanged=71  \u001b[0m unreachable=0    failed=0   ",
    "\u001b[0;33movercloud-infra-2\u001b[0m          : \u001b[0;32mok=177 \u001b[0m \u001b[0;33mchanged=71  \u001b[0m unreachable=0    failed=0   ",
    "\u001b[0;33movercloud-master-0\u001b[0m         : \u001b[0;32mok=351 \u001b[0m \u001b[0;33mchanged=149 \u001b[0m unreachable=0    failed=0   ",
    "\u001b[0;31movercloud-master-1\u001b[0m         : \u001b[0;32mok=666 \u001b[0m \u001b[0;33mchanged=276 \u001b[0m unreachable=0    \u001b[0;31mfailed=1   \u001b[0m",
    "\u001b[0;33movercloud-master-2\u001b[0m         : \u001b[0;32mok=351 \u001b[0m \u001b[0;33mchanged=149 \u001b[0m unreachable=0    failed=0   ",
    "\u001b[0;33movercloud-worker-0\u001b[0m         : \u001b[0;32mok=176 \u001b[0m \u001b[0;33mchanged=71  \u001b[0m unreachable=0    failed=0   ",
    "\u001b[0;33movercloud-worker-1\u001b[0m         : \u001b[0;32mok=176 \u001b[0m \u001b[0;33mchanged=71  \u001b[0m unreachable=0    failed=0   ",
    "",
    "",
    "INSTALLER STATUS ***************************************************************",
    "\u001b[0;32mInitialization               : Complete (0:02:38)\u001b[0m",
    "\u001b[0;32mHealth Check                 : Complete (0:02:20)\u001b[0m",
    "\u001b[0;32mNode Bootstrap Preparation   : Complete (0:26:51)\u001b[0m",
    "\u001b[0;32metcd Install                 : Complete (0:02:30)\u001b[0m",
    "\u001b[0;32mMaster Install               : Complete (0:12:10)\u001b[0m",
    "\u001b[0;32mMaster Additional Install    : Complete (0:19:07)\u001b[0m",
    "\u001b[0;32mNode Join                    : Complete (0:01:58)\u001b[0m",
    "\u001b[0;32mHosted Install               : Complete (0:01:33)\u001b[0m",
    "\u001b[0;31mCluster Monitoring Operator  : In Progress (0:15:32)\u001b[0m",
    "\tThis phase can be restarted by running: playbooks/openshift-monitoring/config.yml",
    "",
    "",
    "Failure summary:",
    "",
    "",
    "  1. Hosts:    overcloud-master-1",
    "     Play:     Configure Cluster Monitoring Operator",
    "     Task:     Wait for the ServiceMonitor CRD to be created",
    "     Message:  \u001b[0;31mnon-zero return code\u001b[0m"
]
|---> warnings: [
    "Consider using 'become', 'become_method', and 'become_user' rather than running sudo"
]

Thank You,
Pradipta

Comment 2 Luis Tomas Bolivar 2019-03-13 10:23:06 UTC

Seems to be a duplicate of https://bugzilla.redhat.com/show_bug.cgi?id=1659183. Closing it, feel free to re-open if that is not the case.

As a workaround, you can follow what is described in https://bugzilla.redhat.com/show_bug.cgi?id=1659183#c9, adjusting it for the images you have (different tags)

*** This bug has been marked as a duplicate of bug 1659183 ***

Note You need to log in before you can comment on or make changes to this bug.