Bug 1879855

Summary: [Kuryr] Install playbook fails waiting for CRD creation due to kuryr-controller crashloop
Product: OpenShift Container Platform Reporter: Jon Uriarte <juriarte>
Component: NetworkingAssignee: rdobosz
Networking sub component: kuryr QA Contact: Jon Uriarte <juriarte>
Status: CLOSED ERRATA Docs Contact:
Severity: urgent    
Priority: unspecified CC: bshirren, rdobosz
Version: 3.11.0   
Target Milestone: ---   
Target Release: 3.11.z   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2020-10-22 11:02:25 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Jon Uriarte 2020-09-17 07:47:09 UTC
Description of problem:

Cannot install openshift v3.11.286 (or v3.11.287) as kuryr-controller pod crashloops due to some incopatibility with Octavia API.
The playbook fails in the task "Wait for the ServiceMonitor CRD to be created".

TASK [openshift_cluster_monitoring_operator : Wait for the ServiceMonitor CRD to be created] ***
FAILED - RETRYING: Wait for the ServiceMonitor CRD to be created (30 retries left).
...
FAILED - RETRYING: Wait for the ServiceMonitor CRD to be created (1 retries left).
fatal: [master-1.openshift.example.com]: FAILED! => {"attempts": 30, "changed": true, "cmd": ["oc", "get", "crd", "servicemonitors.monitoring.coreos.com", "-n", "openshift-monitoring", "--config=/tmp/openshift-cluster-monitoring-ansible-c8Hlpm/admin.kubeconfig"], "delta": "0:00:00.212646", "end": "2020-09-16 12:26:47.048829", "msg": "non-zero return code", "rc": 1, "start": "2020-09-16 12:26:46.836183", "stderr": "No resources found.\nError from server (NotFound): customresourcedefinitions.apiextensions.k8s.io \"servicemonitors.monitoring.coreos.com\" not found", "stderr_lines": ["No resources found.", "Error from server (NotFound): customresourcedefinitions.apiextensions.k8s.io \"servicemonitors.monitoring.coreos.com\" not found"], "stdout": "", "stdout_lines": []}

Failure summary:


  1. Hosts:    master-1.openshift.example.com
     Play:     Configure Cluster Monitoring Operator
     Task:     Wait for the ServiceMonitor CRD to be created
     Message:  non-zero return code


[openshift@master-0 ~]$ oc -n kuryr get pods
NAME                                READY     STATUS             RESTARTS   AGE
kuryr-cni-ds-25rcs                  2/2       Running            90         14h
kuryr-cni-ds-2v9c2                  2/2       Running            0          14h
kuryr-cni-ds-84kxq                  2/2       Running            0          14h
kuryr-cni-ds-8n7mb                  2/2       Running            28         14h
kuryr-cni-ds-8qcv5                  2/2       Running            0          14h
kuryr-cni-ds-9xds6                  2/2       Running            0          14h
kuryr-cni-ds-np5wr                  2/2       Running            0          14h
kuryr-cni-ds-x97jv                  2/2       Running            28         14h
kuryr-controller-6dd96d9587-jzp92   0/1       CrashLoopBackOff   169        14h


kuryr-controller pod logs:
2020-09-17 06:11:25.793 1 INFO kuryr_kubernetes.config [-] /usr/bin/kuryr-k8s-controller version 0.0.0
2020-09-17 06:11:25.956 1 INFO os_vif [-] Loaded VIF plugins: noop, sriov, ovs, linux_bridge, noop
2020-09-17 06:11:25.959 1 INFO kuryr_kubernetes.controller.service [-] Configured handlers: ['vif', 'lb', 'lbaasspec', 'namespace']
2020-09-17 06:11:25.977 1 ERROR kuryr_kubernetes.controller.service [-] Exception when loading handlers lb = kuryr_kubernetes.controller.handlers.lbaas:LoadBalancerHandler.: AttributeError: 'Proxy' object has no attribute 'get_all_version_data'
2020-09-17 06:11:25.977 1 ERROR kuryr_kubernetes.controller.service Traceback (most recent call last):
2020-09-17 06:11:25.977 1 ERROR kuryr_kubernetes.controller.service   File "/usr/lib/python2.7/site-packages/stevedore/extension.py", line 195, in _load_plugins
2020-09-17 06:11:25.977 1 ERROR kuryr_kubernetes.controller.service     verify_requirements,
2020-09-17 06:11:25.977 1 ERROR kuryr_kubernetes.controller.service   File "/usr/lib/python2.7/site-packages/stevedore/named.py", line 158, in _load_one_plugin
2020-09-17 06:11:25.977 1 ERROR kuryr_kubernetes.controller.service     verify_requirements,
2020-09-17 06:11:25.977 1 ERROR kuryr_kubernetes.controller.service   File "/usr/lib/python2.7/site-packages/stevedore/extension.py", line 227, in _load_one_plugin
2020-09-17 06:11:25.977 1 ERROR kuryr_kubernetes.controller.service     obj = plugin(*invoke_args, **invoke_kwds)
2020-09-17 06:11:25.977 1 ERROR kuryr_kubernetes.controller.service   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/controller/handlers/lbaas.py", line 162, in __init__
2020-09-17 06:11:25.977 1 ERROR kuryr_kubernetes.controller.service     self._drv_lbaas = drv_base.LBaaSDriver.get_instance()
2020-09-17 06:11:25.977 1 ERROR kuryr_kubernetes.controller.service   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/controller/drivers/base.py", line 80, in get_instance
2020-09-17 06:11:25.977 1 ERROR kuryr_kubernetes.controller.service     invoke_on_load=True)
2020-09-17 06:11:25.977 1 ERROR kuryr_kubernetes.controller.service   File "/usr/lib/python2.7/site-packages/stevedore/driver.py", line 61, in __init__
2020-09-17 06:11:25.977 1 ERROR kuryr_kubernetes.controller.service     warn_on_missing_entrypoint=warn_on_missing_entrypoint
2020-09-17 06:11:25.977 1 ERROR kuryr_kubernetes.controller.service   File "/usr/lib/python2.7/site-packages/stevedore/named.py", line 81, in __init__
2020-09-17 06:11:25.977 1 ERROR kuryr_kubernetes.controller.service     verify_requirements)
2020-09-17 06:11:25.977 1 ERROR kuryr_kubernetes.controller.service   File "/usr/lib/python2.7/site-packages/stevedore/extension.py", line 203, in _load_plugins
2020-09-17 06:11:25.977 1 ERROR kuryr_kubernetes.controller.service     self._on_load_failure_callback(self, ep, err)
2020-09-17 06:11:25.977 1 ERROR kuryr_kubernetes.controller.service   File "/usr/lib/python2.7/site-packages/stevedore/extension.py", line 195, in _load_plugins
2020-09-17 06:11:25.977 1 ERROR kuryr_kubernetes.controller.service     verify_requirements,
2020-09-17 06:11:25.977 1 ERROR kuryr_kubernetes.controller.service   File "/usr/lib/python2.7/site-packages/stevedore/named.py", line 158, in _load_one_plugin
2020-09-17 06:11:25.977 1 ERROR kuryr_kubernetes.controller.service     verify_requirements,
2020-09-17 06:11:25.977 1 ERROR kuryr_kubernetes.controller.service   File "/usr/lib/python2.7/site-packages/stevedore/extension.py", line 227, in _load_one_plugin
2020-09-17 06:11:25.977 1 ERROR kuryr_kubernetes.controller.service     obj = plugin(*invoke_args, **invoke_kwds)
2020-09-17 06:11:25.977 1 ERROR kuryr_kubernetes.controller.service   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/controller/drivers/lbaasv2.py", line 76, in __init__
2020-09-17 06:11:25.977 1 ERROR kuryr_kubernetes.controller.service     v = self.get_octavia_version()
2020-09-17 06:11:25.977 1 ERROR kuryr_kubernetes.controller.service   File "/usr/lib/python2.7/site-packages/kuryr_kubernetes/controller/drivers/lbaasv2.py", line 101, in get_octavia_version
2020-09-17 06:11:25.977 1 ERROR kuryr_kubernetes.controller.service     regions = lbaas.get_all_version_data()
2020-09-17 06:11:25.977 1 ERROR kuryr_kubernetes.controller.service AttributeError: 'Proxy' object has no attribute 'get_all_version_data'
2020-09-17 06:11:25.977 1 ERROR kuryr_kubernetes.controller.service 
2020-09-17 06:11:25.979 1 CRITICAL kuryr_kubernetes.controller.service [-] Handlers entrypoint "lb = kuryr_kubernetes.controller.handlers.lbaas:LoadBalancerHandler" failed to load due to 'Proxy' object has no attribute 'get_all_version_data'.


Version-Release number of selected component (if applicable):
openshift v3.11.286
OSP 13 2020-09-03.2
kuryr controller image: https://access.redhat.com/containers/#/registry.access.redhat.com/openshift3/ose-kuryr-controller/images/v3.11.286-1
kuryr cni image: https://access.redhat.com/containers/#/registry.access.redhat.com/openshift3/ose-kuryr-cni/images/v3.11.286-1

How reproducible: always


Steps to Reproduce:
1. Deploy OSP 13 with octavia
2. Install 3.11 with Kuryr

Actual results: Installer fails 


Expected results: Successful installation

Comment 2 Jon Uriarte 2020-09-22 06:18:31 UTC
Verified in:
OCP v3.11.292-1_2020-09-21.1
OSP 13 2020-09-16.1

Installation succedded:

INSTALLER STATUS ***************************************************************
Initialization               : Complete (0:00:35)
Health Check                 : Complete (0:00:02)
Node Bootstrap Preparation   : Complete (0:06:14)
etcd Install                 : Complete (0:00:43)
Master Install               : Complete (0:06:37)
Master Additional Install    : Complete (0:01:54)
Node Join                    : Complete (0:00:46)
Hosted Install               : Complete (0:00:57)
Cluster Monitoring Operator  : Complete (0:03:48)
Web Console Install          : Complete (0:01:04)
Console Install              : Complete (0:00:42)

Comment 5 errata-xmlrpc 2020-10-22 11:02:25 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (OpenShift Container Platform 3.11.306 bug fix and enhancement update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:4170