Bug 1949540 - Kuryr-Controller crashes when it's missing the status object
Summary: Kuryr-Controller crashes when it's missing the status object
Keywords:
Status: CLOSED DUPLICATE of bug 1933880
Alias: None
Product: OpenShift Container Platform
Classification: Red Hat
Component: Networking
Version: 4.7
Hardware: Unspecified
OS: Unspecified
medium
medium
Target Milestone: ---
: 4.8.0
Assignee: sscavnic
QA Contact: GenadiC
URL:
Whiteboard:
Depends On: 1933880 1949541 1968418
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-04-14 13:58 UTC by OpenShift BugZilla Robot
Modified: 2021-06-07 11:26 UTC (History)
2 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-04-14 14:00:26 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description OpenShift BugZilla Robot 2021-04-14 13:58:50 UTC
+++ This bug was initially created as a clone of Bug #1933880 +++

Description of problem:
In some situations, it is necessary to forcefully delete Octavia LB's from OpenStack. In this situation, the way we prompt Kuryr to recreate them, is by removing the information from the status object in the kuryrloadbalancer CRD:

Which needs be changed to: {}
$ oc get kuryrloadbalancer -n openshift-monitoring grafana -o jsonpath='{.status}' | jq .
{}

If the user inadvertently deletes the status object though, this will force kuryr-controller to return a traceback that ultimately it is unable to recover from until the status object is returned.

Version-Release number of selected component (if applicable):
bash-4.4$ rpm -qa | grep kuryr
python3-kuryr-lib-1.1.1-0.20190923160834.41e6964.el8ost.noarch
python3-kuryr-kubernetes-4.7.0-202101262230.p0.git.2494.cd95ce5.el8.noarch
openshift-kuryr-controller-4.7.0-202101262230.p0.git.2494.cd95ce5.el8.noarch
openshift-kuryr-common-4.7.0-202101262230.p0.git.2494.cd95ce5.el8.noarch

How reproducible:
100%

Steps to Reproduce:
1. Edit the kuryrloadbalancer CRD for one of the LB's:
oc edit kuryrloadbalancer -n openshift-monitoring grafana
Remove everything from status: down. Including the status: line

eg:
[...]
  - name: https
    port: 3000
    protocol: TCP
    targetPort: https
  project_id: e75466bcb2eb4cf590026be2d94d95ef
  provider: ovn
  security_groups_ids:
  - e9d30328-ea13-4434-9ed2-fe8f4ddb3173
  subnet_id: 0b048882-9b6c-4a5d-97eb-e613645c90fd
  type: ClusterIP
status:
  listeners:
  - id: ea42c50c-b86f-40d7-a98a-310b46f16b70
    loadbalancer_id: 88648171-6441-4e16-8bd8-7959b9a52fae
    name: openshift-monitoring/grafana:TCP:3000
[...]

To this:
[...]
  - name: https
    port: 3000
    protocol: TCP
    targetPort: https
  project_id: e75466bcb2eb4cf590026be2d94d95ef
  provider: ovn
  security_groups_ids:
  - e9d30328-ea13-4434-9ed2-fe8f4ddb3173
  subnet_id: 0b048882-9b6c-4a5d-97eb-e613645c90fd
  type: ClusterIP
[...]


2. Observe kuryr-controller starts failing with the following traceback:
2021-03-01 22:35:00.876 1 ERROR kuryr_kubernetes.handlers.logging [-] Failed to handle event {'type': 'MODIFIED', 'object': {'apiVersion': 'openstack.org/v1', 'kind': 'KuryrLoadBalancer', 'metadata': {'creationTimestamp': '2021-03-01T06:08:28Z', 'finalizers': ['kuryr.openstack.org/kuryrloadbalancer-finalizers'], 'generation': 34, 'managedFields': [{'apiVersion': 'openstack.org/v1', 'fieldsType': 'FieldsV1', 'fieldsV1': {'f:metadata': {'f:finalizers': {'.': {}, 'v:"kuryr.openstack.org/kuryrloadbalancer-finalizers"': {}}}, 'f:spec': {'.': {}, 'f:endpointSlices': {}, 'f:ip': {}, 'f:ports': {}, 'f:project_id': {}, 'f:provider': {}, 'f:security_groups_ids': {}, 'f:subnet_id': {}, 'f:type': {}}}, 'manager': 'python-requests', 'operation': 'Update', 'time': '2021-03-01T22:30:36Z'}], 'name': 'grafana', 'namespace': 'openshift-monitoring', 'resourceVersion': '2140553', 'selfLink': '/apis/openstack.org/v1/namespaces/openshift-monitoring/kuryrloadbalancers/grafana', 'uid': '1e8a70c2-350d-418c-b876-152cbb7d2f4b'}, 'spec': {'endpointSlices': [{'endpoints': [{'addresses': ['10.128.57.183'], 'conditions': {'ready': True}, 'targetRef': {'kind': 'Pod', 'name': 'grafana-6f4d96d7fd-vm8sv', 'namespace': 'openshift-monitoring', 'resourceVersion': '63165', 'uid': '04630764-2c7e-4e86-a4e8-f986f26931cd'}}], 'ports': [{'name': 'https', 'port': 3000, 'protocol': 'TCP'}]}], 'ip': '172.30.88.169', 'ports': [{'name': 'https', 'port': 3000, 'protocol': 'TCP', 'targetPort': 'https'}], 'project_id': 'e75466bcb2eb4cf590026be2d94d95ef', 'provider': 'ovn', 'security_groups_ids': ['e9d30328-ea13-4434-9ed2-fe8f4ddb3173'], 'subnet_id': '0b048882-9b6c-4a5d-97eb-e613645c90fd', 'type': 'ClusterIP'}}}: KeyError: 'status'
2021-03-01 22:35:00.876 1 ERROR kuryr_kubernetes.handlers.logging Traceback (most recent call last):
2021-03-01 22:35:00.876 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/handlers/logging.py", line 37, in __call__
2021-03-01 22:35:00.876 1 ERROR kuryr_kubernetes.handlers.logging     self._handler(event, *args, **kwargs)
2021-03-01 22:35:00.876 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/handlers/retry.py", line 80, in __call__
2021-03-01 22:35:00.876 1 ERROR kuryr_kubernetes.handlers.logging     self._handler(event, *args, **kwargs)
2021-03-01 22:35:00.876 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/handlers/k8s_base.py", line 84, in __call__
2021-03-01 22:35:00.876 1 ERROR kuryr_kubernetes.handlers.logging     self.on_present(obj)
2021-03-01 22:35:00.876 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/handlers/loadbalancer.py", line 65, in on_present
2021-03-01 22:35:00.876 1 ERROR kuryr_kubernetes.handlers.logging     crd_lb = loadbalancer_crd['status'].get('loadbalancer')
2021-03-01 22:35:00.876 1 ERROR kuryr_kubernetes.handlers.logging KeyError: 'status'
2021-03-01 22:35:00.876 1 ERROR kuryr_kubernetes.handlers.logging
2021-03-01 22:35:01.243 1 ERROR kuryr_kubernetes.controller.managers.health [-] Component KuryrLoadBalancerHandler is dead.


Actual results:
kuryr-controller crashes without the status object

Expected results:
If the status object is required, it shouldn't be something that can be removed.

Additional info:
I only tested this on OCP4.7. But I suspect it would be the same on 4.6

--- Additional comment from mdulko on 2021-03-03 11:42:44 UTC ---

Putting this on medium sev/prio as we have an easy workaround - just make sure to put {} as status if you want to clear it.

Comment 2 Michał Dulko 2021-04-14 14:00:26 UTC

*** This bug has been marked as a duplicate of bug 1933880 ***


Note You need to log in before you can comment on or make changes to this bug.