Bug 1933880

Summary: Kuryr-Controller crashes when it's missing the status object
Product: OpenShift Container Platform Reporter: Brendan Shephard <bshephar>
Component: NetworkingAssignee: Michał Dulko <mdulko>
Networking sub component: kuryr QA Contact: rlobillo
Status: CLOSED ERRATA Docs Contact:
Severity: medium    
Priority: medium CC: juriarte, mdulko, openshift-bugzilla-robot, pmannidi
Version: 4.7Keywords: Triaged
Target Milestone: ---   
Target Release: 4.8.0   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: No Doc Update
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2021-07-27 22:48:44 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Bug Depends On:    
Bug Blocks: 1949540, 1949541    
Attachments:
Description Flags
kuryr controller logs none

Description Brendan Shephard 2021-03-01 22:38:04 UTC
Description of problem:
In some situations, it is necessary to forcefully delete Octavia LB's from OpenStack. In this situation, the way we prompt Kuryr to recreate them, is by removing the information from the status object in the kuryrloadbalancer CRD:

Which needs be changed to: {}
$ oc get kuryrloadbalancer -n openshift-monitoring grafana -o jsonpath='{.status}' | jq .
{}

If the user inadvertently deletes the status object though, this will force kuryr-controller to return a traceback that ultimately it is unable to recover from until the status object is returned.

Version-Release number of selected component (if applicable):
bash-4.4$ rpm -qa | grep kuryr
python3-kuryr-lib-1.1.1-0.20190923160834.41e6964.el8ost.noarch
python3-kuryr-kubernetes-4.7.0-202101262230.p0.git.2494.cd95ce5.el8.noarch
openshift-kuryr-controller-4.7.0-202101262230.p0.git.2494.cd95ce5.el8.noarch
openshift-kuryr-common-4.7.0-202101262230.p0.git.2494.cd95ce5.el8.noarch

How reproducible:
100%

Steps to Reproduce:
1. Edit the kuryrloadbalancer CRD for one of the LB's:
oc edit kuryrloadbalancer -n openshift-monitoring grafana
Remove everything from status: down. Including the status: line

eg:
[...]
  - name: https
    port: 3000
    protocol: TCP
    targetPort: https
  project_id: e75466bcb2eb4cf590026be2d94d95ef
  provider: ovn
  security_groups_ids:
  - e9d30328-ea13-4434-9ed2-fe8f4ddb3173
  subnet_id: 0b048882-9b6c-4a5d-97eb-e613645c90fd
  type: ClusterIP
status:
  listeners:
  - id: ea42c50c-b86f-40d7-a98a-310b46f16b70
    loadbalancer_id: 88648171-6441-4e16-8bd8-7959b9a52fae
    name: openshift-monitoring/grafana:TCP:3000
[...]

To this:
[...]
  - name: https
    port: 3000
    protocol: TCP
    targetPort: https
  project_id: e75466bcb2eb4cf590026be2d94d95ef
  provider: ovn
  security_groups_ids:
  - e9d30328-ea13-4434-9ed2-fe8f4ddb3173
  subnet_id: 0b048882-9b6c-4a5d-97eb-e613645c90fd
  type: ClusterIP
[...]


2. Observe kuryr-controller starts failing with the following traceback:
2021-03-01 22:35:00.876 1 ERROR kuryr_kubernetes.handlers.logging [-] Failed to handle event {'type': 'MODIFIED', 'object': {'apiVersion': 'openstack.org/v1', 'kind': 'KuryrLoadBalancer', 'metadata': {'creationTimestamp': '2021-03-01T06:08:28Z', 'finalizers': ['kuryr.openstack.org/kuryrloadbalancer-finalizers'], 'generation': 34, 'managedFields': [{'apiVersion': 'openstack.org/v1', 'fieldsType': 'FieldsV1', 'fieldsV1': {'f:metadata': {'f:finalizers': {'.': {}, 'v:"kuryr.openstack.org/kuryrloadbalancer-finalizers"': {}}}, 'f:spec': {'.': {}, 'f:endpointSlices': {}, 'f:ip': {}, 'f:ports': {}, 'f:project_id': {}, 'f:provider': {}, 'f:security_groups_ids': {}, 'f:subnet_id': {}, 'f:type': {}}}, 'manager': 'python-requests', 'operation': 'Update', 'time': '2021-03-01T22:30:36Z'}], 'name': 'grafana', 'namespace': 'openshift-monitoring', 'resourceVersion': '2140553', 'selfLink': '/apis/openstack.org/v1/namespaces/openshift-monitoring/kuryrloadbalancers/grafana', 'uid': '1e8a70c2-350d-418c-b876-152cbb7d2f4b'}, 'spec': {'endpointSlices': [{'endpoints': [{'addresses': ['10.128.57.183'], 'conditions': {'ready': True}, 'targetRef': {'kind': 'Pod', 'name': 'grafana-6f4d96d7fd-vm8sv', 'namespace': 'openshift-monitoring', 'resourceVersion': '63165', 'uid': '04630764-2c7e-4e86-a4e8-f986f26931cd'}}], 'ports': [{'name': 'https', 'port': 3000, 'protocol': 'TCP'}]}], 'ip': '172.30.88.169', 'ports': [{'name': 'https', 'port': 3000, 'protocol': 'TCP', 'targetPort': 'https'}], 'project_id': 'e75466bcb2eb4cf590026be2d94d95ef', 'provider': 'ovn', 'security_groups_ids': ['e9d30328-ea13-4434-9ed2-fe8f4ddb3173'], 'subnet_id': '0b048882-9b6c-4a5d-97eb-e613645c90fd', 'type': 'ClusterIP'}}}: KeyError: 'status'
2021-03-01 22:35:00.876 1 ERROR kuryr_kubernetes.handlers.logging Traceback (most recent call last):
2021-03-01 22:35:00.876 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/handlers/logging.py", line 37, in __call__
2021-03-01 22:35:00.876 1 ERROR kuryr_kubernetes.handlers.logging     self._handler(event, *args, **kwargs)
2021-03-01 22:35:00.876 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/handlers/retry.py", line 80, in __call__
2021-03-01 22:35:00.876 1 ERROR kuryr_kubernetes.handlers.logging     self._handler(event, *args, **kwargs)
2021-03-01 22:35:00.876 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/handlers/k8s_base.py", line 84, in __call__
2021-03-01 22:35:00.876 1 ERROR kuryr_kubernetes.handlers.logging     self.on_present(obj)
2021-03-01 22:35:00.876 1 ERROR kuryr_kubernetes.handlers.logging   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/handlers/loadbalancer.py", line 65, in on_present
2021-03-01 22:35:00.876 1 ERROR kuryr_kubernetes.handlers.logging     crd_lb = loadbalancer_crd['status'].get('loadbalancer')
2021-03-01 22:35:00.876 1 ERROR kuryr_kubernetes.handlers.logging KeyError: 'status'
2021-03-01 22:35:00.876 1 ERROR kuryr_kubernetes.handlers.logging
2021-03-01 22:35:01.243 1 ERROR kuryr_kubernetes.controller.managers.health [-] Component KuryrLoadBalancerHandler is dead.


Actual results:
kuryr-controller crashes without the status object

Expected results:
If the status object is required, it shouldn't be something that can be removed.

Additional info:
I only tested this on OCP4.7. But I suspect it would be the same on 4.6

Comment 1 Michał Dulko 2021-03-03 11:42:44 UTC
Putting this on medium sev/prio as we have an easy workaround - just make sure to put {} as status if you want to clear it.

Comment 2 Michał Dulko 2021-04-14 14:00:26 UTC
*** Bug 1949540 has been marked as a duplicate of this bug. ***

Comment 5 rlobillo 2021-04-22 13:32:30 UTC
Failed on OCP4.8.0-0.nightly-2021-04-17-044339 over OSP16.1 (RHOS-16.1-RHEL-8-20210323.n.0) with OVN-Octavia

After performing the procedure to replace a lb, kuryr controller got restarted.


Given this klb:

apiVersion: openstack.org/v1
kind: KuryrLoadBalancer
metadata:
  creationTimestamp: "2021-04-17T09:56:51Z"
  finalizers:
  - kuryr.openstack.org/kuryrloadbalancer-finalizers
  generation: 890
  name: demo
  namespace: demo
  resourceVersion: "4498869"
  uid: 3da6ac76-469d-4e9f-a8f8-dbec06f6f516
spec:
  endpointSlices:
  - endpoints:
    - addresses:
      - 10.128.124.60
      conditions:
        ready: true
      targetRef:
        kind: Pod
        name: demo-7897db69cc-jwc42
        namespace: demo
        resourceVersion: "1798942"
        uid: 3031075f-8ace-4410-90a0-de46baff5383
    - addresses:
      - 10.128.124.75
      conditions:
        ready: true
      targetRef:
        kind: Pod
        name: demo-7897db69cc-84k96
        namespace: demo
        resourceVersion: "1799470"
        uid: 0a63bd82-9aa1-4188-b86f-2f09909a61df
    - addresses:
      - 10.128.125.112
      conditions:
        ready: true
      targetRef:
        kind: Pod
        name: demo-7897db69cc-gckm8
        namespace: demo
        resourceVersion: "1799771"
        uid: 782b581a-ef0b-4af4-87f1-3cd1a991c304
    ports:
    - port: 8080
      protocol: TCP
  ip: 172.30.176.102
  ports:
  - port: 80
    protocol: TCP
    targetPort: "8080"
  project_id: b3a48f657fc144e18838c3dc5db2fac6
  provider: ovn
  security_groups_ids:
  - 3121fad5-3d32-4c8b-a205-f8f6cbe316e4
  subnet_id: 49fbfb3b-f432-4f80-8286-cab626764e85
  timeout_client_data: 0
  timeout_member_data: 0
  type: ClusterIP
status:
  listeners:
  - id: fb2df0af-bdbc-48b1-a6a1-b24e64a6c5fa
    loadbalancer_id: 0d194642-1189-4a81-bad4-503c705aaee7
    name: demo/demo:TCP:80
    port: 80
    project_id: b3a48f657fc144e18838c3dc5db2fac6
    protocol: TCP
    timeout_client_data: 50000
    timeout_member_data: 50000
  loadbalancer:
    id: 0d194642-1189-4a81-bad4-503c705aaee7
    ip: 172.30.176.102
    name: demo/demo
    port_id: 6b7e2e09-6589-42fe-a966-9f6d86f5e70f
    project_id: b3a48f657fc144e18838c3dc5db2fac6
    provider: ovn
    security_groups:
    - 3121fad5-3d32-4c8b-a205-f8f6cbe316e4
    subnet_id: 49fbfb3b-f432-4f80-8286-cab626764e85
  members:
  - id: 9819b690-de3a-4f83-91cb-b8015113063e
    ip: 10.128.124.60
    name: demo/demo-7897db69cc-jwc42:8080
    pool_id: abed4e5a-1a48-4b0f-b244-b86344702957
    port: 8080
    project_id: b3a48f657fc144e18838c3dc5db2fac6
    subnet_id: 537193b4-6ca3-47ed-a759-653f26de318f
  - id: ad17c769-d857-41d2-860e-020c623a88e5
    ip: 10.128.124.75
    name: demo/demo-7897db69cc-84k96:8080
    pool_id: abed4e5a-1a48-4b0f-b244-b86344702957
    port: 8080
    project_id: b3a48f657fc144e18838c3dc5db2fac6
    subnet_id: 537193b4-6ca3-47ed-a759-653f26de318f
  - id: 62947cbc-dab9-4bf1-a347-1e18ad25a84b
    ip: 10.128.125.112
    name: demo/demo-7897db69cc-gckm8:8080
    pool_id: abed4e5a-1a48-4b0f-b244-b86344702957
    port: 8080
    project_id: b3a48f657fc144e18838c3dc5db2fac6
    subnet_id: 537193b4-6ca3-47ed-a759-653f26de318f
  pools:
  - id: abed4e5a-1a48-4b0f-b244-b86344702957
    listener_id: fb2df0af-bdbc-48b1-a6a1-b24e64a6c5fa
    loadbalancer_id: 0d194642-1189-4a81-bad4-503c705aaee7
    name: demo/demo:TCP:80
    project_id: b3a48f657fc144e18838c3dc5db2fac6
    protocol: TCP

Below steps were done:

openstack loadbalancer delete demo/demo --cascade
oc edit -n demo klb/demo
# Remove From Status to the end of the file.

After some minutes, kuryr-controller crashes:

2021-04-22 12:28:41.915 1 ERROR kuryr_kubernetes.controller.handlers.loadbalancer [-] Error updating KuryrLoadbalancer CRD {'apiVersion': 'openstack.org/v1', 'kind': 'KuryrLoadBalancer', 'metadata': {'creationTimestamp': '2021-04-17T09:56:51Z', 'finalizers': ['kuryr.openstack.org/kuryrloadbalancer-finalizers'], 'generation': 899, 'managedFields': [{'apiVersion': 'openstack.org/v1', 'fieldsType': 'FieldsV1', 'fieldsV1': {'f:metadata': {'f:finalizers': {'.': {}, 'v:"kuryr.openstack.org/kuryrloadbalancer-finalizers"': {}}}, 'f:spec': {'.': {}, 'f:endpointSlices': {}, 'f:ip': {}, 'f:ports': {}, 'f:project_id': {}, 'f:provider': {}, 'f:security_groups_ids': {}, 'f:subnet_id': {}, 'f:timeout_client_data': {}, 'f:timeout_member_data': {}, 'f:type': {}}, 'f:status': {}}, 'manager': 'python-requests', 'operation': 'Update', 'time': '2021-04-22T12:15:44Z'}], 'name': 'demo', 'namespace': 'demo', 'resourceVersion': '4499113', 'uid': '3da6ac76-469d-4e9f-a8f8-dbec06f6f516'}, 'spec': {'endpointSlices': [{'endpoints': [{'addresses': ['10.128.124.60'], 'conditions': {'ready': True}, 'targetRef': {'kind': 'Pod', 'name': 'demo-7897db69cc-jwc42', 'namespace': 'demo', 'resourceVersion': '1798942', 'uid': '3031075f-8ace-4410-90a0-de46baff5383'}}, {'addresses': ['10.128.124.75'], 'conditions': {'ready': True}, 'targetRef': {'kind': 'Pod', 'name': 'demo-7897db69cc-84k96', 'namespace': 'demo', 'resourceVersion': '1799470', 'uid': '0a63bd82-9aa1-4188-b86f-2f09909a61df'}}, {'addresses': ['10.128.125.112'], 'conditions': {'ready': True}, 'targetRef': {'kind': 'Pod', 'name': 'demo-7897db69cc-gckm8', 'namespace': 'demo', 'resourceVersion': '1799771', 'uid': '782b581a-ef0b-4af4-87f1-3cd1a991c304'}}], 'ports': [{'port': 8080, 'protocol': 'TCP'}]}], 'ip': '172.30.176.102', 'ports': [{'port': 80, 'protocol': 'TCP', 'targetPort': '8080'}], 'project_id': 'b3a48f657fc144e18838c3dc5db2fac6', 'provider': 'ovn', 'security_groups_ids': ['3121fad5-3d32-4c8b-a205-f8f6cbe316e4'], 'subnet_id': '49fbfb3b-f432-4f80-8286-cab626764e85', 'timeout_client_data': 0, 'timeout_member_data': 0, 'type': 'ClusterIP'}, 'status': {'loadbalancer': {'name': 'demo/demo', 'project_id': 'b3a48f657fc144e18838c3dc5db2fac6', 'subnet_id': '49fbfb3b-f432-4f80-8286-cab626764e85', 'ip': '172.30.176.102', 'security_groups': [], 'provider': 'ovn', 'id': 'f09c3033-4d1b-496c-b557-8bd4d08fb1e9', 'port_id': '6c04636a-b7db-4248-a62a-32e6500d248d'}, 'listeners': [{'name': 'demo/demo:TCP:80', 'project_id': 'b3a48f657fc144e18838c3dc5db2fac6', 'loadbalancer_id': 'f09c3033-4d1b-496c-b557-8bd4d08fb1e9', 'protocol': 'TCP', 'port': 80, 'id': '3fab2f4a-1fc5-444e-ab01-d152ecafd3e4'}]}}: kuryr_kubernetes.exceptions.K8sUnprocessableEntity: Unprocessable: '{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"the server rejected our request due to an error in our request","reason":"Invalid","details":{},"code":422}\n'
2021-04-22 12:28:41.915 1 ERROR kuryr_kubernetes.controller.handlers.loadbalancer Traceback (most recent call last):
2021-04-22 12:28:41.915 1 ERROR kuryr_kubernetes.controller.handlers.loadbalancer   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/handlers/loadbalancer.py", line 659, in _add_new_listeners
2021-04-22 12:28:41.915 1 ERROR kuryr_kubernetes.controller.handlers.loadbalancer     loadbalancer_crd['status'])
2021-04-22 12:28:41.915 1 ERROR kuryr_kubernetes.controller.handlers.loadbalancer   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/k8s_client.py", line 195, in patch_crd
2021-04-22 12:28:41.915 1 ERROR kuryr_kubernetes.controller.handlers.loadbalancer     self._raise_from_response(response)
2021-04-22 12:28:41.915 1 ERROR kuryr_kubernetes.controller.handlers.loadbalancer   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/k8s_client.py", line 110, in _raise_from_response
2021-04-22 12:28:41.915 1 ERROR kuryr_kubernetes.controller.handlers.loadbalancer     raise exc.K8sUnprocessableEntity(response.text)
2021-04-22 12:28:41.915 1 ERROR kuryr_kubernetes.controller.handlers.loadbalancer kuryr_kubernetes.exceptions.K8sUnprocessableEntity: Unprocessable: '{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"the server rejected our request due to an error in our request","reason":"Invalid","details":{},"code":422}\n'
2021-04-22 12:28:41.915 1 ERROR kuryr_kubernetes.controller.handlers.loadbalancer ^[[00m
2021-04-22 12:28:41.922 1 ERROR kuryr_kubernetes.handlers.retry [-] Report handler unhealthy KuryrLoadBalancerHandler: kuryr_kubernetes.exceptions.K8sUnprocessableEntity: Unprocessable: '{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"the server rejected our request due to an error in our request","reason":"Invalid","details":{},"code":422}\n'
2021-04-22 12:28:41.922 1 ERROR kuryr_kubernetes.handlers.retry Traceback (most recent call last):
2021-04-22 12:28:41.922 1 ERROR kuryr_kubernetes.handlers.retry   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/handlers/retry.py", line 80, in __call__
2021-04-22 12:28:41.922 1 ERROR kuryr_kubernetes.handlers.retry     self._handler(event, *args, **kwargs)
2021-04-22 12:28:41.922 1 ERROR kuryr_kubernetes.handlers.retry   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/handlers/k8s_base.py", line 84, in __call__
2021-04-22 12:28:41.922 1 ERROR kuryr_kubernetes.handlers.retry     self.on_present(obj)
2021-04-22 12:28:41.922 1 ERROR kuryr_kubernetes.handlers.retry   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/handlers/loadbalancer.py", line 93, in on_present
2021-04-22 12:28:41.922 1 ERROR kuryr_kubernetes.handlers.retry     if self._sync_lbaas_members(loadbalancer_crd):
2021-04-22 12:28:41.922 1 ERROR kuryr_kubernetes.handlers.retry   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/handlers/loadbalancer.py", line 189, in _sync_lbaas_members
2021-04-22 12:28:41.922 1 ERROR kuryr_kubernetes.handlers.retry     if self._sync_lbaas_pools(loadbalancer_crd):
2021-04-22 12:28:41.922 1 ERROR kuryr_kubernetes.handlers.retry   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/handlers/loadbalancer.py", line 490, in _sync_lbaas_pools
2021-04-22 12:28:41.922 1 ERROR kuryr_kubernetes.handlers.retry     if self._sync_lbaas_listeners(loadbalancer_crd):
2021-04-22 12:28:41.922 1 ERROR kuryr_kubernetes.handlers.retry   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/handlers/loadbalancer.py", line 590, in _sync_lbaas_listeners
2021-04-22 12:28:41.922 1 ERROR kuryr_kubernetes.handlers.retry     if self._add_new_listeners(loadbalancer_crd):
2021-04-22 12:28:41.922 1 ERROR kuryr_kubernetes.handlers.retry   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/handlers/loadbalancer.py", line 659, in _add_new_listeners
2021-04-22 12:28:41.922 1 ERROR kuryr_kubernetes.handlers.retry     loadbalancer_crd['status'])
2021-04-22 12:28:41.922 1 ERROR kuryr_kubernetes.handlers.retry   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/k8s_client.py", line 195, in patch_crd
2021-04-22 12:28:41.922 1 ERROR kuryr_kubernetes.handlers.retry     self._raise_from_response(response)
2021-04-22 12:28:41.922 1 ERROR kuryr_kubernetes.handlers.retry   File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/k8s_client.py", line 110, in _raise_from_response
2021-04-22 12:28:41.922 1 ERROR kuryr_kubernetes.handlers.retry     raise exc.K8sUnprocessableEntity(response.text)
2021-04-22 12:28:41.922 1 ERROR kuryr_kubernetes.handlers.retry kuryr_kubernetes.exceptions.K8sUnprocessableEntity: Unprocessable: '{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"the server rejected our request due to an error in our request","reason":"Invalid","details":{},"code":422}\n'


and after the kuryr-controller restart, the lb is succesfully recreated on OSP and the klb status is fulfilled.

Attaching Kuryr Controller logs.

Comment 6 rlobillo 2021-04-22 13:34:46 UTC
Created attachment 1774485 [details]
kuryr controller logs

Comment 8 rlobillo 2021-06-02 08:54:32 UTC
Verified on OCP4.8.0-0.nightly-2021-05-29-114625 over OSP16.1 (RHOS-16.1-RHEL-8-20210323.n.0) with OVN-Octavia enabled.

loadbalancer replacement procedure worked fine.

Given below project:

$ oc get all -n demo                                                                                                                            
NAME                        READY   STATUS    RESTARTS   AGE                                                                                                                       
pod/demo-7897db69cc-c2nrz   1/1     Running   0          43h                                                                                                                       
pod/demo-7897db69cc-m8hsd   1/1     Running   0          43h                                                                                                                       
pod/demo-7897db69cc-n8zcw   1/1     Running   0          43h                                                                                                                       
                                                                                                                                                                                   
NAME           TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE                                                                                                             
service/demo   ClusterIP   172.30.64.198   <none>        80/TCP    43h                                                                                                                        
                                                                                                                                                                                              
NAME                   READY   UP-TO-DATE   AVAILABLE   AGE                                                                                                                                   
deployment.apps/demo   3/3     3            3           43h                                                                                                                                   
                                                                                                                                                                                              
NAME                              DESIRED   CURRENT   READY   AGE                      
replicaset.apps/demo-7897db69cc   3         3         3       43h                                                                                                                   
(shiftstack) [stack@undercloud-0 ~]$ oc rsh -n demo pod/demo-7897db69cc-c2nrz curl 172.30.64.198                                                                                              
demo-7897db69cc-m8hsd: HELLO! I AM ALIVE!!!                                                                                                        

Destroy the loadbalancer and remove the status section from the klb resource:

$ openstack loadbalancer delete demo/demo --cascade
$ oc edit -n demo klb/demo
kuryrloadbalancer.openstack.org/demo edited
$ oc rsh -n demo pod/demo-7897db69cc-c2nrz curl 172.30.64.198
^Ccommand terminated with exit code 130

Triggers the replacement of the loadbalancer after few minutes:

$ oc rsh -n demo pod/demo-7897db69cc-c2nrz curl 172.30.64.198
demo-7897db69cc-n8zcw: HELLO! I AM ALIVE!!!

During this process, kuryr-controller remains stable:

$ oc get pods -n openshift-kuryr
NAME                                READY   STATUS    RESTARTS   AGE
kuryr-cni-2fbw7                     1/1     Running   0          44h
kuryr-cni-dtsqx                     1/1     Running   0          44h
kuryr-cni-ngnsw                     1/1     Running   0          45h
kuryr-cni-qmw74                     1/1     Running   0          45h
kuryr-cni-v9sbw                     1/1     Running   0          44h
kuryr-cni-xr7k5                     1/1     Running   0          45h
kuryr-controller-7f67c7ffd9-mhrqd   1/1     Running   0          72m

and the status section is updated on the klb resource:

$ oc get klb -n demo demo -o json | jq .status
{
  "listeners": [
    {
      "id": "f254bafb-d452-4d1b-b4b0-8cc12f8f7390",
      "loadbalancer_id": "5de8028b-04a6-4415-885f-3ec3097986a8",
      "name": "demo/demo:TCP:80",
      "port": 80,
      "project_id": "c1ac743dc7274e31b3f9fb7c6fa0b4b4",
      "protocol": "TCP"
    }
  ],
  "loadbalancer": {
    "id": "5de8028b-04a6-4415-885f-3ec3097986a8",
    "ip": "172.30.64.198",
    "name": "demo/demo",
    "port_id": "675e50fa-c31d-4c3c-b52a-00bf2f06aa7c",
    "project_id": "c1ac743dc7274e31b3f9fb7c6fa0b4b4",
    "provider": "ovn",
    "security_groups": [
      "9d27b2e0-ea78-4b26-b853-4310ed166751"
    ],
    "subnet_id": "a69803aa-fdb8-4c34-b8c3-ee149e508d9f"
  },
  "members": [
    {
      "id": "4eea3096-5ac9-415b-986a-504b97e00678",
      "ip": "10.128.124.232",
      "name": "demo/demo-7897db69cc-m8hsd:8080",
      "pool_id": "7f38f694-8d66-487c-a101-27465c1a315a",
      "port": 8080,
      "project_id": "c1ac743dc7274e31b3f9fb7c6fa0b4b4",
      "subnet_id": "1bd0f571-eb74-4108-b00f-248f216c1604"
    },
    {
      "id": "a906cbb8-ff29-4635-b5e8-18d95c3437a8",
      "ip": "10.128.125.251",
      "name": "demo/demo-7897db69cc-n8zcw:8080",
      "pool_id": "7f38f694-8d66-487c-a101-27465c1a315a",
      "port": 8080,
      "project_id": "c1ac743dc7274e31b3f9fb7c6fa0b4b4",
      "subnet_id": "1bd0f571-eb74-4108-b00f-248f216c1604"
    },
    {
      "id": "88b8c178-25f5-4674-af7f-9353701c7b08",
      "ip": "10.128.125.76",
      "name": "demo/demo-7897db69cc-c2nrz:8080",
      "pool_id": "7f38f694-8d66-487c-a101-27465c1a315a",
      "port": 8080,
      "project_id": "c1ac743dc7274e31b3f9fb7c6fa0b4b4",
      "subnet_id": "1bd0f571-eb74-4108-b00f-248f216c1604"
    }
  ],
  "pools": [
    {
      "id": "7f38f694-8d66-487c-a101-27465c1a315a",
      "listener_id": "f254bafb-d452-4d1b-b4b0-8cc12f8f7390",
      "loadbalancer_id": "5de8028b-04a6-4415-885f-3ec3097986a8",
      "name": "demo/demo:TCP:80",
      "project_id": "c1ac743dc7274e31b3f9fb7c6fa0b4b4",
      "protocol": "TCP"
    }
  ]
}

Comment 11 errata-xmlrpc 2021-07-27 22:48:44 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Moderate: OpenShift Container Platform 4.8.2 bug fix and security update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHSA-2021:2438