Description of problem: When a Load Balancer resource, in this case pool, transition to ERROR status Kuryr does not attempt to recreate it and consequently any resource creation that depends on the one with ERROR will fail causing the controller to restart continuously. [stack@undercloud-0 ~]$ openstack loadbalancer pool show caea3e31-db8d-4d1a-afd3-0dc3fa02fd4d +----------------------+-----------------------------------------------------+ | Field | Value | +----------------------+-----------------------------------------------------+ | admin_state_up | True | | created_at | 2020-05-16T23:37:16 | | description | | | healthmonitor_id | | | id | caea3e31-db8d-4d1a-afd3-0dc3fa02fd4d | | lb_algorithm | ROUND_ROBIN | | listeners | 39933e05-d9c6-4ed1-9a04-e4994e3ea4d3 | | loadbalancers | 6218ba3d-0290-4c25-a5e1-502f374f8b6c | | members | | | name | openshift-machine-api/machine-api-operator:TCP:8443 | | operating_status | ONLINE | | project_id | f6b6420743ce45ac868c102a523ffde6 | | protocol | TCP | | provisioning_status | ERROR | | session_persistence | None | | updated_at | 2020-05-16T23:37:20 | | tls_container_ref | None | | ca_tls_container_ref | None | | crl_container_ref | None | | tls_enabled | False | +----------------------+-----------------------------------------------------+ 2020-05-17 17:32:25.734 1 ERROR kuryr_kubernetes.handlers.logging [-] Failed to handle event {'type': 'ADDED', 'object': {'kind': 'Endpoints', 'apiVersion': 'v1', 'metadata': {'name': 'machine-api-operator', 'namespace': 'openshift-machin e-api', 'selfLink': '/api/v1/namespaces/openshift-machine-api/endpoints/machine-api-operator', 'uid': 'c87b2853-180c-40ef-8086-1946c099476b', 'resourceVersion': '25606', 'creationTimestamp': '2020-05-16T23:23:16Z', 'labels': {'k8s-app': ' machine-api-operator'}, 'annotations': {'endpoints.kubernetes.io/last-change-trigger-time': '2020-05-16T23:35:05Z', 'openstack.org/kuryr-lbaas-spec': '{"versioned_object.data": {"ip": "172.30.53.190", "lb_ip": null, "ports": [{"versioned_ object.data": {"name": "https", "port": 8443, "protocol": "TCP", "targetPort": "https"}, "versioned_object.name": "LBaaSPortSpec", "versioned_object.namespace": "kuryr_kubernetes", "versioned_object.version": "1.1"}], "project_id": "f6b64 20743ce45ac868c102a523ffde6", "security_groups_ids": [], "subnet_id": "0e811b62-6954-4459-9d02-a232e06c040c", "type": "ClusterIP"}, "versioned_object.name": "LBaaSServiceSpec", "versioned_object.namespace": "kuryr_kubernetes", "versioned_ object.version": "1.0"}'}, 'managedFields': [{'manager': 'python-requests', 'operation': 'Update', 'apiVersion': 'v1', 'time': '2020-05-16T23:26:23Z', 'fieldsType': 'FieldsV1', 'fieldsV1': {'f:metadata': {'f:annotations': {'f:openstack.or g/kuryr-lbaas-spec': {}}}}}, {'manager': 'kube-controller-manager', 'operation': 'Update', 'apiVersion': 'v1', 'time': '2020-05-16T23:35:05Z', 'fieldsType': 'FieldsV1', 'fieldsV1': {'f:metadata': {'f:annotations': {'.': {}, 'f:endpoints.k ubernetes.io/last-change-trigger-time': {}}, 'f:labels': {'.': {}, 'f:k8s-app': {}}}, 'f:subsets': {}}}]}, 'subsets': [{'addresses': [{'ip': '10.128.56.38', 'nodeName': 'ostest-zm4t6-master-1', 'targetRef': {'kind': 'Pod', 'namespace': 'o penshift-machine-api', 'name': 'machine-api-operator-7dddccbf5-wcdf5', 'uid': '549d414d-9b8c-4adf-84f4-d101dbc8fe98', 'resourceVersion': '25604'}}], 'ports': [{'name': 'https', 'port': 8443, 'protocol': 'TCP'}]}]}}: kuryr_kubernetes.excep tions.ResourceNotReady: Resource not ready: LBaaSMember(id=<?>,ip=10.128.56.38,name='openshift-machine-api/machine-api-operator-7dddccbf5-wcdf5:8443',pool_id=caea3e31-db8d-4d1a-afd3-0dc3fa02fd4d,port=8443,project_id='f6b6420743ce45ac868c1 02a523ffde6',subnet_id=0e811b62-6954-4459-9d02-a232e06c040c) 2020-05-17 17:32:25.734 1 ERROR kuryr_kubernetes.handlers.logging Traceback (most recent call last): 2020-05-17 17:32:25.734 1 ERROR kuryr_kubernetes.handlers.logging File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/handlers/logging.py", line 37, in __call__ 2020-05-17 17:32:25.734 1 ERROR kuryr_kubernetes.handlers.logging self._handler(event) 2020-05-17 17:32:25.734 1 ERROR kuryr_kubernetes.handlers.logging File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/handlers/retry.py", line 93, in __call__ 2020-05-17 17:32:25.734 1 ERROR kuryr_kubernetes.handlers.logging self._handler.set_liveness(alive=False) 2020-05-17 17:32:25.734 1 ERROR kuryr_kubernetes.handlers.logging File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__ 2020-05-17 17:32:25.734 1 ERROR kuryr_kubernetes.handlers.logging self.force_reraise() 2020-05-17 17:32:25.734 1 ERROR kuryr_kubernetes.handlers.logging File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise 2020-05-17 17:32:25.734 1 ERROR kuryr_kubernetes.handlers.logging six.reraise(self.type_, self.value, self.tb) 2020-05-17 17:32:25.734 1 ERROR kuryr_kubernetes.handlers.logging File "/usr/lib/python3.6/site-packages/six.py", line 693, in reraise 2020-05-17 17:32:25.734 1 ERROR kuryr_kubernetes.handlers.logging raise value 2020-05-17 17:32:25.734 1 ERROR kuryr_kubernetes.handlers.logging File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/handlers/retry.py", line 78, in __call__ 2020-05-17 17:32:25.734 1 ERROR kuryr_kubernetes.handlers.logging self._handler(event) 2020-05-17 17:32:25.734 1 ERROR kuryr_kubernetes.handlers.logging File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/handlers/k8s_base.py", line 90, in __call__ 2020-05-17 17:32:25.734 1 ERROR kuryr_kubernetes.handlers.logging self.on_present(obj) 2020-05-17 17:32:25.734 1 ERROR kuryr_kubernetes.handlers.logging File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/handlers/lbaas.py", line 190, in on_present 2020-05-17 17:32:25.734 1 ERROR kuryr_kubernetes.handlers.logging if self._sync_lbaas_members(endpoints, lbaas_state, lbaas_spec): 2020-05-17 17:32:25.734 1 ERROR kuryr_kubernetes.handlers.logging File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/handlers/lbaas.py", line 281, in _sync_lbaas_members 2020-05-17 17:32:25.734 1 ERROR kuryr_kubernetes.handlers.logging self._add_new_members(endpoints, lbaas_state, lbaas_spec)): 2020-05-17 17:32:25.734 1 ERROR kuryr_kubernetes.handlers.logging File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/handlers/lbaas.py", line 388, in _add_new_members 2020-05-17 17:32:25.734 1 ERROR kuryr_kubernetes.handlers.logging listener_port=listener_port) 2020-05-17 17:32:25.734 1 ERROR kuryr_kubernetes.handlers.logging File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/drivers/lbaasv2.py", line 450, in ensure_member 2020-05-17 17:32:25.734 1 ERROR kuryr_kubernetes.handlers.logging self._find_member) 2020-05-17 17:32:25.734 1 ERROR kuryr_kubernetes.handlers.logging File "/usr/lib/python3.6/site-packages/kuryr_kubernetes/controller/drivers/lbaasv2.py", line 688, in _ensure_provisioned 2020-05-17 17:32:25.734 1 ERROR kuryr_kubernetes.handlers.logging raise k_exc.ResourceNotReady(obj) 2020-05-17 17:32:25.734 1 ERROR kuryr_kubernetes.handlers.logging kuryr_kubernetes.exceptions.ResourceNotReady: Resource not ready: LBaaSMember(id=<?>,ip=10.128.56.38,name='openshift-machine-api/machine-api-operator-7dddccbf5-wcdf5:8443', pool_id=caea3e31-db8d-4d1a-afd3-0dc3fa02fd4d,port=8443,project_id='f6b6420743ce45ac868c102a523ffde6',subnet_id=0e811b62-6954-4459-9d02-a232e06c040c) Version-Release number of selected component (if applicable): How reproducible: Steps to Reproduce: 1. Transition a LB resource (pool/listener) to ERROR while the LB creation is being handling 2. 3. Actual results: Kuryr-Controller continuously restarted due to unable to create others LB resources to complete handling the event. Expected results: Load Balancer resource to get recreated. Additional info:
Verified on OSP16 - RHOS_TRUNK-16.0-RHEL-8-20200513.n.1 and 4.5.0-0.nightly-2020-06-01-043833 1 Create a pod on test namespace: (overcloud) [stack@undercloud-0 ~]$ oc run --image kuryr/demo demo pod/demo created (overcloud) [stack@undercloud-0 ~]$ oc run --image kuryr/demo demo-caller pod/demo-caller created (overcloud) [stack@undercloud-0 ~]$ oc expose pod/demo --port 80 --target-port 8080 service/demo exposed (overcloud) [stack@undercloud-0 ~]$ oc get pods,svc -o wide --show-labels NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES LABELS pod/demo 1/1 Running 0 38s 10.128.115.234 ostest-k7pdd-worker-8cf4l <none> <none> run=demo pod/demo-caller 1/1 Running 0 27s 10.128.115.247 ostest-k7pdd-worker-jpsbt <none> <none> run=demo-caller NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTOR LABELS service/demo ClusterIP 172.30.202.242 <none> 80/TCP 22s run=demo run=demo (overcloud) [stack@undercloud-0 ~]$ oc rsh demo-caller curl 172.30.202.242 demo: HELLO! I AM ALIVE!!! 2 Search for the LB resources: (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer list | grep demo | d9b5421d-366a-4dab-9d0f-7cd6a3cb7bb9 | test/demo | 758d38e2352449eaa9d6ae554d0650e9 | 172.30.202.242 | ACTIVE | ovn | (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer show d9b5421d-366a-4dab-9d0f-7cd6a3cb7bb9 +---------------------+--------------------------------------+ | Field | Value | +---------------------+--------------------------------------+ | admin_state_up | True | | created_at | 2020-06-01T14:25:55 | | description | | | flavor_id | None | | id | d9b5421d-366a-4dab-9d0f-7cd6a3cb7bb9 | | listeners | 9ebd024c-0872-40db-a056-5f514920b8ef | | name | test/demo | | operating_status | ONLINE | | pools | 80248153-89a8-4f4a-ad03-7a71d7ca5898 | | project_id | 758d38e2352449eaa9d6ae554d0650e9 | | provider | ovn | | provisioning_status | ACTIVE | | updated_at | 2020-06-01T14:26:13 | | vip_address | 172.30.202.242 | | vip_network_id | 0adf99a6-4b4e-4909-9537-680d4031de65 | | vip_port_id | 5e249dce-ca31-42c2-be80-e2fa4ac2ff35 | | vip_qos_policy_id | None | | vip_subnet_id | 27dec9e5-623f-499d-a62d-512759da16cd | +---------------------+--------------------------------------+ (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer member list 80248153-89a8-4f4a-ad03-7a71d7ca5898 +--------------------------------------+----------------+----------------------------------+---------------------+----------------+---------------+------------------+--------+ | id | name | project_id | provisioning_status | address | protocol_port | operating_status | weight | +--------------------------------------+----------------+----------------------------------+---------------------+----------------+---------------+------------------+--------+ | a45180e1-d15e-4ef3-9ed6-ac6a7067ac3f | test/demo:8080 | 758d38e2352449eaa9d6ae554d0650e9 | ACTIVE | 10.128.115.234 | 8080 | NO_MONITOR | 1 | +--------------------------------------+----------------+----------------------------------+---------------------+----------------+---------------+------------------+--------+ 3 Set LB resources to ERROR: MariaDB [octavia]> update member set provisioning_status='ERROR' where id='a45180e1-d15e-4ef3-9ed6-ac6a7067ac3f'; Query OK, 1 row affected (0.002 sec) Rows matched: 1 Changed: 1 Warnings: 0 MariaDB [octavia]> update pool set provisioning_status='ERROR' where id='80248153-89a8-4f4a-ad03-7a71d7ca5898'; Query OK, 1 row affected (0.002 sec) Rows matched: 1 Changed: 1 Warnings: 0 MariaDB [octavia]> update listener set provisioning_status='ERROR' where id='9ebd024c-0872-40db-a056-5f514920b8ef'; Query OK, 1 row affected (0.003 sec) Rows matched: 1 Changed: 1 Warnings: 0 4 Check ERROR status: (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer member list 80248153-89a8-4f4a-ad03-7a71d7ca5898 +--------------------------------------+----------------+----------------------------------+---------------------+----------------+---------------+------------------+--------+ | id | name | project_id | provisioning_status | address | protocol_port | operating_status | weight | +--------------------------------------+----------------+----------------------------------+---------------------+----------------+---------------+------------------+--------+ | a45180e1-d15e-4ef3-9ed6-ac6a7067ac3f | test/demo:8080 | 758d38e2352449eaa9d6ae554d0650e9 | ERROR | 10.128.115.234 | 8080 | NO_MONITOR | 1 | (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer pool show 80248153-89a8-4f4a-ad03-7a71d7ca5898 +----------------------+--------------------------------------+ | Field | Value | +----------------------+--------------------------------------+ | admin_state_up | True | | created_at | 2020-06-01T14:26:11 | | description | | | healthmonitor_id | | | id | 80248153-89a8-4f4a-ad03-7a71d7ca5898 | | lb_algorithm | SOURCE_IP_PORT | | listeners | 9ebd024c-0872-40db-a056-5f514920b8ef | | loadbalancers | d9b5421d-366a-4dab-9d0f-7cd6a3cb7bb9 | | members | a45180e1-d15e-4ef3-9ed6-ac6a7067ac3f | | name | test/demo:TCP:80 | | operating_status | ONLINE | | project_id | 758d38e2352449eaa9d6ae554d0650e9 | | protocol | TCP | | provisioning_status | ERROR | | session_persistence | None | | updated_at | 2020-06-01T14:26:13 | | tls_container_ref | None | | ca_tls_container_ref | None | | crl_container_ref | None | | tls_enabled | False | +----------------------+--------------------------------------+ (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer listener show 9ebd024c-0872-40db-a056-5f514920b8ef +-----------------------------+--------------------------------------+ | Field | Value | +-----------------------------+--------------------------------------+ | admin_state_up | True | | connection_limit | -1 | | created_at | 2020-06-01T14:26:09 | | default_pool_id | 80248153-89a8-4f4a-ad03-7a71d7ca5898 | | default_tls_container_ref | None | | description | | | id | 9ebd024c-0872-40db-a056-5f514920b8ef | | insert_headers | None | | l7policies | | | loadbalancers | d9b5421d-366a-4dab-9d0f-7cd6a3cb7bb9 | | name | test/demo:TCP:80 | | operating_status | ONLINE | | project_id | 758d38e2352449eaa9d6ae554d0650e9 | | protocol | TCP | | protocol_port | 80 | | provisioning_status | ERROR | | sni_container_refs | [] | | timeout_client_data | 50000 | | timeout_member_connect | 5000 | | timeout_member_data | 50000 | | timeout_tcp_inspect | 0 | | updated_at | 2020-06-01T14:26:13 | | client_ca_tls_container_ref | None | | client_authentication | NONE | | client_crl_container_ref | None | | allowed_cidrs | None | +-----------------------------+--------------------------------------+ 5 Trigger regeneration by generating a kuryr event: oc edit endpoints -n test Remove openstack.org/kuryr-lbaas-state element. 6 Check new status: (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer list | grep demo | d9b5421d-366a-4dab-9d0f-7cd6a3cb7bb9 | test/demo | 758d38e2352449eaa9d6ae554d0650e9 | 172.30.202.242 | ACTIVE | ovn | (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer show d9b5421d-366a-4dab-9d0f-7cd6a3cb7bb9 +---------------------+--------------------------------------+ | Field | Value | +---------------------+--------------------------------------+ | admin_state_up | True | | created_at | 2020-06-01T14:25:55 | | description | | | flavor_id | None | | id | d9b5421d-366a-4dab-9d0f-7cd6a3cb7bb9 | | listeners | e665ca54-768c-4d5e-b9a9-5b27b9c34389 | | name | test/demo | | operating_status | ONLINE | | pools | 80248153-89a8-4f4a-ad03-7a71d7ca5898 | | | 868bb9dc-241b-45b2-95e4-9d2cf34f87d7 | | project_id | 758d38e2352449eaa9d6ae554d0650e9 | | provider | ovn | | provisioning_status | ACTIVE | | updated_at | 2020-06-01T14:30:12 | | vip_address | 172.30.202.242 | | vip_network_id | 0adf99a6-4b4e-4909-9537-680d4031de65 | | vip_port_id | 5e249dce-ca31-42c2-be80-e2fa4ac2ff35 | | vip_qos_policy_id | None | | vip_subnet_id | 27dec9e5-623f-499d-a62d-512759da16cd | +---------------------+--------------------------------------+ (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer member list 868bb9dc-241b-45b2-95e4-9d2cf34f87d7 +--------------------------------------+----------------+----------------------------------+---------------------+----------------+---------------+------------------+--------+ | id | name | project_id | provisioning_status | address | protocol_port | operating_status | weight | +--------------------------------------+----------------+----------------------------------+---------------------+----------------+---------------+------------------+--------+ | 2dbc2049-fe36-4ea2-ad1f-b451759bf014 | test/demo:8080 | 758d38e2352449eaa9d6ae554d0650e9 | ACTIVE | 10.128.115.234 | 8080 | NO_MONITOR | 1 | +--------------------------------------+----------------+----------------------------------+---------------------+----------------+---------------+------------------+--------+ 7 Check that connectivity is working: (overcloud) [stack@undercloud-0 ~]$ oc rsh demo-caller curl 172.30.202.242 demo: HELLO! I AM ALIVE!!! No errors observed in kuryr-controller logs.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409