Hide Forgot
Description of problem: When scaling replicas to zero, Octavia loadbalancer pool members are not updated accordingly. Version-Release number of selected component (if applicable): OpenShift 3.11.306 with Kuryr on OpenStack 13 How reproducible: Create a deployment with a service and pod, scale the pods to 5 (for example), then scale down to zero, and check members of the pool for that loadbalancer. Steps to Reproduce: 1. Create deployment 2. Scale up to 5 3. Scale down to zero Actual results: member list still has five members Expected results: member list should be zero Additional info:
After scaling a deployment from 5 replicas to zero, there are no pods left: $ oc get all NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/echo-01 ClusterIP XXX.XXX.157.245 <none> 80/TCP 5h service/echo-02 ClusterIP XXX.XXX.146.71 <none> 80/TCP 5h NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE deployment.apps/echo-01 0 0 0 0 5h deployment.apps/echo-02 0 0 0 0 5h NAME DESIRED CURRENT READY AGE replicaset.apps/echo-01-5c8c6d56c8 0 0 0 5h replicaset.apps/echo-02-77b5b75d95 0 0 0 5h Checking the OpenStack side of things: $ openstack loadbalancer member list f8745fbd-c001-4d60-90ba-6ff9b68cf8ce +--------------------------------------+------------------------------------+----------------------------------+---------------------+----------------+---------------+------------------+--------+ | id | name | project_id | provisioning_status | address | protocol_port | operating_status | weight | +--------------------------------------+------------------------------------+----------------------------------+---------------------+----------------+---------------+------------------+--------+ | 51afb91a-7f73-4d63-8c8e-4ca8125577f4 | momo/echo-02-77b5b75d95-rv5m6:8080 | 5499a185863a469ba0f8d724e886184f | ACTIVE | XXX.XXX.19.132 | 8080 | NO_MONITOR | 1 | | 82aa7394-0fc8-479e-8a56-1817bdf083c5 | momo/echo-02-77b5b75d95-tslsb:8080 | 5499a185863a469ba0f8d724e886184f | ACTIVE | XXX.XXX.19.136 | 8080 | NO_MONITOR | 1 | | 38f3debf-96aa-4750-a1be-d540bf3839ce | momo/echo-02-77b5b75d95-xpmd8:8080 | 5499a185863a469ba0f8d724e886184f | ACTIVE | XXX.XXX.19.143 | 8080 | NO_MONITOR | 1 | | 819a8254-8fb1-44f1-8c82-47bc5860f82a | momo/echo-02-77b5b75d95-zzqrr:8080 | 5499a185863a469ba0f8d724e886184f | ACTIVE | XXX.XXX.19.144 | 8080 | NO_MONITOR | 1 | | 1aaf9dcd-1a9b-4d6e-be2b-8a17b88843cd | momo/echo-02-77b5b75d95-6z69w:8080 | 5499a185863a469ba0f8d724e886184f | ACTIVE | XXX.XXX.19.152 | 8080 | NO_MONITOR | 1 | +--------------------------------------+------------------------------------+----------------------------------+---------------------+----------------+---------------+------------------+--------+ $ openstack loadbalancer member list 0a3f85b6-941e-4ba4-9c76-b9dcc9da64d0 +--------------------------------------+------------------------------------+----------------------------------+---------------------+----------------+---------------+------------------+--------+ | id | name | project_id | provisioning_status | address | protocol_port | operating_status | weight | +--------------------------------------+------------------------------------+----------------------------------+---------------------+----------------+---------------+------------------+--------+ | e73d6142-7e5f-429c-a6c6-169a0a944f4f | momo/echo-01-5c8c6d56c8-s8mfq:8080 | 5499a185863a469ba0f8d724e886184f | ACTIVE | XXX.XXX.19.131 | 8080 | NO_MONITOR | 1 | | 0bb14dd4-1dba-418d-9e46-c4cfbdb3bde7 | momo/echo-01-5c8c6d56c8-kc5z7:8080 | 5499a185863a469ba0f8d724e886184f | ACTIVE | XXX.XXX.19.139 | 8080 | NO_MONITOR | 1 | | efe190c6-8dd9-406f-ac73-02e766afa375 | momo/echo-01-5c8c6d56c8-hn6td:8080 | 5499a185863a469ba0f8d724e886184f | ACTIVE | XXX.XXX.19.142 | 8080 | NO_MONITOR | 1 | | fa63bd37-8178-46e4-9b3b-0e7fb62f6e71 | momo/echo-01-5c8c6d56c8-zfxfr:8080 | 5499a185863a469ba0f8d724e886184f | ACTIVE | XXX.XXX.19.145 | 8080 | NO_MONITOR | 1 | | 9e369b9e-52f5-4ec5-b9a8-07f873425a02 | momo/echo-01-5c8c6d56c8-crp6c:8080 | 5499a185863a469ba0f8d724e886184f | ACTIVE | XXX.XXX.19.153 | 8080 | NO_MONITOR | 1 | +--------------------------------------+------------------------------------+----------------------------------+---------------------+----------------+---------------+------------------+--------+
Hey Mo, Is there any evidence that Kuryr has been able to delete any resources at all? Seems to be a common issue here where there are left over ports in Neutron and left over pool members in Octavia. I wonder if Kuryr actually has the permissions it needs to delete objects? You could find the cloud-credentials being used by Kuryr-CNI and try to delete one of the incorrect pool members, or ports using those same credentials. That would rule out permission issues. Also, do you see any logs from Kuryr about updating Octavia pool members? We could cross reference those messages and times with the messages from Octavia API to see if there were any issues raised. Same thing for the Neutron ports in your other case.
Hi Brendan, So far it seems this issue only appears when scaling down to zero. The problem we are seeing in production is some services are scaled down to zero, but the list of members is not updated. Coincidentally, other pods (for different services) in the same namespace are being created and end up getting assigned one of the previously used IP addresses, which happen to still exist in the member list. This has been observed twice so far. Whenever we scale to a value above zero, there are no issues. See below. Mohammad ---------------------------------------------------------- [openshift@master-2 ~]$ oc project momo Already on project "momo" on server "https://XX.XX.128.1:8443". [openshift@master-2 ~]$ oc get pods No resources found. [openshift@master-2 ~]$ oc get all NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/echo-01 ClusterIP XX.XX.157.245 <none> 80/TCP 3d service/echo-02 ClusterIP XX.XX.146.71 <none> 80/TCP 3d NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE deployment.apps/echo-01 0 0 0 0 3d deployment.apps/echo-02 0 0 0 0 3d NAME DESIRED CURRENT READY AGE replicaset.apps/echo-01-5c8c6d56c8 0 0 0 3d replicaset.apps/echo-02-77b5b75d95 0 0 0 3d dev1/mydc $ openstack loadbalancer list |grep momo | 53e2f3e1-94c1-4871-90ed-30bdf9e619f0 | momo/echo-02 | 5499a185863a469ba0f8d724e886184f | XX.XX.146.71 | ACTIVE | octavia | | 4b93af87-da44-4749-b035-6a8f81d8c121 | momo/echo-01 | 5499a185863a469ba0f8d724e886184f | XX.XX.157.245 | ACTIVE | octavia | dev1/mydc $ openstack loadbalancer pool list --loadbalancer 4b93af87-da44-4749-b035-6a8f81d8c121 +--------------------------------------+---------------------+----------------------------------+---------------------+----------+--------------+----------------+ | id | name | project_id | provisioning_status | protocol | lb_algorithm | admin_state_up | +--------------------------------------+---------------------+----------------------------------+---------------------+----------+--------------+----------------+ | 0a3f85b6-941e-4ba4-9c76-b9dcc9da64d0 | momo/echo-01:TCP:80 | 5499a185863a469ba0f8d724e886184f | ACTIVE | TCP | ROUND_ROBIN | True | +--------------------------------------+---------------------+----------------------------------+---------------------+----------+--------------+----------------+ dev1/mydc $ openstack loadbalancer pool list --loadbalancer 53e2f3e1-94c1-4871-90ed-30bdf9e619f0 +--------------------------------------+---------------------+----------------------------------+---------------------+----------+--------------+----------------+ | id | name | project_id | provisioning_status | protocol | lb_algorithm | admin_state_up | +--------------------------------------+---------------------+----------------------------------+---------------------+----------+--------------+----------------+ | f8745fbd-c001-4d60-90ba-6ff9b68cf8ce | momo/echo-02:TCP:80 | 5499a185863a469ba0f8d724e886184f | ACTIVE | TCP | ROUND_ROBIN | True | +--------------------------------------+---------------------+----------------------------------+---------------------+----------+--------------+----------------+ dev1/mydc $ openstack loadbalancer member list 0a3f85b6-941e-4ba4-9c76-b9dcc9da64d0 +--------------------------------------+------------------------------------+----------------------------------+---------------------+----------------+---------------+------------------+--------+ | id | name | project_id | provisioning_status | address | protocol_port | operating_status | weight | +--------------------------------------+------------------------------------+----------------------------------+---------------------+----------------+---------------+------------------+--------+ | e73d6142-7e5f-429c-a6c6-169a0a944f4f | momo/echo-01-5c8c6d56c8-s8mfq:8080 | 5499a185863a469ba0f8d724e886184f | ACTIVE | XX.XX.19.131 | 8080 | NO_MONITOR | 1 | | 0bb14dd4-1dba-418d-9e46-c4cfbdb3bde7 | momo/echo-01-5c8c6d56c8-kc5z7:8080 | 5499a185863a469ba0f8d724e886184f | ACTIVE | XX.XX.19.139 | 8080 | NO_MONITOR | 1 | | efe190c6-8dd9-406f-ac73-02e766afa375 | momo/echo-01-5c8c6d56c8-hn6td:8080 | 5499a185863a469ba0f8d724e886184f | ACTIVE | XX.XX.19.142 | 8080 | NO_MONITOR | 1 | | fa63bd37-8178-46e4-9b3b-0e7fb62f6e71 | momo/echo-01-5c8c6d56c8-zfxfr:8080 | 5499a185863a469ba0f8d724e886184f | ACTIVE | XX.XX.19.145 | 8080 | NO_MONITOR | 1 | | 9e369b9e-52f5-4ec5-b9a8-07f873425a02 | momo/echo-01-5c8c6d56c8-crp6c:8080 | 5499a185863a469ba0f8d724e886184f | ACTIVE | XX.XX.19.153 | 8080 | NO_MONITOR | 1 | +--------------------------------------+------------------------------------+----------------------------------+---------------------+----------------+---------------+------------------+--------+ dev1/mydc $ openstack loadbalancer member list f8745fbd-c001-4d60-90ba-6ff9b68cf8ce +--------------------------------------+------------------------------------+----------------------------------+---------------------+----------------+---------------+------------------+--------+ | id | name | project_id | provisioning_status | address | protocol_port | operating_status | weight | +--------------------------------------+------------------------------------+----------------------------------+---------------------+----------------+---------------+------------------+--------+ | 51afb91a-7f73-4d63-8c8e-4ca8125577f4 | momo/echo-02-77b5b75d95-rv5m6:8080 | 5499a185863a469ba0f8d724e886184f | ACTIVE | XX.XX.19.132 | 8080 | NO_MONITOR | 1 | | 82aa7394-0fc8-479e-8a56-1817bdf083c5 | momo/echo-02-77b5b75d95-tslsb:8080 | 5499a185863a469ba0f8d724e886184f | ACTIVE | XX.XX.19.136 | 8080 | NO_MONITOR | 1 | | 38f3debf-96aa-4750-a1be-d540bf3839ce | momo/echo-02-77b5b75d95-xpmd8:8080 | 5499a185863a469ba0f8d724e886184f | ACTIVE | XX.XX.19.143 | 8080 | NO_MONITOR | 1 | | 819a8254-8fb1-44f1-8c82-47bc5860f82a | momo/echo-02-77b5b75d95-zzqrr:8080 | 5499a185863a469ba0f8d724e886184f | ACTIVE | XX.XX.19.144 | 8080 | NO_MONITOR | 1 | | 1aaf9dcd-1a9b-4d6e-be2b-8a17b88843cd | momo/echo-02-77b5b75d95-6z69w:8080 | 5499a185863a469ba0f8d724e886184f | ACTIVE | XX.XX.19.152 | 8080 | NO_MONITOR | 1 | +--------------------------------------+------------------------------------+----------------------------------+---------------------+----------------+---------------+------------------+--------+ [openshift@master-2 ~]$ oc scale --replicas=1 deployment.apps/echo-01 deployment.apps/echo-01 scaled [openshift@master-2 ~]$ oc scale --replicas=1 deployment.apps/echo-02 deployment.apps/echo-02 scaled dev1/mydc $ openstack loadbalancer member list f8745fbd-c001-4d60-90ba-6ff9b68cf8ce +--------------------------------------+------------------------------------+----------------------------------+---------------------+----------------+---------------+------------------+--------+ | id | name | project_id | provisioning_status | address | protocol_port | operating_status | weight | +--------------------------------------+------------------------------------+----------------------------------+---------------------+----------------+---------------+------------------+--------+ | f2b2b209-20e9-41ff-b6f6-584ddc893e4d | momo/echo-02-77b5b75d95-vsg7k:8080 | 5499a185863a469ba0f8d724e886184f | ACTIVE | XX.XX.19.141 | 8080 | NO_MONITOR | 1 | +--------------------------------------+------------------------------------+----------------------------------+---------------------+----------------+---------------+------------------+--------+ dev1/mydc $ openstack loadbalancer member list 0a3f85b6-941e-4ba4-9c76-b9dcc9da64d0 +--------------------------------------+------------------------------------+----------------------------------+---------------------+----------------+---------------+------------------+--------+ | id | name | project_id | provisioning_status | address | protocol_port | operating_status | weight | +--------------------------------------+------------------------------------+----------------------------------+---------------------+----------------+---------------+------------------+--------+ | 06009a72-529c-484b-86be-63e23b58d532 | momo/echo-01-5c8c6d56c8-tzr4v:8080 | 5499a185863a469ba0f8d724e886184f | ACTIVE | XX.XX.19.133 | 8080 | NO_MONITOR | 1 | +--------------------------------------+------------------------------------+----------------------------------+---------------------+----------------+---------------+------------------+--------+
Hey Mo, Yeah, I believe the issue here has been identified as some missing logic that would handle a scale to 0 case: https://github.com/openshift/kuryr-kubernetes/blob/release-3.11/kuryr_kubernetes/controller/handlers/lbaas.py#L237-L244 I also believe that the Kuryr engineering team are now working on applying this logic to resolve the issue.
Verified on OCP4.7.0-0.nightly-2020-11-18-203317 over OSP16.1 with OVN-Octavia (RHOS-16.1-RHEL-8-20201110.n.1) creating a deployment with 3 replicas and service with below files: $ cat demo_deployment.yaml apiVersion: apps/v1 kind: Deployment metadata: name: demo labels: app: demo spec: replicas: 3 selector: matchLabels: app: demo template: metadata: labels: app: demo spec: containers: - name: demo image: kuryr/demo ports: - containerPort: 8080 $ cat demo_svc.yaml apiVersion: v1 kind: Service metadata: name: demo labels: app: demo spec: selector: app: demo ports: - port: 80 protocol: TCP targetPort: 8080 The result: $ oc get all NAME READY STATUS RESTARTS AGE pod/demo-66cdc7b66-558q4 1/1 Running 0 35s pod/demo-66cdc7b66-6r6xz 1/1 Running 0 35s pod/demo-66cdc7b66-lqgwm 1/1 Running 0 35s NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE service/demo ClusterIP 172.30.253.185 <none> 80/TCP 20m NAME READY UP-TO-DATE AVAILABLE AGE deployment.apps/demo 3/3 3 3 21m NAME DESIRED CURRENT READY AGE replicaset.apps/demo-66cdc7b66 3 3 3 21m and $ openstack loadbalancer show test/demo +---------------------+--------------------------------------+ | Field | Value | +---------------------+--------------------------------------+ | admin_state_up | True | | created_at | 2020-11-20T10:29:10 | | description | | | flavor_id | None | | id | ea3e7a80-c506-4470-8f03-ffbb54c3582d | | listeners | a06a2ac5-29d9-4662-8bfc-ae8efede9dec | | name | test/demo | | operating_status | ONLINE | | pools | 7445a7f9-3828-4683-9578-8282b60c98bf | | project_id | 09384e0f276445b8b369945abd83baf0 | | provider | ovn | | provisioning_status | ACTIVE | | updated_at | 2020-11-20T10:47:25 | | vip_address | 172.30.253.185 | | vip_network_id | 707947c5-b9ef-416d-a50b-610b8d0c9288 | | vip_port_id | 114b59bb-cc40-4ed6-b3da-befd30767725 | | vip_qos_policy_id | None | | vip_subnet_id | a41d615c-c5e7-4ae4-9b54-139262d060c2 | +---------------------+--------------------------------------+ $ openstack loadbalancer member list 7445a7f9-3828-4683-9578-8282b60c98bf +--------------------------------------+--------------------------------+----------------------------------+---------------------+----------------+---------------+------------------+--------+ | id | name | project_id | provisioning_status | address | protocol_port | operating_status | weight | +--------------------------------------+--------------------------------+----------------------------------+---------------------+----------------+---------------+------------------+--------+ | 8aeca25e-0aed-4047-836e-5ffb08ea6cfc | test/demo-66cdc7b66-6r6xz:8080 | 09384e0f276445b8b369945abd83baf0 | ACTIVE | 10.128.118.121 | 8080 | NO_MONITOR | 1 | | 699f911d-040d-4413-a1c0-78ce6d9127c2 | test/demo-66cdc7b66-558q4:8080 | 09384e0f276445b8b369945abd83baf0 | ACTIVE | 10.128.119.199 | 8080 | NO_MONITOR | 1 | | 79917927-1c2a-452d-a34e-58b8d4bb721e | test/demo-66cdc7b66-lqgwm:8080 | 09384e0f276445b8b369945abd83baf0 | ACTIVE | 10.128.118.53 | 8080 | NO_MONITOR | 1 | +--------------------------------------+--------------------------------+----------------------------------+---------------------+----------------+---------------+------------------+--------+ now, scaling to 0 with below command: $ oc scale --replicas=0 deployment.apps/demo deployment.apps/demo scaled After a while, all the members are removed from the pool: $ openstack loadbalancer member list 7445a7f9-3828-4683-9578-8282b60c98bf $
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Moderate: OpenShift Container Platform 4.7.0 security, bug fix, and enhancement update), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2020:5633