Description of problem: When upgrading Octavia from OSP13 to OSP16 the CNO checks if tagging is supported and if it's tries to look for an API load balancer with tag. However, the existent LB was created with OSP13, which does not support tagging and so the tagging is added on description. The CNO then tries to create a new LB API and fails as the address is already in use. CNO logs: 2020/02/28 16:21:14 Failed to reconcile platform networking resources: failed to create OpenShift API loadbalancer: failed to create LB: Internal Server Error Octavia logs: Neutron server returns request_ids: ['req-b0e62a7e-26f7-4310-9749-ddf971dab7dc']: octavia_lib.api.drivers.exceptions.DriverError: IP address 172.30.0.1 already allocated in subnet 5b87bf2e-3038-4d3d-a610-ab8a936d 50ac Neutron server returns request_ids: ['req-b0e62a7e-26f7-4310-9749-ddf971dab7dc'] 2020-02-28 16:04:20.575 32 ERROR octavia.api.drivers.utils Traceback (most recent call last): 2020-02-28 16:04:20.575 32 ERROR octavia.api.drivers.utils File "/usr/lib/python3.6/site-packages/octavia/network/drivers/neutron/allowed_address_pairs.py", line 450, in allocate_vip 2020-02-28 16:04:20.575 32 ERROR octavia.api.drivers.utils new_port = self.neutron_client.create_port(port) 2020-02-28 16:04:20.575 32 ERROR octavia.api.drivers.utils File "/usr/lib/python3.6/site-packages/neutronclient/v2_0/client.py", line 803, in create_port 2020-02-28 16:04:20.575 32 ERROR octavia.api.drivers.utils return self.post(self.ports_path, body=body) 2020-02-28 16:04:20.575 32 ERROR octavia.api.drivers.utils File "/usr/lib/python3.6/site-packages/neutronclient/v2_0/client.py", line 359, in post 2020-02-28 16:04:20.575 32 ERROR octavia.api.drivers.utils headers=headers, params=params) 2020-02-28 16:04:20.575 32 ERROR octavia.api.drivers.utils File "/usr/lib/python3.6/site-packages/neutronclient/v2_0/client.py", line 294, in do_request 2020-02-28 16:04:20.575 32 ERROR octavia.api.drivers.utils self._handle_fault_response(status_code, replybody, resp) 2020-02-28 16:04:20.575 32 ERROR octavia.api.drivers.utils File "/usr/lib/python3.6/site-packages/neutronclient/v2_0/client.py", line 269, in _handle_fault_response 2020-02-28 16:04:20.575 32 ERROR octavia.api.drivers.utils exception_handler_v20(status_code, error_body) 2020-02-28 16:04:20.575 32 ERROR octavia.api.drivers.utils File "/usr/lib/python3.6/site-packages/neutronclient/v2_0/client.py", line 93, in exception_handler_v20 2020-02-28 16:04:20.575 32 ERROR octavia.api.drivers.utils request_ids=request_ids) 2020-02-28 16:04:20.575 32 ERROR octavia.api.drivers.utils neutronclient.common.exceptions.IpAddressAlreadyAllocatedClient: IP address 172.30.0.1 already allocated in subnet 5b87bf2e-3038-4d3d-a610-ab8a936d50ac 2020-02-28 16:04:20.575 32 ERROR octavia.api.drivers.utils Neutron server returns request_ids: ['req-b0e62a7e-26f7-4310-9749-ddf971dab7dc'] Version-Release number of selected component (if applicable): Upgrade Octavia OSP13 to Octavia OSP16 How reproducible: Steps to Reproduce: 1. 2. 3. Actual results: Expected results: Additional info:
Verified in 4.5.0-0.nightly-2020-04-02-004321 on top of OSP 16 RHOS_TRUNK-16.0-RHEL-8-20200324.n.0 compose. After successful 4.5.0-0.nightly-2020-04-02-004321 installation, the next steps have been followed in order to reproduce the scenario described in this BZ: Add a description to the API LB and remove the tag (as it's done in an OSP 13 deployment): $ openstack loadbalancer list +--------------------------------------+-------------------------------------+----------------------------------+----------------+---------------------+----------+ | id | name | project_id | vip_address | provisioning_status | provider | +--------------------------------------+-------------------------------------+----------------------------------+----------------+---------------------+----------+ ... | d787e30b-79be-4383-b998-96f7f83465f2 | ostest-xk585-kuryr-api-loadbalancer | bb444bffd5f64283a8ddc9897b149829 | 172.30.0.1 | ACTIVE | amphora | +--------------------------------------+-------------------------------------+----------------------------------+----------------+---------------------+----------+ $ openstack loadbalancer set --description 'openshiftClusterID=ostest-xk585' ostest-xk585-kuryr-api-loadbalancer $ openstack loadbalancer show d787e30b-79be-4383-b998-96f7f83465f2 +---------------------+--------------------------------------+ | Field | Value | +---------------------+--------------------------------------+ | admin_state_up | True | | created_at | 2020-04-02T08:14:44 | | description | openshiftClusterID=ostest-xk585 | | flavor_id | None | | id | d787e30b-79be-4383-b998-96f7f83465f2 | | listeners | 2362221d-77ff-4612-b964-cab9bc5a31d0 | | name | ostest-xk585-kuryr-api-loadbalancer | | operating_status | DEGRADED | | pools | df885ed7-f4d3-4f98-91ab-ae0dbf6f71ab | | project_id | bb444bffd5f64283a8ddc9897b149829 | | provider | amphora | | provisioning_status | ACTIVE | | updated_at | 2020-04-02T09:35:23 | | vip_address | 172.30.0.1 | | vip_network_id | 0195014d-4745-41dd-b4cf-46b3752d86bd | | vip_port_id | 8c2dd26c-33eb-48bc-9d94-50bf4fd5e69c | | vip_qos_policy_id | None | | vip_subnet_id | 88b4cbb6-16f7-4613-86e2-c3c55b443929 | +---------------------+--------------------------------------+ The tag needs to be removed from the DB: [root@controller-0 heat-admin]# podman exec -uroot -it galera-bundle-podman-0 mysql MariaDB [(none)]> use octavia MariaDB [octavia]> select * from tags where resource_id='d787e30b-79be-4383-b998-96f7f83465f2'; +--------------------------------------+---------------------------------+ | resource_id | tag | +--------------------------------------+---------------------------------+ | d787e30b-79be-4383-b998-96f7f83465f2 | openshiftClusterID=ostest-xk585 | +--------------------------------------+---------------------------------+ MariaDB [octavia]> delete from tags where resource_id='d787e30b-79be-4383-b998-96f7f83465f2'; Query OK, 1 row affected (0.002 sec) Now restart CNO - it will start and detect the API LB as it would have been created in OSP 13: $ oc -n openshift-network-operator delete pod network-operator-cc7649f7-7stdx $ oc -n openshift-network-operator get pods NAME READY STATUS RESTARTS AGE network-operator-cc7649f7-vmdmm 1/1 Running 0 25s Check CNO logs - it detects an existing API LB with it's description, keeps it and tags it: 2020/04/02 09:39:14 Creating OpenShift API loadbalancer with IP 172.30.0.1 2020/04/02 09:39:14 Detected Octavia API v2.13.0 2020/04/02 09:39:14 Tagging existing loadbalancer API d787e30b-79be-4383-b998-96f7f83465f2 2020/04/02 09:39:14 OpenShift API loadbalancer d787e30b-79be-4383-b998-96f7f83465f2 present Check the tag has been added in DB (as for OSP 16): MariaDB [octavia]> select * from tags where resource_id='d787e30b-79be-4383-b998-96f7f83465f2'; +--------------------------------------+---------------------------------+ | resource_id | tag | +--------------------------------------+---------------------------------+ | d787e30b-79be-4383-b998-96f7f83465f2 | openshiftClusterID=ostest-xk585 | +--------------------------------------+---------------------------------+
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2020:2409