Description of problem: The following issue can be reproduced easily in a reproducer and consistently. First, I can reproduce the same issue as in: https://bugzilla.redhat.com/show_bug.cgi?id=1693808 ~~~ openssl req -new -newkey rsa:4096 -x509 -sha256 -days 3650 -nodes -out server.crt -keyout server.key openssl pkcs12 -export -inkey server.key -in server.crt -passout pass: -out server.p12 openstack secret store --name='tls_secret1' -t 'application/octet-stream' -e 'base64' --payload="$(base64 < server.p12)" openstack acl user add -u ca8ac3feedc64c26bde24ef404586422 $(openstack secret list | awk '/ tls_secret1 / {print $2}') openstack loadbalancer create --name lb1 --vip-subnet-id provider1-subnet sleep 300 openstack loadbalancer listener create --protocol-port 443 --protocol TERMINATED_HTTPS --name listener1 --default-tls-container=$(openstack secret list | awk '/ tls_secret1 / {print $2}') lb1 ~~~ Now, we know that the above BZ will soon be fixed in a new z-stream. However, the problem is that if one tries the following procedure (not knowing about the above fix): ~~~ (overcloud) [stack@undercloud-1 octavia_keys]$ openstack loadbalancer listener create --protocol-port 443 --protocol TERMINATED_HTTPS --name listener1 --default-tls-container=$(openstack secret list | awk '/ tls_secret1 / {print $2}') lb1 Could not retrieve certificate: ['http://10.0.0.15:9311/v1/secrets/7558e3a1-ffc1-43a1-9506-e625be67f038'] (HTTP 400) (Request-ID: req-c6dddf4c-cab9-4950-a791-78e45641ff08) (overcloud) [stack@undercloud-1 octavia_keys]$ ~~~ Is is now inpossible to delete the listener or the load balancer, as the load balancer ends up in PENDING_UPDATE and immutable: ~~~ (overcloud) [stack@undercloud-1 octavia_keys]$ openstack loadbalancer listener create --protocol-port 443 --protocol TERMINATED_HTTPS --name listener1 lb1 +---------------------------+--------------------------------------+ | Field | Value | +---------------------------+--------------------------------------+ | admin_state_up | True | | connection_limit | -1 | | created_at | 2019-05-21T14:02:40 | | default_pool_id | None | | default_tls_container_ref | None | | description | | | id | 4a8f5b7a-b376-4dca-8381-1ac40458b873 | | insert_headers | None | | l7policies | | | loadbalancers | 134940e3-2efe-4f44-96b8-7682cabb700e | | name | listener1 | | operating_status | OFFLINE | | project_id | 522130f5bf8847db8b86eb05d6493cc7 | | protocol | TERMINATED_HTTPS | | protocol_port | 443 | | provisioning_status | PENDING_CREATE | | sni_container_refs | [] | | updated_at | None | +---------------------------+--------------------------------------+ (overcloud) [stack@undercloud-1 octavia_keys]$ openstack loadbalancer listener list +--------------------------------------+-----------------+-----------+----------------------------------+------------------+---------------+----------------+ | id | default_pool_id | name | project_id | protocol | protocol_port | admin_state_up | +--------------------------------------+-----------------+-----------+----------------------------------+------------------+---------------+----------------+ | 4a8f5b7a-b376-4dca-8381-1ac40458b873 | None | listener1 | 522130f5bf8847db8b86eb05d6493cc7 | TERMINATED_HTTPS | 443 | True | +--------------------------------------+-----------------+-----------+----------------------------------+------------------+---------------+----------------+ (overcloud) [stack@undercloud-1 octavia_keys]$ openstack loadbalancer listener set --default-tls-container-ref $(openstack secret list | awk '/ tls_secret1 / {print $2}') listener1 Could not retrieve certificate: ['http://10.0.0.15:9311/v1/secrets/7558e3a1-ffc1-43a1-9506-e625be67f038'] (HTTP 400) (Request-ID: req-892836f3-6926-42dc-a463-78813888ac4f) (overcloud) [stack@undercloud-1 octavia_keys]$ openstack loadbalancer list openstack loadbalancer lis+--------------------------------------+------+----------------------------------+-------------+---------------------+----------+ | id | name | project_id | vip_address | provisioning_status | provider | +--------------------------------------+------+----------------------------------+-------------+---------------------+----------+ | 134940e3-2efe-4f44-96b8-7682cabb700e | lb1 | 522130f5bf8847db8b86eb05d6493cc7 | 10.0.0.107 | PENDING_UPDATE | octavia | +--------------------------------------+------+----------------------------------+-------------+---------------------+----------+ t(overcloud) [stack@undercloud-1 octavia_keys]$ openstack loadbalancer listener list +--------------------------------------+-----------------+-----------+----------------------------------+------------------+---------------+----------------+ | id | default_pool_id | name | project_id | protocol | protocol_port | admin_state_up | +--------------------------------------+-----------------+-----------+----------------------------------+------------------+---------------+----------------+ | 4a8f5b7a-b376-4dca-8381-1ac40458b873 | None | listener1 | 522130f5bf8847db8b86eb05d6493cc7 | TERMINATED_HTTPS | 443 | True | +--------------------------------------+-----------------+-----------+----------------------------------+------------------+---------------+----------------+ (overcloud) [stack@undercloud-1 octavia_keys]$ ~~~ ~~~ (overcloud) [stack@undercloud-1 octavia_keys]$ openstack loadbalancer listener delete 4a8f5b7a-b376-4dca-8381-1ac40458b873 Load Balancer 134940e3-2efe-4f44-96b8-7682cabb700e is immutable and cannot be updated. (HTTP 409) (Request-ID: req-81496b1a-7a3d-42a8-9612-de0ab86825e3) (overcloud) [stack@undercloud-1 octavia_keys]$ openstack loadbalancer delete 134940e3-2efe-4f44-96b8-7682cabb700e Validation failure: Cannot delete Load Balancer 134940e3-2efe-4f44-96b8-7682cabb700e - it has children (HTTP 400) (Request-ID: req-bfa88865-d1d9-4d9c-a7ad-f4eb61f5e606) (overcloud) [stack@undercloud-1 octavia_keys]$ ~~~ Version-Release number of selected component (if applicable): [root@overcloud-controller-0 ~]# docker ps | grep octa a03330b54524 registry.access.redhat.com/rhosp13/openstack-octavia-health-manager:13.0-72 "kolla_start" 16 hours ago Up 16 hours (healthy) octavia_health_manager b09cd2fbcee0 registry.access.redhat.com/rhosp13/openstack-octavia-api:13.0-70 "kolla_start" 16 hours ago Up 16 hours (healthy) octavia_api c48cde3dd94b registry.access.redhat.com/rhosp13/openstack-octavia-housekeeping:13.0-72 "kolla_start" 16 hours ago Up 16 hours (healthy) octavia_housekeeping 0ca1a8631bb1 registry.access.redhat.com/rhosp13/openstack-octavia-worker:13.0-72 "kolla_start" 16 hours ago Up 16 hours (healthy) octavia_worker [root@overcloud-controller-0 ~]# Additional info: Impossible to get rid of the loadbalancer and listener, even after applying the changes that are suggested in: upstream: https://bugs.launchpad.net/tripleo/+bug/1824777 upstream bugfix: https://review.opendev.org/#/c/652603/1/deployment/octavia/octavia-base.yaml
I rebooted my single controller, just to see if this might have an effect, but it has none: (overcloud) [stack@undercloud-1 ~]$ openstack loadbalancer listener delete listener1 openstack loadbalancer dLoad Balancer 134940e3-2efe-4f44-96b8-7682cabb700e is immutable and cannot be updated. (HTTP 409) (Request-ID: req-d2ebc2aa-4d5f-4ba3-ad11-49d6b2e4722f) e(overcloud) [stack@undercloud-1 ~]$ openstack loadbalancer delete lb1 Validation failure: Cannot delete Load Balancer 134940e3-2efe-4f44-96b8-7682cabb700e - it has children (HTTP 400) (Request-ID: req-92a6b543-5f4e-4728-ab03-7a2fed51e54a)
So far, I found the following workaround, although one should obviously not mess around in the database (and this is not supported): ~~~ [root@overcloud-controller-0 ~]# docker exec -it galera-bundle-docker-0 /bin/bash ()[root@overcloud-controller-0 /]# mysql octavia Reading table information for completion of table and column names You can turn off this feature to get a quicker startup with -A Welcome to the MariaDB monitor. Commands end with ; or \g. Your MariaDB connection id is 626 Server version: 10.1.20-MariaDB MariaDB Server Copyright (c) 2000, 2016, Oracle, MariaDB Corporation Ab and others. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. MariaDB [octavia]> MariaDB [octavia]> MariaDB [octavia]> show tables; +--------------------------+ | Tables_in_octavia | +--------------------------+ | alembic_version | | algorithm | | amphora | | amphora_build_request | | amphora_build_slots | | amphora_health | | amphora_roles | | health_monitor | | health_monitor_type | | l7policy | | l7policy_action | | l7rule | | l7rule_compare_type | | l7rule_type | | lb_topology | | listener | | listener_statistics | | load_balancer | | member | | operating_status | | pool | | protocol | | provisioning_status | | quotas | | session_persistence | | session_persistence_type | | sni | | vip | | vrrp_auth_method | | vrrp_group | +--------------------------+ 30 rows in set (0.00 sec) MariaDB [octavia]> select * from listener \G *************************** 1. row *************************** project_id: 522130f5bf8847db8b86eb05d6493cc7 id: 4a8f5b7a-b376-4dca-8381-1ac40458b873 name: listener1 description: NULL protocol: TERMINATED_HTTPS protocol_port: 443 connection_limit: -1 load_balancer_id: 134940e3-2efe-4f44-96b8-7682cabb700e tls_certificate_id: NULL default_pool_id: NULL provisioning_status: PENDING_UPDATE operating_status: ONLINE enabled: 1 peer_port: 1025 insert_headers: NULL created_at: 2019-05-21 14:02:40 updated_at: 2019-05-21 14:03:52 1 row in set (0.00 sec) MariaDB [octavia]> select * from load_balancer \G *************************** 1. row *************************** project_id: 522130f5bf8847db8b86eb05d6493cc7 id: 134940e3-2efe-4f44-96b8-7682cabb700e name: lb1 description: NULL provisioning_status: PENDING_UPDATE operating_status: ONLINE enabled: 1 topology: SINGLE server_group_id: NULL created_at: 2019-05-21 13:49:54 updated_at: 2019-05-21 14:03:52 1 row in set (0.01 sec) MariaDB [octavia]> MariaDB [octavia]> update listener set provisioning_status = 'ACTIVE' where id = '4a8f5b7a-b376-4dca-8381-1ac40458b873'; Query OK, 1 row affected (0.13 sec) Rows matched: 1 Changed: 1 Warnings: 0 MariaDB [octavia]> update loadbalancer set provisioning_status = 'ACTIVE' where id = '134940e3-2efe-4f44-96b8-7682cabb700e'; ERROR 1146 (42S02): Table 'octavia.loadbalancer' doesn't exist MariaDB [octavia]> update load_balancer set provisioning_status = 'ACTIVE' where id = '134940e3-2efe-4f44-96b8-7682cabb700e'; Query OK, 1 row affected (0.03 sec) Rows matched: 1 Changed: 1 Warnings: 0 MariaDB [octavia]> ~~~ Now, I can delete the listener and loadbalancer: ~~~ (overcloud) [stack@undercloud-1 ~]$ openstack loadbalancer list +--------------------------------------+------+----------------------------------+-------------+---------------------+----------+ | id | name | project_id | vip_address | provisioning_status | provider | +--------------------------------------+------+----------------------------------+-------------+---------------------+----------+ | 134940e3-2efe-4f44-96b8-7682cabb700e | lb1 | 522130f5bf8847db8b86eb05d6493cc7 | 10.0.0.107 | ACTIVE | octavia | +--------------------------------------+------+----------------------------------+-------------+---------------------+----------+ (overcloud) [stack@undercloud-1 ~]$ openstack loadbalancer listener delete 4a8f5b7a-b376-4dca-8381-1ac40458b873 (overcloud) [stack@undercloud-1 ~]$ openstack loadbalancer delete 134940e3-2efe-4f44-96b8-7682cabb700e (overcloud) [stack@undercloud-1 ~]$ openstack loadbalancer list (overcloud) [stack@undercloud-1 ~]$ nova list --all +--------------------------------------+--------------+----------------------------------+--------+------------+-------------+----------------------------------------------------------------------+ | ID | Name | Tenant ID | Status | Task State | Power State | Networks | +--------------------------------------+--------------+----------------------------------+--------+------------+-------------+----------------------------------------------------------------------+ | 0d7c70f5-fbac-4a0d-974e-8e1358ce93dc | cirros-test1 | 522130f5bf8847db8b86eb05d6493cc7 | ACTIVE | - | Running | private=192.168.0.21, 2000:192:168:1:f816:3eff:fe0a:b7f3, 10.0.0.111 | | 01640259-acbf-40dc-97e9-f5f432cf14c6 | rhel-test1 | 522130f5bf8847db8b86eb05d6493cc7 | ACTIVE | - | Running | private=192.168.0.9, 2000:192:168:1:f816:3eff:fe34:b706, 10.0.0.119 | +--------------------------------------+--------------+----------------------------------+--------+------------+-------------+----------------------------------------------------------------------+ (overcloud) [stack@undercloud-1 ~]$ openstack loadbalancer amphora list (overcloud) [stack@undercloud-1 ~]$ ~~~
Also, I'd like to know if I can execute this database manipulation during a remote session with the customer. I don't really see why this would cause issues, and it seems like the only valid workaround to me to get rid of the failed load balancer. Thanks, Andreas
Can we get the SOS report? Yes, a path forward would be to set the provisioning status to ERROR in the database, then delete the resource and attempt to recreate it or trigger a failover.
Andreas, the Common Name needs to be set on the certificate. Patch https://review.opendev.org/#/c/667200/ will validate the content of certificates at API level so it will not accept invalid ones.
The bug was fixed in stable/queens upstream.
Reopening: this is not about invalid certificates, but about the fact that if we end up in an invalid state due to https://bugzilla.redhat.com/show_bug.cgi?id=1693808 , listeners are in UPDATE_PENDING and can then never be deleted. I just ran into this again troubleshooting another environment with https://bugzilla.redhat.com/show_bug.cgi?id=1693808 The point is, we will end up in these situations again. Here's the entire sequence that leads to this problem, pre-1693808: Reproduce the same issue as in: https://bugzilla.redhat.com/show_bug.cgi?id=1693808 ~~~ openssl req -new -newkey rsa:4096 -x509 -sha256 -days 3650 -nodes -out server.crt -keyout server.key openssl pkcs12 -export -inkey server.key -in server.crt -passout pass: -out server.p12 openstack secret store --name='tls_secret1' -t 'application/octet-stream' -e 'base64' --payload="$(base64 < server.p12)" openstack acl user add -u ca8ac3feedc64c26bde24ef404586422 $(openstack secret list | awk '/ tls_secret1 / {print $2}') openstack loadbalancer create --name lb1 --vip-subnet-id provider1-subnet sleep 300 openstack loadbalancer listener create --protocol-port 443 --protocol TERMINATED_HTTPS --name listener1 --default-tls-container=$(openstack secret list | awk '/ tls_secret1 / {print $2}') lb1 ~~~ Error message as in 1693808: ~~~ Could not retrieve certificate: ['http://10.0.0.15:9311/v1/secrets/7558e3a1-ffc1-43a1-9506-e625be67f038'] (HTTP 400) (Request-ID: req-c6dddf4c-cab9-4950-a791-78e45641ff08) (overcloud) [stack@undercloud-1 octavia_keys]$ ~~~ Given that the above doesn't work, we create a loadbalancer listener *without* the default-tls-container as a workaround: ~~~ (overcloud) [stack@undercloud-1 octavia_keys]$ openstack loadbalancer listener create --protocol-port 443 --protocol TERMINATED_HTTPS --name listener1 lb1 +---------------------------+--------------------------------------+ | Field | Value | +---------------------------+--------------------------------------+ | admin_state_up | True | | connection_limit | -1 | | created_at | 2019-05-21T14:02:40 | | default_pool_id | None | | default_tls_container_ref | None | | description | | | id | 4a8f5b7a-b376-4dca-8381-1ac40458b873 | | insert_headers | None | | l7policies | | | loadbalancers | 134940e3-2efe-4f44-96b8-7682cabb700e | | name | listener1 | | operating_status | OFFLINE | | project_id | 522130f5bf8847db8b86eb05d6493cc7 | | protocol | TERMINATED_HTTPS | | protocol_port | 443 | | provisioning_status | PENDING_CREATE | | sni_container_refs | [] | | updated_at | None | +---------------------------+--------------------------------------+ (overcloud) [stack@undercloud-1 octavia_keys]$ openstack loadbalancer listener list +--------------------------------------+-----------------+-----------+----------------------------------+------------------+---------------+----------------+ | id | default_pool_id | name | project_id | protocol | protocol_port | admin_state_up | +--------------------------------------+-----------------+-----------+----------------------------------+------------------+---------------+----------------+ | 4a8f5b7a-b376-4dca-8381-1ac40458b873 | None | listener1 | 522130f5bf8847db8b86eb05d6493cc7 | TERMINATED_HTTPS | 443 | True | +--------------------------------------+-----------------+-----------+----------------------------------+------------------+---------------+----------------+ ~~~ We now set the default-tls-container: ~~~ (overcloud) [stack@undercloud-1 octavia_keys]$ openstack loadbalancer listener set --default-tls-container-ref $(openstack secret list | awk '/ tls_secret1 / {print $2}') listener1 Could not retrieve certificate: ['http://10.0.0.15:9311/v1/secrets/7558e3a1-ffc1-43a1-9506-e625be67f038'] (HTTP 400) (Request-ID: req-892836f3-6926-42dc-a463-78813888ac4f) (overcloud) [stack@undercloud-1 octavia_keys]$ openstack loadbalancer list openstack loadbalancer lis+--------------------------------------+------+----------------------------------+-------------+---------------------+----------+ | id | name | project_id | vip_address | provisioning_status | provider | +--------------------------------------+------+----------------------------------+-------------+---------------------+----------+ | 134940e3-2efe-4f44-96b8-7682cabb700e | lb1 | 522130f5bf8847db8b86eb05d6493cc7 | 10.0.0.107 | PENDING_UPDATE | octavia | +--------------------------------------+------+----------------------------------+-------------+---------------------+----------+ t(overcloud) [stack@undercloud-1 octavia_keys]$ openstack loadbalancer listener list +--------------------------------------+-----------------+-----------+----------------------------------+------------------+---------------+----------------+ | id | default_pool_id | name | project_id | protocol | protocol_port | admin_state_up | +--------------------------------------+-----------------+-----------+----------------------------------+------------------+---------------+----------------+ | 4a8f5b7a-b376-4dca-8381-1ac40458b873 | None | listener1 | 522130f5bf8847db8b86eb05d6493cc7 | TERMINATED_HTTPS | 443 | True | +--------------------------------------+-----------------+-----------+----------------------------------+------------------+---------------+----------------+ (overcloud) [stack@undercloud-1 octavia_keys]$ ~~~ And we are stuck in PENDING_UPDATE. These resources can now *NEVER* be deleted via CLI and are stuck there forever. The only way to fix this is to manipulate the database: https://access.redhat.com/solutions/4251821
Have you tried the patch in your environment? If not, please give it a try. If yes, please share Octavia service logs (API and worker).
Carlos, the patch (either) is fine. Certificate validation - perfect, I know that's needed. And 1693808 is fixed as well. I just know that we will down the road have another bug that will lead to the same situation and/or symptoms and then users have no way of deleting loadbalancers and have to go into the database. This is the purpose of this request. The patches you mentioned aren't fixing the problem that I'm reporting. They are avoiding it from happening.
Got it. Just to reinforce: Octavia resources should never get stuck in PENDING_* unless something/someone e.g. kills -9 an Octavia controller service (worker, health manager or housekeeping) or reboots a node having the lock on and working on the resource. Should one find resources in PENDING_* without external interference, that is considered a bug in Octavia and should be addressed like it was in the patch attached to this BZ. The upstream community is working "to move to TaskFlow jobboard / using the persistence and resumption capability". This is a complex feature. Its development started during Train cycle. We expect it to be available in U-cycle (experimental).
Hi Carlos, I understand what you mean, and you can go ahead and close this BZ if you wish. I just wanted to make my point that once in this PENDING_UPDATE state, the only option is to go into the DB. I agree that it's difficult to fix / work on this when the system is never meant to be in this state. But I wished there was some way for octavia to "auto heal" ... So to go into ERROR instead of looping in PENDING_UPDATE forever. - Andreas
Carlos, I see that your patch [1] is not relevant to the pending lbs deletion. What needs to be verified in this bug? Is there a need to manually verify this bug or [2][3] cover this case? [1] https://review.opendev.org/#/c/667200/ [2] https://review.opendev.org/#/c/667200/4/octavia/tests/functional/api/v2/test_listener.py [3] https://review.opendev.org/#/c/667200/4/octavia/tests/functional/api/v2/test_load_balancer.py
The functional tests do not cover the bug that was fixed. We need to test creating a TERMINATED_HTTPS listener with an invalid certificate (e.g. Common Name value unset). The API should not accept it and return an error. See the associated story with the patch for more details.
If this bug requires doc text for errata release, please set the 'Doc Type' and provide draft text according to the template in the 'Doc Text' field. The documentation team will review, edit, and approve the text. If this bug does not require doc text, please set the 'requires_doc_text' flag to -.
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHSA-2019:3788
This comment was flagged a spam, view the edit history to see the original text if required.