Note: This bug is displayed in read-only format because the product is no longer active in Red Hat Bugzilla.

Bug 1620169

Summary: Unable to delete a lb stuck in PENDING_UPDATE status
Product: Red Hat OpenStack Reporter: Udi Shkalim <ushkalim>
Component: openstack-octaviaAssignee: Carlos Goncalves <cgoncalves>
Status: CLOSED CURRENTRELEASE QA Contact: Alexander Stafeyev <astafeye>
Severity: urgent Docs Contact:
Priority: medium    
Version: 13.0 (Queens)CC: amuller, bcafarel, cgoncalves, ihrachys, juriarte, lpeer, majopela, nyechiel, oblaut, ushkalim
Target Milestone: ---Keywords: Reopened, Triaged, ZStream
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2018-11-20 09:06:11 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Compute sosreport none

Description Udi Shkalim 2018-08-22 15:12:54 UTC
Created attachment 1477911 [details]
Compute sosreport

Description of problem:
LB got stuck in PENDING_UPDATE status and cannot be deleted:
(overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer list
+--------------------------------------+------------------------------------------------+----------------------------------+---------------+---------------------+----------+
| id                                   | name                                           | project_id                       | vip_address   | provisioning_status | provider |
+--------------------------------------+------------------------------------------------+----------------------------------+---------------+---------------------+----------+
| 274c97f8-81c0-4118-b8ea-440c38c4ecf7 | openshift-ansible-openshift.example.com-api-lb | a4666e8598fd4298982ad13bd0a1d371 | 172.30.0.1    | ACTIVE              | octavia  |
| 98806254-ffde-4b13-8bc7-e3c74ee36cf7 | openshift-web-console/webconsole               | a4666e8598fd4298982ad13bd0a1d371 | 172.30.92.248 | PENDING_UPDATE      | octavia  |
+--------------------------------------+------------------------------------------------+----------------------------------+---------------+---------------------+----------+
(overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer delete openshift-web-console/webconsole
Validation failure: Cannot delete Load Balancer 98806254-ffde-4b13-8bc7-e3c74ee36cf7 - it has children (HTTP 400) (Request-ID: req-9a1863b8-5201-453c-bf1b-34709ab599b5)
 
 
(overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer delete openshift-web-console/webconsole --cascade
Invalid state PENDING_UPDATE of loadbalancer resource 98806254-ffde-4b13-8bc7-e3c74ee36cf7 (HTTP 409) (Request-ID: req-ac35548f-04cf-4489-9a08-ecc9fb40c111)

(overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer amphora list
+--------------------------------------+--------------------------------------+--------+------------+---------------+---------------+
| id                                   | loadbalancer_id                      | status | role       | lb_network_ip | ha_ip         |
+--------------------------------------+--------------------------------------+--------+------------+---------------+---------------+
| b0b3a145-e836-406c-b023-6298e79785f7 | 98806254-ffde-4b13-8bc7-e3c74ee36cf7 | ERROR  | STANDALONE | 172.24.0.8    | 172.30.92.248 |
+--------------------------------------+--------------------------------------+--------+------------+---------------+---------------+

SOS reports attached.

Version-Release number of selected component (if applicable):
openstack-octavia-health-manager-2.0.1-6.d137eaagit.el7ost.noarch
puppet-octavia-12.4.0-2.el7ost.noarch
openstack-octavia-common-2.0.1-6.d137eaagit.el7ost.noarch
python-octavia-2.0.1-6.d137eaagit.el7ost.noarch
openstack-octavia-api-2.0.1-6.d137eaagit.el7ost.noarch
python2-octaviaclient-1.4.0-1.el7ost.noarch
openstack-octavia-housekeeping-2.0.1-6.d137eaagit.el7ost.noarch
openstack-octavia-worker-2.0.1-6.d137eaagit.el7ost.noarch
puppet-octavia-12.4.0-2.el7ost.noarch
python2-octaviaclient-1.4.0-1.el7ost.noarch
octavia-amphora-image-x86_64-13.0-20180808.1.el7ost.noarch


How reproducible:


Steps to Reproduce:
1. Deployed a LB
2. LB got stuck
3. Failure to delete 

Actual results:
Failed to delete

Expected results:
Delete is possible.

Additional info:

Comment 3 Carlos Goncalves 2018-09-08 20:03:14 UTC
First, there's something odd in your output. You have two LBs created:

1. LB 98806254-ffde-4b13-8bc7-e3c74ee36cf7 with amphora b0b3a145-e836-406c-b023-6298e79785f7
2. LB 274c97f8-81c0-4118-b8ea-440c38c4ecf7 with no amphora, although LB is ACTIVE.

Did you collect the output of "openstack loadbalancer amphora list" after deleting LB 274c97f8-81c0-4118-b8ea-440c38c4ecf7?


The logs seem to have been truncated. I cannot find references to either of them other than a bunch of "WARNING octavia.controller.healthmanager.health_manager [-] Load balancer 98806254-ffde-4b13-8bc7-e3c74ee36cf7 is in immutable state PENDING_UPDATE. Skipping failover" in /var/log/containers/octavia/health-manager.log 


I see other errors in logs:

./var/log/containers/octavia/health-manager.log:2018-08-22 11:50:08.837 24 ERROR octavia.controller.worker.controller_worker AddrFormatError: failed to detect a valid IP address from None

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1577976



2018-08-22 11:58:17.627 23 INFO octavia.controller.queue.endpoint [-] Deleting load balancer '997e8f22-182a-4285-91d7-7f8aa4d76e1f'...
[...]
2018-08-22 11:58:39.276 23 ERROR octavia.network.drivers.neutron.allowed_address_pairs [-] All attempts to remove security group 03fa87fc-8a9d-4ef8-b51c-3a1f7384a52f have failed.: Conflict: Security Group 03    fa87fc-8a9d-4ef8-b51c-3a1f7384a52f in use.

Fixed, backported to stable/queens and released in Octavia 2.0.2 (should be in OSP13z3). I suspect the issue you are trying to report was caused by this issue. Please re-try and share logs containing messages from created to attempt to LB delete.

Comment 4 Udi Shkalim 2018-09-12 13:11:24 UTC
If you think it is a duplicate - we can close it as one.
If the problem will appear again we can re-open it.

Comment 5 Udi Shkalim 2018-10-17 12:45:51 UTC
The problem happened again. This is a blocker.
In order to recover from this state, we need to re-install the undercloud and then deploy a new overcloud.

Comment 8 Carlos Goncalves 2018-11-20 09:06:11 UTC
Current release should provide all the fixes that allow deleting load balancers in ERROR state. Please feel free to reopen if you observe the issue again.