Bug 1620169
| Summary: | Unable to delete a lb stuck in PENDING_UPDATE status | ||||||
|---|---|---|---|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Udi Shkalim <ushkalim> | ||||
| Component: | openstack-octavia | Assignee: | Carlos Goncalves <cgoncalves> | ||||
| Status: | CLOSED CURRENTRELEASE | QA Contact: | Alexander Stafeyev <astafeye> | ||||
| Severity: | urgent | Docs Contact: | |||||
| Priority: | medium | ||||||
| Version: | 13.0 (Queens) | CC: | amuller, bcafarel, cgoncalves, ihrachys, juriarte, lpeer, majopela, nyechiel, oblaut, ushkalim | ||||
| Target Milestone: | --- | Keywords: | Reopened, Triaged, ZStream | ||||
| Target Release: | --- | ||||||
| Hardware: | Unspecified | ||||||
| OS: | Unspecified | ||||||
| Whiteboard: | |||||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | |||||
| Doc Text: | Story Points: | --- | |||||
| Clone Of: | Environment: | ||||||
| Last Closed: | 2018-11-20 09:06:11 UTC | Type: | Bug | ||||
| Regression: | --- | Mount Type: | --- | ||||
| Documentation: | --- | CRM: | |||||
| Verified Versions: | Category: | --- | |||||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | |||||
| Cloudforms Team: | --- | Target Upstream Version: | |||||
| Embargoed: | |||||||
| Attachments: |
|
||||||
First, there's something odd in your output. You have two LBs created: 1. LB 98806254-ffde-4b13-8bc7-e3c74ee36cf7 with amphora b0b3a145-e836-406c-b023-6298e79785f7 2. LB 274c97f8-81c0-4118-b8ea-440c38c4ecf7 with no amphora, although LB is ACTIVE. Did you collect the output of "openstack loadbalancer amphora list" after deleting LB 274c97f8-81c0-4118-b8ea-440c38c4ecf7? The logs seem to have been truncated. I cannot find references to either of them other than a bunch of "WARNING octavia.controller.healthmanager.health_manager [-] Load balancer 98806254-ffde-4b13-8bc7-e3c74ee36cf7 is in immutable state PENDING_UPDATE. Skipping failover" in /var/log/containers/octavia/health-manager.log I see other errors in logs: ./var/log/containers/octavia/health-manager.log:2018-08-22 11:50:08.837 24 ERROR octavia.controller.worker.controller_worker AddrFormatError: failed to detect a valid IP address from None RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1577976 2018-08-22 11:58:17.627 23 INFO octavia.controller.queue.endpoint [-] Deleting load balancer '997e8f22-182a-4285-91d7-7f8aa4d76e1f'... [...] 2018-08-22 11:58:39.276 23 ERROR octavia.network.drivers.neutron.allowed_address_pairs [-] All attempts to remove security group 03fa87fc-8a9d-4ef8-b51c-3a1f7384a52f have failed.: Conflict: Security Group 03 fa87fc-8a9d-4ef8-b51c-3a1f7384a52f in use. Fixed, backported to stable/queens and released in Octavia 2.0.2 (should be in OSP13z3). I suspect the issue you are trying to report was caused by this issue. Please re-try and share logs containing messages from created to attempt to LB delete. If you think it is a duplicate - we can close it as one. If the problem will appear again we can re-open it. The problem happened again. This is a blocker. In order to recover from this state, we need to re-install the undercloud and then deploy a new overcloud. Current release should provide all the fixes that allow deleting load balancers in ERROR state. Please feel free to reopen if you observe the issue again. |
Created attachment 1477911 [details] Compute sosreport Description of problem: LB got stuck in PENDING_UPDATE status and cannot be deleted: (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer list +--------------------------------------+------------------------------------------------+----------------------------------+---------------+---------------------+----------+ | id | name | project_id | vip_address | provisioning_status | provider | +--------------------------------------+------------------------------------------------+----------------------------------+---------------+---------------------+----------+ | 274c97f8-81c0-4118-b8ea-440c38c4ecf7 | openshift-ansible-openshift.example.com-api-lb | a4666e8598fd4298982ad13bd0a1d371 | 172.30.0.1 | ACTIVE | octavia | | 98806254-ffde-4b13-8bc7-e3c74ee36cf7 | openshift-web-console/webconsole | a4666e8598fd4298982ad13bd0a1d371 | 172.30.92.248 | PENDING_UPDATE | octavia | +--------------------------------------+------------------------------------------------+----------------------------------+---------------+---------------------+----------+ (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer delete openshift-web-console/webconsole Validation failure: Cannot delete Load Balancer 98806254-ffde-4b13-8bc7-e3c74ee36cf7 - it has children (HTTP 400) (Request-ID: req-9a1863b8-5201-453c-bf1b-34709ab599b5) (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer delete openshift-web-console/webconsole --cascade Invalid state PENDING_UPDATE of loadbalancer resource 98806254-ffde-4b13-8bc7-e3c74ee36cf7 (HTTP 409) (Request-ID: req-ac35548f-04cf-4489-9a08-ecc9fb40c111) (overcloud) [stack@undercloud-0 ~]$ openstack loadbalancer amphora list +--------------------------------------+--------------------------------------+--------+------------+---------------+---------------+ | id | loadbalancer_id | status | role | lb_network_ip | ha_ip | +--------------------------------------+--------------------------------------+--------+------------+---------------+---------------+ | b0b3a145-e836-406c-b023-6298e79785f7 | 98806254-ffde-4b13-8bc7-e3c74ee36cf7 | ERROR | STANDALONE | 172.24.0.8 | 172.30.92.248 | +--------------------------------------+--------------------------------------+--------+------------+---------------+---------------+ SOS reports attached. Version-Release number of selected component (if applicable): openstack-octavia-health-manager-2.0.1-6.d137eaagit.el7ost.noarch puppet-octavia-12.4.0-2.el7ost.noarch openstack-octavia-common-2.0.1-6.d137eaagit.el7ost.noarch python-octavia-2.0.1-6.d137eaagit.el7ost.noarch openstack-octavia-api-2.0.1-6.d137eaagit.el7ost.noarch python2-octaviaclient-1.4.0-1.el7ost.noarch openstack-octavia-housekeeping-2.0.1-6.d137eaagit.el7ost.noarch openstack-octavia-worker-2.0.1-6.d137eaagit.el7ost.noarch puppet-octavia-12.4.0-2.el7ost.noarch python2-octaviaclient-1.4.0-1.el7ost.noarch octavia-amphora-image-x86_64-13.0-20180808.1.el7ost.noarch How reproducible: Steps to Reproduce: 1. Deployed a LB 2. LB got stuck 3. Failure to delete Actual results: Failed to delete Expected results: Delete is possible. Additional info: