Bug 1412744

Summary: Load balancer v2 in PENDING* state cannot be deleted
Product: Red Hat OpenStack Reporter: Ondrej <ochalups>
Component: openstack-neutron-lbaasAssignee: Nir Magnezi <nmagnezi>
Status: CLOSED WONTFIX QA Contact: Toni Freger <tfreger>
Severity: medium Docs Contact:
Priority: medium    
Version: 10.0 (Newton)CC: amuller, apevec, giondog, lhh, nyechiel, ochalups
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2017-09-25 12:51:02 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Ondrej 2017-01-12 16:36:59 UTC
Description of problem:
The loadbalancer creation initially failed because of wrongly configured lbaas agent and after the agent was fixed it's not possible to delete due to
"Invalid state PENDING_*" error.
We tried to workaround it by modifying the loadbalancer provisioning_status
to ACTIVE directly in the database, but then it failed with "Unable to find 
loadbalancer(s) with id(s) 'id'" error and ended it up in PENDING_DELETE status.


# neutron lbaas-loadbalancer-list
+--------------------------------------+-----------------+--------------+---------------------+----------+
| id                                   | name            | vip_address  | provisioning_status | provider |
+--------------------------------------+-----------------+--------------+---------------------+----------+
| 27714a3d-0e6a-411d-9c64-d5d4980eb8fa | lb1             | 10.100.41.13 | ACTIVE              | haproxy  |
| 96f901de-0e21-46e4-a21b-7250c75aadda | Load Balancer 2 | 10.100.41.20 | ACTIVE              | haproxy  |
| c0967c7e-af39-40aa-87de-62496aa4b95c | Load Balancer 1 | 10.100.41.98 | PENDING_DELETE      | haproxy  |
| d8a14947-07c7-4540-8a13-1421b2ca0dce | Load Balancer 1 | 10.100.41.95 | PENDING_DELETE      | haproxy  |
+--------------------------------------+-----------------+--------------+---------------------+----------+

# mysql -u root ovs_neutron -e "update lbaas_loadbalancers set provisioning_status='ACTIVE' where id='d8a14947-07c7-4540-8a13-1421b2ca0dce";"

# neutron lbaas-loadbalancer-list
+--------------------------------------+-----------------+--------------+---------------------+----------+
| id                                   | name            | vip_address  | provisioning_status | provider |
+--------------------------------------+-----------------+--------------+---------------------+----------+
| 27714a3d-0e6a-411d-9c64-d5d4980eb8fa | lb1             | 10.100.41.13 | ACTIVE              | haproxy  |
| 96f901de-0e21-46e4-a21b-7250c75aadda | Load Balancer 2 | 10.100.41.20 | ACTIVE              | haproxy  |
| c0967c7e-af39-40aa-87de-62496aa4b95c | Load Balancer 1 | 10.100.41.98 | PENDING_DELETE      | haproxy  |
| d8a14947-07c7-4540-8a13-1421b2ca0dce | Load Balancer 1 | 10.100.41.95 | ACTIVE              | haproxy  |
+--------------------------------------+-----------------+--------------+---------------------+----------+

# neutron lbaas-loadbalancer-delete d8a14947-07c7-4540-8a13-1421b2ca0dce
Unable to find loadbalancer(s) with id(s) 'd8a14947-07c7-4540-8a13-1421b2ca0dce'
# neutron lbaas-loadbalancer-delete d8a14947-07c7-4540-8a13-1421b2ca0dce
Invalid state PENDING_DELETE of loadbalancer resource d8a14947-07c7-4540-8a13-1421b2ca0dce
Neutron server returns request_ids: ['req-b8f08392-b1e3-481c-989a-d1ad1ef9fbfb']
# neutron lbaas-loadbalancer-list
+--------------------------------------+-----------------+--------------+---------------------+----------+
| id                                   | name            | vip_address  | provisioning_status | provider |
+--------------------------------------+-----------------+--------------+---------------------+----------+
| 27714a3d-0e6a-411d-9c64-d5d4980eb8fa | lb1             | 10.100.41.13 | ACTIVE              | haproxy  |
| 96f901de-0e21-46e4-a21b-7250c75aadda | Load Balancer 2 | 10.100.41.20 | ACTIVE              | haproxy  |
| c0967c7e-af39-40aa-87de-62496aa4b95c | Load Balancer 1 | 10.100.41.98 | PENDING_DELETE      | haproxy  |
| d8a14947-07c7-4540-8a13-1421b2ca0dce | Load Balancer 1 | 10.100.41.95 | PENDING_DELETE      | haproxy  |
+--------------------------------------+-----------------+--------------+---------------------+----------+

The logs show client conflict (409):
2017-01-12 16:34:44.251 599852 INFO neutron.api.v2.resource [req-cdff930e-69da-46f6-9e95-baf923dca3ff 88f19bef2342447f90ad207abb5718ec b38cb8fea06540a7bb3bff9f89edb4e2 - - -] delete failed (client error): There was a conflict when trying to complete your request.
2017-01-12 16:34:44.254 599852 INFO neutron.wsgi [req-cdff930e-69da-46f6-9e95-baf923dca3ff 88f19bef2342447f90ad207abb5718ec b38cb8fea06540a7bb3bff9f89edb4e2 - - -] 10.100.54.14 - - [12/Jan/2017 16:34:44] "DELETE /v2.0/lbaas/loadbalancers/d8a14947-07c7-4540-8a13-1421b2ca0dce.json HTTP/1.1" 409 342 0.822013

[1] https://bugs.launchpad.net/octavia/+bug/1498130

Version-Release number of selected component (if applicable):
openstack-neutron-lbaas-9.1.0-1.el7ost.noarch
python-neutron-lbaas-9.1.0-1.el7ost.noarch

Logs are available in collab-shell:/cases/01769283/sosreport-20170112-180506

How reproducible:
everytime

Steps to Reproduce:
1.update pending loadbalancer status to active
2.delete loadbalancer
3.

Actual results:
"neutron lbaas-loadbalancer-delete agent_id" fails to delete the loadbalancer

Expected results:
"neutron lbaas-loadbalancer-delete agent_id" would delete the loadbalancer

Additional info:

Comment 1 Assaf Muller 2017-01-24 15:59:01 UTC
Not being able to delete a load balancer in a PENDING state is intended behavior. It seems like the issue here is that the load balancer was stuck in PENDING. Can you confirm this? If so, can you clarify how the balancer got stuck in PENDING?

Comment 2 Ondrej 2017-01-24 16:05:56 UTC
The neutron-lbaasv2-agent was down because of wrong device_driver. Loadbalancers
created before we were able to fix the right driver and bring the agent up were stuck in PENDING_CREATE. We're not able to get rid of them now.
I tried to cheat it by changing the loadbalancer status in neutron db to ACTIVE, but that didn't work.

Comment 3 Assaf Muller 2017-01-24 21:05:25 UTC
(In reply to Ondrej from comment #2)
> The neutron-lbaasv2-agent was down because of wrong device_driver.
> Loadbalancers
> created before we were able to fix the right driver and bring the agent up
> were stuck in PENDING_CREATE. We're not able to get rid of them now.
> I tried to cheat it by changing the loadbalancer status in neutron db to
> ACTIVE, but that didn't work.

Would this be a reproducer then?

1) Install LBaaS v2, create a load balancer successfully
2) Shut down the LBaaS v2 agent
3) Create a new load balancer
4) Start up the LBaaS v2 agent again

Observe that the load balancer is stuck in PENDING_* and cannot be deleted.

Correct?

Comment 4 Ondrej 2017-01-25 08:06:06 UTC
Yes, exactly.

Comment 5 David Pasqua 2017-09-07 17:32:31 UTC
Hello, any update on this?
Was it fixed?
I just got the same error

Comment 6 Assaf Muller 2017-09-11 13:26:52 UTC
Assigned to Nir for RCA.

Comment 7 Nir Magnezi 2017-09-25 12:51:02 UTC
As the Michael (the PTL) said[1], it's by design to not allow actions on load balancers in PENDING_* states.
You may simply delete the row for that specific loadbalancer from the database, but there is no mechanism to move those to a different state.

If any, the issue I see here is the fact that the client request for loadbalancer creation was not rejected when the agent is down, but that would really depend on how it was implemented (synchronous or asynchronous). In any case, this bug is against loadbalanacers that got stuck in PENDING state.

BTW, such scenario is handled in Octavia, but not in the legacy implementation.

[1] https://bugs.launchpad.net/octavia/+bug/1498130/comments/7