Description of problem: I've been able to reproduce this by deleting only 10 instances, each with their own internal and floating IP associated to them. # nova list +--------------------------------------+--------------------------------------------+--------+------------+-------------+-----------------------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+--------------------------------------------+--------+------------+-------------+-----------------------------------+ | 02aad532-8abc-491a-a879-4a364f8bec5b | batch-02aad532-8abc-491a-a879-4a364f8bec5b | ACTIVE | - | Running | internal=10.10.10.38, 192.2.0.111 | | 04e080be-9173-44da-8c47-0806d76c8a25 | batch-04e080be-9173-44da-8c47-0806d76c8a25 | ACTIVE | - | Running | internal=10.10.10.40, 192.2.0.110 | | 5fbae796-7037-4539-970e-1c0b6d922f2a | batch-5fbae796-7037-4539-970e-1c0b6d922f2a | ACTIVE | - | Running | internal=10.10.10.39, 192.2.0.108 | | 91c5095d-7319-438c-a167-f4501ea4bb6e | batch-91c5095d-7319-438c-a167-f4501ea4bb6e | ACTIVE | - | Running | internal=10.10.10.33, 192.2.0.103 | | aaedd2cc-beb8-4d6d-8285-a76cfd05daa2 | batch-aaedd2cc-beb8-4d6d-8285-a76cfd05daa2 | ACTIVE | - | Running | internal=10.10.10.41, 192.2.0.109 | | d4d63f23-fa78-40a9-a19e-46543ce874df | batch-d4d63f23-fa78-40a9-a19e-46543ce874df | ACTIVE | - | Running | internal=10.10.10.36, 192.2.0.105 | | e9d1d2d9-0034-4714-8f43-c1b8c452723b | batch-e9d1d2d9-0034-4714-8f43-c1b8c452723b | ACTIVE | - | Running | internal=10.10.10.35, 192.2.0.104 | | ebd7b729-1586-4827-a7bf-053787811b60 | batch-ebd7b729-1586-4827-a7bf-053787811b60 | ACTIVE | - | Running | internal=10.10.10.42, 192.2.0.107 | | f1b1b73d-9f5f-437d-a53b-7acae472b758 | batch-f1b1b73d-9f5f-437d-a53b-7acae472b758 | ACTIVE | - | Running | internal=10.10.10.37, 192.2.0.106 | | fd604530-fb61-4770-98af-f535c80fc2b7 | batch-fd604530-fb61-4770-98af-f535c80fc2b7 | ACTIVE | - | Running | internal=10.10.10.34, 192.2.0.102 | +--------------------------------------+--------------------------------------------+--------+------------+-------------+-----------------------------------+ 2014-06-15 12:50:41.052 15781 ERROR neutron.openstack.common.rpc.amqp [-] Exception during message handling 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp Traceback (most recent call last): 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp File "/usr/lib/python2.7/site-packages/neutron/openstack/common/rpc/amqp.py", line 462, in _process_data 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp **args) 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp File "/usr/lib/python2.7/site-packages/neutron/common/rpc.py", line 45, in dispatch 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp neutron_ctxt, version, method, namespace, **kwargs) 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp File "/usr/lib/python2.7/site-packages/neutron/openstack/common/rpc/dispatcher.py", line 172, in dispatch 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp result = getattr(proxyobj, method)(ctxt, **kwargs) 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp File "/usr/lib/python2.7/site-packages/neutron/db/l3_rpc_base.py", line 111, in update_floatingip_statuses 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp status) 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp File "/usr/lib/python2.7/site-packages/neutron/db/l3_db.py", line 684, in update_floatingip_status 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp fip_query.update({'status': status}, synchronize_session=False) 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 2690, in update 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp update_op.exec_() 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/persistence.py", line 816, in exec_ 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp self._do_exec() 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/persistence.py", line 913, in _do_exec 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp update_stmt, params=self.query._params) 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp File "/usr/lib/python2.7/site-packages/neutron/openstack/common/db/sqlalchemy/session.py", line 597, in _wrap 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp return f(*args, **kwargs) 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp File "/usr/lib/python2.7/site-packages/neutron/openstack/common/db/sqlalchemy/session.py", line 840, in execute 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp return super(Session, self).execute(*args, **kwargs) 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 934, in execute 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp clause, params or {}) 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 662, in execute 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp params) 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 761, in _execute_clauseelement 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp compiled_sql, distilled_params 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 874, in _execute_context 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp context) 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1024, in _handle_dbapi_exception 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp exc_info 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sqlalchemy/util/compat.py", line 196, in raise_from_cause 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp reraise(type(exception), exception, tb=exc_tb) 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 867, in _execute_context 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp context) 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/default.py", line 324, in do_execute 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp cursor.execute(statement, parameters) 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/MySQLdb/cursors.py", line 174, in execute 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp self.errorhandler(self, exc, value) 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp File "/usr/lib64/python2.7/site-packages/MySQLdb/connections.py", line 36, in defaulterrorhandler 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp raise errorclass, errorvalue 2014-06-15 12:50:41.052 15781 TRACE neutron.openstack.common.rpc.amqp OperationalError: (OperationalError) (1205, 'Lock wait timeout exceeded; try restarting transaction') 'UPDATE floatingips SET status=%s WHERE floatingips.id = %s' ('ACTIVE', 'a030bb1e-31f0-42d7-84fc-520856f0ee66') # nova list +--------------------------------------+--------------------------------------------+--------+------------+-------------+-----------------------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+--------------------------------------------+--------+------------+-------------+-----------------------------------+ | 91c5095d-7319-438c-a167-f4501ea4bb6e | batch-91c5095d-7319-438c-a167-f4501ea4bb6e | ERROR | deleting | Running | internal=10.10.10.33, 192.2.0.103 | | d4d63f23-fa78-40a9-a19e-46543ce874df | batch-d4d63f23-fa78-40a9-a19e-46543ce874df | ERROR | deleting | Running | internal=10.10.10.36, 192.2.0.105 | | e9d1d2d9-0034-4714-8f43-c1b8c452723b | batch-e9d1d2d9-0034-4714-8f43-c1b8c452723b | ERROR | deleting | Running | internal=10.10.10.35, 192.2.0.104 | | ebd7b729-1586-4827-a7bf-053787811b60 | batch-ebd7b729-1586-4827-a7bf-053787811b60 | ERROR | deleting | Running | internal=10.10.10.42, 192.2.0.107 | | f1b1b73d-9f5f-437d-a53b-7acae472b758 | batch-f1b1b73d-9f5f-437d-a53b-7acae472b758 | ERROR | deleting | Running | internal=10.10.10.37, 192.2.0.106 | | fd604530-fb61-4770-98af-f535c80fc2b7 | batch-fd604530-fb61-4770-98af-f535c80fc2b7 | ERROR | deleting | Running | internal=10.10.10.34, 192.2.0.102 | +--------------------------------------+--------------------------------------------+--------+------------+-------------+-----------------------------------+ Version-Release number of selected component (if applicable): # rpm -qa | grep neutron openstack-neutron-openvswitch-2014.1-26.el7ost.noarch python-neutronclient-2.3.4-2.el7ost.noarch openstack-neutron-2014.1-26.el7ost.noarch python-neutron-2014.1-26.el7ost.noarch openstack-neutron-ml2-2014.1-26.el7ost.noarch How reproducible: Always. Steps to Reproduce: 1. Create >= 10 instances and associate a floating IP to each one. 2. Mass delete these instances either through the cli and ui. Actual results: A number of instances are not deleted and enter an error state due to locking issues with the DB. Expected results: All instances deleted and floating IPs freed. Additional info:
Created attachment 908931 [details] Example server.log
Removing the blocker flag as this is a known u/s issue. Adding exception to track this as high priority.
The issue occurs when port delete results in floating IP disassociation which results in L3 notifier to be invoked under transaction that results in green thread yield to another thread that also attempts to update the same row in db (L3 agent updating floating IP status). To fix the issue, we should make sure we don't notify about floating IP disassociation while under db transaction. The issue can be reproduced as follows: 1. create multiple ports: $ for i in `seq 40`; do neutron port-create --name port$i private; done 2. associate a floating IP to each of the ports. $ for i in `seq 40`; do neutron floatingip-create --port_id `neutron port-show port$i|grep '| id' | awk '{print $4}'` public; done 3. delete ports in parallel. $ for i in `seq 40`; do neutron port-delete port$i; done While doing so, I got at least 3 locks.
On related note, Red Hat has started to push upstream to switch to another MySQL client library that should not show those and similar deadlocks: https://review.openstack.org/#/c/104905/
On RHEL7 rpm -qa | grep neutron python-neutronclient-2.3.4-2.el7ost.noarch python-neutron-2014.1.1-2.el7ost.noarch openstack-neutron-2014.1.1-2.el7ost.noarch openstack-neutron-openvswitch-2014.1.1-2.el7ost.noarch [root@puma45 ~(keystone_admin)]# nova boot --flavor 1 --image cirros --num-instances 10 --nic net-id=1db71fbf-1f97-4e34-b960-8a7b1936a583 test_vm [root@puma45 ~(keystone_admin)]# nova list +--------------------------------------+----------------------------------------------+--------+------------+-------------+-------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+----------------------------------------------+--------+------------+-------------+-------------------+ | 168189d7-5af3-4902-8985-08da0d739f2d | test_vm-168189d7-5af3-4902-8985-08da0d739f2d | BUILD | spawning | NOSTATE | private=10.0.0.11 | | 3e8dcc30-0282-496e-ae59-80054457df82 | test_vm-3e8dcc30-0282-496e-ae59-80054457df82 | BUILD | spawning | NOSTATE | private=10.0.0.12 | | 531bed85-a898-48a9-b36e-67560f0429e9 | test_vm-531bed85-a898-48a9-b36e-67560f0429e9 | BUILD | spawning | NOSTATE | private=10.0.0.9 | | 563c3070-8596-4e43-8d8c-770696700037 | test_vm-563c3070-8596-4e43-8d8c-770696700037 | BUILD | spawning | NOSTATE | private=10.0.0.10 | | 66b7a853-78f0-4a32-bd24-ec8b414394b9 | test_vm-66b7a853-78f0-4a32-bd24-ec8b414394b9 | BUILD | spawning | NOSTATE | private=10.0.0.2 | | 6c92ec99-387c-45a1-b1c1-4f7cc807e4c5 | test_vm-6c92ec99-387c-45a1-b1c1-4f7cc807e4c5 | BUILD | spawning | NOSTATE | private=10.0.0.7 | | 76702532-d2e9-4e7a-89a2-e8c8f5d9d635 | test_vm-76702532-d2e9-4e7a-89a2-e8c8f5d9d635 | BUILD | spawning | NOSTATE | private=10.0.0.5 | | a3393005-3b42-42a1-bfb2-a978a1c2fd1e | test_vm-a3393005-3b42-42a1-bfb2-a978a1c2fd1e | BUILD | spawning | NOSTATE | private=10.0.0.8 | | b9d70aaa-3d2e-45e2-bb8e-fdd2b22fc9fd | test_vm-b9d70aaa-3d2e-45e2-bb8e-fdd2b22fc9fd | BUILD | spawning | NOSTATE | private=10.0.0.3 | | d07de2cf-bd6e-4b09-90af-7c194e90785e | test_vm-d07de2cf-bd6e-4b09-90af-7c194e90785e | BUILD | spawning | NOSTATE | private=10.0.0.4 | +--------------------------------------+----------------------------------------------+--------+------------+-------------+-------------------+ [root@puma45 ~(keystone_admin)]# for port in `neutron port-list --device_owner compute:none | awk '/subnet/ {print $2}'`; do neutron floatingip-create public --port-id $port; done [root@puma45 ~(keystone_admin)]# nova list +--------------------------------------+----------------------------------------------+--------+------------+-------------+---------------------------------+ | ID | Name | Status | Task State | Power State | Networks | +--------------------------------------+----------------------------------------------+--------+------------+-------------+---------------------------------+ | 168189d7-5af3-4902-8985-08da0d739f2d | test_vm-168189d7-5af3-4902-8985-08da0d739f2d | ACTIVE | - | Running | private=10.0.0.11, 10.35.166.6 | | 3e8dcc30-0282-496e-ae59-80054457df82 | test_vm-3e8dcc30-0282-496e-ae59-80054457df82 | ACTIVE | - | Running | private=10.0.0.12, 10.35.166.9 | | 531bed85-a898-48a9-b36e-67560f0429e9 | test_vm-531bed85-a898-48a9-b36e-67560f0429e9 | ACTIVE | - | Running | private=10.0.0.9, 10.35.166.12 | | 563c3070-8596-4e43-8d8c-770696700037 | test_vm-563c3070-8596-4e43-8d8c-770696700037 | ACTIVE | - | Running | private=10.0.0.10, 10.35.166.10 | | 66b7a853-78f0-4a32-bd24-ec8b414394b9 | test_vm-66b7a853-78f0-4a32-bd24-ec8b414394b9 | ACTIVE | - | Running | private=10.0.0.2, 10.35.166.5 | | 6c92ec99-387c-45a1-b1c1-4f7cc807e4c5 | test_vm-6c92ec99-387c-45a1-b1c1-4f7cc807e4c5 | ACTIVE | - | Running | private=10.0.0.7, 10.35.166.8 | | 76702532-d2e9-4e7a-89a2-e8c8f5d9d635 | test_vm-76702532-d2e9-4e7a-89a2-e8c8f5d9d635 | ACTIVE | - | Running | private=10.0.0.5, 10.35.166.13 | | a3393005-3b42-42a1-bfb2-a978a1c2fd1e | test_vm-a3393005-3b42-42a1-bfb2-a978a1c2fd1e | ACTIVE | - | Running | private=10.0.0.8, 10.35.166.4 | | b9d70aaa-3d2e-45e2-bb8e-fdd2b22fc9fd | test_vm-b9d70aaa-3d2e-45e2-bb8e-fdd2b22fc9fd | ACTIVE | - | Running | private=10.0.0.3, 10.35.166.7 | | d07de2cf-bd6e-4b09-90af-7c194e90785e | test_vm-d07de2cf-bd6e-4b09-90af-7c194e90785e | ACTIVE | - | Running | private=10.0.0.4, 10.35.166.11 | +--------------------------------------+----------------------------------------------+--------+------------+-------------+---------------------------------+ [root@puma45 ~(keystone_admin)]# nova delete `nova list | awk '/ACTIVE/ {print $2}'` [root@puma45 ~(keystone_admin)]# nova list +----+------+--------+------------+-------------+----------+ | ID | Name | Status | Task State | Power State | Networks | +----+------+--------+------------+-------------+----------+ +----+------+--------+------------+-------------+----------+
[root@puma45 ~(keystone_admin)]# neutron floatingip-list +--------------------------------------+------------------+---------------------+---------+ | id | fixed_ip_address | floating_ip_address | port_id | +--------------------------------------+------------------+---------------------+---------+ | 0422aa67-6e54-4507-8e38-25011e24b02c | | 10.35.166.10 | | | 04684362-dd91-4157-a9f5-0962c4a97264 | | 10.35.166.9 | | | 0686309f-4ee7-4f8c-9c41-2ceddf56ed21 | | 10.35.166.8 | | | 2005c94c-9160-431d-8c5f-d36903b34f3b | | 10.35.166.7 | | | 9288b138-35df-4cc0-9d31-0e537a78eebf | | 10.35.166.13 | | | b1699375-259d-4ea9-9759-21a8693dc04f | | 10.35.166.4 | | | dde5b717-b452-45fc-8cc7-fd09a05b2166 | | 10.35.166.11 | | | ed04b256-10c6-4e4f-8f96-d887e12fb792 | | 10.35.166.12 | | | ef86c585-9359-4199-a7e2-2a311f9a0735 | | 10.35.166.5 | | | f9d81e29-96e2-4bf9-953d-068c40038e9f | | 10.35.166.6 | | +--------------------------------------+------------------+---------------------+---------+
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory, and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. http://rhn.redhat.com/errata/RHBA-2014-0936.html