Bug 1306690 - "StaleDataError: UPDATE statement on table 'floatingips' expected to update 1 row(s); 0 were matched" on removing instance
Summary: "StaleDataError: UPDATE statement on table 'floatingips' expected to update 1...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-neutron
Version: 7.0 (Kilo)
Hardware: Unspecified
OS: Unspecified
medium
unspecified
Target Milestone: async
: 7.0 (Kilo)
Assignee: Ihar Hrachyshka
QA Contact: Toni Freger
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2016-02-11 15:21 UTC by Vadim Rutkovsky
Modified: 2017-02-15 13:50 UTC (History)
5 users (show)

Fixed In Version: openstack-neutron-2015.1.4-12.el7ost
Doc Type: Bug Fix
Doc Text:
Clone Of:
Environment:
Last Closed: 2017-02-15 13:50:20 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1513242 0 None None None 2016-02-11 15:21:46 UTC
OpenStack gerrit 192063 0 None None None 2017-02-09 16:04:17 UTC
Red Hat Product Errata RHBA-2017:0277 0 normal SHIPPED_LIVE openstack-neutron bug fix advisory 2017-02-15 18:49:17 UTC

Description Vadim Rutkovsky 2016-02-11 15:21:46 UTC
Description of problem:
Instance stays in Error state after delete, see neutron logs below.

Upstream issue: https://bugs.launchpad.net/neutron/+bug/1513242


Version-Release number of selected component (if applicable):
openstack-neutron-common-2015.1.2-6.el7ost.noarch
python-neutronclient-2.4.0-2.el7ost.noarch
openstack-neutron-ml2-2015.1.2-6.el7ost.noarch
openstack-neutron-openvswitch-2015.1.2-6.el7ost.noarch
python-neutron-2015.1.2-6.el7ost.noarch
openstack-neutron-2015.1.2-6.el7ost.noarch

How reproducible:
~10% of all 

Steps to Reproduce:
Spawn and delete instance, rinse, repeat

Actual results:
2016-01-28 05:44:15.391 60669 ERROR neutron.api.v2.resource [req-432b001f-2b68-4a16-887f-4ee20e08df5d ] delete failed
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource Traceback (most recent call last):
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource   File "/usr/lib/python2.7/site-packages/neutron/api/v2/resource.py", line 83, in resource
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource     result = method(request=request, **args)
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource   File "/usr/lib/python2.7/site-packages/neutron/api/v2/base.py", line 490, in delete
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource     obj_deleter(request.context, id, **kwargs)
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource   File "/usr/lib/python2.7/site-packages/oslo_db/api.py", line 131, in wrapper
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource     return f(*args, **kwargs)
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource   File "/usr/lib/python2.7/site-packages/neutron/plugins/ml2/plugin.py", line 1272, in delete_port
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource     context, id, do_notify=False)
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource   File "/usr/lib/python2.7/site-packages/neutron/db/l3_dvr_db.py", line 274, in disassociate_floatingips
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource     do_notify=do_notify)
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource   File "/usr/lib/python2.7/site-packages/neutron/db/l3_db.py", line 1361, in disassociate_floatingips
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource     context, port_id)
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource   File "/usr/lib/python2.7/site-packages/neutron/db/l3_db.py", line 1088, in disassociate_floatingips
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource     'router_id': None})
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 490, in __exit__
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource     self.rollback()
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource   File "/usr/lib64/python2.7/site-packages/sqlalchemy/util/langhelpers.py", line 60, in __exit__
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource     compat.reraise(exc_type, exc_value, exc_tb)
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 487, in __exit__
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource     self.commit()
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 392, in commit
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource     self._prepare_impl()
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 372, in _prepare_impl
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource     self.session.flush()
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 2004, in flush
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource     self._flush(objects)
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 2122, in _flush
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource     transaction.rollback(_capture_exception=True)
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource   File "/usr/lib64/python2.7/site-packages/sqlalchemy/util/langhelpers.py", line 60, in __exit__
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource     compat.reraise(exc_type, exc_value, exc_tb)
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 2086, in _flush
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource     flush_context.execute()
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 373, in execute
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource     rec.execute(self)
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/unitofwork.py", line 532, in execute
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource     uow
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/persistence.py", line 170, in save_obj
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource     mapper, table, update)
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource   File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/persistence.py", line 712, in _emit_update_statements
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource     (table.description, len(records), rows))
2016-01-28 05:44:15.391 60669 TRACE neutron.api.v2.resource StaleDataError: UPDATE statement on table 'floatingips' expected to update 1 row(s); 0 were matched.

Expected results:
Instances are removed as expected

Additional info:

Comment 1 Assaf Muller 2016-06-04 20:46:46 UTC
Any idea about this one? The upstream bug expired, it's not clear if this was fixed in OSP8+ and if so is there a patch that we can backport?

Comment 2 Ihar Hrachyshka 2016-06-07 11:56:01 UTC
Assaf, I spent some time comparing Kilo and Liberty code (since the upstream reporter claimed Liberty is fixed for him). Nothing that would stand out, expect maybe a set of retry decorators applied to base api entry points, starting from https://review.openstack.org/#/c/191540/ That said, it does not seem that StaleDataError is handled by the decorators, though I may miss something since I am not that fluent with oslo.db code.

We probably want to backport those anyway since they drastically enhances API stability for deadlocks.

Now, in upstream Liberty we also switched to another mysql driver (pymysql), and that could also contribute to the issue disappearing for the reporter when Liberty code is used. We definitely cannot change the driver in OSP7.

Should I go and backport retry fixes?

Comment 3 Assaf Muller 2016-10-03 13:34:54 UTC
(In reply to Ihar Hrachyshka from comment #2)
> Assaf, I spent some time comparing Kilo and Liberty code (since the upstream
> reporter claimed Liberty is fixed for him). Nothing that would stand out,
> expect maybe a set of retry decorators applied to base api entry points,
> starting from https://review.openstack.org/#/c/191540/ That said, it does
> not seem that StaleDataError is handled by the decorators, though I may miss
> something since I am not that fluent with oslo.db code.
> 
> We probably want to backport those anyway since they drastically enhances
> API stability for deadlocks.
> 
> Now, in upstream Liberty we also switched to another mysql driver (pymysql),
> and that could also contribute to the issue disappearing for the reporter
> when Liberty code is used. We definitely cannot change the driver in OSP7.
> 
> Should I go and backport retry fixes?

The retry decorators have come up in multiple contexts, I think that the answer is yes, we want them in OSP 7 as well.

Comment 4 Ihar Hrachyshka 2017-01-06 16:58:38 UTC
I think the fix is: https://review.openstack.org/#/c/192063/ and we backported it into the latest package.

Comment 7 Ihar Hrachyshka 2017-02-07 14:41:31 UTC
Steps to reproduce for the bug are not clear, as is often the case with race conditions. You can try to repeat creating and deleting instances (with floating IPs) while also deleting the floating IPs in parallel.

From automation perspective, maybe it's just a matter of regression and stability tests using tempest.

Comment 9 Toni Freger 2017-02-14 12:37:37 UTC
The code is verified on latest OSP7 - openstack-neutron-2015.1.4-12.el7ost.noarch

Comment 11 errata-xmlrpc 2017-02-15 13:50:20 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://rhn.redhat.com/errata/RHBA-2017-0277.html


Note You need to log in before you can comment on or make changes to this bug.