Bug 1510162 - Bug in L3 agent code while cleaning up a router namespace
Summary: Bug in L3 agent code while cleaning up a router namespace
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-neutron
Version: 7.0 (Kilo)
Hardware: All
OS: Linux
urgent
urgent
Target Milestone: zstream
: 7.0 (Kilo)
Assignee: Brian Haley
QA Contact: Toni Freger
URL:
Whiteboard:
Depends On: 1508091
Blocks: 1510157 1510159
TreeView+ depends on / blocked
 
Reported: 2017-11-06 20:05 UTC by Brian Haley
Modified: 2020-12-14 10:47 UTC (History)
12 users (show)

Fixed In Version: openstack-neutron-2015.1.4-26.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of: 1508091
Environment:
Last Closed: 2017-12-05 10:47:49 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2017:3381 0 normal SHIPPED_LIVE openstack-neutron bug fix advisory 2017-12-05 06:08:22 UTC

Comment 3 Toni Freger 2017-11-26 07:04:16 UTC
Brian,

I've ran rally benchmark test, creation and deletion of 30 routers, 3 concurrent iteration.

on version openstack-neutron-2015.1.4-26.el7ost.noarch

You can find the test here - https://github.com/openstack/rally/blob/793735c152a573d72391a8ac21e2d908b631195a/samples/tasks/scenarios/neutron/create-and-delete-routers.json


2017-11-26 06:14:13.156 15052 ERROR neutron.agent.l3.ha_router [-] Unable to process HA router 59fad7c2-d393-464f-820b-334927047e64 without HA port
2017-11-26 06:14:13.156 15052 TRACE neutron.agent.l3.ha_router None
2017-11-26 06:14:13.156 15052 TRACE neutron.agent.l3.ha_router
2017-11-26 06:14:13.157 15052 ERROR neutron.agent.l3.agent [-] Error while initializing router 59fad7c2-d393-464f-820b-334927047e64
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent Traceback (most recent call last):
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 335, in _router_added
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent     ri.initialize(self.process_monitor)
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 83, in initialize
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent     raise Exception(msg)
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent Exception: Unable to process HA router 59fad7c2-d393-464f-820b-334927047e64 without HA port
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent
2017-11-26 06:14:13.157 15052 ERROR neutron.agent.l3.agent [-] Error while deleting router 59fad7c2-d393-464f-820b-334927047e64
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent Traceback (most recent call last):
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 342, in _router_added
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent     ri.delete(self)
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 359, in delete
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent     self.destroy_state_change_monitor(self.process_monitor)
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent AttributeError: 'HaRouter' object has no attribute 'process_monitor'
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent
2017-11-26 06:14:13.157 15052 ERROR neutron.agent.l3.agent [-] Failed to process compatible router '59fad7c2-d393-464f-820b-334927047e64'
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent Traceback (most recent call last):
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 509, in _process_router_update
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent     self._process_router_if_compatible(router)
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 450, in _process_router_if_compatible
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent     self._process_added_router(router)
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 455, in _process_added_router
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent     self._router_added(router['id'], router)
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 345, in _router_added
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent     router_id)
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 85, in __exit__
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent     six.reraise(self.type_, self.value, self.tb)
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 335, in _router_added
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent     ri.initialize(self.process_monitor)
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 83, in initialize
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent     raise Exception(msg)
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent Exception: Unable to process HA router 59fad7c2-d393-464f-820b-334927047e64 without HA port
2017-11-26 06:14:13.157 15052 TRACE neutron.agent.l3.agent

Comment 5 Brian Haley 2017-11-27 20:03:59 UTC
Hi Toni,

The first backtrace in Comment #3 looks like another bug in this code path that would be present in all releases.  self.process_monitor is only initialized in a super() call from the HA router initialize code.  In this case initialize() failed early and super() was never called.  I need to open an upstream bug and propose a change there.  This would have been triggered even without the new code from what I can tell and was just a race condition waiting to happen.

The second backtrace in Comment #4 is possibly something new, or could have been fixed upstream already as it looks familiar.  Since it's unrelated I guess I wouldn't necessarily hold things for it.

Let me look at the other bug updates you posted to see if the trace is similar.

Comment 7 Brian Haley 2017-11-28 16:38:48 UTC
Hi Scott,

The second issue (from Comment #4) is unrelated to the changes, so I would consider it new to OSP7.

The first issue (from Comment #3) is related to the changes, but is actually a new bug - i.e. fixing one bug uncovered another.  I am fine with this small change and the one for https://bugzilla.redhat.com/show_bug.cgi?id=1496916 merging which are related since they make the original failure more recoverable and do not fill the log files unnecessarily.

Hopefully Toni will agree.

Comment 9 Brian Haley 2017-11-30 21:55:01 UTC
Scott,

I think we should ship this as-is and I can fix any new bugs going forward.

Toni,

I opened https://bugs.launchpad.net/neutron/+bug/1735557 and have a patch up to fix the other l3-agent issue, not sure if you opened a downstream bug for this yet.  I will need to take a look at the other issue you found as time permits.

Comment 10 Toni Freger 2017-12-04 16:22:54 UTC
Since functionality wasn't damaged it reasonable to work on new bugs and to move this one to verified.

Comment 13 errata-xmlrpc 2017-12-05 10:47:49 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2017:3381


Note You need to log in before you can comment on or make changes to this bug.