Bug 1508091 - Bug in L3 agent code while cleaning up a router namespace
Summary: Bug in L3 agent code while cleaning up a router namespace
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-neutron
Version: 9.0 (Mitaka)
Hardware: All
OS: Linux
urgent
urgent
Target Milestone: zstream
: 9.0 (Mitaka)
Assignee: Brian Haley
QA Contact: Roee Agiman
URL:
Whiteboard:
Depends On:
Blocks: 1510157 1510159 1510162
TreeView+ depends on / blocked
 
Reported: 2017-10-31 20:05 UTC by Andreas Karis
Modified: 2020-12-14 10:44 UTC (History)
13 users (show)

Fixed In Version: openstack-neutron-8.4.0-9.el7ost
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
: 1510157 1510159 1510162 (view as bug list)
Environment:
Last Closed: 2018-03-15 12:41:29 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Product Errata RHBA-2018:0537 0 None None None 2018-03-15 12:42:44 UTC

Internal Links: 1508548

Description Andreas Karis 2017-10-31 20:05:14 UTC
Description of problem:
Bug in L3 agent code while cleaning up a router namespace

Version-Release number of selected component (if applicable):
neutron 8.4.0-6

How reproducible:
customer has several different other issues in neutron. After an upgrade of the neutron RPM to latest, the customer gets:

After banning and clearing the resource on one of the controllers:
~~~
017-10-31 19:42:13.903 20498 ERROR neutron.agent.l3.agent     pm.enable()
2017-10-31 19:42:13.903 20498 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/linux/external_process.py", line 94, in enable
2017-10-31 19:42:13.903 20498 ERROR neutron.agent.l3.agent     run_as_root=self.run_as_root)
2017-10-31 19:42:13.903 20498 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/linux/ip_lib.py", line 958, in execute
2017-10-31 19:42:13.903 20498 ERROR neutron.agent.l3.agent     log_fail_as_error=log_fail_as_error, **kwargs)
2017-10-31 19:42:13.903 20498 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py", line 146, in execute
2017-10-31 19:42:13.903 20498 ERROR neutron.agent.l3.agent     raise ProcessExecutionError(msg, returncode=returncode)
2017-10-31 19:42:13.903 20498 ERROR neutron.agent.l3.agent ProcessExecutionError: Exit code: 1; Stdin: ; Stdout: ; Stderr: Guru mediation now registers SIGUSR1 and SIGUSR2 by default for backward compatibility. SIGUSR1 will no longer be registered in a future release, so please use SIGUSR2 to generate reports.
2017-10-31 19:42:13.903 20498 ERROR neutron.agent.l3.agent Option "verbose" from group "DEFAULT" is deprecated for removal.  Its value may be silently ignored in the future.
2017-10-31 19:42:13.903 20498 ERROR neutron.agent.l3.agent Option "notification_driver" from group "DEFAULT" is deprecated. Use option "driver" from group "oslo_messaging_notifications".
2017-10-31 19:42:13.903 20498 ERROR neutron.agent.l3.agent
2017-10-31 19:42:13.903 20498 ERROR neutron.agent.l3.agent
2017-10-31 19:42:14.025 20498 ERROR neutron.agent.l3.agent [-] Error while deleting router 9127bd7b-1bad-43f6-83e8-e70b731c85c5
2017-10-31 19:42:14.025 20498 ERROR neutron.agent.l3.agent Traceback (most recent call last):
2017-10-31 19:42:14.025 20498 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 369, in _router_added
2017-10-31 19:42:14.025 20498 ERROR neutron.agent.l3.agent     ri.delete()
2017-10-31 19:42:14.025 20498 ERROR neutron.agent.l3.agent TypeError: delete() takes exactly 2 arguments (1 given)
2017-10-31 19:42:14.025 20498 ERROR neutron.agent.l3.agent
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent [-] Failed to process compatible router: 9127bd7b-1bad-43f6-83e8-e70b731c85c5
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent Traceback (most recent call last):
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 523, in _process_router_update
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent     self._process_router_if_compatible(router)
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 460, in _process_router_if_compatible
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent     self._process_added_router(router)
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 465, in _process_added_router
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent     self._router_added(router['id'], router)
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 372, in _router_added
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent     router_id)
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent     self.force_reraise()
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent     six.reraise(self.type_, self.value, self.tb)
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py", line 361, in _router_added
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent     ri.initialize(self.process_monitor)
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 118, in initialize
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent     self.spawn_state_change_monitor(process_monitor)
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py", line 351, in spawn_state_change_monitor
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent     pm.enable()
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/linux/external_process.py", line 94, in enable
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent     run_as_root=self.run_as_root)
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/linux/ip_lib.py", line 958, in execute
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent     log_fail_as_error=log_fail_as_error, **kwargs)
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent   File "/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py", line 146, in execute
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent     raise ProcessExecutionError(msg, returncode=returncode)
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent ProcessExecutionError: Exit code: 1; Stdin: ; Stdout: ; Stderr: Guru mediation now registers SIGUSR1 and SIGUSR2 by default for backward compatibility. SIGUSR1 will no longer be registered in a future release, so please use SIGUSR2 to generate reports.
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent Option "verbose" from group "DEFAULT" is deprecated for removal.  Its value may be silently ignored in the future.
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent Option "notification_driver" from group "DEFAULT" is deprecated. Use option "driver" from group "oslo_messaging_notifications".
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent
2017-10-31 19:42:14.026 20498 ERROR neutron.agent.l3.agent
~~~

This looks like a bug:

/usr/lib/python2.7/site-packages/neutron/agent/l3/agent.py
~~~
(...)
   321     def _create_router(self, router_id, router):
    322         args = []
    323         kwargs = {
    324             'router_id': router_id,
    325             'router': router,
    326             'use_ipv6': self.use_ipv6,
    327             'agent_conf': self.conf,
    328             'interface_driver': self.driver,
    329         }
    330
    331         if router.get('distributed'):
    332             kwargs['agent'] = self
    333             kwargs['host'] = self.host
    334
    335         if router.get('distributed') and router.get('ha'):
    336             if self.conf.agent_mode == l3_constants.L3_AGENT_MODE_DVR_SNAT:
    337                 kwargs['state_change_callback'] = self.enqueue_state_change
    338                 return dvr_edge_ha_router.DvrEdgeHaRouter(*args, **kwargs)
    339
    340         if router.get('distributed'):
    341             if self.conf.agent_mode == l3_constants.L3_AGENT_MODE_DVR_SNAT:
    342                 return dvr_router.DvrEdgeRouter(*args, **kwargs)
    343             else:
    344                 return dvr_local_router.DvrLocalRouter(*args, **kwargs)
    345
    346         if router.get('ha'):
    347             kwargs['state_change_callback'] = self.enqueue_state_change
    348             return ha_router.HaRouter(*args, **kwargs)
    349
    350         return legacy_router.LegacyRouter(*args, **kwargs)
(...)
    352     def _router_added(self, router_id, router):
    353         ri = self._create_router(router_id, router)
    354         registry.notify(resources.ROUTER, events.BEFORE_CREATE,
    355                         self, router=ri)
    356
    357         self.router_info[router_id] = ri
    358
    359         # If initialize() fails, cleanup and retrigger complete sync
    360         try:
    361             ri.initialize(self.process_monitor)
    362         except Exception:
    363             with excutils.save_and_reraise_exception():
    364                 del self.router_info[router_id]
    365                 LOG.exception(_LE('Error while initializing router %s'),
    366                               router_id)
    367                 self.namespaces_manager.ensure_router_cleanup(router_id)
    368                 try:
    369                     ri.delete()
    370                 except Exception:
    371                     LOG.exception(_LE('Error while deleting router %s'),
    372                                   router_id)
(...)
~~~

~~~
/usr/lib/python2.7/site-packages/neutron/agent/l3/legacy_router.py:class LegacyRouter(router.RouterInfo):
~~~

~~~
/usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py:class HaRouter(router.RouterInfo):
~~~

And from /usr/lib/python2.7/site-packages/neutron/agent/l3/ha_router.py
~~~
   413     def delete(self, agent):
    414         self.destroy_state_change_monitor(self.process_monitor)
    415         self.disable_keepalived()
    416         self.ha_network_removed()
    417         super(HaRouter, self).delete(agent)
    418
~~~

/usr/lib/python2.7/site-packages/neutron/agent/l3/legacy_router.py
~~~
(...)
    362     def delete(self, agent):
    363         self.router['gw_port'] = None
    364         self.router[l3_constants.INTERFACE_KEY] = []
    365         self.router[l3_constants.FLOATINGIP_KEY] = []
    366         self.process_delete(agent)
    367         self.disable_radvd()
    368         self.router_namespace.delete()
(...)
~~~


Look at the argument mismatch. ri.delete should call with 2 arguments.

- Andreas

Comment 3 Brian Haley 2017-11-06 14:22:35 UTC
I have a change for this I tried to push upstream but affected stable branches are already closed, I'll just push it downstream.

Comment 13 Roee Agiman 2018-02-26 08:47:39 UTC
Verified.
[stack@undercloud-0 ~]$ cat /etc/yum.repos.d/latest-installed 
9   -p 2018-02-19.1
[stack@undercloud-0 ~]$ rpm -qa | grep neutron-
openstack-neutron-8.4.0-17.el7ost.noarch
python-neutron-8.4.0-17.el7ost.noarch

Comment 16 errata-xmlrpc 2018-03-15 12:41:29 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2018:0537


Note You need to log in before you can comment on or make changes to this bug.