Bug 2064794 - Unable to run 'openstack network agent list' command
Summary: Unable to run 'openstack network agent list' command
Keywords:
Status: CLOSED DUPLICATE of bug 1849166
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: openstack-neutron
Version: 16.1 (Train)
Hardware: Unspecified
OS: Unspecified
urgent
urgent
Target Milestone: ---
: ---
Assignee: Lucas Alvares Gomes
QA Contact: Eran Kuris
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2022-03-16 14:59 UTC by camorris@redhat.co
Modified: 2023-03-13 14:36 UTC (History)
6 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-05-26 09:25:09 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker OSP-13974 0 None None None 2022-03-16 15:07:23 UTC

Description camorris@redhat.co 2022-03-16 14:59:14 UTC
Description of problem:
Unable to run command `openstack network agent list`

Version-Release number of selected component (if applicable):
Red Hat OpenStack Platform release 16.1.7 GA (Train)

How reproducible:
Everytime 

Steps to Reproduce:
1. Run openstack network agent list

Actual results:
(overcloud) [stack@director01 ~]$ openstack network agent list
HttpException: 500: Server Error for url: https://xxx.local:13696/v2.0/agents, Request Failed: internal server error while processing your request.
(overcloud) [stack@director01 ~]$


Expected results:
an actual list of networks

Additional info:

- We are able to create non-provider networks. 
- We have multiple provider networks running, and it has never been an issue in the past.

Trying to sync the database does not help:

podman exec neutron_api neutron-ovn-db-sync-util --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/ml2/ml2_conf.ini --ovn-neutron_sync_mode repair

Comment 2 Lucas Alvares Gomes 2022-03-22 15:04:26 UTC
Hi,

Looking at the SOSReport provided the error related to the "agent list" command seems to be:

2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall [req-d96a98a2-85ab-42ae-a0e7-6f8aeab6432f - - - - -] Fixed interval looping call 'neutron.plugins.ml2.plugin.AgentDbMixin.agent_health_check' failed: AttributeError: Row instance has no attribute 'hostname'
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall Traceback (most recent call last):
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall   File "/usr/lib64/python3.6/site-packages/ovs/db/idl.py", line 1018, in __getattr__
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall     column = self._table.columns[column_name]
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall KeyError: 'hostname'
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall During handling of the above exception, another exception occurred:
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall Traceback (most recent call last):
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall   File "/usr/lib/python3.6/site-packages/oslo_service/loopingcall.py", line 150, in _run_loop
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall     result = func(*self.args, **self.kw)
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall   File "/usr/lib/python3.6/site-packages/neutron_lib/db/api.py", line 139, in wrapped
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall     setattr(e, '_RETRY_EXCEEDED', True)
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall   File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall     self.force_reraise()
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall   File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall     six.reraise(self.type_, self.value, self.tb)
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall   File "/usr/lib/python3.6/site-packages/six.py", line 675, in reraise
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall     raise value
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall   File "/usr/lib/python3.6/site-packages/neutron_lib/db/api.py", line 135, in wrapped
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall     return f(*args, **kwargs)
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall   File "/usr/lib/python3.6/site-packages/oslo_db/api.py", line 154, in wrapper
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall     ectxt.value = e.inner_exc
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall   File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall     self.force_reraise()
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall   File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall     six.reraise(self.type_, self.value, self.tb)
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall   File "/usr/lib/python3.6/site-packages/six.py", line 675, in reraise
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall     raise value
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall   File "/usr/lib/python3.6/site-packages/neutron_lib/db/api.py", line 135, in wrapped
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall     return f(*args, **kwargs)
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall   File "/usr/lib/python3.6/site-packages/oslo_db/api.py", line 154, in wrapper
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall     ectxt.value = e.inner_exc
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall   File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall     self.force_reraise()
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall   File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall     six.reraise(self.type_, self.value, self.tb)
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall   File "/usr/lib/python3.6/site-packages/six.py", line 675, in reraise
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall     raise value
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall   File "/usr/lib/python3.6/site-packages/oslo_db/api.py", line 142, in wrapper
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall     return f(*args, **kwargs)
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall   File "/usr/lib/python3.6/site-packages/neutron_lib/db/api.py", line 183, in wrapped
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall     LOG.debug("Retry wrapper got retriable exception: %s", e)
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall   File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall     self.force_reraise()
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall   File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall     six.reraise(self.type_, self.value, self.tb)
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall   File "/usr/lib/python3.6/site-packages/six.py", line 675, in reraise
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall     raise value
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall   File "/usr/lib/python3.6/site-packages/neutron_lib/db/api.py", line 179, in wrapped
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall     return f(*dup_args, **dup_kwargs)
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall   File "/usr/lib/python3.6/site-packages/neutron/db/agents_db.py", line 312, in agent_health_check
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall     filters={'admin_state_up': [True]})
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall   File "/usr/lib/python3.6/site-packages/networking_ovn/ml2/mech_driver.py", line 1060, in fn
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall     return op(results, new_method(*args, _driver=self, **kwargs))
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall   File "/usr/lib/python3.6/site-packages/networking_ovn/ml2/mech_driver.py", line 1132, in get_agents
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall     for agent in _driver.agents_from_chassis(ch, update_db).values():
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall   File "/usr/lib/python3.6/site-packages/networking_ovn/ml2/mech_driver.py", line 1050, in agents_from_chassis
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall     agent_dict[agent.agent_id] = agent.as_dict(alive)
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall   File "/usr/lib/python3.6/site-packages/networking_ovn/agent/neutron_agent.py", line 51, in as_dict
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall     'host': self.chassis.hostname,
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall   File "/usr/lib64/python3.6/site-packages/ovs/db/idl.py", line 1021, in __getattr__
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall     (self.__class__.__name__, column_name))
2022-03-01 14:43:06.794 41 ERROR oslo.service.loopingcall AttributeError: Row instance has no attribute 'hostname'

It appears many times in the Neutron logs. For example, on controller-0:

[lmartins@supportshell 03161847]$ zgrep "Row instance has no attribute" 0030-sosreport-s97ctrl0-2022-03-01-wjsnnsn.tar.xz/sosreport-s97ctrl0-2022-03-01-wjsnnsn/var/log/containers/neutron/server.log.* | wc -l
40

Searching for this error in previous BZs I found, https://bugzilla.redhat.com/show_bug.cgi?id=1975264. It seems like the workaround for this problem is to restart the neutron_api containers (see comment #5). Can we restart the neutron_api container in all 3 controllers and see if that works ?

For a long term fix, I believe we need this patch from upstream: https://review.opendev.org/c/openstack/neutron/+/797796. In this patch the author introduced a new method in the neutron_agent.py module called chassis_from_private() which fallback to the Chassis table if there's an AttributeError when getting a column from Chassis_Private (Chassis_Private does not have the "hostname" column in the OVSDB). This patch is not included in OSP 16 and in upstream it has only been backported down to stable/wallaby (OSP 16 is based on stable/train). So I can work on the backport for this in parallel.

In the meantime, can we try the workaround of restarting the neutron_api containers in the controllers and see if that mitigates the issue for now ?

Comment 11 Jakub Libosvar 2023-03-13 14:36:53 UTC

*** This bug has been marked as a duplicate of bug 1849166 ***


Note You need to log in before you can comment on or make changes to this bug.