DescriptionMatsvei Hauryliuk
2023-03-13 09:43:47 UTC
Description of problem:
Client is trying to deploy VMs on a particular compute node but the operation failed.
The following traceback could be found in the logs:
2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc [req-d0b73883-67aa-4be4-a9b6-77705ccc1171 c29260b8c93b4f999c4744bee5776360 07639f0089f341ff9edb87c809dc7c4b - - -] Exception while dispatching port events: 'Chassis_Private' object has no attribute 'hostname': AttributeError: 'Chassis_Private' object has no attribute 'hostname'
2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc Traceback (most recent call last):
2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc File "/usr/lib/python3.6/site-packages/neutron/plugins/ml2/ovo_rpc.py", line 133, in dispatch_events
2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc self._resource_push_api.push(context, [obj], rpc_event)
2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc File "/usr/lib/python3.6/site-packages/neutron/api/rpc/handlers/resources_rpc.py", line 245, in push
2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc self._push(context, resource_type, type_resources, event_type)
2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc File "/usr/lib/python3.6/site-packages/neutron/api/rpc/handlers/resources_rpc.py", line 251, in _push
2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc for version in version_manager.get_resource_versions(resource_type):
2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc File "/usr/lib/python3.6/site-packages/neutron/api/rpc/callbacks/version_manager.py", line 250, in get_resource_versions
2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc return _get_cached_tracker().get_resource_versions(resource_type)
2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc File "/usr/lib/python3.6/site-packages/neutron/api/rpc/callbacks/version_manager.py", line 226, in get_resource_versions
2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc self._check_expiration()
2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc File "/usr/lib/python3.6/site-packages/neutron/api/rpc/callbacks/version_manager.py", line 222, in _check_expiration
2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc self._update_consumer_versions()
2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc File "/usr/lib/python3.6/site-packages/neutron/api/rpc/callbacks/version_manager.py", line 211, in _update_consumer_versions
2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc neutron_plugin.get_agents_resource_versions(new_tracker)
2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc File "/usr/lib/python3.6/site-packages/neutron/db/agents_db.py", line 468, in get_agents_resource_versions
2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc for agent in self._get_agents_considered_for_versions():
2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc File "/usr/lib/python3.6/site-packages/neutron/db/agents_db.py", line 455, in _get_agents_considered_for_versions
2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc filters={'admin_state_up': [True]})
2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc File "/usr/lib/python3.6/site-packages/networking_ovn/ml2/mech_driver.py", line 1076, in fn
2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc return op(results, new_method(*args, _driver=self, **kwargs))
2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc File "/usr/lib/python3.6/site-packages/networking_ovn/ml2/mech_driver.py", line 1140, in get_agents
2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc agent_dict = agent.as_dict()
2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc File "/usr/lib/python3.6/site-packages/networking_ovn/agent/neutron_agent.py", line 60, in as_dict
2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc 'host': self.chassis.hostname,
2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc AttributeError: 'Chassis_Private' object has no attribute 'hostname'
2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc
Which looks like an inconsistency between the chassis and the chassis_private tables in the Southbound DB. Restarting the "tripleo_neutron_api.service" container on all controller nodes does remediate the issue temporarily, however a permanent solution is needed as the issue is regular.
One of the possible scenarios we were considering was a high load on the deployment that's causing this, to eliminate this possibility we asked the client to increase the probing interval on all nodes:
ovs-vsctl set open . external_ids:ovn-remote-probe-interval=180000
This didn't solve the issue either.
I need your help for RCA on this.
Version-Release number of selected component (if applicable):
ovn-15.5.0-2.20220216005905.4f55857.el8ost.noarch
OSP 16.2.3
How reproducible:
Happens regularly.
Steps to Reproduce:
1.Deploy a VM on a compute.
2.
3.
Actual results:
Expected results:
Active VM deployed on a compute node.
Additional info:
Attached to the case are Sosreports from 3 controllers, the compute where the issue took place, also output of "openstack server show" for the instance as well as ovnnb_db.db and ovnsb_db.db files.
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.
For information on the advisory (Red Hat OpenStack Platform 16.2.6 (Train) bug fix and enhancement advisory), and where to find the updated
files, follow the link below.
If the solution does not work for you, open a new bug report.
https://access.redhat.com/errata/RHBA-2023:6307
Description of problem: Client is trying to deploy VMs on a particular compute node but the operation failed. The following traceback could be found in the logs: 2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc [req-d0b73883-67aa-4be4-a9b6-77705ccc1171 c29260b8c93b4f999c4744bee5776360 07639f0089f341ff9edb87c809dc7c4b - - -] Exception while dispatching port events: 'Chassis_Private' object has no attribute 'hostname': AttributeError: 'Chassis_Private' object has no attribute 'hostname' 2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc Traceback (most recent call last): 2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc File "/usr/lib/python3.6/site-packages/neutron/plugins/ml2/ovo_rpc.py", line 133, in dispatch_events 2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc self._resource_push_api.push(context, [obj], rpc_event) 2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc File "/usr/lib/python3.6/site-packages/neutron/api/rpc/handlers/resources_rpc.py", line 245, in push 2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc self._push(context, resource_type, type_resources, event_type) 2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc File "/usr/lib/python3.6/site-packages/neutron/api/rpc/handlers/resources_rpc.py", line 251, in _push 2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc for version in version_manager.get_resource_versions(resource_type): 2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc File "/usr/lib/python3.6/site-packages/neutron/api/rpc/callbacks/version_manager.py", line 250, in get_resource_versions 2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc return _get_cached_tracker().get_resource_versions(resource_type) 2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc File "/usr/lib/python3.6/site-packages/neutron/api/rpc/callbacks/version_manager.py", line 226, in get_resource_versions 2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc self._check_expiration() 2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc File "/usr/lib/python3.6/site-packages/neutron/api/rpc/callbacks/version_manager.py", line 222, in _check_expiration 2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc self._update_consumer_versions() 2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc File "/usr/lib/python3.6/site-packages/neutron/api/rpc/callbacks/version_manager.py", line 211, in _update_consumer_versions 2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc neutron_plugin.get_agents_resource_versions(new_tracker) 2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc File "/usr/lib/python3.6/site-packages/neutron/db/agents_db.py", line 468, in get_agents_resource_versions 2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc for agent in self._get_agents_considered_for_versions(): 2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc File "/usr/lib/python3.6/site-packages/neutron/db/agents_db.py", line 455, in _get_agents_considered_for_versions 2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc filters={'admin_state_up': [True]}) 2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc File "/usr/lib/python3.6/site-packages/networking_ovn/ml2/mech_driver.py", line 1076, in fn 2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc return op(results, new_method(*args, _driver=self, **kwargs)) 2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc File "/usr/lib/python3.6/site-packages/networking_ovn/ml2/mech_driver.py", line 1140, in get_agents 2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc agent_dict = agent.as_dict() 2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc File "/usr/lib/python3.6/site-packages/networking_ovn/agent/neutron_agent.py", line 60, in as_dict 2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc 'host': self.chassis.hostname, 2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc AttributeError: 'Chassis_Private' object has no attribute 'hostname' 2023-01-29 00:32:55.059 24 ERROR neutron.plugins.ml2.ovo_rpc Which looks like an inconsistency between the chassis and the chassis_private tables in the Southbound DB. Restarting the "tripleo_neutron_api.service" container on all controller nodes does remediate the issue temporarily, however a permanent solution is needed as the issue is regular. One of the possible scenarios we were considering was a high load on the deployment that's causing this, to eliminate this possibility we asked the client to increase the probing interval on all nodes: ovs-vsctl set open . external_ids:ovn-remote-probe-interval=180000 This didn't solve the issue either. I need your help for RCA on this. Version-Release number of selected component (if applicable): ovn-15.5.0-2.20220216005905.4f55857.el8ost.noarch OSP 16.2.3 How reproducible: Happens regularly. Steps to Reproduce: 1.Deploy a VM on a compute. 2. 3. Actual results: Expected results: Active VM deployed on a compute node. Additional info: Attached to the case are Sosreports from 3 controllers, the compute where the issue took place, also output of "openstack server show" for the instance as well as ovnnb_db.db and ovnsb_db.db files.