Bug 2007689 - [OSP16.1] Failed to call periodic networking_ovn.common.maintenance.DBInconsistenciesPeriodics.check_metadata_ports
Summary: [OSP16.1] Failed to call periodic networking_ovn.common.maintenance.DBInconsi...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat OpenStack
Classification: Red Hat
Component: python-networking-ovn
Version: 16.1 (Train)
Hardware: x86_64
OS: Linux
high
high
Target Milestone: z8
: 16.1 (Train on RHEL 8.2)
Assignee: Rodolfo Alonso
QA Contact: Eduardo Olivares
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-09-24 15:27 UTC by Luigi Tamagnone
Modified: 2022-10-07 18:45 UTC (History)
8 users (show)

Fixed In Version: python-networking-ovn-7.3.1-1.20211112134207.4e24f4c.el8ost
Doc Type: No Doc Update
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-03-24 11:01:35 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Launchpad 1939726 0 None None None 2021-09-30 13:40:07 UTC
OpenStack gerrit 811996 0 None MERGED Fix "_sync_metadata_ports" with no DHCP subnets 2021-10-19 16:38:12 UTC
OpenStack gerrit 812339 0 None NEW Fix "_sync_metadata_ports" with no DHCP subnets 2021-10-19 16:38:09 UTC
Red Hat Issue Tracker OSP-9906 0 None None None 2021-11-15 12:43:01 UTC
Red Hat Product Errata RHBA-2022:0986 0 None None None 2022-03-24 11:01:58 UTC

Description Luigi Tamagnone 2021-09-24 15:27:44 UTC
Description of problem:
The OVN DB went in an inconsistent state (probably for connection issue), but the network is not able to recover from the wrong state:
~~~
2021-09-15 19:13:22.267 34 ERROR networking_ovn.common.ovn_client [req-3b2ec9ef-fa65-458f-9f98-47deb162ea01 819471c72c6c4e8ab53960a3efd8cb32 2aae7a68d7b94e07a9f6445abe353c70 - default default] Unable to delete floating ip in gateway router. Error: Cannot find Logical_Switch_Port with uuid=eadecdd1-746f-4c38-b105-2647ea56494a: ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Logical_Switch_Port with uuid=eadecdd1-746f-4c38-b105-2647ea56494a
~~~
also manually recover failed:
neutron-ovn-db-sync-util --config-file /etc/neutron/neutron.conf --config-file /etc/neutron/plugins/ml2/ml2_conf.ini --ovn-neutron_sync_mode repair
~~~
2021-09-16 16:57:56.760 259594 CRITICAL neutron_ovn_db_sync_util [req-cdb14a39-f013-4fe5-9598-a8502a5c026f - - - - -] Unhandled error: neutron_lib.exceptions.IpAddressGenerationFailure: No more IP addresses available on network 054893c4-00dd-4f46-8f50-ee5c52e03543.
2021-09-16 16:57:56.760 259594 ERROR neutron_ovn_db_sync_util Traceback (most recent call last):
2021-09-16 16:57:56.760 259594 ERROR neutron_ovn_db_sync_util   File "/usr/lib/python3.6/site-packages/neutron/db/ipam_pluggable_backend.py", line 138, in _ipam_allocate_ips
[...]
2021-09-16 16:57:56.760 259594 ERROR neutron_ovn_db_sync_util   File "/usr/lib/python3.6/site-packages/neutron/db/ipam_pluggable_backend.py", line 141, in _ipam_allocate_ips
2021-09-16 16:57:56.760 259594 ERROR neutron_ovn_db_sync_util     net_id=port['network_id'])
2021-09-16 16:57:56.760 259594 ERROR neutron_ovn_db_sync_util neutron_lib.exceptions.IpAddressGenerationFailure: No more IP addresses available on network 054893c4-00dd-4f46-8f50-ee5c52e03543.
2021-09-16 16:57:56.760 259594 ERROR neutron_ovn_db_sync_util 
~~~
The issue seems to bind with subnets that have only one IP available like 172.xx.xx.52/30
subnet_72e0c020-abb0-4569-8153-abde2f9eb8fe:
+-------------------+----------------------------------------------------+
| Field             | Value                                              |
+-------------------+----------------------------------------------------+
| allocation_pools  | {"start": "172.xx.xx.54", "end": "172.xx.xx.54"}   |
| cidr              | 172.xx.xx.52/30                                    |
| created_at        | 2021-02-24T14:35:55Z                               |
| enable_dhcp       | False                                              |
| gateway_ip        | 172.xx.xx.53                                       |
| id                | 72e0c020-abb0-4569-8153-abde2f9eb8fe               |
| ip_version        | 4                                                  |
| network_id        | 054893c4-00dd-4f46-8f50-ee5c52e03543               |

from the port list there is one port down:
+--------------------------------------+------+-------------------+------------------------------------------------------------------------------+--------+
| ID                                   | Name | MAC Address       | Fixed IP Addresses                                                           | Status |
+--------------------------------------+------+-------------------+------------------------------------------------------------------------------+--------+
| b0c9d846-dba4-4330-872b-ee64e0dfebf0 |      | fa:16:xx:xx:xx:ca | ip_address='172.xx.xx.54',  subnet_id='72e0c020-abb0-4569-8153-abde2f9eb8fe' | ACTIVE |
| c2515017-416a-49f1-9f61-0f0c271de698 |      | fa:16:xx:xx:xx:58 |                                                                              | DOWN   |
+--------------------------------------+------+-------------------+------------------------------------------------------------------------------+--------+

A workaround is deleting the DHCP port (c2515017-416a-49f1-9f61-0f0c271de698) for this network.


Version-Release number of selected component (if applicable):
[redhat-release] Red Hat Enterprise Linux release 8.2 (Ootpa)
[rhosp-release] Red Hat OpenStack Platform release 16.1.5 GA (Train)


Actual results:
Some new and some old instances are not reachable at all.

Expected results:
The network will auto recover after a network issue.


Additional info:
Also if now seems the network is ok there is still an ERROR on check_metadata_ports:
~~~
2021-09-24 13:59:08.836 39 ERROR futurist.periodics [req-9fe6abf5-0fdc-458f-a190-8cc68343587f - - - - -] Failed to call periodic 'networking_ovn.common.maintenance.DBInconsistenciesPeriodics.check_metadata_ports' (it runs every 1800.00 seconds): neutron_lib.exceptions.IpAddressGenerationFailure: No more IP addresses available on network 054893c4-00dd-4f46-8f50-ee5c52e03543.
2021-09-24 13:59:08.836 39 ERROR futurist.periodics Traceback (most recent call last):
2021-09-24 13:59:08.836 39 ERROR futurist.periodics   File "/usr/lib/python3.6/site-packages/neutron/db/ipam_pluggable_backend.py", line 138, in _ipam_allocate_ips
2021-09-24 13:59:08.836 39 ERROR futurist.periodics     ip_address, subnet_id = ipam_allocator.allocate(ip_request)
2021-09-24 13:59:08.836 39 ERROR futurist.periodics   File "/usr/lib/python3.6/site-packages/neutron/ipam/subnet_alloc.py", line 240, in allocate
2021-09-24 13:59:08.836 39 ERROR futurist.periodics     raise ipam_exc.IpAddressGenerationFailureAllSubnets()
2021-09-24 13:59:08.836 39 ERROR futurist.periodics neutron.ipam.exceptions.IpAddressGenerationFailureAllSubnets: No more IP addresses available.
2021-09-24 13:59:08.836 39 ERROR futurist.periodics
2021-09-24 13:59:08.836 39 ERROR futurist.periodics During handling of the above exception, another exception occurred:
[...]
2021-09-24 13:59:08.836 39 ERROR futurist.periodics   File "/usr/lib/python3.6/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2021-09-24 13:59:08.836 39 ERROR futurist.periodics     six.reraise(self.type_, self.value, self.tb)
2021-09-24 13:59:08.836 39 ERROR futurist.periodics   File "/usr/lib/python3.6/site-packages/six.py", line 675, in reraise
2021-09-24 13:59:08.836 39 ERROR futurist.periodics     raise value
2021-09-24 13:59:08.836 39 ERROR futurist.periodics   File "/usr/lib/python3.6/site-packages/neutron/db/ipam_pluggable_backend.py", line 141, in _ipam_allocate_ips
2021-09-24 13:59:08.836 39 ERROR futurist.periodics     net_id=port['network_id'])
2021-09-24 13:59:08.836 39 ERROR futurist.periodics neutron_lib.exceptions.IpAddressGenerationFailure: No more IP addresses available on network 054893c4-00dd-4f46-8f50-ee5c52e03543.
2021-09-24 13:59:08.836 39 ERROR futurist.periodics
~~~

Comment 21 errata-xmlrpc 2022-03-24 11:01:35 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (Red Hat OpenStack Platform 16.1.8 bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:0986


Note You need to log in before you can comment on or make changes to this bug.