Bug 2081766
| Summary: | [Neutron][OVN] - Synchronizing Neutron and OVN databases maintenance task failures on port groups. | |||
|---|---|---|---|---|
| Product: | Red Hat OpenStack | Reporter: | Matt Flusche <mflusche> | |
| Component: | openstack-neutron | Assignee: | Jakub Libosvar <jlibosva> | |
| Status: | CLOSED NEXTRELEASE | QA Contact: | Eran Kuris <ekuris> | |
| Severity: | high | Docs Contact: | ||
| Priority: | high | |||
| Version: | 16.1 (Train) | CC: | chrisw, froyo, scohen, ssigwald | |
| Target Milestone: | z10 | Keywords: | Triaged | |
| Target Release: | 16.1 (Train on RHEL 8.2) | |||
| Hardware: | x86_64 | |||
| OS: | Linux | |||
| Whiteboard: | ||||
| Fixed In Version: | Doc Type: | If docs needed, set a value | ||
| Doc Text: | Story Points: | --- | ||
| Clone Of: | ||||
| : | 2086899 (view as bug list) | Environment: | ||
| Last Closed: | 2022-11-14 16:25:46 UTC | Type: | Bug | |
| Regression: | --- | Mount Type: | --- | |
| Documentation: | --- | CRM: | ||
| Verified Versions: | Category: | --- | ||
| oVirt Team: | --- | RHEL 7.3 requirements from Atomic Host: | ||
| Cloudforms Team: | --- | Target Upstream Version: | ||
| Embargoed: | ||||
| Bug Depends On: | 2080222, 2086899 | |||
| Bug Blocks: | ||||
It looks like maintenance task inconsistency detection doesn't work on port groups. This will be fixed in 16.2 |
Description of problem: In this environment, new instances are failing launch with neutron internal server errors[1]. In the neutron server log we found a failure to find port group[2]. In OVN db, we verified missing port group: # export NBDB=$(sudo ovs-vsctl get open . external_ids:ovn-remote | sed -e 's/\"//g' | sed -e 's/6642/6641/g') # alias ovn-nbctl='sudo podman exec ovn_controller ovn-nbctl --db=$NBDB' # ovn-nbctl list Port_Group pg_0f5d1826_f5a1_4702_bf1e_b157b5e95b55 (nil) There is a resync maintenance task every 5 min to fix such issue but it seems broken[3]. Instances are successful with a new security group. [1] - Nova instance failure. {'code': 500, 'created': '2022-04-20T20:02:53Z', | | | 'message': "Exceeded maximum number of retries. Exceeded | | | max scheduling attempts 3 for instance | | | dbfc727e-9ccc-4560-99c9-e3a4747ae4b7. Last exception: | | | Request Failed: internal server error while processing | | | your request.\nNeutron server returns request_ids: | | | ['req-6283cc", 'details': 'Traceback (most recent call | | | last):\n File "/usr/lib/python3.6/site- | | | packages/nova/conductor/manager.py", line 637, in | | | build_instances\n filter_properties, | | | instances[0].uuid)\n File "/usr/lib/python3.6/site- | | | packages/nova/scheduler/utils.py", line 895, in | | | populate_retry\n raise exception.MaxRetriesExceeded(rea | | | son=msg)\nnova.exception.MaxRetriesExceeded: Exceeded | | | maximum number of retries. Exceeded max scheduling | | | attempts 3 for instance | | | dbfc727e-9ccc-4560-99c9-e3a4747ae4b7. Last exception: | | | Request Failed: internal server error while processing | | | your request.\nNeutron server returns request_ids: | | | [\'req-6283ccf0-6fe5-46cf-8342-80f706bd86d8\']\n'} | | flavor | disk='80', ephemeral='0', , original_name='m1.large', | | | ram='8192', swap='0', vcpus='4' [2] - port group failure from neutron server.log 2022-04-20 16:02:51.479 30 ERROR ovsdbapp.backend.ovs_idl.transaction [req-6283ccf0-6fe5-46cf-8342-80f706bd86d8 b7c2463c9c8b1856fb969afd935dbe8666185eba22b83b1f7050d91f9b9fdbfc 6d443e 80563646ba8ccfddeeed2380f1 - 2896954cd77544dc8e673b41d318f3e9 2896954cd77544dc8e673b41d318f3e9] Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/ovsdbapp/schema/ovn_northbound/commands.py", line 1329, in run_idl pg = self.api.lookup('Port_Group', self.port_group) File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/__init__.py", line 204, in lookup return self._lookup(table, record) File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/__init__.py", line 260, in _lookup row = idlutils.row_by_value(self, rl.table, rl.column, record) File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/idlutils.py", line 130, in row_by_value raise RowNotFound(table=table, col=column, match=match) ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Port_Group with name=pg_040f6daa_0ab4_4914_b4c7_0ff8228f0fb7 [3] - ovn sync failure example from neutron server.log 2022-04-20 16:03:02.297 50 DEBUG networking_ovn.common.maintenance [req-4fd77dba-3997-426b-b5da-e465c9ebc689 - - - - -] Maintenance task: Synchronizing Neutron and OVN databases check_for_inconsistencies /usr/lib/python3.6/site-packages/networking_ovn/common/maintenance.py:341 2022-04-20 16:03:02.298 50 DEBUG networking_ovn.common.maintenance [req-4fd77dba-3997-426b-b5da-e465c9ebc689 - - - - -] Maintenance task: Number of inconsistencies found at create/update: security_group_rules=6 _log /usr/lib/python3.6/site-packages/networking_ovn/common/maintenance.py:322 2022-04-20 16:03:02.298 50 DEBUG networking_ovn.common.maintenance [req-4fd77dba-3997-426b-b5da-e465c9ebc689 - - - - -] Maintenance task: Fixing resource 9fd01595-30fa-4b65-bd7c-3caa67d1e518 (type: security_group_rules) at create/update check_for_inconsistencies /usr/lib/python3.6/site-packages/networking_ovn/common/maintenance.py:353 2022-04-20 16:03:02.307 50 DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Running txn n=1 command(idx=0): PgAclAddCommand(entity=pg_b3bd2359_6630_479e_84da_ec3cbff07ea7, direction=from-lport, priority=1002, match=inport == @pg_b3bd2359_6630_479e_84da_ec3cbff07ea7 && ip4 && ip4.dst == 0.0.0.0/0 && tcp && tcp.dst == 25, action=allow-related, log=False, may_exist=False, severity=[], name=[], external_ids={'neutron:security_group_rule_id': '9fd01595-30fa-4b65-bd7c-3caa67d1e518'}) do_commit /usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py:84 2022-04-20 16:03:02.308 50 ERROR ovsdbapp.backend.ovs_idl.transaction [req-4fd77dba-3997-426b-b5da-e465c9ebc689 - - - - -] Traceback (most recent call last): File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/connection.py", line 128, in run txn.results.put(txn.do_commit()) File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py", line 86, in do_commit command.run_idl(txn) File "/usr/lib/python3.6/site-packages/ovsdbapp/schema/ovn_northbound/commands.py", line 121, in run_idl self.direction, self.priority, self.match)) RuntimeError: ACL (from-lport, 1002, inport == @pg_b3bd2359_6630_479e_84da_ec3cbff07ea7 && ip4 && ip4.dst == 0.0.0.0/0 && tcp && tcp.dst == 25) already exists 2022-04-20 16:03:02.309 50 ERROR ovsdbapp.backend.ovs_idl.command [req-4fd77dba-3997-426b-b5da-e465c9ebc689 - - - - -] Error executing command: RuntimeError: ACL (from-lport, 1002, inport == @pg_b3bd2359_6630_479e_84da_ec3cbff07ea7 && ip4 && ip4.dst == 0.0.0.0/0 && tcp && tcp.dst == 25) already exists 2022-04-20 16:03:02.309 50 ERROR ovsdbapp.backend.ovs_idl.command Traceback (most recent call last): 2022-04-20 16:03:02.309 50 ERROR ovsdbapp.backend.ovs_idl.command File "/usr/lib/python3.6/site-packages/ovsdbapp/api.py", line 111, in transaction 2022-04-20 16:03:02.309 50 ERROR ovsdbapp.backend.ovs_idl.command yield self._nested_txns_map[cur_thread_id] 2022-04-20 16:03:02.309 50 ERROR ovsdbapp.backend.ovs_idl.command KeyError: 139979826286216 2022-04-20 16:03:02.309 50 ERROR ovsdbapp.backend.ovs_idl.command 2022-04-20 16:03:02.309 50 ERROR ovsdbapp.backend.ovs_idl.command During handling of the above exception, another exception occurred: 2022-04-20 16:03:02.309 50 ERROR ovsdbapp.backend.ovs_idl.command 2022-04-20 16:03:02.309 50 ERROR ovsdbapp.backend.ovs_idl.command Traceback (most recent call last): 2022-04-20 16:03:02.309 50 ERROR ovsdbapp.backend.ovs_idl.command File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/command.py", line 42, in execute 2022-04-20 16:03:02.309 50 ERROR ovsdbapp.backend.ovs_idl.command t.add(self) 2022-04-20 16:03:02.309 50 ERROR ovsdbapp.backend.ovs_idl.command File "/usr/lib64/python3.6/contextlib.py", line 88, in __exit__ 2022-04-20 16:03:02.309 50 ERROR ovsdbapp.backend.ovs_idl.command next(self.gen) 2022-04-20 16:03:02.309 50 ERROR ovsdbapp.backend.ovs_idl.command File "/usr/lib/python3.6/site-packages/networking_ovn/ovsdb/impl_idl_ovn.py", line 196, in transaction 2022-04-20 16:03:02.309 50 ERROR ovsdbapp.backend.ovs_idl.command yield t 2022-04-20 16:03:02.309 50 ERROR ovsdbapp.backend.ovs_idl.command File "/usr/lib64/python3.6/contextlib.py", line 88, in __exit__ 2022-04-20 16:03:02.309 50 ERROR ovsdbapp.backend.ovs_idl.command next(self.gen) 2022-04-20 16:03:02.309 50 ERROR ovsdbapp.backend.ovs_idl.command File "/usr/lib/python3.6/site-packages/ovsdbapp/api.py", line 119, in transaction 2022-04-20 16:03:02.309 50 ERROR ovsdbapp.backend.ovs_idl.command del self._nested_txns_map[cur_thread_id] 2022-04-20 16:03:02.309 50 ERROR ovsdbapp.backend.ovs_idl.command File "/usr/lib/python3.6/site-packages/ovsdbapp/api.py", line 69, in __exit__ 2022-04-20 16:03:02.309 50 ERROR ovsdbapp.backend.ovs_idl.command self.result = self.commit() 2022-04-20 16:03:02.309 50 ERROR ovsdbapp.backend.ovs_idl.command File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py", line 62, in commit 2022-04-20 16:03:02.309 50 ERROR ovsdbapp.backend.ovs_idl.command raise result.ex 2022-04-20 16:03:02.309 50 ERROR ovsdbapp.backend.ovs_idl.command File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/connection.py", line 128, in run 2022-04-20 16:03:02.309 50 ERROR ovsdbapp.backend.ovs_idl.command txn.results.put(txn.do_commit()) 2022-04-20 16:03:02.309 50 ERROR ovsdbapp.backend.ovs_idl.command File "/usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py", line 86, in do_commit 2022-04-20 16:03:02.309 50 ERROR ovsdbapp.backend.ovs_idl.command command.run_idl(txn) 2022-04-20 16:03:02.309 50 ERROR ovsdbapp.backend.ovs_idl.command File "/usr/lib/python3.6/site-packages/ovsdbapp/schema/ovn_northbound/commands.py", line 121, in run_idl 2022-04-20 16:03:02.309 50 ERROR ovsdbapp.backend.ovs_idl.command self.direction, self.priority, self.match)) 2022-04-20 16:03:02.309 50 ERROR ovsdbapp.backend.ovs_idl.command RuntimeError: ACL (from-lport, 1002, inport == @pg_b3bd2359_6630_479e_84da_ec3cbff07ea7 && ip4 && ip4.dst == 0.0.0.0/0 && tcp && tcp.dst == 25) already exists 2022-04-20 16:03:02.309 50 ERROR ovsdbapp.backend.ovs_idl.command 2022-04-20 16:03:02.309 50 ERROR networking_ovn.common.maintenance [req-4fd77dba-3997-426b-b5da-e465c9ebc689 - - - - -] Maintenance task: Failed to fix resource 9fd01595-30fa-4b65-bd7c-3caa67d1e518 (type: security_group_rules): RuntimeError: ACL (from-lport, 1002, inport == @pg_b3bd2359_6630_479e_84da_ec3cbff07ea7 && ip4 && ip4.dst == 0.0.0.0/0 && tcp && tcp.dst == 25) already exists Version-Release number of selected component (if applicable): container version: openstack-neutron-server-ovn:16.1.7-12 $ podman run --net host registry.redhat.io/rhosp-rhel8/openstack-neutron-server-ovn:16.1.7-12 rpm -qa |grep neutron python3-neutron-dynamic-routing-15.0.1-1.20210528043020.56de1c4.el8ost.noarch puppet-neutron-15.5.1-1.20210614113305.7d0406b.el8ost.noarch python3-neutron-lib-1.29.1-1.20210527195021.4ef4b71.el8ost.noarch python3-neutron-15.2.1-1.20210712133309.el8ost.noarch openstack-neutron-ml2-15.2.1-1.20210712133309.el8ost.noarch python3-neutronclient-6.14.1-1.20210528021924.a09e824.el8ost.noarch openstack-neutron-common-15.2.1-1.20210712133309.el8ost.noarch openstack-neutron-15.2.1-1.20210712133309.el8ost.noarch $ podman run --net host registry.redhat.io/rhosp-rhel8/openstack-neutron-server-ovn:16.1.7-12 rpm -qa |grep ovn puppet-ovn-15.4.1-1.20210528102649.192ac4e.el8ost.noarch python3-networking-ovn-7.3.1-1.20210714143310.el8ost.noarch How reproducible: 100% in this specific env. Steps to Reproduce: 1. Launch instance with this specific security group Additional info: Will provide