Description of problem: Many rally tests failied during this job [1]. All the rally results can be seen here [2]. Let's focus on the create_and_list_networks failures [3], which were the first tests failing chronologically. Haproxy logs[4] At 11:49:32 some network creations fail on both controller-0 and 1 with errors 500 and 404, which means the neutron server returned an error: May 24 11:49:32 controller-1 haproxy[12]: 2620:52:0:13b8::fe:ce:34714 [24/May/2022:11:49:30.565] neutron neutron/controller-0.internalapi.redhat.local 0/0/0/1579/1579 500 352 - - ---- 42/21/19/7/0 0/0 "POST /v2.0/networks HTTP/1.1" May 24 11:49:34 controller-1 haproxy[12]: 2620:52:0:13b8::fe:ce:34720 [24/May/2022:11:49:30.571] neutron neutron/controller-0.internalapi.redhat.local 0/0/0/3830/3830 404 320 - - ---- 43/21/18/6/0 0/0 "POST /v2.0/networks HTTP/1.1" May 24 11:49:34 controller-1 haproxy[12]: 2620:52:0:13b8::fe:ce:34694 [24/May/2022:11:49:30.552] neutron neutron/controller-0.internalapi.redhat.local 0/0/1/4230/4231 500 352 - - ---- 44/21/17/5/0 0/0 "POST /v2.0/networks HTTP/1.1" May 24 11:49:35 controller-1 haproxy[12]: 2620:52:0:13b8::fe:ce:34700 [24/May/2022:11:49:30.558] neutron neutron/controller-1.internalapi.redhat.local 0/0/0/5062/5062 500 352 - - ---- 46/21/16/6/0 0/0 "POST /v2.0/networks HTTP/1.1" Two minutes later, some network creation requests are answered with 504 proxy error, which means neutron server didn't reply to any of them (the had been sent to controllers -0, -1 and -2): May 24 11:51:30 controller-1 haproxy[12]: 2620:52:0:13b8::fe:ce:34688 [24/May/2022:11:49:30.549] neutron neutron/controller-0.internalapi.redhat.local 0/0/0/-1/120001 504 194 - - sH-- 20/20/19/4/0 0/0 "POST /v2.0/networks HTTP/1.1" May 24 11:51:30 controller-1 haproxy[12]: 2620:52:0:13b8::fe:ce:34690 [24/May/2022:11:49:30.549] neutron neutron/controller-2.internalapi.redhat.local 0/0/0/-1/120001 504 194 - - sH-- 19/19/18/7/0 0/0 "POST /v2.0/networks HTTP/1.1" May 24 11:51:30 controller-1 haproxy[12]: 2620:52:0:13b8::fe:ce:34692 [24/May/2022:11:49:30.551] neutron neutron/controller-1.internalapi.redhat.local 0/0/0/-1/120000 504 194 - - sH-- 18/18/17/6/0 0/0 "POST /v2.0/networks HTTP/1.1" May 24 11:51:30 controller-1 haproxy[12]: 2620:52:0:13b8::fe:ce:34698 [24/May/2022:11:49:30.554] neutron neutron/controller-2.internalapi.redhat.local 0/0/0/-1/120001 504 194 - - sH-- 17/17/16/6/0 0/0 "POST /v2.0/networks HTTP/1.1" May 24 11:51:30 controller-1 haproxy[12]: 2620:52:0:13b8::fe:ce:34702 [24/May/2022:11:49:30.561] neutron neutron/controller-0.internalapi.redhat.local 0/0/0/-1/120002 504 194 - - sH-- 16/16/15/2/0 0/0 "POST /v2.0/networks HTTP/1.1" May 24 11:51:30 controller-1 haproxy[12]: 2620:52:0:13b8::fe:ce:34708 [24/May/2022:11:49:30.562] neutron neutron/controller-0.internalapi.redhat.local 0/0/0/-1/120000 504 194 - - sH-- 15/15/14/2/0 0/0 "POST /v2.0/networks HTTP/1.1" May 24 11:51:30 controller-1 haproxy[12]: 2620:52:0:13b8::fe:ce:34706 [24/May/2022:11:49:30.561] neutron neutron/controller-1.internalapi.redhat.local 0/0/1/-1/120002 504 194 - - sH-- 14/14/13/5/0 0/0 "POST /v2.0/networks HTTP/1.1" May 24 11:51:30 controller-1 haproxy[12]: 2620:52:0:13b8::fe:ce:34704 [24/May/2022:11:49:30.561] neutron neutron/controller-2.internalapi.redhat.local 0/0/0/-1/120002 504 194 - - sH-- 13/13/12/5/0 0/0 "POST /v2.0/networks HTTP/1.1" Let's focus now on controller-0's neutron server logs [5]: Three network creation requests are received almost simultanoeusly (<1sec): 2022-05-24 11:49:30.670 17 DEBUG neutron.api.v2.base [req-426cea11-530a-4cac-a3f5-47c20b70b3e3 b859859d233f466e8a721ec0289f59d8 4db2f18a5c0c42119b3e871b8fef26a5 - default default] Request body: {'network': {'name': 's_rally_459ef235_KOA3tzCw'}} prepare_request_body /usr/lib/python3.6/site-packages/neutron/api/v2/base.py:719 ... 2022-05-24 11:49:31.110 16 DEBUG neutron.api.v2.base [req-e89b3c23-17d7-41ac-8d46-2fb4e23fa06d b859859d233f466e8a721ec0289f59d8 4db2f18a5c0c42119b3e871b8fef26a5 - default default] Request body: {'network': {'name': 's_rally_459ef235_CfRiDwM5'}} prepare_request_body /usr/lib/python3.6/site-packages/neutron/api/v2/base.py:719 ... 2022-05-24 11:49:31.381 16 DEBUG neutron.api.v2.base [req-434cad68-a47e-4887-ad1e-374364b58b1e b859859d233f466e8a721ec0289f59d8 4db2f18a5c0c42119b3e871b8fef26a5 - default default] Request body: {'network': {'name': 's_rally_459ef235_SRHmc3mp'}} prepare_request_body /usr/lib/python3.6/site-packages/neutron/api/v2/base.py:719 However, the last one received is the first one triggering an operation on the OVN NBDB (LsAddCommand): 2022-05-24 11:49:31.551 16 DEBUG networking_ovn.db.revision [req-434cad68-a47e-4887-ad1e-374364b58b1e b859859d233f466e8a721ec0289f59d8 4db2f18a5c0c42119b3e871b8fef26a5 - default default] create_initial_revision uuid=c060d40f-7c98-49c7-8f82-19f9b38af4f5, type=networks, rev=-1 create_initial_revision /usr/lib/python3.6/site-packages/networking_ovn/db/revision.py:59 2022-05-24 11:49:31.621 16 DEBUG neutron_lib.callbacks.manager [req-434cad68-a47e-4887-ad1e-374364b58b1e b859859d233f466e8a721ec0289f59d8 4db2f18a5c0c42119b3e871b8fef26a5 - default default] Notify callbacks ['neutron.plugins.ml2.ovo_rpc._ObjectChangeHandler.handle_event--9223372036845666552', 'neutron.api.rpc.agentnotifiers.dhcp_rpc_agent_api.DhcpAgentNotifyAPI._native_event_send_dhcp_notification-6604134'] for network, after_create _notify_loop /usr/lib/python3.6/site-packages/neutron_lib/callbacks/manager.py:193 2022-05-24 11:49:31.652 16 DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Running txn n=1 command(idx=0): LsAddCommand(switch=neutron-c060d40f-7c98-49c7-8f82-19f9b38af4f5, columns={'external_ids': {'neutron:network_name': 's_rally_459ef235_SRHmc3mp', 'neutron:mtu': '1442', 'neutron:revision_number': '1'}, 'other_config': {'mcast_snoop': 'true', 'mcast_flood_unregistered': 'false', 'vlan-passthru': 'false'}}, may_exist=True) do_commit /usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py:89 2022-05-24 11:49:31.655 16 DEBUG neutron.api.rpc.handlers.resources_rpc [req-434cad68-a47e-4887-ad1e-374364b58b1e b859859d233f466e8a721ec0289f59d8 4db2f18a5c0c42119b3e871b8fef26a5 - - -] Pushing event deleted for resources: {'Network': ['ID=c060d40f-7c98-49c7-8f82-19f9b38af4f5,revision_number=None']} push /usr/lib/python3.6/site-packages/neutron/api/rpc/handlers/resources_rpc.py:243 2022-05-24 11:49:31.669 16 DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Running txn n=1 command(idx=0): CheckLivenessCommand() do_commit /usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py:89 2022-05-24 11:49:31.686 16 WARNING networking_ovn.db.revision [req-434cad68-a47e-4887-ad1e-374364b58b1e b859859d233f466e8a721ec0289f59d8 4db2f18a5c0c42119b3e871b8fef26a5 - default default] No revision row found for c060d40f-7c98-49c7-8f82-19f9b38af4f5 (type: networks) when bumping the revision number. Creating one.: sqlalchemy.orm.exc.NoResultFound: No row was found for one() After that warning message the following logs are printed several times: 2022-05-24 11:49:31.708 17 DEBUG networking_ovn.ovsdb.ovsdb_monitor [-] ChassisAgentWriteEvent : Matched Chassis_Private, update, None None matches /usr/lib/python3.6/site-packages/networking_ovn/ovsdb/ovsdb_monitor.py:61 2022-05-24 11:49:31.710 17 DEBUG networking_ovn.ovsdb.ovsdb_monitor [-] ChassisAgentWriteEvent : Matched Chassis_Private, update, None None matches /usr/lib/python3.6/site-packages/networking_ovn/ovsdb/ovsdb_monitor.py:61 And finally, an exception is raised for req-434cad68-a47e-4887-ad1e-374364b58b1e: 2022-05-24 11:49:31.719 16 ERROR neutron.plugins.ml2.managers [req-434cad68-a47e-4887-ad1e-374364b58b1e b859859d233f466e8a721ec0289f59d8 4db2f18a5c0c42119b3e871b8fef26a5 - default default] Mechanism driver 'ovn' failed in create_network_postcommit: networking_ovn.common.exceptions.StandardAttributeIDNotFound: Standard attribute ID not found for c060d40f-7c98-49c7-8f82-19f9b38af4f5 2022-05-24 11:49:31.719 16 ERROR neutron.plugins.ml2.managers Traceback (most recent call last): 2022-05-24 11:49:31.719 16 ERROR neutron.plugins.ml2.managers File "/usr/lib/python3.6/site-packages/networking_ovn/db/revision.py", line 98, in _ensure_revision_row_exist 2022-05-24 11:49:31.719 16 ERROR neutron.plugins.ml2.managers resource_type=resource_type).one() 2022-05-24 11:49:31.719 16 ERROR neutron.plugins.ml2.managers File "/usr/lib64/python3.6/site-packages/sqlalchemy/orm/query.py", line 3282, in one 2022-05-24 11:49:31.719 16 ERROR neutron.plugins.ml2.managers raise orm_exc.NoResultFound("No row was found for one()") 2022-05-24 11:49:31.719 16 ERROR neutron.plugins.ml2.managers sqlalchemy.orm.exc.NoResultFound: No row was found for one() 2022-05-24 11:49:31.719 16 ERROR neutron.plugins.ml2.managers 2022-05-24 11:49:31.719 16 ERROR neutron.plugins.ml2.managers During handling of the above exception, another exception occurred: 2022-05-24 11:49:31.719 16 ERROR neutron.plugins.ml2.managers 2022-05-24 11:49:31.719 16 ERROR neutron.plugins.ml2.managers Traceback (most recent call last): 2022-05-24 11:49:31.719 16 ERROR neutron.plugins.ml2.managers File "/usr/lib/python3.6/site-packages/networking_ovn/db/revision.py", line 47, in _get_standard_attr_id 2022-05-24 11:49:31.719 16 ERROR neutron.plugins.ml2.managers id=resource_uuid).one() 2022-05-24 11:49:31.719 16 ERROR neutron.plugins.ml2.managers File "/usr/lib64/python3.6/site-packages/sqlalchemy/orm/query.py", line 3282, in one 2022-05-24 11:49:31.719 16 ERROR neutron.plugins.ml2.managers raise orm_exc.NoResultFound("No row was found for one()") 2022-05-24 11:49:31.719 16 ERROR neutron.plugins.ml2.managers sqlalchemy.orm.exc.NoResultFound: No row was found for one() 2022-05-24 11:49:31.719 16 ERROR neutron.plugins.ml2.managers 2022-05-24 11:49:31.719 16 ERROR neutron.plugins.ml2.managers During handling of the above exception, another exception occurred: 2022-05-24 11:49:31.719 16 ERROR neutron.plugins.ml2.managers 2022-05-24 11:49:31.719 16 ERROR neutron.plugins.ml2.managers Traceback (most recent call last): 2022-05-24 11:49:31.719 16 ERROR neutron.plugins.ml2.managers File "/usr/lib/python3.6/site-packages/neutron/plugins/ml2/managers.py", line 477, in _call_on_drivers 2022-05-24 11:49:31.719 16 ERROR neutron.plugins.ml2.managers getattr(driver.obj, method_name)(context) 2022-05-24 11:49:31.719 16 ERROR neutron.plugins.ml2.managers File "/usr/lib/python3.6/site-packages/networking_ovn/ml2/mech_driver.py", line 474, in create_network_postcommit 2022-05-24 11:49:31.719 16 ERROR neutron.plugins.ml2.managers self._ovn_client.create_network(network) 2022-05-24 11:49:31.719 16 ERROR neutron.plugins.ml2.managers File "/usr/lib/python3.6/site-packages/networking_ovn/common/ovn_client.py", line 1704, in create_network 2022-05-24 11:49:31.719 16 ERROR neutron.plugins.ml2.managers db_rev.bump_revision(network, ovn_const.TYPE_NETWORKS) ... 2022-05-24 11:49:31.719 16 ERROR neutron.plugins.ml2.managers File "/usr/lib/python3.6/site-packages/networking_ovn/db/revision.py", line 63, in create_initial_revision 2022-05-24 11:49:31.719 16 ERROR neutron.plugins.ml2.managers session, resource_uuid, resource_type) 2022-05-24 11:49:31.719 16 ERROR neutron.plugins.ml2.managers File "/usr/lib/python3.6/site-packages/networking_ovn/db/revision.py", line 51, in _get_standard_attr_id 2022-05-24 11:49:31.719 16 ERROR neutron.plugins.ml2.managers resource_uuid=resource_uuid) 2022-05-24 11:49:31.719 16 ERROR neutron.plugins.ml2.managers networking_ovn.common.exceptions.StandardAttributeIDNotFound: Standard attribute ID not found for c060d40f-7c98-49c7-8f82-19f9b38af4f5 And an error 500 is replied for req-434cad68-a47e-4887-ad1e-374364b58b1e: 2022-05-24 11:49:32.143 16 INFO neutron.wsgi [req-434cad68-a47e-4887-ad1e-374364b58b1e b859859d233f466e8a721ec0289f59d8 4db2f18a5c0c42119b3e871b8fef26a5 - default default] fd00:fd00:fd00:2000::149 "POST /v2.0/networks HTTP/1.1" status: 500 len: 352 time: 1.5765831 Then, a new network creation request is received and immediately processed (the OVN NBDB command LsAddCommand is executed), but another exception is raised (similar to the previous one, not exactly the same). Remember the first two requests have not triggered a LsAddCommand yet: 2022-05-24 11:49:33.481 16 DEBUG neutron.api.v2.base [req-9662a11e-ff89-4a01-8705-bcd95756b39d b859859d233f466e8a721ec0289f59d8 4db2f18a5c0c42119b3e871b8fef26a5 - default default] Request body: {'network': {'name': 's_rally_459ef235_riuJ339c'}} prepare_request_body /usr/lib/python3.6/site-packages/neutron/api/v2/base.py:719 ... 2022-05-24 11:49:33.782 16 DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Running txn n=1 command(idx=0): LsAddCommand(switch=neutron-831abde6-6cf2-4de1-bd41-50527bad8489, columns={'external_ids': {'neutron:network_name': 's_rally_459ef235_riuJ339c', 'neutron:mtu': '1442', 'neutron:revision_number': '1'}, 'other_config': {'mcast_snoop': 'true', 'mcast_flood_unregistered': 'false', 'vlan-passthru': 'false'}}, may_exist=True) do_commit /usr/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/transaction.py:89 ... 2022-05-24 11:49:33.925 16 ERROR neutron.plugins.ml2.managers [req-9662a11e-ff89-4a01-8705-bcd95756b39d b859859d233f466e8a721ec0289f59d8 4db2f18a5c0c42119b3e871b8fef26a5 - default default] Mechanism driver 'ovn' failed in create_network_postcommit: neutron_lib.exceptions.NetworkNotFound: Network 831abde6-6cf2-4de1-bd41-50527bad8489 could not be found. 2022-05-24 11:49:33.925 16 ERROR neutron.plugins.ml2.managers Traceback (most recent call last): 2022-05-24 11:49:33.925 16 ERROR neutron.plugins.ml2.managers File "/usr/lib/python3.6/site-packages/neutron/plugins/ml2/managers.py", line 477, in _call_on_drivers 2022-05-24 11:49:33.925 16 ERROR neutron.plugins.ml2.managers getattr(driver.obj, method_name)(context) 2022-05-24 11:49:33.925 16 ERROR neutron.plugins.ml2.managers File "/usr/lib/python3.6/site-packages/networking_ovn/ml2/mech_driver.py", line 474, in create_network_postcommit 2022-05-24 11:49:33.925 16 ERROR neutron.plugins.ml2.managers self._ovn_client.create_network(network) 2022-05-24 11:49:33.925 16 ERROR neutron.plugins.ml2.managers File "/usr/lib/python3.6/site-packages/networking_ovn/common/ovn_client.py", line 1705, in create_network 2022-05-24 11:49:33.925 16 ERROR neutron.plugins.ml2.managers self.create_metadata_port(context, network) 2022-05-24 11:49:33.925 16 ERROR neutron.plugins.ml2.managers File "/usr/lib/python3.6/site-packages/networking_ovn/common/ovn_client.py", line 2183, in create_metadata_port 2022-05-24 11:49:33.925 16 ERROR neutron.plugins.ml2.managers p_utils.create_port(self._plugin, context, port) 2022-05-24 11:49:33.925 16 ERROR neutron.plugins.ml2.managers File "/usr/lib/python3.6/site-packages/neutron_lib/plugins/utils.py", line 336, in create_port 2022-05-24 11:49:33.925 16 ERROR neutron.plugins.ml2.managers context, {port_apidef.RESOURCE_NAME: port_data}) ... 2022-05-24 11:49:33.925 16 ERROR neutron.plugins.ml2.managers File "/usr/lib/python3.6/site-packages/neutron/plugins/ml2/plugin.py", line 1446, in create_port 2022-05-24 11:49:33.925 16 ERROR neutron.plugins.ml2.managers result, mech_context = self._create_port_db(context, port) 2022-05-24 11:49:33.925 16 ERROR neutron.plugins.ml2.managers File "/usr/lib/python3.6/site-packages/neutron/plugins/ml2/plugin.py", line 1413, in _create_port_db 2022-05-24 11:49:33.925 16 ERROR neutron.plugins.ml2.managers port_db = self.create_port_db(context, port) 2022-05-24 11:49:33.925 16 ERROR neutron.plugins.ml2.managers File "/usr/lib/python3.6/site-packages/neutron/db/db_base_plugin_v2.py", line 1446, in create_port_db 2022-05-24 11:49:33.925 16 ERROR neutron.plugins.ml2.managers raise exc.NetworkNotFound(net_id=network_id) 2022-05-24 11:49:33.925 16 ERROR neutron.plugins.ml2.managers neutron_lib.exceptions.NetworkNotFound: Network 831abde6-6cf2-4de1-bd41-50527bad8489 could not be found. There are no other logs about the first and second requests,req-426cea11-530a-4cac-a3f5-47c20b70b3e3 and req-e89b3c23-17d7-41ac-8d46-2fb4e23fa06d, until ~14 minutes later (this explains why some requests were replied with 504 proxy error): 2022-05-24 12:03:07.517 17 DEBUG neutron_lib.callbacks.manager [req-426cea11-530a-4cac-a3f5-47c20b70b3e3 b859859d233f466e8a721ec0289f59d8 4db2f18a5c0c42119b3e871b8fef26a5 - default default] Callback networking_ovn.ml2.mech_driver.OVNMechanismDriver._create_security_group_precommit--9223372036847719610 raised (pymysql.err.OperationalError) (2013, 'Lost connection to MySQL server during query') ... 2022-05-24 12:03:07.519 16 DEBUG neutron_lib.callbacks.manager [req-e89b3c23-17d7-41ac-8d46-2fb4e23fa06d b859859d233f466e8a721ec0289f59d8 4db2f18a5c0c42119b3e871b8fef26a5 - default default] Callback neutron.plugins.ml2.plugin.SecurityGroupDbMixin._ensure_default_security_group_handler--9223372036853696690 raised (pymysql.err.OperationalError) (2013, 'Lost connection to MySQL server during query') I checked the mysql logs from the controller nodes and the only messages that are printed at this time is the following one (on controller-0, same message printed many times): 2022-05-24 11:53:31 0 [Warning] InnoDB: A long semaphore wait: --Thread 139734580496128 has waited at lock0wait.cc line 96 for 241.00 seconds the semaphore: Mutex at 0x55eec7f64d00, Mutex LOCK_SYS created lock0lock.cc:461, lock var 2 2022-05-24 11:53:31 0 [Warning] InnoDB: A long semaphore wait: --Thread 139735306073856 has waited at lock0lock.cc line 3377 for 241.00 seconds the semaphore: Mutex at 0x55eec7f64d00, Mutex LOCK_SYS created lock0lock.cc:461, lock var 2 [1] https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/view/DFG/view/network/view/networking-ovn/job/DFG-network-networking-ovn-16.2_director-rhel-virthost-3cont_2comp-ipv6-geneve-dvr-rally/40/ [2] http://rhos-ci-logs.lab.eng.tlv2.redhat.com/logs/rcj/DFG-network-networking-ovn-16.2_director-rhel-virthost-3cont_2comp-ipv6-geneve-dvr-rally/40/infrared/.workspaces/workspace_2022-05-24_10-28-58/test_results/results-20220524T114637.html#/ [3] http://rhos-ci-logs.lab.eng.tlv2.redhat.com/logs/rcj/DFG-network-networking-ovn-16.2_director-rhel-virthost-3cont_2comp-ipv6-geneve-dvr-rally/40/infrared/.workspaces/workspace_2022-05-24_10-28-58/test_results/results-20220524T114637.html#/NeutronNetworks.create_and_list_networks/failures [4] http://rhos-ci-logs.lab.eng.tlv2.redhat.com/logs/rcj/DFG-network-networking-ovn-16.2_director-rhel-virthost-3cont_2comp-ipv6-geneve-dvr-rally/40/controller-1/var/log/containers/haproxy/haproxy.log.gz [5] http://rhos-ci-logs.lab.eng.tlv2.redhat.com/logs/rcj/DFG-network-networking-ovn-16.2_director-rhel-virthost-3cont_2comp-ipv6-geneve-dvr-rally/40/controller-0/var/log/containers/neutron/server.log.2.gz Version-Release number of selected component (if applicable): RHOS-16.2-RHEL-8-20220513.n.2 How reproducible: for the moment, it has been reproduced only once Steps to Reproduce: 1. run the rally job maybe we can try to reproduce it by sending 20 processes in parallel creating networks. AFAIK, this is what rally does.
*** Bug 2090796 has been marked as a duplicate of this bug. ***
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Release of components for Red Hat OpenStack Platform 16.2.4), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:8794