Description of problem: Tempest tests are executed on several processes/workers in parallel. Some tests fail performing different operations on routers (the issue has been observed with DELETE and PUT requests) raising the following error: RouterNotFound - Router <router-id> could not be found ---BEGIN EDIT DESCRIPTION--- I have reviewed again the tempest logs and I have to correct what I said above: the issue is only observed with PUT requests, and more specifically it only happens with add_router_interface and remove_router_interface PUT requests All the occurrences of this bug during the execution of tempest can be obtained running the following script on the tempest logs from [1]: j=0 for i in $(seq 1 $(zgrep -c "404 PUT.*router" tempest.log.gz)) do requestfirstline=$(zgrep -m$i -A4 "404 PUT.*router" tempest.log.gz | tail -5 | head -1) requestlastline=$(zgrep -m$i -A4 "404 PUT.*router" tempest.log.gz | tail -1) routernotfound=$(echo $requestlastline | grep "Router .* could not be found" | sed 's/.*Router \(.*\) could not be found.*/\1/') [ "$routernotfound" != "" ] && echo $requestfirstline| grep -v $routernotfound && echo "RouterNotFound: $routernotfound" && ((j=j+1)) && echo "^^^ Ocurrence number $j" done ---END EDIT DESCRIPTION--- When this error happens, the request is sent for routerA, but the error response refers to routerB. And it seems routerB was used in a previous test and was deleted before. I will paste here some tempest logs (obtained at [1]): 1) The tempest PID 302589 successfully deletes the router d4e8bfce-0004-4255-93c1-e88f370039f1 at 02:36:20 2022-02-14 02:36:20.355 302589 INFO tempest.lib.common.rest_client [req-4ebda313-720f-4459-887b-a27a750f9da0 ] Request (RoutersTest:_run_cleanups): 204 DELETE http://10.218.0.101:9696/v2.0/routers/d4e8bfce-0004-4255-93c1-e88f370039f1 3.315s 2) The tempest PID 302593 fails at disconnecting a subnet from the router 3ef3f26c-ea8c-4d86-a730-f371e5866807 at 02:36:27 - the error response says "Router d4e8bfce-0004-4255-93c1-e88f370039f1 could not be found": 2022-02-14 02:36:27.400 302593 INFO tempest.lib.common.rest_client [req-a94eda04-3140-4a71-8611-02e4887a5b0a ] Request (TestGettingAddress:_run_cleanups): 404 PUT http://10.218.0.101:9696/v2.0/routers/3ef3f26c-ea8c-4d86-a730-f371e5866807/remove_router_interface 12.749s 2022-02-14 02:36:27.400 302593 DEBUG tempest.lib.common.rest_client [req-a94eda04-3140-4a71-8611-02e4887a5b0a ] Request - Headers: {'Content-Type': 'application/json', 'Accept': 'application/json', 'X-Auth-Token': '<omitted>'} Body: {"subnet_id": "c305b87c-cb76-40f9-9723-23d59a86fe52"} Response - Headers: {'content-type': 'application/json', 'content-length': '135', 'x-openstack-request-id': 'req-a94eda04-3140-4a71-8611-02e4887a5b0a', 'date': 'Mon, 14 Feb 2022 02:36:27 GMT', 'connection': 'close', 'status': '404', 'content-location': 'http://10.218.0.101:9696/v2.0/routers/3ef3f26c-ea8c-4d86-a730-f371e5866807/remove_router_interface'} Body: b'{"NeutronError": {"type": "RouterNotFound", "message": "Router d4e8bfce-0004-4255-93c1-e88f370039f1 could not be found", "detail": ""}}' _log_request_full /usr/lib/python3.6/site-packages/tempest/lib/common/rest_client.py:457 The neutron server logs at controller-1 (see [2]) show some warning messages for the failed request mentioned above at 2), but the wrong router id is always seen on those logs: $ zgrep "WARNING.*req-a94eda04-3140-4a71-8611-02e4887a5b0a" server.log.2.gz 2022-02-14 02:36:24.898 15 WARNING neutron.scheduler.dhcp_agent_scheduler [req-a94eda04-3140-4a71-8611-02e4887a5b0a 3f4622530e9b4262be6f525269e61f1c 936c75b581f3439b88a58c06b0ff4be9 - default default] No more DHCP agents: neutron_lib.exceptions.l3.RouterNotFound: Router d4e8bfce-0004-4255-93c1-e88f370039f1 could not be found 2022-02-14 02:36:24.905 15 WARNING neutron.api.rpc.agentnotifiers.dhcp_rpc_agent_api [req-a94eda04-3140-4a71-8611-02e4887a5b0a 3f4622530e9b4262be6f525269e61f1c 936c75b581f3439b88a58c06b0ff4be9 - default default] Unable to schedule network a1a38b62-079b-4869-85e9-9c7c9223c967: no agents available; will retry on subsequent port and subnet creation events.: neutron_lib.exceptions.l3.RouterNotFound: Router d4e8bfce-0004-4255-93c1-e88f370039f1 could not be found 2022-02-14 02:36:27.103 15 WARNING neutron.scheduler.dhcp_agent_scheduler [req-a94eda04-3140-4a71-8611-02e4887a5b0a 3f4622530e9b4262be6f525269e61f1c 936c75b581f3439b88a58c06b0ff4be9 - default default] No more DHCP agents: neutron_lib.exceptions.l3.RouterNotFound: Router d4e8bfce-0004-4255-93c1-e88f370039f1 could not be found 2022-02-14 02:36:27.118 15 WARNING neutron.api.rpc.agentnotifiers.dhcp_rpc_agent_api [req-a94eda04-3140-4a71-8611-02e4887a5b0a 3f4622530e9b4262be6f525269e61f1c 936c75b581f3439b88a58c06b0ff4be9 - default default] Unable to schedule network a1a38b62-079b-4869-85e9-9c7c9223c967: no agents available; will retry on subsequent port and subnet creation events.: neutron_lib.exceptions.l3.RouterNotFound: Router d4e8bfce-0004-4255-93c1-e88f370039f1 could not be found Several tempest tests from the following job failed due to this issue. See [3]. Version-Release number of selected component (if applicable): RHOS-16.2-RHEL-8-20220210.n.1 How reproducible: issue only found in one job, but affecting several tests [1] http://rhos-ci-logs.lab.eng.tlv2.redhat.com/logs/rcj/DFG-network-networking-ovn-16.2_director-rhel-virthost-3cont_2comp_3net-ipv4-geneve-composable-vlan-provider-network/64/undercloud-0/home/stack/tempest-dir/tempest.log.gz [2] http://rhos-ci-logs.lab.eng.tlv2.redhat.com/logs/rcj/DFG-network-networking-ovn-16.2_director-rhel-virthost-3cont_2comp_3net-ipv4-geneve-composable-vlan-provider-network/64/controller-1/var/log/containers/neutron/server.log.2.gz [3] https://rhos-ci-jenkins.lab.eng.tlv2.redhat.com/view/DFG/view/network/view/networking-ovn/job/DFG-network-networking-ovn-16.2_director-rhel-virthost-3cont_2comp_3net-ipv4-geneve-composable-vlan-provider-network/64/
Upstream patch ready: https://review.opendev.org/c/openstack/neutron/+/830509
(In reply to Elvira from comment #5) > Upstream patch ready: https://review.opendev.org/c/openstack/neutron/+/830509 Hi Elvira, can you provide the d/s patch [backport of 2 branches trunk and non-trunk]
Since the problem described in this bug report should be resolved in a recent advisory, it has been closed with a resolution of ERRATA. For information on the advisory (Release of components for Red Hat OpenStack Platform 16.2.2), and where to find the updated files, follow the link below. If the solution does not work for you, open a new bug report. https://access.redhat.com/errata/RHBA-2022:1001