Bug 2283600

Summary: set_gateway_mtu fails if one of the lrp is deleted while it's running
Product: Red Hat OpenStack Reporter: Yatin Karel <ykarel>
Component: openstack-neutronAssignee: Yatin Karel <ykarel>
Status: CLOSED ERRATA QA Contact: Fiorella Yanac <fyanac>
Severity: high Docs Contact:
Priority: high    
Version: 17.1 (Wallaby)CC: chrisw, fyanac, mariel, mtomaska, scohen
Target Milestone: z4Keywords: Triaged
Target Release: 17.1   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: openstack-neutron-18.6.1-17.1.20240822200817.85ff760.el9ost Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2024-11-21 09:40:45 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Yatin Karel 2024-05-28 06:50:46 UTC
Description of problem:

Seen in some downstream jobs running with antelope content and ovn_emit_need_to_frag = true, tests fails randomly while running add_router_interface or remove_router_interface calls like:-

ft9.1: setUpClass (tempest.api.compute.servers.test_server_actions.ServerActionsV293TestJSON)testtools.testresult.real._StringException: Traceback (most recent call last):
  File "/usr/lib/python3.9/site-packages/tempest/test.py", line 206, in setUpClass
    raise value.with_traceback(trace)
  File "/usr/lib/python3.9/site-packages/tempest/test.py", line 191, in setUpClass
    cls.setup_credentials()
  File "/usr/lib/python3.9/site-packages/tempest/api/compute/servers/test_server_actions.py", line 827, in setup_credentials
    super(ServerActionsV293TestJSON, cls).setup_credentials()
  File "/usr/lib/python3.9/site-packages/tempest/api/compute/base.py", line 75, in setup_credentials
    super(BaseV2ComputeTest, cls).setup_credentials()
  File "/usr/lib/python3.9/site-packages/tempest/test.py", line 419, in setup_credentials
    manager = cls.get_client_manager(
  File "/usr/lib/python3.9/site-packages/tempest/test.py", line 764, in get_client_manager
    creds = getattr(cred_provider, credentials_method)()
  File "/usr/lib/python3.9/site-packages/tempest/lib/common/dynamic_creds.py", line 473, in get_primary_creds
    return self.get_project_member_creds()
  File "/usr/lib/python3.9/site-packages/tempest/lib/common/dynamic_creds.py", line 508, in get_project_member_creds
    return self.get_credentials(['member'], scope='project')
  File "/usr/lib/python3.9/site-packages/tempest/lib/common/dynamic_creds.py", line 459, in get_credentials
    network, subnet, router = self._create_network_resources(
  File "/usr/lib/python3.9/site-packages/tempest/lib/common/dynamic_creds.py", line 319, in _create_network_resources
    self._add_router_interface(router['id'], subnet['id'])
  File "/usr/lib/python3.9/site-packages/tempest/lib/common/dynamic_creds.py", line 383, in _add_router_interface
    self.routers_admin_client.add_router_interface(router_id,
  File "/usr/lib/python3.9/site-packages/tempest/lib/services/network/routers_client.py", line 72, in add_router_interface
    return self.update_resource(uri, kwargs)
  File "/usr/lib/python3.9/site-packages/tempest/lib/services/network/base.py", line 77, in update_resource
    resp, body = self.put(req_uri, req_post_data)
  File "/usr/lib/python3.9/site-packages/tempest/lib/common/rest_client.py", line 372, in put
    return self.request('PUT', url, extra_headers, headers, body, chunked)
  File "/usr/lib/python3.9/site-packages/tempest/lib/common/rest_client.py", line 742, in request
    self._error_checker(resp, resp_body)
  File "/usr/lib/python3.9/site-packages/tempest/lib/common/rest_client.py", line 922, in _error_checker
    raise exceptions.ServerFault(resp_body, resp=resp,
tempest.lib.exceptions.ServerFault: Got server fault
Details: Request Failed: internal server error while processing your request.

Fails with Internal server error:-
192.168.16.2 - - [13/May/2024:15:09:44 +0000] "PUT /v2.0/routers/a76e7487-6fcb-4134-8de4-f08a3eca33cb/add_router_interface HTTP/1.1" 500 150 "-" "python-urllib3/1.26.5"

2024-05-13T15:10:15.685665853+00:00 stdout F 2024-05-13 15:10:15.684 16 DEBUG ovsdbapp.backend.ovs_idl.transaction [None req-8b7c93b4-c0d8-4963-b711-cab362fcada4 - - - - - -] Running txn n=1 command(idx=20): LrpSetOptionsCommand(_result=None, entity=lrp-9309bd6d-2e46-404e-8c47-fda1adc15cc5, options={}) do_commit /usr/lib/python3.9/site-packages/ovsdbapp/backend/ovs_idl/transaction.py:89^[[00m
2024-05-13T15:10:15.686489887+00:00 stdout F 2024-05-13 15:10:15.685 16 ERROR ovsdbapp.backend.ovs_idl.transaction [None req-d211c9ca-9694-4d69-9e57-b330a231c30f 9486b7f231584e7ab8f143c840957bf3 6bcd5aaa226d4e69b9e20287ea2d3b11 - - default default] Traceback (most recent call last):
2024-05-13T15:10:15.686489887+00:00 stdout F File "/usr/lib/python3.9/site-packages/ovsdbapp/backend/ovs_idl/connection.py", line 118, in run
2024-05-13T15:10:15.686489887+00:00 stdout F txn.results.put(txn.do_commit())
2024-05-13T15:10:15.686489887+00:00 stdout F File "/usr/lib/python3.9/site-packages/ovsdbapp/backend/ovs_idl/transaction.py", line 92, in do_commit
2024-05-13T15:10:15.686489887+00:00 stdout F command.run_idl(txn)
2024-05-13T15:10:15.686489887+00:00 stdout F File "/usr/lib/python3.9/site-packages/ovsdbapp/backend/ovs_idl/command.py", line 349, in run_idl
2024-05-13T15:10:15.686489887+00:00 stdout F entity = self.api.lookup(self.table, self.entity)
2024-05-13T15:10:15.686489887+00:00 stdout F File "/usr/lib/python3.9/site-packages/ovsdbapp/backend/ovs_idl/__init__.py", line 183, in lookup
2024-05-13T15:10:15.686489887+00:00 stdout F return self._lookup(table, record)
2024-05-13T15:10:15.686489887+00:00 stdout F File "/usr/lib/python3.9/site-packages/ovsdbapp/backend/ovs_idl/__init__.py", line 234, in _lookup
2024-05-13T15:10:15.686489887+00:00 stdout F row = idlutils.row_by_value(self, rl.table, rl.column, record)
2024-05-13T15:10:15.686489887+00:00 stdout F File "/usr/lib/python3.9/site-packages/ovsdbapp/backend/ovs_idl/idlutils.py", line 114, in row_by_value
2024-05-13T15:10:15.686489887+00:00 stdout F raise RowNotFound(table=table, col=column, match=match)
2024-05-13T15:10:15.686489887+00:00 stdout F ovsdbapp.backend.ovs_idl.idlutils.RowNotFound: Cannot find Logical_Router_Port with name=lrp-9309bd6d-2e46-404e-8c47-fda1adc15cc5
2024-05-13T15:10:15.686489887+00:00 stdout F ^[[00m

This happens as router port is removed while the LrpSetOptionsCommand runs for that router port, the transaction assumes the port exists[1]. Would need to add if-exists=true in LrpSetOptionsCommand/lrp_set_options to handle such cases.

[1] https://opendev.org/openstack/neutron/src/branch/master/neutron/plugins/ml2/drivers/ovn/mech_driver/ovsdb/ovn_client.py#L2084-L2088


Version-Release number of selected component (if applicable):


How reproducible:
Random test failures when run with higher concurrency + ovn_emit_need_to_frag = true

Steps to Reproduce:
1.
2.
3.

Actual results:
Tests fails randomly

Expected results:
Test shouldn't fail due to this

Additional info:
This is to track backports for fixes for upstream issue https://bugs.launchpad.net/neutron/+bug/2065701

Comment 14 errata-xmlrpc 2024-11-21 09:40:45 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (RHOSP 17.1.4 bug fix and enhancement advisory), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2024:9974

Comment 15 Red Hat Bugzilla 2025-03-22 04:25:14 UTC
The needinfo request[s] on this closed bug have been removed as they have been unresolved for 120 days