Bug 1956430

Summary: Add support for specifying a row's UUID to C and Python IDL
Product: Red Hat Enterprise Linux Fast Datapath Reporter: Terry Wilson <twilson>
Component: openvswitch2.13Assignee: Open vSwitch development team <ovs-team>
Status: NEW --- QA Contact: Zhiqiang Fang <zfang>
Severity: unspecified Docs Contact:
Priority: unspecified    
Version: FDP 21.CCC: ctrautma, jhsiao, ralongi
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:

Description Terry Wilson 2021-05-03 16:51:32 UTC
Description of problem:
There are situations, especially with OVN, where API requests are distributed among workers. If an object is created on one worker, and then updated or referenced on another worker (e.g a logical switch is created on one, and then a port immediately added on another), the second request may process before ovsdb-server has sent that client the row that was created in the first request. This will cause the insert to fail because we can neither look up the logical switch in memory, nor predict what the UUID assigned by the server will be to create a "fake" row with which to craft the jsonrpc request.

If we could set the UUID ourselves, we could for example, create Logical_Switch with the UUID of the associated Neutron network, so both servers will know what that UUID will be and when creating the Logical_Switch_Port we can just try to create it and update the 'ports' field using a Row with the UUID from that neutron network.

There is a patch that was merged to optimize a ddlog usecase: https://github.com/openvswitch/ovs/commit/a529e3cd1f410747778c3a8be1370a618ef0c861 that adds support at the jsonrpc level for this, but there is no support at the IDL level.

Version-Release number of selected component (if applicable):


How reproducible:
100%, it is a feature request/limitation of the existing code. In the real world, especially on heavily loaded systems we see inserts fail sporadically. It's fairly common. 


Steps to Reproduce:
1. Create two IDL connections to the NB db
2. Artificially delay replies from ovsdb-server to the second client, with tc perhaps
3. Create a logical switch on one server, then in a separate transaction create a logical switch port on that switch on the 2nd/delayed client.

Actual results:
There is no way to make this succeed due to 2nd client's lack of knowledge of what the logical switch UUID will be/lack of copy of the switch in its memory.

Expected results:
The 2nd client can craft a transaction with the known UUID of the logical switch when adding the logical port to the logical_switch.ports column.

Additional info:
This fix/feature would be a pre-requisite to changes in ml2/ovn which could use it to avoid these race conditions by using ovsdb-server itself as the source of truth when writing to the db instead of what has been replicated thus far to the client.