Description of problem: Client sends a request to create a row, ovsdb ends up with a leader change during the txn. Ovsdb succesfully creates the new row, but replies back to the client there was an error: E0831 00:11:57.634386 1 ovn.go:546] error while creating logical port node-density-bbf1add6-f5dd-4f2a-b4b5-ba4694a31ed4_node-density-1298 error: Transaction Failed due to an error: constraint violation details: Transaction causes multiple rows in "Logical_Switch_Port" table to have identical values (node-density-bbf1add6-f5dd-4f2a-b4b5-ba4694a31ed4_node-density-1298) for index on column "name". First row, with UUID e2106f3d-fa06-4ca1-be12-4d116c4c3778, existed in the database before this transaction and was not modified by the transaction. Second row, with UUID 07f77f40-089c-4dd5-9863-e79fc4b4f1ad, was inserted by this transaction. in committing transaction Version-Release number of selected component (if applicable): openvswitch2.15-2.15.0-28.el8fdp.x86_64 How reproducible: Often Steps to Reproduce: 1. Run ovn-k scale test with 20 nodes and 4700 pods per test 2. One or two pods will have this initial transaction failure. On retry the pod add is successful. Will attach logs.
Created attachment 1819412 [details] Summary of the logs with affected row name/uuids
Created attachment 1819413 [details] full nbdb logs during txn error
Copying here what I said on slack previously: There is a possible scenario where DB will add a record, but it will fail the transaction. If leader has a transaction to commit, it will write it to DB and then it will send append requests to followers. Follower may also write the change to the log and reply to the leader. But if the leader will transfer leadership before receiving the reply, it will reject the append reply (not a leader). But the server that will receive the leadership will think that change is applied ok, because it received the append request previously. So, transaction will be correctly committed by all servers, but the old leader rejected append replies, so it will fail the transaction. If client will re-try this transaction, it will receive an error, since the data is already committed. OTOH, this issue can be mostly avoided if client will re-connect to a new leader once leadership is transferred. In this case, I think, client will disconnect before receiving transaction failure from the old leader. While connecting to a new leader, client will receive updated database data where transaction is already committed, so client will not need to re-try it. Though, we still need to think how to avoid failing of this transaction.
The core OVN team discussed this issue today during our weekly meeting. https://github.com/openvswitch/ovs/commit/04e5adfedd2a2e4ceac62136671057542a7cf875 solves the double-insert in ovsdb-server. There are potential client-side issues to address still in the C and python IDLs. These would need to ensure that when a transaction fails due to a leadership change, the client will connect to the new leader and check if the transaction needs to be retried. Based on my understanding, libovsdb (used by OCP) already should connect to the new leader when a transaction failure occurs due to a leadership change. I am closing this issue and in its place raising a (likely) lower-priority issue: https://bugzilla.redhat.com/show_bug.cgi?id=2137412