Bug 1999650

Summary: ovsdb transaction during leadership change may result in reply error
Product: Red Hat Enterprise Linux Fast Datapath Reporter: Tim Rozet <trozet>
Component: ovsdb2.15Assignee: Open vSwitch development team <ovs-team>
Status: CLOSED CURRENTRELEASE QA Contact: ovs-qe
Severity: medium Docs Contact:
Priority: high    
Version: RHEL 8.0CC: ctrautma, i.maximets, jhsiao, mmichels, ralongi
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard: perfscale-ovn
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: 2022-10-24 20:01:50 UTC Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:
Attachments:
Description Flags
Summary of the logs with affected row name/uuids
none
full nbdb logs during txn error none

Description Tim Rozet 2021-08-31 13:48:45 UTC
Description of problem:
Client sends a request to create a row, ovsdb ends up with a leader change during the txn. Ovsdb succesfully creates the new row, but replies back to the client there was an error:

E0831 00:11:57.634386       1 ovn.go:546] error while creating logical port node-density-bbf1add6-f5dd-4f2a-b4b5-ba4694a31ed4_node-density-1298 error: Transaction Failed due to an error: constraint violation details: Transaction causes multiple rows in "Logical_Switch_Port" table to have identical values (node-density-bbf1add6-f5dd-4f2a-b4b5-ba4694a31ed4_node-density-1298) for index on column "name".  First row, with UUID e2106f3d-fa06-4ca1-be12-4d116c4c3778, existed in the database before this transaction and was not modified by the transaction.  Second row, with UUID 07f77f40-089c-4dd5-9863-e79fc4b4f1ad, was inserted by this transaction. in committing transaction


Version-Release number of selected component (if applicable):
openvswitch2.15-2.15.0-28.el8fdp.x86_64

How reproducible:
Often

Steps to Reproduce:
1. Run ovn-k scale test with 20 nodes and 4700 pods per test
2. One or two pods will have this initial transaction failure. On retry the pod add is successful.

Will attach logs.

Comment 1 Tim Rozet 2021-08-31 13:51:54 UTC
Created attachment 1819412 [details]
Summary of the logs with affected row name/uuids

Comment 2 Tim Rozet 2021-08-31 13:53:18 UTC
Created attachment 1819413 [details]
full nbdb logs during txn error

Comment 3 Ilya Maximets 2021-09-10 14:42:58 UTC
Copying here what I said on slack previously:

There is a possible scenario where DB will add a record, but it will fail the
transaction.  If leader has a transaction to commit, it will write it to DB
and then it will send append requests to followers.  Follower may also write
the change to the log and reply to the leader.  But if the leader will transfer
leadership before receiving the reply, it will reject the append reply (not a
leader).  But the server that will receive the leadership will think that
change is applied ok, because it received the append request previously.
So, transaction will be correctly committed by all servers, but the old leader
rejected append replies, so it will fail the transaction.  If client will
re-try this transaction, it will receive an error, since the data is already
committed.

OTOH, this issue can be mostly avoided if client will re-connect to a new
leader once leadership is transferred.  In this case, I think, client will
disconnect before receiving transaction failure from the old leader.  While
connecting to a new leader, client will receive updated database data where
transaction is already committed, so client will not need to re-try it.


Though, we still need to think how to avoid failing of this transaction.

Comment 4 Mark Michelson 2022-10-24 20:01:50 UTC
The core OVN team discussed this issue today during our weekly meeting. https://github.com/openvswitch/ovs/commit/04e5adfedd2a2e4ceac62136671057542a7cf875 solves the double-insert in ovsdb-server. There are potential client-side issues to address still in the C and python IDLs. These would need to ensure that when a transaction fails due to a leadership change, the client will connect to the new leader and check if the transaction needs to be retried. Based on my understanding, libovsdb (used by OCP) already should connect to the new leader when a transaction failure occurs due to a leadership change.

I am closing this issue and in its place raising a (likely) lower-priority issue: https://bugzilla.redhat.com/show_bug.cgi?id=2137412