Bug 1821360

Summary: [OVN SCALE] HA/RAFT not integrated correctly into ovsdb
Product: Red Hat Enterprise Linux Fast Datapath Reporter: Anton Ivanov <anivanov>
Component: ovsdbAssignee: OVN Team <ovnteam>
Status: NEW --- QA Contact: ovs-qe
Severity: unspecified Docs Contact:
Priority: low    
Version: RHEL 8.0CC: ctrautma, dcbw, echaudro, fleitner, jhsiao, mmichels, qding, ralongi
Target Milestone: ---   
Target Release: ---   
Hardware: Unspecified   
OS: Unspecified   
Whiteboard:
Fixed In Version: Doc Type: If docs needed, set a value
Doc Text:
Story Points: ---
Clone Of: Environment:
Last Closed: Type: Bug
Regression: --- Mount Type: ---
Documentation: --- CRM:
Verified Versions: Category: ---
oVirt Team: --- RHEL 7.3 requirements from Atomic Host:
Cloudforms Team: --- Target Upstream Version:
Embargoed:

Description Anton Ivanov 2020-04-06 16:37:07 UTC
This was found when analysing failures in the early prototypes of the async IO patchset.

Initial condition:

A DB is removed from OVSDB due to an admin request or a raft HA decision

This triggers notifications for monitor cancels of all clients which are subscribed to the database. These notifications are enqueued to be sent to clients which takes a finite amount of time. 

Any client which will issue a transaction to the database during this time window will receive a "syntax error" JSON reply.

This will be extremely difficult to fix without major API additions because there is no mandatory flush and there is no "upper level" means of triggering a json rpc "echo" and waiting for an echo reply to ensure that anything on the wire between the server and the client has been flushed.

This is likely to be an issue ONLY at scale when there are a lot of pending requests and a lot of pending notifications to transmit.

It is somewhat mitigated by ovsdb connection being effectively half-duplex and it not invoking jsonrpc session receive if there is pending transmit. While this mitigates it, it does not fix it. If ovsdb is optimized for throughput in any way, this is likely to become easier to reproduce.