The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.
Bug 1806460 - [OVN-Scale-Testing] Develop SB DB standalone performance test (OVSDB Record/Replay)
Summary: [OVN-Scale-Testing] Develop SB DB standalone performance test (OVSDB Record/R...
Keywords:
Status: CLOSED UPSTREAM
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: OVN
Version: RHEL 8.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: ---
Assignee: Ilya Maximets
QA Contact: Jianlin Shi
URL:
Whiteboard:
Depends On:
Blocks: 1806468
TreeView+ depends on / blocked
 
Reported: 2020-02-24 09:28 UTC by Ilya Maximets
Modified: 2021-07-19 13:29 UTC (History)
3 users (show)

Fixed In Version:
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2021-07-19 13:29:02 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)

Description Ilya Maximets 2020-02-24 09:28:44 UTC
Idea is to catch real requests that appears in ovn cluster in scaling scenarios and replicate them using ovsdb-client or pure network packets without bringing up the whole [fake] multinode cluster.  This should allow to test ovsdb-sb performance on a fairly small systems without bothering the scale lab.

Comment 1 Ilya Maximets 2020-04-13 22:10:53 UTC
I'm working on third generation of the tests concept.

First idea was to catch requests to ovsdb-sb, parse and re-play using external tools
like ovsdb-client.  However, it turned out to be not that convenient to dump
the traffic, parse it with some external scripts and run the system.  Also,
it's not clear how to measure the actual work done by the ovsdb since there might
be a lot of spikes and delays caused by connections and actual running of big
number of clients on the same machine.

Second idea was to generate synthetic database and transactions to approximate
the ovsdb-sb load.  But this is not that good too since we're changing the ways
how ovn-controller interacts with db and it's actually hard to estimate what
will happen in a real setup.

The last approach I working on right now is to create re-play scenarios right
from the inside of ovsdb.  What I've implemented is a new cmdline argument
(--stream-replay-record) that makes ovsdb to write all the received connections and
requests to special "replay" files via altered stream implementation. All the
writes are done in the exact same order as connections and data appears on sockets.
With passing another argument (--stream-replay) ovsdb will start "receiving" same
connections and same data in exactly same order, but without any delays and
actual communications.  Assuming that the database is in the same original state,
ovsdb should go through exactly same load as in a run where "replay" was recorded.
Performance in this case could be measured by catching the end of the last recorded
transaction.

I have a draft implementation here:
https://github.com/igsilya/ovs/tree/tmp-stream-replay

One positive thing about "replay" solution is that it actually could be used not
only for performance testing but also for debugging in case you have recorded bad
conditions and replaying them locally, maybe with debugger attached or more logs
enabled.

Since implementation is on a "stream" level, this will record all the db connections,
openflow and unixctl.  ovsdb-server is a good application to record because it has
no other sources of events beside stream.  ovn-controller might be also a good candidate.

Comment 2 Ilya Maximets 2020-06-30 11:11:15 UTC
The first version of stream record/replay functionality posted upstream:
    https://patchwork.ozlabs.org/project/openvswitch/list/?series=186549

Comment 3 Ilya Maximets 2021-04-13 18:06:53 UTC
v2 posted upstream:
  https://patchwork.ozlabs.org/project/openvswitch/list/?series=238830&state=*

Comment 5 Ilya Maximets 2021-07-19 13:29:02 UTC
v3 of the OVSDB record/replay functionality got accepted and will be part
of upstream 2.16 release.  Integration of this feature into OVN daemons
and tests is tracked in BZ 1980793.


Note You need to log in before you can comment on or make changes to this bug.