The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.
Bug 1990058 - [RFE] raft: Reduce memory consumption by storing snapshot as a string instead of json object
Summary: [RFE] raft: Reduce memory consumption by storing snapshot as a string instead...
Keywords:
Status: CLOSED ERRATA
Alias: None
Product: Red Hat Enterprise Linux Fast Datapath
Classification: Red Hat
Component: ovsdb2.16
Version: RHEL 8.0
Hardware: Unspecified
OS: Unspecified
high
high
Target Milestone: ---
: FDP 21.I
Assignee: Ilya Maximets
QA Contact: Rick Alongi
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+ depends on / blocked
 
Reported: 2021-08-04 16:24 UTC by Ilya Maximets
Modified: 2022-01-10 16:51 UTC (History)
4 users (show)

Fixed In Version: openvswitch2.16-2.16.0-6.el8fdp
Doc Type: If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed: 2022-01-10 16:50:58 UTC
Target Upstream Version:
Embargoed:


Attachments (Terms of Use)


Links
System ID Private Priority Status Summary Last Updated
Red Hat Issue Tracker FD-1468 0 None None None 2021-08-22 04:31:17 UTC
Red Hat Product Errata RHBA-2022:0053 0 None None None 2022-01-10 16:51:09 UTC

Description Ilya Maximets 2021-08-04 16:24:11 UTC
RAFT module inside the ovsdb-server holds a static json object
with a database snapshot and it takes a lot of RAM.
For the 125MB database from the ovn-k8s cluster-density test,
this json object takes ~612MB RAM out of 1.3 GB of the total
memory consumption of a process.  For the 270MB database from
the ovn-heater's node-density-heavy 120node test this json
object takes ~1.55GB from the total 3.8GB for a process.

In most cases this object is used only to serialize it to string
and store on disk or send over the network.  For a short time
it's needed to re-apply changes after compaction.
So, it should be possible to serialize this object once and store
the string instead and not keep this huge json forever in memory.
Since the size of the serialized string should be same as the
size of the on-disk database after compaction, this change should
save significant amount of RAM.

Side quest:  figure out if we can do the same for all the raft
log entries to same more memory.  This might be needed anyway
for the implementation clarity as the snapshot is just another
raft entry.

Comment 1 Ilya Maximets 2021-08-20 16:55:03 UTC
Patches sent for review:
  https://patchwork.ozlabs.org/project/openvswitch/list/?series=259000&state=*

Comment 3 OvS team 2021-09-01 22:02:34 UTC
* Tue Aug 31 2021 Ilya Maximets <i.maximets> - 2.16.0-6
- ovsdb: monitor: Store serialized json in a json cache. [RH git: bc20330c85] (#1996152)
    commit 43e66fc27659af2a5c976bdd27fe747b442b5554
    Author: Ilya Maximets <i.maximets>
    Date:   Tue Aug 24 21:00:39 2021 +0200
    
        Same json from a json cache is typically sent to all the clients,
        e.g., in case of OVN deployment with ovn-monitor-all=true.
    
        There could be hundreds or thousands connected clients and ovsdb
        will serialize the same json object for each of them before sending.
    
        Serializing it once before storing into json cache to speed up
        processing.
    
        This change allows to save a lot of CPU cycles and a bit of memory
        since we need to store in memory only a string and not the full json
        object.
    
        Testing with ovn-heater on 120 nodes using density-heavy scenario
        shows reduction of the total CPU time used by Southbound DB processes
        from 256 minutes to 147.  Duration of unreasonably long poll intervals
        also reduced dramatically from 7 to 2 seconds:
    
                   Count   Min    Max   Median    Mean   95 percentile
         -------------------------------------------------------------
          Before   1934   1012   7480   4302.5   4875.3     7034.3
          After    1909   1004   2730   1453.0   1532.5     2053.6
    
        Acked-by: Dumitru Ceara <dceara>
        Acked-by: Han Zhou <hzhou>
        Signed-off-by: Ilya Maximets <i.maximets>
    
    Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1996152
    Signed-off-by: Ilya Maximets <i.maximets>


* Tue Aug 31 2021 Ilya Maximets <i.maximets> - 2.16.0-5
- raft: Don't keep full json objects in memory if no longer needed. [RH git: 4606423e8b] (#1990058)
    commit 0de882954032aa37dc943bafd72c33324aa0c95a
    Author: Ilya Maximets <i.maximets>
    Date:   Tue Aug 24 21:00:38 2021 +0200
    
        raft: Don't keep full json objects in memory if no longer needed.
    
        Raft log entries (and raft database snapshot) contains json objects
        of the data.  Follower receives append requests with data that gets
        parsed and added to the raft log.  Leader receives execution requests,
        parses data out of them and adds to the log.  In both cases, later
        ovsdb-server reads the log with ovsdb_storage_read(), constructs
        transaction and updates the database.  On followers these json objects
        in common case are never used again.  Leader may use them to send
        append requests or snapshot installation requests to followers.
        However, all these operations (except for ovsdb_storage_read()) are
        just serializing the json in order to send it over the network.
    
        Json objects are significantly larger than their serialized string
        representation.  For example, the snapshot of the database from one of
        the ovn-heater scale tests takes 270 MB as a string, but 1.6 GB as
        a json object from the total 3.8 GB consumed by ovsdb-server process.
    
        ovsdb_storage_read() for a given raft entry happens only once in a
        lifetime, so after this call, we can serialize the json object, store
        the string representation and free the actual json object that ovsdb
        will never need again.  This can save a lot of memory and can also
        save serialization time, because each raft entry for append requests
        and snapshot installation requests serialized only once instead of
        doing that every time such request needs to be sent.
    
        JSON_SERIALIZED_OBJECT can be used in order to seamlessly integrate
        pre-serialized data into raft_header and similar json objects.
    
        One major special case is creation of a database snapshot.
        Snapshot installation request received over the network will be parsed
        and read by ovsdb-server just like any other raft log entry.  However,
        snapshots created locally with raft_store_snapshot() will never be
        read back, because they reflect the current state of the database,
        hence already applied.  For this case we can free the json object
        right after writing snapshot on disk.
    
        Tests performed with ovn-heater on 60 node density-light scenario,
        where on-disk database goes up to 97 MB, shows average memory
        consumption of ovsdb-server Southbound DB processes decreased by 58%
        (from 602 MB to 256 MB per process) and peak memory consumption
        decreased by 40% (from 1288 MB to 771 MB).
    
        Test with 120 nodes on density-heavy scenario with 270 MB on-disk
        database shows 1.5 GB memory consumption decrease as expected.
        Also, total CPU time consumed by the Southbound DB process reduced
        from 296 to 256 minutes.  Number of unreasonably long poll intervals
        reduced from 2896 down to 1934.
    
        Deserialization is also implemented just in case.  I didn't see this
        function being invoked in practice.
    
        Acked-by: Dumitru Ceara <dceara>
        Acked-by: Han Zhou <hzhou>
        Signed-off-by: Ilya Maximets <i.maximets>
    
    Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1990058
    Signed-off-by: Ilya Maximets <i.maximets>


* Tue Aug 31 2021 Ilya Maximets <i.maximets> - 2.16.0-4
- json: Add support for partially serialized json objects. [RH git: 885e5ce1b5] (#1990058)
    commit b0bca6f27aae845c3ca8b48d66a7dbd3d978162a
    Author: Ilya Maximets <i.maximets>
    Date:   Tue Aug 24 21:00:37 2021 +0200
    
        json: Add support for partially serialized json objects.
    
        Introducing a new json type JSON_SERIALIZED_OBJECT.  It's not an
        actual type that can be seen in a json message on a wire, but
        internal type that is intended to hold a serialized version of
        some other json object.  For this reason it's defined after the
        JSON_N_TYPES to not confuse parsers and other parts of the code
        that relies on compliance with RFC 4627.
    
        With this JSON type internal users may construct large JSON objects,
        parts of which are already serialized.  This way, while serializing
        the larger object, data from JSON_SERIALIZED_OBJECT can be added
        directly to the result, without additional processing.
    
        This will be used by next commits to add pre-serialized JSON data
        to the raft_header structure, that can be converted to a JSON
        before writing the file transaction on disk or sending to other
        servers.  Same technique can also be used to pre-serialize json_cache
        for ovsdb monitors, this should allow to not perform serialization
        for every client and will save some more memory.
    
        Since serialized JSON is just a string, reusing the 'json->string'
        pointer for it.
    
        Acked-by: Dumitru Ceara <dceara>
        Acked-by: Han Zhou <hzhou>
        Signed-off-by: Ilya Maximets <i.maximets>
    
    Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1990058
    Signed-off-by: Ilya Maximets <i.maximets>


* Tue Aug 31 2021 Ilya Maximets <i.maximets> - 2.16.0-3
- json: Optimize string serialization. [RH git: bb1654da63] (#1990069)
    commit 748010ff304b7cd2c43f4eb98a554433f0df07f9
    Author: Ilya Maximets <i.maximets>
    Date:   Tue Aug 24 23:07:22 2021 +0200
    
        json: Optimize string serialization.
    
        Current string serialization code puts all characters one by one.
        This is slow because dynamic string needs to perform length checks
        on every ds_put_char() and it's also doesn't allow compiler to use
        better memory copy operations, i.e. doesn't allow copying few bytes
        at once.
    
        Special symbols are rare in a typical database.  Quotes are frequent,
        but not too frequent.  In databases created by ovn-kubernetes, for
        example, usually there are at least 10 to 50 chars between quotes.
        So, it's better to count characters that doesn't require escaping
        and use fast data copy for the whole sequential block.
    
        Testing with a synthetic benchmark (included) on my laptop shows
        following performance improvement:
    
           Size      Q  S       Before       After       Diff
         -----------------------------------------------------
         100000      0  0 :    0.227 ms     0.142 ms   -37.4 %
         100000      2  1 :    0.277 ms     0.186 ms   -32.8 %
         100000      10 1 :    0.361 ms     0.309 ms   -14.4 %
         10000000    0  0 :   22.720 ms    12.160 ms   -46.4 %
         10000000    2  1 :   27.470 ms    19.300 ms   -29.7 %
         10000000    10 1 :   37.950 ms    31.250 ms   -17.6 %
         100000000   0  0 :  239.600 ms   126.700 ms   -47.1 %
         100000000   2  1 :  292.400 ms   188.600 ms   -35.4 %
         100000000   10 1 :  387.700 ms   321.200 ms   -17.1 %
    
        Here Q - probability (%) for a character to be a '\"' and
        S - probability (%) to be a special character ( < 32).
    
        Testing with a closer to real world scenario shows overall decrease
        of the time needed for database compaction by ~5-10 %.  And this
        change also decreases CPU consumption in general, because string
        serialization is used in many different places including ovsdb
        monitors and raft.
    
        Signed-off-by: Ilya Maximets <i.maximets>
        Acked-by: Numan Siddique <numans>
        Acked-by: Dumitru Ceara <dceara>
    
    Reported-at: https://bugzilla.redhat.com/show_bug.cgi?id=1990069
    Signed-off-by: Ilya Maximets <i.maximets>

Comment 6 Rick Alongi 2022-01-05 18:31:11 UTC
Per email with Dev (i.maximets), no specific test case is feasible for this change; general testing performed during the release is sufficient for regression testing.  A full test suite was run for FDP 21.J using openvswitch2.16-2.16.0-32.el8fdp.

Marking as Verified.

Comment 7 Rick Alongi 2022-01-06 12:55:48 UTC
SanityOnly info:

[ralongi@ralongi openvswitch2.16]$ git log --oneline --grep=1990058
4606423e8b raft: Don't keep full json objects in memory if no longer needed.
885e5ce1b5 json: Add support for partially serialized json objects.
[ralongi@ralongi openvswitch2.16]$ git show 4606423e8b
commit 4606423e8b9bd399c1639fb9b00e13d3870adce9
Author: Ilya Maximets <i.maximets>
Date:   Tue Aug 24 21:00:38 2021 +0200

    raft: Don't keep full json objects in memory if no longer needed.
    
    commit 0de882954032aa37dc943bafd72c33324aa0c95a
    Author: Ilya Maximets <i.maximets>
    Date:   Tue Aug 24 21:00:38 2021 +0200
    
        raft: Don't keep full json objects in memory if no longer needed.
    
        Raft log entries (and raft database snapshot) contains json objects
        of the data.  Follower receives append requests with data that gets
        parsed and added to the raft log.  Leader receives execution requests,
        parses data out of them and adds to the log.  In both cases, later
        ovsdb-server reads the log with ovsdb_storage_read(), constructs
        transaction and updates the database.  On followers these json objects
        in common case are never used again.  Leader may use them to send
        append requests or snapshot installation requests to followers.
        However, all these operations (except for ovsdb_storage_read()) are
        just serializing the json in order to send it over the network.
    
[ralongi@ralongi openvswitch2.16]$ git show 885e5ce1b5
commit 885e5ce1b5a646185dcb653dd7d608a84ab43f53
Author: Ilya Maximets <i.maximets>
Date:   Tue Aug 24 21:00:37 2021 +0200

    json: Add support for partially serialized json objects.
    
    commit b0bca6f27aae845c3ca8b48d66a7dbd3d978162a
    Author: Ilya Maximets <i.maximets>
    Date:   Tue Aug 24 21:00:37 2021 +0200
    
        json: Add support for partially serialized json objects.
    
        Introducing a new json type JSON_SERIALIZED_OBJECT.  It's not an
        actual type that can be seen in a json message on a wire, but
        internal type that is intended to hold a serialized version of
        some other json object.  For this reason it's defined after the
        JSON_N_TYPES to not confuse parsers and other parts of the code
        that relies on compliance with RFC 4627.
    
        With this JSON type internal users may construct large JSON objects,
        parts of which are already serialized.  This way, while serializing
        the larger object, data from JSON_SERIALIZED_OBJECT can be added
        directly to the result, without additional processing.

Comment 9 errata-xmlrpc 2022-01-10 16:50:58 UTC
Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory (openvswitch2.16 bug fix update), and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2022:0053


Note You need to log in before you can comment on or make changes to this bug.