2228464 – ovsdb-server doesn't limit transaction history size on initial database file read

The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.

Bug 2228464 - ovsdb-server doesn't limit transaction history size on initial database file read

Summary: ovsdb-server doesn't limit transaction history size on initial database file ...

Keywords:
Status:	CLOSED CURRENTRELEASE
Alias:	None
Product:	Red Hat Enterprise Linux Fast Datapath
Classification:	Red Hat
Component:	ovsdb3.1
Sub Component:
Version:	FDP 23.K
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	high
Target Milestone:	---
Target Release:	---
Assignee:	Ilya Maximets
QA Contact:	Jianlin Shi
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:
TreeView+	depends on / blocked

Reported:	2023-08-02 12:21 UTC by Ilya Maximets
Modified:	2024-04-23 20:45 UTC (History)
CC List:	6 users (show)
Fixed In Version:	openvswitch3.1-3.1.0-46.el8fdp
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2024-04-23 20:45:29 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Issue Tracker	FD-3078	0	None	None	None	2023-08-02 12:22:57 UTC

Description Ilya Maximets 2023-08-02 12:21:48 UTC

In a clustered database mode, after restart, ovsdb-server may consume
a lot of memory during initial database file read:

  |00004|memory|INFO|95650400 kB peak resident set size after 96.9 seconds
  |00005|memory|INFO|atoms:3083346 cells:1838767 monitors:0 raft-log:123309 txn-history:123307 txn-history-atoms:1647022868

In the example above, it's a Northbound OVN database with 123K small transaction
records in the file.  The total file size is just 150 MB.  Fully compacted - 80 MB.
During the initial database file read, ovsdb-server allocates 95 GB of RAM to
store these transactions in a history.  History will be drained right after the
initial read is completed, but the memory may not be returned to the system
until the next compaction (glibc behavior).  The process may just be killed
before finishing the read if there is no enough memory in the system.

How to reproduce:

1. Create OVN setup with clustered databases.
2. Execute 100K small Northbound database updates in a short time, but make
   sure that ovsdb-server didn't compact it.
3. Re-start ovsdb-server.

Expected results:

Just restarted ovsdb-server process should not consume significantly more
memory than it consumed before the restart.

Comment 1 Ilya Maximets 2023-08-02 13:48:29 UTC

Patch posted for review:
  https://patchwork.ozlabs.org/project/openvswitch/patch/20230802134532.2370039-1-i.maximets@ovn.org/

Comment 2 ovs-bugzilla 2023-08-03 18:17:31 UTC

* Thu Aug 03 2023 Open vSwitch CI <ovs-ci> - 3.1.0-46
- Merging upstream branch-3.1 [RH git: ab94f613c7]
    Commit list:
    8b1795c69f ovsdb-tool: Fix json leak while showing clustered log.
    d4d068fef6 ovsdb-server: Fix excessive memory usage on DB open. (#2228464)
    369daff0d4 tests: Add ovsdb execution cases for set size constraints.
    eb33626b59 ovsdb: relay: Fix handling of XOR updates with size constraints.
    8d2c8c33e7 ovsdb: file: Fix diff application to a default column value.
    3797558158 ovsdb: file: Fix inability to read diffs that violate type size.
    96d02ee7a8 ovs-tcpdump: Clear auto-assigned ipv6 address of mirror port.

Comment 3 Ilya Maximets 2024-04-23 20:45:29 UTC

The issue was fixed long ago, but wasn't added to any errata for some reason.
Identical fixes for 2.17 were released and verified.

It doesn't make much sense to keep this one open any longer.

Note You need to log in before you can comment on or make changes to this bug.