1835729 – [ovsdb-server] Memory leak of RAFT incomplete command

The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.

Bug 1835729 - [ovsdb-server] Memory leak of RAFT incomplete command

Summary: [ovsdb-server] Memory leak of RAFT incomplete command

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux Fast Datapath
Classification:	Red Hat
Component:	openvswitch2.13
Sub Component:
Version:	FDP 20.C
Hardware:	Unspecified
OS:	Unspecified
Priority:	unspecified
Severity:	unspecified
Target Milestone:	---
Target Release:	---
Assignee:	Dumitru Ceara
QA Contact:	Jianlin Shi
Docs Contact:
URL:
Whiteboard:
Depends On:
Blocks:	1836307
TreeView+	depends on / blocked

Reported:	2020-05-14 11:59 UTC by Dumitru Ceara
Modified:	2020-07-15 13:02 UTC (History)
CC List:	6 users (show)
Fixed In Version:	openvswitch2.13-2.13.0-24.el7fdp
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Clones:	1836307 (view as bug list)
Environment:
Last Closed:	2020-07-15 13:02:04 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Product Errata	RHBA-2020:2944	0	None	None	None	2020-07-15 13:02:20 UTC

Description Dumitru Ceara 2020-05-14 11:59:25 UTC

Description of problem:

This BZ is used to track backporting of the RAFT incomplete command memory leak to openvswitch2.13.

Upstream patch:
https://mail.openvswitch.org/pipermail/ovs-dev/2020-May/370105.html

Comment 1 OvS team 2020-05-15 00:57:45 UTC

* Thu May 14 2020 Dumitru Ceara <dceara> - 2.13.0-24
- raft: Disable RAFT jsonrpc inactivity probe. (#1822290)
  [b12acf45a6872dda85642cbc73dd86eb529be17e]

* Thu May 14 2020 Dumitru Ceara <dceara> - 2.13.0-23
- raft: Fix leak of the incomplete command. (#1835729)
  [bb552cffb89104c2bb19b8aff749b8b825a6db13]

* Thu May 14 2020 Dumitru Ceara <dceara> - 2.13.0-22
- raft: Fix the problem of stuck in candidate role forever. (#1828639)
  [c5937276691bb90f99fad1871b5e3ca4ac9391e7]

* Thu May 14 2020 Dumitru Ceara <dceara> - 2.13.0-21
- raft: Fix next_index in install_snapshot reply handling. (#1828639)
  [09ac3c327ec678f36cd9df451b7846acdf734c0f]

* Thu May 14 2020 Dumitru Ceara <dceara> - 2.13.0-20
- raft: Avoid busy loop during leader election. (#1828639)
  [19683b041e19a49e275a4b42f5bb5b0528de898a]

* Thu May 14 2020 Dumitru Ceara <dceara> - 2.13.0-19
- raft: Fix raft_is_connected() when there is no leader yet. (#1828639)
  [2dae730162e5e1b084ac0d1fc339d2f09bd8cddb]

* Thu May 14 2020 Dumitru Ceara <dceara> - 2.13.0-18
- ovsdb-server: Don't disconnect clients after raft install_snapshot. (#1828639)
  [da9680c6095df8d6c477aa10e29baa8f00dc2e25]

* Thu May 14 2020 Dumitru Ceara <dceara> - 2.13.0-17
- raft-rpc: Fix message format. (#1828639)
  [e9bb63d6190925db63b4cad83e57a945c4ac0629]

Comment 5 Dumitru Ceara 2020-06-17 09:21:55 UTC

I verified the fix on the downstream package by running the ovsdb-cluster testsuite with valgrind, essentially test case "OVSDB cluster - txn on follower-2, follower-2 crash before sending execReq, reconnect to follower-3".

On openvswitch2.13-2.13.0-17 (without fix):

# clone the openvswitch2.13 dist-git repo.
# checkout the revision corresponding to openvswitch2.13-2.13.0-17.
$ rh-pgk prep
$ cd ovs-2.13.0 && ./boot.sh && ./configure
$ make check-ovsdb-cluster-valgrind TESTSUITEFLAGS="124"
[...]
124: OVSDB cluster - txn on follower-2, follower-2 crash before sending execReq, reconnect to follower-3 ok
[...]
$ grep "are definitely lost" tests/ovsdb-cluster-testsuite.dir/*/valgrind.* 
tests/ovsdb-cluster-testsuite.dir/124/valgrind.14805:==14805== 72 bytes in 1 blocks are definitely lost in loss record 452 of 642

On openvswitch2.13-2.13.0-29 (with fix):
$ make check-ovsdb-cluster-valgrind TESTSUITEFLAGS="126"
[...]
126: OVSDB cluster - txn on follower-2, follower-2 crash before sending execReq, reconnect to follower-3 ok
[...]
$ grep "are definitely lost" tests/ovsdb-cluster-testsuite.dir/*/valgrind.* 
$

Jianlin, could you please move this to VERIFIED, I don't seem to have the rights to do that.

Thanks,
Dumitru

Comment 6 Jianlin Shi 2020-06-17 09:25:37 UTC

thanks Dumitru for running the valgrind. set VERIFIED per comment 5

Comment 8 errata-xmlrpc 2020-07-15 13:02:04 UTC

Since the problem described in this bug report should be
resolved in a recent advisory, it has been closed with a
resolution of ERRATA.

For information on the advisory, and where to find the updated
files, follow the link below.

If the solution does not work for you, open a new bug report.

https://access.redhat.com/errata/RHBA-2020:2944

Note You need to log in before you can comment on or make changes to this bug.