1859924 – possible memory leak in sb-db raft cluster

The FDP team is no longer accepting new bugs in Bugzilla. Please report your issues under FDP project in Jira. Thanks.

Bug 1859924 - possible memory leak in sb-db raft cluster

Summary: possible memory leak in sb-db raft cluster

Keywords:
Status:	CLOSED ERRATA
Alias:	None
Product:	Red Hat Enterprise Linux Fast Datapath
Classification:	Red Hat
Component:	OVN
Sub Component:
Version:	RHEL 7.7
Hardware:	All
OS:	Unspecified
Priority:	urgent
Severity:	urgent
Target Milestone:	---
Target Release:	---
Assignee:	Numan Siddique
QA Contact:	Jianlin Shi
Docs Contact:
URL:
Whiteboard:
Duplicates (3):	1884049 1885713 1891002 (view as bug list)
Depends On:	1833373 1876990 1877002
Blocks:	1891002
TreeView+	depends on / blocked

Reported:	2020-07-23 10:20 UTC by Anil Vishnoi
Modified:	2024-12-20 19:10 UTC (History)
CC List:	16 users (show)
Fixed In Version:	ovn2.13-20.12.0-1
Doc Type:	If docs needed, set a value
Doc Text:
Clone Of:
Environment:
Last Closed:	2021-03-25 19:03:31 UTC
Target Upstream Version:
Embargoed:

Attachments	(Terms of Use)
ovn log files (3.74 MB, application/gzip) 2020-07-23 10:20 UTC, Anil Vishnoi	no flags	Details
View All

Links
System	ID	Private	Priority	Status	Summary	Last Updated
Red Hat Bugzilla	1855408	0	urgent	CLOSED	OVN cluster unstable after running minimal scale test	2021-02-24 15:15:16 UTC
Red Hat Issue Tracker	FD-774	0	None	None	None	2021-09-29 07:46:42 UTC

Internal Links: 1876990 1877002

Description Anil Vishnoi 2020-07-23 10:20:33 UTC

Created attachment 1702209 [details]
ovn log files

Description of problem:

sb-db cluster node's memory consumption grows to 12G for 100 nodes , 9K Pods and around 1800 services.

```
ovnkube-master-hbvrv   sbdb             40m          12393Mi
```

sb-db cluster was running with election-timer of 30 seconds. Cluster node's memory growth is increasing consistenty 

ovnkube-master-hbvrv   sbdb             2m           832Mi
Tue Jul 21 19:48:15 UTC 2020
Tue Jul 21 20:59:06 UTC 2020
ovnkube-master-hbvrv   sbdb             46m          958Mi
Tue Jul 21 21:02:30 UTC 2020
ovnkube-master-hbvrv   sbdb             45m          1184Mi
Tue Jul 21 21:05:54 UTC 2020
ovnkube-master-hbvrv   sbdb             34m          1841Mi
Tue Jul 21 21:23:02 UTC 2020
ovnkube-master-hbvrv   sbdb             30m          4062Mi
Tue Jul 21 21:43:51 UTC 2020
ovnkube-master-hbvrv   sbdb             34m          5663Mi
Tue Jul 21 22:01:24 UTC 2020
ovnkube-master-hbvrv   sbdb             36m          7393Mi
Tue Jul 21 22:19:09 UTC 2020
ovnkube-master-hbvrv   sbdb             578m         9147Mi
Tue Jul 21 22:33:31 UTC 2020
ovnkube-master-hbvrv   sbdb             12m          9128Mi
Tue Jul 21 22:40:44 UTC 2020
ovnkube-master-hbvrv   sbdb             16m          9466Mi
Tue Jul 21 22:47:57 UTC 2020
ovnkube-master-hbvrv   sbdb             6m           9992Mi
Tue Jul 21 22:55:10 UTC 2020
ovnkube-master-hbvrv   sbdb             822m         10616Mi
Tue Jul 21 22:58:46 UTC 2020
ovnkube-master-hbvrv   sbdb             282m         9973Mi
Tue Jul 21 23:02:27 UTC 2020
ovnkube-master-hbvrv   sbdb             44m          10458Mi
Tue Jul 21 23:20:29 UTC 2020
ovnkube-master-hbvrv   sbdb             154m         9974Mi
Tue Jul 21 23:24:14 UTC 2020
ovnkube-master-hbvrv   sbdb             429m         10631Mi
Tue Jul 21 23:27:49 UTC 2020
ovnkube-master-hbvrv   sbdb             576m         9974Mi
Tue Jul 21 23:31:24 UTC 2020
ovnkube-master-hbvrv   sbdb             92m          10745Mi
Tue Jul 21 23:34:58 UTC 2020
ovnkube-master-hbvrv   sbdb             994m         9994Mi
Tue Jul 21 23:52:52 UTC 2020
ovnkube-master-hbvrv   sbdb             492m         11579Mi
Tue Jul 21 23:56:25 UTC 2020
ovnkube-master-hbvrv   sbdb             30m          9975Mi
Tue Jul 21 23:59:58 UTC 2020
ovnkube-master-hbvrv   sbdb             5m           9975Mi
Wed Jul 22 00:03:42 UTC 2020
Wed Jul 22 00:07:44 UTC 2020
ovnkube-master-hbvrv   sbdb             131m         10809Mi
Wed Jul 22 00:11:19 UTC 2020
ovnkube-master-hbvrv   sbdb             668m         11464Mi
Wed Jul 22 00:14:56 UTC 2020
ovnkube-master-hbvrv   sbdb             132m         10809Mi
ovnkube-master-hbvrv   sbdb             338m         12156Mi
Wed Jul 22 00:46:52 UTC 2020
ovnkube-master-hbvrv   sbdb             8m           11492Mi
Wed Jul 22 00:50:24 UTC 2020
ovnkube-master-hbvrv   sbdb             80m          10667Mi
Wed Jul 22 01:15:06 UTC 2020
ovnkube-master-hbvrv   sbdb             97m          10667Mi
Wed Jul 22 01:18:38 UTC 2020
ovnkube-master-hbvrv   sbdb             48m          10667Mi
Wed Jul 22 01:22:10 UTC 2020
ovnkube-master-hbvrv   sbdb             671m         11490Mi
Wed Jul 22 01:25:42 UTC 2020
ovnkube-master-hbvrv   sbdb             18m          11489Mi
Wed Jul 22 01:29:14 UTC 2020
ovnkube-master-hbvrv   sbdb             87m          10976Mi
Wed Jul 22 02:18:36 UTC 2020
ovnkube-master-hbvrv   sbdb             35m          11419Mi
Wed Jul 22 04:39:43 UTC 2020
Wed Jul 22 04:43:45 UTC 2020
ovnkube-master-hbvrv   sbdb             15m          11419Mi
Wed Jul 22 04:47:16 UTC 2020
Wed Jul 22 04:51:18 UTC 2020
ovnkube-master-hbvrv   sbdb             1715m        3319Mi
Wed Jul 22 04:54:50 UTC 2020
ovnkube-master-hbvrv   sbdb             991m         15412Mi
Wed Jul 22 04:58:21 UTC 2020
ovnkube-master-hbvrv   sbdb             0m           3999Mi
Wed Jul 22 05:01:52 UTC 2020
ovnkube-master-hbvrv   sbdb             0m           0Mi
Wed Jul 22 05:05:24 UTC 2020


At one point of a time memory consumption reached to 15G. Seems like oom_killer kills the sb-db based on it's oom_score at one point and that currupts the db and two of the 3 nodes doesn't restart because of the currupt db.

This memory bloating + the nbdb memory bloating on the master node, causes OOM for other components running on the node where master pods are running. That results in failure to provision the pods and CNI Apis are timing out.

Must-gather  collection fails to i collected logs from the specific pods. All the Logs (ovnkube-master logs, ovnkube-ndoe logs, memory growth logs for all ovn-kubernetes component, cluster status etc) are attached.

Version-Release number of selected component (if applicable):

ovn4.5.rc7
openvswitch2.13-2.13.0-29.el7fdp.x86_64
ovn2.13-2.13.0-31.el7fdp.x86_64

How reproducible:
Always

Steps to Reproduce:
1. Install cluster using 4.5.0-rc.7
2. Deploy a 100 nodes cluster
3. Run Cluster-density (Mastervertical) with at least 1000 namespaces

Actual results:
Pod networking configuration fails or sometime pod annotation request fails if the api server is killed because of the memory bloating.

Expected results:
Pod creation should not fail.

Additional info:
This bug is related to the following bugzilla that tracks the nb-db memory issue.
https://bugzilla.redhat.com/show_bug.cgi?id=1855408

Comment 1 Dan Williams 2020-09-28 15:47:43 UTC

Anil, will this still be a problem after https://github.com/ovn-org/ovn-kubernetes/pull/1711 lands?

Comment 2 Dan Williams 2020-09-29 16:45:34 UTC

Over to Numan as he's working on a patch to reduce the number of flows for each reject ACL.

Comment 3 Numan Siddique 2020-09-29 18:35:43 UTC

Hi Anil,

Is it possible to attach the OVN north db file ?

Comment 4 Anil Vishnoi 2020-09-29 22:27:20 UTC

Hi Numan,

I observed this issue in parallel to the issue reported in bug https://bugzilla.redhat.com/show_bug.cgi?id=1855408

And the logs are uploaded here : https://drive.google.com/file/d/18dIf6qNP3IQvQOlVAOVjH6ppZuF-jKoV/view?usp=sharing

Comment 5 Numan Siddique 2020-10-06 12:41:47 UTC

Submitted the patches for review - https://patchwork.ozlabs.org/project/ovn/list/?submitter=77669

which reduces the number of lflows in sb db.

With the attached db in this bz and with OVN master, ovn-northd crashes on my laptop with 16gb memory.
With OVN master + the above patches, ovn-northd didn't crash.

These patches would certainly help in ovn-northd memory usage.

Comment 6 Numan Siddique 2020-10-06 18:02:30 UTC

Without this patch, the number of logical flows with the attached db is 1869383 and with the patches it is - 667979.

Comment 7 Tim Rozet 2020-10-07 01:11:53 UTC

Thanks Numan. Similarly on another scale setup I found tons of lflows per service hairpin. I saw 434000 of these:

table=6 (ls_in_acl          ), priority=2000 , match=(((!ct.trk || !ct.est || (ct.est && ct_label.blocked == 1))) && ip4 && (ip4.dst==172.30.0.83 && tcp && tcp.dst==6443)), action=(reg0 = 0; icmp4 { eth.dst <-> eth.src; ip4.dst <-> ip4.src; outport <-> inport; next(pipeline=egress,table=5); };)
  table=6 (ls_in_acl          ), priority=2000 , match=(((!ct.trk || !ct.est || (ct.est && ct_label.blocked == 1))) && ip4 && (ip4.dst==172.30.1.1 && tcp && tcp.dst==80)), action=(reg0 = 0; icmp4 { eth.dst <-> eth.src; ip4.dst <-> ip4.src; outport <-> inport; next(pipeline=egress,table=5); }

They look like hairpin flows to me. Can you confirm what they are? Could we cut those down? The action always seems to be the same, but the match criteria hits each service.

Comment 8 Numan Siddique 2020-10-07 07:25:05 UTC

(In reply to Tim Rozet from comment #7)
> Thanks Numan. Similarly on another scale setup I found tons of lflows per
> service hairpin. I saw 434000 of these:
> 
> table=6 (ls_in_acl          ), priority=2000 , match=(((!ct.trk || !ct.est
> || (ct.est && ct_label.blocked == 1))) && ip4 && (ip4.dst==172.30.0.83 &&
> tcp && tcp.dst==6443)), action=(reg0 = 0; icmp4 { eth.dst <-> eth.src;
> ip4.dst <-> ip4.src; outport <-> inport; next(pipeline=egress,table=5); };)
>   table=6 (ls_in_acl          ), priority=2000 , match=(((!ct.trk || !ct.est
> || (ct.est && ct_label.blocked == 1))) && ip4 && (ip4.dst==172.30.1.1 && tcp
> && tcp.dst==80)), action=(reg0 = 0; icmp4 { eth.dst <-> eth.src; ip4.dst <->
> ip4.src; outport <-> inport; next(pipeline=egress,table=5); }
> 
> They look like hairpin flows to me. Can you confirm what they are? Could we
> cut those down? The action always seems to be the same, but the match
> criteria hits each service.

Yes. These are hairpin flows. I'm also looking into cut down these flows if possible.

Comment 9 Numan Siddique 2020-10-12 08:28:23 UTC

(In reply to Numan Siddique from comment #8)
> (In reply to Tim Rozet from comment #7)
> > Thanks Numan. Similarly on another scale setup I found tons of lflows per
> > service hairpin. I saw 434000 of these:
> > 
> > table=6 (ls_in_acl          ), priority=2000 , match=(((!ct.trk || !ct.est
> > || (ct.est && ct_label.blocked == 1))) && ip4 && (ip4.dst==172.30.0.83 &&
> > tcp && tcp.dst==6443)), action=(reg0 = 0; icmp4 { eth.dst <-> eth.src;
> > ip4.dst <-> ip4.src; outport <-> inport; next(pipeline=egress,table=5); };)
> >   table=6 (ls_in_acl          ), priority=2000 , match=(((!ct.trk || !ct.est
> > || (ct.est && ct_label.blocked == 1))) && ip4 && (ip4.dst==172.30.1.1 && tcp
> > && tcp.dst==80)), action=(reg0 = 0; icmp4 { eth.dst <-> eth.src; ip4.dst <->
> > ip4.src; outport <-> inport; next(pipeline=egress,table=5); }
> > 
> > They look like hairpin flows to me. Can you confirm what they are? Could we
> > cut those down? The action always seems to be the same, but the match
> > criteria hits each service.
> 
> Yes. These are hairpin flows. I'm also looking into cut down these flows if
> possible.

Correction. The flows you listed here are not hairpin flows, but lflows added for reject ACL actions.

But definitely there are many hairpin flows. In the attached db, there are around 250000 hairpin flows.
I'm working in reducing these lflows.

Comment 13 Numan Siddique 2020-10-21 10:44:48 UTC

The patches to address hairpin flows are submitted for review - https://bugzilla.redhat.com/show_bug.cgi?id=1859924

There is another BZ for the hairpin flows - https://bugzilla.redhat.com/show_bug.cgi?id=1833373

Comment 14 Numan Siddique 2020-10-21 10:46:30 UTC

(In reply to Numan Siddique from comment #13)
> The patches to address hairpin flows are submitted for review -
> https://bugzilla.redhat.com/show_bug.cgi?id=1859924
> 

this one - https://patchwork.ozlabs.org/project/ovn/list/?series=209175

> There is another BZ for the hairpin flows -
> https://bugzilla.redhat.com/show_bug.cgi?id=1833373

Comment 21 Tim Rozet 2020-11-18 19:53:02 UTC

*** Bug 1885713 has been marked as a duplicate of this bug. ***

Comment 22 Tim Rozet 2020-11-18 20:02:23 UTC

*** Bug 1884049 has been marked as a duplicate of this bug. ***

Comment 23 Tim Rozet 2020-11-18 20:06:59 UTC

*** Bug 1891002 has been marked as a duplicate of this bug. ***

Comment 26 Rashid Khan 2020-12-04 21:16:47 UTC

Adding testblocker flag as per email from Dustin Fri Dec 4th 4:16pm

Comment 28 Dan Williams 2021-01-18 20:46:19 UTC

Per comment 27 the issue is fixed in ovn2.13-20.12.0-1 and later.

Comment 29 Dan Williams 2021-03-25 19:03:31 UTC

Fix shipped in FDP 21.A in ovn2.13-20.12.0-1

Comment 30 Red Hat Bugzilla 2023-09-15 00:34:35 UTC

The needinfo request[s] on this closed bug have been removed as they have been unresolved for 500 days

Note You need to log in before you can comment on or make changes to this bug.